From patchwork Tue Jan 24 17:04:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jamal Hadi Salim X-Patchwork-Id: 13114380 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7310CC25B4E for ; Tue, 24 Jan 2023 17:05:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232381AbjAXRFm (ORCPT ); Tue, 24 Jan 2023 12:05:42 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40736 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229538AbjAXRFi (ORCPT ); Tue, 24 Jan 2023 12:05:38 -0500 Received: from mail-yw1-x1134.google.com (mail-yw1-x1134.google.com [IPv6:2607:f8b0:4864:20::1134]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 20FC711C for ; Tue, 24 Jan 2023 09:05:14 -0800 (PST) Received: by mail-yw1-x1134.google.com with SMTP id 00721157ae682-4a2f8ad29d5so227194117b3.8 for ; Tue, 24 Jan 2023 09:05:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mojatatu-com.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=xjid5R0qLT03DNXyh2RLPnthriJMJJMxKaS9MPlNtjQ=; b=QHJidW1aCX7yvDyWvk8OrKOXjc44yf+n58ubq2uPh1m+K5ixEE7VYUwFBhTp2l5d3X EnY6+PHfBdFngSJ/AiV5PhY7EJUeM6uQx3InwSdB8axrmUE/ePtU/OOb3Djp8fNXRwa+ Y0o/Bb7C9EdgI/98b+KCJmjntg+bvg0BVW945xGJL/rpJLZ0ujwgPrcRs7CV9BSRRJqC d5HcJycQDhCdDOZIZoFt+3al8vbWyg4cb2BmS+weuzAvK+NySSWgftmsIutmpZ/PHu4h NskeEfSFWuDmDH6g8aHHzGI/sldwOKaPG+4B1Souj7J230LsRKuV90/IIlP2mTcNwM2a IVXg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=xjid5R0qLT03DNXyh2RLPnthriJMJJMxKaS9MPlNtjQ=; b=o7U8kl6t6YhDHBWpSrjiTa1OSJxtQGXUXSZuHsbZ3eIDIEq6lan5WrlkW1mdZWPGyd bd0dOA4bzGOAY2HwX+19gG3fq9OawHSXwYwpdzQWc6sj4rr5aixvZ+JtQ8JxQJS0iKZ/ 8GRPTuYIODvPeaV4giPQKKpmiytXRfKpY3o1Uvmf0ET3BsYEIeovqgU2hncrbiHoLF8M 7wKiedq1GjFgesRfggqSJoSIVa11t4PPCY71libGf4LjC8w23YOu+FYJhhWv59v5sqOT x6F97qHAYAoS/1JonK2lXk1AXjfLbM1AkM2j59R2rVTM4unXLyumD/l6Plv/zU0PRvlo utpw== X-Gm-Message-State: AFqh2krX7bD6xNpJaV7JjTnQ9IKY1glSn98C1eJw0uGNA+sOXxIIwZdD nrYGoOrLAUWyvbLMagTDCwtvCTvsn6hSlWX2 X-Google-Smtp-Source: AMrXdXuojCMQ5BLOcPX5IjwYeTonIo2BHL7xtyG4GCd2IldQJ8V9zkPkrGGCZRaqxo2Cwq5YNsolmA== X-Received: by 2002:a0d:edc7:0:b0:4c6:54a2:bf96 with SMTP id w190-20020a0dedc7000000b004c654a2bf96mr34911768ywe.22.1674579911694; Tue, 24 Jan 2023 09:05:11 -0800 (PST) Received: from localhost.localdomain (bras-base-kntaon1618w-grc-10-184-145-9-64.dsl.bell.ca. [184.145.9.64]) by smtp.gmail.com with ESMTPSA id t5-20020a05620a0b0500b007063036cb03sm1700208qkg.126.2023.01.24.09.05.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Jan 2023 09:05:11 -0800 (PST) From: Jamal Hadi Salim To: netdev@vger.kernel.org Cc: kernel@mojatatu.com, deb.chatterjee@intel.com, anjali.singhai@intel.com, namrata.limaye@intel.com, khalidm@nvidia.com, tom@sipanda.io, pratyush@sipanda.io, jiri@resnulli.us, xiyou.wangcong@gmail.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, vladbu@nvidia.com, simon.horman@corigine.com Subject: [PATCH net-next RFC 01/20] net/sched: act_api: change act_base into an IDR Date: Tue, 24 Jan 2023 12:04:51 -0500 Message-Id: <20230124170510.316970-1-jhs@mojatatu.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org X-Patchwork-State: RFC Convert act_base from a list to an IDR. With the introduction of P4TC action templates, we introduce the concept of dynamically creating actions on the fly. Dynamic action IDs are not statically defined (as was the case previously) and are therefore harder to manage within existing linked list approach. We convert to IDR because it has built in ID management which we would have to re-invent with linked lists. Co-developed-by: Victor Nogueira Signed-off-by: Victor Nogueira Co-developed-by: Pedro Tammela Signed-off-by: Pedro Tammela Signed-off-by: Jamal Hadi Salim --- include/uapi/linux/pkt_cls.h | 1 + net/sched/act_api.c | 39 +++++++++++++++++++++--------------- 2 files changed, 24 insertions(+), 16 deletions(-) diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h index 648a82f32..4d716841c 100644 --- a/include/uapi/linux/pkt_cls.h +++ b/include/uapi/linux/pkt_cls.h @@ -139,6 +139,7 @@ enum tca_id { TCA_ID_MPLS, TCA_ID_CT, TCA_ID_GATE, + TCA_ID_DYN, /* other actions go here */ __TCA_ID_MAX = 255 }; diff --git a/net/sched/act_api.c b/net/sched/act_api.c index cd09ef49d..811dddc3b 100644 --- a/net/sched/act_api.c +++ b/net/sched/act_api.c @@ -890,7 +890,7 @@ void tcf_idrinfo_destroy(const struct tc_action_ops *ops, } EXPORT_SYMBOL(tcf_idrinfo_destroy); -static LIST_HEAD(act_base); +static DEFINE_IDR(act_base); static DEFINE_RWLOCK(act_mod_lock); /* since act ops id is stored in pernet subsystem list, * then there is no way to walk through only all the action @@ -949,7 +949,6 @@ static void tcf_pernet_del_id_list(unsigned int id) int tcf_register_action(struct tc_action_ops *act, struct pernet_operations *ops) { - struct tc_action_ops *a; int ret; if (!act->act || !act->dump || !act->init) @@ -970,13 +969,24 @@ int tcf_register_action(struct tc_action_ops *act, } write_lock(&act_mod_lock); - list_for_each_entry(a, &act_base, head) { - if (act->id == a->id || (strcmp(act->kind, a->kind) == 0)) { + if (act->id) { + if (idr_find(&act_base, act->id)) { ret = -EEXIST; goto err_out; } + ret = idr_alloc_u32(&act_base, act, &act->id, act->id, + GFP_ATOMIC); + if (ret < 0) + goto err_out; + } else { + /* Only dynamic actions will require ID generation */ + act->id = TCA_ID_DYN; + + ret = idr_alloc_u32(&act_base, act, &act->id, TCA_ID_MAX, + GFP_ATOMIC); + if (ret < 0) + goto err_out; } - list_add_tail(&act->head, &act_base); write_unlock(&act_mod_lock); return 0; @@ -994,17 +1004,12 @@ EXPORT_SYMBOL(tcf_register_action); int tcf_unregister_action(struct tc_action_ops *act, struct pernet_operations *ops) { - struct tc_action_ops *a; - int err = -ENOENT; + int err = 0; write_lock(&act_mod_lock); - list_for_each_entry(a, &act_base, head) { - if (a == act) { - list_del(&act->head); - err = 0; - break; - } - } + if (!idr_remove(&act_base, act->id)) + err = -EINVAL; + write_unlock(&act_mod_lock); if (!err) { unregister_pernet_subsys(ops); @@ -1019,10 +1024,11 @@ EXPORT_SYMBOL(tcf_unregister_action); static struct tc_action_ops *tc_lookup_action_n(char *kind) { struct tc_action_ops *a, *res = NULL; + unsigned long tmp, id; if (kind) { read_lock(&act_mod_lock); - list_for_each_entry(a, &act_base, head) { + idr_for_each_entry_ul(&act_base, a, tmp, id) { if (strcmp(kind, a->kind) == 0) { if (try_module_get(a->owner)) res = a; @@ -1038,10 +1044,11 @@ static struct tc_action_ops *tc_lookup_action_n(char *kind) static struct tc_action_ops *tc_lookup_action(struct nlattr *kind) { struct tc_action_ops *a, *res = NULL; + unsigned long tmp, id; if (kind) { read_lock(&act_mod_lock); - list_for_each_entry(a, &act_base, head) { + idr_for_each_entry_ul(&act_base, a, tmp, id) { if (nla_strcmp(kind, a->kind) == 0) { if (try_module_get(a->owner)) res = a; From patchwork Tue Jan 24 17:04:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jamal Hadi Salim X-Patchwork-Id: 13114381 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C20ACC54E94 for ; Tue, 24 Jan 2023 17:05:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233564AbjAXRFo (ORCPT ); Tue, 24 Jan 2023 12:05:44 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40760 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233302AbjAXRFj (ORCPT ); Tue, 24 Jan 2023 12:05:39 -0500 Received: from mail-yw1-x112e.google.com (mail-yw1-x112e.google.com [IPv6:2607:f8b0:4864:20::112e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ACC4E233D1 for ; Tue, 24 Jan 2023 09:05:14 -0800 (PST) Received: by mail-yw1-x112e.google.com with SMTP id 00721157ae682-4fda31c3351so198403917b3.11 for ; Tue, 24 Jan 2023 09:05:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mojatatu-com.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=5piEUutojf6lyadXobK442hJbhHg2dN8P7p9kl2Gx2A=; b=YXGBZwZADZPgcQAYSUdeMRI1LOuu4QWv+jyTcysg9aVKPawxsCN1eYTTzKG4+FGrdL Q1/xHJOycw3J3YzyWI51GyuJLGZ+Gs2HxPnjiMxLXWx00Kpw2AeVxVqNPc9tLWOa5PVJ pzUGLo6pvC2vOlxUruzUnJxV05+jef4YQ7QB2QCvoLIV27V0xZyn/8uub4xNcX40Ksco sbRSMe0p2wEaqanIVZLgOQi1Hl7r6bfZzB2GIGMJkXzBjZIV4pX0deAiUtILSgJgqJw7 cVjBxB4N1DeKD7zrQ4mo7pn/971jQBJmzq8cxs/DHPKN9U9/3iIa+PHXmYSsB+jea1zo yx4w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=5piEUutojf6lyadXobK442hJbhHg2dN8P7p9kl2Gx2A=; b=hLs6qJh4GKREq4eW5J6biABA6DyBwuPeu6EcFpTo0b95GclBZ2Q0URHjOxhMoodBSH vaMilkG1fEBLgEyZFShqHAWIT6RFOubR6v66RyHj0OOv2jD1+AjZbA001prVAYYPRbVQ nkJvs2HpfghRRYoHP3egvtJg4OzrM00N+7pduT7vKMG2pywVPxScMTwuxgtw1LSsFWlk dGENVxqBXiY5ay/STc5vuYvcdzbASfjXPH/94zr74xtK6GsNT/unMssAKXn2MUsR1oT1 pYmn7UEx++VHQZjynUKxABlMF/pR3KZml5bjyzHdgr4tkovH2vGD/BZXnQrqfck6Lgua hawA== X-Gm-Message-State: AFqh2koVbNYDhvM9iW6i/TUNn4/XGZFUHDxV/MDQKkd+ST9Ii/pt+cn2 EryRm3RwrsxR/x+Uc1wI0SJmezCGbmUfO1aC X-Google-Smtp-Source: AMrXdXsXjXU6yq5XDTheRMzW1jqTCOQUMUg9io59B6+v6VtNxO68N/eKOpykXL/FEzYsATVJExHE8Q== X-Received: by 2002:a05:7500:191e:b0:f0:6268:17ee with SMTP id cz30-20020a057500191e00b000f0626817eemr1412274gab.27.1674579912765; Tue, 24 Jan 2023 09:05:12 -0800 (PST) Received: from localhost.localdomain (bras-base-kntaon1618w-grc-10-184-145-9-64.dsl.bell.ca. [184.145.9.64]) by smtp.gmail.com with ESMTPSA id t5-20020a05620a0b0500b007063036cb03sm1700208qkg.126.2023.01.24.09.05.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Jan 2023 09:05:12 -0800 (PST) From: Jamal Hadi Salim To: netdev@vger.kernel.org Cc: kernel@mojatatu.com, deb.chatterjee@intel.com, anjali.singhai@intel.com, namrata.limaye@intel.com, khalidm@nvidia.com, tom@sipanda.io, pratyush@sipanda.io, jiri@resnulli.us, xiyou.wangcong@gmail.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, vladbu@nvidia.com, simon.horman@corigine.com Subject: [PATCH net-next RFC 02/20] net/sched: act_api: increase action kind string length Date: Tue, 24 Jan 2023 12:04:52 -0500 Message-Id: <20230124170510.316970-2-jhs@mojatatu.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230124170510.316970-1-jhs@mojatatu.com> References: <20230124170510.316970-1-jhs@mojatatu.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org X-Patchwork-State: RFC Increase action kind string length from IFNAMSIZ to 64 The new P4TC dynamic actions, created via templates, will have longer names of format: "pipeline_name/act_name". IFNAMSIZ is currently 16 and is most of the times undersized for the above format. So, to conform to this new format, we increase the maximum name length to account for this extra string (pipeline name) and the '/' character. Co-developed-by: Victor Nogueira Signed-off-by: Victor Nogueira Co-developed-by: Pedro Tammela Signed-off-by: Pedro Tammela Signed-off-by: Jamal Hadi Salim --- include/net/act_api.h | 2 +- include/uapi/linux/pkt_cls.h | 1 + net/sched/act_api.c | 6 +++--- 3 files changed, 5 insertions(+), 4 deletions(-) diff --git a/include/net/act_api.h b/include/net/act_api.h index 2a6f443f0..5557c55d5 100644 --- a/include/net/act_api.h +++ b/include/net/act_api.h @@ -105,7 +105,7 @@ typedef void (*tc_action_priv_destructor)(void *priv); struct tc_action_ops { struct list_head head; - char kind[IFNAMSIZ]; + char kind[ACTNAMSIZ]; enum tca_id id; /* identifier should match kind */ unsigned int net_id; size_t size; diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h index 4d716841c..5b66df3ec 100644 --- a/include/uapi/linux/pkt_cls.h +++ b/include/uapi/linux/pkt_cls.h @@ -6,6 +6,7 @@ #include #define TC_COOKIE_MAX_SIZE 16 +#define ACTNAMSIZ 64 /* Action attributes */ enum { diff --git a/net/sched/act_api.c b/net/sched/act_api.c index 811dddc3b..2e5a6ebb1 100644 --- a/net/sched/act_api.c +++ b/net/sched/act_api.c @@ -449,7 +449,7 @@ static size_t tcf_action_shared_attrs_size(const struct tc_action *act) rcu_read_unlock(); return nla_total_size(0) /* action number nested */ - + nla_total_size(IFNAMSIZ) /* TCA_ACT_KIND */ + + nla_total_size(ACTNAMSIZ) /* TCA_ACT_KIND */ + cookie_len /* TCA_ACT_COOKIE */ + nla_total_size(sizeof(struct nla_bitfield32)) /* TCA_ACT_HW_STATS */ + nla_total_size(0) /* TCA_ACT_STATS nested */ @@ -1312,7 +1312,7 @@ struct tc_action_ops *tc_action_load_ops(struct nlattr *nla, bool police, { struct nlattr *tb[TCA_ACT_MAX + 1]; struct tc_action_ops *a_o; - char act_name[IFNAMSIZ]; + char act_name[ACTNAMSIZ]; struct nlattr *kind; int err; @@ -1327,7 +1327,7 @@ struct tc_action_ops *tc_action_load_ops(struct nlattr *nla, bool police, NL_SET_ERR_MSG(extack, "TC action kind must be specified"); return ERR_PTR(err); } - if (nla_strscpy(act_name, kind, IFNAMSIZ) < 0) { + if (nla_strscpy(act_name, kind, ACTNAMSIZ) < 0) { NL_SET_ERR_MSG(extack, "TC action name too long"); return ERR_PTR(err); } From patchwork Tue Jan 24 17:04:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jamal Hadi Salim X-Patchwork-Id: 13114382 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4EF5EC54E94 for ; Tue, 24 Jan 2023 17:05:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233152AbjAXRFp (ORCPT ); Tue, 24 Jan 2023 12:05:45 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40724 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233688AbjAXRFl (ORCPT ); Tue, 24 Jan 2023 12:05:41 -0500 Received: from mail-yw1-x1129.google.com (mail-yw1-x1129.google.com [IPv6:2607:f8b0:4864:20::1129]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7A5F4402E0 for ; Tue, 24 Jan 2023 09:05:17 -0800 (PST) Received: by mail-yw1-x1129.google.com with SMTP id 00721157ae682-5063029246dso59431467b3.6 for ; Tue, 24 Jan 2023 09:05:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mojatatu-com.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=QAT4Ji26McrOf8UEVabzWZhhaqB5tgWmmew2tKOuC5Y=; b=bI7MX+zgBMy5bXCNG5gu2rpF2cEq819dJuEkbVNo7Da+8TMbUvdEo19/fKwn/HBNI6 PXA7s02ZwS0e6koTmykyU1lDDlejE+6KE7cqiJ3yoffmHWH8DqXitMjt7xeYAjTAG65m zpLEP7zUfYd6GfUmY/7OVjKl6UCeGQTSAeuSEjtXaWDs+hbP+Hg+imNruIQ2mR5w49a3 78P9j8QkeaN3xGIyFOcJCjovsXtFNs2b45eloa0+idTZQnyzxC3EHM5KdcwUJ0RjmAz9 YG1Z/psRfRLQzT1eg5UZxR8WHwQZ51ohnyWQCk1kzT56N+T5WSUmwCZ5S8U+wiqXc3l8 sxAA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=QAT4Ji26McrOf8UEVabzWZhhaqB5tgWmmew2tKOuC5Y=; b=CNGwhc/BaKNeISUyOvt9b1xd7artES/TyXS9Z6vQgTSnMR0qfwaec0T3xJyVM8C4+F 7d0atXyQ4SOV/DLSFXhCLnO92LANpCJ0TBS+dy+p0TTdCG7Hw3rAy8ySJSA7f6CaBpqp Ja7v36483ss9uLAJtwDk9+s6EhWQWkMy6HBYHbOFAvwEJu9Yc4PCNle7zEblBOqmTDZ/ bJw6Dt2Mmha1nS2QjNlsO62bJoHZXy3pOZ0rFwo9484L4cURdaeWYM8AGq5JRRTlwyXD D/Wzte+Bsdikr1HiB3+E2gqAyGGsbZPIsKFbV24lZYQVUmJvmthA25vRRAd+ZKDFSPxA +nXg== X-Gm-Message-State: AO0yUKVJa0Xkou5CEpIF7SvNPme1iuIfvnvzZp8ygxxAE4sIvilqFeQY uRUjHLVQoJAHH/zIn8D81gm2DytanPFmStlk X-Google-Smtp-Source: AK7set9LSf2PnkB5VCd/oqDiFtM3gu5PVZRWeRucqdv7cAmEg0HreEy8CMkQRvY++vis0BZWTnHqaw== X-Received: by 2002:a05:7500:6715:b0:f3:2bd5:46bb with SMTP id jd21-20020a057500671500b000f32bd546bbmr348309gab.16.1674579914096; Tue, 24 Jan 2023 09:05:14 -0800 (PST) Received: from localhost.localdomain (bras-base-kntaon1618w-grc-10-184-145-9-64.dsl.bell.ca. [184.145.9.64]) by smtp.gmail.com with ESMTPSA id t5-20020a05620a0b0500b007063036cb03sm1700208qkg.126.2023.01.24.09.05.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Jan 2023 09:05:13 -0800 (PST) From: Jamal Hadi Salim To: netdev@vger.kernel.org Cc: kernel@mojatatu.com, deb.chatterjee@intel.com, anjali.singhai@intel.com, namrata.limaye@intel.com, khalidm@nvidia.com, tom@sipanda.io, pratyush@sipanda.io, jiri@resnulli.us, xiyou.wangcong@gmail.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, vladbu@nvidia.com, simon.horman@corigine.com Subject: [PATCH net-next RFC 03/20] net/sched: act_api: increase TCA_ID_MAX Date: Tue, 24 Jan 2023 12:04:53 -0500 Message-Id: <20230124170510.316970-3-jhs@mojatatu.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230124170510.316970-1-jhs@mojatatu.com> References: <20230124170510.316970-1-jhs@mojatatu.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org X-Patchwork-State: RFC Increase TCA_ID_MAX from 255 to 1023 Given P4TC dynamic actions required new IDs (dynamically) and 30 of those are already taken by the standard actions (such as gact, mirred and ife) we are left with 225 actions to create, which seems like a small number. Signed-off-by: Victor Nogueira Signed-off-by: Pedro Tammela Signed-off-by: Jamal Hadi Salim --- include/uapi/linux/pkt_cls.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h index 5b66df3ec..5d6e22f2a 100644 --- a/include/uapi/linux/pkt_cls.h +++ b/include/uapi/linux/pkt_cls.h @@ -142,7 +142,7 @@ enum tca_id { TCA_ID_GATE, TCA_ID_DYN, /* other actions go here */ - __TCA_ID_MAX = 255 + __TCA_ID_MAX = 1023 }; #define TCA_ID_MAX __TCA_ID_MAX From patchwork Tue Jan 24 17:04:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jamal Hadi Salim X-Patchwork-Id: 13114384 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 655C9C54E94 for ; Tue, 24 Jan 2023 17:06:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234642AbjAXRGH (ORCPT ); Tue, 24 Jan 2023 12:06:07 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40776 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233508AbjAXRFp (ORCPT ); Tue, 24 Jan 2023 12:05:45 -0500 Received: from mail-yw1-x112b.google.com (mail-yw1-x112b.google.com [IPv6:2607:f8b0:4864:20::112b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3E95549575 for ; Tue, 24 Jan 2023 09:05:21 -0800 (PST) Received: by mail-yw1-x112b.google.com with SMTP id 00721157ae682-4fd37a1551cso202683227b3.13 for ; Tue, 24 Jan 2023 09:05:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mojatatu-com.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=fpBc4DdaPjPB9YwHzixoGEvX+0rT4epKln3hHMBlcWs=; b=Nj3JtxszHLlZiAXT1APoQBhpfuZqM99QcpYVdQaRvs4jaqsu7rGWy6vwjrEqD41zY3 ogffe0z3BCcD7A8gv5UiY4B4j236ycgrYZ0Y7E/YF6gf/CkyWzD5ek2UcBx6AWSUf9gC 6yQe5QZ/1MOK6KKQqjwu4saHqttmJrsUztmgEdKQ39w1oxi2oD5Kxs8GLphNA6Tdhtwq FPTeiU2I+g+14cLGZaIhJfo8W9DLMVkjKhJvqGyvBTO3zlXeOydGcxl946GpC8COQtEe l6Lwc9e1VzyCjUSjQVDSPM1qx4T1y2hJwOK8FZzEK5MGsOGj6Gklj3nwLLpQG7muoOG2 WLiQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=fpBc4DdaPjPB9YwHzixoGEvX+0rT4epKln3hHMBlcWs=; b=b4pFBD005RVQ/vxhkjPot4R+H9zYGKPNANUF8Ss6z3x+14nwXTtxKsrxyzy/e6pvnN Erjh5Ps3AQ71tpgj6qP9VhygvWpR2/M2mSb0Ard+pab4b09rRJ9LMJQSvpgvsH0YgSNX VflWpTwsIJ+DY9NVuwm2E7EkNZNGttX/RIUPS+ZtHA5ZB7O7vWpXrRVDHjv3uKpm57dd WbezJ/J8X6TpA1Xo9xGr8UuCmIEeIaI2xwhSW1ROPovBSCbGQZ1Xst6x5oAEZy6AVxgp nPWchCj226CQmQEl8ucPJX89s577IItEUUF7t2WF7LjRLU7HhdDCxfyyiGOoZz8KLJQj BPpA== X-Gm-Message-State: AFqh2kqEglTjskEZ0QSWoZQO2lVxiAtvab0XeAK0qS27fk92MZcVX+RS 6OBYxp7gu9jeEmAIJDocsqltoRAPC09J4qzu X-Google-Smtp-Source: AMrXdXuZRxi9Fhzt813ytfexV0PIfeCa/VIbjpPAD3SAfJkpVQE9UiYWIcEAQlqltu9OZkb0EhJcQQ== X-Received: by 2002:a05:7500:d83:b0:f2:55f5:2508 with SMTP id kn3-20020a0575000d8300b000f255f52508mr904347gab.15.1674579915291; Tue, 24 Jan 2023 09:05:15 -0800 (PST) Received: from localhost.localdomain (bras-base-kntaon1618w-grc-10-184-145-9-64.dsl.bell.ca. [184.145.9.64]) by smtp.gmail.com with ESMTPSA id t5-20020a05620a0b0500b007063036cb03sm1700208qkg.126.2023.01.24.09.05.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Jan 2023 09:05:14 -0800 (PST) From: Jamal Hadi Salim To: netdev@vger.kernel.org Cc: kernel@mojatatu.com, deb.chatterjee@intel.com, anjali.singhai@intel.com, namrata.limaye@intel.com, khalidm@nvidia.com, tom@sipanda.io, pratyush@sipanda.io, jiri@resnulli.us, xiyou.wangcong@gmail.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, vladbu@nvidia.com, simon.horman@corigine.com Subject: [PATCH net-next RFC 04/20] net/sched: act_api: add init_ops to struct tc_action_op Date: Tue, 24 Jan 2023 12:04:54 -0500 Message-Id: <20230124170510.316970-4-jhs@mojatatu.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230124170510.316970-1-jhs@mojatatu.com> References: <20230124170510.316970-1-jhs@mojatatu.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org X-Patchwork-State: RFC The initialisation of P4TC action instances require access to a struct p4tc_act (which appears in later patches) to help us to retrieve information like the dynamic action parameters etc. In order to retrieve struct p4tc_act we need the pipeline name or id and the action name or id. The init callback from tc_action_ops parameters had no way of supplying us that information. To solve this issue, we decided to create a new tc_action_ops callback (init_ops), that provides us with the tc_action_ops struct which then provides us with the pipeline and action name. In addition we add a new refcount to struct tc_action_ops called dyn_ref, which accounts for how many action instances we have of a specific dynamic action. Co-developed-by: Victor Nogueira Signed-off-by: Victor Nogueira Co-developed-by: Pedro Tammela Signed-off-by: Pedro Tammela Signed-off-by: Jamal Hadi Salim --- include/net/act_api.h | 6 ++++++ net/sched/act_api.c | 11 ++++++++--- 2 files changed, 14 insertions(+), 3 deletions(-) diff --git a/include/net/act_api.h b/include/net/act_api.h index 5557c55d5..64dc75ba6 100644 --- a/include/net/act_api.h +++ b/include/net/act_api.h @@ -108,6 +108,7 @@ struct tc_action_ops { char kind[ACTNAMSIZ]; enum tca_id id; /* identifier should match kind */ unsigned int net_id; + refcount_t dyn_ref; size_t size; struct module *owner; int (*act)(struct sk_buff *, const struct tc_action *, @@ -119,6 +120,11 @@ struct tc_action_ops { struct nlattr *est, struct tc_action **act, struct tcf_proto *tp, u32 flags, struct netlink_ext_ack *extack); + /* This should be merged with the original init action */ + int (*init_ops)(struct net *net, struct nlattr *nla, + struct nlattr *est, struct tc_action **act, + struct tcf_proto *tp, struct tc_action_ops *ops, + u32 flags, struct netlink_ext_ack *extack); int (*walk)(struct net *, struct sk_buff *, struct netlink_callback *, int, const struct tc_action_ops *, diff --git a/net/sched/act_api.c b/net/sched/act_api.c index 2e5a6ebb1..622b8d3c5 100644 --- a/net/sched/act_api.c +++ b/net/sched/act_api.c @@ -951,7 +951,7 @@ int tcf_register_action(struct tc_action_ops *act, { int ret; - if (!act->act || !act->dump || !act->init) + if (!act->act || !act->dump || (!act->init && !act->init_ops)) return -EINVAL; /* We have to register pernet ops before making the action ops visible, @@ -1403,8 +1403,13 @@ struct tc_action *tcf_action_init_1(struct net *net, struct tcf_proto *tp, } } - err = a_o->init(net, tb[TCA_ACT_OPTIONS], est, &a, tp, - userflags.value | flags, extack); + if (a_o->init) + err = a_o->init(net, tb[TCA_ACT_OPTIONS], est, &a, tp, + userflags.value | flags, extack); + else if (a_o->init_ops) + err = a_o->init_ops(net, tb[TCA_ACT_OPTIONS], est, &a, + tp, a_o, userflags.value | flags, + extack); } else { err = a_o->init(net, nla, est, &a, tp, userflags.value | flags, extack); From patchwork Tue Jan 24 17:04:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jamal Hadi Salim X-Patchwork-Id: 13114385 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A13DBC25B4E for ; Tue, 24 Jan 2023 17:06:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231479AbjAXRGI (ORCPT ); Tue, 24 Jan 2023 12:06:08 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41220 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233643AbjAXRGC (ORCPT ); Tue, 24 Jan 2023 12:06:02 -0500 Received: from mail-yw1-x112e.google.com (mail-yw1-x112e.google.com [IPv6:2607:f8b0:4864:20::112e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6B3963D097 for ; Tue, 24 Jan 2023 09:05:22 -0800 (PST) Received: by mail-yw1-x112e.google.com with SMTP id 00721157ae682-50660e2d2ffso11487087b3.1 for ; Tue, 24 Jan 2023 09:05:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mojatatu-com.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=shP6fdrLVG4eVEYAov+j4yNTs/jzAWvl4mqFvVeVXP0=; b=BD6XM1KmXgVTGcmrlMcsAjzqySugI8JAT9tcK73m/gbvF8rtRxBQS7NYxkaaSsLFeP 5kEO8hdVPiLDULkc6Kc1uLZRFcTlJSWYApAyou5djs5QYA+lpHGGzMLnqlJAk+qbjaW1 BK/K04IE0LPOL48flciIEOTc/fXL1KSCDpMMrqPvWC6QbsjUifuW8WWyJuS0yV2PwSbc 59ObjbqhGk8cp781/HW4dL2yfqERTKTdWKNjR0xTIH+fourHZuw1HkCdQpYQzTi1WNxD wPak4L23CYyrcvunyGDp9UHhbFt0P5+dS28XPenpEQjPRk52jOJvkHLpdfG3SFv2fjvw FgTA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=shP6fdrLVG4eVEYAov+j4yNTs/jzAWvl4mqFvVeVXP0=; b=tsdUGDH9wgQ/QxV+eKHxoV9344lQkXoGNiY5ZggF/iRCKmlTCDhgH7gpDOL3oXWjuV CMHa3R40EkEZVGk0i1Ms2lUH+BpzG1QfIk6cWPaqPr/c9Zlgx40qy+Tm5SjupHM2wofY +Z0PB2d+shyra8gif67cYy3BVdb8utGkg5tMfzC0K4Yb/tgV/yG3zybuK1HEx7lKQQeJ Haf/cGsKQGEQt3tzhI78Fnr0pMNchLtm3twbYj5HCsmLZzxoA/Dc8QTmnyT9r3M6LSRE 65mCVQjcqeexXT+2zNACXonmTvuQfz/zU4X8KV6X/SpBNzRBRbGejBf4XkKICpifEu/E xZvg== X-Gm-Message-State: AO0yUKUeWQQiVacgacnOM3BZAsYJOM0xoYvjnnNpsT6KksMlV2RELCAl CvbvtVk0qeQkaOzzZ7XBLcOQ7nEheWjjgwAb X-Google-Smtp-Source: AK7set+fNMWbUwUCqaSTK/JK+py7ls7B1f61+rKwcEVXtmOkZ2H6uCUUDi/V8LtQzgQtI0kCCZk95g== X-Received: by 2002:a05:7500:5245:b0:f3:95f4:c4f4 with SMTP id r5-20020a057500524500b000f395f4c4f4mr332011gab.33.1674579916492; Tue, 24 Jan 2023 09:05:16 -0800 (PST) Received: from localhost.localdomain (bras-base-kntaon1618w-grc-10-184-145-9-64.dsl.bell.ca. [184.145.9.64]) by smtp.gmail.com with ESMTPSA id t5-20020a05620a0b0500b007063036cb03sm1700208qkg.126.2023.01.24.09.05.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Jan 2023 09:05:16 -0800 (PST) From: Jamal Hadi Salim To: netdev@vger.kernel.org Cc: kernel@mojatatu.com, deb.chatterjee@intel.com, anjali.singhai@intel.com, namrata.limaye@intel.com, khalidm@nvidia.com, tom@sipanda.io, pratyush@sipanda.io, jiri@resnulli.us, xiyou.wangcong@gmail.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, vladbu@nvidia.com, simon.horman@corigine.com Subject: [PATCH net-next RFC 05/20] net/sched: act_api: introduce tc_lookup_action_byid() Date: Tue, 24 Jan 2023 12:04:55 -0500 Message-Id: <20230124170510.316970-5-jhs@mojatatu.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230124170510.316970-1-jhs@mojatatu.com> References: <20230124170510.316970-1-jhs@mojatatu.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org X-Patchwork-State: RFC Introduce a lookup helper to retrieve the tc_action_ops instance given its action id. Co-developed-by: Victor Nogueira Signed-off-by: Victor Nogueira Co-developed-by: Pedro Tammela Signed-off-by: Pedro Tammela Signed-off-by: Jamal Hadi Salim --- include/net/act_api.h | 1 + net/sched/act_api.c | 20 ++++++++++++++++++++ 2 files changed, 21 insertions(+) diff --git a/include/net/act_api.h b/include/net/act_api.h index 64dc75ba6..083557e51 100644 --- a/include/net/act_api.h +++ b/include/net/act_api.h @@ -204,6 +204,7 @@ int tcf_idr_check_alloc(struct tc_action_net *tn, u32 *index, int tcf_idr_release(struct tc_action *a, bool bind); int tcf_register_action(struct tc_action_ops *a, struct pernet_operations *ops); +struct tc_action_ops *tc_lookup_action_byid(u32 act_id); int tcf_unregister_action(struct tc_action_ops *a, struct pernet_operations *ops); int tcf_action_destroy(struct tc_action *actions[], int bind); diff --git a/net/sched/act_api.c b/net/sched/act_api.c index 622b8d3c5..c5dca2085 100644 --- a/net/sched/act_api.c +++ b/net/sched/act_api.c @@ -1020,6 +1020,26 @@ int tcf_unregister_action(struct tc_action_ops *act, } EXPORT_SYMBOL(tcf_unregister_action); +/* lookup by ID */ +struct tc_action_ops *tc_lookup_action_byid(u32 act_id) +{ + struct tc_action_ops *a, *res = NULL; + + if (!act_id) + return NULL; + + read_lock(&act_mod_lock); + + a = idr_find(&act_base, act_id); + if (a && try_module_get(a->owner)) + res = a; + + read_unlock(&act_mod_lock); + + return res; +} +EXPORT_SYMBOL(tc_lookup_action_byid); + /* lookup by name */ static struct tc_action_ops *tc_lookup_action_n(char *kind) { From patchwork Tue Jan 24 17:04:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jamal Hadi Salim X-Patchwork-Id: 13114383 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 507EFC25B4E for ; Tue, 24 Jan 2023 17:06:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233694AbjAXRGG (ORCPT ); Tue, 24 Jan 2023 12:06:06 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40924 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233638AbjAXRFq (ORCPT ); Tue, 24 Jan 2023 12:05:46 -0500 Received: from mail-yw1-x1132.google.com (mail-yw1-x1132.google.com [IPv6:2607:f8b0:4864:20::1132]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 18F00E079 for ; Tue, 24 Jan 2023 09:05:22 -0800 (PST) Received: by mail-yw1-x1132.google.com with SMTP id 00721157ae682-4fd37a1551cso202685037b3.13 for ; Tue, 24 Jan 2023 09:05:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mojatatu-com.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=xZBqNthhbWiFR93NhjgSX75T0fHGYxAx79oLzTyQtOA=; b=uczZtvbBFDGlhYEUouRh4ZrCrMfOiG6ZwbylIgdsTVr4R3152KedMlBRIVdOb7ZGli qq0E3WBPpD3mHa/HxVmHqSgbR383267x+r1r/zGeTpj9fzbsAQ60xq+Up/mOcLduV4cd JTwCHlSvsXy9l2VOrFxPhC1zzqW4vTLLiuTG5l9rncUFIBCd5Y3NTaHbjenDlUQYTo+G zBV+7TppoJABZ8YWXKb0vmnoNhBeJ/Q2eBoYgSiplL8oGVvBpfhRSLg6FYLgjDrxbbzE TqoE3Gxq/8cohBnt3bxDm0NYvJDC4KbCidib4obNeQM0LT2XfLKKzY04Fl9Bu+OApCYy /3+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=xZBqNthhbWiFR93NhjgSX75T0fHGYxAx79oLzTyQtOA=; b=ds/pLnr76N8YZUnoaJgQWGnmfUsx7iDLr2pmS6d1rez73iq5P4AjTtsl/2RRLtU5AW h+8Mo3mZamT5eYjH+NNUnzzV/Wb8XdrBAHKdjs+y3dz9DcsrbmeY3wwYzQm/7L19gxBJ aNzvh21CUPoesxLOAsHnyQAApo3py2msLf7rwlk7agzc+sKtACWeRMenr8dy0gS/jMO1 rUkK8VCFAxuRRPCI/CzRSZ7GArJisxoGdnm5Tlbtwgl89gDtI6rbqgWGUev4RcAFGaA3 6ZpYyZosR8lTn6ZyipZklouo57YceZZaFwdyJS9o25um3RqeSLOrGdkhbbxDDHbIgGIQ bTnA== X-Gm-Message-State: AFqh2krTw+zdT29KHAWqZC5beV5tmHVLdnkJllKKFQJEqUpWMhOW5UNJ +QNWSYVA5UGKzb/qYKZzu0XPjZmXDLgWTTbA X-Google-Smtp-Source: AMrXdXtJQLaBeajuh8RhK77ct7aQg5kb/fEdjG9/AjRDKvMN358f4o12iM0+gDm+Ep6lTlIR6XJSog== X-Received: by 2002:a05:7500:658c:b0:f2:5513:671a with SMTP id iq12-20020a057500658c00b000f25513671amr1076126gab.77.1674579917599; Tue, 24 Jan 2023 09:05:17 -0800 (PST) Received: from localhost.localdomain (bras-base-kntaon1618w-grc-10-184-145-9-64.dsl.bell.ca. [184.145.9.64]) by smtp.gmail.com with ESMTPSA id t5-20020a05620a0b0500b007063036cb03sm1700208qkg.126.2023.01.24.09.05.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Jan 2023 09:05:17 -0800 (PST) From: Jamal Hadi Salim To: netdev@vger.kernel.org Cc: kernel@mojatatu.com, deb.chatterjee@intel.com, anjali.singhai@intel.com, namrata.limaye@intel.com, khalidm@nvidia.com, tom@sipanda.io, pratyush@sipanda.io, jiri@resnulli.us, xiyou.wangcong@gmail.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, vladbu@nvidia.com, simon.horman@corigine.com Subject: [PATCH net-next RFC 06/20] net/sched: act_api: export generic tc action searcher Date: Tue, 24 Jan 2023 12:04:56 -0500 Message-Id: <20230124170510.316970-6-jhs@mojatatu.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230124170510.316970-1-jhs@mojatatu.com> References: <20230124170510.316970-1-jhs@mojatatu.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org X-Patchwork-State: RFC In P4TC we need to query the tc actions directly in a net namespace. Therefore export __tcf_idr_search(). Signed-off-by: Victor Nogueira Signed-off-by: Pedro Tammela Signed-off-by: Jamal Hadi Salim --- include/net/act_api.h | 2 ++ net/sched/act_api.c | 6 +++--- 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/include/net/act_api.h b/include/net/act_api.h index 083557e51..7328183b4 100644 --- a/include/net/act_api.h +++ b/include/net/act_api.h @@ -190,6 +190,8 @@ int tcf_generic_walker(struct tc_action_net *tn, struct sk_buff *skb, const struct tc_action_ops *ops, struct netlink_ext_ack *extack); int tcf_idr_search(struct tc_action_net *tn, struct tc_action **a, u32 index); +int __tcf_idr_search(struct net *net, const struct tc_action_ops *ops, + struct tc_action **a, u32 index); int tcf_idr_create(struct tc_action_net *tn, u32 index, struct nlattr *est, struct tc_action **a, const struct tc_action_ops *ops, int bind, bool cpustats, u32 flags); diff --git a/net/sched/act_api.c b/net/sched/act_api.c index c5dca2085..c730078bb 100644 --- a/net/sched/act_api.c +++ b/net/sched/act_api.c @@ -690,9 +690,8 @@ static int __tcf_generic_walker(struct net *net, struct sk_buff *skb, return tcf_generic_walker(tn, skb, cb, type, ops, extack); } -static int __tcf_idr_search(struct net *net, - const struct tc_action_ops *ops, - struct tc_action **a, u32 index) +int __tcf_idr_search(struct net *net, const struct tc_action_ops *ops, + struct tc_action **a, u32 index) { struct tc_action_net *tn = net_generic(net, ops->net_id); @@ -701,6 +700,7 @@ static int __tcf_idr_search(struct net *net, return tcf_idr_search(tn, a, index); } +EXPORT_SYMBOL(__tcf_idr_search); static int tcf_idr_delete_index(struct tcf_idrinfo *idrinfo, u32 index) { From patchwork Tue Jan 24 17:04:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jamal Hadi Salim X-Patchwork-Id: 13114386 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 698EEC54E94 for ; Tue, 24 Jan 2023 17:06:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233664AbjAXRGK (ORCPT ); Tue, 24 Jan 2023 12:06:10 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40760 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233726AbjAXRGD (ORCPT ); Tue, 24 Jan 2023 12:06:03 -0500 Received: from mail-oa1-x32.google.com (mail-oa1-x32.google.com [IPv6:2001:4860:4864:20::32]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DA5544DCCC for ; Tue, 24 Jan 2023 09:05:22 -0800 (PST) Received: by mail-oa1-x32.google.com with SMTP id 586e51a60fabf-15fe106c7c7so12262490fac.8 for ; Tue, 24 Jan 2023 09:05:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mojatatu-com.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=H39ymevul8gsuCC6zs2ZkPH3UbsWZS2CiywEUp+1ZEE=; b=dKqQ51KoxL35Lcnw1hrQPwMGq8cuRLcqoDMd520nPNQy+Qw/EsbStPv971aKOlxbHS HEILrM1KAeue1EiI7fBzOw6ih7vDCmnneORzUT+i/hy4LKCgvcOJslQtPHPalFmpA0jV dQOAylJsZ3rS4XSHDlTne+L9fpIpQdnFd+bGYlfIXdFJlB4pbF/sr/Em/ruzY6aiTo0x +YIBpwjm9pOfouLsoa9FqKyR2C8CQlBWH4NCWslH9rRocnw3Q8vx2Bhlfw8Y3JBDSdKr Seuq9+KMsc1LGrSyDdnmEl27JTE/OihoYH6EAMPp4yA24QjKq9GsVy+Z1wSnmlk4swaK PZzQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=H39ymevul8gsuCC6zs2ZkPH3UbsWZS2CiywEUp+1ZEE=; b=j+KahRAQXJ6/ETvxT+WUcgqBVE0n/+BNLoLrQvaP14fjoCGC4GYqU6IDSoL414aOub fNDKxxNRw6cHL7JqkO9v3V5Qj4h/i8hIF90hc4NaCeEpqjF2khVTtU1bGBunNEjrjQWX 2sRn0lKybT/xV8KpZlRyjq87bOXETrwDIbD3BsKDy1vUGm2fT2mvxFZujH0cPf2QO3gG kgaI2hJMNwXdQzUxSOHQWdEGpvAXKTBqJjFdYx/0L8064j5FlR+q6OoiS3POkxYrSzuQ CgpPbHEsmzkW8jQfksK9yadFYts/ihBZxDh8kXSsKh/UcOO+7r78QAELImiTPkhZTw22 G1tA== X-Gm-Message-State: AFqh2koR2KVtXSMAUjvLjJCjvZiFaaNzLj4qqkcXwqM8HNPR1NLcmWFB cSPNtjG1NA1xMg1JXXW5w37wPaAj2Bmj6TDZ X-Google-Smtp-Source: AMrXdXsDbS+oQ/CxSLoiSj2Ebiu8roJu1QHctS0pVIgOIFB4sx2YtNh+rhfUnRs8OAxycJjsoCbGgg== X-Received: by 2002:a05:6870:2809:b0:158:a50:d7c4 with SMTP id gz9-20020a056870280900b001580a50d7c4mr15409503oab.57.1674579919095; Tue, 24 Jan 2023 09:05:19 -0800 (PST) Received: from localhost.localdomain (bras-base-kntaon1618w-grc-10-184-145-9-64.dsl.bell.ca. [184.145.9.64]) by smtp.gmail.com with ESMTPSA id t5-20020a05620a0b0500b007063036cb03sm1700208qkg.126.2023.01.24.09.05.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Jan 2023 09:05:18 -0800 (PST) From: Jamal Hadi Salim To: netdev@vger.kernel.org Cc: kernel@mojatatu.com, deb.chatterjee@intel.com, anjali.singhai@intel.com, namrata.limaye@intel.com, khalidm@nvidia.com, tom@sipanda.io, pratyush@sipanda.io, jiri@resnulli.us, xiyou.wangcong@gmail.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, vladbu@nvidia.com, simon.horman@corigine.com Subject: [PATCH net-next RFC 07/20] net/sched: act_api: create and export __tcf_register_action Date: Tue, 24 Jan 2023 12:04:57 -0500 Message-Id: <20230124170510.316970-7-jhs@mojatatu.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230124170510.316970-1-jhs@mojatatu.com> References: <20230124170510.316970-1-jhs@mojatatu.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org X-Patchwork-State: RFC Create and export __tcf_register_action, which will register an action without registering per net ops for it. This is necessary for dynamic P4TC actions, which are bound to a P4 pipeline which is bound to a namespace; for this reason they only need to store themselves in the act_base API, but don't need to be propagated to all net namespaces. Signed-off-by: Victor Nogueira Signed-off-by: Pedro Tammela Signed-off-by: Jamal Hadi Salim --- include/net/act_api.h | 2 ++ net/sched/act_api.c | 74 +++++++++++++++++++++++++++++++++---------- 2 files changed, 60 insertions(+), 16 deletions(-) diff --git a/include/net/act_api.h b/include/net/act_api.h index 7328183b4..26d8d33f9 100644 --- a/include/net/act_api.h +++ b/include/net/act_api.h @@ -206,9 +206,11 @@ int tcf_idr_check_alloc(struct tc_action_net *tn, u32 *index, int tcf_idr_release(struct tc_action *a, bool bind); int tcf_register_action(struct tc_action_ops *a, struct pernet_operations *ops); +int __tcf_register_action(struct tc_action_ops *a); struct tc_action_ops *tc_lookup_action_byid(u32 act_id); int tcf_unregister_action(struct tc_action_ops *a, struct pernet_operations *ops); +int __tcf_unregister_action(struct tc_action_ops *a); int tcf_action_destroy(struct tc_action *actions[], int bind); int tcf_action_exec(struct sk_buff *skb, struct tc_action **actions, int nr_actions, struct tcf_result *res); diff --git a/net/sched/act_api.c b/net/sched/act_api.c index c730078bb..628447669 100644 --- a/net/sched/act_api.c +++ b/net/sched/act_api.c @@ -946,18 +946,10 @@ static void tcf_pernet_del_id_list(unsigned int id) mutex_unlock(&act_id_mutex); } -int tcf_register_action(struct tc_action_ops *act, - struct pernet_operations *ops) +static int tcf_register_action_pernet(struct pernet_operations *ops) { int ret; - if (!act->act || !act->dump || (!act->init && !act->init_ops)) - return -EINVAL; - - /* We have to register pernet ops before making the action ops visible, - * otherwise tcf_action_init_1() could get a partially initialized - * netns. - */ ret = register_pernet_subsys(ops); if (ret) return ret; @@ -968,6 +960,17 @@ int tcf_register_action(struct tc_action_ops *act, goto err_id; } + return 0; + +err_id: + unregister_pernet_subsys(ops); + return ret; +} + +int __tcf_register_action(struct tc_action_ops *act) +{ + int ret; + write_lock(&act_mod_lock); if (act->id) { if (idr_find(&act_base, act->id)) { @@ -993,16 +996,46 @@ int tcf_register_action(struct tc_action_ops *act, err_out: write_unlock(&act_mod_lock); - if (ops->id) - tcf_pernet_del_id_list(*ops->id); + return ret; +} +EXPORT_SYMBOL(__tcf_register_action); + +int tcf_register_action(struct tc_action_ops *act, + struct pernet_operations *ops) +{ + int ret; + + if (!act->act || !act->dump || !act->init) + return -EINVAL; + + /* We have to register pernet ops before making the action ops visible, + * otherwise tcf_action_init_1() could get a partially initialized + * netns. + */ + ret = tcf_register_action_pernet(ops); + if (ret) + return ret; + + ret = __tcf_register_action(act); + if (ret < 0) + goto err_id; + + return 0; + err_id: unregister_pernet_subsys(ops); return ret; } EXPORT_SYMBOL(tcf_register_action); -int tcf_unregister_action(struct tc_action_ops *act, - struct pernet_operations *ops) +static void tcf_unregister_action_pernet(struct pernet_operations *ops) +{ + unregister_pernet_subsys(ops); + if (ops->id) + tcf_pernet_del_id_list(*ops->id); +} + +int __tcf_unregister_action(struct tc_action_ops *act) { int err = 0; @@ -1011,10 +1044,19 @@ int tcf_unregister_action(struct tc_action_ops *act, err = -EINVAL; write_unlock(&act_mod_lock); + + return err; +} +EXPORT_SYMBOL(__tcf_unregister_action); + +int tcf_unregister_action(struct tc_action_ops *act, + struct pernet_operations *ops) +{ + int err; + + err = __tcf_unregister_action(act); if (!err) { - unregister_pernet_subsys(ops); - if (ops->id) - tcf_pernet_del_id_list(*ops->id); + tcf_unregister_action_pernet(ops); } return err; } From patchwork Tue Jan 24 17:04:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jamal Hadi Salim X-Patchwork-Id: 13114387 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 51A6CC25B4E for ; Tue, 24 Jan 2023 17:06:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233726AbjAXRGM (ORCPT ); Tue, 24 Jan 2023 12:06:12 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41218 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233973AbjAXRGD (ORCPT ); Tue, 24 Jan 2023 12:06:03 -0500 Received: from mail-oa1-x32.google.com (mail-oa1-x32.google.com [IPv6:2001:4860:4864:20::32]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8B256DBC9 for ; Tue, 24 Jan 2023 09:05:24 -0800 (PST) Received: by mail-oa1-x32.google.com with SMTP id 586e51a60fabf-15085b8a2f7so18397612fac.2 for ; Tue, 24 Jan 2023 09:05:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mojatatu-com.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=7m7GrIwjqFN40QJvlmkCIYYmIgC5lw0bbO6H0pTDWT0=; b=p/F2aIHuzZj7hYJmAPvzMWx7WdfLBLNRB/PVRsrGyILqhBIxDsesEt+QAG5cbYMVEH 9IEYlJDT/zDqgKW6vdEnpUT5mdjPYjfEqnXYEZt3vpvQnOIf0mOXTlROXqVmZDMbz+aa J6PLyJVDcb/6ZuYzBwsNLehndfDcjfIbR+BlaaB5ECNzsAEGj2gyxLgOSnp0gSXnzNuk O/YkZDGNC3BOnB6KOSl4UQs7O8Bt1h2G8QlMays8q1Gd9d5r4Qz2NpX0VADRSfyJ++ci p4RtU+p4pwtCDfe/t+dS3IpvXep8rGviEQE5YuVCl8jbAcSdimYtZSrMXVeiBXT+QYUO gg1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7m7GrIwjqFN40QJvlmkCIYYmIgC5lw0bbO6H0pTDWT0=; b=8NPxsfhY5KNknLZENQ0UFUePTQgMPjxcuEyOXZA7VZntyvJOeLHfRt2JN4Kzc3Z1jt mjyZxj7mcUkm5FMGNxxtwLHtI2K/qakyXbb5V2mc688rqxNWLWsIgtbwGUSworXS8rek Dvnaw2EKjNg5D0tnWx7pAiZmTgPybfOiWjslK/Zcovzg6tcB4ePYdy86kQLSkpyA1q5L zkFxL4qYawUBjpFsK60C7+/Sf8ZskzaFdOy96njjNN8hM8tO0OfQO3H7mekIopeya4X8 YaNojOBIwVHq+M3HwcTXuuD2YSCjHtrsPrW4q3UMw8N8sTf5FKL5JMj4Tr5uJnBXF69n Z4JQ== X-Gm-Message-State: AFqh2kp4dv9221YALrdLijHASKU5KL/s5hQprYZKA19wAdm36XEWxE3s 9veEDWkgirukvz/DZeKBW+BQj/KtWC3aJ3SY X-Google-Smtp-Source: AMrXdXua2pcGe6ADM0kwkGaBz5IYWCjVt7jtWmam78xq8VE7HE9x72nlJpKxFnhPZK1J/NxqSUIrug== X-Received: by 2002:a05:6358:2245:b0:ed:2de1:6420 with SMTP id i5-20020a056358224500b000ed2de16420mr562292rwc.18.1674579920550; Tue, 24 Jan 2023 09:05:20 -0800 (PST) Received: from localhost.localdomain (bras-base-kntaon1618w-grc-10-184-145-9-64.dsl.bell.ca. [184.145.9.64]) by smtp.gmail.com with ESMTPSA id t5-20020a05620a0b0500b007063036cb03sm1700208qkg.126.2023.01.24.09.05.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Jan 2023 09:05:19 -0800 (PST) From: Jamal Hadi Salim To: netdev@vger.kernel.org Cc: kernel@mojatatu.com, deb.chatterjee@intel.com, anjali.singhai@intel.com, namrata.limaye@intel.com, khalidm@nvidia.com, tom@sipanda.io, pratyush@sipanda.io, jiri@resnulli.us, xiyou.wangcong@gmail.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, vladbu@nvidia.com, simon.horman@corigine.com Subject: [PATCH net-next RFC 08/20] net/sched: act_api: add struct p4tc_action_ops as a parameter to lookup callback Date: Tue, 24 Jan 2023 12:04:58 -0500 Message-Id: <20230124170510.316970-8-jhs@mojatatu.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230124170510.316970-1-jhs@mojatatu.com> References: <20230124170510.316970-1-jhs@mojatatu.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org X-Patchwork-State: RFC For P4TC dynamic actions, we require information from struct tc_action_ops, specifically the action kind, to find and locate the dynamic action information for the lookup operation. Signed-off-by: Victor Nogueira Signed-off-by: Pedro Tammela Signed-off-by: Jamal Hadi Salim --- include/net/act_api.h | 3 ++- net/sched/act_api.c | 2 +- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/include/net/act_api.h b/include/net/act_api.h index 26d8d33f9..fd012270d 100644 --- a/include/net/act_api.h +++ b/include/net/act_api.h @@ -115,7 +115,8 @@ struct tc_action_ops { struct tcf_result *); /* called under RCU BH lock*/ int (*dump)(struct sk_buff *, struct tc_action *, int, int); void (*cleanup)(struct tc_action *); - int (*lookup)(struct net *net, struct tc_action **a, u32 index); + int (*lookup)(struct net *net, const struct tc_action_ops *ops, + struct tc_action **a, u32 index); int (*init)(struct net *net, struct nlattr *nla, struct nlattr *est, struct tc_action **act, struct tcf_proto *tp, diff --git a/net/sched/act_api.c b/net/sched/act_api.c index 628447669..e33b0c248 100644 --- a/net/sched/act_api.c +++ b/net/sched/act_api.c @@ -696,7 +696,7 @@ int __tcf_idr_search(struct net *net, const struct tc_action_ops *ops, struct tc_action_net *tn = net_generic(net, ops->net_id); if (unlikely(ops->lookup)) - return ops->lookup(net, a, index); + return ops->lookup(net, ops, a, index); return tcf_idr_search(tn, a, index); } From patchwork Tue Jan 24 17:04:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jamal Hadi Salim X-Patchwork-Id: 13114388 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6154AC54EAA for ; Tue, 24 Jan 2023 17:06:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234298AbjAXRGO (ORCPT ); Tue, 24 Jan 2023 12:06:14 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41274 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234101AbjAXRGD (ORCPT ); Tue, 24 Jan 2023 12:06:03 -0500 Received: from mail-yw1-x1132.google.com (mail-yw1-x1132.google.com [IPv6:2607:f8b0:4864:20::1132]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AF1CB3D0AB for ; Tue, 24 Jan 2023 09:05:25 -0800 (PST) Received: by mail-yw1-x1132.google.com with SMTP id 00721157ae682-5063029246dso59437897b3.6 for ; Tue, 24 Jan 2023 09:05:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mojatatu-com.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=rIwanKgo4E0WAR8BL+PVZb2lS2OMYRxhD+8AZsvu/3A=; b=Epob+MP1pP79V1PEGk72vyW/SpDbckrr1g13LtMsvW9SnV9cd2qut/BOsbdilRS2aQ 0m7TCH7LirEnHnvchqxqRUjMdyQNb96QZ5wsKYRNH7PhwNe2mlPNwP4wE0yESrDwvUVG ooFcn7fd3IemitR5FJxjAgmNZyVwgomoOxLsvnEDXkBVsmJykb/mShjmJwljylxdEQLz pkRjNfUj8MOsULu6gkDaGEu8jaabyu3kxvYQpJ+nubE7IeVue1RIdxGp2JG0NEJUr7DW TD4szAo9AGFQsYJUBO/4ZHSPfqs5nOCdGkGQ8jWfYhIAno2LkESfcxTQm1EHDl6d4ZQ+ Kgaw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=rIwanKgo4E0WAR8BL+PVZb2lS2OMYRxhD+8AZsvu/3A=; b=lB4tv+h7ZdgUeDQzlHWyXlgoVMp8lClqNr6WlIyMCfbiIemYel7XRNxEiPtOS9NqA5 086J4EoD0bPbLtutsk+NKd+fJa6LPELrotCn6uvfvUypezdvrbO5V1o29XAU/YDjtR2m wxKXwYD3Bp7BuIIK9vKYCrfyRz04FiMZ9eAQU08b1eg2aOPfLp/0YLb9mCt76k+9hZNB SXwknIqq5+1t3CkeyLQgnMsLqa15B1RCykOm+dXetuTIjxOmTqqtTMmtMx2SPRB6P/l+ ox+KG0j7uin7oCLm/6Z4eTqlsdNLPKgGA6dn4nUICqyLw8YduAw3qCoPTUh8225PxofO 78FA== X-Gm-Message-State: AFqh2kp1L30+FozAeJvYKrAPGy4KPvHDPowvaJmGWr1fZeWKklRi4DA/ y+l7yqwpUWHaLswCKyA9nvj347IZPsVPd8vV X-Google-Smtp-Source: AMrXdXtf5DgL+RxVo4VHy9JAK/5cm32PLRo7cqyOBEwA2K4Wu7JpSBadNDUZoN2Yp+ybULnjOqTstQ== X-Received: by 2002:a05:7500:5449:b0:f1:cbd8:62fa with SMTP id e9-20020a057500544900b000f1cbd862famr2370932gac.12.1674579921804; Tue, 24 Jan 2023 09:05:21 -0800 (PST) Received: from localhost.localdomain (bras-base-kntaon1618w-grc-10-184-145-9-64.dsl.bell.ca. [184.145.9.64]) by smtp.gmail.com with ESMTPSA id t5-20020a05620a0b0500b007063036cb03sm1700208qkg.126.2023.01.24.09.05.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Jan 2023 09:05:21 -0800 (PST) From: Jamal Hadi Salim To: netdev@vger.kernel.org Cc: kernel@mojatatu.com, deb.chatterjee@intel.com, anjali.singhai@intel.com, namrata.limaye@intel.com, khalidm@nvidia.com, tom@sipanda.io, pratyush@sipanda.io, jiri@resnulli.us, xiyou.wangcong@gmail.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, vladbu@nvidia.com, simon.horman@corigine.com Subject: [PATCH net-next RFC 09/20] net: introduce rcu_replace_pointer_rtnl Date: Tue, 24 Jan 2023 12:04:59 -0500 Message-Id: <20230124170510.316970-9-jhs@mojatatu.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230124170510.316970-1-jhs@mojatatu.com> References: <20230124170510.316970-1-jhs@mojatatu.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org X-Patchwork-State: RFC We use rcu_replace_pointer(rcu_ptr, ptr, lockdep_rtnl_is_held()) throughout the P4TC infrastructure code. It may be useful for other use cases, so we create a helper. Co-developed-by: Victor Nogueira Signed-off-by: Victor Nogueira Co-developed-by: Pedro Tammela Signed-off-by: Pedro Tammela Signed-off-by: Jamal Hadi Salim --- include/linux/rtnetlink.h | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h index 92ad75549..56a1e80fe 100644 --- a/include/linux/rtnetlink.h +++ b/include/linux/rtnetlink.h @@ -71,6 +71,18 @@ static inline bool lockdep_rtnl_is_held(void) #define rcu_dereference_bh_rtnl(p) \ rcu_dereference_bh_check(p, lockdep_rtnl_is_held()) +/** + * rcu_replace_pointer_rtnl - replace an RCU pointer under rtnl_lock, returning + * its old value + * @rcu_ptr: RCU pointer, whose old value is returned + * @ptr: regular pointer + * + * Perform a replacement under rtnl_lock, where @rcu_ptr is an RCU-annotated + * pointer. The old value of @rcu_ptr is returned, and @rcu_ptr is set to @ptr + */ +#define rcu_replace_pointer_rtnl(rcu_ptr, ptr) \ + rcu_replace_pointer(rcu_ptr, ptr, lockdep_rtnl_is_held()) + /** * rtnl_dereference - fetch RCU pointer when updates are prevented by RTNL * @p: The pointer to read, prior to dereferencing From patchwork Tue Jan 24 17:05:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jamal Hadi Salim X-Patchwork-Id: 13114392 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E1C65C25B4E for ; Tue, 24 Jan 2023 17:06:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234729AbjAXRG0 (ORCPT ); Tue, 24 Jan 2023 12:06:26 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41264 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234593AbjAXRGG (ORCPT ); Tue, 24 Jan 2023 12:06:06 -0500 Received: from mail-yw1-x1134.google.com (mail-yw1-x1134.google.com [IPv6:2607:f8b0:4864:20::1134]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B50E6474D9 for ; Tue, 24 Jan 2023 09:05:27 -0800 (PST) Received: by mail-yw1-x1134.google.com with SMTP id 00721157ae682-50660e2d2ffso11494337b3.1 for ; Tue, 24 Jan 2023 09:05:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mojatatu-com.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=4VocWBDoyuDN8m9klK/arzoHXX06Ad2mUlQsueE7nwU=; b=41fmHxiR7f3fFa1lFgmnfbhTkUYVNLAQZHczoSQpCL8Fg2FfkQOx8YuuY7vf9wqjr2 N9iyqtBMuZiABv03p2LbA50oHp3yGUMzlX8sz05ws0Burtff2JJT6adahNOGIzOHumat tZt07bOBw7Tg7WRGp0u92cUDp1I5Y0VS2ceUqXtX29n4O+qRzZSALtH4yPyZ16SXNZnk z7znHW8TAD+KR1wRl315ErzH6tBwlg6USYmw6CE+IIiPbYmU2N9YtgqAtXWbcRiYTnrR JCdHumi9dX7tOg7TvzRIBOiB+6tjY0O0RuEWRWsOBs8njX3H918LMH3kcVdTn9eZive+ /1ZA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=4VocWBDoyuDN8m9klK/arzoHXX06Ad2mUlQsueE7nwU=; b=4GNhmYJJSXScuOOv3PydMvrhZxzcmNlGIoqPRCnuNH525LfthTGULNPqOFL+7a8i1m +gKHhPHaWPl7068JzAHJ/RCT4NXl6qz3fPSxLdxdoaIJYv7tL/7y1nU7Di1AfW7wRMEF TVabsoDND00YaTIiAcM9E8okNJkVYGNna2bFleFGDiDQHPpReQ0UoERbjzr/MCVl5d5z dsj4UsT9662Jj0TDO+seQpQ0buLfCTOSvpuqsjGqOpFKPXTeWHjuLFLrQ9GIbzcurg4R lUl5ALBrYmtzuILIe3Ir3VRnRVyeXpjD3xv+6pdxONdneHV3nLf7xw88JboqgOj5EhV/ mXtg== X-Gm-Message-State: AFqh2krqhf8C5sPWgY/5BYc5GE3nVI+AWanxSIAyvwUoDNK3tpGgjx8S Utas3GOIu1zmld2wK6nKBL1jW2CLRFtuCmse X-Google-Smtp-Source: AMrXdXu5HBN7BLMPPt7UxcX30LC3Jt+ipFEdma4jparjS6byinjDcyQ+pyx7El1fL+0xJjA4HW9aYQ== X-Received: by 2002:a05:7500:569d:b0:ea:5ad4:645e with SMTP id ca29-20020a057500569d00b000ea5ad4645emr1561884gab.56.1674579923314; Tue, 24 Jan 2023 09:05:23 -0800 (PST) Received: from localhost.localdomain (bras-base-kntaon1618w-grc-10-184-145-9-64.dsl.bell.ca. [184.145.9.64]) by smtp.gmail.com with ESMTPSA id t5-20020a05620a0b0500b007063036cb03sm1700208qkg.126.2023.01.24.09.05.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Jan 2023 09:05:22 -0800 (PST) From: Jamal Hadi Salim To: netdev@vger.kernel.org Cc: kernel@mojatatu.com, deb.chatterjee@intel.com, anjali.singhai@intel.com, namrata.limaye@intel.com, khalidm@nvidia.com, tom@sipanda.io, pratyush@sipanda.io, jiri@resnulli.us, xiyou.wangcong@gmail.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, vladbu@nvidia.com, simon.horman@corigine.com Subject: [PATCH net-next RFC 10/20] net/kparser: add kParser Date: Tue, 24 Jan 2023 12:05:00 -0500 Message-Id: <20230124170510.316970-10-jhs@mojatatu.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230124170510.316970-1-jhs@mojatatu.com> References: <20230124170510.316970-1-jhs@mojatatu.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org X-Patchwork-State: RFC From: Pratyush Khan kParser stands for "The Kernel Parser". This is a programmable/configurable network packet parser which is a ported version of the PANDA parser that is used as a P4 parser by P4TC. For the introduction to the basic building blocks of PANDA parser refer to: https://github.com/panda-net/panda/blob/main/documentation/parser.md The kParser enabled iproute2 CLI can be used to configure the kernel counterpart. Kparser removes the need to create new kernel code for parsing new unknown headers. To add new header definitions and their associated parsing semantics one would use he kParser enabled iproute2::ip command utility to teach the kernel. kParser objects/namespaces -------------------------- Building blocks of kParser are various objects from different namespaces/object types. Various namespaces will be in the next section. Each object is identified by a maximum 128 bytes long '\0' terminated (128 bytes including the '\0' character) human readable ASCII name (only character '/' is not allowed in the name, and names can not start with '-'). Alternatively an unsigned 16 bit ID or both ID and name can be used to identify objects. NOTE: During CLI create operations of these objects, it is must to specify either the name or ID. Both can also be specified. Whichever is not specified during create will be auto generated by the kernel kParser and CLI will convey the identifiers to user for later use. User should save these identifiers. NOTE: name and ID must always unique for any specific object type. name or id can later be used to identify the associated object. Various objects are: 1. condexprs: "Conditional expressions" used to define and configure various complex conditional expressions in kParser. They are used to validate certain conditions for protocol packet field values. 2. condexprslist: "List of Conditional expressions" used to create more complex and composite expressions involving more than one conditional expression(s). 3: condexprstable: "A table of Conditional expressions" used to associate one or more than one list of Conditional expressions with a packet parsing action handlers, i.e. parse node. 4: counter: It is used to create and configure counter objects which can be used for a wide range of usages such as count how many VLAN headers were parsed, how many TCP options are encountered etc. 5: countertable: There is only a single global table of counters, the size of this table is 7. Multiple kParser parser instances can share this countertable. 6: metadata-rule: Defines the metadata structures that will be passed to the kParser datapath parser API by the user. This basically defines a specific metadata extraction rule. This must match with the user passed metadata structure in the datapath API. 7: metadata-ruleset: A list of metadata(s) to associate it with packet parsing action handlers, i.e. parse node. 8: node: A node (a.k.a parse node) represents a specific protocol header. Defining protocol handler involves multiple work, i.e.configure the parser about the associated protocol's packet header, e.g. minimum header length, where to look for the next protocol field in the packet, etc. Along with that, it also defines the rules/handlers to parse and store the required metadata by associating a metalist. The table to find the next protocol node is attached to node. node can be 3 types: PLAIN, TLVS and FLAGFIELDS. PLAIN nodes are the basic protocol headers. TLVS nodes are the Type-Length-Value protocol headers, such as TCP. They also binds a tlvtable to a node. FLAGFIELDS are indexed flag and associated flag fields protocol headers, such as GRE headers. It also binds a flagstable with a node. 9: table: A table is a protocol table, which associated a protocol number with a node. e.g. ethernet protocol type 0x8000 in network order means the next node after ethernet header is IPv4. NOTE: table has key, key must be unique. Usually this key is protocol number, such as ethernet type, or IPv4 protocol number etc. 10: tlvnode: A tlvnode defines a specific TLV parsing rule, e.g. to parse TCP option MSS, a new tlvnode needs to be defined. Each tlvnode can also associate a metalist with the TLV parsing rule, i.e. tlvnode 11: tlvtable: This is a table of multiple tlvnode(s) where the key are types of TLVs (e.g. tlvnode defined for TCP MSS should have the type/kind value set to 2. 12: flags: It describes certain protocol's flags., e.g. GRE flags. 13: flagfields: It defines flagfields for the above mentioned flags. e.g. GRE flagfields such as checksum, key, sequence number etc. 14: flagstable: This defines a table of flagfields and associate them with their respective flag values via their indexes. Here the keys are usually indexes, because in typical flag based protocol header, such as GRE, the flagfields appear in protocol packet in the same order as the set flag bits. The flag is defined by the flag value, mask, size and associated metalist. 13: parser: A parser represents a parse tree. It defines the user metadata and metametadata structure size, number of parsing node and encapsulation limits, root node for the parse tree, success and failure case exit nodes. 14. parserlockunlock: This is to lock a parser and unlock it later when usage is done. During locked period, the whole parse tree becomes immutable and can not be modified/deleted until unlocked. This is needed to protect modify/delete during data path operations. kParser kernel datapath APIs ----------------------------- The following datapath APIs are exposed by kParser for the kernel consumer (such as P4TC). /* kParser datapath API 1: parse a skb using a parser instance key. * skb: input packet skb * kparser_key: key of the associated kParser parser object which must be * already created via CLI. * _metadata: User provided metadata buffer. It must be same as configured * metadata objects in CLI. * metadata_len: Total length of the user provided metadata buffer. * return: kParser error code as defined in include/uapi/linux/kparser.h */ int kparser_parse( struct sk_buff *skb, const struct kparser_hkey *kparser_key, void *_metadata, size_t metadata_len); /* kParser datapath API 2: get/freeze a parser instance using a key. * kparser_key: key of the associated kParser parser object which must be * already created via CLI. * return: NULL if key not found, else an opaque parser instance pointer which * can be used in the following APIs 3 and 4. * NOTE: This call makes the whole parser tree immutable. If caller calls this * more than once, later caller will need to release the same parser exactly that * many times using the API kparser_put_parser(). */ const void * kparser_get_parser( const struct kparser_hkey *kparser_key); /* kParser datapath API 3: parse a void * packet buffer using a parser instance key. * parser: Non NULL kparser_get_parser() returned and cached opaque pointer * referencing a valid parser instance. * _hdr: input packet buffer * parse_len: length of input packet buffer * _metadata: User provided metadata buffer. It must be same as configured * metadata objects in CLI. * metadata_len: Total length of the user provided metadata buffer. * return: kParser error code as defined in include/uapi/linux/kparser.h */ int __kparser_parse( const void *parser, void *_hdr, size_t parse_len, void *_metadata, size_t metadata_len); /* kParser datapath API 4: put/un-freeze a parser instance using a previously * obtained opaque parser pointer via API kparser_get_parser(). * parser: Non NULL kparser_get_parser() returned and cached opaque pointer * referencing a valid parser instance. * return: true if put operation is success, else false. * NOTE: This call makes the whole parser tree deletable for the very last call. */ bool kparser_put_parser(const void *parser); Signed-off-by: Pratyush Khan Signed-off-by: Tom Herbert Signed-off-by: Jamal Hadi Salim --- include/net/kparser.h | 110 + include/uapi/linux/kparser.h | 674 +++++ net/Kconfig | 9 + net/Makefile | 1 + net/kparser/Makefile | 17 + net/kparser/kparser.h | 418 +++ net/kparser/kparser_cmds.c | 917 +++++++ net/kparser/kparser_cmds_dump_ops.c | 586 +++++ net/kparser/kparser_cmds_ops.c | 3778 +++++++++++++++++++++++++++ net/kparser/kparser_condexpr.h | 52 + net/kparser/kparser_datapath.c | 1266 +++++++++ net/kparser/kparser_main.c | 329 +++ net/kparser/kparser_metaextract.h | 891 +++++++ net/kparser/kparser_types.h | 586 +++++ 14 files changed, 9634 insertions(+) create mode 100644 include/net/kparser.h create mode 100644 include/uapi/linux/kparser.h create mode 100644 net/kparser/Makefile create mode 100644 net/kparser/kparser.h create mode 100644 net/kparser/kparser_cmds.c create mode 100644 net/kparser/kparser_cmds_dump_ops.c create mode 100644 net/kparser/kparser_cmds_ops.c create mode 100644 net/kparser/kparser_condexpr.h create mode 100644 net/kparser/kparser_datapath.c create mode 100644 net/kparser/kparser_main.c create mode 100644 net/kparser/kparser_metaextract.h create mode 100644 net/kparser/kparser_types.h diff --git a/include/net/kparser.h b/include/net/kparser.h new file mode 100644 index 000000000..89575a519 --- /dev/null +++ b/include/net/kparser.h @@ -0,0 +1,110 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* Copyright (c) 2022, SiPanda Inc. + * + * kparser.h - kParser global net header file + * + * Authors: Pratyush Kumar Khan + */ + +#ifndef _NET_KPARSER_H +#define _NET_KPARSER_H + +#include +#include + +/* The kParser data path API can consume max 512 bytes */ +#define KPARSER_MAX_SKB_PACKET_LEN 512 + +/* kparser_parse(): Function to parse a skb using a parser instance key. + * + * skb: input packet skb + * kparser_key: key of the associated kParser parser object which must be + * already created via CLI. + * _metadata: User provided metadata buffer. It must be same as configured + * metadata objects in CLI. + * metadata_len: Total length of the user provided metadata buffer. + * avoid_ref: Set this flag in case caller wants to avoid holding the reference + * of the active parser object to save performance on the data path. + * But please be advised, caller should hold the reference of the + * parser object while using this data path. In this case, the CLI + * can be used in advance to get the reference, and caller will also + * need to release the reference via CLI once it is done with the + * data path. + * + * return: kParser error code as defined in include/uapi/linux/kparser.h + */ +extern int kparser_parse(struct sk_buff *skb, + const struct kparser_hkey *kparser_key, + void *_metadata, size_t metadata_len, + bool avoid_ref); + +/* __kparser_parse(): Function to parse a void * packet buffer using a parser instance key. + * + * parser: Non NULL kparser_get_parser() returned and cached opaque pointer + * referencing a valid parser instance. + * _hdr: input packet buffer + * parse_len: length of input packet buffer + * _metadata: User provided metadata buffer. It must be same as configured + * metadata objects in CLI. + * metadata_len: Total length of the user provided metadata buffer. + * + * return: kParser error code as defined in include/uapi/linux/kparser.h + */ +extern int __kparser_parse(const void *parser, void *_hdr, + size_t parse_len, void *_metadata, size_t metadata_len); + +/* kparser_get_parser(): Function to get an opaque reference of a parser instance and mark it + * immutable so that while actively using, it can not be deleted. The parser is identified by a key. + * It marks the associated parser and whole parse tree immutable so that when it is locked, it can + * not be deleted. + * + * kparser_key: key of the associated kParser parser object which must be + * already created via CLI. + * avoid_ref: Set this flag in case caller wants to avoid holding the reference + * of the active parser object to save performance on the data path. + * But please be advised, caller should hold the reference of the + * parser object while using this data path. In this case, the CLI + * can be used in advance to get the reference, and caller will also + * need to release the reference via CLI once it is done with the + * data path. + * + * return: NULL if key not found, else an opaque parser instance pointer which + * can be used in the following APIs 3 and 4. + * + * NOTE: This call makes the whole parser tree immutable. If caller calls this + * more than once, later caller will need to release the same parser exactly that + * many times using the API kparser_put_parser(). + */ +extern const void *kparser_get_parser(const struct kparser_hkey *kparser_key, + bool avoid_ref); + +/* kparser_put_parser(): Function to return and undo the read only operation done previously by + * kparser_get_parser(). The parser instance is identified by using a previously obtained opaque + * parser pointer via API kparser_get_parser(). This undo the immutable change so that any component + * of the whole parse tree can be deleted again. + * + * parser: void *, Non NULL opaque pointer which was previously returned by kparser_get_parser(). + * Caller can use cached opaque pointer as long as system does not restart and kparser.ko is not + * reloaded. + * avoid_ref: Set this flag only when this was used in the prio call to + * kparser_get_parser(). Incorrect usage of this flag might cause + * error and make the parser state unstable. + * + * return: boolean, true if put operation is success, else false. + * + * NOTE: This call makes the whole parser tree deletable for the very last call. + */ +extern bool kparser_put_parser(const void *parser, bool avoid_ref); + +/* net/core/filter.c's callback hook structure to use kParser APIs if kParser enabled */ +struct get_kparser_funchooks { + const void * (*kparser_get_parser_hook)(const struct kparser_hkey + *kparser_key, bool avoid_ref); + int (*__kparser_parse_hook)(const void *parser, void *_hdr, + size_t parse_len, void *_metadata, size_t metadata_len); + bool (*kparser_put_parser_hook)(const void *parser, bool avoid_ref); +}; + +extern struct get_kparser_funchooks kparser_funchooks; + +#endif /* _NET_KPARSER_H */ diff --git a/include/uapi/linux/kparser.h b/include/uapi/linux/kparser.h new file mode 100644 index 000000000..dad9621ee --- /dev/null +++ b/include/uapi/linux/kparser.h @@ -0,0 +1,674 @@ +/* SPDX-License-Identifier: GPL-2.0-only WITH Linux-syscall-note */ +/* Copyright (c) 2022, SiPanda Inc. + * + * kparser.h - kParser global Linux header file + * + * Authors: Tom Herbert + * Pratyush Kumar Khan + */ + +#ifndef _LINUX_KPARSER_H +#define _LINUX_KPARSER_H + +#include +#include + +/* *********************** NETLINK_GENERIC *********************** */ +#define KPARSER_GENL_NAME "kParser" +#define KPARSER_GENL_VERSION 0x1 + +/* *********************** NETLINK CLI *********************** */ +/* *********************** Namespaces/objects *********************** */ +enum kparser_global_namespace_ids { + KPARSER_NS_INVALID, + KPARSER_NS_CONDEXPRS, + KPARSER_NS_CONDEXPRS_TABLE, + KPARSER_NS_CONDEXPRS_TABLES, + KPARSER_NS_COUNTER, + KPARSER_NS_COUNTER_TABLE, + KPARSER_NS_METADATA, + KPARSER_NS_METALIST, + KPARSER_NS_NODE_PARSE, + KPARSER_NS_PROTO_TABLE, + KPARSER_NS_TLV_NODE_PARSE, + KPARSER_NS_TLV_PROTO_TABLE, + KPARSER_NS_FLAG_FIELD, + KPARSER_NS_FLAG_FIELD_TABLE, + KPARSER_NS_FLAG_FIELD_NODE_PARSE, + KPARSER_NS_FLAG_FIELD_PROTO_TABLE, + KPARSER_NS_PARSER, + KPARSER_NS_OP_PARSER_LOCK_UNLOCK, + KPARSER_NS_MAX +}; + +#define KPARSER_ATTR_RSP(id) KPARSER_ATTR_RSP_##id + +#define KPARSER_DEFINE_ATTR_IDS(id) \ + KPARSER_ATTR_CREATE_##id, /* NLA_BINARY */\ + KPARSER_ATTR_UPDATE_##id, /* NLA_BINARY */\ + KPARSER_ATTR_READ_##id, /* NLA_BINARY */\ + KPARSER_ATTR_DELETE_##id, /* NLA_BINARY */\ + KPARSER_ATTR_RSP(id) + +enum { + KPARSER_ATTR_UNSPEC, /* Add more entries after this */ + KPARSER_DEFINE_ATTR_IDS(KPARSER_NS_CONDEXPRS), + KPARSER_DEFINE_ATTR_IDS(KPARSER_NS_CONDEXPRS_TABLE), + KPARSER_DEFINE_ATTR_IDS(KPARSER_NS_CONDEXPRS_TABLES), + KPARSER_DEFINE_ATTR_IDS(KPARSER_NS_COUNTER), + KPARSER_DEFINE_ATTR_IDS(KPARSER_NS_COUNTER_TABLE), + KPARSER_DEFINE_ATTR_IDS(KPARSER_NS_METADATA), + KPARSER_DEFINE_ATTR_IDS(KPARSER_NS_METALIST), + KPARSER_DEFINE_ATTR_IDS(KPARSER_NS_NODE_PARSE), + KPARSER_DEFINE_ATTR_IDS(KPARSER_NS_PROTO_TABLE), + KPARSER_DEFINE_ATTR_IDS(KPARSER_NS_TLV_NODE_PARSE), + KPARSER_DEFINE_ATTR_IDS(KPARSER_NS_TLV_PROTO_TABLE), + KPARSER_DEFINE_ATTR_IDS(KPARSER_NS_FLAG_FIELD), + KPARSER_DEFINE_ATTR_IDS(KPARSER_NS_FLAG_FIELD_TABLE), + KPARSER_DEFINE_ATTR_IDS(KPARSER_NS_FLAG_FIELD_NODE_PARSE), + KPARSER_DEFINE_ATTR_IDS(KPARSER_NS_FLAG_FIELD_PROTO_TABLE), + KPARSER_DEFINE_ATTR_IDS(KPARSER_NS_PARSER), + KPARSER_DEFINE_ATTR_IDS(KPARSER_NS_OP_PARSER_LOCK_UNLOCK), + KPARSER_ATTR_MAX /* Add more entries before this */ +}; + +enum { + KPARSER_CMD_UNSPEC, + KPARSER_CMD_CONFIGURE, + KPARSER_CMD_MAX +}; + +/* *********************** kparser hash key (hkey) *********************** */ +#define KPARSER_INVALID_ID 0xffff + +#define KPARSER_USER_ID_MIN 0 +#define KPARSER_USER_ID_MAX 0x8000 +#define KPARSER_KMOD_ID_MIN 0x8001 +#define KPARSER_KMOD_ID_MAX 0xfffe + +#define KPARSER_MAX_NAME 128 +#define KPARSER_MAX_DIGIT_STR_LEN 16 +#define KPARSER_DEF_NAME_PREFIX "kparser_default_name" +#define KPARSER_USER_ID_MIN 0 +#define KPARSER_USER_ID_MAX 0x8000 +#define KPARSER_KMOD_ID_MIN 0x8001 +#define KPARSER_KMOD_ID_MAX 0xfffe + +struct kparser_hkey { + __u16 id; + char name[KPARSER_MAX_NAME]; +}; + +/* *********************** conditional expressions *********************** */ +enum kparser_condexpr_types { + KPARSER_CONDEXPR_TYPE_OR, + KPARSER_CONDEXPR_TYPE_AND, +}; + +enum kparser_expr_types { + KPARSER_CONDEXPR_TYPE_EQUAL, + KPARSER_CONDEXPR_TYPE_NOTEQUAL, + KPARSER_CONDEXPR_TYPE_LT, + KPARSER_CONDEXPR_TYPE_LTE, + KPARSER_CONDEXPR_TYPE_GT, + KPARSER_CONDEXPR_TYPE_GTE, +}; + +/* One boolean condition expressions */ +struct kparser_condexpr_expr { + enum kparser_expr_types type; + __u16 src_off; + __u8 length; + __u32 mask; + __u32 value; +}; + +struct kparser_conf_condexpr { + struct kparser_hkey key; + struct kparser_condexpr_expr config; +}; + +struct kparser_conf_condexpr_table { + struct kparser_hkey key; + int idx; + int default_fail; + enum kparser_condexpr_types type; + struct kparser_hkey condexpr_expr_key; +}; + +struct kparser_conf_condexpr_tables { + struct kparser_hkey key; + int idx; + struct kparser_hkey condexpr_expr_table_key; +}; + +/* *********************** counter *********************** */ +#define KPARSER_CNTR_NUM_CNTRS 7 + +struct kparser_cntr_conf { + bool valid_entry; + __u8 index; + __u32 max_value; + __u32 array_limit; + size_t el_size; + bool reset_on_encap; + bool overwrite_last; + bool error_on_exceeded; +}; + +struct kparser_conf_cntr { + struct kparser_hkey key; + struct kparser_cntr_conf conf; +}; + +/* *********************** metadata *********************** */ +enum kparser_metadata_type { + KPARSER_METADATA_INVALID, + KPARSER_METADATA_HDRDATA, + KPARSER_METADATA_HDRDATA_NIBBS_EXTRACT, + KPARSER_METADATA_HDRLEN, + KPARSER_METADATA_CONSTANT_BYTE, + KPARSER_METADATA_CONSTANT_HALFWORD, + KPARSER_METADATA_OFFSET, + KPARSER_METADATA_BIT_OFFSET, + KPARSER_METADATA_NUMENCAPS, + KPARSER_METADATA_NUMNODES, + KPARSER_METADATA_TIMESTAMP, + KPARSER_METADATA_RETURN_CODE, + KPARSER_METADATA_COUNTER, + KPARSER_METADATA_NOOP, + KPARSER_METADATA_MAX +}; + +enum kparser_metadata_counter_op_type { + KPARSER_METADATA_COUNTEROP_NOOP, + KPARSER_METADATA_COUNTEROP_INCR, + KPARSER_METADATA_COUNTEROP_RST +}; + +#define KPARSER_METADATA_OFFSET_MIN 0 +#define KPARSER_METADATA_OFFSET_MAX 0xffffff +#define KPARSER_METADATA_OFFSET_INVALID 0xffffffff + +/* TODO: align and pack all struct members + */ +struct kparser_conf_metadata { + struct kparser_hkey key; + enum kparser_metadata_type type; + enum kparser_metadata_counter_op_type cntr_op; // 3 bit + bool frame; + bool e_bit; + __u16 constant_value; + size_t soff; + size_t doff; + size_t len; + size_t add_off; + struct kparser_hkey counterkey; + struct kparser_hkey counter_data_key; +}; + +/* *********************** metadata list/table *********************** */ +struct kparser_conf_metadata_table { + struct kparser_hkey key; + size_t metadata_keys_count; + struct kparser_hkey metadata_keys[0]; +}; + +/* *********************** parse nodes *********************** */ +/* kParser protocol node types + */ +enum kparser_node_type { + /* Plain node, no super structure */ + KPARSER_NODE_TYPE_PLAIN, + /* TLVs node with super structure for TLVs */ + KPARSER_NODE_TYPE_TLVS, + /* Flag-fields with super structure for flag-fields */ + KPARSER_NODE_TYPE_FLAG_FIELDS, + /* It represents the limit value */ + KPARSER_NODE_TYPE_MAX, +}; + +/* Types for parameterized functions */ +struct kparser_parameterized_len { + __u16 src_off; + __u8 size; + bool endian; + __u32 mask; + __u8 right_shift; + __u8 multiplier; + __u8 add_value; +}; + +struct kparser_parameterized_next_proto { + __u16 src_off; + __u32 mask; + __u8 size; + __u8 right_shift; +}; + +struct kparser_conf_parse_ops { + bool len_parameterized; + struct kparser_parameterized_len pflen; + struct kparser_parameterized_next_proto pfnext_proto; + bool cond_exprs_parameterized; + struct kparser_hkey cond_exprs_table; +}; + +/* base nodes */ +struct kparser_conf_node_proto { + bool encap; + bool overlay; + size_t min_len; + struct kparser_conf_parse_ops ops; +}; + +struct kparser_conf_node_parse { + int unknown_ret; + struct kparser_hkey proto_table_key; + struct kparser_hkey wildcard_parse_node_key; + struct kparser_hkey metadata_table_key; + struct kparser_conf_node_proto proto_node; +}; + +/* TLVS */ +struct kparser_proto_tlvs_opts { + struct kparser_parameterized_len pfstart_offset; + bool len_parameterized; + struct kparser_parameterized_len pflen; + struct kparser_parameterized_next_proto pftype; +}; + +struct kparser_conf_proto_tlvs_node { + struct kparser_proto_tlvs_opts ops; + bool tlvsstdfmt; + bool fixed_start_offset; + size_t start_offset; + __u8 pad1_val; + __u8 padn_val; + __u8 eol_val; + bool pad1_enable; + bool padn_enable; + bool eol_enable; + size_t min_len; +}; + +#define KPARSER_DEFAULT_TLV_MAX_LOOP 255 +#define KPARSER_DEFAULT_TLV_MAX_NON_PADDING 255 +#define KPARSER_DEFAULT_TLV_MAX_CONSEC_PAD_BYTES 255 +#define KPARSER_DEFAULT_TLV_MAX_CONSEC_PAD_OPTS 255 +#define KPARSER_DEFAULT_TLV_DISP_LIMIT_EXCEED 0 +#define KPARSER_DEFAULT_TLV_EXCEED_LOOP_CNT_ERR false + +/* Two bit code that describes the action to take when a loop node + * exceeds a limit + */ +enum { + KPARSER_LOOP_DISP_STOP_OKAY = 0, + KPARSER_LOOP_DISP_STOP_NODE_OKAY = 1, + KPARSER_LOOP_DISP_STOP_SUB_NODE_OKAY = 2, + KPARSER_LOOP_DISP_STOP_FAIL = 3, +}; + +/* Configuration for a TLV node (generally loop nodes) + * + * max_loop: Maximum number of TLVs to process + * max_non: Maximum number of non-padding TLVs to process + * max_plen: Maximum consecutive padding bytes + * max_c_pad: Maximum number of consecutive padding options + * disp_limit_exceed: Disposition when a TLV parsing limit is exceeded. See + * KPARSER_LOOP_DISP_STOP_* in parser.h + * exceed_loop_cnt_is_err: True is exceeding maximum number of TLVS is an error + */ +struct kparser_loop_node_config { + __u16 max_loop; + __u16 max_non; + __u8 max_plen; + __u8 max_c_pad; + __u8 disp_limit_exceed; + bool exceed_loop_cnt_is_err; +}; + +/* TODO: + * disp_limit_exceed: 2; + * exceed_loop_cnt_is_err: 1; + */ +struct kparser_conf_parse_tlvs { + struct kparser_conf_proto_tlvs_node proto_node; + struct kparser_hkey tlv_proto_table_key; + int unknown_tlv_type_ret; + struct kparser_hkey tlv_wildcard_node_key; + struct kparser_loop_node_config config; +}; + +/* flag fields */ +struct kparser_parameterized_get_value { + __u16 src_off; + __u32 mask; + __u8 size; +}; + +struct kparser_proto_flag_fields_ops { + bool get_flags_parameterized; + struct kparser_parameterized_get_value pfget_flags; + bool start_fields_offset_parameterized; + struct kparser_parameterized_len pfstart_fields_offset; + bool flag_fields_len; + __u16 hdr_length; +}; + +struct kparser_conf_node_proto_flag_fields { + struct kparser_proto_flag_fields_ops ops; + struct kparser_hkey flag_fields_table_hkey; +}; + +struct kparser_conf_parse_flag_fields { + struct kparser_conf_node_proto_flag_fields proto_node; + struct kparser_hkey flag_fields_proto_table_key; +}; + +struct kparser_conf_node { + struct kparser_hkey key; + enum kparser_node_type type; + struct kparser_conf_node_parse plain_parse_node; + struct kparser_conf_parse_tlvs tlvs_parse_node; + struct kparser_conf_parse_flag_fields flag_fields_parse_node; +}; + +/* *********************** tlv parse node *********************** */ +struct kparser_conf_proto_tlv_node_ops { + bool overlay_type_parameterized; + struct kparser_parameterized_next_proto pfoverlay_type; + bool cond_exprs_parameterized; + struct kparser_hkey cond_exprs_table; +}; + +struct kparser_conf_node_proto_tlv { + size_t min_len; + size_t max_len; + bool is_padding; + struct kparser_conf_proto_tlv_node_ops ops; +}; + +struct kparser_conf_node_parse_tlv { + struct kparser_hkey key; + struct kparser_conf_node_proto_tlv node_proto; + struct kparser_hkey overlay_proto_tlvs_table_key; + struct kparser_hkey overlay_wildcard_parse_node_key; + int unknown_ret; + struct kparser_hkey metadata_table_key; +}; + +/* *********************** flag field *********************** */ +/* One descriptor for a flag + * + * flag: protocol value + * mask: mask to apply to field + * size: size for associated field data + */ +struct kparser_flag_field { + __u32 flag; + __u32 networkflag; + __u32 mask; + size_t size; + bool endian; +}; + +struct kparser_conf_flag_field { + struct kparser_hkey key; + struct kparser_flag_field conf; +}; + +/* *********************** flag field parse node *********************** */ +struct kparser_parse_flag_field_node_ops_conf { + struct kparser_hkey cond_exprs_table_key; +}; + +struct kparser_conf_node_parse_flag_field { + struct kparser_hkey key; + struct kparser_hkey metadata_table_key; + struct kparser_parse_flag_field_node_ops_conf ops; +}; + +/* *********************** generic tables *********************** */ +struct kparser_conf_table { + struct kparser_hkey key; + bool add_entry; + __u16 elems_cnt; + int optional_value1; + int optional_value2; + struct kparser_hkey elem_key; +}; + +/* *********************** parser *********************** */ +/* Flags for parser configuration */ +#define KPARSER_F_DEBUG_DATAPATH (1 << 0) +#define KPARSER_F_DEBUG_CLI (1 << 1) + +#define KPARSER_MAX_NODES 10 +#define KPARSER_MAX_ENCAPS 1 +#define KPARSER_MAX_FRAMES 255 + +/* Configuration for a KPARSER parser + * + * flags: Flags KPARSER_F_* + * max_nodes: Maximum number of nodes to parse + * max_encaps: Maximum number of encapsulations to parse + * max_frames: Maximum number of metadata frames + * metameta_size: Size of metameta data. The metameta data is at the head + * of the user defined metadata structure. This also serves as the + * offset of the first metadata frame + * frame_size: Size of one metadata frame + */ +struct kparser_config { + __u16 flags; + __u16 max_nodes; + __u16 max_encaps; + __u16 max_frames; + size_t metameta_size; + size_t frame_size; +}; + +struct kparser_conf_parser { + struct kparser_hkey key; + struct kparser_config config; + struct kparser_hkey root_node_key; + struct kparser_hkey ok_node_key; + struct kparser_hkey fail_node_key; + struct kparser_hkey atencap_node_key; +}; + +/* *********************** CLI config interface *********************** */ + +/* NOTE: we can't use BITS_PER_TYPE from kernel header here and had to redefine BITS_IN_U32 + * since this is shared with user space code. + */ +#define KPARSER_CONFIG_MAX_KEYS 128 +#define KPARSER_CONFIG_MAX_KEYS_BV_LEN ((KPARSER_CONFIG_MAX_KEYS /\ + (sizeof(__u32) * 8)) + 1) +struct kparser_config_set_keys_bv { + __u32 ns_keys_bvs[KPARSER_CONFIG_MAX_KEYS_BV_LEN]; +}; + +struct kparser_conf_cmd { + enum kparser_global_namespace_ids namespace_id; + struct kparser_config_set_keys_bv conf_keys_bv; + __u8 recursive_read_delete; + union { + /* for read/delete commands */ + /* KPARSER_NS_OP_PARSER_LOCK_UNLOCK */ + struct kparser_hkey obj_key; + + /* KPARSER_NS_CONDEXPRS */ + struct kparser_conf_condexpr cond_conf; + + /* KPARSER_NS_COUNTER */ + struct kparser_conf_cntr cntr_conf; + + /* KPARSER_NS_METADATA */ + struct kparser_conf_metadata md_conf; + + /* KPARSER_NS_METALIST */ + struct kparser_conf_metadata_table mdl_conf; + + /* KPARSER_NS_NODE_PARSE */ + struct kparser_conf_node node_conf; + + /* KPARSER_NS_TLV_NODE_PARSE */ + struct kparser_conf_node_parse_tlv tlv_node_conf; + + /* KPARSER_NS_FLAG_FIELD */ + struct kparser_conf_flag_field flag_field_conf; + + /* KPARSER_NS_FLAG_FIELD_NODE_PARSE */ + struct kparser_conf_node_parse_flag_field flag_field_node_conf; + + /* KPARSER_NS_PROTO_TABLE */ + /* KPARSER_NS_TLV_PROTO_TABLE */ + /* KPARSER_NS_FLAG_FIELD_TABLE */ + /* KPARSER_NS_FLAG_FIELD_PROTO_TABLE */ + /* KPARSER_NS_CONDEXPRS_TABLE */ + /* KPARSER_NS_CONDEXPRS_TABLES */ + /* KPARSER_NS_COUNTER_TABLE */ + struct kparser_conf_table table_conf; + + /* KPARSER_NS_PARSER */ + struct kparser_conf_parser parser_conf; + }; +}; + +struct kparser_cmd_rsp_hdr { + int op_ret_code; + struct kparser_hkey key; + struct kparser_conf_cmd object; + size_t objects_len; + /* array of fixed size kparser_conf_cmd objects */ + struct kparser_conf_cmd objects[0]; +}; + +/* *********************** kParser error code *********************** */ +/* + * There are two variants of the KPARSER return codes. The normal variant is + * a number between -15 and 0 inclusive where the name for the code is + * prefixed by KPARSER_. There is also a special 16-bit encoding which is + * 0xfff0 + -val where val is the negative number for the code so that + * corresponds to values 0xfff0 to 0xffff. Names for the 16-bit encoding + * are prefixed by KPARSER_16BIT_ + */ +enum { + KPARSER_OKAY = 0, /* Okay and continue */ + KPARSER_RET_OKAY = -1, /* Encoding of OKAY in ret code */ + + KPARSER_OKAY_USE_WILD = -2, /* cam instruction */ + KPARSER_OKAY_USE_ALT_WILD = -3, /* cam instruction */ + + KPARSER_STOP_OKAY = -4, /* Okay and stop parsing */ + KPARSER_STOP_NODE_OKAY = -5, /* Stop parsing current node */ + KPARSER_STOP_SUB_NODE_OKAY = -6,/* Stop parsing currnet sub-node */ + + /* Parser failure */ + KPARSER_STOP_FAIL = -12, + KPARSER_STOP_LENGTH = -13, + KPARSER_STOP_UNKNOWN_PROTO = -14, + KPARSER_STOP_ENCAP_DEPTH = -15, + KPARSER_STOP_UNKNOWN_TLV = -16, + KPARSER_STOP_TLV_LENGTH = -17, + KPARSER_STOP_BAD_FLAG = -18, + KPARSER_STOP_FAIL_CMP = -19, + KPARSER_STOP_LOOP_CNT = -20, + KPARSER_STOP_TLV_PADDING = -21, + KPARSER_STOP_OPTION_LIMIT = -22, + KPARSER_STOP_MAX_NODES = -23, + KPARSER_STOP_COMPARE = -24, + KPARSER_STOP_BAD_EXTRACT = -25, + KPARSER_STOP_BAD_CNTR = -26, + KPARSER_STOP_CNTR1 = -27, + KPARSER_STOP_CNTR2 = -28, + KPARSER_STOP_CNTR3 = -29, + KPARSER_STOP_CNTR4 = -30, + KPARSER_STOP_CNTR5 = -31, + KPARSER_STOP_CNTR6 = -32, + KPARSER_STOP_CNTR7 = -33, +}; + +static inline const char *kparser_code_to_text(int code) +{ + switch (code) { + case KPARSER_OKAY: + return "okay"; + case KPARSER_RET_OKAY: + return "okay-ret"; + case KPARSER_OKAY_USE_WILD: + return "okay-use-wild"; + case KPARSER_OKAY_USE_ALT_WILD: + return "okay-use-alt-wild"; + case KPARSER_STOP_OKAY: + return "stop-okay"; + case KPARSER_STOP_NODE_OKAY: + return "stop-node-okay"; + case KPARSER_STOP_SUB_NODE_OKAY: + return "stop-sub-node-okay"; + case KPARSER_STOP_FAIL: + return "stop-fail"; + case KPARSER_STOP_LENGTH: + return "stop-length"; + case KPARSER_STOP_UNKNOWN_PROTO: + return "stop-unknown-proto"; + case KPARSER_STOP_ENCAP_DEPTH: + return "stop-encap-depth"; + case KPARSER_STOP_UNKNOWN_TLV: + return "stop-unknown-tlv"; + case KPARSER_STOP_TLV_LENGTH: + return "stop-tlv-length"; + case KPARSER_STOP_BAD_FLAG: + return "stop-bad-flag"; + case KPARSER_STOP_FAIL_CMP: + return "stop-fail-cmp"; + case KPARSER_STOP_LOOP_CNT: + return "stop-loop-cnt"; + case KPARSER_STOP_TLV_PADDING: + return "stop-tlv-padding"; + case KPARSER_STOP_OPTION_LIMIT: + return "stop-option-limit"; + case KPARSER_STOP_MAX_NODES: + return "stop-max-nodes"; + case KPARSER_STOP_COMPARE: + return "stop-compare"; + case KPARSER_STOP_BAD_EXTRACT: + return "stop-bad-extract"; + case KPARSER_STOP_BAD_CNTR: + return "stop-bad-counter"; + default: + return "unknown-code"; + } +} + +/* *********************** HKey utility APIs *********************** */ +static inline bool kparser_hkey_id_empty(const struct kparser_hkey *key) +{ + if (!key) + return true; + return (key->id == KPARSER_INVALID_ID); +} + +static inline bool kparser_hkey_name_empty(const struct kparser_hkey *key) +{ + if (!key) + return true; + return ((key->name[0] == '\0') || + !strcmp(key->name, KPARSER_DEF_NAME_PREFIX)); +} + +static inline bool kparser_hkey_empty(const struct kparser_hkey *key) +{ + return (kparser_hkey_id_empty(key) && kparser_hkey_name_empty(key)); +} + +static inline bool kparser_hkey_user_id_invalid(const struct kparser_hkey *key) +{ + if (!key) + return true; + return ((key->id == KPARSER_INVALID_ID) || + (key->id > KPARSER_USER_ID_MAX)); +} + +#endif /* _LINUX_KPARSER_H */ diff --git a/net/Kconfig b/net/Kconfig index 48c33c222..3bd2a507b 100644 --- a/net/Kconfig +++ b/net/Kconfig @@ -471,4 +471,13 @@ config NETDEV_ADDR_LIST_TEST default KUNIT_ALL_TESTS depends on KUNIT +config KPARSER + tristate "Parser in Kernel" + help + kParser stands for "The Kernel Parser". This is a programmable + network packet parser which is a ported version of the PANDA + parser. This module exposes kParser APIs in Kernel. + + If unsure, say N. + endif # if NET diff --git a/net/Makefile b/net/Makefile index 0914bea9c..58176c2fd 100644 --- a/net/Makefile +++ b/net/Makefile @@ -79,3 +79,4 @@ obj-$(CONFIG_NET_NCSI) += ncsi/ obj-$(CONFIG_XDP_SOCKETS) += xdp/ obj-$(CONFIG_MPTCP) += mptcp/ obj-$(CONFIG_MCTP) += mctp/ +obj-$(CONFIG_KPARSER) += kparser/ diff --git a/net/kparser/Makefile b/net/kparser/Makefile new file mode 100644 index 000000000..d5a657482 --- /dev/null +++ b/net/kparser/Makefile @@ -0,0 +1,17 @@ +# SPDX-License-Identifier: GPL-2.0-only +# +# Makefile for KPARSER module +# +GCOV_PROFILE := y + +##KBUILD_CFLAGS := -Wall -Wundef -Wno-trigraphs \ +# -fno-strict-aliasing -fno-common -fshort-wchar -fno-PIE \ +# -Werror=implicit-function-declaration -Werror=implicit-int \ +# -Wno-format-security \ +# -fanalyzer \ +# -std=gnu89 +ccflags-y := -DDEBUG -DKERNEL_MOD -Wall + +obj-$(CONFIG_KPARSER) += kparser.o + +kparser-objs := kparser_main.o kparser_cmds.o kparser_cmds_ops.o kparser_cmds_dump_ops.o kparser_datapath.o diff --git a/net/kparser/kparser.h b/net/kparser/kparser.h new file mode 100644 index 000000000..836780e6a --- /dev/null +++ b/net/kparser/kparser.h @@ -0,0 +1,418 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* Copyright (c) 2022, SiPanda Inc. + * + * kparser.h - kParser local header file + * + * Author: Pratyush Kumar Khan + */ + +#ifndef __KPARSER_H +#define __KPARSER_H + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "kparser_types.h" +#include "kparser_condexpr.h" +#include "kparser_metaextract.h" +#include "kparser_types.h" + +/* These are used to track owner/owned relationship between different objects + */ +struct kparser_ref_ctx { + int nsid; + const void *obj; + const void __rcu **link_ptr; + struct kref *refcount; + struct list_head *list; + struct list_head list_node; +}; + +#define KPARSER_LINK_OBJ_SIGNATURE 0xffaabbff + +/* bookkeeping structure to manage the above struct kparser_ref_ctx and map an owner with owned both + * ways + */ +struct kparser_obj_link_ctx { + int sig; + struct kparser_ref_ctx owner_obj; + struct kparser_ref_ctx owned_obj; +}; + +/* global hash table structures */ +struct kparser_htbl { + struct rhashtable tbl; + struct rhashtable_params tbl_params; +}; + +/* it binds a netlink cli structure to an internal namespace object structure + * + * key: hash key, must be always the very first entry for hash functions to work correctly. + * ht_node_id: ID based hash table's linking object. + * ht_node_name: name based hash table's linking object. + * refcount: tracks how many other objects are linked using refcount. + * config: netlink msg's config structure cached, it is replayed back during read operations. + * owner_list: list pointer for kparser_obj_link_ctx.owner_obj.list + * owned_list: list pointer for kparser_obj_link_ctx.owned_obj.list + */ +struct kparser_glue { + struct kparser_hkey key; + struct rhash_head ht_node_id; + struct rhash_head ht_node_name; + struct kref refcount; + struct kparser_conf_cmd config; + struct list_head owner_list; + struct list_head owned_list; +}; + +/* internal namespace structures for conditional expressions + * it binds a netlink cli structure to an internal namespace object structure + */ +struct kparser_glue_condexpr_expr { + struct kparser_glue glue; + struct kparser_condexpr_expr expr; +}; + +/* internal namespace structures for conditional expressions table + * it binds a netlink cli structure to an internal namespace object structure + */ +struct kparser_glue_condexpr_table { + struct kparser_glue glue; + struct kparser_condexpr_table table; +}; + +/* internal namespace structures for table of conditional expressions table + * it binds a netlink cli structure to an internal namespace object structure + */ +struct kparser_glue_condexpr_tables { + struct kparser_glue glue; + struct kparser_condexpr_tables table; +}; + +/* internal namespace structures for counters + * it binds a netlink cli structure to an internal namespace object structure + */ +struct kparser_glue_counter { + struct kparser_glue glue; + struct kparser_cntr_conf counter_cnf; +}; + +/* internal namespace structures for counter table + * it binds a netlink cli structure to an internal namespace object structure + */ +struct kparser_glue_counter_table { + struct kparser_glue glue; + __u8 elems_cnt; + struct kparser_glue_counter k_cntrs[KPARSER_CNTR_NUM_CNTRS]; +}; + +/* internal namespace structures for metadata + * it binds a netlink cli structure to an internal namespace object structure + */ +struct kparser_glue_metadata_extract { + struct kparser_glue glue; + struct kparser_metadata_extract mde; +}; + +/* internal namespace structures for metadata list + * it binds a netlink cli structure to an internal namespace object structure + */ +struct kparser_glue_metadata_table { + struct kparser_glue glue; + size_t md_configs_len; + struct kparser_conf_cmd *md_configs; + struct kparser_metadata_table metadata_table; +}; + +/* internal namespace structures for node + * it binds a netlink cli structure to an internal namespace object structure + */ +struct kparser_glue_node { + struct kparser_glue glue; +}; + +struct kparser_glue_glue_parse_node { + struct kparser_glue_node glue; + union { + struct kparser_parse_node node; + struct kparser_parse_flag_fields_node flags_parse_node; + struct kparser_parse_tlvs_node tlvs_parse_node; + } parse_node; +}; + +/* internal namespace structures for table + * it binds a netlink cli structure to an internal namespace object structure + */ +struct kparser_glue_protocol_table { + struct kparser_glue glue; + struct kparser_proto_table proto_table; +}; + +/* internal namespace structures for tlv nodes and tables + * it binds a netlink cli structure to an internal namespace object structure + */ +struct kparser_glue_parse_tlv_node { + struct kparser_glue_node glue; + struct kparser_parse_tlv_node tlv_parse_node; +}; + +/* internal namespace structures for tlvs proto table + * it binds a netlink cli structure to an internal namespace object structure + */ +struct kparser_glue_proto_tlvs_table { + struct kparser_glue glue; + struct kparser_proto_tlvs_table tlvs_proto_table; +}; + +/* internal namespace structures for flagfields and tables + * it binds a netlink cli structure to an internal namespace object structure + */ +struct kparser_glue_flag_field { + struct kparser_glue glue; + struct kparser_flag_field flag_field; +}; + +/* internal namespace structures for flag field node + * it binds a netlink cli structure to an internal namespace object structure + */ +struct kparser_glue_flag_fields { + struct kparser_glue glue; + struct kparser_flag_fields flag_fields; +}; + +struct kparser_glue_flag_field_node { + struct kparser_glue_node glue; + struct kparser_parse_flag_field_node node_flag_field; +}; + +/* internal namespace structures for flag field table + * it binds a netlink cli structure to an internal namespace object structure + */ +struct kparser_glue_proto_flag_fields_table { + struct kparser_glue glue; + struct kparser_proto_flag_fields_table flags_proto_table; +}; + +/* internal namespace structures for parser + * it binds a netlink cli structure to an internal namespace object structure + */ +struct kparser_glue_parser { + struct kparser_glue glue; + struct list_head list_node; + struct kparser_parser parser; +}; + +/* name hash table's hash object comparison function callback */ +static inline int kparser_cmp_fn_name(struct rhashtable_compare_arg *arg, + const void *ptr) +{ + const char *key2 = arg->key; + const struct kparser_hkey *key1 = ptr; + + return strcmp(key1->name, key2); +} + +/* ID hash table's hash object comparison function callback */ +static inline int kparser_cmp_fn_id(struct rhashtable_compare_arg *arg, + const void *ptr) +{ + const __u16 *key2 = arg->key; + const __u16 *key1 = ptr; + + return (*key1 != *key2); +} + +/* name hash table's hash calculation function callback from hash key */ +static inline __u32 kparser_generic_hash_fn_name(const void *hkey, __u32 key_len, __u32 seed) +{ + const char *key = hkey; + + /* TODO: check if seed needs to be used here + * TODO: replace xxh32() with siphash + */ + return xxh32(hkey, strlen(key), 0); +} + +/* ID hash table's hash calculation function callback from hash key */ +static inline __u32 kparser_generic_hash_fn_id(const void *hkey, __u32 key_len, __u32 seed) +{ + const __u16 *key = hkey; + /* TODO: check if seed needs to be used here + */ + return *key; +} + +/* name hash table's hash calculation function callback from object */ +static inline __u32 kparser_generic_obj_hashfn_name(const void *obj, __u32 key_len, __u32 seed) +{ + const struct kparser_hkey *key; + + key = obj; + /* TODO: check if seed needs to be used here + * TODO: replace xxh32() with siphash + * Note: this only works because key always in the start place + * of all the differnt kparser objects + */ + return xxh32(key->name, strlen(key->name), 0); +} + +/* ID hash table's hash calculation function callback from object */ +static inline __u32 kparser_generic_obj_hashfn_id(const void *obj, __u32 key_len, __u32 seed) +{ + /* TODO: check if seed needs to be used here + * TODO: replace xxh32() with siphash + * Note: this only works because key always is the very first object in all the differnt + * kparser objects + */ + return ((const struct kparser_hkey *)obj)->id; +} + +/* internal shared functions */ +int kparser_init(void); +int kparser_deinit(void); +int kparser_config_handler_add(const void *cmdarg, size_t cmdarglen, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, + void *extack, int *err); +int kparser_config_handler_update(const void *cmdarg, size_t cmdarglen, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, + void *extack, int *err); +int kparser_config_handler_read(const void *cmdarg, size_t cmdarglen, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, + void *extack, int *err); +int kparser_config_handler_delete(const void *cmdarg, size_t cmdarglen, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, + void *extack, int *err); +void *kparser_namespace_lookup(enum kparser_global_namespace_ids ns_id, + const struct kparser_hkey *key); +void kparser_ref_get(struct kref *refcount); +void kparser_ref_put(struct kref *refcount); +int kparser_conf_key_manager(enum kparser_global_namespace_ids ns_id, + const struct kparser_hkey *key, + struct kparser_hkey *new_key, + struct kparser_cmd_rsp_hdr *rsp, + const char *op, + void *extack, int *err); +void kparser_free(void *ptr); +int kparser_namespace_remove(enum kparser_global_namespace_ids ns_id, + struct rhash_head *obj_id, + struct rhash_head *obj_name); +int kparser_namespace_insert(enum kparser_global_namespace_ids ns_id, + struct rhash_head *obj_id, + struct rhash_head *obj_name); + +/* Generic kParser KMOD's netlink msg handler's definitions for create */ +typedef int kparser_obj_create_update(const struct kparser_conf_cmd *conf, + size_t conf_len, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, const char *op, + void *extack, int *err); +/* Generic kParser KMOD's netlink msg handler's definitions for read and delete */ +typedef int kparser_obj_read_del(const struct kparser_hkey *key, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, __u8 recursive_read, + const char *op, void *extack, int *err); +/* Generic kParser KMOD's netlink msg handler's free callbacks */ +typedef void kparser_free_obj(void *ptr, void *arg); +int kparser_link_attach(const void *owner_obj, + int owner_nsid, + const void **owner_obj_link_ptr, + struct kref *owner_obj_refcount, + struct list_head *owner_list, + const void *owned_obj, + int owned_nsid, + struct kref *owned_obj_refcount, + struct list_head *owned_list, + struct kparser_cmd_rsp_hdr *rsp, + const char *op, + void *extack, int *err); +int kparser_link_detach(const void *obj, + struct list_head *owner_list, + struct list_head *owned_list, + struct kparser_cmd_rsp_hdr *rsp, + void *extack, int *err); +int alloc_first_rsp(struct kparser_cmd_rsp_hdr **rsp, size_t *rsp_len, int nsid); +void kparser_start_new_tree_traversal(void); +void kparser_dump_parser_tree(const struct kparser_parser *obj); + +/* kParser KMOD's netlink msg/cmd handler's, these are innermost handlers */ +kparser_obj_create_update + kparser_create_cond_exprs, + kparser_create_cond_table, + kparser_create_cond_tables, + kparser_create_counter, + kparser_create_counter_table, + kparser_create_metadata, + kparser_create_metalist, + kparser_create_parse_node, + kparser_create_proto_table, + kparser_create_parse_tlv_node, + kparser_create_tlv_proto_table, + kparser_create_flag_field, + kparser_create_flag_field_table, + kparser_create_parse_flag_field_node, + kparser_create_flag_field_proto_table, + kparser_create_parser, + kparser_parser_lock; + +kparser_obj_read_del + kparser_read_cond_exprs, + kparser_read_cond_table, + kparser_read_cond_tables, + kparser_read_counter, + kparser_read_counter_table, + kparser_read_metadata, + kparser_del_metadata, + kparser_read_metalist, + kparser_del_metalist, + kparser_read_parse_node, + kparser_del_parse_node, + kparser_read_proto_table, + kparser_del_proto_table, + kparser_read_parse_tlv_node, + kparser_read_tlv_proto_table, + kparser_read_flag_field, + kparser_read_flag_field_table, + kparser_read_parse_flag_field_node, + kparser_read_flag_field_proto_table, + kparser_read_parser, + kparser_del_parser, + kparser_parser_unlock; + +kparser_free_obj + kparser_free_metadata, + kparser_free_metalist, + kparser_free_node, + kparser_free_proto_tbl, + kparser_free_parser; + +#define KPARSER_PARSER_FAST_LOOKUP_RSVD_ID_START 0 +#define KPARSER_PARSER_FAST_LOOKUP_RSVD_ID_STOP 255 + +extern void __rcu + *kparser_fast_lookup_array[KPARSER_PARSER_FAST_LOOKUP_RSVD_ID_STOP - + KPARSER_PARSER_FAST_LOOKUP_RSVD_ID_START + 1]; + +#define KPARSER_KMOD_DEBUG_PRINT(PARSER_FLAG, FMT, ARGS...) \ +do { \ + unsigned int parser_flag = PARSER_FLAG; \ + if ((parser_flag) & KPARSER_F_DEBUG_DATAPATH) \ + pr_alert("kParser:DATA:[%s:%d]" FMT, __func__, __LINE__, ## ARGS);\ + else if ((parser_flag) & KPARSER_F_DEBUG_CLI) \ + pr_alert("kParser:CLI:[%s:%d]" FMT, __func__, __LINE__, ## ARGS);\ + else \ + pr_debug("kParser:[%s:%d]" FMT, __func__, __LINE__, ## ARGS); \ +} \ +while (0) + +#endif /* __KPARSER_H */ diff --git a/net/kparser/kparser_cmds.c b/net/kparser/kparser_cmds.c new file mode 100644 index 000000000..f1daeedb2 --- /dev/null +++ b/net/kparser/kparser_cmds.c @@ -0,0 +1,917 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Copyright (c) 2022, SiPanda Inc. + * + * kparser_cmds.c - kParser KMOD-CLI management API layer + * + * Author: Pratyush Kumar Khan + */ + +#include +#include +#include +#include +#include + +#include "kparser.h" + +#define KREF_INIT_VALUE 1 + +/* These are used to track node loops in parse tree traversal operations */ +static __u64 curr_traversal_ts_id_ns; + +/* This function marks a start of a new parse tree traversal operation */ +void kparser_start_new_tree_traversal(void) +{ + curr_traversal_ts_id_ns = ktime_get_ns(); +} + +/* A simple wrapper for kfree for additional future internal debug info, particularly to + * track memleaks + */ +void kparser_free(void *ptr) +{ + if (ptr) + kfree(ptr); +} + +/* Kernel API kref_put() must have a non-NULL callback, since we don't need to do anything during + * refcount release, kparser_release_ref() is just empty. + */ +static void kparser_release_ref(struct kref *kref) +{ +} + +/* Consumer of this is datapath */ +void kparser_ref_get(struct kref *refcount) +{ + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "refcnt:%u\n", kref_read(refcount)); + + kref_get(refcount); +} + +/* Consumer of this is datapath */ +void kparser_ref_put(struct kref *refcount) +{ + unsigned int refcnt; + + refcnt = kref_read(refcount); + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "refcnt:%u\n", refcnt); + + if (refcnt > KREF_INIT_VALUE) + kref_put(refcount, kparser_release_ref); + else + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "refcount violation detected, val:%u", refcnt); +} + +/* These are to track/bookkeep owner/owned relationships(both ways) when refcount is involved among + * various different types of namespace objects + */ +int kparser_link_attach(const void *owner_obj, + int owner_nsid, + const void **owner_obj_link_ptr, + struct kref *owner_obj_refcount, + struct list_head *owner_list, + const void *owned_obj, + int owned_nsid, + struct kref *owned_obj_refcount, + struct list_head *owned_list, + struct kparser_cmd_rsp_hdr *rsp, + const char *op, + void *extack, int *err) +{ + struct kparser_obj_link_ctx *reflist = NULL; + + reflist = kzalloc(sizeof(*reflist), GFP_KERNEL); + if (!reflist) { + rsp->op_ret_code = ENOMEM; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: kzalloc failed, size: %lu", + op, sizeof(*reflist)); + return -ENOMEM; + } + + reflist->sig = KPARSER_LINK_OBJ_SIGNATURE; + reflist->owner_obj.nsid = owner_nsid; + reflist->owner_obj.obj = owner_obj; + reflist->owner_obj.link_ptr = owner_obj_link_ptr; + reflist->owner_obj.list = owner_list; + reflist->owner_obj.refcount = owner_obj_refcount; + + reflist->owned_obj.nsid = owned_nsid; + reflist->owned_obj.obj = owned_obj; + reflist->owned_obj.list = owned_list; + reflist->owned_obj.refcount = owned_obj_refcount; + + /* reflist is a bookkeeping tracker which maps an owner with owned, both ways. + * hence for both owner and owned map contexts, it is kept in their respective lists. + */ + list_add_tail(&reflist->owner_obj.list_node, reflist->owner_obj.list); + list_add_tail(&reflist->owned_obj.list_node, reflist->owned_obj.list); + + if (reflist->owned_obj.refcount) + kref_get(reflist->owned_obj.refcount); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "owner:%p owned:%p ref:%p\n", + owner_obj, owned_obj, reflist); + + synchronize_rcu(); + + return 0; +} + +/* It is reverse bookkeeping action of kparser_link_attach(). when an object is detached (be it + * owner or owned, the respective map links needs be properly updated to indicate this detachment. + * kparser_link_break() is responsible for this removal update. + */ +static inline int kparser_link_break(const void *owner, const void *owned, + struct kparser_obj_link_ctx *ref, + struct kparser_cmd_rsp_hdr *rsp, + void *extack, int *err) +{ + if (!ref) { + if (rsp) { + rsp->op_ret_code = EFAULT; + NL_SET_ERR_MSG_FMT_MOD(extack, + "link is NULL!"); + } + return -EFAULT; + } + + if (ref->sig != KPARSER_LINK_OBJ_SIGNATURE) { + if (rsp) { + rsp->op_ret_code = EFAULT; + NL_SET_ERR_MSG_FMT_MOD(extack, + "link is corrupt!"); + } + return -EFAULT; + } + + if (owner && ref->owner_obj.obj != owner) { + if (rsp) { + rsp->op_ret_code = EFAULT; + NL_SET_ERR_MSG_FMT_MOD(extack, + "link owner corrupt!"); + } + return -EFAULT; + } + + if (owned && ref->owned_obj.obj != owned) { + if (rsp) { + rsp->op_ret_code = EFAULT; + NL_SET_ERR_MSG_FMT_MOD(extack, + "link owned corrupt!"); + } + return -EFAULT; + } + + if (ref->owned_obj.refcount) + kref_put(ref->owned_obj.refcount, kparser_release_ref); + + if (ref->owner_obj.link_ptr) + rcu_assign_pointer(*ref->owner_obj.link_ptr, NULL); + + list_del_init_careful(&ref->owner_obj.list_node); + list_del_init_careful(&ref->owned_obj.list_node); + + synchronize_rcu(); + + return 0; +} + +/* when a detachment happens, from owner object perspective, it needs to remove the bookkeeping + * map contexts with respect to mapped owned objects. + */ +static inline int kparser_link_detach_owner(const void *obj, + struct list_head *list, + struct kparser_cmd_rsp_hdr *rsp, + void *extack, int *err) +{ + struct kparser_obj_link_ctx *tmp_list_ref = NULL, *curr_ref = NULL; + + list_for_each_entry_safe(curr_ref, tmp_list_ref, list, owner_obj.list_node) { + if (kparser_link_break(obj, NULL, curr_ref, rsp, extack, err) != 0) + return -EFAULT; + kparser_free(curr_ref); + } + + return 0; +} + +/* when a detachment happens, from owned object perspective, it needs to remove the bookkeeping + * map contexts with respect to mapped owner objects. + */ +static inline int kparser_link_detach_owned(const void *obj, + struct list_head *list, + struct kparser_cmd_rsp_hdr *rsp, + void *extack, int *err) +{ + struct kparser_obj_link_ctx *tmp_list_ref = NULL, *curr_ref = NULL; + const struct kparser_glue_glue_parse_node *kparsenode; + const struct kparser_glue_protocol_table *proto_table; + int i; + + list_for_each_entry_safe(curr_ref, tmp_list_ref, list, owned_obj.list_node) { + /* Special case handling: + * if this is parse node and owned by a prototable, make sure + * to remove that table's entry from array separately + */ + if (curr_ref->owner_obj.nsid == KPARSER_NS_PROTO_TABLE && + curr_ref->owned_obj.nsid == KPARSER_NS_NODE_PARSE) { + proto_table = curr_ref->owner_obj.obj; + kparsenode = curr_ref->owned_obj.obj; + for (i = 0; i < proto_table->proto_table.num_ents; + i++) { + if (proto_table->proto_table.entries[i].node != + &kparsenode->parse_node.node) + continue; + rcu_assign_pointer(proto_table->proto_table.entries[i].node, NULL); + memset(&proto_table->proto_table.entries[i], 0, + sizeof(proto_table->proto_table.entries[i])); + synchronize_rcu(); + break; + } + } + + if (kparser_link_break(NULL, obj, curr_ref, rsp, extack, err) != 0) + return -EFAULT; + kparser_free(curr_ref); + } + + return 0; +} + +/* bookkeeping function to break a link between an owner and owned object */ +int kparser_link_detach(const void *obj, + struct list_head *owner_list, + struct list_head *owned_list, + struct kparser_cmd_rsp_hdr *rsp, + void *extack, int *err) +{ + if (kparser_link_detach_owner(obj, owner_list, rsp, extack, err) != 0) + return -EFAULT; + + if (kparser_link_detach_owned(obj, owned_list, rsp, extack, err) != 0) + return -EFAULT; + + return 0; +} + +/* kParser KMOD's namespace definitions */ +struct kparser_mod_namespaces { + enum kparser_global_namespace_ids namespace_id; + const char *name; + struct kparser_htbl htbl_name; + struct kparser_htbl htbl_id; + kparser_obj_create_update *create_handler; + kparser_obj_create_update *update_handler; + kparser_obj_read_del *read_handler; + kparser_obj_read_del *del_handler; + kparser_free_obj *free_handler; + size_t bv_len; + unsigned long *bv; +}; + +/* Statically define kParser KMOD's namespaces with all the parameters */ +#define KPARSER_DEFINE_MOD_NAMESPACE(g_ns_obj, NSID, OBJ_NAME, FIELD, CREATE, \ + READ, UPDATE, DELETE, FREE) \ +static struct kparser_mod_namespaces g_ns_obj = { \ + .namespace_id = NSID, \ + .name = #NSID, \ + .htbl_name = { \ + .tbl_params = { \ + .head_offset = offsetof( \ + struct OBJ_NAME, \ + FIELD.ht_node_name), \ + .key_offset = offsetof( \ + struct OBJ_NAME, \ + FIELD.key.name), \ + .key_len = sizeof(((struct kparser_hkey *)0)->name), \ + .automatic_shrinking = true, \ + .hashfn = kparser_generic_hash_fn_name, \ + .obj_hashfn = kparser_generic_obj_hashfn_name, \ + .obj_cmpfn = kparser_cmp_fn_name, \ + } \ + }, \ + .htbl_id = { \ + .tbl_params = { \ + .head_offset = offsetof( \ + struct OBJ_NAME, \ + FIELD.ht_node_id), \ + .key_offset = offsetof( \ + struct OBJ_NAME, \ + FIELD.key.id), \ + .key_len = sizeof(((struct kparser_hkey *)0)->id), \ + .automatic_shrinking = true, \ + .hashfn = kparser_generic_hash_fn_id, \ + .obj_hashfn = kparser_generic_obj_hashfn_id, \ + .obj_cmpfn = kparser_cmp_fn_id, \ + } \ + }, \ + \ + .create_handler = CREATE, \ + .read_handler = READ, \ + .update_handler = UPDATE, \ + .del_handler = DELETE, \ + .free_handler = FREE, \ +} + +KPARSER_DEFINE_MOD_NAMESPACE(kparser_mod_namespace_condexprs, + KPARSER_NS_CONDEXPRS, + kparser_glue_condexpr_expr, + glue, + kparser_create_cond_exprs, + kparser_read_cond_exprs, + NULL, NULL, NULL); + +KPARSER_DEFINE_MOD_NAMESPACE(kparser_mod_namespace_condexprs_table, + KPARSER_NS_CONDEXPRS_TABLE, + kparser_glue_condexpr_table, + glue, + kparser_create_cond_table, + kparser_read_cond_table, + NULL, NULL, NULL); + +KPARSER_DEFINE_MOD_NAMESPACE(kparser_mod_namespace_condexprs_tables, + KPARSER_NS_CONDEXPRS_TABLES, + kparser_glue_condexpr_tables, + glue, + kparser_create_cond_tables, + kparser_read_cond_tables, + NULL, NULL, NULL); + +KPARSER_DEFINE_MOD_NAMESPACE(kparser_mod_namespace_counter, + KPARSER_NS_COUNTER, + kparser_glue_counter, + glue, + kparser_create_counter, + kparser_read_counter, + NULL, NULL, NULL); + +KPARSER_DEFINE_MOD_NAMESPACE(kparser_mod_namespace_counter_table, + KPARSER_NS_COUNTER_TABLE, + kparser_glue_counter_table, + glue, + kparser_create_counter_table, + kparser_read_counter_table, + NULL, NULL, NULL); + +KPARSER_DEFINE_MOD_NAMESPACE(kparser_mod_namespace_metadata, + KPARSER_NS_METADATA, + kparser_glue_metadata_extract, + glue, + kparser_create_metadata, + kparser_read_metadata, + NULL, + kparser_del_metadata, + kparser_free_metadata); + +KPARSER_DEFINE_MOD_NAMESPACE(kparser_mod_namespace_metalist, + KPARSER_NS_METALIST, + kparser_glue_metadata_table, + glue, + kparser_create_metalist, + kparser_read_metalist, + NULL, + kparser_del_metalist, + kparser_free_metalist); + +KPARSER_DEFINE_MOD_NAMESPACE(kparser_mod_namespace_node_parse, + KPARSER_NS_NODE_PARSE, + kparser_glue_glue_parse_node, + glue.glue, + kparser_create_parse_node, + kparser_read_parse_node, + NULL, + kparser_del_parse_node, + kparser_free_node); + +KPARSER_DEFINE_MOD_NAMESPACE(kparser_mod_namespace_proto_table, + KPARSER_NS_PROTO_TABLE, + kparser_glue_protocol_table, + glue, + kparser_create_proto_table, + kparser_read_proto_table, + NULL, + kparser_del_proto_table, + kparser_free_proto_tbl); + +KPARSER_DEFINE_MOD_NAMESPACE(kparser_mod_namespace_tlv_node_parse, + KPARSER_NS_TLV_NODE_PARSE, + kparser_glue_parse_tlv_node, + glue.glue, + kparser_create_parse_tlv_node, + kparser_read_parse_tlv_node, + NULL, NULL, NULL); + +KPARSER_DEFINE_MOD_NAMESPACE(kparser_mod_namespace_tlv_proto_table, + KPARSER_NS_TLV_PROTO_TABLE, + kparser_glue_proto_tlvs_table, + glue, + kparser_create_tlv_proto_table, + kparser_read_tlv_proto_table, + NULL, NULL, NULL); + +KPARSER_DEFINE_MOD_NAMESPACE(kparser_mod_namespace_flag_field, + KPARSER_NS_FLAG_FIELD, + kparser_glue_flag_field, + glue, + kparser_create_flag_field, + kparser_read_flag_field, + NULL, NULL, NULL); + +KPARSER_DEFINE_MOD_NAMESPACE(kparser_mod_namespace_flag_field_table, + KPARSER_NS_FLAG_FIELD_TABLE, + kparser_glue_flag_fields, + glue, + kparser_create_flag_field_table, + kparser_read_flag_field_table, + NULL, NULL, NULL); + +KPARSER_DEFINE_MOD_NAMESPACE(kparser_mod_namespace_flag_field_parse_node, + KPARSER_NS_FLAG_FIELD_NODE_PARSE, + kparser_glue_flag_field_node, + glue.glue, + kparser_create_parse_flag_field_node, + kparser_read_parse_flag_field_node, + NULL, NULL, NULL); + +KPARSER_DEFINE_MOD_NAMESPACE(kparser_mod_namespace_flag_field_proto_table, + KPARSER_NS_FLAG_FIELD_PROTO_TABLE, + kparser_glue_proto_flag_fields_table, + glue, + kparser_create_flag_field_proto_table, + kparser_read_flag_field_proto_table, + NULL, NULL, NULL); + +KPARSER_DEFINE_MOD_NAMESPACE(kparser_mod_namespace_parser, + KPARSER_NS_PARSER, + kparser_glue_parser, + glue, + kparser_create_parser, + kparser_read_parser, + NULL, + kparser_del_parser, + kparser_free_parser); + +KPARSER_DEFINE_MOD_NAMESPACE(kparser_mod_namespace_parser_lock_unlock, + KPARSER_NS_OP_PARSER_LOCK_UNLOCK, + kparser_glue_parser, + glue, + kparser_parser_lock, + NULL, NULL, + kparser_parser_unlock, + NULL); + +static struct kparser_mod_namespaces *g_mod_namespaces[] = { + [KPARSER_NS_INVALID] = NULL, + [KPARSER_NS_CONDEXPRS] = &kparser_mod_namespace_condexprs, + [KPARSER_NS_CONDEXPRS_TABLE] = &kparser_mod_namespace_condexprs_table, + [KPARSER_NS_CONDEXPRS_TABLES] = + &kparser_mod_namespace_condexprs_tables, + [KPARSER_NS_COUNTER] = &kparser_mod_namespace_counter, + [KPARSER_NS_COUNTER_TABLE] = &kparser_mod_namespace_counter_table, + [KPARSER_NS_METADATA] = &kparser_mod_namespace_metadata, + [KPARSER_NS_METALIST] = &kparser_mod_namespace_metalist, + [KPARSER_NS_NODE_PARSE] = &kparser_mod_namespace_node_parse, + [KPARSER_NS_PROTO_TABLE] = &kparser_mod_namespace_proto_table, + [KPARSER_NS_TLV_NODE_PARSE] = &kparser_mod_namespace_tlv_node_parse, + [KPARSER_NS_TLV_PROTO_TABLE] = &kparser_mod_namespace_tlv_proto_table, + [KPARSER_NS_FLAG_FIELD] = &kparser_mod_namespace_flag_field, + [KPARSER_NS_FLAG_FIELD_TABLE] = + &kparser_mod_namespace_flag_field_table, + [KPARSER_NS_FLAG_FIELD_NODE_PARSE] = + &kparser_mod_namespace_flag_field_parse_node, + [KPARSER_NS_FLAG_FIELD_PROTO_TABLE] = + &kparser_mod_namespace_flag_field_proto_table, + [KPARSER_NS_PARSER] = &kparser_mod_namespace_parser, + [KPARSER_NS_OP_PARSER_LOCK_UNLOCK] = + &kparser_mod_namespace_parser_lock_unlock, + [KPARSER_NS_MAX] = NULL, +}; + +/* Function to allocate autogen IDs for hash keys if user did not allocate themselves + * TODO: free ids + */ +static inline __u16 allocate_id(__u16 id, unsigned long *bv, size_t bvsize) +{ + int i; + + if (id != KPARSER_INVALID_ID) { + /* try to allocate passed id */ + /* already allocated, conflict */ + if (!test_bit(id, bv)) + return KPARSER_INVALID_ID; + __clear_bit(id, bv); + return id; + } + + /* allocate internally, scan bitvector */ + for (i = 0; i < bvsize; i++) { + /* avoid bit vectors which are already full */ + if (bv[i]) { + id = __builtin_ffsl(bv[i]); + if (id) { + id--; + id += (i * BITS_PER_TYPE(unsigned long)); + __clear_bit(id, bv); + return (id + KPARSER_KMOD_ID_MIN); + } + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "ID alloc failed: {%d:%d}\n", + id, i); + return KPARSER_INVALID_ID; + } + } + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "ID alloc failed: {%d:%d}\n", id, i); + return KPARSER_INVALID_ID; +} + +/* allocate hash key's autogen ID */ +static inline int kparser_allocate_key_id(enum kparser_global_namespace_ids ns_id, + const struct kparser_hkey *key, + struct kparser_hkey *new_key) +{ + *new_key = *key; + new_key->id = allocate_id(KPARSER_INVALID_ID, + g_mod_namespaces[ns_id]->bv, + g_mod_namespaces[ns_id]->bv_len); + + if (new_key->id == KPARSER_INVALID_ID) + return -ENOENT; + + return 0; +} + +/* allocate hash key's autogen name */ +static inline bool kparser_allocate_key_name(enum kparser_global_namespace_ids ns_id, + const struct kparser_hkey *key, + struct kparser_hkey *new_key) +{ + *new_key = *key; + memset(new_key->name, 0, sizeof(new_key->name)); + snprintf(new_key->name, sizeof(new_key->name), + "%s-%s-%u", KPARSER_DEF_NAME_PREFIX, + g_mod_namespaces[ns_id]->name, key->id); + new_key->name[sizeof(new_key->name) - 1] = '\0'; + return true; +} + +/* check and decide which component of hash key needs to be allocated using autogen code */ +int kparser_conf_key_manager(enum kparser_global_namespace_ids ns_id, + const struct kparser_hkey *key, + struct kparser_hkey *new_key, + struct kparser_cmd_rsp_hdr *rsp, + const char *op, + void *extack, int *err) +{ + if (kparser_hkey_empty(key)) { + rsp->op_ret_code = -EINVAL; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s:HKey missing", op); + return -EINVAL; + } + + if (kparser_hkey_id_empty(key) && new_key) + return kparser_allocate_key_id(ns_id, key, new_key); + + if (kparser_hkey_user_id_invalid(key)) { + rsp->op_ret_code = -EINVAL; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s:HKey id invalid:%u", + op, key->id); + return -EINVAL; + } + + if (kparser_hkey_name_empty(key) && new_key) + return kparser_allocate_key_name(ns_id, key, new_key); + + if (new_key) + *new_key = *key; + + return 0; +} + +/* remove an object from namespace */ +int kparser_namespace_remove(enum kparser_global_namespace_ids ns_id, + struct rhash_head *obj_id, + struct rhash_head *obj_name) +{ + int rc; + + if (ns_id <= KPARSER_NS_INVALID || ns_id >= KPARSER_NS_MAX) + return -EINVAL; + + if (!g_mod_namespaces[ns_id]) + return -ENOENT; + + rc = rhashtable_remove_fast(&g_mod_namespaces[ns_id]->htbl_id.tbl, obj_id, + g_mod_namespaces[ns_id]->htbl_id.tbl_params); + + if (rc) + return rc; + + rc = rhashtable_remove_fast(&g_mod_namespaces[ns_id]->htbl_name.tbl, obj_name, + g_mod_namespaces[ns_id]->htbl_name.tbl_params); + + return rc; +} + +/* lookup an object using hash key from namespace */ +void *kparser_namespace_lookup(enum kparser_global_namespace_ids ns_id, + const struct kparser_hkey *key) +{ + void *ret; + + if (ns_id <= KPARSER_NS_INVALID || ns_id >= KPARSER_NS_MAX) + return NULL; + + if (!g_mod_namespaces[ns_id]) + return NULL; + + ret = rhashtable_lookup(&g_mod_namespaces[ns_id]->htbl_id.tbl, + &key->id, + g_mod_namespaces[ns_id]->htbl_id.tbl_params); + + if (ret) + return ret; + + ret = rhashtable_lookup(&g_mod_namespaces[ns_id]->htbl_name.tbl, + key->name, + g_mod_namespaces[ns_id]->htbl_name.tbl_params); + + return ret; +} + +/* insert an object using hash key into namespace */ +int kparser_namespace_insert(enum kparser_global_namespace_ids ns_id, + struct rhash_head *obj_id, + struct rhash_head *obj_name) +{ + int rc; + + if (ns_id <= KPARSER_NS_INVALID || ns_id >= KPARSER_NS_MAX) + return -EINVAL; + + if (!g_mod_namespaces[ns_id]) + return -ENOENT; + + rc = rhashtable_insert_fast(&g_mod_namespaces[ns_id]->htbl_id.tbl, obj_id, + g_mod_namespaces[ns_id]->htbl_id.tbl_params); + if (rc) + return rc; + + rc = rhashtable_insert_fast(&g_mod_namespaces[ns_id]->htbl_name.tbl, obj_name, + g_mod_namespaces[ns_id]->htbl_name.tbl_params); + + return rc; +} + +/* allocate the manadatory very first response header (rsp) for netlink reply msg */ +int alloc_first_rsp(struct kparser_cmd_rsp_hdr **rsp, size_t *rsp_len, int nsid) +{ + if (!rsp || *rsp || !rsp_len || (*rsp_len != 0)) + return -EINVAL; + + *rsp = kzalloc(sizeof(**rsp), GFP_KERNEL); + if (!(*rsp)) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, ":kzalloc failed for rsp, size:%lu\n", + sizeof(**rsp)); + return -ENOMEM; + } + + *rsp_len = sizeof(struct kparser_cmd_rsp_hdr); + (*rsp)->object.namespace_id = nsid; + (*rsp)->objects_len = 0; + return 0; +} + +/* initialize kParser's name space manager contexts */ +int kparser_init(void) +{ + int err, i, j; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + for (i = 0; i < (sizeof(g_mod_namespaces) / + sizeof(g_mod_namespaces[0])); i++) { + if (!g_mod_namespaces[i]) + continue; + + err = rhashtable_init(&g_mod_namespaces[i]->htbl_name.tbl, + &g_mod_namespaces[i]->htbl_name.tbl_params); + if (err) + goto handle_error; + + err = rhashtable_init(&g_mod_namespaces[i]->htbl_id.tbl, + &g_mod_namespaces[i]->htbl_id.tbl_params); + if (err) + goto handle_error; + + g_mod_namespaces[i]->bv_len = + ((KPARSER_KMOD_ID_MAX - KPARSER_KMOD_ID_MIN) / + BITS_PER_TYPE(unsigned long)) + 1; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "bv_len:%lu, total_bytes:%lu, range:[%d:%d]\n", + g_mod_namespaces[i]->bv_len, + sizeof(unsigned long) * g_mod_namespaces[i]->bv_len, + KPARSER_KMOD_ID_MAX, KPARSER_KMOD_ID_MIN); + + g_mod_namespaces[i]->bv = kcalloc(g_mod_namespaces[i]->bv_len, + sizeof(unsigned long), + GFP_KERNEL); + + if (!g_mod_namespaces[i]->bv) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "kzalloc() failed"); + goto handle_error; + } + + memset(g_mod_namespaces[i]->bv, 0xff, + g_mod_namespaces[i]->bv_len * sizeof(unsigned long)); + } + + memset(kparser_fast_lookup_array, 0, sizeof(kparser_fast_lookup_array)); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + + return 0; + +handle_error: + for (j = 0; j < i; j++) { + if (!g_mod_namespaces[j]) + continue; + + rhashtable_destroy(&g_mod_namespaces[j]->htbl_name.tbl); + rhashtable_destroy(&g_mod_namespaces[j]->htbl_id.tbl); + + kparser_free(g_mod_namespaces[j]->bv); + g_mod_namespaces[j]->bv_len = 0; + } + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + + return err; +} + +/* de-initialize kParser's name space manager contexts and free and remove all entries */ +int kparser_deinit(void) +{ + int i; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + for (i = 0; i < ARRAY_SIZE(g_mod_namespaces); i++) { + if (!g_mod_namespaces[i]) + continue; + + rhashtable_destroy(&g_mod_namespaces[i]->htbl_name.tbl); + rhashtable_free_and_destroy(&g_mod_namespaces[i]->htbl_id.tbl, + g_mod_namespaces[i]->free_handler, NULL); + + kparser_free(g_mod_namespaces[i]->bv); + + g_mod_namespaces[i]->bv_len = 0; + } + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return 0; +} + +/* pre-process handler for all the netlink msg processors */ +static inline const struct kparser_conf_cmd +*kparser_config_handler_preprocess(const void *cmdarg, + size_t cmdarglen, struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len) +{ + enum kparser_global_namespace_ids ns_id; + const struct kparser_conf_cmd *conf; + int rc; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + conf = cmdarg; + if (!conf || cmdarglen < sizeof(*conf) || !rsp || *rsp || !rsp_len || + (*rsp_len != 0) || conf->namespace_id <= KPARSER_NS_INVALID || + conf->namespace_id >= KPARSER_NS_MAX) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "[%p %lu %p %p %p %lu %d]\n", + conf, cmdarglen, rsp, *rsp, rsp_len, + *rsp_len, conf->namespace_id); + goto err_return; + } + + ns_id = conf->namespace_id; + + if (!g_mod_namespaces[ns_id]) + goto err_return; + + if (!g_mod_namespaces[ns_id]->create_handler) + goto err_return; + + rc = alloc_first_rsp(rsp, rsp_len, conf->namespace_id); + if (rc) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "alloc_first_rsp() failed, rc:%d\n", rc); + goto err_return; + } + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return cmdarg; + +err_return: + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return NULL; +} + +#define KPARSER_CONFIG_HANDLER_PRE() \ +do { \ + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); \ + conf = kparser_config_handler_preprocess(cmdarg, cmdarglen, \ + rsp, rsp_len); \ + if (!conf) \ + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); \ + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); \ +} \ +while (0) + +/* netlink msg processors for create */ +int kparser_config_handler_add(const void *cmdarg, size_t cmdarglen, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, + void *extack, int *err) +{ + const struct kparser_conf_cmd *conf; + + KPARSER_CONFIG_HANDLER_PRE(); + + if (!conf) + return KPARSER_ATTR_UNSPEC; + + if (!g_mod_namespaces[conf->namespace_id]->create_handler) + return KPARSER_ATTR_UNSPEC; + + return g_mod_namespaces[conf->namespace_id]->create_handler(conf, cmdarglen, + rsp, + rsp_len, + "create", + extack, err); +} + +/* netlink msg processors for update */ +int kparser_config_handler_update(const void *cmdarg, size_t cmdarglen, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, void *extack, int *err) +{ + const struct kparser_conf_cmd *conf; + + KPARSER_CONFIG_HANDLER_PRE(); + + if (!conf) + return KPARSER_ATTR_UNSPEC; + + if (!g_mod_namespaces[conf->namespace_id]->update_handler) + return KPARSER_ATTR_UNSPEC; + + return g_mod_namespaces[conf->namespace_id]->update_handler(conf, cmdarglen, + rsp, + rsp_len, + "update", + extack, err); +} + +/* netlink msg processors for read */ +int kparser_config_handler_read(const void *cmdarg, size_t cmdarglen, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, void *extack, int *err) +{ + const struct kparser_conf_cmd *conf; + + KPARSER_CONFIG_HANDLER_PRE(); + + if (!conf) + return KPARSER_ATTR_UNSPEC; + + if (!g_mod_namespaces[conf->namespace_id]->read_handler) + return KPARSER_ATTR_UNSPEC; + + return g_mod_namespaces[conf->namespace_id]->read_handler(&conf->obj_key, rsp, rsp_len, + conf->recursive_read_delete, "read", extack, err); +} + +/* netlink msg processors for delete */ +int kparser_config_handler_delete(const void *cmdarg, size_t cmdarglen, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, void *extack, int *err) +{ + const struct kparser_conf_cmd *conf; + + KPARSER_CONFIG_HANDLER_PRE(); + + if (!conf) + return KPARSER_ATTR_UNSPEC; + + if (!g_mod_namespaces[conf->namespace_id]->del_handler) + return KPARSER_ATTR_UNSPEC; + + return g_mod_namespaces[conf->namespace_id]->del_handler(&conf->obj_key, rsp, rsp_len, + conf->recursive_read_delete, "delete", extack, err); +} diff --git a/net/kparser/kparser_cmds_dump_ops.c b/net/kparser/kparser_cmds_dump_ops.c new file mode 100644 index 000000000..58c867a7e --- /dev/null +++ b/net/kparser/kparser_cmds_dump_ops.c @@ -0,0 +1,586 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Copyright (c) 2022, SiPanda Inc. + * + * kparser_cmds_dump_ops.c - kParser KMOD-CLI debug dump operations + * + * Author: Pratyush Kumar Khan + */ + +#include "kparser.h" + +/* forward declarations of dump functions which dump config structures for debug purposes */ +static void kparser_dump_node(const struct kparser_parse_node *obj); +static void kparser_dump_proto_table(const struct kparser_proto_table *obj); +static void kparser_dump_tlv_parse_node(const struct kparser_parse_tlv_node *obj); +static void kparser_dump_metadatatable(const struct kparser_metadata_table *obj); +static void kparser_dump_cond_tables(const struct kparser_condexpr_tables *obj); + +/* debug code: dump kparser_parameterized_len structure */ +static void kparser_dump_param_len(const struct kparser_parameterized_len *pflen) +{ + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + if (!pflen) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "obj NULL"); + goto done; + } + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "pflen.src_off:%u\n", pflen->src_off); + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "pflen.size:%u\n", pflen->size); + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "pflen.endian:%d\n", pflen->endian); + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "pflen.mask:%u\n", pflen->mask); + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "pflen.right_shift:%u\n", pflen->right_shift); + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "pflen.multiplier:%u\n", pflen->multiplier); + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "pflen.add_value:%u\n", pflen->add_value); + +done: + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); +} + +/* debug code: dump kparser_parameterized_next_proto structure */ +static void kparser_dump_param_next_proto(const struct kparser_parameterized_next_proto + *pfnext_proto) +{ + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + if (!pfnext_proto) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "obj NULL"); + goto done; + } + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "pfnext_proto.src_off:%u\n", pfnext_proto->src_off); + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "pfnext_proto.mask:%u\n", pfnext_proto->mask); + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "pfnext_proto.size:%u\n", pfnext_proto->size); + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "pfnext_proto.right_shift:%u\n", pfnext_proto->right_shift); + +done: + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); +} + +/* debug code: dump kparser_condexpr_expr structure */ +static void kparser_dump_cond_expr(const struct kparser_condexpr_expr *obj) +{ + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + if (!obj) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "obj NULL"); + goto done; + } + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "type:%u, src_off:%u, len:%u, mask:%04x value:%04x\n", + obj->type, obj->src_off, + obj->length, obj->mask, obj->value); +done: + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); +} + +/* debug code: dump kparser_condexpr_table structure */ +static void kparser_dump_cond_table(const struct kparser_condexpr_table *obj) +{ + int i; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + if (!obj) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "obj NULL"); + goto done; + } + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "default_fail:%d, type:%u\n", obj->default_fail, obj->type); + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "num_ents:%u, entries:%p\n", obj->num_ents, obj->entries); + + if (!obj->entries) + goto done; + + for (i = 0; i < obj->num_ents; i++) + kparser_dump_cond_expr(obj->entries[i]); + +done: + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); +} + +/* debug code: dump kparser_condexpr_tables structure */ +static void kparser_dump_cond_tables(const struct kparser_condexpr_tables *obj) +{ + int i; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + if (!obj) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "obj NULL"); + goto done; + } + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "num_ents:%u, entries:%p\n", obj->num_ents, obj->entries); + if (!obj->entries) + goto done; + + for (i = 0; i < obj->num_ents; i++) + kparser_dump_cond_table(obj->entries[i]); + +done: + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); +} + +/* debug code: dump kparser_proto_node structure */ +static void kparser_dump_proto_node(const struct kparser_proto_node *obj) +{ + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + if (!obj) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "obj NULL"); + goto done; + } + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "encap:%u\n", obj->encap); + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "overlay:%u\n", obj->overlay); + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "min_len:%lu\n", obj->min_len); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "ops.flag_fields_length:%d\n", obj->ops.flag_fields_length); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "ops.len_parameterized:%d\n", obj->ops.len_parameterized); + kparser_dump_param_len(&obj->ops.pflen); + + kparser_dump_param_next_proto(&obj->ops.pfnext_proto); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "ops.cond_exprs_parameterized:%d\n", + obj->ops.cond_exprs_parameterized); + kparser_dump_cond_tables(&obj->ops.cond_exprs); + +done: + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); +} + +/* debug code: dump kparser_proto_tlvs_table structure */ +static void kparser_dump_proto_tlvs_table(const struct kparser_proto_tlvs_table *obj) +{ + int i; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + if (!obj) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "obj NULL"); + goto done; + } + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "num_ents:%u, entries:%p\n", obj->num_ents, obj->entries); + if (!obj->entries) + goto done; + + for (i = 0; i < obj->num_ents; i++) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "[%d]: val: %04x\n", i, obj->entries[i].type); + kparser_dump_tlv_parse_node(obj->entries[i].node); + } + +done: + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); +} + +/* debug code: dump kparser_parse_tlv_node structure */ +static void kparser_dump_tlv_parse_node(const struct kparser_parse_tlv_node *obj) +{ + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + if (!obj) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "obj NULL"); + goto done; + } + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "name: %s\n", obj->name); + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "unknown_tlv_type_ret:%d\n", obj->unknown_overlay_ret); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "proto_tlv_node.min_len: %lu\n", obj->proto_tlv_node.min_len); + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "proto_tlv_node.max_len: %lu\n", obj->proto_tlv_node.max_len); + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "proto_tlv_node.is_padding: %u\n", obj->proto_tlv_node.is_padding); + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "proto_tlv_node.overlay_type_parameterized: %u\n", + obj->proto_tlv_node.ops.overlay_type_parameterized); + kparser_dump_param_next_proto(&obj->proto_tlv_node.ops.pfoverlay_type); + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "proto_tlv_node.cond_exprs_parameterized: %u\n", + obj->proto_tlv_node.ops.cond_exprs_parameterized); + kparser_dump_cond_tables(&obj->proto_tlv_node.ops.cond_exprs); + + kparser_dump_proto_tlvs_table(obj->overlay_table); + kparser_dump_tlv_parse_node(obj->overlay_wildcard_node); + kparser_dump_metadatatable(obj->metadata_table); +done: + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); +} + +/* debug code: dump kparser_parse_tlvs_node structure */ +static void kparser_dump_tlvs_parse_node(const struct kparser_parse_tlvs_node *obj) +{ + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + if (!obj) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "obj NULL"); + goto done; + } + + kparser_dump_proto_tlvs_table(obj->tlv_proto_table); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "unknown_tlv_type_ret:%d\n", obj->unknown_tlv_type_ret); + + kparser_dump_tlv_parse_node(obj->tlv_wildcard_node); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "config:max_loop: %u\n", obj->config.max_loop); + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "config:max_non: %u\n", obj->config.max_non); + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "config:max_plen: %u\n", obj->config.max_plen); + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "config:max_c_pad: %u\n", obj->config.max_c_pad); + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "config:disp_limit_exceed: %u\n", obj->config.disp_limit_exceed); + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "config:exceed_loop_cnt_is_err: %u\n", + obj->config.exceed_loop_cnt_is_err); + +done: + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); +} + +/* debug code: dump kparser_proto_tlvs_node structure */ +static void kparser_dump_tlvs_proto_node(const struct kparser_proto_tlvs_node *obj) +{ + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + if (!obj) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "obj NULL"); + goto done; + } + + kparser_dump_proto_node(&obj->proto_node); + + kparser_dump_param_len(&obj->ops.pfstart_offset); + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "ops.len_parameterized:%d\n", obj->ops.len_parameterized); + kparser_dump_param_len(&obj->ops.pflen); + kparser_dump_param_next_proto(&obj->ops.pftype); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "start_offset:%lu\n", obj->start_offset); + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "pad1_val:%u\n", obj->pad1_val); + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "padn_val:%u\n", obj->padn_val); + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "eol_val:%u\n", obj->eol_val); + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "pad1_enable:%u\n", obj->pad1_enable); + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "padn_enable:%u\n", obj->padn_enable); + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "eol_enable:%u\n", obj->eol_enable); + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "fixed_start_offset:%u\n", obj->fixed_start_offset); + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "min_len:%lu\n", obj->min_len); + +done: + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); +} + +/* debug code: dump kparser_flag_field structure */ +static void kparser_dump_flag_field(const struct kparser_flag_field *obj) +{ + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + if (!obj) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "obj NULL"); + goto done; + } + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "flag:%04x, mask:%04x size:%lu\n", + obj->flag, obj->mask, obj->size); + +done: + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); +} + +/* debug code: dump kparser_flag_fields structure */ +static void kparser_dump_flag_fields(const struct kparser_flag_fields *obj) +{ + int i; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + if (!obj) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "obj NULL"); + goto done; + } + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "num_idx:%lu, fields:%p\n", obj->num_idx, obj->fields); + + if (!obj->fields) + goto done; + + for (i = 0; i < obj->num_idx; i++) + kparser_dump_flag_field(&obj->fields[i]); + +done: + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); +} + +/* debug code: dump kparser_parse_flag_field_node structure */ +static void kparser_dump_parse_flag_field_node(const struct kparser_parse_flag_field_node *obj) +{ + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + if (!obj) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "obj NULL"); + goto done; + } + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "name: %s\n", obj->name); + + kparser_dump_metadatatable(obj->metadata_table); + kparser_dump_cond_tables(&obj->ops.cond_exprs); +done: + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); +} + +/* debug code: dump kparser_proto_flag_fields_table structure */ +static void kparser_dump_proto_flag_fields_table(const struct kparser_proto_flag_fields_table *obj) +{ + int i; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + if (!obj) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "obj NULL"); + goto done; + } + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "num_ents:%d, entries:%p\n", obj->num_ents, obj->entries); + + if (!obj->entries) + goto done; + + for (i = 0; i < obj->num_ents; i++) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "proto_flag_fields_table_entry_flag:%x\n", + obj->entries[i].flag); + kparser_dump_parse_flag_field_node(obj->entries[i].node); + } +done: + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); +} + +/* debug code: dump kparser_parse_flag_fields_node structure */ +static void kparser_dump_flags_parse_node(const struct kparser_parse_flag_fields_node *obj) +{ + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + if (!obj) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "obj NULL"); + goto done; + } + + kparser_dump_proto_flag_fields_table(obj->flag_fields_proto_table); +done: + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); +} + +/* debug code: dump kparser_proto_flag_fields_node structure */ +static void kparser_dump_flags_proto_node(const struct kparser_proto_flag_fields_node *obj) +{ + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + if (!obj) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "obj NULL"); + goto done; + } + + kparser_dump_proto_node(&obj->proto_node); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "ops.get_flags_parameterized:%d\n", + obj->ops.get_flags_parameterized); + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "ops.pfget_flags: src_off:%u mask:%04x size:%u\n", + obj->ops.pfget_flags.src_off, + obj->ops.pfget_flags.mask, + obj->ops.pfget_flags.size); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "ops.start_fields_offset_parameterized:%d\n", + obj->ops.start_fields_offset_parameterized); + kparser_dump_param_len(&obj->ops.pfstart_fields_offset); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "ops.flag_feilds_len:%u ops.hdr_length:%u\n", + obj->ops.flag_fields_len, obj->ops.hdr_length); + + kparser_dump_flag_fields(obj->flag_fields); +done: + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); +} + +/* debug code: dump kparser_metadata_table structure */ +static void kparser_dump_metadatatable(const struct kparser_metadata_table *obj) +{ + int i; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + if (!obj) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "obj NULL"); + goto done; + } + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "num_ents:%u, entries:%p\n", obj->num_ents, obj->entries); + if (!obj->entries) + goto done; + + for (i = 0; i < obj->num_ents; i++) + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "mde[%d]:%04x\n", i, obj->entries[i].val); + +done: + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); +} + +/* debug code: dump kparser_proto_table structure */ +static void kparser_dump_proto_table(const struct kparser_proto_table *obj) +{ + int i; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + if (!obj) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "obj NULL"); + goto done; + } + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "num_ents:%u, entries:%p\n", obj->num_ents, obj->entries); + if (!obj->entries) + goto done; + + for (i = 0; i < obj->num_ents; i++) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "[%d]: val: %d\n", i, obj->entries[i].value); + kparser_dump_node(obj->entries[i].node); + } + +done: + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); +} + +/* debug code: dump kparser_parse_node structure */ +static void kparser_dump_node(const struct kparser_parse_node *obj) +{ + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + if (!obj) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "obj NULL"); + goto done; + } + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "name: %s: type: %d\n", obj->name, obj->node_type); + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "unknown_ret:%d\n", obj->unknown_ret); + + switch (obj->node_type) { + case KPARSER_NODE_TYPE_PLAIN: + kparser_dump_proto_node(&obj->proto_node); + break; + + case KPARSER_NODE_TYPE_TLVS: + kparser_dump_tlvs_proto_node(&obj->tlvs_proto_node); + kparser_dump_tlvs_parse_node((const struct kparser_parse_tlvs_node *)obj); + break; + + case KPARSER_NODE_TYPE_FLAG_FIELDS: + kparser_dump_flags_proto_node(&obj->flag_fields_proto_node); + kparser_dump_flags_parse_node((const struct kparser_parse_flag_fields_node *)obj); + break; + + default: + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "unknown node type:%d\n", obj->node_type); + break; + } + + kparser_dump_proto_table(obj->proto_table); + + kparser_dump_node(obj->wildcard_node); + + kparser_dump_metadatatable(obj->metadata_table); + +done: + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); +} + +/* debug code: dump whole parse tree from kparser_parser structure */ +void kparser_dump_parser_tree(const struct kparser_parser *obj) +{ + int i; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + if (!obj) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "obj NULL"); + goto done; + } + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "name: %s\n", obj->name); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "config: flags:%02x\n", obj->config.flags); + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "config: max_nodes:%u\n", obj->config.max_nodes); + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "config: max_encaps:%u\n", obj->config.max_encaps); + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "config: max_frames:%u\n", obj->config.max_frames); + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "config: metameta_size:%lu\n", obj->config.metameta_size); + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "config: frame_size:%lu\n", obj->config.frame_size); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "cntrs_len: %lu\n", obj->cntrs_len); + for (i = 0; i < (sizeof(obj->cntrs_conf.cntrs) / + sizeof(obj->cntrs_conf.cntrs[0])); i++) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "cntrs:%d: max_value:%u\n", i, + obj->cntrs_conf.cntrs[i].max_value); + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "cntrs:%d: array_limit:%u\n", i, + obj->cntrs_conf.cntrs[i].array_limit); + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "cntrs:%d: el_size:%lu\n", i, + obj->cntrs_conf.cntrs[i].el_size); + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "cntrs:%d: reset_on_encap:%d\n", i, + obj->cntrs_conf.cntrs[i].reset_on_encap); + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "cntrs:%d: overwrite_last:%d\n", i, + obj->cntrs_conf.cntrs[i].overwrite_last); + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "cntrs:%d: error_on_exceeded:%d\n", i, + obj->cntrs_conf.cntrs[i].error_on_exceeded); + if (obj->cntrs) + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "cntr[%d]:%d", i, obj->cntrs->cntr[i]); + } + + kparser_dump_node(obj->root_node); + kparser_dump_node(obj->okay_node); + kparser_dump_node(obj->fail_node); + kparser_dump_node(obj->atencap_node); + +done: + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); +} diff --git a/net/kparser/kparser_cmds_ops.c b/net/kparser/kparser_cmds_ops.c new file mode 100644 index 000000000..b642a8d14 --- /dev/null +++ b/net/kparser/kparser_cmds_ops.c @@ -0,0 +1,3778 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Copyright (c) 2022, SiPanda Inc. + * + * kparser_cmds_ops.c - kParser KMOD-CLI netlink request operations handlers + * + * Author: Pratyush Kumar Khan + */ + +#include +#include +#include +#include + +#include "kparser.h" + +/* global netlink cmd handler mutex, all handlers must run within protection of this mutex + * NOTE: never use this mutex on data path operations since they can run under interrupt contexts + */ +static DEFINE_MUTEX(kparser_config_lock); + +/* global counter config, shared among all the parsers */ +static struct kparser_cntrs_conf cntrs_conf = {}; +static __u8 cntrs_conf_idx; + +void *kparser_fast_lookup_array[KPARSER_PARSER_FAST_LOOKUP_RSVD_ID_STOP - + KPARSER_PARSER_FAST_LOOKUP_RSVD_ID_START + 1]; + +/* common pre-process code for create handlers */ +static inline bool +kparser_cmd_create_pre_process(const char *op, + const struct kparser_conf_cmd *conf, + const struct kparser_hkey *argkey, struct kparser_hkey *newkey, + void **kobj, size_t kobjsize, struct kparser_cmd_rsp_hdr *rsp, + size_t glueoffset, + void *extack, int *err) +{ + struct kparser_glue *glue; + + if (kparser_conf_key_manager(conf->namespace_id, argkey, newkey, rsp, + op, extack, err) != 0) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "error"); + return false; + } + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OP:%s Key{%s:%d}:{%s:%d}\n", + op, argkey->name, argkey->id, + newkey->name, newkey->id); + + if (kparser_namespace_lookup(conf->namespace_id, newkey)) { + rsp->op_ret_code = EEXIST; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s:Duplicate object HKey:{%s:%u}", + op, newkey->name, newkey->id); + return false; + } + + *kobj = kzalloc(kobjsize, GFP_KERNEL); + if (!(*kobj)) { + rsp->op_ret_code = ENOMEM; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s:Object allocation failed for size:%lu", + op, kobjsize); + return false; + } + + glue = (*kobj) + glueoffset; + glue->key = *newkey; + + rsp->op_ret_code = kparser_namespace_insert(conf->namespace_id, + &glue->ht_node_id, &glue->ht_node_name); + if (rsp->op_ret_code) { + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s:Htbl insert err:%d", + op, rsp->op_ret_code); + return false; + } + + glue->config = *conf; + kref_init(&glue->refcount); + + rsp->key = *newkey; + rsp->object.conf_keys_bv = conf->conf_keys_bv; + rsp->object = *conf; + + return true; +} + +/* Following functions create kParser object handlers for netlink msgs + * create handler for object conditionals + * NOTE: All handlers startting from here must hold mutex kparser_config_lock + * before any work can be done and must release that mutex before return. + */ +int kparser_create_cond_exprs(const struct kparser_conf_cmd *conf, + size_t conf_len, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, const char *op, + void *extack, int *err) +{ + struct kparser_glue_condexpr_expr *kobj = NULL; + const struct kparser_conf_condexpr *arg; + struct kparser_hkey key; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + mutex_lock(&kparser_config_lock); + + arg = &conf->cond_conf; + + if (!kparser_cmd_create_pre_process(op, conf, &arg->key, &key, + (void **)&kobj, sizeof(*kobj), *rsp, + offsetof(struct + kparser_glue_condexpr_expr, + glue), extack, err)) + goto done; + + kobj->expr = arg->config; + + (*rsp)->key = key; + (*rsp)->object.conf_keys_bv = conf->conf_keys_bv; + (*rsp)->object.cond_conf = kobj->glue.config.cond_conf; +done: + mutex_unlock(&kparser_config_lock); + + if ((*rsp)->op_ret_code != 0) + kparser_free(kobj); + + synchronize_rcu(); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return KPARSER_ATTR_RSP(KPARSER_NS_CONDEXPRS); +} + +/* read handler for object conditionals */ +int kparser_read_cond_exprs(const struct kparser_hkey *key, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, __u8 recursive_read, + const char *op, + void *extack, int *err) +{ + struct kparser_glue_condexpr_expr *kobj; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "Key: {ID:%u Name:%s}\n", key->id, key->name); + + mutex_lock(&kparser_config_lock); + + kobj = kparser_namespace_lookup(KPARSER_NS_CONDEXPRS, key); + if (!kobj) { + (*rsp)->op_ret_code = ENOENT; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s:Object key not found:{%s:%u}", + op, key->name, key->id); + goto done; + } + + (*rsp)->key = kobj->glue.key; + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "Key: {ID:%u Name:%s}\n", (*rsp)->key.id, (*rsp)->key.name); + (*rsp)->object.conf_keys_bv = kobj->glue.config.conf_keys_bv; + (*rsp)->object.cond_conf = kobj->glue.config.cond_conf; +done: + mutex_unlock(&kparser_config_lock); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return KPARSER_ATTR_RSP(KPARSER_NS_CONDEXPRS); +} + +/* create handler for object conditionals table entry */ +static bool kparser_create_cond_table_ent(const struct kparser_conf_table *arg, + struct kparser_glue_condexpr_table **proto_table, + struct kparser_cmd_rsp_hdr *rsp, + const char *op, + void *extack, int *err) +{ + const struct kparser_glue_condexpr_expr *kcondent; + void *realloced_mem; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "Key: {ID:%u Name:%s}\n", arg->key.id, arg->key.name); + + *proto_table = kparser_namespace_lookup(KPARSER_NS_CONDEXPRS_TABLE, &arg->key); + if (!(*proto_table)) { + rsp->op_ret_code = ENOENT; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s:Object key not found:{%s:%u}", + op, arg->key.name, arg->key.id); + return false; + } + + kcondent = kparser_namespace_lookup(KPARSER_NS_CONDEXPRS, &arg->elem_key); + if (!kcondent) { + rsp->op_ret_code = ENOENT; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s:Object key not found:{%s:%u}", + op, arg->elem_key.name, arg->elem_key.id); + return false; + } + + (*proto_table)->table.num_ents++; + realloced_mem = krealloc((*proto_table)->table.entries, + (*proto_table)->table.num_ents * + sizeof(struct kparser_condexpr_expr *), + GFP_KERNEL | ___GFP_ZERO); + if (!realloced_mem) { + rsp->op_ret_code = ENOMEM; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s:krealloc() err, ents:%d, size:%lu", + op, (*proto_table)->table.num_ents, + sizeof(struct kparser_condexpr_expr)); + return false; + } + rcu_assign_pointer((*proto_table)->table.entries, realloced_mem); + + (*proto_table)->table.entries[(*proto_table)->table.num_ents - 1] = &kcondent->expr; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return true; +} + +/* create handler for object conditionals table */ +int kparser_create_cond_table(const struct kparser_conf_cmd *conf, + size_t conf_len, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, const char *op, + void *extack, int *err) +{ + struct kparser_glue_condexpr_table *proto_table = NULL; + const struct kparser_conf_table *arg; + struct kparser_hkey key; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + mutex_lock(&kparser_config_lock); + + arg = &conf->table_conf; + + /* create a table entry */ + if (arg->add_entry) { + if (kparser_create_cond_table_ent(arg, &proto_table, *rsp, op, + extack, err) == false) + goto done; + goto skip_table_create; + } + + if (!kparser_cmd_create_pre_process(op, conf, &arg->key, &key, + (void **)&proto_table, sizeof(*proto_table), *rsp, + offsetof(struct + kparser_glue_condexpr_table, + glue), extack, err)) + goto done; + + proto_table->glue.config.namespace_id = conf->namespace_id; + proto_table->glue.config.conf_keys_bv = conf->conf_keys_bv; + proto_table->glue.config.table_conf = *arg; + proto_table->glue.config.table_conf.key = key; + kref_init(&proto_table->glue.refcount); + proto_table->table.default_fail = arg->optional_value1; + proto_table->table.type = arg->optional_value2; + +skip_table_create: + (*rsp)->key = key; + (*rsp)->object.conf_keys_bv = conf->conf_keys_bv; + (*rsp)->object.table_conf = *arg; +done: + mutex_unlock(&kparser_config_lock); + + if ((*rsp)->op_ret_code != 0) { + if (proto_table && !arg->add_entry) + kparser_free(proto_table); + } + + synchronize_rcu(); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return KPARSER_ATTR_RSP(KPARSER_NS_CONDEXPRS_TABLE); +} + +/* read handler for object conditionals table */ +int kparser_read_cond_table(const struct kparser_hkey *key, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, __u8 recursive_read, + const char *op, + void *extack, int *err) +{ + const struct kparser_glue_condexpr_table *proto_table; + const struct kparser_glue_condexpr_expr *kcondent; + struct kparser_conf_cmd *objects = NULL; + void *realloced_mem; + int i; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "Key: {ID:%u Name:%s}\n", key->id, key->name); + + mutex_lock(&kparser_config_lock); + + proto_table = kparser_namespace_lookup(KPARSER_NS_CONDEXPRS_TABLE, key); + if (!proto_table) { + (*rsp)->op_ret_code = ENOENT; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s:Object key not found, key:{%s:%u}", + op, key->name, key->id); + goto done; + } + + (*rsp)->key = proto_table->glue.key; + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "Key: {ID:%u Name:%s}\n", + (*rsp)->key.id, (*rsp)->key.name); + (*rsp)->object.conf_keys_bv = proto_table->glue.config.conf_keys_bv; + (*rsp)->object.table_conf = proto_table->glue.config.table_conf; + (*rsp)->object.table_conf.optional_value1 = proto_table->table.default_fail; + (*rsp)->object.table_conf.optional_value2 = proto_table->table.type; + + for (i = 0; i < proto_table->table.num_ents; i++) { + (*rsp)->objects_len++; + *rsp_len = *rsp_len + sizeof(struct kparser_conf_cmd); + realloced_mem = krealloc(*rsp, *rsp_len, GFP_KERNEL | ___GFP_ZERO); + if (!realloced_mem) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "krealloc failed for rsp, len:%lu\n", + *rsp_len); + *rsp_len = 0; + mutex_unlock(&kparser_config_lock); + return KPARSER_ATTR_UNSPEC; + } + *rsp = realloced_mem; + objects = (struct kparser_conf_cmd *)(*rsp)->objects; + objects[i].namespace_id = proto_table->glue.config.namespace_id; + objects[i].table_conf = proto_table->glue.config.table_conf; + if (!proto_table->table.entries) + continue; + kcondent = container_of(proto_table->table.entries[i], + struct kparser_glue_condexpr_expr, expr); + objects[i].table_conf.elem_key = kcondent->glue.key; + } +done: + mutex_unlock(&kparser_config_lock); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return KPARSER_ATTR_RSP(KPARSER_NS_CONDEXPRS_TABLE); +} + +/* create handler for object conditionals table's list entry */ +static bool kparser_create_cond_tables_ent(const struct kparser_conf_table *arg, + struct kparser_glue_condexpr_tables **proto_table, + struct kparser_cmd_rsp_hdr *rsp, + const char *op, + void *extack, int *err) +{ + const struct kparser_glue_condexpr_table *kcondent; + void *realloced_mem; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "Key: {ID:%u Name:%s}\n", arg->key.id, arg->key.name); + + *proto_table = kparser_namespace_lookup(KPARSER_NS_CONDEXPRS_TABLES, &arg->key); + if (!(*proto_table)) { + rsp->op_ret_code = ENOENT; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s:Object key not found, key:{%s:%u}", + op, arg->key.name, arg->key.id); + return false; + } + + kcondent = kparser_namespace_lookup(KPARSER_NS_CONDEXPRS_TABLE, &arg->elem_key); + if (!kcondent) { + rsp->op_ret_code = ENOENT; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s:Object key not found, key:{%s:%u}", + op, arg->key.name, arg->key.id); + return false; + } + + (*proto_table)->table.num_ents++; + realloced_mem = krealloc((*proto_table)->table.entries, (*proto_table)->table.num_ents * + sizeof(struct kparser_condexpr_table *), GFP_KERNEL | ___GFP_ZERO); + if (!realloced_mem) { + rsp->op_ret_code = ENOMEM; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: krealloc() err, ents:%d, size:%lu", + op, (*proto_table)->table.num_ents, + sizeof(struct kparser_condexpr_table *)); + return false; + } + rcu_assign_pointer((*proto_table)->table.entries, realloced_mem); + + (*proto_table)->table.entries[(*proto_table)->table.num_ents - 1] = &kcondent->table; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return true; +} + +/* create handler for object conditionals table's list */ +int kparser_create_cond_tables(const struct kparser_conf_cmd *conf, + size_t conf_len, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, const char *op, + void *extack, int *err) +{ + struct kparser_glue_condexpr_tables *proto_table = NULL; + const struct kparser_conf_table *arg; + struct kparser_hkey key; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + mutex_lock(&kparser_config_lock); + + arg = &conf->table_conf; + + /* create a table entry */ + if (arg->add_entry) { + if (kparser_create_cond_tables_ent(arg, &proto_table, *rsp, op, + extack, err) == false) + goto done; + goto skip_table_create; + } + + if (!kparser_cmd_create_pre_process(op, conf, &arg->key, &key, + (void **)&proto_table, sizeof(*proto_table), *rsp, + offsetof(struct + kparser_glue_condexpr_tables, + glue), extack, err)) + goto done; + + proto_table->glue.config.namespace_id = conf->namespace_id; + proto_table->glue.config.conf_keys_bv = conf->conf_keys_bv; + proto_table->glue.config.table_conf = *arg; + proto_table->glue.config.table_conf.key = key; + kref_init(&proto_table->glue.refcount); + +skip_table_create: + (*rsp)->key = key; + (*rsp)->object.conf_keys_bv = conf->conf_keys_bv; + (*rsp)->object.table_conf = *arg; +done: + mutex_unlock(&kparser_config_lock); + + if ((*rsp)->op_ret_code != 0) { + if (proto_table && !arg->add_entry) + kparser_free(proto_table); + } + + synchronize_rcu(); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return KPARSER_ATTR_RSP(KPARSER_NS_CONDEXPRS_TABLES); +} + +/* read handler for object conditionals table's list */ +int kparser_read_cond_tables(const struct kparser_hkey *key, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, __u8 recursive_read, + const char *op, + void *extack, int *err) + +{ + const struct kparser_glue_condexpr_tables *proto_table; + const struct kparser_glue_condexpr_table *kcondent; + struct kparser_conf_cmd *objects = NULL; + void *realloced_mem; + int i; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "Key: {ID:%u Name:%s}\n", key->id, key->name); + + mutex_lock(&kparser_config_lock); + + proto_table = kparser_namespace_lookup(KPARSER_NS_CONDEXPRS_TABLES, key); + if (!proto_table) { + (*rsp)->op_ret_code = ENOENT; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: Object key not found, key:{%s:%u}", + op, key->name, key->id); + goto done; + } + + (*rsp)->key = proto_table->glue.key; + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "Key: {ID:%u Name:%s}\n", (*rsp)->key.id, (*rsp)->key.name); + (*rsp)->object.conf_keys_bv = proto_table->glue.config.conf_keys_bv; + (*rsp)->object.table_conf = proto_table->glue.config.table_conf; + + for (i = 0; i < proto_table->table.num_ents; i++) { + (*rsp)->objects_len++; + *rsp_len = *rsp_len + sizeof(struct kparser_conf_cmd); + realloced_mem = krealloc(*rsp, *rsp_len, GFP_KERNEL | ___GFP_ZERO); + if (!realloced_mem) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + ":krealloc failed for rsp, len:%lu\n", + *rsp_len); + *rsp_len = 0; + mutex_unlock(&kparser_config_lock); + return KPARSER_ATTR_UNSPEC; + } + *rsp = realloced_mem; + objects = (struct kparser_conf_cmd *)(*rsp)->objects; + objects[i].namespace_id = proto_table->glue.config.namespace_id; + objects[i].table_conf = proto_table->glue.config.table_conf; + if (!proto_table->table.entries) + continue; + kcondent = container_of(proto_table->table.entries[i], + struct kparser_glue_condexpr_table, table); + objects[i].table_conf.elem_key = kcondent->glue.key; + } + +done: + mutex_unlock(&kparser_config_lock); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return KPARSER_ATTR_RSP(KPARSER_NS_CONDEXPRS_TABLES); +} + +/* create handler for object counter */ +int kparser_create_counter(const struct kparser_conf_cmd *conf, + size_t conf_len, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, const char *op, + void *extack, int *err) +{ + struct kparser_glue_counter *kcntr = NULL; + const struct kparser_conf_cntr *arg; + struct kparser_hkey key; + int rc; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + mutex_lock(&kparser_config_lock); + + arg = &conf->cntr_conf; + + if (!arg->conf.valid_entry) { + (*rsp)->op_ret_code = EINVAL; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: counter entry is not valid", op); + goto done; + } + + if (cntrs_conf_idx >= KPARSER_CNTR_NUM_CNTRS) { + (*rsp)->op_ret_code = EINVAL; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: counter index %d can not be >= %d", + op, cntrs_conf_idx, + KPARSER_CNTR_NUM_CNTRS); + goto done; + } + + if (kparser_conf_key_manager(conf->namespace_id, &arg->key, &key, *rsp, + op, extack, err) != 0) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "error"); + goto done; + } + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "Key: {ID:%u Name:%s}\n", arg->key.id, arg->key.name); + + if (kparser_namespace_lookup(conf->namespace_id, &key)) { + (*rsp)->op_ret_code = EEXIST; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: Duplicate object key {%s:%u}", + op, arg->key.name, arg->key.id); + goto done; + } + + kcntr = kzalloc(sizeof(*kcntr), GFP_KERNEL); + if (!kcntr) { + (*rsp)->op_ret_code = ENOMEM; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: kzalloc() failed, size: %lu", + op, sizeof(*kcntr)); + goto done; + } + + kcntr->glue.key = key; + + rc = kparser_namespace_insert(conf->namespace_id, + &kcntr->glue.ht_node_id, &kcntr->glue.ht_node_name); + if (rc) { + (*rsp)->op_ret_code = rc; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: kparser_namespace_insert() err:%d", + op, rc); + goto done; + } + + kcntr->glue.config.namespace_id = conf->namespace_id; + kcntr->glue.config.conf_keys_bv = conf->conf_keys_bv; + kcntr->glue.config.cntr_conf = *arg; + kcntr->glue.config.cntr_conf.key = key; + kref_init(&kcntr->glue.refcount); + + kcntr->counter_cnf = arg->conf; + kcntr->counter_cnf.index = cntrs_conf_idx; + + cntrs_conf.cntrs[cntrs_conf_idx] = kcntr->counter_cnf; + + cntrs_conf_idx++; + + (*rsp)->key = key; + (*rsp)->object.conf_keys_bv = conf->conf_keys_bv; + (*rsp)->object.cntr_conf = kcntr->glue.config.cntr_conf; +done: + mutex_unlock(&kparser_config_lock); + + if ((*rsp)->op_ret_code != 0) + kparser_free(kcntr); + + synchronize_rcu(); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return KPARSER_ATTR_RSP(KPARSER_NS_COUNTER); +} + +/* read handler for object counter */ +int kparser_read_counter(const struct kparser_hkey *key, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, __u8 recursive_read, + const char *op, + void *extack, int *err) +{ + struct kparser_glue_counter *kcntr; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "Key: {ID:%u Name:%s}\n", key->id, key->name); + + mutex_lock(&kparser_config_lock); + + kcntr = kparser_namespace_lookup(KPARSER_NS_COUNTER, key); + if (!kcntr) { + (*rsp)->op_ret_code = ENOENT; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s:Object key not found, key:{%s:%u}", + op, key->name, key->id); + goto done; + } + + (*rsp)->key = kcntr->glue.key; + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "Key: {ID:%u Name:%s}\n", (*rsp)->key.id, (*rsp)->key.name); + (*rsp)->object.conf_keys_bv = kcntr->glue.config.conf_keys_bv; + (*rsp)->object.cntr_conf = kcntr->glue.config.cntr_conf; +done: + mutex_unlock(&kparser_config_lock); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return KPARSER_ATTR_RSP(KPARSER_NS_COUNTER); +} + +/* create handler for object counter table */ +int kparser_create_counter_table(const struct kparser_conf_cmd *conf, + size_t conf_len, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, const char *op, + void *extack, int *err) +{ + struct kparser_glue_counter_table *table = NULL; + const struct kparser_conf_table *arg; + struct kparser_glue_counter *kcntr; + struct kparser_hkey key; + int rc; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + mutex_lock(&kparser_config_lock); + + arg = &conf->table_conf; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "Key: {ID:%u Name:%s}\n", arg->key.id, arg->key.name); + + /* create a table entry */ + if (arg->add_entry) { + table = kparser_namespace_lookup(conf->namespace_id, &arg->key); + if (!table) { + (*rsp)->op_ret_code = ENOENT; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s:Object key not found, key:{%s:%u}", + op, arg->key.name, arg->key.id); + goto done; + } + if (table->elems_cnt >= KPARSER_CNTR_NUM_CNTRS) { + (*rsp)->op_ret_code = EINVAL; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s:table full, elem cnt:%u", + op, table->elems_cnt); + goto done; + } + kcntr = kparser_namespace_lookup(KPARSER_NS_COUNTER, + &arg->elem_key); + if (!kcntr) { + (*rsp)->op_ret_code = ENOENT; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s:Object key not found, key:{%s:%u}", + op, arg->elem_key.name, + arg->elem_key.id); + goto done; + } + table->k_cntrs[table->elems_cnt++] = *kcntr; + goto skip_table_create; + } + + if (kparser_conf_key_manager(conf->namespace_id, &arg->key, &key, *rsp, + op, extack, err) != 0) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "error"); + goto done; + } + + if (kparser_namespace_lookup(conf->namespace_id, &key)) { + (*rsp)->op_ret_code = EEXIST; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s:Object key duplicate, key:{%s:%u}", + op, key.name, key.id); + goto done; + } + + /* create counter table */ + table = kzalloc(sizeof(*table), GFP_KERNEL); + if (!table) { + (*rsp)->op_ret_code = ENOMEM; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: kzalloc() failed, size: %lu", + op, sizeof(*table)); + goto done; + } + + table->glue.key = key; + + rc = kparser_namespace_insert(conf->namespace_id, + &table->glue.ht_node_id, &table->glue.ht_node_name); + if (rc) { + (*rsp)->op_ret_code = rc; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: kparser_namespace_insert() err, rc:%d", + op, rc); + goto done; + } + + table->glue.config.namespace_id = conf->namespace_id; + table->glue.config.conf_keys_bv = conf->conf_keys_bv; + table->glue.config.table_conf = *arg; + table->glue.config.table_conf.key = key; + kref_init(&table->glue.refcount); + +skip_table_create: + (*rsp)->key = key; + (*rsp)->object.conf_keys_bv = conf->conf_keys_bv; + (*rsp)->object.table_conf = table->glue.config.table_conf; +done: + mutex_unlock(&kparser_config_lock); + + if ((*rsp)->op_ret_code != 0) + if (table && !arg->add_entry) + kparser_free(table); + + synchronize_rcu(); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return KPARSER_ATTR_RSP(KPARSER_NS_COUNTER_TABLE); +} + +/* read handler for object counter table */ +int kparser_read_counter_table(const struct kparser_hkey *key, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, __u8 recursive_read, + const char *op, + void *extack, int *err) +{ + const struct kparser_glue_counter_table *table; + struct kparser_conf_cmd *objects = NULL; + void *realloced_mem; + int i; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "Key: {ID:%u Name:%s}\n", key->id, key->name); + + mutex_lock(&kparser_config_lock); + + table = kparser_namespace_lookup(KPARSER_NS_COUNTER_TABLE, key); + if (!table) { + (*rsp)->op_ret_code = ENOENT; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: Object key not found, key: {%s:%u}", + op, key->name, key->id); + goto done; + } + + (*rsp)->key = table->glue.key; + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "Key: {ID:%u Name:%s}\n", (*rsp)->key.id, (*rsp)->key.name); + (*rsp)->object.conf_keys_bv = table->glue.config.conf_keys_bv; + (*rsp)->object.table_conf = table->glue.config.table_conf; + + for (i = 0; i < KPARSER_CNTR_NUM_CNTRS; i++) { + (*rsp)->objects_len++; + *rsp_len = *rsp_len + sizeof(struct kparser_conf_cmd); + realloced_mem = krealloc(*rsp, *rsp_len, GFP_KERNEL | ___GFP_ZERO); + if (!realloced_mem) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "krealloc failed for rsp, len:%lu\n", + *rsp_len); + *rsp_len = 0; + mutex_unlock(&kparser_config_lock); + return KPARSER_ATTR_UNSPEC; + } + *rsp = realloced_mem; + objects = (struct kparser_conf_cmd *)(*rsp)->objects; + objects[i].namespace_id = table->k_cntrs[i].glue.config.namespace_id; + objects[i].cntr_conf = table->k_cntrs[i].glue.config.cntr_conf; + objects[i].cntr_conf.conf = cntrs_conf.cntrs[i]; + } + +done: + mutex_unlock(&kparser_config_lock); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return KPARSER_ATTR_RSP(KPARSER_NS_COUNTER_TABLE); +} + +/* create handler for object metadata */ +int kparser_create_metadata(const struct kparser_conf_cmd *conf, + size_t conf_len, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, const char *op, + void *extack, int *err) +{ + struct kparser_glue_metadata_extract *kmde = NULL; + int rc, cntridx = 0, cntr_arr_idx = 0; + const struct kparser_conf_metadata *arg; + struct kparser_glue_counter *kcntr; + struct kparser_hkey key; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + mutex_lock(&kparser_config_lock); + + arg = &conf->md_conf; + + if (kparser_conf_key_manager(conf->namespace_id, &arg->key, &key, *rsp, + op, extack, err) != 0) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "error"); + goto done; + } + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "Key: {ID:%u Name:%s}\n", arg->key.id, arg->key.name); + + if (kparser_namespace_lookup(conf->namespace_id, &key)) { + (*rsp)->op_ret_code = EEXIST; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: Object key duplicate, key: {%s:%u}", + op, arg->key.name, arg->key.id); + goto done; + } + + kcntr = kparser_namespace_lookup(KPARSER_NS_COUNTER, &arg->counterkey); + if (kcntr) + cntridx = kcntr->counter_cnf.index + 1; + + if (arg->type == KPARSER_METADATA_COUNTER) { + /* In this case, one of the counters must be provided. If not, + * that is an error + */ + kcntr = kparser_namespace_lookup(KPARSER_NS_COUNTER, + &arg->counter_data_key); + if (kcntr) + cntr_arr_idx = kcntr->counter_cnf.index + 1; + + if (cntridx == 0 && cntr_arr_idx == 0) { + (*rsp)->op_ret_code = -ENOENT; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: both counteridx and" + " counterdata object keys are not" + " found", op); + goto done; + } else { + if (cntr_arr_idx == 0) + cntr_arr_idx = cntridx; + else if (cntridx == 0) + cntridx = cntr_arr_idx; + } + } + + kmde = kzalloc(sizeof(*kmde), GFP_KERNEL); + if (!kmde) { + (*rsp)->op_ret_code = ENOMEM; + NL_SET_ERR_MSG_FMT_MOD(extack, "%s: kzalloc() failed, size:%lu", + op, sizeof(*kmde)); + goto done; + } + + kmde->glue.key = key; + + rc = kparser_namespace_insert(conf->namespace_id, + &kmde->glue.ht_node_id, &kmde->glue.ht_node_name); + if (rc) { + (*rsp)->op_ret_code = rc; + NL_SET_ERR_MSG_FMT_MOD(extack, "%s: kparser_namespace_insert()" + " err, rc:%d", op, rc); + goto done; + } + + kmde->glue.config.namespace_id = conf->namespace_id; + kmde->glue.config.conf_keys_bv = conf->conf_keys_bv; + kmde->glue.config.md_conf = *arg; + kmde->glue.config.md_conf.key = key; + kref_init(&kmde->glue.refcount); + INIT_LIST_HEAD(&kmde->glue.owner_list); + INIT_LIST_HEAD(&kmde->glue.owned_list); + + if (!kparser_metadata_convert(arg, &kmde->mde, cntridx, cntr_arr_idx)) { + (*rsp)->op_ret_code = EINVAL; + NL_SET_ERR_MSG_FMT_MOD(extack, "%s: kparser_metadata_convert()" + " err, rc:%d", op, rc); + goto done; + } + + (*rsp)->key = key; + (*rsp)->object.conf_keys_bv = conf->conf_keys_bv; + (*rsp)->object.md_conf = kmde->glue.config.md_conf; +done: + mutex_unlock(&kparser_config_lock); + + if ((*rsp)->op_ret_code != 0) + kparser_free(kmde); + + synchronize_rcu(); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return KPARSER_ATTR_RSP(KPARSER_NS_METADATA); +} + +/* read handler for object metadata */ +int kparser_read_metadata(const struct kparser_hkey *key, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, __u8 recursive_read, + const char *op, + void *extack, int *err) +{ + const struct kparser_glue_metadata_extract *kmde; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "Key: {ID:%u Name:%s}\n", key->id, key->name); + + mutex_lock(&kparser_config_lock); + + kmde = kparser_namespace_lookup(KPARSER_NS_METADATA, key); + if (!kmde) { + (*rsp)->op_ret_code = ENOENT; + NL_SET_ERR_MSG_FMT_MOD(extack, "%s: Object key not found," + " key:{%s:%u}", + op, key->name, key->id); + goto done; + } + + (*rsp)->key = kmde->glue.key; + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "Key: {ID:%u Name:%s}\n", (*rsp)->key.id, (*rsp)->key.name); + (*rsp)->object.conf_keys_bv = kmde->glue.config.conf_keys_bv; + (*rsp)->object.md_conf = kmde->glue.config.md_conf; +done: + mutex_unlock(&kparser_config_lock); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return KPARSER_ATTR_RSP(KPARSER_NS_METADATA); +} + +/* delete handler for object metadata */ +int kparser_del_metadata(const struct kparser_hkey *key, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, __u8 recursive_read, + const char *op, + void *extack, int *err) +{ + struct kparser_glue_metadata_extract *kmde; + int rc; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "Key: {ID:%u Name:%s}\n", key->id, key->name); + + mutex_lock(&kparser_config_lock); + + kmde = kparser_namespace_lookup(KPARSER_NS_METADATA, key); + if (!kmde) { + (*rsp)->op_ret_code = ENOENT; + NL_SET_ERR_MSG_FMT_MOD(extack, "%s: Object key not found," + " key:{%s:%u}", + op, key->name, key->id); + goto done; + } + + if (kref_read(&kmde->glue.refcount) != 0) { + (*rsp)->op_ret_code = EBUSY; + NL_SET_ERR_MSG_FMT_MOD(extack, "%s: Metadata object is" + " associated with a metalist, delete" + " that metalist instead", + op); + goto done; + } + + rc = kparser_namespace_remove(KPARSER_NS_METADATA, + &kmde->glue.ht_node_id, &kmde->glue.ht_node_name); + if (rc) { + (*rsp)->op_ret_code = rc; + NL_SET_ERR_MSG_FMT_MOD(extack, "%s: namespace remove error, rc: %d", + op, rc); + goto done; + } + + (*rsp)->key = kmde->glue.key; + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "Key: {ID:%u Name:%s}\n", (*rsp)->key.id, (*rsp)->key.name); + (*rsp)->object.conf_keys_bv = kmde->glue.config.conf_keys_bv; + (*rsp)->object.md_conf = kmde->glue.config.md_conf; + + kparser_free(kmde); +done: + mutex_unlock(&kparser_config_lock); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return KPARSER_ATTR_RSP(KPARSER_NS_METADATA); +} + +/* free handler for object metadata */ +void kparser_free_metadata(void *ptr, void *arg) +{ + /* TODO: */ +} + +/* create handler for object metadata list */ +int kparser_create_metalist(const struct kparser_conf_cmd *conf, + size_t conf_len, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, const char *op, + void *extack, int *err) +{ + struct kparser_glue_metadata_extract *kmde = NULL; + struct kparser_glue_metadata_table *kmdl = NULL; + const struct kparser_conf_metadata_table *arg; + struct kparser_conf_cmd *objects = NULL; + struct kparser_hkey key; + void *realloced_mem; + int rc, i; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + mutex_lock(&kparser_config_lock); + + arg = &conf->mdl_conf; + + if (kparser_conf_key_manager(conf->namespace_id, &arg->key, &key, *rsp, + op, extack, err) != 0) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "error"); + goto done; + } + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "Key: {ID:%u Name:%s}\n", arg->key.id, arg->key.name); + + if (kparser_namespace_lookup(conf->namespace_id, &key)) { + (*rsp)->op_ret_code = EEXIST; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: Duplicate object key, {%s:%u}", + op, arg->key.name, arg->key.id); + goto done; + } + + kmdl = kzalloc(sizeof(*kmdl), GFP_KERNEL); + if (!kmdl) { + (*rsp)->op_ret_code = ENOMEM; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: kzalloc() failed, size: %lu", + op, sizeof(*kmdl)); + goto done; + } + + kmdl->glue.key = key; + kmdl->glue.config.namespace_id = conf->namespace_id; + kmdl->glue.config.conf_keys_bv = conf->conf_keys_bv; + kmdl->glue.config.mdl_conf = *arg; + kmdl->glue.config.mdl_conf.key = key; + kmdl->glue.config.mdl_conf.metadata_keys_count = 0; + kref_init(&kmdl->glue.refcount); + INIT_LIST_HEAD(&kmdl->glue.owner_list); + INIT_LIST_HEAD(&kmdl->glue.owned_list); + + conf_len -= sizeof(*conf); + + for (i = 0; i < arg->metadata_keys_count; i++) { + if (conf_len < sizeof(struct kparser_hkey)) { + (*rsp)->op_ret_code = EINVAL; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: conf len/buffer incomplete", + op); + goto done; + } + + conf_len -= sizeof(struct kparser_hkey); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "Key: {ID:%u Name:%s}\n", + arg->metadata_keys[i].id, arg->metadata_keys[i].name); + + kmde = kparser_namespace_lookup(KPARSER_NS_METADATA, &arg->metadata_keys[i]); + if (!kmde) { + (*rsp)->op_ret_code = ENOENT; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: Object not found, key: {%s:%u}", + op, arg->metadata_keys[i].name, + arg->metadata_keys[i].id); + goto done; + } + kmdl->metadata_table.num_ents++; + realloced_mem = krealloc(kmdl->metadata_table.entries, + kmdl->metadata_table.num_ents * sizeof(*kmde), + GFP_KERNEL | ___GFP_ZERO); + if (!realloced_mem) { + (*rsp)->op_ret_code = ENOMEM; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: krealloc() err, ents:%d, size:%lu", + op, + kmdl->metadata_table.num_ents, + sizeof(*kmde)); + goto done; + } + rcu_assign_pointer(kmdl->metadata_table.entries, realloced_mem); + + kmdl->metadata_table.entries[i] = kmde->mde; + kref_get(&kmde->glue.refcount); + + (*rsp)->objects_len++; + *rsp_len = *rsp_len + sizeof(struct kparser_conf_cmd); + realloced_mem = krealloc(*rsp, *rsp_len, GFP_KERNEL | ___GFP_ZERO); + if (!realloced_mem) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "krealloc failed for rsp, len:%lu\n", + *rsp_len); + *rsp_len = 0; + mutex_unlock(&kparser_config_lock); + if (kmdl) { + kparser_free(kmdl->metadata_table.entries); + kparser_free(kmdl); + } + return KPARSER_ATTR_UNSPEC; + } + *rsp = realloced_mem; + + objects = (struct kparser_conf_cmd *)(*rsp)->objects; + objects[i].namespace_id = kmde->glue.config.namespace_id; + objects[i].conf_keys_bv = kmde->glue.config.conf_keys_bv; + objects[i].md_conf = kmde->glue.config.md_conf; + + kmdl->md_configs_len++; + realloced_mem = krealloc(kmdl->md_configs, + kmdl->md_configs_len * + sizeof(struct kparser_conf_cmd), + GFP_KERNEL | ___GFP_ZERO); + if (!realloced_mem) { + (*rsp)->op_ret_code = ENOMEM; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: krealloc() err, ents:%lu, size:%lu", + op, + kmdl->md_configs_len, + sizeof(struct kparser_conf_cmd)); + goto done; + } + kmdl->md_configs = realloced_mem; + kmdl->md_configs[i].namespace_id = kmde->glue.config.namespace_id; + kmdl->md_configs[i].conf_keys_bv = kmde->glue.config.conf_keys_bv; + kmdl->md_configs[i].md_conf = kmde->glue.config.md_conf; + } + + rc = kparser_namespace_insert(conf->namespace_id, + &kmdl->glue.ht_node_id, &kmdl->glue.ht_node_name); + if (rc) { + (*rsp)->op_ret_code = rc; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: kparser_namespace_insert() err, rc:%d", + op, rc); + goto done; + } + + (*rsp)->key = key; + (*rsp)->object.conf_keys_bv = conf->conf_keys_bv; + (*rsp)->object.mdl_conf = kmdl->glue.config.mdl_conf; + (*rsp)->object.mdl_conf.metadata_keys_count = 0; +done: + mutex_unlock(&kparser_config_lock); + + if ((*rsp)->op_ret_code != 0 && kmdl) { + kparser_free(kmdl->metadata_table.entries); + kparser_free(kmdl->md_configs); + kparser_free(kmdl); + } + + synchronize_rcu(); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return KPARSER_ATTR_RSP(KPARSER_NS_METALIST); +} + +/* read handler for object metadata list */ +int kparser_read_metalist(const struct kparser_hkey *key, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, __u8 recursive_read, + const char *op, + void *extack, int *err) +{ + const struct kparser_glue_metadata_table *kmdl; + struct kparser_conf_cmd *objects = NULL; + void *realloced_mem; + int i; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "Key: {ID:%u Name:%s}\n", key->id, key->name); + + mutex_lock(&kparser_config_lock); + + kmdl = kparser_namespace_lookup(KPARSER_NS_METALIST, key); + if (!kmdl) { + (*rsp)->op_ret_code = ENOENT; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: Object key not found, key: {%s:%u}", + op, key->name, key->id); + goto done; + } + + (*rsp)->key = kmdl->glue.key; + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "Key: {ID:%u Name:%s}\n", (*rsp)->key.id, (*rsp)->key.name); + (*rsp)->object.conf_keys_bv = kmdl->glue.config.conf_keys_bv; + (*rsp)->object.mdl_conf = kmdl->glue.config.mdl_conf; + + for (i = 0; i < kmdl->md_configs_len; i++) { + (*rsp)->objects_len++; + *rsp_len = *rsp_len + sizeof(struct kparser_conf_cmd); + realloced_mem = krealloc(*rsp, *rsp_len, GFP_KERNEL | ___GFP_ZERO); + if (!realloced_mem) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "%s:krealloc failed for rsp, len:%lu\n", + op, *rsp_len); + *rsp_len = 0; + mutex_unlock(&kparser_config_lock); + return KPARSER_ATTR_UNSPEC; + } + *rsp = realloced_mem; + objects = (struct kparser_conf_cmd *)(*rsp)->objects; + objects[i].namespace_id = kmdl->md_configs[i].namespace_id; + objects[i].conf_keys_bv = kmdl->md_configs[i].conf_keys_bv; + objects[i].md_conf = kmdl->md_configs[i].md_conf; + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "Key: {ID:%u Name:%s}\n", + objects[i].md_conf.key.id, objects[i].md_conf.key.name); + } +done: + mutex_unlock(&kparser_config_lock); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return KPARSER_ATTR_RSP(KPARSER_NS_METALIST); +} + +/* delete handler for object metadata list */ +int kparser_del_metalist(const struct kparser_hkey *key, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, __u8 recursive_read, + const char *op, + void *extack, int *err) +{ + struct kparser_obj_link_ctx *tmp_list_ref = NULL, *curr_ref = NULL; + struct kparser_obj_link_ctx *node_tmp_list_ref = NULL; + struct kparser_obj_link_ctx *node_curr_ref = NULL; + struct kparser_glue_glue_parse_node *kparsenode; + struct kparser_glue_metadata_extract *kmde; + struct kparser_glue_metadata_table *kmdl; + struct kparser_conf_cmd *objects = NULL; + void *realloced_mem; + int i, rc; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "Key: {ID:%u Name:%s}\n", key->id, key->name); + + mutex_lock(&kparser_config_lock); + + kmdl = kparser_namespace_lookup(KPARSER_NS_METALIST, key); + if (!kmdl) { + (*rsp)->op_ret_code = ENOENT; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: Object key not found, key: {%s:%u}", + op, key->name, key->id); + goto done; + } + + /* verify if there is any associated immutable parser */ + list_for_each_entry_safe(curr_ref, tmp_list_ref, + &kmdl->glue.owned_list, owned_obj.list_node) { + if (curr_ref->owner_obj.nsid != KPARSER_NS_NODE_PARSE) + continue; + if (kref_read(curr_ref->owner_obj.refcount) == 0) + continue; + kparsenode = (struct kparser_glue_glue_parse_node *)curr_ref->owner_obj.obj; + list_for_each_entry_safe(node_curr_ref, node_tmp_list_ref, + &kparsenode->glue.glue.owned_list, owned_obj.list_node) { + if (node_curr_ref->owner_obj.nsid != KPARSER_NS_PARSER) + continue; + if (kref_read(node_curr_ref->owner_obj.refcount) != 0) { + (*rsp)->op_ret_code = EBUSY; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: attached parser `%s` is immutable", + op, + ((struct kparser_glue_parser *) + node_curr_ref->owner_obj.obj)->glue.key.name); + goto done; + } + } + } + + if (kparser_link_detach(kmdl, &kmdl->glue.owner_list, + &kmdl->glue.owned_list, *rsp, + extack, err) != 0) + goto done; + + (*rsp)->key = kmdl->glue.key; + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "Key: {ID:%u Name:%s}\n", (*rsp)->key.id, (*rsp)->key.name); + (*rsp)->object.conf_keys_bv = kmdl->glue.config.conf_keys_bv; + (*rsp)->object.mdl_conf = kmdl->glue.config.mdl_conf; + + for (i = 0; i < kmdl->md_configs_len; i++) { + (*rsp)->objects_len++; + *rsp_len = *rsp_len + sizeof(struct kparser_conf_cmd); + realloced_mem = krealloc(*rsp, *rsp_len, GFP_KERNEL | ___GFP_ZERO); + if (!realloced_mem) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "krealloc failed for rsp, len:%lu\n", + *rsp_len); + *rsp_len = 0; + mutex_unlock(&kparser_config_lock); + return KPARSER_ATTR_UNSPEC; + } + *rsp = realloced_mem; + objects = (struct kparser_conf_cmd *)(*rsp)->objects; + objects[i].namespace_id = kmdl->md_configs[i].namespace_id; + objects[i].conf_keys_bv = kmdl->md_configs[i].conf_keys_bv; + objects[i].md_conf = kmdl->md_configs[i].md_conf; + + kmde = kparser_namespace_lookup(KPARSER_NS_METADATA, &objects[i].md_conf.key); + if (!kmde) { + (*rsp)->op_ret_code = ENOENT; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: Object not found, key: {%s:%u}", + op, objects[i].md_conf.key.name, + objects[i].md_conf.key.id); + goto done; + } + + rc = kparser_namespace_remove(KPARSER_NS_METADATA, + &kmde->glue.ht_node_id, &kmde->glue.ht_node_name); + if (rc) { + (*rsp)->op_ret_code = rc; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: namespace remove error, rc:%d", + op, rc); + goto done; + } + + kparser_free(kmde); + } + + rc = kparser_namespace_remove(KPARSER_NS_METALIST, + &kmdl->glue.ht_node_id, &kmdl->glue.ht_node_name); + if (rc) { + (*rsp)->op_ret_code = rc; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: namespace remove error, rc:%d", + op, rc); + goto done; + } + + kparser_free(kmdl->metadata_table.entries); + + kmdl->metadata_table.num_ents = 0; + + kparser_free(kmdl->md_configs); + + kparser_free(kmdl); + +done: + mutex_unlock(&kparser_config_lock); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return KPARSER_ATTR_RSP(KPARSER_NS_METALIST); +} + +/* free handler for object metadata list */ +void kparser_free_metalist(void *ptr, void *arg) +{ + /* TODO: */ +} + +/* handler to convert and map netlink node context to kParser KMOD's node context */ +static inline bool kparser_conf_node_convert(const struct kparser_conf_node *conf, + void *node, size_t node_len) +{ + struct kparser_glue_proto_flag_fields_table *kflag_fields_proto_table; + struct kparser_parse_flag_fields_node *flag_fields_parse_node; + struct kparser_glue_parse_tlv_node *kparsetlvwildcardnode; + struct kparser_glue_glue_parse_node *kparsewildcardnode; + struct kparser_glue_proto_tlvs_table *kprototlvstbl; + struct kparser_glue_condexpr_tables *kcond_tables; + struct kparser_parse_tlvs_node *tlvs_parse_node; + struct kparser_glue_flag_fields *kflag_fields; + struct kparser_glue_protocol_table *kprototbl; + struct kparser_parse_node *plain_parse_node; + struct kparser_glue_metadata_table *kmdl; + + if (!conf || !node || node_len < sizeof(*plain_parse_node)) + return false; + + plain_parse_node = node; + plain_parse_node->node_type = conf->type; + plain_parse_node->unknown_ret = conf->plain_parse_node.unknown_ret; + plain_parse_node->proto_node.encap = conf->plain_parse_node.proto_node.encap; + plain_parse_node->proto_node.overlay = conf->plain_parse_node.proto_node.overlay; + plain_parse_node->proto_node.min_len = conf->plain_parse_node.proto_node.min_len; + plain_parse_node->proto_node.ops.len_parameterized = + conf->plain_parse_node.proto_node.ops.len_parameterized; + plain_parse_node->proto_node.ops.pflen = conf->plain_parse_node.proto_node.ops.pflen; + plain_parse_node->proto_node.ops.pfnext_proto = + conf->plain_parse_node.proto_node.ops.pfnext_proto; + + kcond_tables = + kparser_namespace_lookup(KPARSER_NS_CONDEXPRS_TABLES, + &conf->plain_parse_node.proto_node.ops.cond_exprs_table); + if (kcond_tables) { + plain_parse_node->proto_node.ops.cond_exprs = kcond_tables->table; + plain_parse_node->proto_node.ops.cond_exprs_parameterized = true; + } + + strcpy(plain_parse_node->name, conf->key.name); + + kprototbl = kparser_namespace_lookup(KPARSER_NS_PROTO_TABLE, + &conf->plain_parse_node.proto_table_key); + if (kprototbl) + rcu_assign_pointer(plain_parse_node->proto_table, &kprototbl->proto_table); + + kparsewildcardnode = + kparser_namespace_lookup(KPARSER_NS_NODE_PARSE, + &conf->plain_parse_node.wildcard_parse_node_key); + if (kparsewildcardnode) + rcu_assign_pointer(plain_parse_node->wildcard_node, + &kparsewildcardnode->parse_node); + + kmdl = kparser_namespace_lookup(KPARSER_NS_METALIST, + &conf->plain_parse_node.metadata_table_key); + if (kmdl) + rcu_assign_pointer(plain_parse_node->metadata_table, &kmdl->metadata_table); + + switch (conf->type) { + case KPARSER_NODE_TYPE_PLAIN: + break; + + case KPARSER_NODE_TYPE_TLVS: + if (node_len < sizeof(*tlvs_parse_node)) + return false; + + tlvs_parse_node = node; + + tlvs_parse_node->parse_node.tlvs_proto_node.ops = + conf->tlvs_parse_node.proto_node.ops; + + tlvs_parse_node->parse_node.tlvs_proto_node.start_offset = + conf->tlvs_parse_node.proto_node.start_offset; + tlvs_parse_node->parse_node.tlvs_proto_node.pad1_val = + conf->tlvs_parse_node.proto_node.pad1_val; + tlvs_parse_node->parse_node.tlvs_proto_node.padn_val = + conf->tlvs_parse_node.proto_node.padn_val; + tlvs_parse_node->parse_node.tlvs_proto_node.eol_val = + conf->tlvs_parse_node.proto_node.eol_val; + tlvs_parse_node->parse_node.tlvs_proto_node.pad1_enable = + conf->tlvs_parse_node.proto_node.pad1_enable; + tlvs_parse_node->parse_node.tlvs_proto_node.padn_enable = + conf->tlvs_parse_node.proto_node.padn_enable; + tlvs_parse_node->parse_node.tlvs_proto_node.eol_enable = + conf->tlvs_parse_node.proto_node.eol_enable; + tlvs_parse_node->parse_node.tlvs_proto_node.fixed_start_offset = + conf->tlvs_parse_node.proto_node.fixed_start_offset; + tlvs_parse_node->parse_node.tlvs_proto_node.min_len = + conf->tlvs_parse_node.proto_node.min_len; + + kprototlvstbl = + kparser_namespace_lookup(KPARSER_NS_TLV_PROTO_TABLE, + &conf->tlvs_parse_node.tlv_proto_table_key); + if (kprototlvstbl) + rcu_assign_pointer(tlvs_parse_node->tlv_proto_table, + &kprototlvstbl->tlvs_proto_table); + + kparsetlvwildcardnode = + kparser_namespace_lookup(KPARSER_NS_TLV_NODE_PARSE, + &conf->tlvs_parse_node.tlv_wildcard_node_key); + if (kparsetlvwildcardnode) + rcu_assign_pointer(tlvs_parse_node->tlv_wildcard_node, + &kparsetlvwildcardnode->tlv_parse_node); + + tlvs_parse_node->unknown_tlv_type_ret = + conf->tlvs_parse_node.unknown_tlv_type_ret; + + tlvs_parse_node->config = + conf->tlvs_parse_node.config; + break; + + case KPARSER_NODE_TYPE_FLAG_FIELDS: + if (node_len < sizeof(*flag_fields_parse_node)) + return false; + flag_fields_parse_node = node; + + flag_fields_parse_node->parse_node.flag_fields_proto_node.ops = + conf->flag_fields_parse_node.proto_node.ops; + + kflag_fields = + kparser_namespace_lookup(KPARSER_NS_FLAG_FIELD_TABLE, + &conf->flag_fields_parse_node.proto_node. + flag_fields_table_hkey); + if (kflag_fields) + rcu_assign_pointer(flag_fields_parse_node-> + parse_node.flag_fields_proto_node.flag_fields, + &kflag_fields->flag_fields); + + kflag_fields_proto_table = + kparser_namespace_lookup(KPARSER_NS_FLAG_FIELD_PROTO_TABLE, + &conf->flag_fields_parse_node. + flag_fields_proto_table_key); + if (kflag_fields_proto_table) + rcu_assign_pointer(flag_fields_parse_node->flag_fields_proto_table, + &kflag_fields_proto_table->flags_proto_table); + break; + + default: + return false; + } + return true; +} + +/* create handler for object parse node */ +int kparser_create_parse_node(const struct kparser_conf_cmd *conf, + size_t conf_len, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, const char *op, + void *extack, int *err) +{ + struct kparser_glue_glue_parse_node *kparsenode = NULL; + struct kparser_glue_protocol_table *proto_table; + struct kparser_glue_metadata_table *mdl; + const struct kparser_conf_node *arg; + struct kparser_hkey key; + int rc; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + mutex_lock(&kparser_config_lock); + + arg = &conf->node_conf; + + if (kparser_conf_key_manager(conf->namespace_id, &arg->key, &key, *rsp, + op, extack, err) != 0) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "error"); + goto done; + } + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "Key: {ID:%u Name:%s}\n", arg->key.id, arg->key.name); + + if (kparser_namespace_lookup(conf->namespace_id, &key)) { + (*rsp)->op_ret_code = EEXIST; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: Duplicate object, key:{%s:%u}", + op, arg->key.name, arg->key.id); + goto done; + } + + kparsenode = kzalloc(sizeof(*kparsenode), GFP_KERNEL); + if (!kparsenode) { + (*rsp)->op_ret_code = ENOMEM; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: kzalloc() failed, size:%lu", + op, sizeof(*kparsenode)); + goto done; + } + + kparsenode->glue.glue.key = key; + INIT_LIST_HEAD(&kparsenode->glue.glue.owner_list); + INIT_LIST_HEAD(&kparsenode->glue.glue.owned_list); + + rc = kparser_namespace_insert(conf->namespace_id, + &kparsenode->glue.glue.ht_node_id, + &kparsenode->glue.glue.ht_node_name); + if (rc) { + (*rsp)->op_ret_code = rc; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: kparser_namespace_insert() err, rc: %d", + op, rc); + goto done; + } + + kparsenode->glue.glue.config.namespace_id = conf->namespace_id; + kparsenode->glue.glue.config.conf_keys_bv = conf->conf_keys_bv; + kparsenode->glue.glue.config.node_conf = *arg; + kparsenode->glue.glue.config.node_conf.key = key; + kref_init(&kparsenode->glue.glue.refcount); + + if (!kparser_conf_node_convert(arg, &kparsenode->parse_node, + sizeof(kparsenode->parse_node))) { + (*rsp)->op_ret_code = EINVAL; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: kparser_conf_node_convert() err", + op); + goto done; + } + + if (kparsenode->parse_node.node.proto_table) { + proto_table = container_of(kparsenode->parse_node.node.proto_table, + struct kparser_glue_protocol_table, + proto_table); + if (kparser_link_attach(kparsenode, + KPARSER_NS_NODE_PARSE, + (const void **)&kparsenode->parse_node.node.proto_table, + &kparsenode->glue.glue.refcount, + &kparsenode->glue.glue.owner_list, + proto_table, + KPARSER_NS_PROTO_TABLE, + &proto_table->glue.refcount, + &proto_table->glue.owned_list, + *rsp, op, extack, err) != 0) + goto done; + } + + if (kparsenode->parse_node.node.metadata_table) { + mdl = container_of(kparsenode->parse_node.node.metadata_table, + struct kparser_glue_metadata_table, + metadata_table); + if (kparser_link_attach(kparsenode, + KPARSER_NS_NODE_PARSE, + (const void **)&kparsenode->parse_node.node.metadata_table, + &kparsenode->glue.glue.refcount, + &kparsenode->glue.glue.owner_list, + mdl, + KPARSER_NS_METALIST, + &mdl->glue.refcount, + &mdl->glue.owned_list, + *rsp, op, extack, err) != 0) + goto done; + } + + (*rsp)->key = key; + (*rsp)->object.conf_keys_bv = conf->conf_keys_bv; + (*rsp)->object.node_conf = kparsenode->glue.glue.config.node_conf; +done: + mutex_unlock(&kparser_config_lock); + + if ((*rsp)->op_ret_code != 0) + kparser_free(kparsenode); + + synchronize_rcu(); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return KPARSER_ATTR_RSP(KPARSER_NS_NODE_PARSE); +} + +/* read handler for object parse node */ +int kparser_read_parse_node(const struct kparser_hkey *key, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, __u8 recursive_read, + const char *op, + void *extack, int *err) +{ + const struct kparser_glue_glue_parse_node *kparsenode; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "Key: {ID:%u Name:%s}\n", key->id, key->name); + + mutex_lock(&kparser_config_lock); + + kparsenode = kparser_namespace_lookup(KPARSER_NS_NODE_PARSE, key); + if (!kparsenode) { + (*rsp)->op_ret_code = ENOENT; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: Object not found, key: {%s:%u}", + op, key->name, key->id); + goto done; + } + + (*rsp)->key = kparsenode->glue.glue.key; + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "Key: {ID:%u Name:%s}\n", (*rsp)->key.id, (*rsp)->key.name); + (*rsp)->object.conf_keys_bv = kparsenode->glue.glue.config.conf_keys_bv; + (*rsp)->object.node_conf = kparsenode->glue.glue.config.node_conf; +done: + mutex_unlock(&kparser_config_lock); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return KPARSER_ATTR_RSP(KPARSER_NS_NODE_PARSE); +} + +/* delete handler for object parse node */ +int kparser_del_parse_node(const struct kparser_hkey *key, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, __u8 recursive_read, + const char *op, + void *extack, int *err) +{ + struct kparser_obj_link_ctx *tmp_list_ref = NULL, *curr_ref = NULL; + struct kparser_glue_glue_parse_node *kparsenode; + int rc; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "Key: {ID:%u Name:%s}\n", key->id, key->name); + + mutex_lock(&kparser_config_lock); + + kparsenode = kparser_namespace_lookup(KPARSER_NS_NODE_PARSE, key); + if (!kparsenode) { + (*rsp)->op_ret_code = ENOENT; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: Object not found, key: {%s:%u}", + op, key->name, key->id); + goto done; + } + + /* verify if there is any associated immutable parser */ + list_for_each_entry_safe(curr_ref, tmp_list_ref, + &kparsenode->glue.glue.owned_list, + owned_obj.list_node) { + if (curr_ref->owner_obj.nsid != KPARSER_NS_PARSER) + continue; + if (kref_read(curr_ref->owner_obj.refcount) != 0) { + (*rsp)->op_ret_code = EBUSY; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s:attached parser `%s` is immutable", + op, + ((struct kparser_glue_parser *) + curr_ref->owner_obj.obj)->glue.key.name); + goto done; + } + } + + if (kparser_link_detach(kparsenode, &kparsenode->glue.glue.owner_list, + &kparsenode->glue.glue.owned_list, *rsp, extack, + err) != 0) + goto done; + + rc = kparser_namespace_remove(KPARSER_NS_NODE_PARSE, + &kparsenode->glue.glue.ht_node_id, + &kparsenode->glue.glue.ht_node_name); + if (rc) { + (*rsp)->op_ret_code = rc; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: namespace remove error, rc:%d", + op, rc); + goto done; + } + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "Key: {ID:%u Name:%s}\n", (*rsp)->key.id, (*rsp)->key.name); + (*rsp)->object.conf_keys_bv = kparsenode->glue.glue.config.conf_keys_bv; + (*rsp)->object.node_conf = kparsenode->glue.glue.config.node_conf; + + kparser_free(kparsenode); +done: + mutex_unlock(&kparser_config_lock); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return KPARSER_ATTR_RSP(KPARSER_NS_NODE_PARSE); +} + +/* free handler for object parse node */ +void kparser_free_node(void *ptr, void *arg) +{ + /* TODO: */ +} + +/* create handler for object protocol table entry */ +static bool kparser_create_proto_table_ent(const struct kparser_conf_table *arg, + struct kparser_glue_protocol_table **proto_table, + struct kparser_cmd_rsp_hdr *rsp, + const char *op, + void *extack, int *err) +{ + struct kparser_glue_glue_parse_node *kparsenode; + void *realloced_mem; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "Key: {ID:%u Name:%s}\n", arg->key.id, arg->key.name); + + *proto_table = kparser_namespace_lookup(KPARSER_NS_PROTO_TABLE, &arg->key); + if (!(*proto_table)) { + rsp->op_ret_code = ENOENT; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: Object not found, key: {%s:%u}", + op, arg->key.name, arg->key.id); + return false; + } + + kparsenode = kparser_namespace_lookup(KPARSER_NS_NODE_PARSE, &arg->elem_key); + if (!kparsenode) { + rsp->op_ret_code = ENOENT; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: parse node key:{%s:%u} not found", + op, arg->elem_key.name, + arg->elem_key.id); + return false; + } + + (*proto_table)->proto_table.num_ents++; + realloced_mem = krealloc((*proto_table)->proto_table.entries, + (*proto_table)->proto_table.num_ents * + sizeof(struct kparser_proto_table_entry), + GFP_KERNEL | ___GFP_ZERO); + if (!realloced_mem) { + rsp->op_ret_code = ENOMEM; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: krealloc() err, ents:%d, size:%lu", + op, + (*proto_table)->proto_table.num_ents, + sizeof(struct kparser_proto_table_entry)); + return false; + } + rcu_assign_pointer((*proto_table)->proto_table.entries, realloced_mem); + + if (kparser_link_attach(*proto_table, + KPARSER_NS_PROTO_TABLE, + NULL, /* due to realloc, can't cache pointer here */ + &(*proto_table)->glue.refcount, + &(*proto_table)->glue.owner_list, + kparsenode, + KPARSER_NS_NODE_PARSE, + &kparsenode->glue.glue.refcount, + &kparsenode->glue.glue.owned_list, + rsp, op, extack, err) != 0) + return false; + + (*proto_table)->proto_table.entries[(*proto_table)->proto_table.num_ents - 1].value = + arg->optional_value1; + (*proto_table)->proto_table.entries[(*proto_table)->proto_table.num_ents - 1].encap = + arg->optional_value2; + (*proto_table)->proto_table.entries[(*proto_table)->proto_table.num_ents - 1].node = + &kparsenode->parse_node.node; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return true; +} + +/* create handler for object protocol table */ +int kparser_create_proto_table(const struct kparser_conf_cmd *conf, + size_t conf_len, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, const char *op, + void *extack, int *err) +{ + struct kparser_glue_protocol_table *proto_table = NULL; + const struct kparser_conf_table *arg; + struct kparser_hkey key; + int rc; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + mutex_lock(&kparser_config_lock); + + arg = &conf->table_conf; + + /* create a table entry */ + if (arg->add_entry) { + if (kparser_create_proto_table_ent(arg, &proto_table, *rsp, op, + extack, err) == false) + goto done; + goto skip_table_create; + } + + if (kparser_conf_key_manager(conf->namespace_id, &arg->key, &key, *rsp, + op, extack, err) != 0) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "error"); + goto done; + } + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "Key: {ID:%u Name:%s}\n", arg->key.id, arg->key.name); + + if (kparser_namespace_lookup(conf->namespace_id, &key)) { + (*rsp)->op_ret_code = EEXIST; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: Duplicate object, key: {%s:%u}", + op, arg->key.name, arg->key.id); + goto done; + } + + /* create protocol table */ + proto_table = kzalloc(sizeof(*proto_table), GFP_KERNEL); + if (!proto_table) { + (*rsp)->op_ret_code = ENOMEM; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: kzalloc() failed, size: %lu", + op, sizeof(*proto_table)); + goto done; + } + + proto_table->glue.key = key; + + rc = kparser_namespace_insert(conf->namespace_id, + &proto_table->glue.ht_node_id, + &proto_table->glue.ht_node_name); + if (rc) { + (*rsp)->op_ret_code = rc; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: kparser_namespace_insert() err, rc:%d", + op, rc); + goto done; + } + + proto_table->glue.config.namespace_id = conf->namespace_id; + proto_table->glue.config.conf_keys_bv = conf->conf_keys_bv; + proto_table->glue.config.table_conf = *arg; + proto_table->glue.config.table_conf.key = key; + kref_init(&proto_table->glue.refcount); + INIT_LIST_HEAD(&proto_table->glue.owner_list); + INIT_LIST_HEAD(&proto_table->glue.owned_list); + +skip_table_create: + (*rsp)->object.conf_keys_bv = conf->conf_keys_bv; + (*rsp)->object.table_conf = *arg; + +done: + mutex_unlock(&kparser_config_lock); + + if ((*rsp)->op_ret_code != 0) + if (proto_table && !arg->add_entry) + kparser_free(proto_table); + + synchronize_rcu(); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return KPARSER_ATTR_RSP(KPARSER_NS_PROTO_TABLE); +} + +/* read handler for object protocol table */ +int kparser_read_proto_table(const struct kparser_hkey *key, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, __u8 recursive_read, + const char *op, + void *extack, int *err) +{ + const struct kparser_glue_protocol_table *proto_table; + const struct kparser_glue_glue_parse_node *parse_node; + struct kparser_conf_cmd *objects = NULL; + void *realloced_mem; + int i; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "Key: {ID:%u Name:%s}\n", key->id, key->name); + + mutex_lock(&kparser_config_lock); + + proto_table = kparser_namespace_lookup(KPARSER_NS_PROTO_TABLE, key); + if (!proto_table) { + (*rsp)->op_ret_code = ENOENT; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: Object not found, key: {%s:%u}", + op, key->name, key->id); + goto done; + } + + (*rsp)->key = proto_table->glue.key; + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "Key: {ID:%u Name:%s}\n", (*rsp)->key.id, (*rsp)->key.name); + (*rsp)->object.conf_keys_bv = proto_table->glue.config.conf_keys_bv; + (*rsp)->object.table_conf = proto_table->glue.config.table_conf; + + for (i = 0; i < proto_table->proto_table.num_ents; i++) { + (*rsp)->objects_len++; + *rsp_len = *rsp_len + sizeof(struct kparser_conf_cmd); + realloced_mem = krealloc(*rsp, *rsp_len, GFP_KERNEL | ___GFP_ZERO); + if (!realloced_mem) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "krealloc failed for rsp, len:%lu\n", + *rsp_len); + *rsp_len = 0; + mutex_unlock(&kparser_config_lock); + return KPARSER_ATTR_UNSPEC; + } + *rsp = realloced_mem; + objects = (struct kparser_conf_cmd *)(*rsp)->objects; + objects[i].namespace_id = proto_table->glue.config.namespace_id; + objects[i].table_conf = proto_table->glue.config.table_conf; + objects[i].table_conf.optional_value1 = proto_table->proto_table.entries[i].value; + if (!proto_table->proto_table.entries[i].node) + continue; + parse_node = container_of(proto_table->proto_table.entries[i].node, + struct kparser_glue_glue_parse_node, + parse_node.node); + objects[i].table_conf.elem_key = parse_node->glue.glue.key; + } + +done: + mutex_unlock(&kparser_config_lock); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return KPARSER_ATTR_RSP(KPARSER_NS_PROTO_TABLE); +} + +/* delete handler for object protocol table */ +int kparser_del_proto_table(const struct kparser_hkey *key, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, __u8 recursive_read, + const char *op, + void *extack, int *err) +{ + struct kparser_obj_link_ctx *tmp_list_ref = NULL, *curr_ref = NULL; + struct kparser_obj_link_ctx *node_tmp_list_ref = NULL; + struct kparser_obj_link_ctx *node_curr_ref = NULL; + struct kparser_glue_protocol_table *proto_table; + struct kparser_glue_glue_parse_node *kparsenode; + struct kparser_glue_glue_parse_node *parse_node; + struct kparser_conf_cmd *objects = NULL; + void *realloced_mem; + int i, rc; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "Key: {ID:%u Name:%s}\n", key->id, key->name); + + mutex_lock(&kparser_config_lock); + + proto_table = kparser_namespace_lookup(KPARSER_NS_PROTO_TABLE, key); + if (!proto_table) { + (*rsp)->op_ret_code = ENOENT; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: Object not found, key: {%s:%u}", + op, key->name, key->id); + goto done; + } + + /* verify if there is any associated immutable parser */ + list_for_each_entry_safe(curr_ref, tmp_list_ref, + &proto_table->glue.owned_list, owned_obj.list_node) { + if (curr_ref->owner_obj.nsid != KPARSER_NS_NODE_PARSE) + continue; + if (kref_read(curr_ref->owner_obj.refcount) == 0) + continue; + kparsenode = (struct kparser_glue_glue_parse_node *) + curr_ref->owner_obj.obj; + list_for_each_entry_safe(node_curr_ref, node_tmp_list_ref, + &kparsenode->glue.glue.owned_list, owned_obj.list_node) { + if (node_curr_ref->owner_obj.nsid != KPARSER_NS_PARSER) + continue; + if (kref_read(node_curr_ref->owner_obj.refcount) != 0) { + (*rsp)->op_ret_code = EBUSY; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s:attached parser `%s` is immutable", + op, + ((struct kparser_glue_parser *) + node_curr_ref->owner_obj.obj)->glue.key.name); + goto done; + } + } + } + + (*rsp)->key = proto_table->glue.key; + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "Key: {ID:%u Name:%s}\n", (*rsp)->key.id, (*rsp)->key.name); + (*rsp)->object.conf_keys_bv = proto_table->glue.config.conf_keys_bv; + (*rsp)->object.table_conf = proto_table->glue.config.table_conf; + + for (i = 0; i < proto_table->proto_table.num_ents; i++) { + (*rsp)->objects_len++; + *rsp_len = *rsp_len + sizeof(struct kparser_conf_cmd); + realloced_mem = krealloc(*rsp, *rsp_len, GFP_KERNEL | ___GFP_ZERO); + if (!realloced_mem) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "krealloc failed for rsp, len:%lu\n", + *rsp_len); + *rsp_len = 0; + mutex_unlock(&kparser_config_lock); + return KPARSER_ATTR_UNSPEC; + } + *rsp = realloced_mem; + objects = (struct kparser_conf_cmd *)(*rsp)->objects; + objects[i].namespace_id = proto_table->glue.config.namespace_id; + objects[i].table_conf = proto_table->glue.config.table_conf; + objects[i].table_conf.optional_value1 = proto_table->proto_table.entries[i].value; + if (!proto_table->proto_table.entries[i].node) + continue; + parse_node = container_of(proto_table->proto_table.entries[i].node, + struct kparser_glue_glue_parse_node, + parse_node.node); + objects[i].table_conf.elem_key = parse_node->glue.glue.key; + } + + if (kparser_link_detach(proto_table, &proto_table->glue.owner_list, + &proto_table->glue.owned_list, *rsp, + extack, err) != 0) + goto done; + + rc = kparser_namespace_remove(KPARSER_NS_PROTO_TABLE, + &proto_table->glue.ht_node_id, + &proto_table->glue.ht_node_name); + if (rc) { + (*rsp)->op_ret_code = rc; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: namespace remove error, rc:%d", + op, rc); + goto done; + } + + kparser_free(proto_table->proto_table.entries); + kparser_free(proto_table); +done: + mutex_unlock(&kparser_config_lock); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return KPARSER_ATTR_RSP(KPARSER_NS_PROTO_TABLE); +} + +/* free handler for object protocol table */ +void kparser_free_proto_tbl(void *ptr, void *arg) +{ + /* TODO: */ +} + +/* handler to convert and map from netlink tlv node to kParser KMOD's tlv node */ +static inline bool kparser_conf_tlv_node_convert(const struct kparser_conf_node_parse_tlv *conf, + struct kparser_parse_tlv_node *node) +{ + struct kparser_glue_parse_tlv_node *kparsewildcardnode; + struct kparser_glue_condexpr_tables *kcond_tables; + struct kparser_glue_proto_tlvs_table *kprototbl; + struct kparser_glue_metadata_table *kmdl; + + if (!conf || !node) + return false; + + node->proto_tlv_node.min_len = conf->node_proto.min_len; + node->proto_tlv_node.max_len = conf->node_proto.max_len; + node->proto_tlv_node.is_padding = conf->node_proto.is_padding; + + node->proto_tlv_node.ops.pfoverlay_type = conf->node_proto.ops.pfoverlay_type; + if (node->proto_tlv_node.ops.pfoverlay_type.src_off || + node->proto_tlv_node.ops.pfoverlay_type.size || + node->proto_tlv_node.ops.pfoverlay_type.right_shift) + node->proto_tlv_node.ops.overlay_type_parameterized = true; + + kcond_tables = kparser_namespace_lookup(KPARSER_NS_CONDEXPRS_TABLES, + &conf->node_proto.ops.cond_exprs_table); + if (kcond_tables) { + node->proto_tlv_node.ops.cond_exprs = kcond_tables->table; + node->proto_tlv_node.ops.cond_exprs_parameterized = true; + } + + kprototbl = kparser_namespace_lookup(KPARSER_NS_TLV_PROTO_TABLE, + &conf->overlay_proto_tlvs_table_key); + if (kprototbl) + rcu_assign_pointer(node->overlay_table, &kprototbl->tlvs_proto_table); + + kparsewildcardnode = kparser_namespace_lookup(KPARSER_NS_TLV_NODE_PARSE, + &conf->overlay_wildcard_parse_node_key); + if (kparsewildcardnode) + rcu_assign_pointer(node->overlay_wildcard_node, + &kparsewildcardnode->tlv_parse_node); + + node->unknown_overlay_ret = conf->unknown_ret; + strcpy(node->name, conf->key.name); + + kmdl = kparser_namespace_lookup(KPARSER_NS_METALIST, + &conf->metadata_table_key); + if (kmdl) + rcu_assign_pointer(node->metadata_table, &kmdl->metadata_table); + + return true; +} + +/* create handler for object tlv node */ +int kparser_create_parse_tlv_node(const struct kparser_conf_cmd *conf, + size_t conf_len, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, const char *op, + void *extack, int *err) +{ + struct kparser_glue_parse_tlv_node *node = NULL; + const struct kparser_conf_node_parse_tlv *arg; + struct kparser_hkey key; + int rc; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + mutex_lock(&kparser_config_lock); + + arg = &conf->tlv_node_conf; + + if (kparser_conf_key_manager(conf->namespace_id, &arg->key, &key, *rsp, + op, extack, err) != 0) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "error"); + goto done; + } + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "Key: {ID:%u Name:%s}\n", arg->key.id, arg->key.name); + + if (kparser_namespace_lookup(conf->namespace_id, &key)) { + (*rsp)->op_ret_code = EEXIST; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: Duplicate object key, {%s:%u}", + op, arg->key.name, arg->key.id); + goto done; + } + + node = kzalloc(sizeof(*node), GFP_KERNEL); + if (!node) { + (*rsp)->op_ret_code = ENOMEM; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: kzalloc() failed, size: %lu", + op, sizeof(*node)); + goto done; + } + + node->glue.glue.key = key; + + rc = kparser_namespace_insert(conf->namespace_id, + &node->glue.glue.ht_node_id, + &node->glue.glue.ht_node_name); + if (rc) { + (*rsp)->op_ret_code = rc; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: kparser_namespace_insert() err, rc:%d", + op, rc); + goto done; + } + + node->glue.glue.config.namespace_id = conf->namespace_id; + node->glue.glue.config.conf_keys_bv = conf->conf_keys_bv; + node->glue.glue.config.tlv_node_conf = *arg; + node->glue.glue.config.tlv_node_conf.key = key; + kref_init(&node->glue.glue.refcount); + + if (!kparser_conf_tlv_node_convert(arg, &node->tlv_parse_node)) { + (*rsp)->op_ret_code = EINVAL; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: kparser_conf_tlv_node_convert() err", + op); + goto done; + } + + (*rsp)->key = key; + (*rsp)->object.conf_keys_bv = conf->conf_keys_bv; + (*rsp)->object.tlv_node_conf = node->glue.glue.config.tlv_node_conf; +done: + mutex_unlock(&kparser_config_lock); + + if ((*rsp)->op_ret_code != 0) + kparser_free(node); + + synchronize_rcu(); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return KPARSER_ATTR_RSP(KPARSER_NS_TLV_NODE_PARSE); +} + +/* read handler for object tlv node */ +int kparser_read_parse_tlv_node(const struct kparser_hkey *key, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, __u8 recursive_read, + const char *op, + void *extack, int *err) +{ + const struct kparser_glue_parse_tlv_node *node; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "Key: {ID:%u Name:%s}\n", key->id, key->name); + + mutex_lock(&kparser_config_lock); + + node = kparser_namespace_lookup(KPARSER_NS_TLV_NODE_PARSE, key); + if (!node) { + (*rsp)->op_ret_code = ENOENT; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: Object not found, key: {%s:%u}", + op, key->name, key->id); + goto done; + } + + (*rsp)->key = node->glue.glue.key; + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "Key: {ID:%u Name:%s}\n", (*rsp)->key.id, (*rsp)->key.name); + (*rsp)->object.conf_keys_bv = node->glue.glue.config.conf_keys_bv; + (*rsp)->object.tlv_node_conf = node->glue.glue.config.tlv_node_conf; +done: + mutex_unlock(&kparser_config_lock); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return KPARSER_ATTR_RSP(KPARSER_NS_TLV_NODE_PARSE); +} + +/* create handler for object tlv proto table's entry */ +static bool kparser_create_tlv_proto_table_ent(const struct kparser_conf_table *arg, + struct kparser_glue_proto_tlvs_table **proto_table, + struct kparser_cmd_rsp_hdr *rsp, + const char *op, + void *extack, int *err) +{ + const struct kparser_glue_parse_tlv_node *kparsenode; + void *realloced_mem; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "Key: {ID:%u Name:%s}\n", arg->key.id, arg->key.name); + + *proto_table = kparser_namespace_lookup(KPARSER_NS_TLV_PROTO_TABLE, &arg->key); + if (!(*proto_table)) { + rsp->op_ret_code = ENOENT; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: Object not found, key: {%s:%u}", + op, arg->key.name, arg->key.id); + return false; + } + + kparsenode = kparser_namespace_lookup(KPARSER_NS_TLV_NODE_PARSE, &arg->elem_key); + if (!kparsenode) { + rsp->op_ret_code = ENOENT; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: Object not found, key: {%s:%u}", + op, arg->elem_key.name, arg->elem_key.id); + return false; + } + + (*proto_table)->tlvs_proto_table.num_ents++; + realloced_mem = krealloc((*proto_table)->tlvs_proto_table.entries, + (*proto_table)->tlvs_proto_table.num_ents * + sizeof(struct kparser_proto_tlvs_table_entry), + GFP_KERNEL | ___GFP_ZERO); + if (!realloced_mem) { + rsp->op_ret_code = ENOMEM; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: krealloc() err, ents:%d, size:%lu", + op, + (*proto_table)->tlvs_proto_table.num_ents, + sizeof(struct kparser_proto_tlvs_table_entry)); + return false; + } + rcu_assign_pointer((*proto_table)->tlvs_proto_table.entries, realloced_mem); + + (*proto_table)->tlvs_proto_table.entries[(*proto_table)->tlvs_proto_table.num_ents - + 1].type = arg->optional_value1; + (*proto_table)->tlvs_proto_table.entries[(*proto_table)->tlvs_proto_table.num_ents - + 1].node = &kparsenode->tlv_parse_node; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return true; +} + +/* create handler for object tlv proto table */ +int kparser_create_tlv_proto_table(const struct kparser_conf_cmd *conf, + size_t conf_len, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, const char *op, + void *extack, int *err) +{ + struct kparser_glue_proto_tlvs_table *proto_table = NULL; + const struct kparser_conf_table *arg; + struct kparser_hkey key; + int rc; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + mutex_lock(&kparser_config_lock); + + arg = &conf->table_conf; + + /* create a table entry */ + if (arg->add_entry) { + if (kparser_create_tlv_proto_table_ent(arg, &proto_table, *rsp, + op, extack, err) == false) + goto done; + goto skip_table_create; + } + + if (kparser_conf_key_manager(conf->namespace_id, &arg->key, &key, *rsp, + op, extack, err) != 0) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "error"); + goto done; + } + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "Key: {ID:%u Name:%s}\n", arg->key.id, arg->key.name); + + if (kparser_namespace_lookup(conf->namespace_id, &key)) { + (*rsp)->op_ret_code = EEXIST; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: Duplicate object, key: {%s:%u}", + op, arg->key.name, arg->key.id); + goto done; + } + + /* create protocol table */ + proto_table = kzalloc(sizeof(*proto_table), GFP_KERNEL); + if (!proto_table) { + (*rsp)->op_ret_code = ENOMEM; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: kzalloc() failed, size: %lu", + op, sizeof(*proto_table)); + goto done; + } + + proto_table->glue.key = key; + + rc = kparser_namespace_insert(conf->namespace_id, + &proto_table->glue.ht_node_id, + &proto_table->glue.ht_node_name); + if (rc) { + (*rsp)->op_ret_code = rc; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: kparser_namespace_insert() err, rc: %d", + op, rc); + goto done; + } + + proto_table->glue.config.namespace_id = conf->namespace_id; + proto_table->glue.config.conf_keys_bv = conf->conf_keys_bv; + proto_table->glue.config.table_conf = *arg; + proto_table->glue.config.table_conf.key = key; + kref_init(&proto_table->glue.refcount); + +skip_table_create: + (*rsp)->key = key; + (*rsp)->object.conf_keys_bv = conf->conf_keys_bv; + (*rsp)->object.table_conf = *arg; +done: + mutex_unlock(&kparser_config_lock); + + if ((*rsp)->op_ret_code != 0) + if (proto_table && !arg->add_entry) + kparser_free(proto_table); + + synchronize_rcu(); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return KPARSER_ATTR_RSP(KPARSER_NS_TLV_PROTO_TABLE); +} + +/* read handler for object tlv proto table */ +int kparser_read_tlv_proto_table(const struct kparser_hkey *key, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, __u8 recursive_read, + const char *op, + void *extack, int *err) +{ + const struct kparser_glue_proto_tlvs_table *proto_table; + const struct kparser_glue_parse_tlv_node *parse_node; + struct kparser_conf_cmd *objects = NULL; + void *realloced_mem; + int i; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "Key: {ID:%u Name:%s}\n", key->id, key->name); + + mutex_lock(&kparser_config_lock); + + proto_table = kparser_namespace_lookup(KPARSER_NS_TLV_PROTO_TABLE, key); + if (!proto_table) { + (*rsp)->op_ret_code = ENOENT; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: Object not found, key: {%s:%u}", + op, key->name, key->id); + goto done; + } + + (*rsp)->key = proto_table->glue.key; + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "Key: {ID:%u Name:%s}\n", (*rsp)->key.id, (*rsp)->key.name); + (*rsp)->object.conf_keys_bv = proto_table->glue.config.conf_keys_bv; + (*rsp)->object.table_conf = proto_table->glue.config.table_conf; + + for (i = 0; i < proto_table->tlvs_proto_table.num_ents; i++) { + (*rsp)->objects_len++; + *rsp_len = *rsp_len + sizeof(struct kparser_conf_cmd); + realloced_mem = krealloc(*rsp, *rsp_len, GFP_KERNEL | ___GFP_ZERO); + if (!realloced_mem) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "krealloc failed for rsp, len:%lu\n", + *rsp_len); + *rsp_len = 0; + mutex_unlock(&kparser_config_lock); + return KPARSER_ATTR_UNSPEC; + } + *rsp = realloced_mem; + objects = (struct kparser_conf_cmd *)(*rsp)->objects; + objects[i].namespace_id = proto_table->glue.config.namespace_id; + objects[i].table_conf = proto_table->glue.config.table_conf; + objects[i].table_conf.optional_value1 = + proto_table->tlvs_proto_table.entries[i].type; + if (!proto_table->tlvs_proto_table.entries[i].node) + continue; + parse_node = container_of(proto_table->tlvs_proto_table.entries[i].node, + struct kparser_glue_parse_tlv_node, tlv_parse_node); + objects[i].table_conf.elem_key = parse_node->glue.glue.key; + } + +done: + mutex_unlock(&kparser_config_lock); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return KPARSER_ATTR_RSP(KPARSER_NS_TLV_PROTO_TABLE); +} + +/* create handler for object flag field */ +int kparser_create_flag_field(const struct kparser_conf_cmd *conf, + size_t conf_len, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, const char *op, + void *extack, int *err) +{ + struct kparser_glue_flag_field *kobj = NULL; + const struct kparser_conf_flag_field *arg; + struct kparser_hkey key; + int rc; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + mutex_lock(&kparser_config_lock); + + arg = &conf->flag_field_conf; + + if (kparser_conf_key_manager(conf->namespace_id, &arg->key, &key, *rsp, + op, extack, err) != 0) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "error"); + goto done; + } + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "Key: {ID:%u Name:%s}\n", arg->key.id, arg->key.name); + + if (kparser_namespace_lookup(conf->namespace_id, &key)) { + (*rsp)->op_ret_code = EEXIST; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: Duplicate object, key: {%s:%u}", + op, arg->key.name, arg->key.id); + goto done; + } + + kobj = kzalloc(sizeof(*kobj), GFP_KERNEL); + if (!kobj) { + (*rsp)->op_ret_code = ENOMEM; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: kzalloc() failed, size: %lu", + op, sizeof(*kobj)); + goto done; + } + + kobj->glue.key = key; + + rc = kparser_namespace_insert(conf->namespace_id, + &kobj->glue.ht_node_id, &kobj->glue.ht_node_name); + if (rc) { + (*rsp)->op_ret_code = rc; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: kparser_namespace_insert() err, rc:%d", + op, rc); + goto done; + } + + kobj->glue.config.namespace_id = conf->namespace_id; + kobj->glue.config.conf_keys_bv = conf->conf_keys_bv; + kobj->glue.config.flag_field_conf = *arg; + kobj->glue.config.flag_field_conf.key = key; + kref_init(&kobj->glue.refcount); + + kobj->flag_field = arg->conf; + + (*rsp)->key = key; + (*rsp)->object.conf_keys_bv = conf->conf_keys_bv; + (*rsp)->object.flag_field_conf = kobj->glue.config.flag_field_conf; + +done: + mutex_unlock(&kparser_config_lock); + + if ((*rsp)->op_ret_code != 0) + kparser_free(kobj); + + synchronize_rcu(); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return KPARSER_ATTR_RSP(KPARSER_NS_FLAG_FIELD); +} + +/* read handler for object flag field */ +int kparser_read_flag_field(const struct kparser_hkey *key, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, __u8 recursive_read, + const char *op, + void *extack, int *err) +{ + struct kparser_glue_flag_field *kobj; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "Key: {ID:%u Name:%s}\n", key->id, key->name); + + mutex_lock(&kparser_config_lock); + + kobj = kparser_namespace_lookup(KPARSER_NS_FLAG_FIELD, key); + if (!kobj) { + (*rsp)->op_ret_code = ENOENT; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: Object not found, key: {%s:%u}", + op, key->name, key->id); + goto done; + } + + (*rsp)->key = kobj->glue.key; + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "Key: {ID:%u Name:%s}\n", (*rsp)->key.id, (*rsp)->key.name); + (*rsp)->object.conf_keys_bv = kobj->glue.config.conf_keys_bv; + (*rsp)->object.flag_field_conf = kobj->glue.config.flag_field_conf; +done: + mutex_unlock(&kparser_config_lock); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return KPARSER_ATTR_RSP(KPARSER_NS_FLAG_FIELD); +} + +/* compare call back to sort flag fields using their flag values in qsort API */ +static int compare(const void *lhs, const void *rhs) +{ + const struct kparser_flag_field *lhs_flag = lhs; + const struct kparser_flag_field *rhs_flag = rhs; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "lflag:%x rflag:%x\n", lhs_flag->flag, rhs_flag->flag); + + if (lhs_flag->flag < rhs_flag->flag) + return -1; + if (lhs_flag->flag > rhs_flag->flag) + return 1; + + return 0; +} + +/* create handler for object flag field table entry */ +static bool kparser_create_flag_field_table_ent(const struct kparser_conf_table *arg, + struct kparser_glue_flag_fields **proto_table, + struct kparser_cmd_rsp_hdr *rsp, + const char *op, + void *extack, int *err) +{ + const struct kparser_glue_flag_field *kflagent; + void *realloced_mem; + int i; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "Key: {ID:%u Name:%s}\n", arg->key.id, arg->key.name); + + *proto_table = kparser_namespace_lookup(KPARSER_NS_FLAG_FIELD_TABLE, &arg->key); + if (!(*proto_table)) { + rsp->op_ret_code = ENOENT; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: Object not found, key: {%s:%u}", + op, arg->key.name, arg->key.id); + return false; + } + + kflagent = kparser_namespace_lookup(KPARSER_NS_FLAG_FIELD, &arg->elem_key); + if (!kflagent) { + rsp->op_ret_code = ENOENT; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: Object not found, key: {%s:%u}", + op, arg->elem_key.name, arg->elem_key.id); + return false; + } + + (*proto_table)->flag_fields.num_idx++; + + realloced_mem = krealloc((*proto_table)->flag_fields.fields, + (*proto_table)->flag_fields.num_idx * + sizeof(struct kparser_flag_field), + GFP_KERNEL | ___GFP_ZERO); + if (!realloced_mem) { + rsp->op_ret_code = ENOMEM; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: krealloc() err, ents:%lu, size:%lu", + op, + (*proto_table)->flag_fields.num_idx, + sizeof(struct kparser_flag_field)); + return false; + } + rcu_assign_pointer((*proto_table)->flag_fields.fields, realloced_mem); + + (*proto_table)->flag_fields.fields[(*proto_table)->flag_fields.num_idx - 1] = + kflagent->flag_field; + + sort((*proto_table)->flag_fields.fields, + (*proto_table)->flag_fields.num_idx, + sizeof(struct kparser_flag_field), &compare, NULL); + + for (i = 0; i < (*proto_table)->flag_fields.num_idx; i++) + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "List[%d]:%x\n", + i, (*proto_table)->flag_fields.fields[i].flag); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return true; +} + +/* create handler for object flag field */ +int kparser_create_flag_field_table(const struct kparser_conf_cmd *conf, + size_t conf_len, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, const char *op, + void *extack, int *err) +{ + struct kparser_glue_flag_fields *proto_table = NULL; + const struct kparser_conf_table *arg; + struct kparser_hkey key; + int rc; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + mutex_lock(&kparser_config_lock); + + arg = &conf->table_conf; + + if (arg->add_entry) { + if (kparser_create_flag_field_table_ent(arg, &proto_table, *rsp, + op, extack, err) == false) + goto done; + goto skip_table_create; + } + + if (kparser_conf_key_manager(conf->namespace_id, &arg->key, &key, *rsp, + op, extack, err) != 0) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "error"); + goto done; + } + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "Key: {ID:%u Name:%s}\n", arg->key.id, arg->key.name); + + if (kparser_namespace_lookup(conf->namespace_id, &key)) { + (*rsp)->op_ret_code = EEXIST; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: Duplicate object, key: {%s:%u}", + op, arg->key.name, arg->key.id); + goto done; + } + + proto_table = kzalloc(sizeof(*proto_table), GFP_KERNEL); + if (!proto_table) { + (*rsp)->op_ret_code = ENOMEM; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: kzalloc() failed, size: %lu", + op, sizeof(*proto_table)); + goto done; + } + + proto_table->glue.key = key; + + rc = kparser_namespace_insert(conf->namespace_id, + &proto_table->glue.ht_node_id, + &proto_table->glue.ht_node_name); + if (rc) { + (*rsp)->op_ret_code = rc; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: kparser_namespace_insert() err, rc: %d", + op, rc); + goto done; + } + + proto_table->glue.config.namespace_id = conf->namespace_id; + proto_table->glue.config.conf_keys_bv = conf->conf_keys_bv; + proto_table->glue.config.table_conf = *arg; + proto_table->glue.config.table_conf.key = key; + kref_init(&proto_table->glue.refcount); + +skip_table_create: + (*rsp)->key = key; + (*rsp)->object.conf_keys_bv = conf->conf_keys_bv; + (*rsp)->object.table_conf = *arg; +done: + mutex_unlock(&kparser_config_lock); + + if ((*rsp)->op_ret_code != 0) + if (proto_table && !arg->add_entry) + kparser_free(proto_table); + + synchronize_rcu(); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return KPARSER_ATTR_RSP(KPARSER_NS_FLAG_FIELD_TABLE); +} + +/* read handler for object flag field */ +int kparser_read_flag_field_table(const struct kparser_hkey *key, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, __u8 recursive_read, + const char *op, + void *extack, int *err) +{ + const struct kparser_glue_flag_fields *proto_table; + const struct kparser_glue_flag_field *kflagent; + struct kparser_conf_cmd *objects = NULL; + void *realloced_mem; + int i; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "Key: {ID:%u Name:%s}\n", key->id, key->name); + + mutex_lock(&kparser_config_lock); + + proto_table = kparser_namespace_lookup(KPARSER_NS_FLAG_FIELD_TABLE, key); + if (!proto_table) { + (*rsp)->op_ret_code = ENOENT; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: Object not found, key: {%s:%u}", + op, key->name, key->id); + goto done; + } + + (*rsp)->key = proto_table->glue.key; + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "Key: {ID:%u Name:%s}\n", + (*rsp)->key.id, (*rsp)->key.name); + (*rsp)->object.conf_keys_bv = proto_table->glue.config.conf_keys_bv; + (*rsp)->object.table_conf = proto_table->glue.config.table_conf; + + for (i = 0; i < proto_table->flag_fields.num_idx; i++) { + (*rsp)->objects_len++; + *rsp_len = *rsp_len + sizeof(struct kparser_conf_cmd); + realloced_mem = krealloc(*rsp, *rsp_len, GFP_KERNEL | ___GFP_ZERO); + if (!realloced_mem) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "krealloc failed for rsp, len:%lu\n", + *rsp_len); + *rsp_len = 0; + mutex_unlock(&kparser_config_lock); + return KPARSER_ATTR_UNSPEC; + } + *rsp = realloced_mem; + + objects = (struct kparser_conf_cmd *)(*rsp)->objects; + objects[i].namespace_id = proto_table->glue.config.namespace_id; + objects[i].table_conf = proto_table->glue.config.table_conf; + objects[i].table_conf.optional_value1 = i; + if (!proto_table->flag_fields.fields) + continue; + kflagent = container_of(&proto_table->flag_fields.fields[i], + struct kparser_glue_flag_field, flag_field); + objects[i].table_conf.elem_key = kflagent->glue.key; + } + +done: + mutex_unlock(&kparser_config_lock); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return KPARSER_ATTR_RSP(KPARSER_NS_FLAG_FIELD_TABLE); +} + +/* handler to convert and map netlink's flag node to kParser KMOD's flag node */ +static inline bool +kparser_create_parse_flag_field_node_convert(const struct kparser_conf_node_parse_flag_field *conf, + struct kparser_parse_flag_field_node *node) +{ + struct kparser_glue_condexpr_tables *kcond_tables; + struct kparser_glue_metadata_table *kmdl; + + if (!conf || !node) + return false; + + strcpy(node->name, conf->key.name); + + kcond_tables = kparser_namespace_lookup(KPARSER_NS_CONDEXPRS_TABLES, + &conf->ops.cond_exprs_table_key); + if (kcond_tables) + node->ops.cond_exprs = kcond_tables->table; + + kmdl = kparser_namespace_lookup(KPARSER_NS_METALIST, &conf->metadata_table_key); + if (kmdl) + rcu_assign_pointer(node->metadata_table, &kmdl->metadata_table); + + return true; +} + +/* create handler for object flag field node */ +int kparser_create_parse_flag_field_node(const struct kparser_conf_cmd *conf, + size_t conf_len, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, const char *op, + void *extack, int *err) +{ + const struct kparser_conf_node_parse_flag_field *arg; + struct kparser_glue_flag_field_node *node = NULL; + struct kparser_hkey key; + int rc; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + mutex_lock(&kparser_config_lock); + + arg = &conf->flag_field_node_conf; + + if (kparser_conf_key_manager(conf->namespace_id, &arg->key, &key, *rsp, + op, extack, err) != 0) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "error"); + goto done; + } + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "Key: {ID:%u Name:%s}\n", arg->key.id, arg->key.name); + + if (kparser_namespace_lookup(conf->namespace_id, &key)) { + (*rsp)->op_ret_code = EEXIST; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: Duplicate object, key: {%s:%u}", + op, arg->key.name, arg->key.id); + goto done; + } + + node = kzalloc(sizeof(*node), GFP_KERNEL); + if (!node) { + (*rsp)->op_ret_code = ENOMEM; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: kzalloc() failed, size: %lu", + op, sizeof(*node)); + goto done; + } + + node->glue.glue.key = key; + + rc = kparser_namespace_insert(conf->namespace_id, + &node->glue.glue.ht_node_id, &node->glue.glue.ht_node_name); + if (rc) { + (*rsp)->op_ret_code = rc; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: kparser_namespace_insert() err, rc:%d", + op, rc); + goto done; + } + + node->glue.glue.config.namespace_id = conf->namespace_id; + node->glue.glue.config.conf_keys_bv = conf->conf_keys_bv; + node->glue.glue.config.flag_field_node_conf = *arg; + node->glue.glue.config.flag_field_node_conf.key = key; + kref_init(&node->glue.glue.refcount); + + if (!kparser_create_parse_flag_field_node_convert(arg, &node->node_flag_field)) { + (*rsp)->op_ret_code = EINVAL; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: kparser_conf_tlv_node_convert() err", + op); + goto done; + } + + (*rsp)->key = key; + (*rsp)->object.conf_keys_bv = conf->conf_keys_bv; + (*rsp)->object.flag_field_node_conf = node->glue.glue.config.flag_field_node_conf; +done: + mutex_unlock(&kparser_config_lock); + + if ((*rsp)->op_ret_code != 0) + kparser_free(node); + + synchronize_rcu(); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return KPARSER_ATTR_RSP(KPARSER_NS_FLAG_FIELD_NODE_PARSE); +} + +/* read handler for object flag field node */ +int kparser_read_parse_flag_field_node(const struct kparser_hkey *key, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, __u8 recursive_read, + const char *op, + void *extack, int *err) +{ + const struct kparser_glue_flag_field_node *node; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "Key: {ID:%u Name:%s}\n", key->id, key->name); + + mutex_lock(&kparser_config_lock); + + node = kparser_namespace_lookup(KPARSER_NS_FLAG_FIELD_NODE_PARSE, key); + if (!node) { + (*rsp)->op_ret_code = ENOENT; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: Object not found, key: {%s:%u}", + op, key->name, key->id); + goto done; + } + + (*rsp)->key = node->glue.glue.key; + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "Key: {ID:%u Name:%s}\n", (*rsp)->key.id, (*rsp)->key.name); + (*rsp)->object.conf_keys_bv = node->glue.glue.config.conf_keys_bv; + (*rsp)->object.flag_field_node_conf = node->glue.glue.config.flag_field_node_conf; +done: + mutex_unlock(&kparser_config_lock); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return KPARSER_ATTR_RSP(KPARSER_NS_FLAG_FIELD_NODE_PARSE); +} + +/* create handler for object flag field proto table's entry */ +static bool +kparser_create_flag_field_proto_table_ent(const struct kparser_conf_table *arg, + struct kparser_glue_proto_flag_fields_table **proto_table, + struct kparser_cmd_rsp_hdr *rsp, + const char *op, + void *extack, int *err) +{ + const struct kparser_glue_flag_field_node *kparsenode; + void *realloced_mem; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "Key: {ID:%u Name:%s}\n", arg->key.id, arg->key.name); + + *proto_table = kparser_namespace_lookup(KPARSER_NS_FLAG_FIELD_PROTO_TABLE, &arg->key); + if (!(*proto_table)) { + rsp->op_ret_code = ENOENT; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: Object not found, key: {%s:%u}", + op, arg->key.name, arg->key.id); + return false; + } + + kparsenode = kparser_namespace_lookup(KPARSER_NS_FLAG_FIELD_NODE_PARSE, &arg->elem_key); + if (!kparsenode) { + rsp->op_ret_code = ENOENT; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: Object not found, key: {%s:%u}", + op, + arg->elem_key.name, + arg->elem_key.id); + return false; + } + + (*proto_table)->flags_proto_table.num_ents++; + realloced_mem = krealloc((*proto_table)->flags_proto_table.entries, + (*proto_table)->flags_proto_table.num_ents * + sizeof(struct kparser_proto_flag_fields_table_entry), + GFP_KERNEL | ___GFP_ZERO); + if (!realloced_mem) { + rsp->op_ret_code = ENOMEM; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: krealloc() err, ents:%d, size:%lu", + op, + (*proto_table)->flags_proto_table.num_ents, + sizeof(struct kparser_proto_flag_fields_table_entry)); + return false; + } + rcu_assign_pointer((*proto_table)->flags_proto_table.entries, realloced_mem); + + (*proto_table)->flags_proto_table.entries[(*proto_table)->flags_proto_table.num_ents - + 1].flag = arg->optional_value1; + (*proto_table)->flags_proto_table.entries[(*proto_table)->flags_proto_table.num_ents - + 1].node = &kparsenode->node_flag_field; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return true; +} + +/* create handler for object flag field proto table */ +int kparser_create_flag_field_proto_table(const struct kparser_conf_cmd *conf, + size_t conf_len, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, const char *op, + void *extack, int *err) +{ + struct kparser_glue_proto_flag_fields_table *proto_table = NULL; + const struct kparser_conf_table *arg; + struct kparser_hkey key; + int rc; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + mutex_lock(&kparser_config_lock); + + arg = &conf->table_conf; + + /* create a table entry */ + if (arg->add_entry) { + if (kparser_create_flag_field_proto_table_ent(arg, &proto_table, + *rsp, + op, extack, err) == false) + goto done; + goto skip_table_create; + } + + if (kparser_conf_key_manager(conf->namespace_id, &arg->key, &key, *rsp, + op, extack, err) != 0) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "error"); + goto done; + } + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "Key: {ID:%u Name:%s}\n", arg->key.id, arg->key.name); + + if (kparser_namespace_lookup(conf->namespace_id, &key)) { + (*rsp)->op_ret_code = EEXIST; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: Duplicate object, key {%s:%u}", + op, arg->key.name, arg->key.id); + goto done; + } + + /* create protocol table */ + proto_table = kzalloc(sizeof(*proto_table), GFP_KERNEL); + if (!proto_table) { + (*rsp)->op_ret_code = ENOMEM; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: kzalloc() failed, size: %lu", op, + sizeof(*proto_table)); + goto done; + } + + proto_table->glue.key = key; + + rc = kparser_namespace_insert(conf->namespace_id, + &proto_table->glue.ht_node_id, + &proto_table->glue.ht_node_name); + if (rc) { + (*rsp)->op_ret_code = rc; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: kparser_namespace_insert() err, rc: %d", + op, rc); + goto done; + } + + proto_table->glue.config.namespace_id = conf->namespace_id; + proto_table->glue.config.conf_keys_bv = conf->conf_keys_bv; + proto_table->glue.config.table_conf = *arg; + proto_table->glue.config.table_conf.key = key; + kref_init(&proto_table->glue.refcount); + +skip_table_create: + (*rsp)->key = key; + (*rsp)->object.conf_keys_bv = conf->conf_keys_bv; + (*rsp)->object.table_conf = *arg; +done: + mutex_unlock(&kparser_config_lock); + + if ((*rsp)->op_ret_code != 0) + if (proto_table && !arg->add_entry) + kparser_free(proto_table); + + synchronize_rcu(); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return KPARSER_ATTR_RSP(KPARSER_NS_FLAG_FIELD_PROTO_TABLE); +} + +/* read handler for object flag field proto table */ +int kparser_read_flag_field_proto_table(const struct kparser_hkey *key, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, __u8 recursive_read, + const char *op, + void *extack, int *err) +{ + const struct kparser_glue_proto_flag_fields_table *proto_table; + const struct kparser_glue_flag_field_node *parse_node; + struct kparser_conf_cmd *objects = NULL; + void *realloced_mem; + int i; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "Key: {ID:%u Name:%s}\n", key->id, key->name); + + mutex_lock(&kparser_config_lock); + + proto_table = kparser_namespace_lookup(KPARSER_NS_FLAG_FIELD_PROTO_TABLE, key); + if (!proto_table) { + (*rsp)->op_ret_code = ENOENT; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: Object not found, key: {%s:%u}", + op, key->name, key->id); + goto done; + } + + (*rsp)->key = proto_table->glue.key; + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "Key: {ID:%u Name:%s}\n", (*rsp)->key.id, (*rsp)->key.name); + (*rsp)->object.conf_keys_bv = proto_table->glue.config.conf_keys_bv; + (*rsp)->object.table_conf = proto_table->glue.config.table_conf; + + for (i = 0; i < proto_table->flags_proto_table.num_ents; i++) { + (*rsp)->objects_len++; + *rsp_len = *rsp_len + sizeof(struct kparser_conf_cmd); + realloced_mem = krealloc(*rsp, *rsp_len, GFP_KERNEL | ___GFP_ZERO); + if (!realloced_mem) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "krealloc failed for rsp, len:%lu\n", + *rsp_len); + *rsp_len = 0; + mutex_unlock(&kparser_config_lock); + return KPARSER_ATTR_UNSPEC; + } + *rsp = realloced_mem; + + objects = (struct kparser_conf_cmd *)(*rsp)->objects; + objects[i].namespace_id = proto_table->glue.config.namespace_id; + objects[i].table_conf = proto_table->glue.config.table_conf; + if (!proto_table->flags_proto_table.entries[i].node) + continue; + objects[i].table_conf.optional_value1 = + proto_table->flags_proto_table.entries[i].flag; + parse_node = container_of(proto_table->flags_proto_table.entries[i].node, + struct kparser_glue_flag_field_node, + node_flag_field); + objects[i].table_conf.elem_key = parse_node->glue.glue.key; + } + +done: + mutex_unlock(&kparser_config_lock); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return KPARSER_ATTR_RSP(KPARSER_NS_FLAG_FIELD_PROTO_TABLE); +} + +/* conevrt and map from netlink's parser to kParser KMOD's parser */ +static inline bool kparser_parser_convert(const struct kparser_conf_parser *conf, + struct kparser_parser *parser) +{ + struct kparser_glue_glue_parse_node *node; + + strcpy(parser->name, conf->key.name); + + node = kparser_namespace_lookup(KPARSER_NS_NODE_PARSE, &conf->root_node_key); + if (node) + rcu_assign_pointer(parser->root_node, &node->parse_node.node); + else + return false; + + node = kparser_namespace_lookup(KPARSER_NS_NODE_PARSE, &conf->ok_node_key); + if (node) + rcu_assign_pointer(parser->okay_node, &node->parse_node.node); + + node = kparser_namespace_lookup(KPARSER_NS_NODE_PARSE, &conf->fail_node_key); + if (node) + rcu_assign_pointer(parser->fail_node, &node->parse_node.node); + + node = kparser_namespace_lookup(KPARSER_NS_NODE_PARSE, &conf->atencap_node_key); + if (node) + rcu_assign_pointer(parser->atencap_node, &node->parse_node.node); + + parser->cntrs_conf = cntrs_conf; + + parser->config = conf->config; + return true; +} + +/* create handler for object parser */ +int kparser_create_parser(const struct kparser_conf_cmd *conf, + size_t conf_len, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, const char *op, + void *extack, int *err) +{ + struct kparser_glue_glue_parse_node *parse_node; + struct kparser_glue_parser *kparser = NULL; + struct kparser_counters *cntrs = NULL; + const struct kparser_conf_parser *arg; + struct kparser_parser parser = {}; + struct kparser_hkey key; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + mutex_lock(&kparser_config_lock); + + arg = &conf->parser_conf; + + cntrs = kzalloc(sizeof(*cntrs), GFP_KERNEL); + if (!cntrs) { + (*rsp)->op_ret_code = ENOMEM; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: kzalloc() failed, size:%lu", + op, sizeof(*cntrs)); + goto done; + } + rcu_assign_pointer(parser.cntrs, cntrs); + parser.cntrs_len = sizeof(*cntrs); + parser.kparser_start_signature = KPARSERSTARTSIGNATURE; + parser.kparser_end_signature = KPARSERENDSIGNATURE; + if (!kparser_parser_convert(arg, &parser)) { + (*rsp)->op_ret_code = EINVAL; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: parser arg convert error", op); + goto done; + } + + if (!kparser_cmd_create_pre_process(op, conf, &arg->key, &key, + (void **)&kparser, sizeof(*kparser), *rsp, + offsetof(struct kparser_glue_parser, + glue), extack, err)) + goto done; + + kparser->parser = parser; + + INIT_LIST_HEAD(&kparser->glue.owner_list); + INIT_LIST_HEAD(&kparser->glue.owned_list); + + if (kparser->parser.root_node) { + parse_node = container_of(kparser->parser.root_node, + struct kparser_glue_glue_parse_node, + parse_node.node); + if (kparser_link_attach(kparser, + KPARSER_NS_PARSER, + (const void **)&kparser->parser.root_node, + &kparser->glue.refcount, + &kparser->glue.owner_list, + parse_node, + KPARSER_NS_NODE_PARSE, + &parse_node->glue.glue.refcount, + &parse_node->glue.glue.owned_list, + *rsp, op, extack, err) != 0) + goto done; + } + + if (kparser->parser.okay_node) { + parse_node = container_of(kparser->parser.okay_node, + struct kparser_glue_glue_parse_node, + parse_node.node); + if (kparser_link_attach(kparser, + KPARSER_NS_PARSER, + (const void **)&kparser->parser.okay_node, + &kparser->glue.refcount, + &kparser->glue.owner_list, + parse_node, + KPARSER_NS_NODE_PARSE, + &parse_node->glue.glue.refcount, + &parse_node->glue.glue.owned_list, + *rsp, op, extack, err) != 0) + goto done; + } + + if (kparser->parser.fail_node) { + parse_node = container_of(kparser->parser.fail_node, + struct kparser_glue_glue_parse_node, + parse_node.node); + if (kparser_link_attach(kparser, + KPARSER_NS_PARSER, + (const void **)&kparser->parser.fail_node, + &kparser->glue.refcount, + &kparser->glue.owner_list, + parse_node, + KPARSER_NS_NODE_PARSE, + &parse_node->glue.glue.refcount, + &parse_node->glue.glue.owned_list, + *rsp, op, extack, err) != 0) + goto done; + } + + if (kparser->glue.key.id >= KPARSER_PARSER_FAST_LOOKUP_RSVD_ID_START && + kparser->glue.key.id <= KPARSER_PARSER_FAST_LOOKUP_RSVD_ID_STOP) + rcu_assign_pointer(kparser_fast_lookup_array[kparser->glue.key.id], kparser); +done: + mutex_unlock(&kparser_config_lock); + + if ((*rsp)->op_ret_code != 0) { + kparser_free(kparser); + kparser_free(cntrs); + } + + synchronize_rcu(); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return KPARSER_ATTR_RSP(KPARSER_NS_PARSER); +} + +static bool kparser_dump_protocol_table(const struct kparser_proto_table *obj, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, + void *extack, int *err); + +/* dump metadata list to netlink msg rsp */ +static bool kparser_dump_metadata_table(const struct kparser_metadata_table *obj, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, + void *extack, int *err) +{ + const struct kparser_glue_metadata_table *glue_obj; + struct kparser_cmd_rsp_hdr *new_rsp = NULL; + size_t new_rsp_len = 0; + void *realloced_mem; + void *ptr; + int rc; + + if (!obj) + return true; + + rc = alloc_first_rsp(&new_rsp, &new_rsp_len, KPARSER_NS_METALIST); + if (rc) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "alloc_first_rsp() failed, rc:%d\n", + rc); + return false; + } + + glue_obj = container_of(obj, struct kparser_glue_metadata_table, metadata_table); + + /* NOTE: TODO: kparser_config_lock should not be released and reacquired here. Fix later. */ + mutex_unlock(&kparser_config_lock); + rc = kparser_read_metalist(&glue_obj->glue.key, + &new_rsp, &new_rsp_len, false, "read", + extack, err); + mutex_lock(&kparser_config_lock); + + if (rc != KPARSER_ATTR_RSP(KPARSER_NS_METALIST)) + goto error; + + realloced_mem = krealloc(*rsp, *rsp_len + new_rsp_len, GFP_KERNEL | ___GFP_ZERO); + if (!realloced_mem) + goto error; + *rsp = realloced_mem; + + ptr = (*rsp); + ptr += (*rsp_len); + (*rsp_len) = (*rsp_len) + new_rsp_len; + memcpy(ptr, new_rsp, new_rsp_len); + kparser_free(new_rsp); + new_rsp = NULL; + + return true; +error: + kparser_free(new_rsp); + + return false; +} + +/* dump parse node to netlink msg rsp */ +static bool kparser_dump_parse_node(const struct kparser_parse_node *obj, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, + void *extack, int *err) +{ + const struct kparser_glue_glue_parse_node *glue_obj; + struct kparser_cmd_rsp_hdr *new_rsp = NULL; + size_t new_rsp_len = 0; + void *realloced_mem; + void *ptr; + int rc; + + if (!obj) + return true; + + rc = alloc_first_rsp(&new_rsp, &new_rsp_len, KPARSER_NS_NODE_PARSE); + if (rc) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "alloc_first_rsp() failed, rc:%d\n", rc); + return false; + } + + glue_obj = container_of(obj, struct kparser_glue_glue_parse_node, parse_node.node); + + /* NOTE: TODO: kparser_config_lock should not be released and reacquired here. Fix later. */ + mutex_unlock(&kparser_config_lock); + rc = kparser_read_parse_node(&glue_obj->glue.glue.key, + &new_rsp, &new_rsp_len, false, "read", + extack, err); + mutex_lock(&kparser_config_lock); + + if (rc != KPARSER_ATTR_RSP(KPARSER_NS_NODE_PARSE)) + goto error; + + realloced_mem = krealloc(*rsp, *rsp_len + new_rsp_len, GFP_KERNEL | ___GFP_ZERO); + if (!realloced_mem) + goto error; + *rsp = realloced_mem; + + ptr = (*rsp); + ptr += (*rsp_len); + (*rsp_len) = (*rsp_len) + new_rsp_len; + memcpy(ptr, new_rsp, new_rsp_len); + kparser_free(new_rsp); + new_rsp = NULL; + + if (!kparser_dump_protocol_table(obj->proto_table, rsp, rsp_len, extack, + err)) + goto error; + + if (!kparser_dump_metadata_table(obj->metadata_table, rsp, rsp_len, + extack, err)) + goto error; + + return true; +error: + kparser_free(new_rsp); + + return false; +} + +/* dump protocol table to netlink msg rsp */ +static bool kparser_dump_protocol_table(const struct kparser_proto_table *obj, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, + void *extack, int *err) +{ + const struct kparser_glue_protocol_table *glue_obj; + struct kparser_cmd_rsp_hdr *new_rsp = NULL; + size_t new_rsp_len = 0; + void *realloced_mem; + void *ptr; + int rc, i; + + if (!obj) + return true; + + rc = alloc_first_rsp(&new_rsp, &new_rsp_len, KPARSER_NS_PROTO_TABLE); + if (rc) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "alloc_first_rsp() failed, rc:%d\n", rc); + return false; + } + + glue_obj = container_of(obj, struct kparser_glue_protocol_table, + proto_table); + + /* NOTE: TODO: kparser_config_lock should not be released and reacquired here. Fix later. */ + mutex_unlock(&kparser_config_lock); + rc = kparser_read_proto_table(&glue_obj->glue.key, + &new_rsp, &new_rsp_len, false, "read", + extack, err); + mutex_lock(&kparser_config_lock); + + if (rc != KPARSER_ATTR_RSP(KPARSER_NS_PROTO_TABLE)) + goto error; + + realloced_mem = krealloc(*rsp, *rsp_len + new_rsp_len, GFP_KERNEL | ___GFP_ZERO); + if (!realloced_mem) + goto error; + *rsp = realloced_mem; + + ptr = (*rsp); + ptr += (*rsp_len); + (*rsp_len) = (*rsp_len) + new_rsp_len; + memcpy(ptr, new_rsp, new_rsp_len); + kparser_free(new_rsp); + new_rsp = NULL; + + for (i = 0; i < glue_obj->proto_table.num_ents; i++) + if (!kparser_dump_parse_node(glue_obj->proto_table.entries[i].node, + rsp, rsp_len, extack, err)) + goto error; + + return true; +error: + kparser_free(new_rsp); + + return false; +} + +/* dump parser to netlink msg rsp */ +static bool kparser_dump_parser(const struct kparser_glue_parser *kparser, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, + void *extack, int *err) +{ + /* DEBUG code, if(0) avoids warning for both compiler and checkpatch */ + if (0) + kparser_dump_parser_tree(&kparser->parser); + + kparser_start_new_tree_traversal(); + + if (!kparser_dump_parse_node(kparser->parser.root_node, rsp, rsp_len, + extack, err)) + goto error; + + return true; +error: + return false; +} + +/* read handler for object parser */ +int kparser_read_parser(const struct kparser_hkey *key, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, __u8 recursive_read, + const char *op, + void *extack, int *err) +{ + const struct kparser_glue_parser *kparser; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "Key: {ID:%u Name:%s}\n", key->id, key->name); + + mutex_lock(&kparser_config_lock); + + kparser = kparser_namespace_lookup(KPARSER_NS_PARSER, key); + if (!kparser) { + (*rsp)->op_ret_code = ENOENT; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: Object not found, key: {%s:%u}", + op, key->name, key->id); + goto done; + } + + (*rsp)->key = kparser->glue.key; + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "Key: {ID:%u Name:%s}\n", (*rsp)->key.id, (*rsp)->key.name); + (*rsp)->object.conf_keys_bv = kparser->glue.config.conf_keys_bv; + (*rsp)->object.parser_conf = kparser->glue.config.parser_conf; + + if (recursive_read && + kparser_dump_parser(kparser, rsp, rsp_len, extack, err) == false) + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "kparser_dump_parser failed"); + +done: + mutex_unlock(&kparser_config_lock); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return KPARSER_ATTR_RSP(KPARSER_NS_PARSER); +} + +/* delete handler for object parser */ +int kparser_del_parser(const struct kparser_hkey *key, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, __u8 recursive_read, + const char *op, + void *extack, int *err) +{ + struct kparser_glue_parser *kparser; + int rc; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "Key: {ID:%u Name:%s}\n", key->id, key->name); + + mutex_lock(&kparser_config_lock); + + kparser = kparser_namespace_lookup(KPARSER_NS_PARSER, key); + if (!kparser) { + (*rsp)->op_ret_code = ENOENT; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: Object not found, key: {%s:%u}", + op, key->name, key->id); + goto done; + } + + if (kparser_link_detach(kparser, &kparser->glue.owner_list, + &kparser->glue.owned_list, *rsp, + extack, err) != 0) + goto done; + + rc = kparser_namespace_remove(KPARSER_NS_PARSER, + &kparser->glue.ht_node_id, &kparser->glue.ht_node_name); + if (rc) { + (*rsp)->op_ret_code = rc; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: namespace remove error, rc:%d", + op, rc); + goto done; + } + + (*rsp)->key = kparser->glue.key; + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "Key: {ID:%u Name:%s}\n", (*rsp)->key.id, (*rsp)->key.name); + (*rsp)->object.conf_keys_bv = kparser->glue.config.conf_keys_bv; + (*rsp)->object.parser_conf = kparser->glue.config.parser_conf; + + if (kparser->glue.key.id >= KPARSER_PARSER_FAST_LOOKUP_RSVD_ID_START && + kparser->glue.key.id <= KPARSER_PARSER_FAST_LOOKUP_RSVD_ID_STOP) + rcu_assign_pointer(kparser_fast_lookup_array[kparser->glue.key.id], NULL); + + kparser_free(kparser->parser.cntrs); + kparser_free(kparser); +done: + mutex_unlock(&kparser_config_lock); + synchronize_rcu(); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return KPARSER_ATTR_RSP(KPARSER_NS_PARSER); +} + +/* free handler for object parser */ +void kparser_free_parser(void *ptr, void *arg) +{ + /* TODO: */ +} + +/* handler for object parser lock */ +int kparser_parser_lock(const struct kparser_conf_cmd *conf, + size_t conf_len, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, const char *op, + void *extack, int *err) +{ + const struct kparser_parser *parser; + const struct kparser_hkey *key; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + mutex_lock(&kparser_config_lock); + + key = &conf->obj_key; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "Key: {ID:%u Name:%s}\n", key->id, key->name); + + parser = kparser_get_parser(key, false); + if (!parser) { + (*rsp)->op_ret_code = ENOENT; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: object not found, key: {%s:%u}", + op, key->name, key->id); + goto done; + } + + (*rsp)->key = *key; + (*rsp)->object.conf_keys_bv = conf->conf_keys_bv; + (*rsp)->object.obj_key = *key; +done: + mutex_unlock(&kparser_config_lock); + + synchronize_rcu(); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return KPARSER_ATTR_RSP(KPARSER_NS_OP_PARSER_LOCK_UNLOCK); +} + +/* handler for object parser unlock */ +int kparser_parser_unlock(const struct kparser_hkey *key, + struct kparser_cmd_rsp_hdr **rsp, + size_t *rsp_len, __u8 recursive_read, + const char *op, + void *extack, int *err) +{ + const struct kparser_glue_parser *kparser; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "Key: {ID:%u Name:%s}\n", key->id, key->name); + + mutex_lock(&kparser_config_lock); + + kparser = kparser_namespace_lookup(KPARSER_NS_PARSER, key); + if (!kparser) { + (*rsp)->op_ret_code = ENOENT; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: object not found, key: {%s:%u}", + op, key->name, key->id); + goto done; + } + + if (!kparser_put_parser(&kparser->parser, false)) { + (*rsp)->op_ret_code = EINVAL; + NL_SET_ERR_MSG_FMT_MOD(extack, + "%s: Parser unlock failed", + op); + goto done; + } + + (*rsp)->key = *key; + (*rsp)->object.obj_key = *key; +done: + mutex_unlock(&kparser_config_lock); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return KPARSER_ATTR_RSP(KPARSER_NS_OP_PARSER_LOCK_UNLOCK); +} diff --git a/net/kparser/kparser_condexpr.h b/net/kparser/kparser_condexpr.h new file mode 100644 index 000000000..247e8cb3f --- /dev/null +++ b/net/kparser/kparser_condexpr.h @@ -0,0 +1,52 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* Copyright (c) 2022, SiPanda Inc. + * + * kparser_condexpr.h - kParser conditionals helper and structures header file + * + * Authors: Tom Herbert + * Pratyush Kumar Khan + */ + +#ifndef __KPARSER_CONDEXPR_H__ +#define __KPARSER_CONDEXPR_H__ + +/* Definitions for parameterized conditional expressions */ + +#include "kparser_types.h" +#include "kparser_metaextract.h" + +/* Evaluate one conditional expression */ +static inline bool kparser_expr_evaluate(const struct kparser_condexpr_expr *expr, void *hdr) +{ + __u64 val; + + pr_debug("{%s:%d}: soff:%u len:%u mask:%x type:%d\n", + __func__, __LINE__, expr->src_off, expr->length, expr->mask, expr->type); + + __kparser_metadata_bytes_extract(hdr + expr->src_off, (__u8 *)&val, expr->length, false); + + val &= expr->mask; + + pr_debug("{%s:%d}: type:%d val:%llx expr->value:%u\n", + __func__, __LINE__, expr->type, val, expr->value); + + switch (expr->type) { + case KPARSER_CONDEXPR_TYPE_EQUAL: + return (val == expr->value); + case KPARSER_CONDEXPR_TYPE_NOTEQUAL: + return (val != expr->value); + case KPARSER_CONDEXPR_TYPE_LT: + return (val < expr->value); + case KPARSER_CONDEXPR_TYPE_LTE: + return (val <= expr->value); + case KPARSER_CONDEXPR_TYPE_GT: + return (val > expr->value); + case KPARSER_CONDEXPR_TYPE_GTE: + return (val >= expr->value); + default: + break; + } + + return false; +} +#endif /* __KPARSER_CONDEXPR_H__ */ diff --git a/net/kparser/kparser_datapath.c b/net/kparser/kparser_datapath.c new file mode 100644 index 000000000..556d3d055 --- /dev/null +++ b/net/kparser/kparser_datapath.c @@ -0,0 +1,1266 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Copyright (c) 2022, SiPanda Inc. + * + * kparser_datapath.c - kParser main datapath source file for parsing logic - data path + * + * Authors: Tom Herbert + * Pratyush Kumar Khan + */ + +#include +#include +#include + +#include "kparser.h" +#include "kparser_condexpr.h" +#include "kparser_metaextract.h" +#include "kparser_types.h" + +/* Lookup a type in a node table + * TODO: as of now, this table is an array, but in future, this needs to be + * converted to hash table for performance benefits + */ +static const struct kparser_parse_node *lookup_node(__u32 dflags, + int type, + const struct kparser_proto_table *table, + bool *isencap) +{ + struct kparser_proto_table_entry __rcu *entries; + __u32 tmp; + int i; + + if (!table) + return NULL; + + KPARSER_KMOD_DEBUG_PRINT(dflags, + "type:0x%04x ents:%d, types:[%x, %x]\n", + type, table->num_ents, ntohs(type), ntohl(type)); + + for (i = 0; i < table->num_ents; i++) { + entries = rcu_dereference(table->entries); + KPARSER_KMOD_DEBUG_PRINT(dflags, + "type:0x%x evalue:0x%x\n", + type, entries[i].value); + if (type == entries[i].value) { + *isencap = entries[i].encap; + return entries[i].node; + } else if (ntohs(type) == entries[i].value) { + // for 2 bytes + *isencap = entries[i].encap; + return entries[i].node; + } else if (ntohl(type) == entries[i].value) { + // for 4 bytes + *isencap = entries[i].encap; + return entries[i].node; + } + // for 3 bytes + tmp = ntohl(type); + tmp = tmp >> 8; + KPARSER_KMOD_DEBUG_PRINT(dflags, "tmp:%x", tmp); + if (tmp == entries[i].value) { + *isencap = entries[i].encap; + return entries[i].node; + } + } + + return NULL; +} + +/* Lookup a type in a node TLV table */ +static const struct kparser_parse_tlv_node +*lookup_tlv_node(__u32 dflags, __u8 type, + const struct kparser_proto_tlvs_table *table) +{ + int i; + + KPARSER_KMOD_DEBUG_PRINT(dflags, "type:%d\n", type); + + for (i = 0; i < table->num_ents; i++) { + KPARSER_KMOD_DEBUG_PRINT(dflags, "table_type:%d\n", + table->entries[i].type); + if (type == table->entries[i].type) + return table->entries[i].node; + } + + return NULL; +} + +/* Lookup a flag-fields index in a protocol node flag-fields table + * TODO: This needs to optimized later to use array for better performance + */ +static const struct kparser_parse_flag_field_node +*lookup_flag_field_node(__u32 dflags, __u32 flag, + const struct kparser_proto_flag_fields_table *table) +{ + int i; + + for (i = 0; i < table->num_ents; i++) { + KPARSER_KMOD_DEBUG_PRINT(dflags, + "flag:%x eflag[%d]:%x\n", flag, i, + table->entries[i].flag); + if (flag == table->entries[i].flag) + return table->entries[i].node; + } + + return NULL; +} + +/* Metadata table processing */ +static int extract_metadata_table(__u32 dflags, + const struct kparser_parser *parser, + const struct kparser_metadata_table *metadata_table, + const void *_hdr, size_t hdr_len, size_t hdr_offset, + void *_metadata, void *_frame, + const struct kparser_ctrl_data *ctrl) +{ + struct kparser_metadata_extract *entries; + int i, ret; + + KPARSER_KMOD_DEBUG_PRINT(dflags, "cnt:%d\n", metadata_table->num_ents); + + for (i = 0; i < metadata_table->num_ents; i++) { + entries = rcu_dereference(metadata_table->entries); + ret = kparser_metadata_extract(parser, entries[i], + _hdr, hdr_len, hdr_offset, + _metadata, _frame, ctrl); + if (ret != KPARSER_OKAY) + break; + } + return ret; +} + +/* evaluate next proto parameterized context */ +static int eval_parameterized_next_proto(__u32 dflags, + const struct kparser_parameterized_next_proto *pf, + void *_hdr) +{ + __u32 next_proto; + __u32 mask = pf->mask; + + _hdr += pf->src_off; + + switch (pf->size) { + case 1: + next_proto = *(__u8 *)_hdr; + if (pf->mask > 0xff) + mask = 0xff; + break; + case 2: + next_proto = *(__u16 *)_hdr; + if (pf->mask > 0xffff) + mask = 0xffff; + break; + case 3: + memcpy(&next_proto, _hdr, 3); + if (pf->mask > 0xffffff) + mask = 0xffffff; + break; + case 4: + next_proto = *(__u32 *)_hdr; + if (pf->mask > 0xffffffff) + mask = 0xffffffff; + break; + + default: + return KPARSER_STOP_UNKNOWN_PROTO; + } + + KPARSER_KMOD_DEBUG_PRINT(dflags, + "next_proto:%x mask:%x rs:%x pf->src_off:%u pf->size:%u", + next_proto, mask, + pf->right_shift, pf->src_off, pf->size); + + return (next_proto & mask) >> pf->right_shift; +} + +/* evaluate len parameterized context */ +static ssize_t eval_parameterized_len(const struct kparser_parameterized_len *pf, void *_hdr) +{ + __u32 len; + + _hdr += pf->src_off; + + switch (pf->size) { + case 1: + len = *(__u8 *)_hdr; + break; + case 2: + len = *(__u16 *)_hdr; + break; + case 3: + len = 0; + memcpy(&len, _hdr, 3); + break; /* TODO */ + case 4: + len = *(__u32 *)_hdr; + break; + default: + return KPARSER_STOP_LENGTH; + } + + len = (len & pf->mask) >> pf->right_shift; + + return (len * pf->multiplier) + pf->add_value; +} + +/* evaluate conditionals */ +static bool eval_cond_exprs_and_table(const struct kparser_condexpr_table *table, void *_hdr) +{ + int i; + + for (i = 0; i < table->num_ents; i++) + if (!kparser_expr_evaluate(table->entries[i], _hdr)) + return false; + + return true; +} + +/* evaluate table of conditionals */ +static bool eval_cond_exprs_or_table(const struct kparser_condexpr_table *table, void *_hdr) +{ + int i; + + for (i = 0; i < table->num_ents; i++) + if (kparser_expr_evaluate(table->entries[i], _hdr)) + return true; + + return false; +} + +/* evaluate list of table of conditionals */ +static int eval_cond_exprs(__u32 dflags, + const struct kparser_condexpr_tables *tables, + void *_hdr) +{ + bool res; + int i; + + for (i = 0; i < tables->num_ents; i++) { + const struct kparser_condexpr_table *table = tables->entries[i]; + + KPARSER_KMOD_DEBUG_PRINT(dflags, + "type:%d err:%d\n", + table->type, table->default_fail); + + switch (table->type) { + case KPARSER_CONDEXPR_TYPE_OR: + res = eval_cond_exprs_or_table(table, _hdr); + break; + case KPARSER_CONDEXPR_TYPE_AND: + res = eval_cond_exprs_and_table(table, _hdr); + break; + } + if (!res) { + KPARSER_KMOD_DEBUG_PRINT(dflags, + "i:%d type:%d err:%d\n", + i, table->type, + table->default_fail); + + return table->default_fail; + } + } + + return KPARSER_OKAY; +} + +/* process one tlv node */ +static int kparser_parse_one_tlv(__u32 dflags, + const struct kparser_parser *parser, + const struct kparser_parse_tlvs_node *parse_tlvs_node, + const struct kparser_parse_tlv_node *parse_tlv_node, + void *_obj_ref, void *_hdr, + size_t tlv_len, size_t tlv_offset, void *_metadata, + void *_frame, struct kparser_ctrl_data *ctrl) +{ + const struct kparser_parse_tlv_node *next_parse_tlv_node; + const struct kparser_metadata_table *metadata_table; + const struct kparser_proto_tlv_node *proto_tlv_node; + const struct kparser_proto_tlv_node_ops *proto_ops; + struct kparser_proto_tlvs_table *overlay_table; + int type, ret; + +parse_again: + + proto_tlv_node = &parse_tlv_node->proto_tlv_node; + + KPARSER_KMOD_DEBUG_PRINT(dflags, + "kParser parsing TLV %s\n", + parse_tlv_node->name); + + KPARSER_KMOD_DEBUG_PRINT(dflags, + "tlv_len:%lu min_len:%lu\n", + tlv_len, proto_tlv_node->min_len); + + if (tlv_len < proto_tlv_node->min_len || tlv_len > proto_tlv_node->max_len) { + /* Treat check length error as an unrecognized TLV */ + parse_tlv_node = rcu_dereference(parse_tlvs_node->tlv_wildcard_node); + if (parse_tlv_node) + goto parse_again; + else + return parse_tlvs_node->unknown_tlv_type_ret; + } + + proto_ops = &proto_tlv_node->ops; + + KPARSER_KMOD_DEBUG_PRINT(dflags, "cond_exprs_parameterized:%d\n", + proto_ops->cond_exprs_parameterized); + + if (proto_ops->cond_exprs_parameterized) { + ret = eval_cond_exprs(dflags, + &proto_ops->cond_exprs, _hdr); + if (ret != KPARSER_OKAY) + return ret; + } + + metadata_table = rcu_dereference(parse_tlv_node->metadata_table); + if (metadata_table) { + ret = extract_metadata_table(dflags, + parser, + metadata_table, + _hdr, tlv_len, tlv_offset, + _metadata, + _frame, ctrl); + if (ret != KPARSER_OKAY) + return ret; + } + + overlay_table = rcu_dereference(parse_tlv_node->overlay_table); + if (!overlay_table) + return KPARSER_OKAY; + + /* We have an TLV overlay node */ + if (proto_ops && proto_ops->overlay_type_parameterized) + type = eval_parameterized_next_proto(dflags, + &proto_ops->pfoverlay_type, + _hdr); + else + type = tlv_len; + + if (type < 0) + return type; + + /* Get TLV node */ + next_parse_tlv_node = lookup_tlv_node(dflags, type, overlay_table); + if (next_parse_tlv_node) { + parse_tlv_node = next_parse_tlv_node; + goto parse_again; + } + + /* Unknown TLV overlay node */ + next_parse_tlv_node = rcu_dereference(parse_tlv_node->overlay_wildcard_node); + if (next_parse_tlv_node) { + parse_tlv_node = next_parse_tlv_node; + goto parse_again; + } + + return parse_tlv_node->unknown_overlay_ret; +} + +/* tlv loop limit validator */ +static int loop_limit_exceeded(int ret, unsigned int disp) +{ + switch (disp) { + case KPARSER_LOOP_DISP_STOP_OKAY: + return KPARSER_STOP_OKAY; + case KPARSER_LOOP_DISP_STOP_NODE_OKAY: + return KPARSER_STOP_NODE_OKAY; + case KPARSER_LOOP_DISP_STOP_SUB_NODE_OKAY: + return KPARSER_STOP_SUB_NODE_OKAY; + case KPARSER_LOOP_DISP_STOP_FAIL: + default: + return ret; + } +} + +/* process packet value using parameters provided */ +static __u64 eval_get_value(const struct kparser_parameterized_get_value *pf, void *_hdr) +{ + __u64 ret; + + (void)__kparser_metadata_bytes_extract(_hdr + pf->src_off, (__u8 *)&ret, pf->size, false); + + return ret; +} + +/* process and parse multiple tlvs */ +static int kparser_parse_tlvs(__u32 dflags, + const struct kparser_parser *parser, + const struct kparser_parse_node *parse_node, + void *_obj_ref, + void *_hdr, size_t hdr_len, size_t hdr_offset, + void *_metadata, void *_frame, + const struct kparser_ctrl_data *ctrl) +{ + unsigned int loop_cnt = 0, non_pad_cnt = 0, pad_len = 0; + const struct kparser_proto_tlvs_table *tlv_proto_table; + const struct kparser_parse_tlvs_node *parse_tlvs_node; + const struct kparser_proto_tlvs_node *proto_tlvs_node; + const struct kparser_parse_tlv_node *parse_tlv_node; + struct kparser_ctrl_data tlv_ctrl = {}; + unsigned int consec_pad = 0; + size_t len, tlv_offset; + ssize_t off, tlv_len; + __u8 *cp = _hdr; + int type = -1, ret; + + parse_tlvs_node = (struct kparser_parse_tlvs_node *)parse_node; + proto_tlvs_node = (struct kparser_proto_tlvs_node *)&parse_node->tlvs_proto_node; + + KPARSER_KMOD_DEBUG_PRINT(dflags, + "fixed_start_offset:%d start_offset:%lu\n", + proto_tlvs_node->fixed_start_offset, + proto_tlvs_node->start_offset); + + /* Assume hlen marks end of TLVs */ + if (proto_tlvs_node->fixed_start_offset) + off = proto_tlvs_node->start_offset; + else + off = eval_parameterized_len(&proto_tlvs_node->ops.pfstart_offset, cp); + + KPARSER_KMOD_DEBUG_PRINT(dflags, "off:%ld\n", off); + + if (off < 0) + return KPARSER_STOP_LENGTH; + + /* We assume start offset is less than or equal to minimal length */ + len = hdr_len - off; + + cp += off; + tlv_offset = hdr_offset + off; + + KPARSER_KMOD_DEBUG_PRINT(dflags, "len:%ld tlv_offset:%ld\n", + len, tlv_offset); + + /* This is the main TLV processing loop */ + while (len > 0) { + if (++loop_cnt > parse_tlvs_node->config.max_loop) + return loop_limit_exceeded(KPARSER_STOP_LOOP_CNT, + parse_tlvs_node->config.disp_limit_exceed); + + if (proto_tlvs_node->pad1_enable && + *cp == proto_tlvs_node->pad1_val) { + /* One byte padding, just advance */ + cp++; + tlv_offset++; + len--; + if (++pad_len > parse_tlvs_node->config.max_plen || + ++consec_pad > parse_tlvs_node->config.max_c_pad) + return loop_limit_exceeded(KPARSER_STOP_TLV_PADDING, + parse_tlvs_node-> + config.disp_limit_exceed); + continue; + } + + if (proto_tlvs_node->eol_enable && + *cp == proto_tlvs_node->eol_val) { + cp++; + tlv_offset++; + len--; + + /* Hit EOL, we're done */ + break; + } + + if (len < proto_tlvs_node->min_len) { + /* Length error */ + return loop_limit_exceeded(KPARSER_STOP_TLV_LENGTH, + parse_tlvs_node->config.disp_limit_exceed); + } + + /* If the len function is not set this degenerates to an + * array of fixed sized values (which maybe be useful in + * itself now that I think about it) + */ + do { + KPARSER_KMOD_DEBUG_PRINT(dflags, + "len_parameterized:%d min_len:%lu\n", + proto_tlvs_node->ops.len_parameterized, + proto_tlvs_node->min_len); + if (proto_tlvs_node->ops.len_parameterized) { + tlv_len = eval_parameterized_len(&proto_tlvs_node->ops.pflen, cp); + } else { + tlv_len = proto_tlvs_node->min_len; + break; + } + + KPARSER_KMOD_DEBUG_PRINT(dflags, + "tlv_len:%lu\n", tlv_len); + + if (!tlv_len || len < tlv_len) + return loop_limit_exceeded(KPARSER_STOP_TLV_LENGTH, + parse_tlvs_node->config. + disp_limit_exceed); + + if (tlv_len < proto_tlvs_node->min_len) + return loop_limit_exceeded(KPARSER_STOP_TLV_LENGTH, + parse_tlvs_node->config. + disp_limit_exceed); + } while (0); + + type = eval_parameterized_next_proto(dflags, + &proto_tlvs_node->ops.pftype, + cp); + + KPARSER_KMOD_DEBUG_PRINT(dflags, "type:%d\n", type); + + if (proto_tlvs_node->padn_enable && + type == proto_tlvs_node->padn_val) { + /* N byte padding, just advance */ + pad_len += tlv_len; + if (pad_len > parse_tlvs_node->config.max_plen || + ++consec_pad > parse_tlvs_node->config.max_c_pad) + return loop_limit_exceeded(KPARSER_STOP_TLV_PADDING, + parse_tlvs_node->config. + disp_limit_exceed); + goto next_tlv; + } + + /* Get TLV node */ + tlv_proto_table = rcu_dereference(parse_tlvs_node->tlv_proto_table); + if (tlv_proto_table) + parse_tlv_node = lookup_tlv_node(dflags, type, tlv_proto_table); +parse_one_tlv: + if (parse_tlv_node) { + const struct kparser_proto_tlv_node *proto_tlv_node = + &parse_tlv_node->proto_tlv_node; + + if (proto_tlv_node) { + if (proto_tlv_node->is_padding) { + pad_len += tlv_len; + if (pad_len > parse_tlvs_node->config.max_plen || + ++consec_pad > parse_tlvs_node->config.max_c_pad) + return loop_limit_exceeded(KPARSER_STOP_TLV_PADDING, + parse_tlvs_node->config. + disp_limit_exceed); + } else if (++non_pad_cnt > parse_tlvs_node->config.max_non) { + return loop_limit_exceeded(KPARSER_STOP_OPTION_LIMIT, + parse_tlvs_node-> + config.disp_limit_exceed); + } + } + + ret = kparser_parse_one_tlv(dflags, parser, + parse_tlvs_node, + parse_tlv_node, + _obj_ref, cp, tlv_len, + tlv_offset, _metadata, + _frame, &tlv_ctrl); + if (ret != KPARSER_OKAY) + return ret; + } else { + /* Unknown TLV */ + parse_tlv_node = rcu_dereference(parse_tlvs_node->tlv_wildcard_node); + if (parse_tlv_node) { + /* If a wilcard node is present parse that + * node as an overlay to this one. The + * wild card node can perform error processing + */ + goto parse_one_tlv; + } + /* Return default error code. Returning + * KPARSER_OKAY means skip + */ + if (parse_tlvs_node->unknown_tlv_type_ret != KPARSER_OKAY) + return parse_tlvs_node->unknown_tlv_type_ret; + } + + /* Move over current header */ +next_tlv: + cp += tlv_len; + tlv_offset += tlv_len; + len -= tlv_len; + } + + return KPARSER_OKAY; +} + +/* process and parse flag fields */ +static ssize_t kparser_parse_flag_fields(__u32 dflags, + const struct kparser_parser *parser, + const struct kparser_parse_node *parse_node, + void *_obj_ref, + void *_hdr, size_t hdr_len, + size_t hdr_offset, void *_metadata, + void *_frame, + const struct kparser_ctrl_data *ctrl, + size_t parse_len) +{ + const struct kparser_parse_flag_fields_node *parse_flag_fields_node; + const struct kparser_proto_flag_fields_node *proto_flag_fields_node; + const struct kparser_parse_flag_field_node *parse_flag_field_node; + const struct kparser_metadata_table *metadata_table; + ssize_t off = -1, field_len, field_offset, res = 0; + const struct kparser_flag_fields *flag_fields; + __u32 flags = 0; + int i, ret; + + parse_flag_fields_node = (struct kparser_parse_flag_fields_node *)parse_node; + proto_flag_fields_node = (struct kparser_proto_flag_fields_node *)&parse_node->proto_node; + + flag_fields = rcu_dereference(proto_flag_fields_node->flag_fields); + if (!flag_fields) + return KPARSER_OKAY; + + if (proto_flag_fields_node->ops.get_flags_parameterized) + flags = eval_get_value(&proto_flag_fields_node->ops.pfget_flags, _hdr); + + /* Position at start of field data */ + if (proto_flag_fields_node->ops.flag_fields_len) + off = proto_flag_fields_node->ops.hdr_length; + else if (proto_flag_fields_node->ops.start_fields_offset_parameterized) + off = eval_parameterized_len(&proto_flag_fields_node->ops.pfstart_fields_offset, + _hdr); + else + return KPARSER_STOP_LENGTH; + + if (off < 0) + return off; + + if (hdr_offset + off > parse_len) + return KPARSER_STOP_LENGTH; + _hdr += off; + hdr_offset += off; + + KPARSER_KMOD_DEBUG_PRINT(dflags, + "flag_fields->num_idx:%lu\n", + flag_fields->num_idx); + + for (i = 0; i < flag_fields->num_idx; i++) { + off = kparser_flag_fields_offset(i, flags, flag_fields); + KPARSER_KMOD_DEBUG_PRINT(dflags, + "off:%ld pflag:%x flag:%x\n", + off, flags, flag_fields->fields[i].flag); + if (off < 0) + continue; + + if (hdr_offset + flag_fields->fields[i].size > parse_len) + return KPARSER_STOP_LENGTH; + + res += flag_fields->fields[i].size; + + /* Flag field is present, try to find in the parse node + * table based on index in proto flag-fields + */ + parse_flag_field_node = + lookup_flag_field_node(dflags, + flag_fields->fields[i].flag, + parse_flag_fields_node->flag_fields_proto_table); + if (parse_flag_field_node) { + const struct kparser_parse_flag_field_node_ops + *ops = &parse_flag_field_node->ops; + __u8 *cp = _hdr + off; + + field_len = flag_fields->fields[i].size; + field_offset = hdr_offset + off; + + if (field_offset > parse_len) + return KPARSER_STOP_LENGTH; + + KPARSER_KMOD_DEBUG_PRINT(dflags, + "kParser parsing flag-field %s\n", + parse_flag_field_node->name); + + if (eval_cond_exprs(dflags, &ops->cond_exprs, cp) < 0) + return KPARSER_STOP_COMPARE; + + metadata_table = rcu_dereference(parse_flag_field_node->metadata_table); + if (metadata_table) { + ret = extract_metadata_table(dflags, + parser, + parse_flag_field_node->metadata_table, + cp, field_len, field_offset, _metadata, + _frame, ctrl); + if (ret != KPARSER_OKAY) + return ret; + } + } + } + + return res; +} + +/* process ok/fail/atencap nodes */ +static int __kparser_run_exit_node(__u32 dflags, + const struct kparser_parser *parser, + const struct kparser_parse_node *parse_node, + void *_obj_ref, void *_hdr, + size_t hdr_offset, ssize_t hdr_len, + void *_metadata, void *_frame, + struct kparser_ctrl_data *ctrl) +{ + const struct kparser_metadata_table *metadata_table; + int ret; + + KPARSER_KMOD_DEBUG_PRINT(dflags, + "exit node:%s\n", parse_node->name); + /* Run an exit parse node. This is an okay_node, fail_node, or + * atencap_node + */ + metadata_table = rcu_dereference(parse_node->metadata_table); + if (metadata_table) { + ret = extract_metadata_table(dflags, + parser, metadata_table, _hdr, + hdr_len, hdr_offset, _metadata, + _frame, ctrl); + if (ret != KPARSER_OKAY) + return ret; + } + + return KPARSER_OKAY; +} + +/* __kparser_parse(): Function to parse a void * packet buffer using a parser instance key. + * + * parser: Non NULL kparser_get_parser() returned and cached opaque pointer + * referencing a valid parser instance. + * _hdr: input packet buffer + * parse_len: length of input packet buffer + * _metadata: User provided metadata buffer. It must be same as configured + * metadata objects in CLI. + * metadata_len: Total length of the user provided metadata buffer. + * + * return: kParser error code as defined in include/uapi/linux/kparser.h + * + * rcu lock must be held before calling this function. + */ +int ___kparser_parse(const void *obj, void *_hdr, size_t parse_len, + struct sk_buff *skb, void *_metadata, size_t metadata_len) +{ + return 0; +} + +int __kparser_parse(const void *obj, void *_hdr, size_t parse_len, + void *_metadata, size_t metadata_len) +{ + const struct kparser_parse_node *next_parse_node, *atencap_node; + const struct kparser_parse_node *parse_node, *wildcard_node; + struct kparser_ctrl_data ctrl = { .ret = KPARSER_OKAY }; + const struct kparser_metadata_table *metadata_table; + const struct kparser_proto_table *proto_table; + const struct kparser_proto_node *proto_node; + const struct kparser_parser *parser = obj; + int type = -1, i, ret, framescnt; + struct kparser_counters *cntrs; + void *_frame, *_obj_ref = NULL; + const void *base_hdr = _hdr; + ssize_t hdr_offset = 0; + ssize_t hdr_len, res; + __u32 frame_num = 0; + __u32 dflags = 0; + bool currencap; + + if (parser && parser->config.max_encaps > framescnt) + framescnt = parser->config.max_encaps; + + if (!parser || !_metadata || metadata_len == 0 || !_hdr || parse_len == 0 || + (((framescnt * parser->config.frame_size) + + parser->config.metameta_size) > metadata_len)) { + KPARSER_KMOD_DEBUG_PRINT(dflags, + "one or more empty/invalid param(s)\n"); + return -EINVAL; + } + + if (parser->kparser_start_signature != KPARSERSTARTSIGNATURE || + parser->kparser_end_signature != KPARSERENDSIGNATURE) { + KPARSER_KMOD_DEBUG_PRINT(dflags, + "%s:corrupted kparser signature:start:0x%02x, end:0x%02x\n", + __func__, parser->kparser_start_signature, parser->kparser_end_signature); + return -EINVAL; + } + + if (parse_len < parser->config.metameta_size) { + KPARSER_KMOD_DEBUG_PRINT(dflags, + "parse buf err, parse_len:%lu, mmd_len:%lu\n", + parse_len, parser->config.metameta_size); + return -EINVAL; + } + + _frame = _metadata + parser->config.metameta_size; + dflags = parser->config.flags; + + if (dflags & KPARSER_F_DEBUG_DATAPATH) { + /* This code is required for regression tests also */ + pr_alert("kParserdump:len:%lu\n", parse_len); + print_hex_dump_bytes("kParserdump:rcvd_pkt:", + DUMP_PREFIX_OFFSET, _hdr, parse_len); + } + + ctrl.hdr_base = _hdr; + ctrl.node_cnt = 0; + ctrl.encap_levels = 0; + + cntrs = rcu_dereference(parser->cntrs); + if (cntrs) { + /* Initialize parser counters */ + memset(cntrs, 0, sizeof(parser->cntrs_len)); + } + + parse_node = rcu_dereference(parser->root_node); + if (!parse_node) { + KPARSER_KMOD_DEBUG_PRINT(dflags, + "root node missing,parser:%s\n", + parser->name); + return -ENOENT; + } + + /* Main parsing loop. The loop normal teminates when we encounter a + * leaf protocol node, an error condition, hitting limit on layers of + * encapsulation, protocol condition to stop (i.e. flags that + * indicate to stop at flow label or hitting fragment), or + * unknown protocol result in table lookup for next node. + */ + do { + KPARSER_KMOD_DEBUG_PRINT(dflags, + "Parsing node:%s\n", + parse_node->name); + currencap = false; + proto_node = &parse_node->proto_node; + hdr_len = proto_node->min_len; + + if (++ctrl.node_cnt > parser->config.max_nodes) { + ctrl.ret = KPARSER_STOP_MAX_NODES; + goto parser_out; + } + /* Protocol node length checks */ + KPARSER_KMOD_DEBUG_PRINT(dflags, + "kParser parsing %s\n", + parse_node->name); + /* when SKB is passed, if parse_len < hdr_len, then + * try to do skb_pullup(hdr_len) here. reset parse_len based on + * new parse_len, reset data ptr. Do this inside this loop. + */ + if (parse_len < hdr_len) { + ctrl.ret = KPARSER_STOP_LENGTH; + goto parser_out; + } + + do { + if (!proto_node->ops.len_parameterized) + break; + + hdr_len = eval_parameterized_len(&proto_node->ops.pflen, _hdr); + + KPARSER_KMOD_DEBUG_PRINT(dflags, + "eval_hdr_len:%ld min_len:%lu\n", + hdr_len, proto_node->min_len); + + if (hdr_len < proto_node->min_len) { + ctrl.ret = hdr_len < 0 ? hdr_len : KPARSER_STOP_LENGTH; + goto parser_out; + } + if (parse_len < hdr_len) { + ctrl.ret = KPARSER_STOP_LENGTH; + goto parser_out; + } + } while (0); + + hdr_offset = _hdr - base_hdr; + ctrl.pkt_len = parse_len; + + /* Callback processing order + * 1) Extract Metadata + * 2) Process TLVs + * 2.a) Extract metadata from TLVs + * 2.b) Process TLVs + * 3) Process protocol + */ + + metadata_table = rcu_dereference(parse_node->metadata_table); + /* Extract metadata, per node processing */ + if (metadata_table) { + ctrl.ret = extract_metadata_table(dflags, + parser, + metadata_table, + _hdr, hdr_len, hdr_offset, + _metadata, _frame, &ctrl); + if (ctrl.ret != KPARSER_OKAY) + goto parser_out; + } + + /* Process node type */ + switch (parse_node->node_type) { + case KPARSER_NODE_TYPE_PLAIN: + default: + break; + case KPARSER_NODE_TYPE_TLVS: + /* Process TLV nodes */ + ctrl.ret = kparser_parse_tlvs(dflags, parser, + parse_node, + _obj_ref, _hdr, hdr_len, + hdr_offset, _metadata, + _frame, &ctrl); +check_processing_return: + switch (ctrl.ret) { + case KPARSER_STOP_OKAY: + goto parser_out; + case KPARSER_OKAY: + break; /* Go to the next node */ + case KPARSER_STOP_NODE_OKAY: + /* Note KPARSER_STOP_NODE_OKAY means that + * post loop processing is not + * performed + */ + ctrl.ret = KPARSER_OKAY; + goto after_post_processing; + case KPARSER_STOP_SUB_NODE_OKAY: + ctrl.ret = KPARSER_OKAY; + break; /* Just go to next node */ + default: + goto parser_out; + } + break; + case KPARSER_NODE_TYPE_FLAG_FIELDS: + /* Process flag-fields */ + res = kparser_parse_flag_fields(dflags, parser, + parse_node, + _obj_ref, + _hdr, hdr_len, + hdr_offset, + _metadata, + _frame, + &ctrl, parse_len); + if (res < 0) { + ctrl.ret = res; + goto check_processing_return; + } + hdr_len += res; + } + +after_post_processing: + /* Proceed to next protocol layer */ + + proto_table = rcu_dereference(parse_node->proto_table); + wildcard_node = rcu_dereference(parse_node->wildcard_node); + if (!proto_table && !wildcard_node) { + /* Leaf parse node */ + KPARSER_KMOD_DEBUG_PRINT(dflags, "Leaf node"); + goto parser_out; + } + + if (proto_table) { + do { + if (proto_node->ops.cond_exprs_parameterized) { + ctrl.ret = + eval_cond_exprs(dflags, + &proto_node->ops.cond_exprs, + _hdr); + if (ctrl.ret != KPARSER_OKAY) + goto parser_out; + } + + if (!proto_table) + break; + type = + eval_parameterized_next_proto(dflags, + &proto_node->ops.pfnext_proto, + _hdr); + KPARSER_KMOD_DEBUG_PRINT(dflags, + "nxt_proto key:%x\n", + type); + if (type < 0) { + ctrl.ret = type; + goto parser_out; + } + + /* Get next node */ + next_parse_node = lookup_node(dflags, + type, + proto_table, + ¤cap); + + if (next_parse_node) + goto found_next; + } while (0); + } + + /* Try wildcard node. Either table lookup failed to find a + * node or there is only a wildcard + */ + if (wildcard_node) { + /* Perform default processing in a wildcard node */ + next_parse_node = wildcard_node; + } else { + /* Return default code. Parsing will stop + * with the inidicated code + */ + ctrl.ret = parse_node->unknown_ret; + goto parser_out; + } + +found_next: + /* Found next protocol node, set up to process */ + if (!proto_node->overlay) { + /* Move over current header */ + _hdr += hdr_len; + parse_len -= hdr_len; + } + + parse_node = next_parse_node; + if (currencap || proto_node->encap) { + /* Check is there is an atencap_node configured for + * the parser + */ + atencap_node = rcu_dereference(parser->atencap_node); + if (atencap_node) { + ret = __kparser_run_exit_node(dflags, + parser, + atencap_node, + _obj_ref, + _hdr, hdr_offset, + hdr_len, + _metadata, _frame, + &ctrl); + if (ret != KPARSER_OKAY) + goto parser_out; + } + + /* New encapsulation layer. Check against + * number of encap layers allowed and also + * if we need a new metadata frame. + */ + if (++ctrl.encap_levels > parser->config.max_encaps) { + ctrl.ret = KPARSER_STOP_ENCAP_DEPTH; + goto parser_out; + } + + if (frame_num < parser->config.max_frames) { + _frame += parser->config.frame_size; + frame_num++; + } + + /* Check if parser has counters that need to be reset + * at encap + */ + if (parser->cntrs) + for (i = 0; i < KPARSER_CNTR_NUM_CNTRS; i++) + if (parser->cntrs_conf.cntrs[i].reset_on_encap) + cntrs->cntr[i] = 0; + } + + } while (1); + +parser_out: + /* Convert PANDA_OKAY to PANDA_STOP_OKAY if parser is exiting normally. + * This means that okay_node will see PANDA_STOP_OKAY in ctrl.ret + */ + ctrl.ret = ctrl.ret == KPARSER_OKAY ? KPARSER_STOP_OKAY : ctrl.ret; + + parse_node = (ctrl.ret == KPARSER_OKAY || KPARSER_IS_OK_CODE(ctrl.ret)) ? + rcu_dereference(parser->okay_node) : rcu_dereference(parser->fail_node); + + if (!parse_node) { + if (dflags & KPARSER_F_DEBUG_DATAPATH) { + /* This code is required for regression tests also */ + pr_alert("kParserdump:metadata_len:%lu\n", metadata_len); + print_hex_dump_bytes("kParserdump:md:", + DUMP_PREFIX_OFFSET, + _metadata, metadata_len); + } + return ctrl.ret; + } + + /* Run an exit parse node. This is either the okay node or the fail + * node that is set in parser config + */ + ret = __kparser_run_exit_node(dflags, parser, parse_node, _obj_ref, + _hdr, hdr_offset, hdr_len, + _metadata, _frame, &ctrl); + if (ret != KPARSER_OKAY) + ctrl.ret = (ctrl.ret == KPARSER_STOP_OKAY) ? ret : ctrl.ret; + + if (dflags & KPARSER_F_DEBUG_DATAPATH) { + /* This code is required for regression tests also */ + pr_alert("kParserdump:metadata_len:%lu\n", metadata_len); + print_hex_dump_bytes("kParserdump:md:", DUMP_PREFIX_OFFSET, + _metadata, metadata_len); + } + + return ctrl.ret; +} +EXPORT_SYMBOL(__kparser_parse); + +static inline void * +kparser_get_parser_ctx(const struct kparser_hkey *kparser_key) +{ + void *ptr, *parser; + + if (!kparser_key) + return NULL; + + if (kparser_key->id >= KPARSER_PARSER_FAST_LOOKUP_RSVD_ID_START && + kparser_key->id <= KPARSER_PARSER_FAST_LOOKUP_RSVD_ID_STOP) { + rcu_read_lock(); + ptr = kparser_fast_lookup_array[kparser_key->id]; + rcu_read_unlock(); + } else { + ptr = kparser_namespace_lookup(KPARSER_NS_PARSER, kparser_key); + } + + parser = rcu_dereference(ptr); + + return parser; +} + +/* kparser_parse(): Function to parse a skb using a parser instance key. + * + * skb: input packet skb + * kparser_key: key of the associated kParser parser object which must be + * already created via CLI. + * _metadata: User provided metadata buffer. It must be same as configured + * metadata objects in CLI. + * metadata_len: Total length of the user provided metadata buffer. + * avoid_ref: Set this flag in case caller wants to avoid holding the reference + * of the active parser object to save performance on the data path. + * But please be advised, caller should hold the reference of the + * parser object while using this data path. In this case, the CLI + * can be used in advance to get the reference, and caller will also + * need to release the reference via CLI once it is done with the + * data path. + * + * return: kParser error code as defined in include/uapi/linux/kparser.h + */ +int kparser_parse(struct sk_buff *skb, + const struct kparser_hkey *kparser_key, + void *_metadata, size_t metadata_len, bool avoid_ref) +{ + struct kparser_glue_parser *k_prsr; + struct kparser_parser *parser; + void *data, *ptr; + size_t pktlen; + int err; + __u32 dflags = 0; + + data = skb_mac_header(skb); + pktlen = skb_mac_header_len(skb) + skb->len; + if (pktlen > KPARSER_MAX_SKB_PACKET_LEN) { + skb_pull(skb, KPARSER_MAX_SKB_PACKET_LEN); + data = skb_mac_header(skb); + pktlen = skb_mac_header_len(skb) + skb->len; + } + + err = skb_linearize(skb); + if (err < 0) + return err; + WARN_ON(skb->data_len); + + /* TODO: do this pullup inside the loop of ___kparser_parse(), when + * parse_len < hdr_len + * if (pktlen > KPARSER_MAX_SKB_PACKET_LEN) { + * skb_pull(skb, KPARSER_MAX_SKB_PACKET_LEN); + * data = skb_mac_header(skb); + * pktlen = skb_mac_header_len(skb) + skb->len; + * } + * err = skb_linearize(skb); + * if (err < 0) + * return err; + * WARN_ON(skb->data_len); + * ___kparser_parse(parser, skb, _metadata, metadata_len); + */ + k_prsr = kparser_get_parser_ctx(kparser_key); + if (!k_prsr) { + if (kparser_key) + KPARSER_KMOD_DEBUG_PRINT(dflags, "parser {%s:%u} is not found\n", + kparser_key->name, kparser_key->id); + return -EINVAL; + } + + rcu_read_lock(); + + if (likely(!avoid_ref)) + kparser_ref_get(&k_prsr->glue.refcount); + parser = &k_prsr->parser; + + ptr = kparser_namespace_lookup(KPARSER_NS_PARSER, kparser_key); + k_prsr = rcu_dereference(ptr); + parser = &k_prsr->parser; + if (!parser) { + KPARSER_KMOD_DEBUG_PRINT(dflags, + "parser htbl lookup failure for key:{%s:%u}\n", + kparser_key->name, kparser_key->id); + rcu_read_unlock(); + if (likely(!avoid_ref)) + kparser_ref_put(&k_prsr->glue.refcount); + return -ENOENT; + } + + err = __kparser_parse(parser, data, pktlen, _metadata, metadata_len); + + rcu_read_unlock(); + + if (likely(!avoid_ref)) + kparser_ref_put(&k_prsr->glue.refcount); + + return err; +} +EXPORT_SYMBOL(kparser_parse); + +/* kparser_get_parser(): Function to get an opaque reference of a parser instance and mark it + * immutable so that while actively using, it can not be deleted. The parser is identified by a key. + * It marks the associated parser and whole parse tree immutable so that when it is locked, it can + * not be deleted. + * + * kparser_key: key of the associated kParser parser object which must be + * already created via CLI. + * + * return: NULL if key not found, else an opaque parser instance pointer which + * can be used in the following APIs 3 and 4. + * avoid_ref: Set this flag in case caller wants to avoid holding the reference + * of the active parser object to save performance on the data path. + * But please be advised, caller should hold the reference of the + * parser object while using this data path. In this case, the CLI + * can be used in advance to get the reference, and caller will also + * need to release the reference via CLI once it is done with the + * data path. + * + * NOTE: This call makes the whole parser tree immutable. If caller calls this + * more than once, later caller will need to release the same parser exactly that + * many times using the API kparser_put_parser(). + */ +const void *kparser_get_parser(const struct kparser_hkey *kparser_key, + bool avoid_ref) +{ + struct kparser_glue_parser *k_prsr; + + k_prsr = kparser_get_parser_ctx(kparser_key); + if (!k_prsr) + return NULL; + + if (likely(!avoid_ref)) + kparser_ref_get(&k_prsr->glue.refcount); + + return &k_prsr->parser; +} +EXPORT_SYMBOL(kparser_get_parser); + +/* kparser_put_parser(): Function to return and undo the read only operation done previously by + * kparser_get_parser(). The parser instance is identified by using a previously obtained opaque + * parser pointer via API kparser_get_parser(). This undo the immutable change so that any component + * of the whole parse tree can be deleted again. + * + * parser: void *, Non NULL opaque pointer which was previously returned by kparser_get_parser(). + * Caller can use cached opaque pointer as long as system does not restart and kparser.ko is not + * reloaded. + * avoid_ref: Set this flag only when this was used in the prio call to + * kparser_get_parser(). Incorrect usage of this flag might cause + * error and make the parser state unstable. + * + * return: boolean, true if put operation is success, else false. + * + * NOTE: This call makes the whole parser tree deletable for the very last call. + */ +bool kparser_put_parser(const void *obj, bool avoid_ref) +{ + const struct kparser_parser *parser = obj; + struct kparser_glue_parser *k_parser; + + if (!parser) + return false; + + if (likely(!avoid_ref)) { + k_parser = container_of(parser, struct kparser_glue_parser, parser); + kparser_ref_put(&k_parser->glue.refcount); + } + + return true; +} +EXPORT_SYMBOL(kparser_put_parser); diff --git a/net/kparser/kparser_main.c b/net/kparser/kparser_main.c new file mode 100644 index 000000000..8a100e191 --- /dev/null +++ b/net/kparser/kparser_main.c @@ -0,0 +1,329 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Copyright (c) 2022, SiPanda Inc. + * + * kParser KMOD main module source file with netlink handlers + * + * Author: Pratyush Kumar Khan + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "kparser.h" + +static int kparser_cli_cmd_handler(struct sk_buff *skb, struct genl_info *info); + +/* define netlink msg policies */ +#define NS_DEFINE_POLICY_ATTR_ENTRY(ID, STRUC_NAME, RSP_STRUC_NAME) \ + [KPARSER_ATTR_CREATE_##ID] = { \ + .type = NLA_BINARY, \ + .validation_type = NLA_VALIDATE_MIN, \ + .min = sizeof(struct STRUC_NAME) \ + }, \ + [KPARSER_ATTR_UPDATE_##ID] = { \ + .type = NLA_BINARY, \ + .len = sizeof(struct STRUC_NAME), \ + .validation_type = NLA_VALIDATE_MIN, \ + .min = sizeof(struct STRUC_NAME) \ + }, \ + [KPARSER_ATTR_READ_##ID] = { \ + .type = NLA_BINARY, \ + .len = sizeof(struct STRUC_NAME), \ + .validation_type = NLA_VALIDATE_MIN, \ + .min = sizeof(struct STRUC_NAME) \ + }, \ + [KPARSER_ATTR_DELETE_##ID] = { \ + .type = NLA_BINARY, \ + .len = sizeof(struct STRUC_NAME), \ + .validation_type = NLA_VALIDATE_MIN, \ + .min = sizeof(struct STRUC_NAME) \ + }, \ + [KPARSER_ATTR_RSP_##ID] = { \ + .type = NLA_BINARY, \ + .len = sizeof(struct RSP_STRUC_NAME), \ + .validation_type = NLA_VALIDATE_MIN, \ + .min = sizeof(struct RSP_STRUC_NAME) \ + } + +static const struct nla_policy kparser_nl_policy[KPARSER_ATTR_MAX] = { + NS_DEFINE_POLICY_ATTR_ENTRY(KPARSER_NS_CONDEXPRS, + kparser_conf_cmd, + kparser_cmd_rsp_hdr), + NS_DEFINE_POLICY_ATTR_ENTRY(KPARSER_NS_CONDEXPRS_TABLE, + kparser_conf_cmd, + kparser_cmd_rsp_hdr), + NS_DEFINE_POLICY_ATTR_ENTRY(KPARSER_NS_CONDEXPRS_TABLES, + kparser_conf_cmd, + kparser_cmd_rsp_hdr), + NS_DEFINE_POLICY_ATTR_ENTRY(KPARSER_NS_COUNTER, + kparser_conf_cmd, + kparser_cmd_rsp_hdr), + NS_DEFINE_POLICY_ATTR_ENTRY(KPARSER_NS_COUNTER_TABLE, + kparser_conf_cmd, + kparser_cmd_rsp_hdr), + NS_DEFINE_POLICY_ATTR_ENTRY(KPARSER_NS_METADATA, + kparser_conf_cmd, + kparser_cmd_rsp_hdr), + NS_DEFINE_POLICY_ATTR_ENTRY(KPARSER_NS_METALIST, + kparser_conf_cmd, + kparser_cmd_rsp_hdr), + NS_DEFINE_POLICY_ATTR_ENTRY(KPARSER_NS_NODE_PARSE, + kparser_conf_cmd, + kparser_cmd_rsp_hdr), + NS_DEFINE_POLICY_ATTR_ENTRY(KPARSER_NS_PROTO_TABLE, + kparser_conf_cmd, + kparser_cmd_rsp_hdr), + NS_DEFINE_POLICY_ATTR_ENTRY(KPARSER_NS_TLV_NODE_PARSE, + kparser_conf_cmd, + kparser_cmd_rsp_hdr), + NS_DEFINE_POLICY_ATTR_ENTRY(KPARSER_NS_TLV_PROTO_TABLE, + kparser_conf_cmd, + kparser_cmd_rsp_hdr), + NS_DEFINE_POLICY_ATTR_ENTRY(KPARSER_NS_FLAG_FIELD, + kparser_conf_cmd, + kparser_cmd_rsp_hdr), + NS_DEFINE_POLICY_ATTR_ENTRY(KPARSER_NS_FLAG_FIELD_TABLE, + kparser_conf_cmd, + kparser_cmd_rsp_hdr), + NS_DEFINE_POLICY_ATTR_ENTRY(KPARSER_NS_FLAG_FIELD_NODE_PARSE, + kparser_conf_cmd, + kparser_cmd_rsp_hdr), + NS_DEFINE_POLICY_ATTR_ENTRY(KPARSER_NS_FLAG_FIELD_PROTO_TABLE, + kparser_conf_cmd, + kparser_cmd_rsp_hdr), + NS_DEFINE_POLICY_ATTR_ENTRY(KPARSER_NS_PARSER, + kparser_conf_cmd, + kparser_cmd_rsp_hdr), + NS_DEFINE_POLICY_ATTR_ENTRY(KPARSER_NS_OP_PARSER_LOCK_UNLOCK, + kparser_conf_cmd, + kparser_cmd_rsp_hdr), +}; + +/* define netlink operations and family */ +static const struct genl_ops kparser_nl_ops[] = { + { + .cmd = KPARSER_CMD_CONFIGURE, + .doit = kparser_cli_cmd_handler, + .flags = GENL_ADMIN_PERM, + }, +}; + +struct genl_family kparser_nl_family __ro_after_init = { + .hdrsize = 0, + .name = KPARSER_GENL_NAME, + .version = KPARSER_GENL_VERSION, + .maxattr = KPARSER_ATTR_MAX - 1, + .policy = kparser_nl_policy, + .netnsok = true, + .parallel_ops = true, + .module = THIS_MODULE, + .ops = kparser_nl_ops, + .n_ops = ARRAY_SIZE(kparser_nl_ops), + .resv_start_op = KPARSER_CMD_CONFIGURE + 1, +}; + +/* send response to netlink msg requests */ +static int kparser_send_cmd_rsp(int cmd, int attrtype, + const struct kparser_cmd_rsp_hdr *rsp, + size_t rsp_len, struct genl_info *info, int err) +{ + struct sk_buff *msg; + size_t msgsz = NLMSG_DEFAULT_SIZE; + void *hdr; + int ret; + + if (rsp_len > msgsz) + msgsz = rsp_len; + + msg = nlmsg_new(msgsz, GFP_KERNEL); + if (!msg) + return -ENOMEM; + + hdr = genlmsg_put(msg, info->snd_portid, info->snd_seq, + &kparser_nl_family, 0, cmd); + if (!hdr) { + nlmsg_free(msg); + return -ENOBUFS; + } + + if (rsp->op_ret_code != 0) { + struct nlmsghdr *nlh = hdr - GENL_HDRLEN - NLMSG_HDRLEN; + struct nlmsgerr *e; + + nlh->nlmsg_type = NLMSG_ERROR; + nlh->nlmsg_len += nlmsg_msg_size(sizeof(*e)); + nlh->nlmsg_flags |= NLM_F_ACK_TLVS; + e = (struct nlmsgerr *)NLMSG_DATA(nlh); + memset(&e->msg, 0, sizeof(e->msg)); + e->error = rsp->op_ret_code; + nlmsg_free(msg); + return e->error; + } + + if (nla_put(msg, attrtype, (int)rsp_len, rsp)) { + genlmsg_cancel(msg, hdr); + nlmsg_free(msg); + return -EMSGSIZE; + } + + genlmsg_end(msg, hdr); + ret = genlmsg_reply(msg, info); + + /* pr_debug("genlmsg_reply() ret:%d\n", ret); */ + + return ret; +} + +typedef int kparser_ops(const void *, size_t, struct kparser_cmd_rsp_hdr **, + size_t *, void *extack, int *err); + +/* define netlink msg processors */ +#define KPARSER_NS_DEFINE_OP_HANDLERS(NS_ID) \ + [KPARSER_ATTR_CREATE_##NS_ID] = kparser_config_handler_add, \ + [KPARSER_ATTR_UPDATE_##NS_ID] = kparser_config_handler_update, \ + [KPARSER_ATTR_READ_##NS_ID] = kparser_config_handler_read, \ + [KPARSER_ATTR_DELETE_##NS_ID] = kparser_config_handler_delete, \ + [KPARSER_ATTR_RSP_##NS_ID] = NULL + +static kparser_ops *kparser_ns_op_handler[KPARSER_ATTR_MAX] = { + NULL, + KPARSER_NS_DEFINE_OP_HANDLERS(KPARSER_NS_CONDEXPRS), + KPARSER_NS_DEFINE_OP_HANDLERS(KPARSER_NS_CONDEXPRS_TABLE), + KPARSER_NS_DEFINE_OP_HANDLERS(KPARSER_NS_CONDEXPRS_TABLES), + KPARSER_NS_DEFINE_OP_HANDLERS(KPARSER_NS_COUNTER), + KPARSER_NS_DEFINE_OP_HANDLERS(KPARSER_NS_COUNTER_TABLE), + KPARSER_NS_DEFINE_OP_HANDLERS(KPARSER_NS_METADATA), + KPARSER_NS_DEFINE_OP_HANDLERS(KPARSER_NS_METALIST), + KPARSER_NS_DEFINE_OP_HANDLERS(KPARSER_NS_NODE_PARSE), + KPARSER_NS_DEFINE_OP_HANDLERS(KPARSER_NS_PROTO_TABLE), + KPARSER_NS_DEFINE_OP_HANDLERS(KPARSER_NS_TLV_NODE_PARSE), + KPARSER_NS_DEFINE_OP_HANDLERS(KPARSER_NS_TLV_PROTO_TABLE), + KPARSER_NS_DEFINE_OP_HANDLERS(KPARSER_NS_FLAG_FIELD), + KPARSER_NS_DEFINE_OP_HANDLERS(KPARSER_NS_FLAG_FIELD_TABLE), + KPARSER_NS_DEFINE_OP_HANDLERS(KPARSER_NS_FLAG_FIELD_NODE_PARSE), + KPARSER_NS_DEFINE_OP_HANDLERS(KPARSER_NS_FLAG_FIELD_PROTO_TABLE), + KPARSER_NS_DEFINE_OP_HANDLERS(KPARSER_NS_PARSER), + KPARSER_NS_DEFINE_OP_HANDLERS(KPARSER_NS_OP_PARSER_LOCK_UNLOCK), +}; + +/* netlink msg request handler */ +static int kparser_cli_cmd_handler(struct sk_buff *skb, struct genl_info *info) +{ + struct kparser_cmd_rsp_hdr *rsp = NULL; + size_t rsp_len = 0; + int ret_attr_id; + int attr_idx; + int rc, err; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + for (attr_idx = KPARSER_ATTR_UNSPEC + 1; attr_idx < KPARSER_ATTR_MAX; attr_idx++) { + if (!info->attrs[attr_idx] || !kparser_ns_op_handler[attr_idx]) + continue; + + ret_attr_id = kparser_ns_op_handler[attr_idx](nla_data(info->attrs[attr_idx]), + nla_len(info->attrs[attr_idx]), + &rsp, &rsp_len, + info->extack, &err); + + if (ret_attr_id <= KPARSER_ATTR_UNSPEC || ret_attr_id >= KPARSER_ATTR_MAX) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "attr %d handler failed", attr_idx); + rc = EIO; + goto out; + } + + rc = kparser_send_cmd_rsp(KPARSER_CMD_CONFIGURE, ret_attr_id, + rsp, rsp_len, info, err); + if (rc) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, + "kparser_send_cmd_rsp() failed,attr:%d, rc:%d\n", + attr_idx, rc); + // rc = EIO; + goto out; + } + + kfree(rsp); + rsp = NULL; + rsp_len = 0; + } + +out: + if (rsp) + kfree(rsp); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + + return rc; +} + +/* kParser KMOD's init handler */ +static int __init init_kparser(void) +{ + int rc; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + rc = genl_register_family(&kparser_nl_family); + if (rc) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "genl_register_family failed\n"); + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + return rc; + } + + rc = kparser_init(); + if (rc) { + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "kparser_init() err:%d\n", rc); + goto out; + } + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); + + return rc; + +out: + rc = genl_unregister_family(&kparser_nl_family); + if (rc != 0) + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "kparser_deinit() err:%d\n", rc); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "ERR OUT: "); + + return rc; +} + +/* kParser KMOD's exit handler */ +static void __exit exit_kparser(void) +{ + int rc; + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "IN: "); + + rc = genl_unregister_family(&kparser_nl_family); + if (rc != 0) + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "genl_unregister_family() err:%d\n", + rc); + + rc = kparser_deinit(); + if (rc != 0) + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "kparser_deinit() err:%d\n", rc); + + KPARSER_KMOD_DEBUG_PRINT(KPARSER_F_DEBUG_CLI, "OUT: "); +} + +module_init(init_kparser); +module_exit(exit_kparser); +MODULE_AUTHOR("Pratyush Khan "); +MODULE_AUTHOR("SiPanda Inc"); +MODULE_LICENSE("GPL"); +MODULE_DESCRIPTION("Configurable Parameterized Parser in Kernel (KPARSER)"); diff --git a/net/kparser/kparser_metaextract.h b/net/kparser/kparser_metaextract.h new file mode 100644 index 000000000..68eeb9c91 --- /dev/null +++ b/net/kparser/kparser_metaextract.h @@ -0,0 +1,891 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* Copyright (c) 2022, SiPanda Inc. + * + * kparser_metaextract.h - kParser metadata helper and structures header file + * + * Authors: Tom Herbert + * Pratyush Kumar Khan + */ + +#ifndef __KPARSER_METAEXTRACT_H__ +#define __KPARSER_METAEXTRACT_H__ + +#include "kparser_types.h" + +#include + +#ifdef __LITTLE_ENDIAN +#define kparser_htonll(X) \ + (((__u64)htonl((X) & 0xffffffff) << 32) | htonl((X) >> 32)) +#define kparser_ntohll(X) \ + (((__u64)ntohl((X) & 0xffffffff) << 32) | ntohl((X) >> 32)) +#else +#error "Cannot determine endianness" +#define kparser_htonll(X) (X) +#define kparser_ntohll(X) (X) +#endif + +/* Metadata extraction pseudo instructions + * + * These instructions extract header data and set control data into metadata. + * Common fields are: + * - code: Describes the data being written to the metadata. See descriptions + * below + * - frame: Boolean value. If true then the data is a written to the current + * metadata frame (frame + dst_off), else the data is written + * to the metadata base (metadata + dst_off) + * - cntr: Counter. If nonzero the data is written to an array defined + * by the specified counter. Note that dst_off in this case could + * be the base off set of an array plus the offset within an + * element of the array + * - dst_off: Destination offset into the metadata to write the extracted + * data. This is a nine bits to allow an offset of 0 to 511 + * bytes. In the case of writing a sixteen bit constant, + * dst_off is an eight byte field that is multiplied by two + * to derive the target destination offset + * + * Metadata extraction codes: + * - KPARSER_METADATA_BYTES_EXTRACT: bytes field + * Extract some number of bytes of header data. The src_off field + * indicates the source offset in bytes from the current header being + * processed, and length indicates the number of bytes to be extracted. + * One is added to the length to get the target length. For example, + * to extract the IPv4 source address into metadata, src_off would be + * set to twelve and length would be set to three (that indicates + * to extract four bytes). If e_bit is true then the bytes are endian + * swapped before being stored + * - KPARSER_METADATA_NIBBS_EXTRACT: nibbs field + * Extract some number of nibbles of header data. The src_off field + * indicates the source offset in nibbles from the current header being + * processed, and length indicates the number of nibbles to be + * extracted. Note that nibbles are counted such that the high order + * nibble of the first byte is nibble zero, and the low order is + * nibble one. When nibbles are written to be aligned to the + * destination bytes (e.g. the high order nibble to the first + * destination byte contains nibble zero). If an off number of nibbles + * are written, then the last nibble is written to the high order + * nibble of the last byte, and the low order nibble of the last + * byte is zero. If e_bit is true then the resultant bytes are endian + * swapped before being stored + * - KPARSER_METADATA_CONSTANT_BYTE_SET: constant_byte field + * Set a byte constant in the metadata. The data field contains the + * value of the byte to be written + * - KPARSER_METADATA_CONSTANT_HWORD_SET: constant_hword field + * Set a half word (16 bits) constant in the metadata. The data field + * contains the value of the halfword to be written. Note that dst_off + * is multiplied by two to derive the target offset + * - KPARSER_METADATA_OFFSET_SET: offset field + * Set the current absolute offset of a field in a packet. This + * is the offset in two bytes of the current header being processed + * plus the value in add_off which is the offset of the field of + * interest in the current header. For instance, to get the offset of + * the source IP address add_off would be set to twelve; and for a + * plain IPv4 Ethernet packet the value written to metadata would + * be twenty-six (offset of the IPv4 header is fourteen plus twelve + * which is value of add_off and the offset of the source address + * in the IPv4 header). If bit_offset is set then the bit offset of + * the field is written. This is derived as eight times the current + * header byte offset plus add_off. For example, to extract the + * bit offset of the fragment offset of IPv4 header, add_off would + * have the value fifty-one. For a plain IPv4 Ethernet packet, the + * extract bit offset would then be 163 + * - KPARSER_METADATA_CTRL_HDR_LENGTH: control field + * Write the length of the current header to metadata. The length is + * written in two bytes. A counter operation may be specified as + * described below + * - KPARSER_METADATA_CTRL_NUM_NODES: control field + * Write the current number of parse nodes that have been visited to + * metadata. The number of nodes is written in two bytes. A counter + * operation may be specified as described below + * - KPARSER_METADATA_CTRL_NUM_ENCAPS: control field + * Write the current number of encapsulation levels to metadata. The + * number of nodes is written in two bytes. A counter operation may be + * specified as described below + * - KPARSER_METADATA_CTRL_TIMESTAMP: control field + * Write the receive timestamp of a packet to metadata. The timestamp + * number of nodes is written in eight bytes. A counter operation may + * be specified as described below + * - KPARSER_METADATA_CTRL_COUNTER: control_counter field + * Write the current value of a counter to metadata. The counter is + * specified in counter_for_data. The counter is written in two bytes. + * A counter operation may be specified as described below + * - KPARSER_METADATA_CTRL_NOOP: control_noop field + * "No operation". This pseudo instruction does not write any data. + * It's primary purpose is to allow counter operations after performing + * non-control pseudo instructions (note that the non-control variants + * don't have a cntr_op field) + * + * There are two operations that may be performed on a counter and that are + * expressed in control type pseudo instructions: increment and reset. A + * counter operation is set in the cntr_op field of control pseudo instructions. + * The defined counter operations are: + * - KPARSER_METADATA_CNTROP_NULL: No counter operation + * - KPARSER_METADATA_CNTROP_INCREMENT: Increment the counter specified + * in cntr by one. The configuration for the counter is check and + * if the limit for the counter is exceeded the appropriate behavior + * is done + * - KPARSER_METADATA_CNTROP_RESET: Reset the counter specified + * in cntr to zero + */ + +/* Metatdata extract codes */ +#define KPARSER_METADATA_BYTES_EXTRACT 0 /* Var bytes */ +#define KPARSER_METADATA_NIBBS_EXTRACT 1 /* Var bytes */ +#define KPARSER_METADATA_CONSTANT_BYTE_SET 2 /* One byte */ +#define KPARSER_METADATA_CONSTANT_HWORD_SET 3 /* Two bytes */ +#define KPARSER_METADATA_OFFSET_SET 4 /* Two bytes */ +#define KPARSER_METADATA_CTRL_HDR_LENGTH 5 /* Two bytes */ +#define KPARSER_METADATA_CTRL_NUM_NODES 6 /* Two bytes */ +#define KPARSER_METADATA_CTRL_NUM_ENCAPS 7 /* Two bytes */ +#define KPARSER_METADATA_CTRL_TIMESTAMP 8 /* Eight bytes */ +#define KPARSER_METADATA_CTRL_RET_CODE 9 /* Four bytes */ +#define KPARSER_METADATA_CTRL_COUNTER 10 /* Two bytes */ +#define KPARSER_METADATA_CTRL_NOOP 11 /* Zero bytes */ + +#define KPARSER_METADATA_CNTROP_NULL 0 +#define KPARSER_METADATA_CNTROP_INCREMENT 1 +#define KPARSER_METADATA_CNTROP_RESET 2 + +/* Metadata extraction pseudo instructions + * This emulates the custom SiPANDA riscv instructions for metadata extractions, + * hence these are called pseudo instructions + */ +struct kparser_metadata_extract { + union { + struct { + __u32 code: 4; // One of KPARSER_METADATA_* ops + __u32 frame: 1; // Write to frame (true) else to meta + __u32 cntr: 3; // Counter number + __u32 dst_off: 9; // Target offset in frame or meta + __u32 rsvd: 24; + } gen; + struct { + __u32 code: 4; // One of KPARSER_METADATA_* ops + __u32 frame: 1; // Write to frame (true) else to meta + __u32 cntr: 3; // Counter number + __u32 dst_off: 9; // Target offset in frame or meta + __u32 e_bit: 1; // Swap endianness (true) + __u32 src_off: 9; // Src offset in header + __u32 length: 5; // Byte length to read/write + } bytes; + struct { + __u32 code: 4; // One of KPARSER_METADATA_* ops + __u32 frame: 1; // Write to frame (true) else to meta + __u32 cntr: 3; // Counter number + __u32 dst_off: 9; // Target offset in frame or meta + __u32 e_bit: 1; // Swap endianness (true) + __u32 src_off: 10; // Src offset in header + __u32 length: 4; // Nibble length to read/write + } nibbs; + struct { + __u32 code: 4; // One of KPARSER_METADATA_* ops + __u32 frame: 1; // Write to frame (true) else to meta + __u32 cntr: 3; // Counter number + __u32 dst_off: 9; // Target offset / 2 in frame or meta + __u32 rsvd: 7; + __u32 data: 8; // Byte constant + } constant_byte; + struct { + __u32 code: 4; // One of KPARSER_METADATA_* ops + __u32 frame: 1; // Write to frame (true) else to meta + __u32 cntr: 3; // Counter number + __u32 dst_off: 8; // Target offset / 2 in frame or meta + __u32 data: 16; // Byte constant + } constant_hword; + struct { + __u32 code: 4; // One of KPARSER_METADATA_* ops + __u32 frame: 1; // Write to frame (true) else to meta + __u32 cntr: 3; // Counter number + __u32 dst_off: 9; // Target offset in frame or meta + __u32 bit_offset: 1; + __u32 rsvd: 2; + __u32 add_off: 12; // 3 bits for bit offset + } offset; + struct { + __u32 code: 4; // One of KPARSER_METADATA_* ops + __u32 frame: 1; // Write to frame (true) else to meta + __u32 cntr: 3; // Counter number + __u32 dst_off: 9; // Target offset in frame or meta + __u32 cntr_op: 3; // Counter operation + __u32 cntr_for_data: 3; + __u32 rsvd: 9; + } control; + struct { + __u32 code: 4; // One of KPARSER_METADATA_* ops + __u32 frame: 1; // Write to frame (true) else to meta + __u32 cntr: 3; // Counter number + __u32 cntr_op: 3; // Counter operation + __u32 rsvd: 21; + } control_noop; + __u32 val; + }; +}; + +/* Helper macros to make various pseudo instructions */ + +#define __KPARSER_METADATA_MAKE_BYTES_EXTRACT(FRAME, SRC_OFF, DST_OFF, LEN, E_BIT, CNTR) \ +{ \ + .bytes.code = KPARSER_METADATA_BYTES_EXTRACT, \ + .bytes.frame = FRAME, \ + .bytes.src_off = SRC_OFF, \ + .bytes.dst_off = DST_OFF, \ + .bytes.length = (LEN) - 1, /* Minimum one byte */ \ + .bytes.e_bit = E_BIT, \ + .bytes.cntr = CNTR, \ +} + +static inline struct kparser_metadata_extract +__kparser_metadata_make_bytes_extract(bool frame, size_t src_off, + size_t dst_off, size_t len, + bool e_bit, + unsigned int cntr) +{ + const struct kparser_metadata_extract mde = + __KPARSER_METADATA_MAKE_BYTES_EXTRACT(frame, src_off, + dst_off, len, + e_bit, cntr); + return mde; +} + +#define __KPARSER_METADATA_MAKE_NIBBS_EXTRACT(FRAME, NIBB_SRC_OFF, \ + DST_OFF, NIBB_LEN, E_BIT, CNTR) \ +{ \ + .nibbs.code = KPARSER_METADATA_NIBBS_EXTRACT, \ + .nibbs.frame = FRAME, \ + .nibbs.src_off = NIBB_SRC_OFF, \ + .nibbs.dst_off = DST_OFF, \ + .nibbs.length = (NIBB_LEN) - 1, /* Minimum one nibble */ \ + .nibbs.e_bit = E_BIT, \ + .nibbs.cntr = CNTR, \ +} + +static inline struct kparser_metadata_extract +__kparser_make_make_nibbs_extract(bool frame, size_t nibb_src_off, + size_t dst_off, size_t nibb_len, + bool e_bit, unsigned int cntr) +{ + const struct kparser_metadata_extract mde = + __KPARSER_METADATA_MAKE_NIBBS_EXTRACT(frame, nibb_src_off, + dst_off, nibb_len, + e_bit, cntr); + + return mde; +} + +#define __KPARSER_METADATA_MAKE_SET_CONST_BYTE(FRAME, DST_OFF, DATA, CNTR) \ +{ \ + .constant_byte.code = KPARSER_METADATA_CONSTANT_BYTE_SET, \ + .constant_byte.frame = FRAME, \ + .constant_byte.dst_off = DST_OFF, \ + .constant_byte.data = DATA, \ + .constant_byte.cntr = CNTR, \ +} + +static inline struct kparser_metadata_extract +__kparser_metadata_set_const_byte(bool frame, size_t dst_off, + __u8 data, unsigned int cntr) +{ + const struct kparser_metadata_extract mde = + __KPARSER_METADATA_MAKE_SET_CONST_BYTE(frame, dst_off, + data, cntr); + + return mde; +} + +#define __KPARSER_METADATA_MAKE_SET_CONST_HALFWORD(FRAME, DST_OFF, DATA, CNTR) \ +{ \ + .constant_hword.code = \ + KPARSER_METADATA_CONSTANT_HWORD_SET, \ + .constant_hword.frame = FRAME, \ + .constant_hword.dst_off = DST_OFF, \ + .constant_hword.data = DATA, \ + .constant_hword.cntr = CNTR, \ +} + +static inline struct kparser_metadata_extract +__kparser_metadata_set_const_halfword(bool frame, size_t dst_off, + __u16 data, + unsigned int cntr) +{ + const struct kparser_metadata_extract mde = + __KPARSER_METADATA_MAKE_SET_CONST_HALFWORD(frame, dst_off, + data, cntr); + + return mde; +} + +#define __KPARSER_METADATA_MAKE_OFFSET_SET(FRAME, DST_OFF, BIT_OFFSET, ADD_OFF, CNTR) \ +{ \ + .offset.code = KPARSER_METADATA_OFFSET_SET, \ + .offset.frame = FRAME, \ + .offset.dst_off = DST_OFF, \ + .offset.bit_offset = BIT_OFFSET, \ + .offset.add_off = ADD_OFF, \ + .offset.cntr = CNTR, \ +} + +static inline struct kparser_metadata_extract +__kparser_metadata_offset_set(bool frame, size_t dst_off, + bool bit_offset, size_t add_off, unsigned int cntr) +{ + const struct kparser_metadata_extract mde = + __KPARSER_METADATA_MAKE_OFFSET_SET(frame, dst_off, + bit_offset, add_off, cntr); + return mde; +} + +#define __KPARSER_METADATA_MAKE_SET_CONTROL_COUNTER(FRAME, DST_OFF, CNTR_DATA, CNTR, CNTR_OP) \ +{ \ + .control.code = KPARSER_METADATA_CTRL_COUNTER, \ + .control.frame = FRAME, \ + .control.dst_off = DST_OFF, \ + .control.cntr = CNTR, \ + .control.cntr_op = CNTR_OP, \ + .control.cntr_for_data = CNTR_DATA, \ +} + +static inline struct kparser_metadata_extract +__kparser_metadata_set_control_counter(bool frame, size_t dst_off, + unsigned int cntr_data, + unsigned int cntr, + unsigned int cntr_op) +{ + const struct kparser_metadata_extract mde = + __KPARSER_METADATA_MAKE_SET_CONTROL_COUNTER(frame, + dst_off, cntr_data, cntr, + cntr_op); + return mde; +} + +#define __KPARSER_METADATA_MAKE_SET_CONTROL(FRAME, CODE, DST_OFF, CNTR, CNTR_OP) \ +{ \ + .control.code = CODE, \ + .control.frame = FRAME, \ + .control.dst_off = DST_OFF, \ + .control.cntr = CNTR, \ + .control.cntr_op = CNTR_OP, \ +} + +static inline struct kparser_metadata_extract +__kparser_metadata_set_control(bool frame, unsigned int code, + size_t dst_off, unsigned int cntr, + unsigned int cntr_op) +{ + const struct kparser_metadata_extract mde = + __KPARSER_METADATA_MAKE_SET_CONTROL(frame, code, dst_off, + cntr, cntr_op); + return mde; +} + +struct kparser_metadata_table { + int num_ents; + struct kparser_metadata_extract *entries; +}; + +/* Extract functions */ +static inline int __kparser_metadata_bytes_extract(const __u8 *sptr, + __u8 *dptr, size_t length, bool e_bit) +{ + __u16 v16; + __u32 v32; + __u64 v64; + int i; + + if (!dptr) + return KPARSER_OKAY; + + switch (length) { + case sizeof(__u8): + *dptr = *sptr; + break; + case sizeof(__u16): + v16 = *(__u16 *)sptr; + *((__u16 *)dptr) = e_bit ? ntohs(v16) : v16; + break; + case sizeof(__u32): + v32 = *(__u32 *)sptr; + *((__u32 *)dptr) = e_bit ? ntohl(v32) : v32; + break; + case sizeof(__u64): + v64 = *(__u64 *)sptr; + *((__u64 *)dptr) = e_bit ? kparser_ntohll(v64) : v64; + break; + default: + if (e_bit) { + for (i = 0; i < length; i++) + dptr[i] = sptr[length - 1 - i]; + } else { + memcpy(dptr, sptr, length); + } + } + + return KPARSER_OKAY; +} + +static inline void *metadata_get_dst(size_t dst_off, void *mdata) +{ + return &((__u8 *)mdata)[dst_off]; +} + +static inline bool __metatdata_validate_counter(const struct kparser_parser *parser, + unsigned int cntr) +{ + if (!parser) { + pr_warn("Metadata counter is set for extraction but no parser is set"); + return false; + } + + if (!parser->cntrs) { + pr_warn("Metadata counter is set but no counters are configured for parser"); + return false; + } + + if (cntr >= KPARSER_CNTR_NUM_CNTRS) { + pr_warn("Metadata counter %u is greater than maximum %u", + cntr, KPARSER_CNTR_NUM_CNTRS); + return false; + } + + return true; +} + +static inline void *metadata_get_dst_cntr(const struct kparser_parser *parser, + size_t dst_off, void *mdata, + unsigned int cntr, int code) +{ + const struct kparser_cntr_conf *cntr_conf; + __u8 *dptr = &((__u8 *)mdata)[dst_off]; + size_t step; + + if (!cntr) + return dptr; + + cntr--; // Make zero based to access array + + if (!__metatdata_validate_counter(parser, cntr)) + return dptr; + + cntr_conf = &parser->cntrs_conf.cntrs[cntr]; + + if (code != KPARSER_METADATA_CTRL_COUNTER) { + if (parser->cntrs->cntr[cntr] >= cntr_conf->array_limit) { + if (!cntr_conf->array_limit || + !cntr_conf->overwrite_last) + return NULL; + step = cntr_conf->array_limit - 1; + } else { + step = parser->cntrs->cntr[cntr]; + } + + dptr += cntr_conf->el_size * step; + } + + return dptr; +} + +static inline int __metadata_cntr_operation(const struct kparser_parser *parser, + unsigned int operation, unsigned int cntr) +{ + /* cntr 0 means no counter attached, the index starts from 1 in this case + */ + if (!cntr) + return KPARSER_OKAY; + + cntr--; /* Make zero based to access array */ + + if (!__metatdata_validate_counter(parser, cntr)) + return KPARSER_STOP_BAD_CNTR; + + switch (operation) { + default: + case KPARSER_METADATA_CNTROP_NULL: + break; + case KPARSER_METADATA_CNTROP_INCREMENT: + /* Note: parser is const but + * parser->cntrs->cntr is writable + */ + if (parser->cntrs->cntr[cntr] < + parser->cntrs_conf.cntrs[cntr].max_value) + parser->cntrs->cntr[cntr]++; + else if (parser->cntrs_conf.cntrs[cntr].error_on_exceeded) + return KPARSER_STOP_CNTR1 - cntr; + break; + case KPARSER_METADATA_CNTROP_RESET: + parser->cntrs->cntr[cntr] = 0; + break; + } + + return KPARSER_OKAY; +} + +static inline int kparser_metadata_bytes_extract(const struct kparser_parser *parser, + struct kparser_metadata_extract mde, + const void *hdr, void *mdata) +{ + __u8 *dptr = metadata_get_dst_cntr(parser, mde.bytes.dst_off, mdata, + mde.bytes.cntr, 0); + const __u8 *sptr = &((__u8 *)hdr)[mde.bytes.src_off]; + + if (!dptr) + return KPARSER_OKAY; + + return __kparser_metadata_bytes_extract(sptr, dptr, + mde.bytes.length + 1, + mde.bytes.e_bit); +} + +static inline int kparser_metadata_nibbs_extract(const struct kparser_parser *parser, + struct kparser_metadata_extract mde, + const void *hdr, void *mdata) +{ + __u8 *dptr = metadata_get_dst_cntr(parser, mde.nibbs.dst_off, mdata, + mde.nibbs.cntr, 0); + const __u8 *sptr = &((__u8 *)hdr)[mde.nibbs.src_off / 2]; + size_t nibb_len = mde.nibbs.length + 1; + __u8 data; + int i; + + if (!dptr) + return KPARSER_OKAY; + + if (mde.nibbs.src_off % 2 == 0 && nibb_len % 2 == 0) { + /* This is effectively a byte transfer case */ + + return __kparser_metadata_bytes_extract(sptr, dptr, + mde.nibbs.length / 2, + mde.nibbs.e_bit); + } + + if (mde.nibbs.e_bit) { + /* Endianness bit is set. dlen is the number of bytes + * set for output + */ + size_t dlen = (nibb_len + 1) / 2; + + if (mde.nibbs.src_off % 2) { + /* Starting from the odd nibble */ + if (nibb_len % 2) { + /* Odd length and odd start nibble offset. Set + * the reverse of all the bytes after the first + * nibble, and * construct the last byte from + * the low order nibble of the first input byte + */ + for (i = 0; i < dlen - 1; i++) + dptr[i] = sptr[dlen - 1 - i]; + dptr[i] = sptr[0] & 0xf; + } else { + /* Even length and n-bit is set. Logically + * shift all the nibbles in the string left and + * then set the reversed bytes. + */ + + /* High order nibble of last byte becomes + * low order nibble of first output byte + */ + data = sptr[dlen] >> 4; + + for (i = 0; i < dlen - 1; i++) { + /* Construct intermediate bytes. data + * contains the input high order nibble + * of the next input byte shifted right. + * That value is or'ed with the shifted + * left low order nibble of the current + * byte. The result is set in the + * reversed position in the output + */ + dptr[i] = data | sptr[dlen - 1 - i] << 4; + + /* Get the next data value */ + data = sptr[dlen - 1 - i] >> 4; + } + /* Set the last byte as the or of the last + * data value and the low order nibble of the + * zeroth byte of the input shifted left + */ + dptr[i] = data | sptr[0] << 4; + } + } else { + /* Odd length (per check above) and n-bit is not + * set. Logically shift all the nibbles in the + * string right and then set the reversed bytes + */ + + /* High order nibble of last byte becomes + * low order nibble of first output byte + */ + data = sptr[dlen - 1] >> 4; + + for (i = 0; i < dlen - 1; i++) { + /* Construct intermediate bytes. data contains + * the input high order nibble of the next + * input byte shifted right. That value is + * or'ed with the shifted left low order nibble + * of the current byte. The result is set in the + * reversed position in the output + */ + dptr[i] = data | sptr[dlen - 2 - i] << 4; + + /* Get next data value */ + data = sptr[dlen - 2 - i] >> 4; + } + + /* Last output byte is set to high oder nibble of first + * input byte shifted right + */ + dptr[i] = data; + } + } else { + /* No e-bit (no endiannes) */ + + size_t byte_len; + int ind = 0; + + if (mde.nibbs.src_off % 2) { + /* Starting from the odd nibble. Set first output byte + * to masked low order nibble of first input byte + */ + dptr[0] = sptr[0] & 0xf; + ind = 1; + nibb_len--; + } + + /* Copy all the whole intermediate bytes */ + byte_len = nibb_len / 2; + memcpy(&dptr[ind], &sptr[ind], byte_len); + + if (nibb_len % 2) { + /* Have an odd nibble at the endian. Set the last + * output byte to the mask high order nibble of the + * last input byte + */ + dptr[ind + byte_len] = sptr[ind + byte_len] & 0xf0; + } + } + + return KPARSER_OKAY; +} + +static inline int kparser_metadata_const_set_byte(const struct kparser_parser *parser, + struct kparser_metadata_extract mde, + void *mdata) +{ + __u8 *dptr = metadata_get_dst_cntr(parser, mde.constant_byte.dst_off, + mdata, mde.constant_byte.cntr, 0); + + if (dptr) + *dptr = mde.constant_byte.data; + + return KPARSER_OKAY; +} + +static inline int kparser_metadata_const_set_hword(const struct kparser_parser *parser, + struct kparser_metadata_extract mde, + void *mdata) +{ + __u16 *dptr = metadata_get_dst_cntr(parser, mde.constant_hword.dst_off, + mdata, mde.constant_hword.cntr, 0); + + if (dptr) + *dptr = mde.constant_hword.data; + + return KPARSER_OKAY; +} + +static inline int kparser_metadata_set_offset(const struct kparser_parser *parser, + struct kparser_metadata_extract mde, + void *mdata, size_t hdr_offset) +{ + __u16 *dptr = metadata_get_dst_cntr(parser, mde.offset.dst_off, mdata, + mde.offset.cntr, 0); + + if (dptr) { + *dptr = mde.offset.bit_offset ? + 8 * hdr_offset + mde.offset.add_off : + hdr_offset + mde.offset.add_off; + } + + return KPARSER_OKAY; +} + +static inline int __kparser_metadata_control_extract(const struct kparser_parser *parser, + const struct kparser_metadata_extract mde, + const void *_hdr, size_t hdr_len, + size_t hdr_offset, void *mdata, + const struct kparser_ctrl_data *ctrl) +{ + __u16 *dptr = metadata_get_dst_cntr(parser, mde.control.dst_off, mdata, + mde.control.cntr, mde.control.code); + + switch (mde.control.code) { + case KPARSER_METADATA_CTRL_HDR_LENGTH: + if (dptr) + *((__u16 *)dptr) = hdr_len; + break; + case KPARSER_METADATA_CTRL_NUM_NODES: + if (dptr) + *((__u16 *)dptr) = ctrl->node_cnt; + break; + case KPARSER_METADATA_CTRL_NUM_ENCAPS: + if (dptr) + *((__u16 *)dptr) = ctrl->encap_levels; + break; + case KPARSER_METADATA_CTRL_TIMESTAMP: + /* TODO */ + break; + case KPARSER_METADATA_CTRL_COUNTER: + if (!__metatdata_validate_counter(parser, + mde.control.cntr_for_data)) + return KPARSER_STOP_BAD_CNTR; + if (dptr) + *(__u16 *)dptr = parser->cntrs->cntr[mde.control.cntr_for_data - 1]; + break; + case KPARSER_METADATA_CTRL_RET_CODE: + if (dptr) + *((int *)dptr) = ctrl->ret; + break; + case KPARSER_METADATA_CTRL_NOOP: + break; + default: + pr_debug("Unknown extract\n"); + return KPARSER_STOP_BAD_EXTRACT; + } + + return __metadata_cntr_operation(parser, mde.control.cntr_op, mde.control.cntr); +} + +/* Front end functions to process one metadata extraction pseudo instruction + * in the context of parsing a packet + */ +static inline int kparser_metadata_extract(const struct kparser_parser *parser, + const struct kparser_metadata_extract mde, + const void *_hdr, size_t hdr_len, + size_t hdr_offset, void *_metadata, + void *_frame, + const struct kparser_ctrl_data *ctrl) +{ + void *mdata = mde.gen.frame ? _frame : _metadata; + int ret; + + switch (mde.gen.code) { + case KPARSER_METADATA_BYTES_EXTRACT: + ret = kparser_metadata_bytes_extract(parser, mde, + _hdr, mdata); + break; + case KPARSER_METADATA_NIBBS_EXTRACT: + ret = kparser_metadata_nibbs_extract(parser, mde, + _hdr, mdata); + break; + case KPARSER_METADATA_CONSTANT_BYTE_SET: + ret = kparser_metadata_const_set_byte(parser, mde, + mdata); + break; + case KPARSER_METADATA_CONSTANT_HWORD_SET: + ret = kparser_metadata_const_set_hword(parser, mde, + mdata); + break; + case KPARSER_METADATA_OFFSET_SET: + ret = kparser_metadata_set_offset(parser, mde, mdata, + hdr_offset); + break; + default: /* Should be a control metadata extraction */ + ret = __kparser_metadata_control_extract(parser, mde, + _hdr, + hdr_len, + hdr_offset, + mdata, ctrl); + } + + return ret; +} + +static inline bool kparser_metadata_convert(const struct kparser_conf_metadata *conf, + struct kparser_metadata_extract *mde, + int cntridx, int cntr_arr_idx) +{ + __u32 encoding_type; + + switch (conf->type) { + case KPARSER_METADATA_HDRDATA: + *mde = __kparser_metadata_make_bytes_extract(conf->frame, + conf->soff, conf->doff, conf->len, + conf->e_bit, cntridx); + return true; + + case KPARSER_METADATA_HDRDATA_NIBBS_EXTRACT: + *mde = __kparser_make_make_nibbs_extract(conf->frame, + conf->soff, + conf->doff, + conf->len, + conf->e_bit, + cntridx); + return true; + + case KPARSER_METADATA_BIT_OFFSET: + *mde = __kparser_metadata_offset_set(conf->frame, + conf->doff, + true, + conf->add_off, + cntridx); + return true; + + case KPARSER_METADATA_OFFSET: + *mde = __kparser_metadata_offset_set(conf->frame, + conf->doff, + false, + conf->add_off, + cntridx); + return true; + + case KPARSER_METADATA_CONSTANT_BYTE: + *mde = __kparser_metadata_set_const_byte(conf->frame, + conf->doff, conf->constant_value, + cntridx); + return true; + + case KPARSER_METADATA_CONSTANT_HALFWORD: + *mde = __kparser_metadata_set_const_halfword(conf->frame, + conf->doff, conf->constant_value, + cntridx); + return true; + + case KPARSER_METADATA_COUNTER: + *mde = __kparser_metadata_set_control_counter(conf->frame, conf->doff, + cntridx, cntr_arr_idx, + conf->cntr_op); + return true; + + case KPARSER_METADATA_HDRLEN: + encoding_type = KPARSER_METADATA_CTRL_HDR_LENGTH; + break; + + case KPARSER_METADATA_NUMENCAPS: + encoding_type = KPARSER_METADATA_CTRL_NUM_ENCAPS; + break; + + case KPARSER_METADATA_NUMNODES: + encoding_type = KPARSER_METADATA_CTRL_NUM_NODES; + break; + + case KPARSER_METADATA_TIMESTAMP: + encoding_type = KPARSER_METADATA_CTRL_TIMESTAMP; + break; + + case KPARSER_METADATA_RETURN_CODE: + encoding_type = KPARSER_METADATA_CTRL_RET_CODE; + break; + + case KPARSER_METADATA_COUNTEROP_NOOP: + encoding_type = KPARSER_METADATA_CTRL_NOOP; + break; + + default: + return false; + } + + *mde = __kparser_metadata_set_control(conf->frame, encoding_type, conf->doff, + cntridx, conf->cntr_op); + + return true; +} + +#endif /* __KPARSER_METAEXTRACT_H__ */ diff --git a/net/kparser/kparser_types.h b/net/kparser/kparser_types.h new file mode 100644 index 000000000..e957c556e --- /dev/null +++ b/net/kparser/kparser_types.h @@ -0,0 +1,586 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* Copyright (c) 2022, SiPanda Inc. + * + * kparser_types.h - kParser private data types header file + * + * Authors: Tom Herbert + * Pratyush Kumar Khan + */ + +#ifndef __KPARSER_TYPES_H +#define __KPARSER_TYPES_H + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +/* Sign extend an returned signed value */ +#define KPARSER_EXTRACT_CODE(X) ((__s64)(short)(X)) +#define KPARSER_IS_RET_CODE(X) (KPARSER_EXTRACT_CODE(X) < 0) +#define KPARSER_IS_NOT_OK_CODE(X) (KPARSER_EXTRACT_CODE(X) <= KPARSER_STOP_FAIL) +#define KPARSER_IS_OK_CODE(X) \ + (KPARSER_IS_RET_CODE(X) && KPARSER_EXTRACT_CODE(X) > KPARSER_STOP_FAIL) + +/* A table of conditional expressions, type indicates that the expressions + * are or'ed of and'ed + */ +struct kparser_condexpr_table { + int default_fail; + enum kparser_condexpr_types type; + unsigned int num_ents; + const struct kparser_condexpr_expr __rcu **entries; +}; + +/* A table of tables of conditional expressions. This is used to create more + * complex expressions using and's and or's + */ +struct kparser_condexpr_tables { + unsigned int num_ents; + const struct kparser_condexpr_table __rcu **entries; +}; + +/* Control data describing various values produced while parsing. This is + * used an argument to metadata extraction and handler functions + */ +struct kparser_ctrl_data { + int ret; + size_t pkt_len; + void *hdr_base; + unsigned int node_cnt; + unsigned int encap_levels; +}; + +/*****************************************************************************/ + +/* Protocol parsing operations: + * + * Operations can be specified either as a function or a parameterization + * of a parameterized function + * + * len: Return length of protocol header. If value is NULL then the length of + * the header is taken from the min_len in the protocol node. If the + * return value < 0 (a KPARSER_STOP_* return code value) this indicates an + * error and parsing is stopped. A the return value greater than or equal + * to zero then gives the protocol length. If the returned length is less + * than the minimum protocol length, indicated in min_len by the protocol + * node, then this considered and error. + * next_proto: Return next protocol. If value is NULL then there is no + * next protocol. If return value is greater than or equal to zero + * this indicates a protocol number that is used in a table lookup + * to get the next layer protocol node. + * cond_exprs: Parameterization only. This describes a set of conditionals + * check before proceeding. In the case of functions being used, these + * conditionals would be in the next_proto or length function + */ + +struct kparser_parse_ops { + bool flag_fields_length; + bool len_parameterized; + struct kparser_parameterized_len pflen; + struct kparser_parameterized_next_proto pfnext_proto; + bool cond_exprs_parameterized; + struct kparser_condexpr_tables cond_exprs; +}; + +/* Protocol node + * + * This structure contains the definitions to describe parsing of one type + * of protocol header. Fields are: + * + * node_type: The type of the node (plain, TLVs, flag-fields) + * encap: Indicates an encapsulation protocol (e.g. IPIP, GRE) + * overlay: Indicates an overlay protocol. This is used, for example, to + * switch on version number of a protocol header (e.g. IP version number + * or GRE version number) + * name: Text name of protocol node for debugging + * min_len: Minimum length of the protocol header + * ops: Operations to parse protocol header + */ +struct kparser_proto_node { + __u8 encap; + __u8 overlay; + size_t min_len; + struct kparser_parse_ops ops; +}; + +/* Protocol node and parse node operations ordering. When processing a + * layer, operations are called in following order: + * + * protoop.len + * parseop.extract_metadata + * parseop.handle_proto + * protoop.next_proto + */ +/* One entry in a protocol table: + * value: protocol number + * node: associated parse node for the protocol number + */ +struct kparser_proto_table_entry { + int value; + bool encap; + const struct kparser_parse_node __rcu *node; +}; + +/* Protocol table + * + * Contains a protocol table that maps a protocol number to a parse + * node + */ +struct kparser_proto_table { + int num_ents; + struct kparser_proto_table_entry __rcu *entries; +}; + +/*****************************************************************************/ + +struct kparser_cntrs_conf { + struct kparser_cntr_conf cntrs[KPARSER_CNTR_NUM_CNTRS]; +}; + +struct kparser_counters { + __u16 cntr[KPARSER_CNTR_NUM_CNTRS]; +}; + +/*****************************************************************************/ + +/* Definitions for parsing TLVs + * + * Operations can be specified either as a function or a parameterization + * of a parameterized function + * + * TLVs are a common protocol header structure consisting of Type, Length, + * Value tuple (e.g. for handling TCP or IPv6 HBH options TLVs) + */ + +/* Descriptor for parsing operations of one type of TLV. Fields are: + * For struct kparser_proto_tlvs_opts: + * start_offset: Returns the offset of TLVs in a header + * len: Return length of a TLV. Must be set. If the return value < 0 (a + * KPARSER_STOP_* return code value) this indicates an error and parsing + * is stopped. A the return value greater than or equal to zero then + * gives the protocol length. If the returned length is less than the + * minimum TLV option length, indicated by min_len by the TLV protocol + * node, then this considered and error. + * type: Return the type of the TLV. If the return value is less than zero + * (KPARSER_STOP_* value) then this indicates and error and parsing stops + */ + +/* A protocol node for parsing proto with TLVs + * + * proto_node: proto node + * ops: Operations for parsing TLVs + * start_offset: When there TLVs start relative the enapsulating protocol + * (e.g. would be twenty for TCP) + * pad1_val: Type value indicating one byte of TLV padding (e.g. would be + * for IPv6 HBH TLVs) + * pad1_enable: Pad1 value is used to detect single byte padding + * eol_val: Type value that indicates end of TLV list + * eol_enable: End of list value in eol_val is used + * fixed_start_offset: Take start offset from start_offset + * min_len: Minimal length of a TLV option + */ +struct kparser_proto_tlvs_node { + struct kparser_proto_node proto_node; + struct kparser_proto_tlvs_opts ops; + size_t start_offset; + __u8 pad1_val; + __u8 padn_val; + __u8 eol_val; + bool pad1_enable; + bool padn_enable; + bool eol_enable; + bool fixed_start_offset; + size_t min_len; +}; + +/*****************************************************************************/ + +/* Definitions and functions for processing and parsing flag-fields */ +/* Definitions for parsing flag-fields + * + * Flag-fields is a common networking protocol construct that encodes optional + * data in a set of flags and data fields. The flags indicate whether or not a + * corresponding data field is present. The data fields are fixed length per + * each flag-field definition and ordered by the ordering of the flags + * indicating the presence of the fields (e.g. GRE and GUE employ flag-fields) + */ + +/* Flag-fields descriptors and tables + * + * A set of flag-fields is defined in a table of type struct kparser_flag_fields. + * Each entry in the table is a descriptor for one flag-field in a protocol and + * includes a flag value, mask (for the case of a multi-bit flag), and size of + * the cooresponding field. A flag is matched if "(flags & mask) == flag" + */ + +/* Descriptor for a protocol field with flag fields + * + * Defines the flags and their data fields for one instance a flag field in + * a protocol header (e.g. GRE v0 flags): + * + * num_idx: Number of flag_field structures + * fields: List of defined flag fields + */ +struct kparser_flag_fields { + size_t num_idx; + struct kparser_flag_field __rcu *fields; +}; + +/* Structure or parsing operations for flag-fields + * For struct kparser_proto_flag_fields_ops + * Operations can be specified either as a function or a parameterization + * of a parameterized function + * + * flags_offset: Offset of flags in the protocol header + * start_fields_offset: Return the offset in the header of the start of the + * flag fields data + */ + +/* A flag-fields protocol node. Note this is a super structure for aKPARSER + * protocol node and type is KPARSER_NODE_TYPE_FLAG_FIELDS + */ +struct kparser_proto_flag_fields_node { + struct kparser_proto_node proto_node; + struct kparser_proto_flag_fields_ops ops; + const struct kparser_flag_fields __rcu *flag_fields; +}; + +/*****************************************************************************/ + +/* Parse node definition. Defines parsing and processing for one node in + * the parse graph of a parser. Contains: + * + * node_type: The type of the node (plain, TLVs, flag-fields) + * unknown_ret: Code to return for a miss on the protocol table and the + * wildcard node is not set + * proto_node: Protocol node + * ops: Parse node operations + * proto_table: Protocol table for next protocol. This must be non-null if + * next_proto is not NULL + * wildcard_node: Node use for a miss on next protocol lookup + * metadata_table: Table of parameterized metadata operations + * thread_funcs: Thread functions + */ +struct kparser_parse_node { + enum kparser_node_type node_type; + char name[KPARSER_MAX_NAME]; + int unknown_ret; + const struct kparser_proto_table __rcu *proto_table; + const struct kparser_parse_node __rcu *wildcard_node; + const struct kparser_metadata_table __rcu *metadata_table; + union { + struct kparser_proto_node proto_node; + struct kparser_proto_tlvs_node tlvs_proto_node; + struct kparser_proto_flag_fields_node flag_fields_proto_node; + }; +}; + +/*****************************************************************************/ + +/* TLV parse node operations + * + * Operations to process a single TLV + * + * Operations can be specified either as a function or a parameterization + * of a parameterized function + * + * extract_metadata: Extract metadata for the node. Input is the meta + * data frame which points to a parser defined metadata structure. + * If the value is NULL then no metadata is extracted + * handle_tlv: Per TLV type handler which allows arbitrary processing + * of a TLV. Input is the TLV data and a parser defined metadata + * structure for the current frame. Return value is a parser + * return code: KPARSER_OKAY indicates no errors, KPARSER_STOP* return + * values indicate to stop parsing + * check_tlv: Function to validate a TLV + * cond_exprs: Parameterization of a set of conditionals to check before + * proceeding. In the case of functions being used, these + * conditionals would be in the check_tlv function + */ + +/* One entry in a TLV table: + * type: TLV type + * node: associated TLV parse structure for the type + */ +struct kparser_proto_tlvs_table_entry { + int type; + const struct kparser_parse_tlv_node __rcu *node; +}; + +/* TLV table + * + * Contains a table that maps a TLV type to a TLV parse node + */ +struct kparser_proto_tlvs_table { + int num_ents; + struct kparser_proto_tlvs_table_entry __rcu *entries; +}; + +/* Parse node for parsing a protocol header that contains TLVs to be + * parser: + * + * parse_node: Node for main protocol header (e.g. IPv6 node in case of HBH + * options) Note that node_type is set in parse_node to + * KPARSER_NODE_TYPE_TLVS and that the parse node can then be cast to a + * parse_tlv_node + * tlv_proto_table: Lookup table for TLV type + * unknown_tlv_type_ret: Code to return on a TLV type lookup miss and + * tlv_wildcard_node is NULL + * tlv_wildcard_node: Node to use on a TLV type lookup miss + * config: Loop configuration + */ +struct kparser_parse_tlvs_node { + struct kparser_parse_node parse_node; + const struct kparser_proto_tlvs_table __rcu *tlv_proto_table; + int unknown_tlv_type_ret; + const struct kparser_parse_tlv_node __rcu *tlv_wildcard_node; + struct kparser_loop_node_config config; +}; + +struct kparser_proto_tlv_node_ops { + bool overlay_type_parameterized; + struct kparser_parameterized_next_proto pfoverlay_type; + bool cond_exprs_parameterized; + struct kparser_condexpr_tables cond_exprs; +}; + +/* A protocol node for parsing proto with TLVs + * + * min_len: Minimal length of TLV + * max_len: Maximum size of a TLV option + * is_padding: Indicates padding TLV + */ +struct kparser_proto_tlv_node { + size_t min_len; + size_t max_len; + bool is_padding; + struct kparser_proto_tlv_node_ops ops; +}; + +/* Parse node for a single TLV. Use common parse node operations + * (extract_metadata and handle_proto) + * + * proto_tlv_node: TLV protocol node + * tlv_ops: Operations on a TLV + * overlay_table: Lookup table for an overlay TLV + * overlay_wildcard_node: Wildcard node to an overlay lookup miss + * unknown_overlay_ret: Code to return on an overlay lookup miss and + * overlay_wildcard_node is NULL + * name: Name for debugging + * metadata_table: Table of parameterized metadata operations + * thread_funcs: Thread functions + */ +struct kparser_parse_tlv_node { + struct kparser_proto_tlv_node proto_tlv_node; + struct kparser_proto_tlvs_table __rcu *overlay_table; + const struct kparser_parse_tlv_node __rcu *overlay_wildcard_node; + int unknown_overlay_ret; + char name[KPARSER_MAX_NAME]; + struct kparser_metadata_table __rcu *metadata_table; +}; + +/*****************************************************************************/ + +/* Flag-field parse node operations + * + * Operations to process a single flag-field + * + * extract_metadata: Extract metadata for the node. Input is the meta + * data frame which points to a parser defined metadata structure. + * If the value is NULL then no metadata is extracted + * handle_flag_field: Per flag-field handler which allows arbitrary processing + * of a flag-field. Input is the flag-field data and a parser defined + * metadata structure for the current frame. Return value is a parser + * return code: KPARSER_OKAY indicates no errors, KPARSER_STOP* return + * values indicate to stop parsing + * check_flag_field: Function to validate a flag-field + * cond_exprs: Parameterization of a set of conditionals to check before + * proceeding. In the case of functions being used, these + * conditionals would be in the check_flag_field function + */ +struct kparser_parse_flag_field_node_ops { + struct kparser_condexpr_tables cond_exprs; +}; + +/* A parse node for a single flag field + * + * name: Text name for debugging + * metadata_table: Table of parameterized metadata operations + * ops: Operations + * thread_funcs: Thread functions + */ +struct kparser_parse_flag_field_node { + char name[KPARSER_MAX_NAME]; + struct kparser_metadata_table __rcu *metadata_table; + struct kparser_parse_flag_field_node_ops ops; +}; + +/* One entry in a flag-fields protocol table: + * index: flag-field index (index in a flag-fields table) + * node: associated TLV parse structure for the type + */ +struct kparser_proto_flag_fields_table_entry { + __u32 flag; + const struct kparser_parse_flag_field_node __rcu *node; +}; + +/* Flag-fields table + * + * Contains a table that maps a flag-field index to a flag-field parse node. + * Note that the index correlates to an entry in a flag-fields table that + * describes the flag-fields of a protocol + */ +struct kparser_proto_flag_fields_table { + int num_ents; + struct kparser_proto_flag_fields_table_entry __rcu *entries; +}; + +/* A flag-fields parse node. Note this is a super structure for a KPARSER parse + * node and type is KPARSER_NODE_TYPE_FLAG_FIELDS + */ +struct kparser_parse_flag_fields_node { + struct kparser_parse_node parse_node; + const struct kparser_proto_flag_fields_table __rcu + *flag_fields_proto_table; +}; + +static inline ssize_t __kparser_flag_fields_offset(__u32 targ_idx, __u32 flags, + const struct kparser_flag_fields *flag_fields) +{ + ssize_t offset = 0; + __u32 mask, flag; + int i; + + for (i = 0; i < targ_idx; i++) { + flag = flag_fields->fields[i].flag; + if (flag_fields->fields[i].endian) + flag = ntohs(flag); + mask = flag_fields->fields[i].mask ? : flag; + if ((flags & mask) == flag) + offset += flag_fields->fields[i].size; + } + + return offset; +} + +/* Determine offset of a field given a set of flags */ +static inline ssize_t kparser_flag_fields_offset(__u32 targ_idx, __u32 flags, + const struct kparser_flag_fields *flag_fields) +{ + __u32 mask, flag; + + flag = flag_fields->fields[targ_idx].flag; + if (flag_fields->fields[targ_idx].endian) + flag = ntohs(flag); + mask = flag_fields->fields[targ_idx].mask ? : flag; + if ((flags & mask) != flag) { + /* Flag not set */ + return -1; + } + + return __kparser_flag_fields_offset(targ_idx, flags, flag_fields); +} + +/* Check flags are legal */ +static inline bool kparser_flag_fields_check_invalid(__u32 flags, __u32 mask) +{ + return !!(flags & ~mask); +} + +/* Retrieve a byte value from a flag field */ +static inline __u8 kparser_flag_fields_get8(const __u8 *fields, __u32 targ_idx, + __u32 flags, + const struct kparser_flag_fields + *flag_fields) +{ + ssize_t offset = kparser_flag_fields_offset(targ_idx, flags, + flag_fields); + + if (offset < 0) + return 0; + + return *(__u8 *)&fields[offset]; +} + +/* Retrieve a short value from a flag field */ +static inline __u16 kparser_flag_fields_get16(const __u8 *fields, + __u32 targ_idx, + __u32 flags, + const struct kparser_flag_fields + *flag_fields) +{ + ssize_t offset = kparser_flag_fields_offset(targ_idx, flags, flag_fields); + + if (offset < 0) + return 0; + + return *(__u16 *)&fields[offset]; +} + +/* Retrieve a 32 bit value from a flag field */ +static inline __u32 kparser_get_flag_field32(const __u8 *fields, __u32 targ_idx, + __u32 flags, + const struct kparser_flag_fields + *flag_fields) +{ + ssize_t offset = kparser_flag_fields_offset(targ_idx, flags, flag_fields); + + if (offset < 0) + return 0; + + return *(__u32 *)&fields[offset]; +} + +/* Retrieve a 64 bit value from a flag field */ +static inline __u64 kparser_get_flag_field64(const __u8 *fields, __u32 targ_idx, + __u32 flags, + const struct kparser_flag_fields + *flag_fields) +{ + ssize_t offset = kparser_flag_fields_offset(targ_idx, flags, + flag_fields); + + if (offset < 0) + return 0; + + return *(__u64 *)&fields[offset]; +} + +/*****************************************************************************/ + +/* Definition of a KPARSER parser. Fields are: + * + * name: Text name for the parser + * root_node: Root parse node of the parser. When the parser is invoked + * parsing commences at this parse node + * okay_node: Processed at parser exit if no error + * fail_node: Processed at parser exit if there was an error + * parser_type: e.g. KPARSER_GENERIC, KPARSER_OPTIMIZED, KPARSER_KMOD, KPARSER_XDP + * parser_entry_point: Function entry point for optimized parser + * parser_xdp_entry_point: Function entry point for XDP parser + * config: Parser conifguration + */ +#define KPARSERSTARTSIGNATURE 0xabcd +#define KPARSERENDSIGNATURE 0xdcba +struct kparser_parser { + __u16 kparser_start_signature; + char name[KPARSER_MAX_NAME]; + const struct kparser_parse_node __rcu *root_node; + const struct kparser_parse_node __rcu *okay_node; + const struct kparser_parse_node __rcu *fail_node; + const struct kparser_parse_node __rcu *atencap_node; + size_t cntrs_len; + struct kparser_counters __rcu *cntrs; + struct kparser_config config; + struct kparser_cntrs_conf cntrs_conf; + __u16 kparser_end_signature; +}; + +#endif /* __KPARSER_TYPES_H */ From patchwork Tue Jan 24 17:05:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jamal Hadi Salim X-Patchwork-Id: 13114389 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 75BA1C54EAA for ; Tue, 24 Jan 2023 17:06:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234750AbjAXRG1 (ORCPT ); Tue, 24 Jan 2023 12:06:27 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41266 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233638AbjAXRGG (ORCPT ); Tue, 24 Jan 2023 12:06:06 -0500 Received: from mail-yw1-x1129.google.com (mail-yw1-x1129.google.com [IPv6:2607:f8b0:4864:20::1129]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A4B2543455 for ; Tue, 24 Jan 2023 09:05:28 -0800 (PST) Received: by mail-yw1-x1129.google.com with SMTP id 00721157ae682-5063029246dso59442307b3.6 for ; Tue, 24 Jan 2023 09:05:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mojatatu-com.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=PM++/tmPXa9Lt8wtQrD2jRJLobRDrmrTk8tKFeQV7tc=; b=5brnwSsV5xiEYAWCjiprsBOoypCvHFjRbz94LOf88cjVavy/kDEnQgDcxAPdZF2bh4 iRRj4rDKCRv/8PpaGNTkE6Be2qBo5MUdrVrTqISSMBEuifhuLrGe9ZbnEUIe7S7mqICc nYIG1QRYsvBH3mBLbqk8U+Mn/DO3walp6Zs82Ca2BfK3KoN4FTE/sHyFI2FxXgQwkIR9 BQHitD3aIixToxmUaOIDeSZD2KMGl3kWQzH8NicXlegOULuWI/1anh6/msHHe0GXNMAV Ae3qfvztC4iLN9ZRmOfwRr+OvheAkIA1TM7wzO/z7ovA0Sl0ZKqbMp9I0TRauEDiY4YW vwvg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=PM++/tmPXa9Lt8wtQrD2jRJLobRDrmrTk8tKFeQV7tc=; b=QqbE2XSfOVWinRhX2bUZ3l30HbYjxm9zoCQOulE62DcyfYrNNILr2iYKjL2Ny7alIY iSkKFf86K3IPKVbiVofEjO/gGsoYh2YFpp1K1rV4yz0oq+5bwRClH2ex5RBV2zMcFE3H vve4uQVX37RT6qFgfmMshylh43C87UijKhQ1dhKwkKcS5SH1DCFvSaVc9wYuxV9JbixC n3aF32jOy1/l2jcyxwo1LeQlnbzqcQZvKBCcpb6Ou1JCkMBTQ0UXUVhcRF1T4SysDiIZ zKQEGmzTpRib/UfR24yWg3nMib6VWOzuSiE2AJbdpkjq00uXVnEaN86Bxz+LYrG6rn1f 1QkQ== X-Gm-Message-State: AFqh2kpeLEx5/cRFPYIuWdauiayd1MAT/UsackEVixFWSfhApt1V1Orm NQStYAbYn1i2P93sUV+XDRRtk7ofCxKdU7rC X-Google-Smtp-Source: AMrXdXtG9hA9q19Ju3tHVzqOAz91hIuITY7T5XrE6ZvF+uMaApcko0gtjjfCxmau6OT2eR1Ok3R01g== X-Received: by 2002:a05:7500:5c10:b0:f1:8c3e:4e01 with SMTP id fd16-20020a0575005c1000b000f18c3e4e01mr2183575gab.24.1674579925179; Tue, 24 Jan 2023 09:05:25 -0800 (PST) Received: from localhost.localdomain (bras-base-kntaon1618w-grc-10-184-145-9-64.dsl.bell.ca. [184.145.9.64]) by smtp.gmail.com with ESMTPSA id t5-20020a05620a0b0500b007063036cb03sm1700208qkg.126.2023.01.24.09.05.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Jan 2023 09:05:24 -0800 (PST) From: Jamal Hadi Salim To: netdev@vger.kernel.org Cc: kernel@mojatatu.com, deb.chatterjee@intel.com, anjali.singhai@intel.com, namrata.limaye@intel.com, khalidm@nvidia.com, tom@sipanda.io, pratyush@sipanda.io, jiri@resnulli.us, xiyou.wangcong@gmail.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, vladbu@nvidia.com, simon.horman@corigine.com Subject: [PATCH net-next RFC 11/20] p4tc: add P4 data types Date: Tue, 24 Jan 2023 12:05:01 -0500 Message-Id: <20230124170510.316970-11-jhs@mojatatu.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230124170510.316970-1-jhs@mojatatu.com> References: <20230124170510.316970-1-jhs@mojatatu.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org X-Patchwork-State: RFC Introduce abstraction that represents P4 data types. Types could be little, host or big endian definitions. The abstraction also supports defining: a) bitstrings using annotations in control that look like "bitX" where X is the number of bits defined in a type b) bitslices such that one can define in control bit8[0-3] and bit16[0-9]. A 4-bit slice from bits 0-3 and a 10-bit slice from bits 0-9 respectively. Each type has a bitsize, a name (for debugging purposes), an ID and methods/ops. The P4 types will be used by metadata, headers, dynamic actions and other part of P4TC. Each type has four ops: - validate_p4t: Which validates if a given value of a specific type meets valid boundary conditions. - create_bitops: Which, given a bitsize, bitstart and bitend allocates and returns a mask and a shift value. For example, if we have type bit8[3-3] meaning bitstart = 3 and bitend = 3, we'll create a mask which would only give us the fourth bit of a bit8 value, that is, 0x08. Since we are interested in the fourth bit, the bit shift value will be 3. - host_read : Which reads the value of a given type and transforms it to host order - host_write : Which writes a provided host order value and transforms it to the type's native order Co-developed-by: Victor Nogueira Signed-off-by: Victor Nogueira Co-developed-by: Pedro Tammela Signed-off-by: Pedro Tammela Signed-off-by: Jamal Hadi Salim --- include/net/p4tc_types.h | 61 ++ include/uapi/linux/p4tc.h | 40 ++ net/sched/Kconfig | 8 + net/sched/Makefile | 2 + net/sched/p4tc/Makefile | 3 + net/sched/p4tc/p4tc_types.c | 1294 +++++++++++++++++++++++++++++++++++ 6 files changed, 1408 insertions(+) create mode 100644 include/net/p4tc_types.h create mode 100644 include/uapi/linux/p4tc.h create mode 100644 net/sched/p4tc/Makefile create mode 100644 net/sched/p4tc/p4tc_types.c diff --git a/include/net/p4tc_types.h b/include/net/p4tc_types.h new file mode 100644 index 000000000..038ad89e3 --- /dev/null +++ b/include/net/p4tc_types.h @@ -0,0 +1,61 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __NET_P4TYPES_H +#define __NET_P4TYPES_H + +#include +#include +#include + +#include + +#define P4T_MAX_BITSZ 128 + +struct p4tc_type_mask_shift { + void *mask; + u8 shift; +}; + +struct p4tc_type; +struct p4tc_type_ops { + int (*validate_p4t)(struct p4tc_type *container, void *value, u16 startbit, + u16 endbit, struct netlink_ext_ack *extack); + struct p4tc_type_mask_shift *(*create_bitops)(u16 bitsz, + u16 bitstart, + u16 bitend, + struct netlink_ext_ack *extack); + int (*host_read)(struct p4tc_type *container, + struct p4tc_type_mask_shift *mask_shift, void *sval, + void *dval); + int (*host_write)(struct p4tc_type *container, + struct p4tc_type_mask_shift *mask_shift, void *sval, + void *dval); + void (*print)(struct net *net, struct p4tc_type *container, + const char *prefix, void *val); +}; + +#define P4T_MAX_STR_SZ 32 +struct p4tc_type { + char name[P4T_MAX_STR_SZ]; + struct p4tc_type_ops *ops; + size_t container_bitsz; + size_t bitsz; + int typeid; +}; + +struct p4tc_type *p4type_find_byid(int id); +bool p4tc_type_unsigned(int typeid); + +int p4t_copy(struct p4tc_type_mask_shift *dst_mask_shift, + struct p4tc_type *dst_t, void *dstv, + struct p4tc_type_mask_shift *src_mask_shift, + struct p4tc_type *src_t, void *srcv); +int p4t_cmp(struct p4tc_type_mask_shift *dst_mask_shift, + struct p4tc_type *dst_t, void *dstv, + struct p4tc_type_mask_shift *src_mask_shift, + struct p4tc_type *src_t, void *srcv); +void p4t_release(struct p4tc_type_mask_shift *mask_shift); + +int p4tc_register_types(void); +void p4tc_unregister_types(void); + +#endif diff --git a/include/uapi/linux/p4tc.h b/include/uapi/linux/p4tc.h new file mode 100644 index 000000000..2b6f126db --- /dev/null +++ b/include/uapi/linux/p4tc.h @@ -0,0 +1,40 @@ +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ +#ifndef __LINUX_P4TC_H +#define __LINUX_P4TC_H + +#define P4TC_MAX_KEYSZ 512 + +enum { + P4T_UNSPEC, + P4T_U8 = 1, /* NLA_U8 */ + P4T_U16 = 2, /* NLA_U16 */ + P4T_U32 = 3, /* NLA_U32 */ + P4T_U64 = 4, /* NLA_U64 */ + P4T_STRING = 5, /* NLA_STRING */ + P4T_FLAG = 6, /* NLA_FLAG */ + P4T_MSECS = 7, /* NLA_MSECS */ + P4T_NESTED = 8, /* NLA_NESTED */ + P4T_NESTED_ARRAY = 9, /* NLA_NESTED_ARRAY */ + P4T_NUL_STRING = 10, /* NLA_NUL_STRING */ + P4T_BINARY = 11, /* NLA_BINARY */ + P4T_S8 = 12, /* NLA_S8 */ + P4T_S16 = 13, /* NLA_S16 */ + P4T_S32 = 14, /* NLA_S32 */ + P4T_S64 = 15, /* NLA_S64 */ + P4T_BITFIELD32 = 16, /* NLA_BITFIELD32 */ + P4T_MACADDR = 17, /* NLA_REJECT */ + P4T_IPV4ADDR, + P4T_BE16, + P4T_BE32, + P4T_BE64, + P4T_U128, + P4T_S128, + P4T_PATH, + P4T_BOOL, + P4T_DEV, + P4T_KEY, + __P4T_MAX, +}; +#define P4T_MAX (__P4T_MAX - 1) + +#endif diff --git a/net/sched/Kconfig b/net/sched/Kconfig index 777d6b505..c2fbd1889 100644 --- a/net/sched/Kconfig +++ b/net/sched/Kconfig @@ -750,6 +750,14 @@ config NET_EMATCH_IPT To compile this code as a module, choose M here: the module will be called em_ipt. +config NET_P4_TC + bool "P4 support" + select NET_CLS_ACT + help + Say Y here if you want to use P4 features. + The concept of Pipelines, Tables, metadata will be enabled + with this option. + config NET_CLS_ACT bool "Actions" select NET_CLS diff --git a/net/sched/Makefile b/net/sched/Makefile index dd14ef413..465ea14cd 100644 --- a/net/sched/Makefile +++ b/net/sched/Makefile @@ -87,3 +87,5 @@ obj-$(CONFIG_NET_EMATCH_TEXT) += em_text.o obj-$(CONFIG_NET_EMATCH_CANID) += em_canid.o obj-$(CONFIG_NET_EMATCH_IPSET) += em_ipset.o obj-$(CONFIG_NET_EMATCH_IPT) += em_ipt.o + +obj-$(CONFIG_NET_P4_TC) += p4tc/ diff --git a/net/sched/p4tc/Makefile b/net/sched/p4tc/Makefile new file mode 100644 index 000000000..dd1358c9e --- /dev/null +++ b/net/sched/p4tc/Makefile @@ -0,0 +1,3 @@ +# SPDX-License-Identifier: GPL-2.0 + +obj-y := p4tc_types.o diff --git a/net/sched/p4tc/p4tc_types.c b/net/sched/p4tc/p4tc_types.c new file mode 100644 index 000000000..71df1b1cb --- /dev/null +++ b/net/sched/p4tc/p4tc_types.c @@ -0,0 +1,1294 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * net/sched/p4tc_types.c - P4 datatypes + * Copyright (c) 2022, Mojatatu Networks + * Copyright (c) 2022, Intel Corporation. + * Authors: Jamal Hadi Salim + * Victor Nogueira + * Pedro Tammela + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +static DEFINE_IDR(p4tc_types_idr); + +static void p4tc_types_put(void) +{ + unsigned long tmp, typeid; + struct p4tc_type *type; + + idr_for_each_entry_ul(&p4tc_types_idr, type, tmp, typeid) { + idr_remove(&p4tc_types_idr, typeid); + kfree(type); + } +} + +struct p4tc_type *p4type_find_byid(int typeid) +{ + return idr_find(&p4tc_types_idr, typeid); +} + +static struct p4tc_type *p4type_find_byname(const char *name) +{ + struct p4tc_type *type; + unsigned long tmp, typeid; + + idr_for_each_entry_ul(&p4tc_types_idr, type, tmp, typeid) { + if (!strncmp(type->name, name, P4T_MAX_STR_SZ)) + return type; + } + + return NULL; +} + +bool p4tc_type_unsigned(int typeid) +{ + switch (typeid) { + case P4T_U8: + case P4T_U16: + case P4T_U32: + case P4T_U64: + case P4T_U128: + case P4T_BOOL: + return true; + default: + return false; + } +} + +int p4t_copy(struct p4tc_type_mask_shift *dst_mask_shift, + struct p4tc_type *dst_t, void *dstv, + struct p4tc_type_mask_shift *src_mask_shift, + struct p4tc_type *src_t, void *srcv) +{ + u64 readval[BITS_TO_U64(P4TC_MAX_KEYSZ)] = { 0 }; + struct p4tc_type_ops *srco, *dsto; + + dsto = dst_t->ops; + srco = src_t->ops; + + srco->host_read(src_t, src_mask_shift, srcv, &readval); + dsto->host_write(dst_t, dst_mask_shift, &readval, dstv); + + return 0; +} + +int p4t_cmp(struct p4tc_type_mask_shift *dst_mask_shift, + struct p4tc_type *dst_t, void *dstv, + struct p4tc_type_mask_shift *src_mask_shift, + struct p4tc_type *src_t, void *srcv) +{ + u64 a[BITS_TO_U64(P4TC_MAX_KEYSZ)] = { 0 }; + u64 b[BITS_TO_U64(P4TC_MAX_KEYSZ)] = { 0 }; + struct p4tc_type_ops *srco, *dsto; + + dsto = dst_t->ops; + srco = src_t->ops; + + dsto->host_read(dst_t, dst_mask_shift, dstv, a); + srco->host_read(src_t, src_mask_shift, srcv, b); + + return memcmp(a, b, sizeof(a)); +} + +void p4t_release(struct p4tc_type_mask_shift *mask_shift) +{ + kfree(mask_shift->mask); + kfree(mask_shift); +} + +static int p4t_validate_bitpos(u16 bitstart, u16 bitend, u16 maxbitstart, + u16 maxbitend, struct netlink_ext_ack *extack) +{ + if (bitstart > maxbitstart) { + NL_SET_ERR_MSG_MOD(extack, "bitstart too high"); + return -EINVAL; + } + if (bitend > maxbitend) { + NL_SET_ERR_MSG_MOD(extack, "bitend too high"); + return -EINVAL; + } + + return 0; +} + +//XXX: Latter immedv will be 64 bits +static int p4t_u32_validate(struct p4tc_type *container, void *value, + u16 bitstart, u16 bitend, + struct netlink_ext_ack *extack) +{ + u32 container_maxsz = U32_MAX; + u32 *val = value; + size_t maxval; + int ret; + + ret = p4t_validate_bitpos(bitstart, bitend, 31, 31, extack); + if (ret < 0) + return ret; + + maxval = GENMASK(bitend, 0); + if (val && (*val > container_maxsz || *val > maxval)) { + NL_SET_ERR_MSG_MOD(extack, "U32 value out of range"); + return -EINVAL; + } + + return 0; +} + +static struct p4tc_type_mask_shift * +p4t_u32_bitops(u16 bitsiz, u16 bitstart, u16 bitend, + struct netlink_ext_ack *extack) +{ + u32 mask = GENMASK(bitend, bitstart); + struct p4tc_type_mask_shift *mask_shift; + u32 *cmask; + + mask_shift = kzalloc(sizeof(*mask_shift), GFP_KERNEL); + if (!mask_shift) + return ERR_PTR(-ENOMEM); + + cmask = kzalloc(sizeof(u32), GFP_KERNEL); + if (!cmask) { + kfree(mask_shift); + return ERR_PTR(-ENOMEM); + } + + *cmask = mask; + + mask_shift->mask = cmask; + mask_shift->shift = bitstart; + + return mask_shift; +} + +static int p4t_u32_write(struct p4tc_type *container, + struct p4tc_type_mask_shift *mask_shift, void *sval, + void *dval) +{ + u32 *dst = dval; + u32 *src = sval; + u32 maskedst = 0; + u8 shift = 0; + + if (mask_shift) { + u32 *dmask = mask_shift->mask; + + maskedst = *dst & ~*dmask; + shift = mask_shift->shift; + } + + *dst = maskedst | (*src << shift); + + return 0; +} + +static void p4t_u32_print(struct net *net, struct p4tc_type *container, + const char *prefix, void *val) +{ + u32 *v = val; + + pr_info("%s 0x%x\n", prefix, *v); +} + +static int p4t_u32_hread(struct p4tc_type *container, + struct p4tc_type_mask_shift *mask_shift, void *sval, + void *dval) +{ + u32 *dst = dval; + u32 *src = sval; + + if (mask_shift) { + u32 *smask = mask_shift->mask; + u8 shift = mask_shift->shift; + + *dst = (*src & *smask) >> shift; + } else { + *dst = *src; + } + + return 0; +} + +/*XXX: future converting immedv to 64 bits */ +static int p4t_s32_validate(struct p4tc_type *container, void *value, + u16 bitstart, u16 bitend, + struct netlink_ext_ack *extack) +{ + s32 minsz = S32_MIN, maxsz = S32_MAX; + s32 *val = value; + + if (val && (*val > maxsz || *val < minsz)) { + NL_SET_ERR_MSG_MOD(extack, "S32 value out of range"); + return -EINVAL; + } + + return 0; +} + +static int p4t_s32_hread(struct p4tc_type *container, + struct p4tc_type_mask_shift *mask_shift, void *sval, + void *dval) +{ + s32 *dst = dval; + s32 *src = sval; + + *dst = *src; + + return 0; +} + +static int p4t_s32_write(struct p4tc_type *container, + struct p4tc_type_mask_shift *mask_shift, void *sval, + void *dval) +{ + s32 *dst = dval; + s32 *src = sval; + + *dst = *src; + + return 0; +} + +static void p4t_s32_print(struct net *net, struct p4tc_type *container, + const char *prefix, void *val) +{ + s32 *v = val; + + pr_info("%s %x\n", prefix, *v); +} + +static void p4t_s64_print(struct net *net, struct p4tc_type *container, + const char *prefix, void *val) +{ + s64 *v = val; + + pr_info("%s 0x%llx\n", prefix, *v); +} + +static int p4t_be32_validate(struct p4tc_type *container, void *value, + u16 bitstart, u16 bitend, + struct netlink_ext_ack *extack) +{ + size_t container_maxsz = U32_MAX; + __u32 *val_u32 = value; + __be32 val = 0; + size_t maxval; + int ret; + + ret = p4t_validate_bitpos(bitstart, bitend, 31, 31, extack); + if (ret < 0) + return ret; + + if (value) + val = (__be32)(be32_to_cpu(*val_u32)); + + maxval = GENMASK(bitend, 0); + if (val && (val > container_maxsz || val > maxval)) { + NL_SET_ERR_MSG_MOD(extack, "BE32 value out of range"); + return -EINVAL; + } + + return 0; +} + +static int p4t_be32_hread(struct p4tc_type *container, + struct p4tc_type_mask_shift *mask_shift, void *sval, + void *dval) +{ + u32 *dst = dval; + u32 *src = sval; + u32 readval = be32_to_cpu(*src); + + if (mask_shift) { + u32 *smask = mask_shift->mask; + u8 shift = mask_shift->shift; + + readval = (readval & *smask) >> shift; + } + + *dst = readval; + + return 0; +} + +static int p4t_be32_write(struct p4tc_type *container, + struct p4tc_type_mask_shift *mask_shift, void *sval, + void *dval) +{ + __be32 *dst = dval; + u32 maskedst = 0; + u32 *src = sval; + u8 shift = 0; + + if (mask_shift) { + u32 *dmask = (u32 *)mask_shift->mask; + + maskedst = *dst & ~*dmask; + shift = mask_shift->shift; + } + + *dst = cpu_to_be32(maskedst | (*src << shift)); + + return 0; +} + +static void p4t_be32_print(struct net *net, struct p4tc_type *container, + const char *prefix, void *val) +{ + __be32 *v = val; + + pr_info("%s 0x%x\n", prefix, *v); +} + +static int p4t_be64_hread(struct p4tc_type *container, + struct p4tc_type_mask_shift *mask_shift, void *sval, + void *dval) +{ + u64 *dst = dval; + u64 *src = sval; + u64 readval = be64_to_cpu(*src); + + if (mask_shift) { + u64 *smask = mask_shift->mask; + u8 shift = mask_shift->shift; + + readval = (readval & *smask) >> shift; + } + + *dst = readval; + + return 0; +} + +static int p4t_be64_write(struct p4tc_type *container, + struct p4tc_type_mask_shift *mask_shift, void *sval, + void *dval) +{ + __be64 *dst = dval; + u64 maskedst = 0; + u64 *src = sval; + u8 shift = 0; + + if (mask_shift) { + u64 *dmask = (u64 *)mask_shift->mask; + + maskedst = *dst & ~*dmask; + shift = mask_shift->shift; + } + + *dst = cpu_to_be64(maskedst | (*src << shift)); + + return 0; +} + +static void p4t_be64_print(struct net *net, struct p4tc_type *container, + const char *prefix, void *val) +{ + __be64 *v = val; + + pr_info("%s 0x%llx\n", prefix, *v); +} + +static int p4t_u16_validate(struct p4tc_type *container, void *value, + u16 bitstart, u16 bitend, + struct netlink_ext_ack *extack) +{ + u16 container_maxsz = U16_MAX; + u16 *val = value; + u16 maxval; + int ret; + + ret = p4t_validate_bitpos(bitstart, bitend, 15, 15, extack); + if (ret < 0) + return ret; + + maxval = GENMASK(bitend, 0); + if (val && (*val > container_maxsz || *val > maxval)) { + NL_SET_ERR_MSG_MOD(extack, "U16 value out of range"); + return -EINVAL; + } + + return 0; +} + +static struct p4tc_type_mask_shift * +p4t_u16_bitops(u16 bitsiz, u16 bitstart, u16 bitend, + struct netlink_ext_ack *extack) +{ + u16 mask = GENMASK(bitend, bitstart); + struct p4tc_type_mask_shift *mask_shift; + u16 *cmask; + + mask_shift = kzalloc(sizeof(*mask_shift), GFP_KERNEL); + if (!mask_shift) + return ERR_PTR(-ENOMEM); + + cmask = kzalloc(sizeof(u16), GFP_KERNEL); + if (!cmask) { + kfree(mask_shift); + return ERR_PTR(-ENOMEM); + } + + *cmask = mask; + + mask_shift->mask = cmask; + mask_shift->shift = bitstart; + + return mask_shift; +} + +static int p4t_u16_write(struct p4tc_type *container, + struct p4tc_type_mask_shift *mask_shift, void *sval, + void *dval) +{ + u16 *dst = dval; + u16 *src = sval; + u16 maskedst = 0; + u8 shift = 0; + + if (mask_shift) { + u16 *dmask = mask_shift->mask; + + maskedst = *dst & ~*dmask; + shift = mask_shift->shift; + } + + *dst = maskedst | (*src << shift); + + return 0; +} + +static void p4t_u16_print(struct net *net, struct p4tc_type *container, + const char *prefix, void *val) +{ + u16 *v = val; + + pr_info("%s 0x%x\n", prefix, *v); +} + +static int p4t_u16_hread(struct p4tc_type *container, + struct p4tc_type_mask_shift *mask_shift, void *sval, + void *dval) +{ + u16 *dst = dval; + u16 *src = sval; + + if (mask_shift) { + u16 *smask = mask_shift->mask; + u8 shift = mask_shift->shift; + + *dst = (*src & *smask) >> shift; + } else { + *dst = *src; + } + + return 0; +} + +static int p4t_s16_validate(struct p4tc_type *container, void *value, + u16 bitstart, u16 bitend, + struct netlink_ext_ack *extack) +{ + s16 minsz = S16_MIN, maxsz = S16_MAX; + s16 *val = value; + + if (val && (*val > maxsz || *val < minsz)) { + NL_SET_ERR_MSG_MOD(extack, "S16 value out of range"); + return -EINVAL; + } + + return 0; +} + +static int p4t_s16_hread(struct p4tc_type *container, + struct p4tc_type_mask_shift *mask_shift, void *sval, + void *dval) +{ + s16 *dst = dval; + s16 *src = sval; + + *dst = *src; + + return 0; +} + +static int p4t_s16_write(struct p4tc_type *container, + struct p4tc_type_mask_shift *mask_shift, void *sval, + void *dval) +{ + s16 *dst = dval; + s16 *src = sval; + + *src = *dst; + + return 0; +} + +static void p4t_s16_print(struct net *net, struct p4tc_type *container, + const char *prefix, void *val) +{ + s16 *v = val; + + pr_info("%s %d\n", prefix, *v); +} + +static int p4t_be16_validate(struct p4tc_type *container, void *value, + u16 bitstart, u16 bitend, + struct netlink_ext_ack *extack) +{ + __be16 container_maxsz = U16_MAX; + __u16 *val_u16 = value; + __be16 val = 0; + size_t maxval; + int ret; + + ret = p4t_validate_bitpos(bitstart, bitend, 15, 15, extack); + if (ret < 0) + return ret; + + if (value) + val = (__be16)(be16_to_cpu(*val_u16)); + + maxval = GENMASK(bitend, 0); + if (val && (val > container_maxsz || val > maxval)) { + NL_SET_ERR_MSG_MOD(extack, "BE16 value out of range"); + return -EINVAL; + } + + return 0; +} + +static int p4t_be16_hread(struct p4tc_type *container, + struct p4tc_type_mask_shift *mask_shift, void *sval, + void *dval) +{ + u16 *dst = dval; + u16 *src = sval; + u16 readval = be16_to_cpu(*src); + + if (mask_shift) { + u16 *smask = mask_shift->mask; + u8 shift = mask_shift->shift; + + readval = (readval & *smask) >> shift; + } + + *dst = readval; + + return 0; +} + +static int p4t_be16_write(struct p4tc_type *container, + struct p4tc_type_mask_shift *mask_shift, void *sval, + void *dval) +{ + __be16 *dst = dval; + u16 maskedst = 0; + u16 *src = sval; + u8 shift = 0; + + if (mask_shift) { + u16 *dmask = (u16 *)mask_shift->mask; + + maskedst = *dst & ~*dmask; + shift = mask_shift->shift; + } + + *dst = cpu_to_be16(maskedst | (*src << shift)); + + return 0; +} + +static void p4t_be16_print(struct net *net, struct p4tc_type *container, + const char *prefix, void *val) +{ + __be16 *v = val; + + pr_info("%s 0x%x\n", prefix, *v); +} + +static int p4t_u8_validate(struct p4tc_type *container, void *value, + u16 bitstart, u16 bitend, + struct netlink_ext_ack *extack) +{ + u8 *val = value; + size_t container_maxsz = U8_MAX; + u8 maxval; + int ret; + + ret = p4t_validate_bitpos(bitstart, bitend, 7, 7, extack); + if (ret < 0) + return ret; + + maxval = GENMASK(bitend, 0); + if (val && (*val > container_maxsz || *val > maxval)) { + NL_SET_ERR_MSG_MOD(extack, "U8 value out of range"); + return -EINVAL; + } + + return 0; +} + +static struct p4tc_type_mask_shift * +p4t_u8_bitops(u16 bitsiz, u16 bitstart, u16 bitend, + struct netlink_ext_ack *extack) +{ + u8 mask = GENMASK(bitend, bitstart); + struct p4tc_type_mask_shift *mask_shift; + u8 *cmask; + + mask_shift = kzalloc(sizeof(*mask_shift), GFP_KERNEL); + if (!mask_shift) + return ERR_PTR(-ENOMEM); + + cmask = kzalloc(sizeof(u8), GFP_KERNEL); + if (!cmask) { + kfree(mask_shift); + return ERR_PTR(-ENOMEM); + } + + *cmask = mask; + + mask_shift->mask = cmask; + mask_shift->shift = bitstart; + + return mask_shift; +} + +static int p4t_u8_write(struct p4tc_type *container, + struct p4tc_type_mask_shift *mask_shift, void *sval, + void *dval) +{ + u8 *dst = dval; + u8 *src = sval; + u8 maskedst = 0; + u8 shift = 0; + + if (mask_shift) { + u8 *dmask = (u8 *)mask_shift->mask; + + maskedst = *dst & ~*dmask; + shift = mask_shift->shift; + } + + *dst = maskedst | (*src << shift); + + return 0; +} + +static void p4t_u8_print(struct net *net, struct p4tc_type *container, + const char *prefix, void *val) +{ + u8 *v = val; + + pr_info("%s 0x%x\n", prefix, *v); +} + +static int p4t_u8_hread(struct p4tc_type *container, + struct p4tc_type_mask_shift *mask_shift, void *sval, + void *dval) +{ + u8 *dst = dval; + u8 *src = sval; + + if (mask_shift) { + u8 *smask = mask_shift->mask; + u8 shift = mask_shift->shift; + + *dst = (*src & *smask) >> shift; + } else { + *dst = *src; + } + + return 0; +} + +static int p4t_s8_validate(struct p4tc_type *container, void *value, + u16 bitstart, u16 bitend, + struct netlink_ext_ack *extack) +{ + s8 minsz = S8_MIN, maxsz = S8_MAX; + s8 *val = value; + + if (val && (*val > maxsz || *val < minsz)) { + NL_SET_ERR_MSG_MOD(extack, "S8 value out of range"); + return -EINVAL; + } + + return 0; +} + +static int p4t_s8_hread(struct p4tc_type *container, + struct p4tc_type_mask_shift *mask_shift, void *sval, + void *dval) +{ + s8 *dst = dval; + s8 *src = sval; + + *dst = *src; + + return 0; +} + +static void p4t_s8_print(struct net *net, struct p4tc_type *container, + const char *prefix, void *val) +{ + s8 *v = val; + + pr_info("%s %d\n", prefix, *v); +} + +static int p4t_u64_validate(struct p4tc_type *container, void *value, + u16 bitstart, u16 bitend, + struct netlink_ext_ack *extack) +{ + u64 container_maxsz = U64_MAX; + u8 *val = value; + u64 maxval; + int ret; + + ret = p4t_validate_bitpos(bitstart, bitend, 63, 63, extack); + if (ret < 0) + return ret; + + maxval = GENMASK_ULL(bitend, 0); + if (val && (*val > container_maxsz || *val > maxval)) { + NL_SET_ERR_MSG_MOD(extack, "U64 value out of range"); + return -EINVAL; + } + + return 0; +} + +static struct p4tc_type_mask_shift * +p4t_u64_bitops(u16 bitsiz, u16 bitstart, u16 bitend, + struct netlink_ext_ack *extack) +{ + u64 mask = GENMASK(bitend, bitstart); + struct p4tc_type_mask_shift *mask_shift; + u64 *cmask; + + mask_shift = kzalloc(sizeof(*mask_shift), GFP_KERNEL); + if (!mask_shift) + return ERR_PTR(-ENOMEM); + + cmask = kzalloc(sizeof(u64), GFP_KERNEL); + if (!cmask) { + kfree(mask_shift); + return ERR_PTR(-ENOMEM); + } + + *cmask = mask; + + mask_shift->mask = cmask; + mask_shift->shift = bitstart; + + return mask_shift; +} + +static int p4t_u64_write(struct p4tc_type *container, + struct p4tc_type_mask_shift *mask_shift, void *sval, + void *dval) +{ + u64 *dst = dval; + u64 *src = sval; + u64 maskedst = 0; + u8 shift = 0; + + if (mask_shift) { + u64 *dmask = (u64 *)mask_shift->mask; + + maskedst = *dst & ~*dmask; + shift = mask_shift->shift; + } + + *dst = maskedst | (*src << shift); + + return 0; +} + +static void p4t_u64_print(struct net *net, struct p4tc_type *container, + const char *prefix, void *val) +{ + u64 *v = val; + + pr_info("%s 0x%llx\n", prefix, *v); +} + +static int p4t_u64_hread(struct p4tc_type *container, + struct p4tc_type_mask_shift *mask_shift, void *sval, + void *dval) +{ + u64 *dst = dval; + u64 *src = sval; + + if (mask_shift) { + u64 *smask = mask_shift->mask; + u8 shift = mask_shift->shift; + + *dst = (*src & *smask) >> shift; + } else { + *dst = *src; + } + + return 0; +} + +/* As of now, we are not allowing bitops for u128 */ +static int p4t_u128_validate(struct p4tc_type *container, void *value, + u16 bitstart, u16 bitend, + struct netlink_ext_ack *extack) +{ + if (bitstart != 0 || bitend != 127) { + NL_SET_ERR_MSG_MOD(extack, + "Only valid bit type larger than bit64 is bit128"); + return -EINVAL; + } + + return 0; +} + +static int p4t_u128_hread(struct p4tc_type *container, + struct p4tc_type_mask_shift *mask_shift, void *sval, + void *dval) +{ + memcpy(sval, dval, sizeof(__u64) * 2); + + return 0; +} + +static int p4t_u128_write(struct p4tc_type *container, + struct p4tc_type_mask_shift *mask_shift, void *sval, + void *dval) +{ + memcpy(sval, dval, sizeof(__u64) * 2); + + return 0; +} + +static void p4t_u128_print(struct net *net, struct p4tc_type *container, + const char *prefix, void *val) +{ + u64 *v = val; + + pr_info("%s[0-63] %16llx", prefix, v[0]); + pr_info("%s[64-127] %16llx", prefix, v[1]); +} + +static int p4t_ipv4_validate(struct p4tc_type *container, void *value, + u16 bitstart, u16 bitend, + struct netlink_ext_ack *extack) +{ + /* Not allowing bit-slices for now */ + if (bitstart != 0 || bitend != 31) { + NL_SET_ERR_MSG_MOD(extack, "Invalid bitstart or bitend"); + return -EINVAL; + } + + return 0; +} + +static void p4t_ipv4_print(struct net *net, struct p4tc_type *container, + const char *prefix, void *val) +{ + u32 *v32 = val; + u8 *v = val; + + *v32 = cpu_to_be32(*v32); + + pr_info("%s %u.%u.%u.%u\n", prefix, v[0], v[1], v[2], v[3]); +} + +static int p4t_mac_validate(struct p4tc_type *container, void *value, + u16 bitstart, u16 bitend, + struct netlink_ext_ack *extack) +{ + if (bitstart != 0 || bitend != 47) { + NL_SET_ERR_MSG_MOD(extack, "Invalid bitstart or bitend"); + return -EINVAL; + } + + return 0; +} + +static void p4t_mac_print(struct net *net, struct p4tc_type *container, + const char *prefix, void *val) +{ + u8 *v = val; + + pr_info("%s %02X:%02x:%02x:%02x:%02x:%02x\n", prefix, v[0], v[1], v[2], + v[3], v[4], v[5]); +} + +static int p4t_dev_validate(struct p4tc_type *container, void *value, + u16 bitstart, u16 bitend, + struct netlink_ext_ack *extack) +{ + if (bitstart != 0 || bitend != 31) { + NL_SET_ERR_MSG_MOD(extack, "Invalid start or endbit values"); + return -EINVAL; + } + + return 0; +} + +static int p4t_dev_write(struct p4tc_type *container, + struct p4tc_type_mask_shift *mask_shift, void *sval, + void *dval) +{ + u32 *src = sval; + u32 *dst = dval; + + *dst = *src; + + return 0; +} + +static int p4t_dev_hread(struct p4tc_type *container, + struct p4tc_type_mask_shift *mask_shift, void *sval, + void *dval) +{ + u32 *src = sval; + u32 *dst = dval; + + *dst = *src; + + return 0; +} + +static void p4t_dev_print(struct net *net, struct p4tc_type *container, + const char *prefix, void *val) +{ + const u32 *ifindex = val; + struct net_device *dev = dev_get_by_index_rcu(net, *ifindex); + + pr_info("%s %s\n", prefix, dev->name); +} + +static int p4t_key_hread(struct p4tc_type *container, + struct p4tc_type_mask_shift *mask_shift, void *sval, + void *dval) +{ + memcpy(dval, sval, BITS_TO_BYTES(container->bitsz)); + + return 0; +} + +static int p4t_key_write(struct p4tc_type *container, + struct p4tc_type_mask_shift *mask_shift, void *sval, + void *dval) +{ + memcpy(dval, sval, BITS_TO_BYTES(container->bitsz)); + + return 0; +} + +static void p4t_key_print(struct net *net, struct p4tc_type *container, + const char *prefix, void *val) +{ + u64 *v = val; + u16 bitstart = 0, bitend = 63; + int i; + + for (i = 0; i < BITS_TO_U64(container->bitsz); i++) { + pr_info("%s[%u-%u] %16llx\n", prefix, bitstart, bitend, v[i]); + bitstart += 64; + bitend += 64; + } +} + +static int p4t_key_validate(struct p4tc_type *container, void *value, + u16 bitstart, u16 bitend, + struct netlink_ext_ack *extack) +{ + if (p4t_validate_bitpos(bitstart, bitend, 0, P4TC_MAX_KEYSZ, extack)) + return -EINVAL; + + return 0; +} + +static int p4t_bool_validate(struct p4tc_type *container, void *value, + u16 bitstart, u16 bitend, + struct netlink_ext_ack *extack) +{ + bool *val = value; + int ret; + + ret = p4t_validate_bitpos(bitstart, bitend, 31, 31, extack); + if (ret < 0) + return ret; + + if (*val == true || *val == false) + return 0; + + return -EINVAL; +} + +static int p4t_bool_hread(struct p4tc_type *container, + struct p4tc_type_mask_shift *mask_shift, void *sval, + void *dval) +{ + bool *dst = dval; + bool *src = sval; + + *dst = *src; + + return 0; +} + +static int p4t_bool_write(struct p4tc_type *container, + struct p4tc_type_mask_shift *mask_shift, void *sval, + void *dval) +{ + bool *dst = dval; + bool *src = sval; + + *dst = *src; + + return 0; +} + +static void p4t_bool_print(struct net *net, struct p4tc_type *container, + const char *prefix, void *val) +{ + bool *v = val; + + pr_info("%s %s", prefix, *v ? "true" : "false"); +} + +static struct p4tc_type_ops u8_ops = { + .validate_p4t = p4t_u8_validate, + .create_bitops = p4t_u8_bitops, + .host_read = p4t_u8_hread, + .host_write = p4t_u8_write, + .print = p4t_u8_print, +}; + +static struct p4tc_type_ops u16_ops = { + .validate_p4t = p4t_u16_validate, + .create_bitops = p4t_u16_bitops, + .host_read = p4t_u16_hread, + .host_write = p4t_u16_write, + .print = p4t_u16_print, +}; + +static struct p4tc_type_ops u32_ops = { + .validate_p4t = p4t_u32_validate, + .create_bitops = p4t_u32_bitops, + .host_read = p4t_u32_hread, + .host_write = p4t_u32_write, + .print = p4t_u32_print, +}; + +static struct p4tc_type_ops u64_ops = { + .validate_p4t = p4t_u64_validate, + .create_bitops = p4t_u64_bitops, + .host_read = p4t_u64_hread, + .host_write = p4t_u64_write, + .print = p4t_u64_print, +}; + +static struct p4tc_type_ops u128_ops = { + .validate_p4t = p4t_u128_validate, + .host_read = p4t_u128_hread, + .host_write = p4t_u128_write, + .print = p4t_u128_print, +}; + +static struct p4tc_type_ops s8_ops = { + .validate_p4t = p4t_s8_validate, + .host_read = p4t_s8_hread, + .print = p4t_s8_print, +}; + +static struct p4tc_type_ops s16_ops = { + .validate_p4t = p4t_s16_validate, + .host_read = p4t_s16_hread, + .host_write = p4t_s16_write, + .print = p4t_s16_print, +}; + +static struct p4tc_type_ops s32_ops = { + .validate_p4t = p4t_s32_validate, + .host_read = p4t_s32_hread, + .host_write = p4t_s32_write, + .print = p4t_s32_print, +}; + +static struct p4tc_type_ops s64_ops = { + .print = p4t_s64_print, +}; + +static struct p4tc_type_ops s128_ops = {}; + +static struct p4tc_type_ops be16_ops = { + .validate_p4t = p4t_be16_validate, + .create_bitops = p4t_u16_bitops, + .host_read = p4t_be16_hread, + .host_write = p4t_be16_write, + .print = p4t_be16_print, +}; + +static struct p4tc_type_ops be32_ops = { + .validate_p4t = p4t_be32_validate, + .create_bitops = p4t_u32_bitops, + .host_read = p4t_be32_hread, + .host_write = p4t_be32_write, + .print = p4t_be32_print, +}; + +static struct p4tc_type_ops be64_ops = { + .validate_p4t = p4t_u64_validate, + .host_read = p4t_be64_hread, + .host_write = p4t_be64_write, + .print = p4t_be64_print, +}; + +static struct p4tc_type_ops string_ops = {}; +static struct p4tc_type_ops nullstring_ops = {}; + +static struct p4tc_type_ops flag_ops = {}; +static struct p4tc_type_ops path_ops = {}; +static struct p4tc_type_ops msecs_ops = {}; +static struct p4tc_type_ops mac_ops = { + .validate_p4t = p4t_mac_validate, + .create_bitops = p4t_u64_bitops, + .host_read = p4t_u64_hread, + .host_write = p4t_u64_write, + .print = p4t_mac_print, +}; + +static struct p4tc_type_ops ipv4_ops = { + .validate_p4t = p4t_ipv4_validate, + .host_read = p4t_be32_hread, + .host_write = p4t_be32_write, + .print = p4t_ipv4_print, +}; + +static struct p4tc_type_ops bool_ops = { + .validate_p4t = p4t_bool_validate, + .host_read = p4t_bool_hread, + .host_write = p4t_bool_write, + .print = p4t_bool_print, +}; + +static struct p4tc_type_ops dev_ops = { + .validate_p4t = p4t_dev_validate, + .host_read = p4t_dev_hread, + .host_write = p4t_dev_write, + .print = p4t_dev_print, +}; + +static struct p4tc_type_ops key_ops = { + .validate_p4t = p4t_key_validate, + .host_read = p4t_key_hread, + .host_write = p4t_key_write, + .print = p4t_key_print, +}; + +static int __p4tc_do_regtype(int typeid, size_t bitsz, size_t container_bitsz, + const char *t_name, struct p4tc_type_ops *ops) +{ + struct p4tc_type *type; + int err; + + if (typeid > P4T_MAX) + return -EINVAL; + + if (p4type_find_byid(typeid) || p4type_find_byname(t_name)) + return -EEXIST; + + if (bitsz > P4T_MAX_BITSZ) + return -E2BIG; + + if (container_bitsz > P4T_MAX_BITSZ) + return -E2BIG; + + type = kzalloc(sizeof(*type), GFP_ATOMIC); + if (!type) + return -ENOMEM; + + err = idr_alloc_u32(&p4tc_types_idr, type, &typeid, typeid, GFP_ATOMIC); + if (err < 0) + return err; + + strscpy(type->name, t_name, P4T_MAX_STR_SZ); + type->typeid = typeid; + type->bitsz = bitsz; + type->container_bitsz = container_bitsz; + type->ops = ops; + + return 0; +} + +static inline int __p4tc_register_type(int typeid, size_t bitsz, + size_t container_bitsz, + const char *t_name, + struct p4tc_type_ops *ops) +{ + if (__p4tc_do_regtype(typeid, bitsz, container_bitsz, t_name, ops) < + 0) { + pr_err("Unable to allocate p4 type %s\n", t_name); + p4tc_types_put(); + return -1; + } + + return 0; +} + +#define p4tc_register_type(...) \ + do { \ + if (__p4tc_register_type(__VA_ARGS__) < 0) \ + return -1; \ + } while (0) + +int p4tc_register_types(void) +{ + p4tc_register_type(P4T_U8, 8, 8, "u8", &u8_ops); + p4tc_register_type(P4T_U16, 16, 16, "u16", &u16_ops); + p4tc_register_type(P4T_U32, 32, 32, "u32", &u32_ops); + p4tc_register_type(P4T_U64, 64, 64, "u64", &u64_ops); + p4tc_register_type(P4T_U128, 128, 128, "u128", &u128_ops); + p4tc_register_type(P4T_S8, 8, 8, "s8", &s8_ops); + p4tc_register_type(P4T_BE16, 16, 16, "be16", &be16_ops); + p4tc_register_type(P4T_BE32, 32, 32, "be32", &be32_ops); + p4tc_register_type(P4T_BE64, 64, 64, "be64", &be64_ops); + p4tc_register_type(P4T_S16, 16, 16, "s16", &s16_ops); + p4tc_register_type(P4T_S32, 32, 32, "s32", &s32_ops); + p4tc_register_type(P4T_S64, 64, 64, "s64", &s64_ops); + p4tc_register_type(P4T_S128, 128, 128, "s128", &s128_ops); + p4tc_register_type(P4T_STRING, P4T_MAX_STR_SZ * 4, P4T_MAX_STR_SZ * 4, + "string", &string_ops); + p4tc_register_type(P4T_NUL_STRING, P4T_MAX_STR_SZ * 4, + P4T_MAX_STR_SZ * 4, "nullstr", &nullstring_ops); + p4tc_register_type(P4T_FLAG, 32, 32, "flag", &flag_ops); + p4tc_register_type(P4T_PATH, 0, 0, "path", &path_ops); + p4tc_register_type(P4T_MSECS, 0, 0, "msecs", &msecs_ops); + p4tc_register_type(P4T_MACADDR, 48, 64, "mac", &mac_ops); + p4tc_register_type(P4T_IPV4ADDR, 32, 32, "ipv4", &ipv4_ops); + p4tc_register_type(P4T_BOOL, 32, 32, "bool", &bool_ops); + p4tc_register_type(P4T_DEV, 32, 32, "dev", &dev_ops); + p4tc_register_type(P4T_KEY, P4TC_MAX_KEYSZ, P4TC_MAX_KEYSZ, "key", + &key_ops); + + return 0; +} + +void p4tc_unregister_types(void) +{ + p4tc_types_put(); +} From patchwork Tue Jan 24 17:05:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jamal Hadi Salim X-Patchwork-Id: 13114391 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 61561C54EB4 for ; Tue, 24 Jan 2023 17:06:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233508AbjAXRGb (ORCPT ); Tue, 24 Jan 2023 12:06:31 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41372 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233302AbjAXRGH (ORCPT ); Tue, 24 Jan 2023 12:06:07 -0500 Received: from mail-ot1-x329.google.com (mail-ot1-x329.google.com [IPv6:2607:f8b0:4864:20::329]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6A14945BC1 for ; Tue, 24 Jan 2023 09:05:29 -0800 (PST) Received: by mail-ot1-x329.google.com with SMTP id x21-20020a056830245500b006865ccca77aso9575381otr.11 for ; Tue, 24 Jan 2023 09:05:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mojatatu-com.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=lj6MS6th3oSwWfQvtBeb7ucfHpSnDxbUGg6fMNYS/jw=; b=2cCNYjSaWIB6flbqL8l3z5Rgnr7DPZRoSDD4vEdaP1+lPDCOlQVdYuKPCRXEktCZaq R2o6pk/wn9JdFQF4tIJwbvRs7ESi/vUYGWNrVVOKqJSwgfuW3GWHx4dQiXflQLcJcL2I EfUpKw5v6GdfxNX/tpGeT5fA4SuMznEQ5y1qmYfF2Q4HsyRVyXNicas18fsPPpqvy89p deThaWnaBhHTtsv9acdMGRLN44rjUMe52SVkqSx6pS73UuFywq6QXurR6J8CvQ8pLoNU 5jbU7WmnNLiea8umr/3bFAMTZsjBgQj7Zxi2mbIGwIhc70V2pchPrmJDbZqpSZNmiZ3k PUiA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=lj6MS6th3oSwWfQvtBeb7ucfHpSnDxbUGg6fMNYS/jw=; b=m9OmuH/rIQ/cYOt202KMhYUITeyDa7InP4zdqPvcsUTXR/qz8wRaVM5SHXQ/TL82f8 7l5eaVWcPuQwTGDEgoyWXEPFVYO8NrpdtZ8zR7q8vJad6qcs433AqQjxAImEqiwqL6XS dBl5SLcIaYNSSB3HjHy9b1kmJrqIbrPPJDFwMXnAjiwKoEIukWo5YIt0d0fm9vcVrHfL iK/mCU4vf4+pLN/ksxS0g3xfBisZOKaT9fkmaoGTvFbWDquFYxib9QBqY3VAlToLUVm7 R+Ps/Ia/VNFh8yr2wOiJA37K6e5pw5cKs5l3mXqanzw2OP7HgzvbzrOh03ivtmagFKyI Q6Ng== X-Gm-Message-State: AFqh2krC5vEXa2wyKm8Z18k1FUsLLvDegI1ubpcw5hl38LQDcyUmKTKK bTKAgOtfO0LvYJaXB9DJZj36mURBI7umvYPd X-Google-Smtp-Source: AMrXdXsVRbABIxTaXc0ZhdnIUDIEl1MjG9ox0BtH08LqxuQiCxZykluPbNyMBUwVvjVyFO8EUXZwvw== X-Received: by 2002:a9d:74c9:0:b0:684:ca02:2500 with SMTP id a9-20020a9d74c9000000b00684ca022500mr12391715otl.25.1674579926801; Tue, 24 Jan 2023 09:05:26 -0800 (PST) Received: from localhost.localdomain (bras-base-kntaon1618w-grc-10-184-145-9-64.dsl.bell.ca. [184.145.9.64]) by smtp.gmail.com with ESMTPSA id t5-20020a05620a0b0500b007063036cb03sm1700208qkg.126.2023.01.24.09.05.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Jan 2023 09:05:25 -0800 (PST) From: Jamal Hadi Salim To: netdev@vger.kernel.org Cc: kernel@mojatatu.com, deb.chatterjee@intel.com, anjali.singhai@intel.com, namrata.limaye@intel.com, khalidm@nvidia.com, tom@sipanda.io, pratyush@sipanda.io, jiri@resnulli.us, xiyou.wangcong@gmail.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, vladbu@nvidia.com, simon.horman@corigine.com Subject: [PATCH net-next RFC 12/20] p4tc: add pipeline create, get, update, delete Date: Tue, 24 Jan 2023 12:05:02 -0500 Message-Id: <20230124170510.316970-12-jhs@mojatatu.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230124170510.316970-1-jhs@mojatatu.com> References: <20230124170510.316970-1-jhs@mojatatu.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org X-Patchwork-State: RFC __Introducing P4 TC Pipeline__ This commit introduces P4 TC pipelines, which emulate the semantics of a P4 program/pipeline using the TC infrastructure. One can refer to P4 programs/pipelines using their names or their specific pipeline ids (pipeid) CRUD (Create, Read/get, Update and Delete) commands apply on a pipeline. As an example, to create a P4 program/pipeline named aP4proggie with a single table in its pipeline, one would use the following command from user space tc: tc p4template create pipeline/aP4proggie numtables 1 Note that, in the above command, the numtables is set as 1; the default is 0 because it is feasible to have a P4 program with no tables at all. The kernel issues each pipeline a pipeline ID which could be referenced. The control plane can specify an ID of choice, for example: tc p4template create pipeline/aP4proggie pipeid 1 numtables 1 Typically there is no good reason to specify the pipeid, but the choice is offered to the user. To Read pipeline aP4proggie attributes, one would retrieve those details as follows: tc p4template get pipeline/[aP4proggie] [pipeid 1] To Update aP4proggie pipeline from 1 to 10 tables, one would use the following command: tc p4template update pipeline/[aP4proggie] [pipeid 1] numtables 10 Note that, in the above command, one could use the P4 program/pipeline name, id or both to specify which P4 program/pipeline to update. To Delete a P4 program/pipeline named aP4proggie with a pipeid of 1, one would use the following command: tc p4template del pipeline/[aP4proggie] [pipeid 1] Note that, in the above command, one could use the P4 program/pipeline name, id or both to specify which P4 program/pipeline to delete If one wished to dump all the created P4 programs/pipelines, one would use the following command: tc p4template get pipeline/ __Pipeline Lifetime__ After Create is issued, one can Read/get, Update and Delete; however the pipeline can only be put to only after it is "sealed". To seal a pipeline, one would issue the following command: tc p4template update pipeline/aP4proggie state ready Once the pipeline is sealed it cannot updated. It can be deleted and read. After a pipeline is sealed it can be put to use via the TC P4 classifier. For example: tc filter add dev $DEV ingress protocol ip prio 6 p4 pname aP4proggie Instantiates aP4proggie in the ingress of $DEV. One could also attach it to a block of ports (example tc block 22) as such: tc filter add block 22 ingress protocol ip prio 6 p4 pname aP4proggie Once the pipeline is attached to a device or block it cannot be deleted. It becomes Read-only from the control plane/user space. The pipeline can be deleted when there are no longer any users left. __Packet Flow___ Pipelines have pre and post actions which are defined by the template. Pipeline Preactions are actions which will be executed when a packet arrives at the P4TC pipeline. Post actions are tc actions which will be executed at the very end of the pipeline and will, usually, execute part of the verdict decided by the pipeline processing, such as redirecting, mirroring, drop, etc. A P4 pipeline is instantiated via the tc filter known as "p4", for example: tc filter add dev $DEV ingress protocol ip prio 6 p4 pname myprog When a packet arrives at the filter it will first hit the pipeline preaction. Typically the pipeline preaction will execute the "apply" stanza of the P4 program. For example, the following apply logic: apply { if (meta.common.direction == ingress && hdrs.ipv4.isValid()) { mytable.apply(); } } Maps to: tc p4template create action/myprog/PPREA \ cmd beq metadata.kernel.direction constant.bit1.1 \ control pipe / jump endif \ cmd beq hdrfield.myprog.parser1.ipv4.isValid constant.bit1.1 \ control pipe / jump endif \ cmd tableapply table.myprog.cb/mytable \ cmd label endif Then bind it tc p4template update pipeline/myprog preactions action myprog/PPREA A post action is invoked after all the tables (if any) have been "applied" by Pipeline Preaction. Example of postaction: tc p4template create action/myprog/PPOA \ cmd beq metadata.myprog.global/drop constant.bit1.1 control drop / pipe \ cmd send_port_egress metadata.myprog.output_port tc p4template update pipeline/myprog postactions action myprog/PPOA Co-developed-by: Victor Nogueira Signed-off-by: Victor Nogueira Co-developed-by: Pedro Tammela Signed-off-by: Pedro Tammela Signed-off-by: Jamal Hadi Salim --- include/net/p4tc.h | 131 ++++++ include/uapi/linux/p4tc.h | 68 +++ include/uapi/linux/rtnetlink.h | 7 + net/sched/p4tc/Makefile | 2 +- net/sched/p4tc/p4tc_pipeline.c | 754 +++++++++++++++++++++++++++++++++ net/sched/p4tc/p4tc_tmpl_api.c | 586 +++++++++++++++++++++++++ security/selinux/nlmsgtab.c | 5 +- 7 files changed, 1551 insertions(+), 2 deletions(-) create mode 100644 include/net/p4tc.h create mode 100644 net/sched/p4tc/p4tc_pipeline.c create mode 100644 net/sched/p4tc/p4tc_tmpl_api.c diff --git a/include/net/p4tc.h b/include/net/p4tc.h new file mode 100644 index 000000000..178bbdf68 --- /dev/null +++ b/include/net/p4tc.h @@ -0,0 +1,131 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __NET_P4TC_H +#define __NET_P4TC_H + +#include +#include +#include +#include +#include +#include +#include + +#define P4TC_DEFAULT_NUM_TABLES P4TC_MINTABLES_COUNT +#define P4TC_DEFAULT_MAX_RULES 1 +#define P4TC_PATH_MAX 3 + +#define P4TC_KERNEL_PIPEID 0 + +#define P4TC_PID_IDX 0 + +struct p4tc_dump_ctx { + u32 ids[P4TC_PATH_MAX]; +}; + +struct p4tc_template_common; + +/* Redefine these macros to avoid -Wenum-compare warnings */ + +#define __P4T_IS_UINT_TYPE(tp) \ + (tp == P4T_U8 || tp == P4T_U16 || tp == P4T_U32 || tp == P4T_U64) + +#define P4T_ENSURE_UINT_OR_BINARY_TYPE(tp) \ + (__NLA_ENSURE(__P4T_IS_UINT_TYPE(tp) || tp == P4T_MSECS || \ + tp == P4T_BINARY) + \ + tp) + +#define P4T_POLICY_RANGE(tp, _min, _max) \ + { \ + .type = P4T_ENSURE_UINT_OR_BINARY_TYPE(tp), \ + .validation_type = NLA_VALIDATE_RANGE, .min = _min, \ + .max = _max, \ + } + +struct p4tc_nl_pname { + char *data; + bool passed; +}; + +struct p4tc_template_ops { + void (*init)(void); + struct p4tc_template_common *(*cu)(struct net *net, struct nlmsghdr *n, + struct nlattr *nla, + struct p4tc_nl_pname *nl_pname, + u32 *ids, + struct netlink_ext_ack *extack); + int (*put)(struct net *net, struct p4tc_template_common *tmpl, + bool unconditional_purge, struct netlink_ext_ack *extack); + int (*gd)(struct net *net, struct sk_buff *skb, struct nlmsghdr *n, + struct nlattr *nla, struct p4tc_nl_pname *nl_pname, u32 *ids, + struct netlink_ext_ack *extack); + int (*fill_nlmsg)(struct net *net, struct sk_buff *skb, + struct p4tc_template_common *tmpl, + struct netlink_ext_ack *extack); + int (*dump)(struct sk_buff *skb, struct p4tc_dump_ctx *ctx, + struct nlattr *nla, char **p_name, u32 *ids, + struct netlink_ext_ack *extack); + int (*dump_1)(struct sk_buff *skb, struct p4tc_template_common *common); +}; + +struct p4tc_template_common { + char name[TEMPLATENAMSZ]; + struct p4tc_template_ops *ops; + u32 p_id; + u32 PAD0; +}; + +extern const struct p4tc_template_ops p4tc_pipeline_ops; + +struct p4tc_pipeline { + struct p4tc_template_common common; + struct rcu_head rcu; + struct net *net; + struct tc_action **preacts; + int num_preacts; + struct tc_action **postacts; + int num_postacts; + u32 max_rules; + refcount_t p_ref; + refcount_t p_ctrl_ref; + u16 num_tables; + u16 curr_tables; + u8 p_state; +}; + +struct p4tc_pipeline_net { + struct idr pipeline_idr; +}; + +int tcf_p4_tmpl_generic_dump(struct sk_buff *skb, struct p4tc_dump_ctx *ctx, + struct idr *idr, int idx, + struct netlink_ext_ack *extack); + +struct p4tc_pipeline *tcf_pipeline_find_byany(struct net *net, + const char *p_name, + const u32 pipeid, + struct netlink_ext_ack *extack); +struct p4tc_pipeline *tcf_pipeline_find_byid(struct net *net, const u32 pipeid); +struct p4tc_pipeline *tcf_pipeline_get(struct net *net, const char *p_name, + const u32 pipeid, + struct netlink_ext_ack *extack); +void __tcf_pipeline_put(struct p4tc_pipeline *pipeline); +struct p4tc_pipeline * +tcf_pipeline_find_byany_unsealed(struct net *net, const char *p_name, + const u32 pipeid, + struct netlink_ext_ack *extack); + +static inline int p4tc_action_destroy(struct tc_action **acts) +{ + int ret = 0; + + if (acts) { + ret = tcf_action_destroy(acts, TCA_ACT_UNBIND); + kfree(acts); + } + + return ret; +} + +#define to_pipeline(t) ((struct p4tc_pipeline *)t) + +#endif diff --git a/include/uapi/linux/p4tc.h b/include/uapi/linux/p4tc.h index 2b6f126db..739c0fe18 100644 --- a/include/uapi/linux/p4tc.h +++ b/include/uapi/linux/p4tc.h @@ -2,8 +2,73 @@ #ifndef __LINUX_P4TC_H #define __LINUX_P4TC_H +#include +#include + +/* pipeline header */ +struct p4tcmsg { + __u32 pipeid; + __u32 obj; +}; + +#define P4TC_MAXPIPELINE_COUNT 32 +#define P4TC_MAXRULES_LIMIT 512 +#define P4TC_MAXTABLES_COUNT 32 +#define P4TC_MINTABLES_COUNT 0 +#define P4TC_MAXPARSE_KEYS 16 +#define P4TC_MAXMETA_SZ 128 +#define P4TC_MSGBATCH_SIZE 16 + #define P4TC_MAX_KEYSZ 512 +#define TEMPLATENAMSZ 256 +#define PIPELINENAMSIZ TEMPLATENAMSZ + +/* Root attributes */ +enum { + P4TC_ROOT_UNSPEC, + P4TC_ROOT, /* nested messages */ + P4TC_ROOT_PNAME, /* string */ + __P4TC_ROOT_MAX, +}; +#define P4TC_ROOT_MAX __P4TC_ROOT_MAX + +/* PIPELINE attributes */ +enum { + P4TC_PIPELINE_UNSPEC, + P4TC_PIPELINE_MAXRULES, /* u32 */ + P4TC_PIPELINE_NUMTABLES, /* u16 */ + P4TC_PIPELINE_STATE, /* u8 */ + P4TC_PIPELINE_PREACTIONS, /* nested preactions */ + P4TC_PIPELINE_POSTACTIONS, /* nested postactions */ + P4TC_PIPELINE_NAME, /* string only used for pipeline dump */ + __P4TC_PIPELINE_MAX +}; +#define P4TC_PIPELINE_MAX __P4TC_PIPELINE_MAX + +/* P4 Object types */ +enum { + P4TC_OBJ_UNSPEC, + P4TC_OBJ_PIPELINE, + __P4TC_OBJ_MAX, +}; +#define P4TC_OBJ_MAX __P4TC_OBJ_MAX + +/* P4 attributes */ +enum { + P4TC_UNSPEC, + P4TC_PATH, + P4TC_PARAMS, + __P4TC_MAX, +}; +#define P4TC_MAX __P4TC_MAX + +/* PIPELINE states */ +enum { + P4TC_STATE_NOT_READY, + P4TC_STATE_READY, +}; + enum { P4T_UNSPEC, P4T_U8 = 1, /* NLA_U8 */ @@ -37,4 +102,7 @@ enum { }; #define P4T_MAX (__P4T_MAX - 1) +#define P4TC_RTA(r) \ + ((struct rtattr *)(((char *)(r)) + NLMSG_ALIGN(sizeof(struct p4tcmsg)))) + #endif diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h index 25a0af57d..62f0f5c90 100644 --- a/include/uapi/linux/rtnetlink.h +++ b/include/uapi/linux/rtnetlink.h @@ -194,6 +194,13 @@ enum { RTM_GETTUNNEL, #define RTM_GETTUNNEL RTM_GETTUNNEL + RTM_CREATEP4TEMPLATE = 124, +#define RTM_CREATEP4TEMPLATE RTM_CREATEP4TEMPLATE + RTM_DELP4TEMPLATE, +#define RTM_DELP4TEMPLATE RTM_DELP4TEMPLATE + RTM_GETP4TEMPLATE, +#define RTM_GETP4TEMPLATE RTM_GETP4TEMPLATE + __RTM_MAX, #define RTM_MAX (((__RTM_MAX + 3) & ~3) - 1) }; diff --git a/net/sched/p4tc/Makefile b/net/sched/p4tc/Makefile index dd1358c9e..0881a7563 100644 --- a/net/sched/p4tc/Makefile +++ b/net/sched/p4tc/Makefile @@ -1,3 +1,3 @@ # SPDX-License-Identifier: GPL-2.0 -obj-y := p4tc_types.o +obj-y := p4tc_types.o p4tc_tmpl_api.o p4tc_pipeline.o diff --git a/net/sched/p4tc/p4tc_pipeline.c b/net/sched/p4tc/p4tc_pipeline.c new file mode 100644 index 000000000..c6c49ab71 --- /dev/null +++ b/net/sched/p4tc/p4tc_pipeline.c @@ -0,0 +1,754 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * net/sched/p4tc_pipeline.c P4 TC PIPELINE + * + * Copyright (c) 2022, Mojatatu Networks + * Copyright (c) 2022, Intel Corporation. + * Authors: Jamal Hadi Salim + * Victor Nogueira + * Pedro Tammela + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +static unsigned int pipeline_net_id; +static struct p4tc_pipeline *root_pipeline; + +static __net_init int pipeline_init_net(struct net *net) +{ + struct p4tc_pipeline_net *pipe_net = net_generic(net, pipeline_net_id); + + idr_init(&pipe_net->pipeline_idr); + + return 0; +} + +static int tcf_pipeline_put(struct net *net, + struct p4tc_template_common *template, + bool unconditional_purgeline, + struct netlink_ext_ack *extack); + +static void __net_exit pipeline_exit_net(struct net *net) +{ + struct p4tc_pipeline_net *pipe_net; + struct p4tc_pipeline *pipeline; + unsigned long pipeid, tmp; + + rtnl_lock(); + pipe_net = net_generic(net, pipeline_net_id); + idr_for_each_entry_ul(&pipe_net->pipeline_idr, pipeline, tmp, pipeid) { + tcf_pipeline_put(net, &pipeline->common, true, NULL); + } + idr_destroy(&pipe_net->pipeline_idr); + rtnl_unlock(); +} + +static struct pernet_operations pipeline_net_ops = { + .init = pipeline_init_net, + .pre_exit = pipeline_exit_net, + .id = &pipeline_net_id, + .size = sizeof(struct p4tc_pipeline_net), +}; + +static const struct nla_policy tc_pipeline_policy[P4TC_PIPELINE_MAX + 1] = { + [P4TC_PIPELINE_MAXRULES] = + NLA_POLICY_RANGE(NLA_U32, 1, P4TC_MAXRULES_LIMIT), + [P4TC_PIPELINE_NUMTABLES] = + NLA_POLICY_RANGE(NLA_U16, P4TC_MINTABLES_COUNT, P4TC_MAXTABLES_COUNT), + [P4TC_PIPELINE_STATE] = { .type = NLA_U8 }, + [P4TC_PIPELINE_PREACTIONS] = { .type = NLA_NESTED }, + [P4TC_PIPELINE_POSTACTIONS] = { .type = NLA_NESTED }, +}; + +static void tcf_pipeline_destroy(struct p4tc_pipeline *pipeline, + bool free_pipeline) +{ + if (free_pipeline) + kfree(pipeline); +} + +static void tcf_pipeline_destroy_rcu(struct rcu_head *head) +{ + struct p4tc_pipeline *pipeline; + struct net *net; + + pipeline = container_of(head, struct p4tc_pipeline, rcu); + + net = pipeline->net; + tcf_pipeline_destroy(pipeline, true); + put_net(net); +} + +static int tcf_pipeline_put(struct net *net, + struct p4tc_template_common *template, + bool unconditional_purgeline, + struct netlink_ext_ack *extack) +{ + struct p4tc_pipeline_net *pipe_net = net_generic(net, pipeline_net_id); + struct p4tc_pipeline *pipeline = to_pipeline(template); + struct net *pipeline_net = maybe_get_net(net); + + if (pipeline_net && !refcount_dec_if_one(&pipeline->p_ref)) { + NL_SET_ERR_MSG(extack, "Can't delete referenced pipeline"); + return -EBUSY; + } + + idr_remove(&pipe_net->pipeline_idr, pipeline->common.p_id); + + /* XXX: The action fields are only accessed in the control path + * since they will be copied to the filter, where the data path + * will use them. So there is no need to free them in the rcu + * callback. We can just free them here + */ + p4tc_action_destroy(pipeline->preacts); + p4tc_action_destroy(pipeline->postacts); + + if (pipeline_net) + call_rcu(&pipeline->rcu, tcf_pipeline_destroy_rcu); + else + tcf_pipeline_destroy(pipeline, + refcount_read(&pipeline->p_ref) == 1); + + return 0; +} + +static inline int pipeline_try_set_state_ready(struct p4tc_pipeline *pipeline, + struct netlink_ext_ack *extack) +{ + if (pipeline->curr_tables != pipeline->num_tables) { + NL_SET_ERR_MSG(extack, + "Must have all table defined to update state to ready"); + return -EINVAL; + } + + if (!pipeline->preacts) { + NL_SET_ERR_MSG(extack, + "Must specify pipeline preactions before sealing"); + return -EINVAL; + } + + if (!pipeline->postacts) { + NL_SET_ERR_MSG(extack, + "Must specify pipeline postactions before sealing"); + return -EINVAL; + } + + pipeline->p_state = P4TC_STATE_READY; + return true; +} + +static inline bool pipeline_sealed(struct p4tc_pipeline *pipeline) +{ + return pipeline->p_state == P4TC_STATE_READY; +} + +static int p4tc_action_init(struct net *net, struct nlattr *nla, + struct tc_action *acts[], u32 pipeid, u32 flags, + struct netlink_ext_ack *extack) +{ + int init_res[TCA_ACT_MAX_PRIO]; + size_t attrs_size; + int ret; + + /* If action was already created, just bind to existing one*/ + flags = TCA_ACT_FLAGS_BIND; + ret = tcf_action_init(net, NULL, nla, NULL, acts, init_res, &attrs_size, + flags, 0, extack); + + return ret; +} + +struct p4tc_pipeline *tcf_pipeline_find_byid(struct net *net, const u32 pipeid) +{ + struct p4tc_pipeline_net *pipe_net; + + if (pipeid == P4TC_KERNEL_PIPEID) + return root_pipeline; + + pipe_net = net_generic(net, pipeline_net_id); + + return idr_find(&pipe_net->pipeline_idr, pipeid); +} + +static struct p4tc_pipeline *tcf_pipeline_find_byname(struct net *net, + const char *name) +{ + struct p4tc_pipeline_net *pipe_net = net_generic(net, pipeline_net_id); + struct p4tc_pipeline *pipeline; + unsigned long tmp, id; + + idr_for_each_entry_ul(&pipe_net->pipeline_idr, pipeline, tmp, id) { + /* Don't show kernel pipeline */ + if (id == P4TC_KERNEL_PIPEID) + continue; + if (strncmp(pipeline->common.name, name, PIPELINENAMSIZ) == 0) + return pipeline; + } + + return NULL; +} + +static struct p4tc_pipeline *tcf_pipeline_create(struct net *net, + struct nlmsghdr *n, + struct nlattr *nla, + const char *p_name, u32 pipeid, + struct netlink_ext_ack *extack) +{ + struct p4tc_pipeline_net *pipe_net = net_generic(net, pipeline_net_id); + int ret = 0; + struct nlattr *tb[P4TC_PIPELINE_MAX + 1]; + struct p4tc_pipeline *pipeline; + + ret = nla_parse_nested(tb, P4TC_PIPELINE_MAX, nla, tc_pipeline_policy, + extack); + + if (ret < 0) + return ERR_PTR(ret); + + pipeline = kmalloc(sizeof(*pipeline), GFP_KERNEL); + if (!pipeline) + return ERR_PTR(-ENOMEM); + + if (!p_name || p_name[0] == '\0') { + NL_SET_ERR_MSG(extack, "Must specify pipeline name"); + ret = -EINVAL; + goto err; + } + + if (pipeid != P4TC_KERNEL_PIPEID && + tcf_pipeline_find_byid(net, pipeid)) { + NL_SET_ERR_MSG(extack, "Pipeline was already created"); + ret = -EEXIST; + goto err; + } + + if (tcf_pipeline_find_byname(net, p_name)) { + NL_SET_ERR_MSG(extack, "Pipeline was already created"); + ret = -EEXIST; + goto err; + } + + strscpy(pipeline->common.name, p_name, PIPELINENAMSIZ); + + if (pipeid) { + ret = idr_alloc_u32(&pipe_net->pipeline_idr, pipeline, &pipeid, + pipeid, GFP_KERNEL); + } else { + pipeid = 1; + ret = idr_alloc_u32(&pipe_net->pipeline_idr, pipeline, &pipeid, + UINT_MAX, GFP_KERNEL); + } + + if (ret < 0) { + NL_SET_ERR_MSG(extack, "Unable to allocate pipeline id"); + goto err; + } + + pipeline->common.p_id = pipeid; + + if (tb[P4TC_PIPELINE_MAXRULES]) + pipeline->max_rules = + *((u32 *)nla_data(tb[P4TC_PIPELINE_MAXRULES])); + else + pipeline->max_rules = P4TC_DEFAULT_MAX_RULES; + + if (tb[P4TC_PIPELINE_NUMTABLES]) + pipeline->num_tables = + *((u16 *)nla_data(tb[P4TC_PIPELINE_NUMTABLES])); + else + pipeline->num_tables = P4TC_DEFAULT_NUM_TABLES; + + if (tb[P4TC_PIPELINE_PREACTIONS]) { + pipeline->preacts = kcalloc(TCA_ACT_MAX_PRIO, + sizeof(struct tc_action *), + GFP_KERNEL); + if (!pipeline->preacts) { + ret = -ENOMEM; + goto idr_rm; + } + + ret = p4tc_action_init(net, tb[P4TC_PIPELINE_PREACTIONS], + pipeline->preacts, pipeid, 0, extack); + if (ret < 0) { + kfree(pipeline->preacts); + goto idr_rm; + } + pipeline->num_preacts = ret; + } else { + pipeline->preacts = NULL; + pipeline->num_preacts = 0; + } + + if (tb[P4TC_PIPELINE_POSTACTIONS]) { + pipeline->postacts = kcalloc(TCA_ACT_MAX_PRIO, + sizeof(struct tc_action *), + GFP_KERNEL); + if (!pipeline->postacts) { + ret = -ENOMEM; + goto preactions_destroy; + } + + ret = p4tc_action_init(net, tb[P4TC_PIPELINE_POSTACTIONS], + pipeline->postacts, pipeid, 0, extack); + if (ret < 0) { + kfree(pipeline->postacts); + goto preactions_destroy; + } + pipeline->num_postacts = ret; + } else { + pipeline->postacts = NULL; + pipeline->num_postacts = 0; + } + + pipeline->p_state = P4TC_STATE_NOT_READY; + + pipeline->net = net; + + refcount_set(&pipeline->p_ref, 1); + + pipeline->common.ops = (struct p4tc_template_ops *)&p4tc_pipeline_ops; + + return pipeline; + +preactions_destroy: + p4tc_action_destroy(pipeline->preacts); + +idr_rm: + idr_remove(&pipe_net->pipeline_idr, pipeid); + +err: + kfree(pipeline); + return ERR_PTR(ret); +} + +static struct p4tc_pipeline * +__tcf_pipeline_find_byany(struct net *net, const char *p_name, const u32 pipeid, + struct netlink_ext_ack *extack) +{ + struct p4tc_pipeline *pipeline = NULL; + int err; + + if (pipeid) { + pipeline = tcf_pipeline_find_byid(net, pipeid); + if (!pipeline) { + NL_SET_ERR_MSG(extack, "Unable to find pipeline by id"); + err = -EINVAL; + goto out; + } + } else { + if (p_name) { + pipeline = tcf_pipeline_find_byname(net, p_name); + if (!pipeline) { + NL_SET_ERR_MSG(extack, + "Pipeline name not found"); + err = -EINVAL; + goto out; + } + } + } + + return pipeline; + +out: + return ERR_PTR(err); +} + +struct p4tc_pipeline *tcf_pipeline_find_byany(struct net *net, + const char *p_name, + const u32 pipeid, + struct netlink_ext_ack *extack) +{ + struct p4tc_pipeline *pipeline = + __tcf_pipeline_find_byany(net, p_name, pipeid, extack); + if (!pipeline) { + NL_SET_ERR_MSG(extack, "Must specify pipeline name or id"); + return ERR_PTR(-EINVAL); + } + + return pipeline; +} + +struct p4tc_pipeline *tcf_pipeline_get(struct net *net, const char *p_name, + const u32 pipeid, + struct netlink_ext_ack *extack) +{ + struct p4tc_pipeline *pipeline = + __tcf_pipeline_find_byany(net, p_name, pipeid, extack); + if (!pipeline) { + NL_SET_ERR_MSG(extack, "Must specify pipeline name or id"); + return ERR_PTR(-EINVAL); + } else if (IS_ERR(pipeline)) { + return pipeline; + } + + /* Should never happen */ + WARN_ON(!refcount_inc_not_zero(&pipeline->p_ref)); + + return pipeline; +} + +void __tcf_pipeline_put(struct p4tc_pipeline *pipeline) +{ + struct net *net = maybe_get_net(pipeline->net); + + if (net) { + refcount_dec(&pipeline->p_ref); + put_net(net); + /* If netns is going down, we already deleted the pipeline objects in + * the pre_exit net op + */ + } else { + kfree(pipeline); + } +} + +struct p4tc_pipeline * +tcf_pipeline_find_byany_unsealed(struct net *net, const char *p_name, + const u32 pipeid, + struct netlink_ext_ack *extack) +{ + struct p4tc_pipeline *pipeline = + tcf_pipeline_find_byany(net, p_name, pipeid, extack); + if (IS_ERR(pipeline)) + return pipeline; + + if (pipeline_sealed(pipeline)) { + NL_SET_ERR_MSG(extack, "Pipeline is sealed"); + return ERR_PTR(-EINVAL); + } + + return pipeline; +} + +static struct p4tc_pipeline * +tcf_pipeline_update(struct net *net, struct nlmsghdr *n, struct nlattr *nla, + const char *p_name, const u32 pipeid, + struct netlink_ext_ack *extack) +{ + struct tc_action **preacts = NULL; + struct tc_action **postacts = NULL; + u16 num_tables = 0; + u16 max_rules = 0; + int ret = 0; + struct nlattr *tb[P4TC_PIPELINE_MAX + 1]; + struct p4tc_pipeline *pipeline; + int num_preacts, num_postacts; + + ret = nla_parse_nested(tb, P4TC_PIPELINE_MAX, nla, tc_pipeline_policy, + extack); + + if (ret < 0) + goto out; + + pipeline = + tcf_pipeline_find_byany_unsealed(net, p_name, pipeid, extack); + if (IS_ERR(pipeline)) + return pipeline; + + if (tb[P4TC_PIPELINE_NUMTABLES]) + num_tables = *((u16 *)nla_data(tb[P4TC_PIPELINE_NUMTABLES])); + + if (tb[P4TC_PIPELINE_MAXRULES]) + max_rules = *((u32 *)nla_data(tb[P4TC_PIPELINE_MAXRULES])); + + if (tb[P4TC_PIPELINE_PREACTIONS]) { + preacts = kcalloc(TCA_ACT_MAX_PRIO, sizeof(struct tc_action *), + GFP_KERNEL); + if (!preacts) { + ret = -ENOMEM; + goto out; + } + + ret = p4tc_action_init(net, tb[P4TC_PIPELINE_PREACTIONS], + preacts, pipeline->common.p_id, 0, + extack); + if (ret < 0) { + kfree(preacts); + goto out; + } + num_preacts = ret; + } + + if (tb[P4TC_PIPELINE_POSTACTIONS]) { + postacts = kcalloc(TCA_ACT_MAX_PRIO, sizeof(struct tc_action *), + GFP_KERNEL); + if (!postacts) { + ret = -ENOMEM; + goto preactions_destroy; + } + + ret = p4tc_action_init(net, tb[P4TC_PIPELINE_POSTACTIONS], + postacts, pipeline->common.p_id, 0, + extack); + if (ret < 0) { + kfree(postacts); + goto preactions_destroy; + } + num_postacts = ret; + } + + if (tb[P4TC_PIPELINE_STATE]) { + ret = pipeline_try_set_state_ready(pipeline, extack); + if (ret < 0) + goto postactions_destroy; + } + + if (max_rules) + pipeline->max_rules = max_rules; + if (num_tables) + pipeline->num_tables = num_tables; + if (preacts) { + p4tc_action_destroy(pipeline->preacts); + pipeline->preacts = preacts; + pipeline->num_preacts = num_preacts; + } + if (postacts) { + p4tc_action_destroy(pipeline->postacts); + pipeline->postacts = postacts; + pipeline->num_postacts = num_postacts; + } + + return pipeline; + +postactions_destroy: + p4tc_action_destroy(postacts); + +preactions_destroy: + p4tc_action_destroy(preacts); +out: + return ERR_PTR(ret); +} + +static struct p4tc_template_common * +tcf_pipeline_cu(struct net *net, struct nlmsghdr *n, struct nlattr *nla, + struct p4tc_nl_pname *nl_pname, u32 *ids, + struct netlink_ext_ack *extack) +{ + u32 pipeid = ids[P4TC_PID_IDX]; + struct p4tc_pipeline *pipeline; + + if (n->nlmsg_flags & NLM_F_REPLACE) + pipeline = tcf_pipeline_update(net, n, nla, nl_pname->data, + pipeid, extack); + else + pipeline = tcf_pipeline_create(net, n, nla, nl_pname->data, + pipeid, extack); + + if (IS_ERR(pipeline)) + goto out; + + if (!nl_pname->passed) + strscpy(nl_pname->data, pipeline->common.name, PIPELINENAMSIZ); + + if (!ids[P4TC_PID_IDX]) + ids[P4TC_PID_IDX] = pipeline->common.p_id; + +out: + return (struct p4tc_template_common *)pipeline; +} + +static int _tcf_pipeline_fill_nlmsg(struct sk_buff *skb, + const struct p4tc_pipeline *pipeline) +{ + unsigned char *b = nlmsg_get_pos(skb); + struct nlattr *nest, *preacts, *postacts; + + nest = nla_nest_start(skb, P4TC_PARAMS); + if (!nest) + goto out_nlmsg_trim; + if (nla_put_u32(skb, P4TC_PIPELINE_MAXRULES, pipeline->max_rules)) + goto out_nlmsg_trim; + + if (nla_put_u16(skb, P4TC_PIPELINE_NUMTABLES, pipeline->num_tables)) + goto out_nlmsg_trim; + if (nla_put_u8(skb, P4TC_PIPELINE_STATE, pipeline->p_state)) + goto out_nlmsg_trim; + + if (pipeline->preacts) { + preacts = nla_nest_start(skb, P4TC_PIPELINE_PREACTIONS); + if (tcf_action_dump(skb, pipeline->preacts, 0, 0, false) < 0) + goto out_nlmsg_trim; + nla_nest_end(skb, preacts); + } + + if (pipeline->postacts) { + postacts = nla_nest_start(skb, P4TC_PIPELINE_POSTACTIONS); + if (tcf_action_dump(skb, pipeline->postacts, 0, 0, false) < 0) + goto out_nlmsg_trim; + nla_nest_end(skb, postacts); + } + + nla_nest_end(skb, nest); + + return skb->len; + +out_nlmsg_trim: + nlmsg_trim(skb, b); + return -1; +} + +static int tcf_pipeline_fill_nlmsg(struct net *net, struct sk_buff *skb, + struct p4tc_template_common *template, + struct netlink_ext_ack *extack) +{ + const struct p4tc_pipeline *pipeline = to_pipeline(template); + + if (_tcf_pipeline_fill_nlmsg(skb, pipeline) <= 0) { + NL_SET_ERR_MSG(extack, + "Failed to fill notification attributes for pipeline"); + return -EINVAL; + } + + return 0; +} + +static int tcf_pipeline_del_one(struct net *net, + struct p4tc_template_common *tmpl, + struct netlink_ext_ack *extack) +{ + return tcf_pipeline_put(net, tmpl, false, extack); +} + +static int tcf_pipeline_gd(struct net *net, struct sk_buff *skb, + struct nlmsghdr *n, struct nlattr *nla, + struct p4tc_nl_pname *nl_pname, u32 *ids, + struct netlink_ext_ack *extack) +{ + unsigned char *b = nlmsg_get_pos(skb); + u32 pipeid = ids[P4TC_PID_IDX]; + struct p4tc_template_common *tmpl; + struct p4tc_pipeline *pipeline; + int ret = 0; + + if (n->nlmsg_type == RTM_DELP4TEMPLATE && + (n->nlmsg_flags & NLM_F_ROOT)) { + NL_SET_ERR_MSG(extack, "Pipeline flush not supported"); + return -EOPNOTSUPP; + } + + pipeline = tcf_pipeline_find_byany(net, nl_pname->data, pipeid, extack); + if (IS_ERR(pipeline)) + return PTR_ERR(pipeline); + + tmpl = (struct p4tc_template_common *)pipeline; + if (tcf_pipeline_fill_nlmsg(net, skb, tmpl, extack) < 0) + return -1; + + if (!ids[P4TC_PID_IDX]) + ids[P4TC_PID_IDX] = pipeline->common.p_id; + + if (!nl_pname->passed) + strscpy(nl_pname->data, pipeline->common.name, PIPELINENAMSIZ); + + if (n->nlmsg_type == RTM_DELP4TEMPLATE) { + ret = tcf_pipeline_del_one(net, tmpl, extack); + if (ret < 0) + goto out_nlmsg_trim; + } + + return ret; + +out_nlmsg_trim: + nlmsg_trim(skb, b); + return ret; +} + +static int tcf_pipeline_dump(struct sk_buff *skb, struct p4tc_dump_ctx *ctx, + struct nlattr *nla, char **p_name, u32 *ids, + struct netlink_ext_ack *extack) +{ + struct net *net = sock_net(skb->sk); + struct p4tc_pipeline_net *pipe_net = net_generic(net, pipeline_net_id); + + return tcf_p4_tmpl_generic_dump(skb, ctx, &pipe_net->pipeline_idr, + P4TC_PID_IDX, extack); +} + +static int tcf_pipeline_dump_1(struct sk_buff *skb, + struct p4tc_template_common *common) +{ + struct p4tc_pipeline *pipeline = to_pipeline(common); + unsigned char *b = nlmsg_get_pos(skb); + struct nlattr *param; + + /* Don't show kernel pipeline in dump */ + if (pipeline->common.p_id == P4TC_KERNEL_PIPEID) + return 1; + + param = nla_nest_start(skb, P4TC_PARAMS); + if (!param) + goto out_nlmsg_trim; + if (nla_put_string(skb, P4TC_PIPELINE_NAME, pipeline->common.name)) + goto out_nlmsg_trim; + + nla_nest_end(skb, param); + + return 0; + +out_nlmsg_trim: + nlmsg_trim(skb, b); + return -ENOMEM; +} + +static int register_pipeline_pernet(void) +{ + return register_pernet_subsys(&pipeline_net_ops); +} + +static void __tcf_pipeline_init(void) +{ + int pipeid = P4TC_KERNEL_PIPEID; + + root_pipeline = kzalloc(sizeof(*root_pipeline), GFP_ATOMIC); + if (!root_pipeline) { + pr_err("Unable to register kernel pipeline\n"); + return; + } + + strscpy(root_pipeline->common.name, "kernel", PIPELINENAMSIZ); + + root_pipeline->common.ops = + (struct p4tc_template_ops *)&p4tc_pipeline_ops; + + root_pipeline->common.p_id = pipeid; + + root_pipeline->p_state = P4TC_STATE_READY; +} + +static void tcf_pipeline_init(void) +{ + if (register_pipeline_pernet() < 0) + pr_err("Failed to register per net pipeline IDR"); + + if (p4tc_register_types() < 0) + pr_err("Failed to register P4 types"); + + __tcf_pipeline_init(); +} + +const struct p4tc_template_ops p4tc_pipeline_ops = { + .init = tcf_pipeline_init, + .cu = tcf_pipeline_cu, + .fill_nlmsg = tcf_pipeline_fill_nlmsg, + .gd = tcf_pipeline_gd, + .put = tcf_pipeline_put, + .dump = tcf_pipeline_dump, + .dump_1 = tcf_pipeline_dump_1, +}; diff --git a/net/sched/p4tc/p4tc_tmpl_api.c b/net/sched/p4tc/p4tc_tmpl_api.c new file mode 100644 index 000000000..debd5f825 --- /dev/null +++ b/net/sched/p4tc/p4tc_tmpl_api.c @@ -0,0 +1,586 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * net/sched/p4tc_api.c P4 TC API + * + * Copyright (c) 2022, Mojatatu Networks + * Copyright (c) 2022, Intel Corporation. + * Authors: Jamal Hadi Salim + * Victor Nogueira + * Pedro Tammela + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +const struct nla_policy p4tc_root_policy[P4TC_ROOT_MAX + 1] = { + [P4TC_ROOT] = { .type = NLA_NESTED }, + [P4TC_ROOT_PNAME] = { .type = NLA_STRING, .len = PIPELINENAMSIZ }, +}; + +const struct nla_policy p4tc_policy[P4TC_MAX + 1] = { + [P4TC_PATH] = { .type = NLA_BINARY, + .len = P4TC_PATH_MAX * sizeof(u32) }, + [P4TC_PARAMS] = { .type = NLA_NESTED }, +}; + +static bool obj_is_valid(u32 obj) +{ + switch (obj) { + case P4TC_OBJ_PIPELINE: + return true; + default: + return false; + } +} + +static const struct p4tc_template_ops *p4tc_ops[P4TC_OBJ_MAX] = { + [P4TC_OBJ_PIPELINE] = &p4tc_pipeline_ops, +}; + +int tcf_p4_tmpl_generic_dump(struct sk_buff *skb, struct p4tc_dump_ctx *ctx, + struct idr *idr, int idx, + struct netlink_ext_ack *extack) +{ + unsigned char *b = nlmsg_get_pos(skb); + unsigned long id = 0; + int i = 0; + struct p4tc_template_common *common; + unsigned long tmp; + + id = ctx->ids[idx]; + + idr_for_each_entry_continue_ul(idr, common, tmp, id) { + struct nlattr *count; + int ret; + + if (i == P4TC_MSGBATCH_SIZE) + break; + + count = nla_nest_start(skb, i + 1); + if (!count) + goto out_nlmsg_trim; + ret = common->ops->dump_1(skb, common); + if (ret < 0) { + goto out_nlmsg_trim; + } else if (ret) { + nla_nest_cancel(skb, count); + continue; + } + nla_nest_end(skb, count); + + i++; + } + + if (i == 0) { + if (!ctx->ids[idx]) + NL_SET_ERR_MSG(extack, + "There are no pipeline components"); + return 0; + } + + ctx->ids[idx] = id; + + return skb->len; + +out_nlmsg_trim: + nlmsg_trim(skb, b); + return -ENOMEM; +} + +static int tc_ctl_p4_tmpl_gd_1(struct net *net, struct sk_buff *skb, + struct nlmsghdr *n, struct nlattr *arg, + struct p4tc_nl_pname *nl_pname, + struct netlink_ext_ack *extack) +{ + struct p4tcmsg *t = (struct p4tcmsg *)nlmsg_data(n); + u32 ids[P4TC_PATH_MAX] = {}; + struct nlattr *tb[P4TC_MAX + 1]; + struct p4tc_template_ops *op; + int ret; + + if (!obj_is_valid(t->obj)) { + NL_SET_ERR_MSG(extack, "Invalid object type"); + return -EINVAL; + } + + ret = nla_parse_nested(tb, P4TC_MAX, arg, p4tc_policy, extack); + if (ret < 0) + return ret; + + ids[P4TC_PID_IDX] = t->pipeid; + + if (tb[P4TC_PATH]) { + if ((nla_len(tb[P4TC_PATH])) > + (P4TC_PATH_MAX - 1) * sizeof(u32)) { + NL_SET_ERR_MSG(extack, "Path is too big"); + return -E2BIG; + } + } + + op = (struct p4tc_template_ops *)p4tc_ops[t->obj]; + + ret = op->gd(net, skb, n, tb[P4TC_PARAMS], nl_pname, ids, extack); + if (ret < 0) + return ret; + + if (!t->pipeid) + t->pipeid = ids[P4TC_PID_IDX]; + + return ret; +} + +static int tc_ctl_p4_tmpl_gd_n(struct sk_buff *skb, struct nlmsghdr *n, + char *p_name, struct nlattr *nla, int event, + struct netlink_ext_ack *extack) +{ + struct p4tcmsg *t = (struct p4tcmsg *)nlmsg_data(n); + struct net *net = sock_net(skb->sk); + u32 portid = NETLINK_CB(skb).portid; + int ret = 0; + struct nlattr *tb[P4TC_MSGBATCH_SIZE + 1]; + struct p4tc_nl_pname nl_pname; + struct sk_buff *new_skb; + struct p4tcmsg *t_new; + struct nlmsghdr *nlh; + struct nlattr *pnatt; + struct nlattr *root; + int i; + + ret = nla_parse_nested(tb, P4TC_MSGBATCH_SIZE, nla, NULL, extack); + if (ret < 0) + return ret; + + new_skb = alloc_skb(NLMSG_GOODSIZE, GFP_KERNEL); + if (!new_skb) + return -ENOMEM; + + nlh = nlmsg_put(new_skb, portid, n->nlmsg_seq, event, sizeof(*t), + n->nlmsg_flags); + if (!nlh) { + ret = -ENOMEM; + goto out; + } + + t_new = nlmsg_data(nlh); + t_new->pipeid = t->pipeid; + t_new->obj = t->obj; + + pnatt = nla_reserve(new_skb, P4TC_ROOT_PNAME, PIPELINENAMSIZ); + if (!pnatt) { + ret = -ENOMEM; + goto out; + } + + nl_pname.data = nla_data(pnatt); + if (!p_name) { + /* Filled up by the operation or forced failure */ + memset(nl_pname.data, 0, PIPELINENAMSIZ); + nl_pname.passed = false; + } else { + strscpy(nl_pname.data, p_name, PIPELINENAMSIZ); + nl_pname.passed = true; + } + + root = nla_nest_start(new_skb, P4TC_ROOT); + for (i = 1; i < P4TC_MSGBATCH_SIZE + 1 && tb[i]; i++) { + struct nlattr *nest = nla_nest_start(new_skb, i); + + ret = tc_ctl_p4_tmpl_gd_1(net, new_skb, nlh, tb[i], &nl_pname, + extack); + if (n->nlmsg_flags & NLM_F_ROOT && event == RTM_DELP4TEMPLATE) { + if (ret <= 0) + goto out; + } else { + if (ret < 0) + goto out; + } + nla_nest_end(new_skb, nest); + } + nla_nest_end(new_skb, root); + + nlmsg_end(new_skb, nlh); + + if (event == RTM_GETP4TEMPLATE) + return rtnl_unicast(new_skb, net, portid); + + return rtnetlink_send(new_skb, net, portid, RTNLGRP_TC, + n->nlmsg_flags & NLM_F_ECHO); +out: + kfree_skb(new_skb); + return ret; +} + +static int tc_ctl_p4_tmpl_get(struct sk_buff *skb, struct nlmsghdr *n, + struct netlink_ext_ack *extack) +{ + char *p_name = NULL; + struct nlattr *p4tc_attr[P4TC_ROOT_MAX + 1]; + int ret; + + ret = nlmsg_parse(n, sizeof(struct p4tcmsg), p4tc_attr, P4TC_ROOT_MAX, + p4tc_root_policy, extack); + if (ret < 0) + return ret; + + if (!p4tc_attr[P4TC_ROOT]) { + NL_SET_ERR_MSG(extack, + "Netlink P4TC template attributes missing"); + return -EINVAL; + } + + if (p4tc_attr[P4TC_ROOT_PNAME]) + p_name = nla_data(p4tc_attr[P4TC_ROOT_PNAME]); + + return tc_ctl_p4_tmpl_gd_n(skb, n, p_name, p4tc_attr[P4TC_ROOT], + RTM_GETP4TEMPLATE, extack); +} + +static int tc_ctl_p4_tmpl_delete(struct sk_buff *skb, struct nlmsghdr *n, + struct netlink_ext_ack *extack) +{ + char *p_name = NULL; + struct nlattr *p4tc_attr[P4TC_ROOT_MAX + 1]; + int ret; + + if (!netlink_capable(skb, CAP_NET_ADMIN)) + return -EPERM; + + ret = nlmsg_parse(n, sizeof(struct p4tcmsg), p4tc_attr, P4TC_ROOT_MAX, + p4tc_root_policy, extack); + if (ret < 0) + return ret; + + if (!p4tc_attr[P4TC_ROOT]) { + NL_SET_ERR_MSG(extack, + "Netlink P4TC template attributes missing"); + return -EINVAL; + } + + if (p4tc_attr[P4TC_ROOT_PNAME]) + p_name = nla_data(p4tc_attr[P4TC_ROOT_PNAME]); + + return tc_ctl_p4_tmpl_gd_n(skb, n, p_name, p4tc_attr[P4TC_ROOT], + RTM_DELP4TEMPLATE, extack); +} + +static struct p4tc_template_common * +tcf_p4_tmpl_cu_1(struct sk_buff *skb, struct net *net, struct nlmsghdr *n, + struct p4tc_nl_pname *nl_pname, struct nlattr *nla, + struct netlink_ext_ack *extack) +{ + struct p4tcmsg *t = (struct p4tcmsg *)nlmsg_data(n); + u32 ids[P4TC_PATH_MAX] = {}; + struct nlattr *p4tc_attr[P4TC_MAX + 1]; + struct p4tc_template_common *tmpl; + struct p4tc_template_ops *op; + int ret; + + if (!obj_is_valid(t->obj)) { + NL_SET_ERR_MSG(extack, "Invalid object type"); + ret = -EINVAL; + goto out; + } + + ret = nla_parse_nested(p4tc_attr, P4TC_MAX, nla, p4tc_policy, extack); + if (ret < 0) + goto out; + + if (!p4tc_attr[P4TC_PARAMS]) { + NL_SET_ERR_MSG(extack, "Must specify object attributes"); + ret = -EINVAL; + goto out; + } + + ids[P4TC_PID_IDX] = t->pipeid; + + if (p4tc_attr[P4TC_PATH]) { + if ((nla_len(p4tc_attr[P4TC_PATH])) > + (P4TC_PATH_MAX - 1) * sizeof(u32)) { + NL_SET_ERR_MSG(extack, "Path is too big"); + ret = -E2BIG; + goto out; + } + } + + op = (struct p4tc_template_ops *)p4tc_ops[t->obj]; + tmpl = op->cu(net, n, p4tc_attr[P4TC_PARAMS], nl_pname, ids, extack); + if (IS_ERR(tmpl)) + return tmpl; + + ret = op->fill_nlmsg(net, skb, tmpl, extack); + if (ret < 0) + goto put; + + if (!t->pipeid) + t->pipeid = ids[P4TC_PID_IDX]; + + return tmpl; + +put: + op->put(net, tmpl, false, extack); + +out: + return ERR_PTR(ret); +} + +static int tcf_p4_tmpl_cu_n(struct sk_buff *skb, struct nlmsghdr *n, + struct nlattr *nla, char *p_name, + struct netlink_ext_ack *extack) +{ + struct p4tcmsg *t = (struct p4tcmsg *)nlmsg_data(n); + struct net *net = sock_net(skb->sk); + u32 portid = NETLINK_CB(skb).portid; + struct p4tc_template_common *tmpls[P4TC_MSGBATCH_SIZE]; + struct nlattr *tb[P4TC_MSGBATCH_SIZE + 1]; + struct p4tc_nl_pname nl_pname; + struct sk_buff *new_skb; + struct p4tcmsg *t_new; + struct nlmsghdr *nlh; + struct nlattr *pnatt; + struct nlattr *root; + int ret; + int i; + + ret = nla_parse_nested(tb, P4TC_MSGBATCH_SIZE, nla, NULL, extack); + if (ret < 0) + return ret; + + new_skb = alloc_skb(NLMSG_GOODSIZE, GFP_KERNEL); + if (!new_skb) + return -ENOMEM; + + nlh = nlmsg_put(new_skb, portid, n->nlmsg_seq, RTM_CREATEP4TEMPLATE, + sizeof(*t), n->nlmsg_flags); + if (!nlh) + goto out; + + t_new = nlmsg_data(nlh); + if (!t_new) { + NL_SET_ERR_MSG(extack, "Message header is missing"); + ret = -EINVAL; + goto out; + } + t_new->pipeid = t->pipeid; + t_new->obj = t->obj; + + pnatt = nla_reserve(new_skb, P4TC_ROOT_PNAME, PIPELINENAMSIZ); + if (!pnatt) { + ret = -ENOMEM; + goto out; + } + + nl_pname.data = nla_data(pnatt); + if (!p_name) { + /* Filled up by the operation or forced failure */ + memset(nl_pname.data, 0, PIPELINENAMSIZ); + nl_pname.passed = false; + } else { + strscpy(nl_pname.data, p_name, PIPELINENAMSIZ); + nl_pname.passed = true; + } + + root = nla_nest_start(new_skb, P4TC_ROOT); + if (!root) { + ret = -ENOMEM; + goto out; + } + + /* XXX: See if we can use NLA_NESTED_ARRAY here */ + for (i = 0; i < P4TC_MSGBATCH_SIZE && tb[i + 1]; i++) { + struct nlattr *nest = nla_nest_start(new_skb, i + 1); + + tmpls[i] = tcf_p4_tmpl_cu_1(new_skb, net, nlh, &nl_pname, + tb[i + 1], extack); + if (IS_ERR(tmpls[i])) { + ret = PTR_ERR(tmpls[i]); + goto undo_prev; + } + + nla_nest_end(new_skb, nest); + } + nla_nest_end(new_skb, root); + + if (!t_new->pipeid) + t_new->pipeid = ret; + + nlmsg_end(new_skb, nlh); + + return rtnetlink_send(new_skb, net, portid, RTNLGRP_TC, + n->nlmsg_flags & NLM_F_ECHO); + +undo_prev: + if (!(nlh->nlmsg_flags & NLM_F_REPLACE)) { + while (--i > 0) { + struct p4tc_template_common *tmpl = tmpls[i - 1]; + + tmpl->ops->put(net, tmpl, false, extack); + } + } + +out: + kfree_skb(new_skb); + return ret; +} + +static int tc_ctl_p4_tmpl_cu(struct sk_buff *skb, struct nlmsghdr *n, + struct netlink_ext_ack *extack) +{ + char *p_name = NULL; + int ret = 0; + struct nlattr *p4tc_attr[P4TC_ROOT_MAX + 1]; + + if (!netlink_capable(skb, CAP_NET_ADMIN)) + return -EPERM; + + ret = nlmsg_parse(n, sizeof(struct p4tcmsg), p4tc_attr, P4TC_ROOT_MAX, + p4tc_root_policy, extack); + if (ret < 0) + return ret; + + if (!p4tc_attr[P4TC_ROOT]) { + NL_SET_ERR_MSG(extack, + "Netlink P4TC template attributes missing"); + return -EINVAL; + } + + if (p4tc_attr[P4TC_ROOT_PNAME]) + p_name = nla_data(p4tc_attr[P4TC_ROOT_PNAME]); + + return tcf_p4_tmpl_cu_n(skb, n, p4tc_attr[P4TC_ROOT], p_name, extack); +} + +static int tc_ctl_p4_tmpl_dump_1(struct sk_buff *skb, struct nlattr *arg, + char *p_name, struct netlink_callback *cb) +{ + struct p4tc_dump_ctx *ctx = (void *)cb->ctx; + struct netlink_ext_ack *extack = cb->extack; + u32 portid = NETLINK_CB(cb->skb).portid; + const struct nlmsghdr *n = cb->nlh; + u32 ids[P4TC_PATH_MAX] = {}; + struct nlattr *tb[P4TC_MAX + 1]; + struct p4tc_template_ops *op; + struct p4tcmsg *t_new; + struct nlmsghdr *nlh; + struct nlattr *root; + struct p4tcmsg *t; + int ret; + + ret = nla_parse_nested_deprecated(tb, P4TC_MAX, arg, p4tc_policy, + extack); + if (ret < 0) + return ret; + + t = (struct p4tcmsg *)nlmsg_data(n); + if (!obj_is_valid(t->obj)) { + NL_SET_ERR_MSG(extack, "Invalid object type"); + return -EINVAL; + } + + nlh = nlmsg_put(skb, portid, n->nlmsg_seq, RTM_GETP4TEMPLATE, + sizeof(*t), n->nlmsg_flags); + if (!nlh) + return -ENOSPC; + + t_new = nlmsg_data(nlh); + t_new->pipeid = t->pipeid; + t_new->obj = t->obj; + + root = nla_nest_start(skb, P4TC_ROOT); + + ids[P4TC_PID_IDX] = t->pipeid; + if (tb[P4TC_PATH]) { + if ((nla_len(tb[P4TC_PATH])) > + (P4TC_PATH_MAX - 1) * sizeof(u32)) { + NL_SET_ERR_MSG(extack, "Path is too big"); + return -E2BIG; + } + } + + op = (struct p4tc_template_ops *)p4tc_ops[t->obj]; + ret = op->dump(skb, ctx, tb[P4TC_PARAMS], &p_name, ids, extack); + if (ret <= 0) + goto out; + nla_nest_end(skb, root); + + if (p_name) { + if (nla_put_string(skb, P4TC_ROOT_PNAME, p_name)) { + ret = -1; + goto out; + } + } + + if (!t_new->pipeid) + t_new->pipeid = ids[P4TC_PID_IDX]; + + nlmsg_end(skb, nlh); + + return ret; + +out: + nlmsg_cancel(skb, nlh); + return ret; +} + +static int tc_ctl_p4_tmpl_dump(struct sk_buff *skb, struct netlink_callback *cb) +{ + char *p_name = NULL; + struct nlattr *p4tc_attr[P4TC_ROOT_MAX + 1]; + int ret; + + ret = nlmsg_parse(cb->nlh, sizeof(struct p4tcmsg), p4tc_attr, + P4TC_ROOT_MAX, p4tc_root_policy, cb->extack); + if (ret < 0) + return ret; + + if (!p4tc_attr[P4TC_ROOT]) { + NL_SET_ERR_MSG(cb->extack, + "Netlink P4TC template attributes missing"); + return -EINVAL; + } + + if (p4tc_attr[P4TC_ROOT_PNAME]) + p_name = nla_data(p4tc_attr[P4TC_ROOT_PNAME]); + + return tc_ctl_p4_tmpl_dump_1(skb, p4tc_attr[P4TC_ROOT], p_name, cb); +} + +static int __init p4tc_template_init(void) +{ + u32 obj; + + rtnl_register(PF_UNSPEC, RTM_CREATEP4TEMPLATE, tc_ctl_p4_tmpl_cu, NULL, + 0); + rtnl_register(PF_UNSPEC, RTM_DELP4TEMPLATE, tc_ctl_p4_tmpl_delete, NULL, + 0); + rtnl_register(PF_UNSPEC, RTM_GETP4TEMPLATE, tc_ctl_p4_tmpl_get, + tc_ctl_p4_tmpl_dump, 0); + + for (obj = P4TC_OBJ_PIPELINE; obj < P4TC_OBJ_MAX; obj++) { + const struct p4tc_template_ops *op = p4tc_ops[obj]; + + if (!obj_is_valid(obj)) + continue; + + if (op->init) + op->init(); + } + + return 0; +} + +subsys_initcall(p4tc_template_init); diff --git a/security/selinux/nlmsgtab.c b/security/selinux/nlmsgtab.c index 2ee7b4ed4..0a8daf2f8 100644 --- a/security/selinux/nlmsgtab.c +++ b/security/selinux/nlmsgtab.c @@ -94,6 +94,9 @@ static const struct nlmsg_perm nlmsg_route_perms[] = { { RTM_NEWTUNNEL, NETLINK_ROUTE_SOCKET__NLMSG_WRITE }, { RTM_DELTUNNEL, NETLINK_ROUTE_SOCKET__NLMSG_WRITE }, { RTM_GETTUNNEL, NETLINK_ROUTE_SOCKET__NLMSG_READ }, + { RTM_CREATEP4TEMPLATE, NETLINK_ROUTE_SOCKET__NLMSG_WRITE }, + { RTM_DELP4TEMPLATE, NETLINK_ROUTE_SOCKET__NLMSG_WRITE }, + { RTM_GETP4TEMPLATE, NETLINK_ROUTE_SOCKET__NLMSG_READ }, }; static const struct nlmsg_perm nlmsg_tcpdiag_perms[] = { @@ -176,7 +179,7 @@ int selinux_nlmsg_lookup(u16 sclass, u16 nlmsg_type, u32 *perm) * structures at the top of this file with the new mappings * before updating the BUILD_BUG_ON() macro! */ - BUILD_BUG_ON(RTM_MAX != (RTM_NEWTUNNEL + 3)); + BUILD_BUG_ON(RTM_MAX != (RTM_CREATEP4TEMPLATE + 3)); err = nlmsg_perm(nlmsg_type, perm, nlmsg_route_perms, sizeof(nlmsg_route_perms)); break; From patchwork Tue Jan 24 17:05:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jamal Hadi Salim X-Patchwork-Id: 13114390 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1C49CC54E94 for ; Tue, 24 Jan 2023 17:06:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231975AbjAXRG3 (ORCPT ); Tue, 24 Jan 2023 12:06:29 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41220 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234617AbjAXRGH (ORCPT ); Tue, 24 Jan 2023 12:06:07 -0500 Received: from mail-oi1-x236.google.com (mail-oi1-x236.google.com [IPv6:2607:f8b0:4864:20::236]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 66E604B77C for ; Tue, 24 Jan 2023 09:05:30 -0800 (PST) Received: by mail-oi1-x236.google.com with SMTP id n8so13938341oih.0 for ; Tue, 24 Jan 2023 09:05:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mojatatu-com.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=YE4gAOnFYl64NLDUc7qQ4NvEayXeoewIkr8CBaOF2Ss=; b=HwBnDYTXZSqptdQTPKP1j4jLex/hdbRqqI5ytTzaeKDjDS9XesI9m12+HKQjAZJ2AA z4OLp/SFkk9ZV80C50ySUblxudp6nck/2qOZuKIX7JibiUAmyUDXwvPd+4QVMi9lQoaU LNHxI780vgCE5tmlMEvmdgVNTiN2/QngneqLvPJTP9muAlVZPPNEuUa247IhooBxMFGx PdJA1N6IWlKnDCCPu7OfibcW0vZC9b5rwkau9Bar6kAvcR4AsZPUDO1TOKfETSmZyCq7 aoncJdTAq35GLg7SrX5w84IFVfYhz6RD2/CxnSLs9tqUbYl7W555HI8pLnb+JsPyR9CQ h0Lw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=YE4gAOnFYl64NLDUc7qQ4NvEayXeoewIkr8CBaOF2Ss=; b=2xu5uHEWRUxaQxydsyIgM2umK1CWqGCLwc1HPv5kGw/ZWeybbC8sTwd5KjeSZsCJsS z5LZ8BlKuf6O0f2dfMvzvCU06dtaENVRKXgDFUxpd8pfENwbDwm4dCmqACMue//bmBMx jiU1Z2LckCzI3VE4s+FeqaTbKwUuia7g18gO0FTXqh38towbnoc+JK/dPZUX7pxxABhX m06a5E0Nfs5UsS+aIWY1XwQgJYqMPLzAlWt7YFu7ofCjDLGgB8SGYq2On3XZbZMC2zS9 xgz13Ks3jQa17iY00TRi+vIaV+82Nj5kf7JqyGykDFqSkdxDBlzxNE7iH9XYXTc8cOwR affg== X-Gm-Message-State: AFqh2krGJPg3IzUt3HBdKdbOo0kNRKwZOcoOkyCtL5b9/+BNHjE6GQ+K u+jbNnrRlUPmNQal9eHSJZINMs9GnT6rPkY+ X-Google-Smtp-Source: AMrXdXt6B9O0YQTPQLyGGPKcISmndQZlMp1U/ZiMDuQLyjytowC6BlHqkPFTY6Pazf9pgO/i/azBvA== X-Received: by 2002:aca:ac01:0:b0:366:fc4a:1a8f with SMTP id v1-20020acaac01000000b00366fc4a1a8fmr12788710oie.28.1674579928516; Tue, 24 Jan 2023 09:05:28 -0800 (PST) Received: from localhost.localdomain (bras-base-kntaon1618w-grc-10-184-145-9-64.dsl.bell.ca. [184.145.9.64]) by smtp.gmail.com with ESMTPSA id t5-20020a05620a0b0500b007063036cb03sm1700208qkg.126.2023.01.24.09.05.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Jan 2023 09:05:27 -0800 (PST) From: Jamal Hadi Salim To: netdev@vger.kernel.org Cc: kernel@mojatatu.com, deb.chatterjee@intel.com, anjali.singhai@intel.com, namrata.limaye@intel.com, khalidm@nvidia.com, tom@sipanda.io, pratyush@sipanda.io, jiri@resnulli.us, xiyou.wangcong@gmail.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, vladbu@nvidia.com, simon.horman@corigine.com Subject: [PATCH net-next RFC 13/20] p4tc: add metadata create, update, delete, get, flush and dump Date: Tue, 24 Jan 2023 12:05:03 -0500 Message-Id: <20230124170510.316970-13-jhs@mojatatu.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230124170510.316970-1-jhs@mojatatu.com> References: <20230124170510.316970-1-jhs@mojatatu.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org X-Patchwork-State: RFC This commit allows users to create, update, delete and get a P4 pipeline's metadatum. It also allows users to flush and dump all of the P4 pipeline's metadata. As an example, if one were to create a metadatum named mname in a pipeline named ptables with a type of 8 bits, one would use the following command: tc p4template create metadata/ptables/mname [mid 1] type bit8 Note that, in the above command, the metadatum id is optional. If one does not specify a metadatum id, the kernel will assign one. If one were to update a metadatum named mname in a pipeline named ptables with an mid of 1, one would use the following command: tc p4template update metadata/ptables/[mname] [mid 1] type bit4 Note that, in the above command, the metadatum's id and the metadatum's name are optional. That is, one may specify only the metadatum's name, only the metadatum's id, or both. If one were to delete a metadatum named mname from a pipeline named ptables with an mid of 1, one would use the following command: tc p4template del metadata/ptables/[mname] [mid 1] Note that, in the above command, the metadatum's id and the metadatum's name are optional. That is, one may specify only the metadatum's name, only the metadatum's id, or both. If one were to flush all the metadata from a pipeline named ptables, one would use the following command: tc p4template del metadata/ptables/ If one were to get a metadatum named mname from a pipeline named ptables with an mid of 1, one would use the following command: tc p4template get metadata/ptables/[mname] [mid 1] Note that, in the above command, the metadatum's id and the metadatum's name are optional. That is, one may specify only the metadatum's name, only the metadatum's id, or both. If one were to dump all the metadata from a pipeline named ptables, one would use the following command: tc p4template get metadata/ptables/ Co-developed-by: Victor Nogueira Signed-off-by: Victor Nogueira Co-developed-by: Pedro Tammela Signed-off-by: Pedro Tammela Signed-off-by: Jamal Hadi Salim --- include/linux/skbuff.h | 17 + include/net/p4tc.h | 34 ++ include/uapi/linux/p4tc.h | 51 ++ net/core/skbuff.c | 17 + net/sched/p4tc/Makefile | 2 +- net/sched/p4tc/p4tc_meta.c | 819 +++++++++++++++++++++++++++++++++ net/sched/p4tc/p4tc_pipeline.c | 20 +- net/sched/p4tc/p4tc_tmpl_api.c | 15 + 8 files changed, 969 insertions(+), 6 deletions(-) create mode 100644 net/sched/p4tc/p4tc_meta.c diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 4c8492401..0d44b26bc 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -325,6 +325,20 @@ struct tc_skb_ext { }; #endif +#if IS_ENABLED(CONFIG_NET_P4_TC) +#include + +struct __p4tc_skb_ext { + u8 key[BITS_TO_BYTES(P4TC_MAX_KEYSZ)]; + u8 hdrs[HEADER_MAX_LEN]; + u8 metadata[META_MAX_LEN]; +}; + +struct p4tc_skb_ext { + struct __p4tc_skb_ext *p4tc_ext; +}; +#endif + struct sk_buff_head { /* These two members must be first to match sk_buff. */ struct_group_tagged(sk_buff_list, list, @@ -4571,6 +4585,9 @@ enum skb_ext_id { #if IS_ENABLED(CONFIG_NET_TC_SKB_EXT) TC_SKB_EXT, #endif +#if IS_ENABLED(CONFIG_NET_P4_TC) + P4TC_SKB_EXT, +#endif #if IS_ENABLED(CONFIG_MPTCP) SKB_EXT_MPTCP, #endif diff --git a/include/net/p4tc.h b/include/net/p4tc.h index 178bbdf68..748a70c85 100644 --- a/include/net/p4tc.h +++ b/include/net/p4tc.h @@ -12,11 +12,13 @@ #define P4TC_DEFAULT_NUM_TABLES P4TC_MINTABLES_COUNT #define P4TC_DEFAULT_MAX_RULES 1 +#define P4TC_MAXMETA_OFFSET 512 #define P4TC_PATH_MAX 3 #define P4TC_KERNEL_PIPEID 0 #define P4TC_PID_IDX 0 +#define P4TC_MID_IDX 1 struct p4tc_dump_ctx { u32 ids[P4TC_PATH_MAX]; @@ -78,6 +80,7 @@ extern const struct p4tc_template_ops p4tc_pipeline_ops; struct p4tc_pipeline { struct p4tc_template_common common; + struct idr p_meta_idr; struct rcu_head rcu; struct net *net; struct tc_action **preacts; @@ -85,6 +88,7 @@ struct p4tc_pipeline { struct tc_action **postacts; int num_postacts; u32 max_rules; + u32 p_meta_offset; refcount_t p_ref; refcount_t p_ctrl_ref; u16 num_tables; @@ -126,6 +130,36 @@ static inline int p4tc_action_destroy(struct tc_action **acts) return ret; } +static inline bool pipeline_sealed(struct p4tc_pipeline *pipeline) +{ + return pipeline->p_state == P4TC_STATE_READY; +} + +struct p4tc_metadata { + struct p4tc_template_common common; + struct rcu_head rcu; + u32 m_id; + u32 m_skb_off; + refcount_t m_ref; + u16 m_sz; + u16 m_startbit; /* Relative to its container */ + u16 m_endbit; /* Relative to its container */ + u8 m_datatype; /* T_XXX */ + bool m_read_only; +}; + +extern const struct p4tc_template_ops p4tc_meta_ops; + +struct p4tc_metadata *tcf_meta_find_byid(struct p4tc_pipeline *pipeline, + u32 m_id); +void tcf_meta_fill_user_offsets(struct p4tc_pipeline *pipeline); +void tcf_meta_init(struct p4tc_pipeline *root_pipe); +struct p4tc_metadata *tcf_meta_get(struct p4tc_pipeline *pipeline, + const char *mname, const u32 m_id, + struct netlink_ext_ack *extack); +void tcf_meta_put_ref(struct p4tc_metadata *meta); + #define to_pipeline(t) ((struct p4tc_pipeline *)t) +#define to_meta(t) ((struct p4tc_metadata *)t) #endif diff --git a/include/uapi/linux/p4tc.h b/include/uapi/linux/p4tc.h index 739c0fe18..8934c032d 100644 --- a/include/uapi/linux/p4tc.h +++ b/include/uapi/linux/p4tc.h @@ -18,11 +18,15 @@ struct p4tcmsg { #define P4TC_MAXPARSE_KEYS 16 #define P4TC_MAXMETA_SZ 128 #define P4TC_MSGBATCH_SIZE 16 +#define P4TC_MAX_KEYSZ 512 +#define HEADER_MAX_LEN 512 +#define META_MAX_LEN 512 #define P4TC_MAX_KEYSZ 512 #define TEMPLATENAMSZ 256 #define PIPELINENAMSIZ TEMPLATENAMSZ +#define METANAMSIZ TEMPLATENAMSZ /* Root attributes */ enum { @@ -50,6 +54,7 @@ enum { enum { P4TC_OBJ_UNSPEC, P4TC_OBJ_PIPELINE, + P4TC_OBJ_META, __P4TC_OBJ_MAX, }; #define P4TC_OBJ_MAX __P4TC_OBJ_MAX @@ -59,6 +64,7 @@ enum { P4TC_UNSPEC, P4TC_PATH, P4TC_PARAMS, + P4TC_COUNT, __P4TC_MAX, }; #define P4TC_MAX __P4TC_MAX @@ -102,6 +108,51 @@ enum { }; #define P4T_MAX (__P4T_MAX - 1) +/* Details all the info needed to find out metadata size and layout inside cb + * datastructure + */ +struct p4tc_meta_size_params { + __u16 startbit; + __u16 endbit; + __u8 datatype; /* T_XXX */ +}; + +/* Metadata attributes */ +enum { + P4TC_META_UNSPEC, + P4TC_META_NAME, /* string */ + P4TC_META_SIZE, /* struct p4tc_meta_size_params */ + __P4TC_META_MAX +}; +#define P4TC_META_MAX __P4TC_META_MAX + +/* Linux system metadata */ +enum { + P4TC_KERNEL_META_UNSPEC, + P4TC_KERNEL_META_PKTLEN, /* u32 */ + P4TC_KERNEL_META_DATALEN, /* u32 */ + P4TC_KERNEL_META_SKBMARK, /* u32 */ + P4TC_KERNEL_META_TCINDEX, /* u16 */ + P4TC_KERNEL_META_SKBHASH, /* u32 */ + P4TC_KERNEL_META_SKBPRIO, /* u32 */ + P4TC_KERNEL_META_IFINDEX, /* s32 */ + P4TC_KERNEL_META_SKBIIF, /* s32 */ + P4TC_KERNEL_META_PROTOCOL, /* be16 */ + P4TC_KERNEL_META_PKTYPE, /* u8:3 */ + P4TC_KERNEL_META_IDF, /* u8:1 */ + P4TC_KERNEL_META_IPSUM, /* u8:2 */ + P4TC_KERNEL_META_OOOK, /* u8:1 */ + P4TC_KERNEL_META_FCLONE, /* u8:2 */ + P4TC_KERNEL_META_PEEKED, /* u8:1 */ + P4TC_KERNEL_META_QMAP, /* u16 */ + P4TC_KERNEL_META_PTYPEOFF, /* u8 */ + P4TC_KERNEL_META_CLONEOFF, /* u8 */ + P4TC_KERNEL_META_PTCLNOFF, /* u16 */ + P4TC_KERNEL_META_DIRECTION, /* u8:1 */ + __P4TC_KERNEL_META_MAX +}; +#define P4TC_KERNEL_META_MAX (__P4TC_KERNEL_META_MAX - 1) + #define P4TC_RTA(r) \ ((struct rtattr *)(((char *)(r)) + NLMSG_ALIGN(sizeof(struct p4tcmsg)))) diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 4e73ab348..17f4c7d96 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -4583,6 +4583,9 @@ static const u8 skb_ext_type_len[] = { #if IS_ENABLED(CONFIG_NET_TC_SKB_EXT) [TC_SKB_EXT] = SKB_EXT_CHUNKSIZEOF(struct tc_skb_ext), #endif +#if IS_ENABLED(CONFIG_NET_P4_TC) + [P4TC_SKB_EXT] = SKB_EXT_CHUNKSIZEOF(struct p4tc_skb_ext), +#endif #if IS_ENABLED(CONFIG_MPTCP) [SKB_EXT_MPTCP] = SKB_EXT_CHUNKSIZEOF(struct mptcp_ext), #endif @@ -4603,6 +4606,9 @@ static __always_inline unsigned int skb_ext_total_length(void) #if IS_ENABLED(CONFIG_NET_TC_SKB_EXT) skb_ext_type_len[TC_SKB_EXT] + #endif +#if IS_ENABLED(CONFIG_NET_P4_TC) + skb_ext_type_len[P4TC_SKB_EXT] + +#endif #if IS_ENABLED(CONFIG_MPTCP) skb_ext_type_len[SKB_EXT_MPTCP] + #endif @@ -6685,6 +6691,13 @@ static void skb_ext_put_mctp(struct mctp_flow *flow) } #endif +#ifdef CONFIG_NET_P4_TC +static void skb_ext_put_p4tc(struct p4tc_skb_ext *p4tc_skb_ext) +{ + kfree(p4tc_skb_ext->p4tc_ext); +} +#endif + void __skb_ext_del(struct sk_buff *skb, enum skb_ext_id id) { struct skb_ext *ext = skb->extensions; @@ -6724,6 +6737,10 @@ void __skb_ext_put(struct skb_ext *ext) if (__skb_ext_exist(ext, SKB_EXT_MCTP)) skb_ext_put_mctp(skb_ext_get_ptr(ext, SKB_EXT_MCTP)); #endif +#ifdef CONFIG_NET_P4_TC + if (__skb_ext_exist(ext, P4TC_SKB_EXT)) + skb_ext_put_p4tc(skb_ext_get_ptr(ext, P4TC_SKB_EXT)); +#endif kmem_cache_free(skbuff_ext_cache, ext); } diff --git a/net/sched/p4tc/Makefile b/net/sched/p4tc/Makefile index 0881a7563..d523e668c 100644 --- a/net/sched/p4tc/Makefile +++ b/net/sched/p4tc/Makefile @@ -1,3 +1,3 @@ # SPDX-License-Identifier: GPL-2.0 -obj-y := p4tc_types.o p4tc_tmpl_api.o p4tc_pipeline.o +obj-y := p4tc_types.o p4tc_tmpl_api.o p4tc_pipeline.o p4tc_meta.o diff --git a/net/sched/p4tc/p4tc_meta.c b/net/sched/p4tc/p4tc_meta.c new file mode 100644 index 000000000..ebeb73352 --- /dev/null +++ b/net/sched/p4tc/p4tc_meta.c @@ -0,0 +1,819 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * net/sched/p4tc_meta.c P4 TC API METADATA + * + * Copyright (c) 2022, Mojatatu Networks + * Copyright (c) 2022, Intel Corporation. + * Authors: Jamal Hadi Salim + * Victor Nogueira + * Pedro Tammela + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define START_META_OFFSET 0 + +static const struct nla_policy p4tc_meta_policy[P4TC_META_MAX + 1] = { + [P4TC_META_NAME] = { .type = NLA_STRING, .len = METANAMSIZ }, + [P4TC_META_SIZE] = { .type = NLA_BINARY, + .len = sizeof(struct p4tc_meta_size_params) }, +}; + +static int _tcf_meta_put(struct p4tc_pipeline *pipeline, + struct p4tc_metadata *meta, bool unconditional_purge, + struct netlink_ext_ack *extack) +{ + if (!unconditional_purge && !refcount_dec_if_one(&meta->m_ref)) + return -EBUSY; + + pipeline->p_meta_offset -= BITS_TO_U32(meta->m_sz) * sizeof(u32); + idr_remove(&pipeline->p_meta_idr, meta->m_id); + + kfree_rcu(meta, rcu); + + return 0; +} + +static int tcf_meta_put(struct net *net, struct p4tc_template_common *template, + bool unconditional_purge, + struct netlink_ext_ack *extack) +{ + struct p4tc_pipeline *pipeline = + tcf_pipeline_find_byid(net, template->p_id); + struct p4tc_metadata *meta = to_meta(template); + int ret; + + ret = _tcf_meta_put(pipeline, meta, unconditional_purge, extack); + if (ret < 0) + NL_SET_ERR_MSG(extack, "Unable to delete referenced metadatum"); + + return ret; +} + +struct p4tc_metadata *tcf_meta_find_byid(struct p4tc_pipeline *pipeline, + u32 m_id) +{ + return idr_find(&pipeline->p_meta_idr, m_id); +} + +static struct p4tc_metadata * +tcf_meta_find_byname(const char *m_name, struct p4tc_pipeline *pipeline) +{ + struct p4tc_metadata *meta; + unsigned long tmp, id; + + idr_for_each_entry_ul(&pipeline->p_meta_idr, meta, tmp, id) + if (strncmp(meta->common.name, m_name, METANAMSIZ) == 0) + return meta; + + return NULL; +} + +static inline struct p4tc_metadata * +tcf_meta_find_byname_attr(struct nlattr *name_attr, + struct p4tc_pipeline *pipeline) +{ + return tcf_meta_find_byname(nla_data(name_attr), pipeline); +} + +static struct p4tc_metadata *tcf_meta_find_byany(struct p4tc_pipeline *pipeline, + const char *mname, + const u32 m_id, + struct netlink_ext_ack *extack) +{ + struct p4tc_metadata *meta; + int err; + + if (m_id) { + meta = tcf_meta_find_byid(pipeline, m_id); + if (!meta) { + NL_SET_ERR_MSG(extack, + "Unable to find metadatum by id"); + err = -EINVAL; + goto out; + } + } else { + if (mname) { + meta = tcf_meta_find_byname(mname, pipeline); + if (!meta) { + NL_SET_ERR_MSG(extack, + "Metadatum name not found"); + err = -EINVAL; + goto out; + } + } else { + NL_SET_ERR_MSG(extack, + "Must specify metadatum name or id"); + err = -EINVAL; + goto out; + } + } + + return meta; +out: + return ERR_PTR(err); +} + +struct p4tc_metadata *tcf_meta_get(struct p4tc_pipeline *pipeline, + const char *mname, const u32 m_id, + struct netlink_ext_ack *extack) +{ + struct p4tc_metadata *meta; + + meta = tcf_meta_find_byany(pipeline, mname, m_id, extack); + if (IS_ERR(meta)) + return meta; + + /* Should never be zero */ + WARN_ON(!refcount_inc_not_zero(&meta->m_ref)); + return meta; +} + +void tcf_meta_put_ref(struct p4tc_metadata *meta) +{ + WARN_ON(!refcount_dec_not_one(&meta->m_ref)); +} + +static struct p4tc_metadata * +tcf_meta_find_byanyattr(struct p4tc_pipeline *pipeline, + struct nlattr *name_attr, const u32 m_id, + struct netlink_ext_ack *extack) +{ + char *mname = NULL; + + if (name_attr) + mname = nla_data(name_attr); + + return tcf_meta_find_byany(pipeline, mname, m_id, extack); +} + +static int p4tc_check_meta_size(struct p4tc_meta_size_params *sz_params, + struct p4tc_type *type, + struct netlink_ext_ack *extack) +{ + int new_bitsz; + + if (sz_params->startbit > P4T_MAX_BITSZ || + sz_params->startbit > type->bitsz) { + NL_SET_ERR_MSG(extack, "Startbit value too big"); + return -EINVAL; + } + + if (sz_params->endbit > P4T_MAX_BITSZ || + sz_params->endbit > type->bitsz) { + NL_SET_ERR_MSG(extack, "Endbit value too big"); + return -EINVAL; + } + + if (sz_params->endbit < sz_params->startbit) { + NL_SET_ERR_MSG(extack, "Endbit value smaller than startbit"); + return -EINVAL; + } + + new_bitsz = (sz_params->endbit - sz_params->startbit + 1); + if (new_bitsz == 0) { + NL_SET_ERR_MSG(extack, "Bit size can't be zero"); + return -EINVAL; + } + + if (new_bitsz > P4T_MAX_BITSZ || new_bitsz > type->bitsz) { + NL_SET_ERR_MSG(extack, "Bit size too big"); + return -EINVAL; + } + + return new_bitsz; +} + +void tcf_meta_fill_user_offsets(struct p4tc_pipeline *pipeline) +{ + u32 meta_off = START_META_OFFSET; + struct p4tc_metadata *meta; + unsigned long tmp, id; + + idr_for_each_entry_ul(&pipeline->p_meta_idr, meta, tmp, id) { + /* Offsets are multiples of 4 for alignment purposes */ + meta->m_skb_off = meta_off; + meta_off += BITS_TO_U32(meta->m_sz) * sizeof(u32); + } +} + +static struct p4tc_metadata * +__tcf_meta_create(struct p4tc_pipeline *pipeline, u32 m_id, const char *m_name, + struct p4tc_meta_size_params *sz_params, gfp_t alloc_flag, + bool read_only, struct netlink_ext_ack *extack) +{ + u32 p_meta_offset = 0; + bool kmeta; + struct p4tc_metadata *meta; + struct p4tc_type *datatype; + u32 sz_bytes; + int sz_bits; + int ret; + + kmeta = pipeline->common.p_id == P4TC_KERNEL_PIPEID; + + meta = kzalloc(sizeof(*meta), alloc_flag); + if (!meta) { + if (kmeta) + pr_err("Unable to allocate kernel metadatum"); + else + NL_SET_ERR_MSG(extack, + "Unable to allocate user metadatum"); + ret = -ENOMEM; + goto out; + } + + meta->common.p_id = pipeline->common.p_id; + + datatype = p4type_find_byid(sz_params->datatype); + if (!datatype) { + if (kmeta) + pr_err("Invalid data type for kernel metadataum %u\n", + sz_params->datatype); + else + NL_SET_ERR_MSG(extack, + "Invalid data type for user metdatum"); + ret = -EINVAL; + goto free; + } + + sz_bits = p4tc_check_meta_size(sz_params, datatype, extack); + if (sz_bits < 0) { + ret = sz_bits; + goto free; + } + + sz_bytes = BITS_TO_U32(datatype->bitsz) * sizeof(u32); + if (!kmeta) { + p_meta_offset = pipeline->p_meta_offset + sz_bytes; + if (p_meta_offset > BITS_TO_BYTES(P4TC_MAXMETA_OFFSET)) { + NL_SET_ERR_MSG(extack, "Metadata max offset exceeded"); + ret = -EINVAL; + goto free; + } + } + + meta->m_datatype = datatype->typeid; + meta->m_startbit = sz_params->startbit; + meta->m_endbit = sz_params->endbit; + meta->m_sz = sz_bits; + meta->m_read_only = read_only; + + if (m_id) { + ret = idr_alloc_u32(&pipeline->p_meta_idr, meta, &m_id, m_id, + alloc_flag); + if (ret < 0) { + if (kmeta) + pr_err("Unable to alloc kernel metadatum id %u\n", + m_id); + else + NL_SET_ERR_MSG(extack, + "Unable to alloc user metadatum id"); + goto free; + } + + meta->m_id = m_id; + } else { + meta->m_id = 1; + + ret = idr_alloc_u32(&pipeline->p_meta_idr, meta, &meta->m_id, + UINT_MAX, alloc_flag); + if (ret < 0) { + if (kmeta) + pr_err("Unable to alloc kernel metadatum id %u\n", + meta->m_id); + else + NL_SET_ERR_MSG(extack, + "Unable to alloc metadatum id"); + goto free; + } + } + + if (!kmeta) + pipeline->p_meta_offset = p_meta_offset; + + strscpy(meta->common.name, m_name, METANAMSIZ); + meta->common.ops = (struct p4tc_template_ops *)&p4tc_meta_ops; + + refcount_set(&meta->m_ref, 1); + + return meta; + +free: + kfree(meta); +out: + return ERR_PTR(ret); +} + +struct p4tc_metadata *tcf_meta_create(struct nlmsghdr *n, struct nlattr *nla, + u32 m_id, struct p4tc_pipeline *pipeline, + struct netlink_ext_ack *extack) +{ + int ret = 0; + struct p4tc_meta_size_params *sz_params; + struct nlattr *tb[P4TC_META_MAX + 1]; + char *m_name; + + ret = nla_parse_nested(tb, P4TC_META_MAX, nla, p4tc_meta_policy, + extack); + if (ret < 0) + goto out; + + if (tcf_meta_find_byname_attr(tb[P4TC_META_NAME], pipeline) || + tcf_meta_find_byid(pipeline, m_id)) { + NL_SET_ERR_MSG(extack, "Metadatum already exists"); + ret = -EEXIST; + goto out; + } + + if (tb[P4TC_META_NAME]) { + m_name = nla_data(tb[P4TC_META_NAME]); + } else { + NL_SET_ERR_MSG(extack, "Must specify metadatum name"); + ret = -ENOENT; + goto out; + } + + if (tb[P4TC_META_SIZE]) { + sz_params = nla_data(tb[P4TC_META_SIZE]); + } else { + NL_SET_ERR_MSG(extack, "Must specify metadatum size params"); + ret = -ENOENT; + goto out; + } + + return __tcf_meta_create(pipeline, m_id, m_name, sz_params, GFP_KERNEL, + false, extack); + +out: + return ERR_PTR(ret); +} + +static struct p4tc_metadata *tcf_meta_update(struct nlmsghdr *n, + struct nlattr *nla, u32 m_id, + struct p4tc_pipeline *pipeline, + struct netlink_ext_ack *extack) +{ + struct nlattr *tb[P4TC_META_MAX + 1]; + struct p4tc_metadata *meta; + int ret; + + ret = nla_parse_nested(tb, P4TC_META_MAX, nla, p4tc_meta_policy, + extack); + + if (ret < 0) + goto out; + + meta = tcf_meta_find_byanyattr(pipeline, tb[P4TC_META_NAME], m_id, + extack); + if (IS_ERR(meta)) + return meta; + + if (tb[P4TC_META_SIZE]) { + struct p4tc_type *new_datatype, *curr_datatype; + struct p4tc_meta_size_params *sz_params; + u32 new_bytesz, curr_bytesz; + int new_bitsz; + u32 p_meta_offset; + int diff; + + sz_params = nla_data(tb[P4TC_META_SIZE]); + new_datatype = p4type_find_byid(sz_params->datatype); + if (!new_datatype) { + NL_SET_ERR_MSG(extack, "Invalid data type"); + ret = -EINVAL; + goto out; + } + + new_bitsz = + p4tc_check_meta_size(sz_params, new_datatype, extack); + if (new_bitsz < 0) { + ret = new_bitsz; + goto out; + } + + new_bytesz = BITS_TO_U32(new_datatype->bitsz) * sizeof(u32); + + curr_datatype = p4type_find_byid(meta->m_datatype); + curr_bytesz = BITS_TO_U32(curr_datatype->bitsz) * sizeof(u32); + + diff = new_bytesz - curr_bytesz; + p_meta_offset = pipeline->p_meta_offset + diff; + if (p_meta_offset > BITS_TO_BYTES(P4TC_MAXMETA_OFFSET)) { + NL_SET_ERR_MSG(extack, "Metadata max offset exceeded"); + ret = -EINVAL; + goto out; + } + + pipeline->p_meta_offset = p_meta_offset; + + meta->m_datatype = new_datatype->typeid; + meta->m_startbit = sz_params->startbit; + meta->m_endbit = sz_params->endbit; + meta->m_sz = new_bitsz; + } + + return meta; + +out: + return ERR_PTR(ret); +} + +static struct p4tc_template_common * +tcf_meta_cu(struct net *net, struct nlmsghdr *n, struct nlattr *nla, + struct p4tc_nl_pname *nl_pname, u32 *ids, + struct netlink_ext_ack *extack) +{ + u32 pipeid = ids[P4TC_PID_IDX], m_id = ids[P4TC_MID_IDX]; + struct p4tc_pipeline *pipeline; + struct p4tc_metadata *meta; + + pipeline = tcf_pipeline_find_byany_unsealed(net, nl_pname->data, pipeid, + extack); + if (IS_ERR(pipeline)) + return (void *)pipeline; + + if (n->nlmsg_flags & NLM_F_REPLACE) + meta = tcf_meta_update(n, nla, m_id, pipeline, extack); + else + meta = tcf_meta_create(n, nla, m_id, pipeline, extack); + + if (IS_ERR(meta)) + goto out; + + if (!nl_pname->passed) + strscpy(nl_pname->data, pipeline->common.name, PIPELINENAMSIZ); + + if (!ids[P4TC_PID_IDX]) + ids[P4TC_PID_IDX] = pipeline->common.p_id; + +out: + return (struct p4tc_template_common *)meta; +} + +static int _tcf_meta_fill_nlmsg(struct sk_buff *skb, + const struct p4tc_metadata *meta) +{ + unsigned char *b = nlmsg_get_pos(skb); + struct p4tc_meta_size_params sz_params; + struct nlattr *nest; + + if (nla_put_u32(skb, P4TC_PATH, meta->m_id)) + goto out_nlmsg_trim; + + nest = nla_nest_start(skb, P4TC_PARAMS); + if (!nest) + goto out_nlmsg_trim; + + sz_params.datatype = meta->m_datatype; + sz_params.startbit = meta->m_startbit; + sz_params.endbit = meta->m_endbit; + + if (nla_put_string(skb, P4TC_META_NAME, meta->common.name)) + goto out_nlmsg_trim; + if (nla_put(skb, P4TC_META_SIZE, sizeof(sz_params), &sz_params)) + goto out_nlmsg_trim; + + nla_nest_end(skb, nest); + + return skb->len; + +out_nlmsg_trim: + nlmsg_trim(skb, b); + return -1; +} + +static int tcf_meta_fill_nlmsg(struct net *net, struct sk_buff *skb, + struct p4tc_template_common *template, + struct netlink_ext_ack *extack) +{ + const struct p4tc_metadata *meta = to_meta(template); + + if (_tcf_meta_fill_nlmsg(skb, meta) <= 0) { + NL_SET_ERR_MSG(extack, + "Failed to fill notification attributes for metadatum"); + return -EINVAL; + } + + return 0; +} + +static int tcf_meta_flush(struct sk_buff *skb, struct p4tc_pipeline *pipeline, + struct netlink_ext_ack *extack) +{ + struct p4tc_metadata *meta; + unsigned long tmp, m_id; + unsigned char *b = nlmsg_get_pos(skb); + int ret = 0; + int i = 0; + + if (nla_put_u32(skb, P4TC_PATH, 0)) + goto out_nlmsg_trim; + + if (idr_is_empty(&pipeline->p_meta_idr)) { + NL_SET_ERR_MSG(extack, "There is not metadata to flush"); + ret = 0; + goto out_nlmsg_trim; + } + + idr_for_each_entry_ul(&pipeline->p_meta_idr, meta, tmp, m_id) { + if (_tcf_meta_put(pipeline, meta, false, extack) < 0) { + ret = -EBUSY; + continue; + } + i++; + } + + nla_put_u32(skb, P4TC_COUNT, i); + + if (ret < 0) { + if (i == 0) { + NL_SET_ERR_MSG(extack, "Unable to flush any metadata"); + goto out_nlmsg_trim; + } else { + NL_SET_ERR_MSG(extack, "Unable to flush all metadata"); + } + } + + return i; + +out_nlmsg_trim: + nlmsg_trim(skb, b); + return ret; +} + +static int tcf_meta_gd(struct net *net, struct sk_buff *skb, struct nlmsghdr *n, + struct nlattr *nla, struct p4tc_nl_pname *nl_pname, + u32 *ids, struct netlink_ext_ack *extack) +{ + u32 pipeid = ids[P4TC_PID_IDX], m_id = ids[P4TC_MID_IDX]; + struct nlattr *tb[P4TC_META_MAX + 1] = {}; + unsigned char *b = nlmsg_get_pos(skb); + int ret = 0; + struct p4tc_pipeline *pipeline; + struct p4tc_metadata *meta; + + if (n->nlmsg_type == RTM_DELP4TEMPLATE) + pipeline = tcf_pipeline_find_byany_unsealed(net, nl_pname->data, + pipeid, extack); + else + pipeline = tcf_pipeline_find_byany(net, nl_pname->data, pipeid, + extack); + if (IS_ERR(pipeline)) + return PTR_ERR(pipeline); + + if (nla) { + ret = nla_parse_nested(tb, P4TC_META_MAX, nla, p4tc_meta_policy, + extack); + + if (ret < 0) + return ret; + } + + if (!nl_pname->passed) + strscpy(nl_pname->data, pipeline->common.name, PIPELINENAMSIZ); + + if (!ids[P4TC_PID_IDX]) + ids[P4TC_PID_IDX] = pipeline->common.p_id; + + if (n->nlmsg_type == RTM_DELP4TEMPLATE && (n->nlmsg_flags & NLM_F_ROOT)) + return tcf_meta_flush(skb, pipeline, extack); + + meta = tcf_meta_find_byanyattr(pipeline, tb[P4TC_META_NAME], m_id, + extack); + if (IS_ERR(meta)) + return PTR_ERR(meta); + + if (_tcf_meta_fill_nlmsg(skb, meta) < 0) { + NL_SET_ERR_MSG(extack, + "Failed to fill notification attributes for metadatum"); + return -EINVAL; + } + + if (n->nlmsg_type == RTM_DELP4TEMPLATE) { + ret = _tcf_meta_put(pipeline, meta, false, extack); + if (ret < 0) { + NL_SET_ERR_MSG(extack, + "Unable to delete referenced metadatum"); + goto out_nlmsg_trim; + } + } + + return ret; + +out_nlmsg_trim: + nlmsg_trim(skb, b); + return ret; +} + +static int tcf_meta_dump(struct sk_buff *skb, struct p4tc_dump_ctx *ctx, + struct nlattr *nla, char **p_name, u32 *ids, + struct netlink_ext_ack *extack) +{ + unsigned char *b = nlmsg_get_pos(skb); + const u32 pipeid = ids[P4TC_PID_IDX]; + struct net *net = sock_net(skb->sk); + unsigned long m_id = 0; + int i = 0; + struct p4tc_pipeline *pipeline; + struct p4tc_metadata *meta; + unsigned long tmp; + + if (!ctx->ids[P4TC_PID_IDX]) { + pipeline = + tcf_pipeline_find_byany(net, *p_name, pipeid, extack); + if (IS_ERR(pipeline)) + return PTR_ERR(pipeline); + ctx->ids[P4TC_PID_IDX] = pipeline->common.p_id; + } else { + pipeline = tcf_pipeline_find_byid(net, ctx->ids[P4TC_PID_IDX]); + } + + m_id = ctx->ids[P4TC_MID_IDX]; + + idr_for_each_entry_continue_ul(&pipeline->p_meta_idr, meta, tmp, m_id) { + struct nlattr *count, *param; + + if (i == P4TC_MSGBATCH_SIZE) + break; + + count = nla_nest_start(skb, i + 1); + if (!count) + goto out_nlmsg_trim; + + param = nla_nest_start(skb, P4TC_PARAMS); + if (!param) + goto out_nlmsg_trim; + if (nla_put_string(skb, P4TC_META_NAME, meta->common.name)) + goto out_nlmsg_trim; + + nla_nest_end(skb, param); + nla_nest_end(skb, count); + + i++; + } + + if (i == 0) { + if (!ctx->ids[P4TC_MID_IDX]) + NL_SET_ERR_MSG(extack, "There is no metadata to dump"); + return 0; + } + + if (!ids[P4TC_PID_IDX]) + ids[P4TC_PID_IDX] = pipeline->common.p_id; + + if (!(*p_name)) + *p_name = pipeline->common.name; + + ctx->ids[P4TC_MID_IDX] = m_id; + + return skb->len; + +out_nlmsg_trim: + nlmsg_trim(skb, b); + return -ENOMEM; +} + +static int __p4tc_register_kmeta(struct p4tc_pipeline *pipeline, u32 m_id, + const char *m_name, u8 startbit, u8 endbit, + bool read_only, u32 datatype) +{ + struct p4tc_meta_size_params sz_params = { + .startbit = startbit, + .endbit = endbit, + .datatype = datatype, + }; + struct p4tc_metadata *meta; + + meta = __tcf_meta_create(pipeline, m_id, m_name, &sz_params, GFP_ATOMIC, + read_only, NULL); + if (IS_ERR(meta)) { + pr_err("Failed to register metadata %s %ld\n", m_name, + PTR_ERR(meta)); + return PTR_ERR(meta); + } + + pr_debug("Registered kernel metadata %s with id %u\n", m_name, m_id); + + return 0; +} + +#define p4tc_register_kmeta(...) \ + do { \ + if (__p4tc_register_kmeta(__VA_ARGS__) < 0) \ + return; \ + } while (0) + +void tcf_meta_init(struct p4tc_pipeline *root_pipe) +{ + p4tc_register_kmeta(root_pipe, P4TC_KERNEL_META_PKTLEN, "pktlen", 0, 31, + false, P4T_U32); + + p4tc_register_kmeta(root_pipe, P4TC_KERNEL_META_DATALEN, "datalen", 0, + 31, false, P4T_U32); + + p4tc_register_kmeta(root_pipe, P4TC_KERNEL_META_SKBMARK, "skbmark", 0, + 31, false, P4T_U32); + + p4tc_register_kmeta(root_pipe, P4TC_KERNEL_META_TCINDEX, "tcindex", 0, + 15, false, P4T_U16); + + p4tc_register_kmeta(root_pipe, P4TC_KERNEL_META_SKBHASH, "skbhash", 0, + 31, false, P4T_U32); + + p4tc_register_kmeta(root_pipe, P4TC_KERNEL_META_SKBPRIO, "skbprio", 0, + 31, false, P4T_U32); + + p4tc_register_kmeta(root_pipe, P4TC_KERNEL_META_IFINDEX, "ifindex", 0, + 31, false, P4T_S32); + + p4tc_register_kmeta(root_pipe, P4TC_KERNEL_META_SKBIIF, "iif", 0, 31, + true, P4T_DEV); + + p4tc_register_kmeta(root_pipe, P4TC_KERNEL_META_PROTOCOL, "skbproto", 0, + 15, false, P4T_BE16); + + p4tc_register_kmeta(root_pipe, P4TC_KERNEL_META_PTYPEOFF, "ptypeoff", 0, + 7, false, P4T_U8); + + p4tc_register_kmeta(root_pipe, P4TC_KERNEL_META_CLONEOFF, "cloneoff", 0, + 7, false, P4T_U8); + + p4tc_register_kmeta(root_pipe, P4TC_KERNEL_META_PTCLNOFF, "ptclnoff", 0, + 15, false, P4T_U16); + + p4tc_register_kmeta(root_pipe, P4TC_KERNEL_META_QMAP, "skbqmap", 0, 15, + false, P4T_U16); + +#if defined(__LITTLE_ENDIAN_BITFIELD) + p4tc_register_kmeta(root_pipe, P4TC_KERNEL_META_PKTYPE, "skbptype", 0, + 2, false, P4T_U8); + + p4tc_register_kmeta(root_pipe, P4TC_KERNEL_META_IDF, "skbidf", 3, 3, + false, P4T_U8); + + p4tc_register_kmeta(root_pipe, P4TC_KERNEL_META_IPSUM, "skbipsum", 5, 6, + false, P4T_U8); + + p4tc_register_kmeta(root_pipe, P4TC_KERNEL_META_OOOK, "skboook", 7, 7, + false, P4T_U8); + + p4tc_register_kmeta(root_pipe, P4TC_KERNEL_META_FCLONE, "fclone", 2, 3, + false, P4T_U8); + + p4tc_register_kmeta(root_pipe, P4TC_KERNEL_META_PEEKED, "skbpeek", 4, 4, + false, P4T_U8); + + p4tc_register_kmeta(root_pipe, P4TC_KERNEL_META_DIRECTION, "direction", + 7, 7, false, P4T_U8); +#elif define(__BIG_ENDIAN_BITFIELD) + p4tc_register_kmeta(root_pipe, P4TC_KERNEL_META_PKTYPE, "skbptype", 5, + 7, false, P4T_U8); + + p4tc_register_kmeta(root_pipe, P4TC_KERNEL_META_IDF, "skbidf", 4, 4, + false, P4T_U8); + + p4tc_register_kmeta(root_pipe, P4TC_KERNEL_META_IPSUM, "skbipsum", 1, 2, + false, P4T_U8); + + p4tc_register_kmeta(root_pipe, P4TC_KERNEL_META_OOOK, "skboook", 0, 0, + false, P4T_U8); + + p4tc_register_kmeta(root_pipe, P4TC_KERNEL_META_FCLONE, "fclone", 4, 5, + false, P4T_U8); + + p4tc_register_kmeta(root_pipe, P4TC_KERNEL_META_PEEKED, "skbpeek", 3, 3, + false, P4T_U8); + + p4tc_register_kmeta(root_pipe, P4TC_KERNEL_META_DIRECTION, "direction", + 0, 0, false, P4T_U8); +#else +#error "Please fix " +#endif +} + +const struct p4tc_template_ops p4tc_meta_ops = { + .cu = tcf_meta_cu, + .fill_nlmsg = tcf_meta_fill_nlmsg, + .gd = tcf_meta_gd, + .put = tcf_meta_put, + .dump = tcf_meta_dump, +}; diff --git a/net/sched/p4tc/p4tc_pipeline.c b/net/sched/p4tc/p4tc_pipeline.c index c6c49ab71..49f0062ad 100644 --- a/net/sched/p4tc/p4tc_pipeline.c +++ b/net/sched/p4tc/p4tc_pipeline.c @@ -80,6 +80,8 @@ static const struct nla_policy tc_pipeline_policy[P4TC_PIPELINE_MAX + 1] = { static void tcf_pipeline_destroy(struct p4tc_pipeline *pipeline, bool free_pipeline) { + idr_destroy(&pipeline->p_meta_idr); + if (free_pipeline) kfree(pipeline); } @@ -104,6 +106,8 @@ static int tcf_pipeline_put(struct net *net, struct p4tc_pipeline_net *pipe_net = net_generic(net, pipeline_net_id); struct p4tc_pipeline *pipeline = to_pipeline(template); struct net *pipeline_net = maybe_get_net(net); + struct p4tc_metadata *meta; + unsigned long m_id, tmp; if (pipeline_net && !refcount_dec_if_one(&pipeline->p_ref)) { NL_SET_ERR_MSG(extack, "Can't delete referenced pipeline"); @@ -112,6 +116,9 @@ static int tcf_pipeline_put(struct net *net, idr_remove(&pipe_net->pipeline_idr, pipeline->common.p_id); + idr_for_each_entry_ul(&pipeline->p_meta_idr, meta, tmp, m_id) + meta->common.ops->put(net, &meta->common, true, extack); + /* XXX: The action fields are only accessed in the control path * since they will be copied to the filter, where the data path * will use them. So there is no need to free them in the rcu @@ -154,11 +161,6 @@ static inline int pipeline_try_set_state_ready(struct p4tc_pipeline *pipeline, return true; } -static inline bool pipeline_sealed(struct p4tc_pipeline *pipeline) -{ - return pipeline->p_state == P4TC_STATE_READY; -} - static int p4tc_action_init(struct net *net, struct nlattr *nla, struct tc_action *acts[], u32 pipeid, u32 flags, struct netlink_ext_ack *extack) @@ -317,6 +319,9 @@ static struct p4tc_pipeline *tcf_pipeline_create(struct net *net, pipeline->num_postacts = 0; } + idr_init(&pipeline->p_meta_idr); + pipeline->p_meta_offset = 0; + pipeline->p_state = P4TC_STATE_NOT_READY; pipeline->net = net; @@ -508,6 +513,7 @@ tcf_pipeline_update(struct net *net, struct nlmsghdr *n, struct nlattr *nla, ret = pipeline_try_set_state_ready(pipeline, extack); if (ret < 0) goto postactions_destroy; + tcf_meta_fill_user_offsets(pipeline); } if (max_rules) @@ -724,12 +730,16 @@ static void __tcf_pipeline_init(void) strscpy(root_pipeline->common.name, "kernel", PIPELINENAMSIZ); + idr_init(&root_pipeline->p_meta_idr); + root_pipeline->common.ops = (struct p4tc_template_ops *)&p4tc_pipeline_ops; root_pipeline->common.p_id = pipeid; root_pipeline->p_state = P4TC_STATE_READY; + + tcf_meta_init(root_pipeline); } static void tcf_pipeline_init(void) diff --git a/net/sched/p4tc/p4tc_tmpl_api.c b/net/sched/p4tc/p4tc_tmpl_api.c index debd5f825..a13d02ce5 100644 --- a/net/sched/p4tc/p4tc_tmpl_api.c +++ b/net/sched/p4tc/p4tc_tmpl_api.c @@ -42,6 +42,7 @@ static bool obj_is_valid(u32 obj) { switch (obj) { case P4TC_OBJ_PIPELINE: + case P4TC_OBJ_META: return true; default: return false; @@ -50,6 +51,7 @@ static bool obj_is_valid(u32 obj) static const struct p4tc_template_ops *p4tc_ops[P4TC_OBJ_MAX] = { [P4TC_OBJ_PIPELINE] = &p4tc_pipeline_ops, + [P4TC_OBJ_META] = &p4tc_meta_ops, }; int tcf_p4_tmpl_generic_dump(struct sk_buff *skb, struct p4tc_dump_ctx *ctx, @@ -125,11 +127,15 @@ static int tc_ctl_p4_tmpl_gd_1(struct net *net, struct sk_buff *skb, ids[P4TC_PID_IDX] = t->pipeid; if (tb[P4TC_PATH]) { + const u32 *arg_ids = nla_data(tb[P4TC_PATH]); + if ((nla_len(tb[P4TC_PATH])) > (P4TC_PATH_MAX - 1) * sizeof(u32)) { NL_SET_ERR_MSG(extack, "Path is too big"); return -E2BIG; } + + memcpy(&ids[P4TC_MID_IDX], arg_ids, nla_len(tb[P4TC_PATH])); } op = (struct p4tc_template_ops *)p4tc_ops[t->obj]; @@ -309,12 +315,17 @@ tcf_p4_tmpl_cu_1(struct sk_buff *skb, struct net *net, struct nlmsghdr *n, ids[P4TC_PID_IDX] = t->pipeid; if (p4tc_attr[P4TC_PATH]) { + const u32 *arg_ids = nla_data(p4tc_attr[P4TC_PATH]); + if ((nla_len(p4tc_attr[P4TC_PATH])) > (P4TC_PATH_MAX - 1) * sizeof(u32)) { NL_SET_ERR_MSG(extack, "Path is too big"); ret = -E2BIG; goto out; } + + memcpy(&ids[P4TC_MID_IDX], arg_ids, + nla_len(p4tc_attr[P4TC_PATH])); } op = (struct p4tc_template_ops *)p4tc_ops[t->obj]; @@ -504,11 +515,15 @@ static int tc_ctl_p4_tmpl_dump_1(struct sk_buff *skb, struct nlattr *arg, ids[P4TC_PID_IDX] = t->pipeid; if (tb[P4TC_PATH]) { + const u32 *arg_ids = nla_data(tb[P4TC_PATH]); + if ((nla_len(tb[P4TC_PATH])) > (P4TC_PATH_MAX - 1) * sizeof(u32)) { NL_SET_ERR_MSG(extack, "Path is too big"); return -E2BIG; } + + memcpy(&ids[P4TC_MID_IDX], arg_ids, nla_len(tb[P4TC_PATH])); } op = (struct p4tc_template_ops *)p4tc_ops[t->obj]; From patchwork Tue Jan 24 17:05:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jamal Hadi Salim X-Patchwork-Id: 13114393 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3C5E0C54EED for ; Tue, 24 Jan 2023 17:06:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234330AbjAXRGz (ORCPT ); Tue, 24 Jan 2023 12:06:55 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41668 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234686AbjAXRGP (ORCPT ); Tue, 24 Jan 2023 12:06:15 -0500 Received: from mail-yw1-x1132.google.com (mail-yw1-x1132.google.com [IPv6:2607:f8b0:4864:20::1132]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 47D514DCF0 for ; Tue, 24 Jan 2023 09:05:32 -0800 (PST) Received: by mail-yw1-x1132.google.com with SMTP id 00721157ae682-50660e2d2ffso11498667b3.1 for ; Tue, 24 Jan 2023 09:05:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mojatatu-com.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=gUgSdWFSUNvgx/GRQzsKLafKhUG3kZUagTJQTPg/bP4=; b=I9neR3vdrfMERF9JIQL8Ov2WEjlrC+nzvHvt/As3Jq0vodVAftX5QTDO2UaDa7hPE7 c/VLHodiSkR6pfF2Mm0cOYvjHPpuHa5mBNv03WGGz+hC6l4SNSNfY6QJGyWnvZ8CVL2E SiDEu2y3Tqy5LDtoPElxMUoh41zkJ609XFtXpjYLGqK0QfKw8p5DXwmHrivNT9C+vGdV fdFEo7vXF1NhTgP1IsCTrrR2FnsG5s+B4QFaGcEzcudcDHzIg9JqLL/L02cHHZIDt+B7 tVY043QpnBEX8WNxvMXlmrczILT4uAGcRdqsb10ZGZawmwuJjMKOp5BqI0DeiaP6IAlp 2s3w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=gUgSdWFSUNvgx/GRQzsKLafKhUG3kZUagTJQTPg/bP4=; b=AZx888rXBvtVSGBwxtoaS3bj/1bdw22xD6NyhEgGqHl2F9uzXPx9bF917+PWU8OHs7 k2Sr36OvsZVkqKNrotq+aQSEEvpxciLozdTPP8DjDM3LCEBPxV4WFAwnMt5K0L1VUrDM oOFDE2x/G+XGlE1VppXmdeUzpLgjCNgXjUGNws2+Wh1tbi2Z3pVwKPEmsurP8jJIrbD2 VdYwKK2lrhK4F7NlYoN18T/CewbtIn0eguKNI8Pwx8R8RWOtTTUmxIE7mpS5EIIm7isX rI6EFU0atN1DupNY5S5deYrao1VGLFxNicf7Lymk8paCOZ+7z3WPgWS9b1cpQGJAro4x GMYA== X-Gm-Message-State: AFqh2kpBR05kV+dtenNVD9hyEW0AavDyCSK5VzHUSRqkrI9+0J9BbWto nEVO19hBIWtsOSh7iWmp22QxgMapYNEzYrAF X-Google-Smtp-Source: AMrXdXtILKO/60IoOMjm/LtmO+NmTeaLCZSh66cWAUdTRfUx6HW5bL8Wpw6S9ukZKIKgQqOVFmxYew== X-Received: by 2002:a05:7500:6595:b0:f0:1992:e9e2 with SMTP id iq21-20020a057500659500b000f01992e9e2mr2028373gab.57.1674579930044; Tue, 24 Jan 2023 09:05:30 -0800 (PST) Received: from localhost.localdomain (bras-base-kntaon1618w-grc-10-184-145-9-64.dsl.bell.ca. [184.145.9.64]) by smtp.gmail.com with ESMTPSA id t5-20020a05620a0b0500b007063036cb03sm1700208qkg.126.2023.01.24.09.05.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Jan 2023 09:05:29 -0800 (PST) From: Jamal Hadi Salim To: netdev@vger.kernel.org Cc: kernel@mojatatu.com, deb.chatterjee@intel.com, anjali.singhai@intel.com, namrata.limaye@intel.com, khalidm@nvidia.com, tom@sipanda.io, pratyush@sipanda.io, jiri@resnulli.us, xiyou.wangcong@gmail.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, vladbu@nvidia.com, simon.horman@corigine.com Subject: [PATCH net-next RFC 14/20] p4tc: add header field create, get, delete, flush and dump Date: Tue, 24 Jan 2023 12:05:04 -0500 Message-Id: <20230124170510.316970-14-jhs@mojatatu.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230124170510.316970-1-jhs@mojatatu.com> References: <20230124170510.316970-1-jhs@mojatatu.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org X-Patchwork-State: RFC This commit allows control to create, get, delete, flush and dump header field objects. The created header fields are retrieved at runtime by the parser. From a control plane interaction, a header field can only be created once the appropriate parser is instantiated. At runtime, existing header fields can be referenced for computation reasons from metact: metact will use header fields to either create lookup keys or edit the header fields. Header fields are part of a pipeline and a parser instance and header fields can only be created in an unsealed pipeline. To create a header field, the user must issue the equivalent of the following command: tc p4template create hdrfield/myprog/myparser/ipv4/dstAddr hdrfieldid 4 \ type ipv4 where myprog is the name of a pipeline, myparser is a name of a parser instance, ipv4/dstAddr is the name of header field which is of type ipv4. To delete a header field, the user must issue the equivalent of the following command: tc p4template delete hdrfield/myprog/myparser/ipv4/dstAddr where myprog is the name of pipeline, myparser is a name of a parser instance, ipv4/dstAddr is the name of header field to be deleted. To retrieve meta-information from a header field, such as length, position and type, the user must issue the equivalent of the following command: tc p4template get hdrfield/myprog/myparser/ipv4/dstAddr where myprog is the name of pipeline, myparser is a name of a parser instance, ipv4/dstAddr is the name of header field to be deleted. The user can also dump all the header fields available in a parser instance using the equivalent of the following command: tc p4template get hdrfield/myprog/myparser/ With that, the user will get all the header field names available in a specific parser instance. The user can also flush all the header fields available in a parser instance using the equivalent of the following command: tc p4template del hdrfield/myprog/myparser/ Header fields do not support update operations. Co-developed-by: Victor Nogueira Signed-off-by: Victor Nogueira Co-developed-by: Pedro Tammela Signed-off-by: Pedro Tammela Signed-off-by: Jamal Hadi Salim --- include/net/p4tc.h | 62 +++ include/uapi/linux/p4tc.h | 19 + net/sched/p4tc/Makefile | 3 +- net/sched/p4tc/p4tc_hdrfield.c | 625 +++++++++++++++++++++++++++++++ net/sched/p4tc/p4tc_parser_api.c | 229 +++++++++++ net/sched/p4tc/p4tc_pipeline.c | 4 + net/sched/p4tc/p4tc_tmpl_api.c | 2 + 7 files changed, 943 insertions(+), 1 deletion(-) create mode 100644 net/sched/p4tc/p4tc_hdrfield.c create mode 100644 net/sched/p4tc/p4tc_parser_api.c diff --git a/include/net/p4tc.h b/include/net/p4tc.h index 748a70c85..13cf4162e 100644 --- a/include/net/p4tc.h +++ b/include/net/p4tc.h @@ -19,6 +19,10 @@ #define P4TC_PID_IDX 0 #define P4TC_MID_IDX 1 +#define P4TC_PARSEID_IDX 1 +#define P4TC_HDRFIELDID_IDX 2 + +#define P4TC_HDRFIELD_IS_VALIDITY_BIT 0x1 struct p4tc_dump_ctx { u32 ids[P4TC_PATH_MAX]; @@ -83,6 +87,7 @@ struct p4tc_pipeline { struct idr p_meta_idr; struct rcu_head rcu; struct net *net; + struct p4tc_parser *parser; struct tc_action **preacts; int num_preacts; struct tc_action **postacts; @@ -150,6 +155,30 @@ struct p4tc_metadata { extern const struct p4tc_template_ops p4tc_meta_ops; +struct p4tc_parser { + char parser_name[PARSERNAMSIZ]; + struct idr hdr_fields_idr; +#ifdef CONFIG_KPARSER + const struct kparser_parser *kparser; +#endif + refcount_t parser_ref; + u32 parser_inst_id; +}; + +struct p4tc_hdrfield { + struct p4tc_template_common common; + struct p4tc_parser *parser; + u32 parser_inst_id; + u32 hdrfield_id; + refcount_t hdrfield_ref; + u16 startbit; + u16 endbit; + u8 datatype; /* T_XXX */ + u8 flags; /* P4TC_HDRFIELD_FLAGS_* */ +}; + +extern const struct p4tc_template_ops p4tc_hdrfield_ops; + struct p4tc_metadata *tcf_meta_find_byid(struct p4tc_pipeline *pipeline, u32 m_id); void tcf_meta_fill_user_offsets(struct p4tc_pipeline *pipeline); @@ -159,7 +188,40 @@ struct p4tc_metadata *tcf_meta_get(struct p4tc_pipeline *pipeline, struct netlink_ext_ack *extack); void tcf_meta_put_ref(struct p4tc_metadata *meta); +struct p4tc_parser *tcf_parser_create(struct p4tc_pipeline *pipeline, + const char *parser_name, + u32 parser_inst_id, + struct netlink_ext_ack *extack); + +struct p4tc_parser *tcf_parser_find_byid(struct p4tc_pipeline *pipeline, + const u32 parser_inst_id); +struct p4tc_parser *tcf_parser_find_byany(struct p4tc_pipeline *pipeline, + const char *parser_name, + u32 parser_inst_id, + struct netlink_ext_ack *extack); +int tcf_parser_del(struct net *net, struct p4tc_pipeline *pipeline, + struct p4tc_parser *parser, struct netlink_ext_ack *extack); +bool tcf_parser_is_callable(struct p4tc_parser *parser); +int tcf_skb_parse(struct sk_buff *skb, struct p4tc_skb_ext *p4tc_ext, + struct p4tc_parser *parser); + +struct p4tc_hdrfield *tcf_hdrfield_find_byid(struct p4tc_parser *parser, + const u32 hdrfield_id); +struct p4tc_hdrfield *tcf_hdrfield_find_byany(struct p4tc_parser *parser, + const char *hdrfield_name, + u32 hdrfield_id, + struct netlink_ext_ack *extack); +bool tcf_parser_check_hdrfields(struct p4tc_parser *parser, + struct p4tc_hdrfield *hdrfield); +void *tcf_hdrfield_fetch(struct sk_buff *skb, struct p4tc_hdrfield *hdrfield); +struct p4tc_hdrfield *tcf_hdrfield_get(struct p4tc_parser *parser, + const char *hdrfield_name, + u32 hdrfield_id, + struct netlink_ext_ack *extack); +void tcf_hdrfield_put_ref(struct p4tc_hdrfield *hdrfield); + #define to_pipeline(t) ((struct p4tc_pipeline *)t) #define to_meta(t) ((struct p4tc_metadata *)t) +#define to_hdrfield(t) ((struct p4tc_hdrfield *)t) #endif diff --git a/include/uapi/linux/p4tc.h b/include/uapi/linux/p4tc.h index 8934c032d..72714df9e 100644 --- a/include/uapi/linux/p4tc.h +++ b/include/uapi/linux/p4tc.h @@ -27,6 +27,8 @@ struct p4tcmsg { #define TEMPLATENAMSZ 256 #define PIPELINENAMSIZ TEMPLATENAMSZ #define METANAMSIZ TEMPLATENAMSZ +#define PARSERNAMSIZ TEMPLATENAMSZ +#define HDRFIELDNAMSIZ TEMPLATENAMSZ /* Root attributes */ enum { @@ -55,6 +57,7 @@ enum { P4TC_OBJ_UNSPEC, P4TC_OBJ_PIPELINE, P4TC_OBJ_META, + P4TC_OBJ_HDR_FIELD, __P4TC_OBJ_MAX, }; #define P4TC_OBJ_MAX __P4TC_OBJ_MAX @@ -153,6 +156,22 @@ enum { }; #define P4TC_KERNEL_META_MAX (__P4TC_KERNEL_META_MAX - 1) +struct p4tc_hdrfield_ty { + __u16 startbit; + __u16 endbit; + __u8 datatype; /* P4T_* */ +}; + +/* Header field attributes */ +enum { + P4TC_HDRFIELD_UNSPEC, + P4TC_HDRFIELD_DATA, + P4TC_HDRFIELD_NAME, + P4TC_HDRFIELD_PARSER_NAME, + __P4TC_HDRFIELD_MAX +}; +#define P4TC_HDRFIELD_MAX (__P4TC_HDRFIELD_MAX - 1) + #define P4TC_RTA(r) \ ((struct rtattr *)(((char *)(r)) + NLMSG_ALIGN(sizeof(struct p4tcmsg)))) diff --git a/net/sched/p4tc/Makefile b/net/sched/p4tc/Makefile index d523e668c..add22c909 100644 --- a/net/sched/p4tc/Makefile +++ b/net/sched/p4tc/Makefile @@ -1,3 +1,4 @@ # SPDX-License-Identifier: GPL-2.0 -obj-y := p4tc_types.o p4tc_tmpl_api.o p4tc_pipeline.o p4tc_meta.o +obj-y := p4tc_types.o p4tc_pipeline.o p4tc_tmpl_api.o p4tc_meta.o \ + p4tc_parser_api.o p4tc_hdrfield.o diff --git a/net/sched/p4tc/p4tc_hdrfield.c b/net/sched/p4tc/p4tc_hdrfield.c new file mode 100644 index 000000000..2cbb0a624 --- /dev/null +++ b/net/sched/p4tc/p4tc_hdrfield.c @@ -0,0 +1,625 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * net/sched/p4tc_hdrfield.c P4 TC HEADER FIELD + * + * Copyright (c) 2022, Mojatatu Networks + * Copyright (c) 2022, Intel Corporation. + * Authors: Jamal Hadi Salim + * Victor Nogueira + * Pedro Tammela + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +static const struct nla_policy tc_hdrfield_policy[P4TC_HDRFIELD_MAX + 1] = { + [P4TC_HDRFIELD_DATA] = { .type = NLA_BINARY, + .len = sizeof(struct p4tc_hdrfield_ty) }, + [P4TC_HDRFIELD_NAME] = { .type = NLA_STRING, .len = HDRFIELDNAMSIZ }, + [P4TC_HDRFIELD_PARSER_NAME] = { .type = NLA_STRING, + .len = PARSERNAMSIZ }, +}; + +static int _tcf_hdrfield_put(struct p4tc_pipeline *pipeline, + struct p4tc_parser *parser, + struct p4tc_hdrfield *hdrfield, + bool unconditional_purge, + struct netlink_ext_ack *extack) +{ + if (!refcount_dec_if_one(&hdrfield->hdrfield_ref) && + !unconditional_purge) { + NL_SET_ERR_MSG(extack, + "Unable to delete referenced header field"); + return -EBUSY; + } + idr_remove(&parser->hdr_fields_idr, hdrfield->hdrfield_id); + + WARN_ON(!refcount_dec_not_one(&parser->parser_ref)); + kfree(hdrfield); + + return 0; +} + +static int tcf_hdrfield_put(struct net *net, struct p4tc_template_common *tmpl, + bool unconditional_purge, + struct netlink_ext_ack *extack) +{ + struct p4tc_hdrfield *hdrfield; + struct p4tc_pipeline *pipeline; + struct p4tc_parser *parser; + + pipeline = tcf_pipeline_find_byid(net, tmpl->p_id); + + hdrfield = to_hdrfield(tmpl); + parser = pipeline->parser; + + return _tcf_hdrfield_put(pipeline, parser, hdrfield, + unconditional_purge, extack); +} + +static struct p4tc_hdrfield *hdrfield_find_name(struct p4tc_parser *parser, + const char *hdrfield_name) +{ + struct p4tc_hdrfield *hdrfield; + unsigned long tmp, id; + + idr_for_each_entry_ul(&parser->hdr_fields_idr, hdrfield, tmp, id) + if (hdrfield->common.name[0] && + strncmp(hdrfield->common.name, hdrfield_name, + HDRFIELDNAMSIZ) == 0) + return hdrfield; + + return NULL; +} + +struct p4tc_hdrfield *tcf_hdrfield_find_byid(struct p4tc_parser *parser, + const u32 hdrfield_id) +{ + return idr_find(&parser->hdr_fields_idr, hdrfield_id); +} + +struct p4tc_hdrfield *tcf_hdrfield_find_byany(struct p4tc_parser *parser, + const char *hdrfield_name, + u32 hdrfield_id, + struct netlink_ext_ack *extack) +{ + struct p4tc_hdrfield *hdrfield; + int err; + + if (hdrfield_id) { + hdrfield = tcf_hdrfield_find_byid(parser, hdrfield_id); + if (!hdrfield) { + NL_SET_ERR_MSG(extack, "Unable to find hdrfield by id"); + err = -EINVAL; + goto out; + } + } else { + if (hdrfield_name) { + hdrfield = hdrfield_find_name(parser, hdrfield_name); + if (!hdrfield) { + NL_SET_ERR_MSG(extack, + "Header field name not found"); + err = -EINVAL; + goto out; + } + } else { + NL_SET_ERR_MSG(extack, + "Must specify hdrfield name or id"); + err = -EINVAL; + goto out; + } + } + + return hdrfield; + +out: + return ERR_PTR(err); +} + +struct p4tc_hdrfield *tcf_hdrfield_get(struct p4tc_parser *parser, + const char *hdrfield_name, + u32 hdrfield_id, + struct netlink_ext_ack *extack) +{ + struct p4tc_hdrfield *hdrfield; + + hdrfield = tcf_hdrfield_find_byany(parser, hdrfield_name, hdrfield_id, + extack); + if (IS_ERR(hdrfield)) + return hdrfield; + + /* Should never happen */ + WARN_ON(!refcount_inc_not_zero(&hdrfield->hdrfield_ref)); + + return hdrfield; +} + +void tcf_hdrfield_put_ref(struct p4tc_hdrfield *hdrfield) +{ + WARN_ON(!refcount_dec_not_one(&hdrfield->hdrfield_ref)); +} + +static struct p4tc_hdrfield * +tcf_hdrfield_find_byanyattr(struct p4tc_parser *parser, + struct nlattr *name_attr, u32 hdrfield_id, + struct netlink_ext_ack *extack) +{ + char *hdrfield_name = NULL; + + if (name_attr) + hdrfield_name = nla_data(name_attr); + + return tcf_hdrfield_find_byany(parser, hdrfield_name, hdrfield_id, + extack); +} + +void *tcf_hdrfield_fetch(struct sk_buff *skb, struct p4tc_hdrfield *hdrfield) +{ + size_t hdr_offset_len = sizeof(u16); + u16 *hdr_offset_bits, hdr_offset; + struct p4tc_skb_ext *p4tc_skb_ext; + u16 hdr_offset_index; + + p4tc_skb_ext = skb_ext_find(skb, P4TC_SKB_EXT); + if (!p4tc_skb_ext) { + pr_err("Unable to find P4TC_SKB_EXT\n"); + return NULL; + } + + hdr_offset_index = (hdrfield->hdrfield_id - 1) * hdr_offset_len; + if (hdrfield->flags & P4TC_HDRFIELD_IS_VALIDITY_BIT) + return &p4tc_skb_ext->p4tc_ext->hdrs[hdr_offset_index]; + + hdr_offset_bits = + (u16 *)&p4tc_skb_ext->p4tc_ext->hdrs[hdr_offset_index]; + hdr_offset = BITS_TO_BYTES(*hdr_offset_bits); + + return skb_mac_header(skb) + hdr_offset; +} + +static struct p4tc_hdrfield *tcf_hdrfield_create(struct nlmsghdr *n, + struct nlattr *nla, + struct p4tc_pipeline *pipeline, + u32 *ids, + struct netlink_ext_ack *extack) +{ + u32 parser_id = ids[P4TC_PARSEID_IDX]; + char *hdrfield_name = NULL; + const char *parser_name = NULL; + u32 hdrfield_id = 0; + struct nlattr *tb[P4TC_HDRFIELD_MAX + 1]; + struct p4tc_hdrfield_ty *hdr_arg; + struct p4tc_hdrfield *hdrfield; + struct p4tc_parser *parser; + char *s; + int ret; + + ret = nla_parse_nested(tb, P4TC_HDRFIELD_MAX, nla, tc_hdrfield_policy, + extack); + if (ret < 0) + return ERR_PTR(ret); + + hdrfield_id = ids[P4TC_HDRFIELDID_IDX]; + if (!hdrfield_id) { + NL_SET_ERR_MSG(extack, "Must specify header field id"); + return ERR_PTR(-EINVAL); + } + + if (!tb[P4TC_HDRFIELD_DATA]) { + NL_SET_ERR_MSG(extack, "Must supply header field data"); + return ERR_PTR(-EINVAL); + } + hdr_arg = nla_data(tb[P4TC_HDRFIELD_DATA]); + + if (tb[P4TC_HDRFIELD_PARSER_NAME]) + parser_name = nla_data(tb[P4TC_HDRFIELD_PARSER_NAME]); + + rcu_read_lock(); + parser = tcf_parser_find_byany(pipeline, parser_name, parser_id, NULL); + if (IS_ERR(parser)) { + rcu_read_unlock(); + if (!parser_name) { + NL_SET_ERR_MSG(extack, "Must supply parser name"); + return ERR_PTR(-EINVAL); + } + + /* If the parser instance wasn't created, let's create it here */ + parser = tcf_parser_create(pipeline, parser_name, parser_id, + extack); + + if (IS_ERR(parser)) + return (void *)parser; + rcu_read_lock(); + } + + if (!refcount_inc_not_zero(&parser->parser_ref)) { + NL_SET_ERR_MSG(extack, "Parser is stale"); + rcu_read_unlock(); + return ERR_PTR(-EBUSY); + } + rcu_read_unlock(); + + if (tb[P4TC_HDRFIELD_NAME]) + hdrfield_name = nla_data(tb[P4TC_HDRFIELD_NAME]); + + if ((hdrfield_name && hdrfield_find_name(parser, hdrfield_name)) || + tcf_hdrfield_find_byid(parser, hdrfield_id)) { + NL_SET_ERR_MSG(extack, + "Header field with same id or name was already inserted"); + ret = -EEXIST; + goto refcount_dec_parser; + } + + if (hdr_arg->startbit > hdr_arg->endbit) { + NL_SET_ERR_MSG(extack, "Header field startbit > endbit"); + ret = -EINVAL; + goto refcount_dec_parser; + } + + hdrfield = kzalloc(sizeof(*hdrfield), GFP_KERNEL); + if (!hdrfield) { + NL_SET_ERR_MSG(extack, "Failed to allocate hdrfield"); + ret = -ENOMEM; + goto refcount_dec_parser; + } + + hdrfield->hdrfield_id = hdrfield_id; + + s = strnchr(hdrfield_name, HDRFIELDNAMSIZ, '/'); + if (s++ && strncmp(s, "isValid", HDRFIELDNAMSIZ) == 0) { + if (hdr_arg->datatype != P4T_U8 || hdr_arg->startbit != 0 || + hdr_arg->endbit != 0) { + NL_SET_ERR_MSG(extack, + "isValid data type must be bit1"); + ret = -EINVAL; + goto free_hdr; + } + hdrfield->datatype = hdr_arg->datatype; + hdrfield->flags = P4TC_HDRFIELD_IS_VALIDITY_BIT; + } else { + if (!p4type_find_byid(hdr_arg->datatype)) { + NL_SET_ERR_MSG(extack, "Invalid hdrfield data type"); + ret = -EINVAL; + goto free_hdr; + } + hdrfield->datatype = hdr_arg->datatype; + } + + hdrfield->startbit = hdr_arg->startbit; + hdrfield->endbit = hdr_arg->endbit; + hdrfield->parser_inst_id = parser->parser_inst_id; + + ret = tcf_parser_check_hdrfields(parser, hdrfield); + if (ret < 0) + goto free_hdr; + + ret = idr_alloc_u32(&parser->hdr_fields_idr, hdrfield, &hdrfield_id, + hdrfield_id, GFP_KERNEL); + if (ret < 0) { + NL_SET_ERR_MSG(extack, "Unable to allocate ID for hdrfield"); + goto free_hdr; + } + + hdrfield->common.p_id = pipeline->common.p_id; + hdrfield->common.ops = (struct p4tc_template_ops *)&p4tc_hdrfield_ops; + hdrfield->parser = parser; + refcount_set(&hdrfield->hdrfield_ref, 1); + + if (hdrfield_name) + strscpy(hdrfield->common.name, hdrfield_name, HDRFIELDNAMSIZ); + + return hdrfield; + +free_hdr: + kfree(hdrfield); + +refcount_dec_parser: + WARN_ON(!refcount_dec_not_one(&parser->parser_ref)); + return ERR_PTR(ret); +} + +static struct p4tc_template_common * +tcf_hdrfield_cu(struct net *net, struct nlmsghdr *n, struct nlattr *nla, + struct p4tc_nl_pname *nl_pname, u32 *ids, + struct netlink_ext_ack *extack) +{ + u32 pipeid = ids[P4TC_PID_IDX]; + struct p4tc_hdrfield *hdrfield; + struct p4tc_pipeline *pipeline; + + if (n->nlmsg_flags & NLM_F_REPLACE) { + NL_SET_ERR_MSG(extack, "Header field update not supported"); + return ERR_PTR(-EOPNOTSUPP); + } + + pipeline = tcf_pipeline_find_byany_unsealed(net, nl_pname->data, pipeid, + extack); + if (IS_ERR(pipeline)) + return (void *)pipeline; + + hdrfield = tcf_hdrfield_create(n, nla, pipeline, ids, extack); + if (IS_ERR(hdrfield)) + goto out; + + if (!nl_pname->passed) + strscpy(nl_pname->data, pipeline->common.name, PIPELINENAMSIZ); + + if (!ids[P4TC_PID_IDX]) + ids[P4TC_PID_IDX] = pipeline->common.p_id; + +out: + return (struct p4tc_template_common *)hdrfield; +} + +static int _tcf_hdrfield_fill_nlmsg(struct sk_buff *skb, + struct p4tc_hdrfield *hdrfield) +{ + unsigned char *b = nlmsg_get_pos(skb); + struct p4tc_hdrfield_ty hdr_arg; + struct nlattr *nest; + /* Parser instance id + header field id */ + u32 ids[2]; + + ids[0] = hdrfield->parser_inst_id; + ids[1] = hdrfield->hdrfield_id; + + if (nla_put(skb, P4TC_PATH, sizeof(ids), ids)) + goto out_nlmsg_trim; + + nest = nla_nest_start(skb, P4TC_PARAMS); + if (!nest) + goto out_nlmsg_trim; + + hdr_arg.datatype = hdrfield->datatype; + hdr_arg.startbit = hdrfield->startbit; + hdr_arg.endbit = hdrfield->endbit; + + if (hdrfield->common.name[0]) { + if (nla_put_string(skb, P4TC_HDRFIELD_NAME, + hdrfield->common.name)) + goto out_nlmsg_trim; + } + + if (nla_put(skb, P4TC_HDRFIELD_DATA, sizeof(hdr_arg), &hdr_arg)) + goto out_nlmsg_trim; + + nla_nest_end(skb, nest); + + return skb->len; + +out_nlmsg_trim: + nlmsg_trim(skb, b); + return -1; +} + +static int tcf_hdrfield_fill_nlmsg(struct net *net, struct sk_buff *skb, + struct p4tc_template_common *template, + struct netlink_ext_ack *extack) +{ + struct p4tc_hdrfield *hdrfield = to_hdrfield(template); + + if (_tcf_hdrfield_fill_nlmsg(skb, hdrfield) <= 0) { + NL_SET_ERR_MSG(extack, + "Failed to fill notification attributes for pipeline"); + return -EINVAL; + } + + return 0; +} + +static int tcf_hdrfield_flush(struct sk_buff *skb, + struct p4tc_pipeline *pipeline, + struct p4tc_parser *parser, + struct netlink_ext_ack *extack) +{ + unsigned char *b = nlmsg_get_pos(skb); + int ret = 0; + int i = 0; + struct p4tc_hdrfield *hdrfield; + u32 path[2]; + unsigned long tmp, hdrfield_id; + + path[0] = parser->parser_inst_id; + path[1] = 0; + + if (nla_put(skb, P4TC_PATH, sizeof(path), path)) + goto out_nlmsg_trim; + + if (idr_is_empty(&parser->hdr_fields_idr)) { + NL_SET_ERR_MSG(extack, "There are no header fields to flush"); + goto out_nlmsg_trim; + } + + idr_for_each_entry_ul(&parser->hdr_fields_idr, hdrfield, tmp, hdrfield_id) { + if (_tcf_hdrfield_put(pipeline, parser, hdrfield, false, extack) < 0) { + ret = -EBUSY; + continue; + } + i++; + } + + nla_put_u32(skb, P4TC_COUNT, i); + + if (ret < 0) { + if (i == 0) { + NL_SET_ERR_MSG(extack, + "Unable to flush any table instance"); + goto out_nlmsg_trim; + } else { + NL_SET_ERR_MSG(extack, + "Unable to flush all table instances"); + } + } + + return i; + +out_nlmsg_trim: + nlmsg_trim(skb, b); + return 0; +} + +static int tcf_hdrfield_gd(struct net *net, struct sk_buff *skb, + struct nlmsghdr *n, struct nlattr *nla, + struct p4tc_nl_pname *nl_pname, u32 *ids, + struct netlink_ext_ack *extack) +{ + unsigned char *b = nlmsg_get_pos(skb); + u32 pipeid = ids[P4TC_PID_IDX]; + u32 parser_inst_id = ids[P4TC_PARSEID_IDX]; + u32 hdrfield_id = ids[P4TC_HDRFIELDID_IDX]; + struct nlattr *tb[P4TC_HDRFIELD_MAX + 1]; + struct p4tc_hdrfield *hdrfield; + struct p4tc_pipeline *pipeline; + struct p4tc_parser *parser; + char *parser_name; + int ret; + + pipeline = tcf_pipeline_find_byany(net, nl_pname->data, pipeid, extack); + if (IS_ERR(pipeline)) + return PTR_ERR(pipeline); + + ret = nla_parse_nested(tb, P4TC_HDRFIELD_MAX, nla, tc_hdrfield_policy, + extack); + if (ret < 0) + return ret; + + parser_name = tb[P4TC_HDRFIELD_PARSER_NAME] ? + nla_data(tb[P4TC_HDRFIELD_PARSER_NAME]) : NULL; + + parser = tcf_parser_find_byany(pipeline, parser_name, parser_inst_id, + extack); + if (IS_ERR(parser)) + return PTR_ERR(parser); + + if (!ids[P4TC_PID_IDX]) + ids[P4TC_PID_IDX] = pipeline->common.p_id; + + if (!nl_pname->passed) + strscpy(nl_pname->data, pipeline->common.name, PIPELINENAMSIZ); + + if (n->nlmsg_type == RTM_DELP4TEMPLATE && n->nlmsg_flags & NLM_F_ROOT) + return tcf_hdrfield_flush(skb, pipeline, parser, extack); + + hdrfield = tcf_hdrfield_find_byanyattr(parser, tb[P4TC_HDRFIELD_NAME], + hdrfield_id, extack); + if (IS_ERR(hdrfield)) + return PTR_ERR(hdrfield); + + ret = _tcf_hdrfield_fill_nlmsg(skb, hdrfield); + if (ret < 0) + return -ENOMEM; + + if (n->nlmsg_type == RTM_DELP4TEMPLATE) { + ret = _tcf_hdrfield_put(pipeline, parser, hdrfield, false, + extack); + if (ret < 0) + goto out_nlmsg_trim; + } + + return 0; + +out_nlmsg_trim: + nlmsg_trim(skb, b); + return ret; +} + +static int tcf_hdrfield_dump_1(struct sk_buff *skb, + struct p4tc_template_common *common) +{ + struct p4tc_hdrfield *hdrfield = to_hdrfield(common); + struct nlattr *param = nla_nest_start(skb, P4TC_PARAMS); + unsigned char *b = nlmsg_get_pos(skb); + u32 path[2]; + + if (!param) + goto out_nlmsg_trim; + + if (hdrfield->common.name[0] && + nla_put_string(skb, P4TC_HDRFIELD_NAME, hdrfield->common.name)) + goto out_nlmsg_trim; + + nla_nest_end(skb, param); + + path[0] = hdrfield->parser_inst_id; + path[1] = hdrfield->hdrfield_id; + if (nla_put(skb, P4TC_PATH, sizeof(path), path)) + goto out_nlmsg_trim; + + return 0; + +out_nlmsg_trim: + nlmsg_trim(skb, b); + return -ENOMEM; +} + +static int tcf_hdrfield_dump(struct sk_buff *skb, struct p4tc_dump_ctx *ctx, + struct nlattr *nla, char **p_name, u32 *ids, + struct netlink_ext_ack *extack) +{ + struct nlattr *tb[P4TC_HDRFIELD_MAX + 1] = { NULL }; + const u32 pipeid = ids[P4TC_PID_IDX]; + struct net *net = sock_net(skb->sk); + struct p4tc_pipeline *pipeline; + struct p4tc_parser *parser; + int ret; + + if (!ctx->ids[P4TC_PID_IDX]) { + pipeline = + tcf_pipeline_find_byany(net, *p_name, pipeid, extack); + if (IS_ERR(pipeline)) + return PTR_ERR(pipeline); + ctx->ids[P4TC_PID_IDX] = pipeline->common.p_id; + } else { + pipeline = tcf_pipeline_find_byid(net, ctx->ids[P4TC_PID_IDX]); + } + + if (!ctx->ids[P4TC_PARSEID_IDX]) { + if (nla) { + ret = nla_parse_nested(tb, P4TC_HDRFIELD_MAX, nla, + tc_hdrfield_policy, extack); + if (ret < 0) + return ret; + } + + parser = tcf_parser_find_byany(pipeline, + nla_data(tb[P4TC_HDRFIELD_PARSER_NAME]), + ids[P4TC_PARSEID_IDX], extack); + if (IS_ERR(parser)) + return PTR_ERR(parser); + + ctx->ids[P4TC_PARSEID_IDX] = parser->parser_inst_id; + } else { + parser = pipeline->parser; + } + + if (!ids[P4TC_PID_IDX]) + ids[P4TC_PID_IDX] = pipeline->common.p_id; + + if (!(*p_name)) + *p_name = pipeline->common.name; + + return tcf_p4_tmpl_generic_dump(skb, ctx, &parser->hdr_fields_idr, + P4TC_HDRFIELDID_IDX, extack); +} + +const struct p4tc_template_ops p4tc_hdrfield_ops = { + .init = NULL, + .cu = tcf_hdrfield_cu, + .fill_nlmsg = tcf_hdrfield_fill_nlmsg, + .gd = tcf_hdrfield_gd, + .put = tcf_hdrfield_put, + .dump = tcf_hdrfield_dump, + .dump_1 = tcf_hdrfield_dump_1, +}; diff --git a/net/sched/p4tc/p4tc_parser_api.c b/net/sched/p4tc/p4tc_parser_api.c new file mode 100644 index 000000000..267a58aeb --- /dev/null +++ b/net/sched/p4tc/p4tc_parser_api.c @@ -0,0 +1,229 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * net/sched/p4tc_parser_api.c P4 TC PARSER API + * + * Copyright (c) 2022, Mojatatu Networks + * Copyright (c) 2022, Intel Corporation. + * Authors: Jamal Hadi Salim + * Victor Nogueira + * Pedro Tammela + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +static struct p4tc_parser *parser_find_name(struct p4tc_pipeline *pipeline, + const char *parser_name) +{ + if (unlikely(!pipeline->parser)) + return NULL; + + if (!strncmp(pipeline->parser->parser_name, parser_name, PARSERNAMSIZ)) + return pipeline->parser; + + return NULL; +} + +struct p4tc_parser *tcf_parser_find_byid(struct p4tc_pipeline *pipeline, + const u32 parser_inst_id) +{ + if (unlikely(!pipeline->parser)) + return NULL; + + if (parser_inst_id == pipeline->parser->parser_inst_id) + return pipeline->parser; + + return NULL; +} + +static struct p4tc_parser *__parser_find(struct p4tc_pipeline *pipeline, + const char *parser_name, + u32 parser_inst_id, + struct netlink_ext_ack *extack) +{ + struct p4tc_parser *parser; + int err; + + if (parser_inst_id) { + parser = tcf_parser_find_byid(pipeline, parser_inst_id); + if (!parser) { + if (extack) + NL_SET_ERR_MSG(extack, + "Unable to find parser by id"); + err = -EINVAL; + goto out; + } + } else { + if (parser_name) { + parser = parser_find_name(pipeline, parser_name); + if (!parser) { + if (extack) + NL_SET_ERR_MSG(extack, + "Parser name not found"); + err = -EINVAL; + goto out; + } + } else { + if (extack) + NL_SET_ERR_MSG(extack, + "Must specify parser name or id"); + err = -EINVAL; + goto out; + } + } + + return parser; + +out: + return ERR_PTR(err); +} + +struct p4tc_parser *tcf_parser_find_byany(struct p4tc_pipeline *pipeline, + const char *parser_name, + u32 parser_inst_id, + struct netlink_ext_ack *extack) +{ + return __parser_find(pipeline, parser_name, parser_inst_id, extack); +} + +#ifdef CONFIG_KPARSER +int tcf_skb_parse(struct sk_buff *skb, struct p4tc_skb_ext *p4tc_skb_ext, + struct p4tc_parser *parser) +{ + void *hdr = skb_mac_header(skb); + size_t pktlen = skb_mac_header_len(skb) + skb->len; + + return __kparser_parse(parser->kparser, hdr, pktlen, + p4tc_skb_ext->p4tc_ext->hdrs, HEADER_MAX_LEN); +} + +static int __tcf_parser_fill(struct p4tc_parser *parser, + struct netlink_ext_ack *extack) +{ + struct kparser_hkey kparser_key = { 0 }; + + kparser_key.id = parser->parser_inst_id; + strscpy(kparser_key.name, parser->parser_name, KPARSER_MAX_NAME); + + parser->kparser = kparser_get_parser(&kparser_key, false); + if (!parser->kparser) { + NL_SET_ERR_MSG(extack, "Unable to get kparser instance"); + return -ENOENT; + } + + return 0; +} + +void __tcf_parser_put(struct p4tc_parser *parser) +{ + kparser_put_parser(parser->kparser, false); +} + +bool tcf_parser_is_callable(struct p4tc_parser *parser) +{ + return parser && parser->kparser; +} +#else +int tcf_skb_parse(struct sk_buff *skb, struct p4tc_skb_ext *p4tc_skb_ext, + struct p4tc_parser *parser) +{ + return 0; +} + +static int __tcf_parser_fill(struct p4tc_parser *parser, + struct netlink_ext_ack *extack) +{ + return 0; +} + +void __tcf_parser_put(struct p4tc_parser *parser) +{ +} + +bool tcf_parser_is_callable(struct p4tc_parser *parser) +{ + return !!parser; +} +#endif + +struct p4tc_parser * +tcf_parser_create(struct p4tc_pipeline *pipeline, const char *parser_name, + u32 parser_inst_id, struct netlink_ext_ack *extack) +{ + struct p4tc_parser *parser; + int ret; + + if (pipeline->parser) { + NL_SET_ERR_MSG(extack, + "Can only have one parser instance per pipeline"); + return ERR_PTR(-EEXIST); + } + + parser = kzalloc(sizeof(*parser), GFP_KERNEL); + if (!parser) + return ERR_PTR(-ENOMEM); + + if (parser_inst_id) + parser->parser_inst_id = parser_inst_id; + else + /* Assign to KPARSER_KMOD_ID_MAX + 1 if no ID was supplied */ + parser->parser_inst_id = KPARSER_KMOD_ID_MAX + 1; + + strscpy(parser->parser_name, parser_name, PARSERNAMSIZ); + + ret = __tcf_parser_fill(parser, extack); + if (ret < 0) + goto err; + + refcount_set(&parser->parser_ref, 1); + + idr_init(&parser->hdr_fields_idr); + + pipeline->parser = parser; + + return parser; + +err: + kfree(parser); + return ERR_PTR(ret); +} + +/* Dummy function which just returns true + * Once we have the proper parser code, this function will work properly + */ +bool tcf_parser_check_hdrfields(struct p4tc_parser *parser, + struct p4tc_hdrfield *hdrfield) +{ + return true; +} + +int tcf_parser_del(struct net *net, struct p4tc_pipeline *pipeline, + struct p4tc_parser *parser, struct netlink_ext_ack *extack) +{ + struct p4tc_hdrfield *hdrfield; + unsigned long hdr_field_id, tmp; + + __tcf_parser_put(parser); + + idr_for_each_entry_ul(&parser->hdr_fields_idr, hdrfield, tmp, hdr_field_id) + hdrfield->common.ops->put(net, &hdrfield->common, true, extack); + + idr_destroy(&parser->hdr_fields_idr); + + pipeline->parser = NULL; + + kfree(parser); + + return 0; +} diff --git a/net/sched/p4tc/p4tc_pipeline.c b/net/sched/p4tc/p4tc_pipeline.c index 49f0062ad..6fc7bd49d 100644 --- a/net/sched/p4tc/p4tc_pipeline.c +++ b/net/sched/p4tc/p4tc_pipeline.c @@ -115,6 +115,8 @@ static int tcf_pipeline_put(struct net *net, } idr_remove(&pipe_net->pipeline_idr, pipeline->common.p_id); + if (pipeline->parser) + tcf_parser_del(net, pipeline, pipeline->parser, extack); idr_for_each_entry_ul(&pipeline->p_meta_idr, meta, tmp, m_id) meta->common.ops->put(net, &meta->common, true, extack); @@ -319,6 +321,8 @@ static struct p4tc_pipeline *tcf_pipeline_create(struct net *net, pipeline->num_postacts = 0; } + pipeline->parser = NULL; + idr_init(&pipeline->p_meta_idr); pipeline->p_meta_offset = 0; diff --git a/net/sched/p4tc/p4tc_tmpl_api.c b/net/sched/p4tc/p4tc_tmpl_api.c index a13d02ce5..325b56d2e 100644 --- a/net/sched/p4tc/p4tc_tmpl_api.c +++ b/net/sched/p4tc/p4tc_tmpl_api.c @@ -43,6 +43,7 @@ static bool obj_is_valid(u32 obj) switch (obj) { case P4TC_OBJ_PIPELINE: case P4TC_OBJ_META: + case P4TC_OBJ_HDR_FIELD: return true; default: return false; @@ -52,6 +53,7 @@ static bool obj_is_valid(u32 obj) static const struct p4tc_template_ops *p4tc_ops[P4TC_OBJ_MAX] = { [P4TC_OBJ_PIPELINE] = &p4tc_pipeline_ops, [P4TC_OBJ_META] = &p4tc_meta_ops, + [P4TC_OBJ_HDR_FIELD] = &p4tc_hdrfield_ops, }; int tcf_p4_tmpl_generic_dump(struct sk_buff *skb, struct p4tc_dump_ctx *ctx, From patchwork Tue Jan 24 17:05:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jamal Hadi Salim X-Patchwork-Id: 13114394 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B9550C25B4E for ; Tue, 24 Jan 2023 17:06:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234810AbjAXRG5 (ORCPT ); Tue, 24 Jan 2023 12:06:57 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41322 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234711AbjAXRGY (ORCPT ); Tue, 24 Jan 2023 12:06:24 -0500 Received: from mail-oa1-x2e.google.com (mail-oa1-x2e.google.com [IPv6:2001:4860:4864:20::2e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2ABCF5B9B for ; Tue, 24 Jan 2023 09:05:38 -0800 (PST) Received: by mail-oa1-x2e.google.com with SMTP id 586e51a60fabf-15fe106c7c7so12263504fac.8 for ; Tue, 24 Jan 2023 09:05:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mojatatu-com.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=2mAFJytTgXdPJgctF9DqIjL9Xn6Lv07Xe/o918GKGuQ=; b=xt1U/fjiYmYVQ71FSxDQ4Rho7W+B3ilSzuXMR32+pcd4MFJu3A9Dkhmta1Yow1q6vz 6SGinJWYD7w9JAXspI4hMIGQe6cSj6WngjcJypD1Zv2reMIeNVSH0zsOk+NgbfloPlzZ ulfad8ZjC8CqWXtMzR5svxrLiV67UkOJRfykcrp42+xTS4FBhJuasnyQsuHwPfUjl1fz LOhmqz/G2K3Ry4AMuNbDOeqo11edO1g3GsDTjb2AhU6JNuZlo8+V2YbhlAXnCRdMP8eH 5wSju4H3UXBrgm2tDncBxiMUewomgwsj7ingJiQGhXSlJ0JQEXjQJl9PX6FxO803Y0Fr 28Jw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=2mAFJytTgXdPJgctF9DqIjL9Xn6Lv07Xe/o918GKGuQ=; b=urSQTfuzP3JtZP1cMEHS7MA5leBWHVcHRCd24ETnvvxhMrtay93Wf0hTDB5bF+CdNm reGX1IOMRlH/Y7RxGgkwsSWQe1nrxlaBH9JassqvSvZdPn/EO1/vcwLTa0mJcTh+6JZo DJQevIeWdlqMT6/MXfh7a0aV+KOsvi0xbxuyyJft116q5+z3HyNHlPdZJpQU28GYAats 3TvDEk9fvad2cESXBTE/omZQj0o67qXo4xQ1xt2H9o1ryve3REW/YDQ1dS7WjqfTZPBI Z1ppu1QlkJvtvgkRoHODw0E9PiZvob9zxaEEIEx7lzUiUcXPIbhHIfmAg09ucDMShbTJ yghQ== X-Gm-Message-State: AFqh2krQ0VwdFOjDtqn5phEmTLeaUwiZmGbX6p9czirtXzJoGQp06RLA RanfAXsH1l+eM197OXIIn6GH7w2VV0EU6vtn X-Google-Smtp-Source: AMrXdXsZ516bzkaFfIheoL5PeN+NDk+8xd+hGP/DhQn3lSZbyrLQkg1hSvbxwkp3KZFaxVn1MRtKKw== X-Received: by 2002:a05:6359:2e9a:b0:e5:45bd:7f0 with SMTP id rp26-20020a0563592e9a00b000e545bd07f0mr565354rwb.14.1674579931401; Tue, 24 Jan 2023 09:05:31 -0800 (PST) Received: from localhost.localdomain (bras-base-kntaon1618w-grc-10-184-145-9-64.dsl.bell.ca. [184.145.9.64]) by smtp.gmail.com with ESMTPSA id t5-20020a05620a0b0500b007063036cb03sm1700208qkg.126.2023.01.24.09.05.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Jan 2023 09:05:30 -0800 (PST) From: Jamal Hadi Salim To: netdev@vger.kernel.org Cc: kernel@mojatatu.com, deb.chatterjee@intel.com, anjali.singhai@intel.com, namrata.limaye@intel.com, khalidm@nvidia.com, tom@sipanda.io, pratyush@sipanda.io, jiri@resnulli.us, xiyou.wangcong@gmail.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, vladbu@nvidia.com, simon.horman@corigine.com Subject: [PATCH net-next RFC 15/20] p4tc: add action template create, update, delete, get, flush and dump Date: Tue, 24 Jan 2023 12:05:05 -0500 Message-Id: <20230124170510.316970-15-jhs@mojatatu.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230124170510.316970-1-jhs@mojatatu.com> References: <20230124170510.316970-1-jhs@mojatatu.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org X-Patchwork-State: RFC This commit allows users to create, update, delete, get, flush and dump dynamic actions based on P4 action definition. At the moment dynamic actions are tied to P4 programs only and cannot be used outside of a P4 program definition. Visualize the following action in a P4 program: action ipv4_forward(bit<48> dstAddr, bit<8> port) { standard_metadata.egress_spec = port; hdr.ethernet.srcAddr = hdr.ethernet.dstAddr; hdr.ethernet.dstAddr = dstAddr; hdr.ipv4.ttl = hdr.ipv4.ttl - 1; } which is invoked on a P4 table match as such: table mytable { key = { hdr.ipv4.dstAddr: lpm; } actions = { ipv4_forward; drop; NoAction; } size = 1024; } We don't have an equivalent built in "ipv4_forward" action in TC. So we create this action dynamically. The mechanics of dynamic actions follow the CRUD semantics. ___DYNAMIC CREATION___ In this stage we issue the creation command for the dynamic action which specifies the action name, its ID, parameters and the parameter types. So for the ipv4_forward action, the creation would look something like this: tc p4template create action/aP4proggie/ipv4_forward \ param dstAddr type macddr id 1 param port type dev id 2 Note1: Although the P4 program defined dstAddr as type bit48 we use our type called macaddr (likewise for port) - see commit on p4 types for details. Note that in the template creation op we usually just specify the action name, the parameters and their respective types. Also see that we specify a pipeline name during the template creation command. As an example, the above command creates an action template that is bounded to pipeline/program named aP4proggie. Also, below is an example of how one would specify an ID to the action template created in the above command. When the create doesn't specify the action template ID, the kernel assigns a new one for us. Also, if the action template ID specified in the command is already in use, the kernel will reject the command. tc p4template create action/aP4proggie/ipv4_forward actid 1 \ param dstAddr type macddr id 1 param port type dev id 2 Typically the compiler (for example P4C) will always define the actid. Per the P4 specification, actions might be contained in a control block. ___OPS_DESCRIPTION___ In the next stage (ops description), we need to specify which operations this action uses. As example, if we were to specify the operations for the ipv4_forward action, we'd update the created action and issue the following command: tc p4template update action/aP4proggie/ipv4_forward \ cmd set metadata.aP4proggie.temp hdrfield.aP4proggie.parser1.ethernet.dstAddr \ cmd set hdrfield.P4proggie.parser1.ethernet.dstAddr hdrfield.P4proggie.parser1.ethernet.srcAddr \ cmd set hdrfield.P4proggie.parser1.ethernet.srcAddr metadata.aP4proggie.temp \ cmd set metadata.calc.egress_spec param.port \ cmd decr hdrfield.P4proggie.parser1.ipv4.ttl As you can see, we refer to the argument values in the ipv4_forward action using "param" prefix. So, for example, when referring to the argument port in a ipv4_forward, we use "param.port". Of course the two steps could be combined as so when creating the action: tc p4template create action/aP4proggie/ipv4_forward actid 1 \ param dstAddr type macddr id 1 param port type dev id 2 \ cmd set metadata.aP4proggie.temp hdrfield.aP4proggie.parser1.ethernet.dstAddr \ cmd set hdrfield.P4proggie.parser1.ethernet.dstAddr hdrfield.P4proggie.parser1.ethernet.srcAddr \ cmd set hdrfield.P4proggie.parser1.ethernet.srcAddr metadata.aP4proggie.temp \ cmd set metadata.calc.egress_spec param.port \ cmd decr hdrfield.P4proggie.parser1.ipv4.ttl ___ACTION_ACTIVATION___ Once we provided all the necessary information for the new dynamic action, we can go to the final stage, which is action activation. In this stage, we activate the dynamic action and make it available for instantiation. To activate the action template, we issue the following command: tc p4template update action aP4proggie/ipv4_forward state active After the above the command, the action is ready to be instantiated. ___RUNTIME___ This next section deals with the runtime part of action templates, which handle action template instantiation and binding. To instantiate a new action from a template, we use the following command: tc actions add action aP4proggie/ipv4_forward \ param dstAddr AA:BB:CC:DD:EE:FF param port eth0 index 1 Observe these are the same semantics as what tc today already provides with a caveat that we have a keyword "param" to precede the appropriate parameters - as such specifying the index is optional (kernel provides one when unspecified). As previously stated, we refer to the action by it's "full name" (pipeline_name/action_name). Here we are creating an instance of the ipv4_forward action specifying as parameter values AA:BB:CC:DD:EE:FF for dstAddr and eth0 for port. We can create as many instances for action templates as we wish. To bind the instantiated action to a table entry, you can do use the same approach used to bind ordinary actions to filter, for example: tc p4runtime create aP4proggie/table/mycontrol/mytable srcAddr 10.10.10.0/24 \ action ipv4_forward index 1 The above command will bind our newly instantiated action to a table entry which is executed if there's a match. Of course one could have created the table entry as: tc p4runtime create aP4proggie/table/mycontrol/mytable srcAddr 10.10.10.0/24 \ action ipv4_forward param dstAddr AA:BB:CC:DD:EE:FF param port eth0 Actions from other control blocks might be referenced as the action index is a global ID. ___OTHER_CONTROL_COMMANDS___ The lifetime of the dynamic action is tied to its pipeline. As with all pipeline components, write operations to action templates, such as create, update and delete, can only be executed if the pipeline is not sealed. Read/get can be issued even after the pipeline is sealed. If, after we are done with our action template we want to delete it, we should issue the following command: tc p4template del action/aP4proggie/ipv4_forward Note that we could also not specify the action name and use the ID instead, which would transform the above command into the following: tc p4template del action/aP4proggie actid 1 If we had created more action templates and wanted to flush all of the action templates from pipeline aP4proggie, one would use the following command: tc p4template del action/aP4proggie/ After creating or updating a dynamic actions, if one wishes to verify that the dynamic action was created correctly, one would use the following command: tc p4template get action/aP4proggie/ipv4_forward As with the del operation, when can also specify the action id instead of the action name: tc p4template get action/aP4proggie actid 1 The above command will display the relevant data for the action, such as parameter names, types, etc. If one wanted to check which action templates were associated to a specific pipeline, one could use the following command: tc p4template get action/aP4proggie/ Note that this command will only display the name of these action templates. To verify their specific details, one should use the get command, which was previously described. Tested-by: "Khan, Mohd Arif" Tested-by: "Pottimurthy, Sathya Narayana" Co-developed-by: Victor Nogueira Signed-off-by: Victor Nogueira Co-developed-by: Pedro Tammela Signed-off-by: Pedro Tammela Signed-off-by: Jamal Hadi Salim --- include/net/act_api.h | 1 + include/net/p4tc.h | 190 ++++ include/net/sch_generic.h | 5 + include/net/tc_act/p4tc.h | 25 + include/uapi/linux/p4tc.h | 46 + net/sched/p4tc/Makefile | 2 +- net/sched/p4tc/p4tc_action.c | 1824 ++++++++++++++++++++++++++++++++ net/sched/p4tc/p4tc_pipeline.c | 274 ++++- net/sched/p4tc/p4tc_tmpl_api.c | 2 + 9 files changed, 2343 insertions(+), 26 deletions(-) create mode 100644 include/net/tc_act/p4tc.h create mode 100644 net/sched/p4tc/p4tc_action.c diff --git a/include/net/act_api.h b/include/net/act_api.h index fd012270d..e4a6d7da6 100644 --- a/include/net/act_api.h +++ b/include/net/act_api.h @@ -68,6 +68,7 @@ struct tc_action { #define TCA_ACT_FLAGS_REPLACE (1U << (TCA_ACT_FLAGS_USER_BITS + 2)) #define TCA_ACT_FLAGS_NO_RTNL (1U << (TCA_ACT_FLAGS_USER_BITS + 3)) #define TCA_ACT_FLAGS_AT_INGRESS (1U << (TCA_ACT_FLAGS_USER_BITS + 4)) +#define TCA_ACT_FLAGS_FROM_P4TC (1U << (TCA_ACT_FLAGS_USER_BITS + 5)) /* Update lastuse only if needed, to avoid dirtying a cache line. * We use a temp variable to avoid fetching jiffies twice. diff --git a/include/net/p4tc.h b/include/net/p4tc.h index 13cf4162e..09d4d85cf 100644 --- a/include/net/p4tc.h +++ b/include/net/p4tc.h @@ -9,6 +9,8 @@ #include #include #include +#include +#include #define P4TC_DEFAULT_NUM_TABLES P4TC_MINTABLES_COUNT #define P4TC_DEFAULT_MAX_RULES 1 @@ -19,6 +21,7 @@ #define P4TC_PID_IDX 0 #define P4TC_MID_IDX 1 +#define P4TC_AID_IDX 1 #define P4TC_PARSEID_IDX 1 #define P4TC_HDRFIELDID_IDX 2 @@ -26,6 +29,7 @@ struct p4tc_dump_ctx { u32 ids[P4TC_PATH_MAX]; + struct rhashtable_iter *iter; }; struct p4tc_template_common; @@ -82,9 +86,21 @@ struct p4tc_template_common { extern const struct p4tc_template_ops p4tc_pipeline_ops; +struct p4tc_act_dep_edge_node { + struct list_head head; + u32 act_id; +}; + +struct p4tc_act_dep_node { + struct list_head incoming_egde_list; + struct list_head head; + u32 act_id; +}; + struct p4tc_pipeline { struct p4tc_template_common common; struct idr p_meta_idr; + struct idr p_act_idr; struct rcu_head rcu; struct net *net; struct p4tc_parser *parser; @@ -92,13 +108,17 @@ struct p4tc_pipeline { int num_preacts; struct tc_action **postacts; int num_postacts; + struct list_head act_dep_graph; + struct list_head act_topological_order; u32 max_rules; u32 p_meta_offset; + u32 num_created_acts; refcount_t p_ref; refcount_t p_ctrl_ref; u16 num_tables; u16 curr_tables; u8 p_state; + refcount_t p_hdrs_used; }; struct p4tc_pipeline_net { @@ -139,6 +159,18 @@ static inline bool pipeline_sealed(struct p4tc_pipeline *pipeline) { return pipeline->p_state == P4TC_STATE_READY; } +void tcf_pipeline_add_dep_edge(struct p4tc_pipeline *pipeline, + struct p4tc_act_dep_edge_node *edge_node, + u32 vertex_id); +bool tcf_pipeline_check_act_backedge(struct p4tc_pipeline *pipeline, + struct p4tc_act_dep_edge_node *edge_node, + u32 vertex_id); +int determine_act_topological_order(struct p4tc_pipeline *pipeline, + bool copy_dep_graph); + +struct p4tc_act; +void tcf_pipeline_delete_from_dep_graph(struct p4tc_pipeline *pipeline, + struct p4tc_act *act); struct p4tc_metadata { struct p4tc_template_common common; @@ -155,6 +187,66 @@ struct p4tc_metadata { extern const struct p4tc_template_ops p4tc_meta_ops; +struct p4tc_ipv4_param_value { + u32 value; + u32 mask; +}; + +#define P4TC_ACT_PARAM_FLAGS_ISDYN BIT(0) + +struct p4tc_act_param { + char name[ACTPARAMNAMSIZ]; + struct list_head head; + struct rcu_head rcu; + void *value; + void *mask; + u32 type; + u32 id; + u8 flags; +}; + +struct p4tc_act_param_ops { + int (*init_value)(struct net *net, struct p4tc_act_param_ops *op, + struct p4tc_act_param *nparam, struct nlattr **tb, + struct netlink_ext_ack *extack); + int (*dump_value)(struct sk_buff *skb, struct p4tc_act_param_ops *op, + struct p4tc_act_param *param); + void (*free)(struct p4tc_act_param *param); + u32 len; + u32 alloc_len; +}; + +struct p4tc_label_key { + char *label; + u32 labelsz; +}; + +struct p4tc_label_node { + struct rhash_head ht_node; + struct p4tc_label_key key; + int cmd_offset; +}; + +struct p4tc_act { + struct p4tc_template_common common; + struct tc_action_ops ops; + struct rhashtable *labels; + struct list_head cmd_operations; + struct tc_action_net *tn; + struct p4tc_pipeline *pipeline; + struct idr params_idr; + struct tcf_exts exts; + struct list_head head; + u32 a_id; + bool active; + refcount_t a_ref; +}; + +extern const struct p4tc_template_ops p4tc_act_ops; +extern const struct rhashtable_params p4tc_label_ht_params; +extern const struct rhashtable_params acts_params; +void p4tc_label_ht_destroy(void *ptr, void *arg); + struct p4tc_parser { char parser_name[PARSERNAMSIZ]; struct idr hdr_fields_idr; @@ -187,6 +279,84 @@ struct p4tc_metadata *tcf_meta_get(struct p4tc_pipeline *pipeline, const char *mname, const u32 m_id, struct netlink_ext_ack *extack); void tcf_meta_put_ref(struct p4tc_metadata *meta); +void *tcf_meta_fetch(struct sk_buff *skb, struct p4tc_metadata *meta); + +static inline int p4tc_action_init(struct net *net, struct nlattr *nla, + struct tc_action *acts[], u32 pipeid, + u32 flags, struct netlink_ext_ack *extack) +{ + int init_res[TCA_ACT_MAX_PRIO]; + size_t attrs_size; + int ret; + int i; + + /* If action was already created, just bind to existing one*/ + flags |= TCA_ACT_FLAGS_BIND; + flags |= TCA_ACT_FLAGS_FROM_P4TC; + ret = tcf_action_init(net, NULL, nla, NULL, acts, init_res, &attrs_size, + flags, 0, extack); + + /* Check if we are trying to bind to dynamic action from different pipe */ + for (i = 0; i < TCA_ACT_MAX_PRIO && acts[i]; i++) { + struct tc_action *a = acts[i]; + struct tcf_p4act *p; + + if (a->ops->id < TCA_ID_DYN) + continue; + + p = to_p4act(a); + if (p->p_id != pipeid) { + NL_SET_ERR_MSG(extack, + "Unable to bind to dynact from different pipeline"); + ret = -EPERM; + goto destroy_acts; + } + } + + return ret; + +destroy_acts: + tcf_action_destroy(acts, TCA_ACT_FLAGS_BIND); + return ret; +} + +static inline struct p4tc_skb_ext *p4tc_skb_ext_alloc(struct sk_buff *skb) +{ + struct p4tc_skb_ext *p4tc_skb_ext = skb_ext_add(skb, P4TC_SKB_EXT); + + if (!p4tc_skb_ext) + return NULL; + + p4tc_skb_ext->p4tc_ext = + kzalloc(sizeof(struct __p4tc_skb_ext), GFP_ATOMIC); + if (!p4tc_skb_ext->p4tc_ext) + return NULL; + + return p4tc_skb_ext; +} + +struct p4tc_act *tcf_action_find_byid(struct p4tc_pipeline *pipeline, + const u32 a_id); +struct p4tc_act *tcf_action_find_byname(const char *act_name, + struct p4tc_pipeline *pipeline); +struct p4tc_act *tcf_action_find_byany(struct p4tc_pipeline *pipeline, + const char *act_name, const u32 a_id, + struct netlink_ext_ack *extack); +struct p4tc_act *tcf_action_get(struct p4tc_pipeline *pipeline, + const char *act_name, const u32 a_id, + struct netlink_ext_ack *extack); +void tcf_action_put(struct p4tc_act *act); +int tcf_p4_dyna_template_init(struct net *net, struct tc_action **a, + struct p4tc_act *act, + struct list_head *params_list, + struct tc_act_dyna *parm, u32 flags, + struct netlink_ext_ack *extack); +struct p4tc_act_param *tcf_param_find_byid(struct idr *params_idr, + const u32 param_id); +struct p4tc_act_param *tcf_param_find_byany(struct p4tc_act *act, + const char *param_name, + const u32 param_id, + struct netlink_ext_ack *extack); struct p4tc_parser *tcf_parser_create(struct p4tc_pipeline *pipeline, const char *parser_name, @@ -220,8 +390,28 @@ struct p4tc_hdrfield *tcf_hdrfield_get(struct p4tc_parser *parser, struct netlink_ext_ack *extack); void tcf_hdrfield_put_ref(struct p4tc_hdrfield *hdrfield); +int p4tc_init_net_ops(struct net *net, unsigned int id); +void p4tc_exit_net_ops(struct list_head *net_list, unsigned int id); +int tcf_p4_act_init_params(struct net *net, struct tcf_p4act_params *params, + struct p4tc_act *act, struct nlattr *nla, + struct netlink_ext_ack *extack); +void tcf_p4_act_params_destroy(struct tcf_p4act_params *params); +int p4_act_init(struct p4tc_act *act, struct nlattr *nla, + struct p4tc_act_param *params[], + struct netlink_ext_ack *extack); +void p4_put_many_params(struct idr *params_idr, struct p4tc_act_param *params[], + int params_count); +void tcf_p4_act_params_destroy_rcu(struct rcu_head *head); +int p4_act_init_params(struct p4tc_act *act, struct nlattr *nla, + struct p4tc_act_param *params[], bool update, + struct netlink_ext_ack *extack); +extern const struct p4tc_act_param_ops param_ops[P4T_MAX + 1]; +int generic_dump_param_value(struct sk_buff *skb, struct p4tc_type *type, + struct p4tc_act_param *param); + #define to_pipeline(t) ((struct p4tc_pipeline *)t) #define to_meta(t) ((struct p4tc_metadata *)t) #define to_hdrfield(t) ((struct p4tc_hdrfield *)t) +#define to_act(t) ((struct p4tc_act *)t) #endif diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h index af4aa66aa..9f7d3c3ea 100644 --- a/include/net/sch_generic.h +++ b/include/net/sch_generic.h @@ -326,6 +326,11 @@ struct tcf_result { }; const struct tcf_proto *goto_tp; + struct { + bool hit; + bool miss; + int action_run_id; + }; }; }; diff --git a/include/net/tc_act/p4tc.h b/include/net/tc_act/p4tc.h new file mode 100644 index 000000000..5a15d3da1 --- /dev/null +++ b/include/net/tc_act/p4tc.h @@ -0,0 +1,25 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __NET_TC_ACT_P4_H +#define __NET_TC_ACT_P4_H + +#include +#include + +struct tcf_p4act_params { + struct tcf_exts exts; + struct idr params_idr; + struct rcu_head rcu; +}; + +struct tcf_p4act { + struct tc_action common; + /* list of operations */ + struct list_head cmd_operations; + /* Params IDR reference passed during runtime */ + struct tcf_p4act_params __rcu *params; + u32 p_id; + u32 act_id; +}; +#define to_p4act(a) ((struct tcf_p4act *)a) + +#endif /* __NET_TC_ACT_P4_H */ diff --git a/include/uapi/linux/p4tc.h b/include/uapi/linux/p4tc.h index 72714df9e..15876c471 100644 --- a/include/uapi/linux/p4tc.h +++ b/include/uapi/linux/p4tc.h @@ -4,6 +4,7 @@ #include #include +#include /* pipeline header */ struct p4tcmsg { @@ -29,6 +30,9 @@ struct p4tcmsg { #define METANAMSIZ TEMPLATENAMSZ #define PARSERNAMSIZ TEMPLATENAMSZ #define HDRFIELDNAMSIZ TEMPLATENAMSZ +#define ACTPARAMNAMSIZ TEMPLATENAMSZ + +#define LABELNAMSIZ 32 /* Root attributes */ enum { @@ -58,6 +62,7 @@ enum { P4TC_OBJ_PIPELINE, P4TC_OBJ_META, P4TC_OBJ_HDR_FIELD, + P4TC_OBJ_ACT, __P4TC_OBJ_MAX, }; #define P4TC_OBJ_MAX __P4TC_OBJ_MAX @@ -172,6 +177,47 @@ enum { }; #define P4TC_HDRFIELD_MAX (__P4TC_HDRFIELD_MAX - 1) +/* Action attributes */ +enum { + P4TC_ACT_UNSPEC, + P4TC_ACT_NAME, /* string */ + P4TC_ACT_PARMS, /* nested params */ + P4TC_ACT_OPT, /* action opt */ + P4TC_ACT_TM, /* action tm */ + P4TC_ACT_CMDS_LIST, /* command list */ + P4TC_ACT_ACTIVE, /* u8 */ + P4TC_ACT_PAD, + __P4TC_ACT_MAX +}; +#define P4TC_ACT_MAX __P4TC_ACT_MAX + +#define P4TC_CMDS_LIST_MAX 32 + +/* Action params attributes */ +enum { + P4TC_ACT_PARAMS_VALUE_UNSPEC, + P4TC_ACT_PARAMS_VALUE_RAW, /* binary */ + P4TC_ACT_PARAMS_VALUE_OPND, /* struct p4tc_u_operand */ + __P4TC_ACT_PARAMS_VALUE_MAX +}; +#define P4TC_ACT_VALUE_PARAMS_MAX __P4TC_ACT_PARAMS_VALUE_MAX + +/* Action params attributes */ +enum { + P4TC_ACT_PARAMS_UNSPEC, + P4TC_ACT_PARAMS_NAME, /* string */ + P4TC_ACT_PARAMS_ID, /* u32 */ + P4TC_ACT_PARAMS_VALUE, /* bytes */ + P4TC_ACT_PARAMS_MASK, /* bytes */ + P4TC_ACT_PARAMS_TYPE, /* u32 */ + __P4TC_ACT_PARAMS_MAX +}; +#define P4TC_ACT_PARAMS_MAX __P4TC_ACT_PARAMS_MAX + +struct tc_act_dyna { + tc_gen; +}; + #define P4TC_RTA(r) \ ((struct rtattr *)(((char *)(r)) + NLMSG_ALIGN(sizeof(struct p4tcmsg)))) diff --git a/net/sched/p4tc/Makefile b/net/sched/p4tc/Makefile index add22c909..3f7267366 100644 --- a/net/sched/p4tc/Makefile +++ b/net/sched/p4tc/Makefile @@ -1,4 +1,4 @@ # SPDX-License-Identifier: GPL-2.0 obj-y := p4tc_types.o p4tc_pipeline.o p4tc_tmpl_api.o p4tc_meta.o \ - p4tc_parser_api.o p4tc_hdrfield.o + p4tc_parser_api.o p4tc_hdrfield.o p4tc_action.o diff --git a/net/sched/p4tc/p4tc_action.c b/net/sched/p4tc/p4tc_action.c new file mode 100644 index 000000000..f47b42bbe --- /dev/null +++ b/net/sched/p4tc/p4tc_action.c @@ -0,0 +1,1824 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * net/sched/p4tc_action.c P4 TC ACTION TEMPLATES + * + * Copyright (c) 2022, Mojatatu Networks + * Copyright (c) 2022, Intel Corporation. + * Authors: Jamal Hadi Salim + * Victor Nogueira + * Pedro Tammela + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +static LIST_HEAD(dynact_list); + +#define SEPARATOR "/" + +static u32 label_hash_fn(const void *data, u32 len, u32 seed) +{ + const struct p4tc_label_key *key = data; + + return jhash(key->label, key->labelsz, seed); +} + +static int label_hash_cmp(struct rhashtable_compare_arg *arg, const void *ptr) +{ + const struct p4tc_label_key *label_arg = arg->key; + const struct p4tc_label_node *node = ptr; + + return strncmp(label_arg->label, node->key.label, node->key.labelsz); +} + +static u32 label_obj_hash_fn(const void *data, u32 len, u32 seed) +{ + const struct p4tc_label_node *node = data; + + return label_hash_fn(&node->key, 0, seed); +} + +void p4tc_label_ht_destroy(void *ptr, void *arg) +{ + struct p4tc_label_node *node = ptr; + + kfree(node->key.label); + kfree(node); +} + +const struct rhashtable_params p4tc_label_ht_params = { + .obj_cmpfn = label_hash_cmp, + .obj_hashfn = label_obj_hash_fn, + .hashfn = label_hash_fn, + .head_offset = offsetof(struct p4tc_label_node, ht_node), + .key_offset = offsetof(struct p4tc_label_node, key), + .automatic_shrinking = true, +}; + +static int __tcf_p4_dyna_init(struct net *net, struct nlattr *est, + struct p4tc_act *act, struct tc_act_dyna *parm, + struct tc_action **a, struct tcf_proto *tp, + struct tc_action_ops *a_o, + struct tcf_chain **goto_ch, u32 flags, + struct netlink_ext_ack *extack) +{ + bool bind = flags & TCA_ACT_FLAGS_BIND; + bool exists = false; + int ret = 0; + struct p4tc_pipeline *pipeline; + u32 index; + int err; + + index = parm->index; + + err = tcf_idr_check_alloc(act->tn, &index, a, bind); + if (err < 0) + return err; + + exists = err; + if (!exists) { + struct tcf_p4act *p; + + ret = tcf_idr_create(act->tn, index, est, a, a_o, bind, false, + flags); + if (ret) { + tcf_idr_cleanup(act->tn, index); + return ret; + } + + /* dyn_ref here should never be 0, because if we are here, it + * means that a template action of this kind was created. Thus + * dyn_ref should be at least 1. Also since this operation and + * others that add or delete action templates run with + * rtnl_lock held, we cannot do this op and a deletion op in + * parallel. + */ + WARN_ON(!refcount_inc_not_zero(&a_o->dyn_ref)); + + pipeline = act->pipeline; + + p = to_p4act(*a); + p->p_id = pipeline->common.p_id; + p->act_id = act->a_id; + INIT_LIST_HEAD(&p->cmd_operations); + + ret = ACT_P_CREATED; + } else { + if (bind) /* dont override defaults */ + return 0; + if (!(flags & TCA_ACT_FLAGS_REPLACE)) { + tcf_idr_cleanup(act->tn, index); + return -EEXIST; + } + } + + err = tcf_action_check_ctrlact(parm->action, tp, goto_ch, extack); + if (err < 0) { + tcf_idr_release(*a, bind); + return err; + } + + return ret; +} + +static int __tcf_p4_dyna_init_set(struct p4tc_act *act, struct tc_action **a, + struct tcf_p4act_params *params, + struct tcf_chain *goto_ch, + struct tc_act_dyna *parm, bool exists, + struct netlink_ext_ack *extack) +{ + struct tcf_p4act_params *params_old; + struct tcf_p4act *p; + int err = 0; + + p = to_p4act(*a); + + if (exists) + spin_lock_bh(&p->tcf_lock); + + goto_ch = tcf_action_set_ctrlact(*a, parm->action, goto_ch); + + params_old = rcu_replace_pointer(p->params, params, 1); + if (exists) + spin_unlock_bh(&p->tcf_lock); + + if (goto_ch) + tcf_chain_put_by_act(goto_ch); + + if (params_old) + call_rcu(¶ms_old->rcu, tcf_p4_act_params_destroy_rcu); + + return err; +} + +static struct p4tc_act *tcf_p4_find_act(struct net *net, + const struct tc_action_ops *a_o) +{ + char *act_name_clone, *act_name, *p_name; + struct p4tc_pipeline *pipeline; + struct p4tc_act *act; + int err; + + act_name_clone = act_name = kstrdup(a_o->kind, GFP_KERNEL); + if (!act_name) + return ERR_PTR(-ENOMEM); + + p_name = strsep(&act_name, SEPARATOR); + pipeline = tcf_pipeline_find_byany(net, p_name, 0, NULL); + if (IS_ERR(pipeline)) { + err = -ENOENT; + goto free_act_name; + } + + act = tcf_action_find_byname(act_name, pipeline); + if (!act) { + err = -ENOENT; + goto free_act_name; + } + kfree(act_name_clone); + + return act; + +free_act_name: + kfree(act_name_clone); + return ERR_PTR(err); +} + +static int tcf_p4_dyna_init(struct net *net, struct nlattr *nla, + struct nlattr *est, struct tc_action **a, + struct tcf_proto *tp, struct tc_action_ops *a_o, + u32 flags, struct netlink_ext_ack *extack) +{ + bool bind = flags & TCA_ACT_FLAGS_BIND; + struct tcf_chain *goto_ch = NULL; + bool exists = false; + int ret = 0; + struct nlattr *tb[P4TC_ACT_MAX + 1]; + struct tcf_p4act_params *params; + struct tc_act_dyna *parm; + struct p4tc_act *act; + int err; + + if (flags & TCA_ACT_FLAGS_BIND && + !(flags & TCA_ACT_FLAGS_FROM_P4TC)) { + NL_SET_ERR_MSG(extack, + "Can only bind to dynamic action from P4TC objects"); + return -EPERM; + } + + if (!nla) { + NL_SET_ERR_MSG(extack, + "Must specify action netlink attributes"); + return -EINVAL; + } + + err = nla_parse_nested(tb, P4TC_ACT_MAX, nla, NULL, extack); + if (err < 0) + return err; + + if (!tb[P4TC_ACT_OPT]) { + NL_SET_ERR_MSG(extack, + "Must specify option netlink attributes"); + return -EINVAL; + } + + act = tcf_p4_find_act(net, a_o); + if (IS_ERR(act)) + return PTR_ERR(act); + + if (!act->active) { + NL_SET_ERR_MSG(extack, + "Dynamic action must be active to create instance"); + return -EINVAL; + } + + parm = nla_data(tb[P4TC_ACT_OPT]); + + ret = __tcf_p4_dyna_init(net, est, act, parm, a, tp, a_o, &goto_ch, + flags, extack); + if (ret < 0) + return ret; + if (bind && !ret) + return 0; + + err = tcf_action_check_ctrlact(parm->action, tp, &goto_ch, extack); + if (err < 0) + goto release_idr; + + params = kzalloc(sizeof(*params), GFP_KERNEL); + if (!params) { + err = -ENOMEM; + goto release_idr; + } + + idr_init(¶ms->params_idr); + if (tb[P4TC_ACT_PARMS]) { + err = tcf_p4_act_init_params(net, params, act, + tb[P4TC_ACT_PARMS], extack); + if (err < 0) + goto release_params; + } else { + if (!idr_is_empty(&act->params_idr)) { + NL_SET_ERR_MSG(extack, + "Must specify action parameters"); + err = -EINVAL; + goto release_params; + } + } + + exists = ret != ACT_P_CREATED; + err = __tcf_p4_dyna_init_set(act, a, params, goto_ch, parm, exists, + extack); + if (err < 0) + goto release_params; + + return ret; + +release_params: + tcf_p4_act_params_destroy(params); + +release_idr: + tcf_idr_release(*a, bind); + return err; +} + +static const struct nla_policy p4tc_act_params_value_policy[P4TC_ACT_VALUE_PARAMS_MAX + 1] = { + [P4TC_ACT_PARAMS_VALUE_RAW] = { .type = NLA_BINARY }, + [P4TC_ACT_PARAMS_VALUE_OPND] = { .type = NLA_NESTED }, +}; + +static int dev_init_param_value(struct net *net, struct p4tc_act_param_ops *op, + struct p4tc_act_param *nparam, + struct nlattr **tb, + struct netlink_ext_ack *extack) +{ + struct nlattr *tb_value[P4TC_ACT_VALUE_PARAMS_MAX + 1]; + u32 value_len; + u32 *ifindex; + int err; + + if (!tb[P4TC_ACT_PARAMS_VALUE]) { + NL_SET_ERR_MSG(extack, "Must specify param value"); + return -EINVAL; + } + err = nla_parse_nested(tb_value, P4TC_ACT_VALUE_PARAMS_MAX, + tb[P4TC_ACT_PARAMS_VALUE], + p4tc_act_params_value_policy, extack); + if (err < 0) + return err; + + value_len = nla_len(tb_value[P4TC_ACT_PARAMS_VALUE_RAW]); + if (value_len != sizeof(u32)) { + NL_SET_ERR_MSG(extack, "Value length differs from template's"); + return -EINVAL; + } + + ifindex = nla_data(tb_value[P4TC_ACT_PARAMS_VALUE_RAW]); + rcu_read_lock(); + if (!dev_get_by_index_rcu(net, *ifindex)) { + NL_SET_ERR_MSG(extack, "Invalid ifindex"); + rcu_read_unlock(); + return -EINVAL; + } + rcu_read_unlock(); + + nparam->value = kzalloc(sizeof(*ifindex), GFP_KERNEL); + if (!nparam->value) + return -EINVAL; + + memcpy(nparam->value, ifindex, sizeof(*ifindex)); + + return 0; +} + +static int dev_dump_param_value(struct sk_buff *skb, + struct p4tc_act_param_ops *op, + struct p4tc_act_param *param) +{ + struct nlattr *nest; + int ret; + + nest = nla_nest_start(skb, P4TC_ACT_PARAMS_VALUE); + if (param->flags & P4TC_ACT_PARAM_FLAGS_ISDYN) { + struct nlattr *nla_opnd; + + nla_opnd = nla_nest_start(skb, P4TC_ACT_PARAMS_VALUE_OPND); + nla_nest_end(skb, nla_opnd); + } else { + const u32 *ifindex = param->value; + + if (nla_put_u32(skb, P4TC_ACT_PARAMS_VALUE_RAW, *ifindex)) { + ret = -EINVAL; + goto out_nla_cancel; + } + } + nla_nest_end(skb, nest); + + return 0; + +out_nla_cancel: + nla_nest_cancel(skb, nest); + return ret; +} + +static void dev_free_param_value(struct p4tc_act_param *param) +{ + if (!(param->flags & P4TC_ACT_PARAM_FLAGS_ISDYN)) + kfree(param->value); +} + +static int generic_init_param_value(struct p4tc_act_param *nparam, + struct p4tc_type *type, struct nlattr **tb, + struct netlink_ext_ack *extack) +{ + const u32 alloc_len = BITS_TO_BYTES(type->container_bitsz); + const u32 len = BITS_TO_BYTES(type->bitsz); + struct nlattr *tb_value[P4TC_ACT_VALUE_PARAMS_MAX + 1]; + void *value; + int err; + + if (!tb[P4TC_ACT_PARAMS_VALUE]) { + NL_SET_ERR_MSG(extack, "Must specify param value"); + return -EINVAL; + } + + err = nla_parse_nested(tb_value, P4TC_ACT_VALUE_PARAMS_MAX, + tb[P4TC_ACT_PARAMS_VALUE], + p4tc_act_params_value_policy, extack); + if (err < 0) + return err; + + value = nla_data(tb_value[P4TC_ACT_PARAMS_VALUE_RAW]); + if (type->ops->validate_p4t) { + err = type->ops->validate_p4t(type, value, 0, type->bitsz - 1, + extack); + if (err < 0) + return err; + } + + if (nla_len(tb_value[P4TC_ACT_PARAMS_VALUE_RAW]) != len) + return -EINVAL; + + nparam->value = kzalloc(alloc_len, GFP_KERNEL); + if (!nparam->value) + return -ENOMEM; + + memcpy(nparam->value, value, len); + + if (tb[P4TC_ACT_PARAMS_MASK]) { + const void *mask = nla_data(tb[P4TC_ACT_PARAMS_MASK]); + + if (nla_len(tb[P4TC_ACT_PARAMS_MASK]) != len) { + NL_SET_ERR_MSG(extack, + "Mask length differs from template's"); + err = -EINVAL; + goto free_value; + } + + nparam->mask = kzalloc(alloc_len, GFP_KERNEL); + if (!nparam->mask) { + err = -ENOMEM; + goto free_value; + } + + memcpy(nparam->mask, mask, len); + } + + return 0; + +free_value: + kfree(nparam->value); + return err; +} + +const struct p4tc_act_param_ops param_ops[P4T_MAX + 1] = { + [P4T_DEV] = { + .init_value = dev_init_param_value, + .dump_value = dev_dump_param_value, + .free = dev_free_param_value, + }, +}; + +static void generic_free_param_value(struct p4tc_act_param *param) +{ + if (!(param->flags & P4TC_ACT_PARAM_FLAGS_ISDYN)) { + kfree(param->value); + kfree(param->mask); + } +} + +int tcf_p4_act_init_params_list(struct tcf_p4act_params *params, + struct list_head *params_list) +{ + struct p4tc_act_param *nparam, *tmp; + int err; + + list_for_each_entry_safe(nparam, tmp, params_list, head) { + err = idr_alloc_u32(¶ms->params_idr, nparam, &nparam->id, + nparam->id, GFP_KERNEL); + if (err < 0) + return err; + list_del(&nparam->head); + } + + return 0; +} + +/* This is the action instantiation that is invoked from the template code, + * specifically when there is a command act with runtime parameters. + * It is assumed that the action kind that is being instantiated here was + * already created. This functions is analogous to tcf_p4_dyna_init. + */ +int tcf_p4_dyna_template_init(struct net *net, struct tc_action **a, + struct p4tc_act *act, + struct list_head *params_list, + struct tc_act_dyna *parm, u32 flags, + struct netlink_ext_ack *extack) +{ + bool bind = flags & TCA_ACT_FLAGS_BIND; + struct tc_action_ops *a_o = &act->ops; + struct tcf_chain *goto_ch = NULL; + bool exists = false; + struct tcf_p4act_params *params; + int ret; + int err; + + if (!act->active) { + NL_SET_ERR_MSG(extack, + "Dynamic action must be active to create instance"); + return -EINVAL; + } + + ret = __tcf_p4_dyna_init(net, NULL, act, parm, a, NULL, a_o, &goto_ch, + flags, extack); + if (ret < 0) + return ret; + + err = tcf_action_check_ctrlact(parm->action, NULL, &goto_ch, extack); + if (err < 0) + goto release_idr; + + params = kzalloc(sizeof(*params), GFP_KERNEL); + if (!params) { + err = -ENOMEM; + goto release_idr; + } + + idr_init(¶ms->params_idr); + if (params_list) { + err = tcf_p4_act_init_params_list(params, params_list); + if (err < 0) + goto release_params; + } else { + if (!idr_is_empty(&act->params_idr)) { + NL_SET_ERR_MSG(extack, + "Must specify action parameters"); + err = -EINVAL; + goto release_params; + } + } + + exists = ret != ACT_P_CREATED; + err = __tcf_p4_dyna_init_set(act, a, params, goto_ch, parm, exists, + extack); + if (err < 0) + goto release_params; + + return err; + +release_params: + tcf_p4_act_params_destroy(params); + +release_idr: + tcf_idr_release(*a, bind); + return err; +} + +static int tcf_p4_dyna_act(struct sk_buff *skb, const struct tc_action *a, + struct tcf_result *res) +{ + struct tcf_p4act *dynact = to_p4act(a); + int ret = 0; + + tcf_lastuse_update(&dynact->tcf_tm); + tcf_action_update_bstats(&dynact->common, skb); + + return ret; +} + +static int tcf_p4_dyna_dump(struct sk_buff *skb, struct tc_action *a, int bind, + int ref) +{ + unsigned char *b = nlmsg_get_pos(skb); + struct tcf_p4act *dynact = to_p4act(a); + struct tc_act_dyna opt = { + .index = dynact->tcf_index, + .refcnt = refcount_read(&dynact->tcf_refcnt) - ref, + .bindcnt = atomic_read(&dynact->tcf_bindcnt) - bind, + }; + int i = 1; + struct tcf_p4act_params *params; + struct p4tc_act_param *parm; + struct nlattr *nest_parms; + struct nlattr *nest; + struct tcf_t t; + int id; + + spin_lock_bh(&dynact->tcf_lock); + + opt.action = dynact->tcf_action; + if (nla_put(skb, P4TC_ACT_OPT, sizeof(opt), &opt)) + goto nla_put_failure; + + nest = nla_nest_start(skb, P4TC_ACT_CMDS_LIST); + nla_nest_end(skb, nest); + + if (nla_put_string(skb, P4TC_ACT_NAME, a->ops->kind)) + goto nla_put_failure; + + tcf_tm_dump(&t, &dynact->tcf_tm); + if (nla_put_64bit(skb, P4TC_ACT_TM, sizeof(t), &t, P4TC_ACT_PAD)) + goto nla_put_failure; + + nest_parms = nla_nest_start(skb, P4TC_ACT_PARMS); + if (!nest_parms) + goto nla_put_failure; + + params = rcu_dereference_protected(dynact->params, 1); + if (params) { + idr_for_each_entry(¶ms->params_idr, parm, id) { + struct p4tc_act_param_ops *op; + struct nlattr *nest_count; + + nest_count = nla_nest_start(skb, i); + if (!nest_count) + goto nla_put_failure; + + if (nla_put_string(skb, P4TC_ACT_PARAMS_NAME, + parm->name)) + goto nla_put_failure; + + if (nla_put_u32(skb, P4TC_ACT_PARAMS_ID, parm->id)) + goto nla_put_failure; + + op = (struct p4tc_act_param_ops *)¶m_ops[parm->type]; + if (op->dump_value) { + if (op->dump_value(skb, op, parm) < 0) + goto nla_put_failure; + } else { + struct p4tc_type *type; + + type = p4type_find_byid(parm->type); + if (generic_dump_param_value(skb, type, parm)) + goto nla_put_failure; + } + + if (nla_put_u32(skb, P4TC_ACT_PARAMS_TYPE, parm->type)) + goto nla_put_failure; + + nla_nest_end(skb, nest_count); + i++; + } + } + nla_nest_end(skb, nest_parms); + + spin_unlock_bh(&dynact->tcf_lock); + + return skb->len; + +nla_put_failure: + spin_unlock_bh(&dynact->tcf_lock); + nlmsg_trim(skb, b); + return -1; +} + +static int tcf_p4_dyna_lookup(struct net *net, const struct tc_action_ops *ops, + struct tc_action **a, u32 index) +{ + struct p4tc_act *act; + + act = tcf_p4_find_act(net, ops); + if (IS_ERR(act)) + return PTR_ERR(act); + + return tcf_idr_search(act->tn, a, index); +} + +static int tcf_p4_dyna_walker(struct net *net, struct sk_buff *skb, + struct netlink_callback *cb, int type, + const struct tc_action_ops *ops, + struct netlink_ext_ack *extack) +{ + struct p4tc_act *act; + + act = tcf_p4_find_act(net, ops); + if (IS_ERR(act)) + return PTR_ERR(act); + + return tcf_generic_walker(act->tn, skb, cb, type, ops, extack); +} + +static void tcf_p4_dyna_cleanup(struct tc_action *a) +{ + struct tc_action_ops *ops = (struct tc_action_ops *)a->ops; + struct tcf_p4act *m = to_p4act(a); + struct tcf_p4act_params *params; + + params = rcu_dereference_protected(m->params, 1); + + if (refcount_read(&ops->dyn_ref) > 1) + refcount_dec(&ops->dyn_ref); + + spin_lock_bh(&m->tcf_lock); + if (params) + call_rcu(¶ms->rcu, tcf_p4_act_params_destroy_rcu); + spin_unlock_bh(&m->tcf_lock); +} + +int generic_dump_param_value(struct sk_buff *skb, struct p4tc_type *type, + struct p4tc_act_param *param) +{ + const u32 bytesz = BITS_TO_BYTES(type->container_bitsz); + unsigned char *b = nlmsg_get_pos(skb); + struct nlattr *nla_value; + + nla_value = nla_nest_start(skb, P4TC_ACT_PARAMS_VALUE); + if (param->flags & P4TC_ACT_PARAM_FLAGS_ISDYN) { + struct nlattr *nla_opnd; + + nla_opnd = nla_nest_start(skb, P4TC_ACT_PARAMS_VALUE_OPND); + nla_nest_end(skb, nla_opnd); + } else { + if (nla_put(skb, P4TC_ACT_PARAMS_VALUE_RAW, bytesz, + param->value)) + goto out_nlmsg_trim; + } + nla_nest_end(skb, nla_value); + + if (param->mask && + nla_put(skb, P4TC_ACT_PARAMS_MASK, bytesz, param->mask)) + goto out_nlmsg_trim; + + return 0; + +out_nlmsg_trim: + nlmsg_trim(skb, b); + return -1; +} + +void tcf_p4_act_params_destroy(struct tcf_p4act_params *params) +{ + struct p4tc_act_param *param; + unsigned long param_id, tmp; + + idr_for_each_entry_ul(¶ms->params_idr, param, tmp, param_id) { + struct p4tc_act_param_ops *op; + + idr_remove(¶ms->params_idr, param_id); + op = (struct p4tc_act_param_ops *)¶m_ops[param->type]; + if (op->free) + op->free(param); + else + generic_free_param_value(param); + kfree(param); + } + + idr_destroy(¶ms->params_idr); + + kfree(params); +} + +void tcf_p4_act_params_destroy_rcu(struct rcu_head *head) +{ + struct tcf_p4act_params *params; + + params = container_of(head, struct tcf_p4act_params, rcu); + tcf_p4_act_params_destroy(params); +} + +static const struct nla_policy p4tc_act_params_policy[P4TC_ACT_PARAMS_MAX + 1] = { + [P4TC_ACT_PARAMS_NAME] = { .type = NLA_STRING, .len = ACTPARAMNAMSIZ }, + [P4TC_ACT_PARAMS_ID] = { .type = NLA_U32 }, + [P4TC_ACT_PARAMS_VALUE] = { .type = NLA_NESTED }, + [P4TC_ACT_PARAMS_MASK] = { .type = NLA_BINARY }, + [P4TC_ACT_PARAMS_TYPE] = { .type = NLA_U32 }, +}; + +static struct p4tc_act_param *param_find_byname(struct idr *params_idr, + const char *param_name) +{ + struct p4tc_act_param *param; + unsigned long tmp, id; + + idr_for_each_entry_ul(params_idr, param, tmp, id) { + if (param == ERR_PTR(-EBUSY)) + continue; + if (strncmp(param->name, param_name, ACTPARAMNAMSIZ) == 0) + return param; + } + + return NULL; +} + +struct p4tc_act_param *tcf_param_find_byid(struct idr *params_idr, + const u32 param_id) +{ + return idr_find(params_idr, param_id); +} + +struct p4tc_act_param *tcf_param_find_byany(struct p4tc_act *act, + const char *param_name, + const u32 param_id, + struct netlink_ext_ack *extack) +{ + struct p4tc_act_param *param; + int err; + + if (param_id) { + param = tcf_param_find_byid(&act->params_idr, param_id); + if (!param) { + NL_SET_ERR_MSG(extack, "Unable to find param by id"); + err = -EINVAL; + goto out; + } + } else { + if (param_name) { + param = param_find_byname(&act->params_idr, param_name); + if (!param) { + NL_SET_ERR_MSG(extack, "Param name not found"); + err = -EINVAL; + goto out; + } + } else { + NL_SET_ERR_MSG(extack, "Must specify param name or id"); + err = -EINVAL; + goto out; + } + } + + return param; + +out: + return ERR_PTR(err); +} + +static struct p4tc_act_param * +tcf_param_find_byanyattr(struct p4tc_act *act, struct nlattr *name_attr, + const u32 param_id, struct netlink_ext_ack *extack) +{ + char *param_name = NULL; + + if (name_attr) + param_name = nla_data(name_attr); + + return tcf_param_find_byany(act, param_name, param_id, extack); +} + +static int tcf_p4_act_init_param(struct net *net, + struct tcf_p4act_params *params, + struct p4tc_act *act, struct nlattr *nla, + struct netlink_ext_ack *extack) +{ + u32 param_id = 0; + struct nlattr *tb[P4TC_ACT_PARAMS_MAX + 1]; + struct p4tc_act_param *param, *nparam; + struct p4tc_act_param_ops *op; + struct p4tc_type *type; + int err; + + err = nla_parse_nested(tb, P4TC_ACT_PARAMS_MAX, nla, + p4tc_act_params_policy, extack); + if (err < 0) + return err; + + if (tb[P4TC_ACT_PARAMS_ID]) + param_id = *((u32 *)nla_data(tb[P4TC_ACT_PARAMS_ID])); + + param = tcf_param_find_byanyattr(act, tb[P4TC_ACT_PARAMS_NAME], + param_id, extack); + if (IS_ERR(param)) + return PTR_ERR(param); + + if (tb[P4TC_ACT_PARAMS_TYPE]) { + u32 *type = nla_data(tb[P4TC_ACT_PARAMS_TYPE]); + + if (param->type != *type) { + NL_SET_ERR_MSG(extack, + "Param type differs from template"); + return -EINVAL; + } + } else { + NL_SET_ERR_MSG(extack, "Must specify param type"); + return -EINVAL; + } + + nparam = kzalloc(sizeof(*nparam), GFP_KERNEL); + if (!nparam) + return -ENOMEM; + + strscpy(nparam->name, param->name, ACTPARAMNAMSIZ); + nparam->type = param->type; + + type = p4type_find_byid(param->type); + if (!type) { + NL_SET_ERR_MSG(extack, "Invalid param type"); + err = -EINVAL; + goto free; + } + + op = (struct p4tc_act_param_ops *)¶m_ops[param->type]; + if (op->init_value) + err = op->init_value(net, op, nparam, tb, extack); + else + err = generic_init_param_value(nparam, type, tb, extack); + + if (err < 0) + goto free; + + nparam->id = param->id; + + err = idr_alloc_u32(¶ms->params_idr, nparam, &nparam->id, + nparam->id, GFP_KERNEL); + if (err < 0) + goto free_val; + + return 0; + +free_val: + if (op->free) + op->free(nparam); + else + generic_free_param_value(nparam); + +free: + kfree(nparam); + return err; +} + +int tcf_p4_act_init_params(struct net *net, struct tcf_p4act_params *params, + struct p4tc_act *act, struct nlattr *nla, + struct netlink_ext_ack *extack) +{ + struct nlattr *tb[P4TC_MSGBATCH_SIZE + 1]; + int err; + int i; + + err = nla_parse_nested(tb, P4TC_MSGBATCH_SIZE, nla, NULL, NULL); + if (err < 0) + return err; + + for (i = 1; i < P4TC_MSGBATCH_SIZE + 1 && tb[i]; i++) { + err = tcf_p4_act_init_param(net, params, act, tb[i], extack); + if (err < 0) + return err; + } + + return 0; +} + +struct p4tc_act *tcf_action_find_byname(const char *act_name, + struct p4tc_pipeline *pipeline) +{ + char full_act_name[ACTPARAMNAMSIZ]; + struct p4tc_act *act; + unsigned long tmp, id; + + snprintf(full_act_name, ACTNAMSIZ, "%s/%s", pipeline->common.name, + act_name); + idr_for_each_entry_ul(&pipeline->p_act_idr, act, tmp, id) + if (strncmp(act->common.name, full_act_name, ACTNAMSIZ) == 0) + return act; + + return NULL; +} + +struct p4tc_act *tcf_action_find_byid(struct p4tc_pipeline *pipeline, + const u32 a_id) +{ + return idr_find(&pipeline->p_act_idr, a_id); +} + +struct p4tc_act *tcf_action_find_byany(struct p4tc_pipeline *pipeline, + const char *act_name, const u32 a_id, + struct netlink_ext_ack *extack) +{ + struct p4tc_act *act; + int err; + + if (a_id) { + act = tcf_action_find_byid(pipeline, a_id); + if (!act) { + NL_SET_ERR_MSG(extack, "Unable to find action by id"); + err = -ENOENT; + goto out; + } + } else { + if (act_name) { + act = tcf_action_find_byname(act_name, pipeline); + if (!act) { + NL_SET_ERR_MSG(extack, "Action name not found"); + err = -ENOENT; + goto out; + } + } else { + NL_SET_ERR_MSG(extack, + "Must specify action name or id"); + err = -EINVAL; + goto out; + } + } + + return act; + +out: + return ERR_PTR(err); +} + +struct p4tc_act *tcf_action_get(struct p4tc_pipeline *pipeline, + const char *act_name, const u32 a_id, + struct netlink_ext_ack *extack) +{ + struct p4tc_act *act; + + act = tcf_action_find_byany(pipeline, act_name, a_id, extack); + if (IS_ERR(act)) + return act; + + WARN_ON(!refcount_inc_not_zero(&act->a_ref)); + return act; +} + +void tcf_action_put(struct p4tc_act *act) +{ + WARN_ON(!refcount_dec_not_one(&act->a_ref)); +} + +static struct p4tc_act * +tcf_action_find_byanyattr(struct nlattr *act_name_attr, const u32 a_id, + struct p4tc_pipeline *pipeline, + struct netlink_ext_ack *extack) +{ + char *act_name = NULL; + + if (act_name_attr) + act_name = nla_data(act_name_attr); + + return tcf_action_find_byany(pipeline, act_name, a_id, extack); +} + +static void p4_put_param(struct idr *params_idr, struct p4tc_act_param *param) +{ + kfree(param); +} + +void p4_put_many_params(struct idr *params_idr, struct p4tc_act_param *params[], + int params_count) +{ + int i; + + for (i = 0; i < params_count; i++) + p4_put_param(params_idr, params[i]); +} + +static struct p4tc_act_param *p4_create_param(struct p4tc_act *act, + struct nlattr **tb, u32 param_id, + struct netlink_ext_ack *extack) +{ + struct p4tc_act_param *param; + char *name; + int ret; + + if (tb[P4TC_ACT_PARAMS_NAME]) { + name = nla_data(tb[P4TC_ACT_PARAMS_NAME]); + } else { + NL_SET_ERR_MSG(extack, "Must specify param name"); + ret = -EINVAL; + goto out; + } + + param = kmalloc(sizeof(*param), GFP_KERNEL); + if (!param) { + ret = -ENOMEM; + goto out; + } + + if (tcf_param_find_byid(&act->params_idr, param_id) || + param_find_byname(&act->params_idr, name)) { + NL_SET_ERR_MSG(extack, "Param already exists"); + ret = -EEXIST; + goto free; + } + + if (tb[P4TC_ACT_PARAMS_TYPE]) { + struct p4tc_type *type; + + param->type = *((u32 *)nla_data(tb[P4TC_ACT_PARAMS_TYPE])); + type = p4type_find_byid(param->type); + if (!type) { + NL_SET_ERR_MSG(extack, "Param type is invalid"); + ret = -EINVAL; + goto free; + } + } else { + NL_SET_ERR_MSG(extack, "Must specify param type"); + ret = -EINVAL; + goto free; + } + + if (param_id) { + ret = idr_alloc_u32(&act->params_idr, param, ¶m_id, + param_id, GFP_KERNEL); + if (ret < 0) { + NL_SET_ERR_MSG(extack, "Unable to allocate param id"); + goto free; + } + param->id = param_id; + } else { + param->id = 1; + + ret = idr_alloc_u32(&act->params_idr, param, ¶m->id, + UINT_MAX, GFP_KERNEL); + if (ret < 0) { + NL_SET_ERR_MSG(extack, "Unable to allocate param id"); + goto free; + } + } + + strscpy(param->name, name, ACTPARAMNAMSIZ); + + return param; + +free: + kfree(param); + +out: + return ERR_PTR(ret); +} + +static struct p4tc_act_param *p4_update_param(struct p4tc_act *act, + struct nlattr **tb, + const u32 param_id, + struct netlink_ext_ack *extack) +{ + struct p4tc_act_param *param_old, *param; + int ret; + + param_old = tcf_param_find_byanyattr(act, tb[P4TC_ACT_PARAMS_NAME], + param_id, extack); + if (IS_ERR(param_old)) + return param_old; + + param = kmalloc(sizeof(*param), GFP_KERNEL); + if (!param) { + ret = -ENOMEM; + goto out; + } + + strscpy(param->name, param_old->name, ACTPARAMNAMSIZ); + param->id = param_old->id; + + if (tb[P4TC_ACT_PARAMS_TYPE]) { + struct p4tc_type *type; + + param->type = *((u32 *)nla_data(tb[P4TC_ACT_PARAMS_TYPE])); + type = p4type_find_byid(param->type); + if (!type) { + NL_SET_ERR_MSG(extack, "Param type is invalid"); + ret = -EINVAL; + goto out; + } + } else { + NL_SET_ERR_MSG(extack, "Must specify param type"); + ret = -EINVAL; + goto out; + } + + return param; + +out: + return ERR_PTR(ret); +} + +static struct p4tc_act_param *p4_act_init_param(struct p4tc_act *act, + struct nlattr *nla, bool update, + struct netlink_ext_ack *extack) +{ + u32 param_id = 0; + struct nlattr *tb[P4TC_ACT_PARAMS_MAX + 1]; + int ret; + + ret = nla_parse_nested(tb, P4TC_ACT_PARAMS_MAX, nla, NULL, extack); + if (ret < 0) { + ret = -EINVAL; + goto out; + } + + if (tb[P4TC_ACT_PARAMS_ID]) + param_id = *((u32 *)nla_data(tb[P4TC_ACT_PARAMS_ID])); + + if (update) + return p4_update_param(act, tb, param_id, extack); + else + return p4_create_param(act, tb, param_id, extack); + +out: + return ERR_PTR(ret); +} + +int p4_act_init_params(struct p4tc_act *act, struct nlattr *nla, + struct p4tc_act_param *params[], bool update, + struct netlink_ext_ack *extack) +{ + struct nlattr *tb[P4TC_MSGBATCH_SIZE + 1]; + int ret; + int i; + + ret = nla_parse_nested(tb, P4TC_MSGBATCH_SIZE, nla, NULL, extack); + if (ret < 0) + return -EINVAL; + + for (i = 1; i < P4TC_MSGBATCH_SIZE + 1 && tb[i]; i++) { + struct p4tc_act_param *param; + + param = p4_act_init_param(act, tb[i], update, extack); + if (IS_ERR(param)) { + ret = PTR_ERR(param); + goto params_del; + } + params[i - 1] = param; + } + + return i - 1; + +params_del: + p4_put_many_params(&act->params_idr, params, i - 1); + return ret; +} + +int p4_act_init(struct p4tc_act *act, struct nlattr *nla, + struct p4tc_act_param *params[], struct netlink_ext_ack *extack) +{ + int num_params = 0; + int ret; + + idr_init(&act->params_idr); + + if (nla) { + num_params = + p4_act_init_params(act, nla, params, false, extack); + if (num_params < 0) { + ret = num_params; + goto idr_destroy; + } + } + + return num_params; + +idr_destroy: + p4_put_many_params(&act->params_idr, params, num_params); + idr_destroy(&act->params_idr); + return ret; +} + +static const struct nla_policy p4tc_act_policy[P4TC_ACT_MAX + 1] = { + [P4TC_ACT_NAME] = { .type = NLA_STRING, .len = ACTNAMSIZ }, + [P4TC_ACT_PARMS] = { .type = NLA_NESTED }, + [P4TC_ACT_OPT] = { .type = NLA_BINARY, + .len = sizeof(struct tc_act_dyna) }, + [P4TC_ACT_CMDS_LIST] = { .type = NLA_NESTED }, + [P4TC_ACT_ACTIVE] = { .type = NLA_U8 }, +}; + +static inline void p4tc_action_net_exit(struct tc_action_net *tn) +{ + tcf_idrinfo_destroy(tn->ops, tn->idrinfo); + kfree(tn->idrinfo); + kfree(tn); +} + +static int __tcf_act_put(struct net *net, struct p4tc_pipeline *pipeline, + struct p4tc_act *act, bool unconditional_purge, + struct netlink_ext_ack *extack) +{ + struct p4tc_act_param *act_param; + unsigned long param_id, tmp; + struct tc_action_net *tn; + struct idr *idr; + int ret; + + if (!unconditional_purge && (refcount_read(&act->ops.dyn_ref) > 1 || + refcount_read(&act->a_ref) > 1)) { + NL_SET_ERR_MSG(extack, + "Unable to delete referenced action template"); + return -EBUSY; + } + + tn = net_generic(net, act->ops.net_id); + idr = &tn->idrinfo->action_idr; + + idr_for_each_entry_ul(&act->params_idr, act_param, tmp, param_id) { + idr_remove(&act->params_idr, param_id); + kfree(act_param); + } + + ret = __tcf_unregister_action(&act->ops); + if (ret < 0) { + NL_SET_ERR_MSG(extack, + "Unable to unregister new action template"); + return ret; + } + p4tc_action_net_exit(act->tn); + + if (act->labels) { + rhashtable_free_and_destroy(act->labels, p4tc_label_ht_destroy, + NULL); + kfree(act->labels); + } + + idr_remove(&pipeline->p_act_idr, act->a_id); + + if (!unconditional_purge) + tcf_pipeline_delete_from_dep_graph(pipeline, act); + + list_del(&act->head); + + kfree(act); + + pipeline->num_created_acts--; + + return 0; +} + +static int _tcf_act_fill_nlmsg(struct net *net, struct sk_buff *skb, + struct p4tc_act *act) +{ + unsigned char *b = nlmsg_get_pos(skb); + int i = 1; + struct nlattr *nest, *parms, *cmds; + struct p4tc_act_param *param; + unsigned long param_id, tmp; + + if (nla_put_u32(skb, P4TC_PATH, act->a_id)) + goto out_nlmsg_trim; + + nest = nla_nest_start(skb, P4TC_PARAMS); + if (!nest) + goto out_nlmsg_trim; + + if (nla_put_string(skb, P4TC_ACT_NAME, act->common.name)) + goto out_nlmsg_trim; + + parms = nla_nest_start(skb, P4TC_ACT_PARMS); + if (!parms) + goto out_nlmsg_trim; + + idr_for_each_entry_ul(&act->params_idr, param, tmp, param_id) { + struct nlattr *nest_count; + + nest_count = nla_nest_start(skb, i); + if (!nest_count) + goto out_nlmsg_trim; + + if (nla_put_string(skb, P4TC_ACT_PARAMS_NAME, param->name)) + goto out_nlmsg_trim; + + if (nla_put_u32(skb, P4TC_ACT_PARAMS_ID, param->id)) + goto out_nlmsg_trim; + + if (nla_put_u32(skb, P4TC_ACT_PARAMS_TYPE, param->type)) + goto out_nlmsg_trim; + + nla_nest_end(skb, nest_count); + i++; + } + nla_nest_end(skb, parms); + + cmds = nla_nest_start(skb, P4TC_ACT_CMDS_LIST); + nla_nest_end(skb, cmds); + + nla_nest_end(skb, nest); + + return skb->len; + +out_nlmsg_trim: + nlmsg_trim(skb, b); + return -1; +} + +static int tcf_act_fill_nlmsg(struct net *net, struct sk_buff *skb, + struct p4tc_template_common *tmpl, + struct netlink_ext_ack *extack) +{ + return _tcf_act_fill_nlmsg(net, skb, to_act(tmpl)); +} + +static int tcf_act_flush(struct sk_buff *skb, struct net *net, + struct p4tc_pipeline *pipeline, + struct netlink_ext_ack *extack) +{ + unsigned char *b = nlmsg_get_pos(skb); + struct p4tc_act *act; + unsigned long tmp, act_id; + int ret = 0; + int i = 0; + + if (nla_put_u32(skb, P4TC_PATH, 0)) + goto out_nlmsg_trim; + + if (idr_is_empty(&pipeline->p_act_idr)) { + NL_SET_ERR_MSG(extack, + "There are not action templates to flush"); + goto out_nlmsg_trim; + } + + idr_for_each_entry_ul(&pipeline->p_act_idr, act, tmp, act_id) { + if (__tcf_act_put(net, pipeline, act, false, extack) < 0) { + ret = -EBUSY; + continue; + } + i++; + } + + nla_put_u32(skb, P4TC_COUNT, i); + + if (ret < 0) { + if (i == 0) { + NL_SET_ERR_MSG(extack, + "Unable to flush any action template"); + goto out_nlmsg_trim; + } else { + NL_SET_ERR_MSG(extack, + "Unable to flush all action templates"); + } + } + + return i; + +out_nlmsg_trim: + nlmsg_trim(skb, b); + return ret; +} + +static int tcf_act_gd(struct net *net, struct sk_buff *skb, struct nlmsghdr *n, + struct nlattr *nla, struct p4tc_nl_pname *nl_pname, + u32 *ids, struct netlink_ext_ack *extack) +{ + const u32 pipeid = ids[P4TC_PID_IDX], a_id = ids[P4TC_AID_IDX]; + struct nlattr *tb[P4TC_ACT_MAX + 1] = { NULL }; + unsigned char *b = nlmsg_get_pos(skb); + int ret = 0; + struct p4tc_pipeline *pipeline; + struct p4tc_act *act; + + if (n->nlmsg_type == RTM_DELP4TEMPLATE) + pipeline = tcf_pipeline_find_byany_unsealed(net, nl_pname->data, + pipeid, extack); + else + pipeline = tcf_pipeline_find_byany(net, nl_pname->data, pipeid, + extack); + if (IS_ERR(pipeline)) + return PTR_ERR(pipeline); + + if (nla) { + ret = nla_parse_nested(tb, P4TC_ACT_MAX, nla, p4tc_act_policy, + extack); + if (ret < 0) + return ret; + } + + if (!nl_pname->passed) + strscpy(nl_pname->data, pipeline->common.name, PIPELINENAMSIZ); + + if (!ids[P4TC_PID_IDX]) + ids[P4TC_PID_IDX] = pipeline->common.p_id; + + if (n->nlmsg_type == RTM_DELP4TEMPLATE && (n->nlmsg_flags & NLM_F_ROOT)) + return tcf_act_flush(skb, net, pipeline, extack); + + act = tcf_action_find_byanyattr(tb[P4TC_ACT_NAME], a_id, pipeline, + extack); + if (IS_ERR(act)) + return PTR_ERR(act); + + if (_tcf_act_fill_nlmsg(net, skb, act) < 0) { + NL_SET_ERR_MSG(extack, + "Failed to fill notification attributes for template action"); + return -EINVAL; + } + + if (n->nlmsg_type == RTM_DELP4TEMPLATE) { + ret = __tcf_act_put(net, pipeline, act, false, extack); + if (ret < 0) + goto out_nlmsg_trim; + } + + return 0; + +out_nlmsg_trim: + nlmsg_trim(skb, b); + return ret; +} + +static int tcf_act_put(struct net *net, struct p4tc_template_common *tmpl, + bool unconditional_purge, struct netlink_ext_ack *extack) +{ + struct p4tc_act *act = to_act(tmpl); + struct p4tc_pipeline *pipeline; + + pipeline = tcf_pipeline_find_byid(net, tmpl->p_id); + + return __tcf_act_put(net, pipeline, act, unconditional_purge, extack); +} + +static void p4tc_params_replace_many(struct idr *params_idr, + struct p4tc_act_param *params[], + int params_count) +{ + int i; + + for (i = 0; i < params_count; i++) { + struct p4tc_act_param *param = params[i]; + + param = idr_replace(params_idr, param, param->id); + kfree(param); + } +} + +static struct p4tc_act *tcf_act_create(struct net *net, struct nlattr **tb, + struct p4tc_pipeline *pipeline, u32 *ids, + struct netlink_ext_ack *extack) +{ + struct p4tc_act_param *params[P4TC_MSGBATCH_SIZE] = { NULL }; + u32 a_id = ids[P4TC_AID_IDX]; + int num_params = 0; + int ret = 0; + struct p4tc_act_dep_node *dep_node; + struct p4tc_act *act; + char *act_name; + + if (tb[P4TC_ACT_NAME]) { + act_name = nla_data(tb[P4TC_ACT_NAME]); + } else { + NL_SET_ERR_MSG(extack, "Must supply action name"); + return ERR_PTR(-EINVAL); + } + + if ((tcf_action_find_byname(act_name, pipeline))) { + NL_SET_ERR_MSG(extack, "Action already exists with same name"); + return ERR_PTR(-EEXIST); + } + + if (tcf_action_find_byid(pipeline, a_id)) { + NL_SET_ERR_MSG(extack, "Action already exists with same id"); + return ERR_PTR(-EEXIST); + } + + act = kzalloc(sizeof(*act), GFP_KERNEL); + if (!act) + return ERR_PTR(-ENOMEM); + + act->ops.owner = THIS_MODULE; + act->ops.act = tcf_p4_dyna_act; + act->ops.dump = tcf_p4_dyna_dump; + act->ops.cleanup = tcf_p4_dyna_cleanup; + act->ops.init_ops = tcf_p4_dyna_init; + act->ops.lookup = tcf_p4_dyna_lookup; + act->ops.walk = tcf_p4_dyna_walker; + act->ops.size = sizeof(struct tcf_p4act); + INIT_LIST_HEAD(&act->head); + + act->tn = kzalloc(sizeof(*act->tn), GFP_KERNEL); + if (!act->tn) { + ret = -ENOMEM; + goto free_act_ops; + } + + ret = tc_action_net_init(net, act->tn, &act->ops); + if (ret < 0) { + kfree(act->tn); + goto free_act_ops; + } + act->tn->ops = &act->ops; + + snprintf(act->ops.kind, ACTNAMSIZ, "%s/%s", pipeline->common.name, + act_name); + + if (a_id) { + ret = idr_alloc_u32(&pipeline->p_act_idr, act, &a_id, a_id, + GFP_KERNEL); + if (ret < 0) { + NL_SET_ERR_MSG(extack, "Unable to alloc action id"); + goto free_action_net; + } + + act->a_id = a_id; + } else { + act->a_id = 1; + + ret = idr_alloc_u32(&pipeline->p_act_idr, act, &act->a_id, + UINT_MAX, GFP_KERNEL); + if (ret < 0) { + NL_SET_ERR_MSG(extack, "Unable to alloc action id"); + goto free_action_net; + } + } + + dep_node = kzalloc(sizeof(*dep_node), GFP_KERNEL); + if (!dep_node) { + ret = -ENOMEM; + goto idr_rm; + } + dep_node->act_id = act->a_id; + INIT_LIST_HEAD(&dep_node->incoming_egde_list); + list_add_tail(&dep_node->head, &pipeline->act_dep_graph); + + refcount_set(&act->ops.dyn_ref, 1); + ret = __tcf_register_action(&act->ops); + if (ret < 0) { + NL_SET_ERR_MSG(extack, + "Unable to register new action template"); + goto free_dep_node; + } + + num_params = p4_act_init(act, tb[P4TC_ACT_PARMS], params, extack); + if (num_params < 0) { + ret = num_params; + goto unregister; + } + + INIT_LIST_HEAD(&act->cmd_operations); + act->pipeline = pipeline; + + pipeline->num_created_acts++; + + ret = determine_act_topological_order(pipeline, true); + if (ret < 0) { + pipeline->num_created_acts--; + goto uninit; + } + + act->common.p_id = pipeline->common.p_id; + snprintf(act->common.name, ACTNAMSIZ, "%s/%s", pipeline->common.name, + act_name); + act->common.ops = (struct p4tc_template_ops *)&p4tc_act_ops; + + refcount_set(&act->a_ref, 1); + + list_add_tail(&act->head, &dynact_list); + + return act; + +uninit: + p4_put_many_params(&act->params_idr, params, num_params); + idr_destroy(&act->params_idr); + +unregister: + rtnl_unlock(); + __tcf_unregister_action(&act->ops); + rtnl_lock(); + +free_dep_node: + list_del(&dep_node->head); + kfree(dep_node); + +idr_rm: + idr_remove(&pipeline->p_act_idr, act->a_id); + +free_action_net: + p4tc_action_net_exit(act->tn); + +free_act_ops: + kfree(act); + + return ERR_PTR(ret); +} + +static struct p4tc_act *tcf_act_update(struct net *net, struct nlattr **tb, + struct p4tc_pipeline *pipeline, u32 *ids, + u32 flags, + struct netlink_ext_ack *extack) +{ + struct p4tc_act_param *params[P4TC_MSGBATCH_SIZE] = { NULL }; + const u32 a_id = ids[P4TC_AID_IDX]; + int num_params = 0; + s8 active = -1; + int ret = 0; + struct p4tc_act *act; + + act = tcf_action_find_byanyattr(tb[P4TC_ACT_NAME], a_id, pipeline, + extack); + if (IS_ERR(act)) + return act; + + if (tb[P4TC_ACT_ACTIVE]) + active = *((u8 *)nla_data(tb[P4TC_ACT_ACTIVE])); + + if (act->active) { + if (!active) { + if (refcount_read(&act->ops.dyn_ref) > 1) { + NL_SET_ERR_MSG(extack, + "Unable to inactivate referenced action"); + return ERR_PTR(-EINVAL); + } + act->active = false; + return act; + } + NL_SET_ERR_MSG(extack, "Unable to update active action"); + return ERR_PTR(-EINVAL); + } + + if (tb[P4TC_ACT_PARMS]) { + num_params = p4_act_init_params(act, tb[P4TC_ACT_PARMS], params, + true, extack); + if (num_params < 0) { + ret = num_params; + goto out; + } + } + + act->pipeline = pipeline; + if (active == 1) { + act->active = true; + } else if (!active) { + NL_SET_ERR_MSG(extack, "Action is already inactive"); + ret = -EINVAL; + goto params_del; + } + + if (tb[P4TC_ACT_CMDS_LIST]) { + ret = determine_act_topological_order(pipeline, true); + if (ret < 0) + goto params_del; + } + + p4tc_params_replace_many(&act->params_idr, params, num_params); + return act; + +params_del: + p4_put_many_params(&act->params_idr, params, num_params); + +out: + return ERR_PTR(ret); +} + +static struct p4tc_template_common * +tcf_act_cu(struct net *net, struct nlmsghdr *n, struct nlattr *nla, + struct p4tc_nl_pname *nl_pname, u32 *ids, + struct netlink_ext_ack *extack) +{ + const u32 pipeid = ids[P4TC_PID_IDX]; + struct nlattr *tb[P4TC_ACT_MAX + 1]; + struct p4tc_act *act; + struct p4tc_pipeline *pipeline; + int ret; + + pipeline = tcf_pipeline_find_byany_unsealed(net, nl_pname->data, pipeid, + extack); + if (IS_ERR(pipeline)) + return (void *)pipeline; + + ret = nla_parse_nested(tb, P4TC_ACT_MAX, nla, p4tc_act_policy, extack); + if (ret < 0) + return ERR_PTR(ret); + + if (n->nlmsg_flags & NLM_F_REPLACE) + act = tcf_act_update(net, tb, pipeline, ids, n->nlmsg_flags, + extack); + else + act = tcf_act_create(net, tb, pipeline, ids, extack); + if (IS_ERR(act)) + goto out; + + if (!nl_pname->passed) + strscpy(nl_pname->data, pipeline->common.name, PIPELINENAMSIZ); + + if (!ids[P4TC_PID_IDX]) + ids[P4TC_PID_IDX] = pipeline->common.p_id; + +out: + return (struct p4tc_template_common *)act; +} + +static int tcf_act_dump(struct sk_buff *skb, struct p4tc_dump_ctx *ctx, + struct nlattr *nla, char **p_name, u32 *ids, + struct netlink_ext_ack *extack) +{ + struct net *net = sock_net(skb->sk); + struct p4tc_pipeline *pipeline; + + if (!ctx->ids[P4TC_PID_IDX]) { + pipeline = tcf_pipeline_find_byany(net, *p_name, + ids[P4TC_PID_IDX], extack); + if (IS_ERR(pipeline)) + return PTR_ERR(pipeline); + ctx->ids[P4TC_PID_IDX] = pipeline->common.p_id; + } else { + pipeline = tcf_pipeline_find_byid(net, ctx->ids[P4TC_PID_IDX]); + } + + if (!ids[P4TC_PID_IDX]) + ids[P4TC_PID_IDX] = pipeline->common.p_id; + + if (!(*p_name)) + *p_name = pipeline->common.name; + + return tcf_p4_tmpl_generic_dump(skb, ctx, &pipeline->p_act_idr, + P4TC_AID_IDX, extack); +} + +static int tcf_act_dump_1(struct sk_buff *skb, + struct p4tc_template_common *common) +{ + struct nlattr *param = nla_nest_start(skb, P4TC_PARAMS); + unsigned char *b = nlmsg_get_pos(skb); + struct p4tc_act *act = to_act(common); + struct nlattr *nest; + + if (!param) + goto out_nlmsg_trim; + + if (nla_put_string(skb, P4TC_ACT_NAME, act->common.name)) + goto out_nlmsg_trim; + + nest = nla_nest_start(skb, P4TC_ACT_CMDS_LIST); + nla_nest_end(skb, nest); + + if (nla_put_u8(skb, P4TC_ACT_ACTIVE, act->active)) + goto out_nlmsg_trim; + + nla_nest_end(skb, param); + + return 0; + +out_nlmsg_trim: + nlmsg_trim(skb, b); + return -ENOMEM; +} + +const struct p4tc_template_ops p4tc_act_ops = { + .init = NULL, + .cu = tcf_act_cu, + .put = tcf_act_put, + .gd = tcf_act_gd, + .fill_nlmsg = tcf_act_fill_nlmsg, + .dump = tcf_act_dump, + .dump_1 = tcf_act_dump_1, +}; diff --git a/net/sched/p4tc/p4tc_pipeline.c b/net/sched/p4tc/p4tc_pipeline.c index 6fc7bd49d..e43e120a3 100644 --- a/net/sched/p4tc/p4tc_pipeline.c +++ b/net/sched/p4tc/p4tc_pipeline.c @@ -77,10 +77,226 @@ static const struct nla_policy tc_pipeline_policy[P4TC_PIPELINE_MAX + 1] = { [P4TC_PIPELINE_POSTACTIONS] = { .type = NLA_NESTED }, }; +static void __act_dep_graph_free(struct list_head *incoming_egde_list) +{ + struct p4tc_act_dep_edge_node *cursor_edge, *tmp_edge; + + list_for_each_entry_safe(cursor_edge, tmp_edge, incoming_egde_list, + head) { + list_del(&cursor_edge->head); + kfree(cursor_edge); + } +} + +static void act_dep_graph_free(struct list_head *graph) +{ + struct p4tc_act_dep_node *cursor, *tmp; + + list_for_each_entry_safe(cursor, tmp, graph, head) { + __act_dep_graph_free(&cursor->incoming_egde_list); + + list_del(&cursor->head); + kfree(cursor); + } +} + +void tcf_pipeline_delete_from_dep_graph(struct p4tc_pipeline *pipeline, + struct p4tc_act *act) +{ + struct p4tc_act_dep_node *act_node, *node_tmp; + + list_for_each_entry_safe(act_node, node_tmp, &pipeline->act_dep_graph, + head) { + if (act_node->act_id == act->a_id) { + __act_dep_graph_free(&act_node->incoming_egde_list); + list_del(&act_node->head); + kfree(act_node); + } + } + + list_for_each_entry_safe(act_node, node_tmp, + &pipeline->act_topological_order, head) { + if (act_node->act_id == act->a_id) { + list_del(&act_node->head); + kfree(act_node); + } + } +} + +/* Node id indicates the callee's act id. + * edge_node->act_id indicates the caller's act id. + */ +void tcf_pipeline_add_dep_edge(struct p4tc_pipeline *pipeline, + struct p4tc_act_dep_edge_node *edge_node, + u32 node_id) +{ + struct p4tc_act_dep_node *cursor; + + list_for_each_entry(cursor, &pipeline->act_dep_graph, head) { + if (cursor->act_id == node_id) + break; + } + + list_add_tail(&edge_node->head, &cursor->incoming_egde_list); +} + +/* Find root node, that is, the node in our graph that has no incoming edges. + */ +struct p4tc_act_dep_node *find_root_node(struct list_head *act_dep_graph) +{ + struct p4tc_act_dep_node *cursor, *root_node; + + list_for_each_entry(cursor, act_dep_graph, head) { + if (list_empty(&cursor->incoming_egde_list)) { + root_node = cursor; + return root_node; + } + } + + return NULL; +} + +/* node_id indicates where the edge is directed to + * edge_node->act_id indicates where the edge comes from. + */ +bool tcf_pipeline_check_act_backedge(struct p4tc_pipeline *pipeline, + struct p4tc_act_dep_edge_node *edge_node, + u32 node_id) +{ + struct p4tc_act_dep_node *root_node = NULL; + + /* make sure we dont call ourselves */ + if (edge_node->act_id == node_id) + return true; + + /* add to the list temporarily so we can run our algorithm to + * find edgeless node and detect a cycle + */ + tcf_pipeline_add_dep_edge(pipeline, edge_node, node_id); + + /* Now lets try to find a node which has no incoming edges (root node). + * If we find a root node it means there is no cycle; + * OTOH, if we dont find one, it means we have circular depency. + */ + root_node = find_root_node(&pipeline->act_dep_graph); + + if (!root_node) + return true; + + list_del(&edge_node->head); + + return false; +} + +static struct p4tc_act_dep_node * +find_and_del_root_node(struct list_head *act_dep_graph) +{ + struct p4tc_act_dep_node *cursor, *tmp, *root_node; + + root_node = find_root_node(act_dep_graph); + list_del(&root_node->head); + + list_for_each_entry_safe(cursor, tmp, act_dep_graph, head) { + struct p4tc_act_dep_edge_node *cursor_edge, *tmp_edge; + + list_for_each_entry_safe(cursor_edge, tmp_edge, + &cursor->incoming_egde_list, head) { + if (cursor_edge->act_id == root_node->act_id) { + list_del(&cursor_edge->head); + kfree(cursor_edge); + } + } + } + + return root_node; +} + +static int act_dep_graph_copy(struct list_head *new_graph, + struct list_head *old_graph) +{ + int err = -ENOMEM; + struct p4tc_act_dep_node *cursor, *tmp; + + list_for_each_entry_safe(cursor, tmp, old_graph, head) { + struct p4tc_act_dep_edge_node *cursor_edge, *tmp_edge; + struct p4tc_act_dep_node *new_dep_node; + + new_dep_node = kzalloc(sizeof(*new_dep_node), GFP_KERNEL); + if (!new_dep_node) + goto free_graph; + + INIT_LIST_HEAD(&new_dep_node->incoming_egde_list); + list_add_tail(&new_dep_node->head, new_graph); + new_dep_node->act_id = cursor->act_id; + + list_for_each_entry_safe(cursor_edge, tmp_edge, + &cursor->incoming_egde_list, head) { + struct p4tc_act_dep_edge_node *new_dep_edge_node; + + new_dep_edge_node = + kzalloc(sizeof(*new_dep_edge_node), GFP_KERNEL); + if (!new_dep_edge_node) + goto free_graph; + + list_add_tail(&new_dep_edge_node->head, + &new_dep_node->incoming_egde_list); + new_dep_edge_node->act_id = cursor_edge->act_id; + } + } + + return 0; + +free_graph: + act_dep_graph_free(new_graph); + return err; +} + +int determine_act_topological_order(struct p4tc_pipeline *pipeline, + bool copy_dep_graph) +{ + int i = pipeline->num_created_acts; + struct p4tc_act_dep_node *act_node, *node_tmp; + struct p4tc_act_dep_node *node; + struct list_head *dep_graph; + + if (copy_dep_graph) { + int err; + + dep_graph = kzalloc(sizeof(*dep_graph), GFP_KERNEL); + if (!dep_graph) + return -ENOMEM; + + INIT_LIST_HEAD(dep_graph); + err = act_dep_graph_copy(dep_graph, &pipeline->act_dep_graph); + if (err < 0) + return err; + } else { + dep_graph = &pipeline->act_dep_graph; + } + + /* Clear from previous calls */ + list_for_each_entry_safe(act_node, node_tmp, + &pipeline->act_topological_order, head) { + list_del(&act_node->head); + kfree(act_node); + } + + while (i--) { + node = find_and_del_root_node(dep_graph); + list_add_tail(&node->head, &pipeline->act_topological_order); + } + + if (copy_dep_graph) + kfree(dep_graph); + + return 0; +} + static void tcf_pipeline_destroy(struct p4tc_pipeline *pipeline, bool free_pipeline) { idr_destroy(&pipeline->p_meta_idr); + idr_destroy(&pipeline->p_act_idr); if (free_pipeline) kfree(pipeline); @@ -106,21 +322,15 @@ static int tcf_pipeline_put(struct net *net, struct p4tc_pipeline_net *pipe_net = net_generic(net, pipeline_net_id); struct p4tc_pipeline *pipeline = to_pipeline(template); struct net *pipeline_net = maybe_get_net(net); - struct p4tc_metadata *meta; + struct p4tc_act_dep_node *act_node, *node_tmp; unsigned long m_id, tmp; + struct p4tc_metadata *meta; if (pipeline_net && !refcount_dec_if_one(&pipeline->p_ref)) { NL_SET_ERR_MSG(extack, "Can't delete referenced pipeline"); return -EBUSY; } - idr_remove(&pipe_net->pipeline_idr, pipeline->common.p_id); - if (pipeline->parser) - tcf_parser_del(net, pipeline, pipeline->parser, extack); - - idr_for_each_entry_ul(&pipeline->p_meta_idr, meta, tmp, m_id) - meta->common.ops->put(net, &meta->common, true, extack); - /* XXX: The action fields are only accessed in the control path * since they will be copied to the filter, where the data path * will use them. So there is no need to free them in the rcu @@ -129,6 +339,26 @@ static int tcf_pipeline_put(struct net *net, p4tc_action_destroy(pipeline->preacts); p4tc_action_destroy(pipeline->postacts); + act_dep_graph_free(&pipeline->act_dep_graph); + + list_for_each_entry_safe(act_node, node_tmp, + &pipeline->act_topological_order, head) { + struct p4tc_act *act; + + act = tcf_action_find_byid(pipeline, act_node->act_id); + act->common.ops->put(net, &act->common, true, extack); + list_del(&act_node->head); + kfree(act_node); + } + + idr_for_each_entry_ul(&pipeline->p_meta_idr, meta, tmp, m_id) + meta->common.ops->put(net, &meta->common, true, extack); + + if (pipeline->parser) + tcf_parser_del(net, pipeline, pipeline->parser, extack); + + idr_remove(&pipe_net->pipeline_idr, pipeline->common.p_id); + if (pipeline_net) call_rcu(&pipeline->rcu, tcf_pipeline_destroy_rcu); else @@ -159,26 +389,13 @@ static inline int pipeline_try_set_state_ready(struct p4tc_pipeline *pipeline, return -EINVAL; } + /* Will never fail in this case */ + determine_act_topological_order(pipeline, false); + pipeline->p_state = P4TC_STATE_READY; return true; } -static int p4tc_action_init(struct net *net, struct nlattr *nla, - struct tc_action *acts[], u32 pipeid, u32 flags, - struct netlink_ext_ack *extack) -{ - int init_res[TCA_ACT_MAX_PRIO]; - size_t attrs_size; - int ret; - - /* If action was already created, just bind to existing one*/ - flags = TCA_ACT_FLAGS_BIND; - ret = tcf_action_init(net, NULL, nla, NULL, acts, init_res, &attrs_size, - flags, 0, extack); - - return ret; -} - struct p4tc_pipeline *tcf_pipeline_find_byid(struct net *net, const u32 pipeid) { struct p4tc_pipeline_net *pipe_net; @@ -323,9 +540,15 @@ static struct p4tc_pipeline *tcf_pipeline_create(struct net *net, pipeline->parser = NULL; + idr_init(&pipeline->p_act_idr); + idr_init(&pipeline->p_meta_idr); pipeline->p_meta_offset = 0; + INIT_LIST_HEAD(&pipeline->act_dep_graph); + INIT_LIST_HEAD(&pipeline->act_topological_order); + pipeline->num_created_acts = 0; + pipeline->p_state = P4TC_STATE_NOT_READY; pipeline->net = net; @@ -658,7 +881,8 @@ static int tcf_pipeline_gd(struct net *net, struct sk_buff *skb, return PTR_ERR(pipeline); tmpl = (struct p4tc_template_common *)pipeline; - if (tcf_pipeline_fill_nlmsg(net, skb, tmpl, extack) < 0) + ret = tcf_pipeline_fill_nlmsg(net, skb, tmpl, extack); + if (ret < 0) return -1; if (!ids[P4TC_PID_IDX]) diff --git a/net/sched/p4tc/p4tc_tmpl_api.c b/net/sched/p4tc/p4tc_tmpl_api.c index 325b56d2e..2296ae97b 100644 --- a/net/sched/p4tc/p4tc_tmpl_api.c +++ b/net/sched/p4tc/p4tc_tmpl_api.c @@ -44,6 +44,7 @@ static bool obj_is_valid(u32 obj) case P4TC_OBJ_PIPELINE: case P4TC_OBJ_META: case P4TC_OBJ_HDR_FIELD: + case P4TC_OBJ_ACT: return true; default: return false; @@ -54,6 +55,7 @@ static const struct p4tc_template_ops *p4tc_ops[P4TC_OBJ_MAX] = { [P4TC_OBJ_PIPELINE] = &p4tc_pipeline_ops, [P4TC_OBJ_META] = &p4tc_meta_ops, [P4TC_OBJ_HDR_FIELD] = &p4tc_hdrfield_ops, + [P4TC_OBJ_ACT] = &p4tc_act_ops, }; int tcf_p4_tmpl_generic_dump(struct sk_buff *skb, struct p4tc_dump_ctx *ctx, From patchwork Tue Jan 24 17:05:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jamal Hadi Salim X-Patchwork-Id: 13114395 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 23475C61D97 for ; Tue, 24 Jan 2023 17:07:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234717AbjAXRG6 (ORCPT ); Tue, 24 Jan 2023 12:06:58 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41998 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234712AbjAXRGY (ORCPT ); Tue, 24 Jan 2023 12:06:24 -0500 Received: from mail-yw1-x112e.google.com (mail-yw1-x112e.google.com [IPv6:2607:f8b0:4864:20::112e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D544042DE0 for ; Tue, 24 Jan 2023 09:05:38 -0800 (PST) Received: by mail-yw1-x112e.google.com with SMTP id 00721157ae682-50112511ba7so168311237b3.3 for ; Tue, 24 Jan 2023 09:05:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mojatatu-com.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=kwDAKy8ZNBn+MyM8TOUjSuS2/bb2tj8qS1YauFIL6hY=; b=CHCQUAKhsMIyyDvXtVpBejWQQYq8PNudeXgwzhujNJDZ8yrFqFXBZuk7da0SiDTr+g 3dfQGRhFdLRonSNsdqi7hH4SWLDSPPCHkaCsj4Ld+C89ZFEUgwqdkvmyesi7GQtfDdYg qh3AHVOzyzIQS/RWsx02vext128xdJA5VikiP97kAtWGNt7Af5odLbF2onhGtSbd6r5R sZSQwjpKaRZ2/KKMJHW+pSRH47HNgG/eecWODTOJcOjbsZb85QqI5nplBVxRPirW1NCW dGGkkX/H6dJ5Sl0BS5nj4AcTly4wpSvEinj/XyrviepvFIx7bm5/4VY3ugeOC14pz3L7 sd6w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=kwDAKy8ZNBn+MyM8TOUjSuS2/bb2tj8qS1YauFIL6hY=; b=8FlzDcw9EtAwlTvH3kfj9JHH5vsLwHx0V9ZsC0BSHkPH3MyvsUQcXlw8lYQ28uWpOE g7tXdFW2w2DYJytbmK0doec4Q0xjB2JhZjCEMy2uy6iqiCDiUg13fBXp12bITOXKBZ/e 5UFg4OM015dHTVfSvu13KcaPyUu+Bp98xFhYAaRpAc6K8pjxbcgfz1C8MyN8YHfX3Pmv MIule8UQCZZbt5Zf24aqFWwhFVxu2uiOpeNbRcgM6X0bSoIROaErhLoMR3P8Lh7bvwm+ kBMhozjvRTFpjcwu1qrllok7FHb2MDqGr1QbTIyY4HiaBnT/UKZx05iWpxcUTcz+9qKm ATWA== X-Gm-Message-State: AFqh2kpln7RDDTp4Yr9n+itChcrXZ7pmq3XQ0NJDwD16ihwzYxogZxK5 oDo0R7kklP/lgE8Yb7xpT+jQEct5T3uapmB3 X-Google-Smtp-Source: AMrXdXuDTqwye8u0lWyBb0ZFFNE4AkUUlAEt9AO8/IHLEYB9eGxghQUk2hMF7FZD9lwd5/WmRuYRdA== X-Received: by 2002:a05:7500:3608:b0:f1:f5dd:9430 with SMTP id fo8-20020a057500360800b000f1f5dd9430mr1990416gab.65.1674579932558; Tue, 24 Jan 2023 09:05:32 -0800 (PST) Received: from localhost.localdomain (bras-base-kntaon1618w-grc-10-184-145-9-64.dsl.bell.ca. [184.145.9.64]) by smtp.gmail.com with ESMTPSA id t5-20020a05620a0b0500b007063036cb03sm1700208qkg.126.2023.01.24.09.05.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Jan 2023 09:05:32 -0800 (PST) From: Jamal Hadi Salim To: netdev@vger.kernel.org Cc: kernel@mojatatu.com, deb.chatterjee@intel.com, anjali.singhai@intel.com, namrata.limaye@intel.com, khalidm@nvidia.com, tom@sipanda.io, pratyush@sipanda.io, jiri@resnulli.us, xiyou.wangcong@gmail.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, vladbu@nvidia.com, simon.horman@corigine.com Subject: [PATCH net-next RFC 16/20] p4tc: add table create, update, delete, get, flush and dump Date: Tue, 24 Jan 2023 12:05:06 -0500 Message-Id: <20230124170510.316970-16-jhs@mojatatu.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230124170510.316970-1-jhs@mojatatu.com> References: <20230124170510.316970-1-jhs@mojatatu.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org X-Patchwork-State: RFC This commit introduces code to create and maintain P4 tables within a P4 program from user space and the next patch will have the code for maintaining entries in the table. As with all other P4TC objects, tables conform to CRUD operations and it's important to note that write operations, such as create, update and delete, can only be made if the pipeline is not sealed. Per the P4 specification, tables prefix their name with the control block (although this could be overridden by P4 annotations). As an example, if one were to create a table named table1 in a pipeline named myprog1, on control block "mycontrol", one would use the following command: tc p4template create table/myprog1/mycontrol/table1 tblid 1 \ keysz 32 nummasks 8 tentries 8192 Above says that we are creating a table attached to pipeline myprog1 on control block mycontrol which is called table1. Its key size is 32 bits and it can have up to 8 masks and 8192. The table id for table1 is 1. The table id is typically provided by the compiler. Parameters such as nummasks (number of masks this table may have) and tentries (maximum number of entries this table may have) may also be ommited in which case 8 masks and 256 entries will be assumed. If one were to template get named table1 (before or after the pipeline is sealed) one would use the following command: tc p4template get table/myprog1/mycontrol/table1 If one were to dump all the tables from a pipeline named myprog1, one would use the following command: tc p4template get table/myprog1 If one were to update table1 (before the pipeline is sealed) one would use the following command: tc p4template update table/myprog1/mycontrol/table1 .... If one were to delete table1 (before the pipeline is sealed) one would use the following command: tc p4template del table/myprog1/mycontrol/table1 If one were to flush all the tables from a pipeline named myprog1, control block "mycontrol" one would use the following command: tc p4template del table/myprog1/mycontrol/ ___Table Permissions___ Tables can have permissions which apply to all the entries in the specified table. Permissions are defined for both what the control plane (user space) is allowed to do as well as datapath. The permissions field is a 16bit value which will hold CRUDX (create, read, update, delete and execute) permissions for control and data path. Bits 9-5 will have the CRUDX values for control and bits 4-0 will have CRUDX values for data path. By default each table has the following permissions: CRUD--R--X Which means the control plane can perform CRUD operations whereas the data path can only Read and execute on the entries. The user can override these permissions when creating the table or when updating. For example, the following command will create a table which will not allow the datapath to create, update or delete entries but give full CRUD permissions for the control plane. $TC p4template create table/aP4proggie/cb/tname tblid 1 keysz 64 permissions 0x349 ... Recall that these permissions come in the form of CRUDXCRUDX, where the first CRUDX block is for control and the last is for data path. So 0x349 is equivalent to CR-D--R--X If we were to do a get with the following command: $TC p4template get table/aP4proggie/cb/tname The output would be the following: pipeline name aP4proggie pipeline id 22 table id 1 table name cb/tname key_sz 64 max entries 256 masks 8 table entries 0 permissions CR-D--R--X Note, the permissions concept is more powerful than classical const definition currently taken by P4 which makes everything in a table read-only. ___Initial Table Entries___ Templating can create initial table entries. For example: tc p4template update table/myprog/cb/tname \ entry srcAddr 10.10.10.10/24 dstAddr 1.1.1.0/24 prio 17 In this command we are "updating" table cb/tname with a new entry. This entry has as its keys srcAddr and dstAddr (both IPv4 addresses) and prio 17. If one was to read back the entry they would get: pipeline id 22 table id 1 table name cb/tname key_sz 64 max entries 256 masks 8 default key 1 table entries 1 permissions CRUD--R--X entry: table id 1 entry priority 17 key blob 101010a0a0a0a mask blob ffffff00ffffff create whodunnit tc permissions -RUD--R--X ___Table Actions List___ P4 tables allow certain actions but not other to be part of match entry on a table or as default actions when there is a miss. We also allow flags for each of the actions in this list that specify if the action can be added only as a table entry (tableonly), or only as a default action (defaultonly). If no flags are specified, it is assumed that the action can be used in both contexts. In P4TC we extend the concept of default action - which in P4 is mapped to "a default miss action". Our extension is to add a "hit action" which is executed everytime there is a hit. The default miss action will be executed whenever a table lookup doesn't match any of the entries. Both default hit and default miss are optional. An example of specifying a default miss action is as follows: tc p4template update table/myprog/cb/mytable \ default_miss_action permissions 0x109 action drop The above will drop packets if the entry is not found in mytable. Note the above makes the default action a const. Meaning the control plane can neither replace it nor delete it. tc p4template update table/myprog/mytable \ default_hit_action permissions 0x30F action ok Whereas the above allows a default hit action to accept the packet. The permission 0x30F in binary is (1100001111), which means we have only Create and Read permissions in the control plane and Read, Update, Delete and eXecute permissions in the data plane. This means, for example, that now we can only delete the default hit action from the data plane. __Packet Flow___ As with the pipeline, we also have preactions and postactions for tables which can be programmed to teach the kernel how to process the packet. Both are optional. When a table apply() cmd is invoked on a table: 1) The table preaction if present is invoked 2) A "key action" is invoked to construct the table key 3) A table lookup is done 4a) If there is a hit - the match entry action will be executed - if there was a match and the entry has no action and a default hit action has been specified then the default hit action will be executed. 4b) If there was a miss - if there was a default miss action it will be executed then 5) if there is table post action then that is invoked next Example of how one would create a key action for a table: tc p4template create action/myprog/mytable/tkey \ cmd set key.myprog.cb/mytable \ hdrfield.myprog.parser1.ipv4.dstAddr and now bind the key action to the table "mytable" $TC p4template update table/myprog/cb/mytable \ key action myprog/mytable/tkey Example of how one would create a table post action is: tc p4template create action/myprog/mytable/T_mytable_POA \ cmd print prefix T_mytable_POA_res results.hit \ cmd print prefix T_mytable_POA hdrfield.myprog.parser1.ipv4.dstAddr Activate it.. tc p4template update action/myprog/mytable/T_mytable_POA state active bind it.. $TC p4template update table/myprog/cb/mytable postactions \ action myprog/mytable/T_mytable_POA Co-developed-by: Victor Nogueira Signed-off-by: Victor Nogueira Co-developed-by: Pedro Tammela Signed-off-by: Pedro Tammela Signed-off-by: Jamal Hadi Salim --- include/net/p4tc.h | 88 ++ include/net/p4tc_types.h | 2 +- include/uapi/linux/p4tc.h | 103 +++ net/sched/p4tc/Makefile | 2 +- net/sched/p4tc/p4tc_pipeline.c | 15 +- net/sched/p4tc/p4tc_table.c | 1591 ++++++++++++++++++++++++++++++++ net/sched/p4tc/p4tc_tmpl_api.c | 2 + 7 files changed, 1800 insertions(+), 3 deletions(-) create mode 100644 net/sched/p4tc/p4tc_table.c diff --git a/include/net/p4tc.h b/include/net/p4tc.h index 09d4d85cf..58be4f96f 100644 --- a/include/net/p4tc.h +++ b/include/net/p4tc.h @@ -16,11 +16,18 @@ #define P4TC_DEFAULT_MAX_RULES 1 #define P4TC_MAXMETA_OFFSET 512 #define P4TC_PATH_MAX 3 +#define P4TC_MAX_TENTRIES (2 << 23) +#define P4TC_DEFAULT_TENTRIES 256 +#define P4TC_MAX_TMASKS 128 +#define P4TC_DEFAULT_TMASKS 8 + +#define P4TC_MAX_PERMISSION (GENMASK(P4TC_PERM_MAX_BIT, 0)) #define P4TC_KERNEL_PIPEID 0 #define P4TC_PID_IDX 0 #define P4TC_MID_IDX 1 +#define P4TC_TBLID_IDX 1 #define P4TC_AID_IDX 1 #define P4TC_PARSEID_IDX 1 #define P4TC_HDRFIELDID_IDX 2 @@ -101,6 +108,7 @@ struct p4tc_pipeline { struct p4tc_template_common common; struct idr p_meta_idr; struct idr p_act_idr; + struct idr p_tbl_idr; struct rcu_head rcu; struct net *net; struct p4tc_parser *parser; @@ -187,6 +195,66 @@ struct p4tc_metadata { extern const struct p4tc_template_ops p4tc_meta_ops; +struct p4tc_table_key { + struct tc_action **key_acts; + int key_num_acts; +}; + +#define P4TC_CONTROL_PERMISSIONS (GENMASK(9, 5)) +#define P4TC_DATA_PERMISSIONS (GENMASK(4, 0)) + +#define P4TC_TABLE_PERMISSIONS \ + ((GENMASK(P4TC_CTRL_PERM_C_BIT, P4TC_CTRL_PERM_D_BIT)) | \ + P4TC_DATA_PERM_R | P4TC_DATA_PERM_X) + +#define P4TC_PERMISSIONS_UNINIT (1 << P4TC_PERM_MAX_BIT) + +struct p4tc_table_defact { + struct tc_action **default_acts; + /* Will have 2 5 bits blocks containing CRUDX (Create, read, update, + * delete, execute) permissions for control plane and data plane. + * The first 5 bits are for control and the next five are for data plane. + * |crudxcrudx| if we were to denote it as UNIX permission flags. + */ + __u16 permissions; + struct rcu_head rcu; +}; + +struct p4tc_table_perm { + __u16 permissions; + struct rcu_head rcu; +}; + +struct p4tc_table { + struct p4tc_template_common common; + struct list_head tbl_acts_list; + struct p4tc_table_key *tbl_key; + struct idr tbl_masks_idr; + struct idr tbl_prio_idr; + struct rhltable tbl_entries; + struct tc_action **tbl_preacts; + struct tc_action **tbl_postacts; + struct p4tc_table_defact __rcu *tbl_default_hitact; + struct p4tc_table_defact __rcu *tbl_default_missact; + struct p4tc_table_perm __rcu *tbl_permissions; + spinlock_t tbl_masks_idr_lock; + spinlock_t tbl_prio_idr_lock; + int tbl_num_postacts; + int tbl_num_preacts; + u32 tbl_count; + u32 tbl_curr_count; + u32 tbl_keysz; + u32 tbl_id; + u32 tbl_max_entries; + u32 tbl_max_masks; + u32 tbl_curr_used_entries; + refcount_t tbl_ctrl_ref; + refcount_t tbl_ref; + refcount_t tbl_entries_ref; +}; + +extern const struct p4tc_template_ops p4tc_table_ops; + struct p4tc_ipv4_param_value { u32 value; u32 mask; @@ -242,6 +310,12 @@ struct p4tc_act { refcount_t a_ref; }; +struct p4tc_table_act { + struct list_head node; + struct tc_action_ops *ops; + u8 flags; +}; + extern const struct p4tc_template_ops p4tc_act_ops; extern const struct rhashtable_params p4tc_label_ht_params; extern const struct rhashtable_params acts_params; @@ -358,6 +432,19 @@ struct p4tc_act_param *tcf_param_find_byany(struct p4tc_act *act, const u32 param_id, struct netlink_ext_ack *extack); +struct p4tc_table *tcf_table_find_byany(struct p4tc_pipeline *pipeline, + const char *tblname, const u32 tbl_id, + struct netlink_ext_ack *extack); +struct p4tc_table *tcf_table_find_byid(struct p4tc_pipeline *pipeline, + const u32 tbl_id); +void *tcf_table_fetch(struct sk_buff *skb, void *tbl_value_ops); +int tcf_table_try_set_state_ready(struct p4tc_pipeline *pipeline, + struct netlink_ext_ack *extack); +struct p4tc_table *tcf_table_get(struct p4tc_pipeline *pipeline, + const char *tblname, const u32 tbl_id, + struct netlink_ext_ack *extack); +void tcf_table_put_ref(struct p4tc_table *table); + struct p4tc_parser *tcf_parser_create(struct p4tc_pipeline *pipeline, const char *parser_name, u32 parser_inst_id, @@ -413,5 +500,6 @@ int generic_dump_param_value(struct sk_buff *skb, struct p4tc_type *type, #define to_meta(t) ((struct p4tc_metadata *)t) #define to_hdrfield(t) ((struct p4tc_hdrfield *)t) #define to_act(t) ((struct p4tc_act *)t) +#define to_table(t) ((struct p4tc_table *)t) #endif diff --git a/include/net/p4tc_types.h b/include/net/p4tc_types.h index 038ad89e3..275e74f93 100644 --- a/include/net/p4tc_types.h +++ b/include/net/p4tc_types.h @@ -8,7 +8,7 @@ #include -#define P4T_MAX_BITSZ 128 +#define P4T_MAX_BITSZ P4TC_MAX_KEYSZ struct p4tc_type_mask_shift { void *mask; diff --git a/include/uapi/linux/p4tc.h b/include/uapi/linux/p4tc.h index 15876c471..678ee20cd 100644 --- a/include/uapi/linux/p4tc.h +++ b/include/uapi/linux/p4tc.h @@ -31,6 +31,61 @@ struct p4tcmsg { #define PARSERNAMSIZ TEMPLATENAMSZ #define HDRFIELDNAMSIZ TEMPLATENAMSZ #define ACTPARAMNAMSIZ TEMPLATENAMSZ +#define TABLENAMSIZ TEMPLATENAMSZ + +#define P4TC_TABLE_FLAGS_KEYSZ 0x01 +#define P4TC_TABLE_FLAGS_MAX_ENTRIES 0x02 +#define P4TC_TABLE_FLAGS_MAX_MASKS 0x04 +#define P4TC_TABLE_FLAGS_DEFAULT_KEY 0x08 +#define P4TC_TABLE_FLAGS_PERMISSIONS 0x10 + +#define P4TC_CTRL_PERM_C_BIT 9 +#define P4TC_CTRL_PERM_R_BIT 8 +#define P4TC_CTRL_PERM_U_BIT 7 +#define P4TC_CTRL_PERM_D_BIT 6 +#define P4TC_CTRL_PERM_X_BIT 5 + +#define P4TC_DATA_PERM_C_BIT 4 +#define P4TC_DATA_PERM_R_BIT 3 +#define P4TC_DATA_PERM_U_BIT 2 +#define P4TC_DATA_PERM_D_BIT 1 +#define P4TC_DATA_PERM_X_BIT 0 + +#define P4TC_PERM_MAX_BIT P4TC_CTRL_PERM_C_BIT + +#define P4TC_CTRL_PERM_C (1 << P4TC_CTRL_PERM_C_BIT) +#define P4TC_CTRL_PERM_R (1 << P4TC_CTRL_PERM_R_BIT) +#define P4TC_CTRL_PERM_U (1 << P4TC_CTRL_PERM_U_BIT) +#define P4TC_CTRL_PERM_D (1 << P4TC_CTRL_PERM_D_BIT) +#define P4TC_CTRL_PERM_X (1 << P4TC_CTRL_PERM_X_BIT) + +#define P4TC_DATA_PERM_C (1 << P4TC_DATA_PERM_C_BIT) +#define P4TC_DATA_PERM_R (1 << P4TC_DATA_PERM_R_BIT) +#define P4TC_DATA_PERM_U (1 << P4TC_DATA_PERM_U_BIT) +#define P4TC_DATA_PERM_D (1 << P4TC_DATA_PERM_D_BIT) +#define P4TC_DATA_PERM_X (1 << P4TC_DATA_PERM_X_BIT) + +#define p4tc_ctrl_create_ok(perm) (perm & P4TC_CTRL_PERM_C) +#define p4tc_ctrl_read_ok(perm) (perm & P4TC_CTRL_PERM_R) +#define p4tc_ctrl_update_ok(perm) (perm & P4TC_CTRL_PERM_U) +#define p4tc_ctrl_delete_ok(perm) (perm & P4TC_CTRL_PERM_D) +#define p4tc_ctrl_exec_ok(perm) (perm & P4TC_CTRL_PERM_X) + +#define p4tc_data_create_ok(perm) (perm & P4TC_DATA_PERM_C) +#define p4tc_data_read_ok(perm) (perm & P4TC_DATA_PERM_R) +#define p4tc_data_update_ok(perm) (perm & P4TC_DATA_PERM_U) +#define p4tc_data_delete_ok(perm) (perm & P4TC_DATA_PERM_D) +#define p4tc_data_exec_ok(perm) (perm & P4TC_DATA_PERM_X) + +struct p4tc_table_parm { + __u32 tbl_keysz; + __u32 tbl_max_entries; + __u32 tbl_max_masks; + __u32 tbl_flags; + __u32 tbl_num_entries; + __u16 tbl_permissions; + __u16 PAD0; +}; #define LABELNAMSIZ 32 @@ -63,6 +118,7 @@ enum { P4TC_OBJ_META, P4TC_OBJ_HDR_FIELD, P4TC_OBJ_ACT, + P4TC_OBJ_TABLE, __P4TC_OBJ_MAX, }; #define P4TC_OBJ_MAX __P4TC_OBJ_MAX @@ -161,6 +217,53 @@ enum { }; #define P4TC_KERNEL_META_MAX (__P4TC_KERNEL_META_MAX - 1) +/* Table key attributes */ +enum { + P4TC_KEY_UNSPEC, + P4TC_KEY_ACT, /* nested key actions */ + __P4TC_TKEY_MAX +}; +#define P4TC_TKEY_MAX __P4TC_TKEY_MAX + +enum { + P4TC_TABLE_DEFAULT_UNSPEC, + P4TC_TABLE_DEFAULT_ACTION, + P4TC_TABLE_DEFAULT_PERMISSIONS, + __P4TC_TABLE_DEFAULT_MAX +}; +#define P4TC_TABLE_DEFAULT_MAX (__P4TC_TABLE_DEFAULT_MAX - 1) + +enum { + P4TC_TABLE_ACTS_DEFAULT_ONLY, + P4TC_TABLE_ACTS_TABLE_ONLY, + __P4TC_TABLE_ACTS_FLAGS_MAX, +}; +#define P4TC_TABLE_ACTS_FLAGS_MAX (__P4TC_TABLE_ACTS_FLAGS_MAX - 1) + +enum { + P4TC_TABLE_ACT_UNSPEC, + P4TC_TABLE_ACT_FLAGS, /* u8 */ + P4TC_TABLE_ACT_NAME, /* string */ + __P4TC_TABLE_ACT_MAX +}; +#define P4TC_TABLE_ACT_MAX (__P4TC_TABLE_ACT_MAX - 1) + +/* Table type attributes */ +enum { + P4TC_TABLE_UNSPEC, + P4TC_TABLE_NAME, /* string */ + P4TC_TABLE_INFO, /* struct tc_p4_table_type_parm */ + P4TC_TABLE_PREACTIONS, /* nested table preactions */ + P4TC_TABLE_KEY, /* nested table key */ + P4TC_TABLE_POSTACTIONS, /* nested table postactions */ + P4TC_TABLE_DEFAULT_HIT, /* nested default hit action attributes */ + P4TC_TABLE_DEFAULT_MISS, /* nested default miss action attributes */ + P4TC_TABLE_OPT_ENTRY, /* nested const table entry*/ + P4TC_TABLE_ACTS_LIST, /* nested table actions list */ + __P4TC_TABLE_MAX +}; +#define P4TC_TABLE_MAX __P4TC_TABLE_MAX + struct p4tc_hdrfield_ty { __u16 startbit; __u16 endbit; diff --git a/net/sched/p4tc/Makefile b/net/sched/p4tc/Makefile index 3f7267366..de3a7b833 100644 --- a/net/sched/p4tc/Makefile +++ b/net/sched/p4tc/Makefile @@ -1,4 +1,4 @@ # SPDX-License-Identifier: GPL-2.0 obj-y := p4tc_types.o p4tc_pipeline.o p4tc_tmpl_api.o p4tc_meta.o \ - p4tc_parser_api.o p4tc_hdrfield.o p4tc_action.o + p4tc_parser_api.o p4tc_hdrfield.o p4tc_action.o p4tc_table.o diff --git a/net/sched/p4tc/p4tc_pipeline.c b/net/sched/p4tc/p4tc_pipeline.c index e43e120a3..854fc5b57 100644 --- a/net/sched/p4tc/p4tc_pipeline.c +++ b/net/sched/p4tc/p4tc_pipeline.c @@ -297,6 +297,7 @@ static void tcf_pipeline_destroy(struct p4tc_pipeline *pipeline, { idr_destroy(&pipeline->p_meta_idr); idr_destroy(&pipeline->p_act_idr); + idr_destroy(&pipeline->p_tbl_idr); if (free_pipeline) kfree(pipeline); @@ -323,8 +324,9 @@ static int tcf_pipeline_put(struct net *net, struct p4tc_pipeline *pipeline = to_pipeline(template); struct net *pipeline_net = maybe_get_net(net); struct p4tc_act_dep_node *act_node, *node_tmp; - unsigned long m_id, tmp; + unsigned long tbl_id, m_id, tmp; struct p4tc_metadata *meta; + struct p4tc_table *table; if (pipeline_net && !refcount_dec_if_one(&pipeline->p_ref)) { NL_SET_ERR_MSG(extack, "Can't delete referenced pipeline"); @@ -339,6 +341,9 @@ static int tcf_pipeline_put(struct net *net, p4tc_action_destroy(pipeline->preacts); p4tc_action_destroy(pipeline->postacts); + idr_for_each_entry_ul(&pipeline->p_tbl_idr, table, tmp, tbl_id) + table->common.ops->put(net, &table->common, true, extack); + act_dep_graph_free(&pipeline->act_dep_graph); list_for_each_entry_safe(act_node, node_tmp, @@ -371,6 +376,8 @@ static int tcf_pipeline_put(struct net *net, static inline int pipeline_try_set_state_ready(struct p4tc_pipeline *pipeline, struct netlink_ext_ack *extack) { + int ret; + if (pipeline->curr_tables != pipeline->num_tables) { NL_SET_ERR_MSG(extack, "Must have all table defined to update state to ready"); @@ -388,6 +395,9 @@ static inline int pipeline_try_set_state_ready(struct p4tc_pipeline *pipeline, "Must specify pipeline postactions before sealing"); return -EINVAL; } + ret = tcf_table_try_set_state_ready(pipeline, extack); + if (ret < 0) + return ret; /* Will never fail in this case */ determine_act_topological_order(pipeline, false); @@ -542,6 +552,9 @@ static struct p4tc_pipeline *tcf_pipeline_create(struct net *net, idr_init(&pipeline->p_act_idr); + idr_init(&pipeline->p_tbl_idr); + pipeline->curr_tables = 0; + idr_init(&pipeline->p_meta_idr); pipeline->p_meta_offset = 0; diff --git a/net/sched/p4tc/p4tc_table.c b/net/sched/p4tc/p4tc_table.c new file mode 100644 index 000000000..f793c70bc --- /dev/null +++ b/net/sched/p4tc/p4tc_table.c @@ -0,0 +1,1591 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * net/sched/p4tc_table.c P4 TC TABLE + * + * Copyright (c) 2022, Mojatatu Networks + * Copyright (c) 2022, Intel Corporation. + * Authors: Jamal Hadi Salim + * Victor Nogueira + * Pedro Tammela + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define P4TC_P_UNSPEC 0 +#define P4TC_P_CREATED 1 + +static int tcf_key_try_set_state_ready(struct p4tc_table_key *key, + struct netlink_ext_ack *extack) +{ + if (!key->key_acts) { + NL_SET_ERR_MSG(extack, + "Table key must have actions before sealing pipelline"); + return -EINVAL; + } + + return 0; +} + +static int __tcf_table_try_set_state_ready(struct p4tc_table *table, + struct netlink_ext_ack *extack) +{ + if (!table->tbl_postacts) { + NL_SET_ERR_MSG(extack, + "All tables must have postactions before sealing pipelline"); + return -EINVAL; + } + + if (!table->tbl_key) { + NL_SET_ERR_MSG(extack, + "Table must have key before sealing pipeline"); + return -EINVAL; + } + + return tcf_key_try_set_state_ready(table->tbl_key, extack); +} + +int tcf_table_try_set_state_ready(struct p4tc_pipeline *pipeline, + struct netlink_ext_ack *extack) +{ + struct p4tc_table *table; + unsigned long tmp, id; + int ret; + + idr_for_each_entry_ul(&pipeline->p_tbl_idr, table, tmp, id) { + ret = __tcf_table_try_set_state_ready(table, extack); + if (ret < 0) + return ret; + } + + return 0; +} + +static const struct nla_policy p4tc_table_policy[P4TC_TABLE_MAX + 1] = { + [P4TC_TABLE_NAME] = { .type = NLA_STRING, .len = TABLENAMSIZ }, + [P4TC_TABLE_INFO] = { .type = NLA_BINARY, + .len = sizeof(struct p4tc_table_parm) }, + [P4TC_TABLE_PREACTIONS] = { .type = NLA_NESTED }, + [P4TC_TABLE_KEY] = { .type = NLA_NESTED }, + [P4TC_TABLE_POSTACTIONS] = { .type = NLA_NESTED }, + [P4TC_TABLE_DEFAULT_HIT] = { .type = NLA_NESTED }, + [P4TC_TABLE_DEFAULT_MISS] = { .type = NLA_NESTED }, + [P4TC_TABLE_ACTS_LIST] = { .type = NLA_NESTED }, + [P4TC_TABLE_OPT_ENTRY] = { .type = NLA_NESTED }, +}; + +static const struct nla_policy p4tc_table_key_policy[P4TC_MAXPARSE_KEYS + 1] = { + [P4TC_KEY_ACT] = { .type = NLA_NESTED }, +}; + +static int tcf_table_key_fill_nlmsg(struct sk_buff *skb, + struct p4tc_table_key *key) +{ + int ret = 0; + struct nlattr *nest_action; + + if (key->key_acts) { + nest_action = nla_nest_start(skb, P4TC_KEY_ACT); + ret = tcf_action_dump(skb, key->key_acts, 0, 0, false); + if (ret < 0) + return ret; + nla_nest_end(skb, nest_action); + } + + return ret; +} + +static int _tcf_table_fill_nlmsg(struct sk_buff *skb, struct p4tc_table *table) +{ + unsigned char *b = nlmsg_get_pos(skb); + int i = 1; + struct p4tc_table_perm *tbl_perm; + struct p4tc_table_act *table_act; + struct nlattr *nested_tbl_acts; + struct nlattr *default_missact; + struct nlattr *default_hitact; + struct nlattr *nested_count; + struct p4tc_table_parm parm; + struct nlattr *nest_key; + struct nlattr *nest; + struct nlattr *preacts; + struct nlattr *postacts; + int err; + + if (nla_put_u32(skb, P4TC_PATH, table->tbl_id)) + goto out_nlmsg_trim; + + nest = nla_nest_start(skb, P4TC_PARAMS); + if (!nest) + goto out_nlmsg_trim; + + if (nla_put_string(skb, P4TC_TABLE_NAME, table->common.name)) + goto out_nlmsg_trim; + + parm.tbl_keysz = table->tbl_keysz; + parm.tbl_max_entries = table->tbl_max_entries; + parm.tbl_max_masks = table->tbl_max_masks; + parm.tbl_num_entries = refcount_read(&table->tbl_entries_ref) - 1; + + tbl_perm = rcu_dereference_rtnl(table->tbl_permissions); + parm.tbl_permissions = tbl_perm->permissions; + + if (table->tbl_key) { + nest_key = nla_nest_start(skb, P4TC_TABLE_KEY); + err = tcf_table_key_fill_nlmsg(skb, table->tbl_key); + if (err < 0) + goto out_nlmsg_trim; + nla_nest_end(skb, nest_key); + } + + if (table->tbl_preacts) { + preacts = nla_nest_start(skb, P4TC_TABLE_PREACTIONS); + if (tcf_action_dump(skb, table->tbl_preacts, 0, 0, false) < 0) + goto out_nlmsg_trim; + nla_nest_end(skb, preacts); + } + + if (table->tbl_postacts) { + postacts = nla_nest_start(skb, P4TC_TABLE_POSTACTIONS); + if (tcf_action_dump(skb, table->tbl_postacts, 0, 0, false) < 0) + goto out_nlmsg_trim; + nla_nest_end(skb, postacts); + } + + if (table->tbl_default_hitact) { + struct p4tc_table_defact *hitact; + + default_hitact = nla_nest_start(skb, P4TC_TABLE_DEFAULT_HIT); + rcu_read_lock(); + hitact = rcu_dereference_rtnl(table->tbl_default_hitact); + if (hitact->default_acts) { + struct nlattr *nest; + + nest = nla_nest_start(skb, P4TC_TABLE_DEFAULT_ACTION); + if (tcf_action_dump(skb, hitact->default_acts, 0, 0, + false) < 0) { + rcu_read_unlock(); + goto out_nlmsg_trim; + } + nla_nest_end(skb, nest); + } + if (nla_put_u16(skb, P4TC_TABLE_DEFAULT_PERMISSIONS, + hitact->permissions) < 0) { + rcu_read_unlock(); + goto out_nlmsg_trim; + } + rcu_read_unlock(); + nla_nest_end(skb, default_hitact); + } + + if (table->tbl_default_missact) { + struct p4tc_table_defact *missact; + + default_missact = nla_nest_start(skb, P4TC_TABLE_DEFAULT_MISS); + rcu_read_lock(); + missact = rcu_dereference_rtnl(table->tbl_default_missact); + if (missact->default_acts) { + struct nlattr *nest; + + nest = nla_nest_start(skb, P4TC_TABLE_DEFAULT_ACTION); + if (tcf_action_dump(skb, missact->default_acts, 0, 0, + false) < 0) { + rcu_read_unlock(); + goto out_nlmsg_trim; + } + nla_nest_end(skb, nest); + } + if (nla_put_u16(skb, P4TC_TABLE_DEFAULT_PERMISSIONS, + missact->permissions) < 0) { + rcu_read_unlock(); + goto out_nlmsg_trim; + } + rcu_read_unlock(); + nla_nest_end(skb, default_missact); + } + + nested_tbl_acts = nla_nest_start(skb, P4TC_TABLE_ACTS_LIST); + list_for_each_entry(table_act, &table->tbl_acts_list, node) { + nested_count = nla_nest_start(skb, i); + if (nla_put_string(skb, P4TC_TABLE_ACT_NAME, + table_act->ops->kind) < 0) + goto out_nlmsg_trim; + if (nla_put_u32(skb, P4TC_TABLE_ACT_FLAGS, + table_act->flags) < 0) + goto out_nlmsg_trim; + + nla_nest_end(skb, nested_count); + i++; + } + nla_nest_end(skb, nested_tbl_acts); + + if (nla_put(skb, P4TC_TABLE_INFO, sizeof(parm), &parm)) + goto out_nlmsg_trim; + nla_nest_end(skb, nest); + + return skb->len; + +out_nlmsg_trim: + nlmsg_trim(skb, b); + return -1; +} + +static int tcf_table_fill_nlmsg(struct net *net, struct sk_buff *skb, + struct p4tc_template_common *template, + struct netlink_ext_ack *extack) +{ + struct p4tc_table *table = to_table(template); + + if (_tcf_table_fill_nlmsg(skb, table) <= 0) { + NL_SET_ERR_MSG(extack, + "Failed to fill notification attributes for table"); + return -EINVAL; + } + + return 0; +} + +static inline void tcf_table_key_put(struct p4tc_table_key *key) +{ + p4tc_action_destroy(key->key_acts); + kfree(key); +} + +static inline void p4tc_table_defact_destroy(struct p4tc_table_defact *defact) +{ + if (defact) { + p4tc_action_destroy(defact->default_acts); + kfree(defact); + } +} + +void tcf_table_acts_list_destroy(struct list_head *acts_list) +{ + struct p4tc_table_act *table_act, *tmp; + + list_for_each_entry_safe(table_act, tmp, acts_list, node) { + struct p4tc_act *act; + + act = container_of(table_act->ops, typeof(*act), ops); + list_del(&table_act->node); + kfree(table_act); + WARN_ON(!refcount_dec_not_one(&act->a_ref)); + } +} + +static inline int _tcf_table_put(struct net *net, struct nlattr **tb, + struct p4tc_pipeline *pipeline, + struct p4tc_table *table, + bool unconditional_purge, + struct netlink_ext_ack *extack) +{ + bool default_act_del = false; + struct p4tc_table_perm *perm; + + if (tb) + default_act_del = tb[P4TC_TABLE_DEFAULT_HIT] || + tb[P4TC_TABLE_DEFAULT_MISS]; + + if (!default_act_del) { + if (!unconditional_purge && + !refcount_dec_if_one(&table->tbl_ctrl_ref)) { + NL_SET_ERR_MSG(extack, + "Unable to delete referenced table"); + return -EBUSY; + } + + if (!unconditional_purge && + !refcount_dec_if_one(&table->tbl_ref)) { + refcount_set(&table->tbl_ctrl_ref, 1); + NL_SET_ERR_MSG(extack, + "Unable to delete referenced table"); + return -EBUSY; + } + } + + if (tb && tb[P4TC_TABLE_DEFAULT_HIT]) { + struct p4tc_table_defact *hitact; + + rcu_read_lock(); + hitact = rcu_dereference(table->tbl_default_hitact); + if (hitact && !p4tc_ctrl_delete_ok(hitact->permissions)) { + NL_SET_ERR_MSG(extack, + "Permission denied: Unable to delete default hitact"); + rcu_read_unlock(); + return -EPERM; + } + rcu_read_unlock(); + } + + if (tb && tb[P4TC_TABLE_DEFAULT_MISS]) { + struct p4tc_table_defact *missact; + + rcu_read_lock(); + missact = rcu_dereference(table->tbl_default_missact); + if (missact && !p4tc_ctrl_delete_ok(missact->permissions)) { + NL_SET_ERR_MSG(extack, + "Permission denied: Unable to delete default missact"); + rcu_read_unlock(); + return -EPERM; + } + rcu_read_unlock(); + } + + if (!default_act_del || tb[P4TC_TABLE_DEFAULT_HIT]) { + struct p4tc_table_defact *hitact; + + hitact = rtnl_dereference(table->tbl_default_hitact); + if (hitact) { + rcu_replace_pointer_rtnl(table->tbl_default_hitact, + NULL); + synchronize_rcu(); + p4tc_table_defact_destroy(hitact); + } + } + + if (!default_act_del || tb[P4TC_TABLE_DEFAULT_MISS]) { + struct p4tc_table_defact *missact; + + missact = rtnl_dereference(table->tbl_default_missact); + if (missact) { + rcu_replace_pointer_rtnl(table->tbl_default_missact, + NULL); + synchronize_rcu(); + p4tc_table_defact_destroy(missact); + } + } + + if (default_act_del) + return 0; + + if (table->tbl_key) + tcf_table_key_put(table->tbl_key); + + p4tc_action_destroy(table->tbl_preacts); + p4tc_action_destroy(table->tbl_postacts); + + tcf_table_acts_list_destroy(&table->tbl_acts_list); + + idr_destroy(&table->tbl_masks_idr); + idr_destroy(&table->tbl_prio_idr); + + perm = rcu_replace_pointer_rtnl(table->tbl_permissions, NULL); + kfree_rcu(perm, rcu); + + idr_remove(&pipeline->p_tbl_idr, table->tbl_id); + pipeline->curr_tables -= 1; + + kfree(table); + + return 0; +} + +static int tcf_table_put(struct net *net, struct p4tc_template_common *tmpl, + bool unconditional_purge, + struct netlink_ext_ack *extack) +{ + struct p4tc_pipeline *pipeline = + tcf_pipeline_find_byid(net, tmpl->p_id); + struct p4tc_table *table = to_table(tmpl); + + return _tcf_table_put(net, NULL, pipeline, table, unconditional_purge, + extack); +} + +static inline struct p4tc_table_key * +tcf_table_key_add(struct net *net, struct p4tc_table *table, struct nlattr *nla, + struct netlink_ext_ack *extack) +{ + int ret = 0; + struct nlattr *tb[P4TC_TKEY_MAX + 1]; + struct p4tc_table_key *key; + + ret = nla_parse_nested(tb, P4TC_TKEY_MAX, nla, p4tc_table_key_policy, + extack); + if (ret < 0) + goto out; + + key = kzalloc(sizeof(*key), GFP_KERNEL); + if (!key) { + NL_SET_ERR_MSG(extack, "Failed to allocate table key"); + ret = -ENOMEM; + goto out; + } + + if (tb[P4TC_KEY_ACT]) { + key->key_acts = kcalloc(TCA_ACT_MAX_PRIO, + sizeof(struct tc_action *), GFP_KERNEL); + if (!key->key_acts) { + ret = -ENOMEM; + goto free_key; + } + + ret = p4tc_action_init(net, tb[P4TC_KEY_ACT], key->key_acts, + table->common.p_id, 0, extack); + if (ret < 0) { + kfree(key->key_acts); + goto free_key; + } + key->key_num_acts = ret; + } + + return key; + +free_key: + kfree(key); + +out: + return ERR_PTR(ret); +} + +struct p4tc_table *tcf_table_find_byid(struct p4tc_pipeline *pipeline, + const u32 tbl_id) +{ + return idr_find(&pipeline->p_tbl_idr, tbl_id); +} + +static struct p4tc_table *table_find_byname(const char *tblname, + struct p4tc_pipeline *pipeline) +{ + struct p4tc_table *table; + unsigned long tmp, id; + + idr_for_each_entry_ul(&pipeline->p_tbl_idr, table, tmp, id) + if (strncmp(table->common.name, tblname, TABLENAMSIZ) == 0) + return table; + + return NULL; +} + +#define SEPARATOR '/' +struct p4tc_table *tcf_table_find_byany(struct p4tc_pipeline *pipeline, + const char *tblname, const u32 tbl_id, + struct netlink_ext_ack *extack) +{ + struct p4tc_table *table; + int err; + + if (tbl_id) { + table = tcf_table_find_byid(pipeline, tbl_id); + if (!table) { + NL_SET_ERR_MSG(extack, "Unable to find table by id"); + err = -EINVAL; + goto out; + } + } else { + if (tblname) { + table = table_find_byname(tblname, pipeline); + if (!table) { + NL_SET_ERR_MSG(extack, "Table name not found"); + err = -EINVAL; + goto out; + } + } else { + NL_SET_ERR_MSG(extack, "Must specify table name or id"); + err = -EINVAL; + goto out; + } + } + + return table; +out: + return ERR_PTR(err); +} + +struct p4tc_table *tcf_table_get(struct p4tc_pipeline *pipeline, + const char *tblname, const u32 tbl_id, + struct netlink_ext_ack *extack) +{ + struct p4tc_table *table; + + table = tcf_table_find_byany(pipeline, tblname, tbl_id, extack); + if (IS_ERR(table)) + return table; + + /* Should never be zero */ + WARN_ON(!refcount_inc_not_zero(&table->tbl_ref)); + return table; +} + +void tcf_table_put_ref(struct p4tc_table *table) +{ + /* Should never be zero */ + WARN_ON(!refcount_dec_not_one(&table->tbl_ref)); +} + +static int tcf_table_init_default_act(struct net *net, struct nlattr **tb, + struct p4tc_table_defact **default_act, + u32 pipeid, __u16 curr_permissions, + struct netlink_ext_ack *extack) +{ + int ret; + + *default_act = kzalloc(sizeof(**default_act), GFP_KERNEL); + if (!(*default_act)) + return -ENOMEM; + + if (tb[P4TC_TABLE_DEFAULT_PERMISSIONS]) { + __u16 *permissions; + + permissions = nla_data(tb[P4TC_TABLE_DEFAULT_PERMISSIONS]); + if (*permissions > P4TC_MAX_PERMISSION) { + NL_SET_ERR_MSG(extack, + "Permission may only have 10 bits turned on"); + ret = -EINVAL; + goto default_act_free; + } + if (!p4tc_data_exec_ok(*permissions)) { + NL_SET_ERR_MSG(extack, + "Default action must have data path execute permissions"); + ret = -EINVAL; + goto default_act_free; + } + (*default_act)->permissions = *permissions; + } else { + (*default_act)->permissions = curr_permissions; + } + + if (tb[P4TC_TABLE_DEFAULT_ACTION]) { + struct tc_action **default_acts; + + if (!p4tc_ctrl_update_ok(curr_permissions)) { + NL_SET_ERR_MSG(extack, + "Permission denied: Unable to update default hit action"); + ret = -EPERM; + goto default_act_free; + } + + default_acts = kcalloc(TCA_ACT_MAX_PRIO, + sizeof(struct tc_action *), GFP_KERNEL); + if (!default_acts) { + ret = -ENOMEM; + goto default_act_free; + } + + ret = p4tc_action_init(net, tb[P4TC_TABLE_DEFAULT_ACTION], + default_acts, pipeid, 0, extack); + if (ret < 0) { + kfree(default_acts); + goto default_act_free; + } else if (ret > 1) { + NL_SET_ERR_MSG(extack, "Can only have one hit action"); + tcf_action_destroy(default_acts, TCA_ACT_UNBIND); + kfree(default_acts); + ret = -EINVAL; + goto default_act_free; + } + (*default_act)->default_acts = default_acts; + } + + return 0; + +default_act_free: + kfree(*default_act); + + return ret; +} + +static int tcf_table_check_defacts(struct tc_action *defact, + struct list_head *acts_list) +{ + struct p4tc_table_act *table_act; + + list_for_each_entry(table_act, acts_list, node) { + if (table_act->ops->id == defact->ops->id && + !(table_act->flags & BIT(P4TC_TABLE_ACTS_TABLE_ONLY))) + return true; + } + + return false; +} + +static int +tcf_table_init_default_acts(struct net *net, struct nlattr **tb, + struct p4tc_table *table, + struct p4tc_table_defact **default_hitact, + struct p4tc_table_defact **default_missact, + struct list_head *acts_list, + struct netlink_ext_ack *extack) +{ + struct nlattr *tb_default[P4TC_TABLE_DEFAULT_MAX + 1]; + __u16 permissions = P4TC_CONTROL_PERMISSIONS | P4TC_DATA_PERMISSIONS; + int ret; + + *default_missact = NULL; + *default_hitact = NULL; + + if (tb[P4TC_TABLE_DEFAULT_HIT]) { + struct p4tc_table_defact *defact; + + rcu_read_lock(); + defact = rcu_dereference(table->tbl_default_hitact); + if (defact) + permissions = defact->permissions; + rcu_read_unlock(); + + ret = nla_parse_nested(tb_default, P4TC_TABLE_DEFAULT_MAX, + tb[P4TC_TABLE_DEFAULT_HIT], NULL, + extack); + if (ret < 0) + return ret; + + if (!tb_default[P4TC_TABLE_DEFAULT_ACTION] && + !tb_default[P4TC_TABLE_DEFAULT_PERMISSIONS]) + return 0; + + ret = tcf_table_init_default_act(net, tb_default, + default_hitact, + table->common.p_id, permissions, + extack); + if (ret < 0) + return ret; + if (!tcf_table_check_defacts((*default_hitact)->default_acts[0], + acts_list)) { + ret = -EPERM; + NL_SET_ERR_MSG(extack, + "Action is not allowed as default hit action"); + goto default_hitacts_free; + } + } + + if (tb[P4TC_TABLE_DEFAULT_MISS]) { + struct p4tc_table_defact *defact; + + rcu_read_lock(); + defact = rcu_dereference(table->tbl_default_missact); + if (defact) + permissions = defact->permissions; + rcu_read_unlock(); + + ret = nla_parse_nested(tb_default, P4TC_TABLE_DEFAULT_MAX, + tb[P4TC_TABLE_DEFAULT_MISS], NULL, + extack); + if (ret < 0) + goto default_hitacts_free; + + if (!tb_default[P4TC_TABLE_DEFAULT_ACTION] && + !tb_default[P4TC_TABLE_DEFAULT_PERMISSIONS]) + return 0; + + ret = tcf_table_init_default_act(net, tb_default, + default_missact, + table->common.p_id, permissions, + extack); + if (ret < 0) + goto default_hitacts_free; + if (!tcf_table_check_defacts((*default_missact)->default_acts[0], + acts_list)) { + ret = -EPERM; + NL_SET_ERR_MSG(extack, + "Action is not allowed as default miss action"); + goto default_missact_free; + } + } + + return 0; + +default_missact_free: + p4tc_table_defact_destroy(*default_missact); + +default_hitacts_free: + p4tc_table_defact_destroy(*default_hitact); + + return ret; +} + +static const struct nla_policy p4tc_acts_list_policy[P4TC_TABLE_MAX + 1] = { + [P4TC_TABLE_ACT_FLAGS] = + NLA_POLICY_RANGE(NLA_U8, 0, BIT(P4TC_TABLE_ACTS_FLAGS_MAX)), + [P4TC_TABLE_ACT_NAME] = { .type = NLA_STRING, .len = ACTNAMSIZ }, +}; + +static struct p4tc_table_act *tcf_table_act_init(struct nlattr *nla, + struct p4tc_pipeline *pipeline, + struct netlink_ext_ack *extack) +{ + struct nlattr *tb[P4TC_TABLE_ACT_MAX + 1]; + struct p4tc_table_act *table_act; + int ret; + + ret = nla_parse_nested(tb, P4TC_TABLE_ACT_MAX, nla, + p4tc_acts_list_policy, extack); + if (ret < 0) + return ERR_PTR(ret); + + table_act = kzalloc(sizeof(*table_act), GFP_KERNEL); + if (!table_act) + return ERR_PTR(-ENOMEM); + + if (tb[P4TC_TABLE_ACT_NAME]) { + const char *actname = nla_data(tb[P4TC_TABLE_ACT_NAME]); + char *act_name_clone, *act_name, *p_name; + struct p4tc_act *act; + + act_name_clone = act_name = kstrdup(actname, GFP_KERNEL); + if (!act_name) { + ret = -ENOMEM; + goto free_table_act; + } + + p_name = strsep(&act_name, "/"); + act = tcf_action_find_byname(act_name, pipeline); + if (!act) { + NL_SET_ERR_MSG_FMT(extack, + "Unable to find action %s/%s", + p_name, act_name); + ret = -ENOENT; + kfree(act_name_clone); + goto free_table_act; + } + + kfree(act_name_clone); + + table_act->ops = &act->ops; + WARN_ON(!refcount_inc_not_zero(&act->a_ref)); + } else { + NL_SET_ERR_MSG(extack, + "Must specify allowed table action name"); + ret = -EINVAL; + goto free_table_act; + } + + if (tb[P4TC_TABLE_ACT_FLAGS]) { + u8 *flags = nla_data(tb[P4TC_TABLE_ACT_FLAGS]); + + table_act->flags = *flags; + } + + return table_act; + +free_table_act: + kfree(table_act); + return ERR_PTR(ret); +} + +static int tcf_table_acts_list_init(struct nlattr *nla, + struct p4tc_pipeline *pipeline, + struct list_head *acts_list, + struct netlink_ext_ack *extack) +{ + struct nlattr *tb[P4TC_MSGBATCH_SIZE + 1]; + struct p4tc_table_act *table_act; + int ret; + int i; + + ret = nla_parse_nested(tb, P4TC_MSGBATCH_SIZE, nla, NULL, extack); + if (ret < 0) + return ret; + + for (i = 1; i < P4TC_MSGBATCH_SIZE + 1 && tb[i]; i++) { + table_act = tcf_table_act_init(tb[i], pipeline, extack); + if (IS_ERR(table_act)) { + ret = PTR_ERR(table_act); + goto free_acts_list_list; + } + list_add_tail(&table_act->node, acts_list); + } + + return 0; + +free_acts_list_list: + tcf_table_acts_list_destroy(acts_list); + + return ret; +} + +static struct p4tc_table * +tcf_table_find_byanyattr(struct p4tc_pipeline *pipeline, + struct nlattr *name_attr, const u32 tbl_id, + struct netlink_ext_ack *extack) +{ + char *tblname = NULL; + + if (name_attr) + tblname = nla_data(name_attr); + + return tcf_table_find_byany(pipeline, tblname, tbl_id, extack); +} + +static struct p4tc_table *tcf_table_create(struct net *net, struct nlattr **tb, + u32 tbl_id, + struct p4tc_pipeline *pipeline, + struct netlink_ext_ack *extack) +{ + struct p4tc_table_key *key = NULL; + struct p4tc_table_parm *parm; + struct p4tc_table *table; + char *tblname; + int ret; + + if (pipeline->curr_tables == pipeline->num_tables) { + NL_SET_ERR_MSG(extack, + "Table range exceeded max allowed value"); + ret = -EINVAL; + goto out; + } + + if (!tb[P4TC_TABLE_NAME]) { + NL_SET_ERR_MSG(extack, "Must specify table name"); + ret = -EINVAL; + goto out; + } + + tblname = + strnchr(nla_data(tb[P4TC_TABLE_NAME]), TABLENAMSIZ, SEPARATOR); + if (!tblname) { + NL_SET_ERR_MSG(extack, "Table name must contain control block"); + ret = -EINVAL; + goto out; + } + + tblname += 1; + if (tblname[0] == '\0') { + NL_SET_ERR_MSG(extack, "Control block name is too big"); + ret = -EINVAL; + goto out; + } + + table = tcf_table_find_byanyattr(pipeline, tb[P4TC_TABLE_NAME], tbl_id, + NULL); + if (!IS_ERR(table)) { + NL_SET_ERR_MSG(extack, "Table already exists"); + ret = -EEXIST; + goto out; + } + + table = kzalloc(sizeof(*table), GFP_KERNEL); + if (!table) { + NL_SET_ERR_MSG(extack, "Unable to create table"); + ret = -ENOMEM; + goto out; + } + + table->common.p_id = pipeline->common.p_id; + strscpy(table->common.name, nla_data(tb[P4TC_TABLE_NAME]), TABLENAMSIZ); + + if (tb[P4TC_TABLE_INFO]) { + parm = nla_data(tb[P4TC_TABLE_INFO]); + } else { + ret = -EINVAL; + NL_SET_ERR_MSG(extack, "Missing table info"); + goto free; + } + + if (parm->tbl_flags & P4TC_TABLE_FLAGS_KEYSZ) { + if (!parm->tbl_keysz) { + NL_SET_ERR_MSG(extack, "Table keysz cannot be zero"); + ret = -EINVAL; + goto free; + } + if (parm->tbl_keysz > P4TC_MAX_KEYSZ) { + NL_SET_ERR_MSG(extack, + "Table keysz exceeds maximum keysz"); + ret = -EINVAL; + goto free; + } + table->tbl_keysz = parm->tbl_keysz; + } else { + NL_SET_ERR_MSG(extack, "Must specify table key size"); + ret = -EINVAL; + goto free; + } + + if (parm->tbl_flags & P4TC_TABLE_FLAGS_MAX_ENTRIES) { + if (!parm->tbl_max_entries) { + NL_SET_ERR_MSG(extack, + "Table max_entries cannot be zero"); + ret = -EINVAL; + goto free; + } + if (parm->tbl_max_entries > P4TC_MAX_TENTRIES) { + NL_SET_ERR_MSG(extack, + "Table max_entries exceeds maximum value"); + ret = -EINVAL; + goto free; + } + table->tbl_max_entries = parm->tbl_max_entries; + } else { + table->tbl_max_entries = P4TC_DEFAULT_TENTRIES; + } + + if (parm->tbl_flags & P4TC_TABLE_FLAGS_MAX_MASKS) { + if (!parm->tbl_max_masks) { + NL_SET_ERR_MSG(extack, + "Table max_masks cannot be zero"); + ret = -EINVAL; + goto free; + } + if (parm->tbl_max_masks > P4TC_MAX_TMASKS) { + NL_SET_ERR_MSG(extack, + "Table max_masks exceeds maximum value"); + ret = -EINVAL; + goto free; + } + table->tbl_max_masks = parm->tbl_max_masks; + } else { + table->tbl_max_masks = P4TC_DEFAULT_TMASKS; + } + + if (parm->tbl_flags & P4TC_TABLE_FLAGS_PERMISSIONS) { + if (parm->tbl_permissions > P4TC_MAX_PERMISSION) { + NL_SET_ERR_MSG(extack, + "Permission may only have 10 bits turned on"); + ret = -EINVAL; + goto free; + } + if (!p4tc_data_exec_ok(parm->tbl_permissions)) { + NL_SET_ERR_MSG(extack, + "Table must have execute permissions"); + ret = -EINVAL; + goto free; + } + if (!p4tc_data_read_ok(parm->tbl_permissions)) { + NL_SET_ERR_MSG(extack, + "Data path read permissions must be set"); + ret = -EINVAL; + goto free; + } + table->tbl_permissions = + kzalloc(sizeof(*table->tbl_permissions), GFP_KERNEL); + if (!table->tbl_permissions) { + ret = -ENOMEM; + goto free; + } + table->tbl_permissions->permissions = parm->tbl_permissions; + } else { + table->tbl_permissions = + kzalloc(sizeof(*table->tbl_permissions), GFP_KERNEL); + if (!table->tbl_permissions) { + ret = -ENOMEM; + goto free; + } + table->tbl_permissions->permissions = P4TC_TABLE_PERMISSIONS; + } + + refcount_set(&table->tbl_ref, 1); + refcount_set(&table->tbl_ctrl_ref, 1); + + if (tbl_id) { + table->tbl_id = tbl_id; + ret = idr_alloc_u32(&pipeline->p_tbl_idr, table, &table->tbl_id, + table->tbl_id, GFP_KERNEL); + if (ret < 0) { + NL_SET_ERR_MSG(extack, "Unable to allocate table id"); + goto free_permissions; + } + } else { + table->tbl_id = 1; + ret = idr_alloc_u32(&pipeline->p_tbl_idr, table, &table->tbl_id, + UINT_MAX, GFP_KERNEL); + if (ret < 0) { + NL_SET_ERR_MSG(extack, "Unable to allocate table id"); + goto free_permissions; + } + } + + INIT_LIST_HEAD(&table->tbl_acts_list); + if (tb[P4TC_TABLE_ACTS_LIST]) { + ret = tcf_table_acts_list_init(tb[P4TC_TABLE_ACTS_LIST], + pipeline, &table->tbl_acts_list, + extack); + if (ret < 0) + goto idr_rm; + } + + if (tb[P4TC_TABLE_PREACTIONS]) { + table->tbl_preacts = kcalloc(TCA_ACT_MAX_PRIO, + sizeof(struct tc_action *), + GFP_KERNEL); + if (!table->tbl_preacts) { + ret = -ENOMEM; + goto table_acts_destroy; + } + + ret = p4tc_action_init(net, tb[P4TC_TABLE_PREACTIONS], + table->tbl_preacts, table->common.p_id, + 0, extack); + if (ret < 0) { + kfree(table->tbl_preacts); + goto table_acts_destroy; + } + table->tbl_num_preacts = ret; + } else { + table->tbl_preacts = NULL; + } + + if (tb[P4TC_TABLE_POSTACTIONS]) { + table->tbl_postacts = kcalloc(TCA_ACT_MAX_PRIO, + sizeof(struct tc_action *), + GFP_KERNEL); + if (!table->tbl_postacts) { + ret = -ENOMEM; + goto preactions_destroy; + } + + ret = p4tc_action_init(net, tb[P4TC_TABLE_POSTACTIONS], + table->tbl_postacts, table->common.p_id, + 0, extack); + if (ret < 0) { + kfree(table->tbl_postacts); + goto preactions_destroy; + } + table->tbl_num_postacts = ret; + } else { + table->tbl_postacts = NULL; + table->tbl_num_postacts = 0; + } + + if (tb[P4TC_TABLE_KEY]) { + key = tcf_table_key_add(net, table, tb[P4TC_TABLE_KEY], extack); + if (IS_ERR(key)) { + ret = PTR_ERR(key); + goto postacts_destroy; + } + } + + ret = tcf_table_init_default_acts(net, tb, table, + &table->tbl_default_hitact, + &table->tbl_default_missact, + &table->tbl_acts_list, extack); + if (ret < 0) + goto key_put; + + table->tbl_curr_used_entries = 0; + table->tbl_curr_count = 0; + + refcount_set(&table->tbl_entries_ref, 1); + + idr_init(&table->tbl_masks_idr); + idr_init(&table->tbl_prio_idr); + spin_lock_init(&table->tbl_masks_idr_lock); + spin_lock_init(&table->tbl_prio_idr_lock); + + table->tbl_key = key; + + pipeline->curr_tables += 1; + + table->common.ops = (struct p4tc_template_ops *)&p4tc_table_ops; + + return table; + +key_put: + if (key) + tcf_table_key_put(key); + +postacts_destroy: + p4tc_action_destroy(table->tbl_postacts); + +preactions_destroy: + p4tc_action_destroy(table->tbl_preacts); + +idr_rm: + idr_remove(&pipeline->p_tbl_idr, table->tbl_id); + +free_permissions: + kfree(table->tbl_permissions); + +table_acts_destroy: + tcf_table_acts_list_destroy(&table->tbl_acts_list); + +free: + kfree(table); + +out: + return ERR_PTR(ret); +} + +static struct p4tc_table *tcf_table_update(struct net *net, struct nlattr **tb, + u32 tbl_id, + struct p4tc_pipeline *pipeline, + u32 flags, + struct netlink_ext_ack *extack) +{ + struct p4tc_table_key *key = NULL; + int num_postacts = 0, num_preacts = 0; + struct p4tc_table_defact *default_hitact = NULL; + struct p4tc_table_defact *default_missact = NULL; + struct list_head *tbl_acts_list = NULL; + struct p4tc_table_perm *perm = NULL; + struct p4tc_table_parm *parm = NULL; + struct tc_action **postacts = NULL; + struct tc_action **preacts = NULL; + int ret = 0; + struct p4tc_table *table; + + table = tcf_table_find_byanyattr(pipeline, tb[P4TC_TABLE_NAME], tbl_id, + extack); + if (IS_ERR(table)) + return table; + + if (tb[P4TC_TABLE_ACTS_LIST]) { + tbl_acts_list = kzalloc(sizeof(*tbl_acts_list), GFP_KERNEL); + if (!tbl_acts_list) { + ret = -ENOMEM; + goto out; + } + INIT_LIST_HEAD(tbl_acts_list); + ret = tcf_table_acts_list_init(tb[P4TC_TABLE_ACTS_LIST], + pipeline, tbl_acts_list, extack); + if (ret < 0) + goto table_acts_destroy; + } + + if (tb[P4TC_TABLE_PREACTIONS]) { + preacts = kcalloc(TCA_ACT_MAX_PRIO, sizeof(struct tc_action *), + GFP_KERNEL); + if (!preacts) { + ret = -ENOMEM; + goto table_acts_destroy; + } + + ret = p4tc_action_init(net, tb[P4TC_TABLE_PREACTIONS], preacts, + table->common.p_id, 0, extack); + if (ret < 0) { + kfree(preacts); + goto table_acts_destroy; + } + num_preacts = ret; + } + + if (tb[P4TC_TABLE_POSTACTIONS]) { + postacts = kcalloc(TCA_ACT_MAX_PRIO, sizeof(struct tc_action *), + GFP_KERNEL); + if (!postacts) { + ret = -ENOMEM; + goto preactions_destroy; + } + + ret = p4tc_action_init(net, tb[P4TC_TABLE_POSTACTIONS], + postacts, table->common.p_id, 0, extack); + if (ret < 0) { + kfree(postacts); + goto preactions_destroy; + } + num_postacts = ret; + } + + if (tbl_acts_list) + ret = tcf_table_init_default_acts(net, tb, table, + &default_hitact, + &default_missact, + tbl_acts_list, extack); + else + ret = tcf_table_init_default_acts(net, tb, table, + &default_hitact, + &default_missact, + &table->tbl_acts_list, + extack); + if (ret < 0) + goto postactions_destroy; + + if (tb[P4TC_TABLE_KEY]) { + key = tcf_table_key_add(net, table, tb[P4TC_TABLE_KEY], extack); + if (IS_ERR(key)) { + ret = PTR_ERR(key); + goto defaultacts_destroy; + } + } + + if (tb[P4TC_TABLE_INFO]) { + parm = nla_data(tb[P4TC_TABLE_INFO]); + if (parm->tbl_flags & P4TC_TABLE_FLAGS_KEYSZ) { + if (!parm->tbl_keysz) { + NL_SET_ERR_MSG(extack, + "Table keysz cannot be zero"); + ret = -EINVAL; + goto key_destroy; + } + if (parm->tbl_keysz > P4TC_MAX_KEYSZ) { + NL_SET_ERR_MSG(extack, + "Table keysz exceeds maximum keysz"); + ret = -EINVAL; + goto key_destroy; + } + table->tbl_keysz = parm->tbl_keysz; + } + + if (parm->tbl_flags & P4TC_TABLE_FLAGS_MAX_ENTRIES) { + if (!parm->tbl_max_entries) { + NL_SET_ERR_MSG(extack, + "Table max_entries cannot be zero"); + ret = -EINVAL; + goto key_destroy; + } + if (parm->tbl_max_entries > P4TC_MAX_TENTRIES) { + NL_SET_ERR_MSG(extack, + "Table max_entries exceeds maximum value"); + ret = -EINVAL; + goto key_destroy; + } + table->tbl_max_entries = parm->tbl_max_entries; + } + + if (parm->tbl_flags & P4TC_TABLE_FLAGS_MAX_MASKS) { + if (!parm->tbl_max_masks) { + NL_SET_ERR_MSG(extack, + "Table max_masks cannot be zero"); + ret = -EINVAL; + goto key_destroy; + } + if (parm->tbl_max_masks > P4TC_MAX_TMASKS) { + NL_SET_ERR_MSG(extack, + "Table max_masks exceeds maximum value"); + ret = -EINVAL; + goto key_destroy; + } + table->tbl_max_masks = parm->tbl_max_masks; + } + if (parm->tbl_flags & P4TC_TABLE_FLAGS_PERMISSIONS) { + if (parm->tbl_permissions > P4TC_MAX_PERMISSION) { + NL_SET_ERR_MSG(extack, + "Permission may only have 10 bits turned on"); + ret = -EINVAL; + goto key_destroy; + } + if (!p4tc_data_exec_ok(parm->tbl_permissions)) { + NL_SET_ERR_MSG(extack, + "Table must have execute permissions"); + ret = -EINVAL; + goto key_destroy; + } + if (!p4tc_data_read_ok(parm->tbl_permissions)) { + NL_SET_ERR_MSG(extack, + "Data path read permissions must be set"); + ret = -EINVAL; + goto key_destroy; + } + + perm = kzalloc(sizeof(*perm), GFP_KERNEL); + if (!perm) { + ret = -ENOMEM; + goto key_destroy; + } + perm->permissions = parm->tbl_permissions; + } + } + + if (preacts) { + p4tc_action_destroy(table->tbl_preacts); + table->tbl_preacts = preacts; + table->tbl_num_preacts = num_preacts; + } + + if (postacts) { + p4tc_action_destroy(table->tbl_postacts); + table->tbl_postacts = postacts; + table->tbl_num_postacts = num_postacts; + } + + if (default_hitact) { + struct p4tc_table_defact *hitact; + + hitact = rcu_replace_pointer_rtnl(table->tbl_default_hitact, + default_hitact); + if (hitact) { + synchronize_rcu(); + p4tc_table_defact_destroy(hitact); + } + } + + if (default_missact) { + struct p4tc_table_defact *missact; + + missact = rcu_replace_pointer_rtnl(table->tbl_default_missact, + default_missact); + if (missact) { + synchronize_rcu(); + p4tc_table_defact_destroy(missact); + } + } + + if (key) { + if (table->tbl_key) + tcf_table_key_put(table->tbl_key); + table->tbl_key = key; + } + + if (perm) { + perm = rcu_replace_pointer_rtnl(table->tbl_permissions, perm); + kfree_rcu(perm, rcu); + } + + return table; + +key_destroy: + if (key) + tcf_table_key_put(key); + +defaultacts_destroy: + p4tc_table_defact_destroy(default_missact); + p4tc_table_defact_destroy(default_hitact); + +postactions_destroy: + p4tc_action_destroy(postacts); + +preactions_destroy: + p4tc_action_destroy(preacts); + +table_acts_destroy: + if (tbl_acts_list) { + tcf_table_acts_list_destroy(tbl_acts_list); + kfree(tbl_acts_list); + } + +out: + return ERR_PTR(ret); +} + +static bool tcf_table_check_runtime_update(struct nlmsghdr *n, + struct nlattr **tb) +{ + int i; + + if (n->nlmsg_type == RTM_CREATEP4TEMPLATE && + !(n->nlmsg_flags & NLM_F_REPLACE)) + return false; + + if (tb[P4TC_TABLE_INFO]) { + struct p4tc_table_parm *info; + + info = nla_data(tb[P4TC_TABLE_INFO]); + if ((info->tbl_flags & ~P4TC_TABLE_FLAGS_PERMISSIONS) || + !(info->tbl_flags & P4TC_TABLE_FLAGS_PERMISSIONS)) + return false; + } + + for (i = P4TC_TABLE_PREACTIONS; i < P4TC_TABLE_MAX; i++) { + if (i != P4TC_TABLE_DEFAULT_HIT && + i != P4TC_TABLE_DEFAULT_MISS && tb[i]) + return false; + } + + return true; +} + +static struct p4tc_template_common * +tcf_table_cu(struct net *net, struct nlmsghdr *n, struct nlattr *nla, + struct p4tc_nl_pname *nl_pname, u32 *ids, + struct netlink_ext_ack *extack) +{ + u32 pipeid = ids[P4TC_PID_IDX], tbl_id = ids[P4TC_TBLID_IDX]; + struct nlattr *tb[P4TC_TABLE_MAX + 1]; + struct p4tc_pipeline *pipeline; + struct p4tc_table *table; + int ret; + + pipeline = tcf_pipeline_find_byany(net, nl_pname->data, pipeid, extack); + if (IS_ERR(pipeline)) + return (void *)pipeline; + + ret = nla_parse_nested(tb, P4TC_TABLE_MAX, nla, p4tc_table_policy, + extack); + if (ret < 0) + return ERR_PTR(ret); + + if (pipeline_sealed(pipeline) && + !tcf_table_check_runtime_update(n, tb)) { + NL_SET_ERR_MSG(extack, + "Only default action updates are allowed in sealed pipeline"); + return ERR_PTR(-EINVAL); + } + + if (n->nlmsg_flags & NLM_F_REPLACE) + table = tcf_table_update(net, tb, tbl_id, pipeline, + n->nlmsg_flags, extack); + else + table = tcf_table_create(net, tb, tbl_id, pipeline, extack); + + if (IS_ERR(table)) + goto out; + + if (!nl_pname->passed) + strscpy(nl_pname->data, pipeline->common.name, PIPELINENAMSIZ); + + if (!ids[P4TC_PID_IDX]) + ids[P4TC_PID_IDX] = pipeline->common.p_id; + + if (!ids[P4TC_TBLID_IDX]) + ids[P4TC_TBLID_IDX] = table->tbl_id; + +out: + return (struct p4tc_template_common *)table; +} + +static int tcf_table_flush(struct net *net, struct sk_buff *skb, + struct p4tc_pipeline *pipeline, + struct netlink_ext_ack *extack) +{ + unsigned char *b = nlmsg_get_pos(skb); + struct p4tc_table *table; + unsigned long tmp, tbl_id; + int ret = 0; + int i = 0; + + if (nla_put_u32(skb, P4TC_PATH, 0)) + goto out_nlmsg_trim; + + if (idr_is_empty(&pipeline->p_tbl_idr)) { + NL_SET_ERR_MSG(extack, "There are not tables to flush"); + goto out_nlmsg_trim; + } + + idr_for_each_entry_ul(&pipeline->p_tbl_idr, table, tmp, tbl_id) { + if (_tcf_table_put(net, NULL, pipeline, table, false, extack) < 0) { + ret = -EBUSY; + continue; + } + i++; + } + + nla_put_u32(skb, P4TC_COUNT, i); + + if (ret < 0) { + if (i == 0) { + NL_SET_ERR_MSG(extack, "Unable to flush any table"); + goto out_nlmsg_trim; + } else { + NL_SET_ERR_MSG(extack, "Unable to flush all tables"); + } + } + + return i; + +out_nlmsg_trim: + nlmsg_trim(skb, b); + return ret; +} + +static int tcf_table_gd(struct net *net, struct sk_buff *skb, + struct nlmsghdr *n, struct nlattr *nla, + struct p4tc_nl_pname *nl_pname, u32 *ids, + struct netlink_ext_ack *extack) +{ + u32 pipeid = ids[P4TC_PID_IDX], tbl_id = ids[P4TC_MID_IDX]; + struct nlattr *tb[P4TC_TABLE_MAX + 1] = {}; + unsigned char *b = nlmsg_get_pos(skb); + int ret = 0; + struct p4tc_pipeline *pipeline; + struct p4tc_table *table; + + if (nla) { + ret = nla_parse_nested(tb, P4TC_TABLE_MAX, nla, + p4tc_table_policy, extack); + + if (ret < 0) + return ret; + } + + if (n->nlmsg_type == RTM_GETP4TEMPLATE || + tcf_table_check_runtime_update(n, tb)) + pipeline = tcf_pipeline_find_byany(net, nl_pname->data, pipeid, + extack); + else + pipeline = tcf_pipeline_find_byany_unsealed(net, nl_pname->data, + pipeid, extack); + + if (IS_ERR(pipeline)) + return PTR_ERR(pipeline); + + if (!nl_pname->passed) + strscpy(nl_pname->data, pipeline->common.name, PIPELINENAMSIZ); + + if (!ids[P4TC_PID_IDX]) + ids[P4TC_PID_IDX] = pipeline->common.p_id; + + if (n->nlmsg_type == RTM_DELP4TEMPLATE && (n->nlmsg_flags & NLM_F_ROOT)) + return tcf_table_flush(net, skb, pipeline, extack); + + table = tcf_table_find_byanyattr(pipeline, tb[P4TC_TABLE_NAME], tbl_id, + extack); + if (IS_ERR(table)) + return PTR_ERR(table); + + if (_tcf_table_fill_nlmsg(skb, table) < 0) { + NL_SET_ERR_MSG(extack, + "Failed to fill notification attributes for table"); + return -EINVAL; + } + + if (n->nlmsg_type == RTM_DELP4TEMPLATE) { + ret = _tcf_table_put(net, tb, pipeline, table, false, extack); + if (ret < 0) + goto out_nlmsg_trim; + } + + return 0; + +out_nlmsg_trim: + nlmsg_trim(skb, b); + return ret; +} + +static int tcf_table_dump(struct sk_buff *skb, struct p4tc_dump_ctx *ctx, + struct nlattr *nla, char **p_name, u32 *ids, + struct netlink_ext_ack *extack) +{ + struct net *net = sock_net(skb->sk); + struct p4tc_pipeline *pipeline; + + if (!ctx->ids[P4TC_PID_IDX]) { + pipeline = tcf_pipeline_find_byany(net, *p_name, + ids[P4TC_PID_IDX], extack); + if (IS_ERR(pipeline)) + return PTR_ERR(pipeline); + ctx->ids[P4TC_PID_IDX] = pipeline->common.p_id; + } else { + pipeline = tcf_pipeline_find_byid(net, ctx->ids[P4TC_PID_IDX]); + } + + if (!ids[P4TC_PID_IDX]) + ids[P4TC_PID_IDX] = pipeline->common.p_id; + + if (!(*p_name)) + *p_name = pipeline->common.name; + + return tcf_p4_tmpl_generic_dump(skb, ctx, &pipeline->p_tbl_idr, + P4TC_TBLID_IDX, extack); +} + +static int tcf_table_dump_1(struct sk_buff *skb, + struct p4tc_template_common *common) +{ + struct p4tc_table *table = to_table(common); + struct nlattr *nest = nla_nest_start(skb, P4TC_PARAMS); + + if (!nest) + return -ENOMEM; + + if (nla_put_string(skb, P4TC_TABLE_NAME, table->common.name)) { + nla_nest_cancel(skb, nest); + return -ENOMEM; + } + + nla_nest_end(skb, nest); + + return 0; +} + +const struct p4tc_template_ops p4tc_table_ops = { + .init = NULL, + .cu = tcf_table_cu, + .fill_nlmsg = tcf_table_fill_nlmsg, + .gd = tcf_table_gd, + .put = tcf_table_put, + .dump = tcf_table_dump, + .dump_1 = tcf_table_dump_1, +}; diff --git a/net/sched/p4tc/p4tc_tmpl_api.c b/net/sched/p4tc/p4tc_tmpl_api.c index 2296ae97b..2963f6497 100644 --- a/net/sched/p4tc/p4tc_tmpl_api.c +++ b/net/sched/p4tc/p4tc_tmpl_api.c @@ -45,6 +45,7 @@ static bool obj_is_valid(u32 obj) case P4TC_OBJ_META: case P4TC_OBJ_HDR_FIELD: case P4TC_OBJ_ACT: + case P4TC_OBJ_TABLE: return true; default: return false; @@ -56,6 +57,7 @@ static const struct p4tc_template_ops *p4tc_ops[P4TC_OBJ_MAX] = { [P4TC_OBJ_META] = &p4tc_meta_ops, [P4TC_OBJ_HDR_FIELD] = &p4tc_hdrfield_ops, [P4TC_OBJ_ACT] = &p4tc_act_ops, + [P4TC_OBJ_TABLE] = &p4tc_table_ops, }; int tcf_p4_tmpl_generic_dump(struct sk_buff *skb, struct p4tc_dump_ctx *ctx, From patchwork Tue Jan 24 17:05:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jamal Hadi Salim X-Patchwork-Id: 13114398 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C437DC54E94 for ; Tue, 24 Jan 2023 17:07:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234524AbjAXRHB (ORCPT ); Tue, 24 Jan 2023 12:07:01 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42016 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234714AbjAXRGZ (ORCPT ); Tue, 24 Jan 2023 12:06:25 -0500 Received: from mail-yw1-x1130.google.com (mail-yw1-x1130.google.com [IPv6:2607:f8b0:4864:20::1130]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 681064DCF9 for ; Tue, 24 Jan 2023 09:05:41 -0800 (PST) Received: by mail-yw1-x1130.google.com with SMTP id 00721157ae682-4ff07dae50dso182326327b3.2 for ; Tue, 24 Jan 2023 09:05:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mojatatu-com.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=RuBHVrg+LlprbV7alk+KeiZPCcYFTXG+R4LePQFieo0=; b=YqskR9KSx44vQkKmht7papn4xyGAhaMN9Fu951/MVvjScmz++08FA9q35eQ4Ve1P7/ B1OsZA7V1YOU1t0bGb8dJBLkqXdLdKteRTOQytKWkT4NieBkjbrQ8U+oZzAvzd+QUjs+ fIxyc0Mk9Jnv5m4envyGHz9jAXmZqyO+LSrZ+5eU2o4xRvBHsV2gfKsC5+k3wMtEZv25 MeWM85RUuFdn8Bjz6uc5aov7MjKsiPUf/gB4R5smows/OGjlyWfq2Lk307F++YrP8tkb WHKucnVyZ8kTb2avivzxteVTzKUC9Rk2nXHRjvmEXa3NfzEqP42f/ancXlmMoR3oeHWI 7ftg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=RuBHVrg+LlprbV7alk+KeiZPCcYFTXG+R4LePQFieo0=; b=gPDqOAvkTJqG65Cg+bKNBTD4eqw9yCcGqnSDvOjagldUxM9u4NuhF7xUtpLDLx4ybh xKFWGii0ZzJx+VDymARsjlx/HfXgMvDT7XlMe26r/LiKMtFG9bi+CnTWTkrC5LSit6DR hVkl5poF2H9gnSRDGOTCj/WjLpKvJCd61DwQwc9CAfe94MlXfhsBIIL+oNFRShxgbySc 4ZUSfsaOeihjIYnHBSCT1LWPqygHkRmLHgfUbnLHZqsXS5kMAmqloU9wQzzJd/g17bWP RBuYcO+5MFtA0Ko49slnJ0/oWMbYc+opzLX8glJrgUkH17sp/bD8rd2NLkynwtnesiWw 9rVw== X-Gm-Message-State: AFqh2kqlfkolPLTyOjzF2lCmo7zFWnRVjDwGH5EocCzsJ57V3gicBfpM SULTNOek2kENs8380LLiITVqxAcKGviv1bvY X-Google-Smtp-Source: AMrXdXsdTXk9Rp4Bfo3eWBxqgBSharLMQfJoUO+Z9ozvvUXAS0vfkkkbPcM6pMMAO62oRjlvO90wVQ== X-Received: by 2002:a05:7500:374b:b0:f0:52c4:5de2 with SMTP id gx11-20020a057500374b00b000f052c45de2mr2099890gab.38.1674579933700; Tue, 24 Jan 2023 09:05:33 -0800 (PST) Received: from localhost.localdomain (bras-base-kntaon1618w-grc-10-184-145-9-64.dsl.bell.ca. [184.145.9.64]) by smtp.gmail.com with ESMTPSA id t5-20020a05620a0b0500b007063036cb03sm1700208qkg.126.2023.01.24.09.05.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Jan 2023 09:05:33 -0800 (PST) From: Jamal Hadi Salim To: netdev@vger.kernel.org Cc: kernel@mojatatu.com, deb.chatterjee@intel.com, anjali.singhai@intel.com, namrata.limaye@intel.com, khalidm@nvidia.com, tom@sipanda.io, pratyush@sipanda.io, jiri@resnulli.us, xiyou.wangcong@gmail.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, vladbu@nvidia.com, simon.horman@corigine.com Subject: [PATCH net-next RFC 17/20] p4tc: add table entry create, update, get, delete, flush and dump Date: Tue, 24 Jan 2023 12:05:07 -0500 Message-Id: <20230124170510.316970-17-jhs@mojatatu.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230124170510.316970-1-jhs@mojatatu.com> References: <20230124170510.316970-1-jhs@mojatatu.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org X-Patchwork-State: RFC Tables are conceptually similar to TCAMs and this implementation could be labelled as an "algorithmic" TCAM. Tables have keys of specific size, maximum number of entries and masks allowed. The basic P4 key types are supported (exact, LPM, ternary, and ranges) although the kernel side is oblivious of all that and sees only bit blobs which it masks before a lookup is performed. This commit allows users to create, update, delete, get, flush and dump table _entries_ (templates were described in earlier patch). For example, a user issuing the following command: tc p4runtime create myprog/table/cb/tname \ dstAddr 10.10.10.0/24 srcAddr 192.168.0.0/16 prio 16 \ action myprog/cb/send param port type dev port1 indicates a pipeline named "myprog" with a table "tname" whose entry we are updating. User space tc will create a key which has a value of 0x0a0a0a00c0a00000 (10.10.10.0 concatenated with 192.168.0.0) and a mask value of 0xffffff00ffff0000 (/24 concatenated with /16) that will be sent to the kernel. In addition a priority field of 16 is passed to the kernel as well as the action definition. The priority field is needed to disambiguate in case two entries match. In that case, the kernel will choose the one with lowest priority number. Note that table entries can only be created once the pipeline template is sealed. If the user wanted to, for example, add an action to our just created entry, they'd issue the following command: tc p4runtime update myprog/table/cb/tname srcAddr 10.10.10.0/24 \ dstAddr 192.168.0.0/16 prio 16 action myprog/cb/send param port type dev port5 In this case, the user needs to specify the pipeline name, the table name, the keys and the priority, so that we can locate the table entry. If the user wanted to, for example, get the table entry that we just updated, they'd issue the following command: tc p4runtime get myprog/table/cb/tname srcAddr 10.10.10.0/24 \ dstAddr 192.168.0.0/16 prio 16 Note that, again, we need to specify the pipeline name, the table name, the keys and the priority, so that we can locate the table entry. If the user wanted to delete the table entry we created, they'd issue the following command: tc p4runtime del myprog/table/cb/tname srcAddr 10.10.10.0/24 \ dstAddr 192.168.0.0/16 prio 16 Note that, again, we need to specify the pipeline name, the table type, the table instance, the keys and the priority, so that we can locate the table entry. We can also flush all the table entries from a specific table instance. To flush the table entries of table instance named tinst1, from table type tname and pipeline ptables, the user would issue the following command: tc p4runtime del myprog/table/cb/tname We can also dump all the table entries from a specific table instance. To dump the table entries of table instance named tinst1, from table type tname and pipeline ptables, the user would issue the following command: tc p4runtime get myprog/table/cb/tname __Table Entry Permissions__ Table entries can have permissions specified when they are being added. caveat: we are doing a lot more than what P4 defines because we feel it is necessary. Table entry permissions build on the table permissions provided when a table is created via the template (see earlier patch). We have two types of permissions: Control path vs datapath. The template definition can set either one. For example, one could allow for adding table entries by the datapath in case of PNA add-on-miss is needed. By default tables entries have control plane RUD, meaning the control plane can Read, Update or Delete entries. By default, as well, the control plane can create new entries unless specified otherwise by the template. Lets see an example of defining a table "tname" at template time: $TC p4template create table/ptables/cb/tname tblid 1 keysz 64 permissions 0x3C9 ... Above is setting the table tname's permission to be 0x3C9 is equivalent to CRUD--R--X meaning: the control plane can Create, Read, Update, Delete The datapath can only Read and Execute table entries. If one was to dump this table with: $TC p4template get table/ptables/cb/tname The output would be the following: pipeline name ptables id 22 table id 1 table name cb/tname key_sz 64 max entries 256 masks 8 default key 1 table entries 0 permissions CRUD--R--X The expressed permissions above are probably the most practical for most use cases. __Constant Tables And P4-programmed Defined Entries__ If one wanted to restrict the table to be an equivalent to a "const" then the permissions would be set to be: -R----R--X In such a case, typically the P4 program will have some entries defined (see the famous P4 calc example). The "initial entries" specified in the P4 program will have to be added by the template (as generated by the compiler), as such: $TC p4template update table/ptables/cb/tname entry srcAddr 10.10.10.10/24 dstAddr 1.1.1.0/24 prio 17 This table cannot be updated at runtime. Any attempt to add an entry of a table which is read-only at runtime will get a permission denied response back from the kernel. Note: If one was to create an equivalent for PNA add-on-miss feature for this table, then the template would issue table permissions as: -R---CR--X PNA doesn't specify whether the datapath can also delete or update entries, but if it did then more appropriate permissions will be: -R----XCRUDX __Mix And Match of RW vs Constant Entries__ Lets look at other scenarios; lets say the table has CRUD--R--X permissions as defined by the template... At runtime the user could add entries which are "const" - by specifying the entry's permission as -R---R--X example: $TC p4runtime create ptables/table/cb/tname srcAddr 10.10.10.10/24 \ dstAddr 1.1.1.0/24 prio 17 permissions 0x109 action drop or not specify permissions at all as such: $TC p4runtime create ptables/table/cb/tname srcAddr 10.10.10.10/24 \ dstAddr 1.1.1.0/24 prio 17 \ action drop in which case the table's permissions defined at template time( CRUD--R--X) are assumed; meaning the table entry can be deleted or updated by the control plane. __Entries permissions Allowed On A Table Entry Creation At Runtime__ When an entry is added with expressed permissions it has at most to have what the template table definition expressed but could ask for less permission. For example, assuming a table with templated specified permissions of CR-D--R--X: An entry created at runtime with permission of -R----R--X is allowed but an entry with -RUD--R--X will be rejected. Co-developed-by: Victor Nogueira Signed-off-by: Victor Nogueira Co-developed-by: Pedro Tammela Signed-off-by: Pedro Tammela Signed-off-by: Jamal Hadi Salim --- include/net/p4tc.h | 60 + include/uapi/linux/p4tc.h | 32 + include/uapi/linux/rtnetlink.h | 7 + net/sched/p4tc/Makefile | 3 +- net/sched/p4tc/p4tc_pipeline.c | 12 + net/sched/p4tc/p4tc_table.c | 45 + net/sched/p4tc/p4tc_tbl_api.c | 1898 ++++++++++++++++++++++++++++++++ security/selinux/nlmsgtab.c | 5 +- 8 files changed, 2060 insertions(+), 2 deletions(-) create mode 100644 net/sched/p4tc/p4tc_tbl_api.c diff --git a/include/net/p4tc.h b/include/net/p4tc.h index 58be4f96f..9a7942992 100644 --- a/include/net/p4tc.h +++ b/include/net/p4tc.h @@ -123,6 +123,7 @@ struct p4tc_pipeline { u32 num_created_acts; refcount_t p_ref; refcount_t p_ctrl_ref; + refcount_t p_entry_deferal_ref; u16 num_tables; u16 curr_tables; u8 p_state; @@ -234,6 +235,7 @@ struct p4tc_table { struct rhltable tbl_entries; struct tc_action **tbl_preacts; struct tc_action **tbl_postacts; + struct p4tc_table_entry *tbl_const_entry; struct p4tc_table_defact __rcu *tbl_default_hitact; struct p4tc_table_defact __rcu *tbl_default_missact; struct p4tc_table_perm __rcu *tbl_permissions; @@ -321,6 +323,54 @@ extern const struct rhashtable_params p4tc_label_ht_params; extern const struct rhashtable_params acts_params; void p4tc_label_ht_destroy(void *ptr, void *arg); +extern const struct rhashtable_params entry_hlt_params; + +struct p4tc_table_entry; +struct p4tc_table_entry_work { + struct work_struct work; + struct p4tc_pipeline *pipeline; + struct p4tc_table_entry *entry; + bool defer_deletion; +}; + +struct p4tc_table_entry_key { + u8 *value; + u8 *unmasked_key; + u16 keysz; +}; + +struct p4tc_table_entry_mask { + struct rcu_head rcu; + u32 sz; + u32 mask_id; + refcount_t mask_ref; + u8 *value; +}; + +struct p4tc_table_entry { + struct p4tc_table_entry_key key; + struct work_struct work; + struct p4tc_table_entry_tm __rcu *tm; + u32 prio; + u32 mask_id; + struct tc_action **acts; + struct p4tc_table_entry_work *entry_work; + int num_acts; + struct rhlist_head ht_node; + struct list_head list; + struct rcu_head rcu; + refcount_t entries_ref; + u16 who_created; + u16 who_updated; + u16 permissions; +}; + +extern const struct nla_policy p4tc_root_policy[P4TC_ROOT_MAX + 1]; +extern const struct nla_policy p4tc_policy[P4TC_MAX + 1]; +struct p4tc_table_entry *p4tc_table_entry_lookup(struct sk_buff *skb, + struct p4tc_table *table, + u32 keysz); + struct p4tc_parser { char parser_name[PARSERNAMSIZ]; struct idr hdr_fields_idr; @@ -445,6 +495,16 @@ struct p4tc_table *tcf_table_get(struct p4tc_pipeline *pipeline, struct netlink_ext_ack *extack); void tcf_table_put_ref(struct p4tc_table *table); +void tcf_table_entry_destroy_hash(void *ptr, void *arg); + +int tcf_table_const_entry_cu(struct net *net, struct nlattr *arg, + struct p4tc_table_entry *entry, + struct p4tc_pipeline *pipeline, + struct p4tc_table *table, + struct netlink_ext_ack *extack); +int p4tca_table_get_entry_fill(struct sk_buff *skb, struct p4tc_table *table, + struct p4tc_table_entry *entry, u32 tbl_id); + struct p4tc_parser *tcf_parser_create(struct p4tc_pipeline *pipeline, const char *parser_name, u32 parser_inst_id, diff --git a/include/uapi/linux/p4tc.h b/include/uapi/linux/p4tc.h index 678ee20cd..727fdcfe5 100644 --- a/include/uapi/linux/p4tc.h +++ b/include/uapi/linux/p4tc.h @@ -119,6 +119,7 @@ enum { P4TC_OBJ_HDR_FIELD, P4TC_OBJ_ACT, P4TC_OBJ_TABLE, + P4TC_OBJ_TABLE_ENTRY, __P4TC_OBJ_MAX, }; #define P4TC_OBJ_MAX __P4TC_OBJ_MAX @@ -321,6 +322,37 @@ struct tc_act_dyna { tc_gen; }; +struct p4tc_table_entry_tm { + __u64 created; + __u64 lastused; + __u64 firstused; +}; + +/* Table entry attributes */ +enum { + P4TC_ENTRY_UNSPEC, + P4TC_ENTRY_TBLNAME, /* string */ + P4TC_ENTRY_KEY_BLOB, /* Key blob */ + P4TC_ENTRY_MASK_BLOB, /* Mask blob */ + P4TC_ENTRY_PRIO, /* u32 */ + P4TC_ENTRY_ACT, /* nested actions */ + P4TC_ENTRY_TM, /* entry data path timestamps */ + P4TC_ENTRY_WHODUNNIT, /* tells who's modifying the entry */ + P4TC_ENTRY_CREATE_WHODUNNIT, /* tells who created the entry */ + P4TC_ENTRY_UPDATE_WHODUNNIT, /* tells who updated the entry last */ + P4TC_ENTRY_PERMISSIONS, /* entry CRUDX permissions */ + P4TC_ENTRY_PAD, + __P4TC_ENTRY_MAX +}; +#define P4TC_ENTRY_MAX (__P4TC_ENTRY_MAX - 1) + +enum { + P4TC_ENTITY_UNSPEC, + P4TC_ENTITY_KERNEL, + P4TC_ENTITY_TC, + P4TC_ENTITY_MAX +}; + #define P4TC_RTA(r) \ ((struct rtattr *)(((char *)(r)) + NLMSG_ALIGN(sizeof(struct p4tcmsg)))) diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h index 62f0f5c90..dc061ddb8 100644 --- a/include/uapi/linux/rtnetlink.h +++ b/include/uapi/linux/rtnetlink.h @@ -201,6 +201,13 @@ enum { RTM_GETP4TEMPLATE, #define RTM_GETP4TEMPLATE RTM_GETP4TEMPLATE + RTM_CREATEP4TBENT = 128, +#define RTM_CREATEP4TBENT RTM_CREATEP4TBENT + RTM_DELP4TBENT, +#define RTM_DELP4TBENT RTM_DELP4TBENT + RTM_GETP4TBENT, +#define RTM_GETP4TBENT RTM_GETP4TBENT + __RTM_MAX, #define RTM_MAX (((__RTM_MAX + 3) & ~3) - 1) }; diff --git a/net/sched/p4tc/Makefile b/net/sched/p4tc/Makefile index de3a7b833..0d2c20223 100644 --- a/net/sched/p4tc/Makefile +++ b/net/sched/p4tc/Makefile @@ -1,4 +1,5 @@ # SPDX-License-Identifier: GPL-2.0 obj-y := p4tc_types.o p4tc_pipeline.o p4tc_tmpl_api.o p4tc_meta.o \ - p4tc_parser_api.o p4tc_hdrfield.o p4tc_action.o p4tc_table.o + p4tc_parser_api.o p4tc_hdrfield.o p4tc_action.o p4tc_table.o \ + p4tc_tbl_api.o diff --git a/net/sched/p4tc/p4tc_pipeline.c b/net/sched/p4tc/p4tc_pipeline.c index 854fc5b57..f8fcde20b 100644 --- a/net/sched/p4tc/p4tc_pipeline.c +++ b/net/sched/p4tc/p4tc_pipeline.c @@ -328,7 +328,16 @@ static int tcf_pipeline_put(struct net *net, struct p4tc_metadata *meta; struct p4tc_table *table; + if (!refcount_dec_if_one(&pipeline->p_ctrl_ref)) { + if (pipeline_net) { + put_net(pipeline_net); + NL_SET_ERR_MSG(extack, "Can't delete referenced pipeline"); + return -EBUSY; + } + } + if (pipeline_net && !refcount_dec_if_one(&pipeline->p_ref)) { + refcount_set(&pipeline->p_ctrl_ref, 1); NL_SET_ERR_MSG(extack, "Can't delete referenced pipeline"); return -EBUSY; } @@ -567,6 +576,9 @@ static struct p4tc_pipeline *tcf_pipeline_create(struct net *net, pipeline->net = net; refcount_set(&pipeline->p_ref, 1); + refcount_set(&pipeline->p_ctrl_ref, 1); + refcount_set(&pipeline->p_hdrs_used, 1); + refcount_set(&pipeline->p_entry_deferal_ref, 1); pipeline->common.ops = (struct p4tc_template_ops *)&p4tc_pipeline_ops; diff --git a/net/sched/p4tc/p4tc_table.c b/net/sched/p4tc/p4tc_table.c index f793c70bc..491e44396 100644 --- a/net/sched/p4tc/p4tc_table.c +++ b/net/sched/p4tc/p4tc_table.c @@ -234,6 +234,17 @@ static int _tcf_table_fill_nlmsg(struct sk_buff *skb, struct p4tc_table *table) } nla_nest_end(skb, nested_tbl_acts); + if (table->tbl_const_entry) { + struct nlattr *const_nest; + + const_nest = nla_nest_start(skb, P4TC_TABLE_OPT_ENTRY); + p4tca_table_get_entry_fill(skb, table, table->tbl_const_entry, + table->tbl_id); + nla_nest_end(skb, const_nest); + } + kfree(table->tbl_const_entry); + table->tbl_const_entry = NULL; + if (nla_put(skb, P4TC_TABLE_INFO, sizeof(parm), &parm)) goto out_nlmsg_trim; nla_nest_end(skb, nest); @@ -381,6 +392,9 @@ static inline int _tcf_table_put(struct net *net, struct nlattr **tb, tcf_table_acts_list_destroy(&table->tbl_acts_list); + rhltable_free_and_destroy(&table->tbl_entries, + tcf_table_entry_destroy_hash, table); + idr_destroy(&table->tbl_masks_idr); idr_destroy(&table->tbl_prio_idr); @@ -1075,6 +1089,11 @@ static struct p4tc_table *tcf_table_create(struct net *net, struct nlattr **tb, spin_lock_init(&table->tbl_masks_idr_lock); spin_lock_init(&table->tbl_prio_idr_lock); + if (rhltable_init(&table->tbl_entries, &entry_hlt_params) < 0) { + ret = -EINVAL; + goto defaultacts_destroy; + } + table->tbl_key = key; pipeline->curr_tables += 1; @@ -1083,6 +1102,10 @@ static struct p4tc_table *tcf_table_create(struct net *net, struct nlattr **tb, return table; +defaultacts_destroy: + p4tc_table_defact_destroy(table->tbl_default_missact); + p4tc_table_defact_destroy(table->tbl_default_hitact); + key_put: if (key) tcf_table_key_put(key); @@ -1279,6 +1302,25 @@ static struct p4tc_table *tcf_table_update(struct net *net, struct nlattr **tb, } } + if (tb[P4TC_TABLE_OPT_ENTRY]) { + struct p4tc_table_entry *entry; + + entry = kzalloc(GFP_KERNEL, sizeof(*entry)); + if (!entry) { + ret = -ENOMEM; + goto free_perm; + } + + /* Workaround to make this work */ + ret = tcf_table_const_entry_cu(net, tb[P4TC_TABLE_OPT_ENTRY], + entry, pipeline, table, extack); + if (ret < 0) { + kfree(entry); + goto free_perm; + } + table->tbl_const_entry = entry; + } + if (preacts) { p4tc_action_destroy(table->tbl_preacts); table->tbl_preacts = preacts; @@ -1326,6 +1368,9 @@ static struct p4tc_table *tcf_table_update(struct net *net, struct nlattr **tb, return table; +free_perm: + kfree(perm); + key_destroy: if (key) tcf_table_key_put(key); diff --git a/net/sched/p4tc/p4tc_tbl_api.c b/net/sched/p4tc/p4tc_tbl_api.c new file mode 100644 index 000000000..4523ec09b --- /dev/null +++ b/net/sched/p4tc/p4tc_tbl_api.c @@ -0,0 +1,1898 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * net/sched/p4tc_tbl_api.c TC P4 TABLE API + * + * Copyright (c) 2022, Mojatatu Networks + * Copyright (c) 2022, Intel Corporation. + * Authors: Jamal Hadi Salim + * Victor Nogueira + * Pedro Tammela + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define KEY_MASK_ID_SZ (sizeof(u32)) +#define KEY_MASK_ID_SZ_BITS (KEY_MASK_ID_SZ * BITS_PER_BYTE) + +static u32 p4tc_entry_hash_fn(const void *data, u32 len, u32 seed) +{ + const struct p4tc_table_entry_key *key = data; + + return jhash(key->value, key->keysz >> 3, seed); +} + +static int p4tc_entry_hash_cmp(struct rhashtable_compare_arg *arg, + const void *ptr) +{ + const struct p4tc_table_entry_key *key = arg->key; + const struct p4tc_table_entry *entry = ptr; + + return memcmp(entry->key.value, key->value, entry->key.keysz >> 3); +} + +static u32 p4tc_entry_obj_hash_fn(const void *data, u32 len, u32 seed) +{ + const struct p4tc_table_entry *entry = data; + + return p4tc_entry_hash_fn(&entry->key, 0, seed); +} + +const struct rhashtable_params entry_hlt_params = { + .obj_cmpfn = p4tc_entry_hash_cmp, + .obj_hashfn = p4tc_entry_obj_hash_fn, + .hashfn = p4tc_entry_hash_fn, + .head_offset = offsetof(struct p4tc_table_entry, ht_node), + .key_offset = offsetof(struct p4tc_table_entry, key), + .automatic_shrinking = true, +}; + +static struct p4tc_table_entry * +p4tc_entry_lookup(struct p4tc_table *table, struct p4tc_table_entry_key *key, + u32 prio) __must_hold(RCU) +{ + struct p4tc_table_entry *entry; + struct rhlist_head *tmp, *bucket_list; + + bucket_list = + rhltable_lookup(&table->tbl_entries, key, entry_hlt_params); + if (!bucket_list) + return NULL; + + rhl_for_each_entry_rcu(entry, tmp, bucket_list, ht_node) + if (entry->prio == prio) + return entry; + + return NULL; +} + +static struct p4tc_table_entry * +__p4tc_entry_lookup(struct p4tc_table *table, struct p4tc_table_entry_key *key) + __must_hold(RCU) +{ + struct p4tc_table_entry *entry = NULL; + u32 smallest_prio = U32_MAX; + struct rhlist_head *tmp, *bucket_list; + struct p4tc_table_entry *entry_curr; + + bucket_list = + rhltable_lookup(&table->tbl_entries, key, entry_hlt_params); + if (!bucket_list) + return NULL; + + rhl_for_each_entry_rcu(entry_curr, tmp, bucket_list, ht_node) { + if (entry_curr->prio <= smallest_prio) { + smallest_prio = entry_curr->prio; + entry = entry_curr; + } + } + + return entry; +} + +static void mask_key(struct p4tc_table_entry_mask *mask, u8 *masked_key, + u8 *skb_key) +{ + int i; + __u32 *mask_id; + + mask_id = (u32 *)&masked_key[0]; + *mask_id = mask->mask_id; + + for (i = KEY_MASK_ID_SZ; i < BITS_TO_BYTES(mask->sz); i++) + masked_key[i] = skb_key[i - KEY_MASK_ID_SZ] & mask->value[i]; +} + +struct p4tc_table_entry *p4tc_table_entry_lookup(struct sk_buff *skb, + struct p4tc_table *table, + u32 keysz) +{ + struct p4tc_table_entry *entry_curr = NULL; + u8 masked_key[KEY_MASK_ID_SZ + BITS_TO_BYTES(P4TC_MAX_KEYSZ)] = { 0 }; + u32 smallest_prio = U32_MAX; + struct p4tc_table_entry_mask *mask; + struct p4tc_table_entry *entry = NULL; + struct p4tc_skb_ext *p4tc_skb_ext; + unsigned long tmp, mask_id; + + p4tc_skb_ext = skb_ext_find(skb, P4TC_SKB_EXT); + if (unlikely(!p4tc_skb_ext)) + return ERR_PTR(-ENOENT); + + idr_for_each_entry_ul(&table->tbl_masks_idr, mask, tmp, mask_id) { + struct p4tc_table_entry_key key = {}; + + mask_key(mask, masked_key, p4tc_skb_ext->p4tc_ext->key); + + key.value = masked_key; + key.keysz = keysz + KEY_MASK_ID_SZ_BITS; + + entry_curr = __p4tc_entry_lookup(table, &key); + if (entry_curr) { + if (entry_curr->prio <= smallest_prio) { + smallest_prio = entry_curr->prio; + entry = entry_curr; + } + } + } + + return entry; +} + +#define tcf_table_entry_mask_find_byid(table, id) \ + (idr_find(&(table)->tbl_masks_idr, id)) + +static int p4tca_table_get_entry_keys(struct sk_buff *skb, + struct p4tc_table *table, + struct p4tc_table_entry *entry) +{ + unsigned char *b = nlmsg_get_pos(skb); + int ret = -ENOMEM; + struct p4tc_table_entry_mask *mask; + u32 key_sz_bytes; + + key_sz_bytes = (entry->key.keysz - KEY_MASK_ID_SZ_BITS) / BITS_PER_BYTE; + if (nla_put(skb, P4TC_ENTRY_KEY_BLOB, key_sz_bytes, + entry->key.unmasked_key + KEY_MASK_ID_SZ)) + goto out_nlmsg_trim; + + mask = tcf_table_entry_mask_find_byid(table, entry->mask_id); + if (nla_put(skb, P4TC_ENTRY_MASK_BLOB, key_sz_bytes, + mask->value + KEY_MASK_ID_SZ)) + goto out_nlmsg_trim; + + return 0; + +out_nlmsg_trim: + nlmsg_trim(skb, b); + return ret; +} + +static void p4tc_table_entry_tm_dump(struct p4tc_table_entry_tm *dtm, + struct p4tc_table_entry_tm *stm) +{ + unsigned long now = jiffies; + + dtm->created = stm->created ? + jiffies_to_clock_t(now - stm->created) : 0; + dtm->lastused = stm->lastused ? + jiffies_to_clock_t(now - stm->lastused) : 0; + dtm->firstused = stm->firstused ? + jiffies_to_clock_t(now - stm->firstused) : 0; +} + +#define P4TC_ENTRY_MAX_IDS (P4TC_PATH_MAX - 1) + +int p4tca_table_get_entry_fill(struct sk_buff *skb, struct p4tc_table *table, + struct p4tc_table_entry *entry, u32 tbl_id) +{ + unsigned char *b = nlmsg_get_pos(skb); + int ret = -ENOMEM; + struct nlattr *nest, *nest_acts; + struct p4tc_table_entry_tm dtm, *tm; + u32 ids[P4TC_ENTRY_MAX_IDS]; + + ids[P4TC_TBLID_IDX - 1] = tbl_id; + + if (nla_put(skb, P4TC_PATH, P4TC_ENTRY_MAX_IDS * sizeof(u32), ids)) + goto out_nlmsg_trim; + + nest = nla_nest_start(skb, P4TC_PARAMS); + if (!nest) + goto out_nlmsg_trim; + + if (nla_put_u32(skb, P4TC_ENTRY_PRIO, entry->prio)) + goto out_nlmsg_trim; + + if (p4tca_table_get_entry_keys(skb, table, entry) < 0) + goto out_nlmsg_trim; + + if (entry->acts) { + nest_acts = nla_nest_start(skb, P4TC_ENTRY_ACT); + if (tcf_action_dump(skb, entry->acts, 0, 0, false) < 0) + goto out_nlmsg_trim; + nla_nest_end(skb, nest_acts); + } + + if (nla_put_u8(skb, P4TC_ENTRY_CREATE_WHODUNNIT, entry->who_created)) + goto out_nlmsg_trim; + + if (entry->who_updated) { + if (nla_put_u8(skb, P4TC_ENTRY_UPDATE_WHODUNNIT, + entry->who_updated)) + goto out_nlmsg_trim; + } + + if (nla_put_u16(skb, P4TC_ENTRY_PERMISSIONS, entry->permissions)) + goto out_nlmsg_trim; + + tm = rtnl_dereference(entry->tm); + p4tc_table_entry_tm_dump(&dtm, tm); + if (nla_put_64bit(skb, P4TC_ENTRY_TM, sizeof(dtm), &dtm, + P4TC_ENTRY_PAD)) + goto out_nlmsg_trim; + + nla_nest_end(skb, nest); + + return skb->len; + +out_nlmsg_trim: + nlmsg_trim(skb, b); + return ret; +} + +static const struct nla_policy p4tc_entry_policy[P4TC_ENTRY_MAX + 1] = { + [P4TC_ENTRY_TBLNAME] = { .type = NLA_STRING }, + [P4TC_ENTRY_KEY_BLOB] = { .type = NLA_BINARY }, + [P4TC_ENTRY_MASK_BLOB] = { .type = NLA_BINARY }, + [P4TC_ENTRY_PRIO] = { .type = NLA_U32 }, + [P4TC_ENTRY_ACT] = { .type = NLA_NESTED }, + [P4TC_ENTRY_TM] = { .type = NLA_BINARY, + .len = sizeof(struct p4tc_table_entry_tm) }, + [P4TC_ENTRY_WHODUNNIT] = { .type = NLA_U8 }, + [P4TC_ENTRY_CREATE_WHODUNNIT] = { .type = NLA_U8 }, + [P4TC_ENTRY_UPDATE_WHODUNNIT] = { .type = NLA_U8 }, + [P4TC_ENTRY_PERMISSIONS] = { .type = NLA_U16 }, +}; + +static void __tcf_table_entry_mask_destroy(struct p4tc_table_entry_mask *mask) +{ + kfree(mask->value); + kfree(mask); +} + +static void tcf_table_entry_mask_destroy(struct rcu_head *rcu) +{ + struct p4tc_table_entry_mask *mask; + + mask = container_of(rcu, struct p4tc_table_entry_mask, rcu); + + __tcf_table_entry_mask_destroy(mask); +} + +static struct p4tc_table_entry_mask * +tcf_table_entry_mask_find_byvalue(struct p4tc_table *table, + struct p4tc_table_entry_mask *mask) +{ + struct p4tc_table_entry_mask *mask_cur; + unsigned long mask_id, tmp; + + idr_for_each_entry_ul(&table->tbl_masks_idr, mask_cur, tmp, mask_id) { + if (mask_cur->sz == mask->sz) { + u32 mask_sz_bytes = mask->sz / BITS_PER_BYTE - KEY_MASK_ID_SZ; + void *curr_mask_value = mask_cur->value + KEY_MASK_ID_SZ; + void *mask_value = mask->value + KEY_MASK_ID_SZ; + + if (memcmp(curr_mask_value, mask_value, mask_sz_bytes) == 0) + return mask_cur; + } + } + + return NULL; +} + +static void tcf_table_entry_mask_del(struct p4tc_table *table, + struct p4tc_table_entry *entry) +{ + const u32 mask_id = entry->mask_id; + struct p4tc_table_entry_mask *mask_found; + + /* Will always be found*/ + mask_found = tcf_table_entry_mask_find_byid(table, mask_id); + + /* Last reference, can delete*/ + if (refcount_dec_if_one(&mask_found->mask_ref)) { + spin_lock_bh(&table->tbl_masks_idr_lock); + idr_remove(&table->tbl_masks_idr, mask_found->mask_id); + spin_unlock_bh(&table->tbl_masks_idr_lock); + call_rcu(&mask_found->rcu, tcf_table_entry_mask_destroy); + } else { + if (!refcount_dec_not_one(&mask_found->mask_ref)) + pr_warn("Mask was deleted in parallel"); + } +} + +/* TODO: Ordering optimisation for LPM */ +static struct p4tc_table_entry_mask * +tcf_table_entry_mask_add(struct p4tc_table *table, + struct p4tc_table_entry *entry, + struct p4tc_table_entry_mask *mask) +{ + struct p4tc_table_entry_mask *mask_found; + int ret; + + mask_found = tcf_table_entry_mask_find_byvalue(table, mask); + /* Only add mask if it was not already added */ + if (!mask_found) { + struct p4tc_table_entry_mask *mask_allocated; + + mask_allocated = kzalloc(sizeof(*mask_allocated), GFP_ATOMIC); + if (!mask_allocated) + return ERR_PTR(-ENOMEM); + + mask_allocated->value = + kzalloc(BITS_TO_BYTES(mask->sz), GFP_ATOMIC); + if (!mask_allocated->value) { + kfree(mask_allocated); + return ERR_PTR(-ENOMEM); + } + memcpy(mask_allocated->value, mask->value, + BITS_TO_BYTES(mask->sz)); + + mask_allocated->mask_id = 1; + refcount_set(&mask_allocated->mask_ref, 1); + mask_allocated->sz = mask->sz; + + spin_lock_bh(&table->tbl_masks_idr_lock); + ret = idr_alloc_u32(&table->tbl_masks_idr, mask_allocated, + &mask_allocated->mask_id, UINT_MAX, + GFP_ATOMIC); + spin_unlock_bh(&table->tbl_masks_idr_lock); + if (ret < 0) { + kfree(mask_allocated->value); + kfree(mask_allocated); + return ERR_PTR(ret); + } + entry->mask_id = mask_allocated->mask_id; + mask_found = mask_allocated; + } else { + if (!refcount_inc_not_zero(&mask_found->mask_ref)) + return ERR_PTR(-EBUSY); + entry->mask_id = mask_found->mask_id; + } + + return mask_found; +} + +static void tcf_table_entry_del_act(struct p4tc_table_entry *entry) +{ + p4tc_action_destroy(entry->acts); + kfree(entry); +} + +static void tcf_table_entry_del_act_work(struct work_struct *work) +{ + struct p4tc_table_entry_work *entry_work = + container_of(work, typeof(*entry_work), work); + struct p4tc_pipeline *pipeline = entry_work->pipeline; + + tcf_table_entry_del_act(entry_work->entry); + put_net(pipeline->net); + + refcount_dec(&entry_work->pipeline->p_entry_deferal_ref); + + kfree(entry_work); +} + +static void tcf_table_entry_put(struct p4tc_table_entry *entry) +{ + struct p4tc_table_entry_tm *tm; + + tm = rcu_dereference(entry->tm); + kfree(tm); + + kfree(entry->key.unmasked_key); + kfree(entry->key.value); + + if (entry->acts) { + struct p4tc_table_entry_work *entry_work = entry->entry_work; + struct p4tc_pipeline *pipeline = entry_work->pipeline; + struct net *net; + + if (entry_work->defer_deletion) { + net = get_net(pipeline->net); + refcount_inc(&entry_work->pipeline->p_entry_deferal_ref); + schedule_work(&entry_work->work); + } else { + kfree(entry_work); + tcf_table_entry_del_act(entry); + } + } else { + kfree(entry->entry_work); + kfree(entry); + } +} + +static void tcf_table_entry_put_rcu(struct rcu_head *rcu) +{ + struct p4tc_table_entry *entry; + + entry = container_of(rcu, struct p4tc_table_entry, rcu); + + tcf_table_entry_put(entry); +} + +static int tcf_table_entry_destroy(struct p4tc_table *table, + struct p4tc_table_entry *entry, + bool remove_from_hash) +{ + /* Entry was deleted in parallel */ + if (!refcount_dec_if_one(&entry->entries_ref)) + return -EBUSY; + + if (remove_from_hash) + rhltable_remove(&table->tbl_entries, &entry->ht_node, + entry_hlt_params); + + tcf_table_entry_mask_del(table, entry); + if (entry->entry_work->defer_deletion) { + call_rcu(&entry->rcu, tcf_table_entry_put_rcu); + } else { + synchronize_rcu(); + tcf_table_entry_put(entry); + } + + return 0; +} + +/* Only deletes entries when called from pipeline delete, which means + * pipeline->p_ref will already be 0, so no need to use that refcount. + */ +void tcf_table_entry_destroy_hash(void *ptr, void *arg) +{ + struct p4tc_table *table = arg; + struct p4tc_table_entry *entry = ptr; + + refcount_dec(&table->tbl_entries_ref); + + entry->entry_work->defer_deletion = false; + tcf_table_entry_destroy(table, entry, false); +} + +static void tcf_table_entry_put_table(struct p4tc_pipeline *pipeline, + struct p4tc_table *table) +{ + /* If we are here, it means that this was just incremented, so it should be > 1 */ + WARN_ON(!refcount_dec_not_one(&table->tbl_ctrl_ref)); + WARN_ON(!refcount_dec_not_one(&pipeline->p_ctrl_ref)); +} + +static int tcf_table_entry_get_table(struct net *net, + struct p4tc_pipeline **pipeline, + struct p4tc_table **table, + struct nlattr **tb, u32 *ids, char *p_name, + struct netlink_ext_ack *extack) + __must_hold(RCU) +{ + u32 pipeid, tbl_id; + char *tblname; + int ret; + + pipeid = ids[P4TC_PID_IDX]; + + *pipeline = tcf_pipeline_find_byany(net, p_name, pipeid, extack); + if (IS_ERR(*pipeline)) { + ret = PTR_ERR(*pipeline); + goto out; + } + + if (!refcount_inc_not_zero(&((*pipeline)->p_ctrl_ref))) { + NL_SET_ERR_MSG(extack, "Pipeline is stale"); + ret = -EBUSY; + goto out; + } + + tbl_id = ids[P4TC_TBLID_IDX]; + + tblname = tb[P4TC_ENTRY_TBLNAME] ? nla_data(tb[P4TC_ENTRY_TBLNAME]) : NULL; + *table = tcf_table_find_byany(*pipeline, tblname, tbl_id, extack); + if (IS_ERR(*table)) { + ret = PTR_ERR(*table); + goto dec_pipeline_refcount; + } + if (!refcount_inc_not_zero(&((*table)->tbl_ctrl_ref))) { + NL_SET_ERR_MSG(extack, "Table is marked for deletion"); + ret = -EBUSY; + goto dec_pipeline_refcount; + } + + return 0; + +/* If we are here, it means that this was just incremented, so it should be > 1 */ +dec_pipeline_refcount: + WARN_ON(!refcount_dec_not_one(&((*pipeline)->p_ctrl_ref))); + +out: + return ret; +} + +static void tcf_table_entry_assign_key(struct p4tc_table_entry_key *key, + struct p4tc_table_entry_mask *mask, + u8 *keyblob, u8 *maskblob, u32 keysz) +{ + /* Don't assign mask_id to key yet, because it has not been allocated */ + memcpy(key->unmasked_key + KEY_MASK_ID_SZ, keyblob, keysz); + + /* Don't assign mask_id to value yet, because it has not been allocated */ + memcpy(mask->value + KEY_MASK_ID_SZ, maskblob, keysz); +} + +static int tcf_table_entry_extract_key(struct nlattr **tb, + struct p4tc_table_entry_key *key, + struct p4tc_table_entry_mask *mask, + struct netlink_ext_ack *extack) +{ + u32 internal_keysz; + u32 keysz; + + if (!tb[P4TC_ENTRY_KEY_BLOB] || !tb[P4TC_ENTRY_MASK_BLOB]) { + NL_SET_ERR_MSG(extack, "Must specify key and mask blobs"); + return -EINVAL; + } + + keysz = nla_len(tb[P4TC_ENTRY_KEY_BLOB]); + internal_keysz = (keysz + KEY_MASK_ID_SZ) * BITS_PER_BYTE; + if (key->keysz != internal_keysz) { + NL_SET_ERR_MSG(extack, + "Key blob size and table key size differ"); + return -EINVAL; + } + + if (keysz != nla_len(tb[P4TC_ENTRY_MASK_BLOB])) { + NL_SET_ERR_MSG(extack, + "Key and mask blob must have the same length"); + return -EINVAL; + } + + tcf_table_entry_assign_key(key, mask, nla_data(tb[P4TC_ENTRY_KEY_BLOB]), + nla_data(tb[P4TC_ENTRY_MASK_BLOB]), keysz); + + return 0; +} + +static void tcf_table_entry_build_key(struct p4tc_table_entry_key *key, + struct p4tc_table_entry_mask *mask) +{ + u32 *mask_id; + int i; + + mask_id = (u32 *)&key->unmasked_key[0]; + *mask_id = mask->mask_id; + + mask_id = (u32 *)&mask->value[0]; + *mask_id = mask->mask_id; + + for (i = 0; i < BITS_TO_BYTES(key->keysz); i++) + key->value[i] = key->unmasked_key[i] & mask->value[i]; +} + +static int ___tcf_table_entry_del(struct p4tc_pipeline *pipeline, + struct p4tc_table *table, + struct p4tc_table_entry *entry, + bool from_control) + __must_hold(RCU) +{ + int ret = 0; + + if (from_control) { + if (!p4tc_ctrl_delete_ok(entry->permissions)) + return -EPERM; + } else { + if (!p4tc_data_delete_ok(entry->permissions)) + return -EPERM; + } + + if (!refcount_dec_not_one(&table->tbl_entries_ref)) + return -EBUSY; + + spin_lock_bh(&table->tbl_prio_idr_lock); + idr_remove(&table->tbl_prio_idr, entry->prio); + spin_unlock_bh(&table->tbl_prio_idr_lock); + + if (tcf_table_entry_destroy(table, entry, true) < 0) { + ret = -EBUSY; + goto inc_entries_ref; + } + + goto out; + +inc_entries_ref: + WARN_ON(!refcount_dec_not_one(&table->tbl_entries_ref)); + +out: + return ret; +} + +/* Internal function which will be called by the data path */ +static int __tcf_table_entry_del(struct p4tc_pipeline *pipeline, + struct p4tc_table *table, + struct p4tc_table_entry_key *key, + struct p4tc_table_entry_mask *mask, u32 prio, + struct netlink_ext_ack *extack) +{ + struct p4tc_table_entry *entry; + int ret; + + tcf_table_entry_build_key(key, mask); + + entry = p4tc_entry_lookup(table, key, prio); + if (!entry) { + rcu_read_unlock(); + NL_SET_ERR_MSG(extack, "Unable to find entry"); + return -EINVAL; + } + + entry->entry_work->defer_deletion = true; + ret = ___tcf_table_entry_del(pipeline, table, entry, false); + + return ret; +} + +static int tcf_table_entry_gd(struct net *net, struct sk_buff *skb, + struct nlmsghdr *n, struct nlattr *arg, u32 *ids, + struct p4tc_nl_pname *nl_pname, + struct netlink_ext_ack *extack) +{ + struct nlattr *tb[P4TC_ENTRY_MAX + 1] = { NULL }; + struct p4tc_table_entry *entry = NULL; + struct p4tc_pipeline *pipeline = NULL; + struct p4tc_table_entry_mask *mask, *new_mask; + struct p4tc_table_entry_key *key; + struct p4tc_table *table; + u32 keysz_bytes; + u32 prio; + int ret; + + if (arg) { + ret = nla_parse_nested(tb, P4TC_ENTRY_MAX, arg, + p4tc_entry_policy, extack); + + if (ret < 0) + return ret; + } + + if (!tb[P4TC_ENTRY_PRIO]) { + NL_SET_ERR_MSG(extack, "Must specify table entry priority"); + return -EINVAL; + } + prio = *((u32 *)nla_data(tb[P4TC_ENTRY_PRIO])); + + rcu_read_lock(); + ret = tcf_table_entry_get_table(net, &pipeline, &table, tb, ids, + nl_pname->data, extack); + rcu_read_unlock(); + if (ret < 0) + return ret; + + if (n->nlmsg_type == RTM_DELP4TBENT && !pipeline_sealed(pipeline)) { + NL_SET_ERR_MSG(extack, + "Unable to delete table entry in unsealed pipeline"); + ret = -EINVAL; + goto table_put; + } + + key = kzalloc(sizeof(*key), GFP_KERNEL); + if (!key) { + NL_SET_ERR_MSG(extack, "Unable to allocate key"); + ret = -ENOMEM; + goto table_put; + } + key->keysz = table->tbl_keysz + KEY_MASK_ID_SZ_BITS; + keysz_bytes = (key->keysz / BITS_PER_BYTE); + + mask = kzalloc(sizeof(*mask), GFP_KERNEL); + if (!mask) { + NL_SET_ERR_MSG(extack, "Failed to allocate mask"); + ret = -ENOMEM; + goto free_key; + } + mask->value = kzalloc(keysz_bytes, GFP_KERNEL); + if (!mask->value) { + NL_SET_ERR_MSG(extack, "Failed to allocate mask value"); + ret = -ENOMEM; + kfree(mask); + goto free_key; + } + mask->sz = key->keysz; + + key->value = kzalloc(keysz_bytes, GFP_KERNEL); + if (!key->value) { + ret = -ENOMEM; + kfree(mask->value); + kfree(mask); + goto free_key; + } + + key->unmasked_key = kzalloc(keysz_bytes, GFP_KERNEL); + if (!key->unmasked_key) { + ret = -ENOMEM; + kfree(mask->value); + kfree(mask); + goto free_key_value; + } + + ret = tcf_table_entry_extract_key(tb, key, mask, extack); + if (ret < 0) { + kfree(mask->value); + kfree(mask); + goto free_key_unmasked; + } + + new_mask = tcf_table_entry_mask_find_byvalue(table, mask); + kfree(mask->value); + kfree(mask); + if (!new_mask) { + NL_SET_ERR_MSG(extack, "Unable to find entry"); + ret = -ENOENT; + goto free_key_unmasked; + } else { + mask = new_mask; + } + + tcf_table_entry_build_key(key, mask); + + rcu_read_lock(); + entry = p4tc_entry_lookup(table, key, prio); + if (!entry) { + NL_SET_ERR_MSG(extack, "Unable to find entry"); + ret = -EINVAL; + goto unlock; + } + + if (n->nlmsg_type == RTM_GETP4TBENT) { + if (!p4tc_ctrl_read_ok(entry->permissions)) { + NL_SET_ERR_MSG(extack, + "Permission denied: Unable to read table entry"); + ret = -EINVAL; + goto unlock; + } + } + + if (p4tca_table_get_entry_fill(skb, table, entry, table->tbl_id) <= 0) { + NL_SET_ERR_MSG(extack, "Unable to fill table entry attributes"); + ret = -EINVAL; + goto unlock; + } + + if (n->nlmsg_type == RTM_DELP4TBENT) { + entry->entry_work->defer_deletion = true; + ret = ___tcf_table_entry_del(pipeline, table, entry, true); + if (ret < 0) + goto unlock; + } + + if (!ids[P4TC_PID_IDX]) + ids[P4TC_PID_IDX] = pipeline->common.p_id; + + if (!nl_pname->passed) + strscpy(nl_pname->data, pipeline->common.name, PIPELINENAMSIZ); + + ret = 0; + + goto unlock; + +unlock: + rcu_read_unlock(); + +free_key_unmasked: + kfree(key->unmasked_key); + +free_key_value: + kfree(key->value); + +free_key: + kfree(key); + +table_put: + tcf_table_entry_put_table(pipeline, table); + + return ret; +} + +static int tcf_table_entry_flush(struct net *net, struct sk_buff *skb, + struct nlmsghdr *n, struct nlattr *arg, + u32 *ids, struct p4tc_nl_pname *nl_pname, + struct netlink_ext_ack *extack) +{ + struct nlattr *tb[P4TC_ENTRY_MAX + 1] = { NULL }; + unsigned char *b = nlmsg_get_pos(skb); + int ret = 0; + int i = 0; + struct p4tc_pipeline *pipeline; + struct p4tc_table_entry *entry; + struct p4tc_table *table; + u32 arg_ids[P4TC_PATH_MAX - 1]; + struct rhashtable_iter iter; + + if (arg) { + ret = nla_parse_nested(tb, P4TC_ENTRY_MAX, arg, + p4tc_entry_policy, extack); + if (ret < 0) + return ret; + } + + rcu_read_lock(); + ret = tcf_table_entry_get_table(net, &pipeline, &table, tb, ids, + nl_pname->data, extack); + rcu_read_unlock(); + if (ret < 0) + return ret; + + if (!ids[P4TC_TBLID_IDX]) + arg_ids[P4TC_TBLID_IDX - 1] = table->tbl_id; + + if (nla_put(skb, P4TC_PATH, sizeof(arg_ids), arg_ids)) { + ret = -ENOMEM; + goto out_nlmsg_trim; + } + + rhltable_walk_enter(&table->tbl_entries, &iter); + do { + rhashtable_walk_start(&iter); + + while ((entry = rhashtable_walk_next(&iter)) && !IS_ERR(entry)) { + if (!p4tc_ctrl_delete_ok(entry->permissions)) { + ret = -EPERM; + continue; + } + + if (!refcount_dec_not_one(&table->tbl_entries_ref)) { + NL_SET_ERR_MSG(extack, "Table entry is stale"); + ret = -EBUSY; + rhashtable_walk_stop(&iter); + goto walk_exit; + } + + entry->entry_work->defer_deletion = true; + if (tcf_table_entry_destroy(table, entry, true) < 0) { + ret = -EBUSY; + continue; + } + i++; + } + + rhashtable_walk_stop(&iter); + } while (entry == ERR_PTR(-EAGAIN)); + +walk_exit: + rhashtable_walk_exit(&iter); + + nla_put_u32(skb, P4TC_COUNT, i); + + if (ret < 0) { + if (i == 0) { + if (!extack->_msg) + NL_SET_ERR_MSG(extack, + "Unable to flush any entries"); + goto out_nlmsg_trim; + } else { + if (!extack->_msg) + NL_SET_ERR_MSG(extack, + "Unable to flush all entries"); + } + } + + if (!ids[P4TC_PID_IDX]) + ids[P4TC_PID_IDX] = pipeline->common.p_id; + + if (!nl_pname->passed) + strscpy(nl_pname->data, pipeline->common.name, PIPELINENAMSIZ); + + ret = 0; + goto table_put; + +out_nlmsg_trim: + nlmsg_trim(skb, b); + +/* If we are here, it means that this was just incremented, so it should be > 1 */ +table_put: + tcf_table_entry_put_table(pipeline, table); + + return ret; +} + +/* Invoked from both control and data path */ +static int __tcf_table_entry_create(struct p4tc_pipeline *pipeline, + struct p4tc_table *table, + struct p4tc_table_entry *entry, + struct p4tc_table_entry_mask *mask, + u16 whodunnit, bool from_control) + __must_hold(RCU) +{ + struct p4tc_table_perm *tbl_perm; + struct p4tc_table_entry_mask *mask_found; + struct p4tc_table_entry_work *entry_work; + struct p4tc_table_entry_tm *dtm; + u16 permissions; + int ret; + + refcount_set(&entry->entries_ref, 1); + + tbl_perm = rcu_dereference(table->tbl_permissions); + permissions = tbl_perm->permissions; + if (from_control) { + if (!p4tc_ctrl_create_ok(permissions)) + return -EPERM; + } else { + if (!p4tc_data_create_ok(permissions)) + return -EPERM; + } + + mask_found = tcf_table_entry_mask_add(table, entry, mask); + if (IS_ERR(mask_found)) { + ret = PTR_ERR(mask_found); + goto out; + } + + tcf_table_entry_build_key(&entry->key, mask_found); + + if (!refcount_inc_not_zero(&table->tbl_entries_ref)) { + ret = -EBUSY; + goto rm_masks_idr; + } + + if (p4tc_entry_lookup(table, &entry->key, entry->prio)) { + ret = -EEXIST; + goto dec_entries_ref; + } + + dtm = kzalloc(sizeof(*dtm), GFP_ATOMIC); + if (!dtm) { + ret = -ENOMEM; + goto dec_entries_ref; + } + + entry->who_created = whodunnit; + + dtm->created = jiffies; + dtm->firstused = 0; + dtm->lastused = jiffies; + rcu_assign_pointer(entry->tm, dtm); + + entry_work = kzalloc(sizeof(*(entry_work)), GFP_ATOMIC); + if (!entry_work) { + ret = -ENOMEM; + goto free_tm; + } + + entry_work->pipeline = pipeline; + entry_work->entry = entry; + entry->entry_work = entry_work; + + INIT_WORK(&entry_work->work, tcf_table_entry_del_act_work); + + if (rhltable_insert(&table->tbl_entries, &entry->ht_node, + entry_hlt_params) < 0) { + ret = -EBUSY; + goto free_entry_work; + } + + return 0; + +free_entry_work: + kfree(entry_work); + +free_tm: + kfree(dtm); +/*If we are here, it means that this was just incremented, so it should be > 1 */ +dec_entries_ref: + WARN_ON(!refcount_dec_not_one(&table->tbl_entries_ref)); + +rm_masks_idr: + tcf_table_entry_mask_del(table, entry); + +out: + return ret; +} + +/* Invoked from both control and data path */ +static int __tcf_table_entry_update(struct p4tc_pipeline *pipeline, + struct p4tc_table *table, + struct p4tc_table_entry *entry, + struct p4tc_table_entry_mask *mask, + u16 whodunnit, bool from_control) + __must_hold(RCU) +{ + struct p4tc_table_entry_mask *mask_found; + struct p4tc_table_entry_work *entry_work; + struct p4tc_table_entry *entry_old; + struct p4tc_table_entry_tm *tm_old; + struct p4tc_table_entry_tm *tm; + int ret; + + refcount_set(&entry->entries_ref, 1); + + mask_found = tcf_table_entry_mask_add(table, entry, mask); + if (IS_ERR(mask_found)) { + ret = PTR_ERR(mask_found); + goto out; + } + + tcf_table_entry_build_key(&entry->key, mask_found); + + entry_old = p4tc_entry_lookup(table, &entry->key, entry->prio); + if (!entry_old) { + ret = -ENOENT; + goto rm_masks_idr; + } + + if (from_control) { + if (!p4tc_ctrl_update_ok(entry_old->permissions)) { + ret = -EPERM; + goto rm_masks_idr; + } + } else { + if (!p4tc_data_update_ok(entry_old->permissions)) { + ret = -EPERM; + goto rm_masks_idr; + } + } + + if (refcount_read(&entry_old->entries_ref) > 1) { + ret = -EBUSY; + goto rm_masks_idr; + } + + tm = kzalloc(sizeof(*tm), GFP_ATOMIC); + if (!tm) { + ret = -ENOMEM; + goto rm_masks_idr; + } + + tm_old = rcu_dereference_protected(entry_old->tm, 1); + tm->created = tm_old->created; + tm->firstused = tm_old->firstused; + tm->lastused = jiffies; + + entry->who_updated = whodunnit; + + entry->who_created = entry_old->who_created; + + if (entry->permissions == P4TC_PERMISSIONS_UNINIT) + entry->permissions = entry_old->permissions; + + rcu_assign_pointer(entry->tm, tm); + + entry_work = kzalloc(sizeof(*(entry_work)), GFP_ATOMIC); + if (!entry_work) { + ret = -ENOMEM; + goto free_tm; + } + + entry_work->pipeline = pipeline; + entry_work->entry = entry; + entry->entry_work = entry_work; + + INIT_WORK(&entry_work->work, tcf_table_entry_del_act_work); + + if (rhltable_insert(&table->tbl_entries, &entry->ht_node, + entry_hlt_params) < 0) { + ret = -EEXIST; + goto free_entry_work; + } + + entry_old->entry_work->defer_deletion = true; + if (tcf_table_entry_destroy(table, entry_old, true) < 0) { + ret = -EBUSY; + goto out; + } + + return 0; + +free_entry_work: + kfree(entry_work); + +free_tm: + kfree(tm); + +rm_masks_idr: + tcf_table_entry_mask_del(table, entry); + +out: + return ret; +} + +#define P4TC_DEFAULT_TENTRY_PERMISSIONS \ + (P4TC_CTRL_PERM_R | P4TC_CTRL_PERM_U | P4TC_CTRL_PERM_D | \ + P4TC_DATA_PERM_R | P4TC_DATA_PERM_X) + +static bool tcf_table_check_entry_acts(struct p4tc_table *table, + struct tc_action *entry_acts[], + struct list_head *allowed_acts, + int num_entry_acts) +{ + struct p4tc_table_act *table_act; + int i; + + for (i = 0; i < num_entry_acts; i++) { + const struct tc_action *entry_act = entry_acts[i]; + + list_for_each_entry(table_act, allowed_acts, node) { + if (table_act->ops->id == entry_act->ops->id && + !(table_act->flags & BIT(P4TC_TABLE_ACTS_DEFAULT_ONLY))) + return true; + } + } + + return false; +} + +static int __tcf_table_entry_cu(struct net *net, u32 flags, struct nlattr **tb, + struct p4tc_table_entry *entry_cpy, + struct p4tc_pipeline *pipeline, + struct p4tc_table *table, + struct netlink_ext_ack *extack) +{ + u8 mask_value[KEY_MASK_ID_SZ + BITS_TO_BYTES(P4TC_MAX_KEYSZ)] = { 0 }; + struct p4tc_table_entry_mask mask = { 0 }; + u8 whodunnit = P4TC_ENTITY_UNSPEC; + int ret = 0; + struct p4tc_table_entry *entry; + u32 keysz_bytes; + u32 prio; + + prio = tb[P4TC_ENTRY_PRIO] ? *((u32 *)nla_data(tb[P4TC_ENTRY_PRIO])) : 0; + if (flags & NLM_F_REPLACE) { + if (!prio) { + NL_SET_ERR_MSG(extack, "Must specify entry priority"); + return -EINVAL; + } + } else { + if (!prio) { + prio = 1; + spin_lock(&table->tbl_prio_idr_lock); + ret = idr_alloc_u32(&table->tbl_prio_idr, + ERR_PTR(-EBUSY), &prio, UINT_MAX, + GFP_ATOMIC); + spin_unlock(&table->tbl_prio_idr_lock); + if (ret < 0) { + NL_SET_ERR_MSG(extack, + "Unable to allocate priority"); + return ret; + } + } else { + rcu_read_lock(); + if (idr_find(&table->tbl_prio_idr, prio)) { + rcu_read_unlock(); + NL_SET_ERR_MSG(extack, + "Priority already in use"); + return -EBUSY; + } + rcu_read_unlock(); + } + + if (refcount_read(&table->tbl_entries_ref) > table->tbl_max_entries) { + NL_SET_ERR_MSG(extack, + "Table instance max entries reached"); + return -EINVAL; + } + } + if (tb[P4TC_ENTRY_WHODUNNIT]) { + whodunnit = *((u8 *)nla_data(tb[P4TC_ENTRY_WHODUNNIT])); + } else { + NL_SET_ERR_MSG(extack, "Must specify whodunnit attribute"); + ret = -EINVAL; + goto idr_rm; + } + + entry = kzalloc(sizeof(*entry), GFP_KERNEL); + if (!entry) { + NL_SET_ERR_MSG(extack, "Unable to allocate table entry"); + ret = -ENOMEM; + goto idr_rm; + } + entry->prio = prio; + + entry->key.keysz = table->tbl_keysz + KEY_MASK_ID_SZ_BITS; + keysz_bytes = entry->key.keysz / BITS_PER_BYTE; + + mask.sz = entry->key.keysz; + mask.value = mask_value; + + entry->key.value = kzalloc(keysz_bytes, GFP_KERNEL); + if (!entry->key.value) { + ret = -ENOMEM; + goto free_entry; + } + + entry->key.unmasked_key = kzalloc(keysz_bytes, GFP_KERNEL); + if (!entry->key.unmasked_key) { + ret = -ENOMEM; + goto free_key_value; + } + + ret = tcf_table_entry_extract_key(tb, &entry->key, &mask, extack); + if (ret < 0) + goto free_key_unmasked; + + if (tb[P4TC_ENTRY_PERMISSIONS]) { + const u16 tblperm = + rcu_dereference(table->tbl_permissions)->permissions; + u16 nlperm; + + nlperm = *((u16 *)nla_data(tb[P4TC_ENTRY_PERMISSIONS])); + if (nlperm > P4TC_MAX_PERMISSION) { + NL_SET_ERR_MSG(extack, + "Permission may only have 10 bits turned on"); + ret = -EINVAL; + goto free_key_unmasked; + } + if (p4tc_ctrl_create_ok(nlperm) || + p4tc_data_create_ok(nlperm)) { + NL_SET_ERR_MSG(extack, + "Create permission for table entry doesn't make sense"); + ret = -EINVAL; + goto free_key_unmasked; + } + if (!p4tc_data_read_ok(nlperm)) { + NL_SET_ERR_MSG(extack, + "Data path read permission must be set"); + ret = -EINVAL; + goto free_key_unmasked; + } + if (!p4tc_data_exec_ok(nlperm)) { + NL_SET_ERR_MSG(extack, + "Data path execute permissions for entry must be set"); + ret = -EINVAL; + goto free_key_unmasked; + } + + if (~tblperm & nlperm) { + NL_SET_ERR_MSG(extack, + "Trying to set permission bits which aren't allowed by table"); + ret = -EINVAL; + goto free_key_unmasked; + } + entry->permissions = nlperm; + } else { + if (flags & NLM_F_REPLACE) + entry->permissions = P4TC_PERMISSIONS_UNINIT; + else + entry->permissions = P4TC_DEFAULT_TENTRY_PERMISSIONS; + } + + if (tb[P4TC_ENTRY_ACT]) { + entry->acts = kcalloc(TCA_ACT_MAX_PRIO, + sizeof(struct tc_action *), GFP_KERNEL); + if (!entry->acts) { + ret = -ENOMEM; + goto free_key_unmasked; + } + + ret = p4tc_action_init(net, tb[P4TC_ENTRY_ACT], entry->acts, + table->common.p_id, + TCA_ACT_FLAGS_NO_RTNL, extack); + if (ret < 0) { + kfree(entry->acts); + entry->acts = NULL; + goto free_key_unmasked; + } + entry->num_acts = ret; + + if (!tcf_table_check_entry_acts(table, entry->acts, + &table->tbl_acts_list, ret)) { + ret = -EPERM; + NL_SET_ERR_MSG(extack, + "Action is not allowed as entry action"); + goto free_acts; + } + } + + rcu_read_lock(); + if (flags & NLM_F_REPLACE) + ret = __tcf_table_entry_update(pipeline, table, entry, &mask, + whodunnit, true); + else + ret = __tcf_table_entry_create(pipeline, table, entry, &mask, + whodunnit, true); + if (ret < 0) { + rcu_read_unlock(); + goto free_acts; + } + + memcpy(entry_cpy, entry, sizeof(*entry)); + + rcu_read_unlock(); + + return 0; + +free_acts: + p4tc_action_destroy(entry->acts); + +free_key_unmasked: + kfree(entry->key.unmasked_key); + +free_key_value: + kfree(entry->key.value); + +free_entry: + kfree(entry); + +idr_rm: + if (!(flags & NLM_F_REPLACE)) { + spin_lock(&table->tbl_prio_idr_lock); + idr_remove(&table->tbl_prio_idr, prio); + spin_unlock(&table->tbl_prio_idr_lock); + } + + return ret; +} + +static int tcf_table_entry_cu(struct sk_buff *skb, struct net *net, u32 flags, + struct nlattr *arg, u32 *ids, + struct p4tc_nl_pname *nl_pname, + struct netlink_ext_ack *extack) +{ + struct nlattr *tb[P4TC_ENTRY_MAX + 1] = { NULL }; + struct p4tc_table_entry entry = { 0 }; + struct p4tc_pipeline *pipeline; + struct p4tc_table *table; + int ret; + + ret = nla_parse_nested(tb, P4TC_ENTRY_MAX, arg, p4tc_entry_policy, + extack); + if (ret < 0) + return ret; + + rcu_read_lock(); + ret = tcf_table_entry_get_table(net, &pipeline, &table, tb, ids, + nl_pname->data, extack); + rcu_read_unlock(); + if (ret < 0) + return ret; + + if (!pipeline_sealed(pipeline)) { + NL_SET_ERR_MSG(extack, + "Need to seal pipeline before issuing runtime command"); + ret = -EINVAL; + goto table_put; + } + + ret = __tcf_table_entry_cu(net, flags, tb, &entry, pipeline, table, + extack); + if (ret < 0) + goto table_put; + + if (p4tca_table_get_entry_fill(skb, table, &entry, table->tbl_id) <= 0) + NL_SET_ERR_MSG(extack, "Unable to fill table entry attributes"); + + if (!nl_pname->passed) + strscpy(nl_pname->data, pipeline->common.name, PIPELINENAMSIZ); + + if (!ids[P4TC_PID_IDX]) + ids[P4TC_PID_IDX] = pipeline->common.p_id; + +table_put: + tcf_table_entry_put_table(pipeline, table); + return ret; +} + +int tcf_table_const_entry_cu(struct net *net, struct nlattr *arg, + struct p4tc_table_entry *entry, + struct p4tc_pipeline *pipeline, + struct p4tc_table *table, + struct netlink_ext_ack *extack) +{ + struct nlattr *tb[P4TC_ENTRY_MAX + 1] = { NULL }; + int ret; + + ret = nla_parse_nested(tb, P4TC_ENTRY_MAX, arg, p4tc_entry_policy, + extack); + if (ret < 0) + return ret; + + return __tcf_table_entry_cu(net, 0, tb, entry, pipeline, table, extack); +} + +static int tc_ctl_p4_get_1(struct net *net, struct sk_buff *skb, + struct nlmsghdr *n, u32 *ids, struct nlattr *arg, + struct p4tc_nl_pname *nl_pname, + struct netlink_ext_ack *extack) +{ + int ret = 0; + struct nlattr *tb[P4TC_MAX + 1]; + u32 *arg_ids; + + ret = nla_parse_nested(tb, P4TC_MAX, arg, NULL, extack); + if (ret < 0) + return ret; + + if (!tb[P4TC_PATH]) { + NL_SET_ERR_MSG(extack, "Must specify object path"); + return -EINVAL; + } + + if (nla_len(tb[P4TC_PATH]) > (P4TC_PATH_MAX - 1) * sizeof(u32)) { + NL_SET_ERR_MSG(extack, "Path is too big"); + return -E2BIG; + } + + arg_ids = nla_data(tb[P4TC_PATH]); + memcpy(&ids[P4TC_TBLID_IDX], arg_ids, nla_len(tb[P4TC_PATH])); + + return tcf_table_entry_gd(net, skb, n, tb[P4TC_PARAMS], ids, nl_pname, + extack); +} + +static int tc_ctl_p4_delete_1(struct net *net, struct sk_buff *skb, + struct nlmsghdr *n, struct nlattr *arg, u32 *ids, + struct p4tc_nl_pname *nl_pname, + struct netlink_ext_ack *extack) +{ + int ret = 0; + struct nlattr *tb[P4TC_MAX + 1]; + u32 *arg_ids; + + ret = nla_parse_nested(tb, P4TC_MAX, arg, NULL, extack); + if (ret < 0) + return ret; + + if (!tb[P4TC_PATH]) { + NL_SET_ERR_MSG(extack, "Must specify object path"); + return -EINVAL; + } + + if ((nla_len(tb[P4TC_PATH])) > (P4TC_PATH_MAX - 1) * sizeof(u32)) { + NL_SET_ERR_MSG(extack, "Path is too big"); + return -E2BIG; + } + + arg_ids = nla_data(tb[P4TC_PATH]); + memcpy(&ids[P4TC_TBLID_IDX], arg_ids, nla_len(tb[P4TC_PATH])); + if (n->nlmsg_flags & NLM_F_ROOT) + ret = tcf_table_entry_flush(net, skb, n, tb[P4TC_PARAMS], ids, + nl_pname, extack); + else + ret = tcf_table_entry_gd(net, skb, n, tb[P4TC_PARAMS], ids, + nl_pname, extack); + + return ret; +} + +static int tc_ctl_p4_cu_1(struct net *net, struct sk_buff *skb, + struct nlmsghdr *n, u32 *ids, struct nlattr *nla, + struct p4tc_nl_pname *nl_pname, + struct netlink_ext_ack *extack) +{ + int ret = 0; + struct nlattr *p4tca[P4TC_MAX + 1]; + u32 *arg_ids; + + ret = nla_parse_nested(p4tca, P4TC_MAX, nla, NULL, extack); + if (ret < 0) + return ret; + + if (!p4tca[P4TC_PATH]) { + NL_SET_ERR_MSG(extack, "Must specify object path"); + return -EINVAL; + } + + if (nla_len(p4tca[P4TC_PATH]) > (P4TC_PATH_MAX - 1) * sizeof(u32)) { + NL_SET_ERR_MSG(extack, "Path is too big"); + return -E2BIG; + } + + if (!p4tca[P4TC_PARAMS]) { + NL_SET_ERR_MSG(extack, "Must specify object attributes"); + return -EINVAL; + } + + arg_ids = nla_data(p4tca[P4TC_PATH]); + memcpy(&ids[P4TC_TBLID_IDX], arg_ids, nla_len(p4tca[P4TC_PATH])); + + return tcf_table_entry_cu(skb, net, n->nlmsg_flags, p4tca[P4TC_PARAMS], + ids, nl_pname, extack); +} + +static int tc_ctl_p4_table_n(struct sk_buff *skb, struct nlmsghdr *n, int cmd, + char *p_name, struct nlattr *nla, + struct netlink_ext_ack *extack) +{ + struct p4tcmsg *t = (struct p4tcmsg *)nlmsg_data(n); + struct net *net = sock_net(skb->sk); + u32 portid = NETLINK_CB(skb).portid; + u32 ids[P4TC_PATH_MAX] = { 0 }; + int ret = 0, ret_send; + struct nlattr *p4tca[P4TC_MSGBATCH_SIZE + 1]; + struct p4tc_nl_pname nl_pname; + struct sk_buff *new_skb; + struct p4tcmsg *t_new; + struct nlmsghdr *nlh; + struct nlattr *pnatt; + struct nlattr *root; + int i; + + ret = nla_parse_nested(p4tca, P4TC_MSGBATCH_SIZE, nla, NULL, extack); + if (ret < 0) + return ret; + + if (!p4tca[1]) { + NL_SET_ERR_MSG(extack, "No elements in root table array"); + return -EINVAL; + } + + new_skb = alloc_skb(NLMSG_GOODSIZE, GFP_KERNEL); + if (!new_skb) + return -ENOBUFS; + + nlh = nlmsg_put(new_skb, portid, n->nlmsg_seq, cmd, sizeof(*t), + n->nlmsg_flags); + if (!nlh) + goto out; + + t_new = nlmsg_data(nlh); + t_new->pipeid = t->pipeid; + t_new->obj = t->obj; + ids[P4TC_PID_IDX] = t_new->pipeid; + + pnatt = nla_reserve(new_skb, P4TC_ROOT_PNAME, PIPELINENAMSIZ); + if (!pnatt) { + ret = -ENOMEM; + goto out; + } + + nl_pname.data = nla_data(pnatt); + if (!p_name) { + /* Filled up by the operation or forced failure */ + memset(nl_pname.data, 0, PIPELINENAMSIZ); + nl_pname.passed = false; + } else { + strscpy(nl_pname.data, p_name, PIPELINENAMSIZ); + nl_pname.passed = true; + } + + net = maybe_get_net(net); + if (!net) { + NL_SET_ERR_MSG(extack, "Net namespace is going down"); + ret = -EBUSY; + goto out; + } + + root = nla_nest_start(new_skb, P4TC_ROOT); + for (i = 1; i < P4TC_MSGBATCH_SIZE + 1 && p4tca[i]; i++) { + struct nlattr *nest = nla_nest_start(new_skb, i); + + if (cmd == RTM_GETP4TBENT) + ret = tc_ctl_p4_get_1(net, new_skb, nlh, ids, p4tca[i], + &nl_pname, extack); + else if (cmd == RTM_CREATEP4TBENT) + ret = tc_ctl_p4_cu_1(net, new_skb, nlh, ids, p4tca[i], + &nl_pname, extack); + else if (cmd == RTM_DELP4TBENT) + ret = tc_ctl_p4_delete_1(net, new_skb, nlh, p4tca[i], + ids, &nl_pname, extack); + + if (ret < 0) { + if (i == 1) { + goto put_net; + } else { + nla_nest_cancel(new_skb, nest); + break; + } + } + nla_nest_end(new_skb, nest); + } + nla_nest_end(new_skb, root); + + if (!t_new->pipeid) + t_new->pipeid = ids[P4TC_PID_IDX]; + + nlmsg_end(new_skb, nlh); + + if (cmd == RTM_GETP4TBENT) + ret_send = rtnl_unicast(new_skb, net, portid); + else + ret_send = rtnetlink_send(new_skb, net, portid, RTNLGRP_TC, + n->nlmsg_flags & NLM_F_ECHO); + + put_net(net); + + return ret_send ? ret_send : ret; + +put_net: + put_net(net); + +out: + kfree_skb(new_skb); + return ret; +} + +static int tc_ctl_p4_root(struct sk_buff *skb, struct nlmsghdr *n, int cmd, + struct netlink_ext_ack *extack) +{ + char *p_name = NULL; + int ret = 0; + struct nlattr *p4tca[P4TC_ROOT_MAX + 1]; + + ret = nlmsg_parse(n, sizeof(struct p4tcmsg), p4tca, P4TC_ROOT_MAX, + p4tc_root_policy, extack); + if (ret < 0) + return ret; + + if (!p4tca[P4TC_ROOT]) { + NL_SET_ERR_MSG(extack, "Netlink P4TC table attributes missing"); + return -EINVAL; + } + + if (p4tca[P4TC_ROOT_PNAME]) + p_name = nla_data(p4tca[P4TC_ROOT_PNAME]); + + return tc_ctl_p4_table_n(skb, n, cmd, p_name, p4tca[P4TC_ROOT], extack); +} + +static int tc_ctl_p4_get(struct sk_buff *skb, struct nlmsghdr *n, + struct netlink_ext_ack *extack) +{ + return tc_ctl_p4_root(skb, n, RTM_GETP4TBENT, extack); +} + +static int tc_ctl_p4_delete(struct sk_buff *skb, struct nlmsghdr *n, + struct netlink_ext_ack *extack) +{ + if (!netlink_capable(skb, CAP_NET_ADMIN)) + return -EPERM; + + return tc_ctl_p4_root(skb, n, RTM_DELP4TBENT, extack); +} + +static int tc_ctl_p4_cu(struct sk_buff *skb, struct nlmsghdr *n, + struct netlink_ext_ack *extack) +{ + int ret; + + if (!netlink_capable(skb, CAP_NET_ADMIN)) + return -EPERM; + + ret = tc_ctl_p4_root(skb, n, RTM_CREATEP4TBENT, extack); + + return ret; +} + +static int tcf_table_entry_dump(struct sk_buff *skb, struct nlattr *arg, + u32 *ids, struct netlink_callback *cb, + char **p_name, struct netlink_ext_ack *extack) +{ + struct nlattr *tb[P4TC_ENTRY_MAX + 1] = { NULL }; + struct p4tc_dump_ctx *ctx = (void *)cb->ctx; + unsigned char *b = nlmsg_get_pos(skb); + struct p4tc_pipeline *pipeline = NULL; + struct p4tc_table_entry *entry = NULL; + struct net *net = sock_net(skb->sk); + int i = 0; + struct p4tc_table *table; + int ret; + + net = maybe_get_net(net); + if (!net) { + NL_SET_ERR_MSG(extack, "Net namespace is going down"); + return -EBUSY; + } + + if (arg) { + ret = nla_parse_nested(tb, P4TC_ENTRY_MAX, arg, + p4tc_entry_policy, extack); + if (ret < 0) { + kfree(ctx->iter); + goto net_put; + } + } + + rcu_read_lock(); + ret = tcf_table_entry_get_table(net, &pipeline, &table, tb, ids, + *p_name, extack); + rcu_read_unlock(); + if (ret < 0) { + kfree(ctx->iter); + goto net_put; + } + + if (!ctx->iter) { + ctx->iter = kzalloc(sizeof(*ctx->iter), GFP_KERNEL); + if (!ctx->iter) { + ret = -ENOMEM; + goto table_put; + } + + rhltable_walk_enter(&table->tbl_entries, ctx->iter); + } + + ret = -ENOMEM; + rhashtable_walk_start(ctx->iter); + do { + for (i = 0; i < P4TC_MSGBATCH_SIZE && + (entry = rhashtable_walk_next(ctx->iter)) && + !IS_ERR(entry); i++) { + struct nlattr *count; + + if (!p4tc_ctrl_read_ok(entry->permissions)) { + i--; + continue; + } + + count = nla_nest_start(skb, i + 1); + if (!count) { + rhashtable_walk_stop(ctx->iter); + goto table_put; + } + ret = p4tca_table_get_entry_fill(skb, table, entry, + table->tbl_id); + if (ret == 0) { + NL_SET_ERR_MSG(extack, + "Failed to fill notification attributes for table entry"); + goto walk_done; + } else if (ret == -ENOMEM) { + ret = 1; + nla_nest_cancel(skb, count); + rhashtable_walk_stop(ctx->iter); + goto table_put; + } + nla_nest_end(skb, count); + } + } while (entry == ERR_PTR(-EAGAIN)); + rhashtable_walk_stop(ctx->iter); + + if (!i) { + rhashtable_walk_exit(ctx->iter); + + ret = 0; + kfree(ctx->iter); + + goto table_put; + } + + if (!*p_name) + *p_name = pipeline->common.name; + + if (!ids[P4TC_PID_IDX]) + ids[P4TC_PID_IDX] = pipeline->common.p_id; + + ret = skb->len; + + goto table_put; + +walk_done: + rhashtable_walk_stop(ctx->iter); + rhashtable_walk_exit(ctx->iter); + kfree(ctx->iter); + + nlmsg_trim(skb, b); + +table_put: + tcf_table_entry_put_table(pipeline, table); + +net_put: + put_net(net); + + return ret; +} + +static int tc_ctl_p4_dump_1(struct sk_buff *skb, struct netlink_callback *cb, + struct nlattr *arg, char *p_name) +{ + struct netlink_ext_ack *extack = cb->extack; + u32 portid = NETLINK_CB(cb->skb).portid; + const struct nlmsghdr *n = cb->nlh; + u32 ids[P4TC_PATH_MAX] = { 0 }; + struct nlattr *tb[P4TC_MAX + 1]; + struct p4tcmsg *t_new; + struct nlmsghdr *nlh; + struct nlattr *root; + struct p4tcmsg *t; + u32 *arg_ids; + int ret; + + ret = nla_parse_nested(tb, P4TC_MAX, arg, p4tc_policy, extack); + if (ret < 0) + return ret; + + nlh = nlmsg_put(skb, portid, n->nlmsg_seq, RTM_GETP4TBENT, sizeof(*t), + n->nlmsg_flags); + if (!nlh) + return -ENOSPC; + + t = (struct p4tcmsg *)nlmsg_data(n); + t_new = nlmsg_data(nlh); + t_new->pipeid = t->pipeid; + t_new->obj = t->obj; + + if (!tb[P4TC_PATH]) { + NL_SET_ERR_MSG(extack, "Must specify object path"); + return -EINVAL; + } + + if ((nla_len(tb[P4TC_PATH])) > (P4TC_PATH_MAX - 1) * sizeof(u32)) { + NL_SET_ERR_MSG(extack, "Path is too big"); + return -E2BIG; + } + + ids[P4TC_PID_IDX] = t_new->pipeid; + arg_ids = nla_data(tb[P4TC_PATH]); + memcpy(&ids[P4TC_TBLID_IDX], arg_ids, nla_len(tb[P4TC_PATH])); + + root = nla_nest_start(skb, P4TC_ROOT); + ret = tcf_table_entry_dump(skb, tb[P4TC_PARAMS], ids, cb, &p_name, + extack); + if (ret <= 0) + goto out; + nla_nest_end(skb, root); + + if (p_name) { + if (nla_put_string(skb, P4TC_ROOT_PNAME, p_name)) { + ret = -1; + goto out; + } + } + + if (!t_new->pipeid) + t_new->pipeid = ids[P4TC_PID_IDX]; + + nlmsg_end(skb, nlh); + + return skb->len; + +out: + nlmsg_cancel(skb, nlh); + return ret; +} + +static int tc_ctl_p4_dump(struct sk_buff *skb, struct netlink_callback *cb) +{ + char *p_name = NULL; + int ret = 0; + struct nlattr *p4tca[P4TC_ROOT_MAX + 1]; + + ret = nlmsg_parse(cb->nlh, sizeof(struct p4tcmsg), p4tca, P4TC_ROOT_MAX, + p4tc_root_policy, cb->extack); + if (ret < 0) + return ret; + + if (!p4tca[P4TC_ROOT]) { + NL_SET_ERR_MSG(cb->extack, + "Netlink P4TC table attributes missing"); + return -EINVAL; + } + + if (p4tca[P4TC_ROOT_PNAME]) + p_name = nla_data(p4tca[P4TC_ROOT_PNAME]); + + return tc_ctl_p4_dump_1(skb, cb, p4tca[P4TC_ROOT], p_name); +} + +static int __init p4tc_tbl_init(void) +{ + rtnl_register(PF_UNSPEC, RTM_CREATEP4TBENT, tc_ctl_p4_cu, NULL, + RTNL_FLAG_DOIT_UNLOCKED); + rtnl_register(PF_UNSPEC, RTM_DELP4TBENT, tc_ctl_p4_delete, NULL, + RTNL_FLAG_DOIT_UNLOCKED); + rtnl_register(PF_UNSPEC, RTM_GETP4TBENT, tc_ctl_p4_get, tc_ctl_p4_dump, + RTNL_FLAG_DOIT_UNLOCKED); + + return 0; +} + +subsys_initcall(p4tc_tbl_init); diff --git a/security/selinux/nlmsgtab.c b/security/selinux/nlmsgtab.c index 0a8daf2f8..3c26d4dc4 100644 --- a/security/selinux/nlmsgtab.c +++ b/security/selinux/nlmsgtab.c @@ -97,6 +97,9 @@ static const struct nlmsg_perm nlmsg_route_perms[] = { { RTM_CREATEP4TEMPLATE, NETLINK_ROUTE_SOCKET__NLMSG_WRITE }, { RTM_DELP4TEMPLATE, NETLINK_ROUTE_SOCKET__NLMSG_WRITE }, { RTM_GETP4TEMPLATE, NETLINK_ROUTE_SOCKET__NLMSG_READ }, + { RTM_CREATEP4TBENT, NETLINK_ROUTE_SOCKET__NLMSG_WRITE }, + { RTM_DELP4TBENT, NETLINK_ROUTE_SOCKET__NLMSG_WRITE }, + { RTM_GETP4TBENT, NETLINK_ROUTE_SOCKET__NLMSG_READ }, }; static const struct nlmsg_perm nlmsg_tcpdiag_perms[] = { @@ -179,7 +182,7 @@ int selinux_nlmsg_lookup(u16 sclass, u16 nlmsg_type, u32 *perm) * structures at the top of this file with the new mappings * before updating the BUILD_BUG_ON() macro! */ - BUILD_BUG_ON(RTM_MAX != (RTM_CREATEP4TEMPLATE + 3)); + BUILD_BUG_ON(RTM_MAX != (RTM_CREATEP4TBENT + 3)); err = nlmsg_perm(nlmsg_type, perm, nlmsg_route_perms, sizeof(nlmsg_route_perms)); break; From patchwork Tue Jan 24 17:05:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jamal Hadi Salim X-Patchwork-Id: 13114396 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0BA59C54EAA for ; Tue, 24 Jan 2023 17:07:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234830AbjAXRHA (ORCPT ); Tue, 24 Jan 2023 12:07:00 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41324 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233799AbjAXRGZ (ORCPT ); Tue, 24 Jan 2023 12:06:25 -0500 Received: from mail-yw1-x1134.google.com (mail-yw1-x1134.google.com [IPv6:2607:f8b0:4864:20::1134]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 77B2A5FDB for ; Tue, 24 Jan 2023 09:05:42 -0800 (PST) Received: by mail-yw1-x1134.google.com with SMTP id 00721157ae682-50112511ba7so168312577b3.3 for ; Tue, 24 Jan 2023 09:05:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mojatatu-com.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=7OsKojdOxHjceybZ1cGA/00tnVkuO558t2sFj27xTXQ=; b=QCXF5Si3pz/cpgifl7iMwcYE/W12hLyNveJM8RWP+SgilFHYMUd4rhhNjxq8TaHsV3 9DwOCXaHMwjmPFLOEyfyOmr7ZX4q9S/7OiYeI3WKezIwmUX9XU5It7agWMM+EIzZ3oYV LoRGUGuVPvrqmzW9KUf+fEy3vdFLw9F8hqE4V/FuxoltF2DW0CLMjboTSGsQJXpeCpsF SN2hf+HaykLvFJYbI7Kf/XsY1VP0OJhcQ1w3JJPQaRCMzyWHcipm+Vj3mx5NfELV5D5G O9lwgrA48OhloULT5kgsQ1z7God6Gfb3UumSfnMJUZ9drM6PeJfxOGtHm0vvCzGo+UwD L8JA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7OsKojdOxHjceybZ1cGA/00tnVkuO558t2sFj27xTXQ=; b=TKIQtxhTx/xPawvbukjyAxsENCiz6Y4XYksGnbIFGUwi0Okm0GaxK34d/ZRf+UAV6U TQBuNyN7rYUQx6+ku238EhQ+bQVEBwpicZDb6onpHpEe0w391OvFF9xVBNd1Jsw6D0f5 TsVltOUVPMRn9DhJPgGgjbYmE5RD8bvr7ikm9TrP1JkT6DjIK5SumYBBABP4HkCYET6o DDdC6CA1iDtYuDk6XVQohfgeRv3cXMymD2l73/Nfm3hLz5DD5sVl70aEQqWrBuVsjn5a dCjAvMossklKD4uNYs2UPcapDdathm+3LXCGrR8iqFBYKSxAfWXdD4YQSiX9qqhQ+koo 9WCA== X-Gm-Message-State: AFqh2koDv/72yW5wEhvZXTFnKqTQ433nhareduXg0zg6TagoU0odGJrm ybtjZzh8D6qMUAf3VC4EByHOBSbaTEeqShAm X-Google-Smtp-Source: AMrXdXuZXgwVV4RRvBKgvGk3ZWmMTEEAOQ3Tjyd5+K8RGvS2rewgmfYdy93z+f3EclAfhICfJWUr4A== X-Received: by 2002:a05:7500:d17:b0:ee:4dce:65b7 with SMTP id kp23-20020a0575000d1700b000ee4dce65b7mr2125614gab.76.1674579934918; Tue, 24 Jan 2023 09:05:34 -0800 (PST) Received: from localhost.localdomain (bras-base-kntaon1618w-grc-10-184-145-9-64.dsl.bell.ca. [184.145.9.64]) by smtp.gmail.com with ESMTPSA id t5-20020a05620a0b0500b007063036cb03sm1700208qkg.126.2023.01.24.09.05.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Jan 2023 09:05:34 -0800 (PST) From: Jamal Hadi Salim To: netdev@vger.kernel.org Cc: kernel@mojatatu.com, deb.chatterjee@intel.com, anjali.singhai@intel.com, namrata.limaye@intel.com, khalidm@nvidia.com, tom@sipanda.io, pratyush@sipanda.io, jiri@resnulli.us, xiyou.wangcong@gmail.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, vladbu@nvidia.com, simon.horman@corigine.com Subject: [PATCH net-next RFC 18/20] p4tc: add register create, update, delete, get, flush and dump Date: Tue, 24 Jan 2023 12:05:08 -0500 Message-Id: <20230124170510.316970-18-jhs@mojatatu.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230124170510.316970-1-jhs@mojatatu.com> References: <20230124170510.316970-1-jhs@mojatatu.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org X-Patchwork-State: RFC This commit allows users to create, update, delete, get, flush and dump P4 registers. It's important to note that write operations, such as create, update and delete, can only be made if the pipeline is not sealed. Registers in P4 provide a way to store data in your program that can be accessed throughout the lifetime of your P4 program. Which means this a way of storing state between the P4 program's invocations. Let's take a look at an example register declaration in a P4 program: Register>(2) register1; This declaration corresponds to a register named register1, with 2 elements which are of type bit32. You can think of this register as an array of bit32s with 2 elements. If one were to create this register with P4TC, one would issue the following command: tc p4template create register/ptables/register1 type bit32 numelems 2 This will create register "register1" and give it an ID that will be assigned by the kernel. If the user wished to specify also the register id, the command would be the following tc p4template create register/ptables/register1 regid 1 type bit32 \ numelems 2 Now, after creating register1, if one wished to, for example, update index 1 of register1 with value 32, one would issue the following command: tc p4template update register/ptables/register1 index 1 \ value constant.bit32.32 One could also change the value of a specific index using hex notation, examplified by the following command: tc p4template update register/ptables/ regid 1 index 1 \ value constant.bit32.0x20 Note that we used regid in here instead of the register name (register1). We can always use name or id. It's important to note that all elements of a register will be initialised with zero when the register is created Now, after updating the new register the user could issue a get command to check if the register's parameters (type, num elems, id, ...) and the register element values are correct. To do so, the user would issue the following command: tc p4template get register/ptables/register1 Which will output the following: template obj type register pipeline name ptables id 22 register name register1 register id 1 container type bit32 startbit 0 endbit 31 number of elements 2 register1[0] 0 register1[1] 32 Notice that register[0] was unaltered, so it is a 0 because zero is the default initial values. register[1] has value 32, because it was updated in the previous command. The user could also list all of the created registers associated to a pipeline. For example, to list all of the registers associated with pipeline ptables, the user would issue the following command: tc p4template get register/ptables/ Which will output the following: template obj type register pipeline name ptables id 22 register name register1 Another option is to check the value of a specific index inside register1, that can be done using the following command: tc p4template get register/ptables/register1 index 1 Which will output the following: template obj type register pipeline name ptables id 22 register name register1 register id 1 container type bit32 register1[1] 32 To delete register1, the user would issue the following command: tc p4template del register/ptables/register1 Now, to delete all the registers associated with pipeline ptables, the user would issue the following command: tc p4template del register/ptables/ Co-developed-by: Victor Nogueira Signed-off-by: Victor Nogueira Co-developed-by: Pedro Tammela Signed-off-by: Pedro Tammela Signed-off-by: Jamal Hadi Salim --- include/net/p4tc.h | 32 ++ include/uapi/linux/p4tc.h | 28 ++ net/sched/p4tc/Makefile | 2 +- net/sched/p4tc/p4tc_pipeline.c | 9 +- net/sched/p4tc/p4tc_register.c | 749 +++++++++++++++++++++++++++++++++ net/sched/p4tc/p4tc_tmpl_api.c | 2 + 6 files changed, 820 insertions(+), 2 deletions(-) create mode 100644 net/sched/p4tc/p4tc_register.c diff --git a/include/net/p4tc.h b/include/net/p4tc.h index 9a7942992..d9267b798 100644 --- a/include/net/p4tc.h +++ b/include/net/p4tc.h @@ -31,6 +31,7 @@ #define P4TC_AID_IDX 1 #define P4TC_PARSEID_IDX 1 #define P4TC_HDRFIELDID_IDX 2 +#define P4TC_REGID_IDX 1 #define P4TC_HDRFIELD_IS_VALIDITY_BIT 0x1 @@ -109,6 +110,7 @@ struct p4tc_pipeline { struct idr p_meta_idr; struct idr p_act_idr; struct idr p_tbl_idr; + struct idr p_reg_idr; struct rcu_head rcu; struct net *net; struct p4tc_parser *parser; @@ -395,6 +397,21 @@ struct p4tc_hdrfield { extern const struct p4tc_template_ops p4tc_hdrfield_ops; +struct p4tc_register { + struct p4tc_template_common common; + spinlock_t reg_value_lock; + struct p4tc_type *reg_type; + struct p4tc_type_mask_shift *reg_mask_shift; + void *reg_value; + u32 reg_num_elems; + u32 reg_id; + refcount_t reg_ref; + u16 reg_startbit; /* Relative to its container */ + u16 reg_endbit; /* Relative to its container */ +}; + +extern const struct p4tc_template_ops p4tc_register_ops; + struct p4tc_metadata *tcf_meta_find_byid(struct p4tc_pipeline *pipeline, u32 m_id); void tcf_meta_fill_user_offsets(struct p4tc_pipeline *pipeline); @@ -556,10 +573,25 @@ extern const struct p4tc_act_param_ops param_ops[P4T_MAX + 1]; int generic_dump_param_value(struct sk_buff *skb, struct p4tc_type *type, struct p4tc_act_param *param); +struct p4tc_register *tcf_register_find_byid(struct p4tc_pipeline *pipeline, + const u32 reg_id); +struct p4tc_register *tcf_register_get(struct p4tc_pipeline *pipeline, + const char *regname, const u32 reg_id, + struct netlink_ext_ack *extack); +void tcf_register_put_ref(struct p4tc_register *reg); + +struct p4tc_register *tcf_register_find_byany(struct p4tc_pipeline *pipeline, + const char *regname, + const u32 reg_id, + struct netlink_ext_ack *extack); + +void tcf_register_put_rcu(struct rcu_head *head); + #define to_pipeline(t) ((struct p4tc_pipeline *)t) #define to_meta(t) ((struct p4tc_metadata *)t) #define to_hdrfield(t) ((struct p4tc_hdrfield *)t) #define to_act(t) ((struct p4tc_act *)t) #define to_table(t) ((struct p4tc_table *)t) +#define to_register(t) ((struct p4tc_register *)t) #endif diff --git a/include/uapi/linux/p4tc.h b/include/uapi/linux/p4tc.h index 727fdcfe5..0c5f2943e 100644 --- a/include/uapi/linux/p4tc.h +++ b/include/uapi/linux/p4tc.h @@ -22,6 +22,7 @@ struct p4tcmsg { #define P4TC_MAX_KEYSZ 512 #define HEADER_MAX_LEN 512 #define META_MAX_LEN 512 +#define P4TC_MAX_REGISTER_ELEMS 128 #define P4TC_MAX_KEYSZ 512 @@ -32,6 +33,7 @@ struct p4tcmsg { #define HDRFIELDNAMSIZ TEMPLATENAMSZ #define ACTPARAMNAMSIZ TEMPLATENAMSZ #define TABLENAMSIZ TEMPLATENAMSZ +#define REGISTERNAMSIZ TEMPLATENAMSZ #define P4TC_TABLE_FLAGS_KEYSZ 0x01 #define P4TC_TABLE_FLAGS_MAX_ENTRIES 0x02 @@ -120,6 +122,7 @@ enum { P4TC_OBJ_ACT, P4TC_OBJ_TABLE, P4TC_OBJ_TABLE_ENTRY, + P4TC_OBJ_REGISTER, __P4TC_OBJ_MAX, }; #define P4TC_OBJ_MAX __P4TC_OBJ_MAX @@ -353,6 +356,31 @@ enum { P4TC_ENTITY_MAX }; +#define P4TC_REGISTER_FLAGS_DATATYPE 0x1 +#define P4TC_REGISTER_FLAGS_STARTBIT 0x2 +#define P4TC_REGISTER_FLAGS_ENDBIT 0x4 +#define P4TC_REGISTER_FLAGS_NUMELEMS 0x8 +#define P4TC_REGISTER_FLAGS_INDEX 0x10 + +struct p4tc_u_register { + __u32 num_elems; + __u32 datatype; + __u32 index; + __u16 startbit; + __u16 endbit; + __u16 flags; +}; + +/* P4 Register attributes */ +enum { + P4TC_REGISTER_UNSPEC, + P4TC_REGISTER_NAME, /* string */ + P4TC_REGISTER_INFO, /* struct p4tc_u_register */ + P4TC_REGISTER_VALUE, /* value blob */ + __P4TC_REGISTER_MAX +}; +#define P4TC_REGISTER_MAX (__P4TC_REGISTER_MAX - 1) + #define P4TC_RTA(r) \ ((struct rtattr *)(((char *)(r)) + NLMSG_ALIGN(sizeof(struct p4tcmsg)))) diff --git a/net/sched/p4tc/Makefile b/net/sched/p4tc/Makefile index 0d2c20223..b35ced1e3 100644 --- a/net/sched/p4tc/Makefile +++ b/net/sched/p4tc/Makefile @@ -2,4 +2,4 @@ obj-y := p4tc_types.o p4tc_pipeline.o p4tc_tmpl_api.o p4tc_meta.o \ p4tc_parser_api.o p4tc_hdrfield.o p4tc_action.o p4tc_table.o \ - p4tc_tbl_api.o + p4tc_tbl_api.o p4tc_register.o diff --git a/net/sched/p4tc/p4tc_pipeline.c b/net/sched/p4tc/p4tc_pipeline.c index f8fcde20b..9f8433545 100644 --- a/net/sched/p4tc/p4tc_pipeline.c +++ b/net/sched/p4tc/p4tc_pipeline.c @@ -298,6 +298,7 @@ static void tcf_pipeline_destroy(struct p4tc_pipeline *pipeline, idr_destroy(&pipeline->p_meta_idr); idr_destroy(&pipeline->p_act_idr); idr_destroy(&pipeline->p_tbl_idr); + idr_destroy(&pipeline->p_reg_idr); if (free_pipeline) kfree(pipeline); @@ -324,8 +325,9 @@ static int tcf_pipeline_put(struct net *net, struct p4tc_pipeline *pipeline = to_pipeline(template); struct net *pipeline_net = maybe_get_net(net); struct p4tc_act_dep_node *act_node, *node_tmp; - unsigned long tbl_id, m_id, tmp; + unsigned long reg_id, tbl_id, m_id, tmp; struct p4tc_metadata *meta; + struct p4tc_register *reg; struct p4tc_table *table; if (!refcount_dec_if_one(&pipeline->p_ctrl_ref)) { @@ -371,6 +373,9 @@ static int tcf_pipeline_put(struct net *net, if (pipeline->parser) tcf_parser_del(net, pipeline, pipeline->parser, extack); + idr_for_each_entry_ul(&pipeline->p_reg_idr, reg, tmp, reg_id) + reg->common.ops->put(net, ®->common, true, extack); + idr_remove(&pipe_net->pipeline_idr, pipeline->common.p_id); if (pipeline_net) @@ -567,6 +572,8 @@ static struct p4tc_pipeline *tcf_pipeline_create(struct net *net, idr_init(&pipeline->p_meta_idr); pipeline->p_meta_offset = 0; + idr_init(&pipeline->p_reg_idr); + INIT_LIST_HEAD(&pipeline->act_dep_graph); INIT_LIST_HEAD(&pipeline->act_topological_order); pipeline->num_created_acts = 0; diff --git a/net/sched/p4tc/p4tc_register.c b/net/sched/p4tc/p4tc_register.c new file mode 100644 index 000000000..deac38fd2 --- /dev/null +++ b/net/sched/p4tc/p4tc_register.c @@ -0,0 +1,749 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * net/sched/p4tc_register.c P4 TC REGISTER + * + * Copyright (c) 2022, Mojatatu Networks + * Copyright (c) 2022, Intel Corporation. + * Authors: Jamal Hadi Salim + * Victor Nogueira + * Pedro Tammela + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +static const struct nla_policy p4tc_register_policy[P4TC_REGISTER_MAX + 1] = { + [P4TC_REGISTER_NAME] = { .type = NLA_STRING, .len = REGISTERNAMSIZ }, + [P4TC_REGISTER_INFO] = { + .type = NLA_BINARY, + .len = sizeof(struct p4tc_u_register), + }, + [P4TC_REGISTER_VALUE] = { .type = NLA_BINARY }, +}; + +struct p4tc_register *tcf_register_find_byid(struct p4tc_pipeline *pipeline, + const u32 reg_id) +{ + return idr_find(&pipeline->p_reg_idr, reg_id); +} + +static struct p4tc_register * +tcf_register_find_byname(const char *regname, struct p4tc_pipeline *pipeline) +{ + struct p4tc_register *reg; + unsigned long tmp, id; + + idr_for_each_entry_ul(&pipeline->p_reg_idr, reg, tmp, id) + if (strncmp(reg->common.name, regname, REGISTERNAMSIZ) == 0) + return reg; + + return NULL; +} + +struct p4tc_register *tcf_register_find_byany(struct p4tc_pipeline *pipeline, + const char *regname, + const u32 reg_id, + struct netlink_ext_ack *extack) +{ + struct p4tc_register *reg; + int err; + + if (reg_id) { + reg = tcf_register_find_byid(pipeline, reg_id); + if (!reg) { + NL_SET_ERR_MSG(extack, "Unable to find register by id"); + err = -EINVAL; + goto out; + } + } else { + if (regname) { + reg = tcf_register_find_byname(regname, pipeline); + if (!reg) { + NL_SET_ERR_MSG(extack, + "Register name not found"); + err = -EINVAL; + goto out; + } + } else { + NL_SET_ERR_MSG(extack, + "Must specify register name or id"); + err = -EINVAL; + goto out; + } + } + + return reg; +out: + return ERR_PTR(err); +} + +struct p4tc_register *tcf_register_get(struct p4tc_pipeline *pipeline, + const char *regname, const u32 reg_id, + struct netlink_ext_ack *extack) +{ + struct p4tc_register *reg; + + reg = tcf_register_find_byany(pipeline, regname, reg_id, extack); + if (IS_ERR(reg)) + return reg; + + WARN_ON(!refcount_inc_not_zero(®->reg_ref)); + + return reg; +} + +void tcf_register_put_ref(struct p4tc_register *reg) +{ + WARN_ON(!refcount_dec_not_one(®->reg_ref)); +} + +static struct p4tc_register * +tcf_register_find_byanyattr(struct p4tc_pipeline *pipeline, + struct nlattr *name_attr, const u32 reg_id, + struct netlink_ext_ack *extack) +{ + char *regname = NULL; + + if (name_attr) + regname = nla_data(name_attr); + + return tcf_register_find_byany(pipeline, regname, reg_id, extack); +} + +static int _tcf_register_fill_nlmsg(struct sk_buff *skb, + struct p4tc_register *reg, + struct p4tc_u_register *parm_arg) +{ + unsigned char *b = nlmsg_get_pos(skb); + struct p4tc_u_register parm = { 0 }; + size_t value_bytesz; + struct nlattr *nest; + void *value; + + if (nla_put_u32(skb, P4TC_PATH, reg->reg_id)) + goto out_nlmsg_trim; + + nest = nla_nest_start(skb, P4TC_PARAMS); + if (!nest) + goto out_nlmsg_trim; + + if (nla_put_string(skb, P4TC_REGISTER_NAME, reg->common.name)) + goto out_nlmsg_trim; + + parm.datatype = reg->reg_type->typeid; + parm.flags |= P4TC_REGISTER_FLAGS_DATATYPE; + if (parm_arg) { + parm.index = parm_arg->index; + parm.flags |= P4TC_REGISTER_FLAGS_INDEX; + } else { + parm.startbit = reg->reg_startbit; + parm.flags |= P4TC_REGISTER_FLAGS_STARTBIT; + parm.endbit = reg->reg_endbit; + parm.flags |= P4TC_REGISTER_FLAGS_ENDBIT; + parm.num_elems = reg->reg_num_elems; + parm.flags |= P4TC_REGISTER_FLAGS_NUMELEMS; + } + + if (nla_put(skb, P4TC_REGISTER_INFO, sizeof(parm), &parm)) + goto out_nlmsg_trim; + + value_bytesz = BITS_TO_BYTES(reg->reg_type->container_bitsz); + spin_lock_bh(®->reg_value_lock); + if (parm.flags & P4TC_REGISTER_FLAGS_INDEX) { + value = reg->reg_value + parm.index * value_bytesz; + } else { + value = reg->reg_value; + value_bytesz *= reg->reg_num_elems; + } + + if (nla_put(skb, P4TC_REGISTER_VALUE, value_bytesz, value)) { + spin_unlock_bh(®->reg_value_lock); + goto out_nlmsg_trim; + } + spin_unlock_bh(®->reg_value_lock); + + nla_nest_end(skb, nest); + + return skb->len; + +out_nlmsg_trim: + nlmsg_trim(skb, b); + return -1; +} + +static int tcf_register_fill_nlmsg(struct net *net, struct sk_buff *skb, + struct p4tc_template_common *template, + struct netlink_ext_ack *extack) +{ + struct p4tc_register *reg = to_register(template); + + if (_tcf_register_fill_nlmsg(skb, reg, NULL) <= 0) { + NL_SET_ERR_MSG(extack, + "Failed to fill notification attributes for register"); + return -EINVAL; + } + + return 0; +} + +static int _tcf_register_put(struct p4tc_pipeline *pipeline, + struct p4tc_register *reg, + bool unconditional_purge, + struct netlink_ext_ack *extack) +{ + void *value; + + if (!refcount_dec_if_one(®->reg_ref) && !unconditional_purge) + return -EBUSY; + + idr_remove(&pipeline->p_reg_idr, reg->reg_id); + + spin_lock_bh(®->reg_value_lock); + value = reg->reg_value; + reg->reg_value = NULL; + spin_unlock_bh(®->reg_value_lock); + kfree(value); + + if (reg->reg_mask_shift) { + kfree(reg->reg_mask_shift->mask); + kfree(reg->reg_mask_shift); + } + kfree(reg); + + return 0; +} + +static int tcf_register_put(struct net *net, struct p4tc_template_common *tmpl, + bool unconditional_purge, + struct netlink_ext_ack *extack) +{ + struct p4tc_pipeline *pipeline = + tcf_pipeline_find_byid(net, tmpl->p_id); + struct p4tc_register *reg = to_register(tmpl); + int ret; + + ret = _tcf_register_put(pipeline, reg, unconditional_purge, extack); + if (ret < 0) + NL_SET_ERR_MSG(extack, "Unable to delete referenced register"); + + return ret; +} + +static struct p4tc_register *tcf_register_create(struct net *net, + struct nlmsghdr *n, + struct nlattr *nla, u32 reg_id, + struct p4tc_pipeline *pipeline, + struct netlink_ext_ack *extack) +{ + struct nlattr *tb[P4TC_REGISTER_MAX + 1]; + struct p4tc_u_register *parm; + struct p4tc_type *datatype; + struct p4tc_register *reg; + int ret; + + ret = nla_parse_nested(tb, P4TC_REGISTER_MAX, nla, p4tc_register_policy, + extack); + + if (ret < 0) + return ERR_PTR(ret); + + reg = kzalloc(sizeof(*reg), GFP_KERNEL); + if (!reg) + return ERR_PTR(-ENOMEM); + + if (!tb[P4TC_REGISTER_NAME]) { + NL_SET_ERR_MSG(extack, "Must specify register name"); + ret = -EINVAL; + goto free_reg; + } + + if (tcf_register_find_byname(nla_data(tb[P4TC_REGISTER_NAME]), pipeline) || + tcf_register_find_byid(pipeline, reg_id)) { + NL_SET_ERR_MSG(extack, "Register already exists"); + ret = -EEXIST; + goto free_reg; + } + + reg->common.p_id = pipeline->common.p_id; + strscpy(reg->common.name, nla_data(tb[P4TC_REGISTER_NAME]), + REGISTERNAMSIZ); + + if (tb[P4TC_REGISTER_INFO]) { + parm = nla_data(tb[P4TC_REGISTER_INFO]); + } else { + ret = -EINVAL; + NL_SET_ERR_MSG(extack, "Missing register info"); + goto free_reg; + } + + if (tb[P4TC_REGISTER_VALUE]) { + ret = -EINVAL; + NL_SET_ERR_MSG(extack, "Value can't be passed in create"); + goto free_reg; + } + + if (parm->flags & P4TC_REGISTER_FLAGS_INDEX) { + ret = -EINVAL; + NL_SET_ERR_MSG(extack, "Index can't be passed in create"); + goto free_reg; + } + + if (parm->flags & P4TC_REGISTER_FLAGS_NUMELEMS) { + if (!parm->num_elems) { + ret = -EINVAL; + NL_SET_ERR_MSG(extack, "Num elems can't be zero"); + goto free_reg; + } + + if (parm->num_elems > P4TC_MAX_REGISTER_ELEMS) { + NL_SET_ERR_MSG(extack, + "Number of elements exceededs P4 register maximum"); + ret = -EINVAL; + goto free_reg; + } + } else { + NL_SET_ERR_MSG(extack, "Must specify num elems"); + ret = -EINVAL; + goto free_reg; + } + + if (!(parm->flags & P4TC_REGISTER_FLAGS_STARTBIT) || + !(parm->flags & P4TC_REGISTER_FLAGS_ENDBIT)) { + ret = -EINVAL; + NL_SET_ERR_MSG(extack, "Must specify start and endbit"); + goto free_reg; + } + + if (parm->startbit > parm->endbit) { + ret = -EINVAL; + NL_SET_ERR_MSG(extack, "startbit > endbit"); + goto free_reg; + } + + if (parm->flags & P4TC_REGISTER_FLAGS_DATATYPE) { + datatype = p4type_find_byid(parm->datatype); + if (!datatype) { + NL_SET_ERR_MSG(extack, + "Invalid data type for P4 register"); + ret = -EINVAL; + goto free_reg; + } + reg->reg_type = datatype; + } else { + ret = -EINVAL; + NL_SET_ERR_MSG(extack, "Must specify datatype"); + goto free_reg; + } + + if (parm->endbit > datatype->bitsz) { + NL_SET_ERR_MSG(extack, + "Endbit doesn't fix in container datatype"); + ret = -EINVAL; + goto free_reg; + } + reg->reg_startbit = parm->startbit; + reg->reg_endbit = parm->endbit; + + reg->reg_num_elems = parm->num_elems; + + spin_lock_init(®->reg_value_lock); + + reg->reg_value = kcalloc(reg->reg_num_elems, + BITS_TO_BYTES(datatype->container_bitsz), + GFP_KERNEL); + if (!reg->reg_value) { + ret = -ENOMEM; + goto free_reg; + } + + if (reg_id) { + reg->reg_id = reg_id; + ret = idr_alloc_u32(&pipeline->p_reg_idr, reg, ®->reg_id, + reg->reg_id, GFP_KERNEL); + if (ret < 0) { + NL_SET_ERR_MSG(extack, + "Unable to allocate register id"); + goto free_reg_value; + } + } else { + reg->reg_id = 1; + ret = idr_alloc_u32(&pipeline->p_reg_idr, reg, ®->reg_id, + UINT_MAX, GFP_KERNEL); + if (ret < 0) { + NL_SET_ERR_MSG(extack, + "Unable to allocate register id"); + goto free_reg_value; + } + } + + if (datatype->ops->create_bitops) { + size_t bitsz = reg->reg_endbit - reg->reg_startbit + 1; + struct p4tc_type_mask_shift *mask_shift; + + mask_shift = datatype->ops->create_bitops(bitsz, + reg->reg_startbit, + reg->reg_endbit, + extack); + if (IS_ERR(mask_shift)) { + ret = PTR_ERR(mask_shift); + goto idr_rm; + } + reg->reg_mask_shift = mask_shift; + } + + refcount_set(®->reg_ref, 1); + + reg->common.ops = (struct p4tc_template_ops *)&p4tc_register_ops; + + return reg; + +idr_rm: + idr_remove(&pipeline->p_reg_idr, reg->reg_id); + +free_reg_value: + kfree(reg->reg_value); + +free_reg: + kfree(reg); + return ERR_PTR(ret); +} + +static struct p4tc_register *tcf_register_update(struct net *net, + struct nlmsghdr *n, + struct nlattr *nla, u32 reg_id, + struct p4tc_pipeline *pipeline, + struct netlink_ext_ack *extack) +{ + void *user_value = NULL; + struct nlattr *tb[P4TC_REGISTER_MAX + 1]; + struct p4tc_u_register *parm; + struct p4tc_type *datatype; + struct p4tc_register *reg; + int ret; + + ret = nla_parse_nested(tb, P4TC_REGISTER_MAX, nla, p4tc_register_policy, + extack); + + if (ret < 0) + return ERR_PTR(ret); + + reg = tcf_register_find_byanyattr(pipeline, tb[P4TC_REGISTER_NAME], + reg_id, extack); + if (IS_ERR(reg)) + return reg; + + if (tb[P4TC_REGISTER_INFO]) { + parm = nla_data(tb[P4TC_REGISTER_INFO]); + } else { + ret = -EINVAL; + NL_SET_ERR_MSG(extack, "Missing register info"); + goto err; + } + + datatype = reg->reg_type; + + if (parm->flags & P4TC_REGISTER_FLAGS_NUMELEMS) { + ret = -EINVAL; + NL_SET_ERR_MSG(extack, "Can't update register num elems"); + goto err; + } + + if (!(parm->flags & P4TC_REGISTER_FLAGS_STARTBIT) || + !(parm->flags & P4TC_REGISTER_FLAGS_ENDBIT)) { + ret = -EINVAL; + NL_SET_ERR_MSG(extack, "Must specify start and endbit"); + goto err; + } + + if (parm->startbit != reg->reg_startbit || + parm->endbit != reg->reg_endbit) { + ret = -EINVAL; + NL_SET_ERR_MSG(extack, + "Start and endbit don't match with register values"); + goto err; + } + + if (!(parm->flags & P4TC_REGISTER_FLAGS_INDEX)) { + ret = -EINVAL; + NL_SET_ERR_MSG(extack, "Must specify index"); + goto err; + } + + if (tb[P4TC_REGISTER_VALUE]) { + if (nla_len(tb[P4TC_REGISTER_VALUE]) != + BITS_TO_BYTES(datatype->container_bitsz)) { + ret = -EINVAL; + NL_SET_ERR_MSG(extack, + "Value size differs from register type's container size"); + goto err; + } + user_value = nla_data(tb[P4TC_REGISTER_VALUE]); + } else { + ret = -EINVAL; + NL_SET_ERR_MSG(extack, "Missing register value"); + goto err; + } + + if (parm->index >= reg->reg_num_elems) { + ret = -EINVAL; + NL_SET_ERR_MSG(extack, "Register index out of bounds"); + goto err; + } + + if (user_value) { + u64 read_user_value[2] = { 0 }; + size_t type_bytesz; + void *value; + + type_bytesz = BITS_TO_BYTES(datatype->container_bitsz); + + datatype->ops->host_read(datatype, reg->reg_mask_shift, + user_value, read_user_value); + + spin_lock_bh(®->reg_value_lock); + value = reg->reg_value + parm->index * type_bytesz; + datatype->ops->host_write(datatype, reg->reg_mask_shift, + read_user_value, value); + spin_unlock_bh(®->reg_value_lock); + } + + return reg; + +err: + return ERR_PTR(ret); +} + +static struct p4tc_template_common * +tcf_register_cu(struct net *net, struct nlmsghdr *n, struct nlattr *nla, + struct p4tc_nl_pname *nl_pname, u32 *ids, + struct netlink_ext_ack *extack) +{ + u32 pipeid = ids[P4TC_PID_IDX], reg_id = ids[P4TC_REGID_IDX]; + struct p4tc_pipeline *pipeline; + struct p4tc_register *reg; + + pipeline = tcf_pipeline_find_byany_unsealed(net, nl_pname->data, pipeid, + extack); + if (IS_ERR(pipeline)) + return (void *)pipeline; + + if (n->nlmsg_flags & NLM_F_REPLACE) + reg = tcf_register_update(net, n, nla, reg_id, pipeline, + extack); + else + reg = tcf_register_create(net, n, nla, reg_id, pipeline, + extack); + + if (IS_ERR(reg)) + goto out; + + if (!nl_pname->passed) + strscpy(nl_pname->data, pipeline->common.name, PIPELINENAMSIZ); + + if (!ids[P4TC_PID_IDX]) + ids[P4TC_PID_IDX] = reg->common.p_id; + +out: + return (struct p4tc_template_common *)reg; +} + +static int tcf_register_flush(struct sk_buff *skb, + struct p4tc_pipeline *pipeline, + struct netlink_ext_ack *extack) +{ + unsigned char *b = nlmsg_get_pos(skb); + struct p4tc_register *reg; + unsigned long tmp, reg_id; + int ret = 0; + int i = 0; + + if (nla_put_u32(skb, P4TC_PATH, 0)) + goto out_nlmsg_trim; + + if (idr_is_empty(&pipeline->p_reg_idr)) { + NL_SET_ERR_MSG(extack, "There are no registers to flush"); + goto out_nlmsg_trim; + } + + idr_for_each_entry_ul(&pipeline->p_reg_idr, reg, tmp, reg_id) { + if (_tcf_register_put(pipeline, reg, false, extack) < 0) { + ret = -EBUSY; + continue; + } + i++; + } + + nla_put_u32(skb, P4TC_COUNT, i); + + if (ret < 0) { + if (i == 0) { + NL_SET_ERR_MSG(extack, "Unable to flush any register"); + goto out_nlmsg_trim; + } else { + NL_SET_ERR_MSG(extack, "Unable to flush all registers"); + } + } + + return i; + +out_nlmsg_trim: + nlmsg_trim(skb, b); + return ret; +} + +static int tcf_register_gd(struct net *net, struct sk_buff *skb, + struct nlmsghdr *n, struct nlattr *nla, + struct p4tc_nl_pname *nl_pname, u32 *ids, + struct netlink_ext_ack *extack) +{ + u32 pipeid = ids[P4TC_PID_IDX], reg_id = ids[P4TC_REGID_IDX]; + struct nlattr *tb[P4TC_REGISTER_MAX + 1] = {}; + unsigned char *b = nlmsg_get_pos(skb); + struct p4tc_u_register *parm_arg = NULL; + int ret = 0; + struct p4tc_pipeline *pipeline; + struct p4tc_register *reg; + struct nlattr *attr_info; + + if (n->nlmsg_type == RTM_DELP4TEMPLATE) + pipeline = tcf_pipeline_find_byany_unsealed(net, nl_pname->data, + pipeid, extack); + else + pipeline = tcf_pipeline_find_byany(net, nl_pname->data, pipeid, + extack); + + if (IS_ERR(pipeline)) + return PTR_ERR(pipeline); + + if (nla) { + ret = nla_parse_nested(tb, P4TC_REGISTER_MAX, nla, + p4tc_register_policy, extack); + + if (ret < 0) + return ret; + } + + if (!nl_pname->passed) + strscpy(nl_pname->data, pipeline->common.name, PIPELINENAMSIZ); + + if (!ids[P4TC_PID_IDX]) + ids[P4TC_PID_IDX] = pipeline->common.p_id; + + if (n->nlmsg_type == RTM_DELP4TEMPLATE && (n->nlmsg_flags & NLM_F_ROOT)) + return tcf_register_flush(skb, pipeline, extack); + + reg = tcf_register_find_byanyattr(pipeline, tb[P4TC_REGISTER_NAME], + reg_id, extack); + if (IS_ERR(reg)) + return PTR_ERR(reg); + + attr_info = tb[P4TC_REGISTER_INFO]; + if (attr_info) { + if (n->nlmsg_type == RTM_DELP4TEMPLATE) { + NL_SET_ERR_MSG(extack, + "Can't pass info attribute in delete"); + return -EINVAL; + } + parm_arg = nla_data(attr_info); + if (!(parm_arg->flags & P4TC_REGISTER_FLAGS_INDEX) || + (parm_arg->flags & ~P4TC_REGISTER_FLAGS_INDEX)) { + NL_SET_ERR_MSG(extack, + "Must specify param index and only param index"); + return -EINVAL; + } + if (parm_arg->index >= reg->reg_num_elems) { + NL_SET_ERR_MSG(extack, "Register index out of bounds"); + return -EINVAL; + } + } + if (_tcf_register_fill_nlmsg(skb, reg, parm_arg) < 0) { + NL_SET_ERR_MSG(extack, + "Failed to fill notification attributes for register"); + return -EINVAL; + } + + if (n->nlmsg_type == RTM_DELP4TEMPLATE) { + ret = _tcf_register_put(pipeline, reg, false, extack); + if (ret < 0) { + NL_SET_ERR_MSG(extack, + "Unable to delete referenced register"); + goto out_nlmsg_trim; + } + } + + return 0; + +out_nlmsg_trim: + nlmsg_trim(skb, b); + return ret; +} + +static int tcf_register_dump(struct sk_buff *skb, struct p4tc_dump_ctx *ctx, + struct nlattr *nla, char **p_name, u32 *ids, + struct netlink_ext_ack *extack) +{ + struct net *net = sock_net(skb->sk); + struct p4tc_pipeline *pipeline; + + if (!ctx->ids[P4TC_PID_IDX]) { + pipeline = tcf_pipeline_find_byany(net, *p_name, + ids[P4TC_PID_IDX], extack); + if (IS_ERR(pipeline)) + return PTR_ERR(pipeline); + ctx->ids[P4TC_PID_IDX] = pipeline->common.p_id; + } else { + pipeline = tcf_pipeline_find_byid(net, ctx->ids[P4TC_PID_IDX]); + } + + if (!ids[P4TC_PID_IDX]) + ids[P4TC_PID_IDX] = pipeline->common.p_id; + + if (!(*p_name)) + *p_name = pipeline->common.name; + + return tcf_p4_tmpl_generic_dump(skb, ctx, &pipeline->p_reg_idr, + P4TC_REGID_IDX, extack); +} + +static int tcf_register_dump_1(struct sk_buff *skb, + struct p4tc_template_common *common) +{ + struct nlattr *nest = nla_nest_start(skb, P4TC_PARAMS); + struct p4tc_register *reg = to_register(common); + + if (!nest) + return -ENOMEM; + + if (nla_put_string(skb, P4TC_REGISTER_NAME, reg->common.name)) { + nla_nest_cancel(skb, nest); + return -ENOMEM; + } + + nla_nest_end(skb, nest); + + return 0; +} + +const struct p4tc_template_ops p4tc_register_ops = { + .cu = tcf_register_cu, + .fill_nlmsg = tcf_register_fill_nlmsg, + .gd = tcf_register_gd, + .put = tcf_register_put, + .dump = tcf_register_dump, + .dump_1 = tcf_register_dump_1, +}; diff --git a/net/sched/p4tc/p4tc_tmpl_api.c b/net/sched/p4tc/p4tc_tmpl_api.c index 2963f6497..5712cfaf8 100644 --- a/net/sched/p4tc/p4tc_tmpl_api.c +++ b/net/sched/p4tc/p4tc_tmpl_api.c @@ -46,6 +46,7 @@ static bool obj_is_valid(u32 obj) case P4TC_OBJ_HDR_FIELD: case P4TC_OBJ_ACT: case P4TC_OBJ_TABLE: + case P4TC_OBJ_REGISTER: return true; default: return false; @@ -58,6 +59,7 @@ static const struct p4tc_template_ops *p4tc_ops[P4TC_OBJ_MAX] = { [P4TC_OBJ_HDR_FIELD] = &p4tc_hdrfield_ops, [P4TC_OBJ_ACT] = &p4tc_act_ops, [P4TC_OBJ_TABLE] = &p4tc_table_ops, + [P4TC_OBJ_REGISTER] = &p4tc_register_ops, }; int tcf_p4_tmpl_generic_dump(struct sk_buff *skb, struct p4tc_dump_ctx *ctx, From patchwork Tue Jan 24 17:05:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jamal Hadi Salim X-Patchwork-Id: 13114399 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C018CC54E94 for ; Tue, 24 Jan 2023 17:07:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230200AbjAXRHV (ORCPT ); Tue, 24 Jan 2023 12:07:21 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42040 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234745AbjAXRG0 (ORCPT ); Tue, 24 Jan 2023 12:06:26 -0500 Received: from mail-ot1-x335.google.com (mail-ot1-x335.google.com [IPv6:2607:f8b0:4864:20::335]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8379C93E1 for ; Tue, 24 Jan 2023 09:05:42 -0800 (PST) Received: by mail-ot1-x335.google.com with SMTP id x21-20020a056830245500b006865ccca77aso9575720otr.11 for ; Tue, 24 Jan 2023 09:05:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mojatatu-com.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ngm4dAzfYfXIR5UGuajtDNkVB/F4zDdqRHCMLAhB8MM=; b=pj8N68D8jMq9VAuFdDOL6egLRsPIVqyX8rmzCyeOttilM0t2zmb+t+FNw4xNJH33NJ +4Ehk5+QWj63VNX5WGg+e1OvKkilEGdWPbFdcp3RzNTUxLvWhWYBnTksFG9i5WEMMMAV VBbE84jc8tnrKbgUSwubhcglkKor0ECk+Vy33OcLxtdUhR1IBlI/xU5fRAm1BTU7K2gf uJcStMSY1wQAMVw6I2rGrbfni310RG414VY2LPyDSjm1mrggW2Yqq+EAQVuhpqcvSOzs d4qQYZJK6q1J0sJjqbKKWtJjip/dX4/jgqWUXwVwMDai5F3dk/VJpkX3kfMyH2h9vfCA 822A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ngm4dAzfYfXIR5UGuajtDNkVB/F4zDdqRHCMLAhB8MM=; b=WT6y9CB0xf9DJurCQx1CNaWP8uwwR7NvwKF1sLR9+QcEXlZfyJL/8x6wOu7C+O+acY VaPk1zJfCS3QMmT/xEz8rWxGrzh6NRZi7grb/2mN4XKfuBieIBF+DZh8DFX/CcW0ZVcR GurFFP/T+ihNpV490r2hehlJIF3i9+Lm8u/RgwLRMwEMJQ0/LD/7BO+0GvAVKfy5Sp2F YzGZQ1/A7x+c7h98kAcGcMmXYGBIEze8YMV3yBl/roWD1EP9LlSB/FRsTh+Pm5Fbaz2f rUaYcZcWKhd2MpFh6+ICpfs4rTKJ2gtMAHBxl5SC1l6q4+QZ9DxbIx4m6qSIgSxewLLC uLPg== X-Gm-Message-State: AFqh2kpBswIImrFPYwJeKD+mhqZOzvA+JxhRzDatdQ2h9+3cLi3LewiM 50xsMr9SY5BoiXqNWYR3JzrekQi4yBPxcOF3 X-Google-Smtp-Source: AMrXdXsaRcYSFe4ReqQPI4OmPdZeeH/nA8kUIGie/jcVuEm+rToMMYQwHchLim6JDx/S0TKHjiex5w== X-Received: by 2002:a9d:7386:0:b0:670:62c6:56b5 with SMTP id j6-20020a9d7386000000b0067062c656b5mr13613880otk.31.1674579936406; Tue, 24 Jan 2023 09:05:36 -0800 (PST) Received: from localhost.localdomain (bras-base-kntaon1618w-grc-10-184-145-9-64.dsl.bell.ca. [184.145.9.64]) by smtp.gmail.com with ESMTPSA id t5-20020a05620a0b0500b007063036cb03sm1700208qkg.126.2023.01.24.09.05.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Jan 2023 09:05:35 -0800 (PST) From: Jamal Hadi Salim To: netdev@vger.kernel.org Cc: kernel@mojatatu.com, deb.chatterjee@intel.com, anjali.singhai@intel.com, namrata.limaye@intel.com, khalidm@nvidia.com, tom@sipanda.io, pratyush@sipanda.io, jiri@resnulli.us, xiyou.wangcong@gmail.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, vladbu@nvidia.com, simon.horman@corigine.com Subject: [PATCH net-next RFC 19/20] p4tc: add dynamic action commands Date: Tue, 24 Jan 2023 12:05:09 -0500 Message-Id: <20230124170510.316970-19-jhs@mojatatu.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230124170510.316970-1-jhs@mojatatu.com> References: <20230124170510.316970-1-jhs@mojatatu.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org X-Patchwork-State: RFC In this initial patch, we introduce dynamic action commands which will be used by dynamic action in P4TC. The are 8 operations: set, act, print and branching ================================SET================================ The set operation allows us to assign values to objects. The assignee operand("A") can be metadata, header field, table key, dev or register. Whilst the assignor operand("B") can be metadata, header field, table key, register, constant, dev, param or result. We'll describe each of these operand types further down the commit message. The set command has the following syntax: set A B Operand A's size must be bigger or equal to operand B's size. Here are some examples of setting metadata to constants: Create an action that sets kernel skbmark to decimal 1234 tc p4template create action/myprog/test actid 1 \ cmd set metadata.kernel.skbmark constant.bit32.1234 set kernel tcindex to 0x5678 tc p4template create action/myprog/test actid 1 \ cmd metadata.kernel.tcindex constant.bit32.0x5678 Note that we may specify constants in decimal or hexadecimal format. Here are some examples of setting metadata to metadata: Create an action that sets skb->hash to skb->mark tc p4template create action/myprog/test actid 1 \ cmd set metadata.kernel.skbhash metadata.kernel.skbmark Create an action that sets skb->ifindex to skb->iif tc p4template create action/myprog/test actid 1 \ cmd set metadata.kernel.ifindex metadata.kernel.iif We can also use user defined metadata in set operations. For example, if we define the following user metadata tc p4template create metadata/myprog/mymd type bit32 We could create an action to set its value to skbmark, for example tc p4template create action/myprog/test actid 1 \ cmd set metadata.myprog.mymd metadata.kernel.skbmark Note that the way to reference user metadata (from iproute2 perspective) is equivalent to the way we reference kernel metadata. That is: METADATA.PIPELINE_NAME.METADATA_NAME All kernel metadata is stored inside a special pipeline called "kernel". We can also use bit slices in set operations. For example, if one wanted to create an action to assign the first 16 bits of user metadata known as "md" to kernel metadata tcindex, one would right the following: tc p4template create action/myprog/test actid 1 \ cmd set metadata.myprog.tcindex metadata.kernel.md[0-15] If we wanted to write the last 16 bits of user metadata "mymd" to kernel metadata tcindex, we'd issue the following command: tc p4template create action/myprog/test actid 1 \ cmd set metadata.myprog.tcindex metadata.kernel.md[16-31] of course one could create multiple sets in one action as such: tc p4template create action/myprog/swap_ether actid 1 \ cmd set metadata.myprog.temp hdrfield.myprog.parser1.ethernet.dstAddr \ cmd set hdrfield.myprog.parser1.ethernet.dstAddr hdrfield.myprog.parser1.ethernet.srcAddr \ cmd set hdrfield.myprog.parser1.ethernet.srcAddr metadata.myprog.temp ================================ACT================================ The act operation is used to call other actions from dynamic action commands. Note: we can invoke either kernel native actions, such as gact and mirred, etc or pipeline defined dynamic actions. There are two ways to use the act command. - Create an instance of an action and then calling this specific instance - Specify the action parameters directly in the act command. __Method One__ The basic syntax for the first option is: act PIPELINE_NAME.ACTION_NAME.INDEX Where PIPELINE_NAME could be a user created pipeline or the native "kernel" pipeline. For example, if we wanted to call an instance of a mirred action that mirrors a packet to egress on a specific interface (eth0) then first we create an instance of the action kind an assign it an index as follows: tc actions add action mirred egress mirror dev eth0 index 1 After that, we can then use it on a command by indicating the appropriate action name and index. tc p4template create action/myprog/test actid 1 \ cmd act kernel.mirred.1 Note that we use "kernel" as the pipeline name. That's because mirred is a native kernel action. We could also call pipeline specific action from a dynamic action's commands, so for example, if we created the following action template: We can do the same thing but with user created actions, we could do the following: tc p4template create action/myprog/test actid 1 param param1 type bit32 Add an instance of it: tc actions add action myprog/test param param1 type bit32 22 index 1 We could call it using the following command: tc p4template create action/myprog/test actid 12 \ cmd act myprog.test.1 __Method Two__ The syntax for the second method is: act ACTION_NAME PARAMS The second method can only be applied to user defined actions and allows us to invoke action and passing parameter directly in the invocation. So the above example from method1 would turn into the following: tc p4template create action/myprog/test actid 12 \ cmd act myprog.test constant.bit32.22 ================================BRANCHING================================ We have several branch commands: beq (branch-equal), bne (branch-not-equal), bgt (branch-greater-then), blt (branch-less-then), bge (branch-greater-then), ble (branch-less-equal) The basic syntax for branching instructions is: / Where compare-operation could be beq, bne, bg1, blt, bge and ble. A is one of: header field, metadata, key or result field (like result.hit or result.miss). B is one of: a constant, header field or metadata A and B don't need to be the same size and type as long as B's size is smaller or equal to A's size. Note, inherently this means A and B cant both be constants. Let's take a look at some examples: tc p4template create action/myprog/test actid 1 \ cmd beq metadata.kernel.skbmark constant.u32.4 control pipe / jump 1 \ cmd set metadata.kernel.skbmark constant.u32.123 control ok \ cmd set metadata.kernel.skbidf constant.bit1.0 The above action executes the equivalent of the following pseudo code: if (metadata.kernel.skbmark == 4) then metadata.kernel.skbmark = 123 else metadata.kernel.skbidf = 0 endif Here is another example, now with bne: tc p4template create action/myprog/test actid 1 \ cmd bne metadata.kernel.skbmark constant.u32.4 control pipe / jump else \ cmd set metadata.kernel.skbmark constant.u32.123 \ cmd jump endif \ cmd label else \ cmd set metadata.kernel.skbidf constant.bit1.0 \ cmd label endif Note in this example we use "labels". These are a more user-friendly alternative to jumps with numbers, but basically what example action above does is equivalent of the following pseudo code: if (metadata.kernel.skbmark != 4) then metadata.kernel.skbmark = 123 else metadata.kernel.skbidf = 0 endif This example is basically the logical oposite of the previous one. ================================PRINT================================ The print operation allows us to print the value of operands for debugging purposes. The syntax for the print instruction is the following: PRINT [PREFIX] [ACTUAL_PREFIX] operA Where operA could be a header field, metadata, key, result, register or action param. The PREFIX and ACTUAL_PREFIX fields are optional and could contain a prefix string that will be printed before operA's value. Let's first see an example that doesn't use prefix: sudo tc p4template create action/myprog/test actid 1 \ cmd print metadata.kernel.skbmark \ cmd set metadata.kernel.skbmark constant.u32.123 \ cmd print metadata.kernel.skbmark Assuming skb->mark was initially 0, this will print: kernel.skbmark 0 kernel.skbmark 123 If we wanted to add prefixes to those commands, we could do the following: sudo tc p4template create action/myprog/test actid 1 \ cmd print prefix before metadata.kernel.skbmark \ cmd set metadata.kernel.skbmark constant.u32.123 \ cmd print prefix after metadata.kernel.skbmark This will print: before kernel.skbmark 0 after kernel.skbmark 123 ================================PLUS================================ The plus command is used to add two operands The basic syntax for the plus command is: cmd plus operA operB operC The command will add operands operB and operC and store the result in operC. That is: operA = operB + operC operA can be one of: metadatum, header field. operB and operC can be one of: constant, metadatum, key, header field or param. The following example will add metadatum mymd from pipeline myprog and constant 16 and store the result in metadatum mymd2 of pipeline myprog: tc p4template create action/myprog/myfunc \ cmd plus metadata.myprog.mymd2 metadata.myprog.mymd constant.bit32.16 ================================SUB================================ The sub command is used to subtract two operands The basic syntax for the sub command is: cmd sub operA operB operC The command will subtract operands operB and operC and store the result in operC. That is: operA = operB - operC operA can be one of: metadatum, header field. operB and operC can be one of: constant, metadatum, key, header field or param. The following example will subtract metadatum mymd from pipeline myprog and constant 16 and store the result in metadatum mymd2 of pipeline myprog: tc p4template create action/myprog/myfunc \ cmd sub metadata.myprog.mymd2 metadata.myprog.mymd constant.bit32.16 ================================CONCAT================================ The concat command is used to concat upto 8 operands and save the result to a lvalue. The basic syntax for the sub command is: cmd concat operA operB operC [..] The command will concat operands operB and operC and optionally 6 more store the result in operC. It goes without saying that operA's size must be greater or equal to the sum of (operB's size + operC's size .... operI's size) operA can be one of: metadatum, a key, a header field. operB .. operI can only be a constant, a metadatum, a key, a header field or a param. The following example will concat metadatum mymd from pipeline myprog with header field tcp.dport and store the result in metadatum mymd2 of pipeline myprog: tc p4template create action/myprog/myfunc \ cmd concat \ metadata.myprog.mymd2 metadata.myprog.mymd hdrfield.myprog.myparser.tcp.dport ================================BAND================================ The band command is used to perform a binary AND operation between two operands. The basic syntax for the band command is: cmd band operA operB operC The command will perform the "operB AND operC" and store the result in operC. That is: operA = operB & operC operA can be one of: metadatum, header field. operB and operC can be one of: constant, metadatum, key, header field or param. The following example will perform an AND operation of constant 16 and mymd metadata and store the result in metadatum mymd2 of pipeline myprog: tc p4template create action/myprog/myfunc \ cmd band metadata.myprog.mymd2 metadata.myprog.mymd constant.bit32.16 ================================BOR================================ The bor command is used to perform an binary OR operation between two operands. The basic syntax for the bor command is: cmd bor operA operB operC The command will perform the "operB OR operC" and store the result in operC. That is: operA = operB | operC operA can be one of: metadatum, header field. operB and operC can be one of: constant, metadatum, key, header field or param. The following example will perform an OR operation of constant 16 and mymd metadata and store the result in metadatum mymd2 of pipeline myprog: tc p4template create action/myprog/myfunc \ cmd bor metadata.myprog.mymd2 metadata.myprog.mymd constant.bit32.16 ================================BXOR================================ The bxor command is used to perform an binary XOR operation between two operands. The basic syntax for the bxor command is: cmd bxor operA operB operC The command will perform the "operB XOR operC" and store the result in operC. That is: operA = operB ^ operC operA can be one of: metadatum, header field. operB and operC can be one of: constant, metadatum, key, header field or param. The following example will perform a XOR operation of constant 16 and mymd metadata and store the result in metadatum mymd2 of pipeline myprog: tc p4template create action/myprog/myfunc \ cmd bxor metadata.myprog.mymd2 metadata.myprog.mymd constant.bit32.16 ===============================SND PORT EGRESS=============================== The send_port_egress command sends the received packet to a specific network interface device. The syntax of the commands is: cmd send_port_egress operA operA must be of type dev, that is, a network interface device, which exists and is up. The following example uses the send_port_egress to send a packet to port eth0. Note that no other action can run after send_port_egress. tc p4template create action/myprog/myfunc \ cmd send_port_egress dev.eth0 ===============================MIRPORTEGRESS=============================== The mirror_port_egress command mirror the received packet to a specific network interface device. The syntax of the commands is: cmd send_port_egress operA operA must be of type dev, that is, a network interface device, which exists and is up. The following example uses the mirror_port_egress to mirror a packet to port eth0. Note that the semantic of mirror here is means that we are cloning the packet and sending it to the specified network interface. This command won't edit or change the course of the original packet. tc p4template create action/myprog/myfunc \ cmd mirror_port_egress dev.eth0 Co-developed-by: Victor Nogueira Signed-off-by: Victor Nogueira Co-developed-by: Pedro Tammela Signed-off-by: Pedro Tammela Co-developed-by: Evangelos Haleplidis Signed-off-by: Evangelos Haleplidis Signed-off-by: Jamal Hadi Salim --- include/net/p4tc.h | 68 + include/uapi/linux/p4tc.h | 123 ++ net/sched/p4tc/Makefile | 2 +- net/sched/p4tc/p4tc_action.c | 89 +- net/sched/p4tc/p4tc_cmds.c | 3492 ++++++++++++++++++++++++++++++++++ net/sched/p4tc/p4tc_meta.c | 65 + 6 files changed, 3835 insertions(+), 4 deletions(-) create mode 100644 net/sched/p4tc/p4tc_cmds.c diff --git a/include/net/p4tc.h b/include/net/p4tc.h index d9267b798..164cb3c5d 100644 --- a/include/net/p4tc.h +++ b/include/net/p4tc.h @@ -594,4 +594,72 @@ void tcf_register_put_rcu(struct rcu_head *head); #define to_table(t) ((struct p4tc_table *)t) #define to_register(t) ((struct p4tc_register *)t) +/* P4TC COMMANDS */ +int p4tc_cmds_parse(struct net *net, struct p4tc_act *act, struct nlattr *nla, + bool ovr, struct netlink_ext_ack *extack); +int p4tc_cmds_copy(struct p4tc_act *act, struct list_head *new_cmd_operations, + bool delete_old, struct netlink_ext_ack *extack); + +int p4tc_cmds_fillup(struct sk_buff *skb, struct list_head *meta_ops); +void p4tc_cmds_release_ope_list(struct net *net, struct list_head *entries, + bool called_from_template); +struct p4tc_cmd_operand; +int p4tc_cmds_fill_operand(struct sk_buff *skb, struct p4tc_cmd_operand *kopnd); + +struct p4tc_cmd_operate { + struct list_head cmd_operations; + struct list_head operands_list; + struct p4tc_cmd_s *cmd; + char *label1; + char *label2; + u32 num_opnds; + u32 ctl1; + u32 ctl2; + u16 op_id; /* P4TC_CMD_OP_XXX */ + u32 cmd_offset; + u8 op_flags; + u8 op_cnt; +}; + +struct tcf_p4act; +struct p4tc_cmd_operand { + struct list_head oper_list_node; + void *(*fetch)(struct sk_buff *skb, struct p4tc_cmd_operand *op, + struct tcf_p4act *cmd, struct tcf_result *res); + struct p4tc_type *oper_datatype; /* what is stored in path_or_value - P4T_XXX */ + struct p4tc_type_mask_shift *oper_mask_shift; + struct tc_action *action; + void *path_or_value; + void *path_or_value_extra; + void *print_prefix; + void *priv; + u64 immedv_large[BITS_TO_U64(P4T_MAX_BITSZ)]; + u32 immedv; /* one of: immediate value, metadata id, action id */ + u32 immedv2; /* one of: action instance */ + u32 path_or_value_sz; + u32 path_or_value_extra_sz; + u32 print_prefix_sz; + u32 immedv_large_sz; + u32 pipeid; /* 0 for kernel */ + u8 oper_type; /* P4TC_CMD_OPER_XXX */ + u8 oper_cbitsize; /* based on P4T_XXX container size */ + u8 oper_bitsize; /* diff between bitend - oper_bitend */ + u8 oper_bitstart; + u8 oper_bitend; + u8 oper_flags; /* TBA: DATA_IS_IMMEDIATE */ +}; + +struct p4tc_cmd_s { + int cmdid; + u32 num_opnds; + int (*validate_operands)(struct net *net, struct p4tc_act *act, + struct p4tc_cmd_operate *ope, u32 cmd_num_opns, + struct netlink_ext_ack *extack); + void (*free_operation)(struct net *net, struct p4tc_cmd_operate *op, + bool called_for_instance, + struct netlink_ext_ack *extack); + int (*run)(struct sk_buff *skb, struct p4tc_cmd_operate *op, + struct tcf_p4act *cmd, struct tcf_result *res); +}; + #endif diff --git a/include/uapi/linux/p4tc.h b/include/uapi/linux/p4tc.h index 0c5f2943e..e80f93276 100644 --- a/include/uapi/linux/p4tc.h +++ b/include/uapi/linux/p4tc.h @@ -384,4 +384,127 @@ enum { #define P4TC_RTA(r) \ ((struct rtattr *)(((char *)(r)) + NLMSG_ALIGN(sizeof(struct p4tcmsg)))) +/* P4TC COMMANDS */ + +/* Operations */ +enum { + P4TC_CMD_OP_UNSPEC, + P4TC_CMD_OP_SET, + P4TC_CMD_OP_ACT, + P4TC_CMD_OP_BEQ, + P4TC_CMD_OP_BNE, + P4TC_CMD_OP_BLT, + P4TC_CMD_OP_BLE, + P4TC_CMD_OP_BGT, + P4TC_CMD_OP_BGE, + P4TC_CMD_OP_PLUS, + P4TC_CMD_OP_PRINT, + P4TC_CMD_OP_TBLAPP, + P4TC_CMD_OP_SNDPORTEGR, + P4TC_CMD_OP_MIRPORTEGR, + P4TC_CMD_OP_SUB, + P4TC_CMD_OP_CONCAT, + P4TC_CMD_OP_BAND, + P4TC_CMD_OP_BOR, + P4TC_CMD_OP_BXOR, + P4TC_CMD_OP_LABEL, + P4TC_CMD_OP_JUMP, + __P4TC_CMD_OP_MAX +}; +#define P4TC_CMD_OP_MAX (__P4TC_CMD_OP_MAX - 1) + +#define P4TC_CMD_OPERS_MAX 9 + +/* single operation within P4TC_ACT_CMDS_LIST */ +enum { + P4TC_CMD_UNSPEC, + P4TC_CMD_OPERATION, /*struct p4tc_u_operate */ + P4TC_CMD_OPER_LIST, /*nested P4TC_CMD_OPER_XXX list */ + P4TC_CMD_OPER_LABEL1, + P4TC_CMD_OPER_LABEL2, + __P4TC_CMD_OPER_MAX +}; +#define P4TC_CMD_OPER_MAX (__P4TC_CMD_OPER_MAX - 1) + +enum { + P4TC_CMD_OPER_A, + P4TC_CMD_OPER_B, + P4TC_CMD_OPER_C, + P4TC_CMD_OPER_D, + P4TC_CMD_OPER_E, + P4TC_CMD_OPER_F, + P4TC_CMD_OPER_G, + P4TC_CMD_OPER_H, + P4TC_CMD_OPER_I, +}; + +#define P4TC_CMDS_RESULTS_HIT 1 +#define P4TC_CMDS_RESULTS_MISS 2 + +/* P4TC_CMD_OPERATION */ +struct p4tc_u_operate { + __u16 op_type; /* P4TC_CMD_OP_XXX */ + __u8 op_flags; + __u8 op_UNUSED; + __u32 op_ctl1; + __u32 op_ctl2; +}; + +/* Nested P4TC_CMD_OPER_XXX */ +enum { + P4TC_CMD_OPND_UNSPEC, + P4TC_CMD_OPND_INFO, + P4TC_CMD_OPND_PATH, + P4TC_CMD_OPND_PATH_EXTRA, + P4TC_CMD_OPND_LARGE_CONSTANT, + P4TC_CMD_OPND_PREFIX, + __P4TC_CMD_OPND_MAX +}; +#define P4TC_CMD_OPND_MAX (__P4TC_CMD_OPND_MAX - 1) + +/* operand types */ +enum { + P4TC_OPER_UNSPEC, + P4TC_OPER_CONST, + P4TC_OPER_META, + P4TC_OPER_ACTID, + P4TC_OPER_TBL, + P4TC_OPER_KEY, + P4TC_OPER_RES, + P4TC_OPER_HDRFIELD, + P4TC_OPER_PARAM, + P4TC_OPER_DEV, + P4TC_OPER_REG, + P4TC_OPER_LABEL, + __P4TC_OPER_MAX +}; +#define P4TC_OPER_MAX (__P4TC_OPER_MAX - 1) + +#define P4TC_CMD_MAX_OPER_PATH_LEN 32 + +/* P4TC_CMD_OPER_INFO operand*/ +struct p4tc_u_operand { + __u32 immedv; /* immediate value */ + __u32 immedv2; + __u32 pipeid; /* 0 for kernel-global */ + __u8 oper_type; /* P4TC_OPER_XXX */ + __u8 oper_datatype; /* T_XXX */ + __u8 oper_cbitsize; /* Size of container, u8 = 8, etc + * Useful for a type that is not atomic + */ + __u8 oper_startbit; + __u8 oper_endbit; + __u8 oper_flags; +}; + +/* operand flags */ +#define DATA_IS_IMMEDIATE (BIT(0)) /* data is held as immediate value */ +#define DATA_IS_RAW (BIT(1)) /* bitXX datatype, not intepreted by kernel */ +#define DATA_IS_SLICE (BIT(2)) /* bitslice in a container, not intepreted + * by kernel + */ +#define DATA_USES_ROOT_PIPE (BIT(3)) +#define DATA_HAS_TYPE_INFO (BIT(4)) +#define DATA_IS_READ_ONLY (BIT(5)) + #endif diff --git a/net/sched/p4tc/Makefile b/net/sched/p4tc/Makefile index b35ced1e3..396fcd249 100644 --- a/net/sched/p4tc/Makefile +++ b/net/sched/p4tc/Makefile @@ -2,4 +2,4 @@ obj-y := p4tc_types.o p4tc_pipeline.o p4tc_tmpl_api.o p4tc_meta.o \ p4tc_parser_api.o p4tc_hdrfield.o p4tc_action.o p4tc_table.o \ - p4tc_tbl_api.o p4tc_register.o + p4tc_tbl_api.o p4tc_register.o p4tc_cmds.o diff --git a/net/sched/p4tc/p4tc_action.c b/net/sched/p4tc/p4tc_action.c index f47b42bbe..f40acdc5a 100644 --- a/net/sched/p4tc/p4tc_action.c +++ b/net/sched/p4tc/p4tc_action.c @@ -147,7 +147,7 @@ static int __tcf_p4_dyna_init_set(struct p4tc_act *act, struct tc_action **a, { struct tcf_p4act_params *params_old; struct tcf_p4act *p; - int err = 0; + int err; p = to_p4act(*a); @@ -156,6 +156,14 @@ static int __tcf_p4_dyna_init_set(struct p4tc_act *act, struct tc_action **a, goto_ch = tcf_action_set_ctrlact(*a, parm->action, goto_ch); + err = p4tc_cmds_copy(act, &p->cmd_operations, exists, extack); + if (err < 0) { + if (exists) + spin_unlock_bh(&p->tcf_lock); + + return err; + } + params_old = rcu_replace_pointer(p->params, params, 1); if (exists) spin_unlock_bh(&p->tcf_lock); @@ -358,9 +366,15 @@ static int dev_dump_param_value(struct sk_buff *skb, nest = nla_nest_start(skb, P4TC_ACT_PARAMS_VALUE); if (param->flags & P4TC_ACT_PARAM_FLAGS_ISDYN) { + struct p4tc_cmd_operand *kopnd; struct nlattr *nla_opnd; nla_opnd = nla_nest_start(skb, P4TC_ACT_PARAMS_VALUE_OPND); + kopnd = param->value; + if (p4tc_cmds_fill_operand(skb, kopnd) < 0) { + ret = -1; + goto out_nla_cancel; + } nla_nest_end(skb, nla_opnd); } else { const u32 *ifindex = param->value; @@ -557,10 +571,48 @@ static int tcf_p4_dyna_act(struct sk_buff *skb, const struct tc_action *a, { struct tcf_p4act *dynact = to_p4act(a); int ret = 0; + int jmp_cnt = 0; + struct p4tc_cmd_operate *op; tcf_lastuse_update(&dynact->tcf_tm); tcf_action_update_bstats(&dynact->common, skb); + /* We only need this lock because the operand's that are action + * parameters will be assigned at run-time, and thus will cause a write + * operation in the data path. If we had this structure as per-cpu, we'd + * possibly be able to get rid of this lock. + */ + lockdep_off(); + spin_lock(&dynact->tcf_lock); + list_for_each_entry(op, &dynact->cmd_operations, cmd_operations) { + if (jmp_cnt-- > 0) + continue; + + if (op->op_id == P4TC_CMD_OP_LABEL) { + ret = TC_ACT_PIPE; + continue; + } + + ret = op->cmd->run(skb, op, dynact, res); + if (TC_ACT_EXT_CMP(ret, TC_ACT_JUMP)) { + jmp_cnt = ret & TC_ACT_EXT_VAL_MASK; + continue; + } else if (ret != TC_ACT_PIPE) { + break; + } + } + spin_unlock(&dynact->tcf_lock); + lockdep_on(); + + if (ret == TC_ACT_SHOT) + tcf_action_inc_drop_qstats(&dynact->common); + + if (ret == TC_ACT_STOLEN || ret == TC_ACT_TRAP) + ret = TC_ACT_CONSUMED; + + if (ret == TC_ACT_OK) + ret = dynact->tcf_action; + return ret; } @@ -589,6 +641,8 @@ static int tcf_p4_dyna_dump(struct sk_buff *skb, struct tc_action *a, int bind, goto nla_put_failure; nest = nla_nest_start(skb, P4TC_ACT_CMDS_LIST); + if (p4tc_cmds_fillup(skb, &dynact->cmd_operations)) + goto nla_put_failure; nla_nest_end(skb, nest); if (nla_put_string(skb, P4TC_ACT_NAME, a->ops->kind)) @@ -688,6 +742,7 @@ static void tcf_p4_dyna_cleanup(struct tc_action *a) refcount_dec(&ops->dyn_ref); spin_lock_bh(&m->tcf_lock); + p4tc_cmds_release_ope_list(NULL, &m->cmd_operations, false); if (params) call_rcu(¶ms->rcu, tcf_p4_act_params_destroy_rcu); spin_unlock_bh(&m->tcf_lock); @@ -702,9 +757,13 @@ int generic_dump_param_value(struct sk_buff *skb, struct p4tc_type *type, nla_value = nla_nest_start(skb, P4TC_ACT_PARAMS_VALUE); if (param->flags & P4TC_ACT_PARAM_FLAGS_ISDYN) { + struct p4tc_cmd_operand *kopnd; struct nlattr *nla_opnd; nla_opnd = nla_nest_start(skb, P4TC_ACT_PARAMS_VALUE_OPND); + kopnd = param->value; + if (p4tc_cmds_fill_operand(skb, kopnd) < 0) + goto out_nlmsg_trim; nla_nest_end(skb, nla_opnd); } else { if (nla_put(skb, P4TC_ACT_PARAMS_VALUE_RAW, bytesz, @@ -1279,6 +1338,8 @@ static int __tcf_act_put(struct net *net, struct p4tc_pipeline *pipeline, kfree(act_param); } + p4tc_cmds_release_ope_list(net, &act->cmd_operations, true); + ret = __tcf_unregister_action(&act->ops); if (ret < 0) { NL_SET_ERR_MSG(extack, @@ -1352,6 +1413,8 @@ static int _tcf_act_fill_nlmsg(struct net *net, struct sk_buff *skb, nla_nest_end(skb, parms); cmds = nla_nest_start(skb, P4TC_ACT_CMDS_LIST); + if (p4tc_cmds_fillup(skb, &act->cmd_operations)) + goto out_nlmsg_trim; nla_nest_end(skb, cmds); nla_nest_end(skb, nest); @@ -1606,13 +1669,19 @@ static struct p4tc_act *tcf_act_create(struct net *net, struct nlattr **tb, INIT_LIST_HEAD(&act->cmd_operations); act->pipeline = pipeline; + if (tb[P4TC_ACT_CMDS_LIST]) { + ret = p4tc_cmds_parse(net, act, tb[P4TC_ACT_CMDS_LIST], false, + extack); + if (ret < 0) + goto uninit; + } pipeline->num_created_acts++; ret = determine_act_topological_order(pipeline, true); if (ret < 0) { pipeline->num_created_acts--; - goto uninit; + goto release_cmds; } act->common.p_id = pipeline->common.p_id; @@ -1626,6 +1695,10 @@ static struct p4tc_act *tcf_act_create(struct net *net, struct nlattr **tb, return act; +release_cmds: + if (tb[P4TC_ACT_CMDS_LIST]) + p4tc_cmds_release_ope_list(net, &act->cmd_operations, false); + uninit: p4_put_many_params(&act->params_idr, params, num_params); idr_destroy(&act->params_idr); @@ -1704,14 +1777,22 @@ static struct p4tc_act *tcf_act_update(struct net *net, struct nlattr **tb, } if (tb[P4TC_ACT_CMDS_LIST]) { - ret = determine_act_topological_order(pipeline, true); + ret = p4tc_cmds_parse(net, act, tb[P4TC_ACT_CMDS_LIST], true, + extack); if (ret < 0) goto params_del; + + ret = determine_act_topological_order(pipeline, true); + if (ret < 0) + goto release_cmds; } p4tc_params_replace_many(&act->params_idr, params, num_params); return act; +release_cmds: + p4tc_cmds_release_ope_list(net, &act->cmd_operations, false); + params_del: p4_put_many_params(&act->params_idr, params, num_params); @@ -1799,6 +1880,8 @@ static int tcf_act_dump_1(struct sk_buff *skb, goto out_nlmsg_trim; nest = nla_nest_start(skb, P4TC_ACT_CMDS_LIST); + if (p4tc_cmds_fillup(skb, &act->cmd_operations)) + goto out_nlmsg_trim; nla_nest_end(skb, nest); if (nla_put_u8(skb, P4TC_ACT_ACTIVE, act->active)) diff --git a/net/sched/p4tc/p4tc_cmds.c b/net/sched/p4tc/p4tc_cmds.c new file mode 100644 index 000000000..85496ee75 --- /dev/null +++ b/net/sched/p4tc/p4tc_cmds.c @@ -0,0 +1,3492 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * net/sched/p4tc_cmds.c - P4 TC cmds + * Copyright (c) 2022, Mojatatu Networks + * Copyright (c) 2022, Intel Corporation. + * Authors: Jamal Hadi Salim + * Victor Nogueira + * Pedro Tammela + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include + +#include + +#define GET_OPA(operands_list) \ + (list_first_entry(operands_list, struct p4tc_cmd_operand, \ + oper_list_node)) + +#define GET_OPB(operands_list) \ + (list_next_entry(GET_OPA(operands_list), oper_list_node)) + +#define GET_OPC(operands_list) \ + (list_next_entry(GET_OPB(operands_list), oper_list_node)) + +#define P4TC_FETCH_DECLARE(fname) \ + static void *(fname)(struct sk_buff *skb, struct p4tc_cmd_operand *op, \ + struct tcf_p4act *cmd, struct tcf_result *res) + +P4TC_FETCH_DECLARE(p4tc_fetch_metadata); +P4TC_FETCH_DECLARE(p4tc_fetch_constant); +P4TC_FETCH_DECLARE(p4tc_fetch_key); +P4TC_FETCH_DECLARE(p4tc_fetch_table); +P4TC_FETCH_DECLARE(p4tc_fetch_result); +P4TC_FETCH_DECLARE(p4tc_fetch_hdrfield); +P4TC_FETCH_DECLARE(p4tc_fetch_param); +P4TC_FETCH_DECLARE(p4tc_fetch_dev); +P4TC_FETCH_DECLARE(p4tc_fetch_reg); + +#define P4TC_CMD_DECLARE(fname) \ + static int fname(struct sk_buff *skb, struct p4tc_cmd_operate *op, \ + struct tcf_p4act *cmd, struct tcf_result *res) + +P4TC_CMD_DECLARE(p4tc_cmd_SET); +P4TC_CMD_DECLARE(p4tc_cmd_ACT); +P4TC_CMD_DECLARE(p4tc_cmd_PRINT); +P4TC_CMD_DECLARE(p4tc_cmd_TBLAPP); +P4TC_CMD_DECLARE(p4tc_cmd_SNDPORTEGR); +P4TC_CMD_DECLARE(p4tc_cmd_MIRPORTEGR); +P4TC_CMD_DECLARE(p4tc_cmd_PLUS); +P4TC_CMD_DECLARE(p4tc_cmd_SUB); +P4TC_CMD_DECLARE(p4tc_cmd_CONCAT); +P4TC_CMD_DECLARE(p4tc_cmd_BAND); +P4TC_CMD_DECLARE(p4tc_cmd_BOR); +P4TC_CMD_DECLARE(p4tc_cmd_BXOR); +P4TC_CMD_DECLARE(p4tc_cmd_JUMP); + +static void kfree_opentry(struct net *net, struct p4tc_cmd_operate *ope, + bool called_from_template) +{ + if (!ope) + return; + + ope->cmd->free_operation(net, ope, called_from_template, NULL); +} + +static void copy_k2u_operand(struct p4tc_cmd_operand *k, + struct p4tc_u_operand *u) +{ + u->pipeid = k->pipeid; + u->immedv = k->immedv; + u->immedv2 = k->immedv2; + u->oper_type = k->oper_type; + u->oper_datatype = k->oper_datatype->typeid; + u->oper_cbitsize = k->oper_cbitsize; + u->oper_startbit = k->oper_bitstart; + u->oper_endbit = k->oper_bitend; + u->oper_flags = k->oper_flags; +} + +static int copy_u2k_operand(struct p4tc_u_operand *uopnd, + struct p4tc_cmd_operand *kopnd, + struct netlink_ext_ack *extack) +{ + struct p4tc_type *type; + + type = p4type_find_byid(uopnd->oper_datatype); + if (kopnd->oper_flags & DATA_HAS_TYPE_INFO && !type) { + NL_SET_ERR_MSG_MOD(extack, "Invalid operand type"); + return -EINVAL; + } + + kopnd->pipeid = uopnd->pipeid; + kopnd->immedv = uopnd->immedv; + kopnd->immedv2 = uopnd->immedv2; + kopnd->oper_type = uopnd->oper_type; + kopnd->oper_datatype = type; + kopnd->oper_cbitsize = uopnd->oper_cbitsize; + kopnd->oper_bitstart = uopnd->oper_startbit; + kopnd->oper_bitend = uopnd->oper_endbit; + kopnd->oper_bitsize = 1 + kopnd->oper_bitend - kopnd->oper_bitstart; + kopnd->oper_flags = uopnd->oper_flags; + + return 0; +} + +int p4tc_cmds_fill_operand(struct sk_buff *skb, struct p4tc_cmd_operand *kopnd) +{ + unsigned char *b = nlmsg_get_pos(skb); + struct p4tc_u_operand oper = { 0 }; + u32 plen; + + copy_k2u_operand(kopnd, &oper); + if (nla_put(skb, P4TC_CMD_OPND_INFO, sizeof(struct p4tc_u_operand), + &oper)) + goto nla_put_failure; + + if (kopnd->path_or_value && + nla_put_string(skb, P4TC_CMD_OPND_PATH, kopnd->path_or_value)) + goto nla_put_failure; + + if (kopnd->path_or_value_extra && + nla_put_string(skb, P4TC_CMD_OPND_PATH_EXTRA, + kopnd->path_or_value_extra)) + goto nla_put_failure; + + if (kopnd->print_prefix && + nla_put_string(skb, P4TC_CMD_OPND_PREFIX, kopnd->print_prefix)) + goto nla_put_failure; + + plen = kopnd->immedv_large_sz; + + if (plen && nla_put(skb, P4TC_CMD_OPND_LARGE_CONSTANT, plen, + kopnd->immedv_large)) + goto nla_put_failure; + + return skb->len; + +nla_put_failure: + nlmsg_trim(skb, b); + return -1; +} + +static int p4tc_cmds_fill_operands_list(struct sk_buff *skb, + struct list_head *operands_list) +{ + unsigned char *b = nlmsg_get_pos(skb); + int i = 1; + struct p4tc_cmd_operand *cursor; + struct nlattr *nest_count; + + list_for_each_entry(cursor, operands_list, oper_list_node) { + nest_count = nla_nest_start(skb, i); + + if (p4tc_cmds_fill_operand(skb, cursor) < 0) + goto nla_put_failure; + + nla_nest_end(skb, nest_count); + i++; + } + + return skb->len; + +nla_put_failure: + nlmsg_trim(skb, b); + return -1; +} + +/* under spin lock */ +int p4tc_cmds_fillup(struct sk_buff *skb, struct list_head *cmd_operations) +{ + unsigned char *b = nlmsg_get_pos(skb); + struct p4tc_u_operate op = {}; + int i = 1; + struct nlattr *nest_op, *nest_opnds; + struct p4tc_cmd_operate *entry; + int err; + + list_for_each_entry(entry, cmd_operations, cmd_operations) { + nest_op = nla_nest_start(skb, i); + + op.op_type = entry->op_id; + op.op_flags = entry->op_flags; + op.op_ctl1 = entry->ctl1; + op.op_ctl2 = entry->ctl2; + if (nla_put(skb, P4TC_CMD_OPERATION, + sizeof(struct p4tc_u_operate), &op)) + goto nla_put_failure; + + if (!list_empty(&entry->operands_list)) { + nest_opnds = nla_nest_start(skb, P4TC_CMD_OPER_LIST); + err = p4tc_cmds_fill_operands_list(skb, + &entry->operands_list); + if (err < 0) + goto nla_put_failure; + nla_nest_end(skb, nest_opnds); + } + + nla_nest_end(skb, nest_op); + i++; + } + + return 0; + +nla_put_failure: + nlmsg_trim(skb, b); + return -1; +} + +void p4tc_cmds_release_ope_list(struct net *net, struct list_head *entries, + bool called_from_template) +{ + struct p4tc_cmd_operate *entry, *e; + + list_for_each_entry_safe(entry, e, entries, cmd_operations) { + list_del(&entry->cmd_operations); + kfree_opentry(net, entry, called_from_template); + } +} + +static void kfree_tmp_oplist(struct net *net, struct p4tc_cmd_operate *oplist[], + bool called_from_template) +{ + int i = 0; + struct p4tc_cmd_operate *ope; + + for (i = 0; i < P4TC_CMDS_LIST_MAX; i++) { + ope = oplist[i]; + if (!ope) + continue; + + kfree_opentry(net, ope, called_from_template); + } +} + +static int validate_metadata_operand(struct p4tc_cmd_operand *kopnd, + struct p4tc_type *container_type, + struct netlink_ext_ack *extack) +{ + struct p4tc_type_ops *type_ops = container_type->ops; + int err; + + if (kopnd->oper_cbitsize < kopnd->oper_bitsize) { + NL_SET_ERR_MSG_MOD(extack, "bitsize has to be <= cbitsize\n"); + return -EINVAL; + } + + if (type_ops->validate_p4t) { + if (kopnd->oper_type == P4TC_OPER_CONST) + if (kopnd->oper_flags & DATA_IS_IMMEDIATE) { + err = type_ops->validate_p4t(container_type, + &kopnd->immedv, + kopnd->oper_bitstart, + kopnd->oper_bitend, + extack); + } else { + err = type_ops->validate_p4t(container_type, + kopnd->immedv_large, + kopnd->oper_bitstart, + kopnd->oper_bitend, + extack); + } + else + err = type_ops->validate_p4t(container_type, NULL, + kopnd->oper_bitstart, + kopnd->oper_bitend, + extack); + if (err) + return err; + } + + return 0; +} + +static int validate_table_operand(struct p4tc_act *act, + struct p4tc_cmd_operand *kopnd, + struct netlink_ext_ack *extack) +{ + struct p4tc_table *table; + + table = tcf_table_get(act->pipeline, (const char *)kopnd->path_or_value, + kopnd->immedv, extack); + if (IS_ERR(table)) + return PTR_ERR(table); + + kopnd->priv = table; + + return 0; +} + +static int validate_key_operand(struct p4tc_act *act, + struct p4tc_cmd_operand *kopnd, + struct netlink_ext_ack *extack) +{ + struct p4tc_type *t = kopnd->oper_datatype; + struct p4tc_table *table; + + kopnd->pipeid = act->pipeline->common.p_id; + + table = tcf_table_get(act->pipeline, (const char *)kopnd->path_or_value, + kopnd->immedv, extack); + if (IS_ERR(table)) + return PTR_ERR(table); + kopnd->immedv = table->tbl_id; + + if (kopnd->oper_flags & DATA_HAS_TYPE_INFO) { + if (kopnd->oper_bitstart != 0) { + NL_SET_ERR_MSG_MOD(extack, "Key bitstart must be zero"); + return -EINVAL; + } + + if (t->typeid != P4T_KEY) { + NL_SET_ERR_MSG_MOD(extack, "Key type must be key"); + return -EINVAL; + } + + if (table->tbl_keysz != kopnd->oper_bitsize) { + NL_SET_ERR_MSG_MOD(extack, + "Type size doesn't match table keysz"); + return -EINVAL; + } + + t->bitsz = kopnd->oper_bitsize; + } else { + t = p4type_find_byid(P4T_KEY); + if (!t) + return -EINVAL; + + kopnd->oper_bitstart = 0; + kopnd->oper_bitend = table->tbl_keysz - 1; + kopnd->oper_bitsize = table->tbl_keysz; + kopnd->oper_datatype = t; + } + + return 0; +} + +static int validate_hdrfield_operand_type(struct p4tc_cmd_operand *kopnd, + struct p4tc_hdrfield *hdrfield, + struct netlink_ext_ack *extack) +{ + if (hdrfield->startbit != kopnd->oper_bitstart || + hdrfield->endbit != kopnd->oper_bitend || + hdrfield->datatype != kopnd->oper_datatype->typeid) { + NL_SET_ERR_MSG_MOD(extack, "Header field type mismatch"); + return -EINVAL; + } + + return 0; +} + +static int validate_hdrfield_operand(struct p4tc_act *act, + struct p4tc_cmd_operand *kopnd, + struct netlink_ext_ack *extack) +{ + struct p4tc_hdrfield *hdrfield; + struct p4tc_parser *parser; + struct p4tc_type *typ; + + kopnd->pipeid = act->pipeline->common.p_id; + + parser = tcf_parser_find_byany(act->pipeline, + (const char *)kopnd->path_or_value, + kopnd->immedv, extack); + if (IS_ERR(parser)) + return PTR_ERR(parser); + kopnd->immedv = parser->parser_inst_id; + + hdrfield = tcf_hdrfield_get(parser, + (const char *)kopnd->path_or_value_extra, + kopnd->immedv2, extack); + if (IS_ERR(hdrfield)) + return PTR_ERR(hdrfield); + kopnd->immedv2 = hdrfield->hdrfield_id; + + if (kopnd->oper_flags & DATA_HAS_TYPE_INFO) { + if (validate_hdrfield_operand_type(kopnd, hdrfield, extack) < 0) + return -EINVAL; + } else { + kopnd->oper_bitstart = hdrfield->startbit; + kopnd->oper_bitend = hdrfield->endbit; + kopnd->oper_datatype = p4type_find_byid(hdrfield->datatype); + kopnd->oper_bitsize = hdrfield->endbit - hdrfield->startbit + 1; + kopnd->oper_cbitsize = kopnd->oper_datatype->container_bitsz; + } + typ = kopnd->oper_datatype; + if (typ->ops->create_bitops) { + struct p4tc_type_mask_shift *mask_shift; + + mask_shift = typ->ops->create_bitops(kopnd->oper_bitsize, + kopnd->oper_bitstart, + kopnd->oper_bitend, + extack); + if (IS_ERR(mask_shift)) + return PTR_ERR(mask_shift); + + kopnd->oper_mask_shift = mask_shift; + } + + kopnd->priv = hdrfield; + + refcount_inc(&act->pipeline->p_hdrs_used); + + return 0; +} + +struct p4tc_cmd_opnd_priv_dev { + struct net_device *dev; + netdevice_tracker *tracker; +}; + +int validate_dev_operand(struct net *net, struct p4tc_cmd_operand *kopnd, + struct netlink_ext_ack *extack) +{ + struct p4tc_cmd_opnd_priv_dev *priv_dev; + struct net_device *dev; + + if (kopnd->oper_datatype->typeid != P4T_DEV) { + NL_SET_ERR_MSG_MOD(extack, "dev parameter must be dev"); + return -EINVAL; + } + + if (kopnd->oper_datatype->ops->validate_p4t(kopnd->oper_datatype, + &kopnd->immedv, + kopnd->oper_bitstart, + kopnd->oper_bitend, + extack) < 0) { + return -EINVAL; + } + + priv_dev = kzalloc(sizeof(*priv_dev), GFP_KERNEL); + if (!priv_dev) + return -ENOMEM; + kopnd->priv = priv_dev; + + dev = dev_get_by_index(net, kopnd->immedv); + if (!dev) { + NL_SET_ERR_MSG_MOD(extack, "Invalid ifindex"); + return -ENODEV; + } + priv_dev->dev = dev; + netdev_tracker_alloc(dev, priv_dev->tracker, GFP_KERNEL); + + return 0; +} + +static int validate_param_operand(struct p4tc_act *act, + struct p4tc_cmd_operand *kopnd, + struct netlink_ext_ack *extack) +{ + struct p4tc_act_param *param; + struct p4tc_type *t; + + param = tcf_param_find_byany(act, (const char *)kopnd->path_or_value, + kopnd->immedv2, extack); + + if (IS_ERR(param)) + return PTR_ERR(param); + + kopnd->pipeid = act->pipeline->common.p_id; + kopnd->immedv = act->a_id; + kopnd->immedv2 = param->id; + + t = p4type_find_byid(param->type); + if (kopnd->oper_flags & DATA_HAS_TYPE_INFO) { + if (t->typeid != kopnd->oper_datatype->typeid) { + NL_SET_ERR_MSG_MOD(extack, "Param type mismatch"); + return -EINVAL; + } + + if (t->bitsz != kopnd->oper_datatype->bitsz) { + NL_SET_ERR_MSG_MOD(extack, "Param size mismatch"); + return -EINVAL; + } + } else { + kopnd->oper_datatype = t; + kopnd->oper_bitstart = 0; + kopnd->oper_bitend = t->bitsz - 1; + kopnd->oper_bitsize = t->bitsz; + } + kopnd->pipeid = act->pipeline->common.p_id; + kopnd->immedv = act->a_id; + kopnd->immedv2 = param->id; + kopnd->oper_flags |= DATA_IS_READ_ONLY; + + if (kopnd->oper_bitstart != 0) { + NL_SET_ERR_MSG_MOD(extack, "Param startbit must be zero"); + return -EINVAL; + } + + if (kopnd->oper_bitstart > kopnd->oper_bitend) { + NL_SET_ERR_MSG_MOD(extack, "Param startbit > endbit"); + return -EINVAL; + } + + if (t->ops->create_bitops) { + struct p4tc_type_mask_shift *mask_shift; + + mask_shift = t->ops->create_bitops(kopnd->oper_bitsize, + kopnd->oper_bitstart, + kopnd->oper_bitend, extack); + if (IS_ERR(mask_shift)) + return PTR_ERR(mask_shift); + + kopnd->oper_mask_shift = mask_shift; + } + + return 0; +} + +static int validate_res_operand(struct p4tc_cmd_operand *kopnd, + struct netlink_ext_ack *extack) +{ + if (kopnd->immedv == P4TC_CMDS_RESULTS_HIT || + kopnd->immedv == P4TC_CMDS_RESULTS_MISS) + return 0; + + kopnd->oper_flags |= DATA_IS_READ_ONLY; + + NL_SET_ERR_MSG_MOD(extack, "Invalid result field"); + return -EINVAL; +} + +static int register_label(struct p4tc_act *act, const char *label, + int cmd_offset, struct netlink_ext_ack *extack) +{ + const size_t labelsz = strnlen(label, LABELNAMSIZ) + 1; + struct p4tc_label_node *node; + void *ptr; + int err; + + node = kzalloc(sizeof(*node), GFP_KERNEL); + if (!node) + return -ENOMEM; + + node->key.label = kzalloc(labelsz, GFP_KERNEL); + if (!(node->key.label)) { + err = -ENOMEM; + goto free_node; + } + + strscpy(node->key.label, label, labelsz); + node->key.labelsz = labelsz; + + node->cmd_offset = cmd_offset; + + ptr = rhashtable_insert_slow(act->labels, &node->key, &node->ht_node); + if (IS_ERR(ptr)) { + NL_SET_ERR_MSG_MOD(extack, + "Unable to insert in labels hashtable"); + err = PTR_ERR(ptr); + goto free_label; + } + + return 0; + +free_label: + kfree(node->key.label); + +free_node: + kfree(node); + + return err; +} + +static int validate_label_operand(struct p4tc_act *act, + struct p4tc_cmd_operate *ope, + struct p4tc_cmd_operand *kopnd, + struct netlink_ext_ack *extack) +{ + kopnd->oper_datatype = p4type_find_byid(P4T_U32); + return register_label(act, (const char *)kopnd->path_or_value, + ope->cmd_offset, extack); +} + +static int cmd_find_label_offset(struct p4tc_act *act, const char *label, + struct netlink_ext_ack *extack) +{ + struct p4tc_label_node *node; + struct p4tc_label_key label_key; + + label_key.label = (char *)label; + label_key.labelsz = strnlen(label, LABELNAMSIZ) + 1; + + node = rhashtable_lookup(act->labels, &label_key, p4tc_label_ht_params); + if (!node) { + NL_SET_ERR_MSG_MOD(extack, "Unable to find label"); + return -ENOENT; + } + + return node->cmd_offset; +} + +static int validate_reg_operand(struct p4tc_act *act, + struct p4tc_cmd_operand *kopnd, + struct netlink_ext_ack *extack) +{ + struct p4tc_register *reg; + struct p4tc_type *t; + + reg = tcf_register_get(act->pipeline, + (const char *)kopnd->path_or_value, + kopnd->immedv, extack); + if (IS_ERR(reg)) + return PTR_ERR(reg); + + kopnd->pipeid = act->pipeline->common.p_id; + kopnd->immedv = reg->reg_id; + + if (kopnd->immedv2 >= reg->reg_num_elems) { + NL_SET_ERR_MSG_MOD(extack, "Register index out of bounds"); + return -EINVAL; + } + + t = reg->reg_type; + kopnd->oper_datatype = t; + + if (kopnd->oper_flags & DATA_HAS_TYPE_INFO) { + if (reg->reg_type->typeid != kopnd->oper_datatype->typeid) { + NL_SET_ERR_MSG_MOD(extack, + "Invalid register data type"); + return -EINVAL; + } + + if (kopnd->oper_bitstart > kopnd->oper_bitend) { + NL_SET_ERR_MSG_MOD(extack, + "Register startbit > endbit"); + return -EINVAL; + } + } else { + kopnd->oper_bitstart = 0; + kopnd->oper_bitend = t->bitsz - 1; + kopnd->oper_bitsize = t->bitsz; + } + + if (t->ops->create_bitops) { + struct p4tc_type_mask_shift *mask_shift; + + mask_shift = t->ops->create_bitops(kopnd->oper_bitsize, + kopnd->oper_bitstart, + kopnd->oper_bitend, extack); + if (IS_ERR(mask_shift)) + return PTR_ERR(mask_shift); + + kopnd->oper_mask_shift = mask_shift; + } + + /* Should never fail */ + WARN_ON(!refcount_inc_not_zero(®->reg_ref)); + + kopnd->priv = reg; + + return 0; +} + +static struct p4tc_type_mask_shift * +create_metadata_bitops(struct p4tc_cmd_operand *kopnd, + struct p4tc_metadata *meta, struct p4tc_type *t, + struct netlink_ext_ack *extack) +{ + struct p4tc_type_mask_shift *mask_shift; + u8 bitstart, bitend; + u32 bitsz; + + if (kopnd->oper_flags & DATA_IS_SLICE) { + bitstart = meta->m_startbit + kopnd->oper_bitstart; + bitend = meta->m_startbit + kopnd->oper_bitend; + } else { + bitstart = meta->m_startbit; + bitend = meta->m_endbit; + } + bitsz = bitend - bitstart + 1; + mask_shift = t->ops->create_bitops(bitsz, bitstart, bitend, extack); + return mask_shift; +} + +static int __validate_metadata_operand(struct net *net, struct p4tc_act *act, + struct p4tc_cmd_operand *kopnd, + struct netlink_ext_ack *extack) +{ + struct p4tc_type *container_type; + struct p4tc_pipeline *pipeline; + struct p4tc_metadata *meta; + u32 bitsz; + int err; + + if (kopnd->oper_flags & DATA_USES_ROOT_PIPE) + pipeline = tcf_pipeline_find_byid(net, 0); + else + pipeline = act->pipeline; + + kopnd->pipeid = pipeline->common.p_id; + + meta = tcf_meta_get(pipeline, (const char *)kopnd->path_or_value, + kopnd->immedv, extack); + if (IS_ERR(meta)) + return PTR_ERR(meta); + kopnd->immedv = meta->m_id; + + if (!(kopnd->oper_flags & DATA_IS_SLICE)) { + kopnd->oper_bitstart = meta->m_startbit; + kopnd->oper_bitend = meta->m_endbit; + + bitsz = meta->m_endbit - meta->m_startbit + 1; + kopnd->oper_bitsize = bitsz; + } else { + bitsz = kopnd->oper_bitend - kopnd->oper_bitstart + 1; + } + + if (kopnd->oper_flags & DATA_HAS_TYPE_INFO) { + if (meta->m_datatype != kopnd->oper_datatype->typeid) { + NL_SET_ERR_MSG_MOD(extack, + "Invalid metadata data type"); + return -EINVAL; + } + + if (bitsz < kopnd->oper_bitsize) { + NL_SET_ERR_MSG_MOD(extack, "Invalid metadata bit size"); + return -EINVAL; + } + + if (kopnd->oper_bitstart > meta->m_endbit) { + NL_SET_ERR_MSG_MOD(extack, + "Invalid metadata slice start bit"); + return -EINVAL; + } + + if (kopnd->oper_bitend > meta->m_endbit) { + NL_SET_ERR_MSG_MOD(extack, + "Invalid metadata slice end bit"); + return -EINVAL; + } + } else { + kopnd->oper_datatype = p4type_find_byid(meta->m_datatype); + kopnd->oper_bitsize = bitsz; + kopnd->oper_cbitsize = bitsz; + } + + container_type = p4type_find_byid(meta->m_datatype); + if (!container_type) { + NL_SET_ERR_MSG_MOD(extack, "Invalid metadata type"); + return -EINVAL; + } + + err = validate_metadata_operand(kopnd, container_type, extack); + if (err < 0) + return err; + + if (meta->m_read_only) + kopnd->oper_flags |= DATA_IS_READ_ONLY; + + if (container_type->ops->create_bitops) { + struct p4tc_type_mask_shift *mask_shift; + + mask_shift = create_metadata_bitops(kopnd, meta, container_type, + extack); + if (IS_ERR(mask_shift)) + return -EINVAL; + + kopnd->oper_mask_shift = mask_shift; + } + + kopnd->priv = meta; + + return 0; +} + +static struct p4tc_type_mask_shift * +create_constant_bitops(struct p4tc_cmd_operand *kopnd, struct p4tc_type *t, + struct netlink_ext_ack *extack) +{ + struct p4tc_type_mask_shift *mask_shift; + + mask_shift = t->ops->create_bitops(t->bitsz, kopnd->oper_bitstart, + kopnd->oper_bitend, extack); + return mask_shift; +} + +static int validate_large_operand(struct p4tc_cmd_operand *kopnd, + struct netlink_ext_ack *extack) +{ + struct p4tc_type *t = kopnd->oper_datatype; + int err = 0; + + err = validate_metadata_operand(kopnd, t, extack); + if (err) + return err; + if (t->ops->create_bitops) { + struct p4tc_type_mask_shift *mask_shift; + + mask_shift = create_constant_bitops(kopnd, t, extack); + if (IS_ERR(mask_shift)) + return -EINVAL; + + kopnd->oper_mask_shift = mask_shift; + } + + return 0; +} + +/*Data is constant <=32 bits */ +static int validate_immediate_operand(struct p4tc_cmd_operand *kopnd, + struct netlink_ext_ack *extack) +{ + struct p4tc_type *t = kopnd->oper_datatype; + int err = 0; + + err = validate_metadata_operand(kopnd, t, extack); + if (err) + return err; + if (t->ops->create_bitops) { + struct p4tc_type_mask_shift *mask_shift; + + mask_shift = create_constant_bitops(kopnd, t, extack); + if (IS_ERR(mask_shift)) + return -EINVAL; + + kopnd->oper_mask_shift = mask_shift; + } + + return 0; +} + +static int validate_operand(struct net *net, struct p4tc_act *act, + struct p4tc_cmd_operate *ope, + struct p4tc_cmd_operand *kopnd, + struct netlink_ext_ack *extack) +{ + int err = 0; + + if (!kopnd) + return err; + + switch (kopnd->oper_type) { + case P4TC_OPER_CONST: + if (kopnd->oper_flags & DATA_IS_IMMEDIATE) + err = validate_immediate_operand(kopnd, extack); + else + err = validate_large_operand(kopnd, extack); + kopnd->oper_flags |= DATA_IS_READ_ONLY; + break; + case P4TC_OPER_META: + err = __validate_metadata_operand(net, act, kopnd, extack); + break; + case P4TC_OPER_ACTID: + err = 0; + break; + case P4TC_OPER_TBL: + err = validate_table_operand(act, kopnd, extack); + break; + case P4TC_OPER_KEY: + err = validate_key_operand(act, kopnd, extack); + break; + case P4TC_OPER_RES: + err = validate_res_operand(kopnd, extack); + break; + case P4TC_OPER_HDRFIELD: + err = validate_hdrfield_operand(act, kopnd, extack); + break; + case P4TC_OPER_PARAM: + err = validate_param_operand(act, kopnd, extack); + break; + case P4TC_OPER_DEV: + err = validate_dev_operand(net, kopnd, extack); + break; + case P4TC_OPER_REG: + err = validate_reg_operand(act, kopnd, extack); + break; + case P4TC_OPER_LABEL: + err = validate_label_operand(act, ope, kopnd, extack); + break; + default: + NL_SET_ERR_MSG_MOD(extack, "Unknown operand type"); + err = -EINVAL; + } + + return err; +} + +static void __free_operand(struct p4tc_cmd_operand *op) +{ + if (op->oper_mask_shift) + p4t_release(op->oper_mask_shift); + kfree(op->path_or_value); + kfree(op->path_or_value_extra); + kfree(op->print_prefix); + kfree(op); +} + +static void _free_operand_template(struct net *net, struct p4tc_cmd_operand *op) +{ + switch (op->oper_type) { + case P4TC_OPER_META: { + struct p4tc_pipeline *pipeline; + struct p4tc_metadata *meta; + + pipeline = tcf_pipeline_find_byid(net, op->pipeid); + if (pipeline) { + meta = tcf_meta_find_byid(pipeline, op->immedv); + if (meta) + tcf_meta_put_ref(meta); + } + break; + } + case P4TC_OPER_ACTID: { + struct p4tc_pipeline *pipeline; + struct p4tc_act *act; + + if (!(op->oper_flags & DATA_USES_ROOT_PIPE)) { + pipeline = tcf_pipeline_find_byid(net, op->pipeid); + if (pipeline) { + act = tcf_action_find_byid(pipeline, + op->immedv); + if (act) + tcf_action_put(act); + } + } + kfree(op->priv); + break; + } + case P4TC_OPER_TBL: { + struct p4tc_pipeline *pipeline; + struct p4tc_table *table; + + pipeline = tcf_pipeline_find_byid(net, op->pipeid); + if (pipeline) { + table = tcf_table_find_byid(pipeline, op->immedv); + if (table) + tcf_table_put_ref(table); + } + break; + } + case P4TC_OPER_KEY: { + struct p4tc_pipeline *pipeline; + struct p4tc_table *table; + + pipeline = tcf_pipeline_find_byid(net, op->pipeid); + if (pipeline) { + table = tcf_table_find_byid(pipeline, op->immedv); + if (table) + tcf_table_put_ref(table); + } + break; + } + case P4TC_OPER_HDRFIELD: { + struct p4tc_pipeline *pipeline; + + pipeline = tcf_pipeline_find_byid(net, op->pipeid); + /* Should never be NULL */ + if (pipeline) { + struct p4tc_hdrfield *hdrfield; + struct p4tc_parser *parser; + + if (refcount_read(&pipeline->p_hdrs_used) > 1) + refcount_dec(&pipeline->p_hdrs_used); + + parser = tcf_parser_find_byid(pipeline, op->immedv); + if (parser) { + hdrfield = tcf_hdrfield_find_byid(parser, + op->immedv2); + + if (hdrfield) + if (refcount_read(&hdrfield->hdrfield_ref) > 1) + tcf_hdrfield_put_ref(hdrfield); + } + } + break; + } + case P4TC_OPER_DEV: { + struct p4tc_cmd_opnd_priv_dev *priv = op->priv; + + if (priv && priv->dev) + netdev_put(priv->dev, priv->tracker); + kfree(priv); + break; + } + case P4TC_OPER_REG: { + struct p4tc_pipeline *pipeline; + + pipeline = tcf_pipeline_find_byid(net, op->pipeid); + /* Should never be NULL */ + if (pipeline) { + struct p4tc_register *reg; + + reg = tcf_register_find_byid(pipeline, op->immedv); + if (reg) + tcf_register_put_ref(reg); + } + break; + } + } + + __free_operand(op); +} + +static void _free_operand_list_instance(struct list_head *operands_list) +{ + struct p4tc_cmd_operand *op, *tmp; + + list_for_each_entry_safe(op, tmp, operands_list, oper_list_node) { + list_del(&op->oper_list_node); + __free_operand(op); + } +} + +static void _free_operand_list_template(struct net *net, + struct list_head *operands_list) +{ + struct p4tc_cmd_operand *op, *tmp; + + list_for_each_entry_safe(op, tmp, operands_list, oper_list_node) { + list_del(&op->oper_list_node); + _free_operand_template(net, op); + } +} + +static void _free_operation(struct net *net, struct p4tc_cmd_operate *ope, + bool called_from_template, + struct netlink_ext_ack *extack) +{ + if (called_from_template) + _free_operand_list_template(net, &ope->operands_list); + else + _free_operand_list_instance(&ope->operands_list); + + kfree(ope->label1); + kfree(ope->label2); + kfree(ope); +} + +/* XXX: copied from act_api::tcf_free_cookie_rcu - at some point share the code */ +static void _tcf_free_cookie_rcu(struct rcu_head *p) +{ + struct tc_cookie *cookie = container_of(p, struct tc_cookie, rcu); + + kfree(cookie->data); + kfree(cookie); +} + +/* XXX: copied from act_api::tcf_set_action_cookie - at some point share the code */ +static void _tcf_set_action_cookie(struct tc_cookie __rcu **old_cookie, + struct tc_cookie *new_cookie) +{ + struct tc_cookie *old; + + old = xchg((__force struct tc_cookie **)old_cookie, new_cookie); + if (old) + call_rcu(&old->rcu, _tcf_free_cookie_rcu); +} + +/* XXX: copied from act_api::free_tcf - at some point share the code */ +static void _free_tcf(struct tc_action *p) +{ + struct tcf_chain *chain = rcu_dereference_protected(p->goto_chain, 1); + + free_percpu(p->cpu_bstats); + free_percpu(p->cpu_bstats_hw); + free_percpu(p->cpu_qstats); + + _tcf_set_action_cookie(&p->act_cookie, NULL); + if (chain) + tcf_chain_put_by_act(chain); + + kfree(p); +} + +#define P4TC_CMD_OPER_ACT_RUNTIME (BIT(0)) + +static void free_op_ACT(struct net *net, struct p4tc_cmd_operate *ope, + bool dec_act_refs, struct netlink_ext_ack *extack) +{ + struct p4tc_cmd_operand *A; + struct tc_action *p = NULL; + + A = GET_OPA(&ope->operands_list); + if (A) + p = A->action; + + if (p) { + if (dec_act_refs) { + struct tcf_idrinfo *idrinfo = p->idrinfo; + + atomic_dec(&p->tcfa_bindcnt); + + if (refcount_dec_and_mutex_lock(&p->tcfa_refcnt, + &idrinfo->lock)) { + idr_remove(&idrinfo->action_idr, p->tcfa_index); + mutex_unlock(&idrinfo->lock); + + if (p->ops->cleanup) + p->ops->cleanup(p); + + gen_kill_estimator(&p->tcfa_rate_est); + _free_tcf(p); + } + } + } + + return _free_operation(net, ope, dec_act_refs, extack); +} + +static inline int opnd_is_assignable(struct p4tc_cmd_operand *kopnd) +{ + return !(kopnd->oper_flags & DATA_IS_READ_ONLY); +} + +static int validate_multiple_rvals(struct net *net, struct p4tc_act *act, + struct p4tc_cmd_operate *ope, + const size_t max_operands, + const size_t max_size, + struct netlink_ext_ack *extack) +{ + struct p4tc_cmd_operand *cursor; + int rvalue_tot_sz = 0; + int i = 0; + int err; + + cursor = GET_OPA(&ope->operands_list); + list_for_each_entry_continue(cursor, &ope->operands_list, oper_list_node) { + struct p4tc_type *cursor_type; + + if (i == max_operands - 1) { + NL_SET_ERR_MSG_MOD(extack, + "Operands list exceeds maximum allowed value"); + return -EINVAL; + } + + switch (cursor->oper_type) { + case P4TC_OPER_KEY: + case P4TC_OPER_META: + case P4TC_OPER_CONST: + case P4TC_OPER_HDRFIELD: + case P4TC_OPER_PARAM: + break; + default: + NL_SET_ERR_MSG_MOD(extack, + "Rvalue operand must be key, metadata, const, hdrfield or param"); + return -EINVAL; + } + + err = validate_operand(net, act, ope, cursor, extack); + if (err < 0) + return err; + + cursor_type = cursor->oper_datatype; + if (!cursor_type->ops->host_read) { + NL_SET_ERR_MSG_MOD(extack, + "Rvalue operand's types must have host_read op"); + return -EINVAL; + } + + if (cursor_type->container_bitsz > max_size) { + NL_SET_ERR_MSG_MOD(extack, + "Rvalue operand's types must be <= 64 bits"); + return -EINVAL; + } + if (cursor->oper_bitsize % 8 != 0) { + NL_SET_ERR_MSG_MOD(extack, + "All Rvalues must have bitsize multiple of 8"); + return -EINVAL; + } + rvalue_tot_sz += cursor->oper_bitsize; + i++; + } + + if (i < 2) { + NL_SET_ERR_MSG_MOD(extack, + "Operation must have at least two operands"); + return -EINVAL; + } + + return rvalue_tot_sz; +} + +static int __validate_CONCAT(struct net *net, struct p4tc_act *act, + struct p4tc_cmd_operate *ope, + const size_t max_operands, + struct netlink_ext_ack *extack) +{ + struct p4tc_cmd_operand *A; + int err; + + A = GET_OPA(&ope->operands_list); + err = validate_operand(net, act, ope, A, extack); + if (err) /*a better NL_SET_ERR_MSG_MOD done by validate_operand() */ + return err; + + if (!opnd_is_assignable(A)) { + NL_SET_ERR_MSG_MOD(extack, + "Unable to store op result in read-only operand"); + return -EPERM; + } + + return validate_multiple_rvals(net, act, ope, max_operands, + P4T_MAX_BITSZ, extack); +} + +static int __validate_BINARITH(struct net *net, struct p4tc_act *act, + struct p4tc_cmd_operate *ope, + const size_t max_operands, + struct netlink_ext_ack *extack) +{ + struct p4tc_cmd_operand *A; + struct p4tc_type *A_type; + int err; + + A = GET_OPA(&ope->operands_list); + err = validate_operand(net, act, ope, A, extack); + if (err) /*a better NL_SET_ERR_MSG_MOD done by validate_operand() */ + return err > 0 ? -err : err; + + if (!opnd_is_assignable(A)) { + NL_SET_ERR_MSG_MOD(extack, + "Unable to store op result in read-only operand"); + return -EPERM; + } + + switch (A->oper_type) { + case P4TC_OPER_META: + case P4TC_OPER_HDRFIELD: + break; + default: + NL_SET_ERR_MSG_MOD(extack, + "Operand A must be metadata or hdrfield"); + return -EINVAL; + } + + A_type = A->oper_datatype; + if (!A_type->ops->host_write) { + NL_SET_ERR_MSG_MOD(extack, + "Operand A's type must have host_write op"); + return -EINVAL; + } + + if (A_type->container_bitsz > 64) { + NL_SET_ERR_MSG_MOD(extack, + "Operand A's container type must be <= 64 bits"); + return -EINVAL; + } + + return validate_multiple_rvals(net, act, ope, max_operands, 64, extack); +} + +static int validate_num_opnds(struct p4tc_cmd_operate *ope, u32 cmd_num_opnds) +{ + if (ope->num_opnds != cmd_num_opnds) + return -EINVAL; + + return 0; +} + +static struct p4tc_act_param *validate_act_param(struct p4tc_act *act, + struct p4tc_cmd_operand *op, + unsigned long *param_id, + struct netlink_ext_ack *extack) +{ + struct p4tc_act_param *nparam; + struct p4tc_act_param *param; + + param = idr_get_next_ul(&act->params_idr, param_id); + if (!param) { + NL_SET_ERR_MSG_MOD(extack, + "Act has less runtime parameters than passed in call"); + return ERR_PTR(-EINVAL); + } + + if (op->oper_datatype->typeid != param->type) { + NL_SET_ERR_MSG_MOD(extack, "Operand type differs from params"); + return ERR_PTR(-EINVAL); + } + nparam = kzalloc(sizeof(*nparam), GFP_KERNEL); + if (!nparam) + return ERR_PTR(-ENOMEM); + strscpy(nparam->name, param->name, ACTPARAMNAMSIZ); + nparam->id = *param_id; + nparam->value = op; + nparam->type = param->type; + nparam->flags |= P4TC_ACT_PARAM_FLAGS_ISDYN; + + return nparam; +} + +static int validate_act_params(struct net *net, struct p4tc_act *act, + struct p4tc_cmd_operate *ope, + struct p4tc_cmd_operand *A, + struct list_head *params_lst, + struct netlink_ext_ack *extack) +{ + struct p4tc_act_param *params[P4TC_MSGBATCH_SIZE] = { NULL }; + unsigned long param_id = 0; + int i = 0; + struct p4tc_cmd_operand *kopnd; + int err; + + kopnd = A; + list_for_each_entry_continue(kopnd, &ope->operands_list, oper_list_node) { + struct p4tc_act_param *nparam; + + err = validate_operand(net, act, ope, kopnd, extack); + if (err) + goto free_params; + + nparam = validate_act_param(act, kopnd, ¶m_id, extack); + if (IS_ERR(nparam)) { + err = PTR_ERR(nparam); + goto free_params; + } + + params[i] = nparam; + list_add_tail(&nparam->head, params_lst); + i++; + param_id++; + } + + if (idr_get_next_ul(&act->params_idr, ¶m_id)) { + NL_SET_ERR_MSG_MOD(extack, + "Act has more runtime params than passed in call"); + err = -EINVAL; + goto free_params; + } + + return 0; + +free_params: + while (i--) + kfree(params[i]); + + return err; +} + +static void free_intermediate_params_list(struct list_head *params_list) +{ + struct p4tc_act_param *nparam, *p; + + list_for_each_entry_safe(nparam, p, params_list, head) + kfree(nparam); +} + +/* Actions with runtime parameters don't have instance ids (found in immedv2) + * because the action is not created apriori. Example: + * cmd act myprog.myact param1 param2 ... doesn't specify instance. + * As noted, it is equivalent to treating an action like a function call with + * action attributes derived at runtime.If these actions were already + * instantiated then immedv2 will have a non-zero value equal to the action index. + */ +static int check_runtime_params(struct p4tc_cmd_operate *ope, + struct p4tc_cmd_operand *A, + bool *is_runtime_act, + struct netlink_ext_ack *extack) +{ + if (A->immedv2 && ope->num_opnds > 1) { + NL_SET_ERR_MSG_MOD(extack, + "Can't specify runtime params together with instance id"); + return -EINVAL; + } + + if (A->oper_flags & DATA_USES_ROOT_PIPE && !A->immedv2) { + NL_SET_ERR_MSG_MOD(extack, + "Must specify instance id for kernel act calls"); + return -EINVAL; + } + + *is_runtime_act = !A->immedv2; + + return 0; +} + +/* Syntax: act ACTION_ID ACTION_INDEX | act ACTION_ID/ACTION_NAME PARAMS + * Operation: The tc action instance of kind ID ACTION_ID and optional index ACTION_INDEX + * is executed. + */ +int validate_ACT(struct net *net, struct p4tc_act *act, + struct p4tc_cmd_operate *ope, u32 cmd_num_opnds, + struct netlink_ext_ack *extack) +{ + struct tc_action_ops *action_ops; + struct list_head params_list; + struct p4tc_cmd_operand *A, *B; + struct tc_action *action; + bool is_runtime_act; + int err; + + INIT_LIST_HEAD(¶ms_list); + + A = GET_OPA(&ope->operands_list); + err = validate_operand(net, act, ope, A, extack); + if (err < 0) + return err; + + if (A->oper_type != P4TC_OPER_ACTID) { + NL_SET_ERR_MSG_MOD(extack, "ACT: Operand type MUST be P4TC_OPER_ACTID\n"); + return -EINVAL; + } + + err = check_runtime_params(ope, A, &is_runtime_act, extack); + if (err < 0) + return err; + + B = GET_OPB(&ope->operands_list); + A->oper_datatype = p4type_find_byid(P4T_U32); + + if (A->oper_flags & DATA_USES_ROOT_PIPE) { + action_ops = tc_lookup_action_byid(A->immedv); + if (!action_ops) { + NL_SET_ERR_MSG_MOD(extack, "ACT: unknown Action Kind"); + return -EINVAL; + } + A->pipeid = 0; + } else { + struct p4tc_pipeline *pipeline = act->pipeline; + struct p4tc_act_dep_edge_node *edge_node; + struct p4tc_act *callee_act; + bool has_back_edge; + + /* lets check if we have cycles where we are calling an + * action that might end calling us + */ + callee_act = tcf_action_get(pipeline, + (const char *)A->path_or_value, + A->immedv, extack); + if (IS_ERR(callee_act)) + return PTR_ERR(callee_act); + + A->pipeid = act->pipeline->common.p_id; + A->immedv = callee_act->a_id; + + edge_node = kzalloc(sizeof(*edge_node), GFP_KERNEL); + if (!edge_node) { + err = -ENOMEM; + goto free_params_list; + } + edge_node->act_id = act->a_id; + + has_back_edge = tcf_pipeline_check_act_backedge(pipeline, + edge_node, + callee_act->a_id); + if (has_back_edge) { + NL_SET_ERR_MSG_FMT_MOD(extack, + "Call creates a back edge: %s -> %s", + act->common.name, + callee_act->common.name); + err = -EINVAL; + kfree(edge_node); + goto free_params_list; + } + + A->priv = edge_node; + if (is_runtime_act) { + u32 flags = TCA_ACT_FLAGS_BIND; + struct tc_act_dyna parm = { 0 }; + + err = validate_act_params(net, callee_act, ope, A, + ¶ms_list, extack); + if (err < 0) + return err; + + parm.action = TC_ACT_PIPE; + err = tcf_p4_dyna_template_init(net, &action, + callee_act, + ¶ms_list, &parm, + flags, extack); + if (err < 0) + goto free_params_list; + + ope->op_flags |= P4TC_CMD_OPER_ACT_RUNTIME; + } + + action_ops = &callee_act->ops; + } + + if (!is_runtime_act) { + if (__tcf_idr_search(net, action_ops, &action, A->immedv2) == false) { + NL_SET_ERR_MSG_MOD(extack, "ACT: unknown Action index\n"); + module_put(action_ops->owner); + err = -EINVAL; + goto free_params_list; + } + + atomic_inc(&action->tcfa_bindcnt); + } + + A->immedv2 = action->tcfa_index; + A->action = action; + + return 0; + +free_params_list: + free_intermediate_params_list(¶ms_list); + return err; +} + +/* Syntax: set A B + * Operation: B is written to A. + * A could header, or metadata or key + * B could be a constant, header, or metadata + * Restriction: A and B dont have to be of the same size and type + * as long as B's value could be less bits than A + * (example a U16 setting into a U32, etc) + */ +int validate_SET(struct net *net, struct p4tc_act *act, + struct p4tc_cmd_operate *ope, u32 cmd_num_opnds, + struct netlink_ext_ack *extack) +{ + struct p4tc_cmd_operand *A, *B; + struct p4tc_type *A_type; + struct p4tc_type *B_type; + int err = 0; + + err = validate_num_opnds(ope, cmd_num_opnds); + if (err < 0) { + NL_SET_ERR_MSG_MOD(extack, "SET must have only 2 operands"); + return err; + } + + A = GET_OPA(&ope->operands_list); + err = validate_operand(net, act, ope, A, extack); + if (err) /*a better NL_SET_ERR_MSG_MOD done by validate_operand() */ + return err; + + if (!opnd_is_assignable(A)) { + NL_SET_ERR_MSG_MOD(extack, "Unable to set read-only operand"); + return -EPERM; + } + + B = GET_OPB(&ope->operands_list); + if (B->oper_type == P4TC_OPER_KEY) { + NL_SET_ERR_MSG_MOD(extack, "Operand B cannot be key\n"); + return -EINVAL; + } + + err = validate_operand(net, act, ope, B, extack); + if (err) + return err; + + A_type = A->oper_datatype; + B_type = B->oper_datatype; + if (A->oper_type == P4TC_OPER_KEY) + A->oper_datatype = B_type; + + if ((A_type->typeid == P4T_DEV && B_type->typeid != P4T_DEV) || + (A_type->typeid != P4T_DEV && B_type->typeid == P4T_DEV)) { + NL_SET_ERR_MSG_MOD(extack, "Can only set dev to other dev"); + return -EINVAL; + } + + if (!A_type->ops->host_read || !B_type->ops->host_read) { + NL_SET_ERR_MSG_MOD(extack, + "Types of A and B must have host_read op"); + return -EINVAL; + } + + if (!A_type->ops->host_write || !B_type->ops->host_write) { + NL_SET_ERR_MSG_MOD(extack, + "Types of A and B must have host_write op"); + return -EINVAL; + } + + if (A->oper_bitsize < B->oper_bitsize) { + NL_SET_ERR_MSG_MOD(extack, + "set: B.bitsize has to be <= A.bitsize\n"); + return -EINVAL; + } + + if (A->oper_bitsize != B->oper_bitsize) { + /* We allow them as long as the value of B can fit in A + * which has already been verified at this point + */ + u64 Amaxval; + u64 Bmaxval; + + /* Anything can be assigned to P4T_U128 */ + if (A->oper_datatype->typeid == P4T_U128) + return 0; + + Amaxval = GENMASK_ULL(A->oper_bitend, A->oper_bitstart); + + if (B->oper_type == P4TC_OPER_CONST) + Bmaxval = B->immedv; + else + Bmaxval = GENMASK_ULL(B->oper_bitend, B->oper_bitstart); + + if (Bmaxval > Amaxval) { + NL_SET_ERR_MSG_MOD(extack, + "set: B bits has to fit in A\n"); + return -EINVAL; + } + } + + return 0; +} + +int validate_PRINT(struct net *net, struct p4tc_act *act, + struct p4tc_cmd_operate *ope, u32 cmd_num_opnds, + struct netlink_ext_ack *extack) +{ + struct p4tc_cmd_operand *A; + int err; + + err = validate_num_opnds(ope, cmd_num_opnds); + if (err < 0) { + NL_SET_ERR_MSG_MOD(extack, "print must have only 1 operands"); + return err; + } + + A = GET_OPA(&ope->operands_list); + + if (A->oper_type == P4TC_OPER_CONST) { + NL_SET_ERR_MSG_MOD(extack, "Operand A cannot be constant\n"); + return -EINVAL; + } + + return validate_operand(net, act, ope, A, extack); +} + +int validate_TBLAPP(struct net *net, struct p4tc_act *act, + struct p4tc_cmd_operate *ope, u32 cmd_num_opnds, + struct netlink_ext_ack *extack) +{ + struct p4tc_cmd_operand *A; + int err; + + err = validate_num_opnds(ope, cmd_num_opnds); + if (err < 0) { + NL_SET_ERR_MSG_MOD(extack, + "tableapply must have only 1 operands"); + return err; + } + + A = GET_OPA(&ope->operands_list); + if (A->oper_type != P4TC_OPER_TBL) { + NL_SET_ERR_MSG_MOD(extack, "Operand A must be a table\n"); + return -EINVAL; + } + + err = validate_operand(net, act, ope, A, extack); + if (err) /*a better NL_SET_ERR_MSG_MOD done by validate_operand() */ + return err; + + return 0; +} + +int validate_SNDPORTEGR(struct net *net, struct p4tc_act *act, + struct p4tc_cmd_operate *ope, u32 cmd_num_opnds, + struct netlink_ext_ack *extack) +{ + struct p4tc_cmd_operand *A; + int err; + + err = validate_num_opnds(ope, cmd_num_opnds); + if (err < 0) { + NL_SET_ERR_MSG_MOD(extack, + "send_port_egress must have only 1 operands"); + return err; + } + + A = GET_OPA(&ope->operands_list); + + err = validate_operand(net, act, ope, A, extack); + if (err) /*a better NL_SET_ERR_MSG_MOD done by validate_operand() */ + return err; + + return 0; +} + +int validate_BINARITH(struct net *net, struct p4tc_act *act, + struct p4tc_cmd_operate *ope, u32 cmd_num_opnds, + struct netlink_ext_ack *extack) +{ + struct p4tc_cmd_operand *A, *B, *C; + struct p4tc_type *A_type; + struct p4tc_type *B_type; + struct p4tc_type *C_type; + int err; + + err = __validate_BINARITH(net, act, ope, cmd_num_opnds, extack); + if (err < 0) + return err; + + A = GET_OPA(&ope->operands_list); + B = GET_OPB(&ope->operands_list); + C = GET_OPC(&ope->operands_list); + + A_type = A->oper_datatype; + B_type = B->oper_datatype; + C_type = C->oper_datatype; + + /* For now, they must be the same. + * Will change that very soon. + */ + if (A_type != B_type || A_type != C_type) { + NL_SET_ERR_MSG_MOD(extack, + "Type of A, B and C must be the same"); + return -EINVAL; + } + + return 0; +} + +int validate_CONCAT(struct net *net, struct p4tc_act *act, + struct p4tc_cmd_operate *ope, u32 cmd_num_opnds, + struct netlink_ext_ack *extack) +{ + struct p4tc_cmd_operand *A; + int rvalue_tot_sz; + + A = GET_OPA(&ope->operands_list); + + rvalue_tot_sz = __validate_CONCAT(net, act, ope, cmd_num_opnds, extack); + if (rvalue_tot_sz < 0) + return rvalue_tot_sz; + + if (A->oper_bitsize < rvalue_tot_sz) { + NL_SET_ERR_MSG_MOD(extack, + "Rvalue operands concatenated must fit inside operand A"); + return -EINVAL; + } + + return 0; +} + +/* We'll validate jump to labels later once we have all labels processed */ +int validate_JUMP(struct net *net, struct p4tc_act *act, + struct p4tc_cmd_operate *ope, u32 cmd_num_opnds, + struct netlink_ext_ack *extack) +{ + struct p4tc_cmd_operand *A; + int err; + + err = validate_num_opnds(ope, cmd_num_opnds); + if (err < 0) { + NL_SET_ERR_MSG_MOD(extack, "jump must have only 1 operands"); + return err; + } + + A = GET_OPA(&ope->operands_list); + if (A->oper_type != P4TC_OPER_LABEL) { + NL_SET_ERR_MSG_MOD(extack, "Operand A must be a label\n"); + return -EINVAL; + } + + if (A->immedv) { + int jmp_num; + + jmp_num = A->immedv & TC_ACT_EXT_VAL_MASK; + + if (jmp_num <= 0) { + NL_SET_ERR_MSG_MOD(extack, + "Backward jumps are not allowed"); + return -EINVAL; + } + } + + A->oper_datatype = p4type_find_byid(P4T_U32); + + return 0; +} + +int validate_LABEL(struct net *net, struct p4tc_act *act, + struct p4tc_cmd_operate *ope, u32 cmd_num_opnds, + struct netlink_ext_ack *extack) +{ + struct p4tc_cmd_operand *A; + int err; + + err = validate_num_opnds(ope, cmd_num_opnds); + if (err < 0) { + NL_SET_ERR_MSG_MOD(extack, "label must have only 1 operands"); + return err; + } + + A = GET_OPA(&ope->operands_list); + if (A->oper_type != P4TC_OPER_LABEL) { + NL_SET_ERR_MSG_MOD(extack, "Operand A must be a label\n"); + return -EINVAL; + } + + err = validate_operand(net, act, ope, A, extack); + if (err) + return err; + + return 0; +} + +static void p4tc_reg_lock(struct p4tc_cmd_operand *A, + struct p4tc_cmd_operand *B, + struct p4tc_cmd_operand *C) +{ + struct p4tc_register *reg_A, *reg_B, *reg_C; + + if (A->oper_type == P4TC_OPER_REG) { + reg_A = A->priv; + spin_lock_bh(®_A->reg_value_lock); + } + + if (B && B->oper_type == P4TC_OPER_REG) { + reg_B = B->priv; + spin_lock_bh(®_B->reg_value_lock); + } + + if (C && C->oper_type == P4TC_OPER_REG) { + reg_C = C->priv; + spin_lock_bh(®_C->reg_value_lock); + } +} + +static void p4tc_reg_unlock(struct p4tc_cmd_operand *A, + struct p4tc_cmd_operand *B, + struct p4tc_cmd_operand *C) +{ + struct p4tc_register *reg_A, *reg_B, *reg_C; + + if (C && C->oper_type == P4TC_OPER_REG) { + reg_C = C->priv; + spin_unlock_bh(®_C->reg_value_lock); + } + + if (B && B->oper_type == P4TC_OPER_REG) { + reg_B = B->priv; + spin_unlock_bh(®_B->reg_value_lock); + } + + if (A->oper_type == P4TC_OPER_REG) { + reg_A = A->priv; + spin_unlock_bh(®_A->reg_value_lock); + } +} + +static int p4tc_cmp_op(struct p4tc_cmd_operand *A, struct p4tc_cmd_operand *B, + void *A_val, void *B_val) +{ + int res; + + p4tc_reg_lock(A, B, NULL); + + res = p4t_cmp(A->oper_mask_shift, A->oper_datatype, A_val, + B->oper_mask_shift, B->oper_datatype, B_val); + + p4tc_reg_unlock(A, B, NULL); + + return res; +} + +static int p4tc_copy_op(struct p4tc_cmd_operand *A, struct p4tc_cmd_operand *B, + void *A_val, void *B_val) +{ + int res; + + p4tc_reg_lock(A, B, NULL); + + res = p4t_copy(A->oper_mask_shift, A->oper_datatype, A_val, + B->oper_mask_shift, B->oper_datatype, B_val); + + p4tc_reg_unlock(A, B, NULL); + + return res; +} + +/* Syntax: BRANCHOP A B + * BRANCHOP := BEQ, BNEQ, etc + * Operation: B's value is compared to A's value. + * XXX: In the future we will take expressions instead of values + * A could a constant, header, or metadata or key + * B could be a constant, header, metadata, or key + * Restriction: A and B cannot both be constants + */ + +/* if A == B else */ +static int p4tc_cmd_BEQ(struct sk_buff *skb, struct p4tc_cmd_operate *op, + struct tcf_p4act *cmd, struct tcf_result *res) +{ + struct p4tc_cmd_operand *A, *B; + int res_cmp; + void *B_val; + void *A_val; + + A = GET_OPA(&op->operands_list); + B = GET_OPB(&op->operands_list); + + A_val = A->fetch(skb, A, cmd, res); + B_val = B->fetch(skb, B, cmd, res); + + if (!A_val || !B_val) + return TC_ACT_OK; + + res_cmp = p4tc_cmp_op(A, B, A_val, B_val); + if (!res_cmp) + return op->ctl1; + + return op->ctl2; +} + +/* if A != B else */ +static int p4tc_cmd_BNE(struct sk_buff *skb, struct p4tc_cmd_operate *op, + struct tcf_p4act *cmd, struct tcf_result *res) +{ + struct p4tc_cmd_operand *A, *B; + int res_cmp; + void *B_val; + void *A_val; + + A = GET_OPA(&op->operands_list); + B = GET_OPB(&op->operands_list); + + A_val = A->fetch(skb, A, cmd, res); + B_val = B->fetch(skb, B, cmd, res); + + if (!A_val || !B_val) + return TC_ACT_OK; + + res_cmp = p4tc_cmp_op(A, B, A_val, B_val); + if (res_cmp) + return op->ctl1; + + return op->ctl2; +} + +/* if A < B else */ +static int p4tc_cmd_BLT(struct sk_buff *skb, struct p4tc_cmd_operate *op, + struct tcf_p4act *cmd, struct tcf_result *res) +{ + struct p4tc_cmd_operand *A, *B; + int res_cmp; + void *B_val; + void *A_val; + + A = GET_OPA(&op->operands_list); + B = GET_OPB(&op->operands_list); + + A_val = A->fetch(skb, A, cmd, res); + B_val = B->fetch(skb, B, cmd, res); + + if (!A_val || !B_val) + return TC_ACT_OK; + + res_cmp = p4tc_cmp_op(A, B, A_val, B_val); + if (res_cmp < 0) + return op->ctl1; + + return op->ctl2; +} + +/* if A <= B else */ +static int p4tc_cmd_BLE(struct sk_buff *skb, struct p4tc_cmd_operate *op, + struct tcf_p4act *cmd, struct tcf_result *res) +{ + struct p4tc_cmd_operand *A, *B; + int res_cmp; + void *B_val; + void *A_val; + + A = GET_OPA(&op->operands_list); + B = GET_OPB(&op->operands_list); + + A_val = A->fetch(skb, A, cmd, res); + B_val = B->fetch(skb, B, cmd, res); + + if (!A_val || !B_val) + return TC_ACT_OK; + + res_cmp = p4tc_cmp_op(A, B, A_val, B_val); + if (!res_cmp || res_cmp < 0) + return op->ctl1; + + return op->ctl2; +} + +/* if A > B else */ +static int p4tc_cmd_BGT(struct sk_buff *skb, struct p4tc_cmd_operate *op, + struct tcf_p4act *cmd, struct tcf_result *res) +{ + struct p4tc_cmd_operand *A, *B; + int res_cmp; + void *B_val; + void *A_val; + + A = GET_OPA(&op->operands_list); + B = GET_OPB(&op->operands_list); + + A_val = A->fetch(skb, A, cmd, res); + B_val = B->fetch(skb, B, cmd, res); + + if (!A_val || !B_val) + return TC_ACT_OK; + + res_cmp = p4tc_cmp_op(A, B, A_val, B_val); + if (res_cmp > 0) + return op->ctl1; + + return op->ctl2; +} + +/* if A >= B else */ +static int p4tc_cmd_BGE(struct sk_buff *skb, struct p4tc_cmd_operate *op, + struct tcf_p4act *cmd, struct tcf_result *res) +{ + struct p4tc_cmd_operand *A, *B; + int res_cmp; + void *B_val; + void *A_val; + + A = GET_OPA(&op->operands_list); + B = GET_OPB(&op->operands_list); + + A_val = A->fetch(skb, A, cmd, res); + B_val = B->fetch(skb, B, cmd, res); + + if (!A_val || !B_val) + return TC_ACT_OK; + + res_cmp = p4tc_cmp_op(A, B, A_val, B_val); + if (!res_cmp || res_cmp > 0) + return op->ctl1; + + return op->ctl2; +} + +int validate_BRN(struct net *net, struct p4tc_act *act, + struct p4tc_cmd_operate *ope, u32 cmd_num_opnds, + struct netlink_ext_ack *extack) +{ + struct p4tc_cmd_operand *A, *B; + int err = 0; + + if (validate_num_opnds(ope, cmd_num_opnds) < 0) { + NL_SET_ERR_MSG_MOD(extack, + "Branch: branch must have only 2 operands"); + return -EINVAL; + } + + A = GET_OPA(&ope->operands_list); + B = GET_OPB(&ope->operands_list); + + err = validate_operand(net, act, ope, A, extack); + if (err) + return err; + + err = validate_operand(net, act, ope, B, extack); + if (err) + return err; + + if (A->oper_type == P4TC_OPER_CONST && + B->oper_type == P4TC_OPER_CONST) { + NL_SET_ERR_MSG_MOD(extack, + "Branch: A and B can't both be constant\n"); + return -EINVAL; + } + + if (!p4tc_type_unsigned(A->oper_datatype->typeid) || + !p4tc_type_unsigned(B->oper_datatype->typeid)) { + NL_SET_ERR_MSG_MOD(extack, + "Operands A and B must be unsigned\n"); + return -EINVAL; + } + + return 0; +} + +static void generic_free_op(struct net *net, struct p4tc_cmd_operate *ope, + bool called_from_template, + struct netlink_ext_ack *extack) +{ + return _free_operation(net, ope, called_from_template, extack); +} + +static struct p4tc_cmd_s cmds[] = { + { P4TC_CMD_OP_SET, 2, validate_SET, generic_free_op, p4tc_cmd_SET }, + { P4TC_CMD_OP_ACT, 1, validate_ACT, free_op_ACT, p4tc_cmd_ACT }, + { P4TC_CMD_OP_BEQ, 2, validate_BRN, generic_free_op, p4tc_cmd_BEQ }, + { P4TC_CMD_OP_BNE, 2, validate_BRN, generic_free_op, p4tc_cmd_BNE }, + { P4TC_CMD_OP_BGT, 2, validate_BRN, generic_free_op, p4tc_cmd_BGT }, + { P4TC_CMD_OP_BLT, 2, validate_BRN, generic_free_op, p4tc_cmd_BLT }, + { P4TC_CMD_OP_BGE, 2, validate_BRN, generic_free_op, p4tc_cmd_BGE }, + { P4TC_CMD_OP_BLE, 2, validate_BRN, generic_free_op, p4tc_cmd_BLE }, + { P4TC_CMD_OP_PRINT, 1, validate_PRINT, generic_free_op, + p4tc_cmd_PRINT }, + { P4TC_CMD_OP_TBLAPP, 1, validate_TBLAPP, generic_free_op, + p4tc_cmd_TBLAPP }, + { P4TC_CMD_OP_SNDPORTEGR, 1, validate_SNDPORTEGR, generic_free_op, + p4tc_cmd_SNDPORTEGR }, + { P4TC_CMD_OP_MIRPORTEGR, 1, validate_SNDPORTEGR, generic_free_op, + p4tc_cmd_MIRPORTEGR }, + { P4TC_CMD_OP_PLUS, 3, validate_BINARITH, generic_free_op, + p4tc_cmd_PLUS }, + { P4TC_CMD_OP_SUB, 3, validate_BINARITH, generic_free_op, + p4tc_cmd_SUB }, + { P4TC_CMD_OP_CONCAT, P4TC_CMD_OPERS_MAX, validate_CONCAT, + generic_free_op, p4tc_cmd_CONCAT }, + { P4TC_CMD_OP_BAND, 3, validate_BINARITH, generic_free_op, + p4tc_cmd_BAND }, + { P4TC_CMD_OP_BOR, 3, validate_BINARITH, generic_free_op, + p4tc_cmd_BOR }, + { P4TC_CMD_OP_BXOR, 3, validate_BINARITH, generic_free_op, + p4tc_cmd_BXOR }, + { P4TC_CMD_OP_JUMP, 1, validate_JUMP, generic_free_op, p4tc_cmd_JUMP }, + { P4TC_CMD_OP_LABEL, 1, validate_LABEL, generic_free_op, NULL }, +}; + +static struct p4tc_cmd_s *p4tc_get_cmd_byid(u16 cmdid) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(cmds); i++) { + if (cmdid == cmds[i].cmdid) + return &cmds[i]; + } + + return NULL; +} + +/* Operands */ +static const struct nla_policy p4tc_cmd_policy_oper[P4TC_CMD_OPND_MAX + 1] = { + [P4TC_CMD_OPND_INFO] = { .type = NLA_BINARY, + .len = sizeof(struct p4tc_u_operand) }, + [P4TC_CMD_OPND_PATH] = { .type = NLA_STRING, .len = TEMPLATENAMSZ }, + [P4TC_CMD_OPND_PATH_EXTRA] = { .type = NLA_STRING, .len = TEMPLATENAMSZ }, + [P4TC_CMD_OPND_LARGE_CONSTANT] = { + .type = NLA_BINARY, + .len = BITS_TO_BYTES(P4T_MAX_BITSZ), + }, + [P4TC_CMD_OPND_PREFIX] = { .type = NLA_STRING, .len = TEMPLATENAMSZ }, +}; + +/* XXX: P4TC_CMD_POLICY is used to disable overwriting extacks downstream + * Could we use error pointers instead of this P4TC_CMD_POLICY trickery? + */ +#define P4TC_CMD_POLICY 12345 +static int p4tc_cmds_process_opnd(struct nlattr *nla, + struct p4tc_cmd_operand *kopnd, + struct netlink_ext_ack *extack) +{ + int oper_extra_sz = 0; + int oper_prefix_sz = 0; + u32 wantbits = 0; + int oper_sz = 0; + int err = 0; + struct nlattr *tb[P4TC_CMD_OPND_MAX + 1]; + struct p4tc_u_operand *uopnd; + + err = nla_parse_nested(tb, P4TC_CMD_OPND_MAX, nla, p4tc_cmd_policy_oper, + extack); + if (err < 0) { + NL_SET_ERR_MSG_MOD(extack, "parse error: P4TC_CMD_OPND_\n"); + return -EINVAL; + } + + if (!tb[P4TC_CMD_OPND_INFO]) { + NL_SET_ERR_MSG_MOD(extack, "operand information is mandatory"); + return -EINVAL; + } + + uopnd = nla_data(tb[P4TC_CMD_OPND_INFO]); + + if (uopnd->oper_type == P4TC_OPER_META) { + kopnd->fetch = p4tc_fetch_metadata; + } else if (uopnd->oper_type == P4TC_OPER_CONST) { + kopnd->fetch = p4tc_fetch_constant; + } else if (uopnd->oper_type == P4TC_OPER_ACTID) { + kopnd->fetch = NULL; + } else if (uopnd->oper_type == P4TC_OPER_TBL) { + kopnd->fetch = p4tc_fetch_table; + } else if (uopnd->oper_type == P4TC_OPER_KEY) { + kopnd->fetch = p4tc_fetch_key; + } else if (uopnd->oper_type == P4TC_OPER_RES) { + kopnd->fetch = p4tc_fetch_result; + } else if (uopnd->oper_type == P4TC_OPER_HDRFIELD) { + kopnd->fetch = p4tc_fetch_hdrfield; + } else if (uopnd->oper_type == P4TC_OPER_PARAM) { + kopnd->fetch = p4tc_fetch_param; + } else if (uopnd->oper_type == P4TC_OPER_DEV) { + kopnd->fetch = p4tc_fetch_dev; + } else if (uopnd->oper_type == P4TC_OPER_REG) { + kopnd->fetch = p4tc_fetch_reg; + } else if (uopnd->oper_type == P4TC_OPER_LABEL) { + kopnd->fetch = NULL; + } else { + NL_SET_ERR_MSG_MOD(extack, "Unknown operand type"); + return -EINVAL; + } + + wantbits = 1 + uopnd->oper_endbit - uopnd->oper_startbit; + if (uopnd->oper_flags & DATA_HAS_TYPE_INFO && + uopnd->oper_type != P4TC_OPER_ACTID && + uopnd->oper_type != P4TC_OPER_TBL && + uopnd->oper_type != P4TC_OPER_REG && + uopnd->oper_cbitsize < wantbits) { + NL_SET_ERR_MSG_MOD(extack, + "Start and end bit dont fit in space"); + return -EINVAL; + } + + err = copy_u2k_operand(uopnd, kopnd, extack); + if (err < 0) + return err; + + if (tb[P4TC_CMD_OPND_LARGE_CONSTANT]) { + int const_sz; + + const_sz = nla_len(tb[P4TC_CMD_OPND_LARGE_CONSTANT]); + if (const_sz) + memcpy(kopnd->immedv_large, + nla_data(tb[P4TC_CMD_OPND_LARGE_CONSTANT]), + const_sz); + else + kopnd->oper_flags |= DATA_IS_IMMEDIATE; + + kopnd->immedv_large_sz = const_sz; + } + + if (tb[P4TC_CMD_OPND_PATH]) + oper_sz = nla_len(tb[P4TC_CMD_OPND_PATH]); + + kopnd->path_or_value_sz = oper_sz; + + if (oper_sz) { + kopnd->path_or_value = kzalloc(oper_sz, GFP_KERNEL); + if (!kopnd->path_or_value) { + NL_SET_ERR_MSG_MOD(extack, + "Failed to alloc operand path data"); + return -ENOMEM; + } + + nla_memcpy(kopnd->path_or_value, tb[P4TC_CMD_OPND_PATH], + oper_sz); + } + + if (tb[P4TC_CMD_OPND_PATH_EXTRA]) + oper_extra_sz = nla_len(tb[P4TC_CMD_OPND_PATH_EXTRA]); + + kopnd->path_or_value_extra_sz = oper_extra_sz; + + if (oper_extra_sz) { + kopnd->path_or_value_extra = kzalloc(oper_extra_sz, GFP_KERNEL); + if (!kopnd->path_or_value_extra) { + kfree(kopnd->path_or_value); + NL_SET_ERR_MSG_MOD(extack, + "Failed to alloc extra operand path data"); + return -ENOMEM; + } + + nla_memcpy(kopnd->path_or_value_extra, + tb[P4TC_CMD_OPND_PATH_EXTRA], oper_extra_sz); + } + + if (tb[P4TC_CMD_OPND_PREFIX]) + oper_prefix_sz = nla_len(tb[P4TC_CMD_OPND_PREFIX]); + + if (!oper_prefix_sz) + return 0; + + kopnd->print_prefix_sz = oper_prefix_sz; + + kopnd->print_prefix = kzalloc(oper_prefix_sz, GFP_KERNEL); + if (!kopnd->print_prefix) { + kfree(kopnd->path_or_value); + kfree(kopnd->path_or_value_extra); + NL_SET_ERR_MSG_MOD(extack, + "Failed to alloc operand print prefix"); + return -ENOMEM; + } + + nla_memcpy(kopnd->print_prefix, tb[P4TC_CMD_OPND_PREFIX], + oper_prefix_sz); + return 0; +} + +/* Operation */ +static const struct nla_policy cmd_ops_policy[P4TC_CMD_OPER_MAX + 1] = { + [P4TC_CMD_OPERATION] = { .type = NLA_BINARY, + .len = sizeof(struct p4tc_u_operate) }, + [P4TC_CMD_OPER_LIST] = { .type = NLA_NESTED }, + [P4TC_CMD_OPER_LABEL1] = { .type = NLA_STRING, .len = LABELNAMSIZ }, + [P4TC_CMD_OPER_LABEL2] = { .type = NLA_STRING, .len = LABELNAMSIZ }, +}; + +static struct p4tc_cmd_operate *uope_to_kope(struct p4tc_u_operate *uope) +{ + struct p4tc_cmd_operate *ope; + + if (!uope) + return NULL; + + ope = kzalloc(sizeof(*ope), GFP_KERNEL); + if (!ope) + return NULL; + + ope->op_id = uope->op_type; + ope->op_flags = uope->op_flags; + ope->op_cnt = 0; + + ope->ctl1 = uope->op_ctl1; + ope->ctl2 = uope->op_ctl2; + + INIT_LIST_HEAD(&ope->operands_list); + + return ope; +} + +static int p4tc_cmd_process_operands_list(struct nlattr *nla, + struct p4tc_cmd_operate *ope, + struct netlink_ext_ack *extack) +{ + struct nlattr *tb[P4TC_CMD_OPERS_MAX + 1]; + struct p4tc_cmd_operand *opnd; + int err; + int i; + + err = nla_parse_nested(tb, P4TC_CMD_OPERS_MAX, nla, NULL, NULL); + if (err < 0) + return err; + + for (i = 1; i < P4TC_CMD_OPERS_MAX + 1 && tb[i]; i++) { + opnd = kzalloc(sizeof(*opnd), GFP_KERNEL); + if (!opnd) + return -ENOMEM; + err = p4tc_cmds_process_opnd(tb[i], opnd, extack); + /* Will add to list because p4tc_cmd_process_opnd may have + * allocated memory inside opnd even in case of failure, + * and this memory must be freed + */ + list_add_tail(&opnd->oper_list_node, &ope->operands_list); + if (err < 0) + return P4TC_CMD_POLICY; + ope->num_opnds++; + } + + return 0; +} + +static int p4tc_cmd_process_ops(struct net *net, struct p4tc_act *act, + struct nlattr *nla, + struct p4tc_cmd_operate **op_entry, + int cmd_offset, struct netlink_ext_ack *extack) +{ + struct p4tc_cmd_operate *ope = NULL; + int err = 0; + struct nlattr *tb[P4TC_CMD_OPER_MAX + 1]; + struct p4tc_cmd_s *cmd_t; + + err = nla_parse_nested(tb, P4TC_CMD_OPER_MAX, nla, cmd_ops_policy, + extack); + if (err < 0) { + NL_SET_ERR_MSG_MOD(extack, "parse error: P4TC_CMD_OPER_\n"); + return P4TC_CMD_POLICY; + } + + ope = uope_to_kope(nla_data(tb[P4TC_CMD_OPERATION])); + if (!ope) + return -ENOMEM; + + ope->cmd_offset = cmd_offset; + + cmd_t = p4tc_get_cmd_byid(ope->op_id); + if (!cmd_t) { + NL_SET_ERR_MSG_MOD(extack, "Unknown operation ID\n"); + kfree(ope); + return -EINVAL; + } + + if (tb[P4TC_CMD_OPER_LABEL1]) { + const char *label1 = nla_data(tb[P4TC_CMD_OPER_LABEL1]); + const u32 label1_sz = nla_len(tb[P4TC_CMD_OPER_LABEL1]); + + ope->label1 = kzalloc(label1_sz, GFP_KERNEL); + if (!ope->label1) + return P4TC_CMD_POLICY; + + strscpy(ope->label1, label1, label1_sz); + } + + if (tb[P4TC_CMD_OPER_LABEL2]) { + const char *label2 = nla_data(tb[P4TC_CMD_OPER_LABEL2]); + const u32 label2_sz = nla_len(tb[P4TC_CMD_OPER_LABEL2]); + + ope->label2 = kzalloc(label2_sz, GFP_KERNEL); + if (!ope->label2) + return P4TC_CMD_POLICY; + + strscpy(ope->label2, label2, label2_sz); + } + + if (tb[P4TC_CMD_OPER_LIST]) { + err = p4tc_cmd_process_operands_list(tb[P4TC_CMD_OPER_LIST], + ope, extack); + if (err) { + err = P4TC_CMD_POLICY; + goto set_results; + } + } + + err = cmd_t->validate_operands(net, act, ope, cmd_t->num_opnds, extack); + if (err) { + //XXX: think about getting rid of this P4TC_CMD_POLICY + err = P4TC_CMD_POLICY; + goto set_results; + } + +set_results: + ope->cmd = cmd_t; + *op_entry = ope; + + return err; +} + +static inline int cmd_is_branch(u32 cmdid) +{ + if (cmdid == P4TC_CMD_OP_BEQ || cmdid == P4TC_CMD_OP_BNE || + cmdid == P4TC_CMD_OP_BLT || cmdid == P4TC_CMD_OP_BLE || + cmdid == P4TC_CMD_OP_BGT || cmdid == P4TC_CMD_OP_BGE) + return 1; + + return 0; +} + +static int cmd_jump_operand_validate(struct p4tc_act *act, + struct p4tc_cmd_operate *ope, + struct p4tc_cmd_operand *kopnd, int cmdcnt, + struct netlink_ext_ack *extack) +{ + int jmp_cnt, cmd_offset; + + cmd_offset = cmd_find_label_offset(act, + (const char *)kopnd->path_or_value, + extack); + if (cmd_offset < 0) + return cmd_offset; + + if (cmd_offset >= cmdcnt) { + NL_SET_ERR_MSG(extack, "Jump excessive branch"); + return -EINVAL; + } + + jmp_cnt = cmd_offset - ope->cmd_offset - 1; + if (jmp_cnt <= 0) { + NL_SET_ERR_MSG_MOD(extack, "Backward jumps are not allowed"); + return -EINVAL; + } + + kopnd->immedv = TC_ACT_JUMP | jmp_cnt; + + return 0; +} + +static int cmd_brn_validate(struct p4tc_act *act, + struct p4tc_cmd_operate *oplist[], int cnt, + struct netlink_ext_ack *extack) +{ + int cmdcnt = cnt - 1; + int i; + + for (i = 1; i < cmdcnt; i++) { + struct p4tc_cmd_operate *ope = oplist[i - 1]; + int jmp_cnt = 0; + struct p4tc_cmd_operand *kopnd; + + if (ope->op_id == P4TC_CMD_OP_JUMP) { + list_for_each_entry(kopnd, &ope->operands_list, oper_list_node) { + int ret; + + if (kopnd->immedv) { + jmp_cnt = kopnd->immedv & TC_ACT_EXT_VAL_MASK; + if (jmp_cnt + i >= cmdcnt) { + NL_SET_ERR_MSG(extack, + "jump excessive branch"); + return -EINVAL; + } + } else { + ret = cmd_jump_operand_validate(act, ope, + kopnd, + cmdcnt, extack); + if (ret < 0) + return ret; + } + } + } + + if (!cmd_is_branch(ope->op_id)) + continue; + + if (TC_ACT_EXT_CMP(ope->ctl1, TC_ACT_JUMP)) { + if (ope->label1) { + int cmd_offset; + + cmd_offset = cmd_find_label_offset(act, + ope->label1, + extack); + if (cmd_offset < 0) + return -EINVAL; + jmp_cnt = cmd_offset - ope->cmd_offset; + + if (jmp_cnt <= 0) { + NL_SET_ERR_MSG_MOD(extack, + "Backward jumps are not allowed"); + return -EINVAL; + } + ope->ctl1 |= jmp_cnt; + } else { + jmp_cnt = ope->ctl1 & TC_ACT_EXT_VAL_MASK; + if (jmp_cnt + i >= cmdcnt) { + NL_SET_ERR_MSG(extack, + "ctl1 excessive branch"); + return -EINVAL; + } + } + } + + if (TC_ACT_EXT_CMP(ope->ctl2, TC_ACT_JUMP)) { + if (ope->label2) { + int cmd_offset; + + cmd_offset = cmd_find_label_offset(act, + ope->label2, + extack); + if (cmd_offset < 0) + return -EINVAL; + jmp_cnt = cmd_offset - ope->cmd_offset; + + if (jmp_cnt <= 0) { + NL_SET_ERR_MSG_MOD(extack, + "Backward jumps are not allowed"); + return -EINVAL; + } + ope->ctl2 |= jmp_cnt; + } else { + jmp_cnt = ope->ctl2 & TC_ACT_EXT_VAL_MASK; + if (jmp_cnt + i >= cmdcnt) { + NL_SET_ERR_MSG(extack, + "ctl2 excessive branch"); + return -EINVAL; + } + } + } + } + + return 0; +} + +static void p4tc_cmds_insert_acts(struct p4tc_act *act, + struct p4tc_cmd_operate *ope) +{ + struct tc_action *actions[TCA_ACT_MAX_PRIO] = { NULL }; + int i = 0; + struct p4tc_cmd_operand *kopnd; + + list_for_each_entry(kopnd, &ope->operands_list, oper_list_node) { + if (kopnd->oper_type == P4TC_OPER_ACTID && + !(kopnd->oper_flags & DATA_USES_ROOT_PIPE)) { + struct p4tc_act_dep_edge_node *edge_node = kopnd->priv; + struct tcf_p4act *p = to_p4act(kopnd->action); + + /* Add to the dependency graph so we can detect + * circular references + */ + tcf_pipeline_add_dep_edge(act->pipeline, edge_node, + p->act_id); + kopnd->priv = NULL; + + actions[i] = kopnd->action; + i++; + } + } + + tcf_idr_insert_many(actions); +} + +static void p4tc_cmds_ops_pass_to_list(struct p4tc_act *act, + struct p4tc_cmd_operate **oplist, + struct list_head *cmd_operations, + bool called_from_instance) +{ + int i; + + for (i = 0; i < P4TC_CMDS_LIST_MAX && oplist[i]; i++) { + struct p4tc_cmd_operate *ope = oplist[i]; + + if (!called_from_instance) + p4tc_cmds_insert_acts(act, ope); + + list_add_tail(&ope->cmd_operations, cmd_operations); + } +} + +static void p4tc_cmd_ops_del_list(struct net *net, + struct list_head *cmd_operations) +{ + struct p4tc_cmd_operate *ope, *tmp; + + list_for_each_entry_safe(ope, tmp, cmd_operations, cmd_operations) { + list_del(&ope->cmd_operations); + kfree_opentry(net, ope, false); + } +} + +static int p4tc_cmds_copy_opnd(struct p4tc_act *act, + struct p4tc_cmd_operand **new_kopnd, + struct p4tc_cmd_operand *kopnd, + struct netlink_ext_ack *extack) +{ + struct p4tc_type_mask_shift *mask_shift = NULL; + struct p4tc_cmd_operand *_new_kopnd; + int err = 0; + + _new_kopnd = kzalloc(sizeof(*_new_kopnd), GFP_KERNEL); + if (!_new_kopnd) + return -ENOMEM; + + memcpy(_new_kopnd, kopnd, sizeof(*_new_kopnd)); + memset(&_new_kopnd->oper_list_node, 0, sizeof(struct list_head)); + + if (kopnd->oper_type == P4TC_OPER_CONST && + kopnd->oper_datatype->ops->create_bitops) { + mask_shift = create_constant_bitops(kopnd, kopnd->oper_datatype, + extack); + if (IS_ERR(mask_shift)) { + err = -EINVAL; + goto err; + } + } else if (kopnd->oper_type == P4TC_OPER_META && + kopnd->oper_datatype->ops->create_bitops) { + struct p4tc_pipeline *pipeline; + struct p4tc_metadata *meta; + + if (kopnd->pipeid == P4TC_KERNEL_PIPEID) + pipeline = tcf_pipeline_find_byid(NULL, kopnd->pipeid); + else + pipeline = act->pipeline; + + meta = tcf_meta_find_byid(pipeline, kopnd->immedv); + if (!meta) { + err = -EINVAL; + goto err; + } + + mask_shift = create_metadata_bitops(kopnd, meta, + kopnd->oper_datatype, + extack); + if (IS_ERR(mask_shift)) { + err = -EINVAL; + goto err; + } + } else if (kopnd->oper_type == P4TC_OPER_HDRFIELD || + kopnd->oper_type == P4TC_OPER_PARAM || + kopnd->oper_type == P4TC_OPER_REG) { + if (kopnd->oper_datatype->ops->create_bitops) { + struct p4tc_type_ops *ops = kopnd->oper_datatype->ops; + + mask_shift = ops->create_bitops(kopnd->oper_bitsize, + kopnd->oper_bitstart, + kopnd->oper_bitend, + extack); + if (IS_ERR(mask_shift)) { + err = -EINVAL; + goto err; + } + } + } + + _new_kopnd->oper_mask_shift = mask_shift; + + if (kopnd->path_or_value_sz) { + _new_kopnd->path_or_value = + kzalloc(kopnd->path_or_value_sz, GFP_KERNEL); + if (!_new_kopnd->path_or_value) { + err = -ENOMEM; + goto err; + } + + memcpy(_new_kopnd->path_or_value, kopnd->path_or_value, + kopnd->path_or_value_sz); + } + + if (kopnd->path_or_value_extra_sz) { + _new_kopnd->path_or_value_extra = + kzalloc(kopnd->path_or_value_extra_sz, GFP_KERNEL); + if (!_new_kopnd->path_or_value_extra) { + err = -ENOMEM; + goto err; + } + + memcpy(_new_kopnd->path_or_value_extra, + kopnd->path_or_value_extra, + kopnd->path_or_value_extra_sz); + } + + if (kopnd->print_prefix_sz) { + _new_kopnd->print_prefix = + kzalloc(kopnd->print_prefix_sz, GFP_KERNEL); + if (!_new_kopnd->print_prefix) { + err = -ENOMEM; + goto err; + } + memcpy(_new_kopnd->print_prefix, kopnd->print_prefix, + kopnd->print_prefix_sz); + } + + memcpy(_new_kopnd->immedv_large, kopnd->immedv_large, + kopnd->immedv_large_sz); + + *new_kopnd = _new_kopnd; + + return 0; + +err: + kfree(_new_kopnd->path_or_value); + kfree(_new_kopnd->path_or_value_extra); + kfree(_new_kopnd); + + return err; +} + +static int p4tc_cmds_copy_ops(struct p4tc_act *act, + struct p4tc_cmd_operate **new_op_entry, + struct p4tc_cmd_operate *op_entry, + struct netlink_ext_ack *extack) +{ + struct p4tc_cmd_operate *_new_op_entry; + struct p4tc_cmd_operand *cursor; + int err = 0; + + _new_op_entry = kzalloc(sizeof(*_new_op_entry), GFP_KERNEL); + if (!_new_op_entry) + return -ENOMEM; + + INIT_LIST_HEAD(&_new_op_entry->operands_list); + list_for_each_entry(cursor, &op_entry->operands_list, oper_list_node) { + struct p4tc_cmd_operand *new_opnd = NULL; + + err = p4tc_cmds_copy_opnd(act, &new_opnd, cursor, extack); + if (new_opnd) { + struct list_head *head; + + head = &new_opnd->oper_list_node; + list_add_tail(&new_opnd->oper_list_node, + &_new_op_entry->operands_list); + } + if (err < 0) + goto set_results; + } + + _new_op_entry->op_id = op_entry->op_id; + _new_op_entry->op_flags = op_entry->op_flags; + _new_op_entry->op_cnt = op_entry->op_cnt; + _new_op_entry->cmd_offset = op_entry->cmd_offset; + + _new_op_entry->ctl1 = op_entry->ctl1; + _new_op_entry->ctl2 = op_entry->ctl2; + _new_op_entry->cmd = op_entry->cmd; + +set_results: + *new_op_entry = _new_op_entry; + + return err; +} + +int p4tc_cmds_copy(struct p4tc_act *act, struct list_head *new_cmd_operations, + bool delete_old, struct netlink_ext_ack *extack) +{ + struct p4tc_cmd_operate *oplist[P4TC_CMDS_LIST_MAX] = { NULL }; + int i = 0; + struct p4tc_cmd_operate *op; + int err; + + if (delete_old) + p4tc_cmd_ops_del_list(NULL, new_cmd_operations); + + list_for_each_entry(op, &act->cmd_operations, cmd_operations) { + err = p4tc_cmds_copy_ops(act, &oplist[i], op, extack); + if (err < 0) + goto free_oplist; + + i++; + } + + p4tc_cmds_ops_pass_to_list(act, oplist, new_cmd_operations, true); + + return 0; + +free_oplist: + kfree_tmp_oplist(NULL, oplist, false); + return err; +} + +#define SEPARATOR "/" + +int p4tc_cmds_parse(struct net *net, struct p4tc_act *act, struct nlattr *nla, + bool ovr, struct netlink_ext_ack *extack) +{ + /* XXX: oplist and oplist_attr + * could bloat the stack depending on P4TC_CMDS_LIST_MAX + */ + struct p4tc_cmd_operate *oplist[P4TC_CMDS_LIST_MAX] = { NULL }; + struct nlattr *oplist_attr[P4TC_CMDS_LIST_MAX + 1]; + struct rhashtable *labels = act->labels; + int err; + int i; + + err = nla_parse_nested(oplist_attr, P4TC_CMDS_LIST_MAX, nla, NULL, + extack); + if (err < 0) + return err; + + act->labels = kzalloc(sizeof(*labels), GFP_KERNEL); + if (!act->labels) + return -ENOMEM; + + err = rhashtable_init(act->labels, &p4tc_label_ht_params); + if (err < 0) { + kfree(act->labels); + act->labels = labels; + return err; + } + + for (i = 1; i < P4TC_CMDS_LIST_MAX + 1 && oplist_attr[i]; i++) { + if (!oplist_attr[i]) + break; + err = p4tc_cmd_process_ops(net, act, oplist_attr[i], + &oplist[i - 1], i - 1, extack); + if (err) { + kfree_tmp_oplist(net, oplist, true); + + if (err == P4TC_CMD_POLICY) + err = -EINVAL; + + goto free_labels; + } + } + + err = cmd_brn_validate(act, oplist, i, extack); + if (err < 0) { + kfree_tmp_oplist(net, oplist, true); + goto free_labels; + } + + if (ovr) { + p4tc_cmd_ops_del_list(net, &act->cmd_operations); + if (labels) { + rhashtable_free_and_destroy(labels, p4tc_label_ht_destroy, + NULL); + kfree(labels); + } + } + + /*XXX: At this point we have all the cmds and they are valid */ + p4tc_cmds_ops_pass_to_list(act, oplist, &act->cmd_operations, false); + + return 0; + +free_labels: + rhashtable_destroy(act->labels); + kfree(act->labels); + if (ovr) + act->labels = labels; + else + act->labels = NULL; + + return err; +} + +static void *p4tc_fetch_constant(struct sk_buff *skb, + struct p4tc_cmd_operand *op, + struct tcf_p4act *cmd, struct tcf_result *res) +{ + if (op->oper_flags & DATA_IS_IMMEDIATE) + return &op->immedv; + + return op->immedv_large; +} + +static void *p4tc_fetch_table(struct sk_buff *skb, struct p4tc_cmd_operand *op, + struct tcf_p4act *cmd, struct tcf_result *res) +{ + return op->priv; +} + +static void *p4tc_fetch_result(struct sk_buff *skb, struct p4tc_cmd_operand *op, + struct tcf_p4act *cmd, struct tcf_result *res) +{ + if (op->immedv == P4TC_CMDS_RESULTS_HIT) + return &res->hit; + else + return &res->miss; +} + +static void *p4tc_fetch_hdrfield(struct sk_buff *skb, + struct p4tc_cmd_operand *op, + struct tcf_p4act *cmd, struct tcf_result *res) +{ + return tcf_hdrfield_fetch(skb, op->priv); +} + +static void *p4tc_fetch_param(struct sk_buff *skb, struct p4tc_cmd_operand *op, + struct tcf_p4act *cmd, struct tcf_result *res) +{ + struct tcf_p4act_params *params; + struct p4tc_act_param *param; + + params = rcu_dereference(cmd->params); + param = idr_find(¶ms->params_idr, op->immedv2); + + if (param->flags & P4TC_ACT_PARAM_FLAGS_ISDYN) { + struct p4tc_cmd_operand *intern_op = param->value; + + return intern_op->fetch(skb, intern_op, cmd, res); + } + + return param->value; +} + +static void *p4tc_fetch_key(struct sk_buff *skb, struct p4tc_cmd_operand *op, + struct tcf_p4act *cmd, struct tcf_result *res) +{ + struct p4tc_skb_ext *p4tc_skb_ext; + + p4tc_skb_ext = skb_ext_find(skb, P4TC_SKB_EXT); + if (unlikely(!p4tc_skb_ext)) + return NULL; + + return p4tc_skb_ext->p4tc_ext->key; +} + +static void *p4tc_fetch_dev(struct sk_buff *skb, struct p4tc_cmd_operand *op, + struct tcf_p4act *cmd, struct tcf_result *res) +{ + return &op->immedv; +} + +static void *p4tc_fetch_metadata(struct sk_buff *skb, + struct p4tc_cmd_operand *op, + struct tcf_p4act *cmd, struct tcf_result *res) +{ + return tcf_meta_fetch(skb, op->priv); +} + +static void *p4tc_fetch_reg(struct sk_buff *skb, struct p4tc_cmd_operand *op, + struct tcf_p4act *cmd, struct tcf_result *res) +{ + struct p4tc_register *reg = op->priv; + size_t bytesz; + + bytesz = BITS_TO_BYTES(reg->reg_type->container_bitsz); + + return reg->reg_value + bytesz * op->immedv2; +} + +/* SET A B - A is set from B + * + * Assumes everything has been vetted - meaning no checks here + * + */ +static int p4tc_cmd_SET(struct sk_buff *skb, struct p4tc_cmd_operate *op, + struct tcf_p4act *cmd, struct tcf_result *res) +{ + struct p4tc_cmd_operand *A, *B; + void *src; + void *dst; + int err; + + A = GET_OPA(&op->operands_list); + B = GET_OPB(&op->operands_list); + + src = B->fetch(skb, B, cmd, res); + dst = A->fetch(skb, A, cmd, res); + + if (!src || !dst) + return TC_ACT_SHOT; + + err = p4tc_copy_op(A, B, dst, src); + + if (err) + return TC_ACT_SHOT; + + return op->ctl1; +} + +/* ACT A - execute action A + * + * Assumes everything has been vetted - meaning no checks here + * + */ +static int p4tc_cmd_ACT(struct sk_buff *skb, struct p4tc_cmd_operate *op, + struct tcf_p4act *cmd, struct tcf_result *res) +{ + struct p4tc_cmd_operand *A = GET_OPA(&op->operands_list); + const struct tc_action *action = A->action; + + return action->ops->act(skb, action, res); +} + +static int p4tc_cmd_PRINT(struct sk_buff *skb, struct p4tc_cmd_operate *op, + struct tcf_p4act *cmd, struct tcf_result *res) +{ + struct p4tc_cmd_operand *A = GET_OPA(&op->operands_list); + u64 readval[BITS_TO_U64(P4T_MAX_BITSZ)] = { 0 }; + struct net *net = dev_net(skb->dev); + char name[(TEMPLATENAMSZ * 4)]; + struct p4tc_type *val_t; + void *val; + + A = GET_OPA(&op->operands_list); + val = A->fetch(skb, A, cmd, res); + val_t = A->oper_datatype; + + if (!val) + return TC_ACT_OK; + + p4tc_reg_lock(A, NULL, NULL); + if (val_t->ops->host_read) + val_t->ops->host_read(val_t, A->oper_mask_shift, val, &readval); + else + memcpy(&readval, val, BITS_TO_BYTES(A->oper_bitsize)); + /* This is a debug function, so performance is not a priority */ + if (A->oper_type == P4TC_OPER_META) { + struct p4tc_pipeline *pipeline = NULL; + char *path = (char *)A->print_prefix; + struct p4tc_metadata *meta; + + pipeline = tcf_pipeline_find_byid(net, A->pipeid); + meta = tcf_meta_find_byid(pipeline, A->immedv); + + if (path) + snprintf(name, + (TEMPLATENAMSZ << 1) + + P4TC_CMD_MAX_OPER_PATH_LEN, + "%s %s.%s", path, pipeline->common.name, + meta->common.name); + else + snprintf(name, TEMPLATENAMSZ << 1, "%s.%s", + pipeline->common.name, meta->common.name); + + val_t->ops->print(net, val_t, name, &readval); + } else if (A->oper_type == P4TC_OPER_HDRFIELD) { + char *path = (char *)A->print_prefix; + struct p4tc_hdrfield *hdrfield; + struct p4tc_pipeline *pipeline; + struct p4tc_parser *parser; + + pipeline = tcf_pipeline_find_byid(net, A->pipeid); + parser = tcf_parser_find_byid(pipeline, A->immedv); + hdrfield = tcf_hdrfield_find_byid(parser, A->immedv2); + + if (path) + snprintf(name, TEMPLATENAMSZ * 4, + "%s hdrfield.%s.%s.%s", path, + pipeline->common.name, parser->parser_name, + hdrfield->common.name); + else + snprintf(name, TEMPLATENAMSZ * 4, "hdrfield.%s.%s.%s", + pipeline->common.name, parser->parser_name, + hdrfield->common.name); + + val_t->ops->print(net, val_t, name, &readval); + } else if (A->oper_type == P4TC_OPER_KEY) { + char *path = (char *)A->print_prefix; + struct p4tc_table *table; + struct p4tc_pipeline *pipeline; + + pipeline = tcf_pipeline_find_byid(net, A->pipeid); + table = tcf_table_find_byid(pipeline, A->immedv); + if (path) + snprintf(name, TEMPLATENAMSZ * 3, "%s key.%s.%s.%u", + path, pipeline->common.name, + table->common.name, A->immedv2); + else + snprintf(name, TEMPLATENAMSZ * 3, "key.%s.%s.%u", + pipeline->common.name, table->common.name, + A->immedv2); + val_t->ops->print(net, val_t, name, &readval); + } else if (A->oper_type == P4TC_OPER_PARAM) { + char *path = (char *)A->print_prefix; + + if (path) + snprintf(name, TEMPLATENAMSZ * 2, "%s param", path); + else + strcpy(name, "param"); + + val_t->ops->print(net, val_t, "param", &readval); + } else if (A->oper_type == P4TC_OPER_RES) { + char *path = (char *)A->print_prefix; + + if (A->immedv == P4TC_CMDS_RESULTS_HIT) { + if (path) + snprintf(name, TEMPLATENAMSZ * 2, "%s res.hit", + path); + else + strcpy(name, "res.hit"); + + } else if (A->immedv == P4TC_CMDS_RESULTS_MISS) { + if (path) + snprintf(name, TEMPLATENAMSZ * 2, "%s res.miss", + path); + else + strcpy(name, "res.miss"); + } + + val_t->ops->print(net, val_t, name, &readval); + } else if (A->oper_type == P4TC_OPER_REG) { + char *path = (char *)A->print_prefix; + struct p4tc_pipeline *pipeline; + struct p4tc_register *reg; + + pipeline = tcf_pipeline_find_byid(net, A->pipeid); + reg = tcf_register_find_byid(pipeline, A->immedv); + if (path) + snprintf(name, TEMPLATENAMSZ * 2, + "%s register.%s.%s[%u]", path, + pipeline->common.name, reg->common.name, + A->immedv2); + else + snprintf(name, TEMPLATENAMSZ * 2, "register.%s.%s[%u]", + pipeline->common.name, reg->common.name, + A->immedv2); + + val_t->ops->print(net, val_t, name, &readval); + } else { + pr_info("Unsupported operand for print\n"); + } + p4tc_reg_unlock(A, NULL, NULL); + + return op->ctl1; +} + +#define REDIRECT_RECURSION_LIMIT 4 +static DEFINE_PER_CPU(unsigned int, redirect_rec_level); + +static int p4tc_cmd_SNDPORTEGR(struct sk_buff *skb, struct p4tc_cmd_operate *op, + struct tcf_p4act *cmd, struct tcf_result *res) +{ + struct sk_buff *skb2 = skb; + int retval = TC_ACT_STOLEN; + struct p4tc_cmd_operand *A; + struct net_device *dev; + unsigned int rec_level; + bool expects_nh; + u32 *ifindex; + int mac_len; + bool at_nh; + int err; + + A = GET_OPA(&op->operands_list); + ifindex = A->fetch(skb, A, cmd, res); + + rec_level = __this_cpu_inc_return(redirect_rec_level); + if (unlikely(rec_level > REDIRECT_RECURSION_LIMIT)) { + net_warn_ratelimited("SNDPORTEGR: exceeded redirect recursion limit on dev %s\n", + netdev_name(skb->dev)); + __this_cpu_dec(redirect_rec_level); + return TC_ACT_SHOT; + } + + dev = dev_get_by_index_rcu(dev_net(skb->dev), *ifindex); + if (unlikely(!dev)) { + pr_notice_once("SNDPORTEGR: target device is gone\n"); + __this_cpu_dec(redirect_rec_level); + return TC_ACT_SHOT; + } + + if (unlikely(!(dev->flags & IFF_UP))) { + net_notice_ratelimited("SNDPORTEGR: device %s is down\n", + dev->name); + __this_cpu_dec(redirect_rec_level); + return TC_ACT_SHOT; + } + + nf_reset_ct(skb2); + + expects_nh = !dev_is_mac_header_xmit(dev); + at_nh = skb->data == skb_network_header(skb); + if (at_nh != expects_nh) { + mac_len = skb_at_tc_ingress(skb) ? + skb->mac_len : + skb_network_header(skb) - skb_mac_header(skb); + if (expects_nh) { + /* target device/action expect data at nh */ + skb_pull_rcsum(skb2, mac_len); + } else { + /* target device/action expect data at mac */ + skb_push_rcsum(skb2, mac_len); + } + } + + skb_set_redirected(skb2, skb2->tc_at_ingress); + skb2->skb_iif = skb->dev->ifindex; + skb2->dev = dev; + + err = dev_queue_xmit(skb2); + if (err) + retval = TC_ACT_SHOT; + + __this_cpu_dec(redirect_rec_level); + + return retval; +} + +static int p4tc_cmd_MIRPORTEGR(struct sk_buff *skb, struct p4tc_cmd_operate *op, + struct tcf_p4act *cmd, struct tcf_result *res) +{ + struct sk_buff *skb2 = skb; + int retval = TC_ACT_PIPE; + struct p4tc_cmd_operand *A; + struct net_device *dev; + unsigned int rec_level; + bool expects_nh; + u32 *ifindex; + int mac_len; + bool at_nh; + int err; + + A = GET_OPA(&op->operands_list); + ifindex = A->fetch(skb, A, cmd, res); + + rec_level = __this_cpu_inc_return(redirect_rec_level); + if (unlikely(rec_level > REDIRECT_RECURSION_LIMIT)) { + net_warn_ratelimited("MIRPORTEGR: exceeded redirect recursion limit on dev %s\n", + netdev_name(skb->dev)); + __this_cpu_dec(redirect_rec_level); + return TC_ACT_SHOT; + } + + dev = dev_get_by_index_rcu(dev_net(skb->dev), *ifindex); + if (unlikely(!dev)) { + pr_notice_once("MIRPORTEGR: target device is gone\n"); + __this_cpu_dec(redirect_rec_level); + return TC_ACT_SHOT; + } + + if (unlikely(!(dev->flags & IFF_UP))) { + net_notice_ratelimited("MIRPORTEGR: device %s is down\n", + dev->name); + __this_cpu_dec(redirect_rec_level); + return TC_ACT_SHOT; + } + + skb2 = skb_clone(skb, GFP_ATOMIC); + if (!skb2) { + __this_cpu_dec(redirect_rec_level); + return retval; + } + + nf_reset_ct(skb2); + + expects_nh = !dev_is_mac_header_xmit(dev); + at_nh = skb->data == skb_network_header(skb); + if (at_nh != expects_nh) { + mac_len = skb_at_tc_ingress(skb) ? + skb->mac_len : + skb_network_header(skb) - skb_mac_header(skb); + if (expects_nh) { + /* target device/action expect data at nh */ + skb_pull_rcsum(skb2, mac_len); + } else { + /* target device/action expect data at mac */ + skb_push_rcsum(skb2, mac_len); + } + } + + skb2->skb_iif = skb->dev->ifindex; + skb2->dev = dev; + + err = dev_queue_xmit(skb2); + if (err) + retval = TC_ACT_SHOT; + + __this_cpu_dec(redirect_rec_level); + + return retval; +} + +static int p4tc_cmd_TBLAPP(struct sk_buff *skb, struct p4tc_cmd_operate *op, + struct tcf_p4act *cmd, struct tcf_result *res) +{ + struct p4tc_cmd_operand *A = GET_OPA(&op->operands_list); + struct p4tc_table *table = A->fetch(skb, A, cmd, res); + struct p4tc_table_entry *entry; + struct p4tc_table_key *key; + int ret; + + A = GET_OPA(&op->operands_list); + table = A->fetch(skb, A, cmd, res); + if (unlikely(!table)) + return TC_ACT_SHOT; + + if (table->tbl_preacts) { + ret = tcf_action_exec(skb, table->tbl_preacts, + table->tbl_num_preacts, res); + /* Should check what return code should cause return */ + if (ret == TC_ACT_SHOT) + return ret; + } + + /* Sets key */ + key = table->tbl_key; + ret = tcf_action_exec(skb, key->key_acts, key->key_num_acts, res); + if (ret != TC_ACT_PIPE) + return ret; + + entry = p4tc_table_entry_lookup(skb, table, table->tbl_keysz); + if (IS_ERR(entry)) + entry = NULL; + + res->hit = entry ? true : false; + res->miss = !res->hit; + + ret = TC_ACT_PIPE; + if (res->hit) { + struct p4tc_table_defact *hitact; + + hitact = rcu_dereference(table->tbl_default_hitact); + if (entry->acts) + ret = tcf_action_exec(skb, entry->acts, entry->num_acts, + res); + else if (hitact) + ret = tcf_action_exec(skb, hitact->default_acts, 1, + res); + } else { + struct p4tc_table_defact *missact; + + missact = rcu_dereference(table->tbl_default_missact); + if (missact) + ret = tcf_action_exec(skb, missact->default_acts, 1, + res); + } + if (ret != TC_ACT_PIPE) + return ret; + + return tcf_action_exec(skb, table->tbl_postacts, + table->tbl_num_postacts, res); +} + +static int p4tc_cmd_BINARITH(struct sk_buff *skb, struct p4tc_cmd_operate *op, + struct tcf_p4act *cmd, struct tcf_result *res, + void (*p4tc_arith_op)(u64 *res, u64 opB, u64 opC)) +{ + u64 result = 0; + u64 B_val = 0; + u64 C_val = 0; + struct p4tc_cmd_operand *A, *B, *C; + struct p4tc_type_ops *src_C_ops; + struct p4tc_type_ops *src_B_ops; + struct p4tc_type_ops *dst_ops; + void *src_B; + void *src_C; + void *dst; + + A = GET_OPA(&op->operands_list); + B = GET_OPB(&op->operands_list); + C = GET_OPC(&op->operands_list); + + dst = A->fetch(skb, A, cmd, res); + src_B = B->fetch(skb, B, cmd, res); + src_C = C->fetch(skb, C, cmd, res); + + if (!src_B || !src_C || !dst) + return TC_ACT_SHOT; + + dst_ops = A->oper_datatype->ops; + src_B_ops = B->oper_datatype->ops; + src_C_ops = C->oper_datatype->ops; + + p4tc_reg_lock(A, B, C); + + src_B_ops->host_read(B->oper_datatype, B->oper_mask_shift, src_B, + &B_val); + src_C_ops->host_read(C->oper_datatype, C->oper_mask_shift, src_C, + &C_val); + + p4tc_arith_op(&result, B_val, C_val); + + dst_ops->host_write(A->oper_datatype, A->oper_mask_shift, &result, dst); + + p4tc_reg_unlock(A, B, C); + + return op->ctl1; +} + +/* Overflow semantic is the same as C's for u64 */ +static void plus_op(u64 *res, u64 opB, u64 opC) +{ + *res = opB + opC; +} + +static int p4tc_cmd_PLUS(struct sk_buff *skb, struct p4tc_cmd_operate *op, + struct tcf_p4act *cmd, struct tcf_result *res) +{ + return p4tc_cmd_BINARITH(skb, op, cmd, res, plus_op); +} + +/* Underflow semantic is the same as C's for u64 */ +static void sub_op(u64 *res, u64 opB, u64 opC) +{ + *res = opB - opC; +} + +static int p4tc_cmd_SUB(struct sk_buff *skb, struct p4tc_cmd_operate *op, + struct tcf_p4act *cmd, struct tcf_result *res) +{ + return p4tc_cmd_BINARITH(skb, op, cmd, res, sub_op); +} + +static void band_op(u64 *res, u64 opB, u64 opC) +{ + *res = opB & opC; +} + +static int p4tc_cmd_BAND(struct sk_buff *skb, struct p4tc_cmd_operate *op, + struct tcf_p4act *cmd, struct tcf_result *res) +{ + return p4tc_cmd_BINARITH(skb, op, cmd, res, band_op); +} + +static void bor_op(u64 *res, u64 opB, u64 opC) +{ + *res = opB | opC; +} + +static int p4tc_cmd_BOR(struct sk_buff *skb, struct p4tc_cmd_operate *op, + struct tcf_p4act *cmd, struct tcf_result *res) +{ + return p4tc_cmd_BINARITH(skb, op, cmd, res, bor_op); +} + +static void bxor_op(u64 *res, u64 opB, u64 opC) +{ + *res = opB ^ opC; +} + +static int p4tc_cmd_BXOR(struct sk_buff *skb, struct p4tc_cmd_operate *op, + struct tcf_p4act *cmd, struct tcf_result *res) +{ + return p4tc_cmd_BINARITH(skb, op, cmd, res, bxor_op); +} + +static int p4tc_cmd_CONCAT(struct sk_buff *skb, struct p4tc_cmd_operate *op, + struct tcf_p4act *cmd, struct tcf_result *res) +{ + u64 RvalAcc[BITS_TO_U64(P4T_MAX_BITSZ)] = { 0 }; + size_t rvalue_tot_sz = 0; + struct p4tc_cmd_operand *cursor; + struct p4tc_type_ops *dst_ops; + struct p4tc_cmd_operand *A; + void *dst; + + A = GET_OPA(&op->operands_list); + + cursor = A; + list_for_each_entry_continue(cursor, &op->operands_list, oper_list_node) { + size_t cursor_bytesz = BITS_TO_BYTES(cursor->oper_bitsize); + struct p4tc_type *cursor_type = cursor->oper_datatype; + struct p4tc_type_ops *cursor_type_ops = cursor_type->ops; + void *srcR = cursor->fetch(skb, cursor, cmd, res); + u64 Rval[BITS_TO_U64(P4T_MAX_BITSZ)] = { 0 }; + + cursor_type_ops->host_read(cursor->oper_datatype, + cursor->oper_mask_shift, srcR, + &Rval); + cursor_type_ops->host_write(cursor->oper_datatype, + cursor->oper_mask_shift, &Rval, + (char *)RvalAcc + rvalue_tot_sz); + rvalue_tot_sz += cursor_bytesz; + } + + dst = A->fetch(skb, A, cmd, res); + dst_ops = A->oper_datatype->ops; + dst_ops->host_write(A->oper_datatype, A->oper_mask_shift, RvalAcc, dst); + + return op->ctl1; +} + +static int p4tc_cmd_JUMP(struct sk_buff *skb, struct p4tc_cmd_operate *op, + struct tcf_p4act *cmd, struct tcf_result *res) +{ + struct p4tc_cmd_operand *A; + + A = GET_OPA(&op->operands_list); + + return A->immedv; +} diff --git a/net/sched/p4tc/p4tc_meta.c b/net/sched/p4tc/p4tc_meta.c index ebeb73352..d4c340473 100644 --- a/net/sched/p4tc/p4tc_meta.c +++ b/net/sched/p4tc/p4tc_meta.c @@ -202,6 +202,71 @@ static int p4tc_check_meta_size(struct p4tc_meta_size_params *sz_params, return new_bitsz; } +static inline void *tcf_meta_fetch_kernel(struct sk_buff *skb, + const u32 kernel_meta_id) +{ + switch (kernel_meta_id) { + case P4TC_KERNEL_META_QMAP: + return &skb->queue_mapping; + case P4TC_KERNEL_META_PKTLEN: + return &skb->len; + case P4TC_KERNEL_META_DATALEN: + return &skb->data_len; + case P4TC_KERNEL_META_SKBMARK: + return &skb->mark; + case P4TC_KERNEL_META_TCINDEX: + return &skb->tc_index; + case P4TC_KERNEL_META_SKBHASH: + return &skb->hash; + case P4TC_KERNEL_META_SKBPRIO: + return &skb->priority; + case P4TC_KERNEL_META_IFINDEX: + return &skb->dev->ifindex; + case P4TC_KERNEL_META_SKBIIF: + return &skb->skb_iif; + case P4TC_KERNEL_META_PROTOCOL: + return &skb->protocol; + case P4TC_KERNEL_META_PKTYPE: + case P4TC_KERNEL_META_IDF: + case P4TC_KERNEL_META_IPSUM: + case P4TC_KERNEL_META_OOOK: + case P4TC_KERNEL_META_PTYPEOFF: + case P4TC_KERNEL_META_PTCLNOFF: + return &skb->__pkt_type_offset; + case P4TC_KERNEL_META_FCLONE: + case P4TC_KERNEL_META_PEEKED: + case P4TC_KERNEL_META_CLONEOFF: + return &skb->__cloned_offset; + case P4TC_KERNEL_META_DIRECTION: + return &skb->__pkt_vlan_present_offset; + default: + return NULL; + } + + return NULL; +} + +static inline void *tcf_meta_fetch_user(struct sk_buff *skb, const u32 skb_off) +{ + struct p4tc_skb_ext *p4tc_skb_ext; + + p4tc_skb_ext = skb_ext_find(skb, P4TC_SKB_EXT); + if (!p4tc_skb_ext) { + pr_err("Unable to find P4TC_SKB_EXT\n"); + return NULL; + } + + return &p4tc_skb_ext->p4tc_ext->metadata[skb_off]; +} + +void *tcf_meta_fetch(struct sk_buff *skb, struct p4tc_metadata *meta) +{ + if (meta->common.p_id != P4TC_KERNEL_PIPEID) + return tcf_meta_fetch_user(skb, meta->m_skb_off); + + return tcf_meta_fetch_kernel(skb, meta->m_id); +} + void tcf_meta_fill_user_offsets(struct p4tc_pipeline *pipeline) { u32 meta_off = START_META_OFFSET; From patchwork Tue Jan 24 17:05:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jamal Hadi Salim X-Patchwork-Id: 13114397 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C2083C61DA0 for ; Tue, 24 Jan 2023 17:07:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234832AbjAXRHC (ORCPT ); Tue, 24 Jan 2023 12:07:02 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41264 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234576AbjAXRGZ (ORCPT ); Tue, 24 Jan 2023 12:06:25 -0500 Received: from mail-yw1-x112f.google.com (mail-yw1-x112f.google.com [IPv6:2607:f8b0:4864:20::112f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E7D4710C3 for ; Tue, 24 Jan 2023 09:05:42 -0800 (PST) Received: by mail-yw1-x112f.google.com with SMTP id 00721157ae682-4c131bede4bso227367097b3.5 for ; Tue, 24 Jan 2023 09:05:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mojatatu-com.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=DUfRx7uTYdW2mhGQPCEKufhycHQF/x5jtIB7VgmmCCE=; b=wQxJrXIfBCBfN6k88oEcy8Nz9MZMrkA6WZaeM2/rxzdMA3Q2e+mlnX8GMSH+oe6C8g HNkhJUPo4d2MURpnX5y4tlJwNI+ramdJ1T2o++XM2S57Kv7qldfWUrY8miMFMl6cGrtO jFDAtrPbWBuVGeFTuSyyDDs8bojEGQwVLS5oZo9tO44xbB3oh9uYWuhI3mUFwfR3RdhA LVLknsuITrlyCLOuNaNdjKHnWvfIJxNoRO6265dIfkPzXNbOa6325HMCYoin6TiDtDcw jaSd0vpbuTKrcOx1XveMoTYiJTNKYDayXZCOTdsTE0UcBFSFzBtOCTLRH29JygkScz2U 6ptw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=DUfRx7uTYdW2mhGQPCEKufhycHQF/x5jtIB7VgmmCCE=; b=5r+6qWHtaD/b43RQu3jYBtmJyaWsfvyRZCrf9XNvMgWRG67p91yAmVdfLz/h203/f6 wA9Gl17qpHWC28UGDUQkPpYDjaWyUM9v7CYAAuWrwUljO4mTWRzQk19qUkTD5K2I5zoZ qub1dR0r5zZPhwIM0YVdbuzb1s4bq0rtFtJ8h2Sw4qVNvb+Rzea/zItIQS3cT/W8WSy5 AURdGEDtV+2oElna++Sqmfq5o83ZtxTe93yTMcWh1eSRk43npPcKfvZhN54Ohel8tsRS wVDwchWtFtbCKSOt3NEvg9fkb7q1evky2021UD7knz3D8kYayoqrVVdszc0I6muh+6z3 C/oA== X-Gm-Message-State: AO0yUKW08tdmw+8c/41KBXVbK+qwZWrfDtuQQ/y2WYtvPLlpCu4zIe5f eJiAeiTQx1GcwGgKg+dAvINRpBGx4oTqbHGr X-Google-Smtp-Source: AK7set/RZtHP3++2I3IF0zMAhFC3HjJDCFgd0vuX54FawVwpzY1kfkpJ53s5vQpy7RR6GzBBbz1P6g== X-Received: by 2002:a05:7500:2410:b0:f3:8bbf:a5f3 with SMTP id az16-20020a057500241000b000f38bbfa5f3mr208796gab.2.1674579937696; Tue, 24 Jan 2023 09:05:37 -0800 (PST) Received: from localhost.localdomain (bras-base-kntaon1618w-grc-10-184-145-9-64.dsl.bell.ca. [184.145.9.64]) by smtp.gmail.com with ESMTPSA id t5-20020a05620a0b0500b007063036cb03sm1700208qkg.126.2023.01.24.09.05.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Jan 2023 09:05:37 -0800 (PST) From: Jamal Hadi Salim To: netdev@vger.kernel.org Cc: kernel@mojatatu.com, deb.chatterjee@intel.com, anjali.singhai@intel.com, namrata.limaye@intel.com, khalidm@nvidia.com, tom@sipanda.io, pratyush@sipanda.io, jiri@resnulli.us, xiyou.wangcong@gmail.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, vladbu@nvidia.com, simon.horman@corigine.com Subject: [PATCH net-next RFC 20/20] p4tc: add P4 classifier Date: Tue, 24 Jan 2023 12:05:10 -0500 Message-Id: <20230124170510.316970-20-jhs@mojatatu.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230124170510.316970-1-jhs@mojatatu.com> References: <20230124170510.316970-1-jhs@mojatatu.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org X-Patchwork-State: RFC Introduce P4 classifier, which we'll use execute our P4 pipelines in the kernel. To use the P4 classifier you must specify a pipeline name that will be associated to this filter. That pipeline must have already been create via a template. For example, if we were to add a filter to ingress of network interface device eth0 and associate it to P4 pipeline simple_l3 we'd issue the following command: tc filter add dev lo parent ffff: protocol ip prio 6 p4 pname simple_l3 We could also associate an action with this filter, which would look something like this: tc filter add dev lo parent ffff: protocol ip prio 6 p4 \ pname simple_l3 action ok The classifier itself has the following steps: ================================PARSING================================ If the P4 pipeline has an associated parser, then the first thing that will happen is we will invoke the parser the pipeline is associated with. Note, the parser is an optional component. There are P4 programs which may not need to parse headers. Assuming presence of a parser in this first step, the p4 classifier will execute the parser and retrieve all the header fields that were specified in the templating phase. Also remember that a P4 program/pipeline can only has a max of one parser. ================================PREACTIONS================================ After parsing, the classifier will execute the pipeline preactions. Most of the time, the pipeline preactions will consist of a dynamic action table apply command, which will start the match action chain common to P4 programs. The preactions will return a standard action code (TC_ACT_OK, TC_ACT_SHOT and etc). If the preaction returns TC_ACT_PIPE, we'll continue to the next step of the filter execution, otherwise it will stop executing the filter and return the op code. ================================POSTACTIONS================================ After the pipeline preactions have executed and returned TC_ACT_PIPE, the filter will execute the pipeline postactions. Like the preactions, the postactions will return a standard action code. If the postaction returns TC_ACT_PIPE, we'll continue to the next step of the filter execution, otherwise it will stop executing the filter and return the op code. ==============================FILTER ACTIONS============================== After the pipeline preactions have executed and returned TC_ACT_PIPE, the filter will execute the fitler actions, if any were associate with it. Filter actions are the ones defined outside the P4 program, example: tc filter add dev lo parent ffff: protocol ip prio 6 p4 \ pname simple_l3 action ok The action "ok" is classical Linux gact action. The filter will return the op code returned by this action. Co-developed-by: Victor Nogueira Signed-off-by: Victor Nogueira Co-developed-by: Pedro Tammela Signed-off-by: Pedro Tammela Signed-off-by: Jamal Hadi Salim --- include/uapi/linux/pkt_cls.h | 13 ++ net/sched/Kconfig | 12 ++ net/sched/Makefile | 1 + net/sched/cls_p4.c | 339 +++++++++++++++++++++++++++++++++++ net/sched/p4tc/Makefile | 4 +- net/sched/p4tc/trace.c | 10 ++ net/sched/p4tc/trace.h | 45 +++++ 7 files changed, 423 insertions(+), 1 deletion(-) create mode 100644 net/sched/cls_p4.c create mode 100644 net/sched/p4tc/trace.c create mode 100644 net/sched/p4tc/trace.h diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h index 5d6e22f2a..614d013bb 100644 --- a/include/uapi/linux/pkt_cls.h +++ b/include/uapi/linux/pkt_cls.h @@ -724,6 +724,19 @@ enum { #define TCA_MATCHALL_MAX (__TCA_MATCHALL_MAX - 1) +/* P4 classifier */ + +enum { + TCA_P4_UNSPEC, + TCA_P4_CLASSID, + TCA_P4_ACT, + TCA_P4_PNAME, + TCA_P4_PAD, + __TCA_P4_MAX, +}; + +#define TCA_P4_MAX (__TCA_P4_MAX - 1) + /* Extended Matches */ struct tcf_ematch_tree_hdr { diff --git a/net/sched/Kconfig b/net/sched/Kconfig index c2fbd1889..ba84edc1a 100644 --- a/net/sched/Kconfig +++ b/net/sched/Kconfig @@ -640,6 +640,18 @@ config NET_CLS_MATCHALL To compile this code as a module, choose M here: the module will be called cls_matchall. +config NET_CLS_P4 + tristate "P4 classifier" + select NET_CLS + select NET_P4_TC + help + If you say Y here, you will be able to classify packets based on + P4 pipeline programs. You will need to install P4 templates scripts + successfully to use this feature. + + To compile this code as a module, choose M here: the module will + be called cls_p4. + config NET_EMATCH bool "Extended Matches" select NET_CLS diff --git a/net/sched/Makefile b/net/sched/Makefile index 465ea14cd..174230e92 100644 --- a/net/sched/Makefile +++ b/net/sched/Makefile @@ -78,6 +78,7 @@ obj-$(CONFIG_NET_CLS_CGROUP) += cls_cgroup.o obj-$(CONFIG_NET_CLS_BPF) += cls_bpf.o obj-$(CONFIG_NET_CLS_FLOWER) += cls_flower.o obj-$(CONFIG_NET_CLS_MATCHALL) += cls_matchall.o +obj-$(CONFIG_NET_CLS_P4) += cls_p4.o obj-$(CONFIG_NET_EMATCH) += ematch.o obj-$(CONFIG_NET_EMATCH_CMP) += em_cmp.o obj-$(CONFIG_NET_EMATCH_NBYTE) += em_nbyte.o diff --git a/net/sched/cls_p4.c b/net/sched/cls_p4.c new file mode 100644 index 000000000..35b21b3c0 --- /dev/null +++ b/net/sched/cls_p4.c @@ -0,0 +1,339 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * net/sched/cls_p4.c - P4 Classifier + * Copyright (c) 2022, Mojatatu Networks + * Copyright (c) 2022, Intel Corporation. + * Authors: Jamal Hadi Salim + * Victor Nogueira + * Pedro Tammela + */ + +#include +#include +#include +#include + +#include +#include + +#include + +#include "p4tc/trace.h" + +struct cls_p4_head { + struct tcf_exts exts; + struct tcf_result res; + struct rcu_work rwork; + struct p4tc_pipeline *pipeline; + u32 handle; +}; + +static int p4_classify(struct sk_buff *skb, const struct tcf_proto *tp, + struct tcf_result *res) +{ + struct cls_p4_head *head = rcu_dereference_bh(tp->root); + struct tcf_result p4res = {}; + int rc = 0; + struct p4tc_pipeline *pipeline; + struct p4tc_skb_ext *p4tc_ext; + + if (unlikely(!head)) { + pr_err("P4 classifier not found\n"); + return -1; + } + + pipeline = head->pipeline; + trace_p4_classify(skb, pipeline); + + p4tc_ext = skb_ext_find(skb, P4TC_SKB_EXT); + if (!p4tc_ext) { + p4tc_ext = p4tc_skb_ext_alloc(skb); + if (WARN_ON_ONCE(!p4tc_ext)) + return TC_ACT_SHOT; + } + + if (refcount_read(&pipeline->p_hdrs_used) > 1) + rc = tcf_skb_parse(skb, p4tc_ext, pipeline->parser); + + if (rc > 0) { + pr_warn("P4 parser error %d\n", rc); + return TC_ACT_SHOT; + } + + rc = tcf_action_exec(skb, pipeline->preacts, pipeline->num_preacts, + &p4res); + if (rc != TC_ACT_PIPE) + return rc; + + rc = tcf_action_exec(skb, pipeline->postacts, pipeline->num_postacts, + &p4res); + if (rc != TC_ACT_PIPE) + return rc; + + *res = head->res; + + return tcf_exts_exec(skb, &head->exts, res); +} + +static int p4_init(struct tcf_proto *tp) +{ + return 0; +} + +static void __p4_destroy(struct cls_p4_head *head) +{ + tcf_exts_destroy(&head->exts); + tcf_exts_put_net(&head->exts); + __tcf_pipeline_put(head->pipeline); + kfree(head); +} + +static void p4_destroy_work(struct work_struct *work) +{ + struct cls_p4_head *head = + container_of(to_rcu_work(work), struct cls_p4_head, rwork); + + rtnl_lock(); + __p4_destroy(head); + rtnl_unlock(); +} + +static void p4_destroy(struct tcf_proto *tp, bool rtnl_held, + struct netlink_ext_ack *extack) +{ + struct cls_p4_head *head = rtnl_dereference(tp->root); + + if (!head) + return; + + tcf_unbind_filter(tp, &head->res); + + if (tcf_exts_get_net(&head->exts)) + tcf_queue_work(&head->rwork, p4_destroy_work); + else + __p4_destroy(head); +} + +static void *p4_get(struct tcf_proto *tp, u32 handle) +{ + struct cls_p4_head *head = rtnl_dereference(tp->root); + + if (head && head->handle == handle) + return head; + + return NULL; +} + +static const struct nla_policy p4_policy[TCA_P4_MAX + 1] = { + [TCA_P4_UNSPEC] = { .type = NLA_UNSPEC }, + [TCA_P4_CLASSID] = { .type = NLA_U32 }, + [TCA_P4_PNAME] = { .type = NLA_STRING }, +}; + +static int p4_set_parms(struct net *net, struct tcf_proto *tp, + struct cls_p4_head *head, unsigned long base, + struct nlattr **tb, struct nlattr *est, u32 flags, + struct netlink_ext_ack *extack) +{ + int err; + + err = tcf_exts_validate_ex(net, tp, tb, est, &head->exts, flags, 0, + extack); + if (err < 0) + return err; + + if (tb[TCA_P4_CLASSID]) { + head->res.classid = nla_get_u32(tb[TCA_P4_CLASSID]); + tcf_bind_filter(tp, &head->res, base); + } + + return 0; +} + +static int p4_change(struct net *net, struct sk_buff *in_skb, + struct tcf_proto *tp, unsigned long base, u32 handle, + struct nlattr **tca, void **arg, u32 flags, + struct netlink_ext_ack *extack) +{ + struct cls_p4_head *head = rtnl_dereference(tp->root); + struct p4tc_pipeline *pipeline = NULL; + char *pname = NULL; + struct nlattr *tb[TCA_P4_MAX + 1]; + struct cls_p4_head *new; + int err; + + if (!tca[TCA_OPTIONS]) { + NL_SET_ERR_MSG(extack, "Must provide pipeline options"); + return -EINVAL; + } + + if (head) + return -EEXIST; + + err = nla_parse_nested_deprecated(tb, TCA_P4_MAX, tca[TCA_OPTIONS], + p4_policy, NULL); + if (err < 0) + return err; + + if (tb[TCA_P4_PNAME]) + pname = nla_data(tb[TCA_P4_PNAME]); + + if (pname) { + pipeline = tcf_pipeline_get(net, pname, 0, extack); + if (IS_ERR(pipeline)) + return PTR_ERR(pipeline); + } else { + NL_SET_ERR_MSG(extack, "MUST provide pipeline name"); + return -EINVAL; + } + + if (!pipeline_sealed(pipeline)) { + err = -EINVAL; + NL_SET_ERR_MSG(extack, "Pipeline must be sealed before use"); + goto pipeline_put; + } + + if (refcount_read(&pipeline->p_hdrs_used) > 1 && + !tcf_parser_is_callable(pipeline->parser)) { + err = -EINVAL; + NL_SET_ERR_MSG(extack, "Pipeline doesn't have callable parser"); + goto pipeline_put; + } + + new = kzalloc(sizeof(*new), GFP_KERNEL); + if (!new) { + err = -ENOMEM; + goto pipeline_put; + } + + err = tcf_exts_init(&new->exts, net, TCA_P4_ACT, 0); + if (err) + goto err_exts_init; + + if (!handle) + handle = 1; + + new->handle = handle; + + err = p4_set_parms(net, tp, new, base, tb, tca[TCA_RATE], flags, + extack); + if (err) + goto err_set_parms; + + new->pipeline = pipeline; + *arg = head; + rcu_assign_pointer(tp->root, new); + return 0; + +err_set_parms: + tcf_exts_destroy(&new->exts); +err_exts_init: + kfree(new); +pipeline_put: + __tcf_pipeline_put(pipeline); + return err; +} + +static int p4_delete(struct tcf_proto *tp, void *arg, bool *last, + bool rtnl_held, struct netlink_ext_ack *extack) +{ + *last = true; + return 0; +} + +static void p4_walk(struct tcf_proto *tp, struct tcf_walker *arg, + bool rtnl_held) +{ + struct cls_p4_head *head = rtnl_dereference(tp->root); + + if (arg->count < arg->skip) + goto skip; + + if (!head) + return; + if (arg->fn(tp, head, arg) < 0) + arg->stop = 1; +skip: + arg->count++; +} + +static int p4_dump(struct net *net, struct tcf_proto *tp, void *fh, + struct sk_buff *skb, struct tcmsg *t, bool rtnl_held) +{ + struct cls_p4_head *head = fh; + struct nlattr *nest; + + if (!head) + return skb->len; + + t->tcm_handle = head->handle; + + nest = nla_nest_start(skb, TCA_OPTIONS); + if (!nest) + goto nla_put_failure; + + if (nla_put_string(skb, TCA_P4_PNAME, head->pipeline->common.name)) + goto nla_put_failure; + + if (head->res.classid && + nla_put_u32(skb, TCA_P4_CLASSID, head->res.classid)) + goto nla_put_failure; + + if (tcf_exts_dump(skb, &head->exts)) + goto nla_put_failure; + + nla_nest_end(skb, nest); + + if (tcf_exts_dump_stats(skb, &head->exts) < 0) + goto nla_put_failure; + + return skb->len; + +nla_put_failure: + nla_nest_cancel(skb, nest); + return -1; +} + +static void p4_bind_class(void *fh, u32 classid, unsigned long cl, void *q, + unsigned long base) +{ + struct cls_p4_head *head = fh; + + if (head && head->res.classid == classid) { + if (cl) + __tcf_bind_filter(q, &head->res, base); + else + __tcf_unbind_filter(q, &head->res); + } +} + +static struct tcf_proto_ops cls_p4_ops __read_mostly = { + .kind = "p4", + .classify = p4_classify, + .init = p4_init, + .destroy = p4_destroy, + .get = p4_get, + .change = p4_change, + .delete = p4_delete, + .walk = p4_walk, + .dump = p4_dump, + .bind_class = p4_bind_class, + .owner = THIS_MODULE, +}; + +static int __init cls_p4_init(void) +{ + return register_tcf_proto_ops(&cls_p4_ops); +} + +static void __exit cls_p4_exit(void) +{ + unregister_tcf_proto_ops(&cls_p4_ops); +} + +module_init(cls_p4_init); +module_exit(cls_p4_exit); + +MODULE_AUTHOR("Mojatatu Networks"); +MODULE_DESCRIPTION("P4 Classifier"); +MODULE_LICENSE("GPL"); diff --git a/net/sched/p4tc/Makefile b/net/sched/p4tc/Makefile index 396fcd249..ac118a79c 100644 --- a/net/sched/p4tc/Makefile +++ b/net/sched/p4tc/Makefile @@ -1,5 +1,7 @@ # SPDX-License-Identifier: GPL-2.0 +CFLAGS_trace.o := -I$(src) + obj-y := p4tc_types.o p4tc_pipeline.o p4tc_tmpl_api.o p4tc_meta.o \ p4tc_parser_api.o p4tc_hdrfield.o p4tc_action.o p4tc_table.o \ - p4tc_tbl_api.o p4tc_register.o p4tc_cmds.o + p4tc_tbl_api.o p4tc_register.o p4tc_cmds.o trace.o diff --git a/net/sched/p4tc/trace.c b/net/sched/p4tc/trace.c new file mode 100644 index 000000000..9ce2e0c01 --- /dev/null +++ b/net/sched/p4tc/trace.c @@ -0,0 +1,10 @@ +// SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) + +#include + +#ifndef __CHECKER__ + +#define CREATE_TRACE_POINTS +#include "trace.h" + +#endif diff --git a/net/sched/p4tc/trace.h b/net/sched/p4tc/trace.h new file mode 100644 index 000000000..8aecd5562 --- /dev/null +++ b/net/sched/p4tc/trace.h @@ -0,0 +1,45 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#undef TRACE_SYSTEM +#define TRACE_SYSTEM p4tc + +#if !defined(__P4TC_TRACE_H_) || defined(TRACE_HEADER_MULTI_READ) +#define __P4TC_TRACE_H + +#include + +struct p4tc_pipeline; + +TRACE_EVENT(p4_classify, + + TP_PROTO(struct sk_buff *skb, struct p4tc_pipeline *pipeline), + + TP_ARGS(skb, pipeline), + + TP_STRUCT__entry(__string(pname, pipeline->common.name) + __field(u32, p_id) + __field(u32, ifindex) + __field(u32, ingress) + ), + + TP_fast_assign(__assign_str(pname, pipeline->common.name); + __entry->p_id = pipeline->common.p_id; + __entry->ifindex = skb->dev->ifindex; + __entry->ingress = skb_at_tc_ingress(skb); + ), + + TP_printk("dev=%u dir=%s pipeline=%s p_id=%u", + __entry->ifindex, + __entry->ingress ? "ingress" : "egress", + __get_str(pname), + __entry->p_id + ) +); + +#endif + +#undef TRACE_INCLUDE_PATH +#define TRACE_INCLUDE_PATH . +#undef TRACE_INCLUDE_FILE +#define TRACE_INCLUDE_FILE trace + +#include