From patchwork Thu Mar 16 02:36:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kui-Feng Lee X-Patchwork-Id: 13176917 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0DC56C6FD1D for ; Thu, 16 Mar 2023 02:37:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229686AbjCPChq (ORCPT ); Wed, 15 Mar 2023 22:37:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33110 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229513AbjCPChp (ORCPT ); Wed, 15 Mar 2023 22:37:45 -0400 Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A6D7D9CFDA for ; Wed, 15 Mar 2023 19:37:44 -0700 (PDT) Received: from pps.filterd (m0044010.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 32G1mfLH026260 for ; Wed, 15 Mar 2023 19:37:44 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=meta.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=s2048-2021-q4; bh=Gnps8V1BFss1aNGWOStveVPW9JmoN/bLZz1eRN6EUrs=; b=MTSF3mlPn2lM3OcnCoHufqWhsgHgyVuu8GDiPjIODZEuZflPQNVa8KZYRIl4q7Ywg4Dn JeBhgo8Eg6sIk640k35qjettLOxIbKYLIto7dMMk1xf+zXc9VCOEvnJX5ie+T4ee4E6+ fSolIOY6UkA2W4P+lzM0LZ5HaWB3QdxJAUZMuxdCX1YifYmWN2/UKJL+HHgK1YS0E9VJ eCwagI6yjZN95fYW0cNzATnSbrYcxCdCqR+7i5ozP340LJPnUuawuBN8RIkAxPOpVivB pJ+zUrg+gxI5rwEDVMJ/jaPL35Rgq7ViNMkmtihNsbCnj55OdIHYvYL8R0mOWD+NJruX wQ== Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3pbpw6120r-2 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Wed, 15 Mar 2023 19:37:44 -0700 Received: from twshared24004.14.frc2.facebook.com (2620:10d:c0a8:1b::d) by mail.thefacebook.com (2620:10d:c0a8:82::c) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.17; Wed, 15 Mar 2023 19:37:42 -0700 Received: by devbig931.frc1.facebook.com (Postfix, from userid 460691) id 4CA557700226; Wed, 15 Mar 2023 19:37:35 -0700 (PDT) From: Kui-Feng Lee To: , , , , , , CC: Kui-Feng Lee , , Eric Dumazet Subject: [PATCH bpf-next v7 2/8] net: Update an existing TCP congestion control algorithm. Date: Wed, 15 Mar 2023 19:36:35 -0700 Message-ID: <20230316023641.2092778-3-kuifeng@meta.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230316023641.2092778-1-kuifeng@meta.com> References: <20230316023641.2092778-1-kuifeng@meta.com> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: kox6w0TNscWUk2oxHsxZWlCJ5IqVmavd X-Proofpoint-ORIG-GUID: kox6w0TNscWUk2oxHsxZWlCJ5IqVmavd X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-03-16_02,2023-03-15_01,2023-02-09_01 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net This feature lets you immediately transition to another congestion control algorithm or implementation with the same name. Once a name is updated, new connections will apply this new algorithm. The purpose is to update a customized algorithm implemented in BPF struct_ops with a new version on the flight. The following is an example of using the userspace API implemented in later BPF patches. link = bpf_map__attach_struct_ops(skel->maps.ca_update_1); ....... err = bpf_link__update_map(link, skel->maps.ca_update_2); We first load and register an algorithm implemented in BPF struct_ops, then swap it out with a new one using the same name. After that, newly created connections will apply the updated algorithm, while older ones retain the previous version already applied. This patch also takes this chance to refactor the ca validation into the new tcp_validate_congestion_control() function. Cc: netdev@vger.kernel.org, Eric Dumazet Signed-off-by: Kui-Feng Lee --- include/net/tcp.h | 3 +++ net/ipv4/tcp_cong.c | 60 +++++++++++++++++++++++++++++++++++++++------ 2 files changed, 56 insertions(+), 7 deletions(-) diff --git a/include/net/tcp.h b/include/net/tcp.h index db9f828e9d1e..2abb755e6a3a 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -1117,6 +1117,9 @@ struct tcp_congestion_ops { int tcp_register_congestion_control(struct tcp_congestion_ops *type); void tcp_unregister_congestion_control(struct tcp_congestion_ops *type); +int tcp_update_congestion_control(struct tcp_congestion_ops *type, + struct tcp_congestion_ops *old_type); +int tcp_validate_congestion_control(struct tcp_congestion_ops *ca); void tcp_assign_congestion_control(struct sock *sk); void tcp_init_congestion_control(struct sock *sk); diff --git a/net/ipv4/tcp_cong.c b/net/ipv4/tcp_cong.c index db8b4b488c31..c90791ae8389 100644 --- a/net/ipv4/tcp_cong.c +++ b/net/ipv4/tcp_cong.c @@ -75,14 +75,8 @@ struct tcp_congestion_ops *tcp_ca_find_key(u32 key) return NULL; } -/* - * Attach new congestion control algorithm to the list - * of available options. - */ -int tcp_register_congestion_control(struct tcp_congestion_ops *ca) +int tcp_validate_congestion_control(struct tcp_congestion_ops *ca) { - int ret = 0; - /* all algorithms must implement these */ if (!ca->ssthresh || !ca->undo_cwnd || !(ca->cong_avoid || ca->cong_control)) { @@ -90,6 +84,20 @@ int tcp_register_congestion_control(struct tcp_congestion_ops *ca) return -EINVAL; } + return 0; +} + +/* Attach new congestion control algorithm to the list + * of available options. + */ +int tcp_register_congestion_control(struct tcp_congestion_ops *ca) +{ + int ret; + + ret = tcp_validate_congestion_control(ca); + if (ret) + return ret; + ca->key = jhash(ca->name, sizeof(ca->name), strlen(ca->name)); spin_lock(&tcp_cong_list_lock); @@ -130,6 +138,44 @@ void tcp_unregister_congestion_control(struct tcp_congestion_ops *ca) } EXPORT_SYMBOL_GPL(tcp_unregister_congestion_control); +/* Replace a registered old ca with a new one. + * + * The new ca must have the same name as the old one, that has been + * registered. + */ +int tcp_update_congestion_control(struct tcp_congestion_ops *ca, struct tcp_congestion_ops *old_ca) +{ + struct tcp_congestion_ops *existing; + int ret; + + ret = tcp_validate_congestion_control(ca); + if (ret) + return ret; + + ca->key = jhash(ca->name, sizeof(ca->name), strlen(ca->name)); + + spin_lock(&tcp_cong_list_lock); + existing = tcp_ca_find_key(old_ca->key); + if (ca->key == TCP_CA_UNSPEC || !existing || strcmp(existing->name, ca->name)) { + pr_notice("%s not registered or non-unique key\n", + ca->name); + ret = -EINVAL; + } else if (existing != old_ca) { + pr_notice("invalid old congestion control algorithm to replace\n"); + ret = -EINVAL; + } else { + /* Add the new one before removing the old one to keep + * one implementation available all the time. + */ + list_add_tail_rcu(&ca->list, &tcp_cong_list); + list_del_rcu(&existing->list); + pr_debug("%s updated\n", ca->name); + } + spin_unlock(&tcp_cong_list_lock); + + return ret; +} + u32 tcp_ca_get_key_by_name(struct net *net, const char *name, bool *ecn_ca) { const struct tcp_congestion_ops *ca;