From patchwork Mon Mar 17 10:57:55 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Akihiko Odaki X-Patchwork-Id: 14019003 X-Patchwork-Delegate: kuba@kernel.org Received: from mail-pl1-f177.google.com (mail-pl1-f177.google.com [209.85.214.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CE4A1236A7B for ; Mon, 17 Mar 2025 10:58:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.177 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742209121; cv=none; b=rnfg4Vj4j1jZlAQnvm1PX7CqekMF7tUIjh6d9l7Es7g+6VQwM7XZTo+lw5t9GN5OnD9jvLFNdIun85TBJRsT8ACVtYVb1vgd4nhym26FXdEPzoKJZXxSxknLQWNEHPjAlqxmqSLTM3Pa44arq8b51arMnwIVT6Iy6EisL2n+Ep0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742209121; c=relaxed/simple; bh=fAsNygmBH4vEFfwnr1O256Ehqy/tj24JUzQEngxqzTs=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To; b=bNTuss7wd4ROmhMXDZ2OciDPuw2hOEeLG2ym2bagcGjSKXxC/W6YH1HfyhvWRd4WUFiwpvTNYDP2oiDFhbRkpwjEfgLnOtopiItR6diP984p1G8Zw1+vgVGE5eTOHIuzuHrrUAa945EjyBSA06ryegMRZS1y7+a0o0eb7NiU2hc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=daynix.com; spf=pass smtp.mailfrom=daynix.com; dkim=pass (2048-bit key) header.d=daynix-com.20230601.gappssmtp.com header.i=@daynix-com.20230601.gappssmtp.com header.b=Tqy/NZ61; arc=none smtp.client-ip=209.85.214.177 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=daynix.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=daynix.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=daynix-com.20230601.gappssmtp.com header.i=@daynix-com.20230601.gappssmtp.com header.b="Tqy/NZ61" Received: by mail-pl1-f177.google.com with SMTP id d9443c01a7336-225a28a511eso66891635ad.1 for ; Mon, 17 Mar 2025 03:58:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=daynix-com.20230601.gappssmtp.com; s=20230601; t=1742209119; x=1742813919; darn=vger.kernel.org; h=to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=iEx1oB+WR7O6v2TIINfyTS8xxjNqfeRuV0OaAnQT/3s=; b=Tqy/NZ61rI7ApMk1xalcVcd0wmH7ifXVgMH9+3JmaJxwYApsTu+p904ABYNA/eMGyR GCadMhcYlulL0HIcKtojDyCV5UYEYG8zaEbAbxAVlTmIDkqDyDiGWxUSPNdskTDA6nT4 vSPhUH3uLdj1qK8lMrNi1SulCvNEsZyoDwODiWpM+K5ZdCXlhrpNLDsH2UnCOxbJ54iz 8bcuOCT4Upfr9RUuy02tYp1UekDuvmh/mZWHa8VB9PkEpSsXpitmg+m8Cf8wBwN0zd0Q a9yUCoOkHtianBCk+PorQwfrkfbKMjnTtZW6c5tYb9MEkBzBO0VVDrc4xTk7cM6EL6bE duMA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1742209119; x=1742813919; h=to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=iEx1oB+WR7O6v2TIINfyTS8xxjNqfeRuV0OaAnQT/3s=; b=fhms7VVgC0StngE2Rwjf4rX1+l/EyLa3DvBZ1aacfCrADUsO1n9xB2obJzvT7iQWig 3T8uzSOMgWkkcRmwuGRHfmMrFa2F0MLo1JFzV2tEIDxhjNLkMMDtq/D3elJ7YVPwKCrZ CnC6g+Sy216WGtJ70V6LmD6hPx5emHpdSTubIhKonsxtLO35E0NDGpRpE+A1TLiu00at 6jYOO8eGYPwLy10OyFqJ40xthXktRUpCDJXixPjuhyj0HYoKk5M2JP495rGwdpHG5mlg rxkE8p/7i4xxQLqmNplllo8A/ZtSqfxFN4xGCcflYS5eoqKokxEeMfrpUR93C+zOvWXM hpqg== X-Forwarded-Encrypted: i=1; AJvYcCUNsrUKIN/OMgFOmwkH7QAaPp5CIjnhC2/JkQs5RfGybEfB2Hs+ZGpgp9M961qhr3LU0xQ+6dI=@vger.kernel.org X-Gm-Message-State: AOJu0Yxm9O5VHVG20vQ2OUaTpgo+yCYPWcWlBKRNxyfgYkSRIXVfRrge 8rHa6aqSnn/tdhDYE+3EqkEwnYb6l9vIEdNd4k8UCBuC+LwiQZI9AYg55c8wxMI= X-Gm-Gg: ASbGncv+jAy5dLkbyFJQBcZsG8G65GqG/9mhW9LD/yOu8gLrYbmdD6G4/KM7YoWTeH2 SgNitq/SWApblf/PfI9ahi/ZzrJISNl8GPiOwfFS16R8OghI6d8skWso7cmcsS76y2q5Nm3R17i ku2GoejBCR6HFg5rrv7zeKhXrqURkseBWZmg/LToLBOljSYU+FSf4+9MBDZmQ15bq9qBB+MeyL+ M6TgvEUHzIVb0zPjUaTasZ+VEkjCADZVz371THRqPSTnwcG9hc0ni44z5BeHHldYlMijQ1MQFtD AVXfUx0Qpee58SQgn06t0ARZbJXGQFei0LgTc7wIJj+a8nGb X-Google-Smtp-Source: AGHT+IHX4HyK1k//tjemW3f4I2XPhHcKoJYNfaunw3BDHkKS92XkMFCdfmRLmOxFS0Kbm0S3BYpFzQ== X-Received: by 2002:a05:6a21:6f12:b0:1f5:5807:13c7 with SMTP id adf61e73a8af0-1f5c118e6abmr15940192637.17.1742209118936; Mon, 17 Mar 2025 03:58:38 -0700 (PDT) Received: from localhost ([157.82.207.107]) by smtp.gmail.com with UTF8SMTPSA id d2e1a72fcca58-737115294besm7237748b3a.21.2025.03.17.03.58.34 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 17 Mar 2025 03:58:38 -0700 (PDT) From: Akihiko Odaki Date: Mon, 17 Mar 2025 19:57:55 +0900 Subject: [PATCH net-next v11 05/10] tun: Introduce virtio-net hash feature Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20250317-rss-v11-5-4cacca92f31f@daynix.com> References: <20250317-rss-v11-0-4cacca92f31f@daynix.com> In-Reply-To: <20250317-rss-v11-0-4cacca92f31f@daynix.com> To: Jonathan Corbet , Willem de Bruijn , Jason Wang , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , "Michael S. Tsirkin" , Xuan Zhuo , Shuah Khan , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, netdev@vger.kernel.org, kvm@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-kselftest@vger.kernel.org, Yuri Benditovich , Andrew Melnychenko , Stephen Hemminger , gur.stavi@huawei.com, Lei Yang , Simon Horman , Akihiko Odaki X-Mailer: b4 0.15-dev-edae6 X-Patchwork-Delegate: kuba@kernel.org Add ioctls and storage required for the virtio-net hash feature to TUN. Signed-off-by: Akihiko Odaki --- drivers/net/Kconfig | 1 + drivers/net/tun.c | 54 ++++++++++++++++++++++++++++++++++++++++++-------- include/linux/skbuff.h | 3 +++ net/core/skbuff.c | 4 ++++ 4 files changed, 54 insertions(+), 8 deletions(-) diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig index 1fd5acdc73c6..aecfd244dd83 100644 --- a/drivers/net/Kconfig +++ b/drivers/net/Kconfig @@ -395,6 +395,7 @@ config TUN tristate "Universal TUN/TAP device driver support" depends on INET select CRC32 + select SKB_EXTENSIONS help TUN/TAP provides packet reception and transmission for user space programs. It can be viewed as a simple Point-to-Point or Ethernet diff --git a/drivers/net/tun.c b/drivers/net/tun.c index 03d47799e9bd..b2d74e0ec932 100644 --- a/drivers/net/tun.c +++ b/drivers/net/tun.c @@ -209,6 +209,7 @@ struct tun_struct { struct bpf_prog __rcu *xdp_prog; struct tun_prog __rcu *steering_prog; struct tun_prog __rcu *filter_prog; + struct tun_vnet_hash_container __rcu *vnet_hash; struct ethtool_link_ksettings link_ksettings; /* init args */ struct file *file; @@ -451,9 +452,14 @@ static inline void tun_flow_save_rps_rxhash(struct tun_flow_entry *e, u32 hash) e->rps_rxhash = hash; } +static struct virtio_net_hash *tun_add_hash(struct sk_buff *skb) +{ + return skb_ext_add(skb, SKB_EXT_TUN_VNET_HASH); +} + static const struct virtio_net_hash *tun_find_hash(const struct sk_buff *skb) { - return NULL; + return skb_ext_find(skb, SKB_EXT_TUN_VNET_HASH); } /* We try to identify a flow through its rxhash. The reason that @@ -462,14 +468,21 @@ static const struct virtio_net_hash *tun_find_hash(const struct sk_buff *skb) * the userspace application move between processors, we may get a * different rxq no. here. */ -static u16 tun_automq_select_queue(struct tun_struct *tun, struct sk_buff *skb) +static u16 tun_automq_select_queue(struct tun_struct *tun, + const struct tun_vnet_hash_container *vnet_hash, + struct sk_buff *skb) { + struct flow_keys keys; + struct flow_keys_basic keys_basic; struct tun_flow_entry *e; u32 txq, numqueues; numqueues = READ_ONCE(tun->numqueues); - txq = __skb_get_hash_symmetric(skb); + memset(&keys, 0, sizeof(keys)); + skb_flow_dissect(skb, &flow_keys_dissector_symmetric, &keys, 0); + + txq = flow_hash_from_keys(&keys); e = tun_flow_find(&tun->flows[tun_hashfn(txq)], txq); if (e) { tun_flow_save_rps_rxhash(e, txq); @@ -478,6 +491,13 @@ static u16 tun_automq_select_queue(struct tun_struct *tun, struct sk_buff *skb) txq = reciprocal_scale(txq, numqueues); } + keys_basic = (struct flow_keys_basic) { + .control = keys.control, + .basic = keys.basic + }; + tun_vnet_hash_report(vnet_hash, skb, &keys_basic, skb->l4_hash ? skb->hash : txq, + tun_add_hash); + return txq; } @@ -513,8 +533,15 @@ static u16 tun_select_queue(struct net_device *dev, struct sk_buff *skb, u16 ret; rcu_read_lock(); - if (!tun_ebpf_select_queue(tun, skb, &ret)) - ret = tun_automq_select_queue(tun, skb); + if (!tun_ebpf_select_queue(tun, skb, &ret)) { + struct tun_vnet_hash_container *vnet_hash = rcu_dereference(tun->vnet_hash); + + if (vnet_hash && (vnet_hash->common.flags & TUN_VNET_HASH_RSS)) + ret = tun_vnet_rss_select_queue(READ_ONCE(tun->numqueues), vnet_hash, + skb, tun_add_hash); + else + ret = tun_automq_select_queue(tun, vnet_hash, skb); + } rcu_read_unlock(); return ret; @@ -2235,6 +2262,7 @@ static void tun_free_netdev(struct net_device *dev) security_tun_dev_free_security(tun->security); __tun_set_ebpf(tun, &tun->steering_prog, NULL); __tun_set_ebpf(tun, &tun->filter_prog, NULL); + kfree_rcu_mightsleep(rcu_access_pointer(tun->vnet_hash)); } static void tun_setup(struct net_device *dev) @@ -3014,16 +3042,22 @@ static long __tun_chr_ioctl(struct file *file, unsigned int cmd, } else { memset(&ifr, 0, sizeof(ifr)); } - if (cmd == TUNGETFEATURES) { + switch (cmd) { + case TUNGETFEATURES: /* Currently this just means: "what IFF flags are valid?". * This is needed because we never checked for invalid flags on * TUNSETIFF. */ return put_user(IFF_TUN | IFF_TAP | IFF_NO_CARRIER | TUN_FEATURES, (unsigned int __user*)argp); - } else if (cmd == TUNSETQUEUE) { + + case TUNSETQUEUE: return tun_set_queue(file, &ifr); - } else if (cmd == SIOCGSKNS) { + + case TUNGETVNETHASHCAP: + return tun_vnet_ioctl_gethashcap(argp); + + case SIOCGSKNS: if (!ns_capable(net->user_ns, CAP_NET_ADMIN)) return -EPERM; return open_related_ns(&net->ns, get_net_ns); @@ -3264,6 +3298,10 @@ static long __tun_chr_ioctl(struct file *file, unsigned int cmd, ret = open_related_ns(&net->ns, get_net_ns); break; + case TUNSETVNETHASH: + ret = tun_vnet_ioctl_sethash(&tun->vnet_hash, argp); + break; + default: ret = tun_vnet_ioctl(&tun->vnet_hdr_sz, &tun->flags, cmd, argp); break; diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index bb2b751d274a..cdd793f1c360 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -4842,6 +4842,9 @@ enum skb_ext_id { #endif #if IS_ENABLED(CONFIG_MCTP_FLOWS) SKB_EXT_MCTP, +#endif +#if IS_ENABLED(CONFIG_TUN) + SKB_EXT_TUN_VNET_HASH, #endif SKB_EXT_NUM, /* must be last */ }; diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 7b03b64fdcb2..aa2a091b649f 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -64,6 +64,7 @@ #include #include #include +#include #include #include @@ -4969,6 +4970,9 @@ static const u8 skb_ext_type_len[] = { #if IS_ENABLED(CONFIG_MCTP_FLOWS) [SKB_EXT_MCTP] = SKB_EXT_CHUNKSIZEOF(struct mctp_flow), #endif +#if IS_ENABLED(CONFIG_TUN) + [SKB_EXT_TUN_VNET_HASH] = SKB_EXT_CHUNKSIZEOF(struct virtio_net_hash), +#endif }; static __always_inline unsigned int skb_ext_total_length(void)