From patchwork Fri Nov 25 16:56:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sebastian Andrzej Siewior X-Patchwork-Id: 13056161 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 56492C4167B for ; Fri, 25 Nov 2022 16:56:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229997AbiKYQ40 (ORCPT ); Fri, 25 Nov 2022 11:56:26 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35154 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229903AbiKYQ4S (ORCPT ); Fri, 25 Nov 2022 11:56:18 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D0A2C2B279 for ; Fri, 25 Nov 2022 08:56:15 -0800 (PST) From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1669395374; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0FtgL0rFSjRrN+VK5M60a6mQGctH+j2IkY4op/B+D0E=; b=hWi4aZJvfcBffs9YH7QRcFrWYz6Alkn2b4qjzUhjGep9xw1vQSwwzTzpBqCjedRGA/O/lX Zmfqs5X4t8t2DyxYjGUtAymnW7L1RLeH/lfMaGrXpS0OiSOss9Zp5hFvzuF3are3BaD7tu jg2pcpPw0NszLCZi3QW1hn4bVJBj7WeKSOV77CHgeG2QtdJ0a4opbw8unbxvbXiM85MCxD HA6V7noKiblyxFFHK/LcQ3a0qt9u7C5MCJTfL2tSAoouuPUHU7ktg4GzZu8nJcTuFM7jtN caQVxJUEpDlyiZkxpg44ZGPl/Ki63cZAN4pluEY/liN50T6f4ND6IZghlfwdPw== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1669395374; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0FtgL0rFSjRrN+VK5M60a6mQGctH+j2IkY4op/B+D0E=; b=lWJn7ef/e94k34i1Vv9I0Lk+Qsb5qyWwVPn9a99KQXwgwOYXjquAdf+DsFh/92XPZ+8P2s UdLfTcc67bzzX8Bw== To: netdev@vger.kernel.org Cc: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Thomas Gleixner , Kurt Kanzenbach , Sebastian Andrzej Siewior , Juhee Kang Subject: [PATCH v4 net-next 1/8] Revert "net: hsr: use hlist_head instead of list_head for mac addresses" Date: Fri, 25 Nov 2022 17:56:03 +0100 Message-Id: <20221125165610.3802446-2-bigeasy@linutronix.de> In-Reply-To: <20221125165610.3802446-1-bigeasy@linutronix.de> References: <20221125165610.3802446-1-bigeasy@linutronix.de> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org The hlist optimisation (which not only uses hlist_head instead of list_head but also splits hsr_priv::node_db into an array of 256 slots) does not consider the "node merge": Upon starting the hsr network (with three nodes) a packet that is sent from node1 to node3 will also be sent from node1 to node2 and then forwarded to node3. As a result node3 will receive 2 packets because it is not able to filter out the duplicate. Each packet received will create a new struct hsr_node with macaddress_A only set the MAC address it received from (the two MAC addesses from node1). At some point (early in the process) two supervision frames will be received from node1. They will be processed by hsr_handle_sup_frame() and one frame will leave early ("Node has already been merged") and does nothing. The other frame will be merged as portB and have its MAC address written to macaddress_B and the hsr_node (that was created for it as macaddress_A) will be removed. From now on HSR is able to identify a duplicate because both packets sent from one node will result in the same struct hsr_node because hsr_get_node() will find the MAC address either on macaddress_A or macaddress_B. Things get tricky with the optimisation: If sender's MAC address is saved as macaddress_A then the lookup will work as usual. If the MAC address has been merged into macaddress_B of another hsr_node then the lookup won't work because it is likely that the data structure is in another bucket. This results in creating a new struct hsr_node and not recognising a possible duplicate. A way around it would be to add another hsr_node::mac_list_B and attach it to the other bucket to ensure that this hsr_node will be looked up either via macaddress_A _or_ macaddress_B. I however prefer to revert it because it sounds like an academic problem rather than real life workload plus it adds complexity. I'm not an HSR expert with what is usual size of a network but I would guess 40 to 60 nodes. With 10.000 nodes and assuming 60us for pass-through (from node to node) then it would take almost 600ms for a packet to almost wrap around which sounds a lot. Revert the hash MAC addresses optimisation. Fixes: 4acc45db71158 ("net: hsr: use hlist_head instead of list_head for mac addresses") Cc: Juhee Kang Signed-off-by: Sebastian Andrzej Siewior --- net/hsr/hsr_debugfs.c | 38 +++----- net/hsr/hsr_device.c | 10 +- net/hsr/hsr_forward.c | 7 +- net/hsr/hsr_framereg.c | 201 +++++++++++++++-------------------------- net/hsr/hsr_framereg.h | 14 +-- net/hsr/hsr_main.h | 9 +- net/hsr/hsr_netlink.c | 4 +- 7 files changed, 101 insertions(+), 182 deletions(-) diff --git a/net/hsr/hsr_debugfs.c b/net/hsr/hsr_debugfs.c index de476a4176314..1a195efc79cd1 100644 --- a/net/hsr/hsr_debugfs.c +++ b/net/hsr/hsr_debugfs.c @@ -9,7 +9,6 @@ #include #include #include -#include #include "hsr_main.h" #include "hsr_framereg.h" @@ -21,7 +20,6 @@ hsr_node_table_show(struct seq_file *sfp, void *data) { struct hsr_priv *priv = (struct hsr_priv *)sfp->private; struct hsr_node *node; - int i; seq_printf(sfp, "Node Table entries for (%s) device\n", (priv->prot_version == PRP_V1 ? "PRP" : "HSR")); @@ -33,28 +31,22 @@ hsr_node_table_show(struct seq_file *sfp, void *data) seq_puts(sfp, "DAN-H\n"); rcu_read_lock(); + list_for_each_entry_rcu(node, &priv->node_db, mac_list) { + /* skip self node */ + if (hsr_addr_is_self(priv, node->macaddress_A)) + continue; + seq_printf(sfp, "%pM ", &node->macaddress_A[0]); + seq_printf(sfp, "%pM ", &node->macaddress_B[0]); + seq_printf(sfp, "%10lx, ", node->time_in[HSR_PT_SLAVE_A]); + seq_printf(sfp, "%10lx, ", node->time_in[HSR_PT_SLAVE_B]); + seq_printf(sfp, "%14x, ", node->addr_B_port); - for (i = 0 ; i < priv->hash_buckets; i++) { - hlist_for_each_entry_rcu(node, &priv->node_db[i], mac_list) { - /* skip self node */ - if (hsr_addr_is_self(priv, node->macaddress_A)) - continue; - seq_printf(sfp, "%pM ", &node->macaddress_A[0]); - seq_printf(sfp, "%pM ", &node->macaddress_B[0]); - seq_printf(sfp, "%10lx, ", - node->time_in[HSR_PT_SLAVE_A]); - seq_printf(sfp, "%10lx, ", - node->time_in[HSR_PT_SLAVE_B]); - seq_printf(sfp, "%14x, ", node->addr_B_port); - - if (priv->prot_version == PRP_V1) - seq_printf(sfp, "%5x, %5x, %5x\n", - node->san_a, node->san_b, - (node->san_a == 0 && - node->san_b == 0)); - else - seq_printf(sfp, "%5x\n", 1); - } + if (priv->prot_version == PRP_V1) + seq_printf(sfp, "%5x, %5x, %5x\n", + node->san_a, node->san_b, + (node->san_a == 0 && node->san_b == 0)); + else + seq_printf(sfp, "%5x\n", 1); } rcu_read_unlock(); return 0; diff --git a/net/hsr/hsr_device.c b/net/hsr/hsr_device.c index 6ffef47e9be55..7518f7e930431 100644 --- a/net/hsr/hsr_device.c +++ b/net/hsr/hsr_device.c @@ -485,16 +485,12 @@ int hsr_dev_finalize(struct net_device *hsr_dev, struct net_device *slave[2], { bool unregister = false; struct hsr_priv *hsr; - int res, i; + int res; hsr = netdev_priv(hsr_dev); INIT_LIST_HEAD(&hsr->ports); - INIT_HLIST_HEAD(&hsr->self_node_db); - hsr->hash_buckets = HSR_HSIZE; - get_random_bytes(&hsr->hash_seed, sizeof(hsr->hash_seed)); - for (i = 0; i < hsr->hash_buckets; i++) - INIT_HLIST_HEAD(&hsr->node_db[i]); - + INIT_LIST_HEAD(&hsr->node_db); + INIT_LIST_HEAD(&hsr->self_node_db); spin_lock_init(&hsr->list_lock); eth_hw_addr_set(hsr_dev, slave[0]->dev_addr); diff --git a/net/hsr/hsr_forward.c b/net/hsr/hsr_forward.c index a50429a62f744..9894962847d97 100644 --- a/net/hsr/hsr_forward.c +++ b/net/hsr/hsr_forward.c @@ -570,23 +570,20 @@ static int fill_frame_info(struct hsr_frame_info *frame, struct ethhdr *ethhdr; __be16 proto; int ret; - u32 hash; /* Check if skb contains ethhdr */ if (skb->mac_len < sizeof(struct ethhdr)) return -EINVAL; memset(frame, 0, sizeof(*frame)); - - ethhdr = (struct ethhdr *)skb_mac_header(skb); - hash = hsr_mac_hash(port->hsr, ethhdr->h_source); frame->is_supervision = is_supervision_frame(port->hsr, skb); - frame->node_src = hsr_get_node(port, &hsr->node_db[hash], skb, + frame->node_src = hsr_get_node(port, &hsr->node_db, skb, frame->is_supervision, port->type); if (!frame->node_src) return -1; /* Unknown node and !is_supervision, or no mem */ + ethhdr = (struct ethhdr *)skb_mac_header(skb); frame->is_vlan = false; proto = ethhdr->h_proto; diff --git a/net/hsr/hsr_framereg.c b/net/hsr/hsr_framereg.c index 584e217887997..9b8eaebce2549 100644 --- a/net/hsr/hsr_framereg.c +++ b/net/hsr/hsr_framereg.c @@ -15,37 +15,10 @@ #include #include #include -#include #include "hsr_main.h" #include "hsr_framereg.h" #include "hsr_netlink.h" -#ifdef CONFIG_LOCKDEP -int lockdep_hsr_is_held(spinlock_t *lock) -{ - return lockdep_is_held(lock); -} -#endif - -u32 hsr_mac_hash(struct hsr_priv *hsr, const unsigned char *addr) -{ - u32 hash = jhash(addr, ETH_ALEN, hsr->hash_seed); - - return reciprocal_scale(hash, hsr->hash_buckets); -} - -struct hsr_node *hsr_node_get_first(struct hlist_head *head, spinlock_t *lock) -{ - struct hlist_node *first; - - first = rcu_dereference_bh_check(hlist_first_rcu(head), - lockdep_hsr_is_held(lock)); - if (first) - return hlist_entry(first, struct hsr_node, mac_list); - - return NULL; -} - /* seq_nr_after(a, b) - return true if a is after (higher in sequence than) b, * false otherwise. */ @@ -67,7 +40,8 @@ bool hsr_addr_is_self(struct hsr_priv *hsr, unsigned char *addr) { struct hsr_node *node; - node = hsr_node_get_first(&hsr->self_node_db, &hsr->list_lock); + node = list_first_or_null_rcu(&hsr->self_node_db, struct hsr_node, + mac_list); if (!node) { WARN_ONCE(1, "HSR: No self node\n"); return false; @@ -83,12 +57,12 @@ bool hsr_addr_is_self(struct hsr_priv *hsr, unsigned char *addr) /* Search for mac entry. Caller must hold rcu read lock. */ -static struct hsr_node *find_node_by_addr_A(struct hlist_head *node_db, +static struct hsr_node *find_node_by_addr_A(struct list_head *node_db, const unsigned char addr[ETH_ALEN]) { struct hsr_node *node; - hlist_for_each_entry_rcu(node, node_db, mac_list) { + list_for_each_entry_rcu(node, node_db, mac_list) { if (ether_addr_equal(node->macaddress_A, addr)) return node; } @@ -103,7 +77,7 @@ int hsr_create_self_node(struct hsr_priv *hsr, const unsigned char addr_a[ETH_ALEN], const unsigned char addr_b[ETH_ALEN]) { - struct hlist_head *self_node_db = &hsr->self_node_db; + struct list_head *self_node_db = &hsr->self_node_db; struct hsr_node *node, *oldnode; node = kmalloc(sizeof(*node), GFP_KERNEL); @@ -114,13 +88,14 @@ int hsr_create_self_node(struct hsr_priv *hsr, ether_addr_copy(node->macaddress_B, addr_b); spin_lock_bh(&hsr->list_lock); - oldnode = hsr_node_get_first(self_node_db, &hsr->list_lock); + oldnode = list_first_or_null_rcu(self_node_db, + struct hsr_node, mac_list); if (oldnode) { - hlist_replace_rcu(&oldnode->mac_list, &node->mac_list); + list_replace_rcu(&oldnode->mac_list, &node->mac_list); spin_unlock_bh(&hsr->list_lock); kfree_rcu(oldnode, rcu_head); } else { - hlist_add_tail_rcu(&node->mac_list, self_node_db); + list_add_tail_rcu(&node->mac_list, self_node_db); spin_unlock_bh(&hsr->list_lock); } @@ -129,25 +104,25 @@ int hsr_create_self_node(struct hsr_priv *hsr, void hsr_del_self_node(struct hsr_priv *hsr) { - struct hlist_head *self_node_db = &hsr->self_node_db; + struct list_head *self_node_db = &hsr->self_node_db; struct hsr_node *node; spin_lock_bh(&hsr->list_lock); - node = hsr_node_get_first(self_node_db, &hsr->list_lock); + node = list_first_or_null_rcu(self_node_db, struct hsr_node, mac_list); if (node) { - hlist_del_rcu(&node->mac_list); + list_del_rcu(&node->mac_list); kfree_rcu(node, rcu_head); } spin_unlock_bh(&hsr->list_lock); } -void hsr_del_nodes(struct hlist_head *node_db) +void hsr_del_nodes(struct list_head *node_db) { struct hsr_node *node; - struct hlist_node *tmp; + struct hsr_node *tmp; - hlist_for_each_entry_safe(node, tmp, node_db, mac_list) - kfree_rcu(node, rcu_head); + list_for_each_entry_safe(node, tmp, node_db, mac_list) + kfree(node); } void prp_handle_san_frame(bool san, enum hsr_port_type port, @@ -168,7 +143,7 @@ void prp_handle_san_frame(bool san, enum hsr_port_type port, * originating from the newly added node. */ static struct hsr_node *hsr_add_node(struct hsr_priv *hsr, - struct hlist_head *node_db, + struct list_head *node_db, unsigned char addr[], u16 seq_out, bool san, enum hsr_port_type rx_port) @@ -198,14 +173,14 @@ static struct hsr_node *hsr_add_node(struct hsr_priv *hsr, hsr->proto_ops->handle_san_frame(san, rx_port, new_node); spin_lock_bh(&hsr->list_lock); - hlist_for_each_entry_rcu(node, node_db, mac_list, - lockdep_hsr_is_held(&hsr->list_lock)) { + list_for_each_entry_rcu(node, node_db, mac_list, + lockdep_is_held(&hsr->list_lock)) { if (ether_addr_equal(node->macaddress_A, addr)) goto out; if (ether_addr_equal(node->macaddress_B, addr)) goto out; } - hlist_add_tail_rcu(&new_node->mac_list, node_db); + list_add_tail_rcu(&new_node->mac_list, node_db); spin_unlock_bh(&hsr->list_lock); return new_node; out: @@ -225,7 +200,7 @@ void prp_update_san_info(struct hsr_node *node, bool is_sup) /* Get the hsr_node from which 'skb' was sent. */ -struct hsr_node *hsr_get_node(struct hsr_port *port, struct hlist_head *node_db, +struct hsr_node *hsr_get_node(struct hsr_port *port, struct list_head *node_db, struct sk_buff *skb, bool is_sup, enum hsr_port_type rx_port) { @@ -241,7 +216,7 @@ struct hsr_node *hsr_get_node(struct hsr_port *port, struct hlist_head *node_db, ethhdr = (struct ethhdr *)skb_mac_header(skb); - hlist_for_each_entry_rcu(node, node_db, mac_list) { + list_for_each_entry_rcu(node, node_db, mac_list) { if (ether_addr_equal(node->macaddress_A, ethhdr->h_source)) { if (hsr->proto_ops->update_san_info) hsr->proto_ops->update_san_info(node, is_sup); @@ -291,12 +266,11 @@ void hsr_handle_sup_frame(struct hsr_frame_info *frame) struct hsr_sup_tlv *hsr_sup_tlv; struct hsr_node *node_real; struct sk_buff *skb = NULL; - struct hlist_head *node_db; + struct list_head *node_db; struct ethhdr *ethhdr; int i; unsigned int pull_size = 0; unsigned int total_pull_size = 0; - u32 hash; /* Here either frame->skb_hsr or frame->skb_prp should be * valid as supervision frame always will have protocol @@ -334,13 +308,11 @@ void hsr_handle_sup_frame(struct hsr_frame_info *frame) hsr_sp = (struct hsr_sup_payload *)skb->data; /* Merge node_curr (registered on macaddress_B) into node_real */ - node_db = port_rcv->hsr->node_db; - hash = hsr_mac_hash(hsr, hsr_sp->macaddress_A); - node_real = find_node_by_addr_A(&node_db[hash], hsr_sp->macaddress_A); + node_db = &port_rcv->hsr->node_db; + node_real = find_node_by_addr_A(node_db, hsr_sp->macaddress_A); if (!node_real) /* No frame received from AddrA of this node yet */ - node_real = hsr_add_node(hsr, &node_db[hash], - hsr_sp->macaddress_A, + node_real = hsr_add_node(hsr, node_db, hsr_sp->macaddress_A, HSR_SEQNR_START - 1, true, port_rcv->type); if (!node_real) @@ -374,8 +346,7 @@ void hsr_handle_sup_frame(struct hsr_frame_info *frame) hsr_sp = (struct hsr_sup_payload *)skb->data; /* Check if redbox mac and node mac are equal. */ - if (!ether_addr_equal(node_real->macaddress_A, - hsr_sp->macaddress_A)) { + if (!ether_addr_equal(node_real->macaddress_A, hsr_sp->macaddress_A)) { /* This is a redbox supervision frame for a VDAN! */ goto done; } @@ -395,7 +366,7 @@ void hsr_handle_sup_frame(struct hsr_frame_info *frame) node_real->addr_B_port = port_rcv->type; spin_lock_bh(&hsr->list_lock); - hlist_del_rcu(&node_curr->mac_list); + list_del_rcu(&node_curr->mac_list); spin_unlock_bh(&hsr->list_lock); kfree_rcu(node_curr, rcu_head); @@ -433,7 +404,6 @@ void hsr_addr_subst_dest(struct hsr_node *node_src, struct sk_buff *skb, struct hsr_port *port) { struct hsr_node *node_dst; - u32 hash; if (!skb_mac_header_was_set(skb)) { WARN_ONCE(1, "%s: Mac header not set\n", __func__); @@ -443,8 +413,7 @@ void hsr_addr_subst_dest(struct hsr_node *node_src, struct sk_buff *skb, if (!is_unicast_ether_addr(eth_hdr(skb)->h_dest)) return; - hash = hsr_mac_hash(port->hsr, eth_hdr(skb)->h_dest); - node_dst = find_node_by_addr_A(&port->hsr->node_db[hash], + node_dst = find_node_by_addr_A(&port->hsr->node_db, eth_hdr(skb)->h_dest); if (!node_dst) { if (net_ratelimit()) @@ -520,73 +489,59 @@ static struct hsr_port *get_late_port(struct hsr_priv *hsr, void hsr_prune_nodes(struct timer_list *t) { struct hsr_priv *hsr = from_timer(hsr, t, prune_timer); - struct hlist_node *tmp; struct hsr_node *node; + struct hsr_node *tmp; struct hsr_port *port; unsigned long timestamp; unsigned long time_a, time_b; - int i; spin_lock_bh(&hsr->list_lock); + list_for_each_entry_safe(node, tmp, &hsr->node_db, mac_list) { + /* Don't prune own node. Neither time_in[HSR_PT_SLAVE_A] + * nor time_in[HSR_PT_SLAVE_B], will ever be updated for + * the master port. Thus the master node will be repeatedly + * pruned leading to packet loss. + */ + if (hsr_addr_is_self(hsr, node->macaddress_A)) + continue; - for (i = 0; i < hsr->hash_buckets; i++) { - hlist_for_each_entry_safe(node, tmp, &hsr->node_db[i], - mac_list) { - /* Don't prune own node. - * Neither time_in[HSR_PT_SLAVE_A] - * nor time_in[HSR_PT_SLAVE_B], will ever be updated - * for the master port. Thus the master node will be - * repeatedly pruned leading to packet loss. - */ - if (hsr_addr_is_self(hsr, node->macaddress_A)) - continue; + /* Shorthand */ + time_a = node->time_in[HSR_PT_SLAVE_A]; + time_b = node->time_in[HSR_PT_SLAVE_B]; - /* Shorthand */ - time_a = node->time_in[HSR_PT_SLAVE_A]; - time_b = node->time_in[HSR_PT_SLAVE_B]; + /* Check for timestamps old enough to risk wrap-around */ + if (time_after(jiffies, time_a + MAX_JIFFY_OFFSET / 2)) + node->time_in_stale[HSR_PT_SLAVE_A] = true; + if (time_after(jiffies, time_b + MAX_JIFFY_OFFSET / 2)) + node->time_in_stale[HSR_PT_SLAVE_B] = true; - /* Check for timestamps old enough to - * risk wrap-around - */ - if (time_after(jiffies, time_a + MAX_JIFFY_OFFSET / 2)) - node->time_in_stale[HSR_PT_SLAVE_A] = true; - if (time_after(jiffies, time_b + MAX_JIFFY_OFFSET / 2)) - node->time_in_stale[HSR_PT_SLAVE_B] = true; + /* Get age of newest frame from node. + * At least one time_in is OK here; nodes get pruned long + * before both time_ins can get stale + */ + timestamp = time_a; + if (node->time_in_stale[HSR_PT_SLAVE_A] || + (!node->time_in_stale[HSR_PT_SLAVE_B] && + time_after(time_b, time_a))) + timestamp = time_b; - /* Get age of newest frame from node. - * At least one time_in is OK here; nodes get pruned - * long before both time_ins can get stale - */ - timestamp = time_a; - if (node->time_in_stale[HSR_PT_SLAVE_A] || - (!node->time_in_stale[HSR_PT_SLAVE_B] && - time_after(time_b, time_a))) - timestamp = time_b; + /* Warn of ring error only as long as we get frames at all */ + if (time_is_after_jiffies(timestamp + + msecs_to_jiffies(1.5 * MAX_SLAVE_DIFF))) { + rcu_read_lock(); + port = get_late_port(hsr, node); + if (port) + hsr_nl_ringerror(hsr, node->macaddress_A, port); + rcu_read_unlock(); + } - /* Warn of ring error only as long as we get - * frames at all - */ - if (time_is_after_jiffies(timestamp + - msecs_to_jiffies(1.5 * MAX_SLAVE_DIFF))) { - rcu_read_lock(); - port = get_late_port(hsr, node); - if (port) - hsr_nl_ringerror(hsr, - node->macaddress_A, - port); - rcu_read_unlock(); - } - - /* Prune old entries */ - if (time_is_before_jiffies(timestamp + - msecs_to_jiffies(HSR_NODE_FORGET_TIME))) { - hsr_nl_nodedown(hsr, node->macaddress_A); - hlist_del_rcu(&node->mac_list); - /* Note that we need to free this - * entry later: - */ - kfree_rcu(node, rcu_head); - } + /* Prune old entries */ + if (time_is_before_jiffies(timestamp + + msecs_to_jiffies(HSR_NODE_FORGET_TIME))) { + hsr_nl_nodedown(hsr, node->macaddress_A); + list_del_rcu(&node->mac_list); + /* Note that we need to free this entry later: */ + kfree_rcu(node, rcu_head); } } spin_unlock_bh(&hsr->list_lock); @@ -600,20 +555,17 @@ void *hsr_get_next_node(struct hsr_priv *hsr, void *_pos, unsigned char addr[ETH_ALEN]) { struct hsr_node *node; - u32 hash; - - hash = hsr_mac_hash(hsr, addr); if (!_pos) { - node = hsr_node_get_first(&hsr->node_db[hash], - &hsr->list_lock); + node = list_first_or_null_rcu(&hsr->node_db, + struct hsr_node, mac_list); if (node) ether_addr_copy(addr, node->macaddress_A); return node; } node = _pos; - hlist_for_each_entry_continue_rcu(node, mac_list) { + list_for_each_entry_continue_rcu(node, &hsr->node_db, mac_list) { ether_addr_copy(addr, node->macaddress_A); return node; } @@ -633,11 +585,8 @@ int hsr_get_node_data(struct hsr_priv *hsr, struct hsr_node *node; struct hsr_port *port; unsigned long tdiff; - u32 hash; - hash = hsr_mac_hash(hsr, addr); - - node = find_node_by_addr_A(&hsr->node_db[hash], addr); + node = find_node_by_addr_A(&hsr->node_db, addr); if (!node) return -ENOENT; diff --git a/net/hsr/hsr_framereg.h b/net/hsr/hsr_framereg.h index f3762e9e42b54..bdbb8c822ba1a 100644 --- a/net/hsr/hsr_framereg.h +++ b/net/hsr/hsr_framereg.h @@ -28,17 +28,9 @@ struct hsr_frame_info { bool is_from_san; }; -#ifdef CONFIG_LOCKDEP -int lockdep_hsr_is_held(spinlock_t *lock); -#else -#define lockdep_hsr_is_held(lock) 1 -#endif - -u32 hsr_mac_hash(struct hsr_priv *hsr, const unsigned char *addr); -struct hsr_node *hsr_node_get_first(struct hlist_head *head, spinlock_t *lock); void hsr_del_self_node(struct hsr_priv *hsr); -void hsr_del_nodes(struct hlist_head *node_db); -struct hsr_node *hsr_get_node(struct hsr_port *port, struct hlist_head *node_db, +void hsr_del_nodes(struct list_head *node_db); +struct hsr_node *hsr_get_node(struct hsr_port *port, struct list_head *node_db, struct sk_buff *skb, bool is_sup, enum hsr_port_type rx_port); void hsr_handle_sup_frame(struct hsr_frame_info *frame); @@ -76,7 +68,7 @@ void prp_handle_san_frame(bool san, enum hsr_port_type port, void prp_update_san_info(struct hsr_node *node, bool is_sup); struct hsr_node { - struct hlist_node mac_list; + struct list_head mac_list; unsigned char macaddress_A[ETH_ALEN]; unsigned char macaddress_B[ETH_ALEN]; /* Local slave through which AddrB frames are received from this node */ diff --git a/net/hsr/hsr_main.h b/net/hsr/hsr_main.h index b158ba409f9a4..16ae9fb09ccd2 100644 --- a/net/hsr/hsr_main.h +++ b/net/hsr/hsr_main.h @@ -47,9 +47,6 @@ #define HSR_V1_SUP_LSDUSIZE 52 -#define HSR_HSIZE_SHIFT 8 -#define HSR_HSIZE BIT(HSR_HSIZE_SHIFT) - /* The helper functions below assumes that 'path' occupies the 4 most * significant bits of the 16-bit field shared by 'path' and 'LSDU_size' (or * equivalently, the 4 most significant bits of HSR tag byte 14). @@ -188,8 +185,8 @@ struct hsr_proto_ops { struct hsr_priv { struct rcu_head rcu_head; struct list_head ports; - struct hlist_head node_db[HSR_HSIZE]; /* Known HSR nodes */ - struct hlist_head self_node_db; /* MACs of slaves */ + struct list_head node_db; /* Known HSR nodes */ + struct list_head self_node_db; /* MACs of slaves */ struct timer_list announce_timer; /* Supervision frame dispatch */ struct timer_list prune_timer; int announce_count; @@ -199,8 +196,6 @@ struct hsr_priv { spinlock_t seqnr_lock; /* locking for sequence_nr */ spinlock_t list_lock; /* locking for node list */ struct hsr_proto_ops *proto_ops; - u32 hash_buckets; - u32 hash_seed; #define PRP_LAN_ID 0x5 /* 0x1010 for A and 0x1011 for B. Bit 0 is set * based on SLAVE_A or SLAVE_B */ diff --git a/net/hsr/hsr_netlink.c b/net/hsr/hsr_netlink.c index 7174a90929002..78fe40eb9f012 100644 --- a/net/hsr/hsr_netlink.c +++ b/net/hsr/hsr_netlink.c @@ -105,7 +105,6 @@ static int hsr_newlink(struct net *src_net, struct net_device *dev, static void hsr_dellink(struct net_device *dev, struct list_head *head) { struct hsr_priv *hsr = netdev_priv(dev); - int i; del_timer_sync(&hsr->prune_timer); del_timer_sync(&hsr->announce_timer); @@ -114,8 +113,7 @@ static void hsr_dellink(struct net_device *dev, struct list_head *head) hsr_del_ports(hsr); hsr_del_self_node(hsr); - for (i = 0; i < hsr->hash_buckets; i++) - hsr_del_nodes(&hsr->node_db[i]); + hsr_del_nodes(&hsr->node_db); unregister_netdevice_queue(dev, head); } From patchwork Fri Nov 25 16:56:04 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sebastian Andrzej Siewior X-Patchwork-Id: 13056157 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 30EEBC4167B for ; Fri, 25 Nov 2022 16:56:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229965AbiKYQ4T (ORCPT ); Fri, 25 Nov 2022 11:56:19 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35148 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229763AbiKYQ4R (ORCPT ); Fri, 25 Nov 2022 11:56:17 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3D75B2D1C7 for ; Fri, 25 Nov 2022 08:56:16 -0800 (PST) From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1669395374; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=VNq+vGNG0CTlw3VW3ZdooZgvUjXykmOxQ/6FpJtmkoU=; b=hl5fWkOyKU710zBmqopqQhTheuicxnpjOTsGC3Nd3f91XfyKpAIKQMhGDtH4B0RxsxXIOD mUgjkSWvf0h8zfsqJ9OzgRc0xy1u2kLAKzHtu97j++EUwFctPrpFhYGYc0vraU0vfa3wJd Ncnvh9NSI2IuN/PWgs2z2rdDGONdQxCtZTL8tGl/hHN2/OgCdFfZBNoHWsJzjPK7C2Ko34 /9F81pWSZY4jQ8scSo3ABDO62C+Fry3XZXeuVTthct7DtcAkDRsDdlGG2O8+Om8usKkFhv hmtmTceeDWJyZD4fa70XUswJedHdJ+js4vX9jp4LkEoaVMWRGbO2F12NUtYT+A== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1669395374; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=VNq+vGNG0CTlw3VW3ZdooZgvUjXykmOxQ/6FpJtmkoU=; b=A17CVa0zgMc7GYGalzHbKnbSuwTAveCzkF7tYs6MV+Oe8B/jZiw4ihGMo8qq3IO3eIGl7V dOAg3RaLk4gzraBw== To: netdev@vger.kernel.org Cc: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Thomas Gleixner , Kurt Kanzenbach , Sebastian Andrzej Siewior Subject: [PATCH v4 net-next 2/8] hsr: Add a rcu-read lock to hsr_forward_skb(). Date: Fri, 25 Nov 2022 17:56:04 +0100 Message-Id: <20221125165610.3802446-3-bigeasy@linutronix.de> In-Reply-To: <20221125165610.3802446-1-bigeasy@linutronix.de> References: <20221125165610.3802446-1-bigeasy@linutronix.de> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org hsr_forward_skb() a skb and keeps information in an on-stack hsr_frame_info. hsr_get_node() assigns hsr_frame_info::node_src which is from a RCU list. This pointer is used later in hsr_forward_do(). I don't see a reason why this pointer can't vanish midway since there is no guarantee that hsr_forward_skb() is invoked from an RCU read section. Use rcu_read_lock() to protect hsr_frame_info::node_src from its assigment until it is no longer used. Fixes: f266a683a4804 ("net/hsr: Better frame dispatch") Signed-off-by: Sebastian Andrzej Siewior --- net/hsr/hsr_forward.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/net/hsr/hsr_forward.c b/net/hsr/hsr_forward.c index 9894962847d97..3a97b00b6d978 100644 --- a/net/hsr/hsr_forward.c +++ b/net/hsr/hsr_forward.c @@ -613,11 +613,13 @@ void hsr_forward_skb(struct sk_buff *skb, struct hsr_port *port) { struct hsr_frame_info frame; + rcu_read_lock(); if (fill_frame_info(&frame, skb, port) < 0) goto out_drop; hsr_register_frame_in(frame.node_src, port, frame.sequence_nr); hsr_forward_do(&frame); + rcu_read_unlock(); /* Gets called for ingress frames as well as egress from master port. * So check and increment stats for master port only here. */ @@ -632,6 +634,7 @@ void hsr_forward_skb(struct sk_buff *skb, struct hsr_port *port) return; out_drop: + rcu_read_unlock(); port->dev->stats.tx_dropped++; kfree_skb(skb); } From patchwork Fri Nov 25 16:56:05 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sebastian Andrzej Siewior X-Patchwork-Id: 13056156 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6FF8DC4332F for ; Fri, 25 Nov 2022 16:56:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229930AbiKYQ4S (ORCPT ); Fri, 25 Nov 2022 11:56:18 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35150 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229493AbiKYQ4R (ORCPT ); Fri, 25 Nov 2022 11:56:17 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4CFD32D741 for ; Fri, 25 Nov 2022 08:56:16 -0800 (PST) From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1669395374; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=x7EazyHGHCpeTiiHucwroXGYdDx6pSWXMjFRJN0efiU=; b=knOES9mo1PyxLFDtRo/deu7QHMV40F84XJqJNEvMAg9E6mB4sZSB2L2K/huiBtYVMprP0z zPoclEj+CrQDLfVKU8RQMJuNWB20Q+zzr/F2EbOCcLgW/LElJ1iuB9eCS9gQd+KQwHynt0 jrEj8g9fSwIeSSsi0VBXvKJIiMqkbMwv6ZrX/JxD5WGEilvhboyhcUoWvqYf0Qyu7XagsO 2/wQ3nmIsKCTq6Cbkfd+eMMy7UoWNb+S3AEDiJ0HwbaZVONsVyXAkZBJFWRkQNg7aiWv4a fRP//h+sPONLl0Twz9Z+vj0GG2n80x6Pg+4ksJjsceV6SU4uKvEwKsAPpg2JbQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1669395374; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=x7EazyHGHCpeTiiHucwroXGYdDx6pSWXMjFRJN0efiU=; b=JJ6NPGPgyHXIYXfnqe0mIDM5PuO+Dd9CqXHK1128mjv6NAUKeMOyVRgP82W8vaNjZZnwBB +XTMKJx35mY846Cg== To: netdev@vger.kernel.org Cc: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Thomas Gleixner , Kurt Kanzenbach , Sebastian Andrzej Siewior Subject: [PATCH v4 net-next 3/8] hsr: Avoid double remove of a node. Date: Fri, 25 Nov 2022 17:56:05 +0100 Message-Id: <20221125165610.3802446-4-bigeasy@linutronix.de> In-Reply-To: <20221125165610.3802446-1-bigeasy@linutronix.de> References: <20221125165610.3802446-1-bigeasy@linutronix.de> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org Due to the hashed-MAC optimisation one problem become visible: hsr_handle_sup_frame() walks over the list of available nodes and merges two node entries into one if based on the information in the supervision both MAC addresses belong to one node. The list-walk happens on a RCU protected list and delete operation happens under a lock. If the supervision arrives on both slave interfaces at the same time then this delete operation can occur simultaneously on two CPUs. The result is the first-CPU deletes the from the list and the second CPUs BUGs while attempting to dereference a poisoned list-entry. This happens more likely with the optimisation because a new node for the mac_B entry is created once a packet has been received and removed (merged) once the supervision frame has been received. Avoid removing/ cleaning up a hsr_node twice by adding a `removed' field which is set to true after the removal and checked before the removal. Fixes: f266a683a4804 ("net/hsr: Better frame dispatch") Signed-off-by: Sebastian Andrzej Siewior --- net/hsr/hsr_framereg.c | 16 +++++++++++----- net/hsr/hsr_framereg.h | 1 + 2 files changed, 12 insertions(+), 5 deletions(-) diff --git a/net/hsr/hsr_framereg.c b/net/hsr/hsr_framereg.c index 9b8eaebce2549..f2dd846ff9038 100644 --- a/net/hsr/hsr_framereg.c +++ b/net/hsr/hsr_framereg.c @@ -366,9 +366,12 @@ void hsr_handle_sup_frame(struct hsr_frame_info *frame) node_real->addr_B_port = port_rcv->type; spin_lock_bh(&hsr->list_lock); - list_del_rcu(&node_curr->mac_list); + if (!node_curr->removed) { + list_del_rcu(&node_curr->mac_list); + node_curr->removed = true; + kfree_rcu(node_curr, rcu_head); + } spin_unlock_bh(&hsr->list_lock); - kfree_rcu(node_curr, rcu_head); done: /* Push back here */ @@ -539,9 +542,12 @@ void hsr_prune_nodes(struct timer_list *t) if (time_is_before_jiffies(timestamp + msecs_to_jiffies(HSR_NODE_FORGET_TIME))) { hsr_nl_nodedown(hsr, node->macaddress_A); - list_del_rcu(&node->mac_list); - /* Note that we need to free this entry later: */ - kfree_rcu(node, rcu_head); + if (!node->removed) { + list_del_rcu(&node->mac_list); + node->removed = true; + /* Note that we need to free this entry later: */ + kfree_rcu(node, rcu_head); + } } } spin_unlock_bh(&hsr->list_lock); diff --git a/net/hsr/hsr_framereg.h b/net/hsr/hsr_framereg.h index bdbb8c822ba1a..b5f902397bf1a 100644 --- a/net/hsr/hsr_framereg.h +++ b/net/hsr/hsr_framereg.h @@ -80,6 +80,7 @@ struct hsr_node { bool san_a; bool san_b; u16 seq_out[HSR_PT_PORTS]; + bool removed; struct rcu_head rcu_head; }; From patchwork Fri Nov 25 16:56:06 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Sebastian Andrzej Siewior X-Patchwork-Id: 13056159 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CDA63C4332F for ; Fri, 25 Nov 2022 16:56:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229980AbiKYQ4X (ORCPT ); Fri, 25 Nov 2022 11:56:23 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35152 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229863AbiKYQ4R (ORCPT ); Fri, 25 Nov 2022 11:56:17 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7C6D024F0D for ; Fri, 25 Nov 2022 08:56:16 -0800 (PST) From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1669395375; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sp8CnyT8+d8v5Q3zWHZkcsCanTyrIjAXUbUopvzz0Q0=; b=DU8Ck/Df+rhNx2CBXRFdaZbg0x1NN3A7l84AEeAJYgiEbDMc0PsN+Ka4mjhAmE2Q1RIhoY XaAdGIy3wPSOHPAIDXCx6D1or9BZ9dRy7uPcnSgn8WbLHPwjljf95mDQVL5dflqUSvXw5w nCIroLf+nmhWH0kd4PBVVHi+k9nXckh05sW/KAmIJZ5qQFFbCc/shOgUx5VBw7FzspPmq2 /LqrPBvugmGmytISYG8hLC8d94im76YwheYnCGmpArxoOQFw3lFiCX13XR3w/LCarLABM4 jA8H24IaCJM32rDDGIjg035fRg7oSm4MSWASUdBwZ8BwKNhHxcfOffBK4PYW0Q== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1669395375; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sp8CnyT8+d8v5Q3zWHZkcsCanTyrIjAXUbUopvzz0Q0=; b=ZfFo14C/Mu9yWtJADekausM8yKiv5JtAm4p5FV1FU6B/FeVJsfT4mZJXiMrTMzp81iCwjb j/TpOw1FNV6qYnAQ== To: netdev@vger.kernel.org Cc: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Thomas Gleixner , Kurt Kanzenbach , Sebastian Andrzej Siewior Subject: [PATCH v4 net-next 4/8] hsr: Disable netpoll. Date: Fri, 25 Nov 2022 17:56:06 +0100 Message-Id: <20221125165610.3802446-5-bigeasy@linutronix.de> In-Reply-To: <20221125165610.3802446-1-bigeasy@linutronix.de> References: <20221125165610.3802446-1-bigeasy@linutronix.de> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org The hsr device is a software device. Its net_device_ops::ndo_start_xmit() routine will process the packet and then pass the resulting skb to dev_queue_xmit(). During processing, hsr acquires a lock with spin_lock_bh() (hsr_add_node()) which needs to be promoted to the _irq() suffix in order to avoid a potential deadlock. Then there are the warnings in dev_queue_xmit() (due to local_bh_disable() with disabled interrupts) left. Instead trying to address those (there is qdisc and…) for netpoll sake, just disable netpoll on hsr. Disable netpoll on hsr and replace the _irqsave() locking with _bh(). Fixes: f421436a591d3 ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)") Signed-off-by: Sebastian Andrzej Siewior --- net/hsr/hsr_device.c | 14 ++++++-------- net/hsr/hsr_forward.c | 5 ++--- 2 files changed, 8 insertions(+), 11 deletions(-) diff --git a/net/hsr/hsr_device.c b/net/hsr/hsr_device.c index 7518f7e930431..84fba2a402a5b 100644 --- a/net/hsr/hsr_device.c +++ b/net/hsr/hsr_device.c @@ -278,7 +278,6 @@ static void send_hsr_supervision_frame(struct hsr_port *master, __u8 type = HSR_TLV_LIFE_CHECK; struct hsr_sup_payload *hsr_sp; struct hsr_sup_tag *hsr_stag; - unsigned long irqflags; struct sk_buff *skb; *interval = msecs_to_jiffies(HSR_LIFE_CHECK_INTERVAL); @@ -299,7 +298,7 @@ static void send_hsr_supervision_frame(struct hsr_port *master, set_hsr_stag_HSR_ver(hsr_stag, hsr->prot_version); /* From HSRv1 on we have separate supervision sequence numbers. */ - spin_lock_irqsave(&master->hsr->seqnr_lock, irqflags); + spin_lock_bh(&hsr->seqnr_lock); if (hsr->prot_version > 0) { hsr_stag->sequence_nr = htons(hsr->sup_sequence_nr); hsr->sup_sequence_nr++; @@ -307,7 +306,7 @@ static void send_hsr_supervision_frame(struct hsr_port *master, hsr_stag->sequence_nr = htons(hsr->sequence_nr); hsr->sequence_nr++; } - spin_unlock_irqrestore(&master->hsr->seqnr_lock, irqflags); + spin_unlock_bh(&hsr->seqnr_lock); hsr_stag->tlv.HSR_TLV_type = type; /* TODO: Why 12 in HSRv0? */ @@ -332,7 +331,6 @@ static void send_prp_supervision_frame(struct hsr_port *master, struct hsr_priv *hsr = master->hsr; struct hsr_sup_payload *hsr_sp; struct hsr_sup_tag *hsr_stag; - unsigned long irqflags; struct sk_buff *skb; skb = hsr_init_skb(master); @@ -347,7 +345,7 @@ static void send_prp_supervision_frame(struct hsr_port *master, set_hsr_stag_HSR_ver(hsr_stag, (hsr->prot_version ? 1 : 0)); /* From HSRv1 on we have separate supervision sequence numbers. */ - spin_lock_irqsave(&master->hsr->seqnr_lock, irqflags); + spin_lock_bh(&hsr->seqnr_lock); hsr_stag->sequence_nr = htons(hsr->sup_sequence_nr); hsr->sup_sequence_nr++; hsr_stag->tlv.HSR_TLV_type = PRP_TLV_LIFE_CHECK_DD; @@ -358,11 +356,11 @@ static void send_prp_supervision_frame(struct hsr_port *master, ether_addr_copy(hsr_sp->macaddress_A, master->dev->dev_addr); if (skb_put_padto(skb, ETH_ZLEN)) { - spin_unlock_irqrestore(&master->hsr->seqnr_lock, irqflags); + spin_unlock_bh(&hsr->seqnr_lock); return; } - spin_unlock_irqrestore(&master->hsr->seqnr_lock, irqflags); + spin_unlock_bh(&hsr->seqnr_lock); hsr_forward_skb(skb, master); } @@ -444,7 +442,7 @@ void hsr_dev_setup(struct net_device *dev) dev->header_ops = &hsr_header_ops; dev->netdev_ops = &hsr_device_ops; SET_NETDEV_DEVTYPE(dev, &hsr_type); - dev->priv_flags |= IFF_NO_QUEUE; + dev->priv_flags |= IFF_NO_QUEUE | IFF_DISABLE_NETPOLL; dev->needs_free_netdev = true; diff --git a/net/hsr/hsr_forward.c b/net/hsr/hsr_forward.c index 3a97b00b6d978..0cb8f4040bfd1 100644 --- a/net/hsr/hsr_forward.c +++ b/net/hsr/hsr_forward.c @@ -499,7 +499,6 @@ static void handle_std_frame(struct sk_buff *skb, { struct hsr_port *port = frame->port_rcv; struct hsr_priv *hsr = port->hsr; - unsigned long irqflags; frame->skb_hsr = NULL; frame->skb_prp = NULL; @@ -509,10 +508,10 @@ static void handle_std_frame(struct sk_buff *skb, frame->is_from_san = true; } else { /* Sequence nr for the master node */ - spin_lock_irqsave(&hsr->seqnr_lock, irqflags); + spin_lock_bh(&hsr->seqnr_lock); frame->sequence_nr = hsr->sequence_nr; hsr->sequence_nr++; - spin_unlock_irqrestore(&hsr->seqnr_lock, irqflags); + spin_unlock_bh(&hsr->seqnr_lock); } } From patchwork Fri Nov 25 16:56:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sebastian Andrzej Siewior X-Patchwork-Id: 13056164 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 84C16C4321E for ; Fri, 25 Nov 2022 16:56:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229995AbiKYQ4a (ORCPT ); Fri, 25 Nov 2022 11:56:30 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35178 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229956AbiKYQ4T (ORCPT ); Fri, 25 Nov 2022 11:56:19 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 92C2B2B24E for ; Fri, 25 Nov 2022 08:56:18 -0800 (PST) From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1669395375; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Se3VUV4VgV2Dm3+vYvjmbnzZL1Rh4a37GClYgreafSU=; b=s6bFWShGIZZT/RS4BW9kzs2p3p62+EtVvKuPMrnH5IXlwUzm9kmiP4CIyKMVZHI4d5ASOU NYBrGGMyCxW2yHN0oETLu/Fb34xouO9DpKx+jemgiyUlHNsu0FvxKzKzmvypQlapicvoWW Zcw8wTp+vrzRTs6PUfwLZ8RVmFiOhA6wKNf4oZVlxEN9uZpKMP6eUNONdLJzp53BVY/pJ+ 0NcMHRmyytQNe30XeSA3of/CxD9U+AUgeZYrcCtQy9StJQT1RRc9nmhYoa+FfpKV+nYHAs /cibzROy/RBZElwEXPut8F9aaOJYYE9M+vDKRvtsf+LfB/xl/GhPonzYC+hSIw== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1669395375; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Se3VUV4VgV2Dm3+vYvjmbnzZL1Rh4a37GClYgreafSU=; b=8dFundoQ3qt8OtScRBXN3ukY47mPtJeYPlAmHuuKfincfBo5vHDy87FBXQceevSa91x4B+ YRXDrXsz4lSZjNCQ== To: netdev@vger.kernel.org Cc: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Thomas Gleixner , Kurt Kanzenbach , Sebastian Andrzej Siewior Subject: [PATCH v4 net-next 5/8] hsr: Synchronize sending frames to have always incremented outgoing seq nr. Date: Fri, 25 Nov 2022 17:56:07 +0100 Message-Id: <20221125165610.3802446-6-bigeasy@linutronix.de> In-Reply-To: <20221125165610.3802446-1-bigeasy@linutronix.de> References: <20221125165610.3802446-1-bigeasy@linutronix.de> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org Sending frames via the hsr (master) device requires a sequence number which is tracked in hsr_priv::sequence_nr and protected by hsr_priv::seqnr_lock. Each time a new frame is sent, it will obtain a new id and then send it via the slave devices. Each time a packet is sent (via hsr_forward_do()) the sequence number is checked via hsr_register_frame_out() to ensure that a frame is not handled twice. This make sense for the receiving side to ensure that the frame is not injected into the stack twice after it has been received from both slave ports. There is no locking to cover the sending path which means the following scenario is possible: CPU0 CPU1 hsr_dev_xmit(skb1) hsr_dev_xmit(skb2) fill_frame_info() fill_frame_info() hsr_fill_frame_info() hsr_fill_frame_info() handle_std_frame() handle_std_frame() skb1's sequence_nr = 1 skb2's sequence_nr = 2 hsr_forward_do() hsr_forward_do() hsr_register_frame_out(, 2) // okay, send) hsr_register_frame_out(, 1) // stop, lower seq duplicate Both skbs (or their struct hsr_frame_info) received an unique id. However since skb2 was sent before skb1, the higher sequence number was recorded in hsr_register_frame_out() and the late arriving skb1 was dropped and never sent. This scenario has been observed in a three node HSR setup, with node1 + node2 having ping and iperf running in parallel. From time to time ping reported a missing packet. Based on tracing that missing ping packet did not leave the system. It might be possible (didn't check) to drop the sequence number check on the sending side. But if the higher sequence number leaves on wire before the lower does and the destination receives them in that order and it will drop the packet with the lower sequence number and never inject into the stack. Therefore it seems the only way is to lock the whole path from obtaining the sequence number and sending via dev_queue_xmit() and assuming the packets leave on wire in the same order (and don't get reordered by the NIC). Cover the whole path for the master interface from obtaining the ID until after it has been forwarded via hsr_forward_skb() to ensure the skbs are sent to the NIC in the order of the assigned sequence numbers. Fixes: f421436a591d3 ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)") Signed-off-by: Sebastian Andrzej Siewior --- net/hsr/hsr_device.c | 12 +++++++----- net/hsr/hsr_forward.c | 3 +-- 2 files changed, 8 insertions(+), 7 deletions(-) diff --git a/net/hsr/hsr_device.c b/net/hsr/hsr_device.c index 84fba2a402a5b..b1e86a7265b32 100644 --- a/net/hsr/hsr_device.c +++ b/net/hsr/hsr_device.c @@ -219,7 +219,9 @@ static netdev_tx_t hsr_dev_xmit(struct sk_buff *skb, struct net_device *dev) skb->dev = master->dev; skb_reset_mac_header(skb); skb_reset_mac_len(skb); + spin_lock_bh(&hsr->seqnr_lock); hsr_forward_skb(skb, master); + spin_unlock_bh(&hsr->seqnr_lock); } else { dev_core_stats_tx_dropped_inc(dev); dev_kfree_skb_any(skb); @@ -306,7 +308,6 @@ static void send_hsr_supervision_frame(struct hsr_port *master, hsr_stag->sequence_nr = htons(hsr->sequence_nr); hsr->sequence_nr++; } - spin_unlock_bh(&hsr->seqnr_lock); hsr_stag->tlv.HSR_TLV_type = type; /* TODO: Why 12 in HSRv0? */ @@ -317,11 +318,13 @@ static void send_hsr_supervision_frame(struct hsr_port *master, hsr_sp = skb_put(skb, sizeof(struct hsr_sup_payload)); ether_addr_copy(hsr_sp->macaddress_A, master->dev->dev_addr); - if (skb_put_padto(skb, ETH_ZLEN)) + if (skb_put_padto(skb, ETH_ZLEN)) { + spin_unlock_bh(&hsr->seqnr_lock); return; + } hsr_forward_skb(skb, master); - + spin_unlock_bh(&hsr->seqnr_lock); return; } @@ -360,9 +363,8 @@ static void send_prp_supervision_frame(struct hsr_port *master, return; } - spin_unlock_bh(&hsr->seqnr_lock); - hsr_forward_skb(skb, master); + spin_unlock_bh(&hsr->seqnr_lock); } /* Announce (supervision frame) timer function diff --git a/net/hsr/hsr_forward.c b/net/hsr/hsr_forward.c index 0cb8f4040bfd1..b67e52af8967f 100644 --- a/net/hsr/hsr_forward.c +++ b/net/hsr/hsr_forward.c @@ -508,10 +508,9 @@ static void handle_std_frame(struct sk_buff *skb, frame->is_from_san = true; } else { /* Sequence nr for the master node */ - spin_lock_bh(&hsr->seqnr_lock); + lockdep_assert_held(&hsr->seqnr_lock); frame->sequence_nr = hsr->sequence_nr; hsr->sequence_nr++; - spin_unlock_bh(&hsr->seqnr_lock); } } From patchwork Fri Nov 25 16:56:08 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sebastian Andrzej Siewior X-Patchwork-Id: 13056160 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 38599C4332F for ; Fri, 25 Nov 2022 16:56:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230006AbiKYQ40 (ORCPT ); Fri, 25 Nov 2022 11:56:26 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35176 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229936AbiKYQ4T (ORCPT ); Fri, 25 Nov 2022 11:56:19 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8DF3524F0D for ; Fri, 25 Nov 2022 08:56:18 -0800 (PST) From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1669395375; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=6TYM9OumVgRPKq+eX//pCt69s2IzXDIeTR3GIzFelbk=; b=PCU2QGM+/SpXEPyMiZWhXDsOMABMhTS9ScUiKMjqDdzr/Srw2Ss+Ib+BAbff9Hbk9j9HNa MopEwQENU9WxKYlINKaMZ9B3t/TkXNTQ3S6WaLuO0NT09AMhA+u6pxINjZnVPxwglbwkqu s4kC/NYB6xlRiLSe3sgZPKUcuohYd5yj+V1l7rkocPxA/ZOoUp+2lFWYq4beZbFNBWn94P LwoQEW3H5j3mTCiTj3RUKRCyhmEeEmtq6mS8LO979haAWdbbIT+yBvGF7+4gb4L1ntJ01F DrGgHOOmpOA5Z7GeHrEYpevLLyJdbxhtwYWdDPEioDQXICcjEfz9cI5VLmlV5w== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1669395375; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=6TYM9OumVgRPKq+eX//pCt69s2IzXDIeTR3GIzFelbk=; b=q+MM1ullnUPQVfIFBMvq4e4UfR7rFkMBqDSsmybOGgmEo63pa21lLCDrlMX9K5Fy1JeuxS WVttPW4KdvZ0QKCA== To: netdev@vger.kernel.org Cc: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Thomas Gleixner , Kurt Kanzenbach , Sebastian Andrzej Siewior Subject: [PATCH v4 net-next 6/8] hsr: Synchronize sequence number updates. Date: Fri, 25 Nov 2022 17:56:08 +0100 Message-Id: <20221125165610.3802446-7-bigeasy@linutronix.de> In-Reply-To: <20221125165610.3802446-1-bigeasy@linutronix.de> References: <20221125165610.3802446-1-bigeasy@linutronix.de> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org hsr_register_frame_out() compares new sequence_nr vs the old one recorded in hsr_node::seq_out and if the new sequence_nr is higher then it will be written to hsr_node::seq_out as the new value. This operation isn't locked so it is possible that two frames with the same sequence number arrive (via the two slave devices) and are fed to hsr_register_frame_out() at the same time. Both will pass the check and update the sequence counter later to the same value. As a result the content of the same packet is fed into the stack twice. This was noticed by running ping and observing DUP being reported from time to time. Instead of using the hsr_priv::seqnr_lock for the whole receive path (as it is for sending in the master node) add an aditional lock that is only used for sequence number checks and updates. Add a per-node lock that is used during sequence number reads and updates. Fixes: f421436a591d3 ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)") Signed-off-by: Sebastian Andrzej Siewior --- net/hsr/hsr_framereg.c | 9 ++++++++- net/hsr/hsr_framereg.h | 1 + 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/net/hsr/hsr_framereg.c b/net/hsr/hsr_framereg.c index f2dd846ff9038..39a6088080e93 100644 --- a/net/hsr/hsr_framereg.c +++ b/net/hsr/hsr_framereg.c @@ -157,6 +157,7 @@ static struct hsr_node *hsr_add_node(struct hsr_priv *hsr, return NULL; ether_addr_copy(new_node->macaddress_A, addr); + spin_lock_init(&new_node->seq_out_lock); /* We are only interested in time diffs here, so use current jiffies * as initialization. (0 could trigger an spurious ring error warning). @@ -353,6 +354,7 @@ void hsr_handle_sup_frame(struct hsr_frame_info *frame) } ether_addr_copy(node_real->macaddress_B, ethhdr->h_source); + spin_lock_bh(&node_real->seq_out_lock); for (i = 0; i < HSR_PT_PORTS; i++) { if (!node_curr->time_in_stale[i] && time_after(node_curr->time_in[i], node_real->time_in[i])) { @@ -363,6 +365,7 @@ void hsr_handle_sup_frame(struct hsr_frame_info *frame) if (seq_nr_after(node_curr->seq_out[i], node_real->seq_out[i])) node_real->seq_out[i] = node_curr->seq_out[i]; } + spin_unlock_bh(&node_real->seq_out_lock); node_real->addr_B_port = port_rcv->type; spin_lock_bh(&hsr->list_lock); @@ -456,13 +459,17 @@ void hsr_register_frame_in(struct hsr_node *node, struct hsr_port *port, int hsr_register_frame_out(struct hsr_port *port, struct hsr_node *node, u16 sequence_nr) { + spin_lock_bh(&node->seq_out_lock); if (seq_nr_before_or_eq(sequence_nr, node->seq_out[port->type]) && time_is_after_jiffies(node->time_out[port->type] + - msecs_to_jiffies(HSR_ENTRY_FORGET_TIME))) + msecs_to_jiffies(HSR_ENTRY_FORGET_TIME))) { + spin_unlock_bh(&node->seq_out_lock); return 1; + } node->time_out[port->type] = jiffies; node->seq_out[port->type] = sequence_nr; + spin_unlock_bh(&node->seq_out_lock); return 0; } diff --git a/net/hsr/hsr_framereg.h b/net/hsr/hsr_framereg.h index b5f902397bf1a..9a047ac5991d0 100644 --- a/net/hsr/hsr_framereg.h +++ b/net/hsr/hsr_framereg.h @@ -69,6 +69,7 @@ void prp_update_san_info(struct hsr_node *node, bool is_sup); struct hsr_node { struct list_head mac_list; + spinlock_t seq_out_lock; unsigned char macaddress_A[ETH_ALEN]; unsigned char macaddress_B[ETH_ALEN]; /* Local slave through which AddrB frames are received from this node */ From patchwork Fri Nov 25 16:56:09 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sebastian Andrzej Siewior X-Patchwork-Id: 13056163 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 529BFC4332F for ; Fri, 25 Nov 2022 16:56:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230022AbiKYQ42 (ORCPT ); Fri, 25 Nov 2022 11:56:28 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35154 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229969AbiKYQ4U (ORCPT ); Fri, 25 Nov 2022 11:56:20 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B6E5E2D741 for ; Fri, 25 Nov 2022 08:56:18 -0800 (PST) From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1669395375; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=y9IkcfDe4Q8jdMl4A4jc4Ym2T3Ysx+WDfbYKL6Skx1s=; b=4ExhQ76yMFZTbTyR8Yhz3QYqWpdPbdIXstPrzPIN4S8wBHsRhrhgA04vAOFxZxQJYFPUqA 561vSQC8srZnHEZrD2MrA/p7x/o04t9uiwj/rb8gMVG8FMWc2NXxT64LhcRU3ErODaGng1 DXQLuIAIVicE5xB6Xo/zDbWFrbDW6/dqFFObKStG38eC8nqv2xe6ozrpmOc2GkPnoqWS6s U/TOXL6PMHurRrRibn62ElmOyI6boUYNnYMz+l4f3kZ/C/6F16E7c7pM/VkUyRCqh7KTH8 /KmWQUsE7NopbfGlYrWWoNCBWdpes6aShSuEfeBQes2LcUxtbopDW8LxKdx98Q== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1669395375; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=y9IkcfDe4Q8jdMl4A4jc4Ym2T3Ysx+WDfbYKL6Skx1s=; b=ietAzOQLW75hTSKLXEImFNLvjhvUf8aMIjzVnMK1kM/2h6jHkIZFSZkL/u6X1KbDqFe2s1 7hQ2c2Wp6Zo5edBg== To: netdev@vger.kernel.org Cc: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Thomas Gleixner , Kurt Kanzenbach , Sebastian Andrzej Siewior Subject: [PATCH v4 net-next 7/8] hsr: Use a single struct for self_node. Date: Fri, 25 Nov 2022 17:56:09 +0100 Message-Id: <20221125165610.3802446-8-bigeasy@linutronix.de> In-Reply-To: <20221125165610.3802446-1-bigeasy@linutronix.de> References: <20221125165610.3802446-1-bigeasy@linutronix.de> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org self_node_db is a list_head with one entry of struct hsr_node. The purpose is to hold the two MAC addresses of the node itself. It is convenient to recycle the structure. However having a list_head and fetching always the first entry is not really optimal. Created a new data strucure contaning the two MAC addresses named hsr_self_node. Access that structure like an RCU protected pointer so it can be replaced on the fly without blocking the reader. Signed-off-by: Sebastian Andrzej Siewior Reviewed-by: Kurt Kanzenbach --- net/hsr/hsr_device.c | 1 - net/hsr/hsr_framereg.c | 63 +++++++++++++++++++----------------------- net/hsr/hsr_main.h | 8 +++++- 3 files changed, 35 insertions(+), 37 deletions(-) diff --git a/net/hsr/hsr_device.c b/net/hsr/hsr_device.c index b1e86a7265b32..5a236aae2366f 100644 --- a/net/hsr/hsr_device.c +++ b/net/hsr/hsr_device.c @@ -490,7 +490,6 @@ int hsr_dev_finalize(struct net_device *hsr_dev, struct net_device *slave[2], hsr = netdev_priv(hsr_dev); INIT_LIST_HEAD(&hsr->ports); INIT_LIST_HEAD(&hsr->node_db); - INIT_LIST_HEAD(&hsr->self_node_db); spin_lock_init(&hsr->list_lock); eth_hw_addr_set(hsr_dev, slave[0]->dev_addr); diff --git a/net/hsr/hsr_framereg.c b/net/hsr/hsr_framereg.c index 39a6088080e93..00db74d96583d 100644 --- a/net/hsr/hsr_framereg.c +++ b/net/hsr/hsr_framereg.c @@ -38,21 +38,22 @@ static bool seq_nr_after(u16 a, u16 b) bool hsr_addr_is_self(struct hsr_priv *hsr, unsigned char *addr) { - struct hsr_node *node; + struct hsr_self_node *sn; + bool ret = false; - node = list_first_or_null_rcu(&hsr->self_node_db, struct hsr_node, - mac_list); - if (!node) { + rcu_read_lock(); + sn = rcu_dereference(hsr->self_node); + if (!sn) { WARN_ONCE(1, "HSR: No self node\n"); - return false; + goto out; } - if (ether_addr_equal(addr, node->macaddress_A)) - return true; - if (ether_addr_equal(addr, node->macaddress_B)) - return true; - - return false; + if (ether_addr_equal(addr, sn->macaddress_A) || + ether_addr_equal(addr, sn->macaddress_B)) + ret = true; +out: + rcu_read_unlock(); + return ret; } /* Search for mac entry. Caller must hold rcu read lock. @@ -70,50 +71,42 @@ static struct hsr_node *find_node_by_addr_A(struct list_head *node_db, return NULL; } -/* Helper for device init; the self_node_db is used in hsr_rcv() to recognize +/* Helper for device init; the self_node is used in hsr_rcv() to recognize * frames from self that's been looped over the HSR ring. */ int hsr_create_self_node(struct hsr_priv *hsr, const unsigned char addr_a[ETH_ALEN], const unsigned char addr_b[ETH_ALEN]) { - struct list_head *self_node_db = &hsr->self_node_db; - struct hsr_node *node, *oldnode; + struct hsr_self_node *sn, *old; - node = kmalloc(sizeof(*node), GFP_KERNEL); - if (!node) + sn = kmalloc(sizeof(*sn), GFP_KERNEL); + if (!sn) return -ENOMEM; - ether_addr_copy(node->macaddress_A, addr_a); - ether_addr_copy(node->macaddress_B, addr_b); + ether_addr_copy(sn->macaddress_A, addr_a); + ether_addr_copy(sn->macaddress_B, addr_b); spin_lock_bh(&hsr->list_lock); - oldnode = list_first_or_null_rcu(self_node_db, - struct hsr_node, mac_list); - if (oldnode) { - list_replace_rcu(&oldnode->mac_list, &node->mac_list); - spin_unlock_bh(&hsr->list_lock); - kfree_rcu(oldnode, rcu_head); - } else { - list_add_tail_rcu(&node->mac_list, self_node_db); - spin_unlock_bh(&hsr->list_lock); - } + old = rcu_replace_pointer(hsr->self_node, sn, + lockdep_is_held(&hsr->list_lock)); + spin_unlock_bh(&hsr->list_lock); + if (old) + kfree_rcu(old, rcu_head); return 0; } void hsr_del_self_node(struct hsr_priv *hsr) { - struct list_head *self_node_db = &hsr->self_node_db; - struct hsr_node *node; + struct hsr_self_node *old; spin_lock_bh(&hsr->list_lock); - node = list_first_or_null_rcu(self_node_db, struct hsr_node, mac_list); - if (node) { - list_del_rcu(&node->mac_list); - kfree_rcu(node, rcu_head); - } + old = rcu_replace_pointer(hsr->self_node, NULL, + lockdep_is_held(&hsr->list_lock)); spin_unlock_bh(&hsr->list_lock); + if (old) + kfree_rcu(old, rcu_head); } void hsr_del_nodes(struct list_head *node_db) diff --git a/net/hsr/hsr_main.h b/net/hsr/hsr_main.h index 16ae9fb09ccd2..ea9aa5130ad6d 100644 --- a/net/hsr/hsr_main.h +++ b/net/hsr/hsr_main.h @@ -182,11 +182,17 @@ struct hsr_proto_ops { void (*update_san_info)(struct hsr_node *node, bool is_sup); }; +struct hsr_self_node { + unsigned char macaddress_A[ETH_ALEN]; + unsigned char macaddress_B[ETH_ALEN]; + struct rcu_head rcu_head; +}; + struct hsr_priv { struct rcu_head rcu_head; struct list_head ports; struct list_head node_db; /* Known HSR nodes */ - struct list_head self_node_db; /* MACs of slaves */ + struct hsr_self_node *self_node; /* MACs of slaves */ struct timer_list announce_timer; /* Supervision frame dispatch */ struct timer_list prune_timer; int announce_count; From patchwork Fri Nov 25 16:56:10 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sebastian Andrzej Siewior X-Patchwork-Id: 13056162 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 443AAC4321E for ; Fri, 25 Nov 2022 16:56:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230007AbiKYQ42 (ORCPT ); Fri, 25 Nov 2022 11:56:28 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35180 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229973AbiKYQ4V (ORCPT ); Fri, 25 Nov 2022 11:56:21 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B6D202D1C7; Fri, 25 Nov 2022 08:56:18 -0800 (PST) From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1669395376; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nHn+krM5zkqrj8eTzbiYYf9V3Ti8/2Hp14OPp+k9Zxs=; b=V5BFmn+peox5bSPvDeTnqHZLgfjhmzcLYsK63VuTo+RpWPzNVdgR12zDysitfqZwbGNr+0 8WUnONv6NaUPzAvzatYdUSXk7i1I77uqKagKqd6G/8FIju/MASmeRLEZGYnr4uYoOilxJi DqQsduSFUBrrJoaec8bqUeSBi1WxYEC1JgWVLYv3VcvCktc1CMmTB1IjbOnjYMP1EoQ461 M7ZaxcQ+g5Sz+xgdObU2Y2LpOjIj505O4e9WfLQrxJrHMh29nyjB+QF0Vvd24diEiGvjmq seykhXVCa3QTJGlyO3EalKrRxg5EpqyvR/CAJKBmouJe9eNhgkT0hJoo5mchvA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1669395376; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nHn+krM5zkqrj8eTzbiYYf9V3Ti8/2Hp14OPp+k9Zxs=; b=V4DI8XqkoF+otsTJMGygfdDPhHurt88CKvmSfYlNMTZPuAki1nN+8Ngi+ktJlHTWBB0eWl BvuIgO8U9i3F88Aw== To: netdev@vger.kernel.org Cc: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Thomas Gleixner , Kurt Kanzenbach , Sebastian Andrzej Siewior , Shuah Khan , linux-kselftest@vger.kernel.org Subject: [PATCH v4 net-next 8/8] selftests: Add a basic HSR test. Date: Fri, 25 Nov 2022 17:56:10 +0100 Message-Id: <20221125165610.3802446-9-bigeasy@linutronix.de> In-Reply-To: <20221125165610.3802446-1-bigeasy@linutronix.de> References: <20221125165610.3802446-1-bigeasy@linutronix.de> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org This test adds a basic HSRv0 network with 3 nodes. In its current shape it sends and forwards packets, announcements and so merges nodes based on MAC A/B information. It is able to detect duplicate packets and packetloss should any occur. Cc: Shuah Khan Cc: linux-kselftest@vger.kernel.org Signed-off-by: Sebastian Andrzej Siewior Reviewed-by: Kurt Kanzenbach --- tools/testing/selftests/Makefile | 1 + tools/testing/selftests/net/hsr/Makefile | 7 + tools/testing/selftests/net/hsr/config | 4 + tools/testing/selftests/net/hsr/hsr_ping.sh | 256 ++++++++++++++++++++ 4 files changed, 268 insertions(+) create mode 100644 tools/testing/selftests/net/hsr/Makefile create mode 100644 tools/testing/selftests/net/hsr/config create mode 100755 tools/testing/selftests/net/hsr/hsr_ping.sh diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile index f07aef7c592c2..b57b091d80268 100644 --- a/tools/testing/selftests/Makefile +++ b/tools/testing/selftests/Makefile @@ -48,6 +48,7 @@ TARGETS += nci TARGETS += net TARGETS += net/af_unix TARGETS += net/forwarding +TARGETS += net/hsr TARGETS += net/mptcp TARGETS += net/openvswitch TARGETS += netfilter diff --git a/tools/testing/selftests/net/hsr/Makefile b/tools/testing/selftests/net/hsr/Makefile new file mode 100644 index 0000000000000..92c1d9d080cd5 --- /dev/null +++ b/tools/testing/selftests/net/hsr/Makefile @@ -0,0 +1,7 @@ +# SPDX-License-Identifier: GPL-2.0 + +top_srcdir = ../../../../.. + +TEST_PROGS := hsr_ping.sh + +include ../../lib.mk diff --git a/tools/testing/selftests/net/hsr/config b/tools/testing/selftests/net/hsr/config new file mode 100644 index 0000000000000..22061204fb691 --- /dev/null +++ b/tools/testing/selftests/net/hsr/config @@ -0,0 +1,4 @@ +CONFIG_IPV6=y +CONFIG_NET_SCH_NETEM=m +CONFIG_HSR=y +CONFIG_VETH=y diff --git a/tools/testing/selftests/net/hsr/hsr_ping.sh b/tools/testing/selftests/net/hsr/hsr_ping.sh new file mode 100755 index 0000000000000..df91435387086 --- /dev/null +++ b/tools/testing/selftests/net/hsr/hsr_ping.sh @@ -0,0 +1,256 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 + +ret=0 +ksft_skip=4 +ipv6=true + +optstring="h4" +usage() { + echo "Usage: $0 [OPTION]" + echo -e "\t-4: IPv4 only: disable IPv6 tests (default: test both IPv4 and IPv6)" +} + +while getopts "$optstring" option;do + case "$option" in + "h") + usage $0 + exit 0 + ;; + "4") + ipv6=false + ;; + "?") + usage $0 + exit 1 + ;; +esac +done + +sec=$(date +%s) +rndh=$(printf %x $sec)-$(mktemp -u XXXXXX) +ns1="ns1-$rndh" +ns2="ns2-$rndh" +ns3="ns3-$rndh" + +cleanup() +{ + local netns + for netns in "$ns1" "$ns2" "$ns3" ;do + ip netns del $netns + done +} + +ip -Version > /dev/null 2>&1 +if [ $? -ne 0 ];then + echo "SKIP: Could not run test without ip tool" + exit $ksft_skip +fi + +trap cleanup EXIT + +for i in "$ns1" "$ns2" "$ns3" ;do + ip netns add $i || exit $ksft_skip + ip -net $i link set lo up +done + +echo "INFO: preparing interfaces." +# Three HSR nodes. Each node has one link to each of its neighbour, two links in total. +# +# ns1eth1 ----- ns2eth1 +# hsr1 hsr2 +# ns1eth2 ns2eth2 +# | | +# ns3eth1 ns3eth2 +# \ / +# hsr3 +# +# Interfaces +ip link add ns1eth1 netns "$ns1" type veth peer name ns2eth1 netns "$ns2" +ip link add ns1eth2 netns "$ns1" type veth peer name ns3eth1 netns "$ns3" +ip link add ns3eth2 netns "$ns3" type veth peer name ns2eth2 netns "$ns2" + +# HSRv0. +ip -net "$ns1" link add name hsr1 type hsr slave1 ns1eth1 slave2 ns1eth2 supervision 45 version 0 proto 0 +ip -net "$ns2" link add name hsr2 type hsr slave1 ns2eth1 slave2 ns2eth2 supervision 45 version 0 proto 0 +ip -net "$ns3" link add name hsr3 type hsr slave1 ns3eth1 slave2 ns3eth2 supervision 45 version 0 proto 0 + +# IP for HSR +ip -net "$ns1" addr add 100.64.0.1/24 dev hsr1 +ip -net "$ns1" addr add dead:beef:1::1/64 dev hsr1 nodad +ip -net "$ns2" addr add 100.64.0.2/24 dev hsr2 +ip -net "$ns2" addr add dead:beef:1::2/64 dev hsr2 nodad +ip -net "$ns3" addr add 100.64.0.3/24 dev hsr3 +ip -net "$ns3" addr add dead:beef:1::3/64 dev hsr3 nodad + +# All Links up +ip -net "$ns1" link set ns1eth1 up +ip -net "$ns1" link set ns1eth2 up +ip -net "$ns1" link set hsr1 up + +ip -net "$ns2" link set ns2eth1 up +ip -net "$ns2" link set ns2eth2 up +ip -net "$ns2" link set hsr2 up + +ip -net "$ns3" link set ns3eth1 up +ip -net "$ns3" link set ns3eth2 up +ip -net "$ns3" link set hsr3 up + +# $1: IP address +is_v6() +{ + [ -z "${1##*:*}" ] +} + +do_ping() +{ + local netns="$1" + local connect_addr="$2" + local ping_args="-q -c 2" + + if is_v6 "${connect_addr}"; then + $ipv6 || return 0 + ping_args="${ping_args} -6" + fi + + ip netns exec ${netns} ping ${ping_args} $connect_addr >/dev/null + if [ $? -ne 0 ] ; then + echo "$netns -> $connect_addr connectivity [ FAIL ]" 1>&2 + ret=1 + return 1 + fi + + return 0 +} + +do_ping_long() +{ + local netns="$1" + local connect_addr="$2" + local ping_args="-q -c 10" + + if is_v6 "${connect_addr}"; then + $ipv6 || return 0 + ping_args="${ping_args} -6" + fi + + OUT="$(LANG=C ip netns exec ${netns} ping ${ping_args} $connect_addr | grep received)" + if [ $? -ne 0 ] ; then + echo "$netns -> $connect_addr ping [ FAIL ]" 1>&2 + ret=1 + return 1 + fi + + VAL="$(echo $OUT | cut -d' ' -f1-8)" + if [ "$VAL" != "10 packets transmitted, 10 received, 0% packet loss," ] + then + echo "$netns -> $connect_addr ping TEST [ FAIL ]" + echo "Expect to send and receive 10 packets and no duplicates." + echo "Full message: ${OUT}." + ret=1 + return 1 + fi + + return 0 +} + +stop_if_error() +{ + local msg="$1" + + if [ ${ret} -ne 0 ]; then + echo "FAIL: ${msg}" 1>&2 + exit ${ret} + fi +} + + +echo "INFO: Initial validation ping." +# Each node has to be able each one. +do_ping "$ns1" 100.64.0.2 +do_ping "$ns2" 100.64.0.1 +do_ping "$ns3" 100.64.0.1 +stop_if_error "Initial validation failed." + +do_ping "$ns1" 100.64.0.3 +do_ping "$ns2" 100.64.0.3 +do_ping "$ns3" 100.64.0.2 + +do_ping "$ns1" dead:beef:1::2 +do_ping "$ns1" dead:beef:1::3 +do_ping "$ns2" dead:beef:1::1 +do_ping "$ns2" dead:beef:1::2 +do_ping "$ns3" dead:beef:1::1 +do_ping "$ns3" dead:beef:1::2 + +stop_if_error "Initial validation failed." + +# Wait until supervisor all supervision frames have been processed and the node +# entries have been merged. Otherwise duplicate frames will be observed which is +# valid at this stage. +WAIT=5 +while [ ${WAIT} -gt 0 ] +do + grep 00:00:00:00:00:00 /sys/kernel/debug/hsr/hsr*/node_table + if [ $? -ne 0 ] + then + break + fi + sleep 1 + let WAIT = WAIT - 1 +done + +# Just a safety delay in case the above check didn't handle it. +sleep 1 + +echo "INFO: Longer ping test." +do_ping_long "$ns1" 100.64.0.2 +do_ping_long "$ns1" dead:beef:1::2 +do_ping_long "$ns1" 100.64.0.3 +do_ping_long "$ns1" dead:beef:1::3 + +stop_if_error "Longer ping test failed." + +do_ping_long "$ns2" 100.64.0.1 +do_ping_long "$ns2" dead:beef:1::1 +do_ping_long "$ns2" 100.64.0.3 +do_ping_long "$ns2" dead:beef:1::2 +stop_if_error "Longer ping test failed." + +do_ping_long "$ns3" 100.64.0.1 +do_ping_long "$ns3" dead:beef:1::1 +do_ping_long "$ns3" 100.64.0.2 +do_ping_long "$ns3" dead:beef:1::2 +stop_if_error "Longer ping test failed." + +echo "INFO: Cutting one link." +do_ping_long "$ns1" 100.64.0.3 & + +sleep 3 +ip -net "$ns3" link set ns3eth1 down +wait + +ip -net "$ns3" link set ns3eth1 up + +stop_if_error "Failed with one link down." + +echo "INFO: Delay the link and drop a few packages." +tc -net "$ns3" qdisc add dev ns3eth1 root netem delay 50ms +tc -net "$ns2" qdisc add dev ns2eth1 root netem delay 5ms loss 25% + +do_ping_long "$ns1" 100.64.0.2 +do_ping_long "$ns1" 100.64.0.3 + +stop_if_error "Failed with delay and packetloss." + +do_ping_long "$ns2" 100.64.0.1 +do_ping_long "$ns2" 100.64.0.3 + +stop_if_error "Failed with delay and packetloss." + +do_ping_long "$ns3" 100.64.0.1 +do_ping_long "$ns3" 100.64.0.2 +stop_if_error "Failed with delay and packetloss." + +echo "INFO: All good." +exit $ret