Message ID | 1657679096-38572-1-git-send-email-liyonglong@chinatelecom.cn (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | net: sort queues in xps maps | expand |
On Wed, 13 Jul 2022 10:24:56 +0800 Yonglong Li wrote: > in the following case that set xps of each tx-queue with same cpu mask, > packets in the same tcp stream may be hash to different tx queue. Because > the order of queues in each xps map is not the same. > > first set each tx-queue with different cpu mask > echo 0 > /sys/class/net/eth0/queues/tx-0 > echo 1 > /sys/class/net/eth0/queues/tx-1 > echo 2 > /sys/class/net/eth0/queues/tx-2 > echo 4 > /sys/class/net/eth0/queues/tx-3 > and then set each tx-queue with same cpu mask > echo f > /sys/class/net/eth0/queues/tx-0 > echo f > /sys/class/net/eth0/queues/tx-1 > echo f > /sys/class/net/eth0/queues/tx-2 > echo f > /sys/class/net/eth0/queues/tx-3 These commands look truncated. > at this point the order of each map queues is differnet, It will cause > packets in the same stream be hashed to diffetent tx queue: > attr_map[0].queues = [0,1,2,3] > attr_map[1].queues = [1,0,2,3] > attr_map[2].queues = [2,0,1,3] > attr_map[3].queues = [3,0,1,2] > > It is more reasonable that pacekts in the same stream be hashed to the same > tx queue when all tx queue bind with the same CPUs. > > Fixes: 537c00de1c9b ("net: Add functions netif_reset_xps_queue and netif_set_xps_queue") I'd suggest treating this as a general improvement rather than fix, the kernel always behaved this way - it seems logical that sorted is better but whether it's a bug not to sort is not as clear cut. > @@ -2654,6 +2660,13 @@ int __netif_set_xps_queue(struct net_device *dev, const unsigned long *mask, > skip_tc); > } > > + for (j = -1; j = netif_attrmask_next_and(j, online_mask, mask, nr_ids), > + j < nr_ids;) { > + tci = j * num_tc + tc; > + map = xmap_dereference(new_dev_maps->attr_map[tci]); > + sort(map->queues, map->len, sizeof(u16), cmp_u16, NULL); > + } > + Can we instead make sure that expand_xps_map() maintains order?
Hi Jakub, Thanks for your feedback. On 7/14/2022 10:07 AM, Jakub Kicinski wrote: > On Wed, 13 Jul 2022 10:24:56 +0800 Yonglong Li wrote: >> in the following case that set xps of each tx-queue with same cpu mask, >> packets in the same tcp stream may be hash to different tx queue. Because >> the order of queues in each xps map is not the same. >> >> first set each tx-queue with different cpu mask >> echo 0 > /sys/class/net/eth0/queues/tx-0 >> echo 1 > /sys/class/net/eth0/queues/tx-1 >> echo 2 > /sys/class/net/eth0/queues/tx-2 >> echo 4 > /sys/class/net/eth0/queues/tx-3 >> and then set each tx-queue with same cpu mask >> echo f > /sys/class/net/eth0/queues/tx-0 >> echo f > /sys/class/net/eth0/queues/tx-1 >> echo f > /sys/class/net/eth0/queues/tx-2 >> echo f > /sys/class/net/eth0/queues/tx-3 > > These commands look truncated. I will refill the commands. > >> at this point the order of each map queues is differnet, It will cause >> packets in the same stream be hashed to diffetent tx queue: >> attr_map[0].queues = [0,1,2,3] >> attr_map[1].queues = [1,0,2,3] >> attr_map[2].queues = [2,0,1,3] >> attr_map[3].queues = [3,0,1,2] >> >> It is more reasonable that pacekts in the same stream be hashed to the same >> tx queue when all tx queue bind with the same CPUs. >> >> Fixes: 537c00de1c9b ("net: Add functions netif_reset_xps_queue and netif_set_xps_queue") > > I'd suggest treating this as a general improvement rather than fix, > the kernel always behaved this way - it seems logical that sorted is > better but whether it's a bug not to sort is not as clear cut. > agree, will remove Fixes:tag in next version. >> @@ -2654,6 +2660,13 @@ int __netif_set_xps_queue(struct net_device *dev, const unsigned long *mask, >> skip_tc); >> } >> >> + for (j = -1; j = netif_attrmask_next_and(j, online_mask, mask, nr_ids), >> + j < nr_ids;) { >> + tci = j * num_tc + tc; >> + map = xmap_dereference(new_dev_maps->attr_map[tci]); >> + sort(map->queues, map->len, sizeof(u16), cmp_u16, NULL); >> + } >> + > > Can we instead make sure that expand_xps_map() maintains order? > expand_xps_map() only alloc new_map and copy old map's queue to new_map. I think it is not suitable to do it in expand_xps_map(). WDYT?
On Thu, 14 Jul 2022 11:24:31 +0800 Yonglong Li wrote: > >> @@ -2654,6 +2660,13 @@ int __netif_set_xps_queue(struct net_device *dev, const unsigned long *mask, > >> skip_tc); > >> } > >> > >> + for (j = -1; j = netif_attrmask_next_and(j, online_mask, mask, nr_ids), > >> + j < nr_ids;) { > >> + tci = j * num_tc + tc; > >> + map = xmap_dereference(new_dev_maps->attr_map[tci]); > >> + sort(map->queues, map->len, sizeof(u16), cmp_u16, NULL); > >> + } > >> + > > > > Can we instead make sure that expand_xps_map() maintains order? > > > expand_xps_map() only alloc new_map and copy old map's queue to new_map. > I think it is not suitable to do it in expand_xps_map(). > WDYT? Oh, right, sorry for the confusion, I assumed since it reallocates that it also fills the entry. It probably doesn't to make sure that all allocations succeed before making any modifications. Can we factor out the inside of the next loop - starting from the "add tx-queue to CPU/rx-queue maps" comment into a helper? My worry is that __netif_set_xps_queue() is already pretty long and complicated we should try to move some code out rather than make it longer.
On 7/14/2022 11:32 AM, Jakub Kicinski wrote: > On Thu, 14 Jul 2022 11:24:31 +0800 Yonglong Li wrote: >>>> @@ -2654,6 +2660,13 @@ int __netif_set_xps_queue(struct net_device *dev, const unsigned long *mask, >>>> skip_tc); >>>> } >>>> >>>> + for (j = -1; j = netif_attrmask_next_and(j, online_mask, mask, nr_ids), >>>> + j < nr_ids;) { >>>> + tci = j * num_tc + tc; >>>> + map = xmap_dereference(new_dev_maps->attr_map[tci]); >>>> + sort(map->queues, map->len, sizeof(u16), cmp_u16, NULL); >>>> + } >>>> + >>> >>> Can we instead make sure that expand_xps_map() maintains order? >>> >> expand_xps_map() only alloc new_map and copy old map's queue to new_map. >> I think it is not suitable to do it in expand_xps_map(). >> WDYT? > > Oh, right, sorry for the confusion, I assumed since it reallocates that > it also fills the entry. It probably doesn't to make sure that all > allocations succeed before making any modifications. > > Can we factor out the inside of the next loop - starting from the > "add tx-queue to CPU/rx-queue maps" comment into a helper? My worry is > that __netif_set_xps_queue() is already pretty long and complicated we > should try to move some code out rather than make it longer. > Ok, I will prepare a v2 patch as your suggestion.
diff --git a/net/core/dev.c b/net/core/dev.c index 978ed06..aab26b4 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -150,6 +150,7 @@ #include <linux/pm_runtime.h> #include <linux/prandom.h> #include <linux/once_lite.h> +#include <linux/sort.h> #include "dev.h" #include "net-sysfs.h" @@ -199,6 +200,11 @@ static int call_netdevice_notifiers_extack(unsigned long val, static DECLARE_RWSEM(devnet_rename_sem); +static int cmp_u16(const void *a, const void *b) +{ + return *(u16 *)a - *(u16 *)b; +} + static inline void dev_base_seq_inc(struct net *net) { while (++net->dev_base_seq == 0) @@ -2654,6 +2660,13 @@ int __netif_set_xps_queue(struct net_device *dev, const unsigned long *mask, skip_tc); } + for (j = -1; j = netif_attrmask_next_and(j, online_mask, mask, nr_ids), + j < nr_ids;) { + tci = j * num_tc + tc; + map = xmap_dereference(new_dev_maps->attr_map[tci]); + sort(map->queues, map->len, sizeof(u16), cmp_u16, NULL); + } + rcu_assign_pointer(dev->xps_maps[type], new_dev_maps); /* Cleanup old maps */
in the following case that set xps of each tx-queue with same cpu mask, packets in the same tcp stream may be hash to different tx queue. Because the order of queues in each xps map is not the same. first set each tx-queue with different cpu mask echo 0 > /sys/class/net/eth0/queues/tx-0 echo 1 > /sys/class/net/eth0/queues/tx-1 echo 2 > /sys/class/net/eth0/queues/tx-2 echo 4 > /sys/class/net/eth0/queues/tx-3 and then set each tx-queue with same cpu mask echo f > /sys/class/net/eth0/queues/tx-0 echo f > /sys/class/net/eth0/queues/tx-1 echo f > /sys/class/net/eth0/queues/tx-2 echo f > /sys/class/net/eth0/queues/tx-3 at this point the order of each map queues is differnet, It will cause packets in the same stream be hashed to diffetent tx queue: attr_map[0].queues = [0,1,2,3] attr_map[1].queues = [1,0,2,3] attr_map[2].queues = [2,0,1,3] attr_map[3].queues = [3,0,1,2] It is more reasonable that pacekts in the same stream be hashed to the same tx queue when all tx queue bind with the same CPUs. Fixes: 537c00de1c9b ("net: Add functions netif_reset_xps_queue and netif_set_xps_queue") Signed-off-by: Yonglong Li <liyonglong@chinatelecom.cn> --- net/core/dev.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)