Message ID | 20240730022623.98909-2-almasrymina@google.com (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
Series | Device Memory TCP | expand |
On Tue, 30 Jul 2024 02:26:05 +0000, Mina Almasry <almasrymina@google.com> wrote: > Add netdev_rx_queue_restart() function to netdev_rx_queue.h Can you say more? As far as I understand, we just release the buffer submitted to the rx ring and get a new page pool. But I personally feel that the interface here is a bit too complicated. In particular, we also need to copy the rx struct memory, which means it is a dangerous operation for many pointers. Thanks. > > Signed-off-by: David Wei <dw@davidwei.uk> > Signed-off-by: Mina Almasry <almasrymina@google.com> > Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> > Reviewed-by: Jakub Kicinski <kuba@kernel.org> > > --- > > v17: > - Use ASSERT_RTNL() (Jakub). > > v13: > - Add reviewed-by from Pavel (thanks!) > - Fixed comment (Pavel) > > v11: > - Fix not checking dev->queue_mgmt_ops (Pavel). > - Fix ndo_queue_mem_free call that passed the wrong pointer (David). > > v9: https://lore.kernel.org/all/20240502045410.3524155-4-dw@davidwei.uk/ > (submitted by David). > - fixed SPDX license identifier (Simon). > - Rebased on top of merged queue API definition, and changed > implementation to match that. > - Replace rtnl_lock() with rtnl_is_locked() to make it useable from my > netlink code where rtnl is already locked. > > --- > include/net/netdev_rx_queue.h | 3 ++ > net/core/Makefile | 1 + > net/core/netdev_rx_queue.c | 74 +++++++++++++++++++++++++++++++++++ > 3 files changed, 78 insertions(+) > create mode 100644 net/core/netdev_rx_queue.c > > diff --git a/include/net/netdev_rx_queue.h b/include/net/netdev_rx_queue.h > index aa1716fb0e53c..e78ca52d67fbf 100644 > --- a/include/net/netdev_rx_queue.h > +++ b/include/net/netdev_rx_queue.h > @@ -54,4 +54,7 @@ get_netdev_rx_queue_index(struct netdev_rx_queue *queue) > return index; > } > #endif > + > +int netdev_rx_queue_restart(struct net_device *dev, unsigned int rxq); > + > #endif > diff --git a/net/core/Makefile b/net/core/Makefile > index 62be9aef25285..f82232b358a2c 100644 > --- a/net/core/Makefile > +++ b/net/core/Makefile > @@ -19,6 +19,7 @@ obj-$(CONFIG_NETDEV_ADDR_LIST_TEST) += dev_addr_lists_test.o > > obj-y += net-sysfs.o > obj-y += hotdata.o > +obj-y += netdev_rx_queue.o > obj-$(CONFIG_PAGE_POOL) += page_pool.o page_pool_user.o > obj-$(CONFIG_PROC_FS) += net-procfs.o > obj-$(CONFIG_NET_PKTGEN) += pktgen.o > diff --git a/net/core/netdev_rx_queue.c b/net/core/netdev_rx_queue.c > new file mode 100644 > index 0000000000000..da11720a59830 > --- /dev/null > +++ b/net/core/netdev_rx_queue.c > @@ -0,0 +1,74 @@ > +// SPDX-License-Identifier: GPL-2.0-or-later > + > +#include <linux/netdevice.h> > +#include <net/netdev_queues.h> > +#include <net/netdev_rx_queue.h> > + > +int netdev_rx_queue_restart(struct net_device *dev, unsigned int rxq_idx) > +{ > + void *new_mem, *old_mem; > + int err; > + > + if (!dev->queue_mgmt_ops || !dev->queue_mgmt_ops->ndo_queue_stop || > + !dev->queue_mgmt_ops->ndo_queue_mem_free || > + !dev->queue_mgmt_ops->ndo_queue_mem_alloc || > + !dev->queue_mgmt_ops->ndo_queue_start) > + return -EOPNOTSUPP; > + > + ASSERT_RTNL(); > + > + new_mem = kvzalloc(dev->queue_mgmt_ops->ndo_queue_mem_size, GFP_KERNEL); > + if (!new_mem) > + return -ENOMEM; > + > + old_mem = kvzalloc(dev->queue_mgmt_ops->ndo_queue_mem_size, GFP_KERNEL); > + if (!old_mem) { > + err = -ENOMEM; > + goto err_free_new_mem; > + } > + > + err = dev->queue_mgmt_ops->ndo_queue_mem_alloc(dev, new_mem, rxq_idx); > + if (err) > + goto err_free_old_mem; > + > + err = dev->queue_mgmt_ops->ndo_queue_stop(dev, old_mem, rxq_idx); > + if (err) > + goto err_free_new_queue_mem; > + > + err = dev->queue_mgmt_ops->ndo_queue_start(dev, new_mem, rxq_idx); > + if (err) > + goto err_start_queue; > + > + dev->queue_mgmt_ops->ndo_queue_mem_free(dev, old_mem); > + > + kvfree(old_mem); > + kvfree(new_mem); > + > + return 0; > + > +err_start_queue: > + /* Restarting the queue with old_mem should be successful as we haven't > + * changed any of the queue configuration, and there is not much we can > + * do to recover from a failure here. > + * > + * WARN if we fail to recover the old rx queue, and at least free > + * old_mem so we don't also leak that. > + */ > + if (dev->queue_mgmt_ops->ndo_queue_start(dev, old_mem, rxq_idx)) { > + WARN(1, > + "Failed to restart old queue in error path. RX queue %d may be unhealthy.", > + rxq_idx); > + dev->queue_mgmt_ops->ndo_queue_mem_free(dev, old_mem); > + } > + > +err_free_new_queue_mem: > + dev->queue_mgmt_ops->ndo_queue_mem_free(dev, new_mem); > + > +err_free_old_mem: > + kvfree(old_mem); > + > +err_free_new_mem: > + kvfree(new_mem); > + > + return err; > +} > -- > 2.46.0.rc1.232.g9752f9e123-goog > >
On Tue, Jul 30, 2024 at 4:17 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote: > > On Tue, 30 Jul 2024 02:26:05 +0000, Mina Almasry <almasrymina@google.com> wrote: > > Add netdev_rx_queue_restart() function to netdev_rx_queue.h > > > Can you say more? As far as I understand, we just release the buffer > submitted to the rx ring and get a new page pool. > Yes, I just noticed that this commit message is underwritten. I'll add more color. Maybe something like; ==== Add netdev_rx_queue_restart(), which resets an rx queue using the queue API recently merged[1]. The queue API was merged to enable the core net stack reset individual rx queues to actuate changes in the rx queue's configuration. In later patches in this series, we will use netdev_rx_queue_restart() to reset rx queues after binding or unbinding dmabuf configuration, which will cause reallocation of the page_pool to repopulate its memory using the new configuration. [1] https://lore.kernel.org/netdev/20240430231420.699177-1-shailend@google.com/T/ ==== > But I personally feel that the interface here is a bit too complicated. In > particular, we also need to copy the rx struct memory, which means it is a > dangerous operation for many pointers. > Understood, but the complication is necessary based on previous discussions. Jakub requests that we must allocate memory for a new rx queues before bringing down the existing queue, to guard against the interface remaining down on ENOMEM error. Btw, I notice the series was marked as changes requested; the only feedback I got was this one and the incorrect netmem_priv.h header. I'll fix and repost. It's just slightly weird because both v16 and v17 are marked as changes requested in patchwork.
diff --git a/include/net/netdev_rx_queue.h b/include/net/netdev_rx_queue.h index aa1716fb0e53c..e78ca52d67fbf 100644 --- a/include/net/netdev_rx_queue.h +++ b/include/net/netdev_rx_queue.h @@ -54,4 +54,7 @@ get_netdev_rx_queue_index(struct netdev_rx_queue *queue) return index; } #endif + +int netdev_rx_queue_restart(struct net_device *dev, unsigned int rxq); + #endif diff --git a/net/core/Makefile b/net/core/Makefile index 62be9aef25285..f82232b358a2c 100644 --- a/net/core/Makefile +++ b/net/core/Makefile @@ -19,6 +19,7 @@ obj-$(CONFIG_NETDEV_ADDR_LIST_TEST) += dev_addr_lists_test.o obj-y += net-sysfs.o obj-y += hotdata.o +obj-y += netdev_rx_queue.o obj-$(CONFIG_PAGE_POOL) += page_pool.o page_pool_user.o obj-$(CONFIG_PROC_FS) += net-procfs.o obj-$(CONFIG_NET_PKTGEN) += pktgen.o diff --git a/net/core/netdev_rx_queue.c b/net/core/netdev_rx_queue.c new file mode 100644 index 0000000000000..da11720a59830 --- /dev/null +++ b/net/core/netdev_rx_queue.c @@ -0,0 +1,74 @@ +// SPDX-License-Identifier: GPL-2.0-or-later + +#include <linux/netdevice.h> +#include <net/netdev_queues.h> +#include <net/netdev_rx_queue.h> + +int netdev_rx_queue_restart(struct net_device *dev, unsigned int rxq_idx) +{ + void *new_mem, *old_mem; + int err; + + if (!dev->queue_mgmt_ops || !dev->queue_mgmt_ops->ndo_queue_stop || + !dev->queue_mgmt_ops->ndo_queue_mem_free || + !dev->queue_mgmt_ops->ndo_queue_mem_alloc || + !dev->queue_mgmt_ops->ndo_queue_start) + return -EOPNOTSUPP; + + ASSERT_RTNL(); + + new_mem = kvzalloc(dev->queue_mgmt_ops->ndo_queue_mem_size, GFP_KERNEL); + if (!new_mem) + return -ENOMEM; + + old_mem = kvzalloc(dev->queue_mgmt_ops->ndo_queue_mem_size, GFP_KERNEL); + if (!old_mem) { + err = -ENOMEM; + goto err_free_new_mem; + } + + err = dev->queue_mgmt_ops->ndo_queue_mem_alloc(dev, new_mem, rxq_idx); + if (err) + goto err_free_old_mem; + + err = dev->queue_mgmt_ops->ndo_queue_stop(dev, old_mem, rxq_idx); + if (err) + goto err_free_new_queue_mem; + + err = dev->queue_mgmt_ops->ndo_queue_start(dev, new_mem, rxq_idx); + if (err) + goto err_start_queue; + + dev->queue_mgmt_ops->ndo_queue_mem_free(dev, old_mem); + + kvfree(old_mem); + kvfree(new_mem); + + return 0; + +err_start_queue: + /* Restarting the queue with old_mem should be successful as we haven't + * changed any of the queue configuration, and there is not much we can + * do to recover from a failure here. + * + * WARN if we fail to recover the old rx queue, and at least free + * old_mem so we don't also leak that. + */ + if (dev->queue_mgmt_ops->ndo_queue_start(dev, old_mem, rxq_idx)) { + WARN(1, + "Failed to restart old queue in error path. RX queue %d may be unhealthy.", + rxq_idx); + dev->queue_mgmt_ops->ndo_queue_mem_free(dev, old_mem); + } + +err_free_new_queue_mem: + dev->queue_mgmt_ops->ndo_queue_mem_free(dev, new_mem); + +err_free_old_mem: + kvfree(old_mem); + +err_free_new_mem: + kvfree(new_mem); + + return err; +}