Message ID | 20210322154329.340048-1-atenart@kernel.org (mailing list archive) |
---|---|
State | Accepted |
Commit | 7f08ec6e04269ce53b664761c9108b44ed2f54ab |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net-next] net-sysfs: remove possible sleep from an RCU read-side critical section | expand |
Context | Check | Description |
---|---|---|
netdev/cover_letter | success | Link |
netdev/fixes_present | success | Link |
netdev/patch_count | success | Link |
netdev/tree_selection | success | Clearly marked for net-next |
netdev/subject_prefix | success | Link |
netdev/cc_maintainers | warning | 3 maintainers not CCed: alexanderduyck@fb.com edumazet@google.com weiwan@google.com |
netdev/source_inline | success | Was 0 now: 0 |
netdev/verify_signedoff | success | Link |
netdev/module_param | success | Was 0 now: 0 |
netdev/build_32bit | success | Errors and warnings before: 1 this patch: 1 |
netdev/kdoc | success | Errors and warnings before: 0 this patch: 0 |
netdev/verify_fixes | success | Link |
netdev/checkpatch | success | total: 0 errors, 0 warnings, 0 checks, 8 lines checked |
netdev/build_allmodconfig_warn | success | Errors and warnings before: 1 this patch: 1 |
netdev/header_inline | success | Link |
On Mon, Mar 22, 2021 at 04:43:29PM +0100, Antoine Tenart wrote: > xps_queue_show is mostly made of an RCU read-side critical section and > calls bitmap_zalloc with GFP_KERNEL in the middle of it. That is not > allowed as this call may sleep and such behaviours aren't allowed in RCU > read-side critical sections. Fix this by using GFP_NOWAIT instead. This would be another way of fixing the problem that is slightly less complex than my initial proposal, but does allow for using GFP_KERNEL for fewer failures: @@ -1366,11 +1366,10 @@ static ssize_t xps_queue_show(struct net_device *dev, unsigned int index, { struct xps_dev_maps *dev_maps; unsigned long *mask; - unsigned int nr_ids; + unsigned int nr_ids, new_nr_ids; int j, len; - rcu_read_lock(); - dev_maps = rcu_dereference(dev->xps_maps[type]); + dev_maps = READ_ONCE(dev->xps_maps[type]); /* Default to nr_cpu_ids/dev->num_rx_queues and do not just return 0 * when dev_maps hasn't been allocated yet, to be backward compatible. @@ -1379,10 +1378,18 @@ static ssize_t xps_queue_show(struct net_device *dev, unsigned int index, (type == XPS_CPUS ? nr_cpu_ids : dev->num_rx_queues); mask = bitmap_zalloc(nr_ids, GFP_KERNEL); - if (!mask) { - rcu_read_unlock(); + if (!mask) return -ENOMEM; - } + + rcu_read_lock(); + dev_maps = rcu_dereference(dev->xps_maps[type]); + /* if nr_ids shrank in the meantime, do not overrun array. + * if it increased, we just won't show the new ones + */ + new_nr_ids = dev_maps ? dev_maps->nr_ids : + (type == XPS_CPUS ? nr_cpu_ids : dev->num_rx_queues); + if (new_nr_ids < nr_ids) + nr_ids = new_nr_ids; if (!dev_maps || tc >= dev_maps->num_tc) goto out_no_maps; (or do we need the rcu read lock to read dev->num_rcx_queues? i'm assuming we only need it to read the xps_maps array)
Quoting Matthew Wilcox (2021-03-22 17:54:39) > On Mon, Mar 22, 2021 at 04:43:29PM +0100, Antoine Tenart wrote: > > xps_queue_show is mostly made of an RCU read-side critical section and > > calls bitmap_zalloc with GFP_KERNEL in the middle of it. That is not > > allowed as this call may sleep and such behaviours aren't allowed in RCU > > read-side critical sections. Fix this by using GFP_NOWAIT instead. > > This would be another way of fixing the problem that is slightly less > complex than my initial proposal, but does allow for using GFP_KERNEL > for fewer failures: > > @@ -1366,11 +1366,10 @@ static ssize_t xps_queue_show(struct net_device *dev, unsigned int index, > { > struct xps_dev_maps *dev_maps; > unsigned long *mask; > - unsigned int nr_ids; > + unsigned int nr_ids, new_nr_ids; > int j, len; > > - rcu_read_lock(); > - dev_maps = rcu_dereference(dev->xps_maps[type]); > + dev_maps = READ_ONCE(dev->xps_maps[type]); Couldn't dev_maps be freed between here and the read of dev_maps->nr_ids as we're not in an RCU read-side critical section? > /* Default to nr_cpu_ids/dev->num_rx_queues and do not just return 0 > * when dev_maps hasn't been allocated yet, to be backward compatible. > @@ -1379,10 +1378,18 @@ static ssize_t xps_queue_show(struct net_device *dev, unsigned int index, > (type == XPS_CPUS ? nr_cpu_ids : dev->num_rx_queues); > > mask = bitmap_zalloc(nr_ids, GFP_KERNEL); > - if (!mask) { > - rcu_read_unlock(); > + if (!mask) > return -ENOMEM; > - } > + > + rcu_read_lock(); > + dev_maps = rcu_dereference(dev->xps_maps[type]); > + /* if nr_ids shrank in the meantime, do not overrun array. > + * if it increased, we just won't show the new ones > + */ > + new_nr_ids = dev_maps ? dev_maps->nr_ids : > + (type == XPS_CPUS ? nr_cpu_ids : dev->num_rx_queues); > + if (new_nr_ids < nr_ids) > + nr_ids = new_nr_ids; > > if (!dev_maps || tc >= dev_maps->num_tc) > goto out_no_maps; My feeling is there is not much value in having a tricky allocation logic for reads from xps_cpus and xps_rxqs. While we could come up with something, returning -ENOMEM on memory pressure should be fine. Antoine
On Mon, Mar 22, 2021 at 06:41:30PM +0100, Antoine Tenart wrote: > Quoting Matthew Wilcox (2021-03-22 17:54:39) > > - rcu_read_lock(); > > - dev_maps = rcu_dereference(dev->xps_maps[type]); > > + dev_maps = READ_ONCE(dev->xps_maps[type]); > > Couldn't dev_maps be freed between here and the read of dev_maps->nr_ids > as we're not in an RCU read-side critical section? Oh, good point. Never mind, then. > My feeling is there is not much value in having a tricky allocation > logic for reads from xps_cpus and xps_rxqs. While we could come up with > something, returning -ENOMEM on memory pressure should be fine. That's fine. It's your code, and this is probably a small allocation anyway.
Quoting Antoine Tenart (2021-03-22 18:41:30) > Quoting Matthew Wilcox (2021-03-22 17:54:39) > > On Mon, Mar 22, 2021 at 04:43:29PM +0100, Antoine Tenart wrote: > > > xps_queue_show is mostly made of an RCU read-side critical section and > > > calls bitmap_zalloc with GFP_KERNEL in the middle of it. That is not > > > allowed as this call may sleep and such behaviours aren't allowed in RCU > > > read-side critical sections. Fix this by using GFP_NOWAIT instead. > > > > This would be another way of fixing the problem that is slightly less > > complex than my initial proposal, but does allow for using GFP_KERNEL > > for fewer failures: > > > > @@ -1366,11 +1366,10 @@ static ssize_t xps_queue_show(struct net_device *dev, unsigned int index, > > { > > struct xps_dev_maps *dev_maps; > > unsigned long *mask; > > - unsigned int nr_ids; > > + unsigned int nr_ids, new_nr_ids; > > int j, len; > > > > - rcu_read_lock(); > > - dev_maps = rcu_dereference(dev->xps_maps[type]); > > + dev_maps = READ_ONCE(dev->xps_maps[type]); > > Couldn't dev_maps be freed between here and the read of dev_maps->nr_ids > as we're not in an RCU read-side critical section? * The first read of dev_maps->nr_ids, happening before rcu_read_lock, not the one shown below. > > /* Default to nr_cpu_ids/dev->num_rx_queues and do not just return 0 > > * when dev_maps hasn't been allocated yet, to be backward compatible. > > @@ -1379,10 +1378,18 @@ static ssize_t xps_queue_show(struct net_device *dev, unsigned int index, > > (type == XPS_CPUS ? nr_cpu_ids : dev->num_rx_queues); > > > > mask = bitmap_zalloc(nr_ids, GFP_KERNEL); > > - if (!mask) { > > - rcu_read_unlock(); > > + if (!mask) > > return -ENOMEM; > > - } > > + > > + rcu_read_lock(); > > + dev_maps = rcu_dereference(dev->xps_maps[type]); > > + /* if nr_ids shrank in the meantime, do not overrun array. > > + * if it increased, we just won't show the new ones > > + */ > > + new_nr_ids = dev_maps ? dev_maps->nr_ids : > > + (type == XPS_CPUS ? nr_cpu_ids : dev->num_rx_queues); > > + if (new_nr_ids < nr_ids) > > + nr_ids = new_nr_ids;
Quoting Matthew Wilcox (2021-03-22 18:44:21) > On Mon, Mar 22, 2021 at 06:41:30PM +0100, Antoine Tenart wrote: > > Quoting Matthew Wilcox (2021-03-22 17:54:39) > > > - rcu_read_lock(); > > > - dev_maps = rcu_dereference(dev->xps_maps[type]); > > > + dev_maps = READ_ONCE(dev->xps_maps[type]); > > > > Couldn't dev_maps be freed between here and the read of dev_maps->nr_ids > > as we're not in an RCU read-side critical section? > > Oh, good point. Never mind, then. > > > My feeling is there is not much value in having a tricky allocation > > logic for reads from xps_cpus and xps_rxqs. While we could come up with > > something, returning -ENOMEM on memory pressure should be fine. > > That's fine. It's your code, and this is probably a small allocation > anyway. All right. Thanks for the suggestions anyway! Antoine
Hello: This patch was applied to netdev/net-next.git (refs/heads/master): On Mon, 22 Mar 2021 16:43:29 +0100 you wrote: > xps_queue_show is mostly made of an RCU read-side critical section and > calls bitmap_zalloc with GFP_KERNEL in the middle of it. That is not > allowed as this call may sleep and such behaviours aren't allowed in RCU > read-side critical sections. Fix this by using GFP_NOWAIT instead. > > Fixes: 5478fcd0f483 ("net: embed nr_ids in the xps maps") > Reported-by: kernel test robot <oliver.sang@intel.com> > Suggested-by: Matthew Wilcox <willy@infradead.org> > Signed-off-by: Antoine Tenart <atenart@kernel.org> > > [...] Here is the summary with links: - [net-next] net-sysfs: remove possible sleep from an RCU read-side critical section https://git.kernel.org/netdev/net-next/c/7f08ec6e0426 You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html
diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c index 562a42fcd437..f6197774048b 100644 --- a/net/core/net-sysfs.c +++ b/net/core/net-sysfs.c @@ -1378,7 +1378,7 @@ static ssize_t xps_queue_show(struct net_device *dev, unsigned int index, nr_ids = dev_maps ? dev_maps->nr_ids : (type == XPS_CPUS ? nr_cpu_ids : dev->num_rx_queues); - mask = bitmap_zalloc(nr_ids, GFP_KERNEL); + mask = bitmap_zalloc(nr_ids, GFP_NOWAIT); if (!mask) { rcu_read_unlock(); return -ENOMEM;
xps_queue_show is mostly made of an RCU read-side critical section and calls bitmap_zalloc with GFP_KERNEL in the middle of it. That is not allowed as this call may sleep and such behaviours aren't allowed in RCU read-side critical sections. Fix this by using GFP_NOWAIT instead. Fixes: 5478fcd0f483 ("net: embed nr_ids in the xps maps") Reported-by: kernel test robot <oliver.sang@intel.com> Suggested-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Antoine Tenart <atenart@kernel.org> --- Fix sent to net-next as it fixes an issue only in net-next. net/core/net-sysfs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)