Message ID | 20230413090353.14448-1-liangchen.linux@gmail.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 0646dc31ca886693274df5749cd0c8c1eaaeb5ca |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [v5] skbuff: Fix a race between coalescing and releasing SKBs | expand |
On 2023/4/13 17:03, Liang Chen wrote: > Commit 1effe8ca4e34 ("skbuff: fix coalescing for page_pool fragment > recycling") allowed coalescing to proceed with non page pool page and page > pool page when @from is cloned, i.e. > > to->pp_recycle --> false > from->pp_recycle --> true > skb_cloned(from) --> true > > However, it actually requires skb_cloned(@from) to hold true until > coalescing finishes in this situation. If the other cloned SKB is > released while the merging is in process, from_shinfo->nr_frags will be > set to 0 toward the end of the function, causing the increment of frag > page _refcount to be unexpectedly skipped resulting in inconsistent > reference counts. Later when SKB(@to) is released, it frees the page > directly even though the page pool page is still in use, leading to > use-after-free or double-free errors. So it should be prohibited. > > The double-free error message below prompted us to investigate: > BUG: Bad page state in process swapper/1 pfn:0e0d1 > page:00000000c6548b28 refcount:-1 mapcount:0 mapping:0000000000000000 > index:0x2 pfn:0xe0d1 > flags: 0xfffffc0000000(node=0|zone=1|lastcpupid=0x1fffff) > raw: 000fffffc0000000 0000000000000000 ffffffff00000101 0000000000000000 > raw: 0000000000000002 0000000000000000 ffffffffffffffff 0000000000000000 > page dumped because: nonzero _refcount > > CPU: 1 PID: 0 Comm: swapper/1 Tainted: G E 6.2.0+ > Call Trace: > <IRQ> > dump_stack_lvl+0x32/0x50 > bad_page+0x69/0xf0 > free_pcp_prepare+0x260/0x2f0 > free_unref_page+0x20/0x1c0 > skb_release_data+0x10b/0x1a0 > napi_consume_skb+0x56/0x150 > net_rx_action+0xf0/0x350 > ? __napi_schedule+0x79/0x90 > __do_softirq+0xc8/0x2b1 > __irq_exit_rcu+0xb9/0xf0 > common_interrupt+0x82/0xa0 > </IRQ> > <TASK> > asm_common_interrupt+0x22/0x40 > RIP: 0010:default_idle+0xb/0x20 > > Fixes: 53e0961da1c7 ("page_pool: add frag page recycling support in page pool") I am not quite sure the above is right Fixes tag. As 1effe8ca4e34 ("skbuff: fix coalescing for page_pool fragment recycling") has tried to fix it, and it missed the case this patch is fixing, so we need another fix here.
On Thu, Apr 13, 2023 at 1:42 PM Yunsheng Lin <linyunsheng@huawei.com> wrote: > > On 2023/4/13 17:03, Liang Chen wrote: > > Fixes: 53e0961da1c7 ("page_pool: add frag page recycling support in page pool") > > I am not quite sure the above is right Fixes tag. > As 1effe8ca4e34 ("skbuff: fix coalescing for page_pool fragment recycling") has tried > to fix it, and it missed the case this patch is fixing, so we need another fix here. The Fixes: tag is more about pointing to stable teams how to deal with backports. There is no point giving the full chain, because this 'blamed' commit is enough. If an old kernel does not contain this commit, there is no point trying harder.
On Thu, 13 Apr 2023 17:03:53 +0800 Liang Chen wrote: > Commit 1effe8ca4e34 ("skbuff: fix coalescing for page_pool fragment > recycling") allowed coalescing to proceed with non page pool page and page > pool page when @from is cloned, i.e. I think Alex is out for spring celebrations so let me take this in for today's PR. Thanks!
Hello: This patch was applied to netdev/net.git (main) by Jakub Kicinski <kuba@kernel.org>: On Thu, 13 Apr 2023 17:03:53 +0800 you wrote: > Commit 1effe8ca4e34 ("skbuff: fix coalescing for page_pool fragment > recycling") allowed coalescing to proceed with non page pool page and page > pool page when @from is cloned, i.e. > > to->pp_recycle --> false > from->pp_recycle --> true > skb_cloned(from) --> true > > [...] Here is the summary with links: - [v5] skbuff: Fix a race between coalescing and releasing SKBs https://git.kernel.org/netdev/net/c/0646dc31ca88 You are awesome, thank you!
On 2023/4/13 21:45, Eric Dumazet wrote: > On Thu, Apr 13, 2023 at 1:42 PM Yunsheng Lin <linyunsheng@huawei.com> wrote: >> >> On 2023/4/13 17:03, Liang Chen wrote: > >>> Fixes: 53e0961da1c7 ("page_pool: add frag page recycling support in page pool") >> >> I am not quite sure the above is right Fixes tag. >> As 1effe8ca4e34 ("skbuff: fix coalescing for page_pool fragment recycling") has tried >> to fix it, and it missed the case this patch is fixing, so we need another fix here. > > The Fixes: tag is more about pointing to stable teams how to deal with > backports. > There is no point giving the full chain, because this 'blamed' commit is enough. > > If an old kernel does not contain this commit, there is no point trying harder. In that case, it may be better to point to 6a5bcd84e886 ("page_pool: Allow drivers to hint on SKB recycling") too. As even without fragment support, for below case: to->pp_recycle --> true from->pp_recycle --> true skb_cloned(from) --> true It seems some page may be called with page_pool_release_page() twice if 'to' and cloned skb of 'from' are freed concurrently, as there is data race between checking page->pp_magic and clearing page->pp_magic. Anyway, as it is merged already, I guess it is not really matter anymore. > . >
diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 050a875d09c5..7b83410bbaae 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -5597,18 +5597,18 @@ bool skb_try_coalesce(struct sk_buff *to, struct sk_buff *from, if (skb_cloned(to)) return false; - /* In general, avoid mixing slab allocated and page_pool allocated - * pages within the same SKB. However when @to is not pp_recycle and - * @from is cloned, we can transition frag pages from page_pool to - * reference counted. - * - * On the other hand, don't allow coalescing two pp_recycle SKBs if - * @from is cloned, in case the SKB is using page_pool fragment - * references (PP_FLAG_PAGE_FRAG). Since we only take full page - * references for cloned SKBs at the moment that would result in + /* In general, avoid mixing page_pool and non-page_pool allocated + * pages within the same SKB. Additionally avoid dealing with clones + * containing page_pool pages, in case the SKB is using page_pool + * fragment references (PP_FLAG_PAGE_FRAG). Since we only take full + * page references for cloned SKBs at the moment that would result in * inconsistent reference counts. + * In theory we could take full references if from is cloned and + * !@to->pp_recycle but its tricky (due to potential race with the clone + * disappearing) and rare, so not worth dealing with. */ - if (to->pp_recycle != (from->pp_recycle && !skb_cloned(from))) + if (to->pp_recycle != from->pp_recycle || + (from->pp_recycle && skb_cloned(from))) return false; if (len <= skb_tailroom(to)) {