diff mbox series

[RFC,net-next,v2,7/7] net: skbuff: always try to recycle PP pages directly when in softirq

Message ID 20230714170853.866018-10-aleksander.lobakin@intel.com (mailing list archive)
State RFC
Delegated to: Netdev Maintainers
Headers show
Series net: page_pool: a couple of assorted optimizations | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net-next, async
netdev/apply fail Patch does not apply to net-next

Commit Message

Alexander Lobakin July 14, 2023, 5:08 p.m. UTC
Commit 8c48eea3adf3 ("page_pool: allow caching from safely localized
NAPI") allowed direct recycling of skb pages to their PP for some cases,
but unfortunately missed a couple of other majors.
For example, %XDP_DROP in skb mode. The netstack just calls kfree_skb(),
which unconditionally passes `false` as @napi_safe. Thus, all pages go
through ptr_ring and locks, although most of time we're actually inside
the NAPI polling this PP is linked with, so that it would be perfectly
safe to recycle pages directly.
Let's address such. If @napi_safe is true, we're fine, don't change
anything for this path. But if it's false, check whether we are in the
softirq context. It will most likely be so and then if ->list_owner
is our current CPU, we're good to use direct recycling, even though
@napi_safe is false -- concurrent access is excluded. in_softirq()
protection is needed mostly due to we can hit this place in the
process context (not the hardirq though).
For the mentioned xdp-drop-skb-mode case, the improvement I got is
3-4% in Mpps. As for page_pool stats, recycle_ring is now 0 and
alloc_slow counter doesn't change most of time, which means the
MM layer is not even called to allocate any new pages.

Suggested-by: Jakub Kicinski <kuba@kernel.org> # in_softirq()
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
---
 net/core/skbuff.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Jakub Kicinski July 19, 2023, 12:40 a.m. UTC | #1
On Fri, 14 Jul 2023 19:08:52 +0200 Alexander Lobakin wrote:
> Suggested-by: Jakub Kicinski <kuba@kernel.org> # in_softirq()

I thought I said something along the lines as "if this is safe you can
as well" which falls short of a suggestion, cause I don't think it is
safe :)

> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index fc1470aab5cf..1c22fd33be6c 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -902,7 +902,7 @@ bool page_pool_return_skb_page(struct page *page, bool napi_safe)
>  	 * in the same context as the consumer would run, so there's
>  	 * no possible race.
>  	 */
> -	if (napi_safe) {
> +	if (napi_safe || in_softirq()) {
>  		const struct napi_struct *napi = READ_ONCE(pp->p.napi);
>  
>  		allow_direct = napi &&

What if we got here from netpoll? napi budget was 0, so napi_safe is
false, but in_softirq() can be true or false.

XDP SKB is a toy, I really don't think 3-4% in XDP SKB warrants the
risk here.
Alexander Lobakin July 19, 2023, 4:34 p.m. UTC | #2
From: Jakub Kicinski <kuba@kernel.org>
Date: Tue, 18 Jul 2023 17:40:42 -0700

> On Fri, 14 Jul 2023 19:08:52 +0200 Alexander Lobakin wrote:
>> Suggested-by: Jakub Kicinski <kuba@kernel.org> # in_softirq()
> 
> I thought I said something along the lines as "if this is safe you can
> as well" which falls short of a suggestion, cause I don't think it is
> safe :)

Ok, I'll drop you if you insist :D

> 
>> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
>> index fc1470aab5cf..1c22fd33be6c 100644
>> --- a/net/core/skbuff.c
>> +++ b/net/core/skbuff.c
>> @@ -902,7 +902,7 @@ bool page_pool_return_skb_page(struct page *page, bool napi_safe)
>>  	 * in the same context as the consumer would run, so there's
>>  	 * no possible race.
>>  	 */
>> -	if (napi_safe) {
>> +	if (napi_safe || in_softirq()) {
>>  		const struct napi_struct *napi = READ_ONCE(pp->p.napi);
>>  
>>  		allow_direct = napi &&
> 
> What if we got here from netpoll? napi budget was 0, so napi_safe is
> false, but in_softirq() can be true or false.

If we're on the same CPU where the NAPI would run and in the same
context, i.e. softirq, in which the NAPI would run, what is the problem?
If there really is a good one, I can handle it here.

> 
> XDP SKB is a toy, I really don't think 3-4% in XDP SKB warrants the
> risk here.

It affects not only XDP skb, I told ya already :D @napi_safe is very
conservative and misses a good bunch of stuff.
I really don't think that very rare corner cases (which can be handled
if needed) warrants the perf miss here.

Thanks,
Olek
Jakub Kicinski July 19, 2023, 8:51 p.m. UTC | #3
On Wed, 19 Jul 2023 18:34:46 +0200 Alexander Lobakin wrote:
> > What if we got here from netpoll? napi budget was 0, so napi_safe is
> > false, but in_softirq() can be true or false.  
> 
> If we're on the same CPU where the NAPI would run and in the same
> context, i.e. softirq, in which the NAPI would run, what is the problem?
> If there really is a good one, I can handle it here.

#define SOFTIRQ_BITS		8
#define SOFTIRQ_MASK		(__IRQ_MASK(SOFTIRQ_BITS) << SOFTIRQ_SHIFT)
# define softirq_count()	(preempt_count() & SOFTIRQ_MASK)
#define in_softirq()		(softirq_count())

I don't know what else to add beyond that and the earlier explanation.

AFAIK pages as allocated by page pool do not benefit from the usual
KASAN / KMSAN checkers, so if we were to double-recycle a page once
a day because of a netcons race - it's going to be a month long debug
for those of us using Linux in production.
Alexander Lobakin July 20, 2023, 4:46 p.m. UTC | #4
From: Jakub Kicinski <kuba@kernel.org>
Date: Wed, 19 Jul 2023 13:51:50 -0700

> On Wed, 19 Jul 2023 18:34:46 +0200 Alexander Lobakin wrote:
>>> What if we got here from netpoll? napi budget was 0, so napi_safe is
>>> false, but in_softirq() can be true or false.  
>>
>> If we're on the same CPU where the NAPI would run and in the same
>> context, i.e. softirq, in which the NAPI would run, what is the problem?
>> If there really is a good one, I can handle it here.
> 
> #define SOFTIRQ_BITS		8
> #define SOFTIRQ_MASK		(__IRQ_MASK(SOFTIRQ_BITS) << SOFTIRQ_SHIFT)
> # define softirq_count()	(preempt_count() & SOFTIRQ_MASK)
> #define in_softirq()		(softirq_count())

I do remember those, don't worry :)

> 
> I don't know what else to add beyond that and the earlier explanation.

My question was "how can two things race on one CPU in one context if it
implies they won't ever happen simultaneously", but maybe my zero
knowledge of netcons hides something from me.

> 
> AFAIK pages as allocated by page pool do not benefit from the usual
> KASAN / KMSAN checkers, so if we were to double-recycle a page once
> a day because of a netcons race - it's going to be a month long debug
> for those of us using Linux in production.

if (!test_bit(&napi->state, NPSVC))

? It would mean we're not netpolling.
Otherwise, if this still is not enough, I'do go back to my v1 approach
with having a NAPI flag, which would tell for sure we're good to go. I
got confused by your "wouldn't just checking for softirq be enough"! T.T
Joking :D

Thanks,
Olek
Jakub Kicinski July 20, 2023, 5:12 p.m. UTC | #5
On Thu, 20 Jul 2023 18:46:02 +0200 Alexander Lobakin wrote:
> From: Jakub Kicinski <kuba@kernel.org>
> Date: Wed, 19 Jul 2023 13:51:50 -0700
> 
> > On Wed, 19 Jul 2023 18:34:46 +0200 Alexander Lobakin wrote:  
>  [...]  
> >>
> >> If we're on the same CPU where the NAPI would run and in the same
> >> context, i.e. softirq, in which the NAPI would run, what is the problem?
> >> If there really is a good one, I can handle it here.  
> > 
> > #define SOFTIRQ_BITS		8
> > #define SOFTIRQ_MASK		(__IRQ_MASK(SOFTIRQ_BITS) << SOFTIRQ_SHIFT)
> > # define softirq_count()	(preempt_count() & SOFTIRQ_MASK)
> > #define in_softirq()		(softirq_count())  
> 
> I do remember those, don't worry :)
> 
> > I don't know what else to add beyond that and the earlier explanation.  
> 
> My question was "how can two things race on one CPU in one context if it
> implies they won't ever happen simultaneously", but maybe my zero
> knowledge of netcons hides something from me.

One of them is in hardirq.

> > AFAIK pages as allocated by page pool do not benefit from the usual
> > KASAN / KMSAN checkers, so if we were to double-recycle a page once
> > a day because of a netcons race - it's going to be a month long debug
> > for those of us using Linux in production.  
> 
> if (!test_bit(&napi->state, NPSVC))

if you have to the right check is !in_hardirq()

> ? It would mean we're not netpolling.
> Otherwise, if this still is not enough, I'do go back to my v1 approach
> with having a NAPI flag, which would tell for sure we're good to go. I
> got confused by your "wouldn't just checking for softirq be enough"! T.T
> Joking :D

I guess the problem I'm concerned about can already happen.
I'll send a lockdep annotation shortly.
Alexander Lobakin July 20, 2023, 5:48 p.m. UTC | #6
From: Jakub Kicinski <kuba@kernel.org>
Date: Thu, 20 Jul 2023 10:12:31 -0700

> On Thu, 20 Jul 2023 18:46:02 +0200 Alexander Lobakin wrote:
>> From: Jakub Kicinski <kuba@kernel.org>
>> Date: Wed, 19 Jul 2023 13:51:50 -0700
>>
>>> On Wed, 19 Jul 2023 18:34:46 +0200 Alexander Lobakin wrote:  
>>  [...]  
>>>>
>>>> If we're on the same CPU where the NAPI would run and in the same
>>>> context, i.e. softirq, in which the NAPI would run, what is the problem?
>>>> If there really is a good one, I can handle it here.  
>>>
>>> #define SOFTIRQ_BITS		8
>>> #define SOFTIRQ_MASK		(__IRQ_MASK(SOFTIRQ_BITS) << SOFTIRQ_SHIFT)
>>> # define softirq_count()	(preempt_count() & SOFTIRQ_MASK)
>>> #define in_softirq()		(softirq_count())  
>>
>> I do remember those, don't worry :)
>>
>>> I don't know what else to add beyond that and the earlier explanation.  
>>
>> My question was "how can two things race on one CPU in one context if it
>> implies they won't ever happen simultaneously", but maybe my zero
>> knowledge of netcons hides something from me.
> 
> One of them is in hardirq.

If I got your message correctly, that means softirq_count() can return
`true` even if we're in hardirq context, but there are some softirqs
pending? I.e. if I call local_irq_save() inside NAPI poll loop,
in_softirq() will still return `true`? (I'll check it myself in a bit,
but why not ask).
Isn't checking for `interrupt_context_level() == 1` more appropriate
then? Page Pool core code also uses in_softirq(), as well as a hellaton
of other networking-related places.


> 
>>> AFAIK pages as allocated by page pool do not benefit from the usual
>>> KASAN / KMSAN checkers, so if we were to double-recycle a page once
>>> a day because of a netcons race - it's going to be a month long debug
>>> for those of us using Linux in production.  
>>
>> if (!test_bit(&napi->state, NPSVC))
> 
> if you have to the right check is !in_hardirq()
> 
>> ? It would mean we're not netpolling.
>> Otherwise, if this still is not enough, I'do go back to my v1 approach
>> with having a NAPI flag, which would tell for sure we're good to go. I
>> got confused by your "wouldn't just checking for softirq be enough"! T.T
>> Joking :D
> 
> I guess the problem I'm concerned about can already happen.
> I'll send a lockdep annotation shortly.

Interesten.

Thanks,
Olek
Jakub Kicinski July 20, 2023, 6 p.m. UTC | #7
On Thu, 20 Jul 2023 19:48:06 +0200 Alexander Lobakin wrote:
> >> My question was "how can two things race on one CPU in one context if it
> >> implies they won't ever happen simultaneously", but maybe my zero
> >> knowledge of netcons hides something from me.  
> > 
> > One of them is in hardirq.  
> 
> If I got your message correctly, that means softirq_count() can return
> `true` even if we're in hardirq context, but there are some softirqs
> pending? 

Not pending, being executed. Hardirq can come during softirq.

> I.e. if I call local_irq_save() inside NAPI poll loop,
> in_softirq() will still return `true`? (I'll check it myself in a bit,
> but why not ask).

Yes.

> Isn't checking for `interrupt_context_level() == 1` more appropriate
> then? Page Pool core code also uses in_softirq(), as well as a hellaton
> of other networking-related places.

Right now page pool only supports BH and process contexts. IOW the
"else" branch of if (in_softirq()) in page pool is expecting to be
in process context.

Supporting hard irq would mean we need to switch to _irqsave() locking.
That's likely way too costly.

Or stash the freed pages away and free them lazily.

Or add a lockdep warning and hope nobody will ever free a page-pool
backed skb from hard IRQ context :)
Alexander Lobakin July 20, 2023, 6:01 p.m. UTC | #8
From: Jakub Kicinski <kuba@kernel.org>
Date: Thu, 20 Jul 2023 11:00:27 -0700

> On Thu, 20 Jul 2023 19:48:06 +0200 Alexander Lobakin wrote:
>>>> My question was "how can two things race on one CPU in one context if it
>>>> implies they won't ever happen simultaneously", but maybe my zero
>>>> knowledge of netcons hides something from me.  
>>>
>>> One of them is in hardirq.  
>>
>> If I got your message correctly, that means softirq_count() can return
>> `true` even if we're in hardirq context, but there are some softirqs
>> pending? 
> 
> Not pending, being executed. Hardirq can come during softirq.
> 
>> I.e. if I call local_irq_save() inside NAPI poll loop,
>> in_softirq() will still return `true`? (I'll check it myself in a bit,
>> but why not ask).
> 
> Yes.
> 
>> Isn't checking for `interrupt_context_level() == 1` more appropriate
>> then? Page Pool core code also uses in_softirq(), as well as a hellaton
>> of other networking-related places.
> 
> Right now page pool only supports BH and process contexts. IOW the
> "else" branch of if (in_softirq()) in page pool is expecting to be
> in process context.
> 
> Supporting hard irq would mean we need to switch to _irqsave() locking.
> That's likely way too costly.
> 
> Or stash the freed pages away and free them lazily.
> 
> Or add a lockdep warning and hope nobody will ever free a page-pool
> backed skb from hard IRQ context :)

I told you under the previous version that this function is not supposed
to be called under hardirq context, so we don't need to check for it :D
But I was assuming nobody would try to do that. Seems like not really
(netcons) if you want to sanitize this...

Thanks,
Olek
Alexander Lobakin July 20, 2023, 6:13 p.m. UTC | #9
From: Alexander Lobakin <aleksander.lobakin@intel.com>
Date: Thu, 20 Jul 2023 20:01:28 +0200

> From: Jakub Kicinski <kuba@kernel.org>
> Date: Thu, 20 Jul 2023 11:00:27 -0700
> 
>> On Thu, 20 Jul 2023 19:48:06 +0200 Alexander Lobakin wrote:
>>>>> My question was "how can two things race on one CPU in one context if it
>>>>> implies they won't ever happen simultaneously", but maybe my zero
>>>>> knowledge of netcons hides something from me.  
>>>>
>>>> One of them is in hardirq.  
>>>
>>> If I got your message correctly, that means softirq_count() can return
>>> `true` even if we're in hardirq context, but there are some softirqs
>>> pending? 
>>
>> Not pending, being executed. Hardirq can come during softirq.
>>
>>> I.e. if I call local_irq_save() inside NAPI poll loop,
>>> in_softirq() will still return `true`? (I'll check it myself in a bit,
>>> but why not ask).
>>
>> Yes.

My code, run from the NAPI poll loop:

	pr_info("BH. irq_count(): %lu", irq_count());
	pr_info("BH. in_hardirq(): %lu", in_hardirq());
	pr_info("BH. in_softirq(): %lu", in_softirq());
	pr_info("BH. in_serving_softirq(): %lu", in_serving_softirq());
	pr_info("BH. interrupt_context_level(): %u",
		interrupt_context_level());

	local_irq_save(flags);
	pr_info("TH. irq_count(): %lu", irq_count());
	pr_info("TH. in_hardirq(): %lu", in_hardirq());
	pr_info("TH. in_softirq(): %lu", in_softirq());
	pr_info("TH. in_serving_softirq(): %lu", in_serving_softirq());
	pr_info("TH. interrupt_context_level(): %u",
		interrupt_context_level());
	local_irq_restore(flags);

The result:

BH. irq_count(): 256
BH. in_hardirq(): 0
BH. in_softirq(): 256
BH. in_serving_softirq(): 256
BH. interrupt_context_level(): 1
TH. irq_count(): 256
TH. in_hardirq(): 0
TH. in_softirq(): 256
TH. in_serving_softirq(): 256
TH. interrupt_context_level(): 1

IOW, it reports we're in softirq no bloody matter if interrupts are
enabled or not. Either I did something wrong or the entire in_*irq()
family, including interrupt_context_level(), doesn't protect from
anything at all and doesn't work the way that most devs expect it to work?

(or was it just me? :D)

I guess the only way to be sure is to always check irqs_disabled() when
in_softirq() returns true.

>>
>>> Isn't checking for `interrupt_context_level() == 1` more appropriate
>>> then? Page Pool core code also uses in_softirq(), as well as a hellaton
>>> of other networking-related places.
>>
>> Right now page pool only supports BH and process contexts. IOW the
>> "else" branch of if (in_softirq()) in page pool is expecting to be
>> in process context.
>>
>> Supporting hard irq would mean we need to switch to _irqsave() locking.
>> That's likely way too costly.
>>
>> Or stash the freed pages away and free them lazily.
>>
>> Or add a lockdep warning and hope nobody will ever free a page-pool
>> backed skb from hard IRQ context :)
> 
> I told you under the previous version that this function is not supposed
> to be called under hardirq context, so we don't need to check for it :D
> But I was assuming nobody would try to do that. Seems like not really
> (netcons) if you want to sanitize this...
> 
> Thanks,
> Olek

Thanks,
Olek
Jakub Kicinski July 20, 2023, 7:20 p.m. UTC | #10
On Thu, 20 Jul 2023 20:13:07 +0200 Alexander Lobakin wrote:
> IOW, it reports we're in softirq no bloody matter if interrupts are
> enabled or not. Either I did something wrong or the entire in_*irq()
> family, including interrupt_context_level(), doesn't protect from
> anything at all and doesn't work the way that most devs expect it to work?
> 
> (or was it just me? :D)
> 
> I guess the only way to be sure is to always check irqs_disabled() when
> in_softirq() returns true.

We can as well check
	(in_softirq() && !irqs_disabled() && !in_hardirq())
?

The interrupt_context_level() thing is fairly new, I think.
Who knows what happens to it going forward...

> >> Right now page pool only supports BH and process contexts. IOW the
> >> "else" branch of if (in_softirq()) in page pool is expecting to be
> >> in process context.
> >>
> >> Supporting hard irq would mean we need to switch to _irqsave() locking.
> >> That's likely way too costly.
> >>
> >> Or stash the freed pages away and free them lazily.
> >>
> >> Or add a lockdep warning and hope nobody will ever free a page-pool
> >> backed skb from hard IRQ context :)  
> > 
> > I told you under the previous version that this function is not supposed
> > to be called under hardirq context, so we don't need to check for it :D
> > But I was assuming nobody would try to do that. Seems like not really
> > (netcons) if you want to sanitize this...

netcons or anyone who freed socket-less skbs from hardirq.
Until pp recycling was added freeing an skb from hardirq was legal,
AFAICT.
Alexander Lobakin July 20, 2023, 7:33 p.m. UTC | #11
From: Jakub Kicinski <kuba@kernel.org>
Date: Thu, 20 Jul 2023 12:20:15 -0700

> On Thu, 20 Jul 2023 20:13:07 +0200 Alexander Lobakin wrote:
>> IOW, it reports we're in softirq no bloody matter if interrupts are
>> enabled or not. Either I did something wrong or the entire in_*irq()
>> family, including interrupt_context_level(), doesn't protect from
>> anything at all and doesn't work the way that most devs expect it to work?
>>
>> (or was it just me? :D)
>>
>> I guess the only way to be sure is to always check irqs_disabled() when
>> in_softirq() returns true.
> 
> We can as well check
> 	(in_softirq() && !irqs_disabled() && !in_hardirq())
> ?

Yes, something like that. Messy, but I see no other options...

So, I guess you want to add an assertion to make sure that we're *not*
in this:

in_hardirq() || irqs_disabled()

Does this mean that after it's added, my patch is sane? :p

> 
> The interrupt_context_level() thing is fairly new, I think.
> Who knows what happens to it going forward...

Well, it counts the number of active hard interrupts, but doesn't take
into account that if there are no hardirqs we can still disable them
manually. Meh.
Should I try to send a patch for it? :D

> 
>>>> Right now page pool only supports BH and process contexts. IOW the
>>>> "else" branch of if (in_softirq()) in page pool is expecting to be
>>>> in process context.
>>>>
>>>> Supporting hard irq would mean we need to switch to _irqsave() locking.
>>>> That's likely way too costly.
>>>>
>>>> Or stash the freed pages away and free them lazily.
>>>>
>>>> Or add a lockdep warning and hope nobody will ever free a page-pool
>>>> backed skb from hard IRQ context :)  
>>>
>>> I told you under the previous version that this function is not supposed
>>> to be called under hardirq context, so we don't need to check for it :D
>>> But I was assuming nobody would try to do that. Seems like not really
>>> (netcons) if you want to sanitize this...
> 
> netcons or anyone who freed socket-less skbs from hardirq.
> Until pp recycling was added freeing an skb from hardirq was legal,
> AFAICT.

I don't think so. Why do we have dev_kfree_skb_any() then? It checks for

in_hardirq() || irqs_disabled()

and if it's true, defers the skb to process it by backlog task.
"Regular" skb freeing functions don't do that. The _any() variant lives
here for a long time IIRC, so it's not something recent.

Thanks,
Olek
Jakub Kicinski July 20, 2023, 7:46 p.m. UTC | #12
On Thu, 20 Jul 2023 21:33:40 +0200 Alexander Lobakin wrote:
> > We can as well check
> > 	(in_softirq() && !irqs_disabled() && !in_hardirq())
> > ?  
> 
> Yes, something like that. Messy, but I see no other options...
> 
> So, I guess you want to add an assertion to make sure that we're *not*
> in this:
> 
> in_hardirq() || irqs_disabled()
> 
> Does this mean that after it's added, my patch is sane? :p

Well... it's acceptable. Make sure you add a good, informative
but concise comment :)

> > The interrupt_context_level() thing is fairly new, I think.
> > Who knows what happens to it going forward...  
> 
> Well, it counts the number of active hard interrupts, but doesn't take
> into account that if there are no hardirqs we can still disable them
> manually. Meh.
> Should I try to send a patch for it? :D

Depends on how you like to send your time :)

> > netcons or anyone who freed socket-less skbs from hardirq.
> > Until pp recycling was added freeing an skb from hardirq was legal,
> > AFAICT.  
> 
> I don't think so. Why do we have dev_kfree_skb_any() then? It checks for
> 
> in_hardirq() || irqs_disabled()
> 
> and if it's true, defers the skb to process it by backlog task.
> "Regular" skb freeing functions don't do that. The _any() variant lives
> here for a long time IIRC, so it's not something recent.

Drivers (or any other users of dev_kfree_skb_any()) should be fine. 
I'm only paranoid about some unknown bits of code which thought they
can be clever and call kfree_skb() directly, as long as !skb->sk.

But if you add the hard irq checks to your patch then you're strictly
safer than the existing code. Hopefully the checks are not too
expensive.
Yunsheng Lin July 21, 2023, 11:53 a.m. UTC | #13
On 2023/7/21 3:46, Jakub Kicinski wrote:
> On Thu, 20 Jul 2023 21:33:40 +0200 Alexander Lobakin wrote:
>>> We can as well check
>>> 	(in_softirq() && !irqs_disabled() && !in_hardirq())
>>> ?  
>>
>> Yes, something like that. Messy, but I see no other options...
>>
>> So, I guess you want to add an assertion to make sure that we're *not*
>> in this:
>>
>> in_hardirq() || irqs_disabled()
>>
>> Does this mean that after it's added, my patch is sane? :p
> 
> Well... it's acceptable. Make sure you add a good, informative
> but concise comment :)
> 

Does it mean ptr_ring_produce_any() is needed in
page_pool_recycle_in_ring() too?

As it is assumed that page pool API can be called in the context with
irqs_disabled() || in_hardirq(), and force recylcling happens in the
prt_ring.

Isn't it conflit with the below patch? as the below patch assume page
pool API can not be called in the context with irqs_disabled() || in_hardirq():
[PATCH net-next] page_pool: add a lockdep check for recycling in hardirq

Or am I missing something obvious here?
Alexander Lobakin July 21, 2023, 3:37 p.m. UTC | #14
From: Yunsheng Lin <linyunsheng@huawei.com>
Date: Fri, 21 Jul 2023 19:53:33 +0800

> On 2023/7/21 3:46, Jakub Kicinski wrote:
>> On Thu, 20 Jul 2023 21:33:40 +0200 Alexander Lobakin wrote:
>>>> We can as well check
>>>> 	(in_softirq() && !irqs_disabled() && !in_hardirq())
>>>> ?  
>>>
>>> Yes, something like that. Messy, but I see no other options...
>>>
>>> So, I guess you want to add an assertion to make sure that we're *not*
>>> in this:
>>>
>>> in_hardirq() || irqs_disabled()
>>>
>>> Does this mean that after it's added, my patch is sane? :p
>>
>> Well... it's acceptable. Make sure you add a good, informative
>> but concise comment :)
>>
> 
> Does it mean ptr_ring_produce_any() is needed in
> page_pool_recycle_in_ring() too?
> 
> As it is assumed that page pool API can be called in the context with
> irqs_disabled() || in_hardirq(), and force recylcling happens in the
> prt_ring.
> 
> Isn't it conflit with the below patch? as the below patch assume page
> pool API can not be called in the context with irqs_disabled() || in_hardirq():
> [PATCH net-next] page_pool: add a lockdep check for recycling in hardirq
> 
> Or am I missing something obvious here?

No, Page Pool is *not* intended to be called when IRQs are disabled,
hence the fix Jakub's posted in the separate thread.

Thanks,
Olek
Jakub Kicinski July 21, 2023, 4:01 p.m. UTC | #15
On Fri, 21 Jul 2023 17:37:57 +0200 Alexander Lobakin wrote:
> > Does it mean ptr_ring_produce_any() is needed in
> > page_pool_recycle_in_ring() too?
> > 
> > As it is assumed that page pool API can be called in the context with
> > irqs_disabled() || in_hardirq(), and force recylcling happens in the
> > prt_ring.
> > 
> > Isn't it conflit with the below patch? as the below patch assume page
> > pool API can not be called in the context with irqs_disabled() || in_hardirq():
> > [PATCH net-next] page_pool: add a lockdep check for recycling in hardirq
> > 
> > Or am I missing something obvious here?  
> 
> No, Page Pool is *not* intended to be called when IRQs are disabled,
> hence the fix Jakub's posted in the separate thread.

Yeah, it's just a bandaid / compromise, since Olek really wants his
optimization and I really don't want to spend a month debugging
potential production crashes :)

On the ptr ring the use may still be incorrect but there is a spin lock
so things will explode in much more obvious way, if they do.
diff mbox series

Patch

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index fc1470aab5cf..1c22fd33be6c 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -902,7 +902,7 @@  bool page_pool_return_skb_page(struct page *page, bool napi_safe)
 	 * in the same context as the consumer would run, so there's
 	 * no possible race.
 	 */
-	if (napi_safe) {
+	if (napi_safe || in_softirq()) {
 		const struct napi_struct *napi = READ_ONCE(pp->p.napi);
 
 		allow_direct = napi &&