diff mbox series

[iwl-net] ice: recycle/free all of the fragments from multi-buffer packet

Message ID 20230512132331.125047-1-maciej.fijalkowski@intel.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series [iwl-net] ice: recycle/free all of the fragments from multi-buffer packet | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 8 this patch: 8
netdev/cc_maintainers fail 2 blamed authors not CCed: daniel@iogearbox.net alexandr.lobakin@intel.com; 7 maintainers not CCed: kuba@kernel.org daniel@iogearbox.net jesse.brandeburg@intel.com davem@davemloft.net pabeni@redhat.com edumazet@google.com alexandr.lobakin@intel.com
netdev/build_clang success Errors and warnings before: 8 this patch: 8
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 8 this patch: 8
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 9 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Maciej Fijalkowski May 12, 2023, 1:23 p.m. UTC
The ice driver caches next_to_clean value at the beginning of
ice_clean_rx_irq() in order to remember the first buffer that has to be
freed/recycled after main Rx processing loop. The end is indicated by
first descriptor of packet that the Rx processing loop has ended. Note
that if mentioned loop ended in the middle of gathering a multi-buffer
packet, next_to_clean would be pointing to the descriptor in the middle
of the packet BUT freeing/recycling stage will stop at the first
descriptor. This means that next iteration of ice_clean_rx_irq() will
miss the (first_desc, next_to_clean - 1) entries.

 When running various 9K MTU workloads, such splats were observed,
mostly related to rx_buf->page being NULL as behavior described in the
paragraph above breaks ICE_RX_DESC_UNUSED() macro which is used when
refilling Rx buffers:

[  540.780716] BUG: kernel NULL pointer dereference, address: 0000000000000000
[  540.787787] #PF: supervisor read access in kernel mode
[  540.793002] #PF: error_code(0x0000) - not-present page
[  540.798218] PGD 0 P4D 0
[  540.800801] Oops: 0000 [#1] PREEMPT SMP NOPTI
[  540.805231] CPU: 18 PID: 3984 Comm: xskxceiver Tainted: G        W          6.3.0-rc7+ #96
[  540.813619] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0008.031920191559 03/19/2019
[  540.824209] RIP: 0010:ice_clean_rx_irq+0x2b6/0xf00 [ice]
[  540.829678] Code: 74 24 10 e9 aa 00 00 00 8b 55 78 41 31 57 10 41 09 c4 4d 85 ff 0f 84 83 00 00 00 49 8b 57 08 41 8b 4f 1c 65 8b 35 1a fa 4b 3f <48> 8b 02 48 c1 e8 3a 39 c6 0f 85 a2 00 00 00 f6 42 08 02 0f 85 98
[  540.848717] RSP: 0018:ffffc9000f42fc50 EFLAGS: 00010282
[  540.854029] RAX: 0000000000000004 RBX: 0000000000000002 RCX: 000000000000fffe
[  540.861272] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 00000000ffffffff
[  540.868519] RBP: ffff88984a05ac00 R08: 0000000000000000 R09: dead000000000100
[  540.875760] R10: ffff88983fffcd00 R11: 000000000010f2b8 R12: 0000000000000004
[  540.883008] R13: 0000000000000003 R14: 0000000000000800 R15: ffff889847a10040
[  540.890253] FS:  00007f6ddf7fe640(0000) GS:ffff88afdf800000(0000) knlGS:0000000000000000
[  540.898465] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  540.904299] CR2: 0000000000000000 CR3: 000000010d3da001 CR4: 00000000007706e0
[  540.911542] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  540.918789] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  540.926032] PKRU: 55555554
[  540.928790] Call Trace:
[  540.931276]  <TASK>
[  540.933418]  ice_napi_poll+0x4ca/0x6d0 [ice]
[  540.937804]  ? __pfx_ice_napi_poll+0x10/0x10 [ice]
[  540.942716]  napi_busy_loop+0xd7/0x320
[  540.946537]  xsk_recvmsg+0x143/0x170
[  540.950178]  sock_recvmsg+0x99/0xa0
[  540.953729]  __sys_recvfrom+0xa8/0x120
[  540.957543]  ? do_futex+0xbd/0x1d0
[  540.961008]  ? __x64_sys_futex+0x73/0x1d0
[  540.965083]  __x64_sys_recvfrom+0x20/0x30
[  540.969155]  do_syscall_64+0x38/0x90
[  540.972796]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
[  540.977934] RIP: 0033:0x7f6de5f27934

To fix this, compare next_to_clean with first_desc at the beginning of
ice_clean_rx_irq(). In the case they are not the same, set cached_ntc to
first_desc so that at the end, when freeing/recycling buffers,
descriptors from first to ntc are not missed.

Fixes: 2fba7dc5157b ("ice: Add support for XDP multi-buffer on Rx side")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_txrx.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Simon Horman May 12, 2023, 3:37 p.m. UTC | #1
On Fri, May 12, 2023 at 03:23:30PM +0200, Maciej Fijalkowski wrote:
> The ice driver caches next_to_clean value at the beginning of
> ice_clean_rx_irq() in order to remember the first buffer that has to be
> freed/recycled after main Rx processing loop. The end is indicated by
> first descriptor of packet that the Rx processing loop has ended. Note
> that if mentioned loop ended in the middle of gathering a multi-buffer
> packet, next_to_clean would be pointing to the descriptor in the middle
> of the packet BUT freeing/recycling stage will stop at the first
> descriptor. This means that next iteration of ice_clean_rx_irq() will
> miss the (first_desc, next_to_clean - 1) entries.
> 
>  When running various 9K MTU workloads, such splats were observed,
> mostly related to rx_buf->page being NULL as behavior described in the
> paragraph above breaks ICE_RX_DESC_UNUSED() macro which is used when
> refilling Rx buffers:
> 
> [  540.780716] BUG: kernel NULL pointer dereference, address: 0000000000000000
> [  540.787787] #PF: supervisor read access in kernel mode
> [  540.793002] #PF: error_code(0x0000) - not-present page
> [  540.798218] PGD 0 P4D 0
> [  540.800801] Oops: 0000 [#1] PREEMPT SMP NOPTI
> [  540.805231] CPU: 18 PID: 3984 Comm: xskxceiver Tainted: G        W          6.3.0-rc7+ #96
> [  540.813619] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0008.031920191559 03/19/2019
> [  540.824209] RIP: 0010:ice_clean_rx_irq+0x2b6/0xf00 [ice]
> [  540.829678] Code: 74 24 10 e9 aa 00 00 00 8b 55 78 41 31 57 10 41 09 c4 4d 85 ff 0f 84 83 00 00 00 49 8b 57 08 41 8b 4f 1c 65 8b 35 1a fa 4b 3f <48> 8b 02 48 c1 e8 3a 39 c6 0f 85 a2 00 00 00 f6 42 08 02 0f 85 98
> [  540.848717] RSP: 0018:ffffc9000f42fc50 EFLAGS: 00010282
> [  540.854029] RAX: 0000000000000004 RBX: 0000000000000002 RCX: 000000000000fffe
> [  540.861272] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 00000000ffffffff
> [  540.868519] RBP: ffff88984a05ac00 R08: 0000000000000000 R09: dead000000000100
> [  540.875760] R10: ffff88983fffcd00 R11: 000000000010f2b8 R12: 0000000000000004
> [  540.883008] R13: 0000000000000003 R14: 0000000000000800 R15: ffff889847a10040
> [  540.890253] FS:  00007f6ddf7fe640(0000) GS:ffff88afdf800000(0000) knlGS:0000000000000000
> [  540.898465] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  540.904299] CR2: 0000000000000000 CR3: 000000010d3da001 CR4: 00000000007706e0
> [  540.911542] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  540.918789] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [  540.926032] PKRU: 55555554
> [  540.928790] Call Trace:
> [  540.931276]  <TASK>
> [  540.933418]  ice_napi_poll+0x4ca/0x6d0 [ice]
> [  540.937804]  ? __pfx_ice_napi_poll+0x10/0x10 [ice]
> [  540.942716]  napi_busy_loop+0xd7/0x320
> [  540.946537]  xsk_recvmsg+0x143/0x170
> [  540.950178]  sock_recvmsg+0x99/0xa0
> [  540.953729]  __sys_recvfrom+0xa8/0x120
> [  540.957543]  ? do_futex+0xbd/0x1d0
> [  540.961008]  ? __x64_sys_futex+0x73/0x1d0
> [  540.965083]  __x64_sys_recvfrom+0x20/0x30
> [  540.969155]  do_syscall_64+0x38/0x90
> [  540.972796]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
> [  540.977934] RIP: 0033:0x7f6de5f27934
> 
> To fix this, compare next_to_clean with first_desc at the beginning of
> ice_clean_rx_irq(). In the case they are not the same, set cached_ntc to
> first_desc so that at the end, when freeing/recycling buffers,
> descriptors from first to ntc are not missed.
> 
> Fixes: 2fba7dc5157b ("ice: Add support for XDP multi-buffer on Rx side")
> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> ---
>  drivers/net/ethernet/intel/ice/ice_txrx.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c
> index 4fcf2d07eb85..4d1fc047767f 100644
> --- a/drivers/net/ethernet/intel/ice/ice_txrx.c
> +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
> @@ -1162,6 +1162,9 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget)
>  	bool failure;
>  	u32 first;
>  
> +	if (ntc != rx_ring->first_desc)
> +		cached_ntc = rx_ring->first_desc;
> +

Above ntc is initialised as rx_ring->next_to_clean.
And cached_ntc is initialised as ntc, i.e. rx_ring->next_to_clean.

	u32 ntc = rx_ring->next_to_clean;
        ...
        u32 cached_ntc = ntc;

So in effect we have:

	if (rx_ring->next_to_clean != rx_ring->first_desc)
		cached_ntc = rx_ring->first_desc;
	else
		cached_ntc = rx_ring->next_to_clean;

I wonder if we can simplify this by simply initialising cached_ntc as:

	cached_ntc = rx_ring->first_desc;
Piotr Raczynski May 12, 2023, 3:40 p.m. UTC | #2
> @@ -1162,6 +1162,9 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget)
>  	bool failure;
>  	u32 first;
>  
> +	if (ntc != rx_ring->first_desc)
> +		cached_ntc = rx_ring->first_desc;
> +
Would it make sense to apply likely/unlikely here since  we check it per
packet?

Piotr
>  	/* Frame size depend on rx_ring setup when PAGE_SIZE=4K */
>  #if (PAGE_SIZE < 8192)

>
Maciej Fijalkowski May 15, 2023, 1:13 p.m. UTC | #3
On Fri, May 12, 2023 at 05:40:11PM +0200, Piotr Raczynski wrote:
> > @@ -1162,6 +1162,9 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget)
> >  	bool failure;
> >  	u32 first;
> >  
> > +	if (ntc != rx_ring->first_desc)
> > +		cached_ntc = rx_ring->first_desc;
> > +
> Would it make sense to apply likely/unlikely here since  we check it per
> packet?

This was done before running Rx processing loop so this was not per
packet, only a single check, but in the end we don't need this, see
Simon's reply.

> 
> Piotr
> >  	/* Frame size depend on rx_ring setup when PAGE_SIZE=4K */
> >  #if (PAGE_SIZE < 8192)
> 
> >
Maciej Fijalkowski May 15, 2023, 1:17 p.m. UTC | #4
On Fri, May 12, 2023 at 05:37:39PM +0200, Simon Horman wrote:
> On Fri, May 12, 2023 at 03:23:30PM +0200, Maciej Fijalkowski wrote:
> > The ice driver caches next_to_clean value at the beginning of
> > ice_clean_rx_irq() in order to remember the first buffer that has to be
> > freed/recycled after main Rx processing loop. The end is indicated by
> > first descriptor of packet that the Rx processing loop has ended. Note
> > that if mentioned loop ended in the middle of gathering a multi-buffer
> > packet, next_to_clean would be pointing to the descriptor in the middle
> > of the packet BUT freeing/recycling stage will stop at the first
> > descriptor. This means that next iteration of ice_clean_rx_irq() will
> > miss the (first_desc, next_to_clean - 1) entries.
> > 
> >  When running various 9K MTU workloads, such splats were observed,
> > mostly related to rx_buf->page being NULL as behavior described in the
> > paragraph above breaks ICE_RX_DESC_UNUSED() macro which is used when
> > refilling Rx buffers:
> > 
> > [  540.780716] BUG: kernel NULL pointer dereference, address: 0000000000000000
> > [  540.787787] #PF: supervisor read access in kernel mode
> > [  540.793002] #PF: error_code(0x0000) - not-present page
> > [  540.798218] PGD 0 P4D 0
> > [  540.800801] Oops: 0000 [#1] PREEMPT SMP NOPTI
> > [  540.805231] CPU: 18 PID: 3984 Comm: xskxceiver Tainted: G        W          6.3.0-rc7+ #96
> > [  540.813619] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0008.031920191559 03/19/2019
> > [  540.824209] RIP: 0010:ice_clean_rx_irq+0x2b6/0xf00 [ice]
> > [  540.829678] Code: 74 24 10 e9 aa 00 00 00 8b 55 78 41 31 57 10 41 09 c4 4d 85 ff 0f 84 83 00 00 00 49 8b 57 08 41 8b 4f 1c 65 8b 35 1a fa 4b 3f <48> 8b 02 48 c1 e8 3a 39 c6 0f 85 a2 00 00 00 f6 42 08 02 0f 85 98
> > [  540.848717] RSP: 0018:ffffc9000f42fc50 EFLAGS: 00010282
> > [  540.854029] RAX: 0000000000000004 RBX: 0000000000000002 RCX: 000000000000fffe
> > [  540.861272] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 00000000ffffffff
> > [  540.868519] RBP: ffff88984a05ac00 R08: 0000000000000000 R09: dead000000000100
> > [  540.875760] R10: ffff88983fffcd00 R11: 000000000010f2b8 R12: 0000000000000004
> > [  540.883008] R13: 0000000000000003 R14: 0000000000000800 R15: ffff889847a10040
> > [  540.890253] FS:  00007f6ddf7fe640(0000) GS:ffff88afdf800000(0000) knlGS:0000000000000000
> > [  540.898465] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  540.904299] CR2: 0000000000000000 CR3: 000000010d3da001 CR4: 00000000007706e0
> > [  540.911542] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [  540.918789] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > [  540.926032] PKRU: 55555554
> > [  540.928790] Call Trace:
> > [  540.931276]  <TASK>
> > [  540.933418]  ice_napi_poll+0x4ca/0x6d0 [ice]
> > [  540.937804]  ? __pfx_ice_napi_poll+0x10/0x10 [ice]
> > [  540.942716]  napi_busy_loop+0xd7/0x320
> > [  540.946537]  xsk_recvmsg+0x143/0x170
> > [  540.950178]  sock_recvmsg+0x99/0xa0
> > [  540.953729]  __sys_recvfrom+0xa8/0x120
> > [  540.957543]  ? do_futex+0xbd/0x1d0
> > [  540.961008]  ? __x64_sys_futex+0x73/0x1d0
> > [  540.965083]  __x64_sys_recvfrom+0x20/0x30
> > [  540.969155]  do_syscall_64+0x38/0x90
> > [  540.972796]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
> > [  540.977934] RIP: 0033:0x7f6de5f27934
> > 
> > To fix this, compare next_to_clean with first_desc at the beginning of
> > ice_clean_rx_irq(). In the case they are not the same, set cached_ntc to
> > first_desc so that at the end, when freeing/recycling buffers,
> > descriptors from first to ntc are not missed.
> > 
> > Fixes: 2fba7dc5157b ("ice: Add support for XDP multi-buffer on Rx side")
> > Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> > ---
> >  drivers/net/ethernet/intel/ice/ice_txrx.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c
> > index 4fcf2d07eb85..4d1fc047767f 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_txrx.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
> > @@ -1162,6 +1162,9 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget)
> >  	bool failure;
> >  	u32 first;
> >  
> > +	if (ntc != rx_ring->first_desc)
> > +		cached_ntc = rx_ring->first_desc;
> > +
> 
> Above ntc is initialised as rx_ring->next_to_clean.
> And cached_ntc is initialised as ntc, i.e. rx_ring->next_to_clean.
> 
> 	u32 ntc = rx_ring->next_to_clean;
>         ...
>         u32 cached_ntc = ntc;
> 
> So in effect we have:
> 
> 	if (rx_ring->next_to_clean != rx_ring->first_desc)
> 		cached_ntc = rx_ring->first_desc;
> 	else
> 		cached_ntc = rx_ring->next_to_clean;
> 
> I wonder if we can simplify this by simply initialising cached_ntc as:
> 
> 	cached_ntc = rx_ring->first_desc;

Thanks for taking a fresh look over this, indeed this can be simplified as
you pointed out :) being stuck on some problem makes us sometimes unable
to see such obvious things ;)

Will send a v2.
Simon Horman May 15, 2023, 2:39 p.m. UTC | #5
On Mon, May 15, 2023 at 03:17:33PM +0200, Maciej Fijalkowski wrote:
> On Fri, May 12, 2023 at 05:37:39PM +0200, Simon Horman wrote:
> > On Fri, May 12, 2023 at 03:23:30PM +0200, Maciej Fijalkowski wrote:
> > > The ice driver caches next_to_clean value at the beginning of
> > > ice_clean_rx_irq() in order to remember the first buffer that has to be
> > > freed/recycled after main Rx processing loop. The end is indicated by
> > > first descriptor of packet that the Rx processing loop has ended. Note
> > > that if mentioned loop ended in the middle of gathering a multi-buffer
> > > packet, next_to_clean would be pointing to the descriptor in the middle
> > > of the packet BUT freeing/recycling stage will stop at the first
> > > descriptor. This means that next iteration of ice_clean_rx_irq() will
> > > miss the (first_desc, next_to_clean - 1) entries.
> > > 
> > >  When running various 9K MTU workloads, such splats were observed,
> > > mostly related to rx_buf->page being NULL as behavior described in the
> > > paragraph above breaks ICE_RX_DESC_UNUSED() macro which is used when
> > > refilling Rx buffers:
> > > 
> > > [  540.780716] BUG: kernel NULL pointer dereference, address: 0000000000000000
> > > [  540.787787] #PF: supervisor read access in kernel mode
> > > [  540.793002] #PF: error_code(0x0000) - not-present page
> > > [  540.798218] PGD 0 P4D 0
> > > [  540.800801] Oops: 0000 [#1] PREEMPT SMP NOPTI
> > > [  540.805231] CPU: 18 PID: 3984 Comm: xskxceiver Tainted: G        W          6.3.0-rc7+ #96
> > > [  540.813619] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0008.031920191559 03/19/2019
> > > [  540.824209] RIP: 0010:ice_clean_rx_irq+0x2b6/0xf00 [ice]
> > > [  540.829678] Code: 74 24 10 e9 aa 00 00 00 8b 55 78 41 31 57 10 41 09 c4 4d 85 ff 0f 84 83 00 00 00 49 8b 57 08 41 8b 4f 1c 65 8b 35 1a fa 4b 3f <48> 8b 02 48 c1 e8 3a 39 c6 0f 85 a2 00 00 00 f6 42 08 02 0f 85 98
> > > [  540.848717] RSP: 0018:ffffc9000f42fc50 EFLAGS: 00010282
> > > [  540.854029] RAX: 0000000000000004 RBX: 0000000000000002 RCX: 000000000000fffe
> > > [  540.861272] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 00000000ffffffff
> > > [  540.868519] RBP: ffff88984a05ac00 R08: 0000000000000000 R09: dead000000000100
> > > [  540.875760] R10: ffff88983fffcd00 R11: 000000000010f2b8 R12: 0000000000000004
> > > [  540.883008] R13: 0000000000000003 R14: 0000000000000800 R15: ffff889847a10040
> > > [  540.890253] FS:  00007f6ddf7fe640(0000) GS:ffff88afdf800000(0000) knlGS:0000000000000000
> > > [  540.898465] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > [  540.904299] CR2: 0000000000000000 CR3: 000000010d3da001 CR4: 00000000007706e0
> > > [  540.911542] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > [  540.918789] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > > [  540.926032] PKRU: 55555554
> > > [  540.928790] Call Trace:
> > > [  540.931276]  <TASK>
> > > [  540.933418]  ice_napi_poll+0x4ca/0x6d0 [ice]
> > > [  540.937804]  ? __pfx_ice_napi_poll+0x10/0x10 [ice]
> > > [  540.942716]  napi_busy_loop+0xd7/0x320
> > > [  540.946537]  xsk_recvmsg+0x143/0x170
> > > [  540.950178]  sock_recvmsg+0x99/0xa0
> > > [  540.953729]  __sys_recvfrom+0xa8/0x120
> > > [  540.957543]  ? do_futex+0xbd/0x1d0
> > > [  540.961008]  ? __x64_sys_futex+0x73/0x1d0
> > > [  540.965083]  __x64_sys_recvfrom+0x20/0x30
> > > [  540.969155]  do_syscall_64+0x38/0x90
> > > [  540.972796]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
> > > [  540.977934] RIP: 0033:0x7f6de5f27934
> > > 
> > > To fix this, compare next_to_clean with first_desc at the beginning of
> > > ice_clean_rx_irq(). In the case they are not the same, set cached_ntc to
> > > first_desc so that at the end, when freeing/recycling buffers,
> > > descriptors from first to ntc are not missed.
> > > 
> > > Fixes: 2fba7dc5157b ("ice: Add support for XDP multi-buffer on Rx side")
> > > Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> > > ---
> > >  drivers/net/ethernet/intel/ice/ice_txrx.c | 3 +++
> > >  1 file changed, 3 insertions(+)
> > > 
> > > diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c
> > > index 4fcf2d07eb85..4d1fc047767f 100644
> > > --- a/drivers/net/ethernet/intel/ice/ice_txrx.c
> > > +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
> > > @@ -1162,6 +1162,9 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget)
> > >  	bool failure;
> > >  	u32 first;
> > >  
> > > +	if (ntc != rx_ring->first_desc)
> > > +		cached_ntc = rx_ring->first_desc;
> > > +
> > 
> > Above ntc is initialised as rx_ring->next_to_clean.
> > And cached_ntc is initialised as ntc, i.e. rx_ring->next_to_clean.
> > 
> > 	u32 ntc = rx_ring->next_to_clean;
> >         ...
> >         u32 cached_ntc = ntc;
> > 
> > So in effect we have:
> > 
> > 	if (rx_ring->next_to_clean != rx_ring->first_desc)
> > 		cached_ntc = rx_ring->first_desc;
> > 	else
> > 		cached_ntc = rx_ring->next_to_clean;
> > 
> > I wonder if we can simplify this by simply initialising cached_ntc as:
> > 
> > 	cached_ntc = rx_ring->first_desc;
> 
> Thanks for taking a fresh look over this, indeed this can be simplified as
> you pointed out :) being stuck on some problem makes us sometimes unable
> to see such obvious things ;)

Indeed it can. Happy to help :)
diff mbox series

Patch

diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c
index 4fcf2d07eb85..4d1fc047767f 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx.c
+++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
@@ -1162,6 +1162,9 @@  int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget)
 	bool failure;
 	u32 first;
 
+	if (ntc != rx_ring->first_desc)
+		cached_ntc = rx_ring->first_desc;
+
 	/* Frame size depend on rx_ring setup when PAGE_SIZE=4K */
 #if (PAGE_SIZE < 8192)
 	xdp->frame_sz = ice_rx_frame_truesize(rx_ring, 0);