diff mbox series

[net-next,v3,2/3] net: dev: Makes sure netif_rx() can be invoked in any context.

Message ID 20220211233839.2280731-3-bigeasy@linutronix.de (mailing list archive)
State Accepted
Commit baebdf48c360080710f80699eea3affbb13d6c65
Delegated to: Netdev Maintainers
Headers show
Series net: dev: PREEMPT_RT fixups. | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix success Link
netdev/cover_letter success Series has a cover letter
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 4867 this patch: 4867
netdev/cc_maintainers warning 18 maintainers not CCed: andrii@kernel.org pablo@netfilter.org rostedt@goodmis.org osmocom-net-gprs@lists.osmocom.org yhs@fb.com jdmason@kudzu.us ap420073@gmail.com kpsingh@kernel.org linux-ntb@googlegroups.com kafai@fb.com songliubraving@fb.com dsahern@kernel.org laforge@gnumonks.org allenbh@gmail.com qitao.xu@bytedance.com cong.wang@bytedance.com dave.jiang@intel.com mingo@redhat.com
netdev/build_clang success Errors and warnings before: 822 this patch: 822
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 5024 this patch: 5024
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 293 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Sebastian Andrzej Siewior Feb. 11, 2022, 11:38 p.m. UTC
Dave suggested a while ago (eleven years by now) "Let's make netif_rx()
work in all contexts and get rid of netif_rx_ni()". Eric agreed and
pointed out that modern devices should use netif_receive_skb() to avoid
the overhead.
In the meantime someone added another variant, netif_rx_any_context(),
which behaves as suggested.

netif_rx() must be invoked with disabled bottom halves to ensure that
pending softirqs, which were raised within the function, are handled.
netif_rx_ni() can be invoked only from process context (bottom halves
must be enabled) because the function handles pending softirqs without
checking if bottom halves were disabled or not.
netif_rx_any_context() invokes on the former functions by checking
in_interrupts().

netif_rx() could be taught to handle both cases (disabled and enabled
bottom halves) by simply disabling bottom halves while invoking
netif_rx_internal(). The local_bh_enable() invocation will then invoke
pending softirqs only if the BH-disable counter drops to zero.

Eric is concerned about the overhead of BH-disable+enable especially in
regard to the loopback driver. As critical as this driver is, it will
receive a shortcut to avoid the additional overhead which is not needed.

Add a local_bh_disable() section in netif_rx() to ensure softirqs are
handled if needed.
Provide __netif_rx() which does not disable BH and has a lockdep assert
to ensure that interrupts are disabled. Use this shortcut in the
loopback driver and in drivers/net/*.c.
Make netif_rx_ni() and netif_rx_any_context() invoke netif_rx() so they
can be removed once they are no more users left.

Link: https://lkml.kernel.org/r/20100415.020246.218622820.davem@davemloft.net
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
---
 drivers/net/amt.c          |  4 +--
 drivers/net/geneve.c       |  4 +--
 drivers/net/gtp.c          |  2 +-
 drivers/net/loopback.c     |  4 +--
 drivers/net/macsec.c       |  6 ++--
 drivers/net/macvlan.c      |  4 +--
 drivers/net/mhi_net.c      |  2 +-
 drivers/net/ntb_netdev.c   |  2 +-
 drivers/net/rionet.c       |  2 +-
 drivers/net/sb1000.c       |  2 +-
 drivers/net/veth.c         |  2 +-
 drivers/net/vrf.c          |  2 +-
 drivers/net/vxlan.c        |  2 +-
 include/linux/netdevice.h  | 14 ++++++--
 include/trace/events/net.h | 14 --------
 net/core/dev.c             | 67 +++++++++++++++++---------------------
 16 files changed, 60 insertions(+), 73 deletions(-)

Comments

Marek Szyprowski Feb. 16, 2022, 8:56 a.m. UTC | #1
Hi,

On 12.02.2022 00:38, Sebastian Andrzej Siewior wrote:
> Dave suggested a while ago (eleven years by now) "Let's make netif_rx()
> work in all contexts and get rid of netif_rx_ni()". Eric agreed and
> pointed out that modern devices should use netif_receive_skb() to avoid
> the overhead.
> In the meantime someone added another variant, netif_rx_any_context(),
> which behaves as suggested.
>
> netif_rx() must be invoked with disabled bottom halves to ensure that
> pending softirqs, which were raised within the function, are handled.
> netif_rx_ni() can be invoked only from process context (bottom halves
> must be enabled) because the function handles pending softirqs without
> checking if bottom halves were disabled or not.
> netif_rx_any_context() invokes on the former functions by checking
> in_interrupts().
>
> netif_rx() could be taught to handle both cases (disabled and enabled
> bottom halves) by simply disabling bottom halves while invoking
> netif_rx_internal(). The local_bh_enable() invocation will then invoke
> pending softirqs only if the BH-disable counter drops to zero.
>
> Eric is concerned about the overhead of BH-disable+enable especially in
> regard to the loopback driver. As critical as this driver is, it will
> receive a shortcut to avoid the additional overhead which is not needed.
>
> Add a local_bh_disable() section in netif_rx() to ensure softirqs are
> handled if needed.
> Provide __netif_rx() which does not disable BH and has a lockdep assert
> to ensure that interrupts are disabled. Use this shortcut in the
> loopback driver and in drivers/net/*.c.
> Make netif_rx_ni() and netif_rx_any_context() invoke netif_rx() so they
> can be removed once they are no more users left.
>
> Link: https://lkml.kernel.org/r/20100415.020246.218622820.davem@davemloft.net
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> Reviewed-by: Eric Dumazet <edumazet@google.com>
> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>

This patch landed in linux-next 20220215 as commit baebdf48c360 ("net: 
dev: Makes sure netif_rx() can be invoked in any context."). I found 
that it triggers the following warning on my test systems with USB CDC 
ethernet gadget:

------------[ cut here ]------------
WARNING: CPU: 0 PID: 876 at kernel/softirq.c:308 
__local_bh_disable_ip+0xbc/0xc0
Modules linked in: hci_uart btbcm btintel brcmfmac bluetooth s5p_mfc 
s5p_csis s5p_fimc s5p_jpeg cfg80211 v4l2_mem2mem videobuf2_dma_contig 
videobuf2_memops exynos4_is_common videobuf2_v4l2 v4l2_fwnode 
videobuf2_common v4l2_async videodev ecdh_generic ecc brcmutil mc
CPU: 0 PID: 876 Comm: ip Not tainted 5.17.0-rc3-01145-gbaebdf48c360 #11325
Hardware name: Samsung Exynos (Flattened Device Tree)
  unwind_backtrace from show_stack+0x10/0x14
  show_stack from dump_stack_lvl+0x58/0x70
  dump_stack_lvl from __warn+0x238/0x23c
  __warn from warn_slowpath_fmt+0xac/0xb4
  warn_slowpath_fmt from __local_bh_disable_ip+0xbc/0xc0
  __local_bh_disable_ip from netif_rx+0x1c/0x28c
  netif_rx from rx_complete+0x12c/0x224
  rx_complete from dwc2_hsotg_complete_request+0x94/0x208
  dwc2_hsotg_complete_request from dwc2_hsotg_epint+0x9ac/0x1224
  dwc2_hsotg_epint from dwc2_hsotg_irq+0x264/0x1020
  dwc2_hsotg_irq from __handle_irq_event_percpu+0x74/0x3ac
  __handle_irq_event_percpu from handle_irq_event_percpu+0xc/0x38
  handle_irq_event_percpu from handle_irq_event+0x38/0x5c
  handle_irq_event from handle_fasteoi_irq+0xc4/0x180
  handle_fasteoi_irq from generic_handle_domain_irq+0x44/0x88
  generic_handle_domain_irq from gic_handle_irq+0x88/0xac
  gic_handle_irq from generic_handle_arch_irq+0x58/0x78
  generic_handle_arch_irq from __irq_svc+0x54/0x88
Exception stack(0xc3fb58e0 to 0xc3fb5928)
58e0: 00000000 c0f33a98 00000001 00000cbb 60000013 c2dc8730 c2dc875c 
00000cc0
5900: c2d2cc00 00000001 00000000 00000000 00000001 c3fb5930 c0b7dbc4 
c0b890d4
5920: 20000013 ffffffff
  __irq_svc from _raw_spin_unlock_irqrestore+0x2c/0x60
  _raw_spin_unlock_irqrestore from rx_fill+0x6c/0xa0
  rx_fill from eth_open+0x60/0x74
  eth_open from __dev_open+0xf0/0x174
  __dev_open from __dev_change_flags+0x16c/0x1b8
  __dev_change_flags from dev_change_flags+0x14/0x44
  dev_change_flags from do_setlink+0x33c/0xa3c
  do_setlink from __rtnl_newlink+0x524/0x80c
  __rtnl_newlink from rtnl_newlink+0x44/0x60
  rtnl_newlink from rtnetlink_rcv_msg+0x14c/0x4ec
  rtnetlink_rcv_msg from netlink_rcv_skb+0xdc/0x110
  netlink_rcv_skb from netlink_unicast+0x1ac/0x244
  netlink_unicast from netlink_sendmsg+0x328/0x448
  netlink_sendmsg from ____sys_sendmsg+0x1c4/0x220
  ____sys_sendmsg from ___sys_sendmsg+0x68/0x94
  ___sys_sendmsg from __sys_sendmsg+0x4c/0x88
  __sys_sendmsg from ret_fast_syscall+0x0/0x1c
Exception stack(0xc3fb5fa8 to 0xc3fb5ff0)
5fa0:                   beacb6dc beac3624 00000003 beac3630 00000000 
00000000
5fc0: beacb6dc beac3624 00000000 00000128 0054e304 386d3fd2 0054e000 
beac3630
5fe0: 0000006c beac35e0 00517f80 b6ea2ab8
irq event stamp: 3260
hardirqs last  enabled at (3259): [<c0b89104>] 
_raw_spin_unlock_irqrestore+0x5c/0x60
hardirqs last disabled at (3260): [<c0100aec>] __irq_svc+0x4c/0x88
softirqs last  enabled at (3250): [<c0982a90>] __dev_change_flags+0x80/0x1b8
softirqs last disabled at (3248): [<c0982690>] dev_set_rx_mode+0x0/0x40
---[ end trace 0000000000000000 ]---
------------[ cut here ]------------
WARNING: CPU: 0 PID: 876 at kernel/softirq.c:362 
__local_bh_enable_ip+0x200/0x204
Modules linked in: hci_uart btbcm btintel brcmfmac bluetooth s5p_mfc 
s5p_csis s5p_fimc s5p_jpeg cfg80211 v4l2_mem2mem videobuf2_dma_contig 
videobuf2_memops exynos4_is_common videobuf2_v4l2 v4l2_fwnode 
videobuf2_common v4l2_async videodev ecdh_generic ecc brcmutil mc
CPU: 0 PID: 876 Comm: ip Tainted: G        W 
5.17.0-rc3-01145-gbaebdf48c360 #11325
Hardware name: Samsung Exynos (Flattened Device Tree)
  unwind_backtrace from show_stack+0x10/0x14
  show_stack from dump_stack_lvl+0x58/0x70
  dump_stack_lvl from __warn+0x238/0x23c
  __warn from warn_slowpath_fmt+0xac/0xb4
  warn_slowpath_fmt from __local_bh_enable_ip+0x200/0x204
  __local_bh_enable_ip from netif_rx+0x9c/0x28c
  netif_rx from rx_complete+0x12c/0x224
  rx_complete from dwc2_hsotg_complete_request+0x94/0x208
  dwc2_hsotg_complete_request from dwc2_hsotg_epint+0x9ac/0x1224
  dwc2_hsotg_epint from dwc2_hsotg_irq+0x264/0x1020
  dwc2_hsotg_irq from __handle_irq_event_percpu+0x74/0x3ac
  __handle_irq_event_percpu from handle_irq_event_percpu+0xc/0x38
  handle_irq_event_percpu from handle_irq_event+0x38/0x5c
  handle_irq_event from handle_fasteoi_irq+0xc4/0x180
  handle_fasteoi_irq from generic_handle_domain_irq+0x44/0x88
  generic_handle_domain_irq from gic_handle_irq+0x88/0xac
  gic_handle_irq from generic_handle_arch_irq+0x58/0x78
  generic_handle_arch_irq from __irq_svc+0x54/0x88
Exception stack(0xc3fb58e0 to 0xc3fb5928)
58e0: 00000000 c0f33a98 00000001 00000cbb 60000013 c2dc8730 c2dc875c 
00000cc0
5900: c2d2cc00 00000001 00000000 00000000 00000001 c3fb5930 c0b7dbc4 
c0b890d4
5920: 20000013 ffffffff
  __irq_svc from _raw_spin_unlock_irqrestore+0x2c/0x60
  _raw_spin_unlock_irqrestore from rx_fill+0x6c/0xa0
  rx_fill from eth_open+0x60/0x74
  eth_open from __dev_open+0xf0/0x174
  __dev_open from __dev_change_flags+0x16c/0x1b8
  __dev_change_flags from dev_change_flags+0x14/0x44
  dev_change_flags from do_setlink+0x33c/0xa3c
  do_setlink from __rtnl_newlink+0x524/0x80c
  __rtnl_newlink from rtnl_newlink+0x44/0x60
  rtnl_newlink from rtnetlink_rcv_msg+0x14c/0x4ec
  rtnetlink_rcv_msg from netlink_rcv_skb+0xdc/0x110
  netlink_rcv_skb from netlink_unicast+0x1ac/0x244
  netlink_unicast from netlink_sendmsg+0x328/0x448
  netlink_sendmsg from ____sys_sendmsg+0x1c4/0x220
  ____sys_sendmsg from ___sys_sendmsg+0x68/0x94
  ___sys_sendmsg from __sys_sendmsg+0x4c/0x88
  __sys_sendmsg from ret_fast_syscall+0x0/0x1c
Exception stack(0xc3fb5fa8 to 0xc3fb5ff0)
5fa0:                   beacb6dc beac3624 00000003 beac3630 00000000 
00000000
5fc0: beacb6dc beac3624 00000000 00000128 0054e304 386d3fd2 0054e000 
beac3630
5fe0: 0000006c beac35e0 00517f80 b6ea2ab8
irq event stamp: 3261
hardirqs last  enabled at (3259): [<c0b89104>] 
_raw_spin_unlock_irqrestore+0x5c/0x60
hardirqs last disabled at (3260): [<c0100aec>] __irq_svc+0x4c/0x88
softirqs last  enabled at (3250): [<c0982a90>] __dev_change_flags+0x80/0x1b8
softirqs last disabled at (3261): [<c097b808>] netif_rx+0x0/0x28c
---[ end trace 0000000000000000 ]---

The above log comes from Samsung Exynos4210-based Trats board (ARM32 
bit), but similar are triggered on other boards too.

> ---
>   drivers/net/amt.c          |  4 +--
>   drivers/net/geneve.c       |  4 +--
>   drivers/net/gtp.c          |  2 +-
>   drivers/net/loopback.c     |  4 +--
>   drivers/net/macsec.c       |  6 ++--
>   drivers/net/macvlan.c      |  4 +--
>   drivers/net/mhi_net.c      |  2 +-
>   drivers/net/ntb_netdev.c   |  2 +-
>   drivers/net/rionet.c       |  2 +-
>   drivers/net/sb1000.c       |  2 +-
>   drivers/net/veth.c         |  2 +-
>   drivers/net/vrf.c          |  2 +-
>   drivers/net/vxlan.c        |  2 +-
>   include/linux/netdevice.h  | 14 ++++++--
>   include/trace/events/net.h | 14 --------
>   net/core/dev.c             | 67 +++++++++++++++++---------------------
>   16 files changed, 60 insertions(+), 73 deletions(-)
>
> diff --git a/drivers/net/amt.c b/drivers/net/amt.c
> index f1a36d7e2151c..10455c9b9da0e 100644
> --- a/drivers/net/amt.c
> +++ b/drivers/net/amt.c
> @@ -2373,7 +2373,7 @@ static bool amt_membership_query_handler(struct amt_dev *amt,
>   	skb->pkt_type = PACKET_MULTICAST;
>   	skb->ip_summed = CHECKSUM_NONE;
>   	len = skb->len;
> -	if (netif_rx(skb) == NET_RX_SUCCESS) {
> +	if (__netif_rx(skb) == NET_RX_SUCCESS) {
>   		amt_update_gw_status(amt, AMT_STATUS_RECEIVED_QUERY, true);
>   		dev_sw_netstats_rx_add(amt->dev, len);
>   	} else {
> @@ -2470,7 +2470,7 @@ static bool amt_update_handler(struct amt_dev *amt, struct sk_buff *skb)
>   	skb->pkt_type = PACKET_MULTICAST;
>   	skb->ip_summed = CHECKSUM_NONE;
>   	len = skb->len;
> -	if (netif_rx(skb) == NET_RX_SUCCESS) {
> +	if (__netif_rx(skb) == NET_RX_SUCCESS) {
>   		amt_update_relay_status(tunnel, AMT_STATUS_RECEIVED_UPDATE,
>   					true);
>   		dev_sw_netstats_rx_add(amt->dev, len);
> diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
> index c1fdd721a730d..a895ff756093a 100644
> --- a/drivers/net/geneve.c
> +++ b/drivers/net/geneve.c
> @@ -925,7 +925,7 @@ static int geneve_xmit_skb(struct sk_buff *skb, struct net_device *dev,
>   		}
>   
>   		skb->protocol = eth_type_trans(skb, geneve->dev);
> -		netif_rx(skb);
> +		__netif_rx(skb);
>   		dst_release(&rt->dst);
>   		return -EMSGSIZE;
>   	}
> @@ -1021,7 +1021,7 @@ static int geneve6_xmit_skb(struct sk_buff *skb, struct net_device *dev,
>   		}
>   
>   		skb->protocol = eth_type_trans(skb, geneve->dev);
> -		netif_rx(skb);
> +		__netif_rx(skb);
>   		dst_release(dst);
>   		return -EMSGSIZE;
>   	}
> diff --git a/drivers/net/gtp.c b/drivers/net/gtp.c
> index 24e5c54d06c15..bf087171bcf04 100644
> --- a/drivers/net/gtp.c
> +++ b/drivers/net/gtp.c
> @@ -207,7 +207,7 @@ static int gtp_rx(struct pdp_ctx *pctx, struct sk_buff *skb,
>   
>   	dev_sw_netstats_rx_add(pctx->dev, skb->len);
>   
> -	netif_rx(skb);
> +	__netif_rx(skb);
>   	return 0;
>   
>   err:
> diff --git a/drivers/net/loopback.c b/drivers/net/loopback.c
> index ed0edf5884ef8..d05f86fe78c95 100644
> --- a/drivers/net/loopback.c
> +++ b/drivers/net/loopback.c
> @@ -78,7 +78,7 @@ static netdev_tx_t loopback_xmit(struct sk_buff *skb,
>   
>   	skb_orphan(skb);
>   
> -	/* Before queueing this packet to netif_rx(),
> +	/* Before queueing this packet to __netif_rx(),
>   	 * make sure dst is refcounted.
>   	 */
>   	skb_dst_force(skb);
> @@ -86,7 +86,7 @@ static netdev_tx_t loopback_xmit(struct sk_buff *skb,
>   	skb->protocol = eth_type_trans(skb, dev);
>   
>   	len = skb->len;
> -	if (likely(netif_rx(skb) == NET_RX_SUCCESS))
> +	if (likely(__netif_rx(skb) == NET_RX_SUCCESS))
>   		dev_lstats_add(dev, len);
>   
>   	return NETDEV_TX_OK;
> diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c
> index 3d08743317634..832f09ac075e7 100644
> --- a/drivers/net/macsec.c
> +++ b/drivers/net/macsec.c
> @@ -1033,7 +1033,7 @@ static enum rx_handler_result handle_not_macsec(struct sk_buff *skb)
>   				else
>   					nskb->pkt_type = PACKET_MULTICAST;
>   
> -				netif_rx(nskb);
> +				__netif_rx(nskb);
>   			}
>   			continue;
>   		}
> @@ -1056,7 +1056,7 @@ static enum rx_handler_result handle_not_macsec(struct sk_buff *skb)
>   
>   		nskb->dev = ndev;
>   
> -		if (netif_rx(nskb) == NET_RX_SUCCESS) {
> +		if (__netif_rx(nskb) == NET_RX_SUCCESS) {
>   			u64_stats_update_begin(&secy_stats->syncp);
>   			secy_stats->stats.InPktsUntagged++;
>   			u64_stats_update_end(&secy_stats->syncp);
> @@ -1288,7 +1288,7 @@ static rx_handler_result_t macsec_handle_frame(struct sk_buff **pskb)
>   
>   		macsec_reset_skb(nskb, macsec->secy.netdev);
>   
> -		ret = netif_rx(nskb);
> +		ret = __netif_rx(nskb);
>   		if (ret == NET_RX_SUCCESS) {
>   			u64_stats_update_begin(&secy_stats->syncp);
>   			secy_stats->stats.InPktsUnknownSCI++;
> diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
> index 6ef5f77be4d0a..d87c06c317ede 100644
> --- a/drivers/net/macvlan.c
> +++ b/drivers/net/macvlan.c
> @@ -410,7 +410,7 @@ static void macvlan_forward_source_one(struct sk_buff *skb,
>   	if (ether_addr_equal_64bits(eth_hdr(skb)->h_dest, dev->dev_addr))
>   		nskb->pkt_type = PACKET_HOST;
>   
> -	ret = netif_rx(nskb);
> +	ret = __netif_rx(nskb);
>   	macvlan_count_rx(vlan, len, ret == NET_RX_SUCCESS, false);
>   }
>   
> @@ -468,7 +468,7 @@ static rx_handler_result_t macvlan_handle_frame(struct sk_buff **pskb)
>   			/* forward to original port. */
>   			vlan = src;
>   			ret = macvlan_broadcast_one(skb, vlan, eth, 0) ?:
> -			      netif_rx(skb);
> +			      __netif_rx(skb);
>   			handle_res = RX_HANDLER_CONSUMED;
>   			goto out;
>   		}
> diff --git a/drivers/net/mhi_net.c b/drivers/net/mhi_net.c
> index aaa628f859fd4..0b1b6f650104b 100644
> --- a/drivers/net/mhi_net.c
> +++ b/drivers/net/mhi_net.c
> @@ -225,7 +225,7 @@ static void mhi_net_dl_callback(struct mhi_device *mhi_dev,
>   		u64_stats_inc(&mhi_netdev->stats.rx_packets);
>   		u64_stats_add(&mhi_netdev->stats.rx_bytes, skb->len);
>   		u64_stats_update_end(&mhi_netdev->stats.rx_syncp);
> -		netif_rx(skb);
> +		__netif_rx(skb);
>   	}
>   
>   	/* Refill if RX buffers queue becomes low */
> diff --git a/drivers/net/ntb_netdev.c b/drivers/net/ntb_netdev.c
> index 98ca6b18415e7..80bdc07f2cd33 100644
> --- a/drivers/net/ntb_netdev.c
> +++ b/drivers/net/ntb_netdev.c
> @@ -119,7 +119,7 @@ static void ntb_netdev_rx_handler(struct ntb_transport_qp *qp, void *qp_data,
>   	skb->protocol = eth_type_trans(skb, ndev);
>   	skb->ip_summed = CHECKSUM_NONE;
>   
> -	if (netif_rx(skb) == NET_RX_DROP) {
> +	if (__netif_rx(skb) == NET_RX_DROP) {
>   		ndev->stats.rx_errors++;
>   		ndev->stats.rx_dropped++;
>   	} else {
> diff --git a/drivers/net/rionet.c b/drivers/net/rionet.c
> index 1a95f3beb784d..39e61e07e4894 100644
> --- a/drivers/net/rionet.c
> +++ b/drivers/net/rionet.c
> @@ -109,7 +109,7 @@ static int rionet_rx_clean(struct net_device *ndev)
>   		skb_put(rnet->rx_skb[i], RIO_MAX_MSG_SIZE);
>   		rnet->rx_skb[i]->protocol =
>   		    eth_type_trans(rnet->rx_skb[i], ndev);
> -		error = netif_rx(rnet->rx_skb[i]);
> +		error = __netif_rx(rnet->rx_skb[i]);
>   
>   		if (error == NET_RX_DROP) {
>   			ndev->stats.rx_dropped++;
> diff --git a/drivers/net/sb1000.c b/drivers/net/sb1000.c
> index 57a6d598467b2..c3f8020571add 100644
> --- a/drivers/net/sb1000.c
> +++ b/drivers/net/sb1000.c
> @@ -872,7 +872,7 @@ printk("cm0: IP identification: %02x%02x  fragment offset: %02x%02x\n", buffer[3
>   
>   	/* datagram completed: send to upper level */
>   	skb_trim(skb, dlen);
> -	netif_rx(skb);
> +	__netif_rx(skb);
>   	stats->rx_bytes+=dlen;
>   	stats->rx_packets++;
>   	lp->rx_skb[ns] = NULL;
> diff --git a/drivers/net/veth.c b/drivers/net/veth.c
> index 354a963075c5f..6754fb8d9fabc 100644
> --- a/drivers/net/veth.c
> +++ b/drivers/net/veth.c
> @@ -286,7 +286,7 @@ static int veth_forward_skb(struct net_device *dev, struct sk_buff *skb,
>   {
>   	return __dev_forward_skb(dev, skb) ?: xdp ?
>   		veth_xdp_rx(rq, skb) :
> -		netif_rx(skb);
> +		__netif_rx(skb);
>   }
>   
>   /* return true if the specified skb has chances of GRO aggregation
> diff --git a/drivers/net/vrf.c b/drivers/net/vrf.c
> index e0b1ab99a359e..714cafcf6c6c8 100644
> --- a/drivers/net/vrf.c
> +++ b/drivers/net/vrf.c
> @@ -418,7 +418,7 @@ static int vrf_local_xmit(struct sk_buff *skb, struct net_device *dev,
>   
>   	skb->protocol = eth_type_trans(skb, dev);
>   
> -	if (likely(netif_rx(skb) == NET_RX_SUCCESS))
> +	if (likely(__netif_rx(skb) == NET_RX_SUCCESS))
>   		vrf_rx_stats(dev, len);
>   	else
>   		this_cpu_inc(dev->dstats->rx_drps);
> diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
> index 359d16780dbbc..d0dc90d3dac28 100644
> --- a/drivers/net/vxlan.c
> +++ b/drivers/net/vxlan.c
> @@ -2541,7 +2541,7 @@ static void vxlan_encap_bypass(struct sk_buff *skb, struct vxlan_dev *src_vxlan,
>   	tx_stats->tx_bytes += len;
>   	u64_stats_update_end(&tx_stats->syncp);
>   
> -	if (netif_rx(skb) == NET_RX_SUCCESS) {
> +	if (__netif_rx(skb) == NET_RX_SUCCESS) {
>   		u64_stats_update_begin(&rx_stats->syncp);
>   		rx_stats->rx_packets++;
>   		rx_stats->rx_bytes += len;
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index e490b84732d16..c9e883104adb1 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -3669,8 +3669,18 @@ u32 bpf_prog_run_generic_xdp(struct sk_buff *skb, struct xdp_buff *xdp,
>   void generic_xdp_tx(struct sk_buff *skb, struct bpf_prog *xdp_prog);
>   int do_xdp_generic(struct bpf_prog *xdp_prog, struct sk_buff *skb);
>   int netif_rx(struct sk_buff *skb);
> -int netif_rx_ni(struct sk_buff *skb);
> -int netif_rx_any_context(struct sk_buff *skb);
> +int __netif_rx(struct sk_buff *skb);
> +
> +static inline int netif_rx_ni(struct sk_buff *skb)
> +{
> +	return netif_rx(skb);
> +}
> +
> +static inline int netif_rx_any_context(struct sk_buff *skb)
> +{
> +	return netif_rx(skb);
> +}
> +
>   int netif_receive_skb(struct sk_buff *skb);
>   int netif_receive_skb_core(struct sk_buff *skb);
>   void netif_receive_skb_list_internal(struct list_head *head);
> diff --git a/include/trace/events/net.h b/include/trace/events/net.h
> index 78c448c6ab4c5..032b431b987b6 100644
> --- a/include/trace/events/net.h
> +++ b/include/trace/events/net.h
> @@ -260,13 +260,6 @@ DEFINE_EVENT(net_dev_rx_verbose_template, netif_rx_entry,
>   	TP_ARGS(skb)
>   );
>   
> -DEFINE_EVENT(net_dev_rx_verbose_template, netif_rx_ni_entry,
> -
> -	TP_PROTO(const struct sk_buff *skb),
> -
> -	TP_ARGS(skb)
> -);
> -
>   DECLARE_EVENT_CLASS(net_dev_rx_exit_template,
>   
>   	TP_PROTO(int ret),
> @@ -312,13 +305,6 @@ DEFINE_EVENT(net_dev_rx_exit_template, netif_rx_exit,
>   	TP_ARGS(ret)
>   );
>   
> -DEFINE_EVENT(net_dev_rx_exit_template, netif_rx_ni_exit,
> -
> -	TP_PROTO(int ret),
> -
> -	TP_ARGS(ret)
> -);
> -
>   DEFINE_EVENT(net_dev_rx_exit_template, netif_receive_skb_list_exit,
>   
>   	TP_PROTO(int ret),
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 0d13340ed4054..7d90f565e540a 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -4815,66 +4815,57 @@ static int netif_rx_internal(struct sk_buff *skb)
>   	return ret;
>   }
>   
> +/**
> + *	__netif_rx	-	Slightly optimized version of netif_rx
> + *	@skb: buffer to post
> + *
> + *	This behaves as netif_rx except that it does not disable bottom halves.
> + *	As a result this function may only be invoked from the interrupt context
> + *	(either hard or soft interrupt).
> + */
> +int __netif_rx(struct sk_buff *skb)
> +{
> +	int ret;
> +
> +	lockdep_assert_once(hardirq_count() | softirq_count());
> +
> +	trace_netif_rx_entry(skb);
> +	ret = netif_rx_internal(skb);
> +	trace_netif_rx_exit(ret);
> +	return ret;
> +}
> +EXPORT_SYMBOL(__netif_rx);
> +
>   /**
>    *	netif_rx	-	post buffer to the network code
>    *	@skb: buffer to post
>    *
>    *	This function receives a packet from a device driver and queues it for
> - *	the upper (protocol) levels to process.  It always succeeds. The buffer
> - *	may be dropped during processing for congestion control or by the
> - *	protocol layers.
> + *	the upper (protocol) levels to process via the backlog NAPI device. It
> + *	always succeeds. The buffer may be dropped during processing for
> + *	congestion control or by the protocol layers.
> + *	The network buffer is passed via the backlog NAPI device. Modern NIC
> + *	driver should use NAPI and GRO.
> + *	This function can used from any context.
>    *
>    *	return values:
>    *	NET_RX_SUCCESS	(no congestion)
>    *	NET_RX_DROP     (packet was dropped)
>    *
>    */
> -
>   int netif_rx(struct sk_buff *skb)
>   {
>   	int ret;
>   
> +	local_bh_disable();
>   	trace_netif_rx_entry(skb);
> -
>   	ret = netif_rx_internal(skb);
>   	trace_netif_rx_exit(ret);
> -
> +	local_bh_enable();
>   	return ret;
>   }
>   EXPORT_SYMBOL(netif_rx);
>   
> -int netif_rx_ni(struct sk_buff *skb)
> -{
> -	int err;
> -
> -	trace_netif_rx_ni_entry(skb);
> -
> -	preempt_disable();
> -	err = netif_rx_internal(skb);
> -	if (local_softirq_pending())
> -		do_softirq();
> -	preempt_enable();
> -	trace_netif_rx_ni_exit(err);
> -
> -	return err;
> -}
> -EXPORT_SYMBOL(netif_rx_ni);
> -
> -int netif_rx_any_context(struct sk_buff *skb)
> -{
> -	/*
> -	 * If invoked from contexts which do not invoke bottom half
> -	 * processing either at return from interrupt or when softrqs are
> -	 * reenabled, use netif_rx_ni() which invokes bottomhalf processing
> -	 * directly.
> -	 */
> -	if (in_interrupt())
> -		return netif_rx(skb);
> -	else
> -		return netif_rx_ni(skb);
> -}
> -EXPORT_SYMBOL(netif_rx_any_context);
> -
>   static __latent_entropy void net_tx_action(struct softirq_action *h)
>   {
>   	struct softnet_data *sd = this_cpu_ptr(&softnet_data);

Best regards
Geert Uytterhoeven Feb. 22, 2022, 3:30 p.m. UTC | #2
Hi Sebastian,

On Wed, 16 Feb 2022, Marek Szyprowski wrote:
> On 12.02.2022 00:38, Sebastian Andrzej Siewior wrote:
>> Dave suggested a while ago (eleven years by now) "Let's make netif_rx()
>> work in all contexts and get rid of netif_rx_ni()". Eric agreed and
>> pointed out that modern devices should use netif_receive_skb() to avoid
>> the overhead.
>> In the meantime someone added another variant, netif_rx_any_context(),
>> which behaves as suggested.
>>
>> netif_rx() must be invoked with disabled bottom halves to ensure that
>> pending softirqs, which were raised within the function, are handled.
>> netif_rx_ni() can be invoked only from process context (bottom halves
>> must be enabled) because the function handles pending softirqs without
>> checking if bottom halves were disabled or not.
>> netif_rx_any_context() invokes on the former functions by checking
>> in_interrupts().
>>
>> netif_rx() could be taught to handle both cases (disabled and enabled
>> bottom halves) by simply disabling bottom halves while invoking
>> netif_rx_internal(). The local_bh_enable() invocation will then invoke
>> pending softirqs only if the BH-disable counter drops to zero.
>>
>> Eric is concerned about the overhead of BH-disable+enable especially in
>> regard to the loopback driver. As critical as this driver is, it will
>> receive a shortcut to avoid the additional overhead which is not needed.
>>
>> Add a local_bh_disable() section in netif_rx() to ensure softirqs are
>> handled if needed.
>> Provide __netif_rx() which does not disable BH and has a lockdep assert
>> to ensure that interrupts are disabled. Use this shortcut in the
>> loopback driver and in drivers/net/*.c.
>> Make netif_rx_ni() and netif_rx_any_context() invoke netif_rx() so they
>> can be removed once they are no more users left.
>>
>> Link: https://lkml.kernel.org/r/20100415.020246.218622820.davem@davemloft.net
>> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
>> Reviewed-by: Eric Dumazet <edumazet@google.com>
>> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
>
> This patch landed in linux-next 20220215 as commit baebdf48c360 ("net:
> dev: Makes sure netif_rx() can be invoked in any context."). I found
> that it triggers the following warning on my test systems with USB CDC
> ethernet gadget:
>
> ------------[ cut here ]------------
> WARNING: CPU: 0 PID: 876 at kernel/softirq.c:308
> __local_bh_disable_ip+0xbc/0xc0

Similar on rbtx4927 (CONFIG_NE2000=y), where I'm getting a slightly
different warning:

     Sending DHCP requests .
     ------------[ cut here ]------------
     WARNING: CPU: 0 PID: 0 at kernel/softirq.c:362 __local_bh_enable_ip+0x4c/0xc0
     Modules linked in:
     CPU: 0 PID: 0 Comm: swapper Not tainted 5.17.0-rc5-rbtx4927-00770-ga8ca72253967 #300
     Stack : 9800000000b80800 0000000000000008 0000000000000000 a5ba96d4be38c8b0
 	    0000000000000000 9800000000813c10 ffffffff80468188 9800000000813a90
 	    0000000000000001 9800000000813ab0 0000000000000000 20746f4e20726570
 	    0000000000000010 ffffffff802c1400 ffffffff8054ce76 722d302e37312e35
 	    0000000000000000 0000000000000000 0000000000000009 0000000000000000
 	    98000000008bfd40 000000000000004c 0000000006020283 0000000006020287
 	    0000000000000000 0000000000000000 0000000000000000 ffffffff80540000
 	    ffffffff804b8000 9800000000813c10 9800000000b80800 ffffffff801238bc
 	    0000000000000000 ffffffff80470000 0000000000000000 0000000000000009
 	    0000000000000000 ffffffff80108738 0000000000000000 ffffffff801238bc
 	    ...
     Call Trace:
     [<ffffffff80108738>] show_stack+0x68/0xf4
     [<ffffffff801238bc>] __warn+0xc0/0xf0
     [<ffffffff80123964>] warn_slowpath_fmt+0x78/0x94
     [<ffffffff80126408>] __local_bh_enable_ip+0x4c/0xc0
     [<ffffffff80341754>] netif_rx+0x20/0x30
     [<ffffffff8031d870>] ei_receive+0x2f0/0x36c
     [<ffffffff8031e624>] eip_interrupt+0x2dc/0x36c
     [<ffffffff8014f488>] __handle_irq_event_percpu+0x8c/0x134
     [<ffffffff8014f548>] handle_irq_event_percpu+0x18/0x60
     [<ffffffff8014f5c8>] handle_irq_event+0x38/0x60
     [<ffffffff80152008>] handle_level_irq+0x80/0xbc
     [<ffffffff8014eecc>] handle_irq_desc+0x24/0x3c
     [<ffffffff804014b8>] do_IRQ+0x18/0x24
     [<ffffffff801031b0>] handle_int+0x148/0x154
     [<ffffffff80104e18>] arch_local_irq_enable+0x18/0x24
     [<ffffffff8040148c>] default_idle_call+0x2c/0x3c
     [<ffffffff801445d0>] do_idle+0xcc/0x104
     [<ffffffff80144620>] cpu_startup_entry+0x18/0x20
     [<ffffffff80508e34>] start_kernel+0x6f4/0x738

     ---[ end trace 0000000000000000 ]---
     , OK
     IP-Config: Got DHCP answer from a.b.c.d, my address is a.b.c.e
     IP-Config: Complete:

Reverting baebdf48c3600807 ("net: dev: Makes sure netif_rx() can be
invoked in any context.") fixes the issue for me.

Gr{oetje,eeting}s,

 						Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
 							    -- Linus Torvalds
Sebastian Andrzej Siewior Feb. 22, 2022, 4:13 p.m. UTC | #3
On 2022-02-22 16:30:37 [+0100], Geert Uytterhoeven wrote:
> 	Hi Sebastian,

Hi Geert,

> Similar on rbtx4927 (CONFIG_NE2000=y), where I'm getting a slightly
> different warning:

Based on the backtrace the patch in
   https://lore.kernel.org/all/Yg05duINKBqvnxUc@linutronix.de/

should fix it, right?

Sebastian
Geert Uytterhoeven Feb. 23, 2022, 7:56 a.m. UTC | #4
Hi Sebastian,

On Tue, Feb 22, 2022 at 5:13 PM Sebastian Andrzej Siewior
<bigeasy@linutronix.de> wrote:
> On 2022-02-22 16:30:37 [+0100], Geert Uytterhoeven wrote:
> > Similar on rbtx4927 (CONFIG_NE2000=y), where I'm getting a slightly
> > different warning:
>
> Based on the backtrace the patch in
>    https://lore.kernel.org/all/Yg05duINKBqvnxUc@linutronix.de/
>
> should fix it, right?

Indeed, R-b provided in that thread.
Thanks!

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds
diff mbox series

Patch

diff --git a/drivers/net/amt.c b/drivers/net/amt.c
index f1a36d7e2151c..10455c9b9da0e 100644
--- a/drivers/net/amt.c
+++ b/drivers/net/amt.c
@@ -2373,7 +2373,7 @@  static bool amt_membership_query_handler(struct amt_dev *amt,
 	skb->pkt_type = PACKET_MULTICAST;
 	skb->ip_summed = CHECKSUM_NONE;
 	len = skb->len;
-	if (netif_rx(skb) == NET_RX_SUCCESS) {
+	if (__netif_rx(skb) == NET_RX_SUCCESS) {
 		amt_update_gw_status(amt, AMT_STATUS_RECEIVED_QUERY, true);
 		dev_sw_netstats_rx_add(amt->dev, len);
 	} else {
@@ -2470,7 +2470,7 @@  static bool amt_update_handler(struct amt_dev *amt, struct sk_buff *skb)
 	skb->pkt_type = PACKET_MULTICAST;
 	skb->ip_summed = CHECKSUM_NONE;
 	len = skb->len;
-	if (netif_rx(skb) == NET_RX_SUCCESS) {
+	if (__netif_rx(skb) == NET_RX_SUCCESS) {
 		amt_update_relay_status(tunnel, AMT_STATUS_RECEIVED_UPDATE,
 					true);
 		dev_sw_netstats_rx_add(amt->dev, len);
diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
index c1fdd721a730d..a895ff756093a 100644
--- a/drivers/net/geneve.c
+++ b/drivers/net/geneve.c
@@ -925,7 +925,7 @@  static int geneve_xmit_skb(struct sk_buff *skb, struct net_device *dev,
 		}
 
 		skb->protocol = eth_type_trans(skb, geneve->dev);
-		netif_rx(skb);
+		__netif_rx(skb);
 		dst_release(&rt->dst);
 		return -EMSGSIZE;
 	}
@@ -1021,7 +1021,7 @@  static int geneve6_xmit_skb(struct sk_buff *skb, struct net_device *dev,
 		}
 
 		skb->protocol = eth_type_trans(skb, geneve->dev);
-		netif_rx(skb);
+		__netif_rx(skb);
 		dst_release(dst);
 		return -EMSGSIZE;
 	}
diff --git a/drivers/net/gtp.c b/drivers/net/gtp.c
index 24e5c54d06c15..bf087171bcf04 100644
--- a/drivers/net/gtp.c
+++ b/drivers/net/gtp.c
@@ -207,7 +207,7 @@  static int gtp_rx(struct pdp_ctx *pctx, struct sk_buff *skb,
 
 	dev_sw_netstats_rx_add(pctx->dev, skb->len);
 
-	netif_rx(skb);
+	__netif_rx(skb);
 	return 0;
 
 err:
diff --git a/drivers/net/loopback.c b/drivers/net/loopback.c
index ed0edf5884ef8..d05f86fe78c95 100644
--- a/drivers/net/loopback.c
+++ b/drivers/net/loopback.c
@@ -78,7 +78,7 @@  static netdev_tx_t loopback_xmit(struct sk_buff *skb,
 
 	skb_orphan(skb);
 
-	/* Before queueing this packet to netif_rx(),
+	/* Before queueing this packet to __netif_rx(),
 	 * make sure dst is refcounted.
 	 */
 	skb_dst_force(skb);
@@ -86,7 +86,7 @@  static netdev_tx_t loopback_xmit(struct sk_buff *skb,
 	skb->protocol = eth_type_trans(skb, dev);
 
 	len = skb->len;
-	if (likely(netif_rx(skb) == NET_RX_SUCCESS))
+	if (likely(__netif_rx(skb) == NET_RX_SUCCESS))
 		dev_lstats_add(dev, len);
 
 	return NETDEV_TX_OK;
diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c
index 3d08743317634..832f09ac075e7 100644
--- a/drivers/net/macsec.c
+++ b/drivers/net/macsec.c
@@ -1033,7 +1033,7 @@  static enum rx_handler_result handle_not_macsec(struct sk_buff *skb)
 				else
 					nskb->pkt_type = PACKET_MULTICAST;
 
-				netif_rx(nskb);
+				__netif_rx(nskb);
 			}
 			continue;
 		}
@@ -1056,7 +1056,7 @@  static enum rx_handler_result handle_not_macsec(struct sk_buff *skb)
 
 		nskb->dev = ndev;
 
-		if (netif_rx(nskb) == NET_RX_SUCCESS) {
+		if (__netif_rx(nskb) == NET_RX_SUCCESS) {
 			u64_stats_update_begin(&secy_stats->syncp);
 			secy_stats->stats.InPktsUntagged++;
 			u64_stats_update_end(&secy_stats->syncp);
@@ -1288,7 +1288,7 @@  static rx_handler_result_t macsec_handle_frame(struct sk_buff **pskb)
 
 		macsec_reset_skb(nskb, macsec->secy.netdev);
 
-		ret = netif_rx(nskb);
+		ret = __netif_rx(nskb);
 		if (ret == NET_RX_SUCCESS) {
 			u64_stats_update_begin(&secy_stats->syncp);
 			secy_stats->stats.InPktsUnknownSCI++;
diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index 6ef5f77be4d0a..d87c06c317ede 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -410,7 +410,7 @@  static void macvlan_forward_source_one(struct sk_buff *skb,
 	if (ether_addr_equal_64bits(eth_hdr(skb)->h_dest, dev->dev_addr))
 		nskb->pkt_type = PACKET_HOST;
 
-	ret = netif_rx(nskb);
+	ret = __netif_rx(nskb);
 	macvlan_count_rx(vlan, len, ret == NET_RX_SUCCESS, false);
 }
 
@@ -468,7 +468,7 @@  static rx_handler_result_t macvlan_handle_frame(struct sk_buff **pskb)
 			/* forward to original port. */
 			vlan = src;
 			ret = macvlan_broadcast_one(skb, vlan, eth, 0) ?:
-			      netif_rx(skb);
+			      __netif_rx(skb);
 			handle_res = RX_HANDLER_CONSUMED;
 			goto out;
 		}
diff --git a/drivers/net/mhi_net.c b/drivers/net/mhi_net.c
index aaa628f859fd4..0b1b6f650104b 100644
--- a/drivers/net/mhi_net.c
+++ b/drivers/net/mhi_net.c
@@ -225,7 +225,7 @@  static void mhi_net_dl_callback(struct mhi_device *mhi_dev,
 		u64_stats_inc(&mhi_netdev->stats.rx_packets);
 		u64_stats_add(&mhi_netdev->stats.rx_bytes, skb->len);
 		u64_stats_update_end(&mhi_netdev->stats.rx_syncp);
-		netif_rx(skb);
+		__netif_rx(skb);
 	}
 
 	/* Refill if RX buffers queue becomes low */
diff --git a/drivers/net/ntb_netdev.c b/drivers/net/ntb_netdev.c
index 98ca6b18415e7..80bdc07f2cd33 100644
--- a/drivers/net/ntb_netdev.c
+++ b/drivers/net/ntb_netdev.c
@@ -119,7 +119,7 @@  static void ntb_netdev_rx_handler(struct ntb_transport_qp *qp, void *qp_data,
 	skb->protocol = eth_type_trans(skb, ndev);
 	skb->ip_summed = CHECKSUM_NONE;
 
-	if (netif_rx(skb) == NET_RX_DROP) {
+	if (__netif_rx(skb) == NET_RX_DROP) {
 		ndev->stats.rx_errors++;
 		ndev->stats.rx_dropped++;
 	} else {
diff --git a/drivers/net/rionet.c b/drivers/net/rionet.c
index 1a95f3beb784d..39e61e07e4894 100644
--- a/drivers/net/rionet.c
+++ b/drivers/net/rionet.c
@@ -109,7 +109,7 @@  static int rionet_rx_clean(struct net_device *ndev)
 		skb_put(rnet->rx_skb[i], RIO_MAX_MSG_SIZE);
 		rnet->rx_skb[i]->protocol =
 		    eth_type_trans(rnet->rx_skb[i], ndev);
-		error = netif_rx(rnet->rx_skb[i]);
+		error = __netif_rx(rnet->rx_skb[i]);
 
 		if (error == NET_RX_DROP) {
 			ndev->stats.rx_dropped++;
diff --git a/drivers/net/sb1000.c b/drivers/net/sb1000.c
index 57a6d598467b2..c3f8020571add 100644
--- a/drivers/net/sb1000.c
+++ b/drivers/net/sb1000.c
@@ -872,7 +872,7 @@  printk("cm0: IP identification: %02x%02x  fragment offset: %02x%02x\n", buffer[3
 
 	/* datagram completed: send to upper level */
 	skb_trim(skb, dlen);
-	netif_rx(skb);
+	__netif_rx(skb);
 	stats->rx_bytes+=dlen;
 	stats->rx_packets++;
 	lp->rx_skb[ns] = NULL;
diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index 354a963075c5f..6754fb8d9fabc 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -286,7 +286,7 @@  static int veth_forward_skb(struct net_device *dev, struct sk_buff *skb,
 {
 	return __dev_forward_skb(dev, skb) ?: xdp ?
 		veth_xdp_rx(rq, skb) :
-		netif_rx(skb);
+		__netif_rx(skb);
 }
 
 /* return true if the specified skb has chances of GRO aggregation
diff --git a/drivers/net/vrf.c b/drivers/net/vrf.c
index e0b1ab99a359e..714cafcf6c6c8 100644
--- a/drivers/net/vrf.c
+++ b/drivers/net/vrf.c
@@ -418,7 +418,7 @@  static int vrf_local_xmit(struct sk_buff *skb, struct net_device *dev,
 
 	skb->protocol = eth_type_trans(skb, dev);
 
-	if (likely(netif_rx(skb) == NET_RX_SUCCESS))
+	if (likely(__netif_rx(skb) == NET_RX_SUCCESS))
 		vrf_rx_stats(dev, len);
 	else
 		this_cpu_inc(dev->dstats->rx_drps);
diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 359d16780dbbc..d0dc90d3dac28 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -2541,7 +2541,7 @@  static void vxlan_encap_bypass(struct sk_buff *skb, struct vxlan_dev *src_vxlan,
 	tx_stats->tx_bytes += len;
 	u64_stats_update_end(&tx_stats->syncp);
 
-	if (netif_rx(skb) == NET_RX_SUCCESS) {
+	if (__netif_rx(skb) == NET_RX_SUCCESS) {
 		u64_stats_update_begin(&rx_stats->syncp);
 		rx_stats->rx_packets++;
 		rx_stats->rx_bytes += len;
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index e490b84732d16..c9e883104adb1 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3669,8 +3669,18 @@  u32 bpf_prog_run_generic_xdp(struct sk_buff *skb, struct xdp_buff *xdp,
 void generic_xdp_tx(struct sk_buff *skb, struct bpf_prog *xdp_prog);
 int do_xdp_generic(struct bpf_prog *xdp_prog, struct sk_buff *skb);
 int netif_rx(struct sk_buff *skb);
-int netif_rx_ni(struct sk_buff *skb);
-int netif_rx_any_context(struct sk_buff *skb);
+int __netif_rx(struct sk_buff *skb);
+
+static inline int netif_rx_ni(struct sk_buff *skb)
+{
+	return netif_rx(skb);
+}
+
+static inline int netif_rx_any_context(struct sk_buff *skb)
+{
+	return netif_rx(skb);
+}
+
 int netif_receive_skb(struct sk_buff *skb);
 int netif_receive_skb_core(struct sk_buff *skb);
 void netif_receive_skb_list_internal(struct list_head *head);
diff --git a/include/trace/events/net.h b/include/trace/events/net.h
index 78c448c6ab4c5..032b431b987b6 100644
--- a/include/trace/events/net.h
+++ b/include/trace/events/net.h
@@ -260,13 +260,6 @@  DEFINE_EVENT(net_dev_rx_verbose_template, netif_rx_entry,
 	TP_ARGS(skb)
 );
 
-DEFINE_EVENT(net_dev_rx_verbose_template, netif_rx_ni_entry,
-
-	TP_PROTO(const struct sk_buff *skb),
-
-	TP_ARGS(skb)
-);
-
 DECLARE_EVENT_CLASS(net_dev_rx_exit_template,
 
 	TP_PROTO(int ret),
@@ -312,13 +305,6 @@  DEFINE_EVENT(net_dev_rx_exit_template, netif_rx_exit,
 	TP_ARGS(ret)
 );
 
-DEFINE_EVENT(net_dev_rx_exit_template, netif_rx_ni_exit,
-
-	TP_PROTO(int ret),
-
-	TP_ARGS(ret)
-);
-
 DEFINE_EVENT(net_dev_rx_exit_template, netif_receive_skb_list_exit,
 
 	TP_PROTO(int ret),
diff --git a/net/core/dev.c b/net/core/dev.c
index 0d13340ed4054..7d90f565e540a 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4815,66 +4815,57 @@  static int netif_rx_internal(struct sk_buff *skb)
 	return ret;
 }
 
+/**
+ *	__netif_rx	-	Slightly optimized version of netif_rx
+ *	@skb: buffer to post
+ *
+ *	This behaves as netif_rx except that it does not disable bottom halves.
+ *	As a result this function may only be invoked from the interrupt context
+ *	(either hard or soft interrupt).
+ */
+int __netif_rx(struct sk_buff *skb)
+{
+	int ret;
+
+	lockdep_assert_once(hardirq_count() | softirq_count());
+
+	trace_netif_rx_entry(skb);
+	ret = netif_rx_internal(skb);
+	trace_netif_rx_exit(ret);
+	return ret;
+}
+EXPORT_SYMBOL(__netif_rx);
+
 /**
  *	netif_rx	-	post buffer to the network code
  *	@skb: buffer to post
  *
  *	This function receives a packet from a device driver and queues it for
- *	the upper (protocol) levels to process.  It always succeeds. The buffer
- *	may be dropped during processing for congestion control or by the
- *	protocol layers.
+ *	the upper (protocol) levels to process via the backlog NAPI device. It
+ *	always succeeds. The buffer may be dropped during processing for
+ *	congestion control or by the protocol layers.
+ *	The network buffer is passed via the backlog NAPI device. Modern NIC
+ *	driver should use NAPI and GRO.
+ *	This function can used from any context.
  *
  *	return values:
  *	NET_RX_SUCCESS	(no congestion)
  *	NET_RX_DROP     (packet was dropped)
  *
  */
-
 int netif_rx(struct sk_buff *skb)
 {
 	int ret;
 
+	local_bh_disable();
 	trace_netif_rx_entry(skb);
-
 	ret = netif_rx_internal(skb);
 	trace_netif_rx_exit(ret);
-
+	local_bh_enable();
 	return ret;
 }
 EXPORT_SYMBOL(netif_rx);
 
-int netif_rx_ni(struct sk_buff *skb)
-{
-	int err;
-
-	trace_netif_rx_ni_entry(skb);
-
-	preempt_disable();
-	err = netif_rx_internal(skb);
-	if (local_softirq_pending())
-		do_softirq();
-	preempt_enable();
-	trace_netif_rx_ni_exit(err);
-
-	return err;
-}
-EXPORT_SYMBOL(netif_rx_ni);
-
-int netif_rx_any_context(struct sk_buff *skb)
-{
-	/*
-	 * If invoked from contexts which do not invoke bottom half
-	 * processing either at return from interrupt or when softrqs are
-	 * reenabled, use netif_rx_ni() which invokes bottomhalf processing
-	 * directly.
-	 */
-	if (in_interrupt())
-		return netif_rx(skb);
-	else
-		return netif_rx_ni(skb);
-}
-EXPORT_SYMBOL(netif_rx_any_context);
-
 static __latent_entropy void net_tx_action(struct softirq_action *h)
 {
 	struct softnet_data *sd = this_cpu_ptr(&softnet_data);