Message ID | 20240409062407.1952728-1-lei.chen@smartx.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | net:tun: limit printing rate when illegal packet received by tun dev | expand |
On Tue, Apr 09, 2024 at 02:24:05AM -0400, Lei Chen wrote: > vhost_worker will call tun call backs to receive packets. If too many > illegal packets arrives, tun_do_read will keep dumping packet contents. > When console is enabled, it will costs much more cpu time to dump > packet and soft lockup will be detected. > > Rate limit mechanism can be used to limit the dumping rate. > @@ -2125,14 +2126,16 @@ static ssize_t tun_put_user(struct tun_struct *tun, > tun_is_little_endian(tun), true, > vlan_hlen)) { > struct skb_shared_info *sinfo = skb_shinfo(skb); > - pr_err("unexpected GSO type: " > - "0x%x, gso_size %d, hdr_len %d\n", > - sinfo->gso_type, tun16_to_cpu(tun, gso.gso_size), > - tun16_to_cpu(tun, gso.hdr_len)); > - print_hex_dump(KERN_ERR, "tun: ", > - DUMP_PREFIX_NONE, > - 16, 1, skb->head, > - min((int)tun16_to_cpu(tun, gso.hdr_len), 64), true); > + > + if (__ratelimit(&ratelimit)) { Maybe just use net_ratelimit() rather than add a new ratelimit variable? A separate issue, i wounder if rather than pr_err(), netdev_err(tun->dev, ...) should be used to indicate which TUN device has been given bad GSO packets? Andrew
On Tue, 9 Apr 2024 02:24:05 -0400 Lei Chen wrote:
> --- <NMI exception stack> ---
You need to indent this line with a space, otherwise
git am will cut off the commit message here.
On Tue, Apr 9, 2024 at 8:52 PM Andrew Lunn <andrew@lunn.ch> wrote: > > On Tue, Apr 09, 2024 at 02:24:05AM -0400, Lei Chen wrote: > > vhost_worker will call tun call backs to receive packets. If too many > > illegal packets arrives, tun_do_read will keep dumping packet contents. > > When console is enabled, it will costs much more cpu time to dump > > packet and soft lockup will be detected. > > > > Rate limit mechanism can be used to limit the dumping rate. > > @@ -2125,14 +2126,16 @@ static ssize_t tun_put_user(struct tun_struct *tun, > > tun_is_little_endian(tun), true, > > vlan_hlen)) { > > struct skb_shared_info *sinfo = skb_shinfo(skb); > > - pr_err("unexpected GSO type: " > > - "0x%x, gso_size %d, hdr_len %d\n", > > - sinfo->gso_type, tun16_to_cpu(tun, gso.gso_size), > > - tun16_to_cpu(tun, gso.hdr_len)); > > - print_hex_dump(KERN_ERR, "tun: ", > > - DUMP_PREFIX_NONE, > > - 16, 1, skb->head, > > - min((int)tun16_to_cpu(tun, gso.hdr_len), 64), true); > > + > > + if (__ratelimit(&ratelimit)) { > > Maybe just use net_ratelimit() rather than add a new ratelimit > variable? Thanks for your suggestion, net_ratelimit is a better way to make it. > > A separate issue, i wounder if rather than pr_err(), > netdev_err(tun->dev, ...) should be used to indicate which TUN device > has been given bad GSO packets? I got it, I'll remake the patch, thanks.
On Wed, Apr 10, 2024 at 9:25 AM Jakub Kicinski <kuba@kernel.org> wrote: > > On Tue, 9 Apr 2024 02:24:05 -0400 Lei Chen wrote: > > --- <NMI exception stack> --- > > You need to indent this line with a space, otherwise > git am will cut off the commit message here. Thanks for your reply, and I'll remake the patch. :)
On Wed, Apr 10, 2024 at 10:23 AM Lei Chen <lei.chen@smartx.com> wrote: > > On Tue, Apr 9, 2024 at 8:52 PM Andrew Lunn <andrew@lunn.ch> wrote: > > > > On Tue, Apr 09, 2024 at 02:24:05AM -0400, Lei Chen wrote: > > > vhost_worker will call tun call backs to receive packets. If too many > > > illegal packets arrives, tun_do_read will keep dumping packet contents. > > > When console is enabled, it will costs much more cpu time to dump > > > packet and soft lockup will be detected. > > > > > > Rate limit mechanism can be used to limit the dumping rate. > > > @@ -2125,14 +2126,16 @@ static ssize_t tun_put_user(struct tun_struct *tun, > > > tun_is_little_endian(tun), true, > > > vlan_hlen)) { > > > struct skb_shared_info *sinfo = skb_shinfo(skb); > > > - pr_err("unexpected GSO type: " > > > - "0x%x, gso_size %d, hdr_len %d\n", > > > - sinfo->gso_type, tun16_to_cpu(tun, gso.gso_size), > > > - tun16_to_cpu(tun, gso.hdr_len)); > > > - print_hex_dump(KERN_ERR, "tun: ", > > > - DUMP_PREFIX_NONE, > > > - 16, 1, skb->head, > > > - min((int)tun16_to_cpu(tun, gso.hdr_len), 64), true); > > > + > > > + if (__ratelimit(&ratelimit)) { > > > > Maybe just use net_ratelimit() rather than add a new ratelimit > > variable? > > Thanks for your suggestion, net_ratelimit is a better way to make it. +1 Thanks > > > > > A separate issue, i wounder if rather than pr_err(), > > netdev_err(tun->dev, ...) should be used to indicate which TUN device > > has been given bad GSO packets? > > I got it, I'll remake the patch, thanks. >
diff --git a/drivers/net/tun.c b/drivers/net/tun.c index 0b3f21cba552..34c6b043764d 100644 --- a/drivers/net/tun.c +++ b/drivers/net/tun.c @@ -2087,6 +2087,7 @@ static ssize_t tun_put_user(struct tun_struct *tun, struct sk_buff *skb, struct iov_iter *iter) { + static DEFINE_RATELIMIT_STATE(ratelimit, 60 * HZ, 5); struct tun_pi pi = { 0, skb->protocol }; ssize_t total; int vlan_offset = 0; @@ -2125,14 +2126,16 @@ static ssize_t tun_put_user(struct tun_struct *tun, tun_is_little_endian(tun), true, vlan_hlen)) { struct skb_shared_info *sinfo = skb_shinfo(skb); - pr_err("unexpected GSO type: " - "0x%x, gso_size %d, hdr_len %d\n", - sinfo->gso_type, tun16_to_cpu(tun, gso.gso_size), - tun16_to_cpu(tun, gso.hdr_len)); - print_hex_dump(KERN_ERR, "tun: ", - DUMP_PREFIX_NONE, - 16, 1, skb->head, - min((int)tun16_to_cpu(tun, gso.hdr_len), 64), true); + + if (__ratelimit(&ratelimit)) { + pr_err("unexpected GSO type: 0x%x, gso_size %d, hdr_len %d\n", + sinfo->gso_type, tun16_to_cpu(tun, gso.gso_size), + tun16_to_cpu(tun, gso.hdr_len)); + print_hex_dump(KERN_ERR, "tun: ", + DUMP_PREFIX_NONE, + 16, 1, skb->head, + min((int)tun16_to_cpu(tun, gso.hdr_len), 64), true); + } WARN_ON_ONCE(1); return -EINVAL; }
vhost_worker will call tun call backs to receive packets. If too many illegal packets arrives, tun_do_read will keep dumping packet contents. When console is enabled, it will costs much more cpu time to dump packet and soft lockup will be detected. Rate limit mechanism can be used to limit the dumping rate. PID: 33036 TASK: ffff949da6f20000 CPU: 23 COMMAND: "vhost-32980" #0 [fffffe00003fce50] crash_nmi_callback at ffffffff89249253 #1 [fffffe00003fce58] nmi_handle at ffffffff89225fa3 #2 [fffffe00003fceb0] default_do_nmi at ffffffff8922642e #3 [fffffe00003fced0] do_nmi at ffffffff8922660d #4 [fffffe00003fcef0] end_repeat_nmi at ffffffff89c01663 [exception RIP: io_serial_in+20] RIP: ffffffff89792594 RSP: ffffa655314979e8 RFLAGS: 00000002 RAX: ffffffff89792500 RBX: ffffffff8af428a0 RCX: 0000000000000000 RDX: 00000000000003fd RSI: 0000000000000005 RDI: ffffffff8af428a0 RBP: 0000000000002710 R8: 0000000000000004 R9: 000000000000000f R10: 0000000000000000 R11: ffffffff8acbf64f R12: 0000000000000020 R13: ffffffff8acbf698 R14: 0000000000000058 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 --- <NMI exception stack> --- #5 [ffffa655314979e8] io_serial_in at ffffffff89792594 #6 [ffffa655314979e8] wait_for_xmitr at ffffffff89793470 #7 [ffffa65531497a08] serial8250_console_putchar at ffffffff897934f6 #8 [ffffa65531497a20] uart_console_write at ffffffff8978b605 #9 [ffffa65531497a48] serial8250_console_write at ffffffff89796558 Signed-off-by: Lei Chen <lei.chen@smartx.com> --- drivers/net/tun.c | 19 +++++++++++-------- 1 file changed, 11 insertions(+), 8 deletions(-)