Message ID | 20210209103209.482770-4-razor@blackwall.org (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | bonding: 3ad: support for 200G/400G ports and more verbose warning | expand |
Context | Check | Description |
---|---|---|
netdev/cover_letter | success | Link |
netdev/fixes_present | success | Link |
netdev/patch_count | success | Link |
netdev/tree_selection | success | Clearly marked for net-next |
netdev/subject_prefix | success | Link |
netdev/cc_maintainers | success | CCed 6 of 6 maintainers |
netdev/source_inline | success | Was 0 now: 0 |
netdev/verify_signedoff | success | Link |
netdev/module_param | success | Was 0 now: 0 |
netdev/build_32bit | success | Errors and warnings before: 0 this patch: 0 |
netdev/kdoc | success | Errors and warnings before: 0 this patch: 0 |
netdev/verify_fixes | success | Link |
netdev/checkpatch | success | total: 0 errors, 0 warnings, 0 checks, 15 lines checked |
netdev/build_allmodconfig_warn | success | Errors and warnings before: 0 this patch: 0 |
netdev/header_inline | success | Link |
netdev/stable | success | Stable not CCed |
On 09/02/2021 12:32, Nikolay Aleksandrov wrote: > From: Ido Schimmel <idosch@nvidia.com> > > The bond driver needs to be patched to support new ethtool speeds. > Currently it emits a single warning [1] when it encounters an unknown > speed. As evident by the two previous patches, this is not explicit > enough. Instead, use WARN_ONCE() to get a more verbose warning [2]. > > [1] > bond10: (slave swp1): unknown ethtool speed (200000) for port 1 (set it to 0) > > [2] > bond20: (slave swp2): unknown ethtool speed (400000) for port 1 (set it to 0) > WARNING: CPU: 5 PID: 96 at drivers/net/bonding/bond_3ad.c:317 __get_link_speed.isra.0+0x110/0x120 > Modules linked in: > CPU: 5 PID: 96 Comm: kworker/u16:5 Not tainted 5.11.0-rc6-custom-02818-g69a767ec7302 #3243 > Hardware name: Mellanox Technologies Ltd. MSN4700/VMOD0010, BIOS 5.11 01/06/2019 > Workqueue: bond20 bond_mii_monitor > RIP: 0010:__get_link_speed.isra.0+0x110/0x120 > Code: 5b ff ff ff 52 4c 8b 4e 08 44 0f b7 c7 48 c7 c7 18 46 4a b8 48 8b 16 c6 05 d9 76 41 01 01 49 8b 31 89 44 24 04 e8 a2 8a 3f 00 <0f> 0b 8b 44 24 04 59 c3 0 > f 1f 84 00 00 00 00 00 48 85 ff 74 3b 53 > RSP: 0018:ffffb683c03afde0 EFLAGS: 00010282 > RAX: 0000000000000000 RBX: ffff96bd3f2a9a38 RCX: 0000000000000000 > RDX: ffff96c06fd67560 RSI: ffff96c06fd57850 RDI: ffff96c06fd57850 > RBP: 0000000000000000 R08: ffffffffb8b49888 R09: 0000000000009ffb > R10: 00000000ffffe000 R11: 3fffffffffffffff R12: 0000000000000000 > R13: ffff96bd3f2a9a38 R14: ffff96bd49c56400 R15: ffff96bd49c564f0 > FS: 0000000000000000(0000) GS:ffff96c06fd40000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00007f327ad804b0 CR3: 0000000142ad5006 CR4: 00000000003706e0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > Call Trace: > ad_update_actor_keys+0x36/0xc0 > bond_3ad_handle_link_change+0x5d/0xf0 > bond_mii_monitor.cold+0x1c2/0x1e8 > process_one_work+0x1c9/0x360 > worker_thread+0x48/0x3c0 > kthread+0x113/0x130 > ret_from_fork+0x1f/0x30 > > Signed-off-by: Ido Schimmel <idosch@nvidia.com> > --- > drivers/net/bonding/bond_3ad.c | 9 ++++----- > 1 file changed, 4 insertions(+), 5 deletions(-) > Oops, forgot to add my acked-by. :) Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
On Tue, Feb 9, 2021 at 2:42 AM Nikolay Aleksandrov <razor@blackwall.org> wrote: > > From: Ido Schimmel <idosch@nvidia.com> > > The bond driver needs to be patched to support new ethtool speeds. > Currently it emits a single warning [1] when it encounters an unknown > speed. As evident by the two previous patches, this is not explicit > enough. Instead, use WARN_ONCE() to get a more verbose warning [2]. > > [1] > bond10: (slave swp1): unknown ethtool speed (200000) for port 1 (set it to 0) > > [2] > bond20: (slave swp2): unknown ethtool speed (400000) for port 1 (set it to 0) > WARNING: CPU: 5 PID: 96 at drivers/net/bonding/bond_3ad.c:317 __get_link_speed.isra.0+0x110/0x120 > Modules linked in: > CPU: 5 PID: 96 Comm: kworker/u16:5 Not tainted 5.11.0-rc6-custom-02818-g69a767ec7302 #3243 > Hardware name: Mellanox Technologies Ltd. MSN4700/VMOD0010, BIOS 5.11 01/06/2019 > Workqueue: bond20 bond_mii_monitor > RIP: 0010:__get_link_speed.isra.0+0x110/0x120 > Code: 5b ff ff ff 52 4c 8b 4e 08 44 0f b7 c7 48 c7 c7 18 46 4a b8 48 8b 16 c6 05 d9 76 41 01 01 49 8b 31 89 44 24 04 e8 a2 8a 3f 00 <0f> 0b 8b 44 24 04 59 c3 0 > f 1f 84 00 00 00 00 00 48 85 ff 74 3b 53 > RSP: 0018:ffffb683c03afde0 EFLAGS: 00010282 > RAX: 0000000000000000 RBX: ffff96bd3f2a9a38 RCX: 0000000000000000 > RDX: ffff96c06fd67560 RSI: ffff96c06fd57850 RDI: ffff96c06fd57850 > RBP: 0000000000000000 R08: ffffffffb8b49888 R09: 0000000000009ffb > R10: 00000000ffffe000 R11: 3fffffffffffffff R12: 0000000000000000 > R13: ffff96bd3f2a9a38 R14: ffff96bd49c56400 R15: ffff96bd49c564f0 > FS: 0000000000000000(0000) GS:ffff96c06fd40000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00007f327ad804b0 CR3: 0000000142ad5006 CR4: 00000000003706e0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > Call Trace: > ad_update_actor_keys+0x36/0xc0 > bond_3ad_handle_link_change+0x5d/0xf0 > bond_mii_monitor.cold+0x1c2/0x1e8 > process_one_work+0x1c9/0x360 > worker_thread+0x48/0x3c0 > kthread+0x113/0x130 > ret_from_fork+0x1f/0x30 > > Signed-off-by: Ido Schimmel <idosch@nvidia.com> I'm not really sure making the warning consume more text is really going to solve the problem. I was actually much happier with just the first error as I don't need a stack trace. Just having the line is enough information for me to search and find the cause for the issue. Adding a backtrace is just overkill. If we really think this is something that is important maybe we should move this up to an error instead of a warning. For example why not make this use pr_err_once, instead of pr_warn_once? It should make it more likely to be highlighted in the system log.
On Wed, Feb 10, 2021 at 11:44:31AM -0800, Alexander Duyck wrote: > I'm not really sure making the warning consume more text is really > going to solve the problem. I was actually much happier with just the > first error as I don't need a stack trace. Just having the line is > enough information for me to search and find the cause for the issue. > Adding a backtrace is just overkill. > > If we really think this is something that is important maybe we should > move this up to an error instead of a warning. For example why not > make this use pr_err_once, instead of pr_warn_once? It should make it > more likely to be highlighted in the system log. Yea, I expected this comment. We are currently looking for patterns such as 'BUG', 'WARNING', 'BUG kmalloc', 'UBSAN' etc in regression. Mostly based on what syzkaller is doing [1] (which we are also running). We can instead promote this warning to pr_err_once() and start looking at errors as well. It might uncover more issues / false positives. [1] https://github.com/google/syzkaller/blob/42b90a7c596c2b7d8f8d034dff7d8c635631de5a/pkg/report/linux.go#L952
diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c index 2e670f68626d..460dc1bfc7a9 100644 --- a/drivers/net/bonding/bond_3ad.c +++ b/drivers/net/bonding/bond_3ad.c @@ -326,11 +326,10 @@ static u16 __get_link_speed(struct port *port) default: /* unknown speed value from ethtool. shouldn't happen */ - if (slave->speed != SPEED_UNKNOWN) - pr_warn_once("%s: (slave %s): unknown ethtool speed (%d) for port %d (set it to 0)\n", - slave->bond->dev->name, - slave->dev->name, slave->speed, - port->actor_port_number); + WARN_ONCE(slave->speed != SPEED_UNKNOWN, + "%s: (slave %s): unknown ethtool speed (%d) for port %d (set it to 0)\n", + slave->bond->dev->name, slave->dev->name, + slave->speed, port->actor_port_number); speed = 0; break; }