diff mbox series

[net-next,3/3] bonding: 3ad: Use a more verbose warning for unknown speeds

Message ID 20210209103209.482770-4-razor@blackwall.org (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series bonding: 3ad: support for 200G/400G ports and more verbose warning | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Clearly marked for net-next
netdev/subject_prefix success Link
netdev/cc_maintainers success CCed 6 of 6 maintainers
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 15 lines checked
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/header_inline success Link
netdev/stable success Stable not CCed

Commit Message

Nikolay Aleksandrov Feb. 9, 2021, 10:32 a.m. UTC
From: Ido Schimmel <idosch@nvidia.com>

The bond driver needs to be patched to support new ethtool speeds.
Currently it emits a single warning [1] when it encounters an unknown
speed. As evident by the two previous patches, this is not explicit
enough. Instead, use WARN_ONCE() to get a more verbose warning [2].

[1]
bond10: (slave swp1): unknown ethtool speed (200000) for port 1 (set it to 0)

[2]
bond20: (slave swp2): unknown ethtool speed (400000) for port 1 (set it to 0)
WARNING: CPU: 5 PID: 96 at drivers/net/bonding/bond_3ad.c:317 __get_link_speed.isra.0+0x110/0x120
Modules linked in:
CPU: 5 PID: 96 Comm: kworker/u16:5 Not tainted 5.11.0-rc6-custom-02818-g69a767ec7302 #3243
Hardware name: Mellanox Technologies Ltd. MSN4700/VMOD0010, BIOS 5.11 01/06/2019
Workqueue: bond20 bond_mii_monitor
RIP: 0010:__get_link_speed.isra.0+0x110/0x120
Code: 5b ff ff ff 52 4c 8b 4e 08 44 0f b7 c7 48 c7 c7 18 46 4a b8 48 8b 16 c6 05 d9 76 41 01 01 49 8b 31 89 44 24 04 e8 a2 8a 3f 00 <0f> 0b 8b 44 24 04 59 c3 0
f 1f 84 00 00 00 00 00 48 85 ff 74 3b 53
RSP: 0018:ffffb683c03afde0 EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffff96bd3f2a9a38 RCX: 0000000000000000
RDX: ffff96c06fd67560 RSI: ffff96c06fd57850 RDI: ffff96c06fd57850
RBP: 0000000000000000 R08: ffffffffb8b49888 R09: 0000000000009ffb
R10: 00000000ffffe000 R11: 3fffffffffffffff R12: 0000000000000000
R13: ffff96bd3f2a9a38 R14: ffff96bd49c56400 R15: ffff96bd49c564f0
FS:  0000000000000000(0000) GS:ffff96c06fd40000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f327ad804b0 CR3: 0000000142ad5006 CR4: 00000000003706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 ad_update_actor_keys+0x36/0xc0
 bond_3ad_handle_link_change+0x5d/0xf0
 bond_mii_monitor.cold+0x1c2/0x1e8
 process_one_work+0x1c9/0x360
 worker_thread+0x48/0x3c0
 kthread+0x113/0x130
 ret_from_fork+0x1f/0x30

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 drivers/net/bonding/bond_3ad.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

Comments

Nikolay Aleksandrov Feb. 9, 2021, 10:40 a.m. UTC | #1
On 09/02/2021 12:32, Nikolay Aleksandrov wrote:
> From: Ido Schimmel <idosch@nvidia.com>
> 
> The bond driver needs to be patched to support new ethtool speeds.
> Currently it emits a single warning [1] when it encounters an unknown
> speed. As evident by the two previous patches, this is not explicit
> enough. Instead, use WARN_ONCE() to get a more verbose warning [2].
> 
> [1]
> bond10: (slave swp1): unknown ethtool speed (200000) for port 1 (set it to 0)
> 
> [2]
> bond20: (slave swp2): unknown ethtool speed (400000) for port 1 (set it to 0)
> WARNING: CPU: 5 PID: 96 at drivers/net/bonding/bond_3ad.c:317 __get_link_speed.isra.0+0x110/0x120
> Modules linked in:
> CPU: 5 PID: 96 Comm: kworker/u16:5 Not tainted 5.11.0-rc6-custom-02818-g69a767ec7302 #3243
> Hardware name: Mellanox Technologies Ltd. MSN4700/VMOD0010, BIOS 5.11 01/06/2019
> Workqueue: bond20 bond_mii_monitor
> RIP: 0010:__get_link_speed.isra.0+0x110/0x120
> Code: 5b ff ff ff 52 4c 8b 4e 08 44 0f b7 c7 48 c7 c7 18 46 4a b8 48 8b 16 c6 05 d9 76 41 01 01 49 8b 31 89 44 24 04 e8 a2 8a 3f 00 <0f> 0b 8b 44 24 04 59 c3 0
> f 1f 84 00 00 00 00 00 48 85 ff 74 3b 53
> RSP: 0018:ffffb683c03afde0 EFLAGS: 00010282
> RAX: 0000000000000000 RBX: ffff96bd3f2a9a38 RCX: 0000000000000000
> RDX: ffff96c06fd67560 RSI: ffff96c06fd57850 RDI: ffff96c06fd57850
> RBP: 0000000000000000 R08: ffffffffb8b49888 R09: 0000000000009ffb
> R10: 00000000ffffe000 R11: 3fffffffffffffff R12: 0000000000000000
> R13: ffff96bd3f2a9a38 R14: ffff96bd49c56400 R15: ffff96bd49c564f0
> FS:  0000000000000000(0000) GS:ffff96c06fd40000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f327ad804b0 CR3: 0000000142ad5006 CR4: 00000000003706e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  ad_update_actor_keys+0x36/0xc0
>  bond_3ad_handle_link_change+0x5d/0xf0
>  bond_mii_monitor.cold+0x1c2/0x1e8
>  process_one_work+0x1c9/0x360
>  worker_thread+0x48/0x3c0
>  kthread+0x113/0x130
>  ret_from_fork+0x1f/0x30
> 
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> ---
>  drivers/net/bonding/bond_3ad.c | 9 ++++-----
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 

Oops, forgot to add my acked-by. :)
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Alexander Duyck Feb. 10, 2021, 7:44 p.m. UTC | #2
On Tue, Feb 9, 2021 at 2:42 AM Nikolay Aleksandrov <razor@blackwall.org> wrote:
>
> From: Ido Schimmel <idosch@nvidia.com>
>
> The bond driver needs to be patched to support new ethtool speeds.
> Currently it emits a single warning [1] when it encounters an unknown
> speed. As evident by the two previous patches, this is not explicit
> enough. Instead, use WARN_ONCE() to get a more verbose warning [2].
>
> [1]
> bond10: (slave swp1): unknown ethtool speed (200000) for port 1 (set it to 0)
>
> [2]
> bond20: (slave swp2): unknown ethtool speed (400000) for port 1 (set it to 0)
> WARNING: CPU: 5 PID: 96 at drivers/net/bonding/bond_3ad.c:317 __get_link_speed.isra.0+0x110/0x120
> Modules linked in:
> CPU: 5 PID: 96 Comm: kworker/u16:5 Not tainted 5.11.0-rc6-custom-02818-g69a767ec7302 #3243
> Hardware name: Mellanox Technologies Ltd. MSN4700/VMOD0010, BIOS 5.11 01/06/2019
> Workqueue: bond20 bond_mii_monitor
> RIP: 0010:__get_link_speed.isra.0+0x110/0x120
> Code: 5b ff ff ff 52 4c 8b 4e 08 44 0f b7 c7 48 c7 c7 18 46 4a b8 48 8b 16 c6 05 d9 76 41 01 01 49 8b 31 89 44 24 04 e8 a2 8a 3f 00 <0f> 0b 8b 44 24 04 59 c3 0
> f 1f 84 00 00 00 00 00 48 85 ff 74 3b 53
> RSP: 0018:ffffb683c03afde0 EFLAGS: 00010282
> RAX: 0000000000000000 RBX: ffff96bd3f2a9a38 RCX: 0000000000000000
> RDX: ffff96c06fd67560 RSI: ffff96c06fd57850 RDI: ffff96c06fd57850
> RBP: 0000000000000000 R08: ffffffffb8b49888 R09: 0000000000009ffb
> R10: 00000000ffffe000 R11: 3fffffffffffffff R12: 0000000000000000
> R13: ffff96bd3f2a9a38 R14: ffff96bd49c56400 R15: ffff96bd49c564f0
> FS:  0000000000000000(0000) GS:ffff96c06fd40000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f327ad804b0 CR3: 0000000142ad5006 CR4: 00000000003706e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  ad_update_actor_keys+0x36/0xc0
>  bond_3ad_handle_link_change+0x5d/0xf0
>  bond_mii_monitor.cold+0x1c2/0x1e8
>  process_one_work+0x1c9/0x360
>  worker_thread+0x48/0x3c0
>  kthread+0x113/0x130
>  ret_from_fork+0x1f/0x30
>
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>

I'm not really sure making the warning consume more text is really
going to solve the problem. I was actually much happier with just the
first error as I don't need a stack trace. Just having the line is
enough information for me to search and find the cause for the issue.
Adding a backtrace is just overkill.

If we really think this is something that is important maybe we should
move this up to an error instead of a warning. For example why not
make this use pr_err_once, instead of pr_warn_once? It should make it
more likely to be highlighted in the system log.
Ido Schimmel Feb. 10, 2021, 8:11 p.m. UTC | #3
On Wed, Feb 10, 2021 at 11:44:31AM -0800, Alexander Duyck wrote:
> I'm not really sure making the warning consume more text is really
> going to solve the problem. I was actually much happier with just the
> first error as I don't need a stack trace. Just having the line is
> enough information for me to search and find the cause for the issue.
> Adding a backtrace is just overkill.
> 
> If we really think this is something that is important maybe we should
> move this up to an error instead of a warning. For example why not
> make this use pr_err_once, instead of pr_warn_once? It should make it
> more likely to be highlighted in the system log.

Yea, I expected this comment.

We are currently looking for patterns such as 'BUG', 'WARNING', 'BUG
kmalloc', 'UBSAN' etc in regression. Mostly based on what syzkaller is
doing [1] (which we are also running). We can instead promote this
warning to pr_err_once() and start looking at errors as well. It might
uncover more issues / false positives.

[1] https://github.com/google/syzkaller/blob/42b90a7c596c2b7d8f8d034dff7d8c635631de5a/pkg/report/linux.go#L952
diff mbox series

Patch

diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
index 2e670f68626d..460dc1bfc7a9 100644
--- a/drivers/net/bonding/bond_3ad.c
+++ b/drivers/net/bonding/bond_3ad.c
@@ -326,11 +326,10 @@  static u16 __get_link_speed(struct port *port)
 
 		default:
 			/* unknown speed value from ethtool. shouldn't happen */
-			if (slave->speed != SPEED_UNKNOWN)
-				pr_warn_once("%s: (slave %s): unknown ethtool speed (%d) for port %d (set it to 0)\n",
-					     slave->bond->dev->name,
-					     slave->dev->name, slave->speed,
-					     port->actor_port_number);
+			WARN_ONCE(slave->speed != SPEED_UNKNOWN,
+				  "%s: (slave %s): unknown ethtool speed (%d) for port %d (set it to 0)\n",
+				  slave->bond->dev->name, slave->dev->name,
+				  slave->speed, port->actor_port_number);
 			speed = 0;
 			break;
 		}