diff mbox

Regression: Bug 196547 - Since 4.12 - bonding module not working with wireless drivers

Message ID 8845e49b-3165-e6df-5935-c86278d220d9@broadcom.com (mailing list archive)
State RFC
Delegated to: Kalle Valo
Headers show

Commit Message

Arend van Spriel Aug. 10, 2017, 12:43 p.m. UTC
On 10-08-17 07:39, Kalle Valo wrote:
> Hi Mahesh and Andy,
> 
> James Feeney reported that there's a serious regression in bonding
> module since v4.12, it doesn't work with wireless drivers anymore as
> wireless drivers don't report the link speed via ethtool:
> 
> https://bugzilla.kernel.org/show_bug.cgi?id=196547
> 
> In the bug report it's said that this commit is the culprit:
> 
> 3f3c278c94dd bonding: fix active-backup transition

This commit references another one. ie. commit c4adfc822bf5 ("bonding: 
make speed, duplex setting consistent with link state"). Before this 
commit the result of __ethtool_get_link_ksettings() was simply ignored.

ruling it out to be used as active bond slave. To the end-users who were 
using bonding this is simply a regression. So to fix that both changes 
should be reverted in my opinion.

Now specifically for wireless interfaces we could implement 
get_link_ksettings callback although most of the fields requested are 
meaningless in wireless context. Regarding the speed and half-duplex 
values we raised some concerns in an earlier discussion with James. 
Wireless is always half-duplex as there can be only one (unintended ref 
to [1]). If the reported speed in wifi is difficult. In wifi we have 
txrate and rxrate which are inherently asynchronous and it is a 
per-packet value so it is going to change a lot. Seeing only 4 call 
sites in the bonding code tells me that is not taken into account. All 
in all this shenanigan seems netconf material to me.

Regards,
Arend

[1] https://en.wikipedia.org/wiki/Highlander_(film)

Comments

Andreas Born Aug. 10, 2017, 5:52 p.m. UTC | #1
Hi everyone,

2017-08-10 14:43 GMT+02:00 Arend van Spriel <arend.vanspriel@broadcom.com>:
>
>
> On 10-08-17 07:39, Kalle Valo wrote:
>>
>> Hi Mahesh and Andy,
>>
>> James Feeney reported that there's a serious regression in bonding
>> module since v4.12, it doesn't work with wireless drivers anymore as
>> wireless drivers don't report the link speed via ethtool:
>>
>> https://bugzilla.kernel.org/show_bug.cgi?id=196547
>>
>> In the bug report it's said that this commit is the culprit:
>>
>> 3f3c278c94dd bonding: fix active-backup transition
>
>
> This commit references another one. ie. commit c4adfc822bf5 ("bonding: make speed, duplex setting consistent with link state"). Before this commit the result of __ethtool_get_link_ksettings() was simply ignored.

Actually it was not simply ignored in any case: Further down in
bond_miimon_commit() there's a conditional call to
bond_3ad_handle_link_change() which triggers an update using
__get_link_speed() to actually access the result. A similar handler is
also called for lb modes.

>
> Commit 3f3c278c94dd ("bonding: fix active-backup transition") moves setting the link state to the call sites of bond_update_speed_duplex(), just not all call sites.
>
>> Is there a fix for this or should that commit be reverted? This seems to
>> be a serious regression as there are multiple reports already and we
>> should get it fixed for v4.13, and the fix backported to v4.12 stable
>> release.
>
>
> The ethtool callbacks really seem optional. At least in brcmfmac, the wireless driver I maintain, I only provide get_drvinfo callback and there is no warning triggered upon registering the netdev. The changes above now require each netdev to implement the get_link_ksettings callback (get_settings is deprecated) or the link is marked as DOWN ruling it out to be used as active bond slave. To the end-users who were using bonding this is simply a regression. So to fix that both changes should be reverted in my opinion.

Yes, also to me as user of a wireless slave in an active-backup bond
it's clearly a regression. But only partially for some modes like
active-backup since the bonding documentation [1] clearly lists as a
prerequisite

1) for 802.3ad: "Ethtool support in the base drivers for retrieving
the speed and duplex of each slave."
2) for tlb/alb: "Ethtool support in the base drivers for retrieving
the speed of each slave."

This was previously not directly enforced in the bonding code and thus
probably occasionally caused unexpected behavior. At least such
behavior is what to my understanding commit c4adfc822bf5 ("bonding:
make speed, duplex setting consistent with link state") and
3f3c278c94dd ("bonding: fix active-backup transition") intend to fix
with an apparent focus on 802.3ad. However those commits went too far
by requiring a get_link_ksettings implementation by every slave driver
REGARDLESS of the bond mode.

Earlier today I submitted the patch (bonding: require speed/duplex
only for 802.3ad, alb and tlb) [2] that only partially reverts what is
a regression following my aforementioned logic. This seems to me like
the best solution in the short term since it should satisfy both
usergroups represented by Mahesh and James and restores consistence
with the bonding documentation. James already commented approvingly on
that patch in the bug report. [3]

Regards
Andreas

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/networking/bonding.txt
[2] https://www.spinics.net/lists/netdev/msg448662.html
[3] https://bugzilla.kernel.org/show_bug.cgi?id=196547#c10
Kalle Valo Aug. 11, 2017, 1:14 p.m. UTC | #2
Andreas Born <futur.andy@googlemail.com> writes:

> Earlier today I submitted the patch (bonding: require speed/duplex
> only for 802.3ad, alb and tlb) [2] that only partially reverts what is
> a regression following my aforementioned logic. This seems to me like
> the best solution in the short term since it should satisfy both
> usergroups represented by Mahesh and James and restores consistence
> with the bonding documentation. James already commented approvingly on
> that patch in the bug report. [3]
>
> Regards
> Andreas
>
> [1]
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/networking/bonding.txt

Great, thanks.

I'll take it the patch is meant for net tree (and not net-next) so that
it will be fixed for v4.13? Also it should backported to v4.12 stable
tree. I don't see any mention of that in the patch submission and that's
why I'm asking.
Kalle Valo Aug. 12, 2017, 7:35 a.m. UTC | #3
Kalle Valo <kvalo@codeaurora.org> writes:

> Andreas Born <futur.andy@googlemail.com> writes:
>
>> Earlier today I submitted the patch (bonding: require speed/duplex
>> only for 802.3ad, alb and tlb) [2] that only partially reverts what is
>> a regression following my aforementioned logic. This seems to me like
>> the best solution in the short term since it should satisfy both
>> usergroups represented by Mahesh and James and restores consistence
>> with the bonding documentation. James already commented approvingly on
>> that patch in the bug report. [3]
>>
>> Regards
>> Andreas
>>
>> [1]
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/networking/bonding.txt
>
> Great, thanks.
>
> I'll take it the patch is meant for net tree (and not net-next) so that
> it will be fixed for v4.13? Also it should backported to v4.12 stable
> tree. I don't see any mention of that in the patch submission and that's
> why I'm asking.

I see that Dave applied this to the net tree and queued also for stable,
excellent. Thanks everyone!

https://patchwork.ozlabs.org/patch/800080/
James Feeney Aug. 12, 2017, 7:30 p.m. UTC | #4
Hey Kalle

Still, a problem:

On 08/12/2017 01:35 AM, Kalle Valo wrote:
> Kalle Valo <kvalo@codeaurora.org> writes:
> 
>> Andreas Born <futur.andy@googlemail.com> writes:
>>
>>> Earlier today I submitted the patch (bonding: require speed/duplex
>>> only for 802.3ad, alb and tlb) [2] that only partially reverts what is
>>> a regression following my aforementioned logic. This seems to me like
>>> the best solution in the short term since it should satisfy both
>>> usergroups represented by Mahesh and James and restores consistence
>>> with the bonding documentation. James already commented approvingly on
>>> that patch in the bug report. [3]
>>>
>>> Regards
>>> Andreas
>>>
>>> [1]
>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/networking/bonding.txt
>>
>> Great, thanks.
>>
>> I'll take it the patch is meant for net tree (and not net-next) so that
>> it will be fixed for v4.13? Also it should backported to v4.12 stable
>> tree. I don't see any mention of that in the patch submission and that's
>> why I'm asking.
> 
> I see that Dave applied this to the net tree and queued also for stable,
> excellent. Thanks everyone!
> 
> https://patchwork.ozlabs.org/patch/800080/
> 

Andreas patch failed to address the continuous, *10-times per second* warning
which will "spam" the log file, sometimes the console, whenever the test fails:
 if (bond_update_speed_duplex(slave) && bond_needs_speed_duplex(bond)) {...}
which then has:
 netdev_warn(bond->dev, "failed to get link speed/duplex for%s\n",
			slave->dev->name);

That is the sort of irresponsible code that "works fine" as long as there are no
errors, and it never actually runs!

I'm guessing that the simple fix is to use "net_warn_ratelimited()" instead of
"netdev_warn()", where net/core/utils.c says:

/*
 * All net warning printk()s should be guarded by this function.
 */
int net_ratelimit(void)
{
        return __ratelimit(&net_ratelimit_state);
}

though Andreas has also suggested "pr_warn_ratelimited()", which instead uses
"__ratelimit(&_rs)".

Here's a link to the rate-limiting patch, after Andreas' patch is already
applied, since you say that David has already applied the first patch:

https://bugzilla.kernel.org/attachment.cgi?id=257903


Thanks
James
Andreas Born Aug. 13, 2017, 5:42 p.m. UTC | #5
2017-08-12 21:30 GMT+02:00 James Feeney <james@nurealm.net>:
>
>
> Andreas patch failed to address the continuous, *10-times per second* warning
> which will "spam" the log file, sometimes the console, whenever the test fails:
>  if (bond_update_speed_duplex(slave) && bond_needs_speed_duplex(bond)) {...}
> which then has:
>  netdev_warn(bond->dev, "failed to get link speed/duplex for%s\n",
>                         slave->dev->name);
>
> That is the sort of irresponsible code that "works fine" as long as there are no
> errors, and it never actually runs!
>
> I'm guessing that the simple fix is to use "net_warn_ratelimited()" instead of
> "netdev_warn()", where net/core/utils.c says:
>
> /*
>  * All net warning printk()s should be guarded by this function.
>  */
> int net_ratelimit(void)
> {
>         return __ratelimit(&net_ratelimit_state);
> }
>
> though Andreas has also suggested "pr_warn_ratelimited()", which instead uses
> "__ratelimit(&_rs)".
>
> Here's a link to the rate-limiting patch, after Andreas' patch is already
> applied, since you say that David has already applied the first patch:
>
> https://bugzilla.kernel.org/attachment.cgi?id=257903
>

The other day I also sent a patch for part 2 of your bug report on
this list. Better safe than sorry by doing one thing per patch. This
second patch received feedback and its v2 is currently awaiting
review. Sorry, for informing you only now. I should have cc'd you in
the first place. You can lookup the state of that patch on patchwork.
[1] It's basically the same as previous versions just changing the
original code even less.
.
I sort of expect that this second patch won't be queued for stable.
But let's see... Essentially it's a regression too and additionally
would've been fixed by a plain revert.

On a side note I would recommend some of my own reading to you about
patch submission in general [1] and on netdev specifically [2].
Putting your suggested changes in the right form makes everyone's life
easier especially your own saving you lots of back and forth by mail.
Seeing the amount of mail on this list during the last days was reason
enough to comprehend the necessity of those standards.

And, just wondering, who's going to eventually close that bugreport?
https://bugzilla.kernel.org/show_bug.cgi?id=196547

Cheers
Andreas

[1] https://patchwork.ozlabs.org/patch/800770/
[2] https://www.kernel.org/doc/html/latest/process/submitting-patches.html
[3] https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/tree/Documentation/networking/netdev-FAQ.txt
James Feeney Aug. 16, 2017, 8:44 p.m. UTC | #6
On 08/13/2017 11:42 AM, Andreas Born wrote:
> On a side note I would recommend some of my own reading to you about
> patch submission in general [1] and on netdev specifically [2].

Mmm - [2] and [3], I suspect.  Thanks Andreas.  I'll be studying those.  Yeah,
I'm still learning what is needed and what works.  Sometimes, just a note to the
author is more than enough to resolve a problem.  Sometimes, discussion is
needed.  And other times... well, certain people are infamous... but no problem
here, thankfully.

> And, just wondering, who's going to eventually close that bugreport?
> https://bugzilla.kernel.org/show_bug.cgi?id=196547

I can close it when the patches actually land in the kernel.  I'm glad to see
that there was an "Ack" from Mahesh.

On the topic of wireless support for kernel ethtool reporting, I'm wondering, is
there is any consensus about that?

And, for instance, is there any *other* way for the bonding module to make
"better link" decisions for wireless links?  As "wireless" becomes more capable,
possibly more diverse, and probably more essential for computing, this is likely
to become a bigger issue.

Ben Greear mentioned that he had added some support to the ath10k driver.  Dan
Williams mentioned the possibility of updating the mac80211 stack for support.
And Arend van Spriel suggested that the issue might best be left for the next
Netconf.

Immediate problem solved, but maybe a larger issue still needs to be addressed?

James
David Miller Aug. 16, 2017, 9:01 p.m. UTC | #7
From: James Feeney <james@nurealm.net>
Date: Wed, 16 Aug 2017 14:44:27 -0600

> On 08/13/2017 11:42 AM, Andreas Born wrote:
>> On a side note I would recommend some of my own reading to you about
>> patch submission in general [1] and on netdev specifically [2].
> 
> Mmm - [2] and [3], I suspect.  Thanks Andreas.  I'll be studying those.  Yeah,
> I'm still learning what is needed and what works.  Sometimes, just a note to the
> author is more than enough to resolve a problem.  Sometimes, discussion is
> needed.  And other times... well, certain people are infamous... but no problem
> here, thankfully.
> 
>> And, just wondering, who's going to eventually close that bugreport?
>> https://bugzilla.kernel.org/show_bug.cgi?id=196547
> 
> I can close it when the patches actually land in the kernel.

All of the patches are in Linus's tree.
Dan Williams Aug. 16, 2017, 9:22 p.m. UTC | #8
On Wed, 2017-08-16 at 14:44 -0600, James Feeney wrote:
> On 08/13/2017 11:42 AM, Andreas Born wrote:
> > On a side note I would recommend some of my own reading to you
> > about
> > patch submission in general [1] and on netdev specifically [2].
> 
> Mmm - [2] and [3], I suspect.  Thanks Andreas.  I'll be studying
> those.  Yeah,
> I'm still learning what is needed and what works.  Sometimes, just a
> note to the
> author is more than enough to resolve a problem.  Sometimes,
> discussion is
> needed.  And other times... well, certain people are infamous... but
> no problem
> here, thankfully.
> 
> > And, just wondering, who's going to eventually close that
> > bugreport?
> > https://bugzilla.kernel.org/show_bug.cgi?id=196547
> 
> I can close it when the patches actually land in the kernel.  I'm
> glad to see
> that there was an "Ack" from Mahesh.
> 
> On the topic of wireless support for kernel ethtool reporting, I'm
> wondering, is
> there is any consensus about that?
> 
> And, for instance, is there any *other* way for the bonding module to
> make
> "better link" decisions for wireless links?  As "wireless" becomes
> more capable,
> possibly more diverse, and probably more essential for computing,
> this is likely
> to become a bigger issue.
> 
> Ben Greear mentioned that he had added some support to the ath10k
> driver.  Dan
> Williams mentioned the possibility of updating the mac80211 stack for
> support.
> And Arend van Spriel suggested that the issue might best be left for
> the next
> Netconf.
> 
> Immediate problem solved, but maybe a larger issue still needs to be
> addressed?

Again, it's technically possible to add the link settings support to
wireless drivers.  But the issue is around what bonding would do with
that information in its various modes.

My biggest suggestion is that perhaps bonding should grow hysteresis
for link speeds. Since WiFi can change speed every packet, you probably
don't want the bond characteristics changing every couple seconds just
in case your WiFi link is jumping around.  Ethernet won't bounce around
that much, so the hysteresis would have no effect there.  Or, if people
are concerned about response time to speed changes on ethernet (where
you probably do want an instant switch-over) some new flag to indicate
that certain devices don't have stable speeds over time.

Dan
David Miller Aug. 16, 2017, 9:31 p.m. UTC | #9
From: Dan Williams <dcbw@redhat.com>
Date: Wed, 16 Aug 2017 16:22:41 -0500

> My biggest suggestion is that perhaps bonding should grow hysteresis
> for link speeds. Since WiFi can change speed every packet, you probably
> don't want the bond characteristics changing every couple seconds just
> in case your WiFi link is jumping around.  Ethernet won't bounce around
> that much, so the hysteresis would have no effect there.  Or, if people
> are concerned about response time to speed changes on ethernet (where
> you probably do want an instant switch-over) some new flag to indicate
> that certain devices don't have stable speeds over time.

Or just report the average of the range the wireless link can hit, and
be done with it.

I think you guys are overcomplicating things.
Dan Williams Aug. 17, 2017, 2:11 a.m. UTC | #10
On Wed, 2017-08-16 at 14:31 -0700, David Miller wrote:
> From: Dan Williams <dcbw@redhat.com>
> Date: Wed, 16 Aug 2017 16:22:41 -0500
> 
> > My biggest suggestion is that perhaps bonding should grow
> hysteresis
> > for link speeds. Since WiFi can change speed every packet, you
> probably
> > don't want the bond characteristics changing every couple seconds
> just
> > in case your WiFi link is jumping around.  Ethernet won't bounce
> around
> > that much, so the hysteresis would have no effect there.  Or, if
> people
> > are concerned about response time to speed changes on ethernet
> (where
> > you probably do want an instant switch-over) some new flag to
> indicate
> > that certain devices don't have stable speeds over time.
> 
> Or just report the average of the range the wireless link can hit,
> and
> be done with it.
> 
> I think you guys are overcomplicating things.

That range can be from 1 to > 800Mb/s.  No, it won't usually be all
over that range, but it won't be uncommon to fluctuate by hundreds of
Mb/s.  I'm not sure a simple average is really the answer here.  Even
doing that would require new knobs to ethtool, since the rate depends
heavily on card capabilities and also what AP you're connected to *at
that moment*.  If you roam to another AP, then the max speed can
certainly change.

You'll probably say "aim for the 75% case" or something like that,
which is fine, but then you're depending on your 75% case to be (a)
single AP, (b) never move (eg, only bond wifi + ethernet), (c) little
radio interference.  I'm not sure I'd buy that.  If I've put words in
your mouth, forgive me.

Dan
Ben Greear Aug. 17, 2017, 2:36 a.m. UTC | #11
On 08/16/2017 07:11 PM, Dan Williams wrote:
> On Wed, 2017-08-16 at 14:31 -0700, David Miller wrote:
>> From: Dan Williams <dcbw@redhat.com>
>> Date: Wed, 16 Aug 2017 16:22:41 -0500
>>
>>> My biggest suggestion is that perhaps bonding should grow
>> hysteresis
>>> for link speeds. Since WiFi can change speed every packet, you
>> probably
>>> don't want the bond characteristics changing every couple seconds
>> just
>>> in case your WiFi link is jumping around.  Ethernet won't bounce
>> around
>>> that much, so the hysteresis would have no effect there.  Or, if
>> people
>>> are concerned about response time to speed changes on ethernet
>> (where
>>> you probably do want an instant switch-over) some new flag to
>> indicate
>>> that certain devices don't have stable speeds over time.
>>
>> Or just report the average of the range the wireless link can hit,
>> and
>> be done with it.
>>
>> I think you guys are overcomplicating things.
>
> That range can be from 1 to > 800Mb/s.  No, it won't usually be all
> over that range, but it won't be uncommon to fluctuate by hundreds of
> Mb/s.  I'm not sure a simple average is really the answer here.  Even
> doing that would require new knobs to ethtool, since the rate depends
> heavily on card capabilities and also what AP you're connected to *at
> that moment*.  If you roam to another AP, then the max speed can
> certainly change.
>
> You'll probably say "aim for the 75% case" or something like that,
> which is fine, but then you're depending on your 75% case to be (a)
> single AP, (b) never move (eg, only bond wifi + ethernet), (c) little
> radio interference.  I'm not sure I'd buy that.  If I've put words in
> your mouth, forgive me.

If you keep ethtool API simple and just return the last (rx-rate + tx-rate) / 2, or the rate averaged
over the last 100 frames or 10 seconds, then the caller can do longer term averaging
as it sees fit.  Probably no need for lots of averaging complexity in the kernel.

rate-ctrl for wifi basically doesn't happen until you transmit or receive a
fairly steady stream, so it will fluctuate a lot.

Thanks,
Ben

>
> Dan
>
David Miller Aug. 17, 2017, 2:42 a.m. UTC | #12
From: Dan Williams <dcbw@redhat.com>
Date: Wed, 16 Aug 2017 21:11:47 -0500

> You'll probably say "aim for the 75% case" or something like that,
> which is fine, but then you're depending on your 75% case to be (a)
> single AP, (b) never move (eg, only bond wifi + ethernet), (c) little
> radio interference.  I'm not sure I'd buy that.  If I've put words in
> your mouth, forgive me.

You can interpret what I'm saying as "some reasonable value is
better than no value at all."

You can keep arguing about AP changes, radio interference, etc.
but anything is better than the current situation.

Start small and simple, try to handle everything over time and
gradually.
Dan Williams Aug. 17, 2017, 3:18 a.m. UTC | #13
On Wed, 2017-08-16 at 19:36 -0700, Ben Greear wrote:
> On 08/16/2017 07:11 PM, Dan Williams wrote:
> > On Wed, 2017-08-16 at 14:31 -0700, David Miller wrote:
> > > From: Dan Williams <dcbw@redhat.com>
> > > Date: Wed, 16 Aug 2017 16:22:41 -0500
> > > 
> > > > My biggest suggestion is that perhaps bonding should grow
> > > 
> > > hysteresis
> > > > for link speeds. Since WiFi can change speed every packet, you
> > > 
> > > probably
> > > > don't want the bond characteristics changing every couple
> > > > seconds
> > > 
> > > just
> > > > in case your WiFi link is jumping around.  Ethernet won't
> > > > bounce
> > > 
> > > around
> > > > that much, so the hysteresis would have no effect there.  Or,
> > > > if
> > > 
> > > people
> > > > are concerned about response time to speed changes on ethernet
> > > 
> > > (where
> > > > you probably do want an instant switch-over) some new flag to
> > > 
> > > indicate
> > > > that certain devices don't have stable speeds over time.
> > > 
> > > Or just report the average of the range the wireless link can
> > > hit,
> > > and
> > > be done with it.
> > > 
> > > I think you guys are overcomplicating things.
> > 
> > That range can be from 1 to > 800Mb/s.  No, it won't usually be all
> > over that range, but it won't be uncommon to fluctuate by hundreds
> > of
> > Mb/s.  I'm not sure a simple average is really the answer
> > here.  Even
> > doing that would require new knobs to ethtool, since the rate
> > depends
> > heavily on card capabilities and also what AP you're connected to
> > *at
> > that moment*.  If you roam to another AP, then the max speed can
> > certainly change.
> > 
> > You'll probably say "aim for the 75% case" or something like that,
> > which is fine, but then you're depending on your 75% case to be (a)
> > single AP, (b) never move (eg, only bond wifi + ethernet), (c)
> > little
> > radio interference.  I'm not sure I'd buy that.  If I've put words
> > in
> > your mouth, forgive me.
> 
> If you keep ethtool API simple and just return the last (rx-rate +
> tx-rate) / 2, or the rate averaged
> over the last 100 frames or 10 seconds, then the caller can do longer
> term averaging
> as it sees fit.  Probably no need for lots of averaging complexity in
> the kernel.

Yeah, that works too, but I was thinking it was better to present the
actual data through ethtool so that things other than bonding could use
it, and since bonding is the thing that actually cares about the
fluctuation, make it do the more extensive processing.

Dan


> rate-ctrl for wifi basically doesn't happen until you transmit or
> receive a
> fairly steady stream, so it will fluctuate a lot.
Ben Greear Aug. 17, 2017, 3:32 a.m. UTC | #14
On 08/16/2017 08:18 PM, Dan Williams wrote:
> On Wed, 2017-08-16 at 19:36 -0700, Ben Greear wrote:
>> On 08/16/2017 07:11 PM, Dan Williams wrote:
>>> On Wed, 2017-08-16 at 14:31 -0700, David Miller wrote:
>>>> From: Dan Williams <dcbw@redhat.com>
>>>> Date: Wed, 16 Aug 2017 16:22:41 -0500
>>>>
>>>>> My biggest suggestion is that perhaps bonding should grow
>>>>
>>>> hysteresis
>>>>> for link speeds. Since WiFi can change speed every packet, you
>>>>
>>>> probably
>>>>> don't want the bond characteristics changing every couple
>>>>> seconds
>>>>
>>>> just
>>>>> in case your WiFi link is jumping around.  Ethernet won't
>>>>> bounce
>>>>
>>>> around
>>>>> that much, so the hysteresis would have no effect there.  Or,
>>>>> if
>>>>
>>>> people
>>>>> are concerned about response time to speed changes on ethernet
>>>>
>>>> (where
>>>>> you probably do want an instant switch-over) some new flag to
>>>>
>>>> indicate
>>>>> that certain devices don't have stable speeds over time.
>>>>
>>>> Or just report the average of the range the wireless link can
>>>> hit,
>>>> and
>>>> be done with it.
>>>>
>>>> I think you guys are overcomplicating things.
>>>
>>> That range can be from 1 to > 800Mb/s.  No, it won't usually be all
>>> over that range, but it won't be uncommon to fluctuate by hundreds
>>> of
>>> Mb/s.  I'm not sure a simple average is really the answer
>>> here.  Even
>>> doing that would require new knobs to ethtool, since the rate
>>> depends
>>> heavily on card capabilities and also what AP you're connected to
>>> *at
>>> that moment*.  If you roam to another AP, then the max speed can
>>> certainly change.
>>>
>>> You'll probably say "aim for the 75% case" or something like that,
>>> which is fine, but then you're depending on your 75% case to be (a)
>>> single AP, (b) never move (eg, only bond wifi + ethernet), (c)
>>> little
>>> radio interference.  I'm not sure I'd buy that.  If I've put words
>>> in
>>> your mouth, forgive me.
>>
>> If you keep ethtool API simple and just return the last (rx-rate +
>> tx-rate) / 2, or the rate averaged
>> over the last 100 frames or 10 seconds, then the caller can do longer
>> term averaging
>> as it sees fit.  Probably no need for lots of averaging complexity in
>> the kernel.
>
> Yeah, that works too, but I was thinking it was better to present the
> actual data through ethtool so that things other than bonding could use
> it, and since bonding is the thing that actually cares about the
> fluctuation, make it do the more extensive processing.

What do you mean by 'actual data'?  If you want to know the most accurate
transmit/rx rate info, then you need to pay attention to each and every frame's tx/rx rate, as
well as it's ampdu/amsdu, retries, etc.  It is virtually impossible.

So, you will have to settle for something less...  I suggest something simple
to calculate, similar to existing values that are available via debugfs and/or
'iw dev foo station dump', etc.  Let higher layers manipulate the raw data
as they see fit (they can query ethtool as often as they like).

Thanks,
Ben
Jay Vosburgh Aug. 17, 2017, 5:33 a.m. UTC | #15
Dan Williams <dcbw@redhat.com> wrote:
[...]
>You'll probably say "aim for the 75% case" or something like that,
>which is fine, but then you're depending on your 75% case to be (a)
>single AP, (b) never move (eg, only bond wifi + ethernet), (c) little
>radio interference.  I'm not sure I'd buy that.  If I've put words in
>your mouth, forgive me.

	The primary use case that I'm aware of for bonding with wireless
devices is in active-backup mode, paired with an ethernet adapter set as
the bonding "primary" device.  I think this is the case that absolutely
should just work.  I don't think bonding cares (or should care) about
(a) - (c) for this use.

	Your point (b) suggests that there are use cases other than the
above; I'm unfamiliar with any use other than wifi + ethernet, can you
elaborate?

	-J

---
	-Jay Vosburgh, jay.vosburgh@canonical.com
diff mbox

Patch

--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -365,9 +365,10 @@  int bond_set_carrier(struct bonding *bond)
  /* Get link speed and duplex from the slave's base driver
   * using ethtool. If for some reason the call fails or the
   * values are invalid, set speed and duplex to -1,
- * and return.
+ * and return. Return 1 if speed or duplex settings are
+ * UNKNOWN; 0 otherwise.
   */
-static void bond_update_speed_duplex(struct slave *slave)
+static int bond_update_speed_duplex(struct slave *slave)
  {
         struct net_device *slave_dev = slave->dev;
         struct ethtool_link_ksettings ecmd;
@@ -377,24 +378,27 @@  static void bond_update_speed_duplex(struct slave 
*slave)
         slave->duplex = DUPLEX_UNKNOWN;

         res = __ethtool_get_link_ksettings(slave_dev, &ecmd);
-       if (res < 0)
-               return;
-
-       if (ecmd.base.speed == 0 || ecmd.base.speed == ((__u32)-1))
-               return;
-
+       if (res < 0) {
+               slave->link = BOND_LINK_DOWN;
+               return 1;
+       }
+       if (ecmd.base.speed == 0 || ecmd.base.speed == ((__u32)-1)) {
+               slave->link = BOND_LINK_DOWN;
+               return 1;
+       }

Commit 3f3c278c94dd ("bonding: fix active-backup transition") moves 
setting the link state to the call sites of bond_update_speed_duplex(), 
just not all call sites.

> Is there a fix for this or should that commit be reverted? This seems to
> be a serious regression as there are multiple reports already and we
> should get it fixed for v4.13, and the fix backported to v4.12 stable
> release.

The ethtool callbacks really seem optional. At least in brcmfmac, the 
wireless driver I maintain, I only provide get_drvinfo callback and 
there is no warning triggered upon registering the netdev. The changes 
above now require each netdev to implement the get_link_ksettings 
callback (get_settings is deprecated) or the link is marked as DOWN