diff mbox series

wifi: ath9k: Don't mark channelmap stack variable read-only in ath9k_mci_update_wlan_channels()

Message ID 20230413214118.153781-1-toke@toke.dk (mailing list archive)
State Accepted
Commit b956e3110a797a3663f91f9b8935b667cc23fe72
Delegated to: Kalle Valo
Headers show
Series wifi: ath9k: Don't mark channelmap stack variable read-only in ath9k_mci_update_wlan_channels() | expand

Commit Message

Toke Høiland-Jørgensen April 13, 2023, 9:41 p.m. UTC
This partially reverts commit e161d4b60ae3a5356e07202e0bfedb5fad82c6aa.

Turns out the channelmap variable is not actually read-only, it's modified
through the MCI_GPM_CLR_CHANNEL_BIT() macro further down in the function,
so making it read-only causes page faults when that code is hit.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=217183
Fixes: e161d4b60ae3 ("wifi: ath9k: Make arrays prof_prio and channelmap static const")
Cc: stable@vger.kernel.org
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
---
 drivers/net/wireless/ath/ath9k/mci.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

Comments

Kalle Valo April 14, 2023, 10 a.m. UTC | #1
Toke Høiland-Jørgensen <toke@toke.dk> writes:

> This partially reverts commit e161d4b60ae3a5356e07202e0bfedb5fad82c6aa.
>
> Turns out the channelmap variable is not actually read-only, it's modified
> through the MCI_GPM_CLR_CHANNEL_BIT() macro further down in the function,
> so making it read-only causes page faults when that code is hit.
>
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=217183
> Fixes: e161d4b60ae3 ("wifi: ath9k: Make arrays prof_prio and channelmap static const")
> Cc: stable@vger.kernel.org
> Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>

I guess the casting in MCI_GPM_CLR_CHANNEL_BIT() hide this and made it
impossible for the compiler to detect it? A perfect example why I hate
casting :)
Toke Høiland-Jørgensen April 14, 2023, 10:32 a.m. UTC | #2
Kalle Valo <kvalo@kernel.org> writes:

> Toke Høiland-Jørgensen <toke@toke.dk> writes:
>
>> This partially reverts commit e161d4b60ae3a5356e07202e0bfedb5fad82c6aa.
>>
>> Turns out the channelmap variable is not actually read-only, it's modified
>> through the MCI_GPM_CLR_CHANNEL_BIT() macro further down in the function,
>> so making it read-only causes page faults when that code is hit.
>>
>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=217183
>> Fixes: e161d4b60ae3 ("wifi: ath9k: Make arrays prof_prio and channelmap static const")
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
>
> I guess the casting in MCI_GPM_CLR_CHANNEL_BIT() hide this and made it
> impossible for the compiler to detect it? A perfect example why I hate
> casting :)

Yup, exactly. I was also assuming the compiler would catch it, but yay, C! :/

Anyway, cf the bugzilla this was a pretty bad regression for 6.2, so
would be good to move this along reasonably quickly (although I guess we
just missed the -net PR for rc7)...

-Toke
Kalle Valo April 14, 2023, 12:38 p.m. UTC | #3
(dropping stable from cc)

Toke Høiland-Jørgensen <toke@toke.dk> writes:

> Kalle Valo <kvalo@kernel.org> writes:
>
>> Toke Høiland-Jørgensen <toke@toke.dk> writes:
>>
>>> This partially reverts commit e161d4b60ae3a5356e07202e0bfedb5fad82c6aa.
>>>
>>> Turns out the channelmap variable is not actually read-only, it's modified
>>> through the MCI_GPM_CLR_CHANNEL_BIT() macro further down in the function,
>>> so making it read-only causes page faults when that code is hit.
>>>
>>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=217183
>>> Fixes: e161d4b60ae3 ("wifi: ath9k: Make arrays prof_prio and
>>> channelmap static const")
>>> Cc: stable@vger.kernel.org
>>> Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
>>
>> I guess the casting in MCI_GPM_CLR_CHANNEL_BIT() hide this and made it
>> impossible for the compiler to detect it? A perfect example why I hate
>> casting :)
>
> Yup, exactly. I was also assuming the compiler would catch it, but yay, C! :/

We have so many static checkers that I wonder if those would be able to
catch these kind of buggy casts? We had a similar bug in rtw89 something
like a year ago.

> Anyway, cf the bugzilla this was a pretty bad regression for 6.2, so
> would be good to move this along reasonably quickly (although I guess we
> just missed the -net PR for rc7)...

I'm not planning to send anymore stuff to v6.3 so my plan is to take
this to -next. The merge window is very close anyway so this shouldn't
cause too much delay.
Toke Høiland-Jørgensen April 18, 2023, 10:14 a.m. UTC | #4
Kalle Valo <kvalo@kernel.org> writes:

> (dropping stable from cc)
>
> Toke Høiland-Jørgensen <toke@toke.dk> writes:
>
>> Kalle Valo <kvalo@kernel.org> writes:
>>
>>> Toke Høiland-Jørgensen <toke@toke.dk> writes:
>>>
>>>> This partially reverts commit e161d4b60ae3a5356e07202e0bfedb5fad82c6aa.
>>>>
>>>> Turns out the channelmap variable is not actually read-only, it's modified
>>>> through the MCI_GPM_CLR_CHANNEL_BIT() macro further down in the function,
>>>> so making it read-only causes page faults when that code is hit.
>>>>
>>>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=217183
>>>> Fixes: e161d4b60ae3 ("wifi: ath9k: Make arrays prof_prio and
>>>> channelmap static const")
>>>> Cc: stable@vger.kernel.org
>>>> Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
>>>
>>> I guess the casting in MCI_GPM_CLR_CHANNEL_BIT() hide this and made it
>>> impossible for the compiler to detect it? A perfect example why I hate
>>> casting :)
>>
>> Yup, exactly. I was also assuming the compiler would catch it, but yay, C! :/
>
> We have so many static checkers that I wonder if those would be able to
> catch these kind of buggy casts? We had a similar bug in rtw89 something
> like a year ago.

No idea. Would be nice, yeah... :)

>> Anyway, cf the bugzilla this was a pretty bad regression for 6.2, so
>> would be good to move this along reasonably quickly (although I guess we
>> just missed the -net PR for rc7)...
>
> I'm not planning to send anymore stuff to v6.3 so my plan is to take
> this to -next. The merge window is very close anyway so this shouldn't
> cause too much delay.

Hmm, okay, a bit unfortunate that we'll ship 6.3 with the same bug, but
if it goes in during the merge window, I guess we'll get the fix into
6.3.1 (or something close to that) via stable? I can live with that...

-Toke
Kalle Valo April 19, 2023, 4:54 a.m. UTC | #5
Toke Høiland-Jørgensen <toke@toke.dk> writes:

>>> Anyway, cf the bugzilla this was a pretty bad regression for 6.2, so
>>> would be good to move this along reasonably quickly (although I guess we
>>> just missed the -net PR for rc7)...
>>
>> I'm not planning to send anymore stuff to v6.3 so my plan is to take
>> this to -next. The merge window is very close anyway so this shouldn't
>> cause too much delay.
>
> Hmm, okay, a bit unfortunate that we'll ship 6.3 with the same bug

Yeah, it is unfortunate. But it is always a question of time :) To save
time I usually try to send two wireless tree pull requests per cycle,
one early in the cycle and the second around the middle.

> but if it goes in during the merge window, I guess we'll get the fix
> into 6.3.1 (or something close to that) via stable? I can live with
> that...

Yeah, that's what I'm hoping as well. The patch has cc stable so I
assume it should go quickly to stable releases.
Kalle Valo April 19, 2023, 2:24 p.m. UTC | #6
Toke Høiland-Jørgensen <toke@toke.dk> wrote:

> This partially reverts commit e161d4b60ae3a5356e07202e0bfedb5fad82c6aa.
> 
> Turns out the channelmap variable is not actually read-only, it's modified
> through the MCI_GPM_CLR_CHANNEL_BIT() macro further down in the function,
> so making it read-only causes page faults when that code is hit.
> 
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=217183
> Fixes: e161d4b60ae3 ("wifi: ath9k: Make arrays prof_prio and channelmap static const")
> Cc: stable@vger.kernel.org
> Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
> Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>

Patch applied to ath-next branch of ath.git, thanks.

b956e3110a79 wifi: ath9k: Don't mark channelmap stack variable read-only in ath9k_mci_update_wlan_channels()
Colin Ian King April 19, 2023, 3:18 p.m. UTC | #7
On 19/04/2023 15:24, Kalle Valo wrote:
> Toke Høiland-Jørgensen <toke@toke.dk> wrote:
> 
>> This partially reverts commit e161d4b60ae3a5356e07202e0bfedb5fad82c6aa.
>>
>> Turns out the channelmap variable is not actually read-only, it's modified
>> through the MCI_GPM_CLR_CHANNEL_BIT() macro further down in the function,
>> so making it read-only causes page faults when that code is hit.
>>
>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=217183
>> Fixes: e161d4b60ae3 ("wifi: ath9k: Make arrays prof_prio and channelmap static const")
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
>> Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
> 
> Patch applied to ath-next branch of ath.git, thanks.
> 
> b956e3110a79 wifi: ath9k: Don't mark channelmap stack variable read-only in ath9k_mci_update_wlan_channels()
> 

Thanks. Apologies for the regression. My bad.
Thorsten Leemhuis April 20, 2023, 1:50 p.m. UTC | #8
[CCing Jakub, Greg and the regressions list]

On 19.04.23 06:54, Kalle Valo wrote:
> Toke Høiland-Jørgensen <toke@toke.dk> writes:
> 
>>>> Anyway, cf the bugzilla

[FWIW, it's this ticket Toke meant:
https://bugzilla.kernel.org/show_bug.cgi?id=217286 ]

>>>> this was a pretty bad regression for 6.2, so
>>>> would be good to move this along reasonably quickly (although I guess we
>>>> just missed the -net PR for rc7)...
>>>
>>> I'm not planning to send anymore stuff to v6.3 so my plan is to take
>>> this to -next. The merge window is very close anyway so this shouldn't
>>> cause too much delay.
>>
>> Hmm, okay, a bit unfortunate that we'll ship 6.3 with the same bug
> 
> Yeah, it is unfortunate.

Agreed.

> But it is always a question of time :) To save
> time I usually try to send two wireless tree pull requests per cycle,
> one early in the cycle and the second around the middle.

Why not ask Linus to pull this directly from the list then? He doesn't
mind doing that for an occasional regression fix. And then he can decide
himself if the change is worth the risk -- and obviously can take into
account if he'll release and rc8 or not.

That's why Documentation/process/handling-regressions.rst
(https://docs.kernel.org/process/handling-regressions.html ) even tells
people to use that approach in a situation like this one.

Fun fact: that document also says "Subsystem maintainers are expected to
assist in reaching those periods [...]. They thus might have to send
git-pull requests earlier or more often than usual; [...]". That whole
section has a lot of "Ideally" in it, because yes, this thing called
real life is complicated sometimes and we are all volunteers. But still
it's a bit unfortunate that such a important tree like wireless doesn't
sent its fixes upstream every week.

> The patch has cc stable so I assume it should go quickly to stable
> releases.

To me it looks a lot like Greg most of the time only backports important
bug fixes during merge windows (e.g. when asked or when he noticed
something important) -- and leaves everything else often until after
rc1. And when there are a lot of backports, he might even do it in two
batches then. Hence it might easily take until ~May 11th or 18th (if we
get a rc8) until this fix reaches 6.2.y and 6.3.y.

Then it in the end would have taken about one month for the fix to reach
the users. That is really unfortunate. Preventing such situations (which
are not rare) is one of the main reason why handling-regressions.rst was
written like it is...

Ciao, Thorsten
Toke Høiland-Jørgensen April 20, 2023, 2:24 p.m. UTC | #9
Thorsten Leemhuis <regressions@leemhuis.info> writes:

> [CCing Jakub, Greg and the regressions list]
>
> On 19.04.23 06:54, Kalle Valo wrote:
>> Toke Høiland-Jørgensen <toke@toke.dk> writes:
>> 
>>>>> Anyway, cf the bugzilla
>
> [FWIW, it's this ticket Toke meant:
> https://bugzilla.kernel.org/show_bug.cgi?id=217286 ]
>
>>>>> this was a pretty bad regression for 6.2, so
>>>>> would be good to move this along reasonably quickly (although I guess we
>>>>> just missed the -net PR for rc7)...
>>>>
>>>> I'm not planning to send anymore stuff to v6.3 so my plan is to take
>>>> this to -next. The merge window is very close anyway so this shouldn't
>>>> cause too much delay.
>>>
>>> Hmm, okay, a bit unfortunate that we'll ship 6.3 with the same bug
>> 
>> Yeah, it is unfortunate.
>
> Agreed.
>
>> But it is always a question of time :) To save
>> time I usually try to send two wireless tree pull requests per cycle,
>> one early in the cycle and the second around the middle.
>
> Why not ask Linus to pull this directly from the list then? He doesn't
> mind doing that for an occasional regression fix. And then he can decide
> himself if the change is worth the risk -- and obviously can take into
> account if he'll release and rc8 or not.

I'm OK with doing it that way; I'll do so later tonight unless Kalle or
Jakub complains before then...

-Toke
Jakub Kicinski April 20, 2023, 2:50 p.m. UTC | #10
On Thu, 20 Apr 2023 16:24:53 +0200 Toke Høiland-Jørgensen wrote:
> >> But it is always a question of time :) To save
> >> time I usually try to send two wireless tree pull requests per cycle,
> >> one early in the cycle and the second around the middle.  
> >
> > Why not ask Linus to pull this directly from the list then?

Out of curiosity, Thorsten, do you have stats on "how long does it take
fixes to reach Linus" per tree? Stats get people to act much quicker
than pleas, just sayin' ;)

> > He doesn't
> > mind doing that for an occasional regression fix. And then he can decide
> > himself if the change is worth the risk -- and obviously can take into
> > account if he'll release and rc8 or not.  
> 
> I'm OK with doing it that way; I'll do so later tonight unless Kalle or
> Jakub complains before then...

Ah, just after our last(?) 6.3 PR was submitted :(
No objections to you posting this directly to Linus...

That said it is a 6.2 regression AFAICT so it's not exactly in the
"must be fixed in 6.3" category. Assuming Kalle doesn't want a PR -
should we take it into net and have it reach Linus either next Tue
(assuming no -rc8) or Thu (if -rc8)?
Toke Høiland-Jørgensen April 20, 2023, 3:56 p.m. UTC | #11
Jakub Kicinski <kuba@kernel.org> writes:

>> > He doesn't
>> > mind doing that for an occasional regression fix. And then he can decide
>> > himself if the change is worth the risk -- and obviously can take into
>> > account if he'll release and rc8 or not.  
>> 
>> I'm OK with doing it that way; I'll do so later tonight unless Kalle or
>> Jakub complains before then...
>
> Ah, just after our last(?) 6.3 PR was submitted :(
> No objections to you posting this directly to Linus...

Heh, yeah :( In retrospect, just asking you to take it directly into
-net might have been the expedient thing to do here.

> That said it is a 6.2 regression AFAICT so it's not exactly in the
> "must be fixed in 6.3" category. Assuming Kalle doesn't want a PR -
> should we take it into net and have it reach Linus either next Tue
> (assuming no -rc8) or Thu (if -rc8)?

How about I ping Linus and if he doesn't want to take it directly, you
can pull it into -net?

-Toke
Thorsten Leemhuis April 20, 2023, 3:59 p.m. UTC | #12
On 20.04.23 16:50, Jakub Kicinski wrote:
> On Thu, 20 Apr 2023 16:24:53 +0200 Toke Høiland-Jørgensen wrote:
>>>> But it is always a question of time :) To save
>>>> time I usually try to send two wireless tree pull requests per cycle,
>>>> one early in the cycle and the second around the middle.  
>>>
>>> Why not ask Linus to pull this directly from the list then?

BTW (to reply to two mails with one): Toke, thx for giving it a shot!

> Out of curiosity, Thorsten, do you have stats on "how long does it take
> fixes to reach Linus" per tree? Stats get people to act much quicker
> than pleas, just sayin' ;)

I know, I know... :-( Nevertheless thx for the reminder. :-D

I really wish that I had some, but right now the data I have in regzbot
is too messy and not a good base to generate such stats, as they would
likely be misleading (that's the long story short).

I once had the rough plan to approach this differently by looking at all
commits that ended up in the first big batches of stable updates (e.g.
releases like 6.0.3 with many hundreds of changes). I wanted to filter
out the regression fixes and then (1)look how long it took from posting
the fix till it was mainlined and (2)backported to stable. But there
afaics is no good way to automate the first part of the job: filtering
out fixes for regressions that actually bothered someone and might or
might not have been tracked by regzbot (the "might not" share might be
the bigger one, which is part of the valid stats problem indicated above).

>>> He doesn't
>>> mind doing that for an occasional regression fix. And then he can decide
>>> himself if the change is worth the risk -- and obviously can take into
>>> account if he'll release and rc8 or not.  
>>
>> I'm OK with doing it that way; I'll do so later tonight unless Kalle or
>> Jakub complains before then...
> 
> Ah, just after our last(?) 6.3 PR was submitted :(
> No objections to you posting this directly to Linus...
> 
> That said it is a 6.2 regression AFAICT so it's not exactly in the
> "must be fixed in 6.3" category.

Out of curiosity (really, it's not meant as a rhetorical device, I'm
trying to understand this point of view):

Why do you think that? And should it really be like that?

Sure, if it was an old problem (say from 5.18) that was only recently
found I'd agree, especially if the fix might have a more than average
risk of causing other trouble. But shouldn't we care about regressions
that were found shortly after a release (e.g. 6.2 in this case) at least
as much (or maybe even more?) as we care about those found during the
weeks preceding it?

FWIW, it's not the first time I hear a statement like that and every
time I wonder how Linus thinks about this. But whatever, not going to CC
him for that.

Ciao, Thorsten
Jakub Kicinski April 20, 2023, 4:39 p.m. UTC | #13
On Thu, 20 Apr 2023 17:56:26 +0200 Toke Høiland-Jørgensen wrote:
> > That said it is a 6.2 regression AFAICT so it's not exactly in the
> > "must be fixed in 6.3" category. Assuming Kalle doesn't want a PR -
> > should we take it into net and have it reach Linus either next Tue
> > (assuming no -rc8) or Thu (if -rc8)?  
> 
> How about I ping Linus and if he doesn't want to take it directly, you
> can pull it into -net?

Fine by me :)  Just make sure to CC netdev
Jakub Kicinski April 20, 2023, 4:55 p.m. UTC | #14
On Thu, 20 Apr 2023 17:59:49 +0200 Thorsten Leemhuis wrote:
> > Out of curiosity, Thorsten, do you have stats on "how long does it take
> > fixes to reach Linus" per tree? Stats get people to act much quicker
> > than pleas, just sayin' ;)  
> 
> I know, I know... :-( Nevertheless thx for the reminder. :-D
> 
> I really wish that I had some, but right now the data I have in regzbot
> is too messy and not a good base to generate such stats, as they would
> likely be misleading (that's the long story short).
> 
> I once had the rough plan to approach this differently by looking at all
> commits that ended up in the first big batches of stable updates (e.g.
> releases like 6.0.3 with many hundreds of changes). I wanted to filter
> out the regression fixes and then (1)look how long it took from posting
> the fix till it was mainlined and (2)backported to stable. But there
> afaics is no good way to automate the first part of the job: filtering
> out fixes for regressions that actually bothered someone and might or
> might not have been tracked by regzbot (the "might not" share might be
> the bigger one, which is part of the valid stats problem indicated above).

I wouldn't bother with the patches you didn't track in regzbot.
This probably depends on how various people apply / maintain their
patches (sigh) but for networking (and anything else that's pretty 
pure git management) author date of the commit should give you the
time when patch was posted.

So we could go regzbot report date -> author date in upstream -> commit
date in stable potentially without much coding.

Going to Linus's tree vs stable should also be possible. Chris Mason
has showed me once a git incantation which finds the merge commit in
Linus's tree at which a given patch has arrived.. but I lost it.

> >> I'm OK with doing it that way; I'll do so later tonight unless Kalle or
> >> Jakub complains before then...  
> > 
> > Ah, just after our last(?) 6.3 PR was submitted :(
> > No objections to you posting this directly to Linus...
> > 
> > That said it is a 6.2 regression AFAICT so it's not exactly in the
> > "must be fixed in 6.3" category.  
> 
> Out of curiosity (really, it's not meant as a rhetorical device, I'm
> trying to understand this point of view):
> 
> Why do you think that? And should it really be like that?
> 
> Sure, if it was an old problem (say from 5.18) that was only recently
> found I'd agree, especially if the fix might have a more than average
> risk of causing other trouble. But shouldn't we care about regressions
> that were found shortly after a release (e.g. 6.2 in this case) at least
> as much (or maybe even more?) as we care about those found during the
> weeks preceding it?
> 
> FWIW, it's not the first time I hear a statement like that and every
> time I wonder how Linus thinks about this. But whatever, not going to CC
> him for that.

I'm a but curious what Linus would think, too.

Just to be clear the assumption I operate on is that all regressions
are important to fix within reasonable time frame. The question is
whether it matters for this regression that we're close to final.
Whether we should engage extraordinary means to get the fix in before
final is cut.

If it was a 6.3 regression we should try as hard as we can to fix it
in final (e.g. the mlx5 regression), if it's in 6.2 already - the extra
week of waiting may not be worth skipping trees and bothering Linus.

IOW for older regressions there's only the question of whether the fix
is in upstream in a reasonable time. It doesn't matter as much which
particular tag it hits.
Thorsten Leemhuis April 20, 2023, 6:27 p.m. UTC | #15
[CCing Linus: there are two things near the end of this mail where we're
wondering how you feel about them]

On 20.04.23 18:55, Jakub Kicinski wrote:
> On Thu, 20 Apr 2023 17:59:49 +0200 Thorsten Leemhuis wrote:
>>> Out of curiosity, Thorsten, do you have stats on "how long does it take
>>> fixes to reach Linus" per tree? Stats get people to act much quicker
>>> than pleas, just sayin' ;)  
>>
>> I know, I know... :-( Nevertheless thx for the reminder. :-D
>>
>> I really wish that I had some, but right now the data I have in regzbot
>> is too messy and not a good base to generate such stats, as they would
>> likely be misleading (that's the long story short).
>>
>> I once had the rough plan to approach this differently by looking at all
>> commits that ended up in the first big batches of stable updates (e.g.
>> releases like 6.0.3 with many hundreds of changes). I wanted to filter
>> out the regression fixes and then (1)look how long it took from posting
>> the fix till it was mainlined and (2)backported to stable. But there
>> afaics is no good way to automate the first part of the job: filtering
>> out fixes for regressions that actually bothered someone and might or
>> might not have been tracked by regzbot (the "might not" share might be
>> the bigger one, which is part of the valid stats problem indicated above).
> 
> I wouldn't bother with the patches you didn't track in regzbot.
> This probably depends on how various people apply / maintain their
> patches (sigh) but for networking (and anything else that's pretty 
> pure git management) author date of the commit should give you the
> time when patch was posted.

Unless the patch went through several revisions during review. But
that's just a detail.

> So we could go regzbot report date -> author date in upstream -> commit
> date in stable potentially without much coding.
>
> Going to Linus's tree vs stable should also be possible. Chris Mason
> has showed me once a git incantation which finds the merge commit in
> Linus's tree at which a given patch has arrived.. but I lost it.

I have something ugly for that (I needed a python variant of this
somewhere in regzbot):

commit=91dcf1e80;
branch=origin/master;
git show $(git rev-list "${commit}".."${branch}" --ancestry-path |
grep -f <(git rev-list "${commit}".."${branch}" --first-parent) |
tail -1)

[note: this will fail for any changes Linus committed directly to mainline]

>>>> I'm OK with doing it that way; I'll do so later tonight unless Kalle or
>>>> Jakub complains before then...  
>>>
>>> Ah, just after our last(?) 6.3 PR was submitted :(
>>> No objections to you posting this directly to Linus...
>>>
>>> That said it is a 6.2 regression AFAICT so it's not exactly in the
>>> "must be fixed in 6.3" category.  
>>
>> Out of curiosity (really, it's not meant as a rhetorical device, I'm
>> trying to understand this point of view):
>>
>> Why do you think that? And should it really be like that?
>>
>> Sure, if it was an old problem (say from 5.18) that was only recently
>> found I'd agree, especially if the fix might have a more than average
>> risk of causing other trouble. But shouldn't we care about regressions
>> that were found shortly after a release (e.g. 6.2 in this case) at least
>> as much (or maybe even more?) as we care about those found during the
>> weeks preceding it?
>>
>> FWIW, it's not the first time I hear a statement like that and every
>> time I wonder how Linus thinks about this. But whatever, not going to CC
>> him for that.
> 
> I'm a but curious what Linus would think, too.

:-)

I CCed Linus now, as another question for him came up below. With a bit
of luck he'll share his view on these matters. And if not I'm pretty
sure we'll all live happily ever after, too. :-D

> Just to be clear the assumption I operate on is that all regressions
> are important to fix within reasonable time frame. The question is
> whether it matters for this regression that we're close to final.
> Whether we should engage extraordinary means to get the fix in before
> final is cut.

Well, the risk obviously is a factor here, especially during a potential
release week (like the one we're in); but I don't think the risk
evaluation for "fixes for regressions introduced in the previous cycle"
and "fixes for regressions introduced this cycle" should be much
different. But that's see what Linus thinks about this.

> If it was a 6.3 regression we should try as hard as we can to fix it
> in final (e.g. the mlx5 regression), if it's in 6.2 already - the extra
> week 

Well, yes, in the ideal case it is just a week. But at this point of the
devel cycle a one week delay in mainlining can easily result in a two
week delay till the fix reaches users through the stable trees. To explain:

* Assume this fix (posted one week ago) is not merged this week and
Linus releases 6.3 on Sunday; assume further the fix then will go to
mainline during the merge window (e.g. the one week delay scenario). If
that happens during the first or second week of the merge window doesn't
even matter much, as Greg apparently often waits till after -rc1 is
released before be starts backporting most fixes. And that's where the
extra week comes from. We of course could avoid this by bothering Greg
to pick up the fix once it reached mainline; but then it might be better
to bother Linus in the first place to merge is this week.

Not to mention that there would have been another week of delay, if (a)
I had not spoken up in this thread and (b) we get a rc8. But we afaics
avoided that scenario already with the plan to merge the fix next week
through the -net tree in case we get a rc8.

> of waiting may not be worth skipping trees and bothering Linus.

Linus, to you feel bothered by an occational additional pull request or
a mail asking you to pick up a patch directly from the list?

> IOW for older regressions there's only the question of whether the fix
> is in upstream in a reasonable time. It doesn't matter as much which
> particular tag it hits.

Ciao, Thorsten
Toke Høiland-Jørgensen April 20, 2023, 9:09 p.m. UTC | #16
Toke Høiland-Jørgensen <toke@toke.dk> writes:

> This partially reverts commit e161d4b60ae3a5356e07202e0bfedb5fad82c6aa.
>
> Turns out the channelmap variable is not actually read-only, it's modified
> through the MCI_GPM_CLR_CHANNEL_BIT() macro further down in the function,
> so making it read-only causes page faults when that code is hit.
>
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=217183
> Fixes: e161d4b60ae3 ("wifi: ath9k: Make arrays prof_prio and channelmap static const")
> Cc: stable@vger.kernel.org
> Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>

Hi Linus

Thorsten already pulled you into the thread further down, but I figured
I'd do this writeup anyway so you have the full context:

The patch quoted above[0] fixes a regression in the ath9k driver that was
introduced in 6.2, which causes a kernel BUG() whenever the "Bluetooth
co-existence" feature in the driver is enabled (which seems to be the
default on at least some systems).

Because of unfortunate timing (caused by an impedance mismatch between
the wireless tree and the -net tree, and my failure to realise this and
push it directly to -net), this patch did not make it into this week's
network tree pull request to you. Which means that unless you decide to
do an -rc8, this regression will also be in the 6.3 release, and it may
take several more weeks before the fix makes it into a stable release.

So, with a bit of prodding from Thorsten, I'm writing this to ask you if
you'd be willing to pull this patch directly from the mailing list as a
one-off? It's a fairly small patch, and since it's a (partial) revert
the risk of it being the cause of new regressions should be fairly
small. One of the reporters on the Bugzilla (linked above) confirmed
that the patch does indeed fix the regression.

In case you *don't* want to take this patch directly, Jakub has agreed
to pull it directly into -net, in which case it'll land in your tree via
the next networking pull request. Either way, as indicated by the
sibling thread Thorsten Cc'ed you on, we'll take your opinion on the
best way to handle this into account in the future. Just let us know :)

Thanks,
-Toke


[0] Direct Lore link: https://lore.kernel.org/r/20230413214118.153781-1-toke@toke.dk
Linus Torvalds April 20, 2023, 10:27 p.m. UTC | #17
On Thu, Apr 20, 2023 at 2:09 PM Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>
> So, with a bit of prodding from Thorsten, I'm writing this to ask you if
> you'd be willing to pull this patch directly from the mailing list as a
> one-off? It's a fairly small patch, and since it's a (partial) revert
> the risk of it being the cause of new regressions should be fairly
> small.

Sure. I'm always open to direct fixes when there is no controversy
about the fix. No problem. I still happily deal with individual
patches.

And yes, I do consider "regression in an earlier release" to be a
regression that needs fixing.

There's obviously a time limit: if that "regression in an earlier
release" was a year or more ago, and just took forever for people to
notice, and it had semantic changes that now mean that fixing the
regression could cause a _new_ regression, then that can cause me to
go "Oh, now the new semantics are what we have to live with".

But something like this, where the regression was in the previous
release and it's just a clear fix with no semantic subtlety, I
consider to be just a regular regression that should be expedited -
partly to make it into stable, and partly to avoid having to put the
fix into _another_ stable kernel..

             Linus
Toke Høiland-Jørgensen April 20, 2023, 10:38 p.m. UTC | #18
Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Thu, Apr 20, 2023 at 2:09 PM Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>>
>> So, with a bit of prodding from Thorsten, I'm writing this to ask you if
>> you'd be willing to pull this patch directly from the mailing list as a
>> one-off? It's a fairly small patch, and since it's a (partial) revert
>> the risk of it being the cause of new regressions should be fairly
>> small.
>
> Sure. I'm always open to direct fixes when there is no controversy
> about the fix. No problem. I still happily deal with individual
> patches.

Awesome, thanks!

> And yes, I do consider "regression in an earlier release" to be a
> regression that needs fixing.
>
> There's obviously a time limit: if that "regression in an earlier
> release" was a year or more ago, and just took forever for people to
> notice, and it had semantic changes that now mean that fixing the
> regression could cause a _new_ regression, then that can cause me to
> go "Oh, now the new semantics are what we have to live with".
>
> But something like this, where the regression was in the previous
> release and it's just a clear fix with no semantic subtlety, I
> consider to be just a regular regression that should be expedited -
> partly to make it into stable, and partly to avoid having to put the
> fix into _another_ stable kernel..

OK, duly noted; thank you for clarifying :)

-Toke
diff mbox series

Patch

diff --git a/drivers/net/wireless/ath/ath9k/mci.c b/drivers/net/wireless/ath/ath9k/mci.c
index 3363fc4e8966..a0845002d6fe 100644
--- a/drivers/net/wireless/ath/ath9k/mci.c
+++ b/drivers/net/wireless/ath/ath9k/mci.c
@@ -646,9 +646,7 @@  void ath9k_mci_update_wlan_channels(struct ath_softc *sc, bool allow_all)
 	struct ath_hw *ah = sc->sc_ah;
 	struct ath9k_hw_mci *mci = &ah->btcoex_hw.mci;
 	struct ath9k_channel *chan = ah->curchan;
-	static const u32 channelmap[] = {
-		0x00000000, 0xffff0000, 0xffffffff, 0x7fffffff
-	};
+	u32 channelmap[] = {0x00000000, 0xffff0000, 0xffffffff, 0x7fffffff};
 	int i;
 	s16 chan_start, chan_end;
 	u16 wlan_chan;