diff mbox series

iwlwifi: Fix softirq/hardirq disabling in iwl_pcie_gen2_enqueue_hcmd()

Message ID 20210415120447.4424-1-hdegoede@redhat.com
State New
Delegated to: Luca Coelho
Headers show
Series iwlwifi: Fix softirq/hardirq disabling in iwl_pcie_gen2_enqueue_hcmd() | expand

Commit Message

Hans de Goede April 15, 2021, 12:04 p.m. UTC
This fixes the same locking problem fixed by commit 2800aadc18a6 ("iwlwifi:
Fix softirq/hardirq disabling in iwl_pcie_enqueue_hcmd()") but then for
the gen2 variant of enqueue_hcmd().

It's possible for iwl_pcie_enqueue_hcmd() to be called with hard IRQs
disabled (e.g. from LED core). We can't enable BHs in such a situation.

Turn the unconditional BH-enable/BH-disable code into
hardirq-disable/conditional-enable.

This fixes the warning below.

[   36.763543] WARNING: CPU: 6 PID: 1582 at kernel/softirq.c:178 __local_bh_enable_ip+0x97/0xd0
[   36.763550] Modules linked in: cmac bnep vfat fat snd_ctl_led snd_soc_skl_hda_dsp snd_soc_hdac_hdmi snd_soc_dmic snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_sof_pci_intel_cnl snd_sof_intel_hda_common soundwire_intel soundwire_generic_allocation soundwire_cadence snd_sof_intel_hda iTCO_wdt mei_wdt snd_sof_pci mei_hdcp intel_pmc_bxt iTCO_vendor_support snd_sof snd_sof_xtensa_dsp snd_soc_skl intel_rapl_msr snd_soc_hdac_hda snd_hda_ext_core snd_soc_sst_ipc x86_pkg_temp_thermal snd_soc_sst_dsp snd_soc_acpi_intel_match intel_powerclamp snd_soc_acpi coretemp snd_soc_core snd_compress ac97_bus cdc_ether kvm_intel usbnet snd_pcm_dmaengine r8152 snd_hda_intel mii snd_intel_dspcfg snd_usb_audio snd_intel_sdw_acpi snd_usbmidi_lib kvm snd_rawmidi snd_hda_codec irqbypass snd_hda_core snd_hwdep rapl intel_cstate snd_seq intel_uncore iwlmvm snd_seq_device pcspkr snd_pcm wmi_bmof intel_wmi_thunderbolt e1000e mac80211 uvcvideo btusb i2c_i801 videobuf2_vmalloc videobuf2_memops btrtl
[   36.763661]  snd_timer i2c_smbus videobuf2_v4l2 btbcm btintel libarc4 thunderbolt videobuf2_common bluetooth videodev mei_me iwlwifi mei ecdh_generic mc ecc nxp_nci_i2c nxp_nci joydev processor_thermal_device ucsi_acpi nci processor_thermal_rfim processor_thermal_mbox cfg80211 typec_ucsi processor_thermal_rapl intel_pch_thermal intel_rapl_common idma64 intel_soc_dts_iosf typec nfc int3403_thermal int340x_thermal_zone soc_button_array intel_hid sparse_keymap acpi_pad int3400_thermal acpi_thermal_rel binfmt_misc zram ip_tables dm_crypt trusted hid_logitech_hidpp hid_logitech_dj uas usb_storage crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel i915 serio_raw nvme nvme_core i2c_algo_bit drm_kms_helper cec hid_multitouch drm wmi i2c_hid_acpi i2c_hid thinkpad_acpi ledtrig_audio platform_profile snd soundcore rfkill drm_privacy_screen_helper video pinctrl_cannonlake i2c_dev fuse
[   36.763775] CPU: 6 PID: 1582 Comm: NetworkManager Not tainted 5.12.0-rc7+ #303
[   36.763778] Hardware name: LENOVO 20U90SIT19/20U90SIT19, BIOS N2WET24W (1.14 ) 10/15/2020
[   36.763780] RIP: 0010:__local_bh_enable_ip+0x97/0xd0
[   36.763783] Code: f3 48 a9 00 ff ff 00 74 36 65 ff 0d b3 02 f3 48 e8 de ee 12 00 fb 66 0f 1f 44 00 00 5b 5d c3 65 8b 05 d9 0a f3 48 85 c0 75 9c <0f> 0b eb 98 e8 00 ee 12 00 eb a7 48 89 ef e8 16 14 07 00 eb b0 65
[   36.763786] RSP: 0018:ffffbc2e016b72d0 EFLAGS: 00010046
[   36.763790] RAX: 0000000000000000 RBX: 0000000000000201 RCX: 0000000000000000
[   36.763792] RDX: 0000000000000003 RSI: 0000000000000201 RDI: ffffffffc0a8527b
[   36.763794] RBP: ffffffffc0a8527b R08: 0000000000000000 R09: 0000000000000001
[   36.763796] R10: ffffbc2e016b71c0 R11: 0000000000000001 R12: ffffa0570c902c10
[   36.763798] R13: 0000000000000000 R14: 0000000080000000 R15: 0000000000000000
[   36.763801] FS:  00007f44b889bbc0(0000) GS:ffffa05a5a580000(0000) knlGS:0000000000000000
[   36.763803] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   36.763805] CR2: 00007fde6d88b8f0 CR3: 0000000142b08005 CR4: 00000000003706e0
[   36.763808] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   36.763810] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   36.763812] Call Trace:
[   36.763815]  iwl_pcie_gen2_enqueue_hcmd+0x56b/0x8c0 [iwlwifi]
[   36.763848]  iwl_trans_txq_send_hcmd+0x59/0x440 [iwlwifi]
[   36.763872]  iwl_trans_send_cmd+0x81/0x180 [iwlwifi]
[   36.763890]  iwl_mvm_send_cmd+0x28/0x80 [iwlmvm]
[   36.763910]  iwl_mvm_led_set+0x9c/0xc0 [iwlmvm]
[   36.763933]  ? _raw_read_lock_irqsave+0x88/0xa0
[   36.763938]  led_trigger_event+0x46/0x70
[   36.763945]  ieee80211_do_open+0x4c2/0xa70 [mac80211]
[   36.764006]  ieee80211_open+0x69/0x90 [mac80211]
[   36.764057]  __dev_open+0xd4/0x1a0
[   36.764063]  __dev_change_flags+0x1c8/0x240
[   36.764072]  dev_change_flags+0x21/0x60
[   36.764077]  do_setlink+0x238/0x1110
[   36.764084]  ? cpumask_next+0x17/0x20
[   36.764087]  ? __snmp6_fill_stats64.constprop.0+0x53/0xe0
[   36.764093]  ? __nla_validate_parse+0x4f/0xbf0
[   36.764109]  __rtnl_newlink+0x601/0x9b0
[   36.764116]  ? __lock_acquire+0x389/0x1e10
[   36.764130]  ? lock_acquire+0xb5/0x380
[   36.764134]  ? sock_def_readable+0x5/0x2a0
[   36.764136]  ? lock_is_held_type+0xa5/0x120
[   36.764141]  ? find_held_lock+0x2b/0x80
[   36.764146]  ? sock_def_readable+0xb0/0x2a0
[   36.764148]  ? lock_release+0xba/0x2a0
[   36.764156]  ? netlink_unicast+0x1f7/0x230
[   36.764163]  ? rtnl_getlink+0x364/0x3e0
[   36.764207]  ? rcu_read_lock_sched_held+0x3f/0x80
[   36.764211]  ? kmem_cache_alloc_trace+0x29a/0x2c0
[   36.764220]  rtnl_newlink+0x44/0x70
[   36.764225]  rtnetlink_rcv_msg+0x16e/0x480
[   36.764229]  ? netlink_deliver_tap+0x95/0x3d0
[   36.764236]  ? rtnetlink_put_metrics+0x1c0/0x1c0
[   36.764241]  netlink_rcv_skb+0x50/0xf0
[   36.764251]  netlink_unicast+0x16d/0x230
[   36.764258]  netlink_sendmsg+0x24d/0x480
[   36.764270]  sock_sendmsg+0x5e/0x60
[   36.764274]  ____sys_sendmsg+0x22f/0x270
[   36.764278]  ? import_iovec+0x17/0x20
[   36.764282]  ? sendmsg_copy_msghdr+0x59/0x90
[   36.764290]  ___sys_sendmsg+0x81/0xc0
[   36.764303]  ? lock_is_held_type+0xa5/0x120
[   36.764307]  ? find_held_lock+0x2b/0x80
[   36.764313]  ? __fget_files+0xd0/0x1a0
[   36.764316]  ? lock_release+0xba/0x2a0
[   36.764324]  ? __fget_files+0xef/0x1a0
[   36.764333]  __sys_sendmsg+0x49/0x80
[   36.764342]  ? syscall_enter_from_user_mode+0x27/0x80
[   36.764348]  do_syscall_64+0x33/0x40
[   36.764352]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[   36.764355] RIP: 0033:0x7f44b99876fd
[   36.764359] Code: 28 89 54 24 1c 48 89 74 24 10 89 7c 24 08 e8 fa ee ff ff 8b 54 24 1c 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 33 44 89 c7 48 89 44 24 08 e8 4e ef ff ff 48
[   36.764361] RSP: 002b:00007ffca4a55940 EFLAGS: 00000293 ORIG_RAX: 000000000000002e
[   36.764365] RAX: ffffffffffffffda RBX: 0000000000000014 RCX: 00007f44b99876fd
[   36.764368] RDX: 0000000000000000 RSI: 00007ffca4a55980 RDI: 000000000000000c
[   36.764370] RBP: 0000560f3b81d030 R08: 0000000000000000 R09: 0000000000000000
[   36.764372] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000000
[   36.764374] R13: 00007ffca4a55ae0 R14: 00007ffca4a55adc R15: 0000000000000000
[   36.764390] irq event stamp: 246365
[   36.764391] hardirqs last  enabled at (246363): [<ffffffffb70e7b12>] __local_bh_enable_ip+0x82/0xd0
[   36.764395] hardirqs last disabled at (246364): [<ffffffffb7d111c4>] _raw_read_lock_irqsave+0x94/0xa0
[   36.764398] softirqs last  enabled at (246362): [<ffffffffc0a8527b>] iwl_pcie_gen2_enqueue_hcmd+0x56b/0x8c0 [iwlwifi]
[   36.764414] softirqs last disabled at (246365): [<ffffffffc0a84dfb>] iwl_pcie_gen2_enqueue_hcmd+0xeb/0x8c0 [iwlwifi]

Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Chris Murphy <lists@colorremedies.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
---
 drivers/net/wireless/intel/iwlwifi/pcie/tx-gen2.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

Comments

Johannes Berg April 15, 2021, 12:21 p.m. UTC | #1
On Thu, 2021-04-15 at 14:04 +0200, Hans de Goede wrote:
> This fixes the same locking problem fixed by commit 2800aadc18a6 ("iwlwifi:
> Fix softirq/hardirq disabling in iwl_pcie_enqueue_hcmd()") but then for
> the gen2 variant of enqueue_hcmd().
> 
> It's possible for iwl_pcie_enqueue_hcmd() to be called with hard IRQs
> disabled (e.g. from LED core). We can't enable BHs in such a situation.
> 
> Turn the unconditional BH-enable/BH-disable code into
> hardirq-disable/conditional-enable.
> 
> This fixes the warning below.

I believe Jiri posted the same patch:

https://lore.kernel.org/linux-wireless/nycvar.YFH.7.76.2104070918090.12405@cbobk.fhfr.pm/

Not sure where it is now though, I guess Luca can comment.

I also had another fix in this area too.

johannes
Coelho, Luciano April 15, 2021, 12:37 p.m. UTC | #2
On Thu, 2021-04-15 at 14:21 +0200, Johannes Berg wrote:
> On Thu, 2021-04-15 at 14:04 +0200, Hans de Goede wrote:
> > This fixes the same locking problem fixed by commit 2800aadc18a6 ("iwlwifi:
> > Fix softirq/hardirq disabling in iwl_pcie_enqueue_hcmd()") but then for
> > the gen2 variant of enqueue_hcmd().
> > 
> > It's possible for iwl_pcie_enqueue_hcmd() to be called with hard IRQs
> > disabled (e.g. from LED core). We can't enable BHs in such a situation.
> > 
> > Turn the unconditional BH-enable/BH-disable code into
> > hardirq-disable/conditional-enable.
> > 
> > This fixes the warning below.
> 
> I believe Jiri posted the same patch:
> 
> https://lore.kernel.org/linux-wireless/nycvar.YFH.7.76.2104070918090.12405@cbobk.fhfr.pm/
> 
> Not sure where it is now though, I guess Luca can comment.

Jiri's patch is in 5.12-rc7.


> I also had another fix in this area too.

Your patch was not sent out yet.  Is this serious enough to justify
trying to get it into 5.12 so late in the series? Maybe it makes more
sense to wait for stable...

--
Luca.
Johannes Berg April 15, 2021, 12:37 p.m. UTC | #3
On Thu, 2021-04-15 at 12:37 +0000, Coelho, Luciano wrote:
> 
> > I also had another fix in this area too.
> 
> Your patch was not sent out yet.  Is this serious enough to justify
> trying to get it into 5.12 so late in the series? Maybe it makes more
> sense to wait for stable...

It *is* pretty serious, and given that Linus is contemplating rc8 I'd
probably say we could try?

johannes
Coelho, Luciano April 15, 2021, 12:43 p.m. UTC | #4
On Thu, 2021-04-15 at 14:37 +0200, Johannes Berg wrote:
> On Thu, 2021-04-15 at 12:37 +0000, Coelho, Luciano wrote:
> > 
> > > I also had another fix in this area too.
> > 
> > Your patch was not sent out yet.  Is this serious enough to justify
> > trying to get it into 5.12 so late in the series? Maybe it makes more
> > sense to wait for stable...
> 
> It *is* pretty serious, and given that Linus is contemplating rc8 I'd
> probably say we could try?

Okay, I'll send it out now and we can try to take it forward.  Kalle,
is that okay with you? As usual, this is for fixes and should go
directly to your tree.

--
Cheers,
Luca.
Jiri Kosina April 15, 2021, 12:44 p.m. UTC | #5
On Thu, 15 Apr 2021, Coelho, Luciano wrote:

> > I believe Jiri posted the same patch:
> > 
> > https://lore.kernel.org/linux-wireless/nycvar.YFH.7.76.2104070918090.12405@cbobk.fhfr.pm/
> > 
> > Not sure where it is now though, I guess Luca can comment.
> 
> Jiri's patch is in 5.12-rc7.

The iwl_pcie_enqueue_hcmd() is. The one referenced above 
(iwl_pcie_gen2_enqueue_hcmd()) is nowhere as far as I can tell.
Coelho, Luciano April 15, 2021, 1:04 p.m. UTC | #6
On Thu, 2021-04-15 at 14:44 +0200, Jiri Kosina wrote:
> On Thu, 15 Apr 2021, Coelho, Luciano wrote:
> 
> > > I believe Jiri posted the same patch:
> > > 
> > > https://lore.kernel.org/linux-wireless/nycvar.YFH.7.76.2104070918090.12405@cbobk.fhfr.pm/
> > > 
> > > Not sure where it is now though, I guess Luca can comment.
> > 
> > Jiri's patch is in 5.12-rc7.
> 
> The iwl_pcie_enqueue_hcmd() is. The one referenced above 
> (iwl_pcie_gen2_enqueue_hcmd()) is nowhere as far as I can tell.

Right, sorry, I confused the two.

In that same thread Johannes sent a third patch (which we have in our
internal tree).  Johannes, with your patch, the gen2 version is also
needed right?

Kalle, can you take that patch directly to your tree? I'll assign it to
you in patchwork.

And I'll send Johannes' patch out now too.

--
Cheers,
Luca.
Johannes Berg April 15, 2021, 1:06 p.m. UTC | #7
On Thu, 2021-04-15 at 13:04 +0000, Coelho, Luciano wrote:
> On Thu, 2021-04-15 at 14:44 +0200, Jiri Kosina wrote:
> > On Thu, 15 Apr 2021, Coelho, Luciano wrote:
> > 
> > > > I believe Jiri posted the same patch:
> > > > 
> > > > https://lore.kernel.org/linux-wireless/nycvar.YFH.7.76.2104070918090.12405@cbobk.fhfr.pm/
> > > > 
> > > > Not sure where it is now though, I guess Luca can comment.
> > > 
> > > Jiri's patch is in 5.12-rc7.
> > 
> > The iwl_pcie_enqueue_hcmd() is. The one referenced above 
> > (iwl_pcie_gen2_enqueue_hcmd()) is nowhere as far as I can tell.
> 
> Right, sorry, I confused the two.
> 
> In that same thread Johannes sent a third patch (which we have in our
> internal tree).  Johannes, with your patch, the gen2 version is also
> needed right?

Yes, they're independent.

johannes
Coelho, Luciano April 15, 2021, 1:54 p.m. UTC | #8
On Thu, 2021-04-15 at 14:44 +0200, Jiri Kosina wrote:
> On Thu, 15 Apr 2021, Coelho, Luciano wrote:
> 
> > > I believe Jiri posted the same patch:
> > > 
> > > https://lore.kernel.org/linux-wireless/nycvar.YFH.7.76.2104070918090.12405@cbobk.fhfr.pm/
> > > 
> > > Not sure where it is now though, I guess Luca can comment.
> > 
> > Jiri's patch is in 5.12-rc7.
> 
> The iwl_pcie_enqueue_hcmd() is. The one referenced above 
> (iwl_pcie_gen2_enqueue_hcmd()) is nowhere as far as I can tell.

Sorry, I missed that one somehow.  Kalle, I have assigned it to you. 
Can you please take it for -fixes (i.e. still for v5.12)? It's a
serious fix.

--
Cheers,
Luca.
Hans de Goede April 15, 2021, 3:32 p.m. UTC | #9
Hi,

On 4/15/21 2:21 PM, Johannes Berg wrote:
> On Thu, 2021-04-15 at 14:04 +0200, Hans de Goede wrote:
>> This fixes the same locking problem fixed by commit 2800aadc18a6 ("iwlwifi:
>> Fix softirq/hardirq disabling in iwl_pcie_enqueue_hcmd()") but then for
>> the gen2 variant of enqueue_hcmd().
>>
>> It's possible for iwl_pcie_enqueue_hcmd() to be called with hard IRQs
>> disabled (e.g. from LED core). We can't enable BHs in such a situation.
>>
>> Turn the unconditional BH-enable/BH-disable code into
>> hardirq-disable/conditional-enable.
>>
>> This fixes the warning below.
> 
> I believe Jiri posted the same patch:
> 
> https://lore.kernel.org/linux-wireless/nycvar.YFH.7.76.2104070918090.12405@cbobk.fhfr.pm/

Ah yes that is the same patch. I did reference Jiri's patch
for fixing the same issue in iwl_pcie_enqueue_hcmd() in the commit
message, without knowing that Jiri had send a later patch which also fixes
this in  iwl_pcie_gen2_enqueue_hcmd().

Going with Jiri's patch is fine then, but it would be nice if we
can get a fix for this in place soon-ish, which I see has
already been discussed further down in this thread :)

Regards,

Hans
diff mbox series

Patch

diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/tx-gen2.c b/drivers/net/wireless/intel/iwlwifi/pcie/tx-gen2.c
index 4456abb9a074..34bde8c87324 100644
--- a/drivers/net/wireless/intel/iwlwifi/pcie/tx-gen2.c
+++ b/drivers/net/wireless/intel/iwlwifi/pcie/tx-gen2.c
@@ -40,6 +40,7 @@  int iwl_pcie_gen2_enqueue_hcmd(struct iwl_trans *trans,
 	const u8 *cmddata[IWL_MAX_CMD_TBS_PER_TFD];
 	u16 cmdlen[IWL_MAX_CMD_TBS_PER_TFD];
 	struct iwl_tfh_tfd *tfd;
+	unsigned long flags;
 
 	copy_size = sizeof(struct iwl_cmd_header_wide);
 	cmd_size = sizeof(struct iwl_cmd_header_wide);
@@ -108,14 +109,14 @@  int iwl_pcie_gen2_enqueue_hcmd(struct iwl_trans *trans,
 		goto free_dup_buf;
 	}
 
-	spin_lock_bh(&txq->lock);
+	spin_lock_irqsave(&txq->lock, flags);
 
 	idx = iwl_txq_get_cmd_index(txq, txq->write_ptr);
 	tfd = iwl_txq_get_tfd(trans, txq, txq->write_ptr);
 	memset(tfd, 0, sizeof(*tfd));
 
 	if (iwl_txq_space(trans, txq) < ((cmd->flags & CMD_ASYNC) ? 2 : 1)) {
-		spin_unlock_bh(&txq->lock);
+		spin_unlock_irqrestore(&txq->lock, flags);
 
 		IWL_ERR(trans, "No space in command queue\n");
 		iwl_op_mode_cmd_queue_full(trans->op_mode);
@@ -250,7 +251,7 @@  int iwl_pcie_gen2_enqueue_hcmd(struct iwl_trans *trans,
 	spin_unlock(&trans_pcie->reg_lock);
 
 out:
-	spin_unlock_bh(&txq->lock);
+	spin_unlock_irqrestore(&txq->lock, flags);
 free_dup_buf:
 	if (idx < 0)
 		kfree(dup_buf);