Message ID | ThT2jwvySn9tFQm9FxrlPNMQkiGUnx_87cOhmmeexoVOFZqOkpjmAntCWG-kIBMj2830LHZaOULlJxQKiRXkVtGYrrT8rBaB7R65qjIinYo=@dannyvanheumen.nl (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Kalle Valo |
Headers | show |
Series | [v3] brcmfmac: prevent double-free on hardware-reset. | expand |
+ Uffe On 6/17/2022 4:24 PM, Danny van Heumen wrote: > From f1fcceb65d4a44c078cd684ea25a2f2c7f53deb2 Mon Sep 17 00:00:00 2001 > From: Danny van Heumen <danny@dannyvanheumen.nl> > Date: Tue, 24 May 2022 18:30:50 +0200 > Subject: [PATCH v3] brcmfmac: prevent double-free on hardware-reset. > > In case of buggy firmware, brcmfmac may perform a hardware reset. If during > reset and subsequent probing an early failure occurs, a memory region is > accidentally double-freed. With hardened memory allocation enabled, this error > will be detected. > > - return early where appropriate to skip unnecessary clean-up. > - set '.freezer' pointer to NULL to prevent double-freeing under possible > other circumstances and to re-align result under various different > behaviors of memory allocation freeing. > > Given a firmware crash, function 'brcmf_sdio_bus_reset' is called. It calls > 'brcmf_sdiod_remove', then follows up with 'brcmf_sdiod_probe' to reinitialize > the hardware. If 'brcmf_sdiod_probe' fails to "set F1 blocksize", it exits > early, which includes calling 'brcmf_sdiod_remove'. In both cases > 'brcmf_sdiod_freezer_detach' is called to free allocated '.freezer', which > has not yet been re-allocated the second time. For testing you can also trigger the brcmf_sdio_bus_reset() through debugfs: # cd /sys/kernel/debug/ieee80211/phyX # echo 1 > reset So I did that without your patch and observed following: [ 531.481045] brcmfmac: brcmf_sdiod_probe: Failed to set F1 blocksize [ 531.487314] brcmfmac: brcmf_sdio_bus_reset: Failed to probe after sdio device reset: ret -123 [ 531.495893] mmc0: card 0001 removed [ 531.550567] mmc0: new high speed SDIO card at address 0001 [ 531.556561] brcmfmac: F1 signature read @0x18000000=0x16044330 So I looked in brcmf_sdio_bus_reset() and noticed that this function actually does a call to mmc_hw_reset() which is what I suggested earlier. Looking at this log it seems the actual remove and subsequent rescan is a deferred work and it will take care of probing the driver anew. So I changed debug message level for the sdio ops and then I see: [ 1327.686828] brcmfmac: brcmf_sdiod_probe: Failed to set F1 blocksize [ 1327.693091] brcmfmac: brcmf_sdio_bus_reset: Failed to probe after sdio device reset: ret -123 [ 1327.701626] brcmfmac: brcmf_ops_sdio_remove Enter [ 1327.706333] brcmfmac: brcmf_ops_sdio_remove Function: 1 [ 1327.711557] brcmfmac: brcmf_ops_sdio_remove Exit [ 1327.716203] brcmfmac: brcmf_ops_sdio_remove Enter [ 1327.720911] brcmfmac: brcmf_ops_sdio_remove Function: 2 [ 1327.726135] brcmfmac: brcmf_ops_sdio_remove Exit [ 1327.730768] mmc0: card 0001 removed [ 1327.785458] mmc0: new high speed SDIO card at address 0001 [ 1327.791068] brcmfmac: brcmf_ops_sdio_probe Enter [ 1327.795687] brcmfmac: brcmf_ops_sdio_probe Function#: 1 [ 1327.800988] brcmfmac: brcmf_ops_sdio_probe Enter [ 1327.805610] brcmfmac: brcmf_ops_sdio_probe Function#: 2 [ 1327.811317] brcmfmac: F1 signature read @0x18000000=0x16044330 So to me it seems mmc_hw_reset() is doing everything we need and we can make brcmf_sdio_bus_reset() a bit simpler. The only scary part here is that brcmf_ops_sdio_remove() is called first for func1 and then for func2. Upon rmmod this is different, ie. first func2 and then func1. Looking at the implementation in brcmf_ops_sdio_remove() it means we run into use-after-free upon mmc_hw_reset(). Regards, Arend diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c index 212fbbe1cd7e..21e15f571eea 100644 --- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c @@ -4158,24 +4158,15 @@ static int brcmf_sdio_bus_reset(struct device *dev) brcmf_dbg(SDIO, "Enter\n"); - /* start by unregistering irqs */ brcmf_sdiod_intr_unregister(sdiodev); - brcmf_sdiod_remove(sdiodev); + brcmf_bus_change_state(sdiodev->bus_if, BRCMF_BUS_DOWN); - /* reset the adapter */ + /* reset the card */ sdio_claim_host(sdiodev->func1); mmc_hw_reset(sdiodev->func1->card); sdio_release_host(sdiodev->func1); - brcmf_bus_change_state(sdiodev->bus_if, BRCMF_BUS_DOWN); - - ret = brcmf_sdiod_probe(sdiodev); - if (ret) { - brcmf_err("Failed to probe after sdio device reset: ret %d\n", - ret); - } - return ret; }
Hi all, I missed the response for a while. Following up in-line. ------- Original Message ------- On Wednesday, June 22nd, 2022 at 14:36, Arend van Spriel <arend.vanspriel@broadcom.com> wrote: > + Uffe > > On 6/17/2022 4:24 PM, Danny van Heumen wrote: > > > From f1fcceb65d4a44c078cd684ea25a2f2c7f53deb2 Mon Sep 17 00:00:00 2001 > > From: Danny van Heumen danny@dannyvanheumen.nl > > Date: Tue, 24 May 2022 18:30:50 +0200 > > Subject: [PATCH v3] brcmfmac: prevent double-free on hardware-reset. > > > > In case of buggy firmware, brcmfmac may perform a hardware reset. If during > > reset and subsequent probing an early failure occurs, a memory region is > > accidentally double-freed. With hardened memory allocation enabled, this error > > will be detected. > > > > - return early where appropriate to skip unnecessary clean-up. > > - set '.freezer' pointer to NULL to prevent double-freeing under possible > > other circumstances and to re-align result under various different > > behaviors of memory allocation freeing. > > > > Given a firmware crash, function 'brcmf_sdio_bus_reset' is called. It calls > > 'brcmf_sdiod_remove', then follows up with 'brcmf_sdiod_probe' to reinitialize > > the hardware. If 'brcmf_sdiod_probe' fails to "set F1 blocksize", it exits > > early, which includes calling 'brcmf_sdiod_remove'. In both cases > > 'brcmf_sdiod_freezer_detach' is called to free allocated '.freezer', which > > has not yet been re-allocated the second time. > > > For testing you can also trigger the brcmf_sdio_bus_reset() through debugfs: > > # cd /sys/kernel/debug/ieee80211/phyX > # echo 1 > reset This works. I have done a few resets in a loop to see if the hardware keeps recovering and no other issues occur. Everything seems to be fine. > > > So I did that without your patch and observed following: > > [ 531.481045] brcmfmac: brcmf_sdiod_probe: Failed to set F1 blocksize > > [ 531.487314] brcmfmac: brcmf_sdio_bus_reset: Failed to probe after > sdio device > reset: ret -123 > > [ 531.495893] mmc0: card 0001 removed > > [ 531.550567] mmc0: new high speed SDIO card at address 0001 > > [ 531.556561] brcmfmac: F1 signature read @0x18000000=0x16044330 > > So I looked in brcmf_sdio_bus_reset() and noticed that this function > actually does a call to mmc_hw_reset() which is what I suggested > earlier. Looking at this log it seems the actual remove and subsequent > rescan is a deferred work and it will take care of probing the driver anew. I can confirm that this works. This is helpful. > > So I changed debug message level for the sdio ops and then I see: > > [ 1327.686828] brcmfmac: brcmf_sdiod_probe: Failed to set F1 blocksize > > [ 1327.693091] brcmfmac: brcmf_sdio_bus_reset: Failed to probe after > sdio device > reset: ret -123 > > [ 1327.701626] brcmfmac: brcmf_ops_sdio_remove Enter > > [ 1327.706333] brcmfmac: brcmf_ops_sdio_remove Function: 1 > > [ 1327.711557] brcmfmac: brcmf_ops_sdio_remove Exit > > [ 1327.716203] brcmfmac: brcmf_ops_sdio_remove Enter > > [ 1327.720911] brcmfmac: brcmf_ops_sdio_remove Function: 2 > > [ 1327.726135] brcmfmac: brcmf_ops_sdio_remove Exit > > [ 1327.730768] mmc0: card 0001 removed > > [ 1327.785458] mmc0: new high speed SDIO card at address 0001 > > [ 1327.791068] brcmfmac: brcmf_ops_sdio_probe Enter > > [ 1327.795687] brcmfmac: brcmf_ops_sdio_probe Function#: 1 > > [ 1327.800988] brcmfmac: brcmf_ops_sdio_probe Enter > > [ 1327.805610] brcmfmac: brcmf_ops_sdio_probe Function#: 2 > > [ 1327.811317] brcmfmac: F1 signature read @0x18000000=0x16044330 > > So to me it seems mmc_hw_reset() is doing everything we need and we can > make brcmf_sdio_bus_reset() a bit simpler. The only scary part here is > that brcmf_ops_sdio_remove() is called first for func1 and then for > func2. Upon rmmod this is different, ie. first func2 and then func1. > Looking at the implementation in brcmf_ops_sdio_remove() it means we run > into use-after-free upon mmc_hw_reset(). I have looked into this. The logic itself protects from duplicate behavior (at least partially) by checking which func is being removed. However, irq-releases are duplicated because the check only happens after that logic. This is also necessary, because the kernel warns if the irq isn't released after executing the callback. In the next patch (prefixed "PATCH v4", already sent), I have slightly adjusted the control flow such that irq-release happens for each func -- as per expectation -- while the removal logic remains rooted at func1, but executed based on the full device (`sdiodev` as function arg). This should, IIUC, be correct regardless of function order. This solution isn't a perfect redesign, but should respect both the design itself and also sdio driver expectations. Please find the patch at: g_Py6bM1lfcJOWWmHwKU8x4tCFrTRdgFtoM13qYHeN441F392j_6etJnEJ8gHJMRZ6OEKxpJYuP45x3iziHqY6HNXnVwIiyvJLYjvzxT0Xk=%20()%20dannyvanheumen%20!%20nl Regards, Danny
diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c index 9c598ea97499..40f40ac4ef73 100644 --- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c @@ -802,6 +802,7 @@ static void brcmf_sdiod_freezer_detach(struct brcmf_sdio_dev *sdiodev) if (sdiodev->freezer) { WARN_ON(atomic_read(&sdiodev->freezer->freezing)); kfree(sdiodev->freezer); + sdiodev->freezer = NULL; } } @@ -911,7 +912,7 @@ int brcmf_sdiod_probe(struct brcmf_sdio_dev *sdiodev) if (ret) { brcmf_err("Failed to set F1 blocksize\n"); sdio_release_host(sdiodev->func1); - goto out; + return ret; } switch (sdiodev->func2->device) { case SDIO_DEVICE_ID_BROADCOM_CYPRESS_4373: @@ -933,7 +934,7 @@ int brcmf_sdiod_probe(struct brcmf_sdio_dev *sdiodev) if (ret) { brcmf_err("Failed to set F2 blocksize\n"); sdio_release_host(sdiodev->func1); - goto out; + return ret; } else { brcmf_dbg(SDIO, "set F2 blocksize to %d\n", f2_blksz); }