Message ID | 20250114-ufshcd-fix-v2-1-2dc627590a4a@linaro.org (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v2] scsi: ufs: fix use-after free in init error and remove paths | expand |
On Tue, 2025-01-14 at 16:16 +0000, André Draszik wrote: > devm_blk_crypto_profile_init() registers a cleanup handler to run when > the associated (platform-) device is being released. For UFS, the > crypto private data and pointers are stored as part of the ufs_hba's > data structure 'struct ufs_hba::crypto_profile'. This structure is > allocated as part of the underlying ufshd allocation. > > During driver release or during error handling in ufshcd_pltfrm_init(), > this structure is released as part of ufshcd_dealloc_host() before the > (platform-) device associated with the crypto call above is released. > Once this device is released, the crypto cleanup code will run, using > the just-released 'struct ufs_hba::crypto_profile'. This causes a > use-after-free situation: > > exynos-ufshc 14700000.ufs: ufshcd_pltfrm_init() failed -11 > exynos-ufshc 14700000.ufs: probe with driver exynos-ufshc failed with error -11 > Unable to handle kernel paging request at virtual address 01adafad6dadad88 > Mem abort info: > ESR = 0x0000000096000004 > EC = 0x25: DABT (current EL), IL = 32 bits > SET = 0, FnV = 0 > EA = 0, S1PTW = 0 > FSC = 0x04: level 0 translation fault > Data abort info: > ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 > CM = 0, WnR = 0, TnD = 0, TagAccess = 0 > GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 > [01adafad6dadad88] address between user and kernel address ranges > Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP > Modules linked in: > CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Tainted: G W 6.13.0-rc5-next-20250106+ #70 > Tainted: [W]=WARN > Hardware name: Oriole (DT) > pstate: 20400005 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) > pc : kfree+0x60/0x2d8 > lr : kvfree+0x44/0x60 > sp : ffff80008009ba80 > x29: ffff80008009ba90 x28: 0000000000000000 x27: ffffbcc6591e0130 > x26: ffffbcc659309960 x25: ffffbcc658f89c50 x24: ffffbcc659539d80 > x23: ffff22e000940040 x22: ffff22e001539010 x21: ffffbcc65714b22c > x20: 6b6b6b6b6b6b6b6b x19: 01adafad6dadad80 x18: 0000000000000000 > x17: ffffbcc6579fbac8 x16: ffffbcc657a04300 x15: ffffbcc657a027f4 > x14: ffffbcc656f969cc x13: ffffbcc6579fdc80 x12: ffffbcc6579fb194 > x11: ffffbcc6579fbc34 x10: 0000000000000000 x9 : ffffbcc65714b22c > x8 : ffff80008009b880 x7 : 0000000000000000 x6 : ffff80008009b940 > x5 : ffff80008009b8c0 x4 : ffff22e000940518 x3 : ffff22e006f54f40 > x2 : ffffbcc657a02268 x1 : ffff80007fffffff x0 : ffffc1ffc0000000 > Call trace: > kfree+0x60/0x2d8 (P) > kvfree+0x44/0x60 > blk_crypto_profile_destroy_callback+0x28/0x70 > devm_action_release+0x1c/0x30 > release_nodes+0x6c/0x108 > devres_release_all+0x98/0x100 > device_unbind_cleanup+0x20/0x70 > really_probe+0x218/0x2d0 > > In other words, the initialisation code flow is: > > platform-device probe > ufshcd_pltfrm_init() > ufshcd_alloc_host() > scsi_host_alloc() > allocation of struct ufs_hba > creation of scsi-host devices > devm_blk_crypto_profile_init() > devm registration of cleanup handler using platform-device > > and during error handling of ufshcd_pltfrm_init() or during driver > removal: > > ufshcd_dealloc_host() > scsi_host_put() > put_device(scsi-host) > release of struct ufs_hba > put_device(platform-device) > crypto cleanup handler > > To fix this use-after free, change ufshcd_alloc_host() to register a > devres action to automatically cleanup the underlying SCSI device on > ufshcd destruction, without requiring explicit calls to > ufshcd_dealloc_host(). This way: > > * the crypto profile and all other ufs_hba-owned resources are > destroyed before SCSI (as they've been registered after) > * a memleak is plugged in tc-dwc-g210-pci.c as a side-effect > * EXPORT_SYMBOL_GPL(ufshcd_dealloc_host) can be removed fully as > it's not needed anymore > * no future drivers using ufshcd_alloc_host() could ever forget > adding the cleanup > > Fixes: cb77cb5abe1f ("blk-crypto: rename blk_keyslot_manager to blk_crypto_profile") > Fixes: d76d9d7d1009 ("scsi: ufs: use devm_blk_ksm_init()") > Cc: stable@vger.kernel.org > Signed-off-by: André Draszik <andre.draszik@linaro.org> > --- > Changes in v2: > - completely new approach using devres action for Scsi_host cleanup, to > ensure ordering As mentioned, I am not sure if this approach has wider implications (in particular if there is any underlying assumption or requirement for the Scsi_host device to clean up before the ufshcd device). Simple testing using a few iteration of manual module bind/unbind worked, as did the error handling / cleanup during init. But I'm not sure if that is sufficient testing for the changed release ordering. Cheers, Andre'
On 1/14/25 8:16 AM, André Draszik wrote: > +/** > + * ufshcd_scsi_host_put_callback - deallocate underlying Scsi_Host and > + * thereby the Host Bus Adapter (HBA) > + * @host: pointer to SCSI host > + */ > +static void ufshcd_scsi_host_put_callback(void *host) > +{ > + scsi_host_put(host); > +} Please rename ufshcd_scsi_host_put_callback() such that the function name makes clear when this function is called instead of what the function does. Thanks, Bart.
Hi Bart, On Tue, 2025-01-14 at 09:55 -0800, Bart Van Assche wrote: > On 1/14/25 8:16 AM, André Draszik wrote: > > +/** > > + * ufshcd_scsi_host_put_callback - deallocate underlying Scsi_Host and > > + * thereby the Host Bus Adapter (HBA) > > + * @host: pointer to SCSI host > > + */ > > +static void ufshcd_scsi_host_put_callback(void *host) > > +{ > > + scsi_host_put(host); > > +} > > Please rename ufshcd_scsi_host_put_callback() such that the function > name makes clear when this function is called instead of what the > function does. Would you have a suggestion for such a name? Something like ufshcd_driver_release_action()? Unless I'm misunderstanding you, I believe most drivers use a function name that says what the function does, e.g. dell_wmi_ddv_debugfs_remove (just as a completely random example out of many). If going by when it is called and if applying this principle throughout ufshcd, then there can only ever be one such function in ufshcd, as all devm_add_action() callback actions happen at driver release, which surely isn't what you mean. You probably meant something different? Cheers, Andre'
On 1/14/25 11:56 AM, André Draszik wrote: > Hi Bart, > > On Tue, 2025-01-14 at 09:55 -0800, Bart Van Assche wrote: >> On 1/14/25 8:16 AM, André Draszik wrote: >>> +/** >>> + * ufshcd_scsi_host_put_callback - deallocate underlying Scsi_Host and >>> + * thereby the Host Bus Adapter (HBA) >>> + * @host: pointer to SCSI host >>> + */ >>> +static void ufshcd_scsi_host_put_callback(void *host) >>> +{ >>> + scsi_host_put(host); >>> +} >> >> Please rename ufshcd_scsi_host_put_callback() such that the function >> name makes clear when this function is called instead of what the >> function does. > > Would you have a suggestion for such a name? Something like > ufshcd_driver_release_action()? > > Unless I'm misunderstanding you, I believe most drivers use > a function name that says what the function does, e.g. > dell_wmi_ddv_debugfs_remove (just as a completely random > example out of many). > > If going by when it is called and if applying this principle > throughout ufshcd, then there can only ever be one such > function in ufshcd, as all devm_add_action() callback actions > happen at driver release, which surely isn't what you mean. > > You probably meant something different? I meant what I wrote in my previous email: to chose another name for ufshcd_scsi_host_put_callback() only. Having a function name that duplicates the function body leaves readers of the code guessing from where the function is called. BTW, naming callbacks after their call site is a normal practice as far as I know. From ufs-qcom.c: static const struct ufs_hba_variant_ops ufs_hba_qcom_vops = { .name = "qcom", .init = ufs_qcom_init, .exit = ufs_qcom_exit, .get_ufs_hci_version = ufs_qcom_get_ufs_hci_version, .clk_scale_notify = ufs_qcom_clk_scale_notify, .setup_clocks = ufs_qcom_setup_clocks, .hce_enable_notify = ufs_qcom_hce_enable_notify, .link_startup_notify = ufs_qcom_link_startup_notify, .pwr_change_notify = ufs_qcom_pwr_change_notify, .apply_dev_quirks = ufs_qcom_apply_dev_quirks, .fixup_dev_quirks = ufs_qcom_fixup_dev_quirks, .suspend = ufs_qcom_suspend, .resume = ufs_qcom_resume, .dbg_register_dump = ufs_qcom_dump_dbg_regs, .device_reset = ufs_qcom_device_reset, .config_scaling_param = ufs_qcom_config_scaling_param, .reinit_notify = ufs_qcom_reinit_notify, .mcq_config_resource = ufs_qcom_mcq_config_resource, .get_hba_mac = ufs_qcom_get_hba_mac, .op_runtime_config = ufs_qcom_op_runtime_config, .get_outstanding_cqs = ufs_qcom_get_outstanding_cqs, .config_esi = ufs_qcom_config_esi, }; Bart.
diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c index 43ddae7318cb..84089f3f62b0 100644 --- a/drivers/ufs/core/ufshcd.c +++ b/drivers/ufs/core/ufshcd.c @@ -10279,16 +10279,6 @@ int ufshcd_system_thaw(struct device *dev) EXPORT_SYMBOL_GPL(ufshcd_system_thaw); #endif /* CONFIG_PM_SLEEP */ -/** - * ufshcd_dealloc_host - deallocate Host Bus Adapter (HBA) - * @hba: pointer to Host Bus Adapter (HBA) - */ -void ufshcd_dealloc_host(struct ufs_hba *hba) -{ - scsi_host_put(hba->host); -} -EXPORT_SYMBOL_GPL(ufshcd_dealloc_host); - /** * ufshcd_set_dma_mask - Set dma mask based on the controller * addressing capability @@ -10307,6 +10297,16 @@ static int ufshcd_set_dma_mask(struct ufs_hba *hba) return dma_set_mask_and_coherent(hba->dev, DMA_BIT_MASK(32)); } +/** + * ufshcd_scsi_host_put_callback - deallocate underlying Scsi_Host and + * thereby the Host Bus Adapter (HBA) + * @host: pointer to SCSI host + */ +static void ufshcd_scsi_host_put_callback(void *host) +{ + scsi_host_put(host); +} + /** * ufshcd_alloc_host - allocate Host Bus Adapter (HBA) * @dev: pointer to device handle @@ -10334,6 +10334,13 @@ int ufshcd_alloc_host(struct device *dev, struct ufs_hba **hba_handle) err = -ENOMEM; goto out_error; } + + err = devm_add_action_or_reset(dev, ufshcd_scsi_host_put_callback, + host); + if (err) + return dev_err_probe(dev, err, + "failed to add ufshcd dealloc action\n"); + host->nr_maps = HCTX_TYPE_POLL + 1; hba = shost_priv(host); hba->host = host; diff --git a/drivers/ufs/host/ufshcd-pci.c b/drivers/ufs/host/ufshcd-pci.c index ea39c5d5b8cf..9cfcaad23cf9 100644 --- a/drivers/ufs/host/ufshcd-pci.c +++ b/drivers/ufs/host/ufshcd-pci.c @@ -562,7 +562,6 @@ static void ufshcd_pci_remove(struct pci_dev *pdev) pm_runtime_forbid(&pdev->dev); pm_runtime_get_noresume(&pdev->dev); ufshcd_remove(hba); - ufshcd_dealloc_host(hba); } /** @@ -605,7 +604,6 @@ ufshcd_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id) err = ufshcd_init(hba, mmio_base, pdev->irq); if (err) { dev_err(&pdev->dev, "Initialization failed\n"); - ufshcd_dealloc_host(hba); return err; } diff --git a/drivers/ufs/host/ufshcd-pltfrm.c b/drivers/ufs/host/ufshcd-pltfrm.c index 505572d4fa87..adb0a65d9df5 100644 --- a/drivers/ufs/host/ufshcd-pltfrm.c +++ b/drivers/ufs/host/ufshcd-pltfrm.c @@ -488,13 +488,13 @@ int ufshcd_pltfrm_init(struct platform_device *pdev, if (err) { dev_err(dev, "%s: clock parse failed %d\n", __func__, err); - goto dealloc_host; + goto out; } err = ufshcd_parse_regulator_info(hba); if (err) { dev_err(dev, "%s: regulator init failed %d\n", __func__, err); - goto dealloc_host; + goto out; } ufshcd_init_lanes_per_dir(hba); @@ -502,14 +502,14 @@ int ufshcd_pltfrm_init(struct platform_device *pdev, err = ufshcd_parse_operating_points(hba); if (err) { dev_err(dev, "%s: OPP parse failed %d\n", __func__, err); - goto dealloc_host; + goto out; } err = ufshcd_init(hba, mmio_base, irq); if (err) { dev_err_probe(dev, err, "Initialization failed with error %d\n", err); - goto dealloc_host; + goto out; } pm_runtime_set_active(dev); @@ -517,8 +517,6 @@ int ufshcd_pltfrm_init(struct platform_device *pdev, return 0; -dealloc_host: - ufshcd_dealloc_host(hba); out: return err; } @@ -534,7 +532,6 @@ void ufshcd_pltfrm_remove(struct platform_device *pdev) pm_runtime_get_sync(&pdev->dev); ufshcd_remove(hba); - ufshcd_dealloc_host(hba); pm_runtime_disable(&pdev->dev); pm_runtime_put_noidle(&pdev->dev); } diff --git a/include/ufs/ufshcd.h b/include/ufs/ufshcd.h index da0fa5c65081..58eb6e897827 100644 --- a/include/ufs/ufshcd.h +++ b/include/ufs/ufshcd.h @@ -1311,7 +1311,6 @@ static inline void ufshcd_rmwl(struct ufs_hba *hba, u32 mask, u32 val, u32 reg) void ufshcd_enable_irq(struct ufs_hba *hba); void ufshcd_disable_irq(struct ufs_hba *hba); int ufshcd_alloc_host(struct device *, struct ufs_hba **); -void ufshcd_dealloc_host(struct ufs_hba *); int ufshcd_hba_enable(struct ufs_hba *hba); int ufshcd_init(struct ufs_hba *, void __iomem *, unsigned int); int ufshcd_link_recovery(struct ufs_hba *hba);
devm_blk_crypto_profile_init() registers a cleanup handler to run when the associated (platform-) device is being released. For UFS, the crypto private data and pointers are stored as part of the ufs_hba's data structure 'struct ufs_hba::crypto_profile'. This structure is allocated as part of the underlying ufshd allocation. During driver release or during error handling in ufshcd_pltfrm_init(), this structure is released as part of ufshcd_dealloc_host() before the (platform-) device associated with the crypto call above is released. Once this device is released, the crypto cleanup code will run, using the just-released 'struct ufs_hba::crypto_profile'. This causes a use-after-free situation: exynos-ufshc 14700000.ufs: ufshcd_pltfrm_init() failed -11 exynos-ufshc 14700000.ufs: probe with driver exynos-ufshc failed with error -11 Unable to handle kernel paging request at virtual address 01adafad6dadad88 Mem abort info: ESR = 0x0000000096000004 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x04: level 0 translation fault Data abort info: ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 CM = 0, WnR = 0, TnD = 0, TagAccess = 0 GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [01adafad6dadad88] address between user and kernel address ranges Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP Modules linked in: CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Tainted: G W 6.13.0-rc5-next-20250106+ #70 Tainted: [W]=WARN Hardware name: Oriole (DT) pstate: 20400005 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : kfree+0x60/0x2d8 lr : kvfree+0x44/0x60 sp : ffff80008009ba80 x29: ffff80008009ba90 x28: 0000000000000000 x27: ffffbcc6591e0130 x26: ffffbcc659309960 x25: ffffbcc658f89c50 x24: ffffbcc659539d80 x23: ffff22e000940040 x22: ffff22e001539010 x21: ffffbcc65714b22c x20: 6b6b6b6b6b6b6b6b x19: 01adafad6dadad80 x18: 0000000000000000 x17: ffffbcc6579fbac8 x16: ffffbcc657a04300 x15: ffffbcc657a027f4 x14: ffffbcc656f969cc x13: ffffbcc6579fdc80 x12: ffffbcc6579fb194 x11: ffffbcc6579fbc34 x10: 0000000000000000 x9 : ffffbcc65714b22c x8 : ffff80008009b880 x7 : 0000000000000000 x6 : ffff80008009b940 x5 : ffff80008009b8c0 x4 : ffff22e000940518 x3 : ffff22e006f54f40 x2 : ffffbcc657a02268 x1 : ffff80007fffffff x0 : ffffc1ffc0000000 Call trace: kfree+0x60/0x2d8 (P) kvfree+0x44/0x60 blk_crypto_profile_destroy_callback+0x28/0x70 devm_action_release+0x1c/0x30 release_nodes+0x6c/0x108 devres_release_all+0x98/0x100 device_unbind_cleanup+0x20/0x70 really_probe+0x218/0x2d0 In other words, the initialisation code flow is: platform-device probe ufshcd_pltfrm_init() ufshcd_alloc_host() scsi_host_alloc() allocation of struct ufs_hba creation of scsi-host devices devm_blk_crypto_profile_init() devm registration of cleanup handler using platform-device and during error handling of ufshcd_pltfrm_init() or during driver removal: ufshcd_dealloc_host() scsi_host_put() put_device(scsi-host) release of struct ufs_hba put_device(platform-device) crypto cleanup handler To fix this use-after free, change ufshcd_alloc_host() to register a devres action to automatically cleanup the underlying SCSI device on ufshcd destruction, without requiring explicit calls to ufshcd_dealloc_host(). This way: * the crypto profile and all other ufs_hba-owned resources are destroyed before SCSI (as they've been registered after) * a memleak is plugged in tc-dwc-g210-pci.c as a side-effect * EXPORT_SYMBOL_GPL(ufshcd_dealloc_host) can be removed fully as it's not needed anymore * no future drivers using ufshcd_alloc_host() could ever forget adding the cleanup Fixes: cb77cb5abe1f ("blk-crypto: rename blk_keyslot_manager to blk_crypto_profile") Fixes: d76d9d7d1009 ("scsi: ufs: use devm_blk_ksm_init()") Cc: stable@vger.kernel.org Signed-off-by: André Draszik <andre.draszik@linaro.org> --- Changes in v2: - completely new approach using devres action for Scsi_host cleanup, to ensure ordering - add Fixes: and CC: stable tags (Eric) - Link to v1: https://lore.kernel.org/r/20250113-ufshcd-fix-v1-1-ca63d1d4bd55@linaro.org --- In my case, as per above trace I initially encountered an error in ufshcd_verify_dev_init(), which made me notice this problem. For reproducing, it'd be possible to change that function to just return an error. Other approaches for solving this issue I see are the following, but I believe this one here is the cleanest: * turn 'struct ufs_hba::crypto_profile' into a dynamically allocated pointer, in which case it doesn't matter if cleanup runs after scsi_host_put() * add an explicit devm_blk_crypto_profile_deinit() to be called by API users when necessary, e.g. before ufshcd_dealloc_host() in this case * register the crypto cleanup handler against the scsi-host device instead, like in v1 of this patch --- drivers/ufs/core/ufshcd.c | 27 +++++++++++++++++---------- drivers/ufs/host/ufshcd-pci.c | 2 -- drivers/ufs/host/ufshcd-pltfrm.c | 11 ++++------- include/ufs/ufshcd.h | 1 - 4 files changed, 21 insertions(+), 20 deletions(-) --- base-commit: 4e16367cfe0ce395f29d0482b78970cce8e1db73 change-id: 20250113-ufshcd-fix-52409f2d32ff Best regards,