Message ID | 20250330-thinkpad-fix-v1-1-4906b3fe6b74@gmail.com (mailing list archive) |
---|---|
State | Accepted, archived |
Headers | show |
Series | platform/x86: thinkpad_acpi: Fix NULL pointer dereferences while probing | expand |
On Sun Mar 30, 2025 at 12:39 PM -03, Kurt Borja wrote: > Some subdrivers make use of the global reference tpacpi_pdev during > initialization, which is called from the platform driver's probe. > However, after > > commit 38b9ab80db31 ("platform/x86: thinkpad_acpi: Move subdriver initialization to tpacpi_pdriver's probe.") > > this variable is only properly initialized *after* probing and this can > result in a NULL pointer dereference. > > In order to fix this without reverting the commit, register the platform > bundle in two steps, first create and initialize tpacpi_pdev, then > register the driver synchronously with platform_driver_probe(). This way > the benefits of commit 38b9ab80db31 are preserved. > > Additionally, > > commit 43fc63a1e8f6 ("platform/x86: thinkpad_acpi: Move HWMON initialization to tpacpi_hwmon_pdriver's probe") > > introduced a similar problem, however tpacpi_sensors_pdev is only used > once inside the probe, so replace the global reference with the one > given by the probe. I don't understand why b4 added the linux-riscv list to the recipients, but it was definitely not inteded. Sorry for the noise.
On Sun, 2025-03-30 at 12:39 -0300, Kurt Borja wrote: > Some subdrivers make use of the global reference tpacpi_pdev during > initialization, which is called from the platform driver's probe. > However, after > > commit 38b9ab80db31 ("platform/x86: thinkpad_acpi: Move subdriver > initialization to tpacpi_pdriver's probe.") > > this variable is only properly initialized *after* probing and this > can > result in a NULL pointer dereference. > > In order to fix this without reverting the commit, register the > platform > bundle in two steps, first create and initialize tpacpi_pdev, then > register the driver synchronously with platform_driver_probe(). This > way > the benefits of commit 38b9ab80db31 are preserved. > > Additionally, > > commit 43fc63a1e8f6 ("platform/x86: thinkpad_acpi: Move HWMON > initialization to tpacpi_hwmon_pdriver's probe") > > introduced a similar problem, however tpacpi_sensors_pdev is only > used > once inside the probe, so replace the global reference with the one > given by the probe. > > ... > base-commit: 1a9239bb4253f9076b5b4b2a1a4e8d7defd77a95 > change-id: 20250330-thinkpad-fix-98db0d8c3be3 > Fixed problem seen here on thinkpad. Tested on mainline commit 4e82c87058f45e79eeaa4d5bcc3b38dd3dce7209 Tested-by: Gene C <arch@sapience.com>
On Sun, 30 Mar 2025, Kurt Borja wrote: > Some subdrivers make use of the global reference tpacpi_pdev during > initialization, which is called from the platform driver's probe. > However, after > > commit 38b9ab80db31 ("platform/x86: thinkpad_acpi: Move subdriver initialization to tpacpi_pdriver's probe.") > Next time, please include these into the paragraph flow normally obeying the normal paragraph formatting. I changed them in this case. > this variable is only properly initialized *after* probing and this can > result in a NULL pointer dereference. > > In order to fix this without reverting the commit, register the platform > bundle in two steps, first create and initialize tpacpi_pdev, then > register the driver synchronously with platform_driver_probe(). This way > the benefits of commit 38b9ab80db31 are preserved. > > Additionally, > > commit 43fc63a1e8f6 ("platform/x86: thinkpad_acpi: Move HWMON initialization to tpacpi_hwmon_pdriver's probe") > > introduced a similar problem, however tpacpi_sensors_pdev is only used > once inside the probe, so replace the global reference with the one > given by the probe. > > Reported-by: Damian Tometzki <damian@riscv-rocks.de> > Closes: https://lore.kernel.org/r/CAL=B37kdL1orSQZD2A3skDOevRXBzF__cJJgY_GFh9LZO3FMsw@mail.gmail.com/ > Fixes: 38b9ab80db31 ("platform/x86: thinkpad_acpi: Move subdriver initialization to tpacpi_pdriver's probe.") > Fixes: 43fc63a1e8f6 ("platform/x86: thinkpad_acpi: Move HWMON initialization to tpacpi_hwmon_pdriver's probe") > Tested-by: Damian Tometzki <damian@riscv-rocks.de> > Signed-off-by: Kurt Borja <kuurtb@gmail.com> Applied to the review-ilpo-fixes branch. > --- > Hi all, > > The commit message is pretty self-explanatory. I have one question > though. As you can see in the crash dump of the original report: > > Mar 29 17:43:16.180758 fedora kernel: ? asm_exc_page_fault+0x26/0x30 > Mar 29 17:43:16.180769 fedora kernel: ? __pfx_klist_children_get+0x10/0x10 > Mar 29 17:43:16.180781 fedora kernel: ? kobject_get+0xd/0x70 > Mar 29 17:43:16.180792 fedora kernel: device_add+0x8f/0x6e0 > Mar 29 17:43:16.180804 fedora kernel: rfkill_register+0xbc/0x2c0 [rfkill] > Mar 29 17:43:16.180813 fedora kernel: tpacpi_new_rfkill+0x185/0x230 [thinkpad_acpi] > > The NULL dereference happens in device_add(), inside rfkill_register(). > This bothers me because, as you can see here: > > 1198 atp_rfk->rfkill = rfkill_alloc(name, > 1199 &tpacpi_pdev->dev, > 1200 rfktype, > 1201 &tpacpi_rfk_rfkill_ops, > 1202 atp_rfk); > > the NULL deference happens in line 1199, inside tpacpi_new_rfkill(). I > think this disagreement might be due to compile time optimizations? How did you map it to line numbers? Is it just about difference in the compiled binaries that results in different line numbers?
Hi Ilpo, On Tue Apr 1, 2025 at 8:24 AM -03, Ilpo Järvinen wrote: > On Sun, 30 Mar 2025, Kurt Borja wrote: > >> Some subdrivers make use of the global reference tpacpi_pdev during >> initialization, which is called from the platform driver's probe. >> However, after >> >> commit 38b9ab80db31 ("platform/x86: thinkpad_acpi: Move subdriver initialization to tpacpi_pdriver's probe.") >> > > Next time, please include these into the paragraph flow normally obeying > the normal paragraph formatting. I changed them in this case. Thanks, won't happen next time. > >> this variable is only properly initialized *after* probing and this can >> result in a NULL pointer dereference. >> >> In order to fix this without reverting the commit, register the platform >> bundle in two steps, first create and initialize tpacpi_pdev, then >> register the driver synchronously with platform_driver_probe(). This way >> the benefits of commit 38b9ab80db31 are preserved. >> >> Additionally, >> >> commit 43fc63a1e8f6 ("platform/x86: thinkpad_acpi: Move HWMON initialization to tpacpi_hwmon_pdriver's probe") >> >> introduced a similar problem, however tpacpi_sensors_pdev is only used >> once inside the probe, so replace the global reference with the one >> given by the probe. >> >> Reported-by: Damian Tometzki <damian@riscv-rocks.de> >> Closes: https://lore.kernel.org/r/CAL=B37kdL1orSQZD2A3skDOevRXBzF__cJJgY_GFh9LZO3FMsw@mail.gmail.com/ >> Fixes: 38b9ab80db31 ("platform/x86: thinkpad_acpi: Move subdriver initialization to tpacpi_pdriver's probe.") >> Fixes: 43fc63a1e8f6 ("platform/x86: thinkpad_acpi: Move HWMON initialization to tpacpi_hwmon_pdriver's probe") >> Tested-by: Damian Tometzki <damian@riscv-rocks.de> >> Signed-off-by: Kurt Borja <kuurtb@gmail.com> > > Applied to the review-ilpo-fixes branch. Thank you! > >> --- >> Hi all, >> >> The commit message is pretty self-explanatory. I have one question >> though. As you can see in the crash dump of the original report: >> >> Mar 29 17:43:16.180758 fedora kernel: ? asm_exc_page_fault+0x26/0x30 >> Mar 29 17:43:16.180769 fedora kernel: ? __pfx_klist_children_get+0x10/0x10 >> Mar 29 17:43:16.180781 fedora kernel: ? kobject_get+0xd/0x70 >> Mar 29 17:43:16.180792 fedora kernel: device_add+0x8f/0x6e0 >> Mar 29 17:43:16.180804 fedora kernel: rfkill_register+0xbc/0x2c0 [rfkill] >> Mar 29 17:43:16.180813 fedora kernel: tpacpi_new_rfkill+0x185/0x230 [thinkpad_acpi] >> >> The NULL dereference happens in device_add(), inside rfkill_register(). >> This bothers me because, as you can see here: >> >> 1198 atp_rfk->rfkill = rfkill_alloc(name, >> 1199 &tpacpi_pdev->dev, >> 1200 rfktype, >> 1201 &tpacpi_rfk_rfkill_ops, >> 1202 atp_rfk); >> >> the NULL deference happens in line 1199, inside tpacpi_new_rfkill(). I >> think this disagreement might be due to compile time optimizations? > > How did you map it to line numbers? Is it just about difference in the > compiled binaries that results in different line numbers? Oh - I just manually followed the dump trace in search of the first instance of a NULL derefence. If I understand correctly, inside thinkpad_acpi we do reach rfkill_register(), which is line 1227 res = rfkill_register(atp_rfk->rfkill); and I imagine the RIP happens when device_add() tries to get a reference to the parent of the allocated rfkill device. But it's weird because we shouldn't even reach 1227, as the NULL deref first happens at 1199. NULL deref is UB so I guess it makes sense? BTW I got all these line numbers using the base commit.
diff --git a/drivers/platform/x86/thinkpad_acpi.c b/drivers/platform/x86/thinkpad_acpi.c index 0384cf31187872df90f5ac3def9b1d6617e82ed5..a17efb68664c9c7723daa2aba023ba0cbc6b96dd 100644 --- a/drivers/platform/x86/thinkpad_acpi.c +++ b/drivers/platform/x86/thinkpad_acpi.c @@ -367,6 +367,7 @@ static struct { u32 beep_needs_two_args:1; u32 mixer_no_level_control:1; u32 battery_force_primary:1; + u32 platform_drv_registered:1; u32 hotkey_poll_active:1; u32 has_adaptive_kbd:1; u32 kbd_lang:1; @@ -11820,10 +11821,10 @@ static void thinkpad_acpi_module_exit(void) platform_device_unregister(tpacpi_sensors_pdev); } - if (tpacpi_pdev) { + if (tp_features.platform_drv_registered) platform_driver_unregister(&tpacpi_pdriver); + if (tpacpi_pdev) platform_device_unregister(tpacpi_pdev); - } if (proc_dir) remove_proc_entry(TPACPI_PROC_DIR, acpi_root_dir); @@ -11893,9 +11894,8 @@ static int __init tpacpi_pdriver_probe(struct platform_device *pdev) static int __init tpacpi_hwmon_pdriver_probe(struct platform_device *pdev) { - tpacpi_hwmon = devm_hwmon_device_register_with_groups( - &tpacpi_sensors_pdev->dev, TPACPI_NAME, NULL, tpacpi_hwmon_groups); - + tpacpi_hwmon = devm_hwmon_device_register_with_groups(&pdev->dev, TPACPI_NAME, + NULL, tpacpi_hwmon_groups); if (IS_ERR(tpacpi_hwmon)) pr_err("unable to register hwmon device\n"); @@ -11965,16 +11965,24 @@ static int __init thinkpad_acpi_module_init(void) tp_features.quirks = dmi_id->driver_data; /* Device initialization */ - tpacpi_pdev = platform_create_bundle(&tpacpi_pdriver, tpacpi_pdriver_probe, - NULL, 0, NULL, 0); + tpacpi_pdev = platform_device_register_simple(TPACPI_DRVR_NAME, PLATFORM_DEVID_NONE, + NULL, 0); if (IS_ERR(tpacpi_pdev)) { ret = PTR_ERR(tpacpi_pdev); tpacpi_pdev = NULL; - pr_err("unable to register platform device/driver bundle\n"); + pr_err("unable to register platform device\n"); thinkpad_acpi_module_exit(); return ret; } + ret = platform_driver_probe(&tpacpi_pdriver, tpacpi_pdriver_probe); + if (ret) { + pr_err("unable to register main platform driver\n"); + thinkpad_acpi_module_exit(); + return ret; + } + tp_features.platform_drv_registered = 1; + tpacpi_sensors_pdev = platform_create_bundle(&tpacpi_hwmon_pdriver, tpacpi_hwmon_pdriver_probe, NULL, 0, NULL, 0);