Message ID | 20240722083658.54518-1-en-wei.wu@canonical.com (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
Series | platform/x86: intel-vbtn: Support for tablet mode on Dell Venue 11 Pro 7140 | expand |
Hi En-Wei, On 7/22/24 10:36 AM, En-Wei Wu wrote: > On a Dell Venue 7140 tablet with the keyboard/touchpad/battery dock, when > disconnecting the dock there is a kernel bug: > > BUG: kernel NULL pointer dereference, address: 0000000000000018 > > And this causes the following things not to work: > 1. Suspend to idle - the system simply hangs > 2. Poweroff normally (the only way is forcing it via long press the power button) > 3. USB ports: both the USB port on the tablet and also plugging in the keyboard again > > The error message above (plus some crash dump) isn't so useful for debugging, but we > have noticed that there is a debug message shown before the crash dump: > > intel-vbtn INT33D6:00: Registering Intel Virtual Switches input-dev after receiving > a switch event > > The messages above is shown right after the dock is disconnected, and the message implies: > > We failed to set the priv->has_switches to true in the probe function since the > Dell Venue 11 Pro 7140 is not shown in the dmi_switches_allow_list, and this causes a problem > that no input_register_device() on the switch device is called. Afterward, When a user > disconnects the dock, intel-vbtn receives the ACPI event and finally find that there is a > switch out there. So intel-vbtn starts to register the switch device, which may be a dangerous > behavior since there might be some device-related objects/structs that has been freed (due to > the disconnection of the dock). > > To solve this problem from the root cause, simply add the Dell Venue 11 pro 7140 to the > dmi_switches_allow_list. > (The Dell Venue 11 Pro 7140 is a 2-in-1 model that has chassis-type "Portable".) > > BugLink: https://bugs.launchpad.net/bugs/2073001 > > Fixes: 8169bd3e6e19 ("platform/x86: intel-vbtn: Switch to an allow-list > for SW_TABLET_MODE reporting") > Reported-by: Kostadin Stoilov <kmstoilov@gmail.com> > Tested-by: Kostadin Stoilov <kmstoilov@gmail.com> > Signed-off-by: En-Wei Wu <en-wei.wu@canonical.com> Thank you for your patch. Looking at the logs from the launchpad bug I noticed that intel-vbtn INT33D6:00: Registering Intel Virtual Switches input-dev after receiving a switch event is reported in the logs twice. Which strongly suggests that the intel-vbtn notify_handler() function is racing with itself. In the past ACPI notify handlers could never run more then once (at the same time) but since commit e2ffcda16290 ("ACPI: OSL: Allow Notify () handlers to run on all CPUs") ACPI notify handlers like the intel-vbtn notify_handler() may run on multiple CPU cores racing with themselves. I believe that this is the real problem here. I have attached a patch which should fix this. Can you build a test-kernel with this patch instead of your patch and ask the reported of: https://bugs.launchpad.net/bugs/2073001 To test a kernel with the attached patch (and without your patch) to confirm that this fixes it in a more generic manner ? Regards, Hans > --- > drivers/platform/x86/intel/vbtn.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/drivers/platform/x86/intel/vbtn.c b/drivers/platform/x86/intel/vbtn.c > index 9b7ce03ba085..46d07d3cd34b 100644 > --- a/drivers/platform/x86/intel/vbtn.c > +++ b/drivers/platform/x86/intel/vbtn.c > @@ -235,6 +235,12 @@ static const struct dmi_system_id dmi_switches_allow_list[] = { > DMI_MATCH(DMI_PRODUCT_NAME, "Venue 11 Pro 7130"), > }, > }, > + { > + .matches = { > + DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."), > + DMI_MATCH(DMI_PRODUCT_NAME, "Venue 11 Pro 7140"), > + }, > + }, > { > .matches = { > DMI_MATCH(DMI_SYS_VENDOR, "Hewlett-Packard"),
Hi Kostadin, On 7/25/24 8:26 PM, Kostadin Stoilov wrote: > Hi Hans, > > I tested a kernel built by En-Wei with your patch and I can confirm that it resolves the issue. I tried several connect/disconnect suspend/resume cycles. > > The kernel bug messages are not observed and the system suspend/resumes/powers off successfully. > > The dmesg log from the test run can be found here: > https://launchpadlibrarian.net/740661223/6.10-maintainer-dmesg.txt <https://launchpadlibrarian.net/740661223/6.10-maintainer-dmesg.txt> Thank you for testing. I've now done an official submission of this patch upstream: https://lore.kernel.org/platform-driver-x86/20240729110030.8016-1-hdegoede@redhat.com/ This should be included in a fixes pull-req for 6.11 soon-ish and then get backported to various stable kernel series. Regards. Hans > On Mon, 22 Jul 2024 at 12:45, Hans de Goede <hdegoede@redhat.com <mailto:hdegoede@redhat.com>> wrote: > > Hi En-Wei, > > On 7/22/24 10:36 AM, En-Wei Wu wrote: > > On a Dell Venue 7140 tablet with the keyboard/touchpad/battery dock, when > > disconnecting the dock there is a kernel bug: > > > > BUG: kernel NULL pointer dereference, address: 0000000000000018 > > > > And this causes the following things not to work: > > 1. Suspend to idle - the system simply hangs > > 2. Poweroff normally (the only way is forcing it via long press the power button) > > 3. USB ports: both the USB port on the tablet and also plugging in the keyboard again > > > > The error message above (plus some crash dump) isn't so useful for debugging, but we > > have noticed that there is a debug message shown before the crash dump: > > > > intel-vbtn INT33D6:00: Registering Intel Virtual Switches input-dev after receiving > > a switch event > > > > The messages above is shown right after the dock is disconnected, and the message implies: > > > > We failed to set the priv->has_switches to true in the probe function since the > > Dell Venue 11 Pro 7140 is not shown in the dmi_switches_allow_list, and this causes a problem > > that no input_register_device() on the switch device is called. Afterward, When a user > > disconnects the dock, intel-vbtn receives the ACPI event and finally find that there is a > > switch out there. So intel-vbtn starts to register the switch device, which may be a dangerous > > behavior since there might be some device-related objects/structs that has been freed (due to > > the disconnection of the dock). > > > > To solve this problem from the root cause, simply add the Dell Venue 11 pro 7140 to the > > dmi_switches_allow_list. > > (The Dell Venue 11 Pro 7140 is a 2-in-1 model that has chassis-type "Portable".) > > > > BugLink: https://bugs.launchpad.net/bugs/2073001 <https://bugs.launchpad.net/bugs/2073001> > > > > Fixes: 8169bd3e6e19 ("platform/x86: intel-vbtn: Switch to an allow-list > > for SW_TABLET_MODE reporting") > > Reported-by: Kostadin Stoilov <kmstoilov@gmail.com <mailto:kmstoilov@gmail.com>> > > Tested-by: Kostadin Stoilov <kmstoilov@gmail.com <mailto:kmstoilov@gmail.com>> > > Signed-off-by: En-Wei Wu <en-wei.wu@canonical.com <mailto:en-wei.wu@canonical.com>> > > Thank you for your patch. Looking at the logs from the launchpad bug I noticed that > > intel-vbtn INT33D6:00: Registering Intel Virtual Switches input-dev after receiving a switch event > > is reported in the logs twice. Which strongly suggests that the intel-vbtn notify_handler() > function is racing with itself. > > In the past ACPI notify handlers could never run more then once (at the same time) > but since commit e2ffcda16290 ("ACPI: OSL: Allow Notify () handlers to run on > all CPUs") ACPI notify handlers like the intel-vbtn notify_handler() may > run on multiple CPU cores racing with themselves. > > I believe that this is the real problem here. I have attached a patch which should > fix this. Can you build a test-kernel with this patch instead of your patch and > ask the reported of: https://bugs.launchpad.net/bugs/2073001 <https://bugs.launchpad.net/bugs/2073001> > > To test a kernel with the attached patch (and without your patch) to confirm > that this fixes it in a more generic manner ? > > Regards, > > Hans > > > > > > > --- > > drivers/platform/x86/intel/vbtn.c | 6 ++++++ > > 1 file changed, 6 insertions(+) > > > > diff --git a/drivers/platform/x86/intel/vbtn.c b/drivers/platform/x86/intel/vbtn.c > > index 9b7ce03ba085..46d07d3cd34b 100644 > > --- a/drivers/platform/x86/intel/vbtn.c > > +++ b/drivers/platform/x86/intel/vbtn.c > > @@ -235,6 +235,12 @@ static const struct dmi_system_id dmi_switches_allow_list[] = { > > DMI_MATCH(DMI_PRODUCT_NAME, "Venue 11 Pro 7130"), > > }, > > }, > > + { > > + .matches = { > > + DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."), > > + DMI_MATCH(DMI_PRODUCT_NAME, "Venue 11 Pro 7140"), > > + }, > > + }, > > { > > .matches = { > > DMI_MATCH(DMI_SYS_VENDOR, "Hewlett-Packard"), >
Hi Kostadin, On 7/25/24 8:26 PM, Kostadin Stoilov wrote: > Hi Hans, > > I tested a kernel built by En-Wei with your patch and I can confirm that it resolves the issue. I tried several connect/disconnect suspend/resume cycles. > > The kernel bug messages are not observed and the system suspend/resumes/powers off successfully. > > The dmesg log from the test run can be found here: > https://launchpadlibrarian.net/740661223/6.10-maintainer-dmesg.txt <https://launchpadlibrarian.net/740661223/6.10-maintainer-dmesg.txt> Thank you for testing. I've now done an official submission of this patch upstream: https://lore.kernel.org/platform-driver-x86/20240729110030.8016-1-hdegoede@redhat.com/ This should be included in a fixes pull-req for 6.11 soon-ish and then get backported to various stable kernel series. Regards. Hans > On Mon, 22 Jul 2024 at 12:45, Hans de Goede <hdegoede@redhat.com <mailto:hdegoede@redhat.com>> wrote: > > Hi En-Wei, > > On 7/22/24 10:36 AM, En-Wei Wu wrote: > > On a Dell Venue 7140 tablet with the keyboard/touchpad/battery dock, when > > disconnecting the dock there is a kernel bug: > > > > BUG: kernel NULL pointer dereference, address: 0000000000000018 > > > > And this causes the following things not to work: > > 1. Suspend to idle - the system simply hangs > > 2. Poweroff normally (the only way is forcing it via long press the power button) > > 3. USB ports: both the USB port on the tablet and also plugging in the keyboard again > > > > The error message above (plus some crash dump) isn't so useful for debugging, but we > > have noticed that there is a debug message shown before the crash dump: > > > > intel-vbtn INT33D6:00: Registering Intel Virtual Switches input-dev after receiving > > a switch event > > > > The messages above is shown right after the dock is disconnected, and the message implies: > > > > We failed to set the priv->has_switches to true in the probe function since the > > Dell Venue 11 Pro 7140 is not shown in the dmi_switches_allow_list, and this causes a problem > > that no input_register_device() on the switch device is called. Afterward, When a user > > disconnects the dock, intel-vbtn receives the ACPI event and finally find that there is a > > switch out there. So intel-vbtn starts to register the switch device, which may be a dangerous > > behavior since there might be some device-related objects/structs that has been freed (due to > > the disconnection of the dock). > > > > To solve this problem from the root cause, simply add the Dell Venue 11 pro 7140 to the > > dmi_switches_allow_list. > > (The Dell Venue 11 Pro 7140 is a 2-in-1 model that has chassis-type "Portable".) > > > > BugLink: https://bugs.launchpad.net/bugs/2073001 <https://bugs.launchpad.net/bugs/2073001> > > > > Fixes: 8169bd3e6e19 ("platform/x86: intel-vbtn: Switch to an allow-list > > for SW_TABLET_MODE reporting") > > Reported-by: Kostadin Stoilov <kmstoilov@gmail.com <mailto:kmstoilov@gmail.com>> > > Tested-by: Kostadin Stoilov <kmstoilov@gmail.com <mailto:kmstoilov@gmail.com>> > > Signed-off-by: En-Wei Wu <en-wei.wu@canonical.com <mailto:en-wei.wu@canonical.com>> > > Thank you for your patch. Looking at the logs from the launchpad bug I noticed that > > intel-vbtn INT33D6:00: Registering Intel Virtual Switches input-dev after receiving a switch event > > is reported in the logs twice. Which strongly suggests that the intel-vbtn notify_handler() > function is racing with itself. > > In the past ACPI notify handlers could never run more then once (at the same time) > but since commit e2ffcda16290 ("ACPI: OSL: Allow Notify () handlers to run on > all CPUs") ACPI notify handlers like the intel-vbtn notify_handler() may > run on multiple CPU cores racing with themselves. > > I believe that this is the real problem here. I have attached a patch which should > fix this. Can you build a test-kernel with this patch instead of your patch and > ask the reported of: https://bugs.launchpad.net/bugs/2073001 <https://bugs.launchpad.net/bugs/2073001> > > To test a kernel with the attached patch (and without your patch) to confirm > that this fixes it in a more generic manner ? > > Regards, > > Hans > > > > > > > --- > > drivers/platform/x86/intel/vbtn.c | 6 ++++++ > > 1 file changed, 6 insertions(+) > > > > diff --git a/drivers/platform/x86/intel/vbtn.c b/drivers/platform/x86/intel/vbtn.c > > index 9b7ce03ba085..46d07d3cd34b 100644 > > --- a/drivers/platform/x86/intel/vbtn.c > > +++ b/drivers/platform/x86/intel/vbtn.c > > @@ -235,6 +235,12 @@ static const struct dmi_system_id dmi_switches_allow_list[] = { > > DMI_MATCH(DMI_PRODUCT_NAME, "Venue 11 Pro 7130"), > > }, > > }, > > + { > > + .matches = { > > + DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."), > > + DMI_MATCH(DMI_PRODUCT_NAME, "Venue 11 Pro 7140"), > > + }, > > + }, > > { > > .matches = { > > DMI_MATCH(DMI_SYS_VENDOR, "Hewlett-Packard"), >
diff --git a/drivers/platform/x86/intel/vbtn.c b/drivers/platform/x86/intel/vbtn.c index 9b7ce03ba085..46d07d3cd34b 100644 --- a/drivers/platform/x86/intel/vbtn.c +++ b/drivers/platform/x86/intel/vbtn.c @@ -235,6 +235,12 @@ static const struct dmi_system_id dmi_switches_allow_list[] = { DMI_MATCH(DMI_PRODUCT_NAME, "Venue 11 Pro 7130"), }, }, + { + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."), + DMI_MATCH(DMI_PRODUCT_NAME, "Venue 11 Pro 7140"), + }, + }, { .matches = { DMI_MATCH(DMI_SYS_VENDOR, "Hewlett-Packard"),