diff mbox series

[v4] acpi,pci: warn about duplicate IRQ routing entries returned from _PRT

Message ID 20231226124254.66102-1-mat.jonczyk@o2.pl (mailing list archive)
State New, archived
Delegated to: Bjorn Helgaas
Headers show
Series [v4] acpi,pci: warn about duplicate IRQ routing entries returned from _PRT | expand

Commit Message

Mateusz Jończyk Dec. 26, 2023, 12:42 p.m. UTC
On some platforms, the ACPI _PRT function returns duplicate interrupt
routing entries. Linux uses the first matching entry, but sometimes the
second matching entry contains the correct interrupt vector.

As a debugging aid, print a warning to dmesg if duplicate interrupt
routing entries are present. This way, we could check how many models
are affected.

This happens on a Dell Latitude E6500 laptop with the i2c-i801 Intel
SMBus controller. This controller is nonfunctional unless its interrupt
usage is disabled (using the "disable_features=0x10" module parameter).

After investigation, it turned out that the driver was using an
incorrect interrupt vector: in lspci output for this device there was:
        Interrupt: pin B routed to IRQ 19
but after running i2cdetect (without using any i2c-i801 module
parameters) the following was logged to dmesg:

        [...]
        i801_smbus 0000:00:1f.3: Timeout waiting for interrupt!
        i801_smbus 0000:00:1f.3: Transaction timeout
        i801_smbus 0000:00:1f.3: Timeout waiting for interrupt!
        i801_smbus 0000:00:1f.3: Transaction timeout
        irq 17: nobody cared (try booting with the "irqpoll" option)

Existence of duplicate entries in a table returned by the _PRT method
was confirmed by disassembling the ACPI DSDT table.

Windows XP is using IRQ3 (as reported by HWiNFO32 and in the Device
Manager), which is neither of the two vectors returned by _PRT.
As HWiNFO32 decoded contents of the SPD EEPROMs, the i2c-i801 device is
working under Windows. It appears that Windows has reconfigured the
chipset independently to use another interrupt vector for the device.
This is possible, according to the chipset datasheet [1], page 436 for
example (PIRQ[n]_ROUT—PIRQ[A,B,C,D] Routing Control Register).

[1] https://www.intel.com/content/dam/doc/datasheet/io-controller-hub-9-datasheet.pdf

Signed-off-by: Mateusz Jończyk <mat.jonczyk@o2.pl>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Jean Delvare <jdelvare@suse.de>
Previously-reviewed-by: Jean Delvare <jdelvare@suse.de>
Previously-tested-by: Jean Delvare <jdelvare@suse.de>

---
Hello,

I'm resurrecting an older patch that was discussed back in January:

https://lore.kernel.org/lkml/20230121153314.6109-1-mat.jonczyk@o2.pl/T/#u

To consider: should we print a warning or an error in case of duplicate
entries? This may not be serious enough to disturb the user with an
error message at boot.

I'm also looking into modifying the i2c-i801 driver to disable its usage
of interrupts if one did not fire.

v2: - add a newline at the end of the kernel log message,
    - replace: "if (match == NULL)" -> "if (!match)"
    - patch description tweaks.
v3: - fix C style issues pointed by Jean Delvare,
    - switch severity from warning to error.
v3 RESEND: retested on top of v6.2-rc4
v4: - rebase and retest on top of v6.7-rc7
    - switch severity back to warning,
    - change pr_err() to dev_warn() and simplify the code,
    - modify patch description (describe Windows behaviour etc.)
---
 drivers/acpi/pci_irq.c | 25 ++++++++++++++++++++++---
 1 file changed, 22 insertions(+), 3 deletions(-)


base-commit: 861deac3b092f37b2c5e6871732f3e11486f7082

Comments

Jean Delvare Feb. 13, 2024, 9:39 p.m. UTC | #1
Hi Mateusz,

On Tue, 26 Dec 2023 13:42:54 +0100, Mateusz Jończyk wrote:
> On some platforms, the ACPI _PRT function returns duplicate interrupt
> routing entries. Linux uses the first matching entry, but sometimes the
> second matching entry contains the correct interrupt vector.
> 
> As a debugging aid, print a warning to dmesg if duplicate interrupt
> routing entries are present. This way, we could check how many models
> are affected.
> 
> This happens on a Dell Latitude E6500 laptop with the i2c-i801 Intel
> SMBus controller. This controller is nonfunctional unless its interrupt
> usage is disabled (using the "disable_features=0x10" module parameter).
> 
> After investigation, it turned out that the driver was using an
> incorrect interrupt vector: in lspci output for this device there was:
>         Interrupt: pin B routed to IRQ 19
> but after running i2cdetect (without using any i2c-i801 module
> parameters) the following was logged to dmesg:
> 
>         [...]
>         i801_smbus 0000:00:1f.3: Timeout waiting for interrupt!
>         i801_smbus 0000:00:1f.3: Transaction timeout
>         i801_smbus 0000:00:1f.3: Timeout waiting for interrupt!
>         i801_smbus 0000:00:1f.3: Transaction timeout
>         irq 17: nobody cared (try booting with the "irqpoll" option)
> 
> Existence of duplicate entries in a table returned by the _PRT method
> was confirmed by disassembling the ACPI DSDT table.
> 
> Windows XP is using IRQ3 (as reported by HWiNFO32 and in the Device
> Manager), which is neither of the two vectors returned by _PRT.
> As HWiNFO32 decoded contents of the SPD EEPROMs, the i2c-i801 device is
> working under Windows. It appears that Windows has reconfigured the
> chipset independently to use another interrupt vector for the device.
> This is possible, according to the chipset datasheet [1], page 436 for
> example (PIRQ[n]_ROUT—PIRQ[A,B,C,D] Routing Control Register).
> 
> [1] https://www.intel.com/content/dam/doc/datasheet/io-controller-hub-9-datasheet.pdf
> 
> Signed-off-by: Mateusz Jończyk <mat.jonczyk@o2.pl>
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: "Rafael J. Wysocki" <rafael@kernel.org>
> Cc: Len Brown <lenb@kernel.org>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Jean Delvare <jdelvare@suse.de>
> Previously-reviewed-by: Jean Delvare <jdelvare@suse.de>
> Previously-tested-by: Jean Delvare <jdelvare@suse.de>

I'm still happy with this patch, so you can change that back to:

Reviewed-by: Jean Delvare <jdelvare@suse.de>
Tested-by: Jean Delvare <jdelvare@suse.de>

Thanks,
Rafael J. Wysocki Feb. 16, 2024, 6:26 p.m. UTC | #2
On Tue, Dec 26, 2023 at 1:50 PM Mateusz Jończyk <mat.jonczyk@o2.pl> wrote:
>
> On some platforms, the ACPI _PRT function returns duplicate interrupt
> routing entries. Linux uses the first matching entry, but sometimes the
> second matching entry contains the correct interrupt vector.
>
> As a debugging aid, print a warning to dmesg if duplicate interrupt
> routing entries are present. This way, we could check how many models
> are affected.
>
> This happens on a Dell Latitude E6500 laptop with the i2c-i801 Intel
> SMBus controller. This controller is nonfunctional unless its interrupt
> usage is disabled (using the "disable_features=0x10" module parameter).
>
> After investigation, it turned out that the driver was using an
> incorrect interrupt vector: in lspci output for this device there was:
>         Interrupt: pin B routed to IRQ 19
> but after running i2cdetect (without using any i2c-i801 module
> parameters) the following was logged to dmesg:
>
>         [...]
>         i801_smbus 0000:00:1f.3: Timeout waiting for interrupt!
>         i801_smbus 0000:00:1f.3: Transaction timeout
>         i801_smbus 0000:00:1f.3: Timeout waiting for interrupt!
>         i801_smbus 0000:00:1f.3: Transaction timeout
>         irq 17: nobody cared (try booting with the "irqpoll" option)
>
> Existence of duplicate entries in a table returned by the _PRT method
> was confirmed by disassembling the ACPI DSDT table.
>
> Windows XP is using IRQ3 (as reported by HWiNFO32 and in the Device
> Manager), which is neither of the two vectors returned by _PRT.
> As HWiNFO32 decoded contents of the SPD EEPROMs, the i2c-i801 device is
> working under Windows. It appears that Windows has reconfigured the
> chipset independently to use another interrupt vector for the device.
> This is possible, according to the chipset datasheet [1], page 436 for
> example (PIRQ[n]_ROUT—PIRQ[A,B,C,D] Routing Control Register).
>
> [1] https://www.intel.com/content/dam/doc/datasheet/io-controller-hub-9-datasheet.pdf
>
> Signed-off-by: Mateusz Jończyk <mat.jonczyk@o2.pl>
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: "Rafael J. Wysocki" <rafael@kernel.org>
> Cc: Len Brown <lenb@kernel.org>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Jean Delvare <jdelvare@suse.de>
> Previously-reviewed-by: Jean Delvare <jdelvare@suse.de>
> Previously-tested-by: Jean Delvare <jdelvare@suse.de>
>
> ---
> Hello,
>
> I'm resurrecting an older patch that was discussed back in January:
>
> https://lore.kernel.org/lkml/20230121153314.6109-1-mat.jonczyk@o2.pl/T/#u
>
> To consider: should we print a warning or an error in case of duplicate
> entries? This may not be serious enough to disturb the user with an
> error message at boot.
>
> I'm also looking into modifying the i2c-i801 driver to disable its usage
> of interrupts if one did not fire.
>
> v2: - add a newline at the end of the kernel log message,
>     - replace: "if (match == NULL)" -> "if (!match)"
>     - patch description tweaks.
> v3: - fix C style issues pointed by Jean Delvare,
>     - switch severity from warning to error.
> v3 RESEND: retested on top of v6.2-rc4
> v4: - rebase and retest on top of v6.7-rc7
>     - switch severity back to warning,
>     - change pr_err() to dev_warn() and simplify the code,
>     - modify patch description (describe Windows behaviour etc.)
> ---
>  drivers/acpi/pci_irq.c | 25 ++++++++++++++++++++++---
>  1 file changed, 22 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c
> index ff30ceca2203..1fcf72e335b0 100644
> --- a/drivers/acpi/pci_irq.c
> +++ b/drivers/acpi/pci_irq.c
> @@ -203,6 +203,8 @@ static int acpi_pci_irq_find_prt_entry(struct pci_dev *dev,
>         struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL };
>         struct acpi_pci_routing_table *entry;
>         acpi_handle handle = NULL;
> +       struct acpi_prt_entry *match = NULL;
> +       const char *match_int_source = NULL;
>
>         if (dev->bus->bridge)
>                 handle = ACPI_HANDLE(dev->bus->bridge);
> @@ -219,13 +221,30 @@ static int acpi_pci_irq_find_prt_entry(struct pci_dev *dev,
>
>         entry = buffer.pointer;
>         while (entry && (entry->length > 0)) {
> -               if (!acpi_pci_irq_check_entry(handle, dev, pin,
> -                                                entry, entry_ptr))
> -                       break;
> +               struct acpi_prt_entry *curr;
> +
> +               if (!acpi_pci_irq_check_entry(handle, dev, pin, entry, &curr)) {
> +                       if (!match) {
> +                               match = curr;
> +                               match_int_source = entry->source;
> +                       } else {
> +                               dev_warn(&dev->dev, FW_BUG

dev_info() would be sufficient here IMV.

> +                                      "ACPI _PRT returned duplicate IRQ routing entries for INT%c: %s[%d] and %s[%d]\n",
> +                                      pin_name(curr->pin),
> +                                      match_int_source, match->index,
> +                                      entry->source, curr->index);
> +                               /* We use the first matching entry nonetheless,
> +                                * for compatibility with older kernels.
> +                                */
> +                       }
> +               }
> +
>                 entry = (struct acpi_pci_routing_table *)
>                     ((unsigned long)entry + entry->length);
>         }
>
> +       *entry_ptr = match;
> +
>         kfree(buffer.pointer);
>         return 0;
>  }
>
> base-commit: 861deac3b092f37b2c5e6871732f3e11486f7082
> --

Bjorn, any concerns regarding this one?
Bjorn Helgaas Feb. 16, 2024, 6:49 p.m. UTC | #3
On Fri, Feb 16, 2024 at 07:26:06PM +0100, Rafael J. Wysocki wrote:
> On Tue, Dec 26, 2023 at 1:50 PM Mateusz Jończyk <mat.jonczyk@o2.pl> wrote:
> >
> > On some platforms, the ACPI _PRT function returns duplicate interrupt
> > routing entries. Linux uses the first matching entry, but sometimes the
> > second matching entry contains the correct interrupt vector.
> >
> > As a debugging aid, print a warning to dmesg if duplicate interrupt
> > routing entries are present. This way, we could check how many models
> > are affected.
> >
> > This happens on a Dell Latitude E6500 laptop with the i2c-i801 Intel
> > SMBus controller. This controller is nonfunctional unless its interrupt
> > usage is disabled (using the "disable_features=0x10" module parameter).
> >
> > After investigation, it turned out that the driver was using an
> > incorrect interrupt vector: in lspci output for this device there was:
> >         Interrupt: pin B routed to IRQ 19
> > but after running i2cdetect (without using any i2c-i801 module
> > parameters) the following was logged to dmesg:
> >
> >         [...]
> >         i801_smbus 0000:00:1f.3: Timeout waiting for interrupt!
> >         i801_smbus 0000:00:1f.3: Transaction timeout
> >         i801_smbus 0000:00:1f.3: Timeout waiting for interrupt!
> >         i801_smbus 0000:00:1f.3: Transaction timeout
> >         irq 17: nobody cared (try booting with the "irqpoll" option)
> >
> > Existence of duplicate entries in a table returned by the _PRT method
> > was confirmed by disassembling the ACPI DSDT table.
> >
> > Windows XP is using IRQ3 (as reported by HWiNFO32 and in the Device
> > Manager), which is neither of the two vectors returned by _PRT.
> > As HWiNFO32 decoded contents of the SPD EEPROMs, the i2c-i801 device is
> > working under Windows. It appears that Windows has reconfigured the
> > chipset independently to use another interrupt vector for the device.
> > This is possible, according to the chipset datasheet [1], page 436 for
> > example (PIRQ[n]_ROUT—PIRQ[A,B,C,D] Routing Control Register).
> >
> > [1] https://www.intel.com/content/dam/doc/datasheet/io-controller-hub-9-datasheet.pdf
> >
> > Signed-off-by: Mateusz Jończyk <mat.jonczyk@o2.pl>
> > Cc: Bjorn Helgaas <bhelgaas@google.com>
> > Cc: "Rafael J. Wysocki" <rafael@kernel.org>
> > Cc: Len Brown <lenb@kernel.org>
> > Cc: Borislav Petkov <bp@suse.de>
> > Cc: Jean Delvare <jdelvare@suse.de>
> > Previously-reviewed-by: Jean Delvare <jdelvare@suse.de>
> > Previously-tested-by: Jean Delvare <jdelvare@suse.de>
> >
> > ---
> > Hello,
> >
> > I'm resurrecting an older patch that was discussed back in January:
> >
> > https://lore.kernel.org/lkml/20230121153314.6109-1-mat.jonczyk@o2.pl/T/#u
> >
> > To consider: should we print a warning or an error in case of duplicate
> > entries? This may not be serious enough to disturb the user with an
> > error message at boot.
> >
> > I'm also looking into modifying the i2c-i801 driver to disable its usage
> > of interrupts if one did not fire.
> >
> > v2: - add a newline at the end of the kernel log message,
> >     - replace: "if (match == NULL)" -> "if (!match)"
> >     - patch description tweaks.
> > v3: - fix C style issues pointed by Jean Delvare,
> >     - switch severity from warning to error.
> > v3 RESEND: retested on top of v6.2-rc4
> > v4: - rebase and retest on top of v6.7-rc7
> >     - switch severity back to warning,
> >     - change pr_err() to dev_warn() and simplify the code,
> >     - modify patch description (describe Windows behaviour etc.)
> > ---
> >  drivers/acpi/pci_irq.c | 25 ++++++++++++++++++++++---
> >  1 file changed, 22 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c
> > index ff30ceca2203..1fcf72e335b0 100644
> > --- a/drivers/acpi/pci_irq.c
> > +++ b/drivers/acpi/pci_irq.c
> > @@ -203,6 +203,8 @@ static int acpi_pci_irq_find_prt_entry(struct pci_dev *dev,
> >         struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL };
> >         struct acpi_pci_routing_table *entry;
> >         acpi_handle handle = NULL;
> > +       struct acpi_prt_entry *match = NULL;
> > +       const char *match_int_source = NULL;
> >
> >         if (dev->bus->bridge)
> >                 handle = ACPI_HANDLE(dev->bus->bridge);
> > @@ -219,13 +221,30 @@ static int acpi_pci_irq_find_prt_entry(struct pci_dev *dev,
> >
> >         entry = buffer.pointer;
> >         while (entry && (entry->length > 0)) {
> > -               if (!acpi_pci_irq_check_entry(handle, dev, pin,
> > -                                                entry, entry_ptr))
> > -                       break;
> > +               struct acpi_prt_entry *curr;
> > +
> > +               if (!acpi_pci_irq_check_entry(handle, dev, pin, entry, &curr)) {
> > +                       if (!match) {
> > +                               match = curr;
> > +                               match_int_source = entry->source;
> > +                        } else {
> > +                               dev_warn(&dev->dev, FW_BUG
> 
> dev_info() would be sufficient here IMV.
> 
> > +                                      "ACPI _PRT returned duplicate IRQ routing entries for INT%c: %s[%d] and %s[%d]\n",
> > +                                      pin_name(curr->pin),
> > +                                      match_int_source, match->index,
> > +                                      entry->source, curr->index);
> > +                               /* We use the first matching entry nonetheless,
> > +                                * for compatibility with older kernels.

The usual comment style in this file is:

  /*
   * We use ...
   */

> > +                                */
> > +                       }
> > +               }
> > +
> >                 entry = (struct acpi_pci_routing_table *)
> >                     ((unsigned long)entry + entry->length);
> >         }
> >
> > +       *entry_ptr = match;
> > +
> >         kfree(buffer.pointer);
> >         return 0;
> >  }
> >
> > base-commit: 861deac3b092f37b2c5e6871732f3e11486f7082
> > --
> 
> Bjorn, any concerns regarding this one?

No concerns from me.  

I guess this only adds a message, right?  It doesn't actually fix
anything or change any behavior?

This talks about "duplicate" entries, which suggests to me that they
are identical, but I don't think they are.  It sounds like it's two
"matching" entries, i.e., two entries for the same (device, pin)?

And neither of the two _PRT entries yields a working i801 device?

Bjorn
Mateusz Jończyk Feb. 16, 2024, 8:20 p.m. UTC | #4
W dniu 16.02.2024 o 19:49, Bjorn Helgaas pisze:
> On Fri, Feb 16, 2024 at 07:26:06PM +0100, Rafael J. Wysocki wrote:
>> On Tue, Dec 26, 2023 at 1:50 PM Mateusz Jończyk <mat.jonczyk@o2.pl> wrote:
>>> On some platforms, the ACPI _PRT function returns duplicate interrupt
>>> routing entries. Linux uses the first matching entry, but sometimes the
>>> second matching entry contains the correct interrupt vector.
>>>
>>> As a debugging aid, print a warning to dmesg if duplicate interrupt
>>> routing entries are present. This way, we could check how many models
>>> are affected.
>>>
>>> This happens on a Dell Latitude E6500 laptop with the i2c-i801 Intel
>>> SMBus controller. This controller is nonfunctional unless its interrupt
>>> usage is disabled (using the "disable_features=0x10" module parameter).
>>>
>>> After investigation, it turned out that the driver was using an
>>> incorrect interrupt vector: in lspci output for this device there was:
>>>         Interrupt: pin B routed to IRQ 19
>>> but after running i2cdetect (without using any i2c-i801 module
>>> parameters) the following was logged to dmesg:
>>>
>>>         [...]
>>>         i801_smbus 0000:00:1f.3: Timeout waiting for interrupt!
>>>         i801_smbus 0000:00:1f.3: Transaction timeout
>>>         i801_smbus 0000:00:1f.3: Timeout waiting for interrupt!
>>>         i801_smbus 0000:00:1f.3: Transaction timeout
>>>         irq 17: nobody cared (try booting with the "irqpoll" option)
>>>
>>> Existence of duplicate entries in a table returned by the _PRT method
>>> was confirmed by disassembling the ACPI DSDT table.
>>>
>>> Windows XP is using IRQ3 (as reported by HWiNFO32 and in the Device
>>> Manager), which is neither of the two vectors returned by _PRT.
>>> As HWiNFO32 decoded contents of the SPD EEPROMs, the i2c-i801 device is
>>> working under Windows. It appears that Windows has reconfigured the
>>> chipset independently to use another interrupt vector for the device.
>>> This is possible, according to the chipset datasheet [1], page 436 for
>>> example (PIRQ[n]_ROUT—PIRQ[A,B,C,D] Routing Control Register).
>>>
>>> [1] https://www.intel.com/content/dam/doc/datasheet/io-controller-hub-9-datasheet.pdf
>>>
>>> Signed-off-by: Mateusz Jończyk <mat.jonczyk@o2.pl>
>>> Cc: Bjorn Helgaas <bhelgaas@google.com>
>>> Cc: "Rafael J. Wysocki" <rafael@kernel.org>
>>> Cc: Len Brown <lenb@kernel.org>
>>> Cc: Borislav Petkov <bp@suse.de>
>>> Cc: Jean Delvare <jdelvare@suse.de>
>>> Previously-reviewed-by: Jean Delvare <jdelvare@suse.de>
>>> Previously-tested-by: Jean Delvare <jdelvare@suse.de>
>>>
>>> ---
>>> Hello,
>>>
>>> I'm resurrecting an older patch that was discussed back in January:
>>>
>>> https://lore.kernel.org/lkml/20230121153314.6109-1-mat.jonczyk@o2.pl/T/#u
>>>
>>> To consider: should we print a warning or an error in case of duplicate
>>> entries? This may not be serious enough to disturb the user with an
>>> error message at boot.
>>>
>>> I'm also looking into modifying the i2c-i801 driver to disable its usage
>>> of interrupts if one did not fire.
>>>
>>> v2: - add a newline at the end of the kernel log message,
>>>     - replace: "if (match == NULL)" -> "if (!match)"
>>>     - patch description tweaks.
>>> v3: - fix C style issues pointed by Jean Delvare,
>>>     - switch severity from warning to error.
>>> v3 RESEND: retested on top of v6.2-rc4
>>> v4: - rebase and retest on top of v6.7-rc7
>>>     - switch severity back to warning,
>>>     - change pr_err() to dev_warn() and simplify the code,
>>>     - modify patch description (describe Windows behaviour etc.)
>>> ---
>>>  drivers/acpi/pci_irq.c | 25 ++++++++++++++++++++++---
>>>  1 file changed, 22 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c
>>> index ff30ceca2203..1fcf72e335b0 100644
>>> --- a/drivers/acpi/pci_irq.c
>>> +++ b/drivers/acpi/pci_irq.c
>>> @@ -203,6 +203,8 @@ static int acpi_pci_irq_find_prt_entry(struct pci_dev *dev,
>>>         struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL };
>>>         struct acpi_pci_routing_table *entry;
>>>         acpi_handle handle = NULL;
>>> +       struct acpi_prt_entry *match = NULL;
>>> +       const char *match_int_source = NULL;
>>>
>>>         if (dev->bus->bridge)
>>>                 handle = ACPI_HANDLE(dev->bus->bridge);
>>> @@ -219,13 +221,30 @@ static int acpi_pci_irq_find_prt_entry(struct pci_dev *dev,
>>>
>>>         entry = buffer.pointer;
>>>         while (entry && (entry->length > 0)) {
>>> -               if (!acpi_pci_irq_check_entry(handle, dev, pin,
>>> -                                                entry, entry_ptr))
>>> -                       break;
>>> +               struct acpi_prt_entry *curr;
>>> +
>>> +               if (!acpi_pci_irq_check_entry(handle, dev, pin, entry, &curr)) {
>>> +                       if (!match) {
>>> +                               match = curr;
>>> +                               match_int_source = entry->source;
>>> +                        } else {
>>> +                               dev_warn(&dev->dev, FW_BUG
>> dev_info() would be sufficient here IMV.
>>
>>> +                                      "ACPI _PRT returned duplicate IRQ routing entries for INT%c: %s[%d] and %s[%d]\n",
>>> +                                      pin_name(curr->pin),
>>> +                                      match_int_source, match->index,
>>> +                                      entry->source, curr->index);
>>> +                               /* We use the first matching entry nonetheless,
>>> +                                * for compatibility with older kernels.
> The usual comment style in this file is:
>
>   /*
>    * We use ...
>    */
>
>>> +                                */
>>> +                       }
>>> +               }
>>> +
>>>                 entry = (struct acpi_pci_routing_table *)
>>>                     ((unsigned long)entry + entry->length);
>>>         }
>>>
>>> +       *entry_ptr = match;
>>> +
>>>         kfree(buffer.pointer);
>>>         return 0;
>>>  }
>>>
>>> base-commit: 861deac3b092f37b2c5e6871732f3e11486f7082
>>> --
>> Bjorn, any concerns regarding this one?
> No concerns from me.  
>
> I guess this only adds a message, right?  It doesn't actually fix
> anything or change any behavior?
Exactly.
> This talks about "duplicate" entries, which suggests to me that they
> are identical, but I don't think they are.  It sounds like it's two
> "matching" entries, i.e., two entries for the same (device, pin)?

Right.

> And neither of the two _PRT entries yields a working i801 device?

Unpatched Linux uses the first matching entry, but the second one gives
a working i801 device. The point is to print a warning message to see
how many devices are affected and whether it is safe to switch the code
to use the last matching entry in all instances.

Therefore I used dev_warn().

> Bjorn

Greetings,

Mateusz
Rafael J. Wysocki Feb. 16, 2024, 8:51 p.m. UTC | #5
On Fri, Feb 16, 2024 at 9:20 PM Mateusz Jończyk <mat.jonczyk@o2.pl> wrote:
>
> W dniu 16.02.2024 o 19:49, Bjorn Helgaas pisze:
> > On Fri, Feb 16, 2024 at 07:26:06PM +0100, Rafael J. Wysocki wrote:
> >> On Tue, Dec 26, 2023 at 1:50 PM Mateusz Jończyk <mat.jonczyk@o2.pl> wrote:
> >>> On some platforms, the ACPI _PRT function returns duplicate interrupt
> >>> routing entries. Linux uses the first matching entry, but sometimes the
> >>> second matching entry contains the correct interrupt vector.
> >>>
> >>> As a debugging aid, print a warning to dmesg if duplicate interrupt
> >>> routing entries are present. This way, we could check how many models
> >>> are affected.
> >>>
> >>> This happens on a Dell Latitude E6500 laptop with the i2c-i801 Intel
> >>> SMBus controller. This controller is nonfunctional unless its interrupt
> >>> usage is disabled (using the "disable_features=0x10" module parameter).
> >>>
> >>> After investigation, it turned out that the driver was using an
> >>> incorrect interrupt vector: in lspci output for this device there was:
> >>>         Interrupt: pin B routed to IRQ 19
> >>> but after running i2cdetect (without using any i2c-i801 module
> >>> parameters) the following was logged to dmesg:
> >>>
> >>>         [...]
> >>>         i801_smbus 0000:00:1f.3: Timeout waiting for interrupt!
> >>>         i801_smbus 0000:00:1f.3: Transaction timeout
> >>>         i801_smbus 0000:00:1f.3: Timeout waiting for interrupt!
> >>>         i801_smbus 0000:00:1f.3: Transaction timeout
> >>>         irq 17: nobody cared (try booting with the "irqpoll" option)
> >>>
> >>> Existence of duplicate entries in a table returned by the _PRT method
> >>> was confirmed by disassembling the ACPI DSDT table.
> >>>
> >>> Windows XP is using IRQ3 (as reported by HWiNFO32 and in the Device
> >>> Manager), which is neither of the two vectors returned by _PRT.
> >>> As HWiNFO32 decoded contents of the SPD EEPROMs, the i2c-i801 device is
> >>> working under Windows. It appears that Windows has reconfigured the
> >>> chipset independently to use another interrupt vector for the device.
> >>> This is possible, according to the chipset datasheet [1], page 436 for
> >>> example (PIRQ[n]_ROUT—PIRQ[A,B,C,D] Routing Control Register).
> >>>
> >>> [1] https://www.intel.com/content/dam/doc/datasheet/io-controller-hub-9-datasheet.pdf
> >>>
> >>> Signed-off-by: Mateusz Jończyk <mat.jonczyk@o2.pl>
> >>> Cc: Bjorn Helgaas <bhelgaas@google.com>
> >>> Cc: "Rafael J. Wysocki" <rafael@kernel.org>
> >>> Cc: Len Brown <lenb@kernel.org>
> >>> Cc: Borislav Petkov <bp@suse.de>
> >>> Cc: Jean Delvare <jdelvare@suse.de>
> >>> Previously-reviewed-by: Jean Delvare <jdelvare@suse.de>
> >>> Previously-tested-by: Jean Delvare <jdelvare@suse.de>
> >>>
> >>> ---
> >>> Hello,
> >>>
> >>> I'm resurrecting an older patch that was discussed back in January:
> >>>
> >>> https://lore.kernel.org/lkml/20230121153314.6109-1-mat.jonczyk@o2.pl/T/#u
> >>>
> >>> To consider: should we print a warning or an error in case of duplicate
> >>> entries? This may not be serious enough to disturb the user with an
> >>> error message at boot.
> >>>
> >>> I'm also looking into modifying the i2c-i801 driver to disable its usage
> >>> of interrupts if one did not fire.
> >>>
> >>> v2: - add a newline at the end of the kernel log message,
> >>>     - replace: "if (match == NULL)" -> "if (!match)"
> >>>     - patch description tweaks.
> >>> v3: - fix C style issues pointed by Jean Delvare,
> >>>     - switch severity from warning to error.
> >>> v3 RESEND: retested on top of v6.2-rc4
> >>> v4: - rebase and retest on top of v6.7-rc7
> >>>     - switch severity back to warning,
> >>>     - change pr_err() to dev_warn() and simplify the code,
> >>>     - modify patch description (describe Windows behaviour etc.)
> >>> ---
> >>>  drivers/acpi/pci_irq.c | 25 ++++++++++++++++++++++---
> >>>  1 file changed, 22 insertions(+), 3 deletions(-)
> >>>
> >>> diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c
> >>> index ff30ceca2203..1fcf72e335b0 100644
> >>> --- a/drivers/acpi/pci_irq.c
> >>> +++ b/drivers/acpi/pci_irq.c
> >>> @@ -203,6 +203,8 @@ static int acpi_pci_irq_find_prt_entry(struct pci_dev *dev,
> >>>         struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL };
> >>>         struct acpi_pci_routing_table *entry;
> >>>         acpi_handle handle = NULL;
> >>> +       struct acpi_prt_entry *match = NULL;
> >>> +       const char *match_int_source = NULL;
> >>>
> >>>         if (dev->bus->bridge)
> >>>                 handle = ACPI_HANDLE(dev->bus->bridge);
> >>> @@ -219,13 +221,30 @@ static int acpi_pci_irq_find_prt_entry(struct pci_dev *dev,
> >>>
> >>>         entry = buffer.pointer;
> >>>         while (entry && (entry->length > 0)) {
> >>> -               if (!acpi_pci_irq_check_entry(handle, dev, pin,
> >>> -                                                entry, entry_ptr))
> >>> -                       break;
> >>> +               struct acpi_prt_entry *curr;
> >>> +
> >>> +               if (!acpi_pci_irq_check_entry(handle, dev, pin, entry, &curr)) {
> >>> +                       if (!match) {
> >>> +                               match = curr;
> >>> +                               match_int_source = entry->source;
> >>> +                        } else {
> >>> +                               dev_warn(&dev->dev, FW_BUG
> >> dev_info() would be sufficient here IMV.
> >>
> >>> +                                      "ACPI _PRT returned duplicate IRQ routing entries for INT%c: %s[%d] and %s[%d]\n",
> >>> +                                      pin_name(curr->pin),
> >>> +                                      match_int_source, match->index,
> >>> +                                      entry->source, curr->index);
> >>> +                               /* We use the first matching entry nonetheless,
> >>> +                                * for compatibility with older kernels.
> > The usual comment style in this file is:
> >
> >   /*
> >    * We use ...
> >    */
> >
> >>> +                                */
> >>> +                       }
> >>> +               }
> >>> +
> >>>                 entry = (struct acpi_pci_routing_table *)
> >>>                     ((unsigned long)entry + entry->length);
> >>>         }
> >>>
> >>> +       *entry_ptr = match;
> >>> +
> >>>         kfree(buffer.pointer);
> >>>         return 0;
> >>>  }
> >>>
> >>> base-commit: 861deac3b092f37b2c5e6871732f3e11486f7082
> >>> --
> >> Bjorn, any concerns regarding this one?
> > No concerns from me.
> >
> > I guess this only adds a message, right?  It doesn't actually fix
> > anything or change any behavior?
> Exactly.
> > This talks about "duplicate" entries, which suggests to me that they
> > are identical, but I don't think they are.  It sounds like it's two
> > "matching" entries, i.e., two entries for the same (device, pin)?
>
> Right.
>
> > And neither of the two _PRT entries yields a working i801 device?
>
> Unpatched Linux uses the first matching entry, but the second one gives
> a working i801 device. The point is to print a warning message to see
> how many devices are affected and whether it is safe to switch the code
> to use the last matching entry in all instances.
>
> Therefore I used dev_warn().

I don't quite see a connection between the above and the log level.
Mateusz Jończyk Feb. 16, 2024, 9:19 p.m. UTC | #6
W dniu 16.02.2024 o 21:51, Rafael J. Wysocki pisze:

> On Fri, Feb 16, 2024 at 9:20 PM Mateusz Jończyk <mat.jonczyk@o2.pl> wrote:
>> W dniu 16.02.2024 o 19:49, Bjorn Helgaas pisze:
>>> On Fri, Feb 16, 2024 at 07:26:06PM +0100, Rafael J. Wysocki wrote:
>>>> On Tue, Dec 26, 2023 at 1:50 PM Mateusz Jończyk <mat.jonczyk@o2.pl> wrote:
>>>>> On some platforms, the ACPI _PRT function returns duplicate interrupt
>>>>> routing entries. Linux uses the first matching entry, but sometimes the
>>>>> second matching entry contains the correct interrupt vector.
>>>>>
>>>>> As a debugging aid, print a warning to dmesg if duplicate interrupt
>>>>> routing entries are present. This way, we could check how many models
>>>>> are affected.
>>>>>
>>>>> This happens on a Dell Latitude E6500 laptop with the i2c-i801 Intel
>>>>> SMBus controller. This controller is nonfunctional unless its interrupt
>>>>> usage is disabled (using the "disable_features=0x10" module parameter).
>>>>>
>>>>> After investigation, it turned out that the driver was using an
>>>>> incorrect interrupt vector: in lspci output for this device there was:
>>>>>         Interrupt: pin B routed to IRQ 19
>>>>> but after running i2cdetect (without using any i2c-i801 module
>>>>> parameters) the following was logged to dmesg:
>>>>>
>>>>>         [...]
>>>>>         i801_smbus 0000:00:1f.3: Timeout waiting for interrupt!
>>>>>         i801_smbus 0000:00:1f.3: Transaction timeout
>>>>>         i801_smbus 0000:00:1f.3: Timeout waiting for interrupt!
>>>>>         i801_smbus 0000:00:1f.3: Transaction timeout
>>>>>         irq 17: nobody cared (try booting with the "irqpoll" option)
>>>>>
>>>>> Existence of duplicate entries in a table returned by the _PRT method
>>>>> was confirmed by disassembling the ACPI DSDT table.

[snip]

>>> And neither of the two _PRT entries yields a working i801 device?
>> Unpatched Linux uses the first matching entry, but the second one gives
>> a working i801 device. The point is to print a warning message to see
>> how many devices are affected and whether it is safe to switch the code
>> to use the last matching entry in all instances.
>>
>> Therefore I used dev_warn().
> I don't quite see a connection between the above and the log level.

OK, so I'll use dev_info() then.

Greetings,
Mateusz
Andy Shevchenko Feb. 20, 2024, 10:12 p.m. UTC | #7
Tue, Dec 26, 2023 at 01:42:54PM +0100, Mateusz Jończyk kirjoitti:
> On some platforms, the ACPI _PRT function returns duplicate interrupt
> routing entries. Linux uses the first matching entry, but sometimes the
> second matching entry contains the correct interrupt vector.
> 
> As a debugging aid, print a warning to dmesg if duplicate interrupt
> routing entries are present. This way, we could check how many models
> are affected.
> 
> This happens on a Dell Latitude E6500 laptop with the i2c-i801 Intel
> SMBus controller. This controller is nonfunctional unless its interrupt
> usage is disabled (using the "disable_features=0x10" module parameter).
> 
> After investigation, it turned out that the driver was using an
> incorrect interrupt vector: in lspci output for this device there was:
>         Interrupt: pin B routed to IRQ 19
> but after running i2cdetect (without using any i2c-i801 module
> parameters) the following was logged to dmesg:
> 
>         [...]
>         i801_smbus 0000:00:1f.3: Timeout waiting for interrupt!
>         i801_smbus 0000:00:1f.3: Transaction timeout
>         i801_smbus 0000:00:1f.3: Timeout waiting for interrupt!
>         i801_smbus 0000:00:1f.3: Transaction timeout
>         irq 17: nobody cared (try booting with the "irqpoll" option)
> 
> Existence of duplicate entries in a table returned by the _PRT method
> was confirmed by disassembling the ACPI DSDT table.
> 
> Windows XP is using IRQ3 (as reported by HWiNFO32 and in the Device
> Manager), which is neither of the two vectors returned by _PRT.
> As HWiNFO32 decoded contents of the SPD EEPROMs, the i2c-i801 device is
> working under Windows. It appears that Windows has reconfigured the
> chipset independently to use another interrupt vector for the device.
> This is possible, according to the chipset datasheet [1], page 436 for
> example (PIRQ[n]_ROUT—PIRQ[A,B,C,D] Routing Control Register).

> [1] https://www.intel.com/content/dam/doc/datasheet/io-controller-hub-9-datasheet.pdf
> 

Can you convert this to be a Link tag?

Link: ...URL... # [1]
Signed-off-by: ...

> Signed-off-by: Mateusz Jończyk <mat.jonczyk@o2.pl>

> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: "Rafael J. Wysocki" <rafael@kernel.org>
> Cc: Len Brown <lenb@kernel.org>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Jean Delvare <jdelvare@suse.de>

Please, move these (Cc lines) down after --- cutter line.

> Previously-reviewed-by: Jean Delvare <jdelvare@suse.de>
> Previously-tested-by: Jean Delvare <jdelvare@suse.de>

This shouldn't be in the commit message, just use the comment block
(after --- line) for this.

...

>  	struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL };
>  	struct acpi_pci_routing_table *entry;
>  	acpi_handle handle = NULL;
> +	struct acpi_prt_entry *match = NULL;
> +	const char *match_int_source = NULL;

Can you preserve reversed xmas tree ordering?

...

>  	while (entry && (entry->length > 0)) {
> -		if (!acpi_pci_irq_check_entry(handle, dev, pin,
> -						 entry, entry_ptr))
> -			break;
> +		struct acpi_prt_entry *curr;
> +
> +		if (!acpi_pci_irq_check_entry(handle, dev, pin, entry, &curr)) {

> +			if (!match) {

Why not positive condition?

> +				match = curr;
> +				match_int_source = entry->source;
> +			} else {
> +				dev_warn(&dev->dev, FW_BUG
> +				       "ACPI _PRT returned duplicate IRQ routing entries for INT%c: %s[%d] and %s[%d]\n",
> +				       pin_name(curr->pin),
> +				       match_int_source, match->index,
> +				       entry->source, curr->index);
> +				/* We use the first matching entry nonetheless,
> +				 * for compatibility with older kernels.
> +				 */
> +			}
> +		}
diff mbox series

Patch

diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c
index ff30ceca2203..1fcf72e335b0 100644
--- a/drivers/acpi/pci_irq.c
+++ b/drivers/acpi/pci_irq.c
@@ -203,6 +203,8 @@  static int acpi_pci_irq_find_prt_entry(struct pci_dev *dev,
 	struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL };
 	struct acpi_pci_routing_table *entry;
 	acpi_handle handle = NULL;
+	struct acpi_prt_entry *match = NULL;
+	const char *match_int_source = NULL;
 
 	if (dev->bus->bridge)
 		handle = ACPI_HANDLE(dev->bus->bridge);
@@ -219,13 +221,30 @@  static int acpi_pci_irq_find_prt_entry(struct pci_dev *dev,
 
 	entry = buffer.pointer;
 	while (entry && (entry->length > 0)) {
-		if (!acpi_pci_irq_check_entry(handle, dev, pin,
-						 entry, entry_ptr))
-			break;
+		struct acpi_prt_entry *curr;
+
+		if (!acpi_pci_irq_check_entry(handle, dev, pin, entry, &curr)) {
+			if (!match) {
+				match = curr;
+				match_int_source = entry->source;
+			} else {
+				dev_warn(&dev->dev, FW_BUG
+				       "ACPI _PRT returned duplicate IRQ routing entries for INT%c: %s[%d] and %s[%d]\n",
+				       pin_name(curr->pin),
+				       match_int_source, match->index,
+				       entry->source, curr->index);
+				/* We use the first matching entry nonetheless,
+				 * for compatibility with older kernels.
+				 */
+			}
+		}
+
 		entry = (struct acpi_pci_routing_table *)
 		    ((unsigned long)entry + entry->length);
 	}
 
+	*entry_ptr = match;
+
 	kfree(buffer.pointer);
 	return 0;
 }