diff mbox

iommu: Remove stack trace from broken irq remapping warning

Message ID 1380300815-1864-1-git-send-email-nhorman@tuxdriver.com (mailing list archive)
State New, archived
Delegated to: Bjorn Helgaas
Headers show

Commit Message

Neil Horman Sept. 27, 2013, 4:53 p.m. UTC
The warning for the irq remapping broken check in intel_irq_remapping.c is
pretty pointless.  We need the warning, but we know where its comming from, the
stack trace will always be the same, and it needlessly triggers things like
Abrt.  This changes the warning to just print a text warning about BIOS being
broken, without the stack trace, then sets the appropriate taint bit.  Since we
automatically disable irq remapping, theres no need to contiue making Abrt jump
at this problem

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Joerg Roedel <joro@8bytes.org>
CC: Bjorn Helgaas <bhelgaas@google.com>
CC: Andy Lutomirski <luto@amacapital.net>
CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CC: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
---
 drivers/iommu/intel_irq_remapping.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

Comments

Andy Lutomirski Sept. 27, 2013, 7:24 p.m. UTC | #1
On Fri, Sep 27, 2013 at 9:53 AM, Neil Horman <nhorman@tuxdriver.com> wrote:
> The warning for the irq remapping broken check in intel_irq_remapping.c is
> pretty pointless.  We need the warning, but we know where its comming from, the
> stack trace will always be the same, and it needlessly triggers things like
> Abrt.  This changes the warning to just print a text warning about BIOS being
> broken, without the stack trace, then sets the appropriate taint bit.  Since we
> automatically disable irq remapping, theres no need to contiue making Abrt jump
> at this problem
>
> Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
> CC: Joerg Roedel <joro@8bytes.org>
> CC: Bjorn Helgaas <bhelgaas@google.com>
> CC: Andy Lutomirski <luto@amacapital.net>
> CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> CC: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
> ---
>  drivers/iommu/intel_irq_remapping.c | 13 +++++++------
>  1 file changed, 7 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/iommu/intel_irq_remapping.c b/drivers/iommu/intel_irq_remapping.c
> index f71673d..b97d70b 100644
> --- a/drivers/iommu/intel_irq_remapping.c
> +++ b/drivers/iommu/intel_irq_remapping.c
> @@ -525,12 +525,13 @@ static int __init intel_irq_remapping_supported(void)
>         if (disable_irq_remap)
>                 return 0;
>         if (irq_remap_broken) {
> -               WARN_TAINT(1, TAINT_FIRMWARE_WORKAROUND,
> -                          "This system BIOS has enabled interrupt remapping\n"
> -                          "on a chipset that contains an erratum making that\n"
> -                          "feature unstable.  To maintain system stability\n"
> -                          "interrupt remapping is being disabled.  Please\n"
> -                          "contact your BIOS vendor for an update\n");
> +               printk(KERN_WARNING
> +                       "This system BIOS has enabled interrupt remapping\n"
> +                       "on a chipset that contains an erratum making that\n"
> +                       "feature unstable.  To maintain system stability\n"
> +                       "interrupt remapping is being disabled.  Please\n"
> +                       "contact your BIOS vendor for an update\n");
> +               add_taint(TAINT_FIRMWARE_WORKAROUND, LOCKDEP_STILL_OK);

Is the taint bit actually useful?  It seems like functionality will be
missing if this workaround happens, but everything should be stable.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Neil Horman Sept. 27, 2013, 7:41 p.m. UTC | #2
On Fri, Sep 27, 2013 at 12:24:10PM -0700, Andy Lutomirski wrote:
> On Fri, Sep 27, 2013 at 9:53 AM, Neil Horman <nhorman@tuxdriver.com> wrote:
> > The warning for the irq remapping broken check in intel_irq_remapping.c is
> > pretty pointless.  We need the warning, but we know where its comming from, the
> > stack trace will always be the same, and it needlessly triggers things like
> > Abrt.  This changes the warning to just print a text warning about BIOS being
> > broken, without the stack trace, then sets the appropriate taint bit.  Since we
> > automatically disable irq remapping, theres no need to contiue making Abrt jump
> > at this problem
> >
> > Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
> > CC: Joerg Roedel <joro@8bytes.org>
> > CC: Bjorn Helgaas <bhelgaas@google.com>
> > CC: Andy Lutomirski <luto@amacapital.net>
> > CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > CC: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
> > ---
> >  drivers/iommu/intel_irq_remapping.c | 13 +++++++------
> >  1 file changed, 7 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/iommu/intel_irq_remapping.c b/drivers/iommu/intel_irq_remapping.c
> > index f71673d..b97d70b 100644
> > --- a/drivers/iommu/intel_irq_remapping.c
> > +++ b/drivers/iommu/intel_irq_remapping.c
> > @@ -525,12 +525,13 @@ static int __init intel_irq_remapping_supported(void)
> >         if (disable_irq_remap)
> >                 return 0;
> >         if (irq_remap_broken) {
> > -               WARN_TAINT(1, TAINT_FIRMWARE_WORKAROUND,
> > -                          "This system BIOS has enabled interrupt remapping\n"
> > -                          "on a chipset that contains an erratum making that\n"
> > -                          "feature unstable.  To maintain system stability\n"
> > -                          "interrupt remapping is being disabled.  Please\n"
> > -                          "contact your BIOS vendor for an update\n");
> > +               printk(KERN_WARNING
> > +                       "This system BIOS has enabled interrupt remapping\n"
> > +                       "on a chipset that contains an erratum making that\n"
> > +                       "feature unstable.  To maintain system stability\n"
> > +                       "interrupt remapping is being disabled.  Please\n"
> > +                       "contact your BIOS vendor for an update\n");
> > +               add_taint(TAINT_FIRMWARE_WORKAROUND, LOCKDEP_STILL_OK);
> 
> Is the taint bit actually useful?  It seems like functionality will be
> missing if this workaround happens, but everything should be stable.
> 
I think its useful yes.  The system will be stable, an in fact should run
exactly as it did before, but since the errata indicates this should be fixed in
BIOS, its a reminder to the admin that you should investigate an update, or take
action in manually disabling iommu on the command line

Its also in keeping with the way this was structured previously.
Neil

> --Andy
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Neil Horman Oct. 3, 2013, 5:21 p.m. UTC | #3
On Fri, Sep 27, 2013 at 12:53:35PM -0400, Neil Horman wrote:
> The warning for the irq remapping broken check in intel_irq_remapping.c is
> pretty pointless.  We need the warning, but we know where its comming from, the
> stack trace will always be the same, and it needlessly triggers things like
> Abrt.  This changes the warning to just print a text warning about BIOS being
> broken, without the stack trace, then sets the appropriate taint bit.  Since we
> automatically disable irq remapping, theres no need to contiue making Abrt jump
> at this problem
> 
> Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
> CC: Joerg Roedel <joro@8bytes.org>
> CC: Bjorn Helgaas <bhelgaas@google.com>
> CC: Andy Lutomirski <luto@amacapital.net>
> CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> CC: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
> ---
>  drivers/iommu/intel_irq_remapping.c | 13 +++++++------
>  1 file changed, 7 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/iommu/intel_irq_remapping.c b/drivers/iommu/intel_irq_remapping.c
> index f71673d..b97d70b 100644
> --- a/drivers/iommu/intel_irq_remapping.c
> +++ b/drivers/iommu/intel_irq_remapping.c
> @@ -525,12 +525,13 @@ static int __init intel_irq_remapping_supported(void)
>  	if (disable_irq_remap)
>  		return 0;
>  	if (irq_remap_broken) {
> -		WARN_TAINT(1, TAINT_FIRMWARE_WORKAROUND,
> -			   "This system BIOS has enabled interrupt remapping\n"
> -			   "on a chipset that contains an erratum making that\n"
> -			   "feature unstable.  To maintain system stability\n"
> -			   "interrupt remapping is being disabled.  Please\n"
> -			   "contact your BIOS vendor for an update\n");
> +		printk(KERN_WARNING
> +			"This system BIOS has enabled interrupt remapping\n"
> +			"on a chipset that contains an erratum making that\n"
> +			"feature unstable.  To maintain system stability\n"
> +			"interrupt remapping is being disabled.  Please\n"
> +			"contact your BIOS vendor for an update\n");
> +		add_taint(TAINT_FIRMWARE_WORKAROUND, LOCKDEP_STILL_OK);
>  		disable_irq_remap = 1;
>  		return 0;
>  	}
> -- 
> 1.8.3.1
> 
> 

Ping Bjorn, Jeorg, any thoughts here?
Neil

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas Oct. 3, 2013, 7:21 p.m. UTC | #4
On Thu, Oct 3, 2013 at 11:21 AM, Neil Horman <nhorman@tuxdriver.com> wrote:
> On Fri, Sep 27, 2013 at 12:53:35PM -0400, Neil Horman wrote:
>> The warning for the irq remapping broken check in intel_irq_remapping.c is
>> pretty pointless.  We need the warning, but we know where its comming from, the
>> stack trace will always be the same, and it needlessly triggers things like
>> Abrt.  This changes the warning to just print a text warning about BIOS being
>> broken, without the stack trace, then sets the appropriate taint bit.  Since we
>> automatically disable irq remapping, theres no need to contiue making Abrt jump
>> at this problem
>>
>> Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
>> CC: Joerg Roedel <joro@8bytes.org>
>> CC: Bjorn Helgaas <bhelgaas@google.com>
>> CC: Andy Lutomirski <luto@amacapital.net>
>> CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
>> CC: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
>> ---
>>  drivers/iommu/intel_irq_remapping.c | 13 +++++++------
>>  1 file changed, 7 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/iommu/intel_irq_remapping.c b/drivers/iommu/intel_irq_remapping.c
>> index f71673d..b97d70b 100644
>> --- a/drivers/iommu/intel_irq_remapping.c
>> +++ b/drivers/iommu/intel_irq_remapping.c
>> @@ -525,12 +525,13 @@ static int __init intel_irq_remapping_supported(void)
>>       if (disable_irq_remap)
>>               return 0;
>>       if (irq_remap_broken) {
>> -             WARN_TAINT(1, TAINT_FIRMWARE_WORKAROUND,
>> -                        "This system BIOS has enabled interrupt remapping\n"
>> -                        "on a chipset that contains an erratum making that\n"
>> -                        "feature unstable.  To maintain system stability\n"
>> -                        "interrupt remapping is being disabled.  Please\n"
>> -                        "contact your BIOS vendor for an update\n");
>> +             printk(KERN_WARNING
>> +                     "This system BIOS has enabled interrupt remapping\n"
>> +                     "on a chipset that contains an erratum making that\n"
>> +                     "feature unstable.  To maintain system stability\n"
>> +                     "interrupt remapping is being disabled.  Please\n"
>> +                     "contact your BIOS vendor for an update\n");
>> +             add_taint(TAINT_FIRMWARE_WORKAROUND, LOCKDEP_STILL_OK);
>>               disable_irq_remap = 1;
>>               return 0;
>>       }
>> --
>> 1.8.3.1
>>
>>
>
> Ping Bjorn, Jeorg, any thoughts here?

This is in drivers/iommu, so I'd prefer that Joerg handle this.

My opinion is that this patch does the right thing in dropping the
backtrace and keeping the taint.

I tend to agree with Andy that we should also consider a second patch
that drops the taint, because this is basically just a quirk that
avoids broken chipset functionality, but as far as I can tell, there's
no reason to fear an undebuggable problem related to disabling
interrupt remapping.  We don't taint the kernel for other similar
quirks, so I don't see why we should here.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Joerg Roedel Oct. 3, 2013, 8:08 p.m. UTC | #5
On Thu, Oct 03, 2013 at 01:21:42PM -0400, Neil Horman wrote:
> On Fri, Sep 27, 2013 at 12:53:35PM -0400, Neil Horman wrote:
> > The warning for the irq remapping broken check in intel_irq_remapping.c is
> > pretty pointless.  We need the warning, but we know where its comming from, the
> > stack trace will always be the same, and it needlessly triggers things like
> > Abrt.  This changes the warning to just print a text warning about BIOS being
> > broken, without the stack trace, then sets the appropriate taint bit.  Since we
> > automatically disable irq remapping, theres no need to contiue making Abrt jump
> > at this problem
> > 
> > Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
> > CC: Joerg Roedel <joro@8bytes.org>
> > CC: Bjorn Helgaas <bhelgaas@google.com>
> > CC: Andy Lutomirski <luto@amacapital.net>
> > CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > CC: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
> 
> Ping Bjorn, Jeorg, any thoughts here?

Yes, the patch is doing the right thing. I have it already on my list
and will merge it soon.


	Joerg


--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Neil Horman Oct. 3, 2013, 8:26 p.m. UTC | #6
On Thu, Oct 03, 2013 at 10:08:24PM +0200, Joerg Roedel wrote:
> On Thu, Oct 03, 2013 at 01:21:42PM -0400, Neil Horman wrote:
> > On Fri, Sep 27, 2013 at 12:53:35PM -0400, Neil Horman wrote:
> > > The warning for the irq remapping broken check in intel_irq_remapping.c is
> > > pretty pointless.  We need the warning, but we know where its comming from, the
> > > stack trace will always be the same, and it needlessly triggers things like
> > > Abrt.  This changes the warning to just print a text warning about BIOS being
> > > broken, without the stack trace, then sets the appropriate taint bit.  Since we
> > > automatically disable irq remapping, theres no need to contiue making Abrt jump
> > > at this problem
> > > 
> > > Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
> > > CC: Joerg Roedel <joro@8bytes.org>
> > > CC: Bjorn Helgaas <bhelgaas@google.com>
> > > CC: Andy Lutomirski <luto@amacapital.net>
> > > CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > > CC: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
> > 
> > Ping Bjorn, Jeorg, any thoughts here?
> 
> Yes, the patch is doing the right thing. I have it already on my list
> and will merge it soon.
> 
Awesome, thanks guys.  Regarding the taint, I'll propose something for that
early next week.

Regards
Neil

> 
> 	Joerg
> 
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Joerg Roedel Oct. 4, 2013, 2:32 p.m. UTC | #7
On Fri, Sep 27, 2013 at 12:53:35PM -0400, Neil Horman wrote:
> The warning for the irq remapping broken check in intel_irq_remapping.c is
> pretty pointless.  We need the warning, but we know where its comming from, the
> stack trace will always be the same, and it needlessly triggers things like
> Abrt.  This changes the warning to just print a text warning about BIOS being
> broken, without the stack trace, then sets the appropriate taint bit.  Since we
> automatically disable irq remapping, theres no need to contiue making Abrt jump
> at this problem

Applied to x86/vt-d, thanks Neil.


--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/iommu/intel_irq_remapping.c b/drivers/iommu/intel_irq_remapping.c
index f71673d..b97d70b 100644
--- a/drivers/iommu/intel_irq_remapping.c
+++ b/drivers/iommu/intel_irq_remapping.c
@@ -525,12 +525,13 @@  static int __init intel_irq_remapping_supported(void)
 	if (disable_irq_remap)
 		return 0;
 	if (irq_remap_broken) {
-		WARN_TAINT(1, TAINT_FIRMWARE_WORKAROUND,
-			   "This system BIOS has enabled interrupt remapping\n"
-			   "on a chipset that contains an erratum making that\n"
-			   "feature unstable.  To maintain system stability\n"
-			   "interrupt remapping is being disabled.  Please\n"
-			   "contact your BIOS vendor for an update\n");
+		printk(KERN_WARNING
+			"This system BIOS has enabled interrupt remapping\n"
+			"on a chipset that contains an erratum making that\n"
+			"feature unstable.  To maintain system stability\n"
+			"interrupt remapping is being disabled.  Please\n"
+			"contact your BIOS vendor for an update\n");
+		add_taint(TAINT_FIRMWARE_WORKAROUND, LOCKDEP_STILL_OK);
 		disable_irq_remap = 1;
 		return 0;
 	}