diff mbox

x86/efi-bgrt: Switch pr_err() to pr_debug() for invalid BGRT

Message ID 1435579602-6612-1-git-send-email-matt@codeblueprint.co.uk (mailing list archive)
State Not Applicable, archived
Headers show

Commit Message

Matt Fleming June 29, 2015, 12:06 p.m. UTC
From: Matt Fleming <matt.fleming@intel.com>

It's totally legitimate, per the ACPI spec, for the firmware to set the
BGRT 'status' field to zero to indicate that the BGRT image isn't being
displayed, and we shouldn't be printing an error message in that case
because it's just noise for users. So swap pr_err() for pr_debug().

However, Josh points that out it still makes sense to test the validity
of the upper 7 bits of the 'status' field, since they're marked as
"reserved" in the spec and must be zero. If firmware violates this it
really *is* an error.

Reported-by: Tom Yan <tom.ty89@gmail.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
---

Tom, if you could test out this patch that would be really helpful. I'll
take this one through the EFI git repo.

 arch/x86/platform/efi/efi-bgrt.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

Comments

Borislav Petkov June 29, 2015, 1:13 p.m. UTC | #1
On Mon, Jun 29, 2015 at 01:06:42PM +0100, Matt Fleming wrote:
> From: Matt Fleming <matt.fleming@intel.com>
> 
> It's totally legitimate, per the ACPI spec, for the firmware to set the
> BGRT 'status' field to zero to indicate that the BGRT image isn't being
> displayed, and we shouldn't be printing an error message in that case
> because it's just noise for users. So swap pr_err() for pr_debug().
> 
> However, Josh points that out it still makes sense to test the validity
> of the upper 7 bits of the 'status' field, since they're marked as
> "reserved" in the spec and must be zero. If firmware violates this it
> really *is* an error.

Sounds to me this should be

	pr_warn(FW_WARN "... );

then, no?

So that it hopefully gets caught at early testing when fw can still be
fixed...?

Better yet FW_BUG even...
Josh Triplett June 29, 2015, 2 p.m. UTC | #2
On Mon, Jun 29, 2015 at 03:13:05PM +0200, Borislav Petkov wrote:
> On Mon, Jun 29, 2015 at 01:06:42PM +0100, Matt Fleming wrote:
> > From: Matt Fleming <matt.fleming@intel.com>
> > 
> > It's totally legitimate, per the ACPI spec, for the firmware to set the
> > BGRT 'status' field to zero to indicate that the BGRT image isn't being
> > displayed, and we shouldn't be printing an error message in that case
> > because it's just noise for users. So swap pr_err() for pr_debug().
> > 
> > However, Josh points that out it still makes sense to test the validity
> > of the upper 7 bits of the 'status' field, since they're marked as
> > "reserved" in the spec and must be zero. If firmware violates this it
> > really *is* an error.
> 
> Sounds to me this should be
> 
> 	pr_warn(FW_WARN "... );
> 
> then, no?
> 
> So that it hopefully gets caught at early testing when fw can still be
> fixed...?
> 
> Better yet FW_BUG even...

Definitely not FW_BUG.  The field is reserved *now*; it would be
legitimate for a new version of the BGRT spec to define one of those
bits for something else.

- Josh Triplett
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Josh Triplett June 29, 2015, 2:02 p.m. UTC | #3
On Mon, Jun 29, 2015 at 01:06:42PM +0100, Matt Fleming wrote:
> From: Matt Fleming <matt.fleming@intel.com>
> 
> It's totally legitimate, per the ACPI spec, for the firmware to set the
> BGRT 'status' field to zero to indicate that the BGRT image isn't being
> displayed, and we shouldn't be printing an error message in that case
> because it's just noise for users. So swap pr_err() for pr_debug().
> 
> However, Josh points that out it still makes sense to test the validity
> of the upper 7 bits of the 'status' field, since they're marked as
> "reserved" in the spec and must be zero. If firmware violates this it
> really *is* an error.
> 
> Reported-by: Tom Yan <tom.ty89@gmail.com>
> Cc: Josh Triplett <josh@joshtriplett.org>
> Cc: Matthew Garrett <mjg59@srcf.ucam.org>
> Signed-off-by: Matt Fleming <matt.fleming@intel.com>

Looks reasonable to me.

Reviewed-by: Josh Triplett <josh@joshtriplett.org>

> 
> Tom, if you could test out this patch that would be really helpful. I'll
> take this one through the EFI git repo.
> 
>  arch/x86/platform/efi/efi-bgrt.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/platform/efi/efi-bgrt.c b/arch/x86/platform/efi/efi-bgrt.c
> index d7f997f7c26d..ea48449b2e63 100644
> --- a/arch/x86/platform/efi/efi-bgrt.c
> +++ b/arch/x86/platform/efi/efi-bgrt.c
> @@ -50,11 +50,16 @@ void __init efi_bgrt_init(void)
>  		       bgrt_tab->version);
>  		return;
>  	}
> -	if (bgrt_tab->status != 1) {
> -		pr_err("Ignoring BGRT: invalid status %u (expected 1)\n",
> +	if (bgrt_tab->status & 0xfe) {
> +		pr_err("Ignoring BGRT: reserved status bits are non-zero %u\n",
>  		       bgrt_tab->status);
>  		return;
>  	}
> +	if (bgrt_tab->status != 1) {
> +		pr_debug("Ignoring BGRT: invalid status %u (expected 1)\n",
> +			 bgrt_tab->status);
> +		return;
> +	}
>  	if (bgrt_tab->image_type != 0) {
>  		pr_err("Ignoring BGRT: invalid image type %u (expected 0)\n",
>  		       bgrt_tab->image_type);
> -- 
> 2.1.0
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Borislav Petkov June 29, 2015, 2:17 p.m. UTC | #4
On Mon, Jun 29, 2015 at 07:00:22AM -0700, Josh Triplett wrote:
> Definitely not FW_BUG.  The field is reserved *now*; it would be
> legitimate for a new version of the BGRT spec to define one of those
> bits for something else.

Which would mean that booting old kernels on new FW which defines those
reserved bits would cause that warning to fire erroneously.

So then we probably don't need it at all or we need to check implemented
BGRT version of the FW running to know which bits are defined by the
spec and which are reserved...

Also, does the spec really say that reserved bits must be zero? Or it
doesn't specify their value?
Josh Triplett June 29, 2015, 2:45 p.m. UTC | #5
On Mon, Jun 29, 2015 at 04:17:24PM +0200, Borislav Petkov wrote:
> On Mon, Jun 29, 2015 at 07:00:22AM -0700, Josh Triplett wrote:
> > Definitely not FW_BUG.  The field is reserved *now*; it would be
> > legitimate for a new version of the BGRT spec to define one of those
> > bits for something else.
> 
> Which would mean that booting old kernels on new FW which defines those
> reserved bits would cause that warning to fire erroneously.

Not erroneously; those bits could potentially indicate some status
condition we don't know about, so we need to assume we can't handle the
table if a bit we don't understand is set.

(Specs like this should really do what ext4 does, and define whether a
given set of currently undefined bits are optional or mandatory; as in,
if you don't understand them, can you proceed or should you stop?)

> So then we probably don't need it at all or we need to check implemented
> BGRT version of the FW running to know which bits are defined by the
> spec and which are reserved...
> 
> Also, does the spec really say that reserved bits must be zero? Or it
> doesn't specify their value?

The spec says those bits of the status field are reserved and must be
zero.

- Josh Triplett
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Matt Fleming June 29, 2015, 2:49 p.m. UTC | #6
On Mon, 29 Jun, at 04:17:24PM, Borislav Petkov wrote:
> On Mon, Jun 29, 2015 at 07:00:22AM -0700, Josh Triplett wrote:
> > Definitely not FW_BUG.  The field is reserved *now*; it would be
> > legitimate for a new version of the BGRT spec to define one of those
> > bits for something else.
> 
> Which would mean that booting old kernels on new FW which defines those
> reserved bits would cause that warning to fire erroneously.
> 
> So then we probably don't need it at all or we need to check implemented
> BGRT version of the FW running to know which bits are defined by the
> spec and which are reserved...
 
It still makes sense to have the error message because the kernel
literally does not know what the firmware is trying to achieve by
setting those bits.

But I agree with Josh that for the specific case of "reserved bits",
FW_BUG is wrong, because if in some future version of the spec those
bits get used, seeing,

  "[Firmware Bug]: Ignoring BGRT: reserved bits are non-zero 0x3"

is going to confuse the hell outta any firmware engineers, since it's
not the firmware that's buggy, it's just that the kernel lacks support.

This discussion should definitely be summarised in printk.h.

However, other error messages in efi-bgrt.c probably do want to be
sprinkled with FW_* if they represent things that should never happen
ever.

> Also, does the spec really say that reserved bits must be zero? Or it
> doesn't specify their value?

" 1-byte status field indicating current status about the table.
  Bits[7:1] = Reserved (must be zero)
  Bit [0] = Displayed. A one indicates the boot image graphic is
  displayed. "
Josh Triplett June 29, 2015, 2:53 p.m. UTC | #7
On Mon, Jun 29, 2015 at 03:49:40PM +0100, Matt Fleming wrote:
> On Mon, 29 Jun, at 04:17:24PM, Borislav Petkov wrote:
> > On Mon, Jun 29, 2015 at 07:00:22AM -0700, Josh Triplett wrote:
> > > Definitely not FW_BUG.  The field is reserved *now*; it would be
> > > legitimate for a new version of the BGRT spec to define one of those
> > > bits for something else.
> > 
> > Which would mean that booting old kernels on new FW which defines those
> > reserved bits would cause that warning to fire erroneously.
> > 
> > So then we probably don't need it at all or we need to check implemented
> > BGRT version of the FW running to know which bits are defined by the
> > spec and which are reserved...
>  
> It still makes sense to have the error message because the kernel
> literally does not know what the firmware is trying to achieve by
> setting those bits.
> 
> But I agree with Josh that for the specific case of "reserved bits",
> FW_BUG is wrong, because if in some future version of the spec those
> bits get used, seeing,
> 
>   "[Firmware Bug]: Ignoring BGRT: reserved bits are non-zero 0x3"
> 
> is going to confuse the hell outta any firmware engineers, since it's
> not the firmware that's buggy, it's just that the kernel lacks support.

Right.  Same reason we shouldn't use FW_BUG for bgrt_tab->version != 1
or bgrt_tab->image_type != 0 .

> This discussion should definitely be summarised in printk.h.
> 
> However, other error messages in efi-bgrt.c probably do want to be
> sprinkled with FW_* if they represent things that should never happen
> ever.

I think the length check and null pointer check would fall in that
category.

- Josh Triplett
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Borislav Petkov June 29, 2015, 3:44 p.m. UTC | #8
On Mon, Jun 29, 2015 at 03:49:40PM +0100, Matt Fleming wrote:
> It still makes sense to have the error message because the kernel
> literally does not know what the firmware is trying to achieve by
> setting those bits.
> 
> But I agree with Josh that for the specific case of "reserved bits",
> FW_BUG is wrong, because if in some future version of the spec those
> bits get used, seeing,
> 
>   "[Firmware Bug]: Ignoring BGRT: reserved bits are non-zero 0x3"

I still don't see what that message brings if some kernel complains that
some bits are !0 then. Are they valid bits which the kernel doesn't know
about or are they erroneously set and reserved. This, IMHO, is confusing
because the error message is not correct in all cases.

Thus my suggestion to either check the spec version before looking at
the bits or find out in some other way which bits are defined and which
are reserved and warn only about the reserved ones which are 1b.
Josh Triplett June 29, 2015, 4:35 p.m. UTC | #9
On Mon, Jun 29, 2015 at 05:44:58PM +0200, Borislav Petkov wrote:
> On Mon, Jun 29, 2015 at 03:49:40PM +0100, Matt Fleming wrote:
> > It still makes sense to have the error message because the kernel
> > literally does not know what the firmware is trying to achieve by
> > setting those bits.
> > 
> > But I agree with Josh that for the specific case of "reserved bits",
> > FW_BUG is wrong, because if in some future version of the spec those
> > bits get used, seeing,
> > 
> >   "[Firmware Bug]: Ignoring BGRT: reserved bits are non-zero 0x3"
> 
> I still don't see what that message brings if some kernel complains that
> some bits are !0 then. Are they valid bits which the kernel doesn't know
> about or are they erroneously set and reserved. This, IMHO, is confusing
> because the error message is not correct in all cases.
> 
> Thus my suggestion to either check the spec version before looking at
> the bits or find out in some other way which bits are defined and which
> are reserved and warn only about the reserved ones which are 1b.

The version is already checked *before* the status bits: if the version
is not 1, the kernel stops there and ignores the BGRT, before printing a
message about status.

- Josh Triplett
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tom Yan June 29, 2015, 9:52 p.m. UTC | #10
Hi Matt!

The patch works for me against linux 4.0.6. I no longer see the
message in dmesg with the BIOS option set to "Auto". I assume this is
expected? It will only be displayed if I build it with a debug flag,
right?
https://www.kernel.org/doc/local/pr_debug.txt

On 29 June 2015 at 20:06, Matt Fleming <matt@codeblueprint.co.uk> wrote:
> From: Matt Fleming <matt.fleming@intel.com>
>
> It's totally legitimate, per the ACPI spec, for the firmware to set the
> BGRT 'status' field to zero to indicate that the BGRT image isn't being
> displayed, and we shouldn't be printing an error message in that case
> because it's just noise for users. So swap pr_err() for pr_debug().
>
> However, Josh points that out it still makes sense to test the validity
> of the upper 7 bits of the 'status' field, since they're marked as
> "reserved" in the spec and must be zero. If firmware violates this it
> really *is* an error.
>
> Reported-by: Tom Yan <tom.ty89@gmail.com>
> Cc: Josh Triplett <josh@joshtriplett.org>
> Cc: Matthew Garrett <mjg59@srcf.ucam.org>
> Signed-off-by: Matt Fleming <matt.fleming@intel.com>
> ---
>
> Tom, if you could test out this patch that would be really helpful. I'll
> take this one through the EFI git repo.
>
>  arch/x86/platform/efi/efi-bgrt.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/platform/efi/efi-bgrt.c b/arch/x86/platform/efi/efi-bgrt.c
> index d7f997f7c26d..ea48449b2e63 100644
> --- a/arch/x86/platform/efi/efi-bgrt.c
> +++ b/arch/x86/platform/efi/efi-bgrt.c
> @@ -50,11 +50,16 @@ void __init efi_bgrt_init(void)
>                        bgrt_tab->version);
>                 return;
>         }
> -       if (bgrt_tab->status != 1) {
> -               pr_err("Ignoring BGRT: invalid status %u (expected 1)\n",
> +       if (bgrt_tab->status & 0xfe) {
> +               pr_err("Ignoring BGRT: reserved status bits are non-zero %u\n",
>                        bgrt_tab->status);
>                 return;
>         }
> +       if (bgrt_tab->status != 1) {
> +               pr_debug("Ignoring BGRT: invalid status %u (expected 1)\n",
> +                        bgrt_tab->status);
> +               return;
> +       }
>         if (bgrt_tab->image_type != 0) {
>                 pr_err("Ignoring BGRT: invalid image type %u (expected 0)\n",
>                        bgrt_tab->image_type);
> --
> 2.1.0
>
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Matt Fleming June 30, 2015, 9:31 a.m. UTC | #11
On Mon, 29 Jun, at 09:35:53AM, Josh Triplett wrote:
> On Mon, Jun 29, 2015 at 05:44:58PM +0200, Borislav Petkov wrote:
> > On Mon, Jun 29, 2015 at 03:49:40PM +0100, Matt Fleming wrote:
> > > It still makes sense to have the error message because the kernel
> > > literally does not know what the firmware is trying to achieve by
> > > setting those bits.
> > > 
> > > But I agree with Josh that for the specific case of "reserved bits",
> > > FW_BUG is wrong, because if in some future version of the spec those
> > > bits get used, seeing,
> > > 
> > >   "[Firmware Bug]: Ignoring BGRT: reserved bits are non-zero 0x3"
> > 
> > I still don't see what that message brings if some kernel complains that
> > some bits are !0 then. Are they valid bits which the kernel doesn't know
> > about or are they erroneously set and reserved. This, IMHO, is confusing
> > because the error message is not correct in all cases.
> > 
> > Thus my suggestion to either check the spec version before looking at
> > the bits or find out in some other way which bits are defined and which
> > are reserved and warn only about the reserved ones which are 1b.
> 
> The version is already checked *before* the status bits: if the version
> is not 1, the kernel stops there and ignores the BGRT, before printing a
> message about status.

If we're performing version[1] checks then I think it's fair game to use
FW_BUG, since these bits are reserved for that version.

[1] Note the version of the BGRT table is checked, not the ACPI spec
version, and it's not clear which would get a bump if new bits were
defined for the 'status' field. Historically, new bits have been added
to the EFI memory map without bumping the expected "memory descriptor"
version - a spec version update was considered to be sufficient (see
EFI_MEMORY_RO).
Matt Fleming June 30, 2015, 10 a.m. UTC | #12
On Tue, 30 Jun, at 05:52:43AM, Tom Yan wrote:
> Hi Matt!
> 
> The patch works for me against linux 4.0.6. I no longer see the
> message in dmesg with the BIOS option set to "Auto". I assume this is
> expected? It will only be displayed if I build it with a debug flag,
> right?
> https://www.kernel.org/doc/local/pr_debug.txt

Right, that is the expected outcome but it's always good to check that I
didn't somehow goof up the patch.

Thanks for testing!
diff mbox

Patch

diff --git a/arch/x86/platform/efi/efi-bgrt.c b/arch/x86/platform/efi/efi-bgrt.c
index d7f997f7c26d..ea48449b2e63 100644
--- a/arch/x86/platform/efi/efi-bgrt.c
+++ b/arch/x86/platform/efi/efi-bgrt.c
@@ -50,11 +50,16 @@  void __init efi_bgrt_init(void)
 		       bgrt_tab->version);
 		return;
 	}
-	if (bgrt_tab->status != 1) {
-		pr_err("Ignoring BGRT: invalid status %u (expected 1)\n",
+	if (bgrt_tab->status & 0xfe) {
+		pr_err("Ignoring BGRT: reserved status bits are non-zero %u\n",
 		       bgrt_tab->status);
 		return;
 	}
+	if (bgrt_tab->status != 1) {
+		pr_debug("Ignoring BGRT: invalid status %u (expected 1)\n",
+			 bgrt_tab->status);
+		return;
+	}
 	if (bgrt_tab->image_type != 0) {
 		pr_err("Ignoring BGRT: invalid image type %u (expected 0)\n",
 		       bgrt_tab->image_type);