diff mbox series

[v3,03/10] efi: Enumerate EFI_MEMORY_SP

Message ID 155993564854.3036719.3692507629721494555.stgit@dwillia2-desk3.amr.corp.intel.com (mailing list archive)
State Superseded
Headers show
Series EFI Specific Purpose Memory Support | expand

Commit Message

Dan Williams June 7, 2019, 7:27 p.m. UTC
UEFI 2.8 defines an EFI_MEMORY_SP attribute bit to augment the
interpretation of the EFI Memory Types as "reserved for a specific
purpose". The intent of this bit is to allow the OS to identify precious
or scarce memory resources and optionally manage it separately from
EfiConventionalMemory. As defined older OSes that do not know about this
attribute are permitted to ignore it and the memory will be handled
according to the OS default policy for the given memory type.

In other words, this "specific purpose" hint is deliberately weaker than
EfiReservedMemoryType in that the system continues to operate if the OS
takes no action on the attribute. The risk of taking no action is
potentially unwanted / unmovable kernel allocations from the designated
resource that prevent the full realization of the "specific purpose".
For example, consider a system with a high-bandwidth memory pool. Older
kernels are permitted to boot and consume that memory as conventional
"System-RAM" newer kernels may arrange for that memory to be set aside
by the system administrator for a dedicated high-bandwidth memory aware
application to consume.

Specifically, this mechanism allows for the elimination of scenarios
where platform firmware tries to game OS policy by lying about ACPI SLIT
values, i.e. claiming that a precious memory resource has a high
distance to trigger the OS to avoid it by default.

Implement simple detection of the bit for EFI memory table dumps and
save the kernel policy for a follow-on change.

Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/firmware/efi/efi.c |    5 +++--
 include/linux/efi.h        |    1 +
 2 files changed, 4 insertions(+), 2 deletions(-)

Comments

Dave Hansen June 7, 2019, 7:53 p.m. UTC | #1
On 6/7/19 12:27 PM, Dan Williams wrote:
> @@ -848,15 +848,16 @@ char * __init efi_md_typeattr_format(char *buf, size_t size,
>  	if (attr & ~(EFI_MEMORY_UC | EFI_MEMORY_WC | EFI_MEMORY_WT |
>  		     EFI_MEMORY_WB | EFI_MEMORY_UCE | EFI_MEMORY_RO |
>  		     EFI_MEMORY_WP | EFI_MEMORY_RP | EFI_MEMORY_XP |
> -		     EFI_MEMORY_NV |
> +		     EFI_MEMORY_NV | EFI_MEMORY_SP |
>  		     EFI_MEMORY_RUNTIME | EFI_MEMORY_MORE_RELIABLE))
>  		snprintf(pos, size, "|attr=0x%016llx]",
>  			 (unsigned long long)attr);
>  	else
>  		snprintf(pos, size,
> -			 "|%3s|%2s|%2s|%2s|%2s|%2s|%2s|%3s|%2s|%2s|%2s|%2s]",
> +			 "|%3s|%2s|%2s|%2s|%2s|%2s|%2s|%2s|%3s|%2s|%2s|%2s|%2s]",
>  			 attr & EFI_MEMORY_RUNTIME ? "RUN" : "",
>  			 attr & EFI_MEMORY_MORE_RELIABLE ? "MR" : "",
> +			 attr & EFI_MEMORY_SP      ? "SP"  : "",
>  			 attr & EFI_MEMORY_NV      ? "NV"  : "",
>  			 attr & EFI_MEMORY_XP      ? "XP"  : "",
>  			 attr & EFI_MEMORY_RP      ? "RP"  : "",

Haha, I went digging in sysfs to find out where this gets dumped out.
The joke was on me because it seems to only go to dmesg.

Separate from these patches, should we have a runtime file that dumps
out the same info?  dmesg isn't always available, and hotplug could
change this too, I'd imagine.
Dan Williams June 7, 2019, 8:03 p.m. UTC | #2
On Fri, Jun 7, 2019 at 12:54 PM Dave Hansen <dave.hansen@intel.com> wrote:
>
> On 6/7/19 12:27 PM, Dan Williams wrote:
> > @@ -848,15 +848,16 @@ char * __init efi_md_typeattr_format(char *buf, size_t size,
> >       if (attr & ~(EFI_MEMORY_UC | EFI_MEMORY_WC | EFI_MEMORY_WT |
> >                    EFI_MEMORY_WB | EFI_MEMORY_UCE | EFI_MEMORY_RO |
> >                    EFI_MEMORY_WP | EFI_MEMORY_RP | EFI_MEMORY_XP |
> > -                  EFI_MEMORY_NV |
> > +                  EFI_MEMORY_NV | EFI_MEMORY_SP |
> >                    EFI_MEMORY_RUNTIME | EFI_MEMORY_MORE_RELIABLE))
> >               snprintf(pos, size, "|attr=0x%016llx]",
> >                        (unsigned long long)attr);
> >       else
> >               snprintf(pos, size,
> > -                      "|%3s|%2s|%2s|%2s|%2s|%2s|%2s|%3s|%2s|%2s|%2s|%2s]",
> > +                      "|%3s|%2s|%2s|%2s|%2s|%2s|%2s|%2s|%3s|%2s|%2s|%2s|%2s]",
> >                        attr & EFI_MEMORY_RUNTIME ? "RUN" : "",
> >                        attr & EFI_MEMORY_MORE_RELIABLE ? "MR" : "",
> > +                      attr & EFI_MEMORY_SP      ? "SP"  : "",
> >                        attr & EFI_MEMORY_NV      ? "NV"  : "",
> >                        attr & EFI_MEMORY_XP      ? "XP"  : "",
> >                        attr & EFI_MEMORY_RP      ? "RP"  : "",
>
> Haha, I went digging in sysfs to find out where this gets dumped out.
> The joke was on me because it seems to only go to dmesg.
>
> Separate from these patches, should we have a runtime file that dumps
> out the same info?  dmesg isn't always available, and hotplug could
> change this too, I'd imagine.

Perhaps, but I thought /proc/iomem was that runtime file. Given that
x86/Linux only seems to care about the the EFI to E820 translation of
the map and the E820 map is directly reflected in /proc/iomem, do we
need another file?
Dave Hansen June 7, 2019, 9:12 p.m. UTC | #3
On 6/7/19 1:03 PM, Dan Williams wrote:
>> Separate from these patches, should we have a runtime file that dumps
>> out the same info?  dmesg isn't always available, and hotplug could
>> change this too, I'd imagine.
> Perhaps, but I thought /proc/iomem was that runtime file. Given that
> x86/Linux only seems to care about the the EFI to E820 translation of
> the map and the E820 map is directly reflected in /proc/iomem, do we
> need another file?

Probably not.

I'm just trying to think of ways that we can debug systems where someone
"loses" a bunch of memory, especially if they're moving from an old
kernel to a new one with these patches.  From their perspective, they
just lost a bunch of expensive memory.

Do we owe a pr_info(), perhaps?  Or even a /proc/meminfo entry for how
much memory these devices own?
Dan Williams June 7, 2019, 10:07 p.m. UTC | #4
On Fri, Jun 7, 2019 at 2:12 PM Dave Hansen <dave.hansen@intel.com> wrote:
>
> On 6/7/19 1:03 PM, Dan Williams wrote:
> >> Separate from these patches, should we have a runtime file that dumps
> >> out the same info?  dmesg isn't always available, and hotplug could
> >> change this too, I'd imagine.
> > Perhaps, but I thought /proc/iomem was that runtime file. Given that
> > x86/Linux only seems to care about the the EFI to E820 translation of
> > the map and the E820 map is directly reflected in /proc/iomem, do we
> > need another file?
>
> Probably not.
>
> I'm just trying to think of ways that we can debug systems where someone
> "loses" a bunch of memory, especially if they're moving from an old
> kernel to a new one with these patches.  From their perspective, they
> just lost a bunch of expensive memory.
>
> Do we owe a pr_info(), perhaps?  Or even a /proc/meminfo entry for how
> much memory these devices own?

We have this existing print when this bit is found:

[    0.023650] e820: update [mem 0x240000000-0x43fffffff] usable ==>
application reserved

...but perhaps /proc/meminfo could grow:

ApplicationReservedOffline
ApplicationReservedOnline

...to show the relative amount of this memory that has been routed to
device-dax and how much has been returned to the core-mm?
diff mbox series

Patch

diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
index 55b77c576c42..81db09485881 100644
--- a/drivers/firmware/efi/efi.c
+++ b/drivers/firmware/efi/efi.c
@@ -848,15 +848,16 @@  char * __init efi_md_typeattr_format(char *buf, size_t size,
 	if (attr & ~(EFI_MEMORY_UC | EFI_MEMORY_WC | EFI_MEMORY_WT |
 		     EFI_MEMORY_WB | EFI_MEMORY_UCE | EFI_MEMORY_RO |
 		     EFI_MEMORY_WP | EFI_MEMORY_RP | EFI_MEMORY_XP |
-		     EFI_MEMORY_NV |
+		     EFI_MEMORY_NV | EFI_MEMORY_SP |
 		     EFI_MEMORY_RUNTIME | EFI_MEMORY_MORE_RELIABLE))
 		snprintf(pos, size, "|attr=0x%016llx]",
 			 (unsigned long long)attr);
 	else
 		snprintf(pos, size,
-			 "|%3s|%2s|%2s|%2s|%2s|%2s|%2s|%3s|%2s|%2s|%2s|%2s]",
+			 "|%3s|%2s|%2s|%2s|%2s|%2s|%2s|%2s|%3s|%2s|%2s|%2s|%2s]",
 			 attr & EFI_MEMORY_RUNTIME ? "RUN" : "",
 			 attr & EFI_MEMORY_MORE_RELIABLE ? "MR" : "",
+			 attr & EFI_MEMORY_SP      ? "SP"  : "",
 			 attr & EFI_MEMORY_NV      ? "NV"  : "",
 			 attr & EFI_MEMORY_XP      ? "XP"  : "",
 			 attr & EFI_MEMORY_RP      ? "RP"  : "",
diff --git a/include/linux/efi.h b/include/linux/efi.h
index 6ebc2098cfe1..91368f5ce114 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -112,6 +112,7 @@  typedef	struct {
 #define EFI_MEMORY_MORE_RELIABLE \
 				((u64)0x0000000000010000ULL)	/* higher reliability */
 #define EFI_MEMORY_RO		((u64)0x0000000000020000ULL)	/* read-only */
+#define EFI_MEMORY_SP		((u64)0x0000000000040000ULL)	/* special purpose */
 #define EFI_MEMORY_RUNTIME	((u64)0x8000000000000000ULL)	/* range requires runtime mapping */
 #define EFI_MEMORY_DESCRIPTOR_VERSION	1