diff mbox

[2/3] arm64: reimplement page_is_ram() using memblock and UEFI memory map

Message ID 1446126059-25336-3-git-send-email-ard.biesheuvel@linaro.org (mailing list archive)
State New, archived
Headers show

Commit Message

Ard Biesheuvel Oct. 29, 2015, 1:40 p.m. UTC
This patch overrides the __weak default implementation of page_is_ram(),
which uses string comparisons to find entries called 'System RAM' in
/proc/iomem. Since we used the contents of memblock to create those entries
in the first place, let's use memblock directly.

Also, since the UEFI memory map may describe regions backed by RAM that are
not in memblock (i.e., reserved regions that were removed from the linear
mapping), check the pfn against the UEFI memory map as well.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/mm/mmu.c | 34 ++++++++++++++++++++
 1 file changed, 34 insertions(+)

Comments

Matt Fleming Nov. 12, 2015, 3:31 p.m. UTC | #1
On Thu, 29 Oct, at 02:40:58PM, Ard Biesheuvel wrote:
> This patch overrides the __weak default implementation of page_is_ram(),
> which uses string comparisons to find entries called 'System RAM' in
> /proc/iomem. Since we used the contents of memblock to create those entries
> in the first place, let's use memblock directly.
> 
> Also, since the UEFI memory map may describe regions backed by RAM that are
> not in memblock (i.e., reserved regions that were removed from the linear
> mapping), check the pfn against the UEFI memory map as well.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  arch/arm64/mm/mmu.c | 34 ++++++++++++++++++++
>  1 file changed, 34 insertions(+)
 
Am I correct in thinking that the purpose of this series is just to
placate acpi_os_ioremap() on arm64, and its use of page_is_ram()?

While there aren't many users of page_is_ram() right now, I can see
how in the future if new users are added they'd be extremely confused
to find that page_is_ram(pfn) returns true but 'pfn' isn't accessible
by the kernel proper.

Wouldn't it make more sense to teach acpi_os_ioremap() about these
special reserved regions outside of page_is_ram()?
Ard Biesheuvel Nov. 12, 2015, 3:40 p.m. UTC | #2
On 12 November 2015 at 16:31, Matt Fleming <matt@codeblueprint.co.uk> wrote:
> On Thu, 29 Oct, at 02:40:58PM, Ard Biesheuvel wrote:
>> This patch overrides the __weak default implementation of page_is_ram(),
>> which uses string comparisons to find entries called 'System RAM' in
>> /proc/iomem. Since we used the contents of memblock to create those entries
>> in the first place, let's use memblock directly.
>>
>> Also, since the UEFI memory map may describe regions backed by RAM that are
>> not in memblock (i.e., reserved regions that were removed from the linear
>> mapping), check the pfn against the UEFI memory map as well.
>>
>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>> ---
>>  arch/arm64/mm/mmu.c | 34 ++++++++++++++++++++
>>  1 file changed, 34 insertions(+)
>
> Am I correct in thinking that the purpose of this series is just to
> placate acpi_os_ioremap() on arm64, and its use of page_is_ram()?
>

That is currently the primary user, but we need this information for
other purposes as well. One example is /dev/mem, which is used for
both devices and memory (for instance, tools like dmidecode rely
heavily on it). When using it to access a memory region that we
punched out of the linear mapping, we should typically not map it as a
device, since unaligned accesses cause faults in that case.

In summary, it would be nice if we could preserve the 'is ram"
annotation for regions that are not covered by the linear mapping.

> While there aren't many users of page_is_ram() right now, I can see
> how in the future if new users are added they'd be extremely confused
> to find that page_is_ram(pfn) returns true but 'pfn' isn't accessible
> by the kernel proper.
>

Well, who knows. page_is_ram() is poorly documented, and so is the
'System RAM' iomem annotation that its default implementation relies
on.

> Wouldn't it make more sense to teach acpi_os_ioremap() about these
> special reserved regions outside of page_is_ram()?

Perhaps. But it would introduce EFI dependencies into that code.

The bottom line is that I would like to be able to remove UEFI
occupied or reserved regions from the linear mapping without breaking
ACPI, whose use of page_is_ram() results in alignment faults when
accessing such regions.
Mark Rutland Nov. 12, 2015, 4:03 p.m. UTC | #3
On Thu, Nov 12, 2015 at 04:40:23PM +0100, Ard Biesheuvel wrote:
> On 12 November 2015 at 16:31, Matt Fleming <matt@codeblueprint.co.uk> wrote:
> > On Thu, 29 Oct, at 02:40:58PM, Ard Biesheuvel wrote:
> >> This patch overrides the __weak default implementation of page_is_ram(),
> >> which uses string comparisons to find entries called 'System RAM' in
> >> /proc/iomem. Since we used the contents of memblock to create those entries
> >> in the first place, let's use memblock directly.
> >>
> >> Also, since the UEFI memory map may describe regions backed by RAM that are
> >> not in memblock (i.e., reserved regions that were removed from the linear
> >> mapping), check the pfn against the UEFI memory map as well.
> >>
> >> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> >> ---
> >>  arch/arm64/mm/mmu.c | 34 ++++++++++++++++++++
> >>  1 file changed, 34 insertions(+)
> >
> > Am I correct in thinking that the purpose of this series is just to
> > placate acpi_os_ioremap() on arm64, and its use of page_is_ram()?
> >
> 
> That is currently the primary user, but we need this information for
> other purposes as well. One example is /dev/mem, which is used for
> both devices and memory (for instance, tools like dmidecode rely
> heavily on it). When using it to access a memory region that we
> punched out of the linear mapping, we should typically not map it as a
> device, since unaligned accesses cause faults in that case.
> 
> In summary, it would be nice if we could preserve the 'is ram"
> annotation for regions that are not covered by the linear mapping.
> 
> > While there aren't many users of page_is_ram() right now, I can see
> > how in the future if new users are added they'd be extremely confused
> > to find that page_is_ram(pfn) returns true but 'pfn' isn't accessible
> > by the kernel proper.
> >
> 
> Well, who knows. page_is_ram() is poorly documented, and so is the
> 'System RAM' iomem annotation that its default implementation relies
> on.

Sorry if this is a bit of a derailment, but perhaps now is a good
opportunity to introduce something like:

#ifndef page_is_linear_mapped
#define page_is_linear_mapped page_is_ram
#endif

With documentation as to the semantic difference, and a conversion of
existing users.

Thanks,
Mark.
Ard Biesheuvel Nov. 12, 2015, 4:06 p.m. UTC | #4
On 12 November 2015 at 17:03, Mark Rutland <mark.rutland@arm.com> wrote:
> On Thu, Nov 12, 2015 at 04:40:23PM +0100, Ard Biesheuvel wrote:
>> On 12 November 2015 at 16:31, Matt Fleming <matt@codeblueprint.co.uk> wrote:
>> > On Thu, 29 Oct, at 02:40:58PM, Ard Biesheuvel wrote:
>> >> This patch overrides the __weak default implementation of page_is_ram(),
>> >> which uses string comparisons to find entries called 'System RAM' in
>> >> /proc/iomem. Since we used the contents of memblock to create those entries
>> >> in the first place, let's use memblock directly.
>> >>
>> >> Also, since the UEFI memory map may describe regions backed by RAM that are
>> >> not in memblock (i.e., reserved regions that were removed from the linear
>> >> mapping), check the pfn against the UEFI memory map as well.
>> >>
>> >> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>> >> ---
>> >>  arch/arm64/mm/mmu.c | 34 ++++++++++++++++++++
>> >>  1 file changed, 34 insertions(+)
>> >
>> > Am I correct in thinking that the purpose of this series is just to
>> > placate acpi_os_ioremap() on arm64, and its use of page_is_ram()?
>> >
>>
>> That is currently the primary user, but we need this information for
>> other purposes as well. One example is /dev/mem, which is used for
>> both devices and memory (for instance, tools like dmidecode rely
>> heavily on it). When using it to access a memory region that we
>> punched out of the linear mapping, we should typically not map it as a
>> device, since unaligned accesses cause faults in that case.
>>
>> In summary, it would be nice if we could preserve the 'is ram"
>> annotation for regions that are not covered by the linear mapping.
>>
>> > While there aren't many users of page_is_ram() right now, I can see
>> > how in the future if new users are added they'd be extremely confused
>> > to find that page_is_ram(pfn) returns true but 'pfn' isn't accessible
>> > by the kernel proper.
>> >
>>
>> Well, who knows. page_is_ram() is poorly documented, and so is the
>> 'System RAM' iomem annotation that its default implementation relies
>> on.
>
> Sorry if this is a bit of a derailment, but perhaps now is a good
> opportunity to introduce something like:
>
> #ifndef page_is_linear_mapped
> #define page_is_linear_mapped page_is_ram
> #endif
>
> With documentation as to the semantic difference, and a conversion of
> existing users.
>

As I replied in the other thread, this does not cover all cases on
highmem platforms.
diff mbox

Patch

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index c2fa6b56613c..737bfaecb489 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -19,6 +19,7 @@ 
 
 #include <linux/export.h>
 #include <linux/kernel.h>
+#include <linux/efi.h>
 #include <linux/errno.h>
 #include <linux/init.h>
 #include <linux/libfdt.h>
@@ -31,6 +32,7 @@ 
 #include <linux/stop_machine.h>
 
 #include <asm/cputype.h>
+#include <asm/efi.h>
 #include <asm/fixmap.h>
 #include <asm/kernel-pgtable.h>
 #include <asm/sections.h>
@@ -743,3 +745,35 @@  void *__init fixmap_remap_fdt(phys_addr_t dt_phys)
 
 	return dt_virt;
 }
+
+/*
+ * On a UEFI system, the memory map may describe regions that are backed by
+ * memory, but are not covered by the linear mapping and so are not listed as
+ * 'System RAM' in /proc/iomem, which is what the default __weak implementation
+ * of page_is_ram looks for. So check the UEFI memory map as well if the pfn is
+ * not covered by memblock.
+ */
+int page_is_ram(unsigned long pfn)
+{
+	u64 addr = PFN_PHYS(pfn);
+	efi_memory_desc_t *md;
+
+	if (memblock_is_memory(addr))
+		return 1;
+
+	if (!efi_enabled(EFI_MEMMAP))
+		return 0;
+
+	/*
+	 * A pfn could intersect multiple regions in the UEFI memory map if the
+	 * OS page size exceeds 4 KB. However, the UEFI spec explicitly forbids
+	 * mixed attribute mappings within the same 64 KB page frame so just use
+	 * the region that intersects the page address.
+	 */
+	for_each_efi_memory_desc(&memmap, md)
+		if (md->phys_addr <= addr &&
+		    (addr - md->phys_addr) < (md->num_pages << EFI_PAGE_SHIFT))
+			return !!(md->attribute & EFI_MEMORY_WB);
+
+	return 0;
+}