Message ID | 7e3cd1dc48e30f19f2ac97794e21d1b78bc0c082.1734717055.git.sanastasio@raptorengineering.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | xen/mm: Update page APIs to use unsigned long flags | expand |
On 20/12/2024 5:53 pm, Shawn Anastasio wrote: > Xen's memory management APIs map_pages_to_xen, modify_xen_mappings, > set_fixmap, ioremap_attr, and __vmap all use an unsigned int to > represent architecture-dependent page table entry flags. This assumption > does not work on PPC/radix where some flags go past 32-bits, so update > these APIs to use unsigned long for flags instead. > > Signed-off-by: Shawn Anastasio <sanastasio@raptorengineering.com> Funnily enough, the same is true on x86 too. NX and ProtKey bits are from bit 63 downwards. x86 funnels all PTE flags through {get,put}_pte_flags() which compresses a 64bit PTE's flags (top and bottom 12 bits) into a 24 bit number. This was allegedly for the benefit of 32bit builds of Xen with PAE paging. I'm not convinced this claim was backed up evidence, even in the 32bit days, because the areas of code working on flags separately from a 64bit PTE are minimal. Nevertheless, it's been 11 years since we dropped x86_32, and it's definitely tech debt and unnecessary overhead in x86_64. I firmly support making this adjustment. It's been on my todo list for years, but never high enough to get started. However, instead of just a plain unsigned long, perhaps we should have a concrete type, perhaps pte_attr_t ? This lets different architectures handle things differently, and also lets us make it TYPE_SAFE() like mfn and friends. Most importantly though, it makes it less likely that we're going to end up with paths that accidentally have some long->int truncation. Thoughts? ~Andrew
Hi Andrew, On 12/20/24 12:23 PM, Andrew Cooper wrote: > On 20/12/2024 5:53 pm, Shawn Anastasio wrote: >> Xen's memory management APIs map_pages_to_xen, modify_xen_mappings, >> set_fixmap, ioremap_attr, and __vmap all use an unsigned int to >> represent architecture-dependent page table entry flags. This assumption >> does not work on PPC/radix where some flags go past 32-bits, so update >> these APIs to use unsigned long for flags instead. >> >> Signed-off-by: Shawn Anastasio <sanastasio@raptorengineering.com> > > Funnily enough, the same is true on x86 too. NX and ProtKey bits are > from bit 63 downwards. > > x86 funnels all PTE flags through {get,put}_pte_flags() which compresses > a 64bit PTE's flags (top and bottom 12 bits) into a 24 bit number. > > This was allegedly for the benefit of 32bit builds of Xen with PAE > paging. I'm not convinced this claim was backed up evidence, even in > the 32bit days, because the areas of code working on flags separately > from a 64bit PTE are minimal. > > Nevertheless, it's been 11 years since we dropped x86_32, and it's > definitely tech debt and unnecessary overhead in x86_64. Interesting -- I wasn't aware that x86 was running into this issue too, and that approach to solving it definitely seems a bit dubious from an overhead standpoint. > I firmly support making this adjustment. It's been on my todo list for > years, but never high enough to get started. > > However, instead of just a plain unsigned long, perhaps we should have a > concrete type, perhaps pte_attr_t ? > > This lets different architectures handle things differently, and also > lets us make it TYPE_SAFE() like mfn and friends. > I fully agree that introducing a TYPE_SAFE per-arch type for this is the more robust solution. I went with this approach as it requires the least invasive changes to other architectures, but if there's sufficient buy-in from everyone then I think that this would be the better route. One other consideration with that approach would be the ergonomics of using the TYPE_SAFE macros for these flags which are often OR'd together. I know mfn addresses this by offering mfn_{add,max,min,eq} helper functions, so introducing similar bitwise helpers for the new pte_attr_t type could work. typesafe.h could even be expanded to include macros to generate these helper functions for numeric types to reduce duplication if you think that'd be reasonable. > Most importantly though, it makes it less likely that we're going to end > up with paths that accidentally have some long->int truncation. In this patch some of those paths are explicitly present, for example in arm's pt.c. The thinking was that these architectures already obviously have no issue with 32-bit pte flags, so the truncation won't cause any issues, but I agree it's not ideal. At the very least, it presents a potential pitfall if architectures like x86 transition to using the full 64-bit field in the future.
On 20/12/2024 6:51 pm, Shawn Anastasio wrote: > Hi Andrew, > > On 12/20/24 12:23 PM, Andrew Cooper wrote: >> On 20/12/2024 5:53 pm, Shawn Anastasio wrote: >>> Xen's memory management APIs map_pages_to_xen, modify_xen_mappings, >>> set_fixmap, ioremap_attr, and __vmap all use an unsigned int to >>> represent architecture-dependent page table entry flags. This assumption >>> does not work on PPC/radix where some flags go past 32-bits, so update >>> these APIs to use unsigned long for flags instead. >>> >>> Signed-off-by: Shawn Anastasio <sanastasio@raptorengineering.com> >> Funnily enough, the same is true on x86 too. NX and ProtKey bits are >> from bit 63 downwards. >> >> x86 funnels all PTE flags through {get,put}_pte_flags() which compresses >> a 64bit PTE's flags (top and bottom 12 bits) into a 24 bit number. >> >> This was allegedly for the benefit of 32bit builds of Xen with PAE >> paging. I'm not convinced this claim was backed up evidence, even in >> the 32bit days, because the areas of code working on flags separately >> from a 64bit PTE are minimal. >> >> Nevertheless, it's been 11 years since we dropped x86_32, and it's >> definitely tech debt and unnecessary overhead in x86_64. > Interesting -- I wasn't aware that x86 was running into this issue too, > and that approach to solving it definitely seems a bit dubious from an > overhead standpoint. It is history and inertia most likely. Original paging in x86 had 32bit PTEs with attributes in the low 12 bits only. PSE and PSE36 paging were next but not interesting in this context. PAE paging changed to 64bit PTEs, and the NX bit was added in bit 63. This is where the upper 12 bits got introduced. However, using uint64_t's in 32bit code is somewhat unergonomic and Xen had a bunch of optimisations to use 32bit frames (== 44 bits addressable memory). To this day, we've still got this limitation with a __pdx_t which dictates the size of the frametable (and CONFIG_BIGMEM to switch to something larger). x86_64 has a hard dependency on PAE paging, and added other bits into the upper 12 range, but at least now a PTE isn't larger than the GPR size. >> I firmly support making this adjustment. It's been on my todo list for >> years, but never high enough to get started. >> >> However, instead of just a plain unsigned long, perhaps we should have a >> concrete type, perhaps pte_attr_t ? >> >> This lets different architectures handle things differently, and also >> lets us make it TYPE_SAFE() like mfn and friends. >> > I fully agree that introducing a TYPE_SAFE per-arch type for this is the > more robust solution. I went with this approach as it requires the least > invasive changes to other architectures, but if there's sufficient > buy-in from everyone then I think that this would be the better route. > > One other consideration with that approach would be the ergonomics of > using the TYPE_SAFE macros for these flags which are often OR'd > together. I know mfn addresses this by offering mfn_{add,max,min,eq} > helper functions, so introducing similar bitwise helpers for the new > pte_attr_t type could work. typesafe.h could even be expanded to include > macros to generate these helper functions for numeric types to reduce > duplication if you think that'd be reasonable. I was thinking of TYPE_SAFE() as a second step. I'm certainly not asking you to do make it happen in this patch/series. But yes, the ergonomics aren't completely great given how many bit operations we do. >> Most importantly though, it makes it less likely that we're going to end >> up with paths that accidentally have some long->int truncation. > In this patch some of those paths are explicitly present, for example > in arm's pt.c. The thinking was that these architectures already > obviously have no issue with 32-bit pte flags, so the truncation won't > cause any issues, but I agree it's not ideal. At the very least, it > presents a potential pitfall if architectures like x86 transition to > using the full 64-bit field in the future. Ok, in which case you want to start by introducing a per-arch typedef defaulting to unsigned int. Then rearrange the code necessary for PPC, and then change PPC to use unsigned long. This results no changes to other architectures until they take explicit action to switch fully over to the new typedef. ~Andrew
diff --git a/xen/arch/arm/include/asm/fixmap.h b/xen/arch/arm/include/asm/fixmap.h index 0cb5d54d1c..96361e670f 100644 --- a/xen/arch/arm/include/asm/fixmap.h +++ b/xen/arch/arm/include/asm/fixmap.h @@ -30,7 +30,7 @@ extern lpae_t xen_fixmap[XEN_PT_LPAE_ENTRIES]; /* Map a page in a fixmap entry */ -extern void set_fixmap(unsigned int map, mfn_t mfn, unsigned int flags); +extern void set_fixmap(unsigned int map, mfn_t mfn, unsigned long flags); /* Remove a mapping from a fixmap entry */ extern void clear_fixmap(unsigned int map); diff --git a/xen/arch/arm/include/asm/mm.h b/xen/arch/arm/include/asm/mm.h index 59b774b7b8..4cfd71538b 100644 --- a/xen/arch/arm/include/asm/mm.h +++ b/xen/arch/arm/include/asm/mm.h @@ -209,7 +209,7 @@ extern int prepare_secondary_mm(int cpu); /* Map a frame table to cover physical addresses ps through pe */ extern void setup_frametable_mappings(paddr_t ps, paddr_t pe); /* map a physical range in virtual memory */ -void __iomem *ioremap_attr(paddr_t start, size_t len, unsigned int attributes); +void __iomem *ioremap_attr(paddr_t start, size_t len, unsigned long attributes); static inline void __iomem *ioremap_nocache(paddr_t start, size_t len) { diff --git a/xen/arch/arm/mmu/pt.c b/xen/arch/arm/mmu/pt.c index da28d669e7..f8967fca01 100644 --- a/xen/arch/arm/mmu/pt.c +++ b/xen/arch/arm/mmu/pt.c @@ -189,7 +189,7 @@ lpae_t mfn_to_xen_entry(mfn_t mfn, unsigned int attr) } /* Map a 4k page in a fixmap entry */ -void set_fixmap(unsigned int map, mfn_t mfn, unsigned int flags) +void set_fixmap(unsigned int map, mfn_t mfn, unsigned long flags) { int res; @@ -210,7 +210,7 @@ void clear_fixmap(unsigned int map) * This function should only be used to remap device address ranges * TODO: add a check to verify this assumption */ -void *ioremap_attr(paddr_t start, size_t len, unsigned int attributes) +void *ioremap_attr(paddr_t start, size_t len, unsigned long attributes) { mfn_t mfn = _mfn(PFN_DOWN(start)); unsigned int offs = start & (PAGE_SIZE - 1); @@ -701,7 +701,7 @@ static int xen_pt_update(unsigned long virt, int map_pages_to_xen(unsigned long virt, mfn_t mfn, unsigned long nr_mfns, - unsigned int flags) + unsigned long flags) { return xen_pt_update(virt, mfn, nr_mfns, flags); } @@ -719,7 +719,7 @@ int destroy_xen_mappings(unsigned long s, unsigned long e) return xen_pt_update(s, INVALID_MFN, (e - s) >> PAGE_SHIFT, 0); } -int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf) +int modify_xen_mappings(unsigned long s, unsigned long e, unsigned long nf) { ASSERT(IS_ALIGNED(s, PAGE_SIZE)); ASSERT(IS_ALIGNED(e, PAGE_SIZE)); diff --git a/xen/arch/ppc/mm-radix.c b/xen/arch/ppc/mm-radix.c index 1ca3bcba7f..fa326a159b 100644 --- a/xen/arch/ppc/mm-radix.c +++ b/xen/arch/ppc/mm-radix.c @@ -285,7 +285,7 @@ int destroy_xen_mappings(unsigned long s, unsigned long e) int map_pages_to_xen(unsigned long virt, mfn_t mfn, unsigned long nr_mfns, - unsigned int flags) + unsigned long flags) { BUG_ON("unimplemented"); } diff --git a/xen/arch/riscv/include/asm/fixmap.h b/xen/arch/riscv/include/asm/fixmap.h index e399a15f53..7302a5c052 100644 --- a/xen/arch/riscv/include/asm/fixmap.h +++ b/xen/arch/riscv/include/asm/fixmap.h @@ -33,7 +33,7 @@ extern pte_t xen_fixmap[]; /* Map a page in a fixmap entry */ -void set_fixmap(unsigned int map, mfn_t mfn, unsigned int flags); +void set_fixmap(unsigned int map, mfn_t mfn, unsigned long flags); /* Remove a mapping from a fixmap entry */ void clear_fixmap(unsigned int map); diff --git a/xen/arch/riscv/pt.c b/xen/arch/riscv/pt.c index a703e0f1bd..4334db8cce 100644 --- a/xen/arch/riscv/pt.c +++ b/xen/arch/riscv/pt.c @@ -405,7 +405,7 @@ static int pt_update(vaddr_t virt, mfn_t mfn, int map_pages_to_xen(unsigned long virt, mfn_t mfn, unsigned long nr_mfns, - unsigned int flags) + unsigned long flags) { /* * Ensure that flags has PTE_VALID bit as map_pages_to_xen() is supposed @@ -436,7 +436,7 @@ int __init populate_pt_range(unsigned long virt, unsigned long nr_mfns) } /* Map a 4k page in a fixmap entry */ -void set_fixmap(unsigned int map, mfn_t mfn, unsigned int flags) +void set_fixmap(unsigned int map, mfn_t mfn, unsigned long flags) { if ( map_pages_to_xen(FIXMAP_ADDR(map), mfn, 1, flags | PTE_SMALL) != 0 ) BUG(); diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index fa21903eb2..a5b74dad38 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -5469,7 +5469,7 @@ int map_pages_to_xen( unsigned long virt, mfn_t mfn, unsigned long nr_mfns, - unsigned int flags) + unsigned long flags) { bool locking = system_state > SYS_STATE_boot; l3_pgentry_t *pl3e = NULL, ol3e; @@ -5887,7 +5887,7 @@ int __init populate_pt_range(unsigned long virt, unsigned long nr_mfns) * * It is an error to call with present flags over an unpopulated range. */ -int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf) +int modify_xen_mappings(unsigned long s, unsigned long e, unsigned long nf) { bool locking = system_state > SYS_STATE_boot; l3_pgentry_t *pl3e = NULL; diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c index efbec00af9..76215cb6b9 100644 --- a/xen/common/efi/boot.c +++ b/xen/common/efi/boot.c @@ -1656,7 +1656,7 @@ void __init efi_init_memory(void) struct rt_extra { struct rt_extra *next; unsigned long smfn, emfn; - unsigned int prot; + unsigned long prot; } *extra, *extra_head = NULL; free_ebmalloc_unused_mem(); @@ -1671,7 +1671,7 @@ void __init efi_init_memory(void) EFI_MEMORY_DESCRIPTOR *desc = efi_memmap + i; u64 len = desc->NumberOfPages << EFI_PAGE_SHIFT; unsigned long smfn, emfn; - unsigned int prot = PAGE_HYPERVISOR_RWX; + unsigned long prot = PAGE_HYPERVISOR_RWX; paddr_t mem_base; unsigned long mem_npages; diff --git a/xen/common/vmap.c b/xen/common/vmap.c index 47225fecc0..c2d86fff74 100644 --- a/xen/common/vmap.c +++ b/xen/common/vmap.c @@ -222,7 +222,7 @@ static void vm_free(const void *va) } void *__vmap(const mfn_t *mfn, unsigned int granularity, - unsigned int nr, unsigned int align, unsigned int flags, + unsigned int nr, unsigned int align, unsigned long flags, enum vmap_region type) { void *va = vm_alloc(nr * granularity, align, type); diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h index 16f733281a..9efddc51bc 100644 --- a/xen/include/xen/mm.h +++ b/xen/include/xen/mm.h @@ -113,9 +113,9 @@ int map_pages_to_xen( unsigned long virt, mfn_t mfn, unsigned long nr_mfns, - unsigned int flags); + unsigned long flags); /* Alter the permissions of a range of Xen virtual address space. */ -int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf); +int modify_xen_mappings(unsigned long s, unsigned long e, unsigned long nf); void modify_xen_mappings_lite(unsigned long s, unsigned long e, unsigned int nf); int destroy_xen_mappings(unsigned long s, unsigned long e); diff --git a/xen/include/xen/vmap.h b/xen/include/xen/vmap.h index 26c831757a..65403c77e9 100644 --- a/xen/include/xen/vmap.h +++ b/xen/include/xen/vmap.h @@ -57,7 +57,7 @@ void vm_init_type(enum vmap_region type, void *start, void *end); * @return Pointer to the mapped area on success; NULL otherwise. */ void *__vmap(const mfn_t *mfn, unsigned int granularity, unsigned int nr, - unsigned int align, unsigned int flags, enum vmap_region type); + unsigned int align, unsigned long flags, enum vmap_region type); /* * Map an array of pages contiguously into the VMAP_DEFAULT vmap region
Xen's memory management APIs map_pages_to_xen, modify_xen_mappings, set_fixmap, ioremap_attr, and __vmap all use an unsigned int to represent architecture-dependent page table entry flags. This assumption does not work on PPC/radix where some flags go past 32-bits, so update these APIs to use unsigned long for flags instead. Signed-off-by: Shawn Anastasio <sanastasio@raptorengineering.com> --- xen/arch/arm/include/asm/fixmap.h | 2 +- xen/arch/arm/include/asm/mm.h | 2 +- xen/arch/arm/mmu/pt.c | 8 ++++---- xen/arch/ppc/mm-radix.c | 2 +- xen/arch/riscv/include/asm/fixmap.h | 2 +- xen/arch/riscv/pt.c | 4 ++-- xen/arch/x86/mm.c | 4 ++-- xen/common/efi/boot.c | 4 ++-- xen/common/vmap.c | 2 +- xen/include/xen/mm.h | 4 ++-- xen/include/xen/vmap.h | 2 +- 11 files changed, 18 insertions(+), 18 deletions(-)