Message ID | 20190218170245.14915-1-steve.capper@arm.com (mailing list archive) |
---|---|
Headers | show |
Series | 52-bit kernel + user VAs | expand |
On Mon, 18 Feb 2019 at 18:05, Steve Capper <steve.capper@arm.com> wrote: > > This patch series adds support for 52-bit kernel VAs using some of the > machinery already introduced by the 52-bit userspace VA code in 5.0. > > As 52-bit virtual address support is an optional hardware feature, > software support for 52-bit kernel VAs needs to be deduced at early boot > time. If HW support is not available, the kernel falls back to 48-bit. > > A significant proportion of this series focuses on "de-constifying" > VA_BITS related constants. > > In order to allow for a KASAN shadow that changes size at boot time, one > must fix the KASAN_SHADOW_END for both 48 & 52-bit VAs and "grow" the > start address. Also, it is highly desirable to maintain the same > function addresses in the kernel .text between VA sizes. Both of these > requirements necessitate us to flip the kernel address space halves s.t. > the direct linear map occupies the lower addresses. > > One obvious omission is 52-bit kernel VA + 48-bit userspace VA which I > can add with some more #ifdef'ery if needed. > Hi Steve, Apologies if I am bringing up things that have been addressed internally already. We discussed the 52-bit kernel VA work at plumber's at some point, and IIUC, KASAN is the complicating factor when it comes to having compile time constants for VA_BITS_MIN, VA_BITS_MAX and PAGE_OFFSET, right? To clarify what I mean, please refer to the diagram below, which describes a hybrid 48/52 kernel VA arrangement that does not rely on runtime variable quantities. (VA_BITS_MIN == 48, VA_BITS_MAX == 52) +------------------- (~0) -------------------------+ | | | PCI IO / fixmap spaces | | | +------------------------------------------------+ | | | kernel/vmalloc space | | | +------------------------------------------------+ | | | module space | | | +------------------------------------------------+ | | | BPF space | | | +------------------------------------------------+ | | | | | vmemmap space (size based on VA_BITS_MAX) | | | | | +-- linear/vmalloc split based on VA_BITS_MIN -- + | | | linear mapping (48 bit addressable region) | | | +------------------------------------------------+ | | | linear mapping (52 bit addressable region) | | | +------ PAGE_OFFSET based on VA_BITS_MAX --------+ Since KASAN is what is preventing this, would it be acceptable for KASAN to only be supported when you use a true 48 bit or a true 52 bit configuration, and disable it for the 48/52 hybrid configuration? Just thinking out loud (and in ASCII art :-))
On Tue, Feb 19, 2019 at 01:13:32PM +0100, Ard Biesheuvel wrote: > On Mon, 18 Feb 2019 at 18:05, Steve Capper <steve.capper@arm.com> wrote: > > > > This patch series adds support for 52-bit kernel VAs using some of the > > machinery already introduced by the 52-bit userspace VA code in 5.0. > > > > As 52-bit virtual address support is an optional hardware feature, > > software support for 52-bit kernel VAs needs to be deduced at early boot > > time. If HW support is not available, the kernel falls back to 48-bit. > > > > A significant proportion of this series focuses on "de-constifying" > > VA_BITS related constants. > > > > In order to allow for a KASAN shadow that changes size at boot time, one > > must fix the KASAN_SHADOW_END for both 48 & 52-bit VAs and "grow" the > > start address. Also, it is highly desirable to maintain the same > > function addresses in the kernel .text between VA sizes. Both of these > > requirements necessitate us to flip the kernel address space halves s.t. > > the direct linear map occupies the lower addresses. > > > > One obvious omission is 52-bit kernel VA + 48-bit userspace VA which I > > can add with some more #ifdef'ery if needed. > > > > Hi Steve, > > Apologies if I am bringing up things that have been addressed > internally already. We discussed the 52-bit kernel VA work at > plumber's at some point, and IIUC, KASAN is the complicating factor > when it comes to having compile time constants for VA_BITS_MIN, > VA_BITS_MAX and PAGE_OFFSET, right? > > To clarify what I mean, please refer to the diagram below, which > describes a hybrid 48/52 kernel VA arrangement that does not rely on > runtime variable quantities. (VA_BITS_MIN == 48, VA_BITS_MAX == 52) > > +------------------- (~0) -------------------------+ > | | > | PCI IO / fixmap spaces | > | | > +------------------------------------------------+ > | | > | kernel/vmalloc space | > | | > +------------------------------------------------+ > | | > | module space | > | | > +------------------------------------------------+ > | | > | BPF space | > | | > +------------------------------------------------+ > | | > | | > | vmemmap space (size based on VA_BITS_MAX) | > | | > | | > +-- linear/vmalloc split based on VA_BITS_MIN -- + > | | > | linear mapping (48 bit addressable region) | > | | > +------------------------------------------------+ > | | > | linear mapping (52 bit addressable region) | > | | > +------ PAGE_OFFSET based on VA_BITS_MAX --------+ > > Since KASAN is what is preventing this, would it be acceptable for > KASAN to only be supported when you use a true 48 bit or a true 52 bit > configuration, and disable it for the 48/52 hybrid configuration? > > Just thinking out loud (and in ASCII art :-)) TBH, if we end up having support for 52-bit kernel VA, I'd be inclined to drop the 48/52 configuration altogether. But Catalin's on holiday at the moment, and may have a different opinion ;) Will
On Tue, 19 Feb 2019 at 13:48, Will Deacon <will.deacon@arm.com> wrote: > > On Tue, Feb 19, 2019 at 01:13:32PM +0100, Ard Biesheuvel wrote: > > On Mon, 18 Feb 2019 at 18:05, Steve Capper <steve.capper@arm.com> wrote: > > > > > > This patch series adds support for 52-bit kernel VAs using some of the > > > machinery already introduced by the 52-bit userspace VA code in 5.0. > > > > > > As 52-bit virtual address support is an optional hardware feature, > > > software support for 52-bit kernel VAs needs to be deduced at early boot > > > time. If HW support is not available, the kernel falls back to 48-bit. > > > > > > A significant proportion of this series focuses on "de-constifying" > > > VA_BITS related constants. > > > > > > In order to allow for a KASAN shadow that changes size at boot time, one > > > must fix the KASAN_SHADOW_END for both 48 & 52-bit VAs and "grow" the > > > start address. Also, it is highly desirable to maintain the same > > > function addresses in the kernel .text between VA sizes. Both of these > > > requirements necessitate us to flip the kernel address space halves s.t. > > > the direct linear map occupies the lower addresses. > > > > > > One obvious omission is 52-bit kernel VA + 48-bit userspace VA which I > > > can add with some more #ifdef'ery if needed. > > > > > > > Hi Steve, > > > > Apologies if I am bringing up things that have been addressed > > internally already. We discussed the 52-bit kernel VA work at > > plumber's at some point, and IIUC, KASAN is the complicating factor > > when it comes to having compile time constants for VA_BITS_MIN, > > VA_BITS_MAX and PAGE_OFFSET, right? > > > > To clarify what I mean, please refer to the diagram below, which > > describes a hybrid 48/52 kernel VA arrangement that does not rely on > > runtime variable quantities. (VA_BITS_MIN == 48, VA_BITS_MAX == 52) > > > > +------------------- (~0) -------------------------+ > > | | > > | PCI IO / fixmap spaces | > > | | > > +------------------------------------------------+ > > | | > > | kernel/vmalloc space | > > | | > > +------------------------------------------------+ > > | | > > | module space | > > | | > > +------------------------------------------------+ > > | | > > | BPF space | > > | | > > +------------------------------------------------+ > > | | > > | | > > | vmemmap space (size based on VA_BITS_MAX) | > > | | > > | | > > +-- linear/vmalloc split based on VA_BITS_MIN -- + > > | | > > | linear mapping (48 bit addressable region) | > > | | > > +------------------------------------------------+ > > | | > > | linear mapping (52 bit addressable region) | > > | | > > +------ PAGE_OFFSET based on VA_BITS_MAX --------+ > > > > Since KASAN is what is preventing this, would it be acceptable for > > KASAN to only be supported when you use a true 48 bit or a true 52 bit > > configuration, and disable it for the 48/52 hybrid configuration? > > > > Just thinking out loud (and in ASCII art :-)) > > TBH, if we end up having support for 52-bit kernel VA, I'd be inclined to > drop the 48/52 configuration altogether. But Catalin's on holiday at the > moment, and may have a different opinion ;) > But that implies that you cannot have an image that supports 52-bit kernel VAs but can still boot on hardware that does not implement support for it. If that is acceptable, then none of this hoop jumping that Steve is doing in these patches is necessary to begin with, right?
On Tue, Feb 19, 2019 at 01:51:51PM +0100, Ard Biesheuvel wrote: > On Tue, 19 Feb 2019 at 13:48, Will Deacon <will.deacon@arm.com> wrote: > > > > On Tue, Feb 19, 2019 at 01:13:32PM +0100, Ard Biesheuvel wrote: > > > On Mon, 18 Feb 2019 at 18:05, Steve Capper <steve.capper@arm.com> wrote: > > > > > > > > This patch series adds support for 52-bit kernel VAs using some of the > > > > machinery already introduced by the 52-bit userspace VA code in 5.0. > > > > > > > > As 52-bit virtual address support is an optional hardware feature, > > > > software support for 52-bit kernel VAs needs to be deduced at early boot > > > > time. If HW support is not available, the kernel falls back to 48-bit. > > > > > > > > A significant proportion of this series focuses on "de-constifying" > > > > VA_BITS related constants. > > > > > > > > In order to allow for a KASAN shadow that changes size at boot time, one > > > > must fix the KASAN_SHADOW_END for both 48 & 52-bit VAs and "grow" the > > > > start address. Also, it is highly desirable to maintain the same > > > > function addresses in the kernel .text between VA sizes. Both of these > > > > requirements necessitate us to flip the kernel address space halves s.t. > > > > the direct linear map occupies the lower addresses. > > > > > > > > One obvious omission is 52-bit kernel VA + 48-bit userspace VA which I > > > > can add with some more #ifdef'ery if needed. > > > > > > > > > > Hi Steve, > > > > > > Apologies if I am bringing up things that have been addressed > > > internally already. We discussed the 52-bit kernel VA work at > > > plumber's at some point, and IIUC, KASAN is the complicating factor > > > when it comes to having compile time constants for VA_BITS_MIN, > > > VA_BITS_MAX and PAGE_OFFSET, right? > > > > > > To clarify what I mean, please refer to the diagram below, which > > > describes a hybrid 48/52 kernel VA arrangement that does not rely on > > > runtime variable quantities. (VA_BITS_MIN == 48, VA_BITS_MAX == 52) > > > > > > +------------------- (~0) -------------------------+ > > > | | > > > | PCI IO / fixmap spaces | > > > | | > > > +------------------------------------------------+ > > > | | > > > | kernel/vmalloc space | > > > | | > > > +------------------------------------------------+ > > > | | > > > | module space | > > > | | > > > +------------------------------------------------+ > > > | | > > > | BPF space | > > > | | > > > +------------------------------------------------+ > > > | | > > > | | > > > | vmemmap space (size based on VA_BITS_MAX) | > > > | | > > > | | > > > +-- linear/vmalloc split based on VA_BITS_MIN -- + > > > | | > > > | linear mapping (48 bit addressable region) | > > > | | > > > +------------------------------------------------+ > > > | | > > > | linear mapping (52 bit addressable region) | > > > | | > > > +------ PAGE_OFFSET based on VA_BITS_MAX --------+ > > > > > > Since KASAN is what is preventing this, would it be acceptable for > > > KASAN to only be supported when you use a true 48 bit or a true 52 bit > > > configuration, and disable it for the 48/52 hybrid configuration? > > > > > > Just thinking out loud (and in ASCII art :-)) > > > > TBH, if we end up having support for 52-bit kernel VA, I'd be inclined to > > drop the 48/52 configuration altogether. But Catalin's on holiday at the > > moment, and may have a different opinion ;) > > > > But that implies that you cannot have an image that supports 52-bit > kernel VAs but can still boot on hardware that does not implement > support for it. If that is acceptable, then none of this hoop jumping > that Steve is doing in these patches is necessary to begin with, > right? Sorry, I misunderstood what you meant by a "48/52 hybrid configuration". I thought you were referring to the configuration where userspace is 52-bit and the kernel is 48-bit, which is something I think we can drop if we gain support for 52-bit kernel. Now that I understand what you mean, I think disabling KASAN would be fine as long as it's a runtime thing and the kernel continues to work in every other respect. Will
On Tue, 19 Feb 2019 at 14:01, Will Deacon <will.deacon@arm.com> wrote: > > On Tue, Feb 19, 2019 at 01:51:51PM +0100, Ard Biesheuvel wrote: > > On Tue, 19 Feb 2019 at 13:48, Will Deacon <will.deacon@arm.com> wrote: > > > > > > On Tue, Feb 19, 2019 at 01:13:32PM +0100, Ard Biesheuvel wrote: > > > > On Mon, 18 Feb 2019 at 18:05, Steve Capper <steve.capper@arm.com> wrote: > > > > > > > > > > This patch series adds support for 52-bit kernel VAs using some of the > > > > > machinery already introduced by the 52-bit userspace VA code in 5.0. > > > > > > > > > > As 52-bit virtual address support is an optional hardware feature, > > > > > software support for 52-bit kernel VAs needs to be deduced at early boot > > > > > time. If HW support is not available, the kernel falls back to 48-bit. > > > > > > > > > > A significant proportion of this series focuses on "de-constifying" > > > > > VA_BITS related constants. > > > > > > > > > > In order to allow for a KASAN shadow that changes size at boot time, one > > > > > must fix the KASAN_SHADOW_END for both 48 & 52-bit VAs and "grow" the > > > > > start address. Also, it is highly desirable to maintain the same > > > > > function addresses in the kernel .text between VA sizes. Both of these > > > > > requirements necessitate us to flip the kernel address space halves s.t. > > > > > the direct linear map occupies the lower addresses. > > > > > > > > > > One obvious omission is 52-bit kernel VA + 48-bit userspace VA which I > > > > > can add with some more #ifdef'ery if needed. > > > > > > > > > > > > > Hi Steve, > > > > > > > > Apologies if I am bringing up things that have been addressed > > > > internally already. We discussed the 52-bit kernel VA work at > > > > plumber's at some point, and IIUC, KASAN is the complicating factor > > > > when it comes to having compile time constants for VA_BITS_MIN, > > > > VA_BITS_MAX and PAGE_OFFSET, right? > > > > > > > > To clarify what I mean, please refer to the diagram below, which > > > > describes a hybrid 48/52 kernel VA arrangement that does not rely on > > > > runtime variable quantities. (VA_BITS_MIN == 48, VA_BITS_MAX == 52) > > > > > > > > +------------------- (~0) -------------------------+ > > > > | | > > > > | PCI IO / fixmap spaces | > > > > | | > > > > +------------------------------------------------+ > > > > | | > > > > | kernel/vmalloc space | > > > > | | > > > > +------------------------------------------------+ > > > > | | > > > > | module space | > > > > | | > > > > +------------------------------------------------+ > > > > | | > > > > | BPF space | > > > > | | > > > > +------------------------------------------------+ > > > > | | > > > > | | > > > > | vmemmap space (size based on VA_BITS_MAX) | > > > > | | > > > > | | > > > > +-- linear/vmalloc split based on VA_BITS_MIN -- + > > > > | | > > > > | linear mapping (48 bit addressable region) | > > > > | | > > > > +------------------------------------------------+ > > > > | | > > > > | linear mapping (52 bit addressable region) | > > > > | | > > > > +------ PAGE_OFFSET based on VA_BITS_MAX --------+ > > > > > > > > Since KASAN is what is preventing this, would it be acceptable for > > > > KASAN to only be supported when you use a true 48 bit or a true 52 bit > > > > configuration, and disable it for the 48/52 hybrid configuration? > > > > > > > > Just thinking out loud (and in ASCII art :-)) > > > > > > TBH, if we end up having support for 52-bit kernel VA, I'd be inclined to > > > drop the 48/52 configuration altogether. But Catalin's on holiday at the > > > moment, and may have a different opinion ;) > > > > > > > But that implies that you cannot have an image that supports 52-bit > > kernel VAs but can still boot on hardware that does not implement > > support for it. If that is acceptable, then none of this hoop jumping > > that Steve is doing in these patches is necessary to begin with, > > right? > > Sorry, I misunderstood what you meant by a "48/52 hybrid configuration". I > thought you were referring to the configuration where userspace is 52-bit > and the kernel is 48-bit, which is something I think we can drop if we gain > support for 52-bit kernel. > > Now that I understand what you mean, I think disabling KASAN would be fine > as long as it's a runtime thing and the kernel continues to work in every > other respect. > No, it would be a limitation of the 52-bit config which also supports 48-bit-VA-only-h/w that the address space is laid out in such a way that there is simply no room for the KASAN shadow region, since it would have to live in the 48-bit addressable area, but be big enough to cover 52 bits of VA, which is impossible. For the vmemmap space, we could live with sizing it statically to cover a 52-bit VA linear region, but the KASAN shadow region is simply too big. So if KASAN support in that configuration is a requirement, then I agree with Steve's approach, but it does imply that quite a number of formerly compile-time constants now get turned into runtime variables. Steve, do you have any idea what the impact of that is?
On Tue, Feb 19, 2019 at 02:15:26PM +0100, Ard Biesheuvel wrote: > On Tue, 19 Feb 2019 at 14:01, Will Deacon <will.deacon@arm.com> wrote: > > > > On Tue, Feb 19, 2019 at 01:51:51PM +0100, Ard Biesheuvel wrote: > > > On Tue, 19 Feb 2019 at 13:48, Will Deacon <will.deacon@arm.com> wrote: > > > > > > > > On Tue, Feb 19, 2019 at 01:13:32PM +0100, Ard Biesheuvel wrote: > > > > > On Mon, 18 Feb 2019 at 18:05, Steve Capper <steve.capper@arm.com> wrote: > > > > > > > > > > > > This patch series adds support for 52-bit kernel VAs using some of the > > > > > > machinery already introduced by the 52-bit userspace VA code in 5.0. > > > > > > > > > > > > As 52-bit virtual address support is an optional hardware feature, > > > > > > software support for 52-bit kernel VAs needs to be deduced at early boot > > > > > > time. If HW support is not available, the kernel falls back to 48-bit. > > > > > > > > > > > > A significant proportion of this series focuses on "de-constifying" > > > > > > VA_BITS related constants. > > > > > > > > > > > > In order to allow for a KASAN shadow that changes size at boot time, one > > > > > > must fix the KASAN_SHADOW_END for both 48 & 52-bit VAs and "grow" the > > > > > > start address. Also, it is highly desirable to maintain the same > > > > > > function addresses in the kernel .text between VA sizes. Both of these > > > > > > requirements necessitate us to flip the kernel address space halves s.t. > > > > > > the direct linear map occupies the lower addresses. > > > > > > > > > > > > One obvious omission is 52-bit kernel VA + 48-bit userspace VA which I > > > > > > can add with some more #ifdef'ery if needed. > > > > > > > > > > > > > > > > Hi Steve, > > > > > > > > > > Apologies if I am bringing up things that have been addressed > > > > > internally already. We discussed the 52-bit kernel VA work at > > > > > plumber's at some point, and IIUC, KASAN is the complicating factor > > > > > when it comes to having compile time constants for VA_BITS_MIN, > > > > > VA_BITS_MAX and PAGE_OFFSET, right? > > > > > > > > > > To clarify what I mean, please refer to the diagram below, which > > > > > describes a hybrid 48/52 kernel VA arrangement that does not rely on > > > > > runtime variable quantities. (VA_BITS_MIN == 48, VA_BITS_MAX == 52) > > > > > > > > > > +------------------- (~0) -------------------------+ > > > > > | | > > > > > | PCI IO / fixmap spaces | > > > > > | | > > > > > +------------------------------------------------+ > > > > > | | > > > > > | kernel/vmalloc space | > > > > > | | > > > > > +------------------------------------------------+ > > > > > | | > > > > > | module space | > > > > > | | > > > > > +------------------------------------------------+ > > > > > | | > > > > > | BPF space | > > > > > | | > > > > > +------------------------------------------------+ > > > > > | | > > > > > | | > > > > > | vmemmap space (size based on VA_BITS_MAX) | > > > > > | | > > > > > | | > > > > > +-- linear/vmalloc split based on VA_BITS_MIN -- + > > > > > | | > > > > > | linear mapping (48 bit addressable region) | > > > > > | | > > > > > +------------------------------------------------+ > > > > > | | > > > > > | linear mapping (52 bit addressable region) | > > > > > | | > > > > > +------ PAGE_OFFSET based on VA_BITS_MAX --------+ > > > > > > > > > > Since KASAN is what is preventing this, would it be acceptable for > > > > > KASAN to only be supported when you use a true 48 bit or a true 52 bit > > > > > configuration, and disable it for the 48/52 hybrid configuration? > > > > > > > > > > Just thinking out loud (and in ASCII art :-)) > > > > > > > > TBH, if we end up having support for 52-bit kernel VA, I'd be inclined to > > > > drop the 48/52 configuration altogether. But Catalin's on holiday at the > > > > moment, and may have a different opinion ;) > > > > > > > > > > But that implies that you cannot have an image that supports 52-bit > > > kernel VAs but can still boot on hardware that does not implement > > > support for it. If that is acceptable, then none of this hoop jumping > > > that Steve is doing in these patches is necessary to begin with, > > > right? > > > > Sorry, I misunderstood what you meant by a "48/52 hybrid configuration". I > > thought you were referring to the configuration where userspace is 52-bit > > and the kernel is 48-bit, which is something I think we can drop if we gain > > support for 52-bit kernel. > > > > Now that I understand what you mean, I think disabling KASAN would be fine > > as long as it's a runtime thing and the kernel continues to work in every > > other respect. > > > > No, it would be a limitation of the 52-bit config which also supports > 48-bit-VA-only-h/w that the address space is laid out in such a way > that there is simply no room for the KASAN shadow region, since it > would have to live in the 48-bit addressable area, but be big enough > to cover 52 bits of VA, which is impossible. > > For the vmemmap space, we could live with sizing it statically to > cover a 52-bit VA linear region, but the KASAN shadow region is simply > too big. > > So if KASAN support in that configuration is a requirement, then I > agree with Steve's approach, but it does imply that quite a number of > formerly compile-time constants now get turned into runtime variables. > > Steve, do you have any idea what the impact of that is? Hi Guys, The KASAN region only really necessitates two things: 1) that we think about the end address of the region (which is invariant) rather than the start address; and that 2) we flip the kernel VA space. IIUC both these changes have a neglible perf impact. As for VA_BITS_ACTUAL, we need this in a few places: KVM mapping support, and the big one phys_to/from_virt. For phys_to/from_virt the logic is changed s.t. we use a variable lookup for translation but this is folded into a new variable physvirt_offset (before the patch we used a single variable read too). Again IIUC there should be a minimal perf impact (unless one tries to do cat /sys/kernel/debug/kernel_page_tables with KASAN enabled - but that can be optimised later). I didn't have the patience for ASCII art ;-), but I have a picture of what I think it looks like here: https://s3.amazonaws.com/connect.linaro.org/yvr18/presentations/yvr18-119.pdf What I've tried to do is have most parts of the kernel VA space invariant between 48/52 bits. If it's helpful I can type this up into a document/commit log message? For this series I have tried to introduce VA_BITS_MIN in its own patch and also VA_BITS_ACTUAL into its own patch to make it easier to follow. If I've overlooked something, please let me know. Cheers,
On Tue, 19 Feb 2019 at 14:56, Steve Capper <Steve.Capper@arm.com> wrote: > > On Tue, Feb 19, 2019 at 02:15:26PM +0100, Ard Biesheuvel wrote: > > On Tue, 19 Feb 2019 at 14:01, Will Deacon <will.deacon@arm.com> wrote: > > > > > > On Tue, Feb 19, 2019 at 01:51:51PM +0100, Ard Biesheuvel wrote: > > > > On Tue, 19 Feb 2019 at 13:48, Will Deacon <will.deacon@arm.com> wrote: > > > > > > > > > > On Tue, Feb 19, 2019 at 01:13:32PM +0100, Ard Biesheuvel wrote: > > > > > > On Mon, 18 Feb 2019 at 18:05, Steve Capper <steve.capper@arm.com> wrote: > > > > > > > > > > > > > > This patch series adds support for 52-bit kernel VAs using some of the > > > > > > > machinery already introduced by the 52-bit userspace VA code in 5.0. > > > > > > > > > > > > > > As 52-bit virtual address support is an optional hardware feature, > > > > > > > software support for 52-bit kernel VAs needs to be deduced at early boot > > > > > > > time. If HW support is not available, the kernel falls back to 48-bit. > > > > > > > > > > > > > > A significant proportion of this series focuses on "de-constifying" > > > > > > > VA_BITS related constants. > > > > > > > > > > > > > > In order to allow for a KASAN shadow that changes size at boot time, one > > > > > > > must fix the KASAN_SHADOW_END for both 48 & 52-bit VAs and "grow" the > > > > > > > start address. Also, it is highly desirable to maintain the same > > > > > > > function addresses in the kernel .text between VA sizes. Both of these > > > > > > > requirements necessitate us to flip the kernel address space halves s.t. > > > > > > > the direct linear map occupies the lower addresses. > > > > > > > > > > > > > > One obvious omission is 52-bit kernel VA + 48-bit userspace VA which I > > > > > > > can add with some more #ifdef'ery if needed. > > > > > > > > > > > > > > > > > > > Hi Steve, > > > > > > > > > > > > Apologies if I am bringing up things that have been addressed > > > > > > internally already. We discussed the 52-bit kernel VA work at > > > > > > plumber's at some point, and IIUC, KASAN is the complicating factor > > > > > > when it comes to having compile time constants for VA_BITS_MIN, > > > > > > VA_BITS_MAX and PAGE_OFFSET, right? > > > > > > > > > > > > To clarify what I mean, please refer to the diagram below, which > > > > > > describes a hybrid 48/52 kernel VA arrangement that does not rely on > > > > > > runtime variable quantities. (VA_BITS_MIN == 48, VA_BITS_MAX == 52) > > > > > > > > > > > > +------------------- (~0) -------------------------+ > > > > > > | | > > > > > > | PCI IO / fixmap spaces | > > > > > > | | > > > > > > +------------------------------------------------+ > > > > > > | | > > > > > > | kernel/vmalloc space | > > > > > > | | > > > > > > +------------------------------------------------+ > > > > > > | | > > > > > > | module space | > > > > > > | | > > > > > > +------------------------------------------------+ > > > > > > | | > > > > > > | BPF space | > > > > > > | | > > > > > > +------------------------------------------------+ > > > > > > | | > > > > > > | | > > > > > > | vmemmap space (size based on VA_BITS_MAX) | > > > > > > | | > > > > > > | | > > > > > > +-- linear/vmalloc split based on VA_BITS_MIN -- + > > > > > > | | > > > > > > | linear mapping (48 bit addressable region) | > > > > > > | | > > > > > > +------------------------------------------------+ > > > > > > | | > > > > > > | linear mapping (52 bit addressable region) | > > > > > > | | > > > > > > +------ PAGE_OFFSET based on VA_BITS_MAX --------+ > > > > > > > > > > > > Since KASAN is what is preventing this, would it be acceptable for > > > > > > KASAN to only be supported when you use a true 48 bit or a true 52 bit > > > > > > configuration, and disable it for the 48/52 hybrid configuration? > > > > > > > > > > > > Just thinking out loud (and in ASCII art :-)) > > > > > > > > > > TBH, if we end up having support for 52-bit kernel VA, I'd be inclined to > > > > > drop the 48/52 configuration altogether. But Catalin's on holiday at the > > > > > moment, and may have a different opinion ;) > > > > > > > > > > > > > But that implies that you cannot have an image that supports 52-bit > > > > kernel VAs but can still boot on hardware that does not implement > > > > support for it. If that is acceptable, then none of this hoop jumping > > > > that Steve is doing in these patches is necessary to begin with, > > > > right? > > > > > > Sorry, I misunderstood what you meant by a "48/52 hybrid configuration". I > > > thought you were referring to the configuration where userspace is 52-bit > > > and the kernel is 48-bit, which is something I think we can drop if we gain > > > support for 52-bit kernel. > > > > > > Now that I understand what you mean, I think disabling KASAN would be fine > > > as long as it's a runtime thing and the kernel continues to work in every > > > other respect. > > > > > > > No, it would be a limitation of the 52-bit config which also supports > > 48-bit-VA-only-h/w that the address space is laid out in such a way > > that there is simply no room for the KASAN shadow region, since it > > would have to live in the 48-bit addressable area, but be big enough > > to cover 52 bits of VA, which is impossible. > > > > For the vmemmap space, we could live with sizing it statically to > > cover a 52-bit VA linear region, but the KASAN shadow region is simply > > too big. > > > > So if KASAN support in that configuration is a requirement, then I > > agree with Steve's approach, but it does imply that quite a number of > > formerly compile-time constants now get turned into runtime variables. > > > > Steve, do you have any idea what the impact of that is? > > Hi Guys, > > The KASAN region only really necessitates two things: 1) that we think > about the end address of the region (which is invariant) rather than the > start address; and that 2) we flip the kernel VA space. IIUC both these > changes have a neglible perf impact. > > As for VA_BITS_ACTUAL, we need this in a few places: KVM mapping > support, and the big one phys_to/from_virt. For phys_to/from_virt the > logic is changed s.t. we use a variable lookup for translation but this > is folded into a new variable physvirt_offset (before the patch we used > a single variable read too). > > Again IIUC there should be a minimal perf impact (unless one tries to do > cat /sys/kernel/debug/kernel_page_tables with KASAN enabled - but that > can be optimised later). > > I didn't have the patience for ASCII art ;-), but I have a picture of > what I think it looks like here: > https://s3.amazonaws.com/connect.linaro.org/yvr18/presentations/yvr18-119.pdf > What I've tried to do is have most parts of the kernel VA space > invariant between 48/52 bits. If it's helpful I can type this up into a > document/commit log message? > > For this series I have tried to introduce VA_BITS_MIN in its own patch > and also VA_BITS_ACTUAL into its own patch to make it easier to follow. > OK, perhaps I am just rephrasing what you essentially implemented already, but let me try to explain a bit better what I mean: - we flip the VA space in the way you suggest - we limit the size of the top half of the address space to 47 bits - KASAN region growns downwards from (~0) << 47 - we define PAGE_OFFSET as (~0) << 52, regardless of whether the h/w supports LVA or not - however, we tweak the phys/virt translation so that memory appears in the 48-bit addressable part of the linear region on non-LVA hardware The latter basically means that the KASAN shadow region will intersect the linear region, but whether we map memory or shadow pages there depends on the h/w config at runtime. The heart of the matter is probably the different placement of the memory inside the linear region, depending on whether the h/w is LVA capable or not, which is also reflected in your physvirt_offset. I am just trying to figure out why we need VA_BITS_ACTUAL to be a runtime variable.
On Tue, Feb 19, 2019 at 05:18:18PM +0100, Ard Biesheuvel wrote: > On Tue, 19 Feb 2019 at 14:56, Steve Capper <Steve.Capper@arm.com> wrote: > > > > On Tue, Feb 19, 2019 at 02:15:26PM +0100, Ard Biesheuvel wrote: > > > On Tue, 19 Feb 2019 at 14:01, Will Deacon <will.deacon@arm.com> wrote: > > > > > > > > On Tue, Feb 19, 2019 at 01:51:51PM +0100, Ard Biesheuvel wrote: > > > > > On Tue, 19 Feb 2019 at 13:48, Will Deacon <will.deacon@arm.com> wrote: > > > > > > > > > > > > On Tue, Feb 19, 2019 at 01:13:32PM +0100, Ard Biesheuvel wrote: > > > > > > > On Mon, 18 Feb 2019 at 18:05, Steve Capper <steve.capper@arm.com> wrote: > > > > > > > > > > > > > > > > This patch series adds support for 52-bit kernel VAs using some of the > > > > > > > > machinery already introduced by the 52-bit userspace VA code in 5.0. > > > > > > > > > > > > > > > > As 52-bit virtual address support is an optional hardware feature, > > > > > > > > software support for 52-bit kernel VAs needs to be deduced at early boot > > > > > > > > time. If HW support is not available, the kernel falls back to 48-bit. > > > > > > > > > > > > > > > > A significant proportion of this series focuses on "de-constifying" > > > > > > > > VA_BITS related constants. > > > > > > > > > > > > > > > > In order to allow for a KASAN shadow that changes size at boot time, one > > > > > > > > must fix the KASAN_SHADOW_END for both 48 & 52-bit VAs and "grow" the > > > > > > > > start address. Also, it is highly desirable to maintain the same > > > > > > > > function addresses in the kernel .text between VA sizes. Both of these > > > > > > > > requirements necessitate us to flip the kernel address space halves s.t. > > > > > > > > the direct linear map occupies the lower addresses. > > > > > > > > > > > > > > > > One obvious omission is 52-bit kernel VA + 48-bit userspace VA which I > > > > > > > > can add with some more #ifdef'ery if needed. > > > > > > > > > > > > > > > > > > > > > > Hi Steve, > > > > > > > > > > > > > > Apologies if I am bringing up things that have been addressed > > > > > > > internally already. We discussed the 52-bit kernel VA work at > > > > > > > plumber's at some point, and IIUC, KASAN is the complicating factor > > > > > > > when it comes to having compile time constants for VA_BITS_MIN, > > > > > > > VA_BITS_MAX and PAGE_OFFSET, right? > > > > > > > > > > > > > > To clarify what I mean, please refer to the diagram below, which > > > > > > > describes a hybrid 48/52 kernel VA arrangement that does not rely on > > > > > > > runtime variable quantities. (VA_BITS_MIN == 48, VA_BITS_MAX == 52) > > > > > > > > > > > > > > +------------------- (~0) -------------------------+ > > > > > > > | | > > > > > > > | PCI IO / fixmap spaces | > > > > > > > | | > > > > > > > +------------------------------------------------+ > > > > > > > | | > > > > > > > | kernel/vmalloc space | > > > > > > > | | > > > > > > > +------------------------------------------------+ > > > > > > > | | > > > > > > > | module space | > > > > > > > | | > > > > > > > +------------------------------------------------+ > > > > > > > | | > > > > > > > | BPF space | > > > > > > > | | > > > > > > > +------------------------------------------------+ > > > > > > > | | > > > > > > > | | > > > > > > > | vmemmap space (size based on VA_BITS_MAX) | > > > > > > > | | > > > > > > > | | > > > > > > > +-- linear/vmalloc split based on VA_BITS_MIN -- + > > > > > > > | | > > > > > > > | linear mapping (48 bit addressable region) | > > > > > > > | | > > > > > > > +------------------------------------------------+ > > > > > > > | | > > > > > > > | linear mapping (52 bit addressable region) | > > > > > > > | | > > > > > > > +------ PAGE_OFFSET based on VA_BITS_MAX --------+ > > > > > > > > > > > > > > Since KASAN is what is preventing this, would it be acceptable for > > > > > > > KASAN to only be supported when you use a true 48 bit or a true 52 bit > > > > > > > configuration, and disable it for the 48/52 hybrid configuration? > > > > > > > > > > > > > > Just thinking out loud (and in ASCII art :-)) > > > > > > > > > > > > TBH, if we end up having support for 52-bit kernel VA, I'd be inclined to > > > > > > drop the 48/52 configuration altogether. But Catalin's on holiday at the > > > > > > moment, and may have a different opinion ;) > > > > > > > > > > > > > > > > But that implies that you cannot have an image that supports 52-bit > > > > > kernel VAs but can still boot on hardware that does not implement > > > > > support for it. If that is acceptable, then none of this hoop jumping > > > > > that Steve is doing in these patches is necessary to begin with, > > > > > right? > > > > > > > > Sorry, I misunderstood what you meant by a "48/52 hybrid configuration". I > > > > thought you were referring to the configuration where userspace is 52-bit > > > > and the kernel is 48-bit, which is something I think we can drop if we gain > > > > support for 52-bit kernel. > > > > > > > > Now that I understand what you mean, I think disabling KASAN would be fine > > > > as long as it's a runtime thing and the kernel continues to work in every > > > > other respect. > > > > > > > > > > No, it would be a limitation of the 52-bit config which also supports > > > 48-bit-VA-only-h/w that the address space is laid out in such a way > > > that there is simply no room for the KASAN shadow region, since it > > > would have to live in the 48-bit addressable area, but be big enough > > > to cover 52 bits of VA, which is impossible. > > > > > > For the vmemmap space, we could live with sizing it statically to > > > cover a 52-bit VA linear region, but the KASAN shadow region is simply > > > too big. > > > > > > So if KASAN support in that configuration is a requirement, then I > > > agree with Steve's approach, but it does imply that quite a number of > > > formerly compile-time constants now get turned into runtime variables. > > > > > > Steve, do you have any idea what the impact of that is? > > > > Hi Guys, > > > > The KASAN region only really necessitates two things: 1) that we think > > about the end address of the region (which is invariant) rather than the > > start address; and that 2) we flip the kernel VA space. IIUC both these > > changes have a neglible perf impact. > > > > As for VA_BITS_ACTUAL, we need this in a few places: KVM mapping > > support, and the big one phys_to/from_virt. For phys_to/from_virt the > > logic is changed s.t. we use a variable lookup for translation but this > > is folded into a new variable physvirt_offset (before the patch we used > > a single variable read too). > > > > Again IIUC there should be a minimal perf impact (unless one tries to do > > cat /sys/kernel/debug/kernel_page_tables with KASAN enabled - but that > > can be optimised later). > > > > I didn't have the patience for ASCII art ;-), but I have a picture of > > what I think it looks like here: > > https://s3.amazonaws.com/connect.linaro.org/yvr18/presentations/yvr18-119.pdf > > What I've tried to do is have most parts of the kernel VA space > > invariant between 48/52 bits. If it's helpful I can type this up into a > > document/commit log message? > > > > For this series I have tried to introduce VA_BITS_MIN in its own patch > > and also VA_BITS_ACTUAL into its own patch to make it easier to follow. > > Hi Ard, Apologies for my late reply, I had been staring at this for a while. > > OK, perhaps I am just rephrasing what you essentially implemented > already, but let me try to explain a bit better what I mean: > > - we flip the VA space in the way you suggest > - we limit the size of the top half of the address space to 47 bits > - KASAN region growns downwards from (~0) << 47 > - we define PAGE_OFFSET as (~0) << 52, regardless of whether the h/w > supports LVA or not > - however, we tweak the phys/virt translation so that memory appears > in the 48-bit addressable part of the linear region on non-LVA > hardware > > The latter basically means that the KASAN shadow region will intersect > the linear region, but whether we map memory or shadow pages there > depends on the h/w config at runtime. > > The heart of the matter is probably the different placement of the > memory inside the linear region, depending on whether the h/w is LVA > capable or not, which is also reflected in your physvirt_offset. I am > just trying to figure out why we need VA_BITS_ACTUAL to be a runtime > variable. Currently the direct linear map between configurations does not overlap, we have: FFF00000_00000000 - Direct linear map start (52-bit) FFF80000_00000000 - Direct linear map end (52-bit) FFFF0000_00000000 - Direct linear map start (48-bit) FFFF8000_00000000 - Direct linear map end (48-bit) We *can* make PAGE_OFFSET a constant for both 48/52 bit VA_BITS, if we offset it. vmemmap can then be adjusted on early boot to ensure that everything points to the right place. However we will get overlap for 52-bit configurations between KASAN and the direct linear map. The question is: are we okay with quite a large overlap? The KASAN region begins on 0xFFFDA000_00000000 for 52-bit. If we wish to employ a "full" 47-bit direct linear map on 48-bit systems we need a PAGE_OFFSET of 0xFFF78000_00000000 in order to make the direct linear map end addresses "match up" between 48/52 bit configurations. This doesn't leave us with a lot of room for 52-bit configurations though, if KASAN is enabled. Cheers,
On Tue, 26 Feb 2019 at 18:30, Steve Capper <Steve.Capper@arm.com> wrote: > > On Tue, Feb 19, 2019 at 05:18:18PM +0100, Ard Biesheuvel wrote: > > On Tue, 19 Feb 2019 at 14:56, Steve Capper <Steve.Capper@arm.com> wrote: > > > > > > On Tue, Feb 19, 2019 at 02:15:26PM +0100, Ard Biesheuvel wrote: > > > > On Tue, 19 Feb 2019 at 14:01, Will Deacon <will.deacon@arm.com> wrote: > > > > > > > > > > On Tue, Feb 19, 2019 at 01:51:51PM +0100, Ard Biesheuvel wrote: > > > > > > On Tue, 19 Feb 2019 at 13:48, Will Deacon <will.deacon@arm.com> wrote: > > > > > > > > > > > > > > On Tue, Feb 19, 2019 at 01:13:32PM +0100, Ard Biesheuvel wrote: > > > > > > > > On Mon, 18 Feb 2019 at 18:05, Steve Capper <steve.capper@arm.com> wrote: > > > > > > > > > > > > > > > > > > This patch series adds support for 52-bit kernel VAs using some of the > > > > > > > > > machinery already introduced by the 52-bit userspace VA code in 5.0. > > > > > > > > > > > > > > > > > > As 52-bit virtual address support is an optional hardware feature, > > > > > > > > > software support for 52-bit kernel VAs needs to be deduced at early boot > > > > > > > > > time. If HW support is not available, the kernel falls back to 48-bit. > > > > > > > > > > > > > > > > > > A significant proportion of this series focuses on "de-constifying" > > > > > > > > > VA_BITS related constants. > > > > > > > > > > > > > > > > > > In order to allow for a KASAN shadow that changes size at boot time, one > > > > > > > > > must fix the KASAN_SHADOW_END for both 48 & 52-bit VAs and "grow" the > > > > > > > > > start address. Also, it is highly desirable to maintain the same > > > > > > > > > function addresses in the kernel .text between VA sizes. Both of these > > > > > > > > > requirements necessitate us to flip the kernel address space halves s.t. > > > > > > > > > the direct linear map occupies the lower addresses. > > > > > > > > > > > > > > > > > > One obvious omission is 52-bit kernel VA + 48-bit userspace VA which I > > > > > > > > > can add with some more #ifdef'ery if needed. > > > > > > > > > > > > > > > > > > > > > > > > > Hi Steve, > > > > > > > > > > > > > > > > Apologies if I am bringing up things that have been addressed > > > > > > > > internally already. We discussed the 52-bit kernel VA work at > > > > > > > > plumber's at some point, and IIUC, KASAN is the complicating factor > > > > > > > > when it comes to having compile time constants for VA_BITS_MIN, > > > > > > > > VA_BITS_MAX and PAGE_OFFSET, right? > > > > > > > > > > > > > > > > To clarify what I mean, please refer to the diagram below, which > > > > > > > > describes a hybrid 48/52 kernel VA arrangement that does not rely on > > > > > > > > runtime variable quantities. (VA_BITS_MIN == 48, VA_BITS_MAX == 52) > > > > > > > > > > > > > > > > +------------------- (~0) -------------------------+ > > > > > > > > | | > > > > > > > > | PCI IO / fixmap spaces | > > > > > > > > | | > > > > > > > > +------------------------------------------------+ > > > > > > > > | | > > > > > > > > | kernel/vmalloc space | > > > > > > > > | | > > > > > > > > +------------------------------------------------+ > > > > > > > > | | > > > > > > > > | module space | > > > > > > > > | | > > > > > > > > +------------------------------------------------+ > > > > > > > > | | > > > > > > > > | BPF space | > > > > > > > > | | > > > > > > > > +------------------------------------------------+ > > > > > > > > | | > > > > > > > > | | > > > > > > > > | vmemmap space (size based on VA_BITS_MAX) | > > > > > > > > | | > > > > > > > > | | > > > > > > > > +-- linear/vmalloc split based on VA_BITS_MIN -- + > > > > > > > > | | > > > > > > > > | linear mapping (48 bit addressable region) | > > > > > > > > | | > > > > > > > > +------------------------------------------------+ > > > > > > > > | | > > > > > > > > | linear mapping (52 bit addressable region) | > > > > > > > > | | > > > > > > > > +------ PAGE_OFFSET based on VA_BITS_MAX --------+ > > > > > > > > > > > > > > > > Since KASAN is what is preventing this, would it be acceptable for > > > > > > > > KASAN to only be supported when you use a true 48 bit or a true 52 bit > > > > > > > > configuration, and disable it for the 48/52 hybrid configuration? > > > > > > > > > > > > > > > > Just thinking out loud (and in ASCII art :-)) > > > > > > > > > > > > > > TBH, if we end up having support for 52-bit kernel VA, I'd be inclined to > > > > > > > drop the 48/52 configuration altogether. But Catalin's on holiday at the > > > > > > > moment, and may have a different opinion ;) > > > > > > > > > > > > > > > > > > > But that implies that you cannot have an image that supports 52-bit > > > > > > kernel VAs but can still boot on hardware that does not implement > > > > > > support for it. If that is acceptable, then none of this hoop jumping > > > > > > that Steve is doing in these patches is necessary to begin with, > > > > > > right? > > > > > > > > > > Sorry, I misunderstood what you meant by a "48/52 hybrid configuration". I > > > > > thought you were referring to the configuration where userspace is 52-bit > > > > > and the kernel is 48-bit, which is something I think we can drop if we gain > > > > > support for 52-bit kernel. > > > > > > > > > > Now that I understand what you mean, I think disabling KASAN would be fine > > > > > as long as it's a runtime thing and the kernel continues to work in every > > > > > other respect. > > > > > > > > > > > > > No, it would be a limitation of the 52-bit config which also supports > > > > 48-bit-VA-only-h/w that the address space is laid out in such a way > > > > that there is simply no room for the KASAN shadow region, since it > > > > would have to live in the 48-bit addressable area, but be big enough > > > > to cover 52 bits of VA, which is impossible. > > > > > > > > For the vmemmap space, we could live with sizing it statically to > > > > cover a 52-bit VA linear region, but the KASAN shadow region is simply > > > > too big. > > > > > > > > So if KASAN support in that configuration is a requirement, then I > > > > agree with Steve's approach, but it does imply that quite a number of > > > > formerly compile-time constants now get turned into runtime variables. > > > > > > > > Steve, do you have any idea what the impact of that is? > > > > > > Hi Guys, > > > > > > The KASAN region only really necessitates two things: 1) that we think > > > about the end address of the region (which is invariant) rather than the > > > start address; and that 2) we flip the kernel VA space. IIUC both these > > > changes have a neglible perf impact. > > > > > > As for VA_BITS_ACTUAL, we need this in a few places: KVM mapping > > > support, and the big one phys_to/from_virt. For phys_to/from_virt the > > > logic is changed s.t. we use a variable lookup for translation but this > > > is folded into a new variable physvirt_offset (before the patch we used > > > a single variable read too). > > > > > > Again IIUC there should be a minimal perf impact (unless one tries to do > > > cat /sys/kernel/debug/kernel_page_tables with KASAN enabled - but that > > > can be optimised later). > > > > > > I didn't have the patience for ASCII art ;-), but I have a picture of > > > what I think it looks like here: > > > https://s3.amazonaws.com/connect.linaro.org/yvr18/presentations/yvr18-119.pdf > > > What I've tried to do is have most parts of the kernel VA space > > > invariant between 48/52 bits. If it's helpful I can type this up into a > > > document/commit log message? > > > > > > For this series I have tried to introduce VA_BITS_MIN in its own patch > > > and also VA_BITS_ACTUAL into its own patch to make it easier to follow. > > > > > Hi Ard, > > Apologies for my late reply, I had been staring at this for a while. > > > > > OK, perhaps I am just rephrasing what you essentially implemented > > already, but let me try to explain a bit better what I mean: > > > > - we flip the VA space in the way you suggest > > - we limit the size of the top half of the address space to 47 bits > > - KASAN region growns downwards from (~0) << 47 > > - we define PAGE_OFFSET as (~0) << 52, regardless of whether the h/w > > supports LVA or not > > - however, we tweak the phys/virt translation so that memory appears > > in the 48-bit addressable part of the linear region on non-LVA > > hardware > > > > The latter basically means that the KASAN shadow region will intersect > > the linear region, but whether we map memory or shadow pages there > > depends on the h/w config at runtime. > > > > The heart of the matter is probably the different placement of the > > memory inside the linear region, depending on whether the h/w is LVA > > capable or not, which is also reflected in your physvirt_offset. I am > > just trying to figure out why we need VA_BITS_ACTUAL to be a runtime > > variable. > > Currently the direct linear map between configurations does not overlap, > we have: > > FFF00000_00000000 - Direct linear map start (52-bit) > FFF80000_00000000 - Direct linear map end (52-bit) > FFFF0000_00000000 - Direct linear map start (48-bit) > FFFF8000_00000000 - Direct linear map end (48-bit) > > We *can* make PAGE_OFFSET a constant for both 48/52 bit VA_BITS, if we > offset it. vmemmap can then be adjusted on early boot to ensure that > everything points to the right place. However we will get overlap for > 52-bit configurations between KASAN and the direct linear map. > > The question is: are we okay with quite a large overlap? > > The KASAN region begins on 0xFFFDA000_00000000 for 52-bit. If we wish to > employ a "full" 47-bit direct linear map on 48-bit systems we need a > PAGE_OFFSET of 0xFFF78000_00000000 in order to make the direct linear > map end addresses "match up" between 48/52 bit configurations. > > This doesn't leave us with a lot of room for 52-bit configurations > though, if KASAN is enabled. > OK, so with actual numbers, what I had in mind was FFF00000_00000000 start of 52-bit addressable linear region | PAGE_OFFSET FFFD8000_00000000 start of KASAN shadow region | KASAN_SHADOW_OFFSET FFFF0000_00000000 start of 48-bit addressable linear region FFFF6000_00000000 start of used KASAN shadow region (48-bit VA) (KASAN_SHADOW_OFFSET + F0000_00000000 >> 3) FFFF8000_00000000 start of vmemmap area - end of KASAN shadow region FFFF8200_00000000 end of vmemmap area - start of bpf/module/etc area The trick is that the full (52 - 3) bits KASAN shadow space overlaps with the 48-bit linear region, but since you don't need KASAN shadow pages for memory that does not exist, the region FFFF0000_00000000 - FFFF6000_00000000 can be used for mapping the memory in case the h/w is 48-bit only. So in this case, PAGE_OFFSET and KASAN_SHADOW_OFFSET remain compile time constants, and as long as we don't attempt to map anything outside of the 48-bit addressable area on h/w that does not support it, the fact that those quantities are outside the 48-bit range does not really matter.
On Tue, Feb 26, 2019 at 09:17:49PM +0100, Ard Biesheuvel wrote: > On Tue, 26 Feb 2019 at 18:30, Steve Capper <Steve.Capper@arm.com> wrote: > > > > On Tue, Feb 19, 2019 at 05:18:18PM +0100, Ard Biesheuvel wrote: > > > On Tue, 19 Feb 2019 at 14:56, Steve Capper <Steve.Capper@arm.com> wrote: > > > > > > > > On Tue, Feb 19, 2019 at 02:15:26PM +0100, Ard Biesheuvel wrote: > > > > > On Tue, 19 Feb 2019 at 14:01, Will Deacon <will.deacon@arm.com> wrote: > > > > > > > > > > > > On Tue, Feb 19, 2019 at 01:51:51PM +0100, Ard Biesheuvel wrote: > > > > > > > On Tue, 19 Feb 2019 at 13:48, Will Deacon <will.deacon@arm.com> wrote: > > > > > > > > > > > > > > > > On Tue, Feb 19, 2019 at 01:13:32PM +0100, Ard Biesheuvel wrote: > > > > > > > > > On Mon, 18 Feb 2019 at 18:05, Steve Capper <steve.capper@arm.com> wrote: > > > > > > > > > > > > > > > > > > > > This patch series adds support for 52-bit kernel VAs using some of the > > > > > > > > > > machinery already introduced by the 52-bit userspace VA code in 5.0. > > > > > > > > > > > > > > > > > > > > As 52-bit virtual address support is an optional hardware feature, > > > > > > > > > > software support for 52-bit kernel VAs needs to be deduced at early boot > > > > > > > > > > time. If HW support is not available, the kernel falls back to 48-bit. > > > > > > > > > > > > > > > > > > > > A significant proportion of this series focuses on "de-constifying" > > > > > > > > > > VA_BITS related constants. > > > > > > > > > > > > > > > > > > > > In order to allow for a KASAN shadow that changes size at boot time, one > > > > > > > > > > must fix the KASAN_SHADOW_END for both 48 & 52-bit VAs and "grow" the > > > > > > > > > > start address. Also, it is highly desirable to maintain the same > > > > > > > > > > function addresses in the kernel .text between VA sizes. Both of these > > > > > > > > > > requirements necessitate us to flip the kernel address space halves s.t. > > > > > > > > > > the direct linear map occupies the lower addresses. > > > > > > > > > > > > > > > > > > > > One obvious omission is 52-bit kernel VA + 48-bit userspace VA which I > > > > > > > > > > can add with some more #ifdef'ery if needed. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi Steve, > > > > > > > > > > > > > > > > > > Apologies if I am bringing up things that have been addressed > > > > > > > > > internally already. We discussed the 52-bit kernel VA work at > > > > > > > > > plumber's at some point, and IIUC, KASAN is the complicating factor > > > > > > > > > when it comes to having compile time constants for VA_BITS_MIN, > > > > > > > > > VA_BITS_MAX and PAGE_OFFSET, right? > > > > > > > > > > > > > > > > > > To clarify what I mean, please refer to the diagram below, which > > > > > > > > > describes a hybrid 48/52 kernel VA arrangement that does not rely on > > > > > > > > > runtime variable quantities. (VA_BITS_MIN == 48, VA_BITS_MAX == 52) > > > > > > > > > > > > > > > > > > +------------------- (~0) -------------------------+ > > > > > > > > > | | > > > > > > > > > | PCI IO / fixmap spaces | > > > > > > > > > | | > > > > > > > > > +------------------------------------------------+ > > > > > > > > > | | > > > > > > > > > | kernel/vmalloc space | > > > > > > > > > | | > > > > > > > > > +------------------------------------------------+ > > > > > > > > > | | > > > > > > > > > | module space | > > > > > > > > > | | > > > > > > > > > +------------------------------------------------+ > > > > > > > > > | | > > > > > > > > > | BPF space | > > > > > > > > > | | > > > > > > > > > +------------------------------------------------+ > > > > > > > > > | | > > > > > > > > > | | > > > > > > > > > | vmemmap space (size based on VA_BITS_MAX) | > > > > > > > > > | | > > > > > > > > > | | > > > > > > > > > +-- linear/vmalloc split based on VA_BITS_MIN -- + > > > > > > > > > | | > > > > > > > > > | linear mapping (48 bit addressable region) | > > > > > > > > > | | > > > > > > > > > +------------------------------------------------+ > > > > > > > > > | | > > > > > > > > > | linear mapping (52 bit addressable region) | > > > > > > > > > | | > > > > > > > > > +------ PAGE_OFFSET based on VA_BITS_MAX --------+ > > > > > > > > > > > > > > > > > > Since KASAN is what is preventing this, would it be acceptable for > > > > > > > > > KASAN to only be supported when you use a true 48 bit or a true 52 bit > > > > > > > > > configuration, and disable it for the 48/52 hybrid configuration? > > > > > > > > > > > > > > > > > > Just thinking out loud (and in ASCII art :-)) > > > > > > > > > > > > > > > > TBH, if we end up having support for 52-bit kernel VA, I'd be inclined to > > > > > > > > drop the 48/52 configuration altogether. But Catalin's on holiday at the > > > > > > > > moment, and may have a different opinion ;) > > > > > > > > > > > > > > > > > > > > > > But that implies that you cannot have an image that supports 52-bit > > > > > > > kernel VAs but can still boot on hardware that does not implement > > > > > > > support for it. If that is acceptable, then none of this hoop jumping > > > > > > > that Steve is doing in these patches is necessary to begin with, > > > > > > > right? > > > > > > > > > > > > Sorry, I misunderstood what you meant by a "48/52 hybrid configuration". I > > > > > > thought you were referring to the configuration where userspace is 52-bit > > > > > > and the kernel is 48-bit, which is something I think we can drop if we gain > > > > > > support for 52-bit kernel. > > > > > > > > > > > > Now that I understand what you mean, I think disabling KASAN would be fine > > > > > > as long as it's a runtime thing and the kernel continues to work in every > > > > > > other respect. > > > > > > > > > > > > > > > > No, it would be a limitation of the 52-bit config which also supports > > > > > 48-bit-VA-only-h/w that the address space is laid out in such a way > > > > > that there is simply no room for the KASAN shadow region, since it > > > > > would have to live in the 48-bit addressable area, but be big enough > > > > > to cover 52 bits of VA, which is impossible. > > > > > > > > > > For the vmemmap space, we could live with sizing it statically to > > > > > cover a 52-bit VA linear region, but the KASAN shadow region is simply > > > > > too big. > > > > > > > > > > So if KASAN support in that configuration is a requirement, then I > > > > > agree with Steve's approach, but it does imply that quite a number of > > > > > formerly compile-time constants now get turned into runtime variables. > > > > > > > > > > Steve, do you have any idea what the impact of that is? > > > > > > > > Hi Guys, > > > > > > > > The KASAN region only really necessitates two things: 1) that we think > > > > about the end address of the region (which is invariant) rather than the > > > > start address; and that 2) we flip the kernel VA space. IIUC both these > > > > changes have a neglible perf impact. > > > > > > > > As for VA_BITS_ACTUAL, we need this in a few places: KVM mapping > > > > support, and the big one phys_to/from_virt. For phys_to/from_virt the > > > > logic is changed s.t. we use a variable lookup for translation but this > > > > is folded into a new variable physvirt_offset (before the patch we used > > > > a single variable read too). > > > > > > > > Again IIUC there should be a minimal perf impact (unless one tries to do > > > > cat /sys/kernel/debug/kernel_page_tables with KASAN enabled - but that > > > > can be optimised later). > > > > > > > > I didn't have the patience for ASCII art ;-), but I have a picture of > > > > what I think it looks like here: > > > > https://s3.amazonaws.com/connect.linaro.org/yvr18/presentations/yvr18-119.pdf > > > > What I've tried to do is have most parts of the kernel VA space > > > > invariant between 48/52 bits. If it's helpful I can type this up into a > > > > document/commit log message? > > > > > > > > For this series I have tried to introduce VA_BITS_MIN in its own patch > > > > and also VA_BITS_ACTUAL into its own patch to make it easier to follow. > > > > > > > > Hi Ard, > > > > Apologies for my late reply, I had been staring at this for a while. > > > > > > > > OK, perhaps I am just rephrasing what you essentially implemented > > > already, but let me try to explain a bit better what I mean: > > > > > > - we flip the VA space in the way you suggest > > > - we limit the size of the top half of the address space to 47 bits > > > - KASAN region growns downwards from (~0) << 47 > > > - we define PAGE_OFFSET as (~0) << 52, regardless of whether the h/w > > > supports LVA or not > > > - however, we tweak the phys/virt translation so that memory appears > > > in the 48-bit addressable part of the linear region on non-LVA > > > hardware > > > > > > The latter basically means that the KASAN shadow region will intersect > > > the linear region, but whether we map memory or shadow pages there > > > depends on the h/w config at runtime. > > > > > > The heart of the matter is probably the different placement of the > > > memory inside the linear region, depending on whether the h/w is LVA > > > capable or not, which is also reflected in your physvirt_offset. I am > > > just trying to figure out why we need VA_BITS_ACTUAL to be a runtime > > > variable. > > > > Currently the direct linear map between configurations does not overlap, > > we have: > > > > FFF00000_00000000 - Direct linear map start (52-bit) > > FFF80000_00000000 - Direct linear map end (52-bit) > > FFFF0000_00000000 - Direct linear map start (48-bit) > > FFFF8000_00000000 - Direct linear map end (48-bit) > > > > We *can* make PAGE_OFFSET a constant for both 48/52 bit VA_BITS, if we > > offset it. vmemmap can then be adjusted on early boot to ensure that > > everything points to the right place. However we will get overlap for > > 52-bit configurations between KASAN and the direct linear map. > > > > The question is: are we okay with quite a large overlap? > > > > The KASAN region begins on 0xFFFDA000_00000000 for 52-bit. If we wish to > > employ a "full" 47-bit direct linear map on 48-bit systems we need a > > PAGE_OFFSET of 0xFFF78000_00000000 in order to make the direct linear > > map end addresses "match up" between 48/52 bit configurations. > > > > This doesn't leave us with a lot of room for 52-bit configurations > > though, if KASAN is enabled. > > > > OK, so with actual numbers, what I had in mind was > > > FFF00000_00000000 start of 52-bit addressable linear region | PAGE_OFFSET > > FFFD8000_00000000 start of KASAN shadow region | KASAN_SHADOW_OFFSET > > FFFF0000_00000000 start of 48-bit addressable linear region > > FFFF6000_00000000 start of used KASAN shadow region (48-bit VA) > (KASAN_SHADOW_OFFSET + F0000_00000000 >> 3) > > FFFF8000_00000000 start of vmemmap area - end of KASAN shadow region > > FFFF8200_00000000 end of vmemmap area - start of bpf/module/etc area > > > The trick is that the full (52 - 3) bits KASAN shadow space overlaps > with the 48-bit linear region, but since you don't need KASAN shadow > pages for memory that does not exist, the region FFFF0000_00000000 - > FFFF6000_00000000 can be used for mapping the memory in case the h/w > is 48-bit only. > > So in this case, PAGE_OFFSET and KASAN_SHADOW_OFFSET remain compile > time constants, and as long as we don't attempt to map anything > outside of the 48-bit addressable area on h/w that does not support > it, the fact that those quantities are outside the 48-bit range does > not really matter. Thanks Ard, I'll elaborate more on what I'm worrying about :-). The 48/52 bit linear regions above do not overlap and this creates the following issue. To go from a struct page * to a linear address we do the following: lva = (page - VMEMMAP_START) * PAGE_SIZE / sizeof(struct page) + PAGE_OFFSET (Before my series) all the constants are fixed at compile time and thus translation is very quick. My understanding is that you would like PAGE_OFFSET to be constant to preserve the optimised nature of this transform? (if not, please shout :-) ) The problem is that a 52-bit PAGE_OFFSET = 0xFFF00000_00000000 will never be able to give us an lva within a 48-bit addressable range. At best we will get an lva of FFF80000_00000000. We can get around this by adding a variable to the above transform, but this is essentially what my series does by making PAGE_OFFSET variable. Cheers,
On Thu, 28 Feb 2019 at 11:36, Steve Capper <Steve.Capper@arm.com> wrote: > > On Tue, Feb 26, 2019 at 09:17:49PM +0100, Ard Biesheuvel wrote: > > On Tue, 26 Feb 2019 at 18:30, Steve Capper <Steve.Capper@arm.com> wrote: > > > > > > On Tue, Feb 19, 2019 at 05:18:18PM +0100, Ard Biesheuvel wrote: > > > > On Tue, 19 Feb 2019 at 14:56, Steve Capper <Steve.Capper@arm.com> wrote: > > > > > > > > > > On Tue, Feb 19, 2019 at 02:15:26PM +0100, Ard Biesheuvel wrote: > > > > > > On Tue, 19 Feb 2019 at 14:01, Will Deacon <will.deacon@arm.com> wrote: > > > > > > > > > > > > > > On Tue, Feb 19, 2019 at 01:51:51PM +0100, Ard Biesheuvel wrote: > > > > > > > > On Tue, 19 Feb 2019 at 13:48, Will Deacon <will.deacon@arm.com> wrote: > > > > > > > > > > > > > > > > > > On Tue, Feb 19, 2019 at 01:13:32PM +0100, Ard Biesheuvel wrote: > > > > > > > > > > On Mon, 18 Feb 2019 at 18:05, Steve Capper <steve.capper@arm.com> wrote: > > > > > > > > > > > > > > > > > > > > > > This patch series adds support for 52-bit kernel VAs using some of the > > > > > > > > > > > machinery already introduced by the 52-bit userspace VA code in 5.0. > > > > > > > > > > > > > > > > > > > > > > As 52-bit virtual address support is an optional hardware feature, > > > > > > > > > > > software support for 52-bit kernel VAs needs to be deduced at early boot > > > > > > > > > > > time. If HW support is not available, the kernel falls back to 48-bit. > > > > > > > > > > > > > > > > > > > > > > A significant proportion of this series focuses on "de-constifying" > > > > > > > > > > > VA_BITS related constants. > > > > > > > > > > > > > > > > > > > > > > In order to allow for a KASAN shadow that changes size at boot time, one > > > > > > > > > > > must fix the KASAN_SHADOW_END for both 48 & 52-bit VAs and "grow" the > > > > > > > > > > > start address. Also, it is highly desirable to maintain the same > > > > > > > > > > > function addresses in the kernel .text between VA sizes. Both of these > > > > > > > > > > > requirements necessitate us to flip the kernel address space halves s.t. > > > > > > > > > > > the direct linear map occupies the lower addresses. > > > > > > > > > > > > > > > > > > > > > > One obvious omission is 52-bit kernel VA + 48-bit userspace VA which I > > > > > > > > > > > can add with some more #ifdef'ery if needed. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi Steve, > > > > > > > > > > > > > > > > > > > > Apologies if I am bringing up things that have been addressed > > > > > > > > > > internally already. We discussed the 52-bit kernel VA work at > > > > > > > > > > plumber's at some point, and IIUC, KASAN is the complicating factor > > > > > > > > > > when it comes to having compile time constants for VA_BITS_MIN, > > > > > > > > > > VA_BITS_MAX and PAGE_OFFSET, right? > > > > > > > > > > > > > > > > > > > > To clarify what I mean, please refer to the diagram below, which > > > > > > > > > > describes a hybrid 48/52 kernel VA arrangement that does not rely on > > > > > > > > > > runtime variable quantities. (VA_BITS_MIN == 48, VA_BITS_MAX == 52) > > > > > > > > > > > > > > > > > > > > +------------------- (~0) -------------------------+ > > > > > > > > > > | | > > > > > > > > > > | PCI IO / fixmap spaces | > > > > > > > > > > | | > > > > > > > > > > +------------------------------------------------+ > > > > > > > > > > | | > > > > > > > > > > | kernel/vmalloc space | > > > > > > > > > > | | > > > > > > > > > > +------------------------------------------------+ > > > > > > > > > > | | > > > > > > > > > > | module space | > > > > > > > > > > | | > > > > > > > > > > +------------------------------------------------+ > > > > > > > > > > | | > > > > > > > > > > | BPF space | > > > > > > > > > > | | > > > > > > > > > > +------------------------------------------------+ > > > > > > > > > > | | > > > > > > > > > > | | > > > > > > > > > > | vmemmap space (size based on VA_BITS_MAX) | > > > > > > > > > > | | > > > > > > > > > > | | > > > > > > > > > > +-- linear/vmalloc split based on VA_BITS_MIN -- + > > > > > > > > > > | | > > > > > > > > > > | linear mapping (48 bit addressable region) | > > > > > > > > > > | | > > > > > > > > > > +------------------------------------------------+ > > > > > > > > > > | | > > > > > > > > > > | linear mapping (52 bit addressable region) | > > > > > > > > > > | | > > > > > > > > > > +------ PAGE_OFFSET based on VA_BITS_MAX --------+ > > > > > > > > > > > > > > > > > > > > Since KASAN is what is preventing this, would it be acceptable for > > > > > > > > > > KASAN to only be supported when you use a true 48 bit or a true 52 bit > > > > > > > > > > configuration, and disable it for the 48/52 hybrid configuration? > > > > > > > > > > > > > > > > > > > > Just thinking out loud (and in ASCII art :-)) > > > > > > > > > > > > > > > > > > TBH, if we end up having support for 52-bit kernel VA, I'd be inclined to > > > > > > > > > drop the 48/52 configuration altogether. But Catalin's on holiday at the > > > > > > > > > moment, and may have a different opinion ;) > > > > > > > > > > > > > > > > > > > > > > > > > But that implies that you cannot have an image that supports 52-bit > > > > > > > > kernel VAs but can still boot on hardware that does not implement > > > > > > > > support for it. If that is acceptable, then none of this hoop jumping > > > > > > > > that Steve is doing in these patches is necessary to begin with, > > > > > > > > right? > > > > > > > > > > > > > > Sorry, I misunderstood what you meant by a "48/52 hybrid configuration". I > > > > > > > thought you were referring to the configuration where userspace is 52-bit > > > > > > > and the kernel is 48-bit, which is something I think we can drop if we gain > > > > > > > support for 52-bit kernel. > > > > > > > > > > > > > > Now that I understand what you mean, I think disabling KASAN would be fine > > > > > > > as long as it's a runtime thing and the kernel continues to work in every > > > > > > > other respect. > > > > > > > > > > > > > > > > > > > No, it would be a limitation of the 52-bit config which also supports > > > > > > 48-bit-VA-only-h/w that the address space is laid out in such a way > > > > > > that there is simply no room for the KASAN shadow region, since it > > > > > > would have to live in the 48-bit addressable area, but be big enough > > > > > > to cover 52 bits of VA, which is impossible. > > > > > > > > > > > > For the vmemmap space, we could live with sizing it statically to > > > > > > cover a 52-bit VA linear region, but the KASAN shadow region is simply > > > > > > too big. > > > > > > > > > > > > So if KASAN support in that configuration is a requirement, then I > > > > > > agree with Steve's approach, but it does imply that quite a number of > > > > > > formerly compile-time constants now get turned into runtime variables. > > > > > > > > > > > > Steve, do you have any idea what the impact of that is? > > > > > > > > > > Hi Guys, > > > > > > > > > > The KASAN region only really necessitates two things: 1) that we think > > > > > about the end address of the region (which is invariant) rather than the > > > > > start address; and that 2) we flip the kernel VA space. IIUC both these > > > > > changes have a neglible perf impact. > > > > > > > > > > As for VA_BITS_ACTUAL, we need this in a few places: KVM mapping > > > > > support, and the big one phys_to/from_virt. For phys_to/from_virt the > > > > > logic is changed s.t. we use a variable lookup for translation but this > > > > > is folded into a new variable physvirt_offset (before the patch we used > > > > > a single variable read too). > > > > > > > > > > Again IIUC there should be a minimal perf impact (unless one tries to do > > > > > cat /sys/kernel/debug/kernel_page_tables with KASAN enabled - but that > > > > > can be optimised later). > > > > > > > > > > I didn't have the patience for ASCII art ;-), but I have a picture of > > > > > what I think it looks like here: > > > > > https://s3.amazonaws.com/connect.linaro.org/yvr18/presentations/yvr18-119.pdf > > > > > What I've tried to do is have most parts of the kernel VA space > > > > > invariant between 48/52 bits. If it's helpful I can type this up into a > > > > > document/commit log message? > > > > > > > > > > For this series I have tried to introduce VA_BITS_MIN in its own patch > > > > > and also VA_BITS_ACTUAL into its own patch to make it easier to follow. > > > > > > > > > > > Hi Ard, > > > > > > Apologies for my late reply, I had been staring at this for a while. > > > > > > > > > > > OK, perhaps I am just rephrasing what you essentially implemented > > > > already, but let me try to explain a bit better what I mean: > > > > > > > > - we flip the VA space in the way you suggest > > > > - we limit the size of the top half of the address space to 47 bits > > > > - KASAN region growns downwards from (~0) << 47 > > > > - we define PAGE_OFFSET as (~0) << 52, regardless of whether the h/w > > > > supports LVA or not > > > > - however, we tweak the phys/virt translation so that memory appears > > > > in the 48-bit addressable part of the linear region on non-LVA > > > > hardware > > > > > > > > The latter basically means that the KASAN shadow region will intersect > > > > the linear region, but whether we map memory or shadow pages there > > > > depends on the h/w config at runtime. > > > > > > > > The heart of the matter is probably the different placement of the > > > > memory inside the linear region, depending on whether the h/w is LVA > > > > capable or not, which is also reflected in your physvirt_offset. I am > > > > just trying to figure out why we need VA_BITS_ACTUAL to be a runtime > > > > variable. > > > > > > Currently the direct linear map between configurations does not overlap, > > > we have: > > > > > > FFF00000_00000000 - Direct linear map start (52-bit) > > > FFF80000_00000000 - Direct linear map end (52-bit) > > > FFFF0000_00000000 - Direct linear map start (48-bit) > > > FFFF8000_00000000 - Direct linear map end (48-bit) > > > > > > We *can* make PAGE_OFFSET a constant for both 48/52 bit VA_BITS, if we > > > offset it. vmemmap can then be adjusted on early boot to ensure that > > > everything points to the right place. However we will get overlap for > > > 52-bit configurations between KASAN and the direct linear map. > > > > > > The question is: are we okay with quite a large overlap? > > > > > > The KASAN region begins on 0xFFFDA000_00000000 for 52-bit. If we wish to > > > employ a "full" 47-bit direct linear map on 48-bit systems we need a > > > PAGE_OFFSET of 0xFFF78000_00000000 in order to make the direct linear > > > map end addresses "match up" between 48/52 bit configurations. > > > > > > This doesn't leave us with a lot of room for 52-bit configurations > > > though, if KASAN is enabled. > > > > > > > OK, so with actual numbers, what I had in mind was > > > > > > FFF00000_00000000 start of 52-bit addressable linear region | PAGE_OFFSET > > > > FFFD8000_00000000 start of KASAN shadow region | KASAN_SHADOW_OFFSET > > > > FFFF0000_00000000 start of 48-bit addressable linear region > > > > FFFF6000_00000000 start of used KASAN shadow region (48-bit VA) > > (KASAN_SHADOW_OFFSET + F0000_00000000 >> 3) > > > > FFFF8000_00000000 start of vmemmap area - end of KASAN shadow region > > > > FFFF8200_00000000 end of vmemmap area - start of bpf/module/etc area > > > > > > The trick is that the full (52 - 3) bits KASAN shadow space overlaps > > with the 48-bit linear region, but since you don't need KASAN shadow > > pages for memory that does not exist, the region FFFF0000_00000000 - > > FFFF6000_00000000 can be used for mapping the memory in case the h/w > > is 48-bit only. > > > > So in this case, PAGE_OFFSET and KASAN_SHADOW_OFFSET remain compile > > time constants, and as long as we don't attempt to map anything > > outside of the 48-bit addressable area on h/w that does not support > > it, the fact that those quantities are outside the 48-bit range does > > not really matter. > > Thanks Ard, > I'll elaborate more on what I'm worrying about :-). > > The 48/52 bit linear regions above do not overlap and this creates the > following issue. > OK, I see what you mean (I think). In my proposal, the linear regions *do* overlap. In my example, the vmemmap region is only sized to cover 51 bits of linear region, but this is not sufficient, since the 52-bit linear region is actually bigger than that. So based on a linear region that goes from FFF0_0000_0000_0000 ... FFFF_8000_0000_0000 we would end up with a vmemmap region FFFF_8000_0000_0000 ... FFFF_83E0_0000_0000 covering the entire combined linear region. This is a fair chunk of the vmalloc space for 48-bit configuration, but I don't think that is anything to worry about. > To go from a struct page * to a linear address we do the following: > lva = (page - VMEMMAP_START) * PAGE_SIZE / sizeof(struct page) + PAGE_OFFSET > OK, so given the above correction, we can take VMEMMAP_START := FFFF_8000_0000_0000 PAGE_OFFSET := FFF0_0000_0000_0000 and everything still adds up afaict, and struct pages in the 48-bit VA region are covered from FFFF_83C0_0000_0000 and up. > (Before my series) all the constants are fixed at compile time and thus > translation is very quick. My understanding is that you would like > PAGE_OFFSET to be constant to preserve the optimised nature of this > transform? (if not, please shout :-) ) > Yes, the main idea is to have compile time constants for PAGE_OFFSET, VA_BITS, etc > The problem is that a 52-bit PAGE_OFFSET = 0xFFF00000_00000000 will > never be able to give us an lva within a 48-bit addressable range. At > best we will get an lva of FFF80000_00000000. > You are assuming that we have to split the address space down the middle, but I don't think that is necessary at all. > We can get around this by adding a variable to the above transform, but > this is essentially what my series does by making PAGE_OFFSET variable. > > Cheers, > -- > Steve
On Thu, Feb 28, 2019 at 12:22:09PM +0100, Ard Biesheuvel wrote: > On Thu, 28 Feb 2019 at 11:36, Steve Capper <Steve.Capper@arm.com> wrote: > > > > On Tue, Feb 26, 2019 at 09:17:49PM +0100, Ard Biesheuvel wrote: > > > On Tue, 26 Feb 2019 at 18:30, Steve Capper <Steve.Capper@arm.com> wrote: > > > > > > > > On Tue, Feb 19, 2019 at 05:18:18PM +0100, Ard Biesheuvel wrote: > > > > > On Tue, 19 Feb 2019 at 14:56, Steve Capper <Steve.Capper@arm.com> wrote: > > > > > > > > > > > > On Tue, Feb 19, 2019 at 02:15:26PM +0100, Ard Biesheuvel wrote: > > > > > > > On Tue, 19 Feb 2019 at 14:01, Will Deacon <will.deacon@arm.com> wrote: > > > > > > > > > > > > > > > > On Tue, Feb 19, 2019 at 01:51:51PM +0100, Ard Biesheuvel wrote: > > > > > > > > > On Tue, 19 Feb 2019 at 13:48, Will Deacon <will.deacon@arm.com> wrote: > > > > > > > > > > > > > > > > > > > > On Tue, Feb 19, 2019 at 01:13:32PM +0100, Ard Biesheuvel wrote: > > > > > > > > > > > On Mon, 18 Feb 2019 at 18:05, Steve Capper <steve.capper@arm.com> wrote: > > > > > > > > > > > > > > > > > > > > > > > > This patch series adds support for 52-bit kernel VAs using some of the > > > > > > > > > > > > machinery already introduced by the 52-bit userspace VA code in 5.0. > > > > > > > > > > > > > > > > > > > > > > > > As 52-bit virtual address support is an optional hardware feature, > > > > > > > > > > > > software support for 52-bit kernel VAs needs to be deduced at early boot > > > > > > > > > > > > time. If HW support is not available, the kernel falls back to 48-bit. > > > > > > > > > > > > > > > > > > > > > > > > A significant proportion of this series focuses on "de-constifying" > > > > > > > > > > > > VA_BITS related constants. > > > > > > > > > > > > > > > > > > > > > > > > In order to allow for a KASAN shadow that changes size at boot time, one > > > > > > > > > > > > must fix the KASAN_SHADOW_END for both 48 & 52-bit VAs and "grow" the > > > > > > > > > > > > start address. Also, it is highly desirable to maintain the same > > > > > > > > > > > > function addresses in the kernel .text between VA sizes. Both of these > > > > > > > > > > > > requirements necessitate us to flip the kernel address space halves s.t. > > > > > > > > > > > > the direct linear map occupies the lower addresses. > > > > > > > > > > > > > > > > > > > > > > > > One obvious omission is 52-bit kernel VA + 48-bit userspace VA which I > > > > > > > > > > > > can add with some more #ifdef'ery if needed. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi Steve, > > > > > > > > > > > > > > > > > > > > > > Apologies if I am bringing up things that have been addressed > > > > > > > > > > > internally already. We discussed the 52-bit kernel VA work at > > > > > > > > > > > plumber's at some point, and IIUC, KASAN is the complicating factor > > > > > > > > > > > when it comes to having compile time constants for VA_BITS_MIN, > > > > > > > > > > > VA_BITS_MAX and PAGE_OFFSET, right? > > > > > > > > > > > > > > > > > > > > > > To clarify what I mean, please refer to the diagram below, which > > > > > > > > > > > describes a hybrid 48/52 kernel VA arrangement that does not rely on > > > > > > > > > > > runtime variable quantities. (VA_BITS_MIN == 48, VA_BITS_MAX == 52) > > > > > > > > > > > > > > > > > > > > > > +------------------- (~0) -------------------------+ > > > > > > > > > > > | | > > > > > > > > > > > | PCI IO / fixmap spaces | > > > > > > > > > > > | | > > > > > > > > > > > +------------------------------------------------+ > > > > > > > > > > > | | > > > > > > > > > > > | kernel/vmalloc space | > > > > > > > > > > > | | > > > > > > > > > > > +------------------------------------------------+ > > > > > > > > > > > | | > > > > > > > > > > > | module space | > > > > > > > > > > > | | > > > > > > > > > > > +------------------------------------------------+ > > > > > > > > > > > | | > > > > > > > > > > > | BPF space | > > > > > > > > > > > | | > > > > > > > > > > > +------------------------------------------------+ > > > > > > > > > > > | | > > > > > > > > > > > | | > > > > > > > > > > > | vmemmap space (size based on VA_BITS_MAX) | > > > > > > > > > > > | | > > > > > > > > > > > | | > > > > > > > > > > > +-- linear/vmalloc split based on VA_BITS_MIN -- + > > > > > > > > > > > | | > > > > > > > > > > > | linear mapping (48 bit addressable region) | > > > > > > > > > > > | | > > > > > > > > > > > +------------------------------------------------+ > > > > > > > > > > > | | > > > > > > > > > > > | linear mapping (52 bit addressable region) | > > > > > > > > > > > | | > > > > > > > > > > > +------ PAGE_OFFSET based on VA_BITS_MAX --------+ > > > > > > > > > > > > > > > > > > > > > > Since KASAN is what is preventing this, would it be acceptable for > > > > > > > > > > > KASAN to only be supported when you use a true 48 bit or a true 52 bit > > > > > > > > > > > configuration, and disable it for the 48/52 hybrid configuration? > > > > > > > > > > > > > > > > > > > > > > Just thinking out loud (and in ASCII art :-)) > > > > > > > > > > > > > > > > > > > > TBH, if we end up having support for 52-bit kernel VA, I'd be inclined to > > > > > > > > > > drop the 48/52 configuration altogether. But Catalin's on holiday at the > > > > > > > > > > moment, and may have a different opinion ;) > > > > > > > > > > > > > > > > > > > > > > > > > > > > But that implies that you cannot have an image that supports 52-bit > > > > > > > > > kernel VAs but can still boot on hardware that does not implement > > > > > > > > > support for it. If that is acceptable, then none of this hoop jumping > > > > > > > > > that Steve is doing in these patches is necessary to begin with, > > > > > > > > > right? > > > > > > > > > > > > > > > > Sorry, I misunderstood what you meant by a "48/52 hybrid configuration". I > > > > > > > > thought you were referring to the configuration where userspace is 52-bit > > > > > > > > and the kernel is 48-bit, which is something I think we can drop if we gain > > > > > > > > support for 52-bit kernel. > > > > > > > > > > > > > > > > Now that I understand what you mean, I think disabling KASAN would be fine > > > > > > > > as long as it's a runtime thing and the kernel continues to work in every > > > > > > > > other respect. > > > > > > > > > > > > > > > > > > > > > > No, it would be a limitation of the 52-bit config which also supports > > > > > > > 48-bit-VA-only-h/w that the address space is laid out in such a way > > > > > > > that there is simply no room for the KASAN shadow region, since it > > > > > > > would have to live in the 48-bit addressable area, but be big enough > > > > > > > to cover 52 bits of VA, which is impossible. > > > > > > > > > > > > > > For the vmemmap space, we could live with sizing it statically to > > > > > > > cover a 52-bit VA linear region, but the KASAN shadow region is simply > > > > > > > too big. > > > > > > > > > > > > > > So if KASAN support in that configuration is a requirement, then I > > > > > > > agree with Steve's approach, but it does imply that quite a number of > > > > > > > formerly compile-time constants now get turned into runtime variables. > > > > > > > > > > > > > > Steve, do you have any idea what the impact of that is? > > > > > > > > > > > > Hi Guys, > > > > > > > > > > > > The KASAN region only really necessitates two things: 1) that we think > > > > > > about the end address of the region (which is invariant) rather than the > > > > > > start address; and that 2) we flip the kernel VA space. IIUC both these > > > > > > changes have a neglible perf impact. > > > > > > > > > > > > As for VA_BITS_ACTUAL, we need this in a few places: KVM mapping > > > > > > support, and the big one phys_to/from_virt. For phys_to/from_virt the > > > > > > logic is changed s.t. we use a variable lookup for translation but this > > > > > > is folded into a new variable physvirt_offset (before the patch we used > > > > > > a single variable read too). > > > > > > > > > > > > Again IIUC there should be a minimal perf impact (unless one tries to do > > > > > > cat /sys/kernel/debug/kernel_page_tables with KASAN enabled - but that > > > > > > can be optimised later). > > > > > > > > > > > > I didn't have the patience for ASCII art ;-), but I have a picture of > > > > > > what I think it looks like here: > > > > > > https://s3.amazonaws.com/connect.linaro.org/yvr18/presentations/yvr18-119.pdf > > > > > > What I've tried to do is have most parts of the kernel VA space > > > > > > invariant between 48/52 bits. If it's helpful I can type this up into a > > > > > > document/commit log message? > > > > > > > > > > > > For this series I have tried to introduce VA_BITS_MIN in its own patch > > > > > > and also VA_BITS_ACTUAL into its own patch to make it easier to follow. > > > > > > > > > > > > > > Hi Ard, > > > > > > > > Apologies for my late reply, I had been staring at this for a while. > > > > > > > > > > > > > > OK, perhaps I am just rephrasing what you essentially implemented > > > > > already, but let me try to explain a bit better what I mean: > > > > > > > > > > - we flip the VA space in the way you suggest > > > > > - we limit the size of the top half of the address space to 47 bits > > > > > - KASAN region growns downwards from (~0) << 47 > > > > > - we define PAGE_OFFSET as (~0) << 52, regardless of whether the h/w > > > > > supports LVA or not > > > > > - however, we tweak the phys/virt translation so that memory appears > > > > > in the 48-bit addressable part of the linear region on non-LVA > > > > > hardware > > > > > > > > > > The latter basically means that the KASAN shadow region will intersect > > > > > the linear region, but whether we map memory or shadow pages there > > > > > depends on the h/w config at runtime. > > > > > > > > > > The heart of the matter is probably the different placement of the > > > > > memory inside the linear region, depending on whether the h/w is LVA > > > > > capable or not, which is also reflected in your physvirt_offset. I am > > > > > just trying to figure out why we need VA_BITS_ACTUAL to be a runtime > > > > > variable. > > > > > > > > Currently the direct linear map between configurations does not overlap, > > > > we have: > > > > > > > > FFF00000_00000000 - Direct linear map start (52-bit) > > > > FFF80000_00000000 - Direct linear map end (52-bit) > > > > FFFF0000_00000000 - Direct linear map start (48-bit) > > > > FFFF8000_00000000 - Direct linear map end (48-bit) > > > > > > > > We *can* make PAGE_OFFSET a constant for both 48/52 bit VA_BITS, if we > > > > offset it. vmemmap can then be adjusted on early boot to ensure that > > > > everything points to the right place. However we will get overlap for > > > > 52-bit configurations between KASAN and the direct linear map. > > > > > > > > The question is: are we okay with quite a large overlap? > > > > > > > > The KASAN region begins on 0xFFFDA000_00000000 for 52-bit. If we wish to > > > > employ a "full" 47-bit direct linear map on 48-bit systems we need a > > > > PAGE_OFFSET of 0xFFF78000_00000000 in order to make the direct linear > > > > map end addresses "match up" between 48/52 bit configurations. > > > > > > > > This doesn't leave us with a lot of room for 52-bit configurations > > > > though, if KASAN is enabled. > > > > > > > > > > OK, so with actual numbers, what I had in mind was > > > > > > > > > FFF00000_00000000 start of 52-bit addressable linear region | PAGE_OFFSET > > > > > > FFFD8000_00000000 start of KASAN shadow region | KASAN_SHADOW_OFFSET > > > > > > FFFF0000_00000000 start of 48-bit addressable linear region > > > > > > FFFF6000_00000000 start of used KASAN shadow region (48-bit VA) > > > (KASAN_SHADOW_OFFSET + F0000_00000000 >> 3) > > > > > > FFFF8000_00000000 start of vmemmap area - end of KASAN shadow region > > > > > > FFFF8200_00000000 end of vmemmap area - start of bpf/module/etc area > > > > > > > > > The trick is that the full (52 - 3) bits KASAN shadow space overlaps > > > with the 48-bit linear region, but since you don't need KASAN shadow > > > pages for memory that does not exist, the region FFFF0000_00000000 - > > > FFFF6000_00000000 can be used for mapping the memory in case the h/w > > > is 48-bit only. > > > > > > So in this case, PAGE_OFFSET and KASAN_SHADOW_OFFSET remain compile > > > time constants, and as long as we don't attempt to map anything > > > outside of the 48-bit addressable area on h/w that does not support > > > it, the fact that those quantities are outside the 48-bit range does > > > not really matter. > > > > Thanks Ard, > > I'll elaborate more on what I'm worrying about :-). > > > > The 48/52 bit linear regions above do not overlap and this creates the > > following issue. > > > > OK, I see what you mean (I think). In my proposal, the linear regions > *do* overlap. > > In my example, the vmemmap region is only sized to cover 51 bits of > linear region, but this is not sufficient, since the 52-bit linear > region is actually bigger than that. > Ahhhh, okay, nice (sorry I didn't parse your numbers correctly before). > So based on a linear region that goes from > > FFF0_0000_0000_0000 ... FFFF_8000_0000_0000 > > we would end up with a vmemmap region > > FFFF_8000_0000_0000 ... FFFF_83E0_0000_0000 > > covering the entire combined linear region. This is a fair chunk of > the vmalloc space for 48-bit configuration, but I don't think that is > anything to worry about. > > > To go from a struct page * to a linear address we do the following: > > lva = (page - VMEMMAP_START) * PAGE_SIZE / sizeof(struct page) + PAGE_OFFSET > > > > OK, so given the above correction, we can take > > VMEMMAP_START := FFFF_8000_0000_0000 > PAGE_OFFSET := FFF0_0000_0000_0000 > > and everything still adds up afaict, and struct pages in the 48-bit VA > region are covered from FFFF_83C0_0000_0000 and up. > > > (Before my series) all the constants are fixed at compile time and thus > > translation is very quick. My understanding is that you would like > > PAGE_OFFSET to be constant to preserve the optimised nature of this > > transform? (if not, please shout :-) ) > > > > Yes, the main idea is to have compile time constants for PAGE_OFFSET, > VA_BITS, etc > > > The problem is that a 52-bit PAGE_OFFSET = 0xFFF00000_00000000 will > > never be able to give us an lva within a 48-bit addressable range. At > > best we will get an lva of FFF80000_00000000. > > > > You are assuming that we have to split the address space down the > middle, but I don't think that is necessary at all. > Agreed, some minor tweaks are needed to some helper functions to allow for this. Many thanks Ard, I'll give this a go. Cheers,
On Thu, Feb 28, 2019 at 12:22:09PM +0100, Ard Biesheuvel wrote: > On Thu, 28 Feb 2019 at 11:36, Steve Capper <Steve.Capper@arm.com> wrote: > > The 48/52 bit linear regions above do not overlap and this creates the > > following issue. > > OK, I see what you mean (I think). In my proposal, the linear regions > *do* overlap. > > In my example, the vmemmap region is only sized to cover 51 bits of > linear region, but this is not sufficient, since the 52-bit linear > region is actually bigger than that. > > So based on a linear region that goes from > > FFF0_0000_0000_0000 ... FFFF_8000_0000_0000 > > we would end up with a vmemmap region > > FFFF_8000_0000_0000 ... FFFF_83E0_0000_0000 > > covering the entire combined linear region. This is a fair chunk of > the vmalloc space for 48-bit configuration, but I don't think that is > anything to worry about. So that's about 42-bit for vmemmap (my calculations were 2^(52-16+6), assuming a 64 byte sizeof(page)), so 1/64 of the 48-bit va space. I don't think that's a problem. > > To go from a struct page * to a linear address we do the following: > > lva = (page - VMEMMAP_START) * PAGE_SIZE / sizeof(struct page) + PAGE_OFFSET > > OK, so given the above correction, we can take > > VMEMMAP_START := FFFF_8000_0000_0000 > PAGE_OFFSET := FFF0_0000_0000_0000 > > and everything still adds up afaict, and struct pages in the 48-bit VA > region are covered from FFFF_83C0_0000_0000 and up. > > > (Before my series) all the constants are fixed at compile time and thus > > translation is very quick. My understanding is that you would like > > PAGE_OFFSET to be constant to preserve the optimised nature of this > > transform? (if not, please shout :-) ) > > Yes, the main idea is to have compile time constants for PAGE_OFFSET, > VA_BITS, etc If we can have all of the constant, it would be great (we managed from the early versions of the patch to have VA_BITS constant). I have to figure out the KASan story with this (still parsing this thread). Can we still get KASan and single-image in the 52-bit VA configuration? Or, at least, not have the kernel fall apart if a 52-bit image is booted on 48-bit hw with KASan enabled, whether KASan still works or not (to get the chance to print some warning)?
On Mon, 25 Mar 2019 at 19:38, Catalin Marinas <catalin.marinas@arm.com> wrote: > > On Thu, Feb 28, 2019 at 12:22:09PM +0100, Ard Biesheuvel wrote: > > On Thu, 28 Feb 2019 at 11:36, Steve Capper <Steve.Capper@arm.com> wrote: > > > The 48/52 bit linear regions above do not overlap and this creates the > > > following issue. > > > > OK, I see what you mean (I think). In my proposal, the linear regions > > *do* overlap. > > > > In my example, the vmemmap region is only sized to cover 51 bits of > > linear region, but this is not sufficient, since the 52-bit linear > > region is actually bigger than that. > > > > So based on a linear region that goes from > > > > FFF0_0000_0000_0000 ... FFFF_8000_0000_0000 > > > > we would end up with a vmemmap region > > > > FFFF_8000_0000_0000 ... FFFF_83E0_0000_0000 > > > > covering the entire combined linear region. This is a fair chunk of > > the vmalloc space for 48-bit configuration, but I don't think that is > > anything to worry about. > > So that's about 42-bit for vmemmap (my calculations were 2^(52-16+6), > assuming a 64 byte sizeof(page)), so 1/64 of the 48-bit va space. I > don't think that's a problem. > > > > To go from a struct page * to a linear address we do the following: > > > lva = (page - VMEMMAP_START) * PAGE_SIZE / sizeof(struct page) + PAGE_OFFSET > > > > OK, so given the above correction, we can take > > > > VMEMMAP_START := FFFF_8000_0000_0000 > > PAGE_OFFSET := FFF0_0000_0000_0000 > > > > and everything still adds up afaict, and struct pages in the 48-bit VA > > region are covered from FFFF_83C0_0000_0000 and up. > > > > > (Before my series) all the constants are fixed at compile time and thus > > > translation is very quick. My understanding is that you would like > > > PAGE_OFFSET to be constant to preserve the optimised nature of this > > > transform? (if not, please shout :-) ) > > > > Yes, the main idea is to have compile time constants for PAGE_OFFSET, > > VA_BITS, etc > > If we can have all of the constant, it would be great (we managed from > the early versions of the patch to have VA_BITS constant). > > I have to figure out the KASan story with this (still parsing this > thread). Can we still get KASan and single-image in the 52-bit VA > configuration? Or, at least, not have the kernel fall apart if a 52-bit > image is booted on 48-bit hw with KASan enabled, whether KASan still > works or not (to get the chance to print some warning)? > It should work in both cases, but there will be a part of the linear region that is used as KASAN shadow on 52-bit hardware, and as a linear mapping on 48-bit hardware. I'm pretty sure this shouldn't be a problem, but someone has to double check (and perhaps some boundary definition macros for KASAN need to be turned into runtime variables)
Hi Steve, On 02/18/2019 10:32 PM, Steve Capper wrote: > This patch series adds support for 52-bit kernel VAs using some of the > machinery already introduced by the 52-bit userspace VA code in 5.0. > > As 52-bit virtual address support is an optional hardware feature, > software support for 52-bit kernel VAs needs to be deduced at early boot > time. If HW support is not available, the kernel falls back to 48-bit. > > A significant proportion of this series focuses on "de-constifying" > VA_BITS related constants. > > In order to allow for a KASAN shadow that changes size at boot time, one > must fix the KASAN_SHADOW_END for both 48 & 52-bit VAs and "grow" the > start address. Also, it is highly desirable to maintain the same > function addresses in the kernel .text between VA sizes. Both of these > requirements necessitate us to flip the kernel address space halves s.t. > the direct linear map occupies the lower addresses. > > One obvious omission is 52-bit kernel VA + 48-bit userspace VA which I > can add with some more #ifdef'ery if needed. Thanks for the patchset. I did some work on the user-space side to see how user-space tools like makedumpfile and kexec-tools are affected by these changes. I see that Dave Anderson (in Cc) also did some work on crash-utility side [0] to have the basic framework in place to have the user-space tools work with the inverted memory map. I have a couple of concerns regarding: 1. VA_BITS_ACTUAL, and how user-space gets to know its value. 2. Overall bits that we need to be aware in user-space now to address the following combinations of address space (phew ..): a) 48-bit Kernel VA + 48-bit User-space VA + 48-bit PA b) 48-bit Kernel VA + 48-bit User-space VA + 52-bit PA c) 48-bit Kernel VA + 52-bit User-space VA + 52-bit PA d) 52-bit Kernel VA + 52-bit User-space VA + 52-bit PA e) 52-bit Kernel VA + 48-bit User-space VA + 52-bit PA (is this even used-somewhere? Not sure but James [in Cc] had some queries on this [1]. Personally, I am not aware of any users of this combination). I have added detailed comments/concerns in individual patches in mails to follow regarding the above points. [0]. https://github.com/crash-utility/crash/commit/b0b3ef2eda543413762b32710b8a63dd9ed55de5 [1]. http://lists.infradead.org/pipermail/kexec/2019-April/022729.html Thanks, Bhupesh
On Wed, Apr 03, 2019 at 01:39:36PM +0530, Bhupesh Sharma wrote: > Hi Steve, > > On 02/18/2019 10:32 PM, Steve Capper wrote: > > This patch series adds support for 52-bit kernel VAs using some of the > > machinery already introduced by the 52-bit userspace VA code in 5.0. > > > > As 52-bit virtual address support is an optional hardware feature, > > software support for 52-bit kernel VAs needs to be deduced at early boot > > time. If HW support is not available, the kernel falls back to 48-bit. > > > > A significant proportion of this series focuses on "de-constifying" > > VA_BITS related constants. > > > > In order to allow for a KASAN shadow that changes size at boot time, one > > must fix the KASAN_SHADOW_END for both 48 & 52-bit VAs and "grow" the > > start address. Also, it is highly desirable to maintain the same > > function addresses in the kernel .text between VA sizes. Both of these > > requirements necessitate us to flip the kernel address space halves s.t. > > the direct linear map occupies the lower addresses. > > > > One obvious omission is 52-bit kernel VA + 48-bit userspace VA which I > > can add with some more #ifdef'ery if needed. > > Thanks for the patchset. > > I did some work on the user-space side to see how user-space tools like > makedumpfile and kexec-tools are affected by these changes. I see that Dave > Anderson (in Cc) also did some work on crash-utility side [0] to have the > basic framework in place to have the user-space tools work with the inverted > memory map. > > I have a couple of concerns regarding: > > 1. VA_BITS_ACTUAL, and how user-space gets to know its value. > > 2. Overall bits that we need to be aware in user-space now to address the > following combinations of address space (phew ..): > > a) 48-bit Kernel VA + 48-bit User-space VA + 48-bit PA > b) 48-bit Kernel VA + 48-bit User-space VA + 52-bit PA > c) 48-bit Kernel VA + 52-bit User-space VA + 52-bit PA > d) 52-bit Kernel VA + 52-bit User-space VA + 52-bit PA > e) 52-bit Kernel VA + 48-bit User-space VA + 52-bit PA (is this even > used-somewhere? Not sure but James [in Cc] had some queries on this [1]. > Personally, I am not aware of any users of this combination). > > I have added detailed comments/concerns in individual patches in mails to > follow regarding the above points. > > [0]. https://github.com/crash-utility/crash/commit/b0b3ef2eda543413762b32710b8a63dd9ed55de5 > [1]. http://lists.infradead.org/pipermail/kexec/2019-April/022729.html > > Thanks Bhupesh, Ard, As an update to this series, I have got the constant PAGE_OFFSET working (basically got back to it after a while away and also caught a silly typo that I made) and am tidying things up. And figuring out whether or not I can just keep the vmemmap in the same place and just expand it. I don't think we'll see a 52-bit kernel VA and 48-bit user VA unless anyone wants it? Cheers,