Message ID | 20250224225246.3712295-1-jeffxu@google.com (mailing list archive) |
---|---|
Headers | show |
Series | mseal system mappings | expand |
On Mon, Feb 24, 2025 at 10:52 PM <jeffxu@chromium.org> wrote: > > From: Jeff Xu <jeffxu@chromium.org> > > This is V7 version, addressing comments from V6, without code logic > change. > > -------------------------------------------------- > > History: > V7: > - Remove cover letter from the first patch (Liam R. Howlett) > - Change macro name to VM_SEALED_SYSMAP (Liam R. Howlett) > - logging and fclose() in selftest (Liam R. Howlett) Jeff, please don't send out new versions of the patchset that quickly. We were having a discussion on v5, you sent v6 today (acceptable) and now v7 (while changing barely anything of note). It's hard to track things this way, and you're just flooding a bunch of mailboxes. Thanks, Pedro
On Mon, Feb 24, 2025 at 3:03 PM Pedro Falcato <pedro.falcato@gmail.com> wrote: > > On Mon, Feb 24, 2025 at 10:52 PM <jeffxu@chromium.org> wrote: > > > > From: Jeff Xu <jeffxu@chromium.org> > > > > This is V7 version, addressing comments from V6, without code logic > > change. > > > > -------------------------------------------------- > > > > History: > > V7: > > - Remove cover letter from the first patch (Liam R. Howlett) > > - Change macro name to VM_SEALED_SYSMAP (Liam R. Howlett) > > - logging and fclose() in selftest (Liam R. Howlett) > > Jeff, please don't send out new versions of the patchset that quickly. > We were having a discussion on v5, you sent v6 today (acceptable) and > now v7 (while changing barely anything of note). It's hard to track > things this way, and you're just flooding a bunch of mailboxes. > Ah, I apologize. Sure. -Jeff > Thanks, > Pedro
On Mon, Feb 24, 2025 at 03:07:03PM -0800, Jeff Xu wrote: > On Mon, Feb 24, 2025 at 3:03 PM Pedro Falcato <pedro.falcato@gmail.com> wrote: > > > > On Mon, Feb 24, 2025 at 10:52 PM <jeffxu@chromium.org> wrote: > > > > > > From: Jeff Xu <jeffxu@chromium.org> > > > > > > This is V7 version, addressing comments from V6, without code logic > > > change. > > > > > > -------------------------------------------------- > > > > > > History: > > > V7: > > > - Remove cover letter from the first patch (Liam R. Howlett) > > > - Change macro name to VM_SEALED_SYSMAP (Liam R. Howlett) > > > - logging and fclose() in selftest (Liam R. Howlett) > > > > Jeff, please don't send out new versions of the patchset that quickly. > > We were having a discussion on v5, you sent v6 today (acceptable) and > > now v7 (while changing barely anything of note). It's hard to track > > things this way, and you're just flooding a bunch of mailboxes. > > > Ah, I apologize. Sure. > > -Jeff Thanks, I am behind the eight ball on this in a post-viral fatigue haze, so I'd especially appreciate relaxing a bit on series pace here haha. I mean being reasonable I don't want you to feel you're needing to be told 'ok send vX' now, but I'd simply suggest - wait until things 'settle down' a bit on comments + everything's addressed and the 'usual suspects' have commented, then this is a good time to send next version. I realise I'm maybe not well placed to say this as I've previouisly been famous for resending WAY too quick haha, but it's something I've worked on myself so I guess, I relate... > > > > Thanks, > > Pedro Thanks!
BTW can we please drop the 'mseal, system mappings' prefixes on this series, it's really weird and makes it really hard for me to actually read the individual summary lines for each commit. 'mseal:' will do. I mean really you could argue it's 'mm: mseal: ...' but I'm not quite _that_ pedantic :)
Jeff - looking further in this series, I asked for a couple things for this series which you've not provided: 1. Some assurance based on code that the kernel-side code doesn't rely on VDSO/VVAR etc. mapping. I gave up waiting for this and went and checked myself, it looks fine for arm64, x86-64. But it might have been nice had you done it :) Apologies if you had and I just missed it. 2. Tests - could you please add some tests to assert that mremap() fails for VDSO for instance? You've edited an existing test for VDSO in x86 to skip the test should this be enabled, but this is not the same as a self test. And obviously doesn't cover arm64. This should be relatively strightforward right? You already have code for finding out whether VDSO is msealed, so just use that to see if you skip, then attempt mremap(), mmap() over etc. + assert it fails. Ideally these tests would cover all the cases you've changed. Please do try to ensure you address requests from maintainers to save on iterations, while I get the desire to shoot out new versions (I've been guilty of this in the past), it makes life so much easier and will reduce version count if you try to get everything done in a one go. Having said the above, we're really not far off this being a viable series. You just need to address comments here (+ in v6...) + provide some tests. Thanks! On Mon, Feb 24, 2025 at 10:52:39PM +0000, jeffxu@chromium.org wrote: > From: Jeff Xu <jeffxu@chromium.org> > > This is V7 version, addressing comments from V6, without code logic > change. > > -------------------------------------------------- > > History: > V7: > - Remove cover letter from the first patch (Liam R. Howlett) > - Change macro name to VM_SEALED_SYSMAP (Liam R. Howlett) > - logging and fclose() in selftest (Liam R. Howlett) > > V6: > https://lore.kernel.org/all/20250224174513.3600914-1-jeffxu@google.com/ > Nitty, but it's really useful to actually include the history for all of these. > V5 > https://lore.kernel.org/all/20250212032155.1276806-1-jeffxu@google.com/ > > V4: > https://lore.kernel.org/all/20241125202021.3684919-1-jeffxu@google.com/ > > V3: > https://lore.kernel.org/all/20241113191602.3541870-1-jeffxu@google.com/ > > V2: > https://lore.kernel.org/all/20241014215022.68530-1-jeffxu@google.com/ > > V1: > https://lore.kernel.org/all/20241004163155.3493183-1-jeffxu@google.com/ > > -------------------------------------------------- > As discussed during mseal() upstream process [1], mseal() protects > the VMAs of a given virtual memory range against modifications, such > as the read/write (RW) and no-execute (NX) bits. For complete > descriptions of memory sealing, please see mseal.rst [2]. > > The mseal() is useful to mitigate memory corruption issues where a > corrupted pointer is passed to a memory management system. For > example, such an attacker primitive can break control-flow integrity > guarantees since read-only memory that is supposed to be trusted can > become writable or .text pages can get remapped. > > The system mappings are readonly only, memory sealing can protect > them from ever changing to writable or unmmap/remapped as different > attributes. > > System mappings such as vdso, vvar, and sigpage (arm), vectors (arm) > are created by the kernel during program initialization, and could > be sealed after creation. > > Unlike the aforementioned mappings, the uprobe mapping is not > established during program startup. However, its lifetime is the same > as the process's lifetime [3]. It could be sealed from creation. > > The vsyscall on x86-64 uses a special address (0xffffffffff600000), > which is outside the mm managed range. This means mprotect, munmap, and > mremap won't work on the vsyscall. Since sealing doesn't enhance > the vsyscall's security, it is skipped in this patch. If we ever seal > the vsyscall, it is probably only for decorative purpose, i.e. showing > the 'sl' flag in the /proc/pid/smaps. For this patch, it is ignored. > > It is important to note that the CHECKPOINT_RESTORE feature (CRIU) may > alter the system mappings during restore operations. UML(User Mode Linux) > and gVisor, rr are also known to change the vdso/vvar mappings. > Consequently, this feature cannot be universally enabled across all > systems. As such, CONFIG_MSEAL_SYSTEM_MAPPINGS is disabled by default. > > To support mseal of system mappings, architectures must define > CONFIG_ARCH_HAS_MSEAL_SYSTEM_MAPPINGS and update their special mappings > calls to pass mseal flag. Additionally, architectures must confirm they > do not unmap/remap system mappings during the process lifetime. > > In this version, we've improved the handling of system mapping sealing from > previous versions, instead of modifying the _install_special_mapping > function itself, which would affect all architectures, we now call > _install_special_mapping with a sealing flag only within the specific > architecture that requires it. This targeted approach offers two key > advantages: 1) It limits the code change's impact to the necessary > architectures, and 2) It aligns with the software architecture by keeping > the core memory management within the mm layer, while delegating the > decision of sealing system mappings to the individual architecture, which > is particularly relevant since 32-bit architectures never require sealing. > > Prior to this patch series, we explored sealing special mappings from > userspace using glibc's dynamic linker. This approach revealed several > issues: > - The PT_LOAD header may report an incorrect length for vdso, (smaller > than its actual size). The dynamic linker, which relies on PT_LOAD > information to determine mapping size, would then split and partially > seal the vdso mapping. Since each architecture has its own vdso/vvar > code, fixing this in the kernel would require going through each > archiecture. Our initial goal was to enable sealing readonly mappings, > e.g. .text, across all architectures, sealing vdso from kernel since > creation appears to be simpler than sealing vdso at glibc. > - The [vvar] mapping header only contains address information, not length > information. Similar issues might exist for other special mappings. > - Mappings like uprobe are not covered by the dynamic linker, > and there is no effective solution for them. > > This feature's security enhancements will benefit ChromeOS, Android, > and other high security systems. > > Testing: > This feature was tested on ChromeOS and Android for both x86-64 and ARM64. > - Enable sealing and verify vdso/vvar, sigpage, vector are sealed properly, > i.e. "sl" shown in the smaps for those mappings, and mremap is blocked. > - Passing various automation tests (e.g. pre-checkin) on ChromeOS and > Android to ensure the sealing doesn't affect the functionality of > Chromebook and Android phone. > > I also tested the feature on Ubuntu on x86-64: > - With config disabled, vdso/vvar is not sealed, > - with config enabled, vdso/vvar is sealed, and booting up Ubuntu is OK, > normal operations such as browsing the web, open/edit doc are OK. > > In addition, Benjamin Berg tested this on UML. > > Link: https://lore.kernel.org/all/20240415163527.626541-1-jeffxu@chromium.org/ [1] > Link: Documentation/userspace-api/mseal.rst [2] > Link: https://lore.kernel.org/all/CABi2SkU9BRUnqf70-nksuMCQ+yyiWjo3fM4XkRkL-NrCZxYAyg@mail.gmail.com/ [3] > > > > > Jeff Xu (7): > mseal, system mappings: kernel config and header change > selftests: x86: test_mremap_vdso: skip if vdso is msealed > mseal, system mappings: enable x86-64 > mseal, system mappings: enable arm64 > mseal, system mappings: enable uml architecture > mseal, system mappings: uprobe mapping > mseal, system mappings: update mseal.rst > > Documentation/userspace-api/mseal.rst | 7 +++ > arch/arm64/Kconfig | 1 + > arch/arm64/kernel/vdso.c | 22 +++++++--- > arch/um/Kconfig | 1 + > arch/x86/Kconfig | 1 + > arch/x86/entry/vdso/vma.c | 16 ++++--- > arch/x86/um/vdso/vma.c | 6 ++- > include/linux/mm.h | 10 +++++ > init/Kconfig | 18 ++++++++ > kernel/events/uprobes.c | 5 ++- > security/Kconfig | 18 ++++++++ > .../testing/selftests/x86/test_mremap_vdso.c | 43 +++++++++++++++++++ > 12 files changed, 132 insertions(+), 16 deletions(-) > > -- > 2.48.1.658.g4767266eb4-goog >
On Tue, Feb 25, 2025 at 7:18 AM Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote: > > Jeff - looking further in this series, I asked for a couple things for this > series which you've not provided: > > 1. Some assurance based on code that the kernel-side code doesn't rely on > VDSO/VVAR etc. mapping. I gave up waiting for this and went and checked > myself, it looks fine for arm64, x86-64. But it might have been nice had > you done it :) Apologies if you had and I just missed it. > Thanks for checking this. Do ppc in kernel code unmap/remap vdso ? I am aware that userspace can remap/unmap special mappings, but I don't know if the kernel will remap/unmap a special mapping. Could you please point out the code ? > 2. Tests - could you please add some tests to assert that mremap() fails > for VDSO for instance? You've edited an existing test for VDSO in x86 to > skip the test should this be enabled, but this is not the same as a self > test. And obviously doesn't cover arm64. > > This should be relatively strightforward right? You already have code > for finding out whether VDSO is msealed, so just use that to see if you > skip, then attempt mremap(), mmap() over etc. + assert it fails. > > Ideally these tests would cover all the cases you've changed. > It is not as easy. The config is disabled by default. And I don't know a way to detect KCONFIG from selftest itself. Without this, I can't reasonably determine the test result.
On Tue, Feb 25, 2025 at 2:32 AM Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote: > > BTW can we please drop the 'mseal, system mappings' prefixes on this > series, it's really weird and makes it really hard for me to actually read > the individual summary lines for each commit. 'mseal:' will do. > I am not sure. I had comments about adding mseal, system mappings, as prefixes, and I think it is reasonable.
On Tue, Feb 25, 2025 at 04:12:40PM -0800, Jeff Xu wrote: > On Tue, Feb 25, 2025 at 7:18 AM Lorenzo Stoakes > <lorenzo.stoakes@oracle.com> wrote: > > > > Jeff - looking further in this series, I asked for a couple things for this > > series which you've not provided: > > > > 1. Some assurance based on code that the kernel-side code doesn't rely on > > VDSO/VVAR etc. mapping. I gave up waiting for this and went and checked > > myself, it looks fine for arm64, x86-64. But it might have been nice had > > you done it :) Apologies if you had and I just missed it. > > > Thanks for checking this. > Do ppc in kernel code unmap/remap vdso ? > > I am aware that userspace can remap/unmap special mappings, but I > don't know if the kernel will remap/unmap a special mapping. Could you > please point out the code ? Again as discussed in other thread, let's leave this until as/when you try to attack PPC. I am not 100% this is the case, I may be mistaken sure, but gathered it _might_ be. > > > > 2. Tests - could you please add some tests to assert that mremap() fails > > for VDSO for instance? You've edited an existing test for VDSO in x86 to > > skip the test should this be enabled, but this is not the same as a self > > test. And obviously doesn't cover arm64. > > > > This should be relatively strightforward right? You already have code > > for finding out whether VDSO is msealed, so just use that to see if you > > skip, then attempt mremap(), mmap() over etc. + assert it fails. > > > > Ideally these tests would cover all the cases you've changed. > > > It is not as easy. > > The config is disabled by default. And I don't know a way to detect > KCONFIG from selftest itself. Without this, I can't reasonably > determine the test result. Please in future let's try to get this kind of response at the point of the request being made :) makes life much easier. So I think it is easy actually. As I say here (perhaps you missed it?) you literally already have code you added to the x86 test to prevent test failure. So you essentially look for [vdso] or whatever, then you look up to see if it is sealed, ensure an mremap() fails. Of course this doesn't check to see if it _should_ be enabled or not. I'm being nice by not _insisting_ we export a way for you to determine whether this config option is enabled or not for the sake of a test (since I don't want to hold up this series). But that'd be nice! Maybe in a /sys/kernel/security endpoint... ...but I think we can potentially add this later on so we don't hold things up here/add yet more to the series. I suspect you and Kees alike would prioritise getting _this series_ in at this point :) You could, if you wanted to, check to see if /proc/config.gz exists and zgrep it (only there if CONFIG_IKCONFIG_PROC is set) and assert based on that, but you know kinda hacky. But anyway, FWIW I think it'd be useful to assert that mremap() or munmap() fails here for system mappings for the sake of it. I guess this is, in effect, only checking mseal-ing works right? But it at least asserts the existence of the thing, and that it behaves. Later on, when you add some way of observing that it's enabled or not, you can extend the test to check this. Thanks!
On Tue, Feb 25, 2025 at 04:17:01PM -0800, Jeff Xu wrote: > On Tue, Feb 25, 2025 at 2:32 AM Lorenzo Stoakes > <lorenzo.stoakes@oracle.com> wrote: > > > > BTW can we please drop the 'mseal, system mappings' prefixes on this > > series, it's really weird and makes it really hard for me to actually read > > the individual summary lines for each commit. 'mseal:' will do. > > > > I am not sure. > I had comments about adding mseal, system mappings, as prefixes, and I > think it is reasonable. No it's really horrible (sorry I know it's hyperbolic but it really really looks horrid to me) , I've never seen prefixes like that before in mm in my life and I don't think it adds anything. I also find it MIGHTY confusing. Please remove this :) you can git log the relevant files to see the conventions people use. Typically xxx: has something really short and sweet for the 'xxx'. I realise this is subjective, but it's a small thing and would be helpful for parsing your series at a glance. Thanks!
On Tue, Feb 25, 2025 at 10:01 PM Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote: > > On Tue, Feb 25, 2025 at 04:17:01PM -0800, Jeff Xu wrote: > > On Tue, Feb 25, 2025 at 2:32 AM Lorenzo Stoakes > > <lorenzo.stoakes@oracle.com> wrote: > > > > > > BTW can we please drop the 'mseal, system mappings' prefixes on this > > > series, it's really weird and makes it really hard for me to actually read > > > the individual summary lines for each commit. 'mseal:' will do. > > > > > > > I am not sure. > > I had comments about adding mseal, system mappings, as prefixes, and I > > think it is reasonable. > > No it's really horrible (sorry I know it's hyperbolic but it really really looks > horrid to me) , I've never seen prefixes like that before in mm in my life and I > don't think it adds anything. > > I also find it MIGHTY confusing. > > Please remove this :) you can git log the relevant files to see the conventions > people use. Typically xxx: has something really short and sweet for the 'xxx'. > > I realise this is subjective, but it's a small thing and would be helpful for > parsing your series at a glance. > I need a prefix to group the patches, especially when the first patch is just a config change. "mseal" won't work because this patch isn't about core business logic of mseal. How about "mseal sysmap:"? -Jeff
Hi Lorenzo, On Tue, Feb 25, 2025, 9:43 PM Lorenzo Stoakes > > > > 2. Tests - could you please add some tests to assert that mremap() fails > > > for VDSO for instance? You've edited an existing test for VDSO in x86 to > > > skip the test should this be enabled, but this is not the same as a self > > > test. And obviously doesn't cover arm64. > > > > > > This should be relatively strightforward right? You already have code > > > for finding out whether VDSO is msealed, so just use that to see if you > > > skip, then attempt mremap(), mmap() over etc. + assert it fails. > > > > > > Ideally these tests would cover all the cases you've changed. > > > > > It is not as easy. > > > > The config is disabled by default. And I don't know a way to detect > > KCONFIG from selftest itself. Without this, I can't reasonably > > determine the test result. > > Please in future let's try to get this kind of response at the point of the > request being made :) makes life much easier. > There might be miscommunication ? This version is the first time you ask about adding a self test. IIRC, you had comments about providing the details of what tests I did, in V4. As a follow-up, I added a test-info section in the cover letter of V5 Though I have thought about adding a selftest since the beginning, there are two problems: 1> This config is by default disabled, 2> No good pattern to check KCONFIG from selftest. > > So I think it is easy actually. As I say here (perhaps you missed it?) you > literally already have code you added to the x86 test to prevent test > failure. > > So you essentially look for [vdso] or whatever, then you look up to see if > it is sealed, ensure an mremap() fails. > This suggestion doesn't test the core change of this series, which is to enable mseal for vdso. When the vdso is marked with "sl", mremap() will fail, that's part of the mseal() business logic and belongs in mseal_test. The mseal_test already covers the mseal, and this series doesn't change mseal. As for the "sl", as I responded in the "refactor mseal_test" [1] , it will be part of the verifying step: [1] https://lore.kernel.org/all/CABi2SkUv_1gsuGh86AON-xRLAggCvDqJMHxT17mGy-XutSTAwg@mail.gmail.com/ > Of course this doesn't check to see if it _should_ be enabled or not. I'm > being nice by not _insisting_ we export a way for you to determine whether > this config option is enabled or not for the sake of a test (since I don't > want to hold up this series). > > But that'd be nice! Maybe in a > /sys/kernel/security endpoint... > > > ...but I think we can potentially add this later on so we don't hold things > up here/add yet more to the series. I suspect you and Kees alike would > prioritise getting _this series_ in at this point :) > > You could, if you wanted to, check to see if /proc/config.gz exists and > zgrep it (only there if CONFIG_IKCONFIG_PROC is set) and assert based on > that, but you know kinda hacky. Ya, it is hacky. None of the existing selftest uses this pattern, and I'm not sure /proc/config.gz is enabled in the default kernel config. One option is to have ChromeOS or Android to maintain an out-of-tree test, since the configuration will be enabled there. Option two is to create a new path: tools/testing/selftests/sealsysmap. Then, add CONFIG_SEAL_SYSTEM_MAPPING=y to the config file and add a selftest to this path. This seems to be the preferred way by selftest, but we need a new dir for that. Option three is to add a self-test when we have a process-level opt-in solution. This would allow the test to deterministically know whether the vdso should be sealed or not. > > But anyway, FWIW I think it'd be useful to assert that mremap() or munmap() > fails here for system mappings for the sake of it. I guess this is, in > effect, only checking mseal-ing works right? But it at least asserts the > existence of the thing, and that it behaves. > > Later on, when you add some way of observing that it's enabled or not, you > can extend the test to check this. I think it is best that we don't add a test that doesn't actually check the code change. Do you think one of the above three options works ? maybe the second option (with a new path) ? In any case, I think the risk is low, and the code changes are quite simple, and fully tested. Thanks. -Jeff.
On Thu, Feb 27, 2025 at 04:55:06PM -0800, Jeff Xu wrote: > Hi Lorenzo, > > On Tue, Feb 25, 2025, 9:43 PM Lorenzo Stoakes > > > > > > 2. Tests - could you please add some tests to assert that mremap() fails > > > > for VDSO for instance? You've edited an existing test for VDSO in x86 to > > > > skip the test should this be enabled, but this is not the same as a self > > > > test. And obviously doesn't cover arm64. > > > > > > > > This should be relatively strightforward right? You already have code > > > > for finding out whether VDSO is msealed, so just use that to see if you > > > > skip, then attempt mremap(), mmap() over etc. + assert it fails. > > > > > > > > Ideally these tests would cover all the cases you've changed. > > > > > > > It is not as easy. > > > > > > The config is disabled by default. And I don't know a way to detect > > > KCONFIG from selftest itself. Without this, I can't reasonably > > > determine the test result. > > > > Please in future let's try to get this kind of response at the point of the > > request being made :) makes life much easier. > > > There might be miscommunication ? > This version is the first time you ask about adding a self test. Yeah I thought I'd been clear but this might _very well_ have been me not having been, so apologies if so. I mean 'make sure it's tested' is an overloaded term right? So fact you've tested on android, chromeos, etc. implies 'tested', but also self tests/kunit/whatever. > > IIRC, you had comments about providing the details of what tests I did, in V4. > As a follow-up, I added a test-info section in the cover letter of V5 Thanks. Appreciate that! And this really does point towards a miscommunication on my part, will try to be super explicit in future. > > Though I have thought about adding a selftest since the beginning, > there are two problems: > 1> This config is by default disabled, > 2> No good pattern to check KCONFIG from selftest. Yeah agreed, it's a TOTAL pain. I wish we had a better way of doing this. Maybe a self-volunteering exercise (*goes to write on writeboard :P*) > > > > > So I think it is easy actually. As I say here (perhaps you missed it?) you > > literally already have code you added to the x86 test to prevent test > > failure. > > > > So you essentially look for [vdso] or whatever, then you look up to see if > > it is sealed, ensure an mremap() fails. > > > This suggestion doesn't test the core change of this series, which is > to enable mseal for vdso. Right, and thinking about it, what does this test? Just that mseal works right? It's sort of implicit that, if a VMA is sealed, the seal should work (or rather, tested in mseal tests themselves, rather than mseal system settings). > > When the vdso is marked with "sl", mremap() will fail, that's part of > the mseal() business logic and belongs in mseal_test. The mseal_test > already covers the mseal, and this series doesn't change mseal. > > As for the "sl", as I responded in the "refactor mseal_test" [1] , it > will be part of the verifying step: > > [1] https://lore.kernel.org/all/CABi2SkUv_1gsuGh86AON-xRLAggCvDqJMHxT17mGy-XutSTAwg@mail.gmail.com/ OK cool thanks. I will look at this when I can, I'm just snowed under pre-LSF and have been sick so backlog, backlog as discussed off-list. But it's not been forgotten (whiteboard etc. etc.) > > > Of course this doesn't check to see if it _should_ be enabled or not. I'm > > being nice by not _insisting_ we export a way for you to determine whether > > this config option is enabled or not for the sake of a test (since I don't > > want to hold up this series). > > > > But that'd be nice! Maybe in a > > /sys/kernel/security endpoint... > > > > > > ...but I think we can potentially add this later on so we don't hold things > > up here/add yet more to the series. I suspect you and Kees alike would > > prioritise getting _this series_ in at this point :) > > > > You could, if you wanted to, check to see if /proc/config.gz exists and > > zgrep it (only there if CONFIG_IKCONFIG_PROC is set) and assert based on > > that, but you know kinda hacky. > > Ya, it is hacky. None of the existing selftest uses this pattern, and > I'm not sure /proc/config.gz is enabled in the default kernel config. Yeah and I'm not sure I even like my hacky suggestion here, I mean let's be honest, it sucks. > > One option is to have ChromeOS or Android to maintain an out-of-tree > test, since the configuration will be enabled there. Nah haha, though of course you can do what you want. Really want something upstream. > > Option two is to create a new path: > tools/testing/selftests/sealsysmap. Then, add > CONFIG_SEAL_SYSTEM_MAPPING=y to the config file and add a selftest to > this path. This seems to be the preferred way by selftest, but we need > a new dir for that. OK I like option 2 let's do this. But let's call it mseal_system_mappings (yes I"m being nitty again :) I really want to try to _explicitly_ say 'mseal' because we have other forms of sealing. Not your fault, but we overload terms like _crazy_ in mm and need to be cautious as not to confuse vs. e.g. memfd seals. > > Option three is to add a self-test when we have a process-level opt-in > solution. This would allow the test to deterministically know whether > the vdso should be sealed or not. Yeah one for future. > > > > > But anyway, FWIW I think it'd be useful to assert that mremap() or munmap() > > fails here for system mappings for the sake of it. I guess this is, in > > effect, only checking mseal-ing works right? But it at least asserts the > > existence of the thing, and that it behaves. > > > > Later on, when you add some way of observing that it's enabled or not, you > > can extend the test to check this. > > I think it is best that we don't add a test that doesn't actually > check the code change. Do you think one of the above three options > works ? maybe the second option (with a new path) ? Yeah I actually agree on reflection. And yes agreed option 2 is great, thanks! > > In any case, I think the risk is low, and the code changes are quite > simple, and fully tested. Yeah indeed, but I'd really like _something_ if possible. If option 2 is relatively quick let's get that sorted out! > > Thanks. > -Jeff. Cheers, Lorenzo
On Thu, Feb 27, 2025 at 03:43:07PM -0800, Jeff Xu wrote: > On Tue, Feb 25, 2025 at 10:01 PM Lorenzo Stoakes > <lorenzo.stoakes@oracle.com> wrote: > > > > On Tue, Feb 25, 2025 at 04:17:01PM -0800, Jeff Xu wrote: > > > On Tue, Feb 25, 2025 at 2:32 AM Lorenzo Stoakes > > > <lorenzo.stoakes@oracle.com> wrote: > > > > > > > > BTW can we please drop the 'mseal, system mappings' prefixes on this > > > > series, it's really weird and makes it really hard for me to actually read > > > > the individual summary lines for each commit. 'mseal:' will do. > > > > > > > > > > I am not sure. > > > I had comments about adding mseal, system mappings, as prefixes, and I > > > think it is reasonable. > > > > No it's really horrible (sorry I know it's hyperbolic but it really really looks > > horrid to me) , I've never seen prefixes like that before in mm in my life and I > > don't think it adds anything. > > > > I also find it MIGHTY confusing. > > > > Please remove this :) you can git log the relevant files to see the conventions > > people use. Typically xxx: has something really short and sweet for the 'xxx'. > > > > I realise this is subjective, but it's a small thing and would be helpful for > > parsing your series at a glance. > > > I need a prefix to group the patches, especially when the first patch > is just a config change. "mseal" won't work because this patch isn't > about core business logic of mseal. > > How about "mseal sysmap:"? OK let's go with that as a compromise. This isn't a huge big deal. > > -Jeff
On Fri, Feb 28, 2025 at 1:35 AM Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote: > > On Thu, Feb 27, 2025 at 04:55:06PM -0800, Jeff Xu wrote: > > Hi Lorenzo, > > > > On Tue, Feb 25, 2025, 9:43 PM Lorenzo Stoakes > > > > > > > > 2. Tests - could you please add some tests to assert that mremap() fails > > > > > for VDSO for instance? You've edited an existing test for VDSO in x86 to > > > > > skip the test should this be enabled, but this is not the same as a self > > > > > test. And obviously doesn't cover arm64. > > > > > > > > > > This should be relatively strightforward right? You already have code > > > > > for finding out whether VDSO is msealed, so just use that to see if you > > > > > skip, then attempt mremap(), mmap() over etc. + assert it fails. > > > > > > > > > > Ideally these tests would cover all the cases you've changed. > > > > > > > > > It is not as easy. > > > > > > > > The config is disabled by default. And I don't know a way to detect > > > > KCONFIG from selftest itself. Without this, I can't reasonably > > > > determine the test result. > > > > > > Please in future let's try to get this kind of response at the point of the > > > request being made :) makes life much easier. > > > > > There might be miscommunication ? > > This version is the first time you ask about adding a self test. > > Yeah I thought I'd been clear but this might _very well_ have been me not > having been, so apologies if so. > > I mean 'make sure it's tested' is an overloaded term right? So fact you've > tested on android, chromeos, etc. implies 'tested', but also self > tests/kunit/whatever. > > > > > IIRC, you had comments about providing the details of what tests I did, in V4. > > As a follow-up, I added a test-info section in the cover letter of V5 > > Thanks. Appreciate that! And this really does point towards a > miscommunication on my part, will try to be super explicit in future. > > > > > Though I have thought about adding a selftest since the beginning, > > there are two problems: > > 1> This config is by default disabled, > > 2> No good pattern to check KCONFIG from selftest. > > Yeah agreed, it's a TOTAL pain. > > I wish we had a better way of doing this. Maybe a self-volunteering > exercise (*goes to write on writeboard :P*) > > > > > > > > > So I think it is easy actually. As I say here (perhaps you missed it?) you > > > literally already have code you added to the x86 test to prevent test > > > failure. > > > > > > So you essentially look for [vdso] or whatever, then you look up to see if > > > it is sealed, ensure an mremap() fails. > > > > > This suggestion doesn't test the core change of this series, which is > > to enable mseal for vdso. > > Right, and thinking about it, what does this test? Just that mseal works > right? > > It's sort of implicit that, if a VMA is sealed, the seal should work (or > rather, tested in mseal tests themselves, rather than mseal system > settings). > > > > > When the vdso is marked with "sl", mremap() will fail, that's part of > > the mseal() business logic and belongs in mseal_test. The mseal_test > > already covers the mseal, and this series doesn't change mseal. > > > > As for the "sl", as I responded in the "refactor mseal_test" [1] , it > > will be part of the verifying step: > > > > [1] https://lore.kernel.org/all/CABi2SkUv_1gsuGh86AON-xRLAggCvDqJMHxT17mGy-XutSTAwg@mail.gmail.com/ > > OK cool thanks. I will look at this when I can, I'm just snowed under > pre-LSF and have been sick so backlog, backlog as discussed off-list. But > it's not been forgotten (whiteboard etc. etc.) > Ya, no worry about that review, please take time to recover, I will wait. And appreciate your time and take priority for reviewing this series. > > > > > Of course this doesn't check to see if it _should_ be enabled or not. I'm > > > being nice by not _insisting_ we export a way for you to determine whether > > > this config option is enabled or not for the sake of a test (since I don't > > > want to hold up this series). > > > > > > But that'd be nice! Maybe in a > > > /sys/kernel/security endpoint... > > > > > > > > > ...but I think we can potentially add this later on so we don't hold things > > > up here/add yet more to the series. I suspect you and Kees alike would > > > prioritise getting _this series_ in at this point :) > > > > > > You could, if you wanted to, check to see if /proc/config.gz exists and > > > zgrep it (only there if CONFIG_IKCONFIG_PROC is set) and assert based on > > > that, but you know kinda hacky. > > > > Ya, it is hacky. None of the existing selftest uses this pattern, and > > I'm not sure /proc/config.gz is enabled in the default kernel config. > > Yeah and I'm not sure I even like my hacky suggestion here, I mean let's be > honest, it sucks. > > > > > One option is to have ChromeOS or Android to maintain an out-of-tree > > test, since the configuration will be enabled there. > > Nah haha, though of course you can do what you want. Really want something > upstream. > > > > > Option two is to create a new path: > > tools/testing/selftests/sealsysmap. Then, add > > CONFIG_SEAL_SYSTEM_MAPPING=y to the config file and add a selftest to > > this path. This seems to be the preferred way by selftest, but we need > > a new dir for that. > > OK I like option 2 let's do this. > > But let's call it mseal_system_mappings (yes I"m being nitty again :) I > really want to try to _explicitly_ say 'mseal' because we have other forms > of sealing. > Sure. If long path names aren't a problem, I will use mseal_system_mappings, otherwise mseal_sysmap. > Not your fault, but we overload terms like _crazy_ in mm and need to be > cautious as not to confuse vs. e.g. memfd seals. > > > > > > Option three is to add a self-test when we have a process-level opt-in > > solution. This would allow the test to deterministically know whether > > the vdso should be sealed or not. > > Yeah one for future. > > > > > > > > > But anyway, FWIW I think it'd be useful to assert that mremap() or munmap() > > > fails here for system mappings for the sake of it. I guess this is, in > > > effect, only checking mseal-ing works right? But it at least asserts the > > > existence of the thing, and that it behaves. > > > > > > Later on, when you add some way of observing that it's enabled or not, you > > > can extend the test to check this. > > > > I think it is best that we don't add a test that doesn't actually > > check the code change. Do you think one of the above three options > > works ? maybe the second option (with a new path) ? > > Yeah I actually agree on reflection. And yes agreed option 2 is great, > thanks! > > > > > In any case, I think the risk is low, and the code changes are quite > > simple, and fully tested. > > Yeah indeed, but I'd really like _something_ if possible. If option 2 is > relatively quick let's get that sorted out! > Great ! I will work on option 2. Thanks -Jeff
On Fri, Feb 28, 2025 at 09:24:11AM -0800, Jeff Xu wrote: [snip] > > > > > > When the vdso is marked with "sl", mremap() will fail, that's part of > > > the mseal() business logic and belongs in mseal_test. The mseal_test > > > already covers the mseal, and this series doesn't change mseal. > > > > > > As for the "sl", as I responded in the "refactor mseal_test" [1] , it > > > will be part of the verifying step: > > > > > > [1] https://lore.kernel.org/all/CABi2SkUv_1gsuGh86AON-xRLAggCvDqJMHxT17mGy-XutSTAwg@mail.gmail.com/ > > > > OK cool thanks. I will look at this when I can, I'm just snowed under > > pre-LSF and have been sick so backlog, backlog as discussed off-list. But > > it's not been forgotten (whiteboard etc. etc.) > > > Ya, no worry about that review, please take time to recover, I will wait. > And appreciate your time and take priority for reviewing this series. Thanks :) [snip] > > > > > > Option two is to create a new path: > > > tools/testing/selftests/sealsysmap. Then, add > > > CONFIG_SEAL_SYSTEM_MAPPING=y to the config file and add a selftest to > > > this path. This seems to be the preferred way by selftest, but we need > > > a new dir for that. > > > > OK I like option 2 let's do this. > > > > But let's call it mseal_system_mappings (yes I"m being nitty again :) I > > really want to try to _explicitly_ say 'mseal' because we have other forms > > of sealing. > > > Sure. Thanks! > > If long path names aren't a problem, I will use mseal_system_mappings, > otherwise mseal_sysmap. I think long form is fine. [snip] > > > > > > In any case, I think the risk is low, and the code changes are quite > > > simple, and fully tested. > > > > Yeah indeed, but I'd really like _something_ if possible. If option 2 is > > relatively quick let's get that sorted out! > > > Great ! I will work on option 2. Perfect cheers! > > Thanks > -Jeff
From: Jeff Xu <jeffxu@chromium.org> This is V7 version, addressing comments from V6, without code logic change. -------------------------------------------------- History: V7: - Remove cover letter from the first patch (Liam R. Howlett) - Change macro name to VM_SEALED_SYSMAP (Liam R. Howlett) - logging and fclose() in selftest (Liam R. Howlett) V6: https://lore.kernel.org/all/20250224174513.3600914-1-jeffxu@google.com/ V5 https://lore.kernel.org/all/20250212032155.1276806-1-jeffxu@google.com/ V4: https://lore.kernel.org/all/20241125202021.3684919-1-jeffxu@google.com/ V3: https://lore.kernel.org/all/20241113191602.3541870-1-jeffxu@google.com/ V2: https://lore.kernel.org/all/20241014215022.68530-1-jeffxu@google.com/ V1: https://lore.kernel.org/all/20241004163155.3493183-1-jeffxu@google.com/ -------------------------------------------------- As discussed during mseal() upstream process [1], mseal() protects the VMAs of a given virtual memory range against modifications, such as the read/write (RW) and no-execute (NX) bits. For complete descriptions of memory sealing, please see mseal.rst [2]. The mseal() is useful to mitigate memory corruption issues where a corrupted pointer is passed to a memory management system. For example, such an attacker primitive can break control-flow integrity guarantees since read-only memory that is supposed to be trusted can become writable or .text pages can get remapped. The system mappings are readonly only, memory sealing can protect them from ever changing to writable or unmmap/remapped as different attributes. System mappings such as vdso, vvar, and sigpage (arm), vectors (arm) are created by the kernel during program initialization, and could be sealed after creation. Unlike the aforementioned mappings, the uprobe mapping is not established during program startup. However, its lifetime is the same as the process's lifetime [3]. It could be sealed from creation. The vsyscall on x86-64 uses a special address (0xffffffffff600000), which is outside the mm managed range. This means mprotect, munmap, and mremap won't work on the vsyscall. Since sealing doesn't enhance the vsyscall's security, it is skipped in this patch. If we ever seal the vsyscall, it is probably only for decorative purpose, i.e. showing the 'sl' flag in the /proc/pid/smaps. For this patch, it is ignored. It is important to note that the CHECKPOINT_RESTORE feature (CRIU) may alter the system mappings during restore operations. UML(User Mode Linux) and gVisor, rr are also known to change the vdso/vvar mappings. Consequently, this feature cannot be universally enabled across all systems. As such, CONFIG_MSEAL_SYSTEM_MAPPINGS is disabled by default. To support mseal of system mappings, architectures must define CONFIG_ARCH_HAS_MSEAL_SYSTEM_MAPPINGS and update their special mappings calls to pass mseal flag. Additionally, architectures must confirm they do not unmap/remap system mappings during the process lifetime. In this version, we've improved the handling of system mapping sealing from previous versions, instead of modifying the _install_special_mapping function itself, which would affect all architectures, we now call _install_special_mapping with a sealing flag only within the specific architecture that requires it. This targeted approach offers two key advantages: 1) It limits the code change's impact to the necessary architectures, and 2) It aligns with the software architecture by keeping the core memory management within the mm layer, while delegating the decision of sealing system mappings to the individual architecture, which is particularly relevant since 32-bit architectures never require sealing. Prior to this patch series, we explored sealing special mappings from userspace using glibc's dynamic linker. This approach revealed several issues: - The PT_LOAD header may report an incorrect length for vdso, (smaller than its actual size). The dynamic linker, which relies on PT_LOAD information to determine mapping size, would then split and partially seal the vdso mapping. Since each architecture has its own vdso/vvar code, fixing this in the kernel would require going through each archiecture. Our initial goal was to enable sealing readonly mappings, e.g. .text, across all architectures, sealing vdso from kernel since creation appears to be simpler than sealing vdso at glibc. - The [vvar] mapping header only contains address information, not length information. Similar issues might exist for other special mappings. - Mappings like uprobe are not covered by the dynamic linker, and there is no effective solution for them. This feature's security enhancements will benefit ChromeOS, Android, and other high security systems. Testing: This feature was tested on ChromeOS and Android for both x86-64 and ARM64. - Enable sealing and verify vdso/vvar, sigpage, vector are sealed properly, i.e. "sl" shown in the smaps for those mappings, and mremap is blocked. - Passing various automation tests (e.g. pre-checkin) on ChromeOS and Android to ensure the sealing doesn't affect the functionality of Chromebook and Android phone. I also tested the feature on Ubuntu on x86-64: - With config disabled, vdso/vvar is not sealed, - with config enabled, vdso/vvar is sealed, and booting up Ubuntu is OK, normal operations such as browsing the web, open/edit doc are OK. In addition, Benjamin Berg tested this on UML. Link: https://lore.kernel.org/all/20240415163527.626541-1-jeffxu@chromium.org/ [1] Link: Documentation/userspace-api/mseal.rst [2] Link: https://lore.kernel.org/all/CABi2SkU9BRUnqf70-nksuMCQ+yyiWjo3fM4XkRkL-NrCZxYAyg@mail.gmail.com/ [3] Jeff Xu (7): mseal, system mappings: kernel config and header change selftests: x86: test_mremap_vdso: skip if vdso is msealed mseal, system mappings: enable x86-64 mseal, system mappings: enable arm64 mseal, system mappings: enable uml architecture mseal, system mappings: uprobe mapping mseal, system mappings: update mseal.rst Documentation/userspace-api/mseal.rst | 7 +++ arch/arm64/Kconfig | 1 + arch/arm64/kernel/vdso.c | 22 +++++++--- arch/um/Kconfig | 1 + arch/x86/Kconfig | 1 + arch/x86/entry/vdso/vma.c | 16 ++++--- arch/x86/um/vdso/vma.c | 6 ++- include/linux/mm.h | 10 +++++ init/Kconfig | 18 ++++++++ kernel/events/uprobes.c | 5 ++- security/Kconfig | 18 ++++++++ .../testing/selftests/x86/test_mremap_vdso.c | 43 +++++++++++++++++++ 12 files changed, 132 insertions(+), 16 deletions(-)