Message ID | 20250224174513.3600914-2-jeffxu@google.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mseal system mappings | expand |
On 2/24/25 09:45, jeffxu@chromium.org wrote: > +/* > + * mseal of userspace process's system mappings. > + */ > +#ifdef CONFIG_MSEAL_SYSTEM_MAPPINGS > +#define MSEAL_SYSTEM_MAPPINGS_VM_FLAG VM_SEALED > +#else > +#define MSEAL_SYSTEM_MAPPINGS_VM_FLAG VM_NONE > +#endif This ends up looking pretty wonky in practice: > + vm_flags = VM_READ|VM_MAYREAD|VM_IO|VM_DONTDUMP|VM_PFNMAP; > + vm_flags |= MSEAL_SYSTEM_MAPPINGS_VM_FLAG; because MSEAL_SYSTEM_MAPPINGS_VM_FLAG is so much different from the other ones. Would it really hurt to have #ifdef CONFIG_64BIT /* VM is sealed, in vm_flags */ #define VM_SEALED _BITUL(63) +#else +#define VM_SEALED VM_NONE #endif ? Then all the users could just do: vm_flags = VM_READ|VM_MAYREAD|VM_IO|VM_DONTDUMP|VM_PFNMAP| VM_SEALED That seems to be a the common way of doing things. Take a look at: # define VM_GROWSUP VM_NONE ... # define VM_MTE VM_NONE # define VM_MTE_ALLOWED VM_NONE ... # define VM_UFFD_MINOR VM_NONE ... #define VM_DROPPABLE VM_NONE ... and more
On Mon, Feb 24, 2025 at 10:21 AM Dave Hansen <dave.hansen@intel.com> wrote: > > On 2/24/25 09:45, jeffxu@chromium.org wrote: > > +/* > > + * mseal of userspace process's system mappings. > > + */ > > +#ifdef CONFIG_MSEAL_SYSTEM_MAPPINGS > > +#define MSEAL_SYSTEM_MAPPINGS_VM_FLAG VM_SEALED > > +#else > > +#define MSEAL_SYSTEM_MAPPINGS_VM_FLAG VM_NONE > > +#endif > > This ends up looking pretty wonky in practice: > > > + vm_flags = VM_READ|VM_MAYREAD|VM_IO|VM_DONTDUMP|VM_PFNMAP; > > + vm_flags |= MSEAL_SYSTEM_MAPPINGS_VM_FLAG; > > because MSEAL_SYSTEM_MAPPINGS_VM_FLAG is so much different from the > other ones. > > Would it really hurt to have > > #ifdef CONFIG_64BIT > /* VM is sealed, in vm_flags */ > #define VM_SEALED _BITUL(63) > +#else > +#define VM_SEALED VM_NONE > #endif > > ? > VM_SEALED isn't defined in 32-bit systems, and mseal.c isn't part of the build. This is intentional. Any 32-bit code trying to use the sealing function or the VM_SEALED flag will immediately fail compilation. This makes it easier to identify incorrect usage. For example: Consider the case below in src/third_party/kernel/v6.6/fs/proc/task_mmu.c, #ifdef CONFIG_64BIT [ilog2(VM_SEALED)] = "sl", #endif Redefining VM_SEALED to VM_NONE for 32 bit won't detect the problem in case that "#ifdef CONFIG_64BIT" line is missing. Please note, this has been like this since the first version of mseal() RFC patch, and I prefer to keep it this way. Thanks -Jeff
On 2/24/25 10:44, Jeff Xu wrote: > For example: > Consider the case below in src/third_party/kernel/v6.6/fs/proc/task_mmu.c, > > #ifdef CONFIG_64BIT > [ilog2(VM_SEALED)] = "sl", > #endif > > Redefining VM_SEALED to VM_NONE for 32 bit won't detect the problem > in case that "#ifdef CONFIG_64BIT" line is missing. > > Please note, this has been like this since the first version of > mseal() RFC patch, and I prefer to keep it this way. That logic is reasonable. But it's different from the _vast_ majority of other flags. So what justifies VM_SEALED being so different? It's leading to pretty objectively ugly code in this series.
On Mon, Feb 24, 2025 at 10:52:13AM -0800, Dave Hansen wrote: > On 2/24/25 10:44, Jeff Xu wrote: > > For example: > > Consider the case below in src/third_party/kernel/v6.6/fs/proc/task_mmu.c, > > > > #ifdef CONFIG_64BIT > > [ilog2(VM_SEALED)] = "sl", > > #endif > > > > Redefining VM_SEALED to VM_NONE for 32 bit won't detect the problem > > in case that "#ifdef CONFIG_64BIT" line is missing. > > > > Please note, this has been like this since the first version of > > mseal() RFC patch, and I prefer to keep it this way. > > That logic is reasonable. But it's different from the _vast_ majority of > other flags. > > So what justifies VM_SEALED being so different? It's leading to pretty > objectively ugly code in this series. Note that VM_SEALED is the "is this VMA sealed?" bit itself. The define for "should we perform system mapping sealing?" is intentionally separate here, so that it can be Kconfig and per-arch toggled, etc. As for the name, I have no strong opinion. Perhaps VM_SEALED_SYSTEM_MAPPING ? -Kees
On Mon, Feb 24, 2025 at 10:55 AM Kees Cook <kees@kernel.org> wrote: > > On Mon, Feb 24, 2025 at 10:52:13AM -0800, Dave Hansen wrote: > > On 2/24/25 10:44, Jeff Xu wrote: > > > For example: > > > Consider the case below in src/third_party/kernel/v6.6/fs/proc/task_mmu.c, > > > > > > #ifdef CONFIG_64BIT > > > [ilog2(VM_SEALED)] = "sl", > > > #endif > > > > > > Redefining VM_SEALED to VM_NONE for 32 bit won't detect the problem > > > in case that "#ifdef CONFIG_64BIT" line is missing. > > > > > > Please note, this has been like this since the first version of > > > mseal() RFC patch, and I prefer to keep it this way. > > > > That logic is reasonable. But it's different from the _vast_ majority of > > other flags. > > > > So what justifies VM_SEALED being so different? It's leading to pretty > > objectively ugly code in this series. > > Note that VM_SEALED is the "is this VMA sealed?" bit itself. The define > for "should we perform system mapping sealing?" is intentionally separate > here, so that it can be Kconfig and per-arch toggled, etc. > Ya, it is a layer of separation also. Thanks for pointing it out. > As for the name, I have no strong opinion. Perhaps VM_SEALED_SYSTEM_MAPPING ? > OK. Thanks -Jeff > -Kees > > -- > Kees Cook
On 2/24/25 10:55, Kees Cook wrote: >> That logic is reasonable. But it's different from the _vast_ majority of >> other flags. >> >> So what justifies VM_SEALED being so different? It's leading to pretty >> objectively ugly code in this series. > Note that VM_SEALED is the "is this VMA sealed?" bit itself. The define > for "should we perform system mapping sealing?" is intentionally separate > here, so that it can be Kconfig and per-arch toggled, etc. Ahh, makes sense. > As for the name, I have no strong opinion. Perhaps VM_SEALED_SYSTEM_MAPPING ? Yeah, that'd work. Just something more consistent with the existing naming and more compact. I think: VM_SEALED_SYS would fit in nicely.
* Jeff Xu <jeffxu@chromium.org> [250224 13:44]: > On Mon, Feb 24, 2025 at 10:21 AM Dave Hansen <dave.hansen@intel.com> wrote: > > > > On 2/24/25 09:45, jeffxu@chromium.org wrote: > > > +/* > > > + * mseal of userspace process's system mappings. > > > + */ > > > +#ifdef CONFIG_MSEAL_SYSTEM_MAPPINGS > > > +#define MSEAL_SYSTEM_MAPPINGS_VM_FLAG VM_SEALED > > > +#else > > > +#define MSEAL_SYSTEM_MAPPINGS_VM_FLAG VM_NONE > > > +#endif > > > > This ends up looking pretty wonky in practice: > > > > > + vm_flags = VM_READ|VM_MAYREAD|VM_IO|VM_DONTDUMP|VM_PFNMAP; > > > + vm_flags |= MSEAL_SYSTEM_MAPPINGS_VM_FLAG; > > > > because MSEAL_SYSTEM_MAPPINGS_VM_FLAG is so much different from the > > other ones. > > > > Would it really hurt to have > > > > #ifdef CONFIG_64BIT > > /* VM is sealed, in vm_flags */ > > #define VM_SEALED _BITUL(63) > > +#else > > +#define VM_SEALED VM_NONE > > #endif > > > > ? > > > VM_SEALED isn't defined in 32-bit systems, and mseal.c isn't part of > the build. This is intentional. Any 32-bit code trying to use the > sealing function or the VM_SEALED flag will immediately fail > compilation. This makes it easier to identify incorrect usage. > The reason that two #defines are needed is because you can have mseal enabled while not sealing system mappings, so for this to be clean we need two defines. However MSEAL_SYSTEM_MAPPINGS_VM_FLAG, is _way_ too long, in my opinion. Keeping with "VM_SEALED" I'd suggest "VM_SYSTEM_SEALED". > For example: > Consider the case below in src/third_party/kernel/v6.6/fs/proc/task_mmu.c, third_party? > > #ifdef CONFIG_64BIT > [ilog2(VM_SEALED)] = "sl", > #endif > > Redefining VM_SEALED to VM_NONE for 32 bit won't detect the problem > in case that "#ifdef CONFIG_64BIT" line is missing. I don't think it is reasonable to insist on doing things differently in the kernel because you have external tests that would need updating. These things can change independently, so I don't think this is a valid argument. If these are upstream tests, and we need these tests to work then they can be fixed. > > Please note, this has been like this since the first version of > mseal() RFC patch, and I prefer to keep it this way. Thanks, Liam
On Mon, Feb 24, 2025 at 11:03 AM Liam R. Howlett <Liam.Howlett@oracle.com> wrote: > > * Jeff Xu <jeffxu@chromium.org> [250224 13:44]: > > On Mon, Feb 24, 2025 at 10:21 AM Dave Hansen <dave.hansen@intel.com> wrote: > > > > > > On 2/24/25 09:45, jeffxu@chromium.org wrote: > > > > +/* > > > > + * mseal of userspace process's system mappings. > > > > + */ > > > > +#ifdef CONFIG_MSEAL_SYSTEM_MAPPINGS > > > > +#define MSEAL_SYSTEM_MAPPINGS_VM_FLAG VM_SEALED > > > > +#else > > > > +#define MSEAL_SYSTEM_MAPPINGS_VM_FLAG VM_NONE > > > > +#endif > > > > > > This ends up looking pretty wonky in practice: > > > > > > > + vm_flags = VM_READ|VM_MAYREAD|VM_IO|VM_DONTDUMP|VM_PFNMAP; > > > > + vm_flags |= MSEAL_SYSTEM_MAPPINGS_VM_FLAG; > > > > > > because MSEAL_SYSTEM_MAPPINGS_VM_FLAG is so much different from the > > > other ones. > > > > > > Would it really hurt to have > > > > > > #ifdef CONFIG_64BIT > > > /* VM is sealed, in vm_flags */ > > > #define VM_SEALED _BITUL(63) > > > +#else > > > +#define VM_SEALED VM_NONE > > > #endif > > > > > > ? > > > > > VM_SEALED isn't defined in 32-bit systems, and mseal.c isn't part of > > the build. This is intentional. Any 32-bit code trying to use the > > sealing function or the VM_SEALED flag will immediately fail > > compilation. This makes it easier to identify incorrect usage. > > > > The reason that two #defines are needed is because you can have mseal > enabled while not sealing system mappings, so for this to be clean we > need two defines. > > However MSEAL_SYSTEM_MAPPINGS_VM_FLAG, is _way_ too long, in my opinion. > Keeping with "VM_SEALED" I'd suggest "VM_SYSTEM_SEALED". > > > For example: > > Consider the case below in src/third_party/kernel/v6.6/fs/proc/task_mmu.c, > > third_party? > Sorry, I pasted the code path from ChromeOS code base, it is actually in the kernel itself. fs/proc/task_mmu.c > > > > #ifdef CONFIG_64BIT > > [ilog2(VM_SEALED)] = "sl", > > #endif > > > > Redefining VM_SEALED to VM_NONE for 32 bit won't detect the problem > > in case that "#ifdef CONFIG_64BIT" line is missing. > > I don't think it is reasonable to insist on doing things differently in > the kernel because you have external tests that would need updating. > These things can change independently, so I don't think this is a valid > argument. > > If these are upstream tests, and we need these tests to work then they > can be fixed. > As above, this is actually kernel code, not test. -Jeff > > > > Please note, this has been like this since the first version of > > mseal() RFC patch, and I prefer to keep it this way. > > Thanks, > Liam
On Mon, Feb 24, 2025 at 11:03 AM Liam R. Howlett <Liam.Howlett@oracle.com> wrote: > > * Jeff Xu <jeffxu@chromium.org> [250224 13:44]: > > On Mon, Feb 24, 2025 at 10:21 AM Dave Hansen <dave.hansen@intel.com> wrote: > > > > > > On 2/24/25 09:45, jeffxu@chromium.org wrote: > > > > +/* > > > > + * mseal of userspace process's system mappings. > > > > + */ > > > > +#ifdef CONFIG_MSEAL_SYSTEM_MAPPINGS > > > > +#define MSEAL_SYSTEM_MAPPINGS_VM_FLAG VM_SEALED > > > > +#else > > > > +#define MSEAL_SYSTEM_MAPPINGS_VM_FLAG VM_NONE > > > > +#endif > > > > > > This ends up looking pretty wonky in practice: > > > > > > > + vm_flags = VM_READ|VM_MAYREAD|VM_IO|VM_DONTDUMP|VM_PFNMAP; > > > > + vm_flags |= MSEAL_SYSTEM_MAPPINGS_VM_FLAG; > > > > > > because MSEAL_SYSTEM_MAPPINGS_VM_FLAG is so much different from the > > > other ones. > > > > > > Would it really hurt to have > > > > > > #ifdef CONFIG_64BIT > > > /* VM is sealed, in vm_flags */ > > > #define VM_SEALED _BITUL(63) > > > +#else > > > +#define VM_SEALED VM_NONE > > > #endif > > > > > > ? > > > > > VM_SEALED isn't defined in 32-bit systems, and mseal.c isn't part of > > the build. This is intentional. Any 32-bit code trying to use the > > sealing function or the VM_SEALED flag will immediately fail > > compilation. This makes it easier to identify incorrect usage. > > > > The reason that two #defines are needed is because you can have mseal > enabled while not sealing system mappings, so for this to be clean we > need two defines. > > However MSEAL_SYSTEM_MAPPINGS_VM_FLAG, is _way_ too long, in my opinion. > Keeping with "VM_SEALED" I'd suggest "VM_SYSTEM_SEALED". > How about MSEAL_SYSTME_MAPPINGS as Kees suggested ? The VM_SYSTEM_SEALED doesn't have the MSEAL key or the MAPPING, so it might take longer for the new reader to understand what it is. -Jeff
* Kees Cook <kees@kernel.org> [250224 13:55]: > On Mon, Feb 24, 2025 at 10:52:13AM -0800, Dave Hansen wrote: > > On 2/24/25 10:44, Jeff Xu wrote: > > > For example: > > > Consider the case below in src/third_party/kernel/v6.6/fs/proc/task_mmu.c, > > > > > > #ifdef CONFIG_64BIT > > > [ilog2(VM_SEALED)] = "sl", > > > #endif > > > > > > Redefining VM_SEALED to VM_NONE for 32 bit won't detect the problem > > > in case that "#ifdef CONFIG_64BIT" line is missing. > > > > > > Please note, this has been like this since the first version of > > > mseal() RFC patch, and I prefer to keep it this way. > > > > That logic is reasonable. But it's different from the _vast_ majority of > > other flags. > > > > So what justifies VM_SEALED being so different? It's leading to pretty > > objectively ugly code in this series. > > Note that VM_SEALED is the "is this VMA sealed?" bit itself. The define > for "should we perform system mapping sealing?" is intentionally separate > here, so that it can be Kconfig and per-arch toggled, etc. > Considering Dave is the second person that did not find the huge commit message helpful, can we please limit the commit message to be about the actual code and not the entire series? I thought we said that it was worth while making this change in v5? Thanks, Liam
* Jeff Xu <jeffxu@chromium.org> [250224 14:07]: ... > > > > > > #ifdef CONFIG_64BIT > > > [ilog2(VM_SEALED)] = "sl", > > > #endif > > > > > > Redefining VM_SEALED to VM_NONE for 32 bit won't detect the problem > > > in case that "#ifdef CONFIG_64BIT" line is missing. > > > > I don't think it is reasonable to insist on doing things differently in > > the kernel because you have external tests that would need updating. > > These things can change independently, so I don't think this is a valid > > argument. > > > > If these are upstream tests, and we need these tests to work then they > > can be fixed. > > > As above, this is actually kernel code, not test. Okay, so then they could be easily fixed to work with changing the flags and it would transparently function in your ChromeOS code base. That's a good way of doing things. It doesn't make sense here since, as stated in this thread, we need two defines. But it does mean we are less locked into how this functions and it could change later, if needed. Also, do we need something like the above test for VM_SEALED_SYS? Thanks, Liam
On Mon, Feb 24, 2025 at 11:11 AM Liam R. Howlett <Liam.Howlett@oracle.com> wrote: > > * Kees Cook <kees@kernel.org> [250224 13:55]: > > On Mon, Feb 24, 2025 at 10:52:13AM -0800, Dave Hansen wrote: > > > On 2/24/25 10:44, Jeff Xu wrote: > > > > For example: > > > > Consider the case below in src/third_party/kernel/v6.6/fs/proc/task_mmu.c, > > > > > > > > #ifdef CONFIG_64BIT > > > > [ilog2(VM_SEALED)] = "sl", > > > > #endif > > > > > > > > Redefining VM_SEALED to VM_NONE for 32 bit won't detect the problem > > > > in case that "#ifdef CONFIG_64BIT" line is missing. > > > > > > > > Please note, this has been like this since the first version of > > > > mseal() RFC patch, and I prefer to keep it this way. > > > > > > That logic is reasonable. But it's different from the _vast_ majority of > > > other flags. > > > > > > So what justifies VM_SEALED being so different? It's leading to pretty > > > objectively ugly code in this series. > > > > Note that VM_SEALED is the "is this VMA sealed?" bit itself. The define > > for "should we perform system mapping sealing?" is intentionally separate > > here, so that it can be Kconfig and per-arch toggled, etc. > > > > Considering Dave is the second person that did not find the huge commit > message helpful, can we please limit the commit message to be about the > actual code and not the entire series? > > I thought we said that it was worth while making this change in v5? > I include the cover letter's content in the first commit message to clearly communicate the purpose of the entire patch series, saving maintainers' time when accepting the patch. Should I still include that, and add what the first patch does, and separate it from the cover letter with "----"? What do you think? -Jeff > Thanks, > Liam
On Mon, Feb 24, 2025 at 11:10:22AM -0800, Jeff Xu wrote: > On Mon, Feb 24, 2025 at 11:03 AM Liam R. Howlett > <Liam.Howlett@oracle.com> wrote: > > > > * Jeff Xu <jeffxu@chromium.org> [250224 13:44]: > > > On Mon, Feb 24, 2025 at 10:21 AM Dave Hansen <dave.hansen@intel.com> wrote: > > > > > > > > On 2/24/25 09:45, jeffxu@chromium.org wrote: > > > > > +/* > > > > > + * mseal of userspace process's system mappings. > > > > > + */ > > > > > +#ifdef CONFIG_MSEAL_SYSTEM_MAPPINGS > > > > > +#define MSEAL_SYSTEM_MAPPINGS_VM_FLAG VM_SEALED > > > > > +#else > > > > > +#define MSEAL_SYSTEM_MAPPINGS_VM_FLAG VM_NONE > > > > > +#endif > > > > > > > > This ends up looking pretty wonky in practice: > > > > > > > > > + vm_flags = VM_READ|VM_MAYREAD|VM_IO|VM_DONTDUMP|VM_PFNMAP; > > > > > + vm_flags |= MSEAL_SYSTEM_MAPPINGS_VM_FLAG; > > > > > > > > because MSEAL_SYSTEM_MAPPINGS_VM_FLAG is so much different from the > > > > other ones. > > > > > > > > Would it really hurt to have > > > > > > > > #ifdef CONFIG_64BIT > > > > /* VM is sealed, in vm_flags */ > > > > #define VM_SEALED _BITUL(63) > > > > +#else > > > > +#define VM_SEALED VM_NONE > > > > #endif > > > > > > > > ? > > > > > > > VM_SEALED isn't defined in 32-bit systems, and mseal.c isn't part of > > > the build. This is intentional. Any 32-bit code trying to use the > > > sealing function or the VM_SEALED flag will immediately fail > > > compilation. This makes it easier to identify incorrect usage. > > > > > > > The reason that two #defines are needed is because you can have mseal > > enabled while not sealing system mappings, so for this to be clean we > > need two defines. > > > > However MSEAL_SYSTEM_MAPPINGS_VM_FLAG, is _way_ too long, in my opinion. > > Keeping with "VM_SEALED" I'd suggest "VM_SYSTEM_SEALED". > > > How about MSEAL_SYSTME_MAPPINGS as Kees suggested ? > > The VM_SYSTEM_SEALED doesn't have the MSEAL key or the MAPPING, so it > might take longer for the new reader to understand what it is. I think to address Dave's concern, it should start with "VM_". We use "SEAL" already with VM_SEALED, so the "M" in "MSEAL" may be redundant, and to clarify the system mapping, how avout VM_SEAL_SYSMAP ? That capture's, hopefully, Dave, Liam, and your thoughts about the naming?
On Mon, Feb 24, 2025 at 02:10:58PM -0500, Liam R. Howlett wrote: > * Kees Cook <kees@kernel.org> [250224 13:55]: > > On Mon, Feb 24, 2025 at 10:52:13AM -0800, Dave Hansen wrote: > > > On 2/24/25 10:44, Jeff Xu wrote: > > > > For example: > > > > Consider the case below in src/third_party/kernel/v6.6/fs/proc/task_mmu.c, > > > > > > > > #ifdef CONFIG_64BIT > > > > [ilog2(VM_SEALED)] = "sl", > > > > #endif > > > > > > > > Redefining VM_SEALED to VM_NONE for 32 bit won't detect the problem > > > > in case that "#ifdef CONFIG_64BIT" line is missing. > > > > > > > > Please note, this has been like this since the first version of > > > > mseal() RFC patch, and I prefer to keep it this way. > > > > > > That logic is reasonable. But it's different from the _vast_ majority of > > > other flags. > > > > > > So what justifies VM_SEALED being so different? It's leading to pretty > > > objectively ugly code in this series. > > > > Note that VM_SEALED is the "is this VMA sealed?" bit itself. The define > > for "should we perform system mapping sealing?" is intentionally separate > > here, so that it can be Kconfig and per-arch toggled, etc. > > > > Considering Dave is the second person that did not find the huge commit > message helpful, can we please limit the commit message to be about the > actual code and not the entire series? > > I thought we said that it was worth while making this change in v5? Right, please minimize patch #1's commit log to just what it is doing, etc, and leave the rest of the rationale in the 0/N cover letter.
* Jeff Xu <jeffxu@chromium.org> [250224 14:23]: > On Mon, Feb 24, 2025 at 11:11 AM Liam R. Howlett > <Liam.Howlett@oracle.com> wrote: > > > > * Kees Cook <kees@kernel.org> [250224 13:55]: > > > On Mon, Feb 24, 2025 at 10:52:13AM -0800, Dave Hansen wrote: > > > > On 2/24/25 10:44, Jeff Xu wrote: > > > > > For example: > > > > > Consider the case below in src/third_party/kernel/v6.6/fs/proc/task_mmu.c, > > > > > > > > > > #ifdef CONFIG_64BIT > > > > > [ilog2(VM_SEALED)] = "sl", > > > > > #endif > > > > > > > > > > Redefining VM_SEALED to VM_NONE for 32 bit won't detect the problem > > > > > in case that "#ifdef CONFIG_64BIT" line is missing. > > > > > > > > > > Please note, this has been like this since the first version of > > > > > mseal() RFC patch, and I prefer to keep it this way. > > > > > > > > That logic is reasonable. But it's different from the _vast_ majority of > > > > other flags. > > > > > > > > So what justifies VM_SEALED being so different? It's leading to pretty > > > > objectively ugly code in this series. > > > > > > Note that VM_SEALED is the "is this VMA sealed?" bit itself. The define > > > for "should we perform system mapping sealing?" is intentionally separate > > > here, so that it can be Kconfig and per-arch toggled, etc. > > > > > > > Considering Dave is the second person that did not find the huge commit > > message helpful, can we please limit the commit message to be about the > > actual code and not the entire series? > > > > I thought we said that it was worth while making this change in v5? > > > I include the cover letter's content in the first commit message to > clearly communicate the purpose of the entire patch series, saving > maintainers' time when accepting the patch. Having more text than patch for such a patch seems unreasonable. I'd find it more acceptable if it were a complicated race condition, but everyone is getting lost in the summary. > > Should I still include that, and add what the first patch does, and > separate it from the cover letter with "----"? What do you think? Here is my v5 answer, I think it was clear about not putting the entire summary into the first patch. [1]. https://lore.kernel.org/all/ml3x5qchmnehdzz2rxsdcdghivaqffojiweuhvpvzw45u3l5bh@23sblrom3m36/ Thanks, Liam
On Mon, Feb 24, 2025 at 11:32 AM Liam R. Howlett <Liam.Howlett@oracle.com> wrote: > > * Jeff Xu <jeffxu@chromium.org> [250224 14:23]: > > On Mon, Feb 24, 2025 at 11:11 AM Liam R. Howlett > > <Liam.Howlett@oracle.com> wrote: > > > > > > * Kees Cook <kees@kernel.org> [250224 13:55]: > > > > On Mon, Feb 24, 2025 at 10:52:13AM -0800, Dave Hansen wrote: > > > > > On 2/24/25 10:44, Jeff Xu wrote: > > > > > > For example: > > > > > > Consider the case below in src/third_party/kernel/v6.6/fs/proc/task_mmu.c, > > > > > > > > > > > > #ifdef CONFIG_64BIT > > > > > > [ilog2(VM_SEALED)] = "sl", > > > > > > #endif > > > > > > > > > > > > Redefining VM_SEALED to VM_NONE for 32 bit won't detect the problem > > > > > > in case that "#ifdef CONFIG_64BIT" line is missing. > > > > > > > > > > > > Please note, this has been like this since the first version of > > > > > > mseal() RFC patch, and I prefer to keep it this way. > > > > > > > > > > That logic is reasonable. But it's different from the _vast_ majority of > > > > > other flags. > > > > > > > > > > So what justifies VM_SEALED being so different? It's leading to pretty > > > > > objectively ugly code in this series. > > > > > > > > Note that VM_SEALED is the "is this VMA sealed?" bit itself. The define > > > > for "should we perform system mapping sealing?" is intentionally separate > > > > here, so that it can be Kconfig and per-arch toggled, etc. > > > > > > > > > > Considering Dave is the second person that did not find the huge commit > > > message helpful, can we please limit the commit message to be about the > > > actual code and not the entire series? > > > > > > I thought we said that it was worth while making this change in v5? > > > > > I include the cover letter's content in the first commit message to > > clearly communicate the purpose of the entire patch series, saving > > maintainers' time when accepting the patch. > > Having more text than patch for such a patch seems unreasonable. I'd > find it more acceptable if it were a complicated race condition, but > everyone is getting lost in the summary. > I will remove the cover letter from the first patch then. > > > > Should I still include that, and add what the first patch does, and > > separate it from the cover letter with "----"? What do you think? > > Here is my v5 answer, I think it was clear about not putting the entire > summary into the first patch. > Thanks. > [1]. https://lore.kernel.org/all/ml3x5qchmnehdzz2rxsdcdghivaqffojiweuhvpvzw45u3l5bh@23sblrom3m36/ > > Thanks, > Liam
On Mon, Feb 24, 2025 at 11:26 AM Kees Cook <kees@kernel.org> wrote: > > On Mon, Feb 24, 2025 at 02:10:58PM -0500, Liam R. Howlett wrote: > > * Kees Cook <kees@kernel.org> [250224 13:55]: > > > On Mon, Feb 24, 2025 at 10:52:13AM -0800, Dave Hansen wrote: > > > > On 2/24/25 10:44, Jeff Xu wrote: > > > > > For example: > > > > > Consider the case below in src/third_party/kernel/v6.6/fs/proc/task_mmu.c, > > > > > > > > > > #ifdef CONFIG_64BIT > > > > > [ilog2(VM_SEALED)] = "sl", > > > > > #endif > > > > > > > > > > Redefining VM_SEALED to VM_NONE for 32 bit won't detect the problem > > > > > in case that "#ifdef CONFIG_64BIT" line is missing. > > > > > > > > > > Please note, this has been like this since the first version of > > > > > mseal() RFC patch, and I prefer to keep it this way. > > > > > > > > That logic is reasonable. But it's different from the _vast_ majority of > > > > other flags. > > > > > > > > So what justifies VM_SEALED being so different? It's leading to pretty > > > > objectively ugly code in this series. > > > > > > Note that VM_SEALED is the "is this VMA sealed?" bit itself. The define > > > for "should we perform system mapping sealing?" is intentionally separate > > > here, so that it can be Kconfig and per-arch toggled, etc. > > > > > > > Considering Dave is the second person that did not find the huge commit > > message helpful, can we please limit the commit message to be about the > > actual code and not the entire series? > > > > I thought we said that it was worth while making this change in v5? > > Right, please minimize patch #1's commit log to just what it is doing, > etc, and leave the rest of the rationale in the 0/N cover letter. > Sure. > -- > Kees Cook
On Mon, Feb 24, 2025 at 11:18 AM Liam R. Howlett <Liam.Howlett@oracle.com> wrote: > > Also, do we need something like the above test for VM_SEALED_SYS? > Do you mean adding selftest for sealing vdso ? or test the VM_SEALED_SYS macro in 32 bit vs 64 bits ? CONFIG_MSEAL_SYSTEM_MAPPINGS is by default disabled. I'm not sure about the common practice here. I also don't know how to detect CONFIG_MSEAL_SYSTEM_MAPPINGS from selftest, that is needed for the selftest. -Jeff > Thanks, > Liam >
On Mon, Feb 24, 2025 at 11:25 AM Kees Cook <kees@kernel.org> wrote: > > On Mon, Feb 24, 2025 at 11:10:22AM -0800, Jeff Xu wrote: > > On Mon, Feb 24, 2025 at 11:03 AM Liam R. Howlett > > <Liam.Howlett@oracle.com> wrote: > > > > > > * Jeff Xu <jeffxu@chromium.org> [250224 13:44]: > > > > On Mon, Feb 24, 2025 at 10:21 AM Dave Hansen <dave.hansen@intel.com> wrote: > > > > > > > > > > On 2/24/25 09:45, jeffxu@chromium.org wrote: > > > > > > +/* > > > > > > + * mseal of userspace process's system mappings. > > > > > > + */ > > > > > > +#ifdef CONFIG_MSEAL_SYSTEM_MAPPINGS > > > > > > +#define MSEAL_SYSTEM_MAPPINGS_VM_FLAG VM_SEALED > > > > > > +#else > > > > > > +#define MSEAL_SYSTEM_MAPPINGS_VM_FLAG VM_NONE > > > > > > +#endif > > > > > > > > > > This ends up looking pretty wonky in practice: > > > > > > > > > > > + vm_flags = VM_READ|VM_MAYREAD|VM_IO|VM_DONTDUMP|VM_PFNMAP; > > > > > > + vm_flags |= MSEAL_SYSTEM_MAPPINGS_VM_FLAG; > > > > > > > > > > because MSEAL_SYSTEM_MAPPINGS_VM_FLAG is so much different from the > > > > > other ones. > > > > > > > > > > Would it really hurt to have > > > > > > > > > > #ifdef CONFIG_64BIT > > > > > /* VM is sealed, in vm_flags */ > > > > > #define VM_SEALED _BITUL(63) > > > > > +#else > > > > > +#define VM_SEALED VM_NONE > > > > > #endif > > > > > > > > > > ? > > > > > > > > > VM_SEALED isn't defined in 32-bit systems, and mseal.c isn't part of > > > > the build. This is intentional. Any 32-bit code trying to use the > > > > sealing function or the VM_SEALED flag will immediately fail > > > > compilation. This makes it easier to identify incorrect usage. > > > > > > > > > > The reason that two #defines are needed is because you can have mseal > > > enabled while not sealing system mappings, so for this to be clean we > > > need two defines. > > > > > > However MSEAL_SYSTEM_MAPPINGS_VM_FLAG, is _way_ too long, in my opinion. > > > Keeping with "VM_SEALED" I'd suggest "VM_SYSTEM_SEALED". > > > > > How about MSEAL_SYSTME_MAPPINGS as Kees suggested ? > > > > The VM_SYSTEM_SEALED doesn't have the MSEAL key or the MAPPING, so it > > might take longer for the new reader to understand what it is. > > I think to address Dave's concern, it should start with "VM_". We use > "SEAL" already with VM_SEALED, so the "M" in "MSEAL" may be redundant, > and to clarify the system mapping, how avout VM_SEAL_SYSMAP ? That > capture's, hopefully, Dave, Liam, and your thoughts about the naming? > Liam had a comment in the previous version, asking everything related with mseal() to have the MSEAL keyword, that is why KCONFIG change has MSEAL. How about VM_MSEAL_SYSMAP ? -Jeff > -- > Kees Cook
* Jeff Xu <jeffxu@chromium.org> [250224 14:42]: > On Mon, Feb 24, 2025 at 11:25 AM Kees Cook <kees@kernel.org> wrote: > > > > On Mon, Feb 24, 2025 at 11:10:22AM -0800, Jeff Xu wrote: > > > On Mon, Feb 24, 2025 at 11:03 AM Liam R. Howlett > > > <Liam.Howlett@oracle.com> wrote: > > > > > > > > * Jeff Xu <jeffxu@chromium.org> [250224 13:44]: > > > > > On Mon, Feb 24, 2025 at 10:21 AM Dave Hansen <dave.hansen@intel.com> wrote: > > > > > > > > > > > > On 2/24/25 09:45, jeffxu@chromium.org wrote: > > > > > > > +/* > > > > > > > + * mseal of userspace process's system mappings. > > > > > > > + */ > > > > > > > +#ifdef CONFIG_MSEAL_SYSTEM_MAPPINGS > > > > > > > +#define MSEAL_SYSTEM_MAPPINGS_VM_FLAG VM_SEALED > > > > > > > +#else > > > > > > > +#define MSEAL_SYSTEM_MAPPINGS_VM_FLAG VM_NONE > > > > > > > +#endif > > > > > > > > > > > > This ends up looking pretty wonky in practice: > > > > > > > > > > > > > + vm_flags = VM_READ|VM_MAYREAD|VM_IO|VM_DONTDUMP|VM_PFNMAP; > > > > > > > + vm_flags |= MSEAL_SYSTEM_MAPPINGS_VM_FLAG; > > > > > > > > > > > > because MSEAL_SYSTEM_MAPPINGS_VM_FLAG is so much different from the > > > > > > other ones. > > > > > > > > > > > > Would it really hurt to have > > > > > > > > > > > > #ifdef CONFIG_64BIT > > > > > > /* VM is sealed, in vm_flags */ > > > > > > #define VM_SEALED _BITUL(63) > > > > > > +#else > > > > > > +#define VM_SEALED VM_NONE > > > > > > #endif > > > > > > > > > > > > ? > > > > > > > > > > > VM_SEALED isn't defined in 32-bit systems, and mseal.c isn't part of > > > > > the build. This is intentional. Any 32-bit code trying to use the > > > > > sealing function or the VM_SEALED flag will immediately fail > > > > > compilation. This makes it easier to identify incorrect usage. > > > > > > > > > > > > > The reason that two #defines are needed is because you can have mseal > > > > enabled while not sealing system mappings, so for this to be clean we > > > > need two defines. > > > > > > > > However MSEAL_SYSTEM_MAPPINGS_VM_FLAG, is _way_ too long, in my opinion. > > > > Keeping with "VM_SEALED" I'd suggest "VM_SYSTEM_SEALED". > > > > > > > How about MSEAL_SYSTME_MAPPINGS as Kees suggested ? > > > > > > The VM_SYSTEM_SEALED doesn't have the MSEAL key or the MAPPING, so it > > > might take longer for the new reader to understand what it is. > > > > I think to address Dave's concern, it should start with "VM_". We use > > "SEAL" already with VM_SEALED, so the "M" in "MSEAL" may be redundant, > > and to clarify the system mapping, how avout VM_SEAL_SYSMAP ? That > > capture's, hopefully, Dave, Liam, and your thoughts about the naming? > > > Liam had a comment in the previous version, asking everything related > with mseal() to have the MSEAL keyword, that is why KCONFIG change has > MSEAL. > > How about VM_MSEAL_SYSMAP ? I don't recall if it was this set or previous ones, but seal is becoming overloaded and having mseal would have been good for this one. VM_MSEAL does gain more greping, but since we have VM_SEALED, VM_SEAL_SYSMAP is going to show up on VM_SEAL grep, but not VM_SEALED greps. Maybe VM_SEALED_SYSMAP would be better for searching. Thanks, Liam
* Jeff Xu <jeffxu@chromium.org> [250224 14:40]: > On Mon, Feb 24, 2025 at 11:18 AM Liam R. Howlett > <Liam.Howlett@oracle.com> wrote: > > > > Also, do we need something like the above test for VM_SEALED_SYS? > > > Do you mean adding selftest for sealing vdso ? or test the > VM_SEALED_SYS macro in 32 bit vs 64 bits ? > > CONFIG_MSEAL_SYSTEM_MAPPINGS is by default disabled. I'm not sure > about the common practice here. > > I also don't know how to detect CONFIG_MSEAL_SYSTEM_MAPPINGS from > selftest, that is needed for the selftest. Looking more into this, I think it's fine without anything in there changing. Thanks, Liam
On Mon, Feb 24, 2025 at 12:13 PM Liam R. Howlett <Liam.Howlett@oracle.com> wrote: > > * Jeff Xu <jeffxu@chromium.org> [250224 14:42]: > > On Mon, Feb 24, 2025 at 11:25 AM Kees Cook <kees@kernel.org> wrote: > > > > > > On Mon, Feb 24, 2025 at 11:10:22AM -0800, Jeff Xu wrote: > > > > On Mon, Feb 24, 2025 at 11:03 AM Liam R. Howlett > > > > <Liam.Howlett@oracle.com> wrote: > > > > > > > > > > * Jeff Xu <jeffxu@chromium.org> [250224 13:44]: > > > > > > On Mon, Feb 24, 2025 at 10:21 AM Dave Hansen <dave.hansen@intel.com> wrote: > > > > > > > > > > > > > > On 2/24/25 09:45, jeffxu@chromium.org wrote: > > > > > > > > +/* > > > > > > > > + * mseal of userspace process's system mappings. > > > > > > > > + */ > > > > > > > > +#ifdef CONFIG_MSEAL_SYSTEM_MAPPINGS > > > > > > > > +#define MSEAL_SYSTEM_MAPPINGS_VM_FLAG VM_SEALED > > > > > > > > +#else > > > > > > > > +#define MSEAL_SYSTEM_MAPPINGS_VM_FLAG VM_NONE > > > > > > > > +#endif > > > > > > > > > > > > > > This ends up looking pretty wonky in practice: > > > > > > > > > > > > > > > + vm_flags = VM_READ|VM_MAYREAD|VM_IO|VM_DONTDUMP|VM_PFNMAP; > > > > > > > > + vm_flags |= MSEAL_SYSTEM_MAPPINGS_VM_FLAG; > > > > > > > > > > > > > > because MSEAL_SYSTEM_MAPPINGS_VM_FLAG is so much different from the > > > > > > > other ones. > > > > > > > > > > > > > > Would it really hurt to have > > > > > > > > > > > > > > #ifdef CONFIG_64BIT > > > > > > > /* VM is sealed, in vm_flags */ > > > > > > > #define VM_SEALED _BITUL(63) > > > > > > > +#else > > > > > > > +#define VM_SEALED VM_NONE > > > > > > > #endif > > > > > > > > > > > > > > ? > > > > > > > > > > > > > VM_SEALED isn't defined in 32-bit systems, and mseal.c isn't part of > > > > > > the build. This is intentional. Any 32-bit code trying to use the > > > > > > sealing function or the VM_SEALED flag will immediately fail > > > > > > compilation. This makes it easier to identify incorrect usage. > > > > > > > > > > > > > > > > The reason that two #defines are needed is because you can have mseal > > > > > enabled while not sealing system mappings, so for this to be clean we > > > > > need two defines. > > > > > > > > > > However MSEAL_SYSTEM_MAPPINGS_VM_FLAG, is _way_ too long, in my opinion. > > > > > Keeping with "VM_SEALED" I'd suggest "VM_SYSTEM_SEALED". > > > > > > > > > How about MSEAL_SYSTME_MAPPINGS as Kees suggested ? > > > > > > > > The VM_SYSTEM_SEALED doesn't have the MSEAL key or the MAPPING, so it > > > > might take longer for the new reader to understand what it is. > > > > > > I think to address Dave's concern, it should start with "VM_". We use > > > "SEAL" already with VM_SEALED, so the "M" in "MSEAL" may be redundant, > > > and to clarify the system mapping, how avout VM_SEAL_SYSMAP ? That > > > capture's, hopefully, Dave, Liam, and your thoughts about the naming? > > > > > Liam had a comment in the previous version, asking everything related > > with mseal() to have the MSEAL keyword, that is why KCONFIG change has > > MSEAL. > > > > How about VM_MSEAL_SYSMAP ? > > I don't recall if it was this set or previous ones, but seal is becoming > overloaded and having mseal would have been good for this one. > > VM_MSEAL does gain more greping, but since we have VM_SEALED, > VM_SEAL_SYSMAP is going to show up on VM_SEAL grep, but not VM_SEALED > greps. Maybe VM_SEALED_SYSMAP would be better for searching. > OK, I will change to VM_SEALED_SYSMAP -Jeff > Thanks, > Liam > >
diff --git a/include/linux/mm.h b/include/linux/mm.h index 7b1068ddcbb7..0e2a4a45d245 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -4155,4 +4155,14 @@ int arch_get_shadow_stack_status(struct task_struct *t, unsigned long __user *st int arch_set_shadow_stack_status(struct task_struct *t, unsigned long status); int arch_lock_shadow_stack_status(struct task_struct *t, unsigned long status); + +/* + * mseal of userspace process's system mappings. + */ +#ifdef CONFIG_MSEAL_SYSTEM_MAPPINGS +#define MSEAL_SYSTEM_MAPPINGS_VM_FLAG VM_SEALED +#else +#define MSEAL_SYSTEM_MAPPINGS_VM_FLAG VM_NONE +#endif + #endif /* _LINUX_MM_H */ diff --git a/init/Kconfig b/init/Kconfig index d0d021b3fa3b..07435e33f965 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -1882,6 +1882,24 @@ config ARCH_HAS_MEMBARRIER_CALLBACKS config ARCH_HAS_MEMBARRIER_SYNC_CORE bool +config ARCH_HAS_MSEAL_SYSTEM_MAPPINGS + bool + help + Control MSEAL_SYSTEM_MAPPINGS access based on architecture. + + A 64-bit kernel is required for the memory sealing feature. + No specific hardware features from the CPU are needed. + + To enable this feature, the architecture needs to update their + special mappings calls to include the sealing flag and confirm + that it doesn't unmap/remap system mappings during the life + time of the process. After the architecture enables this, a + distribution can set CONFIG_MSEAL_SYSTEM_MAPPING to manage access + to the feature. + + For complete descriptions of memory sealing, please see + Documentation/userspace-api/mseal.rst + config HAVE_PERF_EVENTS bool help diff --git a/security/Kconfig b/security/Kconfig index f10dbf15c294..15a86a952910 100644 --- a/security/Kconfig +++ b/security/Kconfig @@ -51,6 +51,24 @@ config PROC_MEM_NO_FORCE endchoice +config MSEAL_SYSTEM_MAPPINGS + bool "mseal system mappings" + depends on 64BIT + depends on ARCH_HAS_MSEAL_SYSTEM_MAPPINGS + depends on !CHECKPOINT_RESTORE + help + Seal system mappings such as vdso, vvar, sigpage, uprobes, etc. + + A 64-bit kernel is required for the memory sealing feature. + No specific hardware features from the CPU are needed. + + Note: CHECKPOINT_RESTORE, UML, gVisor, rr are known to relocate or + unmap system mapping, therefore this config can't be enabled + universally. + + For complete descriptions of memory sealing, please see + Documentation/userspace-api/mseal.rst + config SECURITY bool "Enable different security models" depends on SYSFS