Message ID | CAAseMr6Fae=4otZZX9HmO1YgehjsVmd+44G5RN7Wu8iF2+fruA@mail.gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Wed, Nov 16, 2016 at 10:39:38PM +0800, Gengjia Chen wrote: > Hello kees and everyone : Hi, > This email introduces two patches. As a general note, it would be better to use git send-email to send these patches as separate emails, in reply to a cover letter. That's more in keeping with usual Linux list style, and makes review and application of patches easier/possible. e.g. assuming you have two patches on a branch foo: $ git format-patch --cover-leter foo~2..foo $ ${EDITOR} 0000-* # write your cover letter comments $ git send-email --to=${whoever} --cc={who_else} *.patch See $(man git-send-email) and $(man git-config) for how to set that up. > Patch 1 introduce the write-rarely memory section for > those kernel objects which are read only mostly > but need to be written to sometimes. > Patch 2 introduce two helper functions (mark_wrdata_rw/ > mark_wrdata_ro) to make __write_rarely memory section > writable or unwritable. They play like the pax_open_kernel/ > pax_close_kernel functions in grsecurity patch.Right now > this only been implemented on arm32. I've mentioned this elsewhere, but I don't think {open,close}_kernel() works as an interface. Specifically, I don't believe it can be safely and efficiently implemented for arm with LPAE, nor arm64. To cater for those it may be possible to use a temporary RW mapping separate from the usual kernel mapping (e.g. in TTBR0 on arm64). It should be possible to have an API that can use either approach, if writers are suitable annotated. That all said, from the two patches it's not even clear how this would be used, as nothing is marked __write_rarely, nor are any writers updated. A demonstration would be helpful. It would also be worth getting these onto the relevant lists (per scripts/get_maintainer.pl), so as to get the relevant core and architecture maintainers involved as early as possible. Thanks, Mark.
On Wed, 2016-11-16 at 22:39 +0800, Gengjia Chen wrote: > Hello kees and everyone : > > This email introduces two patches. > > Patch 1 introduce the write-rarely memory section for > those kernel objects which are read only mostly > but need to be written to sometimes. > > Patch 2 introduce two helper functions (mark_wrdata_rw/ > mark_wrdata_ro) to make __write_rarely memory section > writable or unwritable. They play like the pax_open_kernel/ > pax_close_kernel functions in grsecurity patch.Right now > this only been implemented on arm32. Is this the way we want to go with this, or could it make more sense to have a special ELF section in the kernel that is allowed to write to write-rarely memory, and have the exception handler deal with writes coming from that code? That way it would be much harder to insert code into the kernel that calls your mark_wrdata_rw/mark_wrdata_ro functions.
On Wed, Nov 16, 2016 at 10:39:38PM +0800, Gengjia Chen wrote: > From: chengengjia <chengjia4574@gmail.com> > Date: Wed, 16 Nov 2016 21:46:49 +0800 > Subject: [PATCH] arch: introduce write-rarely memory section > > This introduces __write_rarely section as a way > to mark those memory that read only in most case, > but also need to be written to sometimes. > > Signed-off-by: chengengjia <chengjia4574@gmail.com> Minor nit, please use your full/real name for From: and signed-off-by: lines. thanks, greg k-h
On Wed, Nov 16, 2016 at 7:16 AM, Mark Rutland <mark.rutland@arm.com> wrote: > On Wed, Nov 16, 2016 at 10:39:38PM +0800, Gengjia Chen wrote: >> Patch 1 introduce the write-rarely memory section for >> those kernel objects which are read only mostly >> but need to be written to sometimes. >> Patch 2 introduce two helper functions (mark_wrdata_rw/ >> mark_wrdata_ro) to make __write_rarely memory section >> writable or unwritable. They play like the pax_open_kernel/ >> pax_close_kernel functions in grsecurity patch.Right now >> this only been implemented on arm32. Thanks Gengjia for working on this! As part of this, can you add some lkdtm (drivers/misc/lkdtm*.c) tests for these kinds of memory sections, so it's possible to test the feature? And as other have mentioned, finding some simple examples of where to use this primitive would be great. I think I saw a small one in grsecurity in drivers/cdrom/cdrom.c where cdo->generic_packet gets assigned. It's the only thing updated in the entire cdo object: - if (!cdo->generic_packet) - cdo->generic_packet = cdrom_dummy_generic_packet; + if (!cdo->generic_packet) { + pax_open_kernel(); + const_cast(cdo->generic_packet) = cdrom_dummy_generic_packet; + pax_close_kernel(); + } > I've mentioned this elsewhere, but I don't think {open,close}_kernel() > works as an interface. Specifically, I don't believe it can be safely > and efficiently implemented for arm with LPAE, nor arm64. I've seen similar objections from x86 maintainers. > To cater for those it may be possible to use a temporary RW mapping > separate from the usual kernel mapping (e.g. in TTBR0 on arm64). It > should be possible to have an API that can use either approach, if > writers are suitable annotated. Do you mean something besides a new fixmap? As for annotation, this was something that came up during Kernel Summit too. It'd be nice to have a type-checked function of some kind that could be used. Maybe something like this (using the cdrom example above): rare_write(cdo, generic_packet, cdrom_dummy_generic_packet); > That all said, from the two patches it's not even clear how this would > be used, as nothing is marked __write_rarely, nor are any writers > updated. A demonstration would be helpful. The primary user of this would be the constify plugin, which marks all structures that contain only function pointers as being read-only. It also takes annotations for other things that are found manually. > It would also be worth getting these onto the relevant lists (per > scripts/get_maintainer.pl), so as to get the relevant core and > architecture maintainers involved as early as possible. Yeah, once some more examples are seen, it should be clear how to best build the API. -Kees
On Thu, Nov 17, 2016 at 03:47:31PM -0800, Kees Cook wrote: > On Wed, Nov 16, 2016 at 7:16 AM, Mark Rutland <mark.rutland@arm.com> wrote: > > On Wed, Nov 16, 2016 at 10:39:38PM +0800, Gengjia Chen wrote: > + pax_open_kernel(); > + const_cast(cdo->generic_packet) = cdrom_dummy_generic_packet; > + pax_close_kernel(); > > I've mentioned this elsewhere, but I don't think {open,close}_kernel() > > works as an interface. Specifically, I don't believe it can be safely > > and efficiently implemented for arm with LPAE, nor arm64. > > I've seen similar objections from x86 maintainers. > > > To cater for those it may be possible to use a temporary RW mapping > > separate from the usual kernel mapping (e.g. in TTBR0 on arm64). It > > should be possible to have an API that can use either approach, if > > writers are suitable annotated. > > Do you mean something besides a new fixmap? Yes. I think i mentioned this in passing at LSS. For arm64, my idea was to have the RW mapping in its own mm, using the processes/idmap address area. That way it will only be mapped per-thread as required. Other architectures without something like domains should be able to do something similar. I believe the only additional part we need for that is something to convert a pointer to its RW alias. See below for an example. We already do both of these things on arm64 in other cases, e.g. for temporarily mapping EFI runtime services, switching TTBR1 safely. > As for annotation, this was something that came up during Kernel > Summit too. It'd be nice to have a type-checked function of some kind > that could be used. Maybe something like this (using the cdrom example > above): > > rare_write(cdo, generic_packet, cdrom_dummy_generic_packet); Something like that should be possible. Though I'd expect just var and val as arguments, e.g. rare_write(cdo->generic_packet, cdrom_dummy_generic_packet); With something like the below (assuming all helpers are inlined): #define rare_write(__var, __val) ({ \ typeof(var) *__rw_var; \ __some_typecheck_perhaps(__var, __val); \ \ __rw_var = __rare_rw_ptr(&(__var)); \ \ __rare_rw_map(); \ *__rw_var = (__val); \ __rare_rw_unmap(); \ \ __clobber_var_so_gcc_knows(__var); \ }) ... which should work in the case of a separate mapping, or finer-grained per-thread permission controls (e.g. ARM domains, x86 pkeys). ... and the trivial case is simple enough: #if !IS_ENABLED(WHATEVER_THIS_IS_CALLED) #define __rare_rw_ptr(__p) __p #define __rare_rw_map() #define __rare_rw_unmap() #endif Thanks, Mark.
On Fri, Nov 18, 2016 at 11:10:59AM +0000, Mark Rutland wrote: > With something like the below (assuming all helpers are inlined): > > #define rare_write(__var, __val) ({ \ > typeof(var) *__rw_var; \ ... iwith s/var/__var/ here... > __some_typecheck_perhaps(__var, __val); \ > \ > __rw_var = __rare_rw_ptr(&(__var)); \ > \ > __rare_rw_map(); \ > *__rw_var = (__val); \ > __rare_rw_unmap(); \ > \ > __clobber_var_so_gcc_knows(__var); \ > }) ... and with explicit clobber gone, since __rare_rw_unmap would presumably have to have a memory clobber anyway, just like the fixmap code. Thanks, Mark.
==== patch 1 ===== From 7ee1ed5a723723112167ca8603f4e2945d133b33 Mon Sep 17 00:00:00 2001 From: chengengjia <chengjia4574@gmail.com> Date: Wed, 16 Nov 2016 21:46:49 +0800 Subject: [PATCH] arch: introduce write-rarely memory section This introduces __write_rarely section as a way to mark those memory that read only in most case, but also need to be written to sometimes. Signed-off-by: chengengjia <chengjia4574@gmail.com> --- include/asm-generic/vmlinux.lds.h | 9 +++++++++ include/linux/cache.h | 9 +++++++++ 2 files changed, 18 insertions(+) diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h index 3074796..684bcf2 100644 --- a/include/asm-generic/vmlinux.lds.h +++ b/include/asm-generic/vmlinux.lds.h @@ -263,6 +263,14 @@ #endif /* + * Allow architectures to handle write_rarely data on their + * own by defining an empty WRITE_RARELY_DATA. + */ +#ifndef WRITE_RARELY_DATA +#define WRITE_RARELY_DATA *(.data..write_rarely) +#endif + +/* * Read only Data */ #define RO_DATA_SECTION(align) \ @@ -271,6 +279,7 @@ VMLINUX_SYMBOL(__start_rodata) = .; \ *(.rodata) *(.rodata.*) \ RO_AFTER_INIT_DATA /* Read only after init */ \ + WRITE_RARELY_DATA /* write rarely*/ \ *(__vermagic) /* Kernel version magic */ \ . = ALIGN(8); \ VMLINUX_SYMBOL(__start___tracepoints_ptrs) = .; \ diff --git a/include/linux/cache.h b/include/linux/cache.h index 1be04f8..d2c30cd 100644 --- a/include/linux/cache.h +++ b/include/linux/cache.h @@ -30,6 +30,15 @@ #define __ro_after_init __attribute__((__section__(".data..ro_after_init"))) #endif +/* + * __write_rarely is used to mark things that are read-only mostly but need to be + * written to sometimes. To write to __write_rarely variables, mark_wrdata_rw() + * need to be called, after write, mark_wrdata_ro() should be called. + */ +#ifndef __write_rarely +#define __write_rarely __attribute__((__section__(".data..write_rarely"))) +#endif + #ifndef ____cacheline_aligned #define ____cacheline_aligned __attribute__((__aligned__(SMP_CACHE_BYTES))) #endif -- 1.9.1 ==== patch 2 ===== From 0472f3f14532577a1963d517da8a0edfe1d99375 Mon Sep 17 00:00:00 2001 From: chengengjia <chengjia4574@gmail.com> Date: Wed, 16 Nov 2016 22:16:43 +0800 Subject: [PATCH] arch: arm: add mark_wrdata_rw/mark_wrdata_ro functions This introduces mark_wrdata_rw/mark_wrdata_ro to mark __write_rarely section writable/unwritable. Based on work by PaxTeam and Grsecurity. Signed-off-by: chengengjia <chengjia4574@gmail.com> --- arch/arm/include/asm/domain.h | 14 ++++++++++++++ arch/arm/include/asm/pgtable.h | 38 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 52 insertions(+) diff --git a/arch/arm/include/asm/domain.h b/arch/arm/include/asm/domain.h index 99d9f63..01f18a5 100644 --- a/arch/arm/include/asm/domain.h +++ b/arch/arm/include/asm/domain.h @@ -54,6 +54,10 @@ #else #define DOMAIN_MANAGER 1 #endif +#ifdef CONFIG_DEBUG_RODATA +#define DOMAIN_OPENKERN 3 +#endif + #define domain_mask(dom) ((3) << (2 * (dom))) #define domain_val(dom,type) ((type) << (2 * (dom))) @@ -124,6 +128,16 @@ static inline void set_domain(unsigned val) set_domain(domain); \ } while (0) +#elif defined(CONFIG_DEBUG_RODATA) +#define modify_domain(dom,type) \ + do { \ + struct thread_info *thread = current_thread_info(); \ + unsigned int domain = get_domain(); \ + domain &= ~domain_mask(dom); \ + domain = domain | domain_val(dom, type); \ + thread->cpu_domain = domain; \ + set_domain(domain); \ + } while (0) #else static inline void modify_domain(unsigned dom, unsigned type) { } #endif diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h index a8d656d..5e6f553 100644 --- a/arch/arm/include/asm/pgtable.h +++ b/arch/arm/include/asm/pgtable.h @@ -56,6 +56,44 @@ #define pmd_ERROR(pmd) __pmd_error(__FILE__, __LINE__, pmd) #define pgd_ERROR(pgd) __pgd_error(__FILE__, __LINE__, pgd) +#ifdef CONFIG_DEBUG_RODATA +#include <asm/domain.h> +#include <linux/thread_info.h> +#include <linux/preempt.h> + +static inline int test_domain(int domain, int domaintype) +{ + return ((current_thread_info()->cpu_domain) & domain_val(domain, 3)) + == domain_val(domain, domaintype); +} + +static inline unsigned long mark_wrdata_rw(void) { +#ifdef CONFIG_ARM_LPAE + /* TODO */ +#else + preempt_disable(); + BUG_ON(test_domain(DOMAIN_KERNEL, DOMAIN_OPENKERN)); + modify_domain(DOMAIN_KERNEL, DOMAIN_OPENKERN); +#endif + return 0; +} + +static inline unsigned long mark_wrdata_ro(void) { +#ifdef CONFIG_ARM_LPAE + /* TODO */ +#else + BUG_ON(test_domain(DOMAIN_KERNEL, DOMAIN_MANAGER)); + modify_domain(DOMAIN_KERNEL, DOMAIN_MANAGER); + preempt_enable_no_resched(); +#endif + return 0; +} +#else +static inline unsigned long mark_wrdata_rw(void) { return 0; } +static inline unsigned long mark_wrdata_ro(void) { return 0; } +#endif + + /* * This is the lowest virtual address we can permit any user space