Message ID | 20200709040316.12789-3-cmr@informatik.wtf (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Use per-CPU temporary mappings for patching | expand |
Hi "Christopher, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on powerpc/next] [also build test WARNING on char-misc/char-misc-testing v5.8-rc5 next-20200716] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Christopher-M-Riedl/Use-per-CPU-temporary-mappings-for-patching/20200709-123827 base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next config: powerpc-randconfig-r013-20200717 (attached as .config) compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project ed6b578040a85977026c93bf4188f996148f3218) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # install powerpc cross compiling tool for clang build # apt-get install binutils-powerpc-linux-gnu # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All warnings (new ones prefixed by >>): >> arch/powerpc/lib/code-patching.c:53:13: warning: no previous prototype for function 'poking_init' [-Wmissing-prototypes] void __init poking_init(void) ^ arch/powerpc/lib/code-patching.c:53:1: note: declare 'static' if the function is not intended to be used outside of this translation unit void __init poking_init(void) ^ static 1 warning generated. vim +/poking_init +53 arch/powerpc/lib/code-patching.c 52 > 53 void __init poking_init(void) 54 { 55 spinlock_t *ptl; /* for protecting pte table */ 56 pte_t *ptep; 57 58 /* 59 * Some parts of the kernel (static keys for example) depend on 60 * successful code patching. Code patching under STRICT_KERNEL_RWX 61 * requires this setup - otherwise we cannot patch at all. We use 62 * BUG_ON() here and later since an early failure is preferred to 63 * buggy behavior and/or strange crashes later. 64 */ 65 patching_mm = copy_init_mm(); 66 BUG_ON(!patching_mm); 67 68 /* 69 * In hash we cannot go above DEFAULT_MAP_WINDOW easily. 70 * XXX: Do we want additional bits of entropy for radix? 71 */ 72 patching_addr = (get_random_long() & PAGE_MASK) % 73 (DEFAULT_MAP_WINDOW - PAGE_SIZE); 74 75 ptep = get_locked_pte(patching_mm, patching_addr, &ptl); 76 BUG_ON(!ptep); 77 pte_unmap_unlock(ptep, ptl); 78 } 79 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
"Christopher M. Riedl" <cmr@informatik.wtf> writes: > When code patching a STRICT_KERNEL_RWX kernel the page containing the > address to be patched is temporarily mapped with permissive memory > protections. Currently, a per-cpu vmalloc patch area is used for this > purpose. While the patch area is per-cpu, the temporary page mapping is > inserted into the kernel page tables for the duration of the patching. > The mapping is exposed to CPUs other than the patching CPU - this is > undesirable from a hardening perspective. > > Use the `poking_init` init hook to prepare a temporary mm and patching > address. Initialize the temporary mm by copying the init mm. Choose a > randomized patching address inside the temporary mm userspace address > portion. The next patch uses the temporary mm and patching address for > code patching. > > Based on x86 implementation: > > commit 4fc19708b165 > ("x86/alternatives: Initialize temporary mm for patching") > > Signed-off-by: Christopher M. Riedl <cmr@informatik.wtf> > --- > arch/powerpc/lib/code-patching.c | 33 ++++++++++++++++++++++++++++++++ > 1 file changed, 33 insertions(+) > > diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c > index 0a051dfeb177..8ae1a9e5fe6e 100644 > --- a/arch/powerpc/lib/code-patching.c > +++ b/arch/powerpc/lib/code-patching.c > @@ -11,6 +11,8 @@ > #include <linux/cpuhotplug.h> > #include <linux/slab.h> > #include <linux/uaccess.h> > +#include <linux/sched/task.h> > +#include <linux/random.h> > > #include <asm/tlbflush.h> > #include <asm/page.h> > @@ -44,6 +46,37 @@ int raw_patch_instruction(struct ppc_inst *addr, struct ppc_inst instr) > } > > #ifdef CONFIG_STRICT_KERNEL_RWX > + > +static struct mm_struct *patching_mm __ro_after_init; > +static unsigned long patching_addr __ro_after_init; > + > +void __init poking_init(void) > +{ > + spinlock_t *ptl; /* for protecting pte table */ > + pte_t *ptep; > + > + /* > + * Some parts of the kernel (static keys for example) depend on > + * successful code patching. Code patching under STRICT_KERNEL_RWX > + * requires this setup - otherwise we cannot patch at all. We use > + * BUG_ON() here and later since an early failure is preferred to > + * buggy behavior and/or strange crashes later. > + */ > + patching_mm = copy_init_mm(); > + BUG_ON(!patching_mm); > + > + /* > + * In hash we cannot go above DEFAULT_MAP_WINDOW easily. > + * XXX: Do we want additional bits of entropy for radix? > + */ > + patching_addr = (get_random_long() & PAGE_MASK) % > + (DEFAULT_MAP_WINDOW - PAGE_SIZE); It took me a while to understand this calculation. I see that it's calculating a base address for a page in which to do patching. It does the following: - get a random long - mask with PAGE_MASK so as to get a page aligned value - make sure that the base address is at least one PAGE_SIZE below DEFAULT_MAP_WINDOW so we have a clear page between the base and DEFAULT_MAP_WINDOW. On 64-bit Book3S with 64K pages, that works out to be PAGE_SIZE = 0x0000 0000 0001 0000 PAGE_MASK = 0xFFFF FFFF FFFF 0000 DEFAULT_MAP_WINDOW = DEFAULT_MAP_WINDOW_USER64 = TASK_SIZE_128TB = 0x0000_8000_0000_0000 DEFAULT_MAP_WINDOW - PAGE_SIZE = 0x0000 7FFF FFFF 0000 It took a while (and a conversation with my wife who studied pure maths!) but I am convinced that the modulo preserves the page-alignement of the patching address. One thing I did realise is that patching_addr can be zero at the end of this process. That seems dubious and slightly error-prone to me - is the patching process robust to that or should we exclude it? Anyway, if I have the maths right, that there are 0x7fffffff or ~2 billion possible locations for the patching page, which is just shy of 31 bits of entropy. I think this compares pretty favourably to most (K)ASLR implementations? What's the range if built with 4k pages? Kind regards, Daniel > + > + ptep = get_locked_pte(patching_mm, patching_addr, &ptl); > + BUG_ON(!ptep); > + pte_unmap_unlock(ptep, ptl); > +} > + > static DEFINE_PER_CPU(struct vm_struct *, text_poke_area); > > static int text_area_cpu_up(unsigned int cpu) > -- > 2.27.0
On Thu Aug 6, 2020 at 8:24 AM CDT, Daniel Axtens wrote: > "Christopher M. Riedl" <cmr@informatik.wtf> writes: > > > When code patching a STRICT_KERNEL_RWX kernel the page containing the > > address to be patched is temporarily mapped with permissive memory > > protections. Currently, a per-cpu vmalloc patch area is used for this > > purpose. While the patch area is per-cpu, the temporary page mapping is > > inserted into the kernel page tables for the duration of the patching. > > The mapping is exposed to CPUs other than the patching CPU - this is > > undesirable from a hardening perspective. > > > > Use the `poking_init` init hook to prepare a temporary mm and patching > > address. Initialize the temporary mm by copying the init mm. Choose a > > randomized patching address inside the temporary mm userspace address > > portion. The next patch uses the temporary mm and patching address for > > code patching. > > > > Based on x86 implementation: > > > > commit 4fc19708b165 > > ("x86/alternatives: Initialize temporary mm for patching") > > > > Signed-off-by: Christopher M. Riedl <cmr@informatik.wtf> > > --- > > arch/powerpc/lib/code-patching.c | 33 ++++++++++++++++++++++++++++++++ > > 1 file changed, 33 insertions(+) > > > > diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c > > index 0a051dfeb177..8ae1a9e5fe6e 100644 > > --- a/arch/powerpc/lib/code-patching.c > > +++ b/arch/powerpc/lib/code-patching.c > > @@ -11,6 +11,8 @@ > > #include <linux/cpuhotplug.h> > > #include <linux/slab.h> > > #include <linux/uaccess.h> > > +#include <linux/sched/task.h> > > +#include <linux/random.h> > > > > #include <asm/tlbflush.h> > > #include <asm/page.h> > > @@ -44,6 +46,37 @@ int raw_patch_instruction(struct ppc_inst *addr, struct ppc_inst instr) > > } > > > > #ifdef CONFIG_STRICT_KERNEL_RWX > > + > > +static struct mm_struct *patching_mm __ro_after_init; > > +static unsigned long patching_addr __ro_after_init; > > + > > +void __init poking_init(void) > > +{ > > + spinlock_t *ptl; /* for protecting pte table */ > > + pte_t *ptep; > > + > > + /* > > + * Some parts of the kernel (static keys for example) depend on > > + * successful code patching. Code patching under STRICT_KERNEL_RWX > > + * requires this setup - otherwise we cannot patch at all. We use > > + * BUG_ON() here and later since an early failure is preferred to > > + * buggy behavior and/or strange crashes later. > > + */ > > + patching_mm = copy_init_mm(); > > + BUG_ON(!patching_mm); > > + > > + /* > > + * In hash we cannot go above DEFAULT_MAP_WINDOW easily. > > + * XXX: Do we want additional bits of entropy for radix? > > + */ > > + patching_addr = (get_random_long() & PAGE_MASK) % > > + (DEFAULT_MAP_WINDOW - PAGE_SIZE); > > It took me a while to understand this calculation. I see that it's > calculating a base address for a page in which to do patching. It does > the following: I will add a comment explaining the calulcation in the next spin. > > - get a random long > > - mask with PAGE_MASK so as to get a page aligned value > > - make sure that the base address is at least one PAGE_SIZE below > DEFAULT_MAP_WINDOW so we have a clear page between the base and > DEFAULT_MAP_WINDOW. > > On 64-bit Book3S with 64K pages, that works out to be > > PAGE_SIZE = 0x0000 0000 0001 0000 > PAGE_MASK = 0xFFFF FFFF FFFF 0000 > > DEFAULT_MAP_WINDOW = DEFAULT_MAP_WINDOW_USER64 = TASK_SIZE_128TB > = 0x0000_8000_0000_0000 > > DEFAULT_MAP_WINDOW - PAGE_SIZE = 0x0000 7FFF FFFF 0000 > > It took a while (and a conversation with my wife who studied pure > maths!) but I am convinced that the modulo preserves the page-alignement > of the patching address. I am glad a proper mathematician agrees because my maths are decidedly unpure :) > > One thing I did realise is that patching_addr can be zero at the end of > this process. That seems dubious and slightly error-prone to me - is > the patching process robust to that or should we exclude it? Good catch! I will fix this in the next spin. > > Anyway, if I have the maths right, that there are 0x7fffffff or ~2 > billion possible locations for the patching page, which is just shy of > 31 bits of entropy. > > I think this compares pretty favourably to most (K)ASLR implementations? I will stress that I am not an expert here, but it looks like this does compares favorably against other 64b ASLR [0]. [0]: https://www.cs.ucdavis.edu/~peisert/research/2017-SecDev-AnalysisASLR.pdf > > What's the range if built with 4k pages? Using the formula from my series coverletter, we should expect 34 bits of entropy since DEFAULT_MAP_WINDOW_USER64 is 64TB for 4K pages: bits of entropy = log2(DEFAULT_MAP_WINDOW_USER64 / PAGE_SIZE) PAGE_SIZE=4K, DEFAULT_MAP_WINDOW_USER64=64TB bits of entropy = log2(64TB / 4K) bits of entropy = 34 > > Kind regards, > Daniel > > > + > > + ptep = get_locked_pte(patching_mm, patching_addr, &ptl); > > + BUG_ON(!ptep); > > + pte_unmap_unlock(ptep, ptl); > > +} > > + > > static DEFINE_PER_CPU(struct vm_struct *, text_poke_area); > > > > static int text_area_cpu_up(unsigned int cpu) > > -- > > 2.27.0
diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c index 0a051dfeb177..8ae1a9e5fe6e 100644 --- a/arch/powerpc/lib/code-patching.c +++ b/arch/powerpc/lib/code-patching.c @@ -11,6 +11,8 @@ #include <linux/cpuhotplug.h> #include <linux/slab.h> #include <linux/uaccess.h> +#include <linux/sched/task.h> +#include <linux/random.h> #include <asm/tlbflush.h> #include <asm/page.h> @@ -44,6 +46,37 @@ int raw_patch_instruction(struct ppc_inst *addr, struct ppc_inst instr) } #ifdef CONFIG_STRICT_KERNEL_RWX + +static struct mm_struct *patching_mm __ro_after_init; +static unsigned long patching_addr __ro_after_init; + +void __init poking_init(void) +{ + spinlock_t *ptl; /* for protecting pte table */ + pte_t *ptep; + + /* + * Some parts of the kernel (static keys for example) depend on + * successful code patching. Code patching under STRICT_KERNEL_RWX + * requires this setup - otherwise we cannot patch at all. We use + * BUG_ON() here and later since an early failure is preferred to + * buggy behavior and/or strange crashes later. + */ + patching_mm = copy_init_mm(); + BUG_ON(!patching_mm); + + /* + * In hash we cannot go above DEFAULT_MAP_WINDOW easily. + * XXX: Do we want additional bits of entropy for radix? + */ + patching_addr = (get_random_long() & PAGE_MASK) % + (DEFAULT_MAP_WINDOW - PAGE_SIZE); + + ptep = get_locked_pte(patching_mm, patching_addr, &ptl); + BUG_ON(!ptep); + pte_unmap_unlock(ptep, ptl); +} + static DEFINE_PER_CPU(struct vm_struct *, text_poke_area); static int text_area_cpu_up(unsigned int cpu)
When code patching a STRICT_KERNEL_RWX kernel the page containing the address to be patched is temporarily mapped with permissive memory protections. Currently, a per-cpu vmalloc patch area is used for this purpose. While the patch area is per-cpu, the temporary page mapping is inserted into the kernel page tables for the duration of the patching. The mapping is exposed to CPUs other than the patching CPU - this is undesirable from a hardening perspective. Use the `poking_init` init hook to prepare a temporary mm and patching address. Initialize the temporary mm by copying the init mm. Choose a randomized patching address inside the temporary mm userspace address portion. The next patch uses the temporary mm and patching address for code patching. Based on x86 implementation: commit 4fc19708b165 ("x86/alternatives: Initialize temporary mm for patching") Signed-off-by: Christopher M. Riedl <cmr@informatik.wtf> --- arch/powerpc/lib/code-patching.c | 33 ++++++++++++++++++++++++++++++++ 1 file changed, 33 insertions(+)