Message ID | 1397626297-23873-2-git-send-email-victor.kamensky@linaro.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 04/15, Victor Kamensky wrote: > > --- a/kernel/events/uprobes.c > +++ b/kernel/events/uprobes.c > @@ -1149,7 +1149,7 @@ static int xol_add_vma(struct mm_struct *mm, struct xol_area *area) > } > > ret = install_special_mapping(mm, area->vaddr, PAGE_SIZE, > - VM_EXEC|VM_MAYEXEC|VM_DONTCOPY|VM_IO, &area->page); > + VM_EXEC|VM_MAYEXEC|VM_DONTCOPY|VM_IO|VM_WRITE, &area->page); Yes, this is nasty. I would like to have a reason to nack this change ;) Unfortunately the current code is buggy too and we need to protect the kernel from malicious applications which can rewrite the insn we are going to step over in UTASK_SSTEP state anyway. > +void __weak arch_uprobe_copy_ixol(struct page *page, unsigned long vaddr, > + void *src, unsigned long len) > +{ > + /* > + * Note if CPU does not support instructions write snooping > + * from dcache it needs to define its own version of this > + * function that would take care of proper cache flushes. > + * > + * Nothing we can do if it fails, added if to make unused > + * result warning happy. If xol write failed because process > + * unmapped xol area by mistake, process will crash in some > + * other place. > + */ > + if (__copy_to_user((void *) vaddr, src, len)); > +} Plus, again, this can write to another mapping, say to file-backed memory. Finally, with this change it won't be possible to share this xol memory with other tasks. But it seems that it is pointless to argue. Oleg.
From: Oleg Nesterov <oleg@redhat.com> Date: Wed, 16 Apr 2014 16:51:07 +0200 > On 04/15, Victor Kamensky wrote: >> >> --- a/kernel/events/uprobes.c >> +++ b/kernel/events/uprobes.c >> @@ -1149,7 +1149,7 @@ static int xol_add_vma(struct mm_struct *mm, struct xol_area *area) >> } >> >> ret = install_special_mapping(mm, area->vaddr, PAGE_SIZE, >> - VM_EXEC|VM_MAYEXEC|VM_DONTCOPY|VM_IO, &area->page); >> + VM_EXEC|VM_MAYEXEC|VM_DONTCOPY|VM_IO|VM_WRITE, &area->page); > > Yes, this is nasty. > > I would like to have a reason to nack this change ;) Unfortunately the current > code is buggy too and we need to protect the kernel from malicious applications > which can rewrite the insn we are going to step over in UTASK_SSTEP state anyway. I think there may be a way to achieve your objectives. Pass MAP_SHARED into the flags argument of get_unmapped_area(), and pass the pfn of the xol page in as "pgoff". This will make the xol page get mapped into the user process at an address which is "D-cache congruent" to the kernel side mapping. So all kernel stores to the page will use the same D-cache line that user space accesses to it will. So we end up with all of the benefits of storing directly to userspace, along with what you're trying to achieve.
On 04/16, David Miller wrote: > > From: Oleg Nesterov <oleg@redhat.com> > Date: Wed, 16 Apr 2014 16:51:07 +0200 > > > On 04/15, Victor Kamensky wrote: > >> > >> --- a/kernel/events/uprobes.c > >> +++ b/kernel/events/uprobes.c > >> @@ -1149,7 +1149,7 @@ static int xol_add_vma(struct mm_struct *mm, struct xol_area *area) > >> } > >> > >> ret = install_special_mapping(mm, area->vaddr, PAGE_SIZE, > >> - VM_EXEC|VM_MAYEXEC|VM_DONTCOPY|VM_IO, &area->page); > >> + VM_EXEC|VM_MAYEXEC|VM_DONTCOPY|VM_IO|VM_WRITE, &area->page); > > > > Yes, this is nasty. > > > > I would like to have a reason to nack this change ;) Unfortunately the current > > code is buggy too and we need to protect the kernel from malicious applications > > which can rewrite the insn we are going to step over in UTASK_SSTEP state anyway. > > I think there may be a way to achieve your objectives. > > Pass MAP_SHARED into the flags argument of get_unmapped_area(), and > pass the pfn of the xol page in as "pgoff". > > This will make the xol page get mapped into the user process at an > address which is "D-cache congruent" to the kernel side mapping. > > So all kernel stores to the page will use the same D-cache line that > user space accesses to it will. Thanks... I didn't know.. But did you really mean get_unmapped_area(pgoff => page_to_pfn(area->page)) ? I simply can't understand how this can work, arm (and x86) really use it as "pgoff << PAGE_SHIFT" align_offset accounted in unmapped_area() ? _Perhaps_ this can help as a "random number" unique for every xol mapping ? Hmm, no, I don't understand this COLOUR_ALIGN() magic on arm, but unlikely this is true. Help! > So we end up with all of the benefits of storing directly to > userspace, along with what you're trying to achieve. And in this case we can avoid copy_to_user(), right ? Oleg.
From: Oleg Nesterov <oleg@redhat.com> Date: Wed, 16 Apr 2014 18:43:10 +0200 > But did you really mean get_unmapped_area(pgoff => page_to_pfn(area->page)) ? Yes. > I simply can't understand how this can work, arm (and x86) really use it as > "pgoff << PAGE_SHIFT" align_offset accounted in unmapped_area() ? When a platform has D-cache aliasing issues, it must make sure that every shared page is mapped to the same D-cache alias in all such shared mappings. The way this is done is to map each pgoff to a page in the D-cache. For example, pgoff 0 would be given a virtual address that maps to the first page in the D-cache, pgoff 1 to the second, and so forth. What we're doing with this get_unmapped_area() call is to make it so that userspace's virtual address will land at the same virtual alias as the kernel one does. So that stores on the kernel side will be seen properly, without any cache flushing, by the user side mapping. >> So we end up with all of the benefits of storing directly to >> userspace, along with what you're trying to achieve. > > And in this case we can avoid copy_to_user(), right ? Yes, that's the whole idea, you can forget about the VM_WRITE etc. that caused your concerns. It'd be just memcpy to kernel side mapping of the page + i-cache flush where necessary.
On 04/16, David Miller wrote: > > From: Oleg Nesterov <oleg@redhat.com> > Date: Wed, 16 Apr 2014 18:43:10 +0200 > > > But did you really mean get_unmapped_area(pgoff => page_to_pfn(area->page)) ? > > Yes. > > > I simply can't understand how this can work, arm (and x86) really use it as > > "pgoff << PAGE_SHIFT" align_offset accounted in unmapped_area() ? > > When a platform has D-cache aliasing issues, it must make sure that > every shared page is mapped to the same D-cache alias in all such > shared mappings. > > The way this is done is to map each pgoff to a page in the D-cache. > For example, pgoff 0 would be given a virtual address that maps to the > first page in the D-cache, pgoff 1 to the second, and so forth. > > What we're doing with this get_unmapped_area() call is to make it so > that userspace's virtual address will land at the same virtual alias > as the kernel one does. ^^^^^^^^^^^^^^^ and page_address() == xxx + (pfn << PAGE_SHIFT), I seem to understand... David, thanks a lot. I am not saying I fully understand this all, I'll try to reread your email tomorrow, but it seems that I see the light. The last question... area->page = alloc_page(GFP_HIGHUSER), and I am not sure that arch/arm/mm/highmem.c:kmap_atomic() can't break the aliasing, __fix_to_virt() in this case will use the same (per-cpu) idx. Looks like, __kunmap_atomic()->__cpuc_flush_dcache_area() should take care, but could you please ack/nack my understanding? Thanks! > >> So we end up with all of the benefits of storing directly to > >> userspace, along with what you're trying to achieve. > > > > And in this case we can avoid copy_to_user(), right ? > > Yes, that's the whole idea, you can forget about the VM_WRITE etc. > that caused your concerns. > > It'd be just memcpy to kernel side mapping of the page + i-cache flush > where necessary. Great. Victor, Russel, what do you think? It seems that this patch can be updated. Oleg.
From: Oleg Nesterov <oleg@redhat.com> Date: Wed, 16 Apr 2014 21:18:25 +0200 > The last question... area->page = alloc_page(GFP_HIGHUSER), and I am > not sure that arch/arm/mm/highmem.c:kmap_atomic() can't break the > aliasing, __fix_to_virt() in this case will use the same (per-cpu) idx. > > Looks like, __kunmap_atomic()->__cpuc_flush_dcache_area() should take > care, but could you please ack/nack my understanding? Good point, it might therefore make sense to use a low-mem page.
On Wed, Apr 16, 2014 at 09:18:25PM +0200, Oleg Nesterov wrote: > Looks like, __kunmap_atomic()->__cpuc_flush_dcache_area() should take > care, but could you please ack/nack my understanding? flush_dcache_area() doesn't touch the I-cache... the hint is in the name. :) This is also the function which is used for flush_dcache_page() which we've already established isn't sufficient (for the same reason.) Plus... we still would need to know the user address(es) to flush for the I-cache side...
It is too late for me to even try to read emails ;) perhaps I am totally confused. On 04/16, Russell King - ARM Linux wrote: > > On Wed, Apr 16, 2014 at 09:18:25PM +0200, Oleg Nesterov wrote: > > Looks like, __kunmap_atomic()->__cpuc_flush_dcache_area() should take > > care, but could you please ack/nack my understanding? > > flush_dcache_area() doesn't touch the I-cache... the hint is in the > name. :) This is also the function which is used for flush_dcache_page() > which we've already established isn't sufficient (for the same reason.) Yes, we still need "i-cache flush where necessary" as David pointed. > Plus... we still would need to know the user address(es) to flush for > the I-cache side... We know it, it is xol_vaddr in xol_get_insn_slot(). Oleg.
diff --git a/arch/arm/kernel/uprobes.c b/arch/arm/kernel/uprobes.c index f9bacee..4836e54 100644 --- a/arch/arm/kernel/uprobes.c +++ b/arch/arm/kernel/uprobes.c @@ -113,6 +113,14 @@ int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe, struct mm_struct *mm, return 0; } +void arch_uprobe_copy_ixol(struct page *page, unsigned long vaddr, + void *src, unsigned long len) +{ + if (!__copy_to_user((void *) vaddr, src, len)) + flush_cache_user_range(vaddr, len); +} + + int arch_uprobe_pre_xol(struct arch_uprobe *auprobe, struct pt_regs *regs) { struct uprobe_task *utask = current->utask; diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h index edff2b9..c52f827 100644 --- a/include/linux/uprobes.h +++ b/include/linux/uprobes.h @@ -32,6 +32,7 @@ struct vm_area_struct; struct mm_struct; struct inode; struct notifier_block; +struct page; #define UPROBE_HANDLER_REMOVE 1 #define UPROBE_HANDLER_MASK 1 @@ -127,6 +128,8 @@ extern int arch_uprobe_exception_notify(struct notifier_block *self, unsigned l extern void arch_uprobe_abort_xol(struct arch_uprobe *aup, struct pt_regs *regs); extern unsigned long arch_uretprobe_hijack_return_addr(unsigned long trampoline_vaddr, struct pt_regs *regs); extern bool __weak arch_uprobe_ignore(struct arch_uprobe *aup, struct pt_regs *regs); +extern void __weak arch_uprobe_copy_ixol(struct page *page, unsigned long vaddr, + void *src, unsigned long len); #else /* !CONFIG_UPROBES */ struct uprobes_state { }; diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c index 04709b6..1038e57 100644 --- a/kernel/events/uprobes.c +++ b/kernel/events/uprobes.c @@ -1149,7 +1149,7 @@ static int xol_add_vma(struct mm_struct *mm, struct xol_area *area) } ret = install_special_mapping(mm, area->vaddr, PAGE_SIZE, - VM_EXEC|VM_MAYEXEC|VM_DONTCOPY|VM_IO, &area->page); + VM_EXEC|VM_MAYEXEC|VM_DONTCOPY|VM_IO|VM_WRITE, &area->page); if (ret) goto fail; @@ -1296,14 +1296,8 @@ static unsigned long xol_get_insn_slot(struct uprobe *uprobe) if (unlikely(!xol_vaddr)) return 0; - /* Initialize the slot */ - copy_to_page(area->page, xol_vaddr, - &uprobe->arch.ixol, sizeof(uprobe->arch.ixol)); - /* - * We probably need flush_icache_user_range() but it needs vma. - * This should work on supported architectures too. - */ - flush_dcache_page(area->page); + arch_uprobe_copy_ixol(area->page, xol_vaddr, + &uprobe->arch.ixol, sizeof(uprobe->arch.ixol)); return xol_vaddr; } @@ -1346,6 +1340,22 @@ static void xol_free_insn_slot(struct task_struct *tsk) } } +void __weak arch_uprobe_copy_ixol(struct page *page, unsigned long vaddr, + void *src, unsigned long len) +{ + /* + * Note if CPU does not support instructions write snooping + * from dcache it needs to define its own version of this + * function that would take care of proper cache flushes. + * + * Nothing we can do if it fails, added if to make unused + * result warning happy. If xol write failed because process + * unmapped xol area by mistake, process will crash in some + * other place. + */ + if (__copy_to_user((void *) vaddr, src, len)); +} + /** * uprobe_get_swbp_addr - compute address of swbp given post-swbp regs * @regs: Reflects the saved state of the task after it has hit a breakpoint
After instruction write into xol area, on ARM V7 architecture code need to flush dcache and icache to sync them up for given set of addresses. Having just 'flush_dcache_page(page)' call is not enough - it is possible to have stale instruction sitting in icache for given xol area slot address. Introduce arch_uprobe_ixol_copy weak function that by default calls __copy_to_user function, and that sufficient for CPUs that can snoop instruction writes from dcache. On ARM define new one that handles xol slot copy in ARM specific way. Arm implementation of arch_uprobe_ixol_copy function makes __copy_to_user call which does not have dcache aliasing issues and then flush_cache_user_range to push dcache out and invalidate corresponding icache entries. Note in order to write into uprobes xol area had to add VM_WRITE to xol area mapping. Signed-off-by: Victor Kamensky <victor.kamensky@linaro.org> --- arch/arm/kernel/uprobes.c | 8 ++++++++ include/linux/uprobes.h | 3 +++ kernel/events/uprobes.c | 28 +++++++++++++++++++--------- 3 files changed, 30 insertions(+), 9 deletions(-)