Message ID | 1364235486-17738-3-git-send-email-will.deacon@arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, Mar 25, 2013 at 06:18:05PM +0000, Will Deacon wrote: > do_cache_op finds the lowest VMA contained in the specified address > range and rounds the range to cover only the mapped addresses. > > Since commit 4542b6a0fa6b ("ARM: 7365/1: drop unused parameter from > flush_cache_user_range") the VMA is not used for anything else in this > code and seeing as the low-level cache flushing routines return -EFAULT > if the address is not valid, there is no need for this range truncation. > > This patch removes the VMA handling code from the cacheflushing syscall. > > Signed-off-by: Will Deacon <will.deacon@arm.com> > --- > arch/arm/kernel/traps.c | 17 +---------------- > 1 file changed, 1 insertion(+), 16 deletions(-) > > diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c > index 1c08911..da5e268 100644 > --- a/arch/arm/kernel/traps.c > +++ b/arch/arm/kernel/traps.c > @@ -509,25 +509,10 @@ static int bad_syscall(int n, struct pt_regs *regs) > static inline int > do_cache_op(unsigned long start, unsigned long end, int flags) > { > - struct mm_struct *mm = current->active_mm; > - struct vm_area_struct *vma; > - > if (end < start || flags) > return -EINVAL; > > - down_read(&mm->mmap_sem); > - vma = find_vma(mm, start); > - if (vma && vma->vm_start < end) { > - if (start < vma->vm_start) > - start = vma->vm_start; > - if (end > vma->vm_end) > - end = vma->vm_end; > - > - up_read(&mm->mmap_sem); > - return flush_cache_user_range(start, end); > - } > - up_read(&mm->mmap_sem); > - return -EINVAL; > + return flush_cache_user_range(start, end); While this would work, it introduces a possibility of DoS where an application passes bigger valid range (kernel linear mapping) and the kernel code would not be preempted (CONFIG_PREEMPT disabled). IIRC, that's why Russell reject such patch a while back.
Hello, On Wed, Mar 27, 2013 at 11:09:38AM +0000, Catalin Marinas wrote: > On Mon, Mar 25, 2013 at 06:18:05PM +0000, Will Deacon wrote: > > diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c > > index 1c08911..da5e268 100644 > > --- a/arch/arm/kernel/traps.c > > +++ b/arch/arm/kernel/traps.c > > @@ -509,25 +509,10 @@ static int bad_syscall(int n, struct pt_regs *regs) > > static inline int > > do_cache_op(unsigned long start, unsigned long end, int flags) > > { > > - struct mm_struct *mm = current->active_mm; > > - struct vm_area_struct *vma; > > - > > if (end < start || flags) > > return -EINVAL; > > > > - down_read(&mm->mmap_sem); > > - vma = find_vma(mm, start); > > - if (vma && vma->vm_start < end) { > > - if (start < vma->vm_start) > > - start = vma->vm_start; > > - if (end > vma->vm_end) > > - end = vma->vm_end; > > - > > - up_read(&mm->mmap_sem); > > - return flush_cache_user_range(start, end); > > - } > > - up_read(&mm->mmap_sem); > > - return -EINVAL; > > + return flush_cache_user_range(start, end); > > While this would work, it introduces a possibility of DoS where an > application passes bigger valid range (kernel linear mapping) and the > kernel code would not be preempted (CONFIG_PREEMPT disabled). IIRC, > that's why Russell reject such patch a while back. Hmm, I'm not sure I buy that argument. Firstly, you can't just pass a kernel linear mapping address -- we'll fault straight away because it's not a userspace address. Secondly, what's to stop an application from mmaping a large area into a single VMA and giving rise to the same situation? Finally, interrupts are enabled during this operation, so I don't understand how you can trigger a DoS, irrespective of the preempt configuration. Is there an old thread I can refer to with more details about this? It may be that some of the assumptions there no longer hold with subsequent changes to the fault handling on this path. Will
On Wed, Mar 27, 2013 at 12:15:12PM +0000, Will Deacon wrote: > On Wed, Mar 27, 2013 at 11:09:38AM +0000, Catalin Marinas wrote: > > On Mon, Mar 25, 2013 at 06:18:05PM +0000, Will Deacon wrote: > > > diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c > > > index 1c08911..da5e268 100644 > > > --- a/arch/arm/kernel/traps.c > > > +++ b/arch/arm/kernel/traps.c > > > @@ -509,25 +509,10 @@ static int bad_syscall(int n, struct pt_regs *regs) > > > static inline int > > > do_cache_op(unsigned long start, unsigned long end, int flags) > > > { > > > - struct mm_struct *mm = current->active_mm; > > > - struct vm_area_struct *vma; > > > - > > > if (end < start || flags) > > > return -EINVAL; > > > > > > - down_read(&mm->mmap_sem); > > > - vma = find_vma(mm, start); > > > - if (vma && vma->vm_start < end) { > > > - if (start < vma->vm_start) > > > - start = vma->vm_start; > > > - if (end > vma->vm_end) > > > - end = vma->vm_end; > > > - > > > - up_read(&mm->mmap_sem); > > > - return flush_cache_user_range(start, end); > > > - } > > > - up_read(&mm->mmap_sem); > > > - return -EINVAL; > > > + return flush_cache_user_range(start, end); > > > > While this would work, it introduces a possibility of DoS where an > > application passes bigger valid range (kernel linear mapping) and the > > kernel code would not be preempted (CONFIG_PREEMPT disabled). IIRC, > > that's why Russell reject such patch a while back. > > Hmm, I'm not sure I buy that argument. Firstly, you can't just pass a kernel > linear mapping address -- we'll fault straight away because it's not a > userspace address. Fault where? > Secondly, what's to stop an application from mmaping a large area into > a single VMA and giving rise to the same situation? Finally, > interrupts are enabled during this operation, so I don't understand > how you can trigger a DoS, irrespective of the preempt configuration. You can prevent context switching to other threads. But I agree, with a large vma (which is already faulted in), you can get similar behaviour. > Is there an old thread I can refer to with more details about this? It may > be that some of the assumptions there no longer hold with subsequent changes > to the fault handling on this path. You could search the list for "do_cache_op", I don't have any at hand.
On Wed, Mar 27, 2013 at 12:21:59PM +0000, Catalin Marinas wrote: > On Wed, Mar 27, 2013 at 12:15:12PM +0000, Will Deacon wrote: > > On Wed, Mar 27, 2013 at 11:09:38AM +0000, Catalin Marinas wrote: > > > While this would work, it introduces a possibility of DoS where an > > > application passes bigger valid range (kernel linear mapping) and the > > > kernel code would not be preempted (CONFIG_PREEMPT disabled). IIRC, > > > that's why Russell reject such patch a while back. > > > > Hmm, I'm not sure I buy that argument. Firstly, you can't just pass a kernel > > linear mapping address -- we'll fault straight away because it's not a > > userspace address. > > Fault where? I was expecting something like an access_ok check, but you're right, we don't have one (and I guess that's not strictly needed given that flushing isn't destructive). I still find it a bit scary that we allow userspace to pass kernel addresses through though -- especially if there's something like a DMA or CPU suspend operation running on another core. > > Secondly, what's to stop an application from mmaping a large area into > > a single VMA and giving rise to the same situation? Finally, > > interrupts are enabled during this operation, so I don't understand > > how you can trigger a DoS, irrespective of the preempt configuration. > > You can prevent context switching to other threads. But I agree, with a > large vma (which is already faulted in), you can get similar behaviour. So the easy fix is to split the range up into chunks and call cond_resched after processing each one. Will
On Wed, Mar 27, 2013 at 12:43:52PM +0000, Will Deacon wrote: > On Wed, Mar 27, 2013 at 12:21:59PM +0000, Catalin Marinas wrote: > > On Wed, Mar 27, 2013 at 12:15:12PM +0000, Will Deacon wrote: > > > On Wed, Mar 27, 2013 at 11:09:38AM +0000, Catalin Marinas wrote: > > > > While this would work, it introduces a possibility of DoS where an > > > > application passes bigger valid range (kernel linear mapping) and the > > > > kernel code would not be preempted (CONFIG_PREEMPT disabled). IIRC, > > > > that's why Russell reject such patch a while back. > > > > > > Hmm, I'm not sure I buy that argument. Firstly, you can't just pass a kernel > > > linear mapping address -- we'll fault straight away because it's not a > > > userspace address. > > > > Fault where? > > I was expecting something like an access_ok check, but you're right, we > don't have one (and I guess that's not strictly needed given that flushing > isn't destructive). I still find it a bit scary that we allow userspace to > pass kernel addresses through though -- especially if there's something like > a DMA or CPU suspend operation running on another core. Currently we don't allow kernel addresses since we require a valid vma. If we drop the vma search, we should add an access_ok. > > > Secondly, what's to stop an application from mmaping a large area into > > > a single VMA and giving rise to the same situation? Finally, > > > interrupts are enabled during this operation, so I don't understand > > > how you can trigger a DoS, irrespective of the preempt configuration. > > > > You can prevent context switching to other threads. But I agree, with a > > large vma (which is already faulted in), you can get similar behaviour. > > So the easy fix is to split the range up into chunks and call cond_resched > after processing each one. This would work.
diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c index 1c08911..da5e268 100644 --- a/arch/arm/kernel/traps.c +++ b/arch/arm/kernel/traps.c @@ -509,25 +509,10 @@ static int bad_syscall(int n, struct pt_regs *regs) static inline int do_cache_op(unsigned long start, unsigned long end, int flags) { - struct mm_struct *mm = current->active_mm; - struct vm_area_struct *vma; - if (end < start || flags) return -EINVAL; - down_read(&mm->mmap_sem); - vma = find_vma(mm, start); - if (vma && vma->vm_start < end) { - if (start < vma->vm_start) - start = vma->vm_start; - if (end > vma->vm_end) - end = vma->vm_end; - - up_read(&mm->mmap_sem); - return flush_cache_user_range(start, end); - } - up_read(&mm->mmap_sem); - return -EINVAL; + return flush_cache_user_range(start, end); } /*
do_cache_op finds the lowest VMA contained in the specified address range and rounds the range to cover only the mapped addresses. Since commit 4542b6a0fa6b ("ARM: 7365/1: drop unused parameter from flush_cache_user_range") the VMA is not used for anything else in this code and seeing as the low-level cache flushing routines return -EFAULT if the address is not valid, there is no need for this range truncation. This patch removes the VMA handling code from the cacheflushing syscall. Signed-off-by: Will Deacon <will.deacon@arm.com> --- arch/arm/kernel/traps.c | 17 +---------------- 1 file changed, 1 insertion(+), 16 deletions(-)