diff mbox

[RFC] vb2: force output buffers to fault into memory

Message ID 201212041648.40108.hverkuil@xs4all.nl (mailing list archive)
State New, archived
Headers show

Commit Message

Hans Verkuil Dec. 4, 2012, 3:48 p.m. UTC
(repost after accidentally using HTML formatting)

This needs a good review. The change is minor, but as I am not a mm expert,
I'd like to get some more feedback on this. The dma-sg change has been
successfully tested on our hardware, but I don't have any hardware to test
the vmalloc change.

Note that the 'write' attribute is still stored internally and used to tell
whether set_page_dirty_lock() should be called during put_userptr.

It is my understanding that that still makes sense, so I didn't change that.

Regards,

	Hans

--- start patch ---

When calling get_user_pages for output buffers, the 'write' argument is set
to 0 (since the DMA isn't writing to memory), This can cause unexpected results:

If you calloc() buffer memory and do not fill that memory afterwards, then
the kernel assigns most of that memory to one single physical 'zero' page.

If you queue that buffer to the V4L2 driver, then it will call get_user_pages
and store the results. Next you dequeue it, fill the buffer and queue it
again. Now the V4L2 core code sees the same userptr and length and expects it
to be the same buffer that it got before and it will reuse the results of the
previous get_user_pages call. But that still points to zero pages, whereas
userspace filled it up and so changed the buffer to use different pages. In
other words, the pages the V4L2 core knows about are no longer correct.

The solution is to always set 'write' to 1 as this will force the kernel to
fault in proper pages.

We do this for videobuf2-dma-sg.c and videobuf2-vmalloc.c, but not for
videobuf2-dma-contig.c since the userptr there is already supposed to
point to contiguous memory and shouldn't use the zero page at all.

Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
---
 drivers/media/v4l2-core/videobuf2-dma-sg.c  |    3 ++-
 drivers/media/v4l2-core/videobuf2-vmalloc.c |    4 +++-
 2 files changed, 5 insertions(+), 2 deletions(-)

Comments

Laurent Pinchart Dec. 6, 2012, 1:23 a.m. UTC | #1
Hi Hans,

On Tuesday 04 December 2012 16:48:40 Hans Verkuil wrote:
> (repost after accidentally using HTML formatting)
> 
> This needs a good review. The change is minor, but as I am not a mm expert,
> I'd like to get some more feedback on this. The dma-sg change has been
> successfully tested on our hardware, but I don't have any hardware to test
> the vmalloc change.
> 
> Note that the 'write' attribute is still stored internally and used to tell
> whether set_page_dirty_lock() should be called during put_userptr.
> 
> It is my understanding that that still makes sense, so I didn't change that.
> 
> Regards,
> 
> 	Hans
> 
> --- start patch ---
> 
> When calling get_user_pages for output buffers, the 'write' argument is set
> to 0 (since the DMA isn't writing to memory), This can cause unexpected
> results:
> 
> If you calloc() buffer memory and do not fill that memory afterwards, then
> the kernel assigns most of that memory to one single physical 'zero' page.
> 
> If you queue that buffer to the V4L2 driver, then it will call
> get_user_pages and store the results. Next you dequeue it, fill the buffer
> and queue it again. Now the V4L2 core code sees the same userptr and length
> and expects it to be the same buffer that it got before and it will reuse
> the results of the previous get_user_pages call. But that still points to
> zero pages, whereas userspace filled it up and so changed the buffer to use
> different pages. In other words, the pages the V4L2 core knows about are no
> longer correct.
> 
> The solution is to always set 'write' to 1 as this will force the kernel to
> fault in proper pages.

I'm wondering if we should really force faulting pages. The buffer given to 
the driver might be real read-only memory, in which case faulting the pages 
would probably hurt (agreed, that's pretty unlikely, but it's still a valid 
use case according to the API).

Moreover, this wouldn't fix all similar userptr-related issues. An application 
could remap a completely different memory area to the same userspace address 
between two QBUF calls, and videobuf2 would not handle that properly. That's 
also a valid use case according to the API, and it would be pretty difficult 
to handle it correctly in an efficient way on the kernel side. I think we 
could modify the spec to forbid mapping changes between QBUF calls, and in 
that case the zero-page mapping situation you described could be forbidden as 
well. If we clearly document it in the spec we could push the responsibility 
of faulting the pages to userspace.

> We do this for videobuf2-dma-sg.c and videobuf2-vmalloc.c, but not for
> videobuf2-dma-contig.c since the userptr there is already supposed to
> point to contiguous memory and shouldn't use the zero page at all.
> 
> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
> ---
>  drivers/media/v4l2-core/videobuf2-dma-sg.c  |    3 ++-
>  drivers/media/v4l2-core/videobuf2-vmalloc.c |    4 +++-
>  2 files changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/media/v4l2-core/videobuf2-dma-sg.c
> b/drivers/media/v4l2-core/videobuf2-dma-sg.c index 25c3b36..c29f159 100644
> --- a/drivers/media/v4l2-core/videobuf2-dma-sg.c
> +++ b/drivers/media/v4l2-core/videobuf2-dma-sg.c
> @@ -143,7 +143,8 @@ static void *vb2_dma_sg_get_userptr(void *alloc_ctx,
> unsigned long vaddr, num_pages_from_user = get_user_pages(current,
> current->mm,
>  					     vaddr & PAGE_MASK,
>  					     buf->sg_desc.num_pages,
> -					     write,
> +					     1, /* always set write to force
> +						   faulting all pages */
>  					     1, /* force */
>  					     buf->pages,
>  					     NULL);
> diff --git a/drivers/media/v4l2-core/videobuf2-vmalloc.c
> b/drivers/media/v4l2-core/videobuf2-vmalloc.c index a47fd4f..c8d8519 100644
> --- a/drivers/media/v4l2-core/videobuf2-vmalloc.c
> +++ b/drivers/media/v4l2-core/videobuf2-vmalloc.c
> @@ -107,7 +107,9 @@ static void *vb2_vmalloc_get_userptr(void *alloc_ctx,
> unsigned long vaddr, /* current->mm->mmap_sem is taken by videobuf2 core */
>  		n_pages = get_user_pages(current, current->mm,
>  					 vaddr & PAGE_MASK, buf->n_pages,
> -					 write, 1, /* force */
> +					 1, /* always set write to force
> +					       faulting all pages */
> +					 1, /* force */
>  					 buf->pages, NULL);
>  		if (n_pages != buf->n_pages)
>  			goto fail_get_user_pages;
Marek Szyprowski Dec. 7, 2012, 2:29 p.m. UTC | #2
Hello,

On 12/6/2012 2:23 AM, Laurent Pinchart wrote:
> Hi Hans,
>
> On Tuesday 04 December 2012 16:48:40 Hans Verkuil wrote:
> > (repost after accidentally using HTML formatting)
> >
> > This needs a good review. The change is minor, but as I am not a mm expert,
> > I'd like to get some more feedback on this. The dma-sg change has been
> > successfully tested on our hardware, but I don't have any hardware to test
> > the vmalloc change.
> >
> > Note that the 'write' attribute is still stored internally and used to tell
> > whether set_page_dirty_lock() should be called during put_userptr.
> >
> > It is my understanding that that still makes sense, so I didn't change that.
> >
> > Regards,
> >
> > 	Hans
> >
> > --- start patch ---
> >
> > When calling get_user_pages for output buffers, the 'write' argument is set
> > to 0 (since the DMA isn't writing to memory), This can cause unexpected
> > results:
> >
> > If you calloc() buffer memory and do not fill that memory afterwards, then
> > the kernel assigns most of that memory to one single physical 'zero' page.
> >
> > If you queue that buffer to the V4L2 driver, then it will call
> > get_user_pages and store the results. Next you dequeue it, fill the buffer
> > and queue it again. Now the V4L2 core code sees the same userptr and length
> > and expects it to be the same buffer that it got before and it will reuse
> > the results of the previous get_user_pages call. But that still points to
> > zero pages, whereas userspace filled it up and so changed the buffer to use
> > different pages. In other words, the pages the V4L2 core knows about are no
> > longer correct.
> >
> > The solution is to always set 'write' to 1 as this will force the kernel to
> > fault in proper pages.
>
> I'm wondering if we should really force faulting pages. The buffer given to
> the driver might be real read-only memory, in which case faulting the pages
> would probably hurt (agreed, that's pretty unlikely, but it's still a valid
> use case according to the API).
>
> Moreover, this wouldn't fix all similar userptr-related issues. An application
> could remap a completely different memory area to the same userspace address
> between two QBUF calls, and videobuf2 would not handle that properly. That's
> also a valid use case according to the API, and it would be pretty difficult
> to handle it correctly in an efficient way on the kernel side. I think we
> could modify the spec to forbid mapping changes between QBUF calls, and in
> that case the zero-page mapping situation you described could be forbidden as
> well. If we clearly document it in the spec we could push the responsibility
> of faulting the pages to userspace.

I've spent some time thinking on this issue and I came to the conclusion 
that it
might be a good idea extend the v4l2 spec with such case. Changing the 
write
parameter for get_user_pages() is only a workaround, which doesn't even 
work for
all possible use cases, so I agree with Laurent that user ptr usage 
should be
much better documented. I only wonder if it a good idea to require 
faulting in
pages from the user space application. Can we assume that memset(ptr, 0, 
buf_size)
will do the job in all cases? I expect yes, but I would like to be 
really sure,
this will be probably put into the documentation for the reference code.

Best regards
Laurent Pinchart Dec. 11, 2012, 2 p.m. UTC | #3
Hi Marek,

On Friday 07 December 2012 15:29:30 Marek Szyprowski wrote:
> On 12/6/2012 2:23 AM, Laurent Pinchart wrote:
> > On Tuesday 04 December 2012 16:48:40 Hans Verkuil wrote:
> > > (repost after accidentally using HTML formatting)
> > > 
> > > This needs a good review. The change is minor, but as I am not a mm
> > > expert, I'd like to get some more feedback on this. The dma-sg change
> > > has been successfully tested on our hardware, but I don't have any
> > > hardware to test the vmalloc change.
> > > 
> > > Note that the 'write' attribute is still stored internally and used to
> > > tell whether set_page_dirty_lock() should be called during put_userptr.
> > > 
> > > It is my understanding that that still makes sense, so I didn't change
> > > that.
> > > 
> > > Regards,
> > > 
> > > 	Hans
> > > 
> > > --- start patch ---
> > > 
> > > When calling get_user_pages for output buffers, the 'write' argument is
> > > set to 0 (since the DMA isn't writing to memory), This can cause
> > > unexpected results:
> > > 
> > > If you calloc() buffer memory and do not fill that memory afterwards,
> > > then the kernel assigns most of that memory to one single physical
> > > 'zero' page.
> > > 
> > > If you queue that buffer to the V4L2 driver, then it will call
> > > get_user_pages and store the results. Next you dequeue it, fill the
> > > buffer and queue it again. Now the V4L2 core code sees the same userptr
> > > and length and expects it to be the same buffer that it got before and
> > > it will reuse the results of the previous get_user_pages call. But that
> > > still points to zero pages, whereas userspace filled it up and so
> > > changed the buffer to use different pages. In other words, the pages the
> > > V4L2 core knows about are no longer correct.
> > > 
> > > The solution is to always set 'write' to 1 as this will force the kernel
> > > to fault in proper pages.
> > 
> > I'm wondering if we should really force faulting pages. The buffer given
> > to the driver might be real read-only memory, in which case faulting the
> > pages would probably hurt (agreed, that's pretty unlikely, but it's still
> > a valid use case according to the API).
> > 
> > Moreover, this wouldn't fix all similar userptr-related issues. An
> > application could remap a completely different memory area to the same
> > userspace address between two QBUF calls, and videobuf2 would not handle
> > that properly. That's also a valid use case according to the API, and it
> > would be pretty difficult to handle it correctly in an efficient way on
> > the kernel side. I think we could modify the spec to forbid mapping
> > changes between QBUF calls, and in that case the zero-page mapping
> > situation you described could be forbidden as well. If we clearly
> > document it in the spec we could push the responsibility of faulting the
> > pages to userspace.
> 
> I've spent some time thinking on this issue and I came to the conclusion
> that it might be a good idea extend the v4l2 spec with such case. Changing
> the write parameter for get_user_pages() is only a workaround, which doesn't
> even work for all possible use cases, so I agree with Laurent that user ptr
> usage should be much better documented. I only wonder if it a good idea to
> require faulting in pages from the user space application. Can we assume
> that memset(ptr, 0, buf_size) will do the job in all cases? I expect yes,
> but I would like to be really sure,

Good question, I would also guess that memset should do, but I'm no expert on 
memory management code.

> this will be probably put into the documentation for the reference code.

At the end of the day it's just an API contract, like all the others. The 
USERPTR API should, in my opinion, require userspace to provide a buffer 
backed by usable pages (for non PFNMAP buffers) or usable physical memory (for 
PFNMAP buffers). Otherwise the kernel will need to switch to a guesswork mode, 
which is hackish at best.

Of course people should use DMABUF instead of USERPTR when possible :-)
diff mbox

Patch

diff --git a/drivers/media/v4l2-core/videobuf2-dma-sg.c b/drivers/media/v4l2-core/videobuf2-dma-sg.c
index 25c3b36..c29f159 100644
--- a/drivers/media/v4l2-core/videobuf2-dma-sg.c
+++ b/drivers/media/v4l2-core/videobuf2-dma-sg.c
@@ -143,7 +143,8 @@  static void *vb2_dma_sg_get_userptr(void *alloc_ctx, unsigned long vaddr,
 	num_pages_from_user = get_user_pages(current, current->mm,
 					     vaddr & PAGE_MASK,
 					     buf->sg_desc.num_pages,
-					     write,
+					     1, /* always set write to force
+						   faulting all pages */
 					     1, /* force */
 					     buf->pages,
 					     NULL);
diff --git a/drivers/media/v4l2-core/videobuf2-vmalloc.c b/drivers/media/v4l2-core/videobuf2-vmalloc.c
index a47fd4f..c8d8519 100644
--- a/drivers/media/v4l2-core/videobuf2-vmalloc.c
+++ b/drivers/media/v4l2-core/videobuf2-vmalloc.c
@@ -107,7 +107,9 @@  static void *vb2_vmalloc_get_userptr(void *alloc_ctx, unsigned long vaddr,
 		/* current->mm->mmap_sem is taken by videobuf2 core */
 		n_pages = get_user_pages(current, current->mm,
 					 vaddr & PAGE_MASK, buf->n_pages,
-					 write, 1, /* force */
+					 1, /* always set write to force
+					       faulting all pages */
+					 1, /* force */
 					 buf->pages, NULL);
 		if (n_pages != buf->n_pages)
 			goto fail_get_user_pages;