Message ID | 20221107161740.144456-17-david@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mm/gup: remove FOLL_FORCE usage from drivers (reliable R/O long-term pinning) | expand |
Hi David, On Tue, Nov 8, 2022 at 1:19 AM David Hildenbrand <david@redhat.com> wrote: > > FOLL_FORCE is really only for debugger access. According to commit > 707947247e95 ("media: videobuf2-vmalloc: get_userptr: buffers are always > writable"), the pinned pages are always writable. Actually that patch is only a workaround to temporarily disable support for read-only pages as they seemed to suffer from some corruption issues in the retrieved user pages. We expect to support read-only pages as hardware input after. That said, FOLL_FORCE doesn't sound like the right thing even in that case, but I don't know the background behind it being added here in the first place. +Hans Verkuil +Marek Szyprowski do you happen to remember anything about it? Best regards, Tomasz > > FOLL_FORCE in this case seems to be a legacy leftover. Let's just remove > it. > > Cc: Tomasz Figa <tfiga@chromium.org> > Cc: Marek Szyprowski <m.szyprowski@samsung.com> > Cc: Mauro Carvalho Chehab <mchehab@kernel.org> > Signed-off-by: David Hildenbrand <david@redhat.com> > --- > drivers/media/common/videobuf2/frame_vector.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/media/common/videobuf2/frame_vector.c b/drivers/media/common/videobuf2/frame_vector.c > index 542dde9d2609..062e98148c53 100644 > --- a/drivers/media/common/videobuf2/frame_vector.c > +++ b/drivers/media/common/videobuf2/frame_vector.c > @@ -50,7 +50,7 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames, > start = untagged_addr(start); > > ret = pin_user_pages_fast(start, nr_frames, > - FOLL_FORCE | FOLL_WRITE | FOLL_LONGTERM, > + FOLL_WRITE | FOLL_LONGTERM, > (struct page **)(vec->ptrs)); > if (ret > 0) { > vec->got_ref = true; > -- > 2.38.1 >
On 08.11.22 05:45, Tomasz Figa wrote: > Hi David, Hi Tomasz, thanks for looking into this! > > On Tue, Nov 8, 2022 at 1:19 AM David Hildenbrand <david@redhat.com> wrote: >> >> FOLL_FORCE is really only for debugger access. According to commit >> 707947247e95 ("media: videobuf2-vmalloc: get_userptr: buffers are always >> writable"), the pinned pages are always writable. > As reference, the cover letter of the series: https://lkml.kernel.org/r/20221107161740.144456-1-david@redhat.com > Actually that patch is only a workaround to temporarily disable > support for read-only pages as they seemed to suffer from some > corruption issues in the retrieved user pages. We expect to support > read-only pages as hardware input after. That said, FOLL_FORCE doesn't > sound like the right thing even in that case, but I don't know the > background behind it being added here in the first place. +Hans > Verkuil +Marek Szyprowski do you happen to remember anything about it? Maybe I mis-interpreted 707947247e95; re-reading it again, I am not quite sure what the actual problem is and how it relates to GUP FOLL_WRITE handling. FOLL_FORCE already was in place before 707947247e95 and should be removed. What I understood is "Just always call vb2_create_framevec() with FOLL_WRITE to always allow writing to the buffers.". If the pinned page is never written to via the obtained GUP reference: * FOLL_WRITE should not be set * FOLL_FORCE should not be set * We should not dirty the page when unpinning If the pinned page may be written to via the obtained GUP reference: * FOLL_WRITE should be set * FOLL_FORCE should not be set * We should dirty the page when unpinning If the function is called for both, we should think about doing it conditional based on a "write" variable, like pre-707947247e95 did. @Hans, any insight?
Hi Tomasz, David, On 11/8/22 05:45, Tomasz Figa wrote: > Hi David, > > On Tue, Nov 8, 2022 at 1:19 AM David Hildenbrand <david@redhat.com> wrote: >> >> FOLL_FORCE is really only for debugger access. According to commit >> 707947247e95 ("media: videobuf2-vmalloc: get_userptr: buffers are always >> writable"), the pinned pages are always writable. > > Actually that patch is only a workaround to temporarily disable > support for read-only pages as they seemed to suffer from some > corruption issues in the retrieved user pages. We expect to support > read-only pages as hardware input after. That said, FOLL_FORCE doesn't > sound like the right thing even in that case, but I don't know the > background behind it being added here in the first place. +Hans > Verkuil +Marek Szyprowski do you happen to remember anything about it? I tracked the use of 'force' all the way back to the first git commit (2.6.12-rc1) in the very old video-buf.c. So it is very, very old and the reason is lost in the mists of time. I'm not sure if the 'force' argument of get_user_pages() at that time even meant the same as FOLL_FORCE today. From what I can tell it has just been faithfully used ever since, but I have my doubt that anyone understands the reason behind it since it was never explained. Looking at this old LWN article https://lwn.net/Articles/28548/ suggests that it might be related to calling get_user_pages for write buffers (non-zero write argument) where you also want to be able to read from the buffer. That is certainly something that some drivers need to do post-capture fixups. But 'force' was also always set for read buffers, and I don't know if that was something that was actually needed, or just laziness. I assume that removing FOLL_FORCE from 'FOLL_FORCE|FOLL_WRITE' will still allow drivers to read from the buffer? Regards, Hans > > Best regards, > Tomasz > >> >> FOLL_FORCE in this case seems to be a legacy leftover. Let's just remove >> it. >> >> Cc: Tomasz Figa <tfiga@chromium.org> >> Cc: Marek Szyprowski <m.szyprowski@samsung.com> >> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> >> Signed-off-by: David Hildenbrand <david@redhat.com> >> --- >> drivers/media/common/videobuf2/frame_vector.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/drivers/media/common/videobuf2/frame_vector.c b/drivers/media/common/videobuf2/frame_vector.c >> index 542dde9d2609..062e98148c53 100644 >> --- a/drivers/media/common/videobuf2/frame_vector.c >> +++ b/drivers/media/common/videobuf2/frame_vector.c >> @@ -50,7 +50,7 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames, >> start = untagged_addr(start); >> >> ret = pin_user_pages_fast(start, nr_frames, >> - FOLL_FORCE | FOLL_WRITE | FOLL_LONGTERM, >> + FOLL_WRITE | FOLL_LONGTERM, >> (struct page **)(vec->ptrs)); >> if (ret > 0) { >> vec->got_ref = true; >> -- >> 2.38.1 >>
On 22.11.22 13:25, Hans Verkuil wrote: > Hi Tomasz, David, > > On 11/8/22 05:45, Tomasz Figa wrote: >> Hi David, >> >> On Tue, Nov 8, 2022 at 1:19 AM David Hildenbrand <david@redhat.com> wrote: >>> >>> FOLL_FORCE is really only for debugger access. According to commit >>> 707947247e95 ("media: videobuf2-vmalloc: get_userptr: buffers are always >>> writable"), the pinned pages are always writable. >> >> Actually that patch is only a workaround to temporarily disable >> support for read-only pages as they seemed to suffer from some >> corruption issues in the retrieved user pages. We expect to support >> read-only pages as hardware input after. That said, FOLL_FORCE doesn't >> sound like the right thing even in that case, but I don't know the >> background behind it being added here in the first place. +Hans >> Verkuil +Marek Szyprowski do you happen to remember anything about it? > > I tracked the use of 'force' all the way back to the first git commit > (2.6.12-rc1) in the very old video-buf.c. So it is very, very old and the > reason is lost in the mists of time. > > I'm not sure if the 'force' argument of get_user_pages() at that time > even meant the same as FOLL_FORCE today. From what I can tell it has just > been faithfully used ever since, but I have my doubt that anyone understands > the reason behind it since it was never explained. > > Looking at this old LWN article https://lwn.net/Articles/28548/ suggests > that it might be related to calling get_user_pages for write buffers > (non-zero write argument) where you also want to be able to read from the > buffer. That is certainly something that some drivers need to do post-capture > fixups. > > But 'force' was also always set for read buffers, and I don't know if that > was something that was actually needed, or just laziness. > > I assume that removing FOLL_FORCE from 'FOLL_FORCE|FOLL_WRITE' will still > allow drivers to read from the buffer? Yes. The only problematic corner case I can imagine is if someone has a VMA without write permissions (no PROT_WRITE/VM_WRITE) and wants to pin user space pages as a read buffer. We'd specify now FOLL_WRITE without FOLL_FORCE and GUP would reject that: write access without write permissions is invalid. There would be no way around "fixing" this implementation to not specify FOLL_WRITE when only reading from user-space pages. Not sure what the implications are regarding that corruption that was mentioned in 707947247e95. Having said that, I assume such a scenario is unlikely -- but you might know better how user space usually uses this interface. There would be three options: 1) Leave the FOLL_FORCE hack in for now, which I *really* want to avoid. 2) Remove FOLL_FORCE and see if anybody even notices (this patch) and leave the implementation as is for now. 3) Remove FOLL_FORCE and fixup the implementation to only specify FOLL_WRITE if the pages will actually get written to. 3) would most probably ideal, however, I am no expert on that code and can't do it (707947247e95 confuses me). So naive me would go with 2) first.
On 11/22/22 13:38, David Hildenbrand wrote: > On 22.11.22 13:25, Hans Verkuil wrote: >> Hi Tomasz, David, >> >> On 11/8/22 05:45, Tomasz Figa wrote: >>> Hi David, >>> >>> On Tue, Nov 8, 2022 at 1:19 AM David Hildenbrand <david@redhat.com> wrote: >>>> >>>> FOLL_FORCE is really only for debugger access. According to commit >>>> 707947247e95 ("media: videobuf2-vmalloc: get_userptr: buffers are always >>>> writable"), the pinned pages are always writable. >>> >>> Actually that patch is only a workaround to temporarily disable >>> support for read-only pages as they seemed to suffer from some >>> corruption issues in the retrieved user pages. We expect to support >>> read-only pages as hardware input after. That said, FOLL_FORCE doesn't >>> sound like the right thing even in that case, but I don't know the >>> background behind it being added here in the first place. +Hans >>> Verkuil +Marek Szyprowski do you happen to remember anything about it? >> >> I tracked the use of 'force' all the way back to the first git commit >> (2.6.12-rc1) in the very old video-buf.c. So it is very, very old and the >> reason is lost in the mists of time. >> >> I'm not sure if the 'force' argument of get_user_pages() at that time >> even meant the same as FOLL_FORCE today. From what I can tell it has just >> been faithfully used ever since, but I have my doubt that anyone understands >> the reason behind it since it was never explained. >> >> Looking at this old LWN article https://lwn.net/Articles/28548/ suggests >> that it might be related to calling get_user_pages for write buffers >> (non-zero write argument) where you also want to be able to read from the >> buffer. That is certainly something that some drivers need to do post-capture >> fixups. >> >> But 'force' was also always set for read buffers, and I don't know if that >> was something that was actually needed, or just laziness. >> >> I assume that removing FOLL_FORCE from 'FOLL_FORCE|FOLL_WRITE' will still >> allow drivers to read from the buffer? > > Yes. The only problematic corner case I can imagine is if someone has a > VMA without write permissions (no PROT_WRITE/VM_WRITE) and wants to pin > user space pages as a read buffer. We'd specify now FOLL_WRITE without > FOLL_FORCE and GUP would reject that: write access without write > permissions is invalid. I do not believe this will be an issue. > > There would be no way around "fixing" this implementation to not specify > FOLL_WRITE when only reading from user-space pages. Not sure what the > implications are regarding that corruption that was mentioned in > 707947247e95. Before 707947247e95 the FOLL_WRITE flag was only set for write buffers (i.e. video capture, DMA_FROM_DEVICE), not for read buffers (video output, DMA_TO_DEVICE). In the video output case there should never be any need for drivers to write to the buffer to the best of my knowledge. But I have had some complaints about that commit that it causes problems in some scenarios, and it has been on my todo list for quite some time now to dig deeper into this. I probably should prioritize this for this or next week. > > Having said that, I assume such a scenario is unlikely -- but you might > know better how user space usually uses this interface. There would be > three options: > > 1) Leave the FOLL_FORCE hack in for now, which I *really* want to avoid. > 2) Remove FOLL_FORCE and see if anybody even notices (this patch) and > leave the implementation as is for now. > 3) Remove FOLL_FORCE and fixup the implementation to only specify > FOLL_WRITE if the pages will actually get written to. > > 3) would most probably ideal, however, I am no expert on that code and > can't do it (707947247e95 confuses me). So naive me would go with 2) first. > Option 3 would be best. And 707947247e95 confuses me as well, and I actually wrote it :-) I am wondering whether it was addressed at the right level, but as I said, I need to dig a bit deeper into this. Regards, Hans
On 22.11.22 15:07, Hans Verkuil wrote: > On 11/22/22 13:38, David Hildenbrand wrote: >> On 22.11.22 13:25, Hans Verkuil wrote: >>> Hi Tomasz, David, >>> >>> On 11/8/22 05:45, Tomasz Figa wrote: >>>> Hi David, >>>> >>>> On Tue, Nov 8, 2022 at 1:19 AM David Hildenbrand <david@redhat.com> wrote: >>>>> >>>>> FOLL_FORCE is really only for debugger access. According to commit >>>>> 707947247e95 ("media: videobuf2-vmalloc: get_userptr: buffers are always >>>>> writable"), the pinned pages are always writable. >>>> >>>> Actually that patch is only a workaround to temporarily disable >>>> support for read-only pages as they seemed to suffer from some >>>> corruption issues in the retrieved user pages. We expect to support >>>> read-only pages as hardware input after. That said, FOLL_FORCE doesn't >>>> sound like the right thing even in that case, but I don't know the >>>> background behind it being added here in the first place. +Hans >>>> Verkuil +Marek Szyprowski do you happen to remember anything about it? >>> >>> I tracked the use of 'force' all the way back to the first git commit >>> (2.6.12-rc1) in the very old video-buf.c. So it is very, very old and the >>> reason is lost in the mists of time. >>> >>> I'm not sure if the 'force' argument of get_user_pages() at that time >>> even meant the same as FOLL_FORCE today. From what I can tell it has just >>> been faithfully used ever since, but I have my doubt that anyone understands >>> the reason behind it since it was never explained. >>> >>> Looking at this old LWN article https://lwn.net/Articles/28548/ suggests >>> that it might be related to calling get_user_pages for write buffers >>> (non-zero write argument) where you also want to be able to read from the >>> buffer. That is certainly something that some drivers need to do post-capture >>> fixups. >>> >>> But 'force' was also always set for read buffers, and I don't know if that >>> was something that was actually needed, or just laziness. >>> >>> I assume that removing FOLL_FORCE from 'FOLL_FORCE|FOLL_WRITE' will still >>> allow drivers to read from the buffer? >> >> Yes. The only problematic corner case I can imagine is if someone has a >> VMA without write permissions (no PROT_WRITE/VM_WRITE) and wants to pin >> user space pages as a read buffer. We'd specify now FOLL_WRITE without >> FOLL_FORCE and GUP would reject that: write access without write >> permissions is invalid. > > I do not believe this will be an issue. > >> >> There would be no way around "fixing" this implementation to not specify >> FOLL_WRITE when only reading from user-space pages. Not sure what the >> implications are regarding that corruption that was mentioned in >> 707947247e95. > > Before 707947247e95 the FOLL_WRITE flag was only set for write buffers > (i.e. video capture, DMA_FROM_DEVICE), not for read buffers (video output, > DMA_TO_DEVICE). In the video output case there should never be any need > for drivers to write to the buffer to the best of my knowledge. > > But I have had some complaints about that commit that it causes problems > in some scenarios, and it has been on my todo list for quite some time now > to dig deeper into this. I probably should prioritize this for this or > next week. > >> >> Having said that, I assume such a scenario is unlikely -- but you might >> know better how user space usually uses this interface. There would be >> three options: >> >> 1) Leave the FOLL_FORCE hack in for now, which I *really* want to avoid. >> 2) Remove FOLL_FORCE and see if anybody even notices (this patch) and >> leave the implementation as is for now. >> 3) Remove FOLL_FORCE and fixup the implementation to only specify >> FOLL_WRITE if the pages will actually get written to. >> >> 3) would most probably ideal, however, I am no expert on that code and >> can't do it (707947247e95 confuses me). So naive me would go with 2) first. >> > > Option 3 would be best. And 707947247e95 confuses me as well, and I actually > wrote it :-) I am wondering whether it was addressed at the right level, but > as I said, I need to dig a bit deeper into this. Cool, let me know if I can help!
On Tue, Nov 22, 2022 at 4:25 AM Hans Verkuil <hverkuil@xs4all.nl> wrote: > > I tracked the use of 'force' all the way back to the first git commit > (2.6.12-rc1) in the very old video-buf.c. So it is very, very old and the > reason is lost in the mists of time. Well, not entirely. For archaeology reasons, I went back to the old BK history, which exists as a git conversion in https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git/ and there you can actually find it. Not with a lot of explanations, though - it's commit b7649ef789 ("[PATCH] videobuf update"): This updates the video-buf.c module (helper module for video buffer management). Some memory management fixes, also some adaptions to the final v4l2 api. but it went from err = get_user_pages(current,current->mm, - data, dma->nr_pages, - rw == READ, 0, /* don't force */ + data & PAGE_MASK, dma->nr_pages, + rw == READ, 1, /* force */ dma->pages, NULL); in that commit. So it goes back to October 2002. > Looking at this old LWN article https://lwn.net/Articles/28548/ suggests > that it might be related to calling get_user_pages for write buffers The timing roughly matches. > I assume that removing FOLL_FORCE from 'FOLL_FORCE|FOLL_WRITE' will still > allow drivers to read from the buffer? The issue with some of the driver hacks has been that - they only want to read, and the buffer may be read-only - they then used FOLL_WRITE despite that, because they want to break COW (due to the issue that David is now fixing with his series) - but that means that the VM layer says "nope, you can't write to this read-only user mapping" - ... and then they use FOLL_FORCE to say "yes, I can". iOW, the FOLL_FORCE may be entirely due to an (incorrect, but historically needed) FOLL_WRITE. Linus
diff --git a/drivers/media/common/videobuf2/frame_vector.c b/drivers/media/common/videobuf2/frame_vector.c index 542dde9d2609..062e98148c53 100644 --- a/drivers/media/common/videobuf2/frame_vector.c +++ b/drivers/media/common/videobuf2/frame_vector.c @@ -50,7 +50,7 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames, start = untagged_addr(start); ret = pin_user_pages_fast(start, nr_frames, - FOLL_FORCE | FOLL_WRITE | FOLL_LONGTERM, + FOLL_WRITE | FOLL_LONGTERM, (struct page **)(vec->ptrs)); if (ret > 0) { vec->got_ref = true;
FOLL_FORCE is really only for debugger access. According to commit 707947247e95 ("media: videobuf2-vmalloc: get_userptr: buffers are always writable"), the pinned pages are always writable. FOLL_FORCE in this case seems to be a legacy leftover. Let's just remove it. Cc: Tomasz Figa <tfiga@chromium.org> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Signed-off-by: David Hildenbrand <david@redhat.com> --- drivers/media/common/videobuf2/frame_vector.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)