mbox series

[RFC,v2,0/3] expose mapping wrprotect, fix fb_defio use

Message ID cover.1736809898.git.lorenzo.stoakes@oracle.com (mailing list archive)
Headers show
Series expose mapping wrprotect, fix fb_defio use | expand

Message

Lorenzo Stoakes Jan. 13, 2025, 11:15 p.m. UTC
Right now the only means by which we can write-protect a range using the
reverse mapping is via folio_mkclean().

However this is not always the appropriate means of doing so, specifically
in the case of the framebuffer deferred I/O logic (fb_defio enabled by
CONFIG_FB_DEFERRED_IO). There, kernel pages are mapped read-only and
write-protect faults used to batch up I/O operations.

Each time the deferred work is done, folio_mkclean() is used to mark the
framebuffer page as having had I/O performed on it. However doing so
requires the kernel page (perhaps allocated via vmalloc()) to have its
page->mapping, index fields set so the rmap can find everything that maps
it in order to write-protect.

This is problematic as firstly, these fields should not be set for
kernel-allocated memory, and secondly these are not folios (it's not user
memory) and page->index, mapping fields are now deprecated and soon to be
removed.

The implementers cannot be blamed for having used this however, as there is
simply no other way of performing this operation correctly.

This series fixes this - we provide the mapping_wrprotect_page() function
to allow the reverse mapping to be used to look up mappings from the page
cache object (i.e. its address_space pointer) at a specific offset.

The fb_defio logic already stores this offset, and can simply be expanded
to keep track of the page cache object, so the change then becomes
straight-forward.

This series should have no functional change.

*** REVIEWERS NOTES: ***

I do not have any hardware that uses fb_defio, so I'm asking for help with
testing this series from those who do :) I have tested the mm side of this,
and done a quick compile smoke test of the fb_defio side but this _very
much_ requires testing on actual hardware to ensure everything behaves as
expected.

This is based on Andrew's tree [0] in the mm-unstable branch - I was
thinking it'd be best to go through the mm tree (with fb_defio maintainer
approval, of course!) as it relies upon the mm changes to work correctly.

[0]: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git/

RFC v2:
* Updated Jaya Kumar's email on cc - the MAINTAINERS section is apparently incorrect.
* Corrected rmap_walk_file() comment to refer to folios as per Matthew.
* Reference folio->mapping rather than folio_mapping(folio) in rmap_walk_file()
  as per Matthew.
* Reference folio->index rather than folio_pgoff(folio) in rmap_walk_file() as
  per Matthew.
* Renamed rmap_wrprotect_file_page() to mapping_wrprotect_page() as per Matthew.
* Fixed kerneldoc and moved to implementation as per Matthew.
* Updated mapping_wrprotect_page() to take a struct page pointer as per David.
* Removed folio lock when invoking mapping_wrprotect_page() in
  fb_deferred_io_work() as per Matthew.
* Removed compound_nr() in fb_deferred_io_work() as per Matthew.

RFC v1:
https://lore.kernel.org/all/1e452b5b65f15a9a5d0c2ed3f5f812fdd1367603.1736352361.git.lorenzo.stoakes@oracle.com/

Lorenzo Stoakes (3):
  mm: refactor rmap_walk_file() to separate out traversal logic
  mm: provide mapping_wrprotect_page() function
  fb_defio: do not use deprecated page->mapping, index fields

 drivers/video/fbdev/core/fb_defio.c |  38 ++-----
 include/linux/fb.h                  |   1 +
 include/linux/rmap.h                |   3 +
 mm/rmap.c                           | 152 +++++++++++++++++++++++-----
 4 files changed, 141 insertions(+), 53 deletions(-)

--
2.48.0

Comments

Lorenzo Stoakes Jan. 27, 2025, 6:45 p.m. UTC | #1
Hi guys,

It'd be good if somebody from the drm side could look at this as you are
using fields that are deprecated and will be made unavailable (in a
non-optional fashion), and relatively soon :)

It'd be good to confirm this works on real hardware (none of mine supports
this sadly, I tried 3 separate machines...).

In effect I provide a whole new mechanism just so defio can do the right
thing and retain the exact same behaviour, so just need to confirm it does
what you need from our side.

Thanks, Lorenzo

On Mon, Jan 13, 2025 at 11:15:45PM +0000, Lorenzo Stoakes wrote:
> Right now the only means by which we can write-protect a range using the
> reverse mapping is via folio_mkclean().
>
> However this is not always the appropriate means of doing so, specifically
> in the case of the framebuffer deferred I/O logic (fb_defio enabled by
> CONFIG_FB_DEFERRED_IO). There, kernel pages are mapped read-only and
> write-protect faults used to batch up I/O operations.
>
> Each time the deferred work is done, folio_mkclean() is used to mark the
> framebuffer page as having had I/O performed on it. However doing so
> requires the kernel page (perhaps allocated via vmalloc()) to have its
> page->mapping, index fields set so the rmap can find everything that maps
> it in order to write-protect.
>
> This is problematic as firstly, these fields should not be set for
> kernel-allocated memory, and secondly these are not folios (it's not user
> memory) and page->index, mapping fields are now deprecated and soon to be
> removed.
>
> The implementers cannot be blamed for having used this however, as there is
> simply no other way of performing this operation correctly.
>
> This series fixes this - we provide the mapping_wrprotect_page() function
> to allow the reverse mapping to be used to look up mappings from the page
> cache object (i.e. its address_space pointer) at a specific offset.
>
> The fb_defio logic already stores this offset, and can simply be expanded
> to keep track of the page cache object, so the change then becomes
> straight-forward.
>
> This series should have no functional change.
>
> *** REVIEWERS NOTES: ***
>
> I do not have any hardware that uses fb_defio, so I'm asking for help with
> testing this series from those who do :) I have tested the mm side of this,
> and done a quick compile smoke test of the fb_defio side but this _very
> much_ requires testing on actual hardware to ensure everything behaves as
> expected.
>
> This is based on Andrew's tree [0] in the mm-unstable branch - I was
> thinking it'd be best to go through the mm tree (with fb_defio maintainer
> approval, of course!) as it relies upon the mm changes to work correctly.
>
> [0]: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git/
>
> RFC v2:
> * Updated Jaya Kumar's email on cc - the MAINTAINERS section is apparently incorrect.
> * Corrected rmap_walk_file() comment to refer to folios as per Matthew.
> * Reference folio->mapping rather than folio_mapping(folio) in rmap_walk_file()
>   as per Matthew.
> * Reference folio->index rather than folio_pgoff(folio) in rmap_walk_file() as
>   per Matthew.
> * Renamed rmap_wrprotect_file_page() to mapping_wrprotect_page() as per Matthew.
> * Fixed kerneldoc and moved to implementation as per Matthew.
> * Updated mapping_wrprotect_page() to take a struct page pointer as per David.
> * Removed folio lock when invoking mapping_wrprotect_page() in
>   fb_deferred_io_work() as per Matthew.
> * Removed compound_nr() in fb_deferred_io_work() as per Matthew.
>
> RFC v1:
> https://lore.kernel.org/all/1e452b5b65f15a9a5d0c2ed3f5f812fdd1367603.1736352361.git.lorenzo.stoakes@oracle.com/
>
> Lorenzo Stoakes (3):
>   mm: refactor rmap_walk_file() to separate out traversal logic
>   mm: provide mapping_wrprotect_page() function
>   fb_defio: do not use deprecated page->mapping, index fields
>
>  drivers/video/fbdev/core/fb_defio.c |  38 ++-----
>  include/linux/fb.h                  |   1 +
>  include/linux/rmap.h                |   3 +
>  mm/rmap.c                           | 152 +++++++++++++++++++++++-----
>  4 files changed, 141 insertions(+), 53 deletions(-)
>
> --
> 2.48.0
Kajtár Zsolt Jan. 30, 2025, 6:58 p.m. UTC | #2
Hello Lorenzo!

I tried the patches below yesterday with the latest mm tree from git at the
time. I have some work in progress code which uses defio, and hardware which
needs that for shuffling bytes around to get the right picture.

There was no obvious difference in operation, it looked the same with and
without the patches. I've skipped the card's memory initialization so it 
contained garbage while the system memory behind was clean. That garbage
remained and was only cleaned or overwritten where the system memory was
updated by the application. Fbcon wasn't assigned and the app mapped the
memory of course. Left it running at configured to HZ/50 for a while with
1/3 of the screen updated at ~50 FPS by the test app, nothing interesting
happened. Same tearing both ways.

I can't comment on performance as the CPU is moving the bytes in the defio
callback and that code is too heavy at the moment to notice a difference.
Test was performed on a dual core x86-64 (E3400).

So defio seems to still work after these changes for me.

Tested-by: Zsolt Kajtar <soci@c64.rulez.org>

> On Mon, Jan 13, 2025 at 11:15:45PM +0000, Lorenzo Stoakes wrote:
>> Right now the only means by which we can write-protect a range using the
>> reverse mapping is via folio_mkclean().
>>
>> However this is not always the appropriate means of doing so, specifically
>> in the case of the framebuffer deferred I/O logic (fb_defio enabled by
>> CONFIG_FB_DEFERRED_IO). There, kernel pages are mapped read-only and
>> write-protect faults used to batch up I/O operations.
>>
>> Each time the deferred work is done, folio_mkclean() is used to mark the
>> framebuffer page as having had I/O performed on it. However doing so
>> requires the kernel page (perhaps allocated via vmalloc()) to have its
>> page->mapping, index fields set so the rmap can find everything that maps
>> it in order to write-protect.
>>
>> This is problematic as firstly, these fields should not be set for
>> kernel-allocated memory, and secondly these are not folios (it's not user
>> memory) and page->index, mapping fields are now deprecated and soon to be
>> removed.
>>
>> The implementers cannot be blamed for having used this however, as there is
>> simply no other way of performing this operation correctly.
>>
>> This series fixes this - we provide the mapping_wrprotect_page() function
>> to allow the reverse mapping to be used to look up mappings from the page
>> cache object (i.e. its address_space pointer) at a specific offset.
>>
>> The fb_defio logic already stores this offset, and can simply be expanded
>> to keep track of the page cache object, so the change then becomes
>> straight-forward.
>>
>> This series should have no functional change.
>>
>> *** REVIEWERS NOTES: ***
>>
>> I do not have any hardware that uses fb_defio, so I'm asking for help with
>> testing this series from those who do :) I have tested the mm side of this,
>> and done a quick compile smoke test of the fb_defio side but this _very
>> much_ requires testing on actual hardware to ensure everything behaves as
>> expected.
>>
>> This is based on Andrew's tree [0] in the mm-unstable branch - I was
>> thinking it'd be best to go through the mm tree (with fb_defio maintainer
>> approval, of course!) as it relies upon the mm changes to work correctly.
>>
>> [0]: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git/
>>
>> RFC v2:
>> * Updated Jaya Kumar's email on cc - the MAINTAINERS section is apparently incorrect.
>> * Corrected rmap_walk_file() comment to refer to folios as per Matthew.
>> * Reference folio->mapping rather than folio_mapping(folio) in rmap_walk_file()
>>   as per Matthew.
>> * Reference folio->index rather than folio_pgoff(folio) in rmap_walk_file() as
>>   per Matthew.
>> * Renamed rmap_wrprotect_file_page() to mapping_wrprotect_page() as per Matthew.
>> * Fixed kerneldoc and moved to implementation as per Matthew.
>> * Updated mapping_wrprotect_page() to take a struct page pointer as per David.
>> * Removed folio lock when invoking mapping_wrprotect_page() in
>>   fb_deferred_io_work() as per Matthew.
>> * Removed compound_nr() in fb_deferred_io_work() as per Matthew.
>>
>> RFC v1:
>> https://lore.kernel.org/all/1e452b5b65f15a9a5d0c2ed3f5f812fdd1367603.1736352361.git.lorenzo.stoakes@oracle.com/
>>
>> Lorenzo Stoakes (3):
>>   mm: refactor rmap_walk_file() to separate out traversal logic
>>   mm: provide mapping_wrprotect_page() function
>>   fb_defio: do not use deprecated page->mapping, index fields
>>
>>  drivers/video/fbdev/core/fb_defio.c |  38 ++-----
>>  include/linux/fb.h                  |   1 +
>>  include/linux/rmap.h                |   3 +
>>  mm/rmap.c                           | 152 +++++++++++++++++++++++-----
>>  4 files changed, 141 insertions(+), 53 deletions(-)
>>
>> --
>> 2.48.0
Lorenzo Stoakes Jan. 31, 2025, 5:03 p.m. UTC | #3
+cc Kajtar, who has kindly smoke tested this series on real hardware and
confirmed things are working ostensibly the same as before.

On this basis I will be un-RFC'ing this and, if Kajtar can reply to
confirm, will add a Tested-by tag to patch 3/3.

Again to emphasise - there is some urgency here - as struct page fields
that defio relies upon prior to this series are being removed
non-optionally.

Thanks, Lorenzo


On Mon, Jan 13, 2025 at 11:15:45PM +0000, Lorenzo Stoakes wrote:
> Right now the only means by which we can write-protect a range using the
> reverse mapping is via folio_mkclean().
>
> However this is not always the appropriate means of doing so, specifically
> in the case of the framebuffer deferred I/O logic (fb_defio enabled by
> CONFIG_FB_DEFERRED_IO). There, kernel pages are mapped read-only and
> write-protect faults used to batch up I/O operations.
>
> Each time the deferred work is done, folio_mkclean() is used to mark the
> framebuffer page as having had I/O performed on it. However doing so
> requires the kernel page (perhaps allocated via vmalloc()) to have its
> page->mapping, index fields set so the rmap can find everything that maps
> it in order to write-protect.
>
> This is problematic as firstly, these fields should not be set for
> kernel-allocated memory, and secondly these are not folios (it's not user
> memory) and page->index, mapping fields are now deprecated and soon to be
> removed.
>
> The implementers cannot be blamed for having used this however, as there is
> simply no other way of performing this operation correctly.
>
> This series fixes this - we provide the mapping_wrprotect_page() function
> to allow the reverse mapping to be used to look up mappings from the page
> cache object (i.e. its address_space pointer) at a specific offset.
>
> The fb_defio logic already stores this offset, and can simply be expanded
> to keep track of the page cache object, so the change then becomes
> straight-forward.
>
> This series should have no functional change.
>
> *** REVIEWERS NOTES: ***
>
> I do not have any hardware that uses fb_defio, so I'm asking for help with
> testing this series from those who do :) I have tested the mm side of this,
> and done a quick compile smoke test of the fb_defio side but this _very
> much_ requires testing on actual hardware to ensure everything behaves as
> expected.
>
> This is based on Andrew's tree [0] in the mm-unstable branch - I was
> thinking it'd be best to go through the mm tree (with fb_defio maintainer
> approval, of course!) as it relies upon the mm changes to work correctly.
>
> [0]: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git/
>
> RFC v2:
> * Updated Jaya Kumar's email on cc - the MAINTAINERS section is apparently incorrect.
> * Corrected rmap_walk_file() comment to refer to folios as per Matthew.
> * Reference folio->mapping rather than folio_mapping(folio) in rmap_walk_file()
>   as per Matthew.
> * Reference folio->index rather than folio_pgoff(folio) in rmap_walk_file() as
>   per Matthew.
> * Renamed rmap_wrprotect_file_page() to mapping_wrprotect_page() as per Matthew.
> * Fixed kerneldoc and moved to implementation as per Matthew.
> * Updated mapping_wrprotect_page() to take a struct page pointer as per David.
> * Removed folio lock when invoking mapping_wrprotect_page() in
>   fb_deferred_io_work() as per Matthew.
> * Removed compound_nr() in fb_deferred_io_work() as per Matthew.
>
> RFC v1:
> https://lore.kernel.org/all/1e452b5b65f15a9a5d0c2ed3f5f812fdd1367603.1736352361.git.lorenzo.stoakes@oracle.com/
>
> Lorenzo Stoakes (3):
>   mm: refactor rmap_walk_file() to separate out traversal logic
>   mm: provide mapping_wrprotect_page() function
>   fb_defio: do not use deprecated page->mapping, index fields
>
>  drivers/video/fbdev/core/fb_defio.c |  38 ++-----
>  include/linux/fb.h                  |   1 +
>  include/linux/rmap.h                |   3 +
>  mm/rmap.c                           | 152 +++++++++++++++++++++++-----
>  4 files changed, 141 insertions(+), 53 deletions(-)
>
> --
> 2.48.0
Kajtár Zsolt Jan. 31, 2025, 6:01 p.m. UTC | #4
> +cc Kajtar, who has kindly smoke tested this series on real hardware and
> confirmed things are working ostensibly the same as before.
> 
> On this basis I will be un-RFC'ing this and, if Kajtar can reply to
> confirm, will add a Tested-by tag to patch 3/3.

No problem, I'm glad I could help. Using defio is required for higher
resolution modes anyway so I just had to code that part a bit earlier
than planned.
Thomas Zimmermann Feb. 3, 2025, 10:24 a.m. UTC | #5
Hi


Am 14.01.25 um 00:15 schrieb Lorenzo Stoakes:
[...]
>
> *** REVIEWERS NOTES: ***
>
> I do not have any hardware that uses fb_defio, so I'm asking for help with
> testing this series from those who do :) I have tested the mm side of this,
> and done a quick compile smoke test of the fb_defio side but this _very
> much_ requires testing on actual hardware to ensure everything behaves as
> expected.

With a recent Linux distro, you likely boot up graphics with simpledrm, 
which uses fb_defio as part of its console emulation. To test, boot the 
kernel with the 'nomodeset' parameter and write to /dev/fb0.

Best regards
Thomas

>
> This is based on Andrew's tree [0] in the mm-unstable branch - I was
> thinking it'd be best to go through the mm tree (with fb_defio maintainer
> approval, of course!) as it relies upon the mm changes to work correctly.
>
> [0]: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git/
>
> RFC v2:
> * Updated Jaya Kumar's email on cc - the MAINTAINERS section is apparently incorrect.
> * Corrected rmap_walk_file() comment to refer to folios as per Matthew.
> * Reference folio->mapping rather than folio_mapping(folio) in rmap_walk_file()
>    as per Matthew.
> * Reference folio->index rather than folio_pgoff(folio) in rmap_walk_file() as
>    per Matthew.
> * Renamed rmap_wrprotect_file_page() to mapping_wrprotect_page() as per Matthew.
> * Fixed kerneldoc and moved to implementation as per Matthew.
> * Updated mapping_wrprotect_page() to take a struct page pointer as per David.
> * Removed folio lock when invoking mapping_wrprotect_page() in
>    fb_deferred_io_work() as per Matthew.
> * Removed compound_nr() in fb_deferred_io_work() as per Matthew.
>
> RFC v1:
> https://lore.kernel.org/all/1e452b5b65f15a9a5d0c2ed3f5f812fdd1367603.1736352361.git.lorenzo.stoakes@oracle.com/
>
> Lorenzo Stoakes (3):
>    mm: refactor rmap_walk_file() to separate out traversal logic
>    mm: provide mapping_wrprotect_page() function
>    fb_defio: do not use deprecated page->mapping, index fields
>
>   drivers/video/fbdev/core/fb_defio.c |  38 ++-----
>   include/linux/fb.h                  |   1 +
>   include/linux/rmap.h                |   3 +
>   mm/rmap.c                           | 152 +++++++++++++++++++++++-----
>   4 files changed, 141 insertions(+), 53 deletions(-)
>
> --
> 2.48.0
Lorenzo Stoakes Feb. 3, 2025, 10:54 a.m. UTC | #6
On Mon, Feb 03, 2025 at 11:24:50AM +0100, Thomas Zimmermann wrote:
> Hi
>
>
> Am 14.01.25 um 00:15 schrieb Lorenzo Stoakes:
> [...]
> >
> > *** REVIEWERS NOTES: ***
> >
> > I do not have any hardware that uses fb_defio, so I'm asking for help with
> > testing this series from those who do :) I have tested the mm side of this,
> > and done a quick compile smoke test of the fb_defio side but this _very
> > much_ requires testing on actual hardware to ensure everything behaves as
> > expected.
>
> With a recent Linux distro, you likely boot up graphics with simpledrm,
> which uses fb_defio as part of its console emulation. To test, boot the
> kernel with the 'nomodeset' parameter and write to /dev/fb0.
>
> Best regards
> Thomas

Thanks, luckily a user kindly tested the series on real hardware and
confirmed it worked, so the series is now un-RFC'd [1] :) but will try this
if it needs significant respin.

[1]:https://lore.kernel.org/all/cover.1738347308.git.lorenzo.stoakes@oracle.com/

>
> >
> > This is based on Andrew's tree [0] in the mm-unstable branch - I was
> > thinking it'd be best to go through the mm tree (with fb_defio maintainer
> > approval, of course!) as it relies upon the mm changes to work correctly.
> >
> > [0]: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git/
> >
> > RFC v2:
> > * Updated Jaya Kumar's email on cc - the MAINTAINERS section is apparently incorrect.
> > * Corrected rmap_walk_file() comment to refer to folios as per Matthew.
> > * Reference folio->mapping rather than folio_mapping(folio) in rmap_walk_file()
> >    as per Matthew.
> > * Reference folio->index rather than folio_pgoff(folio) in rmap_walk_file() as
> >    per Matthew.
> > * Renamed rmap_wrprotect_file_page() to mapping_wrprotect_page() as per Matthew.
> > * Fixed kerneldoc and moved to implementation as per Matthew.
> > * Updated mapping_wrprotect_page() to take a struct page pointer as per David.
> > * Removed folio lock when invoking mapping_wrprotect_page() in
> >    fb_deferred_io_work() as per Matthew.
> > * Removed compound_nr() in fb_deferred_io_work() as per Matthew.
> >
> > RFC v1:
> > https://lore.kernel.org/all/1e452b5b65f15a9a5d0c2ed3f5f812fdd1367603.1736352361.git.lorenzo.stoakes@oracle.com/
> >
> > Lorenzo Stoakes (3):
> >    mm: refactor rmap_walk_file() to separate out traversal logic
> >    mm: provide mapping_wrprotect_page() function
> >    fb_defio: do not use deprecated page->mapping, index fields
> >
> >   drivers/video/fbdev/core/fb_defio.c |  38 ++-----
> >   include/linux/fb.h                  |   1 +
> >   include/linux/rmap.h                |   3 +
> >   mm/rmap.c                           | 152 +++++++++++++++++++++++-----
> >   4 files changed, 141 insertions(+), 53 deletions(-)
> >
> > --
> > 2.48.0
>
> --
> --
> Thomas Zimmermann
> Graphics Driver Developer
> SUSE Software Solutions Germany GmbH
> Frankenstrasse 146, 90461 Nuernberg, Germany
> GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
> HRB 36809 (AG Nuernberg)
>