diff mbox series

[v2,1/5] fuse: Fix VM_MAYSHARE and direct_io_allow_mmap

Message ID 20240131230827.207552-2-bschubert@ddn.com (mailing list archive)
State New
Headers show
Series fuse: inode IO modes and mmap | expand

Commit Message

Bernd Schubert Jan. 31, 2024, 11:08 p.m. UTC
There were multiple issues with direct_io_allow_mmap:
- fuse_link_write_file() was missing, resulting in warnings in
  fuse_write_file_get() and EIO from msync()
- "vma->vm_ops = &fuse_file_vm_ops" was not set, but especially
  fuse_page_mkwrite is needed.

The semantics of invalidate_inode_pages2() is so far not clearly defined
in fuse_file_mmap. It dates back to
commit 3121bfe76311 ("fuse: fix "direct_io" private mmap")
Though, as direct_io_allow_mmap is a new feature, that was for MAP_PRIVATE
only. As invalidate_inode_pages2() is calling into fuse_launder_folio()
and writes out dirty pages, it should be safe to call
invalidate_inode_pages2 for MAP_PRIVATE and MAP_SHARED as well.

Cc: Hao Xu <howeyxu@tencent.com>
Cc: stable@vger.kernel.org
Fixes: e78662e818f9 ("fuse: add a new fuse init flag to relax restrictions in no cache mode")
Signed-off-by: Bernd Schubert <bschubert@ddn.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/fuse/file.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

Comments

Miklos Szeredi Feb. 1, 2024, 8:45 a.m. UTC | #1
On Thu, 1 Feb 2024 at 00:09, Bernd Schubert <bschubert@ddn.com> wrote:
>
> There were multiple issues with direct_io_allow_mmap:
> - fuse_link_write_file() was missing, resulting in warnings in
>   fuse_write_file_get() and EIO from msync()
> - "vma->vm_ops = &fuse_file_vm_ops" was not set, but especially
>   fuse_page_mkwrite is needed.
>
> The semantics of invalidate_inode_pages2() is so far not clearly defined
> in fuse_file_mmap. It dates back to
> commit 3121bfe76311 ("fuse: fix "direct_io" private mmap")
> Though, as direct_io_allow_mmap is a new feature, that was for MAP_PRIVATE
> only. As invalidate_inode_pages2() is calling into fuse_launder_folio()
> and writes out dirty pages, it should be safe to call
> invalidate_inode_pages2 for MAP_PRIVATE and MAP_SHARED as well.

Did you test with fsx (various versions can be found in LTP/xfstests)?
 It's very good at finding  mapped vs. non-mapped bugs.

Thanks,
Miklos
Bernd Schubert Feb. 1, 2024, 2:36 p.m. UTC | #2
On 2/1/24 09:45, Miklos Szeredi wrote:
> On Thu, 1 Feb 2024 at 00:09, Bernd Schubert <bschubert@ddn.com> wrote:
>>
>> There were multiple issues with direct_io_allow_mmap:
>> - fuse_link_write_file() was missing, resulting in warnings in
>>    fuse_write_file_get() and EIO from msync()
>> - "vma->vm_ops = &fuse_file_vm_ops" was not set, but especially
>>    fuse_page_mkwrite is needed.
>>
>> The semantics of invalidate_inode_pages2() is so far not clearly defined
>> in fuse_file_mmap. It dates back to
>> commit 3121bfe76311 ("fuse: fix "direct_io" private mmap")
>> Though, as direct_io_allow_mmap is a new feature, that was for MAP_PRIVATE
>> only. As invalidate_inode_pages2() is calling into fuse_launder_folio()
>> and writes out dirty pages, it should be safe to call
>> invalidate_inode_pages2 for MAP_PRIVATE and MAP_SHARED as well.
> 
> Did you test with fsx (various versions can be found in LTP/xfstests)?
>   It's very good at finding  mapped vs. non-mapped bugs.

I tested with xfstest, but not with fsx yet. I can look into that. Do 
you have by any chance an exact command I should run?


Thanks,
Bernd
Miklos Szeredi Feb. 1, 2024, 2:52 p.m. UTC | #3
On Thu, 1 Feb 2024 at 15:36, Bernd Schubert <bernd.schubert@fastmail.fm> wrote:
>
>
>
> On 2/1/24 09:45, Miklos Szeredi wrote:
> > On Thu, 1 Feb 2024 at 00:09, Bernd Schubert <bschubert@ddn.com> wrote:
> >>
> >> There were multiple issues with direct_io_allow_mmap:
> >> - fuse_link_write_file() was missing, resulting in warnings in
> >>    fuse_write_file_get() and EIO from msync()
> >> - "vma->vm_ops = &fuse_file_vm_ops" was not set, but especially
> >>    fuse_page_mkwrite is needed.
> >>
> >> The semantics of invalidate_inode_pages2() is so far not clearly defined
> >> in fuse_file_mmap. It dates back to
> >> commit 3121bfe76311 ("fuse: fix "direct_io" private mmap")
> >> Though, as direct_io_allow_mmap is a new feature, that was for MAP_PRIVATE
> >> only. As invalidate_inode_pages2() is calling into fuse_launder_folio()
> >> and writes out dirty pages, it should be safe to call
> >> invalidate_inode_pages2 for MAP_PRIVATE and MAP_SHARED as well.
> >
> > Did you test with fsx (various versions can be found in LTP/xfstests)?
> >   It's very good at finding  mapped vs. non-mapped bugs.
>
> I tested with xfstest, but not with fsx yet. I can look into that. Do
> you have by any chance an exact command I should run?

Just specifying a filename should be good.  Make sure you test with
various open modes.

Thanks,
Miklos
Bernd Schubert Feb. 1, 2024, 3:39 p.m. UTC | #4
On 2/1/24 15:52, Miklos Szeredi wrote:
> On Thu, 1 Feb 2024 at 15:36, Bernd Schubert <bernd.schubert@fastmail.fm> wrote:
>>
>>
>>
>> On 2/1/24 09:45, Miklos Szeredi wrote:
>>> On Thu, 1 Feb 2024 at 00:09, Bernd Schubert <bschubert@ddn.com> wrote:
>>>>
>>>> There were multiple issues with direct_io_allow_mmap:
>>>> - fuse_link_write_file() was missing, resulting in warnings in
>>>>     fuse_write_file_get() and EIO from msync()
>>>> - "vma->vm_ops = &fuse_file_vm_ops" was not set, but especially
>>>>     fuse_page_mkwrite is needed.
>>>>
>>>> The semantics of invalidate_inode_pages2() is so far not clearly defined
>>>> in fuse_file_mmap. It dates back to
>>>> commit 3121bfe76311 ("fuse: fix "direct_io" private mmap")
>>>> Though, as direct_io_allow_mmap is a new feature, that was for MAP_PRIVATE
>>>> only. As invalidate_inode_pages2() is calling into fuse_launder_folio()
>>>> and writes out dirty pages, it should be safe to call
>>>> invalidate_inode_pages2 for MAP_PRIVATE and MAP_SHARED as well.
>>>
>>> Did you test with fsx (various versions can be found in LTP/xfstests)?
>>>    It's very good at finding  mapped vs. non-mapped bugs.
>>
>> I tested with xfstest, but not with fsx yet. I can look into that. Do
>> you have by any chance an exact command I should run?
> 
> Just specifying a filename should be good.  Make sure you test with
> various open modes.



fsx immediately fails in FOPEN_DIRECT_IP mode ("passthrough_hp 
--direct-io ...") on an unpatched kernel, but continues to run in 
patched mode.

Given
-N numops: total # operations to do (default infinity)

How long do you think I should run it? Maybe something like 3 hours 
(--duration=$(3 * 60 * 60))?


Thanks,
Bernd
Miklos Szeredi Feb. 1, 2024, 3:43 p.m. UTC | #5
On Thu, 1 Feb 2024 at 16:40, Bernd Schubert <bschubert@ddn.com> wrote:

> Given
> -N numops: total # operations to do (default infinity)
>
> How long do you think I should run it? Maybe something like 3 hours
> (--duration=$(3 * 60 * 60))?

I used -N1000000.  If there were any issues they usually triggered much earlier.

Thanks,
Miklos
Amir Goldstein Feb. 1, 2024, 3:48 p.m. UTC | #6
On Thu, Feb 1, 2024 at 5:43 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
>
> On Thu, 1 Feb 2024 at 16:40, Bernd Schubert <bschubert@ddn.com> wrote:
>
> > Given
> > -N numops: total # operations to do (default infinity)
> >
> > How long do you think I should run it? Maybe something like 3 hours
> > (--duration=$(3 * 60 * 60))?
>
> I used -N1000000.  If there were any issues they usually triggered much earlier.
>

Note that at least these fstests run fsx in several configurations:

$ grep begin `git grep -l run_fsx tests/generic/`
tests/generic/091:_begin_fstest rw auto quick
tests/generic/263:_begin_fstest rw auto quick
tests/generic/469:_begin_fstest auto quick punch zero prealloc
tests/generic/521:_begin_fstest soak long_rw smoketest
tests/generic/522:_begin_fstest soak long_rw smoketest
tests/generic/616:_begin_fstest auto rw io_uring stress soak
tests/generic/617:_begin_fstest auto rw io_uring stress soak

Bernd, you've probably already ran them if you are running auto, quick
or rw test groups.

Possibly you want to try and run also the -g soak.long_rw tests.

They use nr_ops=$((1000000 * TIME_FACTOR))

Thanks,
Amir.
Bernd Schubert Feb. 1, 2024, 4:16 p.m. UTC | #7
On 2/1/24 16:48, Amir Goldstein wrote:
> On Thu, Feb 1, 2024 at 5:43 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
>>
>> On Thu, 1 Feb 2024 at 16:40, Bernd Schubert <bschubert@ddn.com> wrote:
>>
>>> Given
>>> -N numops: total # operations to do (default infinity)
>>>
>>> How long do you think I should run it? Maybe something like 3 hours
>>> (--duration=$(3 * 60 * 60))?
>>
>> I used -N1000000.  If there were any issues they usually triggered much earlier.
>>
> 
> Note that at least these fstests run fsx in several configurations:
> 
> $ grep begin `git grep -l run_fsx tests/generic/`
> tests/generic/091:_begin_fstest rw auto quick
> tests/generic/263:_begin_fstest rw auto quick
> tests/generic/469:_begin_fstest auto quick punch zero prealloc
> tests/generic/521:_begin_fstest soak long_rw smoketest
> tests/generic/522:_begin_fstest soak long_rw smoketest
> tests/generic/616:_begin_fstest auto rw io_uring stress soak
> tests/generic/617:_begin_fstest auto rw io_uring stress soak
> 
> Bernd, you've probably already ran them if you are running auto, quick
> or rw test groups.
> 
> Possibly you want to try and run also the -g soak.long_rw tests.
> 
> They use nr_ops=$((1000000 * TIME_FACTOR))


I didn't check yet what is the actual value, but TIME_FACTOR must be 
smaller than 1 - using "-N1000000" is taking quite some time. I should 
have started in screen. Some of the tests are marked as failed, like

generic/263:
     +main: filesystem does not support fallocate mode 
FALLOC_FL_KEEP_SIZE, disabling!
     +main: filesystem does not support fallocate mode 
FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE, disabling!
     +main: filesystem does not support fallocate mode 
FALLOC_FL_ZERO_RANGE, disabling!


Although the test runs

generic/469 7s ... [not run] xfs_io falloc -k failed (old kernel/wrong fs?)


generic/521
generic/522
--> Somehow not in the output list at all. Ah I see, that needs "-g 
soak.long_rw"



Going to add the the soak.long option to the tests and will run again, 
once current fsx is over.


Thanks,
Bernd
Bernd Schubert Feb. 2, 2024, 2:47 p.m. UTC | #8
On 2/1/24 16:43, Miklos Szeredi wrote:
> On Thu, 1 Feb 2024 at 16:40, Bernd Schubert <bschubert@ddn.com> wrote:
> 
>> Given
>> -N numops: total # operations to do (default infinity)
>>
>> How long do you think I should run it? Maybe something like 3 hours
>> (--duration=$(3 * 60 * 60))?
> 
> I used -N1000000.  If there were any issues they usually triggered much earlier.

Forgot to post, it succeeds both, with and without FOPEN_DIRECT_IO with


bernd@squeeze1 ~>/home/fusetests/src/xfstests-dev/ltp/fsx -N1000000
/scratch/dest/testfile
Seed set to 1
main: filesystem does not support fallocate mode FALLOC_FL_KEEP_SIZE,
disabling!
main: filesystem does not support fallocate mode FALLOC_FL_PUNCH_HOLE |
FALLOC_FL_KEEP_SIZE, disabling!
main: filesystem does not support fallocate mode FALLOC_FL_ZERO_RANGE,
disabling!
main: filesystem does not support fallocate mode
FALLOC_FL_COLLAPSE_RANGE, disabling!
main: filesystem does not support fallocate mode FALLOC_FL_INSERT_RANGE,
disabling!
main: filesystem does not support clone range, disabling!
main: filesystem does not support dedupe range, disabling!
main: filesystem does not support exchange range, disabling!
truncating to largest ever: 0x3aea7
copying to largest ever: 0x3e19b
copying to largest ever: 0x3e343
fallocating to largest ever: 0x40000
skipping zero length fallocate
skipping zero size write
All 1000000 operations completed A-OK!

(I always tested the entire patch series).


Thanks,
Bernd
diff mbox series

Patch

diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 148a71b8b4d0..243f469cac07 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -2476,7 +2476,10 @@  static int fuse_file_mmap(struct file *file, struct vm_area_struct *vma)
 
 		invalidate_inode_pages2(file->f_mapping);
 
-		return generic_file_mmap(file, vma);
+		if (!(vma->vm_flags & VM_MAYSHARE)) {
+			/* MAP_PRIVATE */
+			return generic_file_mmap(file, vma);
+		}
 	}
 
 	if ((vma->vm_flags & VM_SHARED) && (vma->vm_flags & VM_MAYWRITE))