diff mbox series

fuse: dax: No-op writepages callback

Message ID 20241113-dax-no-writeback-v1-1-ee2c3a8d9f84@asahilina.net (mailing list archive)
State New
Headers show
Series fuse: dax: No-op writepages callback | expand

Commit Message

Asahi Lina Nov. 12, 2024, 7:55 p.m. UTC
When using FUSE DAX with virtiofs, cache coherency is managed by the
host. Disk persistence is handled via fsync() and friends, which are
passed directly via the FUSE layer to the host. Therefore, there's no
need to do dax_writeback_mapping_range(). All that ends up doing is a
cache flush operation, which is not caught by KVM and doesn't do much,
since the host and guest are already cache-coherent.

Since dax_writeback_mapping_range() checks that the inode block size is
equal to PAGE_SIZE, this fixes a spurious WARN when virtiofs is used
with a mismatched guest PAGE_SIZE and virtiofs backing FS block size
(this happens, for example, when it's a tmpfs and the host and guest
have a different PAGE_SIZE). FUSE DAX does not require any particular FS
block size, since it always performs DAX mappings in aligned 2MiB
blocks.

See discussion in [1].

[1] https://lore.kernel.org/lkml/20241101-dax-page-size-v1-1-eedbd0c6b08f@asahilina.net/T/#u

Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Asahi Lina <lina@asahilina.net>
---
 fs/fuse/dax.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)


---
base-commit: 9852d85ec9d492ebef56dc5f229416c925758edc
change-id: 20241113-dax-no-writeback-41e6bb3698bc

Cheers,
~~ Lina

Comments

Dan Williams Nov. 12, 2024, 8:11 p.m. UTC | #1
Asahi Lina wrote:
> When using FUSE DAX with virtiofs, cache coherency is managed by the
> host. Disk persistence is handled via fsync() and friends, which are
> passed directly via the FUSE layer to the host. Therefore, there's no
> need to do dax_writeback_mapping_range(). All that ends up doing is a
> cache flush operation, which is not caught by KVM and doesn't do much,
> since the host and guest are already cache-coherent.
> 
> Since dax_writeback_mapping_range() checks that the inode block size is
> equal to PAGE_SIZE, this fixes a spurious WARN when virtiofs is used
> with a mismatched guest PAGE_SIZE and virtiofs backing FS block size
> (this happens, for example, when it's a tmpfs and the host and guest
> have a different PAGE_SIZE). FUSE DAX does not require any particular FS
> block size, since it always performs DAX mappings in aligned 2MiB
> blocks.
> 
> See discussion in [1].
> 
> [1] https://lore.kernel.org/lkml/20241101-dax-page-size-v1-1-eedbd0c6b08f@asahilina.net/T/#u
> 
> Suggested-by: Dan Williams <dan.j.williams@intel.com>
> Signed-off-by: Asahi Lina <lina@asahilina.net>
> ---
>  fs/fuse/dax.c | 7 ++-----
>  1 file changed, 2 insertions(+), 5 deletions(-)

Looks good to me, thanks for the discussion on this.

Acked-by: Dan Williams <dan.j.williams@intel.com>
Miklos Szeredi Nov. 13, 2024, 10:48 a.m. UTC | #2
On Tue, 12 Nov 2024 at 20:55, Asahi Lina <lina@asahilina.net> wrote:
>
> When using FUSE DAX with virtiofs, cache coherency is managed by the
> host. Disk persistence is handled via fsync() and friends, which are
> passed directly via the FUSE layer to the host. Therefore, there's no
> need to do dax_writeback_mapping_range(). All that ends up doing is a
> cache flush operation, which is not caught by KVM and doesn't do much,
> since the host and guest are already cache-coherent.

The conclusion seems convincing.  But adding Vivek, who originally
added this in commit 9483e7d5809a ("virtiofs: define dax address space
operations").

What I'm not clearly seeing is how virtually aliased CPU caches
interact with this.  In mm/filemap.c I see the flush_dcache_folio()
calls which deal with the kernel mapping of a page being in a
different cacheline as the user mapping.  How does that work in the
virt environment?

Also I suggest to remove the writepages callback instead of leaving it
as a no-op.

Thanks,
Miklos
Asahi Lina Nov. 13, 2024, 3:17 p.m. UTC | #3
On 11/13/24 7:48 PM, Miklos Szeredi wrote:
> On Tue, 12 Nov 2024 at 20:55, Asahi Lina <lina@asahilina.net> wrote:
>>
>> When using FUSE DAX with virtiofs, cache coherency is managed by the
>> host. Disk persistence is handled via fsync() and friends, which are
>> passed directly via the FUSE layer to the host. Therefore, there's no
>> need to do dax_writeback_mapping_range(). All that ends up doing is a
>> cache flush operation, which is not caught by KVM and doesn't do much,
>> since the host and guest are already cache-coherent.
> 
> The conclusion seems convincing.  But adding Vivek, who originally
> added this in commit 9483e7d5809a ("virtiofs: define dax address space
> operations").
> 
> What I'm not clearly seeing is how virtually aliased CPU caches
> interact with this.  In mm/filemap.c I see the flush_dcache_folio()
> calls which deal with the kernel mapping of a page being in a
> different cacheline as the user mapping.  How does that work in the
> virt environment?
> 

Oof, I forgot those architectures existed...

The only architecture that has both a KVM implementation and selects
ARCH_HAS_CPU_CACHE_ALIASING is mips. Is it possible that no MIPS
implementations with virtualization also have cache aliasing, and we can
just not care about this?

~~ Lina
diff mbox series

Patch

diff --git a/fs/fuse/dax.c b/fs/fuse/dax.c
index 12ef91d170bb3091ac35a33d2b9dc38330b00948..15cf7bb20b5ebf15451190dac2fcc2e841148e6c 100644
--- a/fs/fuse/dax.c
+++ b/fs/fuse/dax.c
@@ -777,11 +777,8 @@  ssize_t fuse_dax_write_iter(struct kiocb *iocb, struct iov_iter *from)
 static int fuse_dax_writepages(struct address_space *mapping,
 			       struct writeback_control *wbc)
 {
-
-	struct inode *inode = mapping->host;
-	struct fuse_conn *fc = get_fuse_conn(inode);
-
-	return dax_writeback_mapping_range(mapping, fc->dax->dev, wbc);
+	/* nothing to flush, fuse cache coherency is managed by the host */
+	return 0;
 }
 
 static vm_fault_t __fuse_dax_fault(struct vm_fault *vmf, unsigned int order,