diff mbox series

[RFC] vfs: wrap CONFIG_READ_ONLY_THP_FOR_FS-related code with an ifdef

Message ID 20240624074135.486845-1-mjguzik@gmail.com (mailing list archive)
State New
Headers show
Series [RFC] vfs: wrap CONFIG_READ_ONLY_THP_FOR_FS-related code with an ifdef | expand

Commit Message

Mateusz Guzik June 24, 2024, 7:41 a.m. UTC
On kernels compiled without this option (which is currently the default
state) filemap_nr_thps expands to 0.

do_dentry_open has a big chunk dependent on it, most of which gets
optimized away, except for a branch and a full fence:

if (f->f_mode & FMODE_WRITE) {
[snip]
        smp_mb();
        if (filemap_nr_thps(inode->i_mapping)) {
[snip]
	}
}

While the branch is pretty minor the fence really does not need to be
there.

This is a bare-minimum patch which takes care of it until someone(tm)
cleans this up. Notably it does not conditionally compile other spots
which issue the matching fence.

I did not bother benchmarking it, not issuing a spurious full fence in
the fast path does not warrant justification from perf standpoint.

Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
---

I am not particularly familiar with any of this, the smp_mb in the open
for write path was sticking out like a sore thumb on code read so I
figured there may be One Weird Trick to whack it.

If the stock code is correct as is, then the ifdef as above is fine.

The ifdefed chunk is big enough that it should probably be its own
routine. I don't want to bikeshed so I did not go for it.

For a moment I considered adding filemap_nr_thps_mb which would expand
to 0 or issue the fence + do the read, but then I figured a routine
claiming to post a fence and only conditionally do it is misleading at
best.

As per the commit message fences in collapse_file remain compiled in.
It is unclear to me if the code following them is doing anything useful
on kernels !CONFIG_READ_ONLY_THP_FOR_FS.

All that said, if there is cosmetic touch ups you want done here, I can
do them.

However, a nice full patch would take care of all of the above and I
have neither the information needed to do it nor the interest to get it,
so should someone insinst on a full version I'm going to suggest they
write it themselves. I repeat this is merely a damage control until
someone sorts thigs out.

 fs/open.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Mateusz Guzik June 24, 2024, 8:17 a.m. UTC | #1
On Mon, Jun 24, 2024 at 09:41:34AM +0200, Mateusz Guzik wrote:
> On kernels compiled without this option (which is currently the default
> state) filemap_nr_thps expands to 0.
> 
> do_dentry_open has a big chunk dependent on it, most of which gets
> optimized away, except for a branch and a full fence:
> 
> if (f->f_mode & FMODE_WRITE) {
> [snip]
>         smp_mb();
>         if (filemap_nr_thps(inode->i_mapping)) {
> [snip]
> 	}
> }
> 
> While the branch is pretty minor the fence really does not need to be
> there.
> 

[the To: recipient bounces, thus got dropped]

Now that I sent this I remembered that some of the atomic ops in Linux
provide a full fence regardless of an architecture.

get_write_access uses atomic_inc_unless_negative which qualifies?

  1551 static __always_inline bool
  1552 atomic_inc_unless_negative(atomic_t *v)
  1553 {
  1554 │       kcsan_mb();
  1555 │       instrument_atomic_read_write(v, sizeof(*v));
  1556 │       return raw_atomic_inc_unless_negative(v);
  1557 }

If so, the ifdefed-out smp_mb can be eliminated altogether if I got my
fences right.

On top of that mnt_get_write_access injects another smp_mb() anyway.

> This is a bare-minimum patch which takes care of it until someone(tm)
> cleans this up. Notably it does not conditionally compile other spots
> which issue the matching fence.
> 
> I did not bother benchmarking it, not issuing a spurious full fence in
> the fast path does not warrant justification from perf standpoint.
> 
> Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
> ---
> 
> I am not particularly familiar with any of this, the smp_mb in the open
> for write path was sticking out like a sore thumb on code read so I
> figured there may be One Weird Trick to whack it.
> 
> If the stock code is correct as is, then the ifdef as above is fine.
> 
> The ifdefed chunk is big enough that it should probably be its own
> routine. I don't want to bikeshed so I did not go for it.
> 
> For a moment I considered adding filemap_nr_thps_mb which would expand
> to 0 or issue the fence + do the read, but then I figured a routine
> claiming to post a fence and only conditionally do it is misleading at
> best.
> 
> As per the commit message fences in collapse_file remain compiled in.
> It is unclear to me if the code following them is doing anything useful
> on kernels !CONFIG_READ_ONLY_THP_FOR_FS.
> 
> All that said, if there is cosmetic touch ups you want done here, I can
> do them.
> 
> However, a nice full patch would take care of all of the above and I
> have neither the information needed to do it nor the interest to get it,
> so should someone insinst on a full version I'm going to suggest they
> write it themselves. I repeat this is merely a damage control until
> someone sorts thigs out.
> 
>  fs/open.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/fs/open.c b/fs/open.c
> index 28f2fcbebb1b..654c300b3c33 100644
> --- a/fs/open.c
> +++ b/fs/open.c
> @@ -980,6 +980,7 @@ static int do_dentry_open(struct file *f,
>  	if ((f->f_flags & O_DIRECT) && !(f->f_mode & FMODE_CAN_ODIRECT))
>  		return -EINVAL;
>  
> +#ifdef CONFIG_READ_ONLY_THP_FOR_FS
>  	/*
>  	 * XXX: Huge page cache doesn't support writing yet. Drop all page
>  	 * cache for this file before processing writes.
> @@ -1007,6 +1008,7 @@ static int do_dentry_open(struct file *f,
>  			filemap_invalidate_unlock(inode->i_mapping);
>  		}
>  	}
> +#endif
>  
>  	return 0;
>  
> -- 
> 2.43.0
>
diff mbox series

Patch

diff --git a/fs/open.c b/fs/open.c
index 28f2fcbebb1b..654c300b3c33 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -980,6 +980,7 @@  static int do_dentry_open(struct file *f,
 	if ((f->f_flags & O_DIRECT) && !(f->f_mode & FMODE_CAN_ODIRECT))
 		return -EINVAL;
 
+#ifdef CONFIG_READ_ONLY_THP_FOR_FS
 	/*
 	 * XXX: Huge page cache doesn't support writing yet. Drop all page
 	 * cache for this file before processing writes.
@@ -1007,6 +1008,7 @@  static int do_dentry_open(struct file *f,
 			filemap_invalidate_unlock(inode->i_mapping);
 		}
 	}
+#endif
 
 	return 0;