diff mbox series

fs: optimise kiocb_set_rw_flags()

Message ID e523f51f59ad6ecdad4ad22c560cb9c913e96e1a.1596277420.git.asml.silence@gmail.com (mailing list archive)
State New, archived
Headers show
Series fs: optimise kiocb_set_rw_flags() | expand

Commit Message

Pavel Begunkov Aug. 1, 2020, 10:36 a.m. UTC
Use a local var to collect flags in kiocb_set_rw_flags(). That spares
some memory writes and allows to replace most of the jumps with MOVEcc.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---

v2: fast exit on flags == 0 (Matthew Wilcox)

 include/linux/fs.h | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

Comments

Pavel Begunkov Aug. 1, 2020, 10:42 a.m. UTC | #1
On 01/08/2020 13:36, Pavel Begunkov wrote:
> Use a local var to collect flags in kiocb_set_rw_flags(). That spares
> some memory writes and allows to replace most of the jumps with MOVEcc.

This one is pretty old, but looks like nobody in fsdevel cares.

Jens, any chance you can pick this up? For instancec, I don't like the
binary it generates for io_uring.

> 
> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
> ---
> 
> v2: fast exit on flags == 0 (Matthew Wilcox)
> 
>  include/linux/fs.h | 16 +++++++++++-----
>  1 file changed, 11 insertions(+), 5 deletions(-)
> 
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 8a00ba99284e..b7f1f1b7d691 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -3282,22 +3282,28 @@ static inline int iocb_flags(struct file *file)
>  
>  static inline int kiocb_set_rw_flags(struct kiocb *ki, rwf_t flags)
>  {
> +	int kiocb_flags = 0;
> +
> +	if (!flags)
> +		return 0;
>  	if (unlikely(flags & ~RWF_SUPPORTED))
>  		return -EOPNOTSUPP;
>  
>  	if (flags & RWF_NOWAIT) {
>  		if (!(ki->ki_filp->f_mode & FMODE_NOWAIT))
>  			return -EOPNOTSUPP;
> -		ki->ki_flags |= IOCB_NOWAIT;
> +		kiocb_flags |= IOCB_NOWAIT;
>  	}
>  	if (flags & RWF_HIPRI)
> -		ki->ki_flags |= IOCB_HIPRI;
> +		kiocb_flags |= IOCB_HIPRI;
>  	if (flags & RWF_DSYNC)
> -		ki->ki_flags |= IOCB_DSYNC;
> +		kiocb_flags |= IOCB_DSYNC;
>  	if (flags & RWF_SYNC)
> -		ki->ki_flags |= (IOCB_DSYNC | IOCB_SYNC);
> +		kiocb_flags |= (IOCB_DSYNC | IOCB_SYNC);
>  	if (flags & RWF_APPEND)
> -		ki->ki_flags |= IOCB_APPEND;
> +		kiocb_flags |= IOCB_APPEND;
> +
> +	ki->ki_flags |= kiocb_flags;
>  	return 0;
>  }
>  
>
Matthew Wilcox Aug. 1, 2020, 3:37 p.m. UTC | #2
On Sat, Aug 01, 2020 at 01:36:33PM +0300, Pavel Begunkov wrote:
> Use a local var to collect flags in kiocb_set_rw_flags(). That spares
> some memory writes and allows to replace most of the jumps with MOVEcc.
> 
> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>

Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>

If you want to improve the codegen here further, I would suggest that
renumbering the IOCB flags to match the RWF flags would lead to better
codegen (can't do it the other way around; RWF flags are userspace ABI,
IOCB flags are not).  iocb_flags() probably doesn't get any worse because
the IOCB_ flags don't have the same numbers as the O_ bits (which differ
by arch anyway).
Jens Axboe Aug. 1, 2020, 5:01 p.m. UTC | #3
On 8/1/20 9:37 AM, Matthew Wilcox wrote:
> On Sat, Aug 01, 2020 at 01:36:33PM +0300, Pavel Begunkov wrote:
>> Use a local var to collect flags in kiocb_set_rw_flags(). That spares
>> some memory writes and allows to replace most of the jumps with MOVEcc.
>>
>> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
> 
> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> 
> If you want to improve the codegen here further, I would suggest that
> renumbering the IOCB flags to match the RWF flags would lead to better
> codegen (can't do it the other way around; RWF flags are userspace ABI,
> IOCB flags are not).  iocb_flags() probably doesn't get any worse because
> the IOCB_ flags don't have the same numbers as the O_ bits (which differ
> by arch anyway).

Yeah that's not a bad idea, would kill a lot of branches.
Jens Axboe Aug. 1, 2020, 5:02 p.m. UTC | #4
On 8/1/20 4:36 AM, Pavel Begunkov wrote:
> Use a local var to collect flags in kiocb_set_rw_flags(). That spares
> some memory writes and allows to replace most of the jumps with MOVEcc.

I've picked this one up.
Pavel Begunkov Aug. 2, 2020, 8:21 a.m. UTC | #5
On 01/08/2020 20:02, Jens Axboe wrote:
> On 8/1/20 4:36 AM, Pavel Begunkov wrote:
>> Use a local var to collect flags in kiocb_set_rw_flags(). That spares
>> some memory writes and allows to replace most of the jumps with MOVEcc.
> 
> I've picked this one up.

Thanks
Pavel Begunkov Aug. 2, 2020, 8:33 a.m. UTC | #6
On 01/08/2020 20:01, Jens Axboe wrote:
> On 8/1/20 9:37 AM, Matthew Wilcox wrote:
>> On Sat, Aug 01, 2020 at 01:36:33PM +0300, Pavel Begunkov wrote:
>>> Use a local var to collect flags in kiocb_set_rw_flags(). That spares
>>> some memory writes and allows to replace most of the jumps with MOVEcc.
>>>
>>> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
>>
>> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
>>
>> If you want to improve the codegen here further, I would suggest that
>> renumbering the IOCB flags to match the RWF flags would lead to better
>> codegen (can't do it the other way around; RWF flags are userspace ABI,
>> IOCB flags are not).  iocb_flags() probably doesn't get any worse because
>> the IOCB_ flags don't have the same numbers as the O_ bits (which differ
>> by arch anyway).
> 
> Yeah that's not a bad idea, would kill a lot of branches.

Is that common here to do so? I've done this for io_uring flags a while
ago, but left RWF alone at the time, reluctant to check for possible
complications (e.g. bit magic).
Pavel Begunkov Aug. 2, 2020, 8:41 a.m. UTC | #7
On 01/08/2020 18:37, Matthew Wilcox wrote:
> On Sat, Aug 01, 2020 at 01:36:33PM +0300, Pavel Begunkov wrote:
>> Use a local var to collect flags in kiocb_set_rw_flags(). That spares
>> some memory writes and allows to replace most of the jumps with MOVEcc.
>>
>> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
> 
> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Thanks for reviewing it

> 
> If you want to improve the codegen here further, I would suggest that
> renumbering the IOCB flags to match the RWF flags would lead to better
> codegen (can't do it the other way around; RWF flags are userspace ABI,
> IOCB flags are not).  iocb_flags() probably doesn't get any worse because
> the IOCB_ flags don't have the same numbers as the O_ bits (which differ
> by arch anyway).
>
diff mbox series

Patch

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 8a00ba99284e..b7f1f1b7d691 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -3282,22 +3282,28 @@  static inline int iocb_flags(struct file *file)
 
 static inline int kiocb_set_rw_flags(struct kiocb *ki, rwf_t flags)
 {
+	int kiocb_flags = 0;
+
+	if (!flags)
+		return 0;
 	if (unlikely(flags & ~RWF_SUPPORTED))
 		return -EOPNOTSUPP;
 
 	if (flags & RWF_NOWAIT) {
 		if (!(ki->ki_filp->f_mode & FMODE_NOWAIT))
 			return -EOPNOTSUPP;
-		ki->ki_flags |= IOCB_NOWAIT;
+		kiocb_flags |= IOCB_NOWAIT;
 	}
 	if (flags & RWF_HIPRI)
-		ki->ki_flags |= IOCB_HIPRI;
+		kiocb_flags |= IOCB_HIPRI;
 	if (flags & RWF_DSYNC)
-		ki->ki_flags |= IOCB_DSYNC;
+		kiocb_flags |= IOCB_DSYNC;
 	if (flags & RWF_SYNC)
-		ki->ki_flags |= (IOCB_DSYNC | IOCB_SYNC);
+		kiocb_flags |= (IOCB_DSYNC | IOCB_SYNC);
 	if (flags & RWF_APPEND)
-		ki->ki_flags |= IOCB_APPEND;
+		kiocb_flags |= IOCB_APPEND;
+
+	ki->ki_flags |= kiocb_flags;
 	return 0;
 }