[PATCHSET,for-next,0/3] Add FMODE_NOWAIT support to pipes

Message ID	20230308031033.155717-1-axboe@kernel.dk (mailing list archive)
Headers	show Return-Path: <linux-fsdevel-owner@vger.kernel.org> From: Jens Axboe <axboe@kernel.dk> To: io-uring@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: [PATCHSET for-next 0/3] Add FMODE_NOWAIT support to pipes Date: Tue, 7 Mar 2023 20:10:30 -0700 Message-Id: <20230308031033.155717-1-axboe@kernel.dk> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	Add FMODE_NOWAIT support to pipes \| expand [PATCHSET,for-next,0/3] Add FMODE_NOWAIT support to pipes [1/3] fs: add 'nonblock' parameter to pipe_buf_confirm() and fops method [2/3] pipe: enable handling of IOCB_NOWAIT [3/3] pipe: set FMODE_NOWAIT on pipes

Message ID

20230308031033.155717-1-axboe@kernel.dk (mailing list archive)

Headers

From: Jens Axboe <axboe@kernel.dk>
To: io-uring@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: [PATCHSET for-next 0/3] Add FMODE_NOWAIT support to pipes
Date: Tue,  7 Mar 2023 20:10:30 -0700
Message-Id: <20230308031033.155717-1-axboe@kernel.dk>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Precedence: bulk

Series

Add FMODE_NOWAIT support to pipes | expand

Message

Jens Axboe March 8, 2023, 3:10 a.m. UTC

Hi,

One thing that's always been a bit slower than I'd like with io_uring is
dealing with pipes. They don't support IOCB_NOWAIT, and hence we need to
punt them to io-wq for handling.

This series adds support for FMODE_NOWAIT to pipes.

Patch 1 extends pipe_buf_operations->confirm() to accept a nonblock
parameter, and wires up the caller, pipe_buf_confirm(), to have that
argument too.

Patch 2 makes pipes deal with IOCB_NOWAIT for locking the pipe, calling
pipe_buf_confirm(), and for allocating new pages on writes.

Patch 3 flicks the switch and enables FMODE_NOWAIT for pipes.

Curious on how big of a difference this makes, I wrote a small benchmark
that simply opens 128 pipes and then does 256 rounds of reading and
writing to them. This was run 10 times, discarding the first run as it's
always a bit slower. Before the patch:

Avg:	262.52 msec
Stdev:	  2.12 msec
Min:	261.07 msec
Max	267.91 msec

and after the patch:

Avg:	24.14 msec
Stdev:	 9.61 msec
Min:	17.84 msec
Max:	43.75 msec

or about a 10x improvement in performance (and efficiency).

I ran the patches through the ltp pipe and splice tests, no regressions
observed. Looking at io_uring traces, we can see that we no longer have
any io_uring_queue_async_work() traces after the patch, where previously
everything was done via io-wq.

Comments

Jens Axboe March 8, 2023, 3:33 a.m. UTC | #1

On 3/7/23 8:10?PM, Jens Axboe wrote:
> Curious on how big of a difference this makes, I wrote a small benchmark
> that simply opens 128 pipes and then does 256 rounds of reading and
> writing to them. This was run 10 times, discarding the first run as it's
> always a bit slower. Before the patch:
> 
> Avg:	262.52 msec
> Stdev:	  2.12 msec
> Min:	261.07 msec
> Max	267.91 msec
> 
> and after the patch:
> 
> Avg:	24.14 msec
> Stdev:	 9.61 msec
> Min:	17.84 msec
> Max:	43.75 msec
> 
> or about a 10x improvement in performance (and efficiency).

The above test was for a pipe being empty when the read is issued, if
the test is changed to have data when, then it looks even better:

Before:

Avg:	249.24 msec
Stdev:	  0.20 msec
Min:	248.96 msec
Max:	249.53 msec

After:

Avg:	 10.86 msec
Stdev:	  0.91 msec
Min:	 10.02 msec
Max:	 12.67 msec

or about a 23x improvement.

Dave Chinner March 8, 2023, 6:46 a.m. UTC | #2

On Tue, Mar 07, 2023 at 08:33:24PM -0700, Jens Axboe wrote:
> On 3/7/23 8:10?PM, Jens Axboe wrote:
> > Curious on how big of a difference this makes, I wrote a small benchmark
> > that simply opens 128 pipes and then does 256 rounds of reading and
> > writing to them. This was run 10 times, discarding the first run as it's
> > always a bit slower. Before the patch:
> > 
> > Avg:	262.52 msec
> > Stdev:	  2.12 msec
> > Min:	261.07 msec
> > Max	267.91 msec
> > 
> > and after the patch:
> > 
> > Avg:	24.14 msec
> > Stdev:	 9.61 msec
> > Min:	17.84 msec
> > Max:	43.75 msec
> > 
> > or about a 10x improvement in performance (and efficiency).
> 
> The above test was for a pipe being empty when the read is issued, if
> the test is changed to have data when, then it looks even better:
> 
> Before:
> 
> Avg:	249.24 msec
> Stdev:	  0.20 msec
> Min:	248.96 msec
> Max:	249.53 msec
> 
> After:
> 
> Avg:	 10.86 msec
> Stdev:	  0.91 msec
> Min:	 10.02 msec
> Max:	 12.67 msec
> 
> or about a 23x improvement.

Nice!

Code looks OK, maybe consider s/nonblock/nowait/, but I'm not a pipe
expert so I'll leave nitty gritty details to Al, et al.

Acked-by: Dave Chinner <dchinner@redhat.com>

Jens Axboe March 8, 2023, 2:30 p.m. UTC | #3

On 3/7/23 11:46?PM, Dave Chinner wrote:
> On Tue, Mar 07, 2023 at 08:33:24PM -0700, Jens Axboe wrote:
>> On 3/7/23 8:10?PM, Jens Axboe wrote:
>>> Curious on how big of a difference this makes, I wrote a small benchmark
>>> that simply opens 128 pipes and then does 256 rounds of reading and
>>> writing to them. This was run 10 times, discarding the first run as it's
>>> always a bit slower. Before the patch:
>>>
>>> Avg:	262.52 msec
>>> Stdev:	  2.12 msec
>>> Min:	261.07 msec
>>> Max	267.91 msec
>>>
>>> and after the patch:
>>>
>>> Avg:	24.14 msec
>>> Stdev:	 9.61 msec
>>> Min:	17.84 msec
>>> Max:	43.75 msec
>>>
>>> or about a 10x improvement in performance (and efficiency).
>>
>> The above test was for a pipe being empty when the read is issued, if
>> the test is changed to have data when, then it looks even better:
>>
>> Before:
>>
>> Avg:	249.24 msec
>> Stdev:	  0.20 msec
>> Min:	248.96 msec
>> Max:	249.53 msec
>>
>> After:
>>
>> Avg:	 10.86 msec
>> Stdev:	  0.91 msec
>> Min:	 10.02 msec
>> Max:	 12.67 msec
>>
>> or about a 23x improvement.
> 
> Nice!
> 
> Code looks OK, maybe consider s/nonblock/nowait/, but I'm not a pipe
> expert so I'll leave nitty gritty details to Al, et al.

We seem to use both somewhat interchangably throughout the kernel. Don't
feel strongly about that one, so I'll let the majority speak on what
they prefer.

> Acked-by: Dave Chinner <dchinner@redhat.com>

Thanks, added.