[RFC,0/1] erofs: change to use asynchronous io for fscache readahead

Message ID	20220428233849.321495-1-yinxin.x@bytedance.com (mailing list archive)
Headers	show Return-Path: <linux-fsdevel-owner@kernel.org> From: Xin Yin <yinxin.x@bytedance.com> To: jefflexu@linux.alibaba.com, xiang@kernel.org, dhowells@redhat.com Cc: linux-erofs@lists.ozlabs.org, linux-cachefs@redhat.com, linux-fsdevel@vger.kernel.org, Xin Yin <yinxin.x@bytedance.com> Subject: [RFC PATCH 0/1] erofs: change to use asynchronous io for fscache readahead Date: Fri, 29 Apr 2022 07:38:48 +0800 Message-Id: <20220428233849.321495-1-yinxin.x@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	erofs: change to use asynchronous io for fscache readahead \| expand [RFC,0/1] erofs: change to use asynchronous io for fscache readahead [RFC,1/1] erofs: change to use asynchronous io for fscache readahead

Message ID

20220428233849.321495-1-yinxin.x@bytedance.com (mailing list archive)

Headers

From: Xin Yin <yinxin.x@bytedance.com>
To: jefflexu@linux.alibaba.com, xiang@kernel.org, dhowells@redhat.com
Cc: linux-erofs@lists.ozlabs.org, linux-cachefs@redhat.com,
        linux-fsdevel@vger.kernel.org, Xin Yin <yinxin.x@bytedance.com>
Subject: [RFC PATCH 0/1] erofs: change to use asynchronous io for fscache
 readahead
Date: Fri, 29 Apr 2022 07:38:48 +0800
Message-Id: <20220428233849.321495-1-yinxin.x@bytedance.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Precedence: bulk

Series

erofs: change to use asynchronous io for fscache readahead | expand

Message

Xin Yin April 28, 2022, 11:38 p.m. UTC

Hi Jeffle & Xiang

I have tested your fscache,erofs: fscache-based on-demand read semantics 
v9 patches sets https://www.spinics.net/lists/linux-fsdevel/msg216178.html.
For now , it works fine with the nydus image-service. After the image data 
is fully loaded to local storage, it does have great IO performance gain 
compared with nydus V5 which is based on fuse.

For 4K random read , fscache-based erofs can get the same performance with 
the original local filesystem. But I still saw a performance drop in the 4K 
sequential read case. And I found the root cause is in erofs_fscache_readahead() 
we use synchronous IO , which may stall the readahead pipelining.

I have tried to change to use asynchronous io during erofs fscache readahead 
procedure, as what netfs did. Then I saw a great performance gain.

Here are my test steps and results:
- generate nydus v6 format image , in which stored a large file for IO test.
- launch nydus image-service , and  make image data fully loaded to local storage (ext4).
- run fio with below cmd.
fio -ioengine=psync -bs=4k -size=5G -direct=0 -thread -rw=read -filename=./test_image  -name="test" -numjobs=1 -iodepth=16 -runtime=60

v9 patches: 202654 KB/s
v9 patches + async readahead patch: 407213 KB/s
ext4: 439912 KB/s


Xin Yin (1):
  erofs: change to use asynchronous io for fscache readahead

 fs/erofs/fscache.c | 256 ++++++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 245 insertions(+), 11 deletions(-)

Comments

Gao Xiang April 29, 2022, 12:36 p.m. UTC | #1

Hi Xin,

On Fri, Apr 29, 2022 at 07:38:48AM +0800, Xin Yin wrote:
> Hi Jeffle & Xiang
> 
> I have tested your fscache,erofs: fscache-based on-demand read semantics 
> v9 patches sets https://www.spinics.net/lists/linux-fsdevel/msg216178.html.
> For now , it works fine with the nydus image-service. After the image data 
> is fully loaded to local storage, it does have great IO performance gain 
> compared with nydus V5 which is based on fuse.

Yeah, thanks for your interest and efforts. Actually I'm pretty sure you
could observe CPU, bandwidth and latency improvement on the dense deployed
scenarios since our goal is to provide native performance when the data is
ready, as well as image on-demand read, flexible cache data management to
end users.

> 
> For 4K random read , fscache-based erofs can get the same performance with 
> the original local filesystem. But I still saw a performance drop in the 4K 
> sequential read case. And I found the root cause is in erofs_fscache_readahead() 
> we use synchronous IO , which may stall the readahead pipelining.
> 

Yeah, that is a known TODO, in principle, when such part of data is locally
available, it will have the similar performance (bandwidth, latency, CPU
loading) as loop device. But we don't implement asynchronous I/O for now,
since we need to make the functionality work first, so thanks for your
patch addressing this.

> I have tried to change to use asynchronous io during erofs fscache readahead 
> procedure, as what netfs did. Then I saw a great performance gain.
> 
> Here are my test steps and results:
> - generate nydus v6 format image , in which stored a large file for IO test.
> - launch nydus image-service , and  make image data fully loaded to local storage (ext4).
> - run fio with below cmd.
> fio -ioengine=psync -bs=4k -size=5G -direct=0 -thread -rw=read -filename=./test_image  -name="test" -numjobs=1 -iodepth=16 -runtime=60

Yeah, although I can see what you mean (to test buffered I/O), the
argument is still somewhat messy (maybe because we don't support
fscache-based direct I/O for now. That is another TODO but with
low priority.)

> 
> v9 patches: 202654 KB/s
> v9 patches + async readahead patch: 407213 KB/s
> ext4: 439912 KB/s

May I ask if such ext4 image is through a loop device? If not, that is
reasonable. Anyway, it's not a big problem for now, we could optimize
it later since it should be exactly the same finally.

And I will drop a message to Jeffle for further review since we're
closing to another 5-day national holiday.

Thanks again!
Gao Xiang

Xin Yin May 1, 2022, 7:59 a.m. UTC | #2

On Fri, Apr 29, 2022 at 8:36 PM Gao Xiang <hsiangkao@linux.alibaba.com> wrote:
>
> Hi Xin,
>
> On Fri, Apr 29, 2022 at 07:38:48AM +0800, Xin Yin wrote:
> > Hi Jeffle & Xiang
> >
> > I have tested your fscache,erofs: fscache-based on-demand read semantics
> > v9 patches sets https://www.spinics.net/lists/linux-fsdevel/msg216178.html.
> > For now , it works fine with the nydus image-service. After the image data
> > is fully loaded to local storage, it does have great IO performance gain
> > compared with nydus V5 which is based on fuse.
>
> Yeah, thanks for your interest and efforts. Actually I'm pretty sure you
> could observe CPU, bandwidth and latency improvement on the dense deployed
> scenarios since our goal is to provide native performance when the data is
> ready, as well as image on-demand read, flexible cache data management to
> end users.
>
> >
> > For 4K random read , fscache-based erofs can get the same performance with
> > the original local filesystem. But I still saw a performance drop in the 4K
> > sequential read case. And I found the root cause is in erofs_fscache_readahead()
> > we use synchronous IO , which may stall the readahead pipelining.
> >
>
> Yeah, that is a known TODO, in principle, when such part of data is locally
> available, it will have the similar performance (bandwidth, latency, CPU
> loading) as loop device. But we don't implement asynchronous I/O for now,
> since we need to make the functionality work first, so thanks for your
> patch addressing this.
>
> > I have tried to change to use asynchronous io during erofs fscache readahead
> > procedure, as what netfs did. Then I saw a great performance gain.
> >
> > Here are my test steps and results:
> > - generate nydus v6 format image , in which stored a large file for IO test.
> > - launch nydus image-service , and  make image data fully loaded to local storage (ext4).
> > - run fio with below cmd.
> > fio -ioengine=psync -bs=4k -size=5G -direct=0 -thread -rw=read -filename=./test_image  -name="test" -numjobs=1 -iodepth=16 -runtime=60
>
> Yeah, although I can see what you mean (to test buffered I/O), the
> argument is still somewhat messy (maybe because we don't support
> fscache-based direct I/O for now. That is another TODO but with
> low priority.)
>
> >
> > v9 patches: 202654 KB/s
> > v9 patches + async readahead patch: 407213 KB/s
> > ext4: 439912 KB/s
>
> May I ask if such ext4 image is through a loop device? If not, that is
> reasonable. Anyway, it's not a big problem for now, we could optimize
> it later since it should be exactly the same finally.
>

This ext4 image is not through a loop device ,  just the same test
file stored in native ext4.  Actually , after further tests , I could
see that fscache-based erofs with async readahead patch almost achieve
native performance in sequential buffer read cases.

Thanks,
Xin Yin

> And I will drop a message to Jeffle for further review since we're
> closing to another 5-day national holiday.
>
> Thanks again!
> Gao Xiang
>