Message ID | 20240122111751.449762-1-kernel@pankajraghav.com (mailing list archive) |
---|---|
Headers | show |
Series | fstest changes for LBS | expand |
On Mon, Jan 22, 2024 at 12:17:49PM +0100, Pankaj Raghav (Samsung) wrote: > From: Pankaj Raghav <p.raghav@samsung.com> > > Some tests need to be adapted to for LBS[1] based on the filesystem > blocksize. These are generic changes where it uses the filesystem > blocksize instead of assuming it. > > There are some more generic test cases that are failing due to logdev > size requirement that changes with filesystem blocksize. I will address > them in a separate series. > > [1] https://lore.kernel.org/lkml/20230915183848.1018717-1-kernel@pankajraghav.com/ > > Pankaj Raghav (2): > xfs/558: scale blk IO size based on the filesystem blksz > xfs/161: adapt the test case for LBS filesystem Do either of these fail and require fixing for a 64k page size system running 64kB block size? i.e. are these actual 64kB block size issues, or just issues with the LBS patchset? -Dave.
On 23/01/2024 01:25, Dave Chinner wrote: > On Mon, Jan 22, 2024 at 12:17:49PM +0100, Pankaj Raghav (Samsung) wrote: >> From: Pankaj Raghav <p.raghav@samsung.com> >> >> Some tests need to be adapted to for LBS[1] based on the filesystem >> blocksize. These are generic changes where it uses the filesystem >> blocksize instead of assuming it. >> >> There are some more generic test cases that are failing due to logdev >> size requirement that changes with filesystem blocksize. I will address >> them in a separate series. >> >> [1] https://lore.kernel.org/lkml/20230915183848.1018717-1-kernel@pankajraghav.com/ >> >> Pankaj Raghav (2): >> xfs/558: scale blk IO size based on the filesystem blksz >> xfs/161: adapt the test case for LBS filesystem > > Do either of these fail and require fixing for a 64k page size > system running 64kB block size? > > i.e. are these actual 64kB block size issues, or just issues with > the LBS patchset? > I had the same question in mind. Unfortunately, I don't have access to any 64k Page size machine at the moment. I will ask around if I can get access to it. @Zorro I saw you posted a test report for 64k blocksize. Is it possible for you to see if these test cases(xfs/161, xfs/558) work in your setup with 64k block size? CCing Ritesh as I saw him post a patch to fix a testcase for 64k block size.
On Tue, Jan 23, 2024 at 09:52:39AM +0100, Pankaj Raghav wrote: > On 23/01/2024 01:25, Dave Chinner wrote: > > On Mon, Jan 22, 2024 at 12:17:49PM +0100, Pankaj Raghav (Samsung) wrote: > >> From: Pankaj Raghav <p.raghav@samsung.com> > >> > >> Some tests need to be adapted to for LBS[1] based on the filesystem > >> blocksize. These are generic changes where it uses the filesystem > >> blocksize instead of assuming it. > >> > >> There are some more generic test cases that are failing due to logdev > >> size requirement that changes with filesystem blocksize. I will address > >> them in a separate series. > >> > >> [1] https://lore.kernel.org/lkml/20230915183848.1018717-1-kernel@pankajraghav.com/ > >> > >> Pankaj Raghav (2): > >> xfs/558: scale blk IO size based on the filesystem blksz > >> xfs/161: adapt the test case for LBS filesystem > > > > Do either of these fail and require fixing for a 64k page size > > system running 64kB block size? > > > > i.e. are these actual 64kB block size issues, or just issues with > > the LBS patchset? > > > > I had the same question in mind. Unfortunately, I don't have access to any 64k Page size > machine at the moment. I will ask around if I can get access to it. > > @Zorro I saw you posted a test report for 64k blocksize. Is it possible for you to > see if these test cases(xfs/161, xfs/558) work in your setup with 64k block size? Sure, I'll reserve one ppc64le and give it a try. But I remember there're more failed cases on 64k blocksize xfs. Thanks, Zorro > > CCing Ritesh as I saw him post a patch to fix a testcase for 64k block size. >
Pankaj Raghav <p.raghav@samsung.com> writes: > On 23/01/2024 01:25, Dave Chinner wrote: >> On Mon, Jan 22, 2024 at 12:17:49PM +0100, Pankaj Raghav (Samsung) wrote: >>> From: Pankaj Raghav <p.raghav@samsung.com> >>> >>> Some tests need to be adapted to for LBS[1] based on the filesystem >>> blocksize. These are generic changes where it uses the filesystem >>> blocksize instead of assuming it. >>> >>> There are some more generic test cases that are failing due to logdev >>> size requirement that changes with filesystem blocksize. I will address >>> them in a separate series. >>> >>> [1] https://lore.kernel.org/lkml/20230915183848.1018717-1-kernel@pankajraghav.com/ >>> >>> Pankaj Raghav (2): >>> xfs/558: scale blk IO size based on the filesystem blksz >>> xfs/161: adapt the test case for LBS filesystem >> >> Do either of these fail and require fixing for a 64k page size >> system running 64kB block size? >> >> i.e. are these actual 64kB block size issues, or just issues with >> the LBS patchset? >> > > I had the same question in mind. Unfortunately, I don't have access to any 64k Page size > machine at the moment. I will ask around if I can get access to it. > > @Zorro I saw you posted a test report for 64k blocksize. Is it possible for you to > see if these test cases(xfs/161, xfs/558) work in your setup with 64k block size? > > CCing Ritesh as I saw him post a patch to fix a testcase for 64k block size. Hi Pankaj, So I tested this on Linux 6.6 on Power8 qemu (which I had it handy). xfs/558 passed with both 64k blocksize & with 4k blocksize on a 64k pagesize system. However, since on this system the quota was v4.05, it does not support bigtime feature hence could not run xfs/161. xfs/161 [not run] quota: bigtime support not detected xfs/558 7s ... 21s I will collect this info on a different system with latest kernel and will update for xfs/161 too. -ritesh
Zorro Lang <zlang@redhat.com> writes: > On Tue, Jan 23, 2024 at 09:52:39AM +0100, Pankaj Raghav wrote: >> On 23/01/2024 01:25, Dave Chinner wrote: >> > On Mon, Jan 22, 2024 at 12:17:49PM +0100, Pankaj Raghav (Samsung) wrote: >> >> From: Pankaj Raghav <p.raghav@samsung.com> >> >> >> >> Some tests need to be adapted to for LBS[1] based on the filesystem >> >> blocksize. These are generic changes where it uses the filesystem >> >> blocksize instead of assuming it. >> >> >> >> There are some more generic test cases that are failing due to logdev >> >> size requirement that changes with filesystem blocksize. I will address >> >> them in a separate series. >> >> >> >> [1] https://lore.kernel.org/lkml/20230915183848.1018717-1-kernel@pankajraghav.com/ >> >> >> >> Pankaj Raghav (2): >> >> xfs/558: scale blk IO size based on the filesystem blksz >> >> xfs/161: adapt the test case for LBS filesystem >> > >> > Do either of these fail and require fixing for a 64k page size >> > system running 64kB block size? >> > >> > i.e. are these actual 64kB block size issues, or just issues with >> > the LBS patchset? >> > >> >> I had the same question in mind. Unfortunately, I don't have access to any 64k Page size >> machine at the moment. I will ask around if I can get access to it. >> >> @Zorro I saw you posted a test report for 64k blocksize. Is it possible for you to >> see if these test cases(xfs/161, xfs/558) work in your setup with 64k block size? > > Sure, I'll reserve one ppc64le and give it a try. But I remember there're more failed > cases on 64k blocksize xfs. > Please share the lists of failed testcases with 64k bs xfs (if you have it handy). IIRC, many of them could be due to 64k bs itself, but yes, I can take a look and work on those. Thanks! -ritesh
>> @Zorro I saw you posted a test report for 64k blocksize. Is it possible for you to >> see if these test cases(xfs/161, xfs/558) work in your setup with 64k block size? > > Sure, I'll reserve one ppc64le and give it a try. But I remember there're more failed > cases on 64k blocksize xfs. > Thanks a lot, Zorro. I am also having issues with xfs/166 with LBS. I am not sure if this exists on a 64k base page size system. FYI, there are a lot of generic tests that are failing due to the filesystem size being too small to fit the log with 64k block size. At least with LBS (I am not sure about 64k base page system), these are the failures due to filesystem size: generic/042, generic/081, generic/108, generic/455, generic/457, generic/482, generic/704, generic/730, generic/731, shared/298. For example in generic/042 with 64k block size: max log size 388 smaller than min log size 2028, filesystem is too small Usage: mkfs.xfs /* blocksize */ [-b size=num] /* config file */ [-c options=xxx] /* metadata */ [-m crc=0|1,finobt=0|1,uuid=xxx,rmapbt=0|1,reflink=0|1, inobtcount=0|1,bigtime=0|1] /* data subvol */ [-d agcount=n,agsize=n,file,name=xxx,size=num, (sunit=value,swidth=value|su=num,sw=num|noalign), sectsize=num /* force overwrite */ [-f] /* inode size */ [-i perblock=n|size=num,maxpct=n,attr=0|1|2, projid32bit=0|1,sparse=0|1,nrext64=0|1] /* no discard */ [-K] /* log subvol */ [-l agnum=n,internal,size=num,logdev=xxx,version=n sunit=value|su=num,sectsize=num,lazy-count=0|1] /* label */ [-L label (maximum 12 characters)] /* naming */ [-n size=num,version=2|ci,ftype=0|1] /* no-op info only */ [-N] /* prototype file */ [-p fname] /* quiet */ [-q] /* realtime subvol */ [-r extsize=num,size=num,rtdev=xxx] /* sectorsize */ [-s size=num] /* version */ [-V] devicename <devicename> is required unless -d name=xxx is given. <num> is xxx (bytes), xxxs (sectors), xxxb (fs blocks), xxxk (xxx KiB), xxxm (xxx MiB), xxxg (xxx GiB), xxxt (xxx TiB) or xxxp (xxx PiB). <value> is xxx (512 byte blocks). -- Pankaj
>> CCing Ritesh as I saw him post a patch to fix a testcase for 64k block size. > > Hi Pankaj, > > So I tested this on Linux 6.6 on Power8 qemu (which I had it handy). > xfs/558 passed with both 64k blocksize & with 4k blocksize on a 64k > pagesize system. Thanks for testing it out. I will investigate this further, and see why I have this failure in LBS for 64k and not for 32k and 16k block sizes. As this test also expects some invalidation during the page cache writeback, this might an issue just with LBS and not for 64k page size machines. Probably I will also spend some time to set up a Power8 qemu to test these failures. > However, since on this system the quota was v4.05, it does not support > bigtime feature hence could not run xfs/161. > > xfs/161 [not run] quota: bigtime support not detected > xfs/558 7s ... 21s > > I will collect this info on a different system with latest kernel and > will update for xfs/161 too. > Sounds good! Thanks! > -ritesh
Pankaj Raghav <p.raghav@samsung.com> writes: >>> CCing Ritesh as I saw him post a patch to fix a testcase for 64k block size. >> >> Hi Pankaj, >> >> So I tested this on Linux 6.6 on Power8 qemu (which I had it handy). >> xfs/558 passed with both 64k blocksize & with 4k blocksize on a 64k >> pagesize system. Ok, so it looks like the testcase xfs/558 is failing on linux-next with 64k blocksize but passing with 4k blocksize. It thought it was passing on my previous linux 6.6 release, but I guess those too were just some lucky runs. Here is the report - linux-next: xfs/558 aggregate results across 11 runs: pass=2 (18.2%), fail=9 (81.8%) v6.6: xfs/558 aggregate results across 11 runs: pass=5 (45.5%), fail=6 (54.5%) So I guess, I will spend sometime analyzing why the failure. Failure log ================ xfs/558 36s ... - output mismatch (see /root/xfstests-dev/results//xfs_64k_iomap/xfs/558.out.bad) --- tests/xfs/558.out 2023-06-29 12:06:13.824276289 +0000 +++ /root/xfstests-dev/results//xfs_64k_iomap/xfs/558.out.bad 2024-01-23 18:54:56.613116520 +0000 @@ -1,2 +1,3 @@ QA output created by 558 +Expected to hear about writeback iomap invalidations? Silence is golden ... (Run 'diff -u /root/xfstests-dev/tests/xfs/558.out /root/xfstests-dev/results//xfs_64k_iomap/xfs/558.out.bad' to see the entire diff) HINT: You _MAY_ be missing kernel fix: 5c665e5b5af6 xfs: remove xfs_map_cow -ritesh > > Thanks for testing it out. I will investigate this further, and see why > I have this failure in LBS for 64k and not for 32k and 16k block sizes. > > As this test also expects some invalidation during the page cache writeback, > this might an issue just with LBS and not for 64k page size machines. > > Probably I will also spend some time to set up a Power8 qemu to test these failures. > >> However, since on this system the quota was v4.05, it does not support >> bigtime feature hence could not run xfs/161. >> >> xfs/161 [not run] quota: bigtime support not detected >> xfs/558 7s ... 21s >> >> I will collect this info on a different system with latest kernel and >> will update for xfs/161 too. >> > > Sounds good! Thanks! > >> -ritesh
On 23/01/2024 20:42, Ritesh Harjani (IBM) wrote: > Pankaj Raghav <p.raghav@samsung.com> writes: > >>>> CCing Ritesh as I saw him post a patch to fix a testcase for 64k block size. >>> >>> Hi Pankaj, >>> >>> So I tested this on Linux 6.6 on Power8 qemu (which I had it handy). >>> xfs/558 passed with both 64k blocksize & with 4k blocksize on a 64k >>> pagesize system. > > Ok, so it looks like the testcase xfs/558 is failing on linux-next with > 64k blocksize but passing with 4k blocksize. > It thought it was passing on my previous linux 6.6 release, but I guess > those too were just some lucky runs. Here is the report - > > linux-next: xfs/558 aggregate results across 11 runs: pass=2 (18.2%), fail=9 (81.8%) > v6.6: xfs/558 aggregate results across 11 runs: pass=5 (45.5%), fail=6 (54.5%) > Oh, thanks for reporting back! I can confirm that it happens 100% of time with my LBS patch enabled for 64k bs. Let's see what Zorro reports back on a real 64k hardware. > So I guess, I will spend sometime analyzing why the failure. > Could you try the patch I sent for xfs/558 and see if it works all the time? The issue is 'xfs_wb*iomap_invalid' not getting triggered when we have larger bs. I basically increased the blksz in the test based on the underlying bs. Maybe there is a better solution than what I proposed, but it fixes the test. > Failure log > ================ > xfs/558 36s ... - output mismatch (see /root/xfstests-dev/results//xfs_64k_iomap/xfs/558.out.bad) > --- tests/xfs/558.out 2023-06-29 12:06:13.824276289 +0000 > +++ /root/xfstests-dev/results//xfs_64k_iomap/xfs/558.out.bad 2024-01-23 18:54:56.613116520 +0000 > @@ -1,2 +1,3 @@ > QA output created by 558 > +Expected to hear about writeback iomap invalidations? > Silence is golden > ... > (Run 'diff -u /root/xfstests-dev/tests/xfs/558.out /root/xfstests-dev/results//xfs_64k_iomap/xfs/558.out.bad' to see the entire diff) > > HINT: You _MAY_ be missing kernel fix: > 5c665e5b5af6 xfs: remove xfs_map_cow > > -ritesh > >> >> Thanks for testing it out. I will investigate this further, and see why >> I have this failure in LBS for 64k and not for 32k and 16k block sizes. >> >> As this test also expects some invalidation during the page cache writeback, >> this might an issue just with LBS and not for 64k page size machines. >> >> Probably I will also spend some time to set up a Power8 qemu to test these failures. >> >>> However, since on this system the quota was v4.05, it does not support >>> bigtime feature hence could not run xfs/161. >>> >>> xfs/161 [not run] quota: bigtime support not detected >>> xfs/558 7s ... 21s >>> >>> I will collect this info on a different system with latest kernel and >>> will update for xfs/161 too. >>> >> >> Sounds good! Thanks! >> >>> -ritesh
On Tue, Jan 23, 2024 at 09:21:50PM +0100, Pankaj Raghav wrote: > On 23/01/2024 20:42, Ritesh Harjani (IBM) wrote: > > Pankaj Raghav <p.raghav@samsung.com> writes: > > > >>>> CCing Ritesh as I saw him post a patch to fix a testcase for 64k block size. > >>> > >>> Hi Pankaj, > >>> > >>> So I tested this on Linux 6.6 on Power8 qemu (which I had it handy). > >>> xfs/558 passed with both 64k blocksize & with 4k blocksize on a 64k > >>> pagesize system. > > > > Ok, so it looks like the testcase xfs/558 is failing on linux-next with > > 64k blocksize but passing with 4k blocksize. > > It thought it was passing on my previous linux 6.6 release, but I guess > > those too were just some lucky runs. Here is the report - > > > > linux-next: xfs/558 aggregate results across 11 runs: pass=2 (18.2%), fail=9 (81.8%) > > v6.6: xfs/558 aggregate results across 11 runs: pass=5 (45.5%), fail=6 (54.5%) > > > > Oh, thanks for reporting back! > > I can confirm that it happens 100% of time with my LBS patch enabled for 64k bs. > > Let's see what Zorro reports back on a real 64k hardware. > > > So I guess, I will spend sometime analyzing why the failure. > > > > Could you try the patch I sent for xfs/558 and see if it works all the time? > > The issue is 'xfs_wb*iomap_invalid' not getting triggered when we have larger > bs. I basically increased the blksz in the test based on the underlying bs. > Maybe there is a better solution than what I proposed, but it fixes the test. The only improvement I can think of would be to force-disable large folios on the file being tested. Large folios mess with testing because the race depends on write and writeback needing to walk multiple pages. Right now the pagecache only institutes large folios if the IO patterns are large IOs, but in theory that could change some day. I suspect that the iomap tracepoint data and possibly trace_mm_filemap_add_to_page_cache might help figure out what size folios are actually in use during the invalidation test. (Perhaps it's time for me to add a 64k bs VM to the test fleet.) --D > > Failure log > > ================ > > xfs/558 36s ... - output mismatch (see /root/xfstests-dev/results//xfs_64k_iomap/xfs/558.out.bad) > > --- tests/xfs/558.out 2023-06-29 12:06:13.824276289 +0000 > > +++ /root/xfstests-dev/results//xfs_64k_iomap/xfs/558.out.bad 2024-01-23 18:54:56.613116520 +0000 > > @@ -1,2 +1,3 @@ > > QA output created by 558 > > +Expected to hear about writeback iomap invalidations? > > Silence is golden > > ... > > (Run 'diff -u /root/xfstests-dev/tests/xfs/558.out /root/xfstests-dev/results//xfs_64k_iomap/xfs/558.out.bad' to see the entire diff) > > > > HINT: You _MAY_ be missing kernel fix: > > 5c665e5b5af6 xfs: remove xfs_map_cow > > > > -ritesh > > > >> > >> Thanks for testing it out. I will investigate this further, and see why > >> I have this failure in LBS for 64k and not for 32k and 16k block sizes. > >> > >> As this test also expects some invalidation during the page cache writeback, > >> this might an issue just with LBS and not for 64k page size machines. > >> > >> Probably I will also spend some time to set up a Power8 qemu to test these failures. > >> > >>> However, since on this system the quota was v4.05, it does not support > >>> bigtime feature hence could not run xfs/161. > >>> > >>> xfs/161 [not run] quota: bigtime support not detected > >>> xfs/558 7s ... 21s > >>> > >>> I will collect this info on a different system with latest kernel and > >>> will update for xfs/161 too. > >>> > >> > >> Sounds good! Thanks! > >> > >>> -ritesh
>> The issue is 'xfs_wb*iomap_invalid' not getting triggered when we have larger >> bs. I basically increased the blksz in the test based on the underlying bs. >> Maybe there is a better solution than what I proposed, but it fixes the test. > > The only improvement I can think of would be to force-disable large > folios on the file being tested. Large folios mess with testing because > the race depends on write and writeback needing to walk multiple pages. > Right now the pagecache only institutes large folios if the IO patterns > are large IOs, but in theory that could change some day. > Hmm, so we create like a debug parameter to disable large folios while the file is being tested? The only issue is that LBS work needs large folio to be enabled. So I think then the solution is to add a debug parameter to disable large folios for normal blocksizes (bs <= ps) while running the test but disable this test altogether for LBS(bs > ps)? > I suspect that the iomap tracepoint data and possibly > trace_mm_filemap_add_to_page_cache might help figure out what size > folios are actually in use during the invalidation test. > Cool! I will see if I can do some analysis by adding trace_mm_filemap_add_to_page_cache while running the test. > (Perhaps it's time for me to add a 64k bs VM to the test fleet.) > I confirmed with Chandan that Oracle OCI with Ampere supports 64kb page sizes. We (Luis and I) are also looking into running kdevops on XFS with 64kb page size and block size as it might be useful for the LBS work to cross verify the failures.
From: Pankaj Raghav <p.raghav@samsung.com> Some tests need to be adapted to for LBS[1] based on the filesystem blocksize. These are generic changes where it uses the filesystem blocksize instead of assuming it. There are some more generic test cases that are failing due to logdev size requirement that changes with filesystem blocksize. I will address them in a separate series. [1] https://lore.kernel.org/lkml/20230915183848.1018717-1-kernel@pankajraghav.com/ Pankaj Raghav (2): xfs/558: scale blk IO size based on the filesystem blksz xfs/161: adapt the test case for LBS filesystem tests/xfs/161 | 9 +++++++-- tests/xfs/558 | 7 ++++++- 2 files changed, 13 insertions(+), 3 deletions(-) base-commit: c46ca4d1f6c0c45f9a3ea18bc31ba5ae89e02c70