Message ID | 20230805055537.147835-1-hch@lst.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | zram: take device and not only bvec offset into account | expand |
On (23/08/05 07:55), Christoph Hellwig wrote: > Commit af8b04c63708 ("zram: simplify bvec iteration in > __zram_make_request") changed the bio iteration in zram to rely on the > implicit capping to page boundaries in bio_for_each_segment. But it > failed to care for the fact zram not only care about the page alignment > of the bio payload, but also the page alignment into the device. For > buffered I/O and swap those are the same, but for direct I/O or kernel > internal I/O like XFS log buffer writes they can differ. > > Fix this by open coding bio_for_each_segment and limiting the bvec len > so that it never crosses over a page alignment boundary in the device > in addition to the payload boundary already taken care of by > bio_iter_iovec. > > Fixes: af8b04c63708 ("zram: simplify bvec iteration in __zram_make_request") > Reported-by: Dusty Mabe <dusty@dustymabe.com> > Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Sergey Senozhatsky <senozhatsky@chromium.org>
On Sat, Aug 05, 2023 at 04:46:45PM +0900, Sergey Senozhatsky wrote: > > Fixes: af8b04c63708 ("zram: simplify bvec iteration in __zram_make_request") > > Reported-by: Dusty Mabe <dusty@dustymabe.com> > > Signed-off-by: Christoph Hellwig <hch@lst.de> > > Acked-by: Sergey Senozhatsky <senozhatsky@chromium.org> Btw, are there any interesting test suites you want me to run on a > 4K page size system now that I do have this setup available?
On 8/5/23 04:13, Christoph Hellwig wrote: > On Sat, Aug 05, 2023 at 04:46:45PM +0900, Sergey Senozhatsky wrote: >>> Fixes: af8b04c63708 ("zram: simplify bvec iteration in __zram_make_request") >>> Reported-by: Dusty Mabe <dusty@dustymabe.com> >>> Signed-off-by: Christoph Hellwig <hch@lst.de> >> >> Acked-by: Sergey Senozhatsky <senozhatsky@chromium.org> > > Btw, are there any interesting test suites you want me to run on > a > 4K page size system now that I do have this setup available? The patch is passing tests for me. I ran the Fedora CoreOS root reprovision tests (which are the tests that caught this bug to begin with) and the trivial reproducer: ``` #!/bin/bash set -eux -o pipefail modprobe zram num_devices=0 read dev < /sys/class/zram-control/hot_add echo 10G > /sys/block/zram"${dev}"/disksize mkfs.xfs /dev/zram"${dev}" mkdir -p /tmp/foo mount -t xfs /dev/zram"${dev}" /tmp/foo ``` Dusty
On Sat, 05 Aug 2023 07:55:37 +0200, Christoph Hellwig wrote: > Commit af8b04c63708 ("zram: simplify bvec iteration in > __zram_make_request") changed the bio iteration in zram to rely on the > implicit capping to page boundaries in bio_for_each_segment. But it > failed to care for the fact zram not only care about the page alignment > of the bio payload, but also the page alignment into the device. For > buffered I/O and swap those are the same, but for direct I/O or kernel > internal I/O like XFS log buffer writes they can differ. > > [...] Applied, thanks! [1/1] zram: take device and not only bvec offset into account commit: 95848dcb9d676738411a8ff70a9704039f1b3982 Best regards,
On (23/08/05 10:13), Christoph Hellwig wrote: > On Sat, Aug 05, 2023 at 04:46:45PM +0900, Sergey Senozhatsky wrote: > > > Fixes: af8b04c63708 ("zram: simplify bvec iteration in __zram_make_request") > > > Reported-by: Dusty Mabe <dusty@dustymabe.com> > > > Signed-off-by: Christoph Hellwig <hch@lst.de> > > > > Acked-by: Sergey Senozhatsky <senozhatsky@chromium.org> > > Btw, are there any interesting test suites you want me to run on > a > 4K page size system now that I do have this setup available? I don't really have any special tests. I used to run fio, but switched to a shell script that: 1) configures zram0 and adds zram1 as writeback 2) mkfs.ext4 on zram0, cp linux tar.gz, compile (in parallel) 3) deferred recompress (idle and size based) 4) idle writeback 5) re-reads all writtenback pages I test on a system with 4K pages, tho, I probably need to get an image with larger PAGE_SIZE.
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c index 5676e6dd5b1672..06673c6ca25555 100644 --- a/drivers/block/zram/zram_drv.c +++ b/drivers/block/zram/zram_drv.c @@ -1870,15 +1870,16 @@ static void zram_bio_discard(struct zram *zram, struct bio *bio) static void zram_bio_read(struct zram *zram, struct bio *bio) { - struct bvec_iter iter; - struct bio_vec bv; - unsigned long start_time; + unsigned long start_time = bio_start_io_acct(bio); + struct bvec_iter iter = bio->bi_iter; - start_time = bio_start_io_acct(bio); - bio_for_each_segment(bv, bio, iter) { + do { u32 index = iter.bi_sector >> SECTORS_PER_PAGE_SHIFT; u32 offset = (iter.bi_sector & (SECTORS_PER_PAGE - 1)) << SECTOR_SHIFT; + struct bio_vec bv = bio_iter_iovec(bio, iter); + + bv.bv_len = min_t(u32, bv.bv_len, PAGE_SIZE - offset); if (zram_bvec_read(zram, &bv, index, offset, bio) < 0) { atomic64_inc(&zram->stats.failed_reads); @@ -1890,22 +1891,26 @@ static void zram_bio_read(struct zram *zram, struct bio *bio) zram_slot_lock(zram, index); zram_accessed(zram, index); zram_slot_unlock(zram, index); - } + + bio_advance_iter_single(bio, &iter, bv.bv_len); + } while (iter.bi_size); + bio_end_io_acct(bio, start_time); bio_endio(bio); } static void zram_bio_write(struct zram *zram, struct bio *bio) { - struct bvec_iter iter; - struct bio_vec bv; - unsigned long start_time; + unsigned long start_time = bio_start_io_acct(bio); + struct bvec_iter iter = bio->bi_iter; - start_time = bio_start_io_acct(bio); - bio_for_each_segment(bv, bio, iter) { + do { u32 index = iter.bi_sector >> SECTORS_PER_PAGE_SHIFT; u32 offset = (iter.bi_sector & (SECTORS_PER_PAGE - 1)) << SECTOR_SHIFT; + struct bio_vec bv = bio_iter_iovec(bio, iter); + + bv.bv_len = min_t(u32, bv.bv_len, PAGE_SIZE - offset); if (zram_bvec_write(zram, &bv, index, offset, bio) < 0) { atomic64_inc(&zram->stats.failed_writes); @@ -1916,7 +1921,10 @@ static void zram_bio_write(struct zram *zram, struct bio *bio) zram_slot_lock(zram, index); zram_accessed(zram, index); zram_slot_unlock(zram, index); - } + + bio_advance_iter_single(bio, &iter, bv.bv_len); + } while (iter.bi_size); + bio_end_io_acct(bio, start_time); bio_endio(bio); }
Commit af8b04c63708 ("zram: simplify bvec iteration in __zram_make_request") changed the bio iteration in zram to rely on the implicit capping to page boundaries in bio_for_each_segment. But it failed to care for the fact zram not only care about the page alignment of the bio payload, but also the page alignment into the device. For buffered I/O and swap those are the same, but for direct I/O or kernel internal I/O like XFS log buffer writes they can differ. Fix this by open coding bio_for_each_segment and limiting the bvec len so that it never crosses over a page alignment boundary in the device in addition to the payload boundary already taken care of by bio_iter_iovec. Fixes: af8b04c63708 ("zram: simplify bvec iteration in __zram_make_request") Reported-by: Dusty Mabe <dusty@dustymabe.com> Signed-off-by: Christoph Hellwig <hch@lst.de> --- drivers/block/zram/zram_drv.c | 32 ++++++++++++++++++++------------ 1 file changed, 20 insertions(+), 12 deletions(-)