Message ID | 20240206202511.4799-1-21cnbao@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] zram: easy the allocation of zcomp_strm's buffers through vmalloc | expand |
On (24/02/07 09:25), Barry Song wrote: > From: Barry Song <v-songbaohua@oppo.com> > > Firstly, there is no need to keep zcomp_strm's buffers contiguous > physically. > > Secondly, The recent mTHP project has provided the possibility to > swapout and swapin large folios. Compressing/decompressing large > blocks can hugely decrease CPU consumption and improve compression > ratio. This requires us to make zRAM support the compression and > decompression for large objects. > With the support of large objects in zRAM of our out-of-tree code, > we have observed many allocation failures during CPU hotplug as > large objects need larger buffers. So this change is also more > future-proof once we begin to bring up multiple sizes in zRAM. > > Signed-off-by: Barry Song <v-songbaohua@oppo.com> Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org> Note: Taking it in NOT because of the out-of-tree code (we don't really do that), but because this is executed from CPU offline/online paths, which can happen on devices with fragmented memory (a valid concern IMHO). Minchan, if you have any objections, please chime in. > @@ -37,7 +38,7 @@ static void zcomp_strm_free(struct zcomp_strm *zstrm) > { > if (!IS_ERR_OR_NULL(zstrm->tfm)) > crypto_free_comp(zstrm->tfm); > - free_pages((unsigned long)zstrm->buffer, 1); > + vfree(zstrm->buffer); > zstrm->tfm = NULL; > zstrm->buffer = NULL; > } > @@ -53,7 +54,7 @@ static int zcomp_strm_init(struct zcomp_strm *zstrm, struct zcomp *comp) > * allocate 2 pages. 1 for compressed data, plus 1 extra for the > * case when compressed size is larger than the original one > */ > - zstrm->buffer = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, 1); > + zstrm->buffer = vzalloc(2 * PAGE_SIZE); > if (IS_ERR_OR_NULL(zstrm->tfm) || !zstrm->buffer) { > zcomp_strm_free(zstrm); > return -ENOMEM; > -- > 2.34.1 >
On 2/6/24 6:44 PM, Sergey Senozhatsky wrote: > On (24/02/07 09:25), Barry Song wrote: >> From: Barry Song <v-songbaohua@oppo.com> >> >> Firstly, there is no need to keep zcomp_strm's buffers contiguous >> physically. >> >> Secondly, The recent mTHP project has provided the possibility to >> swapout and swapin large folios. Compressing/decompressing large >> blocks can hugely decrease CPU consumption and improve compression >> ratio. This requires us to make zRAM support the compression and >> decompression for large objects. >> With the support of large objects in zRAM of our out-of-tree code, >> we have observed many allocation failures during CPU hotplug as >> large objects need larger buffers. So this change is also more >> future-proof once we begin to bring up multiple sizes in zRAM. >> >> Signed-off-by: Barry Song <v-songbaohua@oppo.com> > > Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org> > > Note: > Taking it in NOT because of the out-of-tree code (we don't really > do that), but because this is executed from CPU offline/online > paths, which can happen on devices with fragmented memory (a valid > concern IMHO). > > Minchan, if you have any objections, please chime in. Not Minchan, but I do have an issue with the title of the commit, it doesn't make any sense. Can the maintainer please re-write that to be something that is appropriate and actually describes what the patch does?
On (24/02/06 19:40), Jens Axboe wrote: > On 2/6/24 6:44 PM, Sergey Senozhatsky wrote: > > On (24/02/07 09:25), Barry Song wrote: > >> From: Barry Song <v-songbaohua@oppo.com> > >> > >> Firstly, there is no need to keep zcomp_strm's buffers contiguous > >> physically. > >> > >> Secondly, The recent mTHP project has provided the possibility to > >> swapout and swapin large folios. Compressing/decompressing large > >> blocks can hugely decrease CPU consumption and improve compression > >> ratio. This requires us to make zRAM support the compression and > >> decompression for large objects. > >> With the support of large objects in zRAM of our out-of-tree code, > >> we have observed many allocation failures during CPU hotplug as > >> large objects need larger buffers. So this change is also more > >> future-proof once we begin to bring up multiple sizes in zRAM. > >> > >> Signed-off-by: Barry Song <v-songbaohua@oppo.com> > > > > Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org> > > > > Note: > > Taking it in NOT because of the out-of-tree code (we don't really > > do that), but because this is executed from CPU offline/online > > paths, which can happen on devices with fragmented memory (a valid > > concern IMHO). > > > > Minchan, if you have any objections, please chime in. > > Not Minchan, but I do have an issue with the title of the commit, it > doesn't make any sense. Can the maintainer please re-write that to be > something that is appropriate and actually describes what the patch > does? Thanks Jens. I fully agree, I requested a better commit message in v1 feedback, we probably still can improve on this. Something like this? --- zram: do not allocate physically contiguous strm buffers Currently zram allocates 2 physically contigous pages per-CPU's compression stream (we may have up to 3 streams per-CPU). Since those buffers are per-CPU we allocate them from CPU hotplug path, which may have higher risks of failed allocations on devices with fragmented memory. Switch to virtually contiguos allocations - crypto comp does not seem impose requirements on compression working buffers to be physically contiguous.
On 2/6/24 8:14 PM, Sergey Senozhatsky wrote: > On (24/02/06 19:40), Jens Axboe wrote: >> On 2/6/24 6:44 PM, Sergey Senozhatsky wrote: >>> On (24/02/07 09:25), Barry Song wrote: >>>> From: Barry Song <v-songbaohua@oppo.com> >>>> >>>> Firstly, there is no need to keep zcomp_strm's buffers contiguous >>>> physically. >>>> >>>> Secondly, The recent mTHP project has provided the possibility to >>>> swapout and swapin large folios. Compressing/decompressing large >>>> blocks can hugely decrease CPU consumption and improve compression >>>> ratio. This requires us to make zRAM support the compression and >>>> decompression for large objects. >>>> With the support of large objects in zRAM of our out-of-tree code, >>>> we have observed many allocation failures during CPU hotplug as >>>> large objects need larger buffers. So this change is also more >>>> future-proof once we begin to bring up multiple sizes in zRAM. >>>> >>>> Signed-off-by: Barry Song <v-songbaohua@oppo.com> >>> >>> Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org> >>> >>> Note: >>> Taking it in NOT because of the out-of-tree code (we don't really >>> do that), but because this is executed from CPU offline/online >>> paths, which can happen on devices with fragmented memory (a valid >>> concern IMHO). >>> >>> Minchan, if you have any objections, please chime in. >> >> Not Minchan, but I do have an issue with the title of the commit, it >> doesn't make any sense. Can the maintainer please re-write that to be >> something that is appropriate and actually describes what the patch >> does? > > Thanks Jens. I fully agree, I requested a better commit message in > v1 feedback, we probably still can improve on this. > > > Something like this? > > --- > > zram: do not allocate physically contiguous strm buffers > > Currently zram allocates 2 physically contigous pages per-CPU's > compression stream (we may have up to 3 streams per-CPU). Since > those buffers are per-CPU we allocate them from CPU hotplug path, > which may have higher risks of failed allocations on devices with > fragmented memory. > > Switch to virtually contiguos allocations - crypto comp does not > seem impose requirements on compression working buffers to be > physically contiguous. Yep, this is much better! Thanks.
On (24/02/06 20:17), Jens Axboe wrote: [..] > >>> Minchan, if you have any objections, please chime in. > >> > >> Not Minchan, but I do have an issue with the title of the commit, it > >> doesn't make any sense. Can the maintainer please re-write that to be > >> something that is appropriate and actually describes what the patch > >> does? > > > > Thanks Jens. I fully agree, I requested a better commit message in > > v1 feedback, we probably still can improve on this. > > > > > > Something like this? > > > > --- > > > > zram: do not allocate physically contiguous strm buffers > > > > Currently zram allocates 2 physically contigous pages per-CPU's > > compression stream (we may have up to 3 streams per-CPU). Since > > those buffers are per-CPU we allocate them from CPU hotplug path, > > which may have higher risks of failed allocations on devices with > > fragmented memory. > > > > Switch to virtually contiguos allocations - crypto comp does not > > seem impose requirements on compression working buffers to be > > physically contiguous. > > Yep, this is much better! Thanks. Thanks. Barry, can you please send v3 with the suggested subject and commit message?
On (24/02/07 12:14), Sergey Senozhatsky wrote: > --- > > zram: do not allocate physically contiguous strm buffers > > Currently zram allocates 2 physically contigous pages per-CPU's > compression stream (we may have up to 3 streams per-CPU). Since Correction: ^ up to 4 > those buffers are per-CPU we allocate them from CPU hotplug path, > which may have higher risks of failed allocations on devices with > fragmented memory. > > Switch to virtually contiguos allocations - crypto comp does not > seem impose requirements on compression working buffers to be > physically contiguous.
On Wed, Feb 7, 2024 at 4:21 PM Sergey Senozhatsky <senozhatsky@chromium.org> wrote: > > On (24/02/06 20:17), Jens Axboe wrote: > [..] > > >>> Minchan, if you have any objections, please chime in. > > >> > > >> Not Minchan, but I do have an issue with the title of the commit, it > > >> doesn't make any sense. Can the maintainer please re-write that to be > > >> something that is appropriate and actually describes what the patch > > >> does? > > > > > > Thanks Jens. I fully agree, I requested a better commit message in > > > v1 feedback, we probably still can improve on this. > > > > > > > > > Something like this? > > > > > > --- > > > > > > zram: do not allocate physically contiguous strm buffers > > > > > > Currently zram allocates 2 physically contigous pages per-CPU's > > > compression stream (we may have up to 3 streams per-CPU). Since > > > those buffers are per-CPU we allocate them from CPU hotplug path, > > > which may have higher risks of failed allocations on devices with > > > fragmented memory. > > > > > > Switch to virtually contiguos allocations - crypto comp does not > > > seem impose requirements on compression working buffers to be > > > physically contiguous. > > > > Yep, this is much better! Thanks. > > Thanks. > Hi Sergey, Jens, > Barry, can you please send v3 with the suggested subject and commit > message? Thanks for your comments and improvements. will send v3 accordingly. Best regard Barry
diff --git a/drivers/block/zram/zcomp.c b/drivers/block/zram/zcomp.c index 55af4efd7983..8237b08c49d8 100644 --- a/drivers/block/zram/zcomp.c +++ b/drivers/block/zram/zcomp.c @@ -11,6 +11,7 @@ #include <linux/sched.h> #include <linux/cpu.h> #include <linux/crypto.h> +#include <linux/vmalloc.h> #include "zcomp.h" @@ -37,7 +38,7 @@ static void zcomp_strm_free(struct zcomp_strm *zstrm) { if (!IS_ERR_OR_NULL(zstrm->tfm)) crypto_free_comp(zstrm->tfm); - free_pages((unsigned long)zstrm->buffer, 1); + vfree(zstrm->buffer); zstrm->tfm = NULL; zstrm->buffer = NULL; } @@ -53,7 +54,7 @@ static int zcomp_strm_init(struct zcomp_strm *zstrm, struct zcomp *comp) * allocate 2 pages. 1 for compressed data, plus 1 extra for the * case when compressed size is larger than the original one */ - zstrm->buffer = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, 1); + zstrm->buffer = vzalloc(2 * PAGE_SIZE); if (IS_ERR_OR_NULL(zstrm->tfm) || !zstrm->buffer) { zcomp_strm_free(zstrm); return -ENOMEM;