diff mbox series

zram: use copy_page for full page copy

Message ID 20240613000422.1918-1-jszhang@kernel.org (mailing list archive)
State New
Headers show
Series zram: use copy_page for full page copy | expand

Commit Message

Jisheng Zhang June 13, 2024, 12:04 a.m. UTC
commit 42e99bd975fd ("zram: optimize memory operations with
clear_page()/copy_page()") optimize page copy/clean operations, but
then commit d72e9a7a93e4 ("zram: do not use copy_page with non-page
aligned address") removes the optimization because there's memory
corruption at that time, the reason was well explained. But after
commit 1f7319c74275 ("zram: partial IO refactoring"), partial IO uses
alloc_page() instead of kmalloc to allocate a page, so we can bring
back the optimization.

commit 80ba4caf8ba9 ("zram: use copy_page for full page copy") brings
back partial optimization, missed one point in zram_write_page().
optimize the full page copying in zram_write_page() with copy_page()

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
---
 drivers/block/zram/zram_drv.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

Comments

Sergey Senozhatsky June 13, 2024, 3:17 a.m. UTC | #1
On (24/06/13 08:04), Jisheng Zhang wrote:
> commit 42e99bd975fd ("zram: optimize memory operations with
> clear_page()/copy_page()") optimize page copy/clean operations, but
> then commit d72e9a7a93e4 ("zram: do not use copy_page with non-page
> aligned address") removes the optimization because there's memory
> corruption at that time, the reason was well explained. But after
> commit 1f7319c74275 ("zram: partial IO refactoring"), partial IO uses
> alloc_page() instead of kmalloc to allocate a page, so we can bring
> back the optimization.
> 
> commit 80ba4caf8ba9 ("zram: use copy_page for full page copy") brings
> back partial optimization, missed one point in zram_write_page().
> optimize the full page copying in zram_write_page() with copy_page()

Is copy_page() really more optimal than memcpy(PAGE_SIZE)?
Jisheng Zhang June 13, 2024, 12:58 p.m. UTC | #2
On Thu, Jun 13, 2024 at 12:17:31PM +0900, Sergey Senozhatsky wrote:
> On (24/06/13 08:04), Jisheng Zhang wrote:
> > commit 42e99bd975fd ("zram: optimize memory operations with
> > clear_page()/copy_page()") optimize page copy/clean operations, but
> > then commit d72e9a7a93e4 ("zram: do not use copy_page with non-page
> > aligned address") removes the optimization because there's memory
> > corruption at that time, the reason was well explained. But after
> > commit 1f7319c74275 ("zram: partial IO refactoring"), partial IO uses
> > alloc_page() instead of kmalloc to allocate a page, so we can bring
> > back the optimization.
> > 
> > commit 80ba4caf8ba9 ("zram: use copy_page for full page copy") brings
> > back partial optimization, missed one point in zram_write_page().
> > optimize the full page copying in zram_write_page() with copy_page()
> 
> Is copy_page() really more optimal than memcpy(PAGE_SIZE)?

I think yes copy_page performs better than memcpy(PAGE_SIZE)
commit afb2d666d025 ("zsmalloc: use copy_page for full page copy")
also shows the result.
Christoph Hellwig June 14, 2024, 5:25 a.m. UTC | #3
On Thu, Jun 13, 2024 at 08:04:22AM +0800, Jisheng Zhang wrote:
> commit 42e99bd975fd ("zram: optimize memory operations with
> clear_page()/copy_page()") optimize page copy/clean operations, but
> then commit d72e9a7a93e4 ("zram: do not use copy_page with non-page
> aligned address") removes the optimization because there's memory
> corruption at that time, the reason was well explained. But after
> commit 1f7319c74275 ("zram: partial IO refactoring"), partial IO uses
> alloc_page() instead of kmalloc to allocate a page, so we can bring
> back the optimization.
> 
> commit 80ba4caf8ba9 ("zram: use copy_page for full page copy") brings
> back partial optimization, missed one point in zram_write_page().
> optimize the full page copying in zram_write_page() with copy_page()
> 
> Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
> ---
>  drivers/block/zram/zram_drv.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index 3acd7006ad2c..4b2b5098062f 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -1478,11 +1478,13 @@ static int zram_write_page(struct zram *zram, struct page *page, u32 index)
>  	dst = zs_map_object(zram->mem_pool, handle, ZS_MM_WO);
>  
>  	src = zstrm->buffer;
> -	if (comp_len == PAGE_SIZE)
> +	if (comp_len == PAGE_SIZE) {
>  		src = kmap_local_page(page);
> -	memcpy(dst, src, comp_len);
> -	if (comp_len == PAGE_SIZE)
> +		copy_page(dst, src);
>  		kunmap_local(src);
> +	} else {
> +		memcpy(dst, src, comp_len);
> +	}

I know this is pre-existing code, but why do we need to kmap
for comp_len == PAGE_SIZE and not for the other cases?  Something
feels really obsfucated here.
Sergey Senozhatsky June 14, 2024, 5:31 a.m. UTC | #4
On (24/06/13 22:25), Christoph Hellwig wrote:
> On Thu, Jun 13, 2024 at 08:04:22AM +0800, Jisheng Zhang wrote:
> > commit 42e99bd975fd ("zram: optimize memory operations with
> > clear_page()/copy_page()") optimize page copy/clean operations, but
> > then commit d72e9a7a93e4 ("zram: do not use copy_page with non-page
> > aligned address") removes the optimization because there's memory
> > corruption at that time, the reason was well explained. But after
> > commit 1f7319c74275 ("zram: partial IO refactoring"), partial IO uses
> > alloc_page() instead of kmalloc to allocate a page, so we can bring
> > back the optimization.
> > 
> > commit 80ba4caf8ba9 ("zram: use copy_page for full page copy") brings
> > back partial optimization, missed one point in zram_write_page().
> > optimize the full page copying in zram_write_page() with copy_page()
> > 
> > Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
> > ---
> >  drivers/block/zram/zram_drv.c | 8 +++++---
> >  1 file changed, 5 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> > index 3acd7006ad2c..4b2b5098062f 100644
> > --- a/drivers/block/zram/zram_drv.c
> > +++ b/drivers/block/zram/zram_drv.c
> > @@ -1478,11 +1478,13 @@ static int zram_write_page(struct zram *zram, struct page *page, u32 index)
> >  	dst = zs_map_object(zram->mem_pool, handle, ZS_MM_WO);
> >  
> >  	src = zstrm->buffer;
> > -	if (comp_len == PAGE_SIZE)
> > +	if (comp_len == PAGE_SIZE) {
> >  		src = kmap_local_page(page);
> > -	memcpy(dst, src, comp_len);
> > -	if (comp_len == PAGE_SIZE)
> > +		copy_page(dst, src);
> >  		kunmap_local(src);
> > +	} else {
> > +		memcpy(dst, src, comp_len);
> > +	}
> 
> I know this is pre-existing code, but why do we need to kmap
> for comp_len == PAGE_SIZE and not for the other cases?  Something
> feels really obsfucated here.

It is tricky a little.

If we managed to compress page (size < zsmalloc uncompressible watermark)
then src is per-CPU buffer with compressed data.  Otherwise src is original
page (with uncompressed data).
diff mbox series

Patch

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 3acd7006ad2c..4b2b5098062f 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -1478,11 +1478,13 @@  static int zram_write_page(struct zram *zram, struct page *page, u32 index)
 	dst = zs_map_object(zram->mem_pool, handle, ZS_MM_WO);
 
 	src = zstrm->buffer;
-	if (comp_len == PAGE_SIZE)
+	if (comp_len == PAGE_SIZE) {
 		src = kmap_local_page(page);
-	memcpy(dst, src, comp_len);
-	if (comp_len == PAGE_SIZE)
+		copy_page(dst, src);
 		kunmap_local(src);
+	} else {
+		memcpy(dst, src, comp_len);
+	}
 
 	zcomp_stream_put(zram->comps[ZRAM_PRIMARY_COMP]);
 	zs_unmap_object(zram->mem_pool, handle);