diff mbox series

[v5,3/3] crypto: scompress: remove memcpy if sg_nents is 1

Message ID 20240220064414.262582-4-21cnbao@gmail.com (mailing list archive)
State Superseded
Delegated to: Herbert Xu
Headers show
Series mm/zswap & crypto/compress: remove a couple of memcpy | expand

Commit Message

Barry Song Feb. 20, 2024, 6:44 a.m. UTC
From: Barry Song <v-songbaohua@oppo.com>

while sg_nents is 1 which is always true for the current kernel
as the only user - zswap is the case, we should remove two big
memcpy.

Signed-off-by: Barry Song <v-songbaohua@oppo.com>
Tested-by: Chengming Zhou <zhouchengming@bytedance.com>
---
 crypto/scompress.c | 36 +++++++++++++++++++++++++++++-------
 1 file changed, 29 insertions(+), 7 deletions(-)

Comments

Herbert Xu Feb. 21, 2024, 5:35 a.m. UTC | #1
On Tue, Feb 20, 2024 at 07:44:14PM +1300, Barry Song wrote:
> From: Barry Song <v-songbaohua@oppo.com>
> 
> while sg_nents is 1 which is always true for the current kernel
> as the only user - zswap is the case, we should remove two big
> memcpy.
> 
> Signed-off-by: Barry Song <v-songbaohua@oppo.com>
> Tested-by: Chengming Zhou <zhouchengming@bytedance.com>
> ---
>  crypto/scompress.c | 36 +++++++++++++++++++++++++++++-------
>  1 file changed, 29 insertions(+), 7 deletions(-)

This patch is independent of the other two.  Please split it
out so I can apply it directly.

> @@ -134,13 +135,25 @@ static int scomp_acomp_comp_decomp(struct acomp_req *req, int dir)
>  	scratch = raw_cpu_ptr(&scomp_scratch);
>  	spin_lock(&scratch->lock);
>  
> -	scatterwalk_map_and_copy(scratch->src, req->src, 0, req->slen, 0);
> +	if (sg_nents(req->src) == 1) {
> +		src = kmap_local_page(sg_page(req->src)) + req->src->offset;

What if the SG entry is longer than PAGE_SIZE (or indeed crosses a
page boundary)? I think the test needs to be strengthened.

Thanks,
Barry Song Feb. 21, 2024, 5:55 a.m. UTC | #2
On Wed, Feb 21, 2024 at 6:35 PM Herbert Xu <herbert@gondor.apana.org.au> wrote:
>
> On Tue, Feb 20, 2024 at 07:44:14PM +1300, Barry Song wrote:
> > From: Barry Song <v-songbaohua@oppo.com>
> >
> > while sg_nents is 1 which is always true for the current kernel
> > as the only user - zswap is the case, we should remove two big
> > memcpy.
> >
> > Signed-off-by: Barry Song <v-songbaohua@oppo.com>
> > Tested-by: Chengming Zhou <zhouchengming@bytedance.com>
> > ---
> >  crypto/scompress.c | 36 +++++++++++++++++++++++++++++-------
> >  1 file changed, 29 insertions(+), 7 deletions(-)
>
> This patch is independent of the other two.  Please split it
> out so I can apply it directly.

Ok. OTOH, patch 3/3 has no dependency with other patches. so patch
3/3 should be perfectly applicable to crypto :-)

Hi Andrew,
Would you please handle patch 1/3 and 2/3 in mm-tree given Herbert's ack on
1/3?

>
> > @@ -134,13 +135,25 @@ static int scomp_acomp_comp_decomp(struct acomp_req *req, int dir)
> >       scratch = raw_cpu_ptr(&scomp_scratch);
> >       spin_lock(&scratch->lock);
> >
> > -     scatterwalk_map_and_copy(scratch->src, req->src, 0, req->slen, 0);
> > +     if (sg_nents(req->src) == 1) {
> > +             src = kmap_local_page(sg_page(req->src)) + req->src->offset;
>
> What if the SG entry is longer than PAGE_SIZE (or indeed crosses a
> page boundary)? I think the test needs to be strengthened.

I don't understand what is the problem for a nents to cross two pages
as anyway they are contiguous in both physical and virtual addresses.
if they are not contiguous, they will be two nents.

>
> Thanks,
> --
> Email: Herbert Xu <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Thanks
Barry
Barry Song Feb. 21, 2024, 6:10 a.m. UTC | #3
On Wed, Feb 21, 2024 at 6:55 PM Barry Song <21cnbao@gmail.com> wrote:
>
> On Wed, Feb 21, 2024 at 6:35 PM Herbert Xu <herbert@gondor.apana.org.au> wrote:
> >
> > On Tue, Feb 20, 2024 at 07:44:14PM +1300, Barry Song wrote:
> > > From: Barry Song <v-songbaohua@oppo.com>
> > >
> > > while sg_nents is 1 which is always true for the current kernel
> > > as the only user - zswap is the case, we should remove two big
> > > memcpy.
> > >
> > > Signed-off-by: Barry Song <v-songbaohua@oppo.com>
> > > Tested-by: Chengming Zhou <zhouchengming@bytedance.com>
> > > ---
> > >  crypto/scompress.c | 36 +++++++++++++++++++++++++++++-------
> > >  1 file changed, 29 insertions(+), 7 deletions(-)
> >
> > This patch is independent of the other two.  Please split it
> > out so I can apply it directly.
>
> Ok. OTOH, patch 3/3 has no dependency with other patches. so patch
> 3/3 should be perfectly applicable to crypto :-)
>
> Hi Andrew,
> Would you please handle patch 1/3 and 2/3 in mm-tree given Herbert's ack on
> 1/3?
>
> >
> > > @@ -134,13 +135,25 @@ static int scomp_acomp_comp_decomp(struct acomp_req *req, int dir)
> > >       scratch = raw_cpu_ptr(&scomp_scratch);
> > >       spin_lock(&scratch->lock);
> > >
> > > -     scatterwalk_map_and_copy(scratch->src, req->src, 0, req->slen, 0);
> > > +     if (sg_nents(req->src) == 1) {
> > > +             src = kmap_local_page(sg_page(req->src)) + req->src->offset;
> >
> > What if the SG entry is longer than PAGE_SIZE (or indeed crosses a
> > page boundary)? I think the test needs to be strengthened.
>
> I don't understand what is the problem for a nents to cross two pages
> as anyway they are contiguous in both physical and virtual addresses.
> if they are not contiguous, they will be two nents.

second thought,  you are right.  sorry for my noise.
The test was running on a platform like arm64 without HIGHMEM. thus,
kmap_local_page always returns mapped page_address of normal
zone.

but for platforms with HIGHMEM for example arm32, x86_32 , we can't use
the virtual address of the first page as the start address of two pages though
they are physically contiguous.

I will rework on this.  ideally, we should still avoid the memcpy though
two pages are within one nents :-) we are really this case for zswap
as the dst is always two pages in case the compressed data is
longer than the original data.

>
> >
> > Thanks,
> > --
> > Email: Herbert Xu <herbert@gondor.apana.org.au>
> > Home Page: http://gondor.apana.org.au/~herbert/
> > PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
>

Thanks
Barry
diff mbox series

Patch

diff --git a/crypto/scompress.c b/crypto/scompress.c
index b108a30a7600..50a487eac792 100644
--- a/crypto/scompress.c
+++ b/crypto/scompress.c
@@ -117,6 +117,7 @@  static int scomp_acomp_comp_decomp(struct acomp_req *req, int dir)
 	struct crypto_scomp *scomp = *tfm_ctx;
 	void **ctx = acomp_request_ctx(req);
 	struct scomp_scratch *scratch;
+	void *src, *dst;
 	unsigned int dlen;
 	int ret;
 
@@ -134,13 +135,25 @@  static int scomp_acomp_comp_decomp(struct acomp_req *req, int dir)
 	scratch = raw_cpu_ptr(&scomp_scratch);
 	spin_lock(&scratch->lock);
 
-	scatterwalk_map_and_copy(scratch->src, req->src, 0, req->slen, 0);
+	if (sg_nents(req->src) == 1) {
+		src = kmap_local_page(sg_page(req->src)) + req->src->offset;
+	} else {
+		scatterwalk_map_and_copy(scratch->src, req->src, 0,
+					 req->slen, 0);
+		src = scratch->src;
+	}
+
+	if (req->dst && sg_nents(req->dst) == 1)
+		dst = kmap_local_page(sg_page(req->dst)) + req->dst->offset;
+	else
+		dst = scratch->dst;
+
 	if (dir)
-		ret = crypto_scomp_compress(scomp, scratch->src, req->slen,
-					    scratch->dst, &req->dlen, *ctx);
+		ret = crypto_scomp_compress(scomp, src, req->slen,
+					    dst, &req->dlen, *ctx);
 	else
-		ret = crypto_scomp_decompress(scomp, scratch->src, req->slen,
-					      scratch->dst, &req->dlen, *ctx);
+		ret = crypto_scomp_decompress(scomp, src, req->slen,
+					      dst, &req->dlen, *ctx);
 	if (!ret) {
 		if (!req->dst) {
 			req->dst = sgl_alloc(req->dlen, GFP_ATOMIC, NULL);
@@ -152,10 +165,19 @@  static int scomp_acomp_comp_decomp(struct acomp_req *req, int dir)
 			ret = -ENOSPC;
 			goto out;
 		}
-		scatterwalk_map_and_copy(scratch->dst, req->dst, 0, req->dlen,
-					 1);
+		if (dst == scratch->dst) {
+			scatterwalk_map_and_copy(scratch->dst, req->dst, 0,
+						 req->dlen, 1);
+		} else {
+			flush_dcache_page(sg_page(req->dst));
+		}
 	}
 out:
+	if (src != scratch->src)
+		kunmap_local(src);
+	if (dst != scratch->dst)
+		kunmap_local(dst);
+
 	spin_unlock(&scratch->lock);
 	return ret;
 }