Message ID | 20240220064414.262582-4-21cnbao@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Herbert Xu |
Headers | show |
Series | mm/zswap & crypto/compress: remove a couple of memcpy | expand |
On Tue, Feb 20, 2024 at 07:44:14PM +1300, Barry Song wrote: > From: Barry Song <v-songbaohua@oppo.com> > > while sg_nents is 1 which is always true for the current kernel > as the only user - zswap is the case, we should remove two big > memcpy. > > Signed-off-by: Barry Song <v-songbaohua@oppo.com> > Tested-by: Chengming Zhou <zhouchengming@bytedance.com> > --- > crypto/scompress.c | 36 +++++++++++++++++++++++++++++------- > 1 file changed, 29 insertions(+), 7 deletions(-) This patch is independent of the other two. Please split it out so I can apply it directly. > @@ -134,13 +135,25 @@ static int scomp_acomp_comp_decomp(struct acomp_req *req, int dir) > scratch = raw_cpu_ptr(&scomp_scratch); > spin_lock(&scratch->lock); > > - scatterwalk_map_and_copy(scratch->src, req->src, 0, req->slen, 0); > + if (sg_nents(req->src) == 1) { > + src = kmap_local_page(sg_page(req->src)) + req->src->offset; What if the SG entry is longer than PAGE_SIZE (or indeed crosses a page boundary)? I think the test needs to be strengthened. Thanks,
On Wed, Feb 21, 2024 at 6:35 PM Herbert Xu <herbert@gondor.apana.org.au> wrote: > > On Tue, Feb 20, 2024 at 07:44:14PM +1300, Barry Song wrote: > > From: Barry Song <v-songbaohua@oppo.com> > > > > while sg_nents is 1 which is always true for the current kernel > > as the only user - zswap is the case, we should remove two big > > memcpy. > > > > Signed-off-by: Barry Song <v-songbaohua@oppo.com> > > Tested-by: Chengming Zhou <zhouchengming@bytedance.com> > > --- > > crypto/scompress.c | 36 +++++++++++++++++++++++++++++------- > > 1 file changed, 29 insertions(+), 7 deletions(-) > > This patch is independent of the other two. Please split it > out so I can apply it directly. Ok. OTOH, patch 3/3 has no dependency with other patches. so patch 3/3 should be perfectly applicable to crypto :-) Hi Andrew, Would you please handle patch 1/3 and 2/3 in mm-tree given Herbert's ack on 1/3? > > > @@ -134,13 +135,25 @@ static int scomp_acomp_comp_decomp(struct acomp_req *req, int dir) > > scratch = raw_cpu_ptr(&scomp_scratch); > > spin_lock(&scratch->lock); > > > > - scatterwalk_map_and_copy(scratch->src, req->src, 0, req->slen, 0); > > + if (sg_nents(req->src) == 1) { > > + src = kmap_local_page(sg_page(req->src)) + req->src->offset; > > What if the SG entry is longer than PAGE_SIZE (or indeed crosses a > page boundary)? I think the test needs to be strengthened. I don't understand what is the problem for a nents to cross two pages as anyway they are contiguous in both physical and virtual addresses. if they are not contiguous, they will be two nents. > > Thanks, > -- > Email: Herbert Xu <herbert@gondor.apana.org.au> > Home Page: http://gondor.apana.org.au/~herbert/ > PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt Thanks Barry
On Wed, Feb 21, 2024 at 6:55 PM Barry Song <21cnbao@gmail.com> wrote: > > On Wed, Feb 21, 2024 at 6:35 PM Herbert Xu <herbert@gondor.apana.org.au> wrote: > > > > On Tue, Feb 20, 2024 at 07:44:14PM +1300, Barry Song wrote: > > > From: Barry Song <v-songbaohua@oppo.com> > > > > > > while sg_nents is 1 which is always true for the current kernel > > > as the only user - zswap is the case, we should remove two big > > > memcpy. > > > > > > Signed-off-by: Barry Song <v-songbaohua@oppo.com> > > > Tested-by: Chengming Zhou <zhouchengming@bytedance.com> > > > --- > > > crypto/scompress.c | 36 +++++++++++++++++++++++++++++------- > > > 1 file changed, 29 insertions(+), 7 deletions(-) > > > > This patch is independent of the other two. Please split it > > out so I can apply it directly. > > Ok. OTOH, patch 3/3 has no dependency with other patches. so patch > 3/3 should be perfectly applicable to crypto :-) > > Hi Andrew, > Would you please handle patch 1/3 and 2/3 in mm-tree given Herbert's ack on > 1/3? > > > > > > @@ -134,13 +135,25 @@ static int scomp_acomp_comp_decomp(struct acomp_req *req, int dir) > > > scratch = raw_cpu_ptr(&scomp_scratch); > > > spin_lock(&scratch->lock); > > > > > > - scatterwalk_map_and_copy(scratch->src, req->src, 0, req->slen, 0); > > > + if (sg_nents(req->src) == 1) { > > > + src = kmap_local_page(sg_page(req->src)) + req->src->offset; > > > > What if the SG entry is longer than PAGE_SIZE (or indeed crosses a > > page boundary)? I think the test needs to be strengthened. > > I don't understand what is the problem for a nents to cross two pages > as anyway they are contiguous in both physical and virtual addresses. > if they are not contiguous, they will be two nents. second thought, you are right. sorry for my noise. The test was running on a platform like arm64 without HIGHMEM. thus, kmap_local_page always returns mapped page_address of normal zone. but for platforms with HIGHMEM for example arm32, x86_32 , we can't use the virtual address of the first page as the start address of two pages though they are physically contiguous. I will rework on this. ideally, we should still avoid the memcpy though two pages are within one nents :-) we are really this case for zswap as the dst is always two pages in case the compressed data is longer than the original data. > > > > > Thanks, > > -- > > Email: Herbert Xu <herbert@gondor.apana.org.au> > > Home Page: http://gondor.apana.org.au/~herbert/ > > PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt > Thanks Barry
diff --git a/crypto/scompress.c b/crypto/scompress.c index b108a30a7600..50a487eac792 100644 --- a/crypto/scompress.c +++ b/crypto/scompress.c @@ -117,6 +117,7 @@ static int scomp_acomp_comp_decomp(struct acomp_req *req, int dir) struct crypto_scomp *scomp = *tfm_ctx; void **ctx = acomp_request_ctx(req); struct scomp_scratch *scratch; + void *src, *dst; unsigned int dlen; int ret; @@ -134,13 +135,25 @@ static int scomp_acomp_comp_decomp(struct acomp_req *req, int dir) scratch = raw_cpu_ptr(&scomp_scratch); spin_lock(&scratch->lock); - scatterwalk_map_and_copy(scratch->src, req->src, 0, req->slen, 0); + if (sg_nents(req->src) == 1) { + src = kmap_local_page(sg_page(req->src)) + req->src->offset; + } else { + scatterwalk_map_and_copy(scratch->src, req->src, 0, + req->slen, 0); + src = scratch->src; + } + + if (req->dst && sg_nents(req->dst) == 1) + dst = kmap_local_page(sg_page(req->dst)) + req->dst->offset; + else + dst = scratch->dst; + if (dir) - ret = crypto_scomp_compress(scomp, scratch->src, req->slen, - scratch->dst, &req->dlen, *ctx); + ret = crypto_scomp_compress(scomp, src, req->slen, + dst, &req->dlen, *ctx); else - ret = crypto_scomp_decompress(scomp, scratch->src, req->slen, - scratch->dst, &req->dlen, *ctx); + ret = crypto_scomp_decompress(scomp, src, req->slen, + dst, &req->dlen, *ctx); if (!ret) { if (!req->dst) { req->dst = sgl_alloc(req->dlen, GFP_ATOMIC, NULL); @@ -152,10 +165,19 @@ static int scomp_acomp_comp_decomp(struct acomp_req *req, int dir) ret = -ENOSPC; goto out; } - scatterwalk_map_and_copy(scratch->dst, req->dst, 0, req->dlen, - 1); + if (dst == scratch->dst) { + scatterwalk_map_and_copy(scratch->dst, req->dst, 0, + req->dlen, 1); + } else { + flush_dcache_page(sg_page(req->dst)); + } } out: + if (src != scratch->src) + kunmap_local(src); + if (dst != scratch->dst) + kunmap_local(dst); + spin_unlock(&scratch->lock); return ret; }