diff mbox series

[v3,net-next,01/21] iov_iter: Introduce new procedures for copy to iter/pages

Message ID 20210201100509.27351-2-borisp@mellanox.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series nvme-tcp receive offloads | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count fail Series longer than 15 patches
netdev/tree_selection success Clearly marked for net-next
netdev/subject_prefix success Link
netdev/cc_maintainers success CCed 1 of 1 maintainers
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 10846 this patch: 10846
netdev/kdoc success Errors and warnings before: 4 this patch: 4
netdev/verify_fixes success Link
netdev/checkpatch warning CHECK: Alignment should match open parenthesis WARNING: Missing a blank line after declarations WARNING: line length of 94 exceeds 80 columns
netdev/build_allmodconfig_warn success Errors and warnings before: 16986 this patch: 16986
netdev/header_inline success Link
netdev/stable success Stable not CCed

Commit Message

Boris Pismenny Feb. 1, 2021, 10:04 a.m. UTC
When using direct data placement the NIC writes some of the payload
directly to the destination buffer, and constructs the SKB such that it
points to this data. As a result, the skb_copy datagram_iter call will
attempt to copy data when it is not necessary.

Introduce new procedures for copy to iter/pages in case that the
source of the copy operation might be identical to the destination,
in such cases the copy is skipped only for bio_vec, later commits
uses those functions to introduce new skb copy(+hash) functions.

Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Ben Ben-Ishay <benishay@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Yoray Zack <yorayz@mellanox.com>
---
 include/linux/uio.h | 12 ++++++++++++
 lib/iov_iter.c      | 47 +++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 59 insertions(+)

Comments

Christoph Hellwig Feb. 1, 2021, 5:35 p.m. UTC | #1
On Mon, Feb 01, 2021 at 12:04:49PM +0200, Boris Pismenny wrote:
> +static __always_inline __must_check
> +size_t ddp_copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
> +{
> +	if (unlikely(!check_copy_size(addr, bytes, true)))
> +		return 0;
> +	else
> +		return _ddp_copy_to_iter(addr, bytes, i);
> +}

No need for the else after a return, and the normal kernel convention
double underscores for magic internal functions.

But more importantly: does this belong into the generic header without
and comments what the ddp means and when it should be used?

> +static void ddp_memcpy_to_page(struct page *page, size_t offset, const char *from, size_t len)

Overly long line.  But we're also looking into generic helpers for
this kind of things, not sure if they made it to linux-next in the
meantime, but please check.

> +size_t _ddp_copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
> +{
> +	const char *from = addr;
> +	if (unlikely(iov_iter_is_pipe(i)))
> +		return copy_pipe_to_iter(addr, bytes, i);
> +	if (iter_is_iovec(i))
> +		might_fault();
> +	iterate_and_advance(i, bytes, v,
> +		copyout(v.iov_base, (from += v.iov_len) - v.iov_len, v.iov_len),
> +		ddp_memcpy_to_page(v.bv_page, v.bv_offset,
> +				   (from += v.bv_len) - v.bv_len, v.bv_len),
> +		memcpy(v.iov_base, (from += v.iov_len) - v.iov_len, v.iov_len)
> +		)
> +
> +	return bytes;
> +}

This bloats every kernel build, so please move it into a conditionally
built file.  And please document the whole thing.
Or Gerlitz Feb. 2, 2021, 6 p.m. UTC | #2
On Mon, Feb 1, 2021 at 7:38 PM Christoph Hellwig <hch@lst.de> wrote:
> On Mon, Feb 01, 2021 at 12:04:49PM +0200, Boris Pismenny wrote:
> > +static __always_inline __must_check
> > +size_t ddp_copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
> > +{
> > +     if (unlikely(!check_copy_size(addr, bytes, true)))
> > +             return 0;
> > +     else
> > +             return _ddp_copy_to_iter(addr, bytes, i);
> > +}
>
> No need for the else after a return, and the normal kernel convention
> double underscores for magic internal functions.

ack for the no-else-after-a-return

Re the double underscoring, I was not sure, e.g the non-ddp counterpart
(_copy_to_iter) is single underscored

> But more importantly: does this belong into the generic header without
> and comments what the ddp means and when it should be used?

will look into this, any idea for a more suitable location?

> > +static void ddp_memcpy_to_page(struct page *page, size_t offset, const char *from, size_t len)
>
> Overly long line.  But we're also looking into generic helpers for
> this kind of things, not sure if they made it to linux-next in the
> meantime, but please check.

This is what I found in linux-next - note sure if you were referring to it

commit 11432a3cc061c39475295be533c3674c4f8a6d0b
Author: David Howells <dhowells@redhat.com>

    iov_iter: Add ITER_XARRAY

> > +size_t _ddp_copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
> > +{
> > +     const char *from = addr;
> > +     if (unlikely(iov_iter_is_pipe(i)))
> > +             return copy_pipe_to_iter(addr, bytes, i);
> > +     if (iter_is_iovec(i))
> > +             might_fault();
> > +     iterate_and_advance(i, bytes, v,
> > +             copyout(v.iov_base, (from += v.iov_len) - v.iov_len, v.iov_len),
> > +             ddp_memcpy_to_page(v.bv_page, v.bv_offset,
> > +                                (from += v.bv_len) - v.bv_len, v.bv_len),
> > +             memcpy(v.iov_base, (from += v.iov_len) - v.iov_len, v.iov_len)
> > +             )
> > +
> > +     return bytes;
> > +}
>
> This bloats every kernel build, so please move it into a conditionally built file.

ack

>  And please document the whole thing.

ok
Christoph Hellwig Feb. 3, 2021, 4:56 p.m. UTC | #3
On Tue, Feb 02, 2021 at 08:00:51PM +0200, Or Gerlitz wrote:
> will look into this, any idea for a more suitable location?

Maybe just a new file under lib/ for now?

> > Overly long line.  But we're also looking into generic helpers for
> > this kind of things, not sure if they made it to linux-next in the
> > meantime, but please check.
> 
> This is what I found in linux-next - note sure if you were referring to it
> 
> commit 11432a3cc061c39475295be533c3674c4f8a6d0b
> Author: David Howells <dhowells@redhat.com>
> 
>     iov_iter: Add ITER_XARRAY

No, that's not it.  Ira, what is the status of the memcpy_{to,from}_page
helpers?
Ira Weiny Feb. 3, 2021, 7:34 p.m. UTC | #4
On Wed, Feb 03, 2021 at 05:56:21PM +0100, Christoph Hellwig wrote:
> On Tue, Feb 02, 2021 at 08:00:51PM +0200, Or Gerlitz wrote:
> > will look into this, any idea for a more suitable location?
> 
> Maybe just a new file under lib/ for now?
> 
> > > Overly long line.  But we're also looking into generic helpers for
> > > this kind of things, not sure if they made it to linux-next in the
> > > meantime, but please check.
> > 
> > This is what I found in linux-next - note sure if you were referring to it
> > 
> > commit 11432a3cc061c39475295be533c3674c4f8a6d0b
> > Author: David Howells <dhowells@redhat.com>
> > 
> >     iov_iter: Add ITER_XARRAY
> 
> No, that's not it.  Ira, what is the status of the memcpy_{to,from}_page
> helpers?

Converting the entire kernel tree in one series has become unwieldy so I've
given up.

But I have a series to convert btrfs which I could release by the end of the
week.  That should be good enough to push the memcpy_*_page() support in.

I'm get it formatted and submitted,
Ira
Boris Pismenny Feb. 7, 2021, 2:13 p.m. UTC | #5
On 03/02/2021 21:34, Ira Weiny wrote:
> On Wed, Feb 03, 2021 at 05:56:21PM +0100, Christoph Hellwig wrote:
>> On Tue, Feb 02, 2021 at 08:00:51PM +0200, Or Gerlitz wrote:
>>> will look into this, any idea for a more suitable location?
>>
>> Maybe just a new file under lib/ for now?
>>
>>>> Overly long line.  But we're also looking into generic helpers for
>>>> this kind of things, not sure if they made it to linux-next in the
>>>> meantime, but please check.
>>>
>>> This is what I found in linux-next - note sure if you were referring to it
>>>
>>> commit 11432a3cc061c39475295be533c3674c4f8a6d0b
>>> Author: David Howells <dhowells@redhat.com>
>>>
>>>     iov_iter: Add ITER_XARRAY
>>
>> No, that's not it.  Ira, what is the status of the memcpy_{to,from}_page
>> helpers?
> 
> Converting the entire kernel tree in one series has become unwieldy so I've
> given up.
> 
> But I have a series to convert btrfs which I could release by the end of the
> week.  That should be good enough to push the memcpy_*_page() support in.
> 
> I'm get it formatted and submitted,
> Ira
> 

To conclude this discussion, there's nothing that we need to change here
as the relevant series is still WIP, right?
Boris Pismenny Feb. 7, 2021, 2:24 p.m. UTC | #6
On 03/02/2021 18:56, Christoph Hellwig wrote:
> On Tue, Feb 02, 2021 at 08:00:51PM +0200, Or Gerlitz wrote:
>> will look into this, any idea for a more suitable location?
> 
> Maybe just a new file under lib/ for now?
> 

That doesn't work unless we copy quite a lot of code. There are macros
here (in lib/iov_iter.c) that we rely on, e.g. iterate_and_advance and
friends.

Instead, I propose that we place all of the new code under an ifdef to
reduce the impact on object size if the code is unused. We'll also
improve documentation around this commit.
diff mbox series

Patch

diff --git a/include/linux/uio.h b/include/linux/uio.h
index 72d88566694e..3c42125a7f24 100644
--- a/include/linux/uio.h
+++ b/include/linux/uio.h
@@ -123,6 +123,7 @@  size_t copy_page_from_iter(struct page *page, size_t offset, size_t bytes,
 			 struct iov_iter *i);
 
 size_t _copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i);
+size_t _ddp_copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i);
 size_t _copy_from_iter(void *addr, size_t bytes, struct iov_iter *i);
 bool _copy_from_iter_full(void *addr, size_t bytes, struct iov_iter *i);
 size_t _copy_from_iter_nocache(void *addr, size_t bytes, struct iov_iter *i);
@@ -137,6 +138,15 @@  size_t copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
 		return _copy_to_iter(addr, bytes, i);
 }
 
+static __always_inline __must_check
+size_t ddp_copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
+{
+	if (unlikely(!check_copy_size(addr, bytes, true)))
+		return 0;
+	else
+		return _ddp_copy_to_iter(addr, bytes, i);
+}
+
 static __always_inline __must_check
 size_t copy_from_iter(void *addr, size_t bytes, struct iov_iter *i)
 {
@@ -265,6 +275,8 @@  size_t csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum, struct io
 bool csum_and_copy_from_iter_full(void *addr, size_t bytes, __wsum *csum, struct iov_iter *i);
 size_t hash_and_copy_to_iter(const void *addr, size_t bytes, void *hashp,
 		struct iov_iter *i);
+size_t ddp_hash_and_copy_to_iter(const void *addr, size_t bytes, void *hashp,
+		struct iov_iter *i);
 
 struct iovec *iovec_from_user(const struct iovec __user *uvector,
 		unsigned long nr_segs, unsigned long fast_segs,
diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index a21e6a5792c5..b8af1d3bbec0 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -473,6 +473,16 @@  static void memcpy_from_page(char *to, struct page *page, size_t offset, size_t
 	kunmap_atomic(from);
 }
 
+static void ddp_memcpy_to_page(struct page *page, size_t offset, const char *from, size_t len)
+{
+	char *to = kmap_atomic(page);
+
+	if (to + offset != from)
+		memcpy(to + offset, from, len);
+
+	kunmap_atomic(to);
+}
+
 static void memcpy_to_page(struct page *page, size_t offset, const char *from, size_t len)
 {
 	char *to = kmap_atomic(page);
@@ -625,6 +635,24 @@  static size_t csum_and_copy_to_pipe_iter(const void *addr, size_t bytes,
 	return bytes;
 }
 
+size_t _ddp_copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
+{
+	const char *from = addr;
+	if (unlikely(iov_iter_is_pipe(i)))
+		return copy_pipe_to_iter(addr, bytes, i);
+	if (iter_is_iovec(i))
+		might_fault();
+	iterate_and_advance(i, bytes, v,
+		copyout(v.iov_base, (from += v.iov_len) - v.iov_len, v.iov_len),
+		ddp_memcpy_to_page(v.bv_page, v.bv_offset,
+				   (from += v.bv_len) - v.bv_len, v.bv_len),
+		memcpy(v.iov_base, (from += v.iov_len) - v.iov_len, v.iov_len)
+		)
+
+	return bytes;
+}
+EXPORT_SYMBOL(_ddp_copy_to_iter);
+
 size_t _copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
 {
 	const char *from = addr;
@@ -1566,6 +1594,25 @@  size_t csum_and_copy_to_iter(const void *addr, size_t bytes, void *csump,
 }
 EXPORT_SYMBOL(csum_and_copy_to_iter);
 
+size_t ddp_hash_and_copy_to_iter(const void *addr, size_t bytes, void *hashp,
+		struct iov_iter *i)
+{
+#ifdef CONFIG_CRYPTO_HASH
+	struct ahash_request *hash = hashp;
+	struct scatterlist sg;
+	size_t copied;
+
+	copied = ddp_copy_to_iter(addr, bytes, i);
+	sg_init_one(&sg, addr, copied);
+	ahash_request_set_crypt(hash, &sg, NULL, copied);
+	crypto_ahash_update(hash);
+	return copied;
+#else
+	return 0;
+#endif
+}
+EXPORT_SYMBOL(ddp_hash_and_copy_to_iter);
+
 size_t hash_and_copy_to_iter(const void *addr, size_t bytes, void *hashp,
 		struct iov_iter *i)
 {