Message ID | d0d9e3fffcaba4ace1fb8f437bd4783928bb2d24.1712923998.git.asml.silence@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | implement io_uring notification (ubuf_info) stacking | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Guessing tree name failed - patch did not apply |
On 4/12/24 6:55 AM, Pavel Begunkov wrote: > At the moment an skb can only have one ubuf_info associated with it, > which might be a performance problem for zerocopy sends in cases like > TCP via io_uring. Add a callback for assigning ubuf_info to skb, this > way we will implement smarter assignment later like linking ubuf_info > together. > > Note, it's an optional callback, which should be compatible with > skb_zcopy_set(), that's because the net stack might potentially decide > to clone an skb and take another reference to ubuf_info whenever it > wishes. Also, a correct implementation should always be able to bind to > an skb without prior ubuf_info, otherwise we could end up in a situation > when the send would not be able to progress. > > Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> > --- > include/linux/skbuff.h | 2 ++ > net/core/skbuff.c | 20 ++++++++++++++------ > 2 files changed, 16 insertions(+), 6 deletions(-) > Reviewed-by: David Ahern <dsahern@kernel.org>
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index a110e97e074a..ced69f37977f 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -530,6 +530,8 @@ enum { struct ubuf_info_ops { void (*complete)(struct sk_buff *, struct ubuf_info *, bool zerocopy_success); + /* has to be compatible with skb_zcopy_set() */ + int (*link_skb)(struct sk_buff *skb, struct ubuf_info *uarg); }; /* diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 749abab23a67..1922e3d09c7f 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -1881,11 +1881,18 @@ int skb_zerocopy_iter_stream(struct sock *sk, struct sk_buff *skb, struct ubuf_info *orig_uarg = skb_zcopy(skb); int err, orig_len = skb->len; - /* An skb can only point to one uarg. This edge case happens when - * TCP appends to an skb, but zerocopy_realloc triggered a new alloc. - */ - if (orig_uarg && uarg != orig_uarg) - return -EEXIST; + if (uarg->ops->link_skb) { + err = uarg->ops->link_skb(skb, uarg); + if (err) + return err; + } else { + /* An skb can only point to one uarg. This edge case happens + * when TCP appends to an skb, but zerocopy_realloc triggered + * a new alloc. + */ + if (orig_uarg && uarg != orig_uarg) + return -EEXIST; + } err = __zerocopy_sg_from_iter(msg, sk, skb, &msg->msg_iter, len); if (err == -EFAULT || (err == -EMSGSIZE && skb->len == orig_len)) { @@ -1899,7 +1906,8 @@ int skb_zerocopy_iter_stream(struct sock *sk, struct sk_buff *skb, return err; } - skb_zcopy_set(skb, uarg, NULL); + if (!uarg->ops->link_skb) + skb_zcopy_set(skb, uarg, NULL); return skb->len - orig_len; } EXPORT_SYMBOL_GPL(skb_zerocopy_iter_stream);
At the moment an skb can only have one ubuf_info associated with it, which might be a performance problem for zerocopy sends in cases like TCP via io_uring. Add a callback for assigning ubuf_info to skb, this way we will implement smarter assignment later like linking ubuf_info together. Note, it's an optional callback, which should be compatible with skb_zcopy_set(), that's because the net stack might potentially decide to clone an skb and take another reference to ubuf_info whenever it wishes. Also, a correct implementation should always be able to bind to an skb without prior ubuf_info, otherwise we could end up in a situation when the send would not be able to progress. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> --- include/linux/skbuff.h | 2 ++ net/core/skbuff.c | 20 ++++++++++++++------ 2 files changed, 16 insertions(+), 6 deletions(-)