Message ID | 20240303052408.310064-2-kuba@kernel.org (mailing list archive) |
---|---|
State | Accepted |
Commit | b5a899154aa94cc573db3ae1f61dabe7bfe8b579 |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | netlink: handle EMSGSIZE errors in the core | expand |
On Sat, Mar 02, 2024 at 09:24:06PM -0800, Jakub Kicinski wrote: > Eric points out that our current suggested way of handling > EMSGSIZE errors ((err == -EMSGSIZE) ? skb->len : err) will > break if we didn't fit even a single object into the buffer > provided by the user. This should not happen for well behaved > applications, but we can fix that, and free netlink families > from dealing with that completely by moving error handling > into the core. > > Let's assume from now on that all EMSGSIZE errors in dumps are > because we run out of skb space. Families can now propagate > the error nla_put_*() etc generated and not worry about any > return value magic. If some family really wants to send EMSGSIZE > to user space, assuming it generates the same error on the next > dump iteration the skb->len should be 0, and user space should > still see the EMSGSIZE. > > This should simplify families and prevent mistakes in return > values which lead to DONE being forced into a separate recv() > call as discovered by Ido some time ago. > > Reviewed-by: Eric Dumazet <edumazet@google.com> > Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Ido Schimmel <idosch@nvidia.com> I read your comment here [1] and I believe this patch [2] is needed to avoid another pass in case of an error code other than EMSGSIZE. I can submit it after your series is accepted. [1] https://lore.kernel.org/netdev/20240229073750.6e59155e@kernel.org/ [2] diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c index 70509da4f080..b3a24b61f76b 100644 --- a/net/ipv4/nexthop.c +++ b/net/ipv4/nexthop.c @@ -3241,10 +3241,6 @@ static int rtm_dump_nexthop(struct sk_buff *skb, struct netlink_callback *cb) err = rtm_dump_walk_nexthops(skb, cb, root, ctx, &rtm_dump_nexthop_cb, &filter); - if (err < 0) { - if (likely(skb->len)) - err = skb->len; - } cb->seq = net->nexthop.seq; nl_dump_check_consistent(cb, nlmsg_hdr(skb)); @@ -3439,11 +3435,6 @@ static int rtm_dump_nexthop_bucket(struct sk_buff *skb, &rtm_dump_nexthop_bucket_cb, &dd); } - if (err < 0) { - if (likely(skb->len)) - err = skb->len; - } - cb->seq = net->nexthop.seq; nl_dump_check_consistent(cb, nlmsg_hdr(skb)); return err;
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c index ad7b645e3ae7..da846212fb9b 100644 --- a/net/netlink/af_netlink.c +++ b/net/netlink/af_netlink.c @@ -2267,6 +2267,15 @@ static int netlink_dump(struct sock *sk, bool lock_taken) if (extra_mutex) mutex_unlock(extra_mutex); + /* EMSGSIZE plus something already in the skb means + * that there's more to dump but current skb has filled up. + * If the callback really wants to return EMSGSIZE to user space + * it needs to do so again, on the next cb->dump() call, + * without putting data in the skb. + */ + if (nlk->dump_done_errno == -EMSGSIZE && skb->len) + nlk->dump_done_errno = skb->len; + cb->extack = NULL; }