Message ID | 20250218194056.380647-1-sdf@fomichev.me (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] tcp: devmem: properly export MSG_CTRUNC to userspace | expand |
On Tue, Feb 18, 2025 at 11:40 AM Stanislav Fomichev <sdf@fomichev.me> wrote: > > Currently, we report -ETOOSMALL (err) only on the first iteration > (!sent). When we get put_cmsg error after a bunch of successful > put_cmsg calls, we don't signal the error at all. This might be > confusing on the userspace side which will see truncated CMSGs > but no MSG_CTRUNC signal. > > Consider the following case: > - sizeof(struct cmsghdr) = 16 > - sizeof(struct dmabuf_cmsg) = 24 > - total cmsg size (CMSG_LEN) = 40 (16+24) > > When calling recvmsg with msg_controllen=60, the userspace > will receive two(!) dmabuf_cmsg(s), the first one will The intended API in this scenario is that the user will receive *one* dmabuf_cmgs. The kernel will consider that data in that frag to be delivered to userspace, and subsequent recvmsg() calls will not re-deliver that data. The next recvmsg() call will deliver the data that we failed to put_cmsg() in the current call. If you receive two dmabuf_cmsgs in this scenario, that is indeed a bug. Exposing CMSG_CTRUNC could be a good fix. It may indicate to the user "ignore the last cmsg we put, because it got truncated, and you'll receive the full cmsg on the next recvmsg call". We do need to update the docs for this I think. However, I think a much much better fix is to modify put_cmsg() so that we only get one dmabuf_cmsgs in this scenario, if possible. We could add a strict flag to put_cmsg(). If (strict == true && msg->controlllen < cmlen), we return an error instead of putting a truncated cmsg, so that the user only sees one dmabuf_cmsg in this scenario. Is this doable? -- Thanks, Mina
On 02/18, Mina Almasry wrote: > On Tue, Feb 18, 2025 at 11:40 AM Stanislav Fomichev <sdf@fomichev.me> wrote: > > > > Currently, we report -ETOOSMALL (err) only on the first iteration > > (!sent). When we get put_cmsg error after a bunch of successful > > put_cmsg calls, we don't signal the error at all. This might be > > confusing on the userspace side which will see truncated CMSGs > > but no MSG_CTRUNC signal. > > > > Consider the following case: > > - sizeof(struct cmsghdr) = 16 > > - sizeof(struct dmabuf_cmsg) = 24 > > - total cmsg size (CMSG_LEN) = 40 (16+24) > > > > When calling recvmsg with msg_controllen=60, the userspace > > will receive two(!) dmabuf_cmsg(s), the first one will > > The intended API in this scenario is that the user will receive *one* > dmabuf_cmgs. The kernel will consider that data in that frag to be > delivered to userspace, and subsequent recvmsg() calls will not > re-deliver that data. The next recvmsg() call will deliver the data > that we failed to put_cmsg() in the current call. > > If you receive two dmabuf_cmsgs in this scenario, that is indeed a > bug. Exposing CMSG_CTRUNC could be a good fix. It may indicate to the > user "ignore the last cmsg we put, because it got truncated, and > you'll receive the full cmsg on the next recvmsg call". We do need to > update the docs for this I think. > > However, I think a much much better fix is to modify put_cmsg() so > that we only get one dmabuf_cmsgs in this scenario, if possible. We > could add a strict flag to put_cmsg(). If (strict == true && > msg->controlllen < cmlen), we return an error instead of putting a > truncated cmsg, so that the user only sees one dmabuf_cmsg in this > scenario. > > Is this doable? Instead of modifying put_cmsg(), I can have an extra check before calling it to make sure the full entry fits. Something like: --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -2498,6 +2498,11 @@ static int tcp_recvmsg_dmabuf(struct sock *sk, const struct sk_buff *skb, offset += copy; remaining_len -= copy; + if (msg.msg_controllen < CMSG_LEN(sizeof(dmabuf_cmsg))) { + err = -ETOOSMALL; + goto out; + } + err = put_cmsg(msg, SOL_SOCKET, SO_DEVMEM_DMABUF, sizeof(dmabuf_cmsg), WDYT? I'll still probably remove '~MSG_CTRUNC' parts as well to avoid confusion.
On Tue, Feb 18, 2025 at 1:17 PM Stanislav Fomichev <stfomichev@gmail.com> wrote: > > On 02/18, Mina Almasry wrote: > > On Tue, Feb 18, 2025 at 11:40 AM Stanislav Fomichev <sdf@fomichev.me> wrote: > > > > > > Currently, we report -ETOOSMALL (err) only on the first iteration > > > (!sent). When we get put_cmsg error after a bunch of successful > > > put_cmsg calls, we don't signal the error at all. This might be > > > confusing on the userspace side which will see truncated CMSGs > > > but no MSG_CTRUNC signal. > > > > > > Consider the following case: > > > - sizeof(struct cmsghdr) = 16 > > > - sizeof(struct dmabuf_cmsg) = 24 > > > - total cmsg size (CMSG_LEN) = 40 (16+24) > > > > > > When calling recvmsg with msg_controllen=60, the userspace > > > will receive two(!) dmabuf_cmsg(s), the first one will > > > > The intended API in this scenario is that the user will receive *one* > > dmabuf_cmgs. The kernel will consider that data in that frag to be > > delivered to userspace, and subsequent recvmsg() calls will not > > re-deliver that data. The next recvmsg() call will deliver the data > > that we failed to put_cmsg() in the current call. > > > > If you receive two dmabuf_cmsgs in this scenario, that is indeed a > > bug. Exposing CMSG_CTRUNC could be a good fix. It may indicate to the > > user "ignore the last cmsg we put, because it got truncated, and > > you'll receive the full cmsg on the next recvmsg call". We do need to > > update the docs for this I think. > > > > However, I think a much much better fix is to modify put_cmsg() so > > that we only get one dmabuf_cmsgs in this scenario, if possible. We > > could add a strict flag to put_cmsg(). If (strict == true && > > msg->controlllen < cmlen), we return an error instead of putting a > > truncated cmsg, so that the user only sees one dmabuf_cmsg in this > > scenario. > > > > Is this doable? > > Instead of modifying put_cmsg(), I can have an extra check before > calling it to make sure the full entry fits. Something like: > Yes, that sounds perfect. I would add a new helper, maybe put_dmabuf_cmsg, that checks that we have enough space before calling the generic put_cmsg(). > --- a/net/ipv4/tcp.c > +++ b/net/ipv4/tcp.c > @@ -2498,6 +2498,11 @@ static int tcp_recvmsg_dmabuf(struct sock *sk, const struct sk_buff *skb, > offset += copy; > remaining_len -= copy; > > + if (msg.msg_controllen < CMSG_LEN(sizeof(dmabuf_cmsg))) { > + err = -ETOOSMALL; > + goto out; > + } > + > err = put_cmsg(msg, SOL_SOCKET, > SO_DEVMEM_DMABUF, > sizeof(dmabuf_cmsg), > > WDYT? I'll still probably remove '~MSG_CTRUNC' parts as well to avoid > confusion. Yes, since we check there is enough space before calling put_cmsg(), it should now become impossible for put_cmsg() to set MSG_CTRUNC anyway, so the check in tcp_recvmsg_dmabuf() becomes an unnecessary defensive check that should be removed. Thanks for catching this!
On 02/18, Mina Almasry wrote: > On Tue, Feb 18, 2025 at 1:17 PM Stanislav Fomichev <stfomichev@gmail.com> wrote: > > > > On 02/18, Mina Almasry wrote: > > > On Tue, Feb 18, 2025 at 11:40 AM Stanislav Fomichev <sdf@fomichev.me> wrote: > > > > > > > > Currently, we report -ETOOSMALL (err) only on the first iteration > > > > (!sent). When we get put_cmsg error after a bunch of successful > > > > put_cmsg calls, we don't signal the error at all. This might be > > > > confusing on the userspace side which will see truncated CMSGs > > > > but no MSG_CTRUNC signal. > > > > > > > > Consider the following case: > > > > - sizeof(struct cmsghdr) = 16 > > > > - sizeof(struct dmabuf_cmsg) = 24 > > > > - total cmsg size (CMSG_LEN) = 40 (16+24) > > > > > > > > When calling recvmsg with msg_controllen=60, the userspace > > > > will receive two(!) dmabuf_cmsg(s), the first one will > > > > > > The intended API in this scenario is that the user will receive *one* > > > dmabuf_cmgs. The kernel will consider that data in that frag to be > > > delivered to userspace, and subsequent recvmsg() calls will not > > > re-deliver that data. The next recvmsg() call will deliver the data > > > that we failed to put_cmsg() in the current call. > > > > > > If you receive two dmabuf_cmsgs in this scenario, that is indeed a > > > bug. Exposing CMSG_CTRUNC could be a good fix. It may indicate to the > > > user "ignore the last cmsg we put, because it got truncated, and > > > you'll receive the full cmsg on the next recvmsg call". We do need to > > > update the docs for this I think. > > > > > > However, I think a much much better fix is to modify put_cmsg() so > > > that we only get one dmabuf_cmsgs in this scenario, if possible. We > > > could add a strict flag to put_cmsg(). If (strict == true && > > > msg->controlllen < cmlen), we return an error instead of putting a > > > truncated cmsg, so that the user only sees one dmabuf_cmsg in this > > > scenario. > > > > > > Is this doable? > > > > Instead of modifying put_cmsg(), I can have an extra check before > > calling it to make sure the full entry fits. Something like: > > > > Yes, that sounds perfect. I would add a new helper, maybe > put_dmabuf_cmsg, that checks that we have enough space before calling > the generic put_cmsg(). > > > --- a/net/ipv4/tcp.c > > +++ b/net/ipv4/tcp.c > > @@ -2498,6 +2498,11 @@ static int tcp_recvmsg_dmabuf(struct sock *sk, const struct sk_buff *skb, > > offset += copy; > > remaining_len -= copy; > > > > + if (msg.msg_controllen < CMSG_LEN(sizeof(dmabuf_cmsg))) { > > + err = -ETOOSMALL; > > + goto out; > > + } > > + > > err = put_cmsg(msg, SOL_SOCKET, > > SO_DEVMEM_DMABUF, > > sizeof(dmabuf_cmsg), > > > > WDYT? I'll still probably remove '~MSG_CTRUNC' parts as well to avoid > > confusion. > > Yes, since we check there is enough space before calling put_cmsg(), > it should now become impossible for put_cmsg() to set MSG_CTRUNC > anyway, so the check in tcp_recvmsg_dmabuf() becomes an unnecessary > defensive check that should be removed. > > Thanks for catching this! Perfect, thanks for a quick review! --- pw-bot: cr
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 0d704bda6c41..bdc9ac648d83 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -2441,7 +2441,6 @@ static int tcp_recvmsg_dmabuf(struct sock *sk, const struct sk_buff *skb, err = put_cmsg(msg, SOL_SOCKET, SO_DEVMEM_LINEAR, sizeof(dmabuf_cmsg), &dmabuf_cmsg); if (err || msg->msg_flags & MSG_CTRUNC) { - msg->msg_flags &= ~MSG_CTRUNC; if (!err) err = -ETOOSMALL; goto out; @@ -2504,7 +2503,6 @@ static int tcp_recvmsg_dmabuf(struct sock *sk, const struct sk_buff *skb, sizeof(dmabuf_cmsg), &dmabuf_cmsg); if (err || msg->msg_flags & MSG_CTRUNC) { - msg->msg_flags &= ~MSG_CTRUNC; if (!err) err = -ETOOSMALL; goto out;
Currently, we report -ETOOSMALL (err) only on the first iteration (!sent). When we get put_cmsg error after a bunch of successful put_cmsg calls, we don't signal the error at all. This might be confusing on the userspace side which will see truncated CMSGs but no MSG_CTRUNC signal. Consider the following case: - sizeof(struct cmsghdr) = 16 - sizeof(struct dmabuf_cmsg) = 24 - total cmsg size (CMSG_LEN) = 40 (16+24) When calling recvmsg with msg_controllen=60, the userspace will receive two(!) dmabuf_cmsg(s), the first one will be a valid one and the second one will be silently truncated. There is no easy way to discover the truncation besides doing something like "cm->cmsg_len != CMSG_LEN(sizeof(dmabuf_cmsg))". Do not mask MSG_CTRUNC to keep conventional cmsg-truncated signals in place. If there is a concern with breaking UAPI, I can document this case in devmem.rst instead. Fixes: 8f0b3cc9a4c1 ("tcp: RX path for devmem TCP") Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> --- net/ipv4/tcp.c | 2 -- 1 file changed, 2 deletions(-)