mbox series

[v3,net-next,0/7] mptcp: implement TCP_NOTSENT_LOWAT support

Message ID cover.1708082765.git.pabeni@redhat.com (mailing list archive)
Headers show
Series mptcp: implement TCP_NOTSENT_LOWAT support | expand

Message

Paolo Abeni Feb. 16, 2024, 11:28 a.m. UTC
Patch 6/7 does the magic, all the others are minor cleanup and fix of
buglet exposed by such feature. I'll push a paired pktdrill test.

Note that this relies on the existing accounting for snd_nxt. As I
stated such accounting is not 110% accurate as it tracks the most recent
sequence number queued to any subflow, and not the actual sequence
number sent on the wire.

I experimented a lot trying to implement the latter and in the end it
proved to be both "too complex" and "not necessary".

The complexity raises from the need for additional lock and a lot of
refactory to introduce such protection without adding significant
overhead. Additionally, snd_nxt is currenly used and exposed with the
current semantic both eBPF and the internal packet scheduling.
Introducing a different tracking will still require us to keep the old
one.

More interesting, a more accurate tracking could be not strictly
necessary: as the MPTCP protocol enqueues data to the subflows only up
the available send window, any enqueue data is sent on the wire
instantly, without any blocking operation short of a drop in the tx path
at the nft or TC layer.

The individual patches changelog carry the gory details.

Still sending a single series to outline the functional requirements,
even if the first 3 patches could land on mptcp-net directly.

v2 -> v3:
  - typo in patch 1/7
  - dropped unused code in patch 3/7
  - dropped duplicate code in patch 6/7

v1 -> v2:
  - clarifiy commit message in patch 1/7
  - fix possible wake-up bug in patch 6/7

*** BLURB HERE ***

Paolo Abeni (7):
  mptcp: push at DSS boundaries
  mptcp: fix snd_wnd initialization for passive socket
  mptcp: fix potential wake-up event loss
  mptcp: cleanup writer wake-up
  mptcp: avoid some duplicate code in socket option handling
  mptcp: implement TCP_NOTSENT_LOWAT support.
  mptcp: cleanup SOL_TCP handling

 net/mptcp/protocol.c | 57 ++++++++++++++++++++++++-----------
 net/mptcp/protocol.h | 51 ++++++++++++++++++++++---------
 net/mptcp/sockopt.c  | 71 ++++++++++++++++++++------------------------
 3 files changed, 108 insertions(+), 71 deletions(-)

Comments

Mat Martineau Feb. 16, 2024, 7:48 p.m. UTC | #1
On Fri, 16 Feb 2024, Paolo Abeni wrote:

> Patch 6/7 does the magic, all the others are minor cleanup and fix of
> buglet exposed by such feature. I'll push a paired pktdrill test.
>
> Note that this relies on the existing accounting for snd_nxt. As I
> stated such accounting is not 110% accurate as it tracks the most recent
> sequence number queued to any subflow, and not the actual sequence
> number sent on the wire.
>
> I experimented a lot trying to implement the latter and in the end it
> proved to be both "too complex" and "not necessary".
>
> The complexity raises from the need for additional lock and a lot of
> refactory to introduce such protection without adding significant
> overhead. Additionally, snd_nxt is currenly used and exposed with the
> current semantic both eBPF and the internal packet scheduling.
> Introducing a different tracking will still require us to keep the old
> one.
>
> More interesting, a more accurate tracking could be not strictly
> necessary: as the MPTCP protocol enqueues data to the subflows only up
> the available send window, any enqueue data is sent on the wire
> instantly, without any blocking operation short of a drop in the tx path
> at the nft or TC layer.
>
> The individual patches changelog carry the gory details.
>

v3 of the series LGTM, thanks Paolo.

Matthieu note that this series is split between mptcp-net and mptcp-next:

> Still sending a single series to outline the functional requirements,
> even if the first 3 patches could land on mptcp-net directly.
>

Reviewed-by: Mat Martineau <martineau@kernel.org>


> v2 -> v3:
>  - typo in patch 1/7
>  - dropped unused code in patch 3/7
>  - dropped duplicate code in patch 6/7
>
> v1 -> v2:
>  - clarifiy commit message in patch 1/7
>  - fix possible wake-up bug in patch 6/7
>
> *** BLURB HERE ***
>
> Paolo Abeni (7):
>  mptcp: push at DSS boundaries
>  mptcp: fix snd_wnd initialization for passive socket
>  mptcp: fix potential wake-up event loss
>  mptcp: cleanup writer wake-up
>  mptcp: avoid some duplicate code in socket option handling
>  mptcp: implement TCP_NOTSENT_LOWAT support.
>  mptcp: cleanup SOL_TCP handling
>
> net/mptcp/protocol.c | 57 ++++++++++++++++++++++++-----------
> net/mptcp/protocol.h | 51 ++++++++++++++++++++++---------
> net/mptcp/sockopt.c  | 71 ++++++++++++++++++++------------------------
> 3 files changed, 108 insertions(+), 71 deletions(-)
>
> -- 
> 2.43.0
>
>
>
Matthieu Baerts Feb. 19, 2024, 9:44 a.m. UTC | #2
Hi Paolo, Mat,

On 16/02/2024 12:28, Paolo Abeni wrote:
> Patch 6/7 does the magic, all the others are minor cleanup and fix of
> buglet exposed by such feature. I'll push a paired pktdrill test.
> 
> Note that this relies on the existing accounting for snd_nxt. As I
> stated such accounting is not 110% accurate as it tracks the most recent
> sequence number queued to any subflow, and not the actual sequence
> number sent on the wire.
> 
> I experimented a lot trying to implement the latter and in the end it
> proved to be both "too complex" and "not necessary".
> 
> The complexity raises from the need for additional lock and a lot of
> refactory to introduce such protection without adding significant
> overhead. Additionally, snd_nxt is currenly used and exposed with the
> current semantic both eBPF and the internal packet scheduling.
> Introducing a different tracking will still require us to keep the old
> one.
> 
> More interesting, a more accurate tracking could be not strictly
> necessary: as the MPTCP protocol enqueues data to the subflows only up
> the available send window, any enqueue data is sent on the wire
> instantly, without any blocking operation short of a drop in the tx path
> at the nft or TC layer.
> 
> The individual patches changelog carry the gory details.
> 
> Still sending a single series to outline the functional requirements,
> even if the first 3 patches could land on mptcp-net directly.

Thank you for the patches, and the reviews!

Now in our tree (fixes for -net, and feat. for net-next):


New patches for t/upstream-net and t/upstream:
- d9b01cb9adc7: mptcp: push at DSS boundaries
- 1ea124e2ca57: mptcp: fix snd_wnd initialization for passive socket
- c64351cfe29e: mptcp: fix potential wake-up event loss
- Results: 086d253b7038..9dcf86462a59 (export-net)

- efe1cf277159: conflict in top-bases/t/DO-NOT-MERGE-git-markup-net-next
- Results: 8de48203098c..4f7ce06a0f9e (export)

New patches for t/upstream (only):
- 232c91fcaf38: mptcp: cleanup writer wake-up
- 1e762225d9b2: mptcp: avoid some duplicate code in socket option handling
- 51913f3f330f: mptcp: implement TCP_NOTSENT_LOWAT support
- eb5db737fdac: mptcp: cleanup SOL_TCP handling
- Results: 4f7ce06a0f9e..f2fb9bec2195 (export)

Cheers,
Matt