[v2,net] inet: inet_defrag: prevent sk release while still in use

ip_local_out() and other functions can pass skb->sk as function argument.

If the skb is a fragment and reassembly happens before such function call
returns, the sk must not be released.

This affects skb fragments reassembled via netfilter or similar
modules, e.g. openvswitch or ct_act.c, when run as part of tx pipeline.

Eric Dumazet made an initial analysis of this bug.  Quoting Eric:
  Calling ip_defrag() in output path is also implying skb_orphan(),
  which is buggy because output path relies on sk not disappearing.

  A relevant old patch about the issue was :
  8282f27449bf ("inet: frag: Always orphan skbs inside ip_defrag()")

  [..]

  net/ipv4/ip_output.c depends on skb->sk being set, and probably to an
  inet socket, not an arbitrary one.

  If we orphan the packet in ipvlan, then downstream things like FQ
  packet scheduler will not work properly.

  We need to change ip_defrag() to only use skb_orphan() when really
  needed, ie whenever frag_list is going to be used.

Eric suggested to stash sk in fragment queue and made an initial patch.
However there is a problem with this:

If skb is refragmented again right after, ip_do_fragment() will copy
head->sk to the new fragments, and sets up destructor to sock_wfree.
IOW, we have no choice but to fix up sk_wmem accouting to reflect the
fully reassembled skb, else wmem will underflow.

This change moves the orphan down into the core, to last possible moment.
As ip_defrag_offset is aliased with sk_buff->sk member, we must move the
offset into the FRAG_CB, else skb->sk gets clobbered.

This allows to delay the orphaning long enough to learn if the skb has
to be queued or if the skb is completing the reasm queue.

In the former case, things work as before, skb is orphaned.  This is
safe because skb gets queued/stolen and won't continue past reasm engine.

In the latter case, we will steal the skb->sk reference, reattach it to
the head skb, and fix up wmem accouting when inet_frag inflates truesize.

Fixes: 7026b1ddb6b8 ("netfilter: Pass socket pointer down through okfn().")
Diagnosed-by: Eric Dumazet <edumazet@google.com>
Reported-by: xingwei lee <xrivendell7@gmail.com>
Reported-by: yue sun <samsun1006219@gmail.com>
Reported-by: syzbot+e5167d7144a62715044c@syzkaller.appspotmail.com
Signed-off-by: Florian Westphal <fw@strlen.de>
---
 Changes in v2:
  - add Fixes tag
  - remove obsolete tcp_wfree prototype declaration

 include/linux/skbuff.h                  |  7 +--
 net/ipv4/inet_fragment.c                | 70 ++++++++++++++++++++-----
 net/ipv4/ip_fragment.c                  |  2 +-
 net/ipv6/netfilter/nf_conntrack_reasm.c |  2 +-
 4 files changed, 60 insertions(+), 21 deletions(-)

Message ID	20240326101845.30836-1-fw@strlen.de (mailing list archive)
State	Accepted
Commit	18685451fc4e546fc0e718580d32df3c0e5c8272
Delegated to:	Netdev Maintainers
Headers	show Received: from Chamillionaire.breakpoint.cc (Chamillionaire.breakpoint.cc [91.216.245.30]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9FFB967A01; Tue, 26 Mar 2024 10:19:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.216.245.30 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711448385; cv=none; b=CrHB9zSvT6Gv2BURssKHqonss5TjcireBIkoTX1jKRk/MgVwS6j2cTgyxt3ddqHwbv0hRYmF3tFz/ok971+mTSpad0jvg271KaweDO8OI5B/XlRsRurBMher3kB3nxZO3qbyKdoWUE8F7fMnvDeoAM1yjGa13NlmGnSGZ++EjEM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711448385; c=relaxed/simple; bh=Wvyum4grn5QEHjS0f0YEsG9H5Xk2GfaffR2HRmtfoWw=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=SQgeZESqw7bq4ibi0bmaXu8TUT0aNcDgod+zxyNwhbdZv4p0H1ojRj499cB8b8DTc1VghVzvy+dVADxwdJNEWWwj7hpeVbyH7f7KHCpkheSySQlfmobBVEMbDuiItStf5yBzMlhDvwljcX2dI+1myoB/+fdmKuObe0T2bAnnC8I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=strlen.de; spf=pass smtp.mailfrom=breakpoint.cc; arc=none smtp.client-ip=91.216.245.30 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=strlen.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=breakpoint.cc Received: from fw by Chamillionaire.breakpoint.cc with local (Exim 4.92) (envelope-from <fw@breakpoint.cc>) id 1rp3tz-0003KQ-R3; Tue, 26 Mar 2024 11:19:31 +0100 From: Florian Westphal <fw@strlen.de> To: <netdev@vger.kernel.org> Cc: dsahern@kernel.org, kuba@kernel.org, pabeni@redhat.com, netfilter-devel@vger.kernel.org, edumazet@google.com, Florian Westphal <fw@strlen.de>, xingwei lee <xrivendell7@gmail.com>, yue sun <samsun1006219@gmail.com>, syzbot+e5167d7144a62715044c@syzkaller.appspotmail.com Subject: [PATCH v2 net] inet: inet_defrag: prevent sk release while still in use Date: Tue, 26 Mar 2024 11:18:41 +0100 Message-ID: <20240326101845.30836-1-fw@strlen.de> X-Mailer: git-send-email 2.43.2 Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: <netdev.vger.kernel.org> List-Subscribe: <mailto:netdev+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:netdev+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Delegate: kuba@kernel.org
Series	[v2,net] inet: inet_defrag: prevent sk release while still in use \| expand [v2,net] inet: inet_defrag: prevent sk release while still in use

Context	Check	Description
netdev/series_format	success	Single patches do not need cover letters
netdev/tree_selection	success	Clearly marked for net
netdev/ynl	success	Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present	success	Fixes tag present in non-next series
netdev/header_inline	success	No static functions without inline keyword in header files
netdev/build_32bit	success	Errors and warnings before: 5799 this patch: 5799
netdev/build_tools	success	Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers	warning	3 maintainers not CCed: kadlec@netfilter.org pablo@netfilter.org coreteam@netfilter.org
netdev/build_clang	success	Errors and warnings before: 1077 this patch: 1077
netdev/verify_signedoff	success	Signed-off-by tag matches author and committer
netdev/deprecated_api	success	None detected
netdev/check_selftest	success	No net selftest shell script
netdev/verify_fixes	success	Fixes tag looks correct
netdev/build_allmodconfig_warn	success	Errors and warnings before: 6083 this patch: 6083
netdev/checkpatch	warning	WARNING: Non-standard signature: Diagnosed-by: WARNING: function definition argument 'struct sk_buff *' should also have an identifier name
netdev/build_clang_rust	success	No Rust files in patch. Skipping build
netdev/kdoc	success	Errors and warnings before: 0 this patch: 0
netdev/source_inline	success	Was 0 now: 0
netdev/contest	success	net-next-2024-03-27--15-00 (tests: 952)

[v2,net] inet: inet_defrag: prevent sk release while still in use

Checks

Commit Message

Comments

Patch