diff mbox series

[nf] netfilter: nf_tables: unconditionally flush pending work before notifier

Message ID 20240702140841.3337-1-fw@strlen.de (mailing list archive)
State Awaiting Upstream
Delegated to: Netdev Maintainers
Headers show
Series [nf] netfilter: nf_tables: unconditionally flush pending work before notifier | expand

Checks

Context Check Description
netdev/series_format warning Single patches do not need cover letters; Target tree name not specified in the subject
netdev/tree_selection success Guessed tree name to be net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 839 this patch: 839
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers fail 1 blamed authors not CCed: pablo@netfilter.org; 6 maintainers not CCed: kuba@kernel.org coreteam@netfilter.org pabeni@redhat.com kadlec@netfilter.org edumazet@google.com pablo@netfilter.org
netdev/build_clang success Errors and warnings before: 846 this patch: 846
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 862 this patch: 862
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 9 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 5 this patch: 5
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-07-03--12-00 (tests: 666)

Commit Message

Florian Westphal July 2, 2024, 2:08 p.m. UTC
syzbot reports:

KASAN: slab-uaf in nft_ctx_update include/net/netfilter/nf_tables.h:1831
KASAN: slab-uaf in nft_commit_release net/netfilter/nf_tables_api.c:9530
KASAN: slab-uaf int nf_tables_trans_destroy_work+0x152b/0x1750 net/netfilter/nf_tables_api.c:9597
Read of size 2 at addr ffff88802b0051c4 by task kworker/1:1/45
[..]
Workqueue: events nf_tables_trans_destroy_work
Call Trace:
 nft_ctx_update include/net/netfilter/nf_tables.h:1831 [inline]
 nft_commit_release net/netfilter/nf_tables_api.c:9530 [inline]
 nf_tables_trans_destroy_work+0x152b/0x1750 net/netfilter/nf_tables_api.c:9597

Problem is that the notifier does a conditional flush, but its possible
that the table-to-be-removed is still referenced by transactions being
processed by the worker, so we need to flush unconditionally.

We could make the flush_work depend on whether we found a table to delete
in nf-next to avoid the flush for most cases.

AFAICS this problem is only exposed in nf-next, with
commit e169285f8c56 ("netfilter: nf_tables: do not store nft_ctx in transaction objects"),
with this commit applied there is an unconditional fetch of
table->family which is whats triggering the above splat.

Fixes: 2c9f0293280e ("netfilter: nf_tables: flush pending destroy work before netlink notifier")
Reported-and-tested-by: syzbot+4fd66a69358fc15ae2ad@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=4fd66a69358fc15ae2ad
Signed-off-by: Florian Westphal <fw@strlen.de>
---
 net/netfilter/nf_tables_api.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

Comments

Hillf Danton July 3, 2024, 10:35 a.m. UTC | #1
On Tue,  2 Jul 2024 16:08:14 +0200 Florian Westphal <fw@strlen.de>
> syzbot reports:
> 
> KASAN: slab-uaf in nft_ctx_update include/net/netfilter/nf_tables.h:1831
> KASAN: slab-uaf in nft_commit_release net/netfilter/nf_tables_api.c:9530
> KASAN: slab-uaf int nf_tables_trans_destroy_work+0x152b/0x1750 net/netfilter/nf_tables_api.c:9597
> Read of size 2 at addr ffff88802b0051c4 by task kworker/1:1/45
> [..]
> Workqueue: events nf_tables_trans_destroy_work
> Call Trace:
>  nft_ctx_update include/net/netfilter/nf_tables.h:1831 [inline]
>  nft_commit_release net/netfilter/nf_tables_api.c:9530 [inline]
>  nf_tables_trans_destroy_work+0x152b/0x1750 net/netfilter/nf_tables_api.c:9597
> 
> Problem is that the notifier does a conditional flush, but its possible
> that the table-to-be-removed is still referenced by transactions being
> processed by the worker, so we need to flush unconditionally.
> 
Given trans->table goes thru the lifespan of trans, your proposal is a bandaid
if trans outlives table.
Florian Westphal July 3, 2024, 10:52 a.m. UTC | #2
Hillf Danton <hdanton@sina.com> wrote:
> Given trans->table goes thru the lifespan of trans, your proposal is a bandaid
> if trans outlives table.

trans must never outlive table.
Hillf Danton July 3, 2024, 12:09 p.m. UTC | #3
On Wed, 3 Jul 2024 12:52:15 +0200 Florian Westphal <fw@strlen.de>
> Hillf Danton <hdanton@sina.com> wrote:
> > Given trans->table goes thru the lifespan of trans, your proposal is a bandaid
> > if trans outlives table.
> 
> trans must never outlive table.
> 
What is preventing trans from being freed after closing sock, given
trans is freed in workqueue?

	close sock
	queue work
Florian Westphal July 3, 2024, 1:01 p.m. UTC | #4
Hillf Danton <hdanton@sina.com> wrote:
> On Wed, 3 Jul 2024 12:52:15 +0200 Florian Westphal <fw@strlen.de>
> > Hillf Danton <hdanton@sina.com> wrote:
> > > Given trans->table goes thru the lifespan of trans, your proposal is a bandaid
> > > if trans outlives table.
> > 
> > trans must never outlive table.
> > 
> What is preventing trans from being freed after closing sock, given
> trans is freed in workqueue?
> 
> 	close sock
> 	queue work

The notifier acquires the transaction mutex, locking out all other
transactions, so no further transactions requests referencing
the table can be queued.

The work queue is flushed before potentially ripping the table
out.  After this, no transactions referencing the table can exist
anymore; the only transactions than can still be queued are those
coming from a different netns, and tables are scoped per netns.

Table is torn down.  Transaction mutex is released.

Next transaction from userspace can't find the table anymore (its gone),
so no more transactions can be queued for this table.

As I wrote in the commit message, the flush is dumb, this should first
walk to see if there is a matching table to be torn down, and then flush
work queue once before tearing the table down.

But its better to clearly split bug fix and such a change.
Hillf Danton July 4, 2024, 10:35 a.m. UTC | #5
On Wed, 3 Jul 2024 15:01:07 +0200 Florian Westphal <fw@strlen.de>
> Hillf Danton <hdanton@sina.com> wrote:
> > On Wed, 3 Jul 2024 12:52:15 +0200 Florian Westphal <fw@strlen.de>
> > > Hillf Danton <hdanton@sina.com> wrote:
> > > > Given trans->table goes thru the lifespan of trans, your proposal is a bandaid
> > > > if trans outlives table.
> > > 
> > > trans must never outlive table.
> > > 
> > What is preventing trans from being freed after closing sock, given
> > trans is freed in workqueue?
> > 
> > 	close sock
> > 	queue work
> 
> The notifier acquires the transaction mutex, locking out all other
> transactions, so no further transactions requests referencing
> the table can be queued.
> 
As per the syzbot report, trans->table could be instantiated before
notifier acquires the transaction mutex. And in fact the lock helps
trans outlive table even with your patch.

	cpu1			cpu2
	---			---
	transB->table = A
				lock trans mutex
				flush work
				free A
				unlock trans mutex

	queue work to free transB

> The work queue is flushed before potentially ripping the table
> out.  After this, no transactions referencing the table can exist
> anymore; the only transactions than can still be queued are those
> coming from a different netns, and tables are scoped per netns.
> 
> Table is torn down.  Transaction mutex is released.
> 
> Next transaction from userspace can't find the table anymore (its gone),
> so no more transactions can be queued for this table.
> 
> As I wrote in the commit message, the flush is dumb, this should first
> walk to see if there is a matching table to be torn down, and then flush
> work queue once before tearing the table down.
> 
> But its better to clearly split bug fix and such a change.
Florian Westphal July 4, 2024, 10:54 a.m. UTC | #6
Hillf Danton <hdanton@sina.com> wrote:
> On Wed, 3 Jul 2024 15:01:07 +0200 Florian Westphal <fw@strlen.de>
> > Hillf Danton <hdanton@sina.com> wrote:
> > > On Wed, 3 Jul 2024 12:52:15 +0200 Florian Westphal <fw@strlen.de>
> > > > Hillf Danton <hdanton@sina.com> wrote:
> > > > > Given trans->table goes thru the lifespan of trans, your proposal is a bandaid
> > > > > if trans outlives table.
> > > > 
> > > > trans must never outlive table.
> > > > 
> > > What is preventing trans from being freed after closing sock, given
> > > trans is freed in workqueue?
> > > 
> > > 	close sock
> > > 	queue work
> > 
> > The notifier acquires the transaction mutex, locking out all other
> > transactions, so no further transactions requests referencing
> > the table can be queued.
> > 
> As per the syzbot report, trans->table could be instantiated before
> notifier acquires the transaction mutex. And in fact the lock helps
> trans outlive table even with your patch.
> 
> 	cpu1			cpu2
> 	---			---
> 	transB->table = A
> 				lock trans mutex
> 				flush work
> 				free A
> 				unlock trans mutex
> 
> 	queue work to free transB

Can you show a crash reproducer or explain how this assign
and queueing happens unordered wrt. cpu2?

This should look like this:

 	cpu1			cpu2
 	---			---
	lock trans mutex
  				lock trans mutex -> blocks
 	transB->table = A
  	queue work to free transB
	unlock trans mutex
				lock trans mutex returns
  				flush work
  				free A
  				unlock trans mutex
diff mbox series

Patch

diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 02d75aefaa8e..683f6a4518ee 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -11552,8 +11552,7 @@  static int nft_rcv_nl_event(struct notifier_block *this, unsigned long event,
 
 	gc_seq = nft_gc_seq_begin(nft_net);
 
-	if (!list_empty(&nf_tables_destroy_list))
-		nf_tables_trans_destroy_flush_work();
+	nf_tables_trans_destroy_flush_work();
 again:
 	list_for_each_entry(table, &nft_net->tables, list) {
 		if (nft_table_has_owner(table) &&