diff mbox series

netfilter: Don't track counter updates of do_add_counters()

Message ID 20240822043609.141992-1-takakura@valinux.co.jp (mailing list archive)
State Not Applicable
Delegated to: Netdev Maintainers
Headers show
Series netfilter: Don't track counter updates of do_add_counters() | expand

Checks

Context Check Description
netdev/series_format warning Single patches do not need cover letters; Target tree name not specified in the subject
netdev/tree_selection success Guessed tree name to be net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 16 this patch: 16
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 9 of 9 maintainers
netdev/build_clang success Errors and warnings before: 16 this patch: 16
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 34 this patch: 34
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 64 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-08-23--15-00 (tests: 712)

Commit Message

takakura@valinux.co.jp Aug. 22, 2024, 4:36 a.m. UTC
From: Ryo Takakura <takakura@valinux.co.jp>

While adding counters in do_add_counters(), we call
xt_write_recseq_begin/end to indicate that counters are being updated.
Updates are being tracked so that the counters retrieved by get_counters()
will reflect concurrent updates.

However, there is no need to track the updates done by do_add_counters() as
both do_add_counters() and get_counters() acquire per ipv4,ipv6,arp mutex
beforehand which prevents concurrent update and retrieval between the two.

Moreover, as the xt_write_recseq_begin/end is shared among ipv4,ipv6,arp,
do_add_counters() called by one of ipv4,ipv6,arp can falsely delay the 
synchronization of concurrent get_counters() or xt_replace_table() called 
by any other than the one calling do_add_counters().

So remove xt_write_recseq_begin/end from do_add_counters() for ipv4,ipv6,arp.

Signed-off-by: Ryo Takakura <takakura@valinux.co.jp>
---
 net/ipv4/netfilter/arp_tables.c | 4 ----
 net/ipv4/netfilter/ip_tables.c  | 3 ---
 net/ipv6/netfilter/ip6_tables.c | 3 ---
 3 files changed, 10 deletions(-)

Comments

Eric Dumazet Aug. 22, 2024, 6:03 a.m. UTC | #1
On Thu, Aug 22, 2024 at 6:36 AM <takakura@valinux.co.jp> wrote:
>
> From: Ryo Takakura <takakura@valinux.co.jp>
>
> While adding counters in do_add_counters(), we call
> xt_write_recseq_begin/end to indicate that counters are being updated.
> Updates are being tracked so that the counters retrieved by get_counters()
> will reflect concurrent updates.
>
> However, there is no need to track the updates done by do_add_counters() as
> both do_add_counters() and get_counters() acquire per ipv4,ipv6,arp mutex
> beforehand which prevents concurrent update and retrieval between the two.
>
> Moreover, as the xt_write_recseq_begin/end is shared among ipv4,ipv6,arp,
> do_add_counters() called by one of ipv4,ipv6,arp can falsely delay the
> synchronization of concurrent get_counters() or xt_replace_table() called
> by any other than the one calling do_add_counters().
>
> So remove xt_write_recseq_begin/end from do_add_counters() for ipv4,ipv6,arp.

Completely wrong patch.

There is no way we can update pairs of 64bit counters without any
synchronization.

This is per cpu sequence, the 'shared among ipv4,ipv6,arp' part is moot.

We could use cmpxchg128 on 64bit arches, but I suspect there will be
no improvement.
takakura@valinux.co.jp Aug. 22, 2024, 10:37 a.m. UTC | #2
Hi Eric,

Thanks for taking a look at the patch! And sorry that I see that I was 
missing the point of the synchronization.

On 2024-08-22 6:03 Eric Dumazet wrote:
>On Thu, Aug 22, 2024 at 6:36 AM <takakura@valinux.co.jp> wrote:
>>
>> From: Ryo Takakura <takakura@valinux.co.jp>
>>
>> While adding counters in do_add_counters(), we call
>> xt_write_recseq_begin/end to indicate that counters are being updated.
>> Updates are being tracked so that the counters retrieved by get_counters()
>> will reflect concurrent updates.
>>
>> However, there is no need to track the updates done by do_add_counters() as
>> both do_add_counters() and get_counters() acquire per ipv4,ipv6,arp mutex
>> beforehand which prevents concurrent update and retrieval between the two.
>>
>> Moreover, as the xt_write_recseq_begin/end is shared among ipv4,ipv6,arp,
>> do_add_counters() called by one of ipv4,ipv6,arp can falsely delay the
>> synchronization of concurrent get_counters() or xt_replace_table() called
>> by any other than the one calling do_add_counters().
>>
>> So remove xt_write_recseq_begin/end from do_add_counters() for ipv4,ipv6,arp.
>
>Completely wrong patch.
>
>There is no way we can update pairs of 64bit counters without any
>synchronization.

Yes, I was completely wrong about why the synchronization is required...

>This is per cpu sequence, the 'shared among ipv4,ipv6,arp' part is moot.
>
>We could use cmpxchg128 on 64bit arches, but I suspect there will be
>no improvement.

I see. And if we were to use cmpxchg128, we would also need to come up with 
the way for xt_replace_table()'s synchronization which I guess the current 
per cpu sequence is more suited.

Sincerely,
Ryo Takakura
diff mbox series

Patch

diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c
index 14365b20f1c5..20de048d3e0c 100644
--- a/net/ipv4/netfilter/arp_tables.c
+++ b/net/ipv4/netfilter/arp_tables.c
@@ -1009,7 +1009,6 @@  static int do_add_counters(struct net *net, sockptr_t arg, unsigned int len)
 	const struct xt_table_info *private;
 	int ret = 0;
 	struct arpt_entry *iter;
-	unsigned int addend;
 
 	paddc = xt_copy_counters(arg, len, &tmp);
 	if (IS_ERR(paddc))
@@ -1029,8 +1028,6 @@  static int do_add_counters(struct net *net, sockptr_t arg, unsigned int len)
 	}
 
 	i = 0;
-
-	addend = xt_write_recseq_begin();
 	xt_entry_foreach(iter,  private->entries, private->size) {
 		struct xt_counters *tmp;
 
@@ -1038,7 +1035,6 @@  static int do_add_counters(struct net *net, sockptr_t arg, unsigned int len)
 		ADD_COUNTER(*tmp, paddc[i].bcnt, paddc[i].pcnt);
 		++i;
 	}
-	xt_write_recseq_end(addend);
  unlock_up_free:
 	local_bh_enable();
 	xt_table_unlock(t);
diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c
index fe89a056eb06..f54dea2a8fcd 100644
--- a/net/ipv4/netfilter/ip_tables.c
+++ b/net/ipv4/netfilter/ip_tables.c
@@ -1162,7 +1162,6 @@  do_add_counters(struct net *net, sockptr_t arg, unsigned int len)
 	const struct xt_table_info *private;
 	int ret = 0;
 	struct ipt_entry *iter;
-	unsigned int addend;
 
 	paddc = xt_copy_counters(arg, len, &tmp);
 	if (IS_ERR(paddc))
@@ -1182,7 +1181,6 @@  do_add_counters(struct net *net, sockptr_t arg, unsigned int len)
 	}
 
 	i = 0;
-	addend = xt_write_recseq_begin();
 	xt_entry_foreach(iter, private->entries, private->size) {
 		struct xt_counters *tmp;
 
@@ -1190,7 +1188,6 @@  do_add_counters(struct net *net, sockptr_t arg, unsigned int len)
 		ADD_COUNTER(*tmp, paddc[i].bcnt, paddc[i].pcnt);
 		++i;
 	}
-	xt_write_recseq_end(addend);
  unlock_up_free:
 	local_bh_enable();
 	xt_table_unlock(t);
diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c
index 131f7bb2110d..f1d3bb74eb16 100644
--- a/net/ipv6/netfilter/ip6_tables.c
+++ b/net/ipv6/netfilter/ip6_tables.c
@@ -1179,7 +1179,6 @@  do_add_counters(struct net *net, sockptr_t arg, unsigned int len)
 	const struct xt_table_info *private;
 	int ret = 0;
 	struct ip6t_entry *iter;
-	unsigned int addend;
 
 	paddc = xt_copy_counters(arg, len, &tmp);
 	if (IS_ERR(paddc))
@@ -1198,7 +1197,6 @@  do_add_counters(struct net *net, sockptr_t arg, unsigned int len)
 	}
 
 	i = 0;
-	addend = xt_write_recseq_begin();
 	xt_entry_foreach(iter, private->entries, private->size) {
 		struct xt_counters *tmp;
 
@@ -1206,7 +1204,6 @@  do_add_counters(struct net *net, sockptr_t arg, unsigned int len)
 		ADD_COUNTER(*tmp, paddc[i].bcnt, paddc[i].pcnt);
 		++i;
 	}
-	xt_write_recseq_end(addend);
  unlock_up_free:
 	local_bh_enable();
 	xt_table_unlock(t);