Message ID | 20210829221615.2057201-3-eric.dumazet@gmail.com (mailing list archive) |
---|---|
State | Accepted |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | inet: make exception handling less predictible | expand |
Context | Check | Description |
---|---|---|
netdev/cover_letter | success | Link |
netdev/fixes_present | success | Link |
netdev/patch_count | success | Link |
netdev/tree_selection | success | Clearly marked for net |
netdev/subject_prefix | success | Link |
netdev/cc_maintainers | warning | 1 maintainers not CCed: yoshfuji@linux-ipv6.org |
netdev/source_inline | success | Was 0 now: 0 |
netdev/verify_signedoff | success | Link |
netdev/module_param | success | Was 0 now: 0 |
netdev/build_32bit | success | Errors and warnings before: 5 this patch: 5 |
netdev/kdoc | success | Errors and warnings before: 0 this patch: 0 |
netdev/verify_fixes | success | Link |
netdev/checkpatch | warning | WARNING: 'wont' may be misspelled - perhaps 'won't'? |
netdev/build_allmodconfig_warn | success | Errors and warnings before: 5 this patch: 5 |
netdev/header_inline | success | Link |
On 8/29/21 3:16 PM, Eric Dumazet wrote: > From: Eric Dumazet <edumazet@google.com> > > Even after commit 6457378fe796 ("ipv4: use siphash instead of Jenkins in > fnhe_hashfun()"), an attacker can still use brute force to learn > some secrets from a victim linux host. > > One way to defeat these attacks is to make the max depth of the hash > table bucket a random value. > > Before this patch, each bucket of the hash table used to store exceptions > could contain 6 items under attack. > > After the patch, each bucket would contains a random number of items, > between 6 and 10. The attacker can no longer infer secrets. > > This is slightly increasing memory size used by the hash table, > by 50% in average, we do not expect this to be a problem. > > This patch is more complex than the prior one (IPv6 equivalent), > because IPv4 was reusing the oldest entry. > Since we need to be able to evict more than one entry per > update_or_create_fnhe() call, I had to replace > fnhe_oldest() with fnhe_remove_oldest(). > > Also note that we will queue extra kfree_rcu() calls under stress, > which hopefully wont be a too big issue. > > Fixes: 4895c771c7f0 ("ipv4: Add FIB nexthop exceptions.") > Signed-off-by: Eric Dumazet <edumazet@google.com> > Reported-by: Keyu Man <kman001@ucr.edu> > Cc: Willy Tarreau <w@1wt.eu> > Signed-off-by: David S. Miller <davem@davemloft.net> > --- > net/ipv4/route.c | 44 +++++++++++++++++++++++++++++--------------- > 1 file changed, 29 insertions(+), 15 deletions(-) > Reviewed-by: David Ahern <dsahern@kernel.org> Tested-by: David Ahern <dsahern@kernel.org>
diff --git a/net/ipv4/route.c b/net/ipv4/route.c index a6f20ee3533554b210d27c4ab6637ca7a05b148b..225714b5efc0b9c6bcd2d58a62d4656cdc5a1cde 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -586,18 +586,25 @@ static void fnhe_flush_routes(struct fib_nh_exception *fnhe) } } -static struct fib_nh_exception *fnhe_oldest(struct fnhe_hash_bucket *hash) +static void fnhe_remove_oldest(struct fnhe_hash_bucket *hash) { - struct fib_nh_exception *fnhe, *oldest; + struct fib_nh_exception __rcu **fnhe_p, **oldest_p; + struct fib_nh_exception *fnhe, *oldest = NULL; - oldest = rcu_dereference(hash->chain); - for (fnhe = rcu_dereference(oldest->fnhe_next); fnhe; - fnhe = rcu_dereference(fnhe->fnhe_next)) { - if (time_before(fnhe->fnhe_stamp, oldest->fnhe_stamp)) + for (fnhe_p = &hash->chain; ; fnhe_p = &fnhe->fnhe_next) { + fnhe = rcu_dereference_protected(*fnhe_p, + lockdep_is_held(&fnhe_lock)); + if (!fnhe) + break; + if (!oldest || + time_before(fnhe->fnhe_stamp, oldest->fnhe_stamp)) { oldest = fnhe; + oldest_p = fnhe_p; + } } fnhe_flush_routes(oldest); - return oldest; + *oldest_p = oldest->fnhe_next; + kfree_rcu(oldest, rcu); } static u32 fnhe_hashfun(__be32 daddr) @@ -676,16 +683,21 @@ static void update_or_create_fnhe(struct fib_nh_common *nhc, __be32 daddr, if (rt) fill_route_from_fnhe(rt, fnhe); } else { - if (depth > FNHE_RECLAIM_DEPTH) - fnhe = fnhe_oldest(hash); - else { - fnhe = kzalloc(sizeof(*fnhe), GFP_ATOMIC); - if (!fnhe) - goto out_unlock; + /* Randomize max depth to avoid some side channels attacks. */ + int max_depth = FNHE_RECLAIM_DEPTH + + prandom_u32_max(FNHE_RECLAIM_DEPTH); - fnhe->fnhe_next = hash->chain; - rcu_assign_pointer(hash->chain, fnhe); + while (depth > max_depth) { + fnhe_remove_oldest(hash); + depth--; } + + fnhe = kzalloc(sizeof(*fnhe), GFP_ATOMIC); + if (!fnhe) + goto out_unlock; + + fnhe->fnhe_next = hash->chain; + fnhe->fnhe_genid = genid; fnhe->fnhe_daddr = daddr; fnhe->fnhe_gw = gw; @@ -693,6 +705,8 @@ static void update_or_create_fnhe(struct fib_nh_common *nhc, __be32 daddr, fnhe->fnhe_mtu_locked = lock; fnhe->fnhe_expires = max(1UL, expires); + rcu_assign_pointer(hash->chain, fnhe); + /* Exception created; mark the cached routes for the nexthop * stale, so anyone caching it rechecks if this exception * applies to them.