Message ID | 20220428124001.7428-4-w@1wt.eu (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | insufficient TCP source port randomness | expand |
Hi Eric, On Thu, Apr 28, 2022 at 02:39:57PM +0200, Willy Tarreau wrote: > From: Eric Dumazet <edumazet@google.com> > > In order to limit the ability for an observer to recognize the source > ports sequence used to contact a set of destinations, we should > periodically shuffle the secret. 10 seconds looks effective enough Nit: "periodically re-salt the input". > without causing particular issues. Just FYI, moving from siphash_3u32 to siphash_4u32 is not free, as it bumps us up from siphash_3u32 to siphash_2u64, which does two more siphash rounds. Maybe this doesn't matter much, but just FYI. I wonder, though, about your "10 seconds looks effective enough without causing particular issues." I surmise from that sentence that a lower value might cause particular issues, but that you found 10 seconds to be okay in practice. Fine. But what happens if one caller hits this at second 9 and the next caller hits it at second 0? In that case, the interval might have been 1 second, not 10. In other words, if you need a certain minimum quantization for this to not cause "particular issues", it might not work the way you wanted it to. Additionally, that problem aside, if you round EPHEMERAL_PORT_SHUFFLE_PERIOD to the nearest power of two, you can turn the expensive division into a bit shift right. Regards, Jason
On Thu, Apr 28, 2022 at 2:40 PM Willy Tarreau <w@1wt.eu> wrote: > @@ -101,10 +103,12 @@ u64 secure_ipv6_port_ephemeral(const __be32 *saddr, const __be32 *daddr, > struct in6_addr saddr; > struct in6_addr daddr; > __be16 dport; > + unsigned int timeseed; Also, does the struct packing (or lack thereof) lead to problems here? Uninitialized bytes might not make a stable hash.
Hi Jason, On Fri, Apr 29, 2022 at 04:37:12PM +0200, Jason A. Donenfeld wrote: > Hi Eric, > > On Thu, Apr 28, 2022 at 02:39:57PM +0200, Willy Tarreau wrote: > > From: Eric Dumazet <edumazet@google.com> > > > > In order to limit the ability for an observer to recognize the source > > ports sequence used to contact a set of destinations, we should > > periodically shuffle the secret. 10 seconds looks effective enough > Nit: "periodically re-salt the input". > > without causing particular issues. > > Just FYI, moving from siphash_3u32 to siphash_4u32 is not free, as it > bumps us up from siphash_3u32 to siphash_2u64, which does two more > siphash rounds. Maybe this doesn't matter much, but just FYI. > > I wonder, though, about your "10 seconds looks effective enough without > causing particular issues." I surmise from that sentence that a lower > value might cause particular issues, but that you found 10 seconds to be > okay in practice. Fine. But what happens if one caller hits this at > second 9 and the next caller hits it at second 0? In that case, the > interval might have been 1 second, not 10. In other words, if you need > a certain minimum quantization for this to not cause "particular > issues", it might not work the way you wanted it to. I can partially respond on this one because as you've probably seen at first I had some reserves on this one, which vanished. The problem caused by reusing a source port too early almost essentially concerns clients closing first, either with a regular shutdown() where a FIN is sent first, or by forcing an RST by disabling lingering / breaking the association with connect(AF_UNSPEC). In both cases, the risk is that the packet supposed to finish to close the connection (e.g. last ACK that turns the socket to TIME_WAIT or the RST) is lost, releasing the local entry too early and making the client send a SYN from a port that's still seen as in use on the other side. For local TIME_WAIT, there's no issue in practice because 1) the port is not yet usable, and 2) clients closing first cannot sustain high connection rates anyway, they're limited to #ports/60s. For those dealing with high connection rates (e.g. proxies) which seldom have to abort connections, they have no other choice but closing using an RST to avoid blocking the port in TIME_WAIT. These ones already face the risk that the RST is lost on the path and that a subsequent SYN from the same port is delivered to a connection that's in ESTABLISHED or FIN_WAIT state anyway. This *does* happen due to losses, albeit at quite a low rate in general. In such cases, the server returns an out-of-window ACK, the client sends an RST and a SYN again (which roughly looks like a retransmit as the RST). Such cases may occasionally fail through firewalls if the RST is lost between the firewall and the server. The firewall will then block the response ACK from the server and the client will have to time out on the SYN and try again. This is the current situation even without the patch (hence why closing with RST must be avoided as much as possible). The change of the salting here can slightly increase the risk that this happens during the transition from the 9th to the 10th second as you said, and only for the case above. However this will only be a problem if the SYN arrives before the RST or if the RST is lost. The second case already exists and the difference is that before the patch, we could have hoped that the other side would finally have closed on timeout before sending a new SYN (which is already not granted, e.g. ESTABLISHED) or unlikely to happen above #ports/60s connections per second (e.g. FIN_WAIT with a longer timeout). The first case (SYN arrives before RST) may happen on randomly load-balanced links (where TCP already doesn't work well), and while this does increase the risk, the risk only concerns just a few SYN packets every 10s which each have a chance over #port of reusing the same port as the last one, or ~#port/N of reusing the same port as one of the last N ones for small values of N. This remains extremely low and is not higher than the risk of losing the RST before the patch anyway. As such all these risks are very low but they increase as the timer decreases. This would work well at 60s and could probably still work well enough at 1s. But 10s is enough to make Amit's attack impractical, so better not risk to degrade anything. And the tests I ran showed only 2 failed connections after 13 billion closed by the client, which is perfectly within what I'm used to observe in such cases. > Additionally, that problem aside, if you round EPHEMERAL_PORT_SHUFFLE_PERIOD > to the nearest power of two, you can turn the expensive division into a > bit shift right. I've been wondering if it would help to use 8 or 16 here, but that doesn't change anything, it's a constant so it's converted to a reciprocal divide by the compiler, i.e. just a 32-bit multiply and a shift. Thanks! Willy
On Fri, Apr 29, 2022 at 04:48:52PM +0200, Jason A. Donenfeld wrote: > On Thu, Apr 28, 2022 at 2:40 PM Willy Tarreau <w@1wt.eu> wrote: > > @@ -101,10 +103,12 @@ u64 secure_ipv6_port_ephemeral(const __be32 *saddr, const __be32 *daddr, > > struct in6_addr saddr; > > struct in6_addr daddr; > > __be16 dport; > > + unsigned int timeseed; > > Also, does the struct packing (or lack thereof) lead to problems here? > Uninitialized bytes might not make a stable hash. Hmmm, I didn't notice, and I think you're right indeed. I did test in IPv6 without noticing any problem but it doesn't mean that the hash is perfectly stable. I'll send an update for this one, thank you! Willy
diff --git a/net/core/secure_seq.c b/net/core/secure_seq.c index 55aa5cc258e3..2c76aab20403 100644 --- a/net/core/secure_seq.c +++ b/net/core/secure_seq.c @@ -22,6 +22,8 @@ static siphash_aligned_key_t net_secret; static siphash_aligned_key_t ts_secret; +#define EPHEMERAL_PORT_SHUFFLE_PERIOD (10 * HZ) + static __always_inline void net_secret_init(void) { net_get_random_once(&net_secret, sizeof(net_secret)); @@ -101,10 +103,12 @@ u64 secure_ipv6_port_ephemeral(const __be32 *saddr, const __be32 *daddr, struct in6_addr saddr; struct in6_addr daddr; __be16 dport; + unsigned int timeseed; } __aligned(SIPHASH_ALIGNMENT) combined = { .saddr = *(struct in6_addr *)saddr, .daddr = *(struct in6_addr *)daddr, - .dport = dport + .dport = dport, + .timeseed = jiffies / EPHEMERAL_PORT_SHUFFLE_PERIOD, }; net_secret_init(); return siphash(&combined, offsetofend(typeof(combined), dport), @@ -145,8 +149,10 @@ EXPORT_SYMBOL_GPL(secure_tcp_seq); u64 secure_ipv4_port_ephemeral(__be32 saddr, __be32 daddr, __be16 dport) { net_secret_init(); - return siphash_3u32((__force u32)saddr, (__force u32)daddr, - (__force u16)dport, &net_secret); + return siphash_4u32((__force u32)saddr, (__force u32)daddr, + (__force u16)dport, + jiffies / EPHEMERAL_PORT_SHUFFLE_PERIOD, + &net_secret); } EXPORT_SYMBOL_GPL(secure_ipv4_port_ephemeral); #endif