diff mbox

mac80211 drops packet with old IV after rekeying - workaround patch for CCMP

Message ID 555CF4C2.7040002@web.de (mailing list archive)
State Rejected
Delegated to: Johannes Berg
Headers show

Commit Message

Alexander Wetzel May 20, 2015, 8:55 p.m. UTC
I've verified that turning off hardware encryption on the AP and the STA
is indeed preventing the issue.
As soon as one of them is using the hardware encryption I can trigger
the problem. (In my setup it seems to be mostly caused by the AP, since
I needed sometimes as much as three rekeys to get the freeze when the AP
was using Software and the STA hardware encryption.)

So confident that we finally found the root of the evil I tried to write
some code catching the races, see the attachment.

It's probably not the best fix, but the only one I could think of and
deploy myself with the knowledge I gathered here and the last days.

It's only for CCMP for now and I've not checked on assumptions what some
parts of the code are for.
This is just "works for me". (It survived now 30 rekeys under my test
load, when previously nearly every rekey did freeze it.)

I think there is no need any longer to generate captures, but if you
want them anyway I can of course still work on that.


The decision to check if the pn is <= 10 as indicator to switch over to
the "correct" pn counter is at best questionable. (Using 1 does not
work. I have problems with the inital key here and the code was not
switching over to the correct rx_pn counter and I expect that we also
may lose some frames here.) I'm not even sure that it's safe to assume
that pn is always starting at one, since wpa_supplicant is dumping some
sequence number on rekey I would have assumed to be a more or less
random start value. But in the debug prints the real value is always
starting at one. So I'm using that for now.

I'll now try that in my environment, keeping the insane low rekey
interval at two minutes for at least some weeks.

What was really surprising me here is, that this is such a generic issue
and I'm finding that in my home environment. For my understanding that
should break many (all?) EAP Wlan's. (I'm using EAP-TLS and that did
make the WLAN basically unusable and any sane person would have switched
back to PSK...)

For completeness, here is the original openwrt bug report I opened:
https://dev.openwrt.org/ticket/18966


Alexander

Comments

Johannes Berg May 21, 2015, 7:11 a.m. UTC | #1
On Wed, 2015-05-20 at 22:55 +0200, Alexander Wetzel wrote:
> I've verified that turning off hardware encryption on the AP and the STA
> is indeed preventing the issue.
> As soon as one of them is using the hardware encryption I can trigger
> the problem. (In my setup it seems to be mostly caused by the AP, since
> I needed sometimes as much as three rekeys to get the freeze when the AP
> was using Software and the STA hardware encryption.)

Right, I did identify cases where both sides can have issues. I'm not
surprised that the AP-side issue is more likely.

> So confident that we finally found the root of the evil I tried to write
> some code catching the races, see the attachment.
> 
> It's probably not the best fix, but the only one I could think of and
> deploy myself with the knowledge I gathered here and the last days.

Your patch breaks the security properties of this code, so we cannot use
it :-)

> What was really surprising me here is, that this is such a generic issue
> and I'm finding that in my home environment. For my understanding that
> should break many (all?) EAP Wlan's. (I'm using EAP-TLS and that did
> make the WLAN basically unusable and any sane person would have switched
> back to PSK...)

Well, I think it's a matter of probabilities.

First of all, the AP bug seems to be more likely to cause an issue, so
anyone who deployed EAP-TLS with non-broken APs is far better off than
you are. Secondly, you really can only run into this while you do
rekeying in heavy traffic, so in production environments with large
rekey intervals it doesn't matter as much again. And then I guess the
windows driver reconnects on PTK rekey request, so there you wouldn't
see it either ... as a consequence the number of affected people must be
pretty low :)

johannes

--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff -ur linux-4.0.4-gentoo/net/mac80211/key.h linux-4.0.4-gentoo_patched/net/mac80211/key.h
--- linux-4.0.4-gentoo/net/mac80211/key.h	2015-04-13 00:12:50.000000000 +0200
+++ linux-4.0.4-gentoo_patched/net/mac80211/key.h	2015-05-20 13:16:06.370256697 +0200
@@ -84,12 +84,14 @@ 
 			 * Management frames.
 			 */
 			u8 rx_pn[IEEE80211_NUM_TIDS + 1][IEEE80211_CCMP_PN_LEN];
+			u8 rx_pn_old[IEEE80211_NUM_TIDS + 1][IEEE80211_CCMP_PN_LEN];
 			struct crypto_aead *tfm;
 			u32 replays; /* dot11RSNAStatsCCMPReplays */
 		} ccmp;
 		struct {
 			atomic64_t tx_pn;
 			u8 rx_pn[IEEE80211_CMAC_PN_LEN];
+			u8 rx_pn_old[IEEE80211_CMAC_PN_LEN];
 			struct crypto_cipher *tfm;
 			u32 replays; /* dot11RSNAStatsCMACReplays */
 			u32 icverrors; /* dot11RSNAStatsCMACICVErrors */
@@ -97,6 +99,7 @@ 
 		struct {
 			atomic64_t tx_pn;
 			u8 rx_pn[IEEE80211_GMAC_PN_LEN];
+			u8 rx_pn_old[IEEE80211_GMAC_PN_LEN];
 			struct crypto_aead *tfm;
 			u32 replays; /* dot11RSNAStatsCMACReplays */
 			u32 icverrors; /* dot11RSNAStatsCMACICVErrors */
@@ -109,12 +112,14 @@ 
 			 * Management frames.
 			 */
 			u8 rx_pn[IEEE80211_NUM_TIDS + 1][IEEE80211_GCMP_PN_LEN];
+			u8 rx_pn_old[IEEE80211_NUM_TIDS + 1][IEEE80211_GCMP_PN_LEN];
 			struct crypto_aead *tfm;
 			u32 replays; /* dot11RSNAStatsGCMPReplays */
 		} gcmp;
 		struct {
 			/* generic cipher scheme */
 			u8 rx_pn[IEEE80211_NUM_TIDS + 1][MAX_PN_LEN];
+			u8 rx_pn_old[IEEE80211_NUM_TIDS + 1][MAX_PN_LEN];
 		} gen;
 	} u;
 
diff -ur linux-4.0.4-gentoo/net/mac80211/wpa.c linux-4.0.4-gentoo_patched/net/mac80211/wpa.c
--- linux-4.0.4-gentoo/net/mac80211/wpa.c	2015-04-13 00:12:50.000000000 +0200
+++ linux-4.0.4-gentoo_patched/net/mac80211/wpa.c	2015-05-20 21:43:25.529721066 +0200
@@ -495,6 +495,7 @@ 
 	struct sk_buff *skb = rx->skb;
 	struct ieee80211_rx_status *status = IEEE80211_SKB_RXCB(skb);
 	u8 pn[IEEE80211_CCMP_PN_LEN];
+	static u8 zero[IEEE80211_CCMP_PN_LEN];
 	int data_len;
 	int queue;
 
@@ -525,6 +526,31 @@ 
 		return RX_DROP_UNUSABLE;
 	}
 
+	/* HACK: try to work around race when replacing PSK with enabled hardware offload on AP or STA */
+	if (unlikely(memcmp(key->u.ccmp.rx_pn[queue], zero, IEEE80211_CCMP_PN_LEN) == 0 )) {
+
+		printk ("DDD - rx_pn is zero, virgin key detected! pl=%x\n", pn[IEEE80211_CCMP_PN_LEN-1]);
+		print_hex_dump_debug("cnt: ", DUMP_PREFIX_OFFSET, IEEE80211_CCMP_PN_LEN, 1, key->u.ccmp.rx_pn[queue], IEEE80211_CCMP_PN_LEN, false);
+		print_hex_dump_debug("pn : ", DUMP_PREFIX_OFFSET, IEEE80211_CCMP_PN_LEN, 1, pn, IEEE80211_CCMP_PN_LEN, false);
+
+		if ((memcmp(pn, zero, IEEE80211_CCMP_PN_LEN -1) == 0) && pn[IEEE80211_CCMP_PN_LEN-1] <= 10) {
+			/* pn is <=10 , we can start using the new counter */
+			printk ("DDD - set new pn\n");
+			memcpy(key->u.ccmp.rx_pn[queue], pn, IEEE80211_CCMP_PN_LEN);
+		} else if (memcmp(pn, key->u.ccmp.rx_pn_old[queue], IEEE80211_CCMP_PN_LEN) <= 0) {
+			printk ("DDD - attack!\n");
+			/* reply attack during rekey operation, guess it will really hard to do that... */
+			key->u.ccmp.replays++; /* we count it as an reply anyway...*/
+			return RX_DROP_UNUSABLE; 
+		} else {	
+			printk ("DDD - prevent poisening \n");
+			memcpy(key->u.ccmp.rx_pn_old[queue], pn, IEEE80211_CCMP_PN_LEN);
+		}
+	} else {
+			
+		memcpy(key->u.ccmp.rx_pn[queue], pn, IEEE80211_CCMP_PN_LEN);
+	}
+
 	if (!(status->flag & RX_FLAG_DECRYPTED)) {
 		u8 aad[2 * AES_BLOCK_SIZE];
 		u8 b_0[AES_BLOCK_SIZE];
@@ -539,8 +565,6 @@ 
 			return RX_DROP_UNUSABLE;
 	}
 
-	memcpy(key->u.ccmp.rx_pn[queue], pn, IEEE80211_CCMP_PN_LEN);
-
 	/* Remove CCMP header and MIC */
 	if (pskb_trim(skb, skb->len - mic_len))
 		return RX_DROP_UNUSABLE;