diff mbox series

[RFC,v3,05/12] mac80211: Mark A-MPDU keyid borders for drivers

Message ID 20190210210620.31181-6-alexander@wetzel-home.de (mailing list archive)
State RFC
Delegated to: Johannes Berg
Headers show
Series Draft for Extended Key ID support | expand

Commit Message

Alexander Wetzel Feb. 10, 2019, 9:06 p.m. UTC
IEEE 802.11-2016 "9.7.3 A-MPDU contents" forbids aggregating MPDUs with
different keyids.
Extended Key ID support breaks the assumption that all unicast MPDUs for
one station can always be aggregated.

Inform the drivers about a keyid border by dropping the
@IEEE80211_TX_CTL_AMPDU flag for the last MPDU using the current keyid.

Signed-off-by: Alexander Wetzel <alexander@wetzel-home.de>
---

This patch here is totally unnecessary when we decide that IEEE 802.11 -
2016 is not meaning what is referenced in the commit message above:-)
The exact wording in the standard is:
"All protected MPDUs within an A-MPDU have the same Key ID."

The intent of the wording was probably written without considering
Extended Key IDs. At least it makes no sense for me to forbid mixing
MPDUs using keyid 0 and 1 in one A-MPDU. Nevertheless it says what it
says and there may now be cards/drivers depending on that and e.g. only
check it for the first MPDU. The lost MPDUs would then be our fault, for
aggregating together what according to the standard must not. 

But even with that view we still don't need the patch:
A-MPDU aggregation is the job for the driver. We simply can offload the
problem to the drivers.

On the other side mac80211 is in what I consider a better position to
determine and mark the MPDU keyid border, saving the driver the effort
to parse the MPDUs again and keep track of the "current" keyid. It also
allows the driver to terminate a running A-MPDU frame when it gets the
last packet for the old key instead one frame later.

To get around what looked like a nightmare of synchronisation problems
this patch puts the Tx key switch into the Tx path. Handling idle
connections when we don't want to accept that the first packet of a
rekey may still use the key-before-current makes it more complex. 

The code is assuming that the driver is not aggregating MPDUs more than
5s apart. We probably don't have wait nearly so long but I'm not sure
what is the minimum time.

The patch also brought up a interesting problem: We are out of sta_info
flags and skb CB also has no room left to add new ones. I worked around
that by redefining how IEEE80211_TX_CTL_AMPDU has to be handled for
Extended Key ID A-MPDU sessions. If you think that puts too much stain
on the API I would need a pointer what would be considered an acceptable
solution to the problem.

But I also fully understand when you think this patch goes too far and
want it thrown out and want it handled somehow else:-).


 net/mac80211/key.c      | 10 +++++--
 net/mac80211/key.h      |  1 +
 net/mac80211/sta_info.c |  1 +
 net/mac80211/sta_info.h |  2 ++
 net/mac80211/tx.c       | 64 +++++++++++++++++++++++++++++++++++------
 5 files changed, 68 insertions(+), 10 deletions(-)

Comments

Johannes Berg Feb. 15, 2019, 11:50 a.m. UTC | #1
> The intent of the wording was probably written without considering
> Extended Key IDs. At least it makes no sense for me to forbid mixing
> MPDUs using keyid 0 and 1 in one A-MPDU. 

I think it may make sense - reprogramming the hardware engines may take
some time, and doing that in the middle of the A-MPDU may not be
feasible? You don't just have to load the key (that you need to do
anyway) but also extract the status? I dunno, I'm more handwaving, but
it doesn't make sense to add such a requirement when only one key index
can be used to start with.

> The code is assuming that the driver is not aggregating MPDUs more than
> 5s apart. We probably don't have wait nearly so long but I'm not sure
> what is the minimum time.

OTOH, if you have a lot of BE/VI/VO traffic BK might be starved even
longer than that, technically indefinitely.

> +static struct ieee80211_key debug_noinline
> +*ieee80211_select_sta_key(struct ieee80211_tx_data *tx)
> +{
> +	struct ieee80211_hdr *hdr = (struct ieee80211_hdr *)tx->skb->data;
> +	struct ieee80211_tx_info *info = IEEE80211_SKB_CB(tx->skb);
> +	struct sta_info *sta = tx->sta;
> +	struct ieee80211_key *key;
> +	struct ieee80211_key *next_key;
> +
> +	key = rcu_dereference(tx->sta->ptk[tx->sta->ptk_idx]);
> +
> +	if (likely(sta->ptk_idx_next == INVALID_PTK_KEYIDX))
> +		return key;
> +
> +	/* Only when using Extended Key ID the code below can be executed */
> +
> +	if (!ieee80211_is_data_present(hdr->frame_control))
> +		return key;
> +
> +	if (sta->ptk_idx_next == sta->ptk_idx) {
> +		/* First packet using new key with A-MPDU active*/
> +		sta->ptk_idx_next = INVALID_PTK_KEYIDX;
> +		ieee80211_check_fast_xmit(tx->sta);

I'm not convinced you can call this from this context? It looks safe
though, but it's really strange in a way.

> +	info->flags &= ~IEEE80211_TX_CTL_AMPDU;

Like you say above, I don't think this really makes a lot of sense. If
we don't have any free bits I guess we should try to find some ...

johannes
Alexander Wetzel Feb. 21, 2019, 9:20 p.m. UTC | #2
Am 15.02.19 um 12:50 schrieb Johannes Berg:
> 
>> The intent of the wording was probably written without considering
>> Extended Key IDs. At least it makes no sense for me to forbid mixing
>> MPDUs using keyid 0 and 1 in one A-MPDU.
> 
> I think it may make sense - reprogramming the hardware engines may take
> some time, and doing that in the middle of the A-MPDU may not be
> feasible? You don't just have to load the key (that you need to do
> anyway) but also extract the status? I dunno, I'm more handwaving, but
> it doesn't make sense to add such a requirement when only one key index
> can be used to start with.
> 

I'm pretty new to all that and I know I have still huge gaps everywhere.
But the card must be able to process MPDUs using both KeyIDs at that 
moment already. When receiving a A-MPDU it should not be very hard to 
check each MPDU and process it accordingly. (TX should even be simpler: 
The driver simply can decide if he want's to have a key border in the 
A-MPDU or not and switch to key when it wants.)

Forbidding to mix unicast with both keyIDs is adding an ugly overhead to 
an otherwise quite simple solution and adds complexity, at least for us 
here.
So while I would like to understand the reason for that rule it's still 
a rule and breaking it seems to be a bad idea.

>> The code is assuming that the driver is not aggregating MPDUs more than
>> 5s apart. We probably don't have wait nearly so long but I'm not sure
>> what is the minimum time.
> 
> OTOH, if you have a lot of BE/VI/VO traffic BK might be starved even
> longer than that, technically indefinitely.
> 

Hm, there is nothing preventing us to drop this "idle" switch as long as 
we also drop the 10s Rx accel offload when idle fallback in the COMPAT 
patch. (We have to drop the RX idle accel patch or get a small risk of 
dropping packets in unlikely but possible scenarios. If that are eapol 
packets things will become hairy, probably disconnecting the sta.)

It just feels strange to potentially still use the old key for one 
packet more without time limit. It could be, that the first EAPOL packet 
we send for the next rekey would still use the previous key, the second 
eapol packet the current.


>> +static struct ieee80211_key debug_noinline
>> +*ieee80211_select_sta_key(struct ieee80211_tx_data *tx)
>> +{
>> +	struct ieee80211_hdr *hdr = (struct ieee80211_hdr *)tx->skb->data;
>> +	struct ieee80211_tx_info *info = IEEE80211_SKB_CB(tx->skb);
>> +	struct sta_info *sta = tx->sta;
>> +	struct ieee80211_key *key;
>> +	struct ieee80211_key *next_key;
>> +
>> +	key = rcu_dereference(tx->sta->ptk[tx->sta->ptk_idx]);
>> +
>> +	if (likely(sta->ptk_idx_next == INVALID_PTK_KEYIDX))
>> +		return key;
>> +
>> +	/* Only when using Extended Key ID the code below can be executed */
>> +
>> +	if (!ieee80211_is_data_present(hdr->frame_control))
>> +		return key;
>> +
>> +	if (sta->ptk_idx_next == sta->ptk_idx) {
>> +		/* First packet using new key with A-MPDU active*/
>> +		sta->ptk_idx_next = INVALID_PTK_KEYIDX;
>> +		ieee80211_check_fast_xmit(tx->sta);
> 
> I'm not convinced you can call this from this context? It looks safe
> though, but it's really strange in a way.
> 

Well, it's seems to work fine, no warnings or problems so far:-)

But I also had doubts and only after finding out that 
ieee80211_check_fast_xmit() is already being called from the same 
context I dared to use it, see ieee80211_tx_prepare().

>> +	info->flags &= ~IEEE80211_TX_CTL_AMPDU;
> 
> Like you say above, I don't think this really makes a lot of sense. If
> we don't have any free bits I guess we should try to find some ...

Well, all I can think of is quite invasive, so I hoped you would have a 
better idea:

Move the flags out from CB and use the freed space for a pointer to the 
new construct.
That will touch a ton of code and slow down things a bit.

The question here also would be, if we should use a struct able to 
handle other extensions or just a long long for flags.

If it comes to that I would propose to merge the other patches up to 
this one and then start looking into that...

Alexander
Johannes Berg Feb. 22, 2019, 8:51 a.m. UTC | #3
On Thu, 2019-02-21 at 22:20 +0100, Alexander Wetzel wrote:

> > I think it may make sense - reprogramming the hardware engines may take
> > some time, and doing that in the middle of the A-MPDU may not be
> > feasible? You don't just have to load the key (that you need to do
> > anyway) but also extract the status? I dunno, I'm more handwaving, but
> > it doesn't make sense to add such a requirement when only one key index
> > can be used to start with.
> > 
> 
> I'm pretty new to all that and I know I have still huge gaps everywhere.

And I'm just handwaving ;-)

But I know that for example we have A-MPDU spacing rules, ie. sometimes
padding must be inserted by the transmitter to give the receiver's HW or
FW enough time to program crypto engines. It thus stands to reason that
in order to minimise the spacing you'd want to keep the key material at
hand in the engine while processing the whole A-MPDU.

> But the card must be able to process MPDUs using both KeyIDs at that 
> moment already. When receiving a A-MPDU it should not be very hard to 
> check each MPDU and process it accordingly. (TX should even be simpler: 
> The driver simply can decide if he want's to have a key border in the 
> A-MPDU or not and switch to key when it wants.)

I tend to agree, but you never know in what surprising ways hardware
engines work :-)

> > > The code is assuming that the driver is not aggregating MPDUs more than
> > > 5s apart. We probably don't have wait nearly so long but I'm not sure
> > > what is the minimum time.
> > 
> > OTOH, if you have a lot of BE/VI/VO traffic BK might be starved even
> > longer than that, technically indefinitely.
> > 
> 
> Hm, there is nothing preventing us to drop this "idle" switch as long as 
> we also drop the 10s Rx accel offload when idle fallback in the COMPAT 
> patch. (We have to drop the RX idle accel patch or get a small risk of 
> dropping packets in unlikely but possible scenarios. If that are eapol 
> packets things will become hairy, probably disconnecting the sta.)
> 
> It just feels strange to potentially still use the old key for one 
> packet more without time limit. It could be, that the first EAPOL packet 
> we send for the next rekey would still use the previous key, the second 
> eapol packet the current.

Not sure I understand this. If we have no TX going on at all, then
surely we can switch with the next packet, before we encrypt it?

And if we have a packet sitting on hardware queues forever, then we
can't do anything about it anyway?

> > > +	if (sta->ptk_idx_next == sta->ptk_idx) {
> > > +		/* First packet using new key with A-MPDU active*/
> > > +		sta->ptk_idx_next = INVALID_PTK_KEYIDX;
> > > +		ieee80211_check_fast_xmit(tx->sta);
> > 
> > I'm not convinced you can call this from this context? It looks safe
> > though, but it's really strange in a way.
> > 
> 
> Well, it's seems to work fine, no warnings or problems so far:-)
> 
> But I also had doubts and only after finding out that 
> ieee80211_check_fast_xmit() is already being called from the same 
> context I dared to use it, see ieee80211_tx_prepare().

OK :-)
I guess I misremembered then.

> > > +	info->flags &= ~IEEE80211_TX_CTL_AMPDU;
> > 
> > Like you say above, I don't think this really makes a lot of sense. If
> > we don't have any free bits I guess we should try to find some ...
> 
> Well, all I can think of is quite invasive, so I hoped you would have a 
> better idea:

I think we have free bits in enum mac80211_tx_control_flags, so that
should be workable? They won't be preserved until TX status, but that's
OK for this purpose, no?

johannes
Alexander Wetzel Feb. 23, 2019, 9:47 p.m. UTC | #4
>>>> The code is assuming that the driver is not aggregating MPDUs more than
>>>> 5s apart. We probably don't have wait nearly so long but I'm not sure
>>>> what is the minimum time.
>>>
>>> OTOH, if you have a lot of BE/VI/VO traffic BK might be starved even
>>> longer than that, technically indefinitely.
>>>
>>
>> Hm, there is nothing preventing us to drop this "idle" switch as long as
>> we also drop the 10s Rx accel offload when idle fallback in the COMPAT
>> patch. (We have to drop the RX idle accel patch or get a small risk of
>> dropping packets in unlikely but possible scenarios. If that are eapol
>> packets things will become hairy, probably disconnecting the sta.)
>>
>> It just feels strange to potentially still use the old key for one
>> packet more without time limit. It could be, that the first EAPOL packet
>> we send for the next rekey would still use the previous key, the second
>> eapol packet the current.
> 
> Not sure I understand this. If we have no TX going on at all, then
> surely we can switch with the next packet, before we encrypt it?

Well, yes. But how can we figure out we are indeed idle?
(Chances are there is something and I missed it, I've not even look at 
many subsystems in mac80211 and probably even don't know some of them 
exist...)

The driver/card must have given up on waiting for more frames to 
aggregate and send out whatever it had in the queue.

You just told me above that a card may keep a partially assembled A-MPDU 
in the buffer for >5s if we have higher priority traffic... That was my 
idea to just handle it time-based with an insane but still ok long time.

Checking if all higher priority TIDs are idle sounds complicated and we 
still must know when the driver/card gives up and send out the bufferd 
frames even if the A-MPDU is not full. That does not make the idea to 
handle it time-based any simpler...

I'm not even sure it makes much sense to send out packets with 5s delay 
at all, tcp will already have queued retransmitts and for udp the frames 
are either also considered to be lost by the application or we don't 
care about them any more.

So yes, we can switch Tx to the new key immediately when we are sure the 
driver/card has no A-MPDU "in aggregation" at that time. I guess we can 
add a call the driver, asking it to tell us if a TID is idle, which also 
sounds complicated. So I guess I just would send the "key border" signal 
even when idle and accept that this can in corner cases use different 
keys for even and odd EAPOL frames.


>>>> +	info->flags &= ~IEEE80211_TX_CTL_AMPDU;
>>>
>>> Like you say above, I don't think this really makes a lot of sense. If
>>> we don't have any free bits I guess we should try to find some ...
>>
>> Well, all I can think of is quite invasive, so I hoped you would have a
>> better idea:
> 
> I think we have free bits in enum mac80211_tx_control_flags, so that
> should be workable? They won't be preserved until TX status, but that's
> OK for this purpose, no?

Wonderful idea:-)
I'll rewrite drop you a new patch revision with all the other changes we 
discussed (or are still discussing).

If nothing new pops up - either in our discussions or when rewriting the 
key border signal - I would drop RFC from the next patch series all the 
POC patches after this one so we can focus on the merge relevant defects.

Alexander
diff mbox series

Patch

diff --git a/net/mac80211/key.c b/net/mac80211/key.c
index 7f673887ec50..dee18f61fe33 100644
--- a/net/mac80211/key.c
+++ b/net/mac80211/key.c
@@ -309,6 +309,10 @@  int ieee80211_key_activate_tx(struct ieee80211_key *key)
 
 	assert_key_lock(local);
 
+	/* Two key activations must not overlap */
+	if (WARN_ON(sta->ptk_idx_next != INVALID_PTK_KEYIDX))
+		return -EOPNOTSUPP;
+
 	key->flags &= ~KEY_FLAG_RX_ONLY;
 
 	if (!(key->flags & KEY_FLAG_UPLOADED_TO_HARDWARE) ||
@@ -329,8 +333,9 @@  int ieee80211_key_activate_tx(struct ieee80211_key *key)
 	}
 
 	old = key_mtx_dereference(local, sta->ptk[sta->ptk_idx]);
-	sta->ptk_idx = key->conf.keyidx;
-	ieee80211_check_fast_xmit(sta);
+
+	sta->ptk_idx_next = key->conf.keyidx;
+	key->force_use_after = jiffies + 5 * HZ;
 
 	if (!ext_native && key->flags & KEY_FLAG_UPLOADED_TO_HARDWARE) {
 		key->flags |= KEY_FLAG_RX_SW_CRYPTO;
@@ -577,6 +582,7 @@  ieee80211_key_alloc(u32 cipher, int idx, size_t key_len,
 	 */
 	key->conf.flags = 0;
 	key->flags = 0;
+	key->force_use_after = 0;
 
 	key->conf.cipher = cipher;
 	key->conf.keyidx = idx;
diff --git a/net/mac80211/key.h b/net/mac80211/key.h
index d74c8c36491a..48975d56e792 100644
--- a/net/mac80211/key.h
+++ b/net/mac80211/key.h
@@ -65,6 +65,7 @@  struct ieee80211_key {
 	struct ieee80211_local *local;
 	struct ieee80211_sub_if_data *sdata;
 	struct sta_info *sta;
+	unsigned long force_use_after;
 
 	/* for sdata list */
 	struct list_head list;
diff --git a/net/mac80211/sta_info.c b/net/mac80211/sta_info.c
index a20e05439173..6fe40844f485 100644
--- a/net/mac80211/sta_info.c
+++ b/net/mac80211/sta_info.c
@@ -358,6 +358,7 @@  struct sta_info *sta_info_alloc(struct ieee80211_sub_if_data *sdata,
 	 */
 	BUILD_BUG_ON(ARRAY_SIZE(sta->ptk) <= INVALID_PTK_KEYIDX);
 	sta->ptk_idx = INVALID_PTK_KEYIDX;
+	sta->ptk_idx_next = INVALID_PTK_KEYIDX;
 
 	sta->local = local;
 	sta->sdata = sdata;
diff --git a/net/mac80211/sta_info.h b/net/mac80211/sta_info.h
index 1fd1a349a875..6eff946bc55a 100644
--- a/net/mac80211/sta_info.h
+++ b/net/mac80211/sta_info.h
@@ -450,6 +450,7 @@  struct ieee80211_sta_rx_stats {
  * @sdata: virtual interface this station belongs to
  * @ptk: peer keys negotiated with this station, if any
  * @ptk_idx: peer key index to use for transmissions
+ * @ptk_idx_next: peer key index in activation (Extended Key ID only)
  * @ext_key_compat_wk: supports PTK key installs when using EXT_KEY_ID_COMPAT
  * @gtk: group keys negotiated with this station, if any
  * @rate_ctrl: rate control algorithm reference
@@ -533,6 +534,7 @@  struct sta_info {
 	struct ieee80211_key __rcu *ptk[NUM_DEFAULT_KEYS];
 	struct delayed_work ext_key_compat_wk;
 	u8 ptk_idx;
+	u8 ptk_idx_next;
 	struct rate_control_ref *rate_ctrl;
 	void *rate_ctrl_priv;
 	spinlock_t rate_ctrl_lock;
diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c
index 111bd6c490a6..d3825eca9e64 100644
--- a/net/mac80211/tx.c
+++ b/net/mac80211/tx.c
@@ -586,6 +586,51 @@  ieee80211_tx_h_check_control_port_protocol(struct ieee80211_tx_data *tx)
 	return TX_CONTINUE;
 }
 
+static struct ieee80211_key debug_noinline
+*ieee80211_select_sta_key(struct ieee80211_tx_data *tx)
+{
+	struct ieee80211_hdr *hdr = (struct ieee80211_hdr *)tx->skb->data;
+	struct ieee80211_tx_info *info = IEEE80211_SKB_CB(tx->skb);
+	struct sta_info *sta = tx->sta;
+	struct ieee80211_key *key;
+	struct ieee80211_key *next_key;
+
+	key = rcu_dereference(tx->sta->ptk[tx->sta->ptk_idx]);
+
+	if (likely(sta->ptk_idx_next == INVALID_PTK_KEYIDX))
+		return key;
+
+	/* Only when using Extended Key ID the code below can be executed */
+
+	if (!ieee80211_is_data_present(hdr->frame_control))
+		return key;
+
+	if (sta->ptk_idx_next == sta->ptk_idx) {
+		/* First packet using new key with A-MPDU active*/
+		sta->ptk_idx_next = INVALID_PTK_KEYIDX;
+		ieee80211_check_fast_xmit(tx->sta);
+		return key;
+	}
+
+	next_key = rcu_dereference(sta->ptk[sta->ptk_idx_next]);
+	if (!key || !(info->flags & IEEE80211_TX_CTL_AMPDU) ||
+	    (next_key->force_use_after &&
+	     time_is_before_jiffies(next_key->force_use_after))) {
+		/* nothing special to do, just start using the new key */
+		sta->ptk_idx = sta->ptk_idx_next;
+		sta->ptk_idx_next = INVALID_PTK_KEYIDX;
+		next_key->force_use_after = 0;
+		ieee80211_check_fast_xmit(tx->sta);
+		return next_key;
+	}
+
+	/* Last packet with old key with A-MPDU active */
+	sta->ptk_idx = sta->ptk_idx_next;
+	next_key->force_use_after = 0;
+	info->flags &= ~IEEE80211_TX_CTL_AMPDU;
+	return key;
+}
+
 static ieee80211_tx_result debug_noinline
 ieee80211_tx_h_select_key(struct ieee80211_tx_data *tx)
 {
@@ -595,9 +640,8 @@  ieee80211_tx_h_select_key(struct ieee80211_tx_data *tx)
 
 	if (unlikely(info->flags & IEEE80211_TX_INTFL_DONT_ENCRYPT))
 		tx->key = NULL;
-	else if (tx->sta &&
-		 (key = rcu_dereference(tx->sta->ptk[tx->sta->ptk_idx])))
-		tx->key = key;
+	else if (tx->sta)
+		tx->key = ieee80211_select_sta_key(tx);
 	else if (ieee80211_is_group_privacy_action(tx->skb) &&
 		(key = rcu_dereference(tx->sdata->default_multicast_key)))
 		tx->key = key;
@@ -3414,6 +3458,10 @@  static bool ieee80211_xmit_fast(struct ieee80211_sub_if_data *sdata,
 	if (skb->sk && skb_shinfo(skb)->tx_flags & SKBTX_WIFI_STATUS)
 		return false;
 
+	/* ieee80211_key_activate_tx() requests to change key */
+	if (unlikely(sta->ptk_idx_next != INVALID_PTK_KEYIDX))
+		return false;
+
 	if (hdr->frame_control & cpu_to_le16(IEEE80211_STYPE_QOS_DATA)) {
 		tid = skb->priority & IEEE80211_QOS_CTL_TAG1D_MASK;
 		tid_tx = rcu_dereference(sta->ampdu_mlme.tid_tx[tid]);
@@ -3556,6 +3604,11 @@  struct sk_buff *ieee80211_tx_dequeue(struct ieee80211_hw *hw,
 	if (txq->sta)
 		tx.sta = container_of(txq->sta, struct sta_info, sta);
 
+	if (test_bit(IEEE80211_TXQ_AMPDU, &txqi->flags))
+		info->flags |= IEEE80211_TX_CTL_AMPDU;
+	else
+		info->flags &= ~IEEE80211_TX_CTL_AMPDU;
+
 	/*
 	 * The key can be removed while the packet was queued, so need to call
 	 * this here to get the current key.
@@ -3566,11 +3619,6 @@  struct sk_buff *ieee80211_tx_dequeue(struct ieee80211_hw *hw,
 		goto begin;
 	}
 
-	if (test_bit(IEEE80211_TXQ_AMPDU, &txqi->flags))
-		info->flags |= IEEE80211_TX_CTL_AMPDU;
-	else
-		info->flags &= ~IEEE80211_TX_CTL_AMPDU;
-
 	if (info->control.flags & IEEE80211_TX_CTRL_FAST_XMIT) {
 		struct sta_info *sta = container_of(txq->sta, struct sta_info,
 						    sta);