diff mbox series

[RFC] mac80211: rework locking for txq scheduling / airtime fairness

Message ID 20190313181531.62539-1-nbd@nbd.name (mailing list archive)
State Superseded
Delegated to: Johannes Berg
Headers show
Series [RFC] mac80211: rework locking for txq scheduling / airtime fairness | expand

Commit Message

Felix Fietkau March 13, 2019, 6:15 p.m. UTC
Holding the lock around the entire duration of tx scheduling can create
some nasty lock contention, especially when processing airtime information
from the tx status or the rx path.
Improve locking by only holding the active_txq_lock for lookups / scheduling
list modifications.

Signed-off-by: Felix Fietkau <nbd@nbd.name>
---
 include/net/mac80211.h | 48 ++++++++++++++++--------------------------
 net/mac80211/tx.c      | 44 ++++++++++++++------------------------
 2 files changed, 34 insertions(+), 58 deletions(-)

Comments

Toke Høiland-Jørgensen March 13, 2019, 10:55 p.m. UTC | #1
Felix Fietkau <nbd@nbd.name> writes:

> Holding the lock around the entire duration of tx scheduling can
> create some nasty lock contention, especially when processing airtime
> information from the tx status or the rx path.

Right, I can see how that could become an issue at higher loads than
what I tested with :)

> Improve locking by only holding the active_txq_lock for lookups /
> scheduling list modifications.

So the (potential) issue I see with this modification is that it
requires the driver to ensure that it will not interleave two different
scheduling rounds for the same acno. I.e., another call to
schedule_start() will reset the round counter and something needs to
guarantee that this doesn't happen until the driver has actually
finished the previous round.

I am not sure to what extent this would *actually* be a problem. For
ath9k, for instance, there's already the in-driver chan_lock (although
the call to ieee80211_txq_schedule_start() would then have to be moved
into the section covered by that lock). But it does introduce an
implicit dependency in the API, which should at least be documented.

If memory serves, avoiding this implicit dependency was the original
reason I had for adding the full lock around everything. I can't think
of any other reason right now, but if I do think of something I'll be
sure to let you know :)

-Toke
Felix Fietkau March 14, 2019, 5:23 p.m. UTC | #2
On 2019-03-13 23:55, Toke Høiland-Jørgensen wrote:
> Felix Fietkau <nbd@nbd.name> writes:
> 
>> Holding the lock around the entire duration of tx scheduling can
>> create some nasty lock contention, especially when processing airtime
>> information from the tx status or the rx path.
> 
> Right, I can see how that could become an issue at higher loads than
> what I tested with :)
I stumbled onto this when I was doing perf runs with mt76 (before even
adding support for this API) and I noticed that a visible amount of lock
contention was caused by mac80211 tx calls being blocked by mt76 tx
scheduling.
I wanted to fix these issues by switching to the mac80211 txq scheduling
API, but when reading the code I noticed that using this API had the
very same issue I was trying to fix. That's why I made this patch :)

>> Improve locking by only holding the active_txq_lock for lookups /
>> scheduling list modifications.
> 
> So the (potential) issue I see with this modification is that it
> requires the driver to ensure that it will not interleave two different
> scheduling rounds for the same acno. I.e., another call to
> schedule_start() will reset the round counter and something needs to
> guarantee that this doesn't happen until the driver has actually
> finished the previous round.
> 
> I am not sure to what extent this would *actually* be a problem. For
> ath9k, for instance, there's already the in-driver chan_lock (although
> the call to ieee80211_txq_schedule_start() would then have to be moved
> into the section covered by that lock). But it does introduce an
> implicit dependency in the API, which should at least be documented.
With ath9k it's also protected by the per-txq lock.
With ath10k it's protected by having scheduling calls come from the NAPI
poll function.

> If memory serves, avoiding this implicit dependency was the original
> reason I had for adding the full lock around everything. I can't think
> of any other reason right now, but if I do think of something I'll be
> sure to let you know :)
I'll change the patch to make this more explicit and resubmit.
Thanks for your comments.

- Felix
Toke Høiland-Jørgensen March 14, 2019, 10:17 p.m. UTC | #3
Felix Fietkau <nbd@nbd.name> writes:

> On 2019-03-13 23:55, Toke Høiland-Jørgensen wrote:
>> Felix Fietkau <nbd@nbd.name> writes:
>> 
>>> Holding the lock around the entire duration of tx scheduling can
>>> create some nasty lock contention, especially when processing airtime
>>> information from the tx status or the rx path.
>> 
>> Right, I can see how that could become an issue at higher loads than
>> what I tested with :)
> I stumbled onto this when I was doing perf runs with mt76 (before even
> adding support for this API) and I noticed that a visible amount of lock
> contention was caused by mac80211 tx calls being blocked by mt76 tx
> scheduling.
> I wanted to fix these issues by switching to the mac80211 txq scheduling
> API, but when reading the code I noticed that using this API had the
> very same issue I was trying to fix. That's why I made this patch :)

Ah, I see. Well I applaud you making fixing this and switching over mt76
your solution :)

>>> Improve locking by only holding the active_txq_lock for lookups /
>>> scheduling list modifications.
>> 
>> So the (potential) issue I see with this modification is that it
>> requires the driver to ensure that it will not interleave two different
>> scheduling rounds for the same acno. I.e., another call to
>> schedule_start() will reset the round counter and something needs to
>> guarantee that this doesn't happen until the driver has actually
>> finished the previous round.
>> 
>> I am not sure to what extent this would *actually* be a problem. For
>> ath9k, for instance, there's already the in-driver chan_lock (although
>> the call to ieee80211_txq_schedule_start() would then have to be moved
>> into the section covered by that lock). But it does introduce an
>> implicit dependency in the API, which should at least be documented.
> With ath9k it's also protected by the per-txq lock.

Ah, right, that was just moved not removed entirely. Great!

> With ath10k it's protected by having scheduling calls come from the NAPI
> poll function.

Cool.

>> If memory serves, avoiding this implicit dependency was the original
>> reason I had for adding the full lock around everything. I can't think
>> of any other reason right now, but if I do think of something I'll be
>> sure to let you know :)
> I'll change the patch to make this more explicit and resubmit.
> Thanks for your comments.

Sounds good!

-Toke
diff mbox series

Patch

diff --git a/include/net/mac80211.h b/include/net/mac80211.h
index 3771625b7a9d..40fbd15e6c8c 100644
--- a/include/net/mac80211.h
+++ b/include/net/mac80211.h
@@ -6269,8 +6269,6 @@  struct sk_buff *ieee80211_tx_dequeue(struct ieee80211_hw *hw,
  * @hw: pointer as obtained from ieee80211_alloc_hw()
  * @ac: AC number to return packets from.
  *
- * Should only be called between calls to ieee80211_txq_schedule_start()
- * and ieee80211_txq_schedule_end().
  * Returns the next txq if successful, %NULL if no queue is eligible. If a txq
  * is returned, it should be returned with ieee80211_return_txq() after the
  * driver has finished scheduling it.
@@ -6278,51 +6276,41 @@  struct sk_buff *ieee80211_tx_dequeue(struct ieee80211_hw *hw,
 struct ieee80211_txq *ieee80211_next_txq(struct ieee80211_hw *hw, u8 ac);
 
 /**
- * ieee80211_return_txq - return a TXQ previously acquired by ieee80211_next_txq()
- *
- * @hw: pointer as obtained from ieee80211_alloc_hw()
- * @txq: pointer obtained from station or virtual interface
- *
- * Should only be called between calls to ieee80211_txq_schedule_start()
- * and ieee80211_txq_schedule_end().
- */
-void ieee80211_return_txq(struct ieee80211_hw *hw, struct ieee80211_txq *txq);
-
-/**
- * ieee80211_txq_schedule_start - acquire locks for safe scheduling of an AC
+ * ieee80211_txq_schedule_start - start new scheduling round for TXQs
  *
  * @hw: pointer as obtained from ieee80211_alloc_hw()
  * @ac: AC number to acquire locks for
  *
- * Acquire locks needed to schedule TXQs from the given AC. Should be called
- * before ieee80211_next_txq() or ieee80211_return_txq().
+ * Should be called before ieee80211_next_txq() or ieee80211_return_txq().
  */
-void ieee80211_txq_schedule_start(struct ieee80211_hw *hw, u8 ac)
-	__acquires(txq_lock);
+void ieee80211_txq_schedule_start(struct ieee80211_hw *hw, u8 ac);
+
+/* (deprecated) */
+static inline void ieee80211_txq_schedule_end(struct ieee80211_hw *hw, u8 ac)
+{
+}
 
 /**
- * ieee80211_txq_schedule_end - release locks for safe scheduling of an AC
+ * ieee80211_schedule_txq - schedule a TXQ for transmission
  *
  * @hw: pointer as obtained from ieee80211_alloc_hw()
- * @ac: AC number to acquire locks for
+ * @txq: pointer obtained from station or virtual interface
  *
- * Release locks previously acquired by ieee80211_txq_schedule_end().
+ * Schedules a TXQ for transmission if it is not already scheduled.
  */
-void ieee80211_txq_schedule_end(struct ieee80211_hw *hw, u8 ac)
-	__releases(txq_lock);
+void ieee80211_schedule_txq(struct ieee80211_hw *hw, struct ieee80211_txq *txq)
 
 /**
- * ieee80211_schedule_txq - schedule a TXQ for transmission
+ * ieee80211_return_txq - return a TXQ previously acquired by ieee80211_next_txq()
  *
  * @hw: pointer as obtained from ieee80211_alloc_hw()
  * @txq: pointer obtained from station or virtual interface
- *
- * Schedules a TXQ for transmission if it is not already scheduled. Takes a
- * lock, which means it must *not* be called between
- * ieee80211_txq_schedule_start() and ieee80211_txq_schedule_end()
  */
-void ieee80211_schedule_txq(struct ieee80211_hw *hw, struct ieee80211_txq *txq)
-	__acquires(txq_lock) __releases(txq_lock);
+static inline void
+ieee80211_return_txq(struct ieee80211_hw *hw, struct ieee80211_txq *txq)
+{
+	ieee80211_schedule_txq(hw, txq);
+}
 
 /**
  * ieee80211_txq_may_transmit - check whether TXQ is allowed to transmit
diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c
index ca23abbf5c0b..51cc37802439 100644
--- a/net/mac80211/tx.c
+++ b/net/mac80211/tx.c
@@ -3653,16 +3653,17 @@  EXPORT_SYMBOL(ieee80211_tx_dequeue);
 struct ieee80211_txq *ieee80211_next_txq(struct ieee80211_hw *hw, u8 ac)
 {
 	struct ieee80211_local *local = hw_to_local(hw);
+	struct ieee80211_txq *ret = NULL;
 	struct txq_info *txqi = NULL;
 
-	lockdep_assert_held(&local->active_txq_lock[ac]);
+	spin_lock_bh(&local->active_txq_lock[ac]);
 
  begin:
 	txqi = list_first_entry_or_null(&local->active_txqs[ac],
 					struct txq_info,
 					schedule_order);
 	if (!txqi)
-		return NULL;
+		goto out;
 
 	if (txqi->txq.sta) {
 		struct sta_info *sta = container_of(txqi->txq.sta,
@@ -3679,21 +3680,25 @@  struct ieee80211_txq *ieee80211_next_txq(struct ieee80211_hw *hw, u8 ac)
 
 
 	if (txqi->schedule_round == local->schedule_round[ac])
-		return NULL;
+		goto out;
 
 	list_del_init(&txqi->schedule_order);
 	txqi->schedule_round = local->schedule_round[ac];
-	return &txqi->txq;
+	ret = &txqi->txq;
+
+out:
+	spin_unlock_bh(&local->active_txq_lock[ac]);
+	return ret;
 }
 EXPORT_SYMBOL(ieee80211_next_txq);
 
-void ieee80211_return_txq(struct ieee80211_hw *hw,
-			  struct ieee80211_txq *txq)
+void ieee80211_schedule_txq(struct ieee80211_hw *hw,
+			    struct ieee80211_txq *txq)
 {
 	struct ieee80211_local *local = hw_to_local(hw);
 	struct txq_info *txqi = to_txq_info(txq);
 
-	lockdep_assert_held(&local->active_txq_lock[txq->ac]);
+	spin_lock_bh(&local->active_txq_lock[txq->ac]);
 
 	if (list_empty(&txqi->schedule_order) &&
 	    (!skb_queue_empty(&txqi->frags) || txqi->tin.backlog_packets)) {
@@ -3713,17 +3718,7 @@  void ieee80211_return_txq(struct ieee80211_hw *hw,
 			list_add_tail(&txqi->schedule_order,
 				      &local->active_txqs[txq->ac]);
 	}
-}
-EXPORT_SYMBOL(ieee80211_return_txq);
 
-void ieee80211_schedule_txq(struct ieee80211_hw *hw,
-			    struct ieee80211_txq *txq)
-	__acquires(txq_lock) __releases(txq_lock)
-{
-	struct ieee80211_local *local = hw_to_local(hw);
-
-	spin_lock_bh(&local->active_txq_lock[txq->ac]);
-	ieee80211_return_txq(hw, txq);
 	spin_unlock_bh(&local->active_txq_lock[txq->ac]);
 }
 EXPORT_SYMBOL(ieee80211_schedule_txq);
@@ -3736,7 +3731,7 @@  bool ieee80211_txq_may_transmit(struct ieee80211_hw *hw,
 	struct sta_info *sta;
 	u8 ac = txq->ac;
 
-	lockdep_assert_held(&local->active_txq_lock[ac]);
+	spin_lock_bh(&local->active_txq_lock[ac]);
 
 	if (!txqi->txq.sta)
 		goto out;
@@ -3766,34 +3761,27 @@  bool ieee80211_txq_may_transmit(struct ieee80211_hw *hw,
 
 	sta->airtime[ac].deficit += sta->airtime_weight;
 	list_move_tail(&txqi->schedule_order, &local->active_txqs[ac]);
+	spin_unlock_bh(&local->active_txq_lock[ac]);
 
 	return false;
 out:
 	if (!list_empty(&txqi->schedule_order))
 		list_del_init(&txqi->schedule_order);
+	spin_unlock_bh(&local->active_txq_lock[ac]);
 
 	return true;
 }
 EXPORT_SYMBOL(ieee80211_txq_may_transmit);
 
 void ieee80211_txq_schedule_start(struct ieee80211_hw *hw, u8 ac)
-	__acquires(txq_lock)
 {
 	struct ieee80211_local *local = hw_to_local(hw);
 
 	spin_lock_bh(&local->active_txq_lock[ac]);
 	local->schedule_round[ac]++;
-}
-EXPORT_SYMBOL(ieee80211_txq_schedule_start);
-
-void ieee80211_txq_schedule_end(struct ieee80211_hw *hw, u8 ac)
-	__releases(txq_lock)
-{
-	struct ieee80211_local *local = hw_to_local(hw);
-
 	spin_unlock_bh(&local->active_txq_lock[ac]);
 }
-EXPORT_SYMBOL(ieee80211_txq_schedule_end);
+EXPORT_SYMBOL(ieee80211_txq_schedule_start);
 
 void __ieee80211_subif_start_xmit(struct sk_buff *skb,
 				  struct net_device *dev,