diff mbox series

[net-next,v3,08/11] net: ethernet: ti: cpts: move rx timestamp processing to ptp worker only

Message ID 20200320194244.4703-9-grygorii.strashko@ti.com (mailing list archive)
State New, archived
Headers show
Series net: ethernet: ti: cpts: add irq and HW_TS_PUSH events | expand

Commit Message

Grygorii Strashko March 20, 2020, 7:42 p.m. UTC
Once CPTS IRQ will be enabled the CPTS irq handler may compete with netif
RX sofirq path and so RX timestamp might not be ready at the moment packet
is processed. As result, packet has to be deferred and processed later.

This patch moves RX timestamp processing tx timestamp processing to PTP
worker always the same way as it's been done for TX timestamps.

  napi_rx->cpts_rx_timestamp
   if ptp_packet then
      push to rxq
      ptp_schedule_worker()

  do_aux_work->cpts_overflow_check
    cpts_process_events()

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
---
 drivers/net/ethernet/ti/cpsw.c     |  14 +--
 drivers/net/ethernet/ti/cpsw_new.c |  13 ++-
 drivers/net/ethernet/ti/cpts.c     | 132 ++++++++++++++++++-----------
 drivers/net/ethernet/ti/cpts.h     |   6 +-
 4 files changed, 106 insertions(+), 59 deletions(-)

Comments

Richard Cochran March 24, 2020, 1:43 p.m. UTC | #1
On Fri, Mar 20, 2020 at 09:42:41PM +0200, Grygorii Strashko wrote:
> Once CPTS IRQ will be enabled the CPTS irq handler may compete with netif
> RX sofirq path and so RX timestamp might not be ready at the moment packet
> is processed. As result, packet has to be deferred and processed later.

This change is not necessary.  The Rx path can simply take a spinlock,
check the event list and the HW queue.
 
> This patch moves RX timestamp processing tx timestamp processing to PTP
> worker always the same way as it's been done for TX timestamps.

There is no advantage to delaying Rx time stamp delivery.  In fact, it
can degrade synchronization performance.  The only reason the
implementation delays Tx time stamps delivery is because there is no
other way.

Thanks,
Richard
Grygorii Strashko March 24, 2020, 3:34 p.m. UTC | #2
On 24/03/2020 15:43, Richard Cochran wrote:
> On Fri, Mar 20, 2020 at 09:42:41PM +0200, Grygorii Strashko wrote:
>> Once CPTS IRQ will be enabled the CPTS irq handler may compete with netif
>> RX sofirq path and so RX timestamp might not be ready at the moment packet
>> is processed. As result, packet has to be deferred and processed later.
> 
> This change is not necessary.  The Rx path can simply take a spinlock,
> check the event list and the HW queue.
>   
>> This patch moves RX timestamp processing tx timestamp processing to PTP
>> worker always the same way as it's been done for TX timestamps.
> 
> There is no advantage to delaying Rx time stamp delivery.  In fact, it
> can degrade synchronization performance.  The only reason the
> implementation delays Tx time stamps delivery is because there is no
> other way.

I tested both ways and kept this version as i'v not seen any degradation,
but, of course, i'll redo the test (or may be you can advise what test to run).

My thoughts were - network stack might not immediately deliver packet to the application
and PTP worker can be tuned (pri and smp_affinity), resulted code will be more structured,
less locks and less time spent in softirq context.

I also can drop it.
Richard Cochran March 24, 2020, 4:54 p.m. UTC | #3
On Tue, Mar 24, 2020 at 05:34:34PM +0200, Grygorii Strashko wrote:
> I tested both ways and kept this version as i'v not seen any degradation,
> but, of course, i'll redo the test (or may be you can advise what test to run).

Measure the time delay from when the frame arrives in the stack until
that frame+RxTimestamp arrives in the application.  I expect the round
about way via kthread takes longer. 
 
> My thoughts were - network stack might not immediately deliver packet to the application

The network stack always delivers the packet, but you artificially
delay that delivery by calling netif_receive_skb() later on from
cpts_match_rx_ts().

> and PTP worker can be tuned (pri and smp_affinity),

That won't avoid the net softirq.

> resulted code will be more structured,

I am afraid people will copy this pattern in new drivers.  It really
does not make much sense.

Thanks,
Richard
Grygorii Strashko March 26, 2020, 11:15 a.m. UTC | #4
Hi Richard

On 24/03/2020 18:54, Richard Cochran wrote:
> On Tue, Mar 24, 2020 at 05:34:34PM +0200, Grygorii Strashko wrote:
>> I tested both ways and kept this version as i'v not seen any degradation,
>> but, of course, i'll redo the test (or may be you can advise what test to run).
> 
> Measure the time delay from when the frame arrives in the stack until
> that frame+RxTimestamp arrives in the application.  I expect the round
> about way via kthread takes longer.
>   
>> My thoughts were - network stack might not immediately deliver packet to the application
> 
> The network stack always delivers the packet, but you artificially
> delay that delivery by calling netif_receive_skb() later on from
> cpts_match_rx_ts().
> 
>> and PTP worker can be tuned (pri and smp_affinity),
> 
> That won't avoid the net softirq.
> 
>> resulted code will be more structured,
> 
> I am afraid people will copy this pattern in new drivers.  It really
> does not make much sense.

I did additional testing and will drop this patch.
Any other comments from you side?

Thank you.
Richard Cochran March 27, 2020, 1:37 a.m. UTC | #5
On Thu, Mar 26, 2020 at 01:15:45PM +0200, Grygorii Strashko wrote:
> I did additional testing and will drop this patch.
> Any other comments from you side?

I made a few minor comments.  Overall the series looks good.
For the series:

Acked-by: Richard Cochran <richardcochran@gmail.com>
diff mbox series

Patch

diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index c2c5bf87da01..ce2155394830 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -433,17 +433,21 @@  static void cpsw_rx_handler(void *token, int len, int status)
 	skb->dev = ndev;
 	if (status & CPDMA_RX_VLAN_ENCAP)
 		cpsw_rx_vlan_encap(skb);
-	if (priv->rx_ts_enabled)
-		cpts_rx_timestamp(cpsw->cpts, skb);
-	skb->protocol = eth_type_trans(skb, ndev);
 
 	/* unmap page as no netstack skb page recycling */
 	page_pool_release_page(pool, page);
-	netif_receive_skb(skb);
-
 	ndev->stats.rx_bytes += len;
 	ndev->stats.rx_packets++;
 
+	ret = 0;
+	if (priv->rx_ts_enabled)
+		ret = cpts_rx_timestamp(cpsw->cpts, skb);
+
+	if (!ret) {
+		skb->protocol = eth_type_trans(skb, ndev);
+		netif_receive_skb(skb);
+	}
+
 requeue:
 	xmeta = page_address(new_page) + CPSW_XMETA_OFFSET;
 	xmeta->ndev = ndev;
diff --git a/drivers/net/ethernet/ti/cpsw_new.c b/drivers/net/ethernet/ti/cpsw_new.c
index 9209e613257d..8561f0e3b769 100644
--- a/drivers/net/ethernet/ti/cpsw_new.c
+++ b/drivers/net/ethernet/ti/cpsw_new.c
@@ -375,17 +375,22 @@  static void cpsw_rx_handler(void *token, int len, int status)
 	skb->dev = ndev;
 	if (status & CPDMA_RX_VLAN_ENCAP)
 		cpsw_rx_vlan_encap(skb);
-	if (priv->rx_ts_enabled)
-		cpts_rx_timestamp(cpsw->cpts, skb);
-	skb->protocol = eth_type_trans(skb, ndev);
 
 	/* unmap page as no netstack skb page recycling */
 	page_pool_release_page(pool, page);
-	netif_receive_skb(skb);
 
 	ndev->stats.rx_bytes += len;
 	ndev->stats.rx_packets++;
 
+	ret = 0;
+	if (priv->rx_ts_enabled)
+		ret = cpts_rx_timestamp(cpsw->cpts, skb);
+
+	if (!ret) {
+		skb->protocol = eth_type_trans(skb, ndev);
+		netif_receive_skb(skb);
+	}
+
 requeue:
 	xmeta = page_address(new_page) + CPSW_XMETA_OFFSET;
 	xmeta->ndev = ndev;
diff --git a/drivers/net/ethernet/ti/cpts.c b/drivers/net/ethernet/ti/cpts.c
index 3cfa0f78287b..fe70eb677b88 100644
--- a/drivers/net/ethernet/ti/cpts.c
+++ b/drivers/net/ethernet/ti/cpts.c
@@ -311,6 +311,66 @@  static bool cpts_match_tx_ts(struct cpts *cpts, struct cpts_event *event)
 	return found;
 }
 
+static bool cpts_match_rx_ts(struct cpts *cpts, struct cpts_event *event)
+{
+	struct sk_buff_head rxq_list;
+	struct sk_buff_head tempq;
+	struct sk_buff *skb, *tmp;
+	unsigned long flags;
+	bool found = false;
+	u32 mtype_seqid;
+
+	mtype_seqid = event->high &
+		      ((MESSAGE_TYPE_MASK << MESSAGE_TYPE_SHIFT) |
+		       (SEQUENCE_ID_MASK << SEQUENCE_ID_SHIFT) |
+		       (EVENT_TYPE_MASK << EVENT_TYPE_SHIFT));
+
+	__skb_queue_head_init(&rxq_list);
+	__skb_queue_head_init(&tempq);
+
+	spin_lock_irqsave(&cpts->rxq.lock, flags);
+	skb_queue_splice_init(&cpts->rxq, &rxq_list);
+	spin_unlock_irqrestore(&cpts->rxq.lock, flags);
+
+	skb_queue_walk_safe(&rxq_list, skb, tmp) {
+		struct skb_shared_hwtstamps *ssh;
+		struct cpts_skb_cb_data *skb_cb =
+					(struct cpts_skb_cb_data *)skb->cb;
+
+		if (mtype_seqid == skb_cb->skb_mtype_seqid) {
+			__skb_unlink(skb, &rxq_list);
+			ssh = skb_hwtstamps(skb);
+			memset(ssh, 0, sizeof(*ssh));
+			ssh->hwtstamp = ns_to_ktime(event->timestamp);
+			found = true;
+			dev_dbg(cpts->dev, "match rx timestamp mtype_seqid %08x\n",
+				mtype_seqid);
+			__skb_queue_tail(&tempq, skb);
+			break;
+		}
+
+		if (time_after(jiffies, skb_cb->tmo)) {
+			/* timeout any expired skbs */
+			dev_dbg(cpts->dev, "expiring rx timestamp\n");
+			__skb_unlink(skb, &rxq_list);
+			__skb_queue_tail(&tempq, skb);
+		}
+	}
+
+	spin_lock_irqsave(&cpts->rxq.lock, flags);
+	skb_queue_splice(&rxq_list, &cpts->rxq);
+	spin_unlock_irqrestore(&cpts->rxq.lock, flags);
+
+	local_bh_disable();
+	while ((skb = __skb_dequeue(&tempq))) {
+		skb->protocol = eth_type_trans(skb, skb->dev);
+		netif_receive_skb(skb);
+	}
+	local_bh_enable();
+
+	return found;
+}
+
 static void cpts_process_events(struct cpts *cpts)
 {
 	struct list_head *this, *next;
@@ -318,6 +378,7 @@  static void cpts_process_events(struct cpts *cpts)
 	LIST_HEAD(events_free);
 	unsigned long flags;
 	LIST_HEAD(events);
+	int type;
 
 	spin_lock_irqsave(&cpts->lock, flags);
 	list_splice_init(&cpts->events, &events);
@@ -325,8 +386,18 @@  static void cpts_process_events(struct cpts *cpts)
 
 	list_for_each_safe(this, next, &events) {
 		event = list_entry(this, struct cpts_event, list);
-		if (cpts_match_tx_ts(cpts, event) ||
-		    time_after(jiffies, event->tmo)) {
+		type = event_type(event);
+
+		if (type == CPTS_EV_TX &&
+		    (cpts_match_tx_ts(cpts, event) ||
+		     time_after(jiffies, event->tmo))) {
+			list_del_init(&event->list);
+			list_add(&event->list, &events_free);
+		}
+
+		if (type == CPTS_EV_RX &&
+		    (cpts_match_rx_ts(cpts, event) ||
+		     time_after(jiffies, event->tmo))) {
 			list_del_init(&event->list);
 			list_add(&event->list, &events_free);
 		}
@@ -422,64 +493,27 @@  static int cpts_skb_get_mtype_seqid(struct sk_buff *skb, u32 *mtype_seqid)
 	return 1;
 }
 
-static u64 cpts_find_ts(struct cpts *cpts, struct sk_buff *skb,
-			int ev_type, u32 skb_mtype_seqid)
-{
-	struct list_head *this, *next;
-	struct cpts_event *event;
-	unsigned long flags;
-	u32 mtype_seqid;
-	u64 ns = 0;
-
-	cpts_fifo_read(cpts, -1);
-	spin_lock_irqsave(&cpts->lock, flags);
-	list_for_each_safe(this, next, &cpts->events) {
-		event = list_entry(this, struct cpts_event, list);
-		if (event_expired(event)) {
-			list_del_init(&event->list);
-			list_add(&event->list, &cpts->pool);
-			continue;
-		}
-
-		mtype_seqid = event->high &
-			      ((MESSAGE_TYPE_MASK << MESSAGE_TYPE_SHIFT) |
-			       (SEQUENCE_ID_MASK << SEQUENCE_ID_SHIFT) |
-			       (EVENT_TYPE_MASK << EVENT_TYPE_SHIFT));
-
-		if (mtype_seqid == skb_mtype_seqid) {
-			ns = event->timestamp;
-			list_del_init(&event->list);
-			list_add(&event->list, &cpts->pool);
-			break;
-		}
-	}
-	spin_unlock_irqrestore(&cpts->lock, flags);
-
-	return ns;
-}
-
-void cpts_rx_timestamp(struct cpts *cpts, struct sk_buff *skb)
+int cpts_rx_timestamp(struct cpts *cpts, struct sk_buff *skb)
 {
 	struct cpts_skb_cb_data *skb_cb = (struct cpts_skb_cb_data *)skb->cb;
-	struct skb_shared_hwtstamps *ssh;
 	int ret;
-	u64 ns;
 
 	ret = cpts_skb_get_mtype_seqid(skb, &skb_cb->skb_mtype_seqid);
 	if (!ret)
-		return;
+		return 0;
 
 	skb_cb->skb_mtype_seqid |= (CPTS_EV_RX << EVENT_TYPE_SHIFT);
 
 	dev_dbg(cpts->dev, "%s mtype seqid %08x\n",
 		__func__, skb_cb->skb_mtype_seqid);
 
-	ns = cpts_find_ts(cpts, skb, CPTS_EV_RX, skb_cb->skb_mtype_seqid);
-	if (!ns)
-		return;
-	ssh = skb_hwtstamps(skb);
-	memset(ssh, 0, sizeof(*ssh));
-	ssh->hwtstamp = ns_to_ktime(ns);
+	/* Always defer RX TS processing to PTP worker */
+	/* get the timestamp for timeouts */
+	skb_cb->tmo = jiffies + msecs_to_jiffies(CPTS_SKB_RX_TX_TMO);
+	skb_queue_tail(&cpts->rxq, skb);
+	ptp_schedule_worker(cpts->clock, 0);
+
+	return 1;
 }
 EXPORT_SYMBOL_GPL(cpts_rx_timestamp);
 
@@ -514,6 +548,7 @@  int cpts_register(struct cpts *cpts)
 	int err, i;
 
 	skb_queue_head_init(&cpts->txq);
+	skb_queue_head_init(&cpts->rxq);
 	INIT_LIST_HEAD(&cpts->events);
 	INIT_LIST_HEAD(&cpts->pool);
 	for (i = 0; i < CPTS_MAX_EVENTS; i++)
@@ -556,6 +591,7 @@  void cpts_unregister(struct cpts *cpts)
 
 	/* Drop all packet */
 	skb_queue_purge(&cpts->txq);
+	skb_queue_purge(&cpts->rxq);
 
 	clk_disable(cpts->refclk);
 }
diff --git a/drivers/net/ethernet/ti/cpts.h b/drivers/net/ethernet/ti/cpts.h
index f16e14d67f5f..d6069198e059 100644
--- a/drivers/net/ethernet/ti/cpts.h
+++ b/drivers/net/ethernet/ti/cpts.h
@@ -115,12 +115,13 @@  struct cpts {
 	struct cpts_event pool_data[CPTS_MAX_EVENTS];
 	unsigned long ov_check_period;
 	struct sk_buff_head txq;
+	struct sk_buff_head rxq;
 	u64 cur_timestamp;
 	u32 mult_new;
 	struct mutex ptp_clk_mutex; /* sync PTP interface and worker */
 };
 
-void cpts_rx_timestamp(struct cpts *cpts, struct sk_buff *skb);
+int cpts_rx_timestamp(struct cpts *cpts, struct sk_buff *skb);
 void cpts_tx_timestamp(struct cpts *cpts, struct sk_buff *skb);
 int cpts_register(struct cpts *cpts);
 void cpts_unregister(struct cpts *cpts);
@@ -141,8 +142,9 @@  static inline bool cpts_can_timestamp(struct cpts *cpts, struct sk_buff *skb)
 #else
 struct cpts;
 
-static inline void cpts_rx_timestamp(struct cpts *cpts, struct sk_buff *skb)
+static inline int cpts_rx_timestamp(struct cpts *cpts, struct sk_buff *skb)
 {
+	return 0;
 }
 static inline void cpts_tx_timestamp(struct cpts *cpts, struct sk_buff *skb)
 {