diff mbox series

[spi,for-5.4,2/5] spi: Add a PTP system timestamp to the transfer structure

Message ID 20190818182600.3047-3-olteanv@gmail.com (mailing list archive)
State Superseded
Headers show
Series Deterministic SPI latency with NXP DSPI driver | expand

Commit Message

Vladimir Oltean Aug. 18, 2019, 6:25 p.m. UTC
SPI is one of the interfaces used to access devices which have a POSIX
clock driver (real time clocks, 1588 timers etc).

Since there are lots of sources of timing jitter when retrieving the
time from such a device (queuing delays, locking contention, running in
interruptible context etc), introduce a PTP system timestamp structure
in struct spi_transfer. This is to be used by SPI device drivers when
they need to know the exact time at which the underlying device's time
was snapshotted.

Because SPI controllers may have jitter even between frames, also
introduce a field which specifies to the controller driver specifically
which byte needs to be snapshotted.

Add a default implementation of the PTP system timestamping in the SPI
core. There are 3 entry points from the core towards the SPI controller
drivers:
- transfer_one: The driver is passed individual spi_transfers to
  execute. This is the easiest to timestamp.
- transfer_one_message: The core passes the driver an entire spi_message
  (a potential batch of spi_transfers). The core puts the same pre and
  post timestamp to all transfers within a message. This is not ideal,
  but nothing better can be done by default anyway, since the core has
  no insight into how the driver batches the transfers.
- transfer: Like transfer_one_message, but for unqueued drivers (i.e.
  the driver implements its own queue scheduling).

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
---
 drivers/spi/spi.c       | 62 +++++++++++++++++++++++++++++++++++++++++
 include/linux/spi/spi.h | 38 +++++++++++++++++++++++++
 2 files changed, 100 insertions(+)

Comments

Mark Brown Aug. 22, 2019, 5:11 p.m. UTC | #1
On Sun, Aug 18, 2019 at 09:25:57PM +0300, Vladimir Oltean wrote:

> @@ -1391,6 +1402,13 @@ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread)
>  		goto out;
>  	}
>  
> +	if (!ctlr->ptp_sts_supported) {
> +		list_for_each_entry(xfer, &mesg->transfers, transfer_list) {
> +			xfer->ptp_sts_word_pre = 0;
> +			ptp_read_system_prets(xfer->ptp_sts);
> +		}
> +	}
> +

We can do better than this for controllers which use transfer_one().

> +const void *spi_xfer_ptp_sts_word(struct spi_transfer *xfer, bool pre)
> +{

xfer can be const here too.

> + * @ptp_sts_supported: If the driver sets this to true, it must provide a
> + *	time snapshot in @spi_transfer->ptp_sts as close as possible to the
> + *	moment in time when @spi_transfer->ptp_sts_word_pre and
> + *	@spi_transfer->ptp_sts_word_post were transmitted.
> + *	If the driver does not set this, the SPI core takes the snapshot as
> + *	close to the driver hand-over as possible.

A couple of issues here.  The big one is that for PIO transfers
this is going to either complicate the code or introduce overhead
in individual drivers for an extremely niche use case.  I guess
most drivers won't implement it which makes this a bit moot but
then this is a concern that pushes back against the idea of
implementing the feature.

The other is that it's not 100% clear what you're looking to
timestamp here - is it when the data goes on the wire, is it when
the data goes on the FIFO (which could be relatively large)?  I'm
guessing you're looking for the physical transfer here, if that's
the case should there be some effort to compensate for the delays
in the controller?
Vladimir Oltean Aug. 24, 2019, 12:38 p.m. UTC | #2
On Thu, 22 Aug 2019 at 21:19, Mark Brown <broonie@kernel.org> wrote:
>
> On Sun, Aug 18, 2019 at 09:25:57PM +0300, Vladimir Oltean wrote:
>
> > @@ -1391,6 +1402,13 @@ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread)
> >               goto out;
> >       }
> >
> > +     if (!ctlr->ptp_sts_supported) {
> > +             list_for_each_entry(xfer, &mesg->transfers, transfer_list) {
> > +                     xfer->ptp_sts_word_pre = 0;
> > +                     ptp_read_system_prets(xfer->ptp_sts);
> > +             }
> > +     }
> > +
>
> We can do better than this for controllers which use transfer_one().
>

You mean I should guard this "if", and the one below, with "&&
!ctlr->transfer_one"?

> > +const void *spi_xfer_ptp_sts_word(struct spi_transfer *xfer, bool pre)
> > +{
>
> xfer can be const here too.
>
> > + * @ptp_sts_supported: If the driver sets this to true, it must provide a
> > + *   time snapshot in @spi_transfer->ptp_sts as close as possible to the
> > + *   moment in time when @spi_transfer->ptp_sts_word_pre and
> > + *   @spi_transfer->ptp_sts_word_post were transmitted.
> > + *   If the driver does not set this, the SPI core takes the snapshot as
> > + *   close to the driver hand-over as possible.
>
> A couple of issues here.  The big one is that for PIO transfers
> this is going to either complicate the code or introduce overhead
> in individual drivers for an extremely niche use case.  I guess
> most drivers won't implement it which makes this a bit moot but
> then this is a concern that pushes back against the idea of
> implementing the feature.
>

The concern is the overhead in terms of code, or runtime performance?
Arguably the applications that require deterministic latency are
actually going to push for overall less overhead at runtime, even if
that comes at a cost in terms of code size. The spi-fsl-dspi driver
does not perform worse by any metric after this rework.

> The other is that it's not 100% clear what you're looking to
> timestamp here - is it when the data goes on the wire, is it when
> the data goes on the FIFO (which could be relatively large)?  I'm
> guessing you're looking for the physical transfer here, if that's
> the case should there be some effort to compensate for the delays
> in the controller?

The goal is to timestamp the moment when the SPI slave sees word N of
the data. Luckily the DSPI driver raises the TCF (Transfer Complete
Flag) once that word has been transmitted, which I used to my
advantage. The EOQ mode behaves similarly, but has a granularity of 4
words. The controller delays are hence implicitly included in the
software timestamp.
But the question is valid and I expect that such compensation might be
needed for some hardware, provided that it can be measured and
guaranteed. In fact Hubert did add such logic to the v3 of his MDIO
patch: https://lkml.org/lkml/2019/8/20/195 There were some objections
mainly related to the certainty of those offset corrections. I don't
want to "future-proof" the API now with features I have no use of, but
such compensation logic might come in the future.

Regards,
-Vladimir
Mark Brown Aug. 27, 2019, 7:01 p.m. UTC | #3
On Sat, Aug 24, 2019 at 03:38:16PM +0300, Vladimir Oltean wrote:
> On Thu, 22 Aug 2019 at 21:19, Mark Brown <broonie@kernel.org> wrote:
> > On Sun, Aug 18, 2019 at 09:25:57PM +0300, Vladimir Oltean wrote:

> > > +     if (!ctlr->ptp_sts_supported) {
> > > +             list_for_each_entry(xfer, &mesg->transfers, transfer_list) {
> > > +                     xfer->ptp_sts_word_pre = 0;
> > > +                     ptp_read_system_prets(xfer->ptp_sts);
> > > +             }
> > > +     }

> > We can do better than this for controllers which use transfer_one().

> You mean I should guard this "if", and the one below, with "&&
> !ctlr->transfer_one"?

Yes, that'd make it a bit more obvious that the better handling
is there.

> > > + * @ptp_sts_supported: If the driver sets this to true, it must provide a
> > > + *   time snapshot in @spi_transfer->ptp_sts as close as possible to the
> > > + *   moment in time when @spi_transfer->ptp_sts_word_pre and
> > > + *   @spi_transfer->ptp_sts_word_post were transmitted.
> > > + *   If the driver does not set this, the SPI core takes the snapshot as
> > > + *   close to the driver hand-over as possible.

> > A couple of issues here.  The big one is that for PIO transfers
> > this is going to either complicate the code or introduce overhead
> > in individual drivers for an extremely niche use case.  I guess
> > most drivers won't implement it which makes this a bit moot but
> > then this is a concern that pushes back against the idea of
> > implementing the feature.

> The concern is the overhead in terms of code, or runtime performance?

Both, yes.

> Arguably the applications that require deterministic latency are
> actually going to push for overall less overhead at runtime, even if
> that comes at a cost in terms of code size. The spi-fsl-dspi driver
> does not perform worse by any metric after this rework.

Determinalistic and fast are often note the same thing here,
sometimes it's better not to optimize if the optimization only
works some of the time for example.

> > The other is that it's not 100% clear what you're looking to
> > timestamp here - is it when the data goes on the wire, is it when
> > the data goes on the FIFO (which could be relatively large)?  I'm
> > guessing you're looking for the physical transfer here, if that's
> > the case should there be some effort to compensate for the delays
> > in the controller?

> The goal is to timestamp the moment when the SPI slave sees word N of
> the data. Luckily the DSPI driver raises the TCF (Transfer Complete
> Flag) once that word has been transmitted, which I used to my
> advantage. The EOQ mode behaves similarly, but has a granularity of 4
> words. The controller delays are hence implicitly included in the
> software timestamp.

The documentation should be clear on that, it'd be very natural
for someone to timestamp on entry to the FIFO.

> But the question is valid and I expect that such compensation might be
> needed for some hardware, provided that it can be measured and
> guaranteed. In fact Hubert did add such logic to the v3 of his MDIO
> patch: https://lkml.org/lkml/2019/8/20/195 There were some objections
> mainly related to the certainty of those offset corrections. I don't
> want to "future-proof" the API now with features I have no use of, but
> such compensation logic might come in the future.

I think it's mainly important that people know what the
expectations are so different drivers are consistent in how they
work, as you say the API can always be extended later.
diff mbox series

Patch

diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index d96e04627982..cf5c5435edcd 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -1171,6 +1171,11 @@  static int spi_transfer_one_message(struct spi_controller *ctlr,
 		spi_statistics_add_transfer_stats(statm, xfer, ctlr);
 		spi_statistics_add_transfer_stats(stats, xfer, ctlr);
 
+		if (!ctlr->ptp_sts_supported) {
+			xfer->ptp_sts_word_pre = 0;
+			ptp_read_system_prets(xfer->ptp_sts);
+		}
+
 		if (xfer->tx_buf || xfer->rx_buf) {
 			reinit_completion(&ctlr->xfer_completion);
 
@@ -1197,6 +1202,11 @@  static int spi_transfer_one_message(struct spi_controller *ctlr,
 					xfer->len);
 		}
 
+		if (!ctlr->ptp_sts_supported) {
+			ptp_read_system_postts(xfer->ptp_sts);
+			xfer->ptp_sts_word_post = xfer->len;
+		}
+
 		trace_spi_transfer_stop(msg, xfer);
 
 		if (msg->status != -EINPROGRESS)
@@ -1265,6 +1275,7 @@  EXPORT_SYMBOL_GPL(spi_finalize_current_transfer);
  */
 static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread)
 {
+	struct spi_transfer *xfer;
 	struct spi_message *mesg;
 	bool was_busy = false;
 	unsigned long flags;
@@ -1391,6 +1402,13 @@  static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread)
 		goto out;
 	}
 
+	if (!ctlr->ptp_sts_supported) {
+		list_for_each_entry(xfer, &mesg->transfers, transfer_list) {
+			xfer->ptp_sts_word_pre = 0;
+			ptp_read_system_prets(xfer->ptp_sts);
+		}
+	}
+
 	ret = ctlr->transfer_one_message(ctlr, mesg);
 	if (ret) {
 		dev_err(&ctlr->dev,
@@ -1418,6 +1436,34 @@  static void spi_pump_messages(struct kthread_work *work)
 	__spi_pump_messages(ctlr, true);
 }
 
+/**
+ * spi_xfer_ptp_sts_word - helper for drivers to retrieve the pointer to the
+ *			   word in the TX buffer which must be timestamped.
+ *			   The SPI slave does not provide a pointer directly
+ *			   because the TX and RX buffers may be reallocated
+ *			   (see @spi_map_msg).
+ * @xfer: Pointer to the transfer being timestamped
+ * @pre: If true, returns the pointer to @ptp_sts_word_pre, otherwise returns
+ *	the pointer to @ptp_sts_word_post.
+ */
+const void *spi_xfer_ptp_sts_word(struct spi_transfer *xfer, bool pre)
+{
+	unsigned int bytes_per_word;
+
+	if (xfer->bits_per_word <= 8)
+		bytes_per_word = 1;
+	else if (xfer->bits_per_word <= 16)
+		bytes_per_word = 2;
+	else
+		bytes_per_word = 4;
+
+	if (pre)
+		return xfer->tx_buf + xfer->ptp_sts_word_pre * bytes_per_word;
+
+	return xfer->tx_buf + xfer->ptp_sts_word_post * bytes_per_word;
+}
+EXPORT_SYMBOL_GPL(spi_xfer_ptp_sts_word);
+
 /**
  * spi_set_thread_rt - set the controller to pump at realtime priority
  * @ctlr: controller to boost priority of
@@ -1503,6 +1549,7 @@  EXPORT_SYMBOL_GPL(spi_get_next_queued_message);
  */
 void spi_finalize_current_message(struct spi_controller *ctlr)
 {
+	struct spi_transfer *xfer;
 	struct spi_message *mesg;
 	unsigned long flags;
 	int ret;
@@ -1511,6 +1558,13 @@  void spi_finalize_current_message(struct spi_controller *ctlr)
 	mesg = ctlr->cur_msg;
 	spin_unlock_irqrestore(&ctlr->queue_lock, flags);
 
+	if (!ctlr->ptp_sts_supported) {
+		list_for_each_entry(xfer, &mesg->transfers, transfer_list) {
+			ptp_read_system_postts(xfer->ptp_sts);
+			xfer->ptp_sts_word_post = xfer->len;
+		}
+	}
+
 	spi_unmap_msg(ctlr, mesg);
 
 	if (ctlr->cur_msg_prepared && ctlr->unprepare_message) {
@@ -3270,6 +3324,7 @@  static int __spi_validate(struct spi_device *spi, struct spi_message *message)
 static int __spi_async(struct spi_device *spi, struct spi_message *message)
 {
 	struct spi_controller *ctlr = spi->controller;
+	struct spi_transfer *xfer;
 
 	/*
 	 * Some controllers do not support doing regular SPI transfers. Return
@@ -3285,6 +3340,13 @@  static int __spi_async(struct spi_device *spi, struct spi_message *message)
 
 	trace_spi_message_submit(message);
 
+	if (!ctlr->ptp_sts_supported) {
+		list_for_each_entry(xfer, &message->transfers, transfer_list) {
+			xfer->ptp_sts_word_pre = 0;
+			ptp_read_system_prets(xfer->ptp_sts);
+		}
+	}
+
 	return ctlr->transfer(spi, message);
 }
 
diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h
index af4f265d0f67..bb7553c6e5d0 100644
--- a/include/linux/spi/spi.h
+++ b/include/linux/spi/spi.h
@@ -13,6 +13,7 @@ 
 #include <linux/completion.h>
 #include <linux/scatterlist.h>
 #include <linux/gpio/consumer.h>
+#include <linux/ptp_clock_kernel.h>
 
 struct dma_chan;
 struct property_entry;
@@ -409,6 +410,12 @@  static inline void spi_unregister_driver(struct spi_driver *sdrv)
  * @fw_translate_cs: If the boot firmware uses different numbering scheme
  *	what Linux expects, this optional hook can be used to translate
  *	between the two.
+ * @ptp_sts_supported: If the driver sets this to true, it must provide a
+ *	time snapshot in @spi_transfer->ptp_sts as close as possible to the
+ *	moment in time when @spi_transfer->ptp_sts_word_pre and
+ *	@spi_transfer->ptp_sts_word_post were transmitted.
+ *	If the driver does not set this, the SPI core takes the snapshot as
+ *	close to the driver hand-over as possible.
  *
  * Each SPI controller can communicate with one or more @spi_device
  * children.  These make a small bus, sharing MOSI, MISO and SCK signals
@@ -604,6 +611,12 @@  struct spi_controller {
 	void			*dummy_tx;
 
 	int (*fw_translate_cs)(struct spi_controller *ctlr, unsigned cs);
+
+	/*
+	 * Driver sets this field to indicate it is able to snapshot SPI
+	 * transfers (needed e.g. for reading the time of POSIX clocks)
+	 */
+	bool			ptp_sts_supported;
 };
 
 static inline void *spi_controller_get_devdata(struct spi_controller *ctlr)
@@ -644,6 +657,9 @@  extern struct spi_message *spi_get_next_queued_message(struct spi_controller *ct
 extern void spi_finalize_current_message(struct spi_controller *ctlr);
 extern void spi_finalize_current_transfer(struct spi_controller *ctlr);
 
+/* Helper calls for driver to get which buffer pointer must be timestamped */
+extern const void *spi_xfer_ptp_sts_word(struct spi_transfer *xfer, bool pre);
+
 /* the spi driver core manages memory for the spi_controller classdev */
 extern struct spi_controller *__spi_alloc_controller(struct device *host,
 						unsigned int size, bool slave);
@@ -753,6 +769,24 @@  extern void spi_res_release(struct spi_controller *ctlr,
  * @transfer_list: transfers are sequenced through @spi_message.transfers
  * @tx_sg: Scatterlist for transmit, currently not for client use
  * @rx_sg: Scatterlist for receive, currently not for client use
+ * @ptp_sts_word_pre: The word (subject to bits_per_word semantics) offset
+ *	within @tx_buf for which the SPI device is requesting that the time
+ *	snapshot for this transfer begins. Upon completing the SPI transfer,
+ *	this value may have changed compared to what was requested, depending
+ *	on the available snapshotting resolution (DMA transfer,
+ *	@ptp_sts_supported is false, etc).
+ * @ptp_sts_word_post: See @ptp_sts_word_post. The two can be equal (meaning
+ *	that a single byte should be snapshotted). The core will set
+ *	@ptp_sts_word_pre to 0, and @ptp_sts_word_post to the length of the
+ *	transfer, if @ptp_sts_supported is false for this controller. This is
+ *	done purposefully (instead of setting to spi_transfer->len - 1) to
+ *	denote that a transfer-level snapshot taken from within the driver may
+ *	still be of higher quality.
+ * @ptp_sts: Pointer to a memory location held by the SPI slave device where a
+ *	PTP system timestamp structure may lie. If drivers use PIO or their
+ *	hardware has some sort of assist for retrieving exact transfer timing,
+ *	they can (and should) assert @ptp_sts_supported and populate this
+ *	structure using the ptp_read_system_*ts helper functions.
  *
  * SPI transfers always write the same number of bytes as they read.
  * Protocol drivers should always provide @rx_buf and/or @tx_buf.
@@ -842,6 +876,10 @@  struct spi_transfer {
 
 	u32		effective_speed_hz;
 
+	unsigned int	ptp_sts_word_pre;
+	unsigned int	ptp_sts_word_post;
+	struct ptp_system_timestamp *ptp_sts;
+
 	struct list_head transfer_list;
 };