diff mbox

[2/2] spi: add statistics gathering and reporting methods

Message ID 1430741564-2849-2-git-send-email-kernel@martin.sperl.org (mailing list archive)
State New, archived
Headers show

Commit Message

Martin Sperl May 4, 2015, 12:12 p.m. UTC
From: Martin Sperl <kernel@martin.sperl.org>

the statistics are available:
for each master via:
  /sys/class/spi_master/spi*/stat
for each device via:
  /sys/class/spi_master/spi32766/spi32766.4/stat
  /sys/bus/spi/devices/spi32766.*/stat

Example output:
messages: 78
transfers: 78
errors: 0
timeout: 0
bytes: 122990
bytes-tx: 122990
bytes-rx: 122990
transfer-len-log2-histogram: 0 33 3 9 1 2 0 0 0 0 0 0 0 30 0 0 0

Note that log2-histogram is defined for bin n as:
spi_transfer.len in the range: [ 2^n : 2^(n+1) [
with n=0 for spi_transfer.len = 0

bus-master drivers can include additional information via
spi_master.show_stats

Signed-off-by: Martin Sperl <kernel@martin.sperl.org>
---
 drivers/spi/spi.c |  145 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 144 insertions(+), 1 deletion(-)

Comments

Mark Brown May 4, 2015, 1:42 p.m. UTC | #1
On Mon, May 04, 2015 at 12:12:43PM +0000, kernel@martin.sperl.org wrote:
> From: Martin Sperl <kernel@martin.sperl.org>
> 
> the statistics are available:
> for each master via:
>   /sys/class/spi_master/spi*/stat
> for each device via:
>   /sys/class/spi_master/spi32766/spi32766.4/stat
>   /sys/bus/spi/devices/spi32766.*/stat

This is unsuitable for sysfs as it is, sysfs is strictly one value per
file.  debugfs would be a more suitable place for a file like this, or
if they are going to go into sysfs they should go in as a directory of
files with one value per file like the network device statistics
directory.  I'm ambivalent about which is the best place, both have
their merits.

> transfer-len-log2-histogram: 0 33 3 9 1 2 0 0 0 0 0 0 0 30 0 0 0
> 
> Note that log2-histogram is defined for bin n as:
> spi_transfer.len in the range: [ 2^n : 2^(n+1) [
> with n=0 for spi_transfer.len = 0

It would be more friendly to print this out as an actual histogram with
pretty printing, that's definitely a debugfs thing though.

> +/* helper macros to help updating statistics (locked and unlocked) */
> +#define SPI_STATS_FIELD_ADD_NOLOCK(master, spi, key, value) \
> +	do {						    \
> +		master->stats.key += value;		    \
> +		spi->stats.key += value;		    \
> +	} while (0)

It's a bit unclear to me that we need to collect the stats on both a
per device and a per master basis, if we've got device statistics we can
add up the various devices to get the master values easily enough when
we need them.  It does mean that removing devices would discard some of
the statistics but it's not clear to me that this is actually going to
concern anyone.

It'd be good to have a better name too - _FIELD_ADD makes it sound like
it's adding a new field which is obviously not what we're doing here.
Perhaps just SPI_STATS_ADD?

> +#define SPI_STATS_FIELD_ADD(master, spi, key, value)			\
> +	do {								\
> +		unsigned long flags;					\
> +									\
> +		spin_lock_irqsave(&master->stats_spinlock, flags);	\
> +		SPI_STATS_FIELD_ADD_NOLOCK(master, spi, key, value);	\
> +		spin_unlock_irqrestore(&master->stats_spinlock, flags);	\
> +	} while (0)

Do we really need the locked version with a separate lock?  I know
you're very concerned about performance for some of your applications
and the fact that we're going to be spending a lot of time taking a
separate spinlock (with interrupt disabling... ) seems like it's going
to be adding overhead we could live without.

> +		/* update statistics */
> +		l2size = (xfer->len) ? fls(xfer->len - 1) + 1 : 0;
> +		l2size = min(l2size, SPI_STATISTICS_L2HISTO_SIZE - 1);
> +
> +		spin_lock_irqsave(&master->stats_spinlock, flags);
> +		SPI_STATS_FIELD_ADD_NOLOCK(master, msg->spi, transfers, 1);
> +		SPI_STATS_FIELD_ADD_NOLOCK(master, msg->spi,
> +					   bytes, xfer->len);
> +		if (xfer->tx_buf)
> +			SPI_STATS_FIELD_ADD_NOLOCK(master, msg->spi,
> +						   bytes_tx, xfer->len);
> +		if (xfer->rx_buf)
> +			SPI_STATS_FIELD_ADD_NOLOCK(master, msg->spi,
> +					           bytes_rx, xfer->len);
> +		SPI_STATS_FIELD_ADD_NOLOCK(master, msg->spi,
> +					   bytes_l2histo[l2size], 1);
> +		spin_unlock_irqrestore(&master->stats_spinlock, flags);

I can't help but feel that this would be a bit more legible without the
macro for adding to a field - just a function that takes the transfer
and a stats buffer to add to.
Martin Sperl May 4, 2015, 2:25 p.m. UTC | #2
> On 04.05.2015, at 15:42, Mark Brown <broonie@kernel.org> wrote:
> This is unsuitable for sysfs as it is, sysfs is strictly one value per
> file.  debugfs would be a more suitable place for a file like this, or
> if they are going to go into sysfs they should go in as a directory of
> files with one value per file like the network device statistics
> directory.  I'm ambivalent about which is the best place, both have
> their merits.
that makes it much more complicated to add things per spi_master
implementation (which was my original idea: have a framework that exposes
the distinct transfer modes of the spi-bcm2835 driver (polling, interrupt
and dma driven) and also to get some stats on how often we run into those
“corner” cases - that is also why I want to gather the histogram
statistics, as it may make it easier to understand the use cases of people
who complain about “performance” in case they share those infos…

In those cases a single file is the easiest to share, but that is the way
it is with sysfs...

> 
>> transfer-len-log2-histogram: 0 33 3 9 1 2 0 0 0 0 0 0 0 30 0 0 0
> It would be more friendly to print this out as an actual histogram with
> pretty printing, that's definitely a debugfs thing though.
I could make it more readable or - if split - run it with the same format
as /sys/devices/virtual/block/ram0/stat - just multiple columns...

> It's a bit unclear to me that we need to collect the stats on both a
> per device and a per master basis, if we've got device statistics we can
> add up the various devices to get the master values easily enough when
> we need them.  It does mean that removing devices would discard some of
> the statistics but it's not clear to me that this is actually going to
> concern anyone.
> 
well - I found that it sometimes may help identify what device is responsible
for an issue on a mixed device bus.

> Do we really need the locked version with a separate lock?  I know
> you're very concerned about performance for some of your applications
> and the fact that we're going to be spending a lot of time taking a
> separate spinlock (with interrupt disabling... ) seems like it's going
> to be adding overhead we could live without.
> 
I did not have it at first but was concerned that someone would complain
about missing locking. That is why I have minimized the spinlock in the
loop.

That is easy to remove again...

> I can't help but feel that this would be a bit more legible without the
> macro for adding to a field - just a function that takes the transfer
> and a stats buffer to add to.
just a bit more writing when counting twice…

OK, so I will rework the patch looking into the above and will now put
everything into one single patch...--
To unsubscribe from this list: send the line "unsubscribe linux-spi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mark Brown May 4, 2015, 2:30 p.m. UTC | #3
On Mon, May 04, 2015 at 04:25:01PM +0200, Martin Sperl wrote:

> > It's a bit unclear to me that we need to collect the stats on both a
> > per device and a per master basis, if we've got device statistics we can
> > add up the various devices to get the master values easily enough when
> > we need them.  It does mean that removing devices would discard some of
> > the statistics but it's not clear to me that this is actually going to
> > concern anyone.

> well - I found that it sometimes may help identify what device is responsible
> for an issue on a mixed device bus.

We can do that by looking at the devices though?  I'm not clear what the
master statistics are telling us?

> > Do we really need the locked version with a separate lock?  I know
> > you're very concerned about performance for some of your applications
> > and the fact that we're going to be spending a lot of time taking a
> > separate spinlock (with interrupt disabling... ) seems like it's going
> > to be adding overhead we could live without.

> I did not have it at first but was concerned that someone would complain
> about missing locking. That is why I have minimized the spinlock in the
> loop.

> That is easy to remove again...

I'm not saying don't lock, I'm asking if adding this spinlock is the
best way of doing this.  We already lock the master most of the time,
can we not continue using the same lock?
Martin Sperl May 4, 2015, 3:50 p.m. UTC | #4
> On 04.05.2015, at 16:30, Mark Brown <broonie@kernel.org> wrote:
> 
> I'm not saying don't lock, I'm asking if adding this spinlock is the
> best way of doing this.  We already lock the master most of the time,
> can we not continue using the same lock?

Tell me the lock to use and I am happy to handle it.
but a quick look shows that transfer_one_message is running without
a lock held…

We could hold a lock while we run in spi_transfer_one_message
but that could mean that gathering statistics could get blocked
for long periods of time - thta is why I had used a separate lock
that is only held during the “bulk” update…

But for that to work we will need to allocate spi_master_class
(among other things) dynamically - Let us see how that works.--
To unsubscribe from this list: send the line "unsubscribe linux-spi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mark Brown May 4, 2015, 5:29 p.m. UTC | #5
On Mon, May 04, 2015 at 05:50:11PM +0200, Martin Sperl wrote:
> > On 04.05.2015, at 16:30, Mark Brown <broonie@kernel.org> wrote:

> > I'm not saying don't lock, I'm asking if adding this spinlock is the
> > best way of doing this.  We already lock the master most of the time,
> > can we not continue using the same lock?

> Tell me the lock to use and I am happy to handle it.
> but a quick look shows that transfer_one_message is running without
> a lock held…

I didn't say don't take a lock, I said don't make a new lock.  

> But for that to work we will need to allocate spi_master_class
> (among other things) dynamically - Let us see how that works.

I'm not following this at all, sorry.
Jakub Kici?ski May 5, 2015, 12:20 p.m. UTC | #6
On Mon, 4 May 2015 16:25:01 +0200, Martin Sperl wrote:
> > On 04.05.2015, at 15:42, Mark Brown <broonie@kernel.org> wrote:
> > This is unsuitable for sysfs as it is, sysfs is strictly one value per
> > file.  debugfs would be a more suitable place for a file like this, or
> > if they are going to go into sysfs they should go in as a directory of
> > files with one value per file like the network device statistics
> > directory.  I'm ambivalent about which is the best place, both have
> > their merits.  
>
> that makes it much more complicated to add things per spi_master
> implementation (which was my original idea: have a framework that exposes
> the distinct transfer modes of the spi-bcm2835 driver (polling, interrupt
> and dma driven) and also to get some stats on how often we run into those
> “corner” cases - that is also why I want to gather the histogram
> statistics, as it may make it easier to understand the use cases of people
> who complain about “performance” in case they share those infos…

You can also consider adding few strategically placed tracepoints.  
It saves the amount of kernel code necessary and is generally more
flexible.
--
To unsubscribe from this list: send the line "unsubscribe linux-spi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mark Brown May 5, 2015, 3:02 p.m. UTC | #7
On Tue, May 05, 2015 at 02:20:08PM +0200, Jakub Kici?ski wrote:
> On Mon, 4 May 2015 16:25:01 +0200, Martin Sperl wrote:

> > that makes it much more complicated to add things per spi_master
> > implementation (which was my original idea: have a framework that exposes
> > the distinct transfer modes of the spi-bcm2835 driver (polling, interrupt
> > and dma driven) and also to get some stats on how often we run into those
> > “corner” cases - that is also why I want to gather the histogram
> > statistics, as it may make it easier to understand the use cases of people
> > who complain about “performance” in case they share those infos…

> You can also consider adding few strategically placed tracepoints.  
> It saves the amount of kernel code necessary and is generally more
> flexible.

We already have a bunch of tracepoints in there (probably sufficiently
detailed for what's here) but they solve a different problem to
statistics - while you can extract statistics at least as far back as
the tracepoints go but it's slightly cumbersome and for longer term
tests the buffer will roll round at some point and it's a bit more work
for people to set them up.

If we can gather a few statistics without too much overhead I don't see
a huge problem with doing both.
Jakub Kici?ski May 5, 2015, 3:48 p.m. UTC | #8
On Tue, 5 May 2015 16:02:46 +0100, Mark Brown wrote:
> On Tue, May 05, 2015 at 02:20:08PM +0200, Jakub Kici?ski wrote:
> > On Mon, 4 May 2015 16:25:01 +0200, Martin Sperl wrote:
> 
> > > that makes it much more complicated to add things per spi_master
> > > implementation (which was my original idea: have a framework that exposes
> > > the distinct transfer modes of the spi-bcm2835 driver (polling, interrupt
> > > and dma driven) and also to get some stats on how often we run into those
> > > “corner” cases - that is also why I want to gather the histogram
> > > statistics, as it may make it easier to understand the use cases of people
> > > who complain about “performance” in case they share those infos…
> 
> > You can also consider adding few strategically placed tracepoints.  
> > It saves the amount of kernel code necessary and is generally more
> > flexible.
> 
> We already have a bunch of tracepoints in there (probably sufficiently
> detailed for what's here) but they solve a different problem to
> statistics - while you can extract statistics at least as far back as
> the tracepoints go but it's slightly cumbersome and for longer term
> tests the buffer will roll round at some point and it's a bit more work
> for people to set them up.

Sure, I definitely agree with complicated setup especially that Martin
is targeting this at RPi users who are usually not the most seasoned
of kernel developers.  Perhaps one day someone will write a generic
tracepoint-like statistic subsystem which can be disabled at runtime
and easily compiled-out...
--
To unsubscribe from this list: send the line "unsubscribe linux-spi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index 50910d8..615f60d 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -41,6 +41,114 @@ 
 #define CREATE_TRACE_POINTS
 #include <trace/events/spi.h>
 
+/* helper macros to help updating statistics (locked and unlocked) */
+#define SPI_STATS_FIELD_ADD_NOLOCK(master, spi, key, value) \
+	do {						    \
+		master->stats.key += value;		    \
+		spi->stats.key += value;		    \
+	} while (0)
+
+#define SPI_STATS_FIELD_ADD(master, spi, key, value)			\
+	do {								\
+		unsigned long flags;					\
+									\
+		spin_lock_irqsave(&master->stats_spinlock, flags);	\
+		SPI_STATS_FIELD_ADD_NOLOCK(master, spi, key, value);	\
+		spin_unlock_irqrestore(&master->stats_spinlock, flags);	\
+	} while (0)
+
+static ssize_t
+spi_show_statistics(struct spi_statistics *stat,
+		    char *buf, ssize_t buf_size)
+{
+	ssize_t len;
+	int i;
+
+	len = snprintf(buf,
+		       buf_size,
+		       "messages: %ld\n"
+		       "transfers: %ld\n"
+		       "errors: %ld\n"
+		       "timeout: %ld\n"
+		       "bytes: %lld\n"
+		       "bytes-tx: %lld\n"
+		       "bytes-rx: %lld\n"
+		       "transfer-len-log2-histogram:",
+		       stat->messages,
+		       stat->transfers,
+		       stat->errors,
+		       stat->timedout,
+		       stat->bytes,
+		       stat->bytes_tx,
+		       stat->bytes_rx
+		);
+
+	for (i = 0; i < SPI_STATISTICS_L2HISTO_SIZE; i++) {
+		len += snprintf(buf + len, buf_size - len,
+				" %ld", stat->bytes_l2histo[i]);
+	}
+
+	len += snprintf(buf + len, buf_size - len, "\n");
+
+	return len;
+}
+
+static ssize_t
+stat_show(struct device *dev, struct device_attribute *a, char *buf)
+{
+	struct spi_device *spi = to_spi_device(dev);
+	ssize_t len;
+	unsigned long flags;
+
+	spin_lock_irqsave(&spi->master->stats_spinlock, flags);
+
+	len = spi_show_statistics(&spi->stats, buf, PAGE_SIZE);
+
+	/* call show_stats */
+	if (spi->master->show_stats)
+		len += spi->master->show_stats(spi->master, spi,
+					       buf + len,
+					       PAGE_SIZE - len);
+
+	spin_unlock_irqrestore(&spi->master->stats_spinlock, flags);
+
+	return len;
+}
+static DEVICE_ATTR_RO(stat);
+
+static ssize_t
+master_stat_show(struct device *dev, struct device_attribute *a, char *buf)
+{
+	struct spi_master *master = to_spi_master(dev);
+	ssize_t len;
+	unsigned long flags;
+
+	spin_lock_irqsave(&master->stats_spinlock, flags);
+
+	len = spi_show_statistics(&master->stats, buf, PAGE_SIZE);
+
+	if (master->show_stats)
+		len += master->show_stats(master, NULL,
+					  buf + len,
+					  PAGE_SIZE - len);
+
+	spin_unlock_irqrestore(&master->stats_spinlock, flags);
+
+	return len;
+}
+
+struct device_attribute dev_attr_master_stat = {
+	.attr  = { .name = "stat", .mode = S_IRUGO },
+	.show  = master_stat_show,
+	.store = NULL
+};
+
+static struct attribute *spi_master_attrs[] = {
+	&dev_attr_master_stat.attr,
+	NULL,
+};
+ATTRIBUTE_GROUPS(spi_master);
+
 static void spidev_release(struct device *dev)
 {
 	struct spi_device	*spi = to_spi_device(dev);
@@ -69,6 +177,7 @@  static DEVICE_ATTR_RO(modalias);
 
 static struct attribute *spi_dev_attrs[] = {
 	&dev_attr_modalias.attr,
+	&dev_attr_stat.attr,
 	NULL,
 };
 ATTRIBUTE_GROUPS(spi_dev);
@@ -679,12 +788,34 @@  static int spi_transfer_one_message(struct spi_master *master,
 	bool keep_cs = false;
 	int ret = 0;
 	unsigned long ms = 1;
+	int l2size;
+	unsigned long flags;
 
 	spi_set_cs(msg->spi, true);
 
+	SPI_STATS_FIELD_ADD(master, msg->spi, messages, 1);
+
 	list_for_each_entry(xfer, &msg->transfers, transfer_list) {
 		trace_spi_transfer_start(msg, xfer);
 
+		/* update statistics */
+		l2size = (xfer->len) ? fls(xfer->len - 1) + 1 : 0;
+		l2size = min(l2size, SPI_STATISTICS_L2HISTO_SIZE - 1);
+
+		spin_lock_irqsave(&master->stats_spinlock, flags);
+		SPI_STATS_FIELD_ADD_NOLOCK(master, msg->spi, transfers, 1);
+		SPI_STATS_FIELD_ADD_NOLOCK(master, msg->spi,
+					   bytes, xfer->len);
+		if (xfer->tx_buf)
+			SPI_STATS_FIELD_ADD_NOLOCK(master, msg->spi,
+						   bytes_tx, xfer->len);
+		if (xfer->rx_buf)
+			SPI_STATS_FIELD_ADD_NOLOCK(master, msg->spi,
+					           bytes_rx, xfer->len);
+		SPI_STATS_FIELD_ADD_NOLOCK(master, msg->spi,
+					   bytes_l2histo[l2size], 1);
+		spin_unlock_irqrestore(&master->stats_spinlock, flags);
+
 		if (xfer->tx_buf || xfer->rx_buf) {
 			reinit_completion(&master->xfer_completion);
 
@@ -692,6 +823,8 @@  static int spi_transfer_one_message(struct spi_master *master,
 			if (ret < 0) {
 				dev_err(&msg->spi->dev,
 					"SPI transfer failed: %d\n", ret);
+				SPI_STATS_FIELD_ADD(master, msg->spi,
+						    errors, 1);
 				goto out;
 			}
 
@@ -707,6 +840,8 @@  static int spi_transfer_one_message(struct spi_master *master,
 			if (ms == 0) {
 				dev_err(&msg->spi->dev,
 					"SPI transfer timed out\n");
+				SPI_STATS_FIELD_ADD(master, msg->spi,
+						    timedout, 1);
 				msg->status = -ETIMEDOUT;
 			}
 		} else {
@@ -1405,6 +1540,7 @@  static struct class spi_master_class = {
 	.name		= "spi_master",
 	.owner		= THIS_MODULE,
 	.dev_release	= spi_master_release,
+	.dev_groups     = spi_master_groups,
 };
 
 
@@ -1928,6 +2064,8 @@  static int __spi_async(struct spi_device *spi, struct spi_message *message)
 
 	message->spi = spi;
 
+	SPI_STATS_FIELD_ADD(master, spi, spi_async, 1);
+
 	trace_spi_message_submit(message);
 
 	return master->transfer(spi, message);
@@ -2064,6 +2202,8 @@  static int __spi_sync(struct spi_device *spi, struct spi_message *message,
 	message->context = &done;
 	message->spi = spi;
 
+	SPI_STATS_FIELD_ADD(master, spi, spi_sync, 1);
+
 	if (!bus_locked)
 		mutex_lock(&master->bus_lock_mutex);
 
@@ -2091,8 +2231,11 @@  static int __spi_sync(struct spi_device *spi, struct spi_message *message,
 		/* Push out the messages in the calling context if we
 		 * can.
 		 */
-		if (master->transfer == spi_queued_transfer)
+		if (master->transfer == spi_queued_transfer) {
+			SPI_STATS_FIELD_ADD(master, spi,
+					    spi_sync_immediate, 1);
 			__spi_pump_messages(master, false);
+		}
 
 		wait_for_completion(&done);
 		status = message->status;