[RFC,i-g-t,1/1] intel-gpu-top: Support for client stats
diff mbox series

Message ID 20191025142410.18465-2-tvrtko.ursulin@linux.intel.com
State New
Headers show
Series
  • Per client engine busyness
Related show

Commit Message

Tvrtko Ursulin Oct. 25, 2019, 2:24 p.m. UTC
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Adds support for per-client engine busyness stats i915 exports in sysfs
and produces output like the below:

==========================================================================
intel-gpu-top -  935/ 935 MHz;    0% RC6; 14.73 Watts;     1097 irqs/s

      IMC reads:     1401 MiB/s
     IMC writes:        4 MiB/s

          ENGINE      BUSY                                 MI_SEMA MI_WAIT
     Render/3D/0   63.73% |███████████████████           |      3%      0%
       Blitter/0    9.53% |██▊                           |      6%      0%
         Video/0   39.32% |███████████▊                  |     16%      0%
         Video/1   15.62% |████▋                         |      0%      0%
  VideoEnhance/0    0.00% |                              |      0%      0%

  PID            NAME     RCS          BCS          VCS         VECS
 4084        gem_wsim |█████▌     ||█          ||           ||           |
 4086        gem_wsim |█▌         ||           ||███        ||           |
==========================================================================

Apart from the existing physical engine utilization it now also shows
utilization per client and per engine class.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 tools/intel_gpu_top.c | 590 +++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 584 insertions(+), 6 deletions(-)

Comments

Chris Wilson Oct. 25, 2019, 3:13 p.m. UTC | #1
Quoting Tvrtko Ursulin (2019-10-25 15:24:10)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Adds support for per-client engine busyness stats i915 exports in sysfs
> and produces output like the below:
> 
> ==========================================================================
> intel-gpu-top -  935/ 935 MHz;    0% RC6; 14.73 Watts;     1097 irqs/s

Could we get "gpu / pkg Watts" pretty please?

Are irq/s interesting with execlists? Originally the idea was to say how
many times clients were sleeping and being woken up. Now we interrupt
to wipe the gpu's nose when it sneezes.

> 
>       IMC reads:     1401 MiB/s
>      IMC writes:        4 MiB/s
> 
>           ENGINE      BUSY                                 MI_SEMA MI_WAIT
>      Render/3D/0   63.73% |███████████████████           |      3%      0%
>        Blitter/0    9.53% |██▊                           |      6%      0%
>          Video/0   39.32% |███████████▊                  |     16%      0%
>          Video/1   15.62% |████▋                         |      0%      0%
>   VideoEnhance/0    0.00% |                              |      0%      0%
> 
>   PID            NAME     RCS          BCS          VCS         VECS
>  4084        gem_wsim |█████▌     ||█          ||           ||           |
>  4086        gem_wsim |█▌         ||           ||███        ||           |
> ==========================================================================
> 
> Apart from the existing physical engine utilization it now also shows
> utilization per client and per engine class.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---

> +#define SYSFS_CLIENTS "/sys/class/drm/card0/clients"

We need to somehow pull the right card.

Nothing shocking here. Where's the intel-gpu-overlay integration? ;)
-Chris
Tvrtko Ursulin Oct. 25, 2019, 3:38 p.m. UTC | #2
On 25/10/2019 16:13, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-10-25 15:24:10)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Adds support for per-client engine busyness stats i915 exports in sysfs
>> and produces output like the below:
>>
>> ==========================================================================
>> intel-gpu-top -  935/ 935 MHz;    0% RC6; 14.73 Watts;     1097 irqs/s
> 
> Could we get "gpu / pkg Watts" pretty please?

Sure, next week or so.

> Are irq/s interesting with execlists? Originally the idea was to say how
> many times clients were sleeping and being woken up. Now we interrupt
> to wipe the gpu's nose when it sneezes.
> 
>>
>>        IMC reads:     1401 MiB/s
>>       IMC writes:        4 MiB/s
>>
>>            ENGINE      BUSY                                 MI_SEMA MI_WAIT
>>       Render/3D/0   63.73% |███████████████████           |      3%      0%
>>         Blitter/0    9.53% |██▊                           |      6%      0%
>>           Video/0   39.32% |███████████▊                  |     16%      0%
>>           Video/1   15.62% |████▋                         |      0%      0%
>>    VideoEnhance/0    0.00% |                              |      0%      0%
>>
>>    PID            NAME     RCS          BCS          VCS         VECS
>>   4084        gem_wsim |█████▌     ||█          ||           ||           |
>>   4086        gem_wsim |█▌         ||           ||███        ||           |
>> ==========================================================================
>>
>> Apart from the existing physical engine utilization it now also shows
>> utilization per client and per engine class.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
> 
>> +#define SYSFS_CLIENTS "/sys/class/drm/card0/clients"
> 
> We need to somehow pull the right card.

Yeah, as I said RFC and reference only. :)

> Nothing shocking here. Where's the intel-gpu-overlay integration? ;)

Maybe intel-gpu-overlay should become an output plugin for intel_gpu_top. :)

Regards,

Tvrtko

Patch
diff mbox series

diff --git a/tools/intel_gpu_top.c b/tools/intel_gpu_top.c
index cc8db7c539ed..50e9c153329a 100644
--- a/tools/intel_gpu_top.c
+++ b/tools/intel_gpu_top.c
@@ -659,8 +659,403 @@  static void pmu_sample(struct engines *engines)
 	}
 }
 
+enum client_status {
+	FREE = 0, /* mbz */
+	ALIVE,
+	PROBE
+};
+
+struct clients;
+
+struct client {
+	struct clients *clients;
+
+	enum client_status status;
+	unsigned int id;
+	unsigned int pid;
+	char name[128];
+	unsigned int samples;
+	unsigned long total;
+	struct engines *engines;
+	unsigned long *val;
+	uint64_t *last;
+};
+
+struct engine_class {
+	unsigned int class;
+	const char *name;
+	unsigned int num_engines;
+};
+
+struct clients {
+	unsigned int num_classes;
+	struct engine_class *class;
+
+	unsigned int num_clients;
+	struct client *client;
+};
+
+#define for_each_client(clients, c, tmp) \
+	for ((tmp) = (clients)->num_clients, c = (clients)->client; \
+	     (tmp > 0); (tmp)--, (c)++)
+
+#define SYSFS_ENABLE "/sys/class/drm/card0/clients/enable_stats"
+
+bool __stats_enabled;
+
+static int __set_stats(bool val)
+{
+	int fd, ret;
+
+	fd = open(SYSFS_ENABLE, O_WRONLY);
+	if (fd < 0)
+		return -errno;
+
+	ret = write(fd, val ? "1" : "0", 2);
+	if (ret < 0)
+		return -errno;
+	else if (ret < 2)
+		return 1;
+
+	close(fd);
+
+	return 0;
+}
+
+static void __restore_stats(void)
+{
+	int ret;
+
+	if (__stats_enabled)
+		return;
+
+	ret = __set_stats(false);
+	if (ret)
+		fprintf(stderr, "Failed to disable per-client stats! (%d)\n",
+			ret);
+}
+
+static void __restore_stats_signal(int sig)
+{
+	exit(0);
+}
+
+static int enable_stats(void)
+{
+	int fd, ret;
+
+	fd = open(SYSFS_ENABLE, O_RDONLY);
+	if (fd < 0)
+		return -errno;
+
+	close(fd);
+
+	__stats_enabled = filename_to_u64(SYSFS_ENABLE, 10);
+	if (__stats_enabled)
+		return 0;
+
+	ret = __set_stats(true);
+	if (!ret) {
+		if (atexit(__restore_stats))
+			fprintf(stderr, "Failed to register exit handler!");
+
+		if (signal(SIGINT, __restore_stats_signal))
+			fprintf(stderr, "Failed to register signal handler!");
+	} else {
+		fprintf(stderr, "Failed to enable per-client stats! (%d)\n",
+			ret);
+	}
+
+	return ret;
+}
+
+static struct clients *init_clients(void)
+{
+	struct clients *clients = malloc(sizeof(*clients));
+
+	if (enable_stats()) {
+		free(clients);
+		return NULL;
+	}
+
+	return memset(clients, 0, sizeof(*clients));
+}
+
+#define SYSFS_CLIENTS "/sys/class/drm/card0/clients"
+
+static uint64_t read_client_busy(unsigned int id, unsigned int class)
+{
+	char buf[256];
+	ssize_t ret;
+
+	ret = snprintf(buf, sizeof(buf),
+		       SYSFS_CLIENTS "/%u/busy/%u",
+		       id, class);
+	assert(ret > 0 && ret < sizeof(buf));
+	if (ret <= 0 || ret == sizeof(buf))
+		return 0;
+
+	return filename_to_u64(buf, 10);
+}
+
+static struct client *
+find_client(struct clients *clients, enum client_status status, unsigned int id)
+{
+	struct client *c;
+	int tmp;
+
+	for_each_client(clients, c, tmp) {
+		if ((status == FREE && c->status == FREE) ||
+		    (status == c->status && c->id == id))
+			return c;
+	}
+
+	return NULL;
+}
+
+static void update_client(struct client *c, unsigned int pid, char *name)
+{
+	uint64_t val[c->clients->num_classes];
+	unsigned int i;
+
+	if (c->pid != pid)
+		c->pid = pid;
+
+	if (strncmp(c->name, name, sizeof(c->name)))
+		strncpy(c->name, name, sizeof(c->name));
+
+	for (i = 0; i < c->clients->num_classes; i++)
+		val[i] = read_client_busy(c->id, c->clients->class[i].class);
+
+	c->total = 0;
+
+	for (i = 0; i < c->clients->num_classes; i++) {
+		assert(val[i] >= c->last[i]);
+		c->val[i] = val[i] - c->last[i];
+		c->total += c->val[i];
+		c->last[i] = val[i];
+	}
+
+	c->samples++;
+	c->status = ALIVE;
+}
+
+static int class_cmp(const void *_a, const void *_b)
+{
+	const struct engine_class *a = _a;
+	const struct engine_class *b = _b;
+
+	return a->class - b->class;
+}
+
+static void scan_classes(struct clients *clients, unsigned int id)
+{
+	struct engine_class *classes;
+	unsigned int num, i;
+	struct dirent *dent;
+	char buf[256];
+	int ret;
+	DIR *d;
+
+	ret = snprintf(buf, sizeof(buf), SYSFS_CLIENTS "/%u/busy", id);
+	assert(ret > 0 && ret < sizeof(buf));
+	if (ret <= 0 || ret == sizeof(buf))
+		return;
+
+	d = opendir(buf);
+	if (!d)
+		return;
+
+restart:
+	rewinddir(d);
+
+	num = 0;
+	while ((dent = readdir(d)) != NULL) {
+		if (dent->d_type != DT_REG)
+			continue;
+
+		num++;
+	}
+
+	rewinddir(d);
+
+	classes = calloc(num, sizeof(*classes));
+	assert(classes);
+
+	i = 0;
+	while ((dent = readdir(d)) != NULL) {
+		if (i > num) {
+			// FIXME: free individual names
+			free(classes);
+			goto restart;
+		}
+
+		if (dent->d_type != DT_REG)
+			continue;
+
+		classes[i].class = atoi(dent->d_name);
+		classes[i].name = class_short_name(classes[i].class);
+		i++;
+	}
+
+	closedir(d);
+
+	qsort(classes, num, sizeof(*classes), class_cmp);
+
+	clients->num_classes = num;
+	clients->class = classes;
+}
+
+static void
+add_client(struct clients *clients, unsigned int id, unsigned int pid,
+	   char *name)
+{
+	struct client *c;
+
+	assert(!find_client(clients, ALIVE, id));
+
+	c = find_client(clients, FREE, 0);
+	if (!c) {
+		unsigned int idx = clients->num_clients;
+
+		clients->num_clients += (clients->num_clients + 2) / 2;
+		clients->client = realloc(clients->client,
+					  clients->num_clients * sizeof(*c));
+		assert(clients->client);
+
+		c = &clients->client[idx];
+		memset(c, 0, (clients->num_clients - idx) * sizeof(*c));
+	}
+
+	if (!clients->num_classes)
+		scan_classes(clients, id);
+
+	c->id = id;
+	c->clients = clients;
+	c->val = calloc(clients->num_classes, sizeof(c->val));
+	c->last = calloc(clients->num_classes, sizeof(c->last));
+	assert(c->val && c->last);
+
+	update_client(c, pid, name);
+}
+
+static void free_client(struct client *c)
+{
+	free(c->val);
+	free(c->last);
+	memset(c, 0, sizeof(*c));
+}
+
+static char *read_client_sysfs(unsigned int id, const char *field)
+{
+	char buf[256];
+	ssize_t ret;
+
+	ret = snprintf(buf, sizeof(buf), SYSFS_CLIENTS "/%u/%s", id, field);
+	assert(ret > 0 && ret < sizeof(buf));
+	if (ret <= 0 || ret == sizeof(buf))
+		return NULL;
+
+	ret = filename_to_buf(buf, buf, sizeof(buf));
+	assert(ret == 0);
+	if (ret)
+		return NULL;
+
+	return strdup(buf);
+}
+
+static void scan_clients(struct clients *clients)
+{
+	struct dirent *dent;
+	struct client *c;
+	char *pid, *name;
+	unsigned int id;
+	int tmp;
+	DIR *d;
+
+	if (!clients)
+		return;
+
+	for_each_client(clients, c, tmp) {
+		if (c->status == ALIVE)
+			c->status = PROBE;
+	}
+
+	d = opendir(SYSFS_CLIENTS);
+	if (!d)
+		return;
+
+	while ((dent = readdir(d)) != NULL) {
+		if (dent->d_type != DT_DIR)
+			continue;
+		if (!isdigit(dent->d_name[0]))
+			continue;
+
+		id = atoi(dent->d_name);
+
+		name = read_client_sysfs(id, "name");
+		assert(name);
+		if (!name)
+			continue;
+
+		pid = read_client_sysfs(id, "pid");
+		assert(pid);
+		if (!pid) {
+			free(name);
+			continue;
+		}
+
+		c = find_client(clients, PROBE, id);
+		if (c) {
+			update_client(c, atoi(pid), name);
+			continue;
+		}
+
+		add_client(clients, id, atoi(pid), name);
+
+		free(name);
+		free(pid);
+	}
+
+	closedir(d);
+
+	for_each_client(clients, c, tmp) {
+		if (c->status == PROBE)
+			free_client(c);
+	}
+}
+
+static int cmp(const void *_a, const void *_b)
+{
+	const struct client *a = _a;
+	const struct client *b = _b;
+	long tot_a = a->total;
+	long tot_b = b->total;
+
+	tot_a *= a->status == ALIVE && a->samples > 1;
+	tot_b *= b->status == ALIVE && b->samples > 1;
+
+	tot_b -= tot_a;
+
+	if (!tot_b)
+		return (int)b->id - a->id;
+
+	while (tot_b > INT_MAX || tot_b < INT_MIN)
+		tot_b /= 2;
+
+	return tot_b;
+}
+
 static const char *bars[] = { " ", "▏", "▎", "▍", "▌", "▋", "▊", "▉", "█" };
 
+static void n_spaces(const unsigned int n)
+{
+	unsigned int i;
+
+	for (i = 0; i < n; i++)
+		putchar(' ');
+}
+
 static void
 print_percentage_bar(double percent, int max_len)
 {
@@ -674,8 +1069,10 @@  print_percentage_bar(double percent, int max_len)
 	if (i)
 		printf("%s", bars[i]);
 
-	for (i = 0; i < (max_len - 2 - (bar_len + 7) / 8); i++)
-		putchar(' ');
+	bar_len = max_len - 2 - (bar_len + 7) / 8;
+	if (bar_len > max_len)
+		bar_len = max_len;
+	n_spaces(bar_len);
 
 	putchar('|');
 }
@@ -775,6 +1172,18 @@  json_close_struct(void)
 		fflush(stdout);
 }
 
+static void
+__json_add_member(const char *key, const char *val)
+{
+	assert(json_indent_level < ARRAY_SIZE(json_indent));
+
+	fprintf(out, "%s%s\"%s\": \"%s\"",
+		json_struct_members ? ",\n" : "",
+		json_indent[json_indent_level], key, val);
+
+	json_struct_members++;
+}
+
 static unsigned int
 json_add_member(const struct cnt_group *parent, struct cnt_item *item,
 		unsigned int headers)
@@ -1075,8 +1484,6 @@  print_header(struct engines *engines, double t,
 		memmove(&groups[0], &groups[1],
 			sizeof(groups) - sizeof(groups[0]));
 
-	pops->open_struct(NULL);
-
 	*consumed = print_groups(groups);
 
 	if (output_mode == INTERACTIVE) {
@@ -1232,7 +1639,6 @@  print_engines_footer(struct engines *engines, double t,
 		     int lines, int con_w, int con_h)
 {
 	pops->close_struct();
-	pops->close_struct();
 
 	if (output_mode == INTERACTIVE) {
 		if (lines++ < con_h)
@@ -1242,6 +1648,136 @@  print_engines_footer(struct engines *engines, double t,
 	return lines;
 }
 
+static int
+print_clients_header(struct clients *clients, int lines,
+		     int con_w, int con_h, unsigned int *class_w)
+{
+	int len;
+
+	if (output_mode == INTERACTIVE) {
+		if (lines++ >= con_h)
+			return lines;
+
+		printf("\033[7m");
+		len = printf("%5s%16s", "PID", "NAME");
+
+		if (lines++ >= con_h)
+			return lines;
+
+		if (clients->num_classes) {
+			unsigned int i;
+
+			*class_w = (con_w - len) / clients->num_classes;
+
+			for (i = 0; i < clients->num_classes; i++) {
+				unsigned int name_len =
+					strlen(clients->class[i].name);
+				unsigned int pad = (*class_w - name_len) / 2;
+
+				n_spaces(pad);
+				printf("%s", clients->class[i].name);
+				n_spaces(*class_w - pad - name_len);
+				len += pad + name_len +
+				       (*class_w - pad - name_len);
+			}
+		}
+
+		n_spaces(con_w - len);
+		printf("\033[0m\n");
+	} else {
+		if (clients->num_classes)
+			pops->open_struct("clients");
+	}
+
+	return lines;
+}
+
+static void count_engines(struct clients *clients, struct engines *engines)
+{
+	unsigned int i;
+
+	for (i = 0; i < engines->num_engines; i++) {
+		struct engine *engine = engine_ptr(engines, i);
+
+		clients->class[engine->class].num_engines++;
+	}
+}
+
+static int
+print_client(struct client *c, struct engines *engines, double t, int lines,
+	     int con_w, int con_h, unsigned int period_us,
+	     unsigned int *class_w)
+{
+	struct clients *clients = c->clients;
+	unsigned int i;
+
+	if (output_mode == INTERACTIVE) {
+		printf("%5u%16s ", c->pid, c->name);
+
+		for (i = 0; i < clients->num_classes; i++) {
+			double pct;
+
+			if (!clients->class[i].num_engines)
+				count_engines(clients, engines);
+
+			pct = (double)c->val[i] / period_us / 1e3 * 100 /
+			      clients->class[i].num_engines;
+
+			print_percentage_bar(pct, *class_w);
+		}
+
+		putchar('\n');
+	} else if (output_mode == JSON) {
+		char buf[64];
+
+		snprintf(buf, sizeof(buf), "%u", c->id);
+		pops->open_struct(buf);
+
+		__json_add_member("name", c->name);
+
+		snprintf(buf, sizeof(buf), "%u", c->pid);
+		__json_add_member("pid", buf);
+
+		pops->open_struct("engine-classes");
+
+		for (i = 0; i < clients->num_classes; i++) {
+			double pct;
+
+			snprintf(buf, sizeof(buf), "%s",
+				 clients->class[i].name);
+			pops->open_struct(buf);
+
+			pct = (double)c->val[i] / period_us / 1e3 * 100;
+			snprintf(buf, sizeof(buf), "%f", pct);
+			__json_add_member("busy", buf);
+
+			__json_add_member("unit", "%");
+
+			pops->close_struct();
+		}
+
+		pops->close_struct();
+		pops->close_struct();
+	}
+
+	return lines;
+}
+
+static int
+print_clients_footer(struct clients *clients, double t,
+		     int lines, int con_w, int con_h)
+{
+	if (output_mode == INTERACTIVE) {
+		if (lines++ < con_h)
+			printf("\n");
+	} else {
+		if (clients->num_classes)
+			pops->close_struct();
+	}
+
+	return lines;
+}
+
 static bool stop_top;
 
 static void sigint_handler(int  sig)
@@ -1252,6 +1788,7 @@  static void sigint_handler(int  sig)
 int main(int argc, char **argv)
 {
 	unsigned int period_us = DEFAULT_PERIOD_MS * 1000;
+	struct clients *clients = NULL;
 	int con_w = -1, con_h = -1;
 	char *output_path = NULL;
 	struct engines *engines;
@@ -1335,12 +1872,17 @@  int main(int argc, char **argv)
 		return 1;
 	}
 
+	clients = init_clients();
+
 	pmu_sample(engines);
+	scan_clients(clients);
 
 	while (!stop_top) {
 		bool consumed = false;
-		int lines = 0;
+		int j, lines = 0;
+		unsigned int class_w;
 		struct winsize ws;
+		struct client *c;
 		double t;
 
 		/* Update terminal size. */
@@ -1354,10 +1896,18 @@  int main(int argc, char **argv)
 		pmu_sample(engines);
 		t = (double)(engines->ts.cur - engines->ts.prev) / 1e9;
 
+		scan_clients(clients);
+		if (clients) {
+			qsort(clients->client, clients->num_clients,
+			      sizeof(*clients->client), cmp);
+		}
+
 		if (stop_top)
 			break;
 
 		while (!consumed) {
+			pops->open_struct(NULL);
+
 			lines = print_header(engines, t, lines, con_w, con_h,
 					     &consumed);
 
@@ -1376,6 +1926,34 @@  int main(int argc, char **argv)
 
 			lines = print_engines_footer(engines, t, lines, con_w,
 						     con_h);
+
+			if (clients) {
+				lines = print_clients_header(clients, lines,
+							     con_w, con_h,
+							     &class_w);
+
+				for_each_client(clients, c, j) {
+					if (lines++ > con_h)
+						break;
+
+					assert(c->status != PROBE);
+					if (c->status != ALIVE)
+						break;
+
+					if (c->samples < 2)
+						continue;
+
+					lines = print_client(c, engines, t,
+							     lines, con_w,
+							     con_h, period_us,
+							     &class_w);
+				}
+
+				lines = print_clients_footer(clients, t, lines,
+							     con_w, con_h);
+			}
+
+			pops->close_struct();
 		}
 
 		if (stop_top)