Message ID | 20200309183129.2296-2-tvrtko.ursulin@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Per client engine busyness | expand |
Quoting Tvrtko Ursulin (2020-03-09 18:31:18) > +struct i915_drm_client * > +i915_drm_client_add(struct i915_drm_clients *clients, struct task_struct *task) > +{ > + struct i915_drm_client *client; > + int ret; > + > + client = kzalloc(sizeof(*client), GFP_KERNEL); > + if (!client) > + return ERR_PTR(-ENOMEM); > + > + kref_init(&client->kref); > + client->clients = clients; > + > + ret = mutex_lock_interruptible(&clients->lock); > + if (ret) > + goto err_id; > + ret = xa_alloc_cyclic(&clients->xarray, &client->id, client, > + xa_limit_32b, &clients->next_id, GFP_KERNEL); So what's next_id used for that explains having the over-arching mutex? -Chris
On 09/03/2020 21:34, Chris Wilson wrote: > Quoting Tvrtko Ursulin (2020-03-09 18:31:18) >> +struct i915_drm_client * >> +i915_drm_client_add(struct i915_drm_clients *clients, struct task_struct *task) >> +{ >> + struct i915_drm_client *client; >> + int ret; >> + >> + client = kzalloc(sizeof(*client), GFP_KERNEL); >> + if (!client) >> + return ERR_PTR(-ENOMEM); >> + >> + kref_init(&client->kref); >> + client->clients = clients; >> + >> + ret = mutex_lock_interruptible(&clients->lock); >> + if (ret) >> + goto err_id; >> + ret = xa_alloc_cyclic(&clients->xarray, &client->id, client, >> + xa_limit_32b, &clients->next_id, GFP_KERNEL); > > So what's next_id used for that explains having the over-arching mutex? It's to give out client id's "cyclically" - before I apparently misunderstood what xa_alloc_cyclic is supposed to do - I thought after giving out id 1 it would give out 2 next, even if 1 was returned to the pool in the meantime. But it doesn't, I need to track the start point for the next search with "next". I want this to make intel_gpu_top's life easier, so it doesn't have to deal with id recycling for all practical purposes. And a peek into xa implementation told me the internal lock is not protecting "next. I could stick with one lock and not use the internal one if I used it on release path as well. Regards, Tvrtko
Quoting Tvrtko Ursulin (2020-03-09 23:26:34) > > On 09/03/2020 21:34, Chris Wilson wrote: > > Quoting Tvrtko Ursulin (2020-03-09 18:31:18) > >> +struct i915_drm_client * > >> +i915_drm_client_add(struct i915_drm_clients *clients, struct task_struct *task) > >> +{ > >> + struct i915_drm_client *client; > >> + int ret; > >> + > >> + client = kzalloc(sizeof(*client), GFP_KERNEL); > >> + if (!client) > >> + return ERR_PTR(-ENOMEM); > >> + > >> + kref_init(&client->kref); > >> + client->clients = clients; > >> + > >> + ret = mutex_lock_interruptible(&clients->lock); > >> + if (ret) > >> + goto err_id; > >> + ret = xa_alloc_cyclic(&clients->xarray, &client->id, client, > >> + xa_limit_32b, &clients->next_id, GFP_KERNEL); > > > > So what's next_id used for that explains having the over-arching mutex? > > It's to give out client id's "cyclically" - before I apparently > misunderstood what xa_alloc_cyclic is supposed to do - I thought after > giving out id 1 it would give out 2 next, even if 1 was returned to the > pool in the meantime. But it doesn't, I need to track the start point > for the next search with "next". Ok. A requirement of the API for the external counter. > I want this to make intel_gpu_top's life easier, so it doesn't have to > deal with id recycling for all practical purposes. Fair enough. I only worry about the radix nodes and sparse ids :) > And a peek into xa implementation told me the internal lock is not > protecting "next. See xa_alloc_cyclic(), seems to cover __xa_alloc_cycle (where *next is manipulated) under the xa_lock. -Chris
On 10/03/2020 00:13, Chris Wilson wrote: > Quoting Tvrtko Ursulin (2020-03-09 23:26:34) >> >> On 09/03/2020 21:34, Chris Wilson wrote: >>> Quoting Tvrtko Ursulin (2020-03-09 18:31:18) >>>> +struct i915_drm_client * >>>> +i915_drm_client_add(struct i915_drm_clients *clients, struct task_struct *task) >>>> +{ >>>> + struct i915_drm_client *client; >>>> + int ret; >>>> + >>>> + client = kzalloc(sizeof(*client), GFP_KERNEL); >>>> + if (!client) >>>> + return ERR_PTR(-ENOMEM); >>>> + >>>> + kref_init(&client->kref); >>>> + client->clients = clients; >>>> + >>>> + ret = mutex_lock_interruptible(&clients->lock); >>>> + if (ret) >>>> + goto err_id; >>>> + ret = xa_alloc_cyclic(&clients->xarray, &client->id, client, >>>> + xa_limit_32b, &clients->next_id, GFP_KERNEL); >>> >>> So what's next_id used for that explains having the over-arching mutex? >> >> It's to give out client id's "cyclically" - before I apparently >> misunderstood what xa_alloc_cyclic is supposed to do - I thought after >> giving out id 1 it would give out 2 next, even if 1 was returned to the >> pool in the meantime. But it doesn't, I need to track the start point >> for the next search with "next". > > Ok. A requirement of the API for the external counter. > >> I want this to make intel_gpu_top's life easier, so it doesn't have to >> deal with id recycling for all practical purposes. > > Fair enough. I only worry about the radix nodes and sparse ids :) I only found in docs that it should be efficient when the data is "densely clustered". And given that does appear based on a tree like structure I thought that means a few clusters of ids should be okay. But maybe in practice we would have more than a few clusters. I guess that could indeed be the case.. hm.. Maybe I could use a list and keep pointer to last entry. When u32 next wraps I reset to list head. Downside is any search for next free id potentially has to walk over one used up cluster. May be passable apart for IGT-type stress tests. >> And a peek into xa implementation told me the internal lock is not >> protecting "next. > > See xa_alloc_cyclic(), seems to cover __xa_alloc_cycle (where *next is > manipulated) under the xa_lock. Ha, true, not sure how I went past top-level and forgot what's in there. :) Regards, Tvrtko
Quoting Tvrtko Ursulin (2020-03-09 18:31:18) > +static int > +__i915_drm_client_register(struct i915_drm_client *client, > + struct task_struct *task) > +{ > + struct i915_drm_clients *clients = client->clients; > + struct device_attribute *attr; > + int ret = -ENOMEM; > + char idstr[32]; > + > + client->pid = get_task_pid(task, PIDTYPE_PID); > + > + client->name = kstrdup(task->comm, GFP_KERNEL); > + if (!client->name) > + goto err_name; > + > + if (!clients->root) > + return 0; /* intel_fbdev_init registers a client before sysfs */ > + > + snprintf(idstr, sizeof(idstr), "%u", client->id); > + client->root = kobject_create_and_add(idstr, clients->root); > + if (!client->root) > + goto err_client; > + > + attr = &client->attr.name; > + sysfs_attr_init(&attr->attr); > + attr->attr.name = "name"; > + attr->attr.mode = 0444; > + attr->show = show_client_name; > + > + ret = sysfs_create_file(client->root, (struct attribute *)attr); > + if (ret) > + goto err_attr; > + > + attr = &client->attr.pid; > + sysfs_attr_init(&attr->attr); > + attr->attr.name = "pid"; > + attr->attr.mode = 0444; > + attr->show = show_client_pid; > + > + ret = sysfs_create_file(client->root, (struct attribute *)attr); > + if (ret) > + goto err_attr; How do we think we will extend this (e.g. for client/1/(trace,debug))? i915_drm_client_add_attr() ? Or should we put all the attr here and make them known a priori? I think I prefer i915_drm_client_add_attr, but that will also require a notification chain? And that smells like overengineering. At any rate we have 2 other definite users around the corner for the client sysfs, so we should look at what API suits us. -Chris
On 10/03/2020 11:41, Chris Wilson wrote: > Quoting Tvrtko Ursulin (2020-03-09 18:31:18) >> +static int >> +__i915_drm_client_register(struct i915_drm_client *client, >> + struct task_struct *task) >> +{ >> + struct i915_drm_clients *clients = client->clients; >> + struct device_attribute *attr; >> + int ret = -ENOMEM; >> + char idstr[32]; >> + >> + client->pid = get_task_pid(task, PIDTYPE_PID); >> + >> + client->name = kstrdup(task->comm, GFP_KERNEL); >> + if (!client->name) >> + goto err_name; >> + >> + if (!clients->root) >> + return 0; /* intel_fbdev_init registers a client before sysfs */ >> + >> + snprintf(idstr, sizeof(idstr), "%u", client->id); >> + client->root = kobject_create_and_add(idstr, clients->root); >> + if (!client->root) >> + goto err_client; >> + >> + attr = &client->attr.name; >> + sysfs_attr_init(&attr->attr); >> + attr->attr.name = "name"; >> + attr->attr.mode = 0444; >> + attr->show = show_client_name; >> + >> + ret = sysfs_create_file(client->root, (struct attribute *)attr); >> + if (ret) >> + goto err_attr; >> + >> + attr = &client->attr.pid; >> + sysfs_attr_init(&attr->attr); >> + attr->attr.name = "pid"; >> + attr->attr.mode = 0444; >> + attr->show = show_client_pid; >> + >> + ret = sysfs_create_file(client->root, (struct attribute *)attr); >> + if (ret) >> + goto err_attr; > > How do we think we will extend this (e.g. for client/1/(trace,debug))? > > i915_drm_client_add_attr() ? > > Or should we put all the attr here and make them known a priori? > > I think I prefer i915_drm_client_add_attr, but that will also require a > notification chain? And that smells like overengineering. > > At any rate we have 2 other definite users around the corner for the > client sysfs, so we should look at what API suits us. It sounds acceptable to me to just call their setup from here. __i915_drm_client_register sounds clear enough place. We potentially just split the function into "client core" and "add-on users" for better readability: __i915_drm_client_register { ...register_client(); ...register_client_busy(client, ...); ...register_client_xxx(client, ...); } Regards, Tvrtko
Quoting Tvrtko Ursulin (2020-03-09 18:31:18) > +struct i915_drm_clients { > + struct mutex lock; > + struct xarray xarray; > + u32 next_id; > + > + struct kobject *root; > +}; > + > +struct i915_drm_client { > + struct kref kref; > + > + struct rcu_head rcu; > + > + unsigned int id; > + struct pid *pid; > + char *name; > + bool closed; After spending a couple of days with kcsan, I can predict you will want to mark this up with WRITE_ONCE/READ_ONCE or switch to set_bit/test_bit. I haven't spotted anything else to complain about, and you already suggested splitting out the attr setup :) -Chris
diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 9f887a86e555..c6fc0f258ce3 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -36,7 +36,8 @@ subdir-ccflags-y += -I$(srctree)/$(src) # Please keep these build lists sorted! # core driver code -i915-y += i915_drv.o \ +i915-y += i915_drm_client.o \ + i915_drv.o \ i915_irq.o \ i915_getparam.o \ i915_params.o \ diff --git a/drivers/gpu/drm/i915/i915_drm_client.c b/drivers/gpu/drm/i915/i915_drm_client.c new file mode 100644 index 000000000000..fbba9667aa7e --- /dev/null +++ b/drivers/gpu/drm/i915/i915_drm_client.c @@ -0,0 +1,155 @@ +/* + * SPDX-License-Identifier: MIT + * + * Copyright © 2020 Intel Corporation + */ + +#include <linux/kernel.h> +#include <linux/slab.h> +#include <linux/types.h> + +#include "i915_drm_client.h" +#include "i915_gem.h" +#include "i915_utils.h" + +static ssize_t +show_client_name(struct device *kdev, struct device_attribute *attr, char *buf) +{ + struct i915_drm_client *client = + container_of(attr, typeof(*client), attr.name); + + return snprintf(buf, PAGE_SIZE, + client->closed ? "<%s>" : "%s", + client->name); +} + +static ssize_t +show_client_pid(struct device *kdev, struct device_attribute *attr, char *buf) +{ + struct i915_drm_client *client = + container_of(attr, typeof(*client), attr.pid); + + return snprintf(buf, PAGE_SIZE, + client->closed ? "<%u>" : "%u", + pid_nr(client->pid)); +} + +static int +__i915_drm_client_register(struct i915_drm_client *client, + struct task_struct *task) +{ + struct i915_drm_clients *clients = client->clients; + struct device_attribute *attr; + int ret = -ENOMEM; + char idstr[32]; + + client->pid = get_task_pid(task, PIDTYPE_PID); + + client->name = kstrdup(task->comm, GFP_KERNEL); + if (!client->name) + goto err_name; + + if (!clients->root) + return 0; /* intel_fbdev_init registers a client before sysfs */ + + snprintf(idstr, sizeof(idstr), "%u", client->id); + client->root = kobject_create_and_add(idstr, clients->root); + if (!client->root) + goto err_client; + + attr = &client->attr.name; + sysfs_attr_init(&attr->attr); + attr->attr.name = "name"; + attr->attr.mode = 0444; + attr->show = show_client_name; + + ret = sysfs_create_file(client->root, (struct attribute *)attr); + if (ret) + goto err_attr; + + attr = &client->attr.pid; + sysfs_attr_init(&attr->attr); + attr->attr.name = "pid"; + attr->attr.mode = 0444; + attr->show = show_client_pid; + + ret = sysfs_create_file(client->root, (struct attribute *)attr); + if (ret) + goto err_attr; + + return 0; + +err_attr: + kobject_put(client->root); +err_client: + kfree(client->name); +err_name: + put_pid(client->pid); + + return ret; +} + +static void +__i915_drm_client_unregister(struct i915_drm_client *client) +{ + put_pid(fetch_and_zero(&client->pid)); + kfree(fetch_and_zero(&client->name)); + + if (!client->root) + return; /* fbdev client or error during drm open */ + + kobject_put(fetch_and_zero(&client->root)); +} + +struct i915_drm_client * +i915_drm_client_add(struct i915_drm_clients *clients, struct task_struct *task) +{ + struct i915_drm_client *client; + int ret; + + client = kzalloc(sizeof(*client), GFP_KERNEL); + if (!client) + return ERR_PTR(-ENOMEM); + + kref_init(&client->kref); + client->clients = clients; + + ret = mutex_lock_interruptible(&clients->lock); + if (ret) + goto err_id; + ret = xa_alloc_cyclic(&clients->xarray, &client->id, client, + xa_limit_32b, &clients->next_id, GFP_KERNEL); + mutex_unlock(&clients->lock); + if (ret) + goto err_id; + + ret = __i915_drm_client_register(client, task); + if (ret) + goto err_register; + + return client; + +err_register: + xa_erase(&clients->xarray, client->id); +err_id: + kfree(client); + + return ERR_PTR(ret); +} + +void __i915_drm_client_free(struct kref *kref) +{ + struct i915_drm_client *client = + container_of(kref, typeof(*client), kref); + + __i915_drm_client_unregister(client); + xa_erase(&client->clients->xarray, client->id); + kfree_rcu(client, rcu); +} + +void i915_drm_client_close(struct i915_drm_client *client) +{ + GEM_BUG_ON(client->closed); + client->closed = true; + i915_drm_client_put(client); +} diff --git a/drivers/gpu/drm/i915/i915_drm_client.h b/drivers/gpu/drm/i915/i915_drm_client.h new file mode 100644 index 000000000000..fb5979ec92d7 --- /dev/null +++ b/drivers/gpu/drm/i915/i915_drm_client.h @@ -0,0 +1,64 @@ +/* + * SPDX-License-Identifier: MIT + * + * Copyright © 2020 Intel Corporation + */ + +#ifndef __I915_DRM_CLIENT_H__ +#define __I915_DRM_CLIENT_H__ + +#include <linux/device.h> +#include <linux/kobject.h> +#include <linux/kref.h> +#include <linux/pid.h> +#include <linux/rcupdate.h> +#include <linux/sched.h> +#include <linux/xarray.h> + +struct i915_drm_clients { + struct mutex lock; + struct xarray xarray; + u32 next_id; + + struct kobject *root; +}; + +struct i915_drm_client { + struct kref kref; + + struct rcu_head rcu; + + unsigned int id; + struct pid *pid; + char *name; + bool closed; + + struct i915_drm_clients *clients; + + struct kobject *root; + struct { + struct device_attribute pid; + struct device_attribute name; + } attr; +}; + +static inline struct i915_drm_client * +i915_drm_client_get(struct i915_drm_client *client) +{ + kref_get(&client->kref); + return client; +} + +void __i915_drm_client_free(struct kref *kref); + +static inline void i915_drm_client_put(struct i915_drm_client *client) +{ + kref_put(&client->kref, __i915_drm_client_free); +} + +void i915_drm_client_close(struct i915_drm_client *client); + +struct i915_drm_client *i915_drm_client_add(struct i915_drm_clients *clients, + struct task_struct *task); + +#endif /* !__I915_DRM_CLIENT_H__ */ diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index c081f4ec87df..2eda21d730eb 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -91,6 +91,7 @@ #include "intel_wakeref.h" #include "intel_wopcm.h" +#include "i915_drm_client.h" #include "i915_gem.h" #include "i915_gem_fence_reg.h" #include "i915_gem_gtt.h" @@ -220,6 +221,8 @@ struct drm_i915_file_private { /** ban_score: Accumulated score of all ctx bans and fast hangs. */ atomic_t ban_score; unsigned long hang_timestamp; + + struct i915_drm_client *client; }; /* Interface history: @@ -1193,6 +1196,8 @@ struct drm_i915_private { struct i915_pmu pmu; + struct i915_drm_clients clients; + struct i915_hdcp_comp_master *hdcp_master; bool hdcp_comp_added; diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index ca5420012a22..ec8734eb54eb 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1218,12 +1218,16 @@ static void i915_gem_init__mm(struct drm_i915_private *i915) i915_gem_init__objects(i915); } -void i915_gem_init_early(struct drm_i915_private *dev_priv) +void i915_gem_init_early(struct drm_i915_private *i915) { - i915_gem_init__mm(dev_priv); - i915_gem_init__contexts(dev_priv); + i915_gem_init__mm(i915); + i915_gem_init__contexts(i915); - spin_lock_init(&dev_priv->fb_tracking.lock); + spin_lock_init(&i915->fb_tracking.lock); + + mutex_init(&i915->clients.lock); + i915->clients.next_id = 0; + xa_init_flags(&i915->clients.xarray, XA_FLAGS_ALLOC); } void i915_gem_cleanup_early(struct drm_i915_private *dev_priv) @@ -1232,6 +1236,8 @@ void i915_gem_cleanup_early(struct drm_i915_private *dev_priv) GEM_BUG_ON(!llist_empty(&dev_priv->mm.free_list)); GEM_BUG_ON(atomic_read(&dev_priv->mm.free_count)); drm_WARN_ON(&dev_priv->drm, dev_priv->mm.shrink_count); + drm_WARN_ON(&dev_priv->drm, !xa_empty(&dev_priv->clients.xarray)); + xa_destroy(&dev_priv->clients.xarray); } int i915_gem_freeze(struct drm_i915_private *dev_priv) @@ -1286,6 +1292,8 @@ void i915_gem_release(struct drm_device *dev, struct drm_file *file) struct drm_i915_file_private *file_priv = file->driver_priv; struct i915_request *request; + i915_drm_client_close(file_priv->client); + /* Clean up our request list when the client is going away, so that * later retire_requests won't dereference our soon-to-be-gone * file_priv. @@ -1299,17 +1307,25 @@ void i915_gem_release(struct drm_device *dev, struct drm_file *file) int i915_gem_open(struct drm_i915_private *i915, struct drm_file *file) { struct drm_i915_file_private *file_priv; - int ret; + struct i915_drm_client *client; + int ret = -ENOMEM; DRM_DEBUG("\n"); file_priv = kzalloc(sizeof(*file_priv), GFP_KERNEL); if (!file_priv) - return -ENOMEM; + goto err_alloc; + + client = i915_drm_client_add(&i915->clients, current); + if (IS_ERR(client)) { + ret = PTR_ERR(client); + goto err_client; + } file->driver_priv = file_priv; file_priv->dev_priv = i915; file_priv->file = file; + file_priv->client = client; spin_lock_init(&file_priv->mm.lock); INIT_LIST_HEAD(&file_priv->mm.request_list); @@ -1319,8 +1335,15 @@ int i915_gem_open(struct drm_i915_private *i915, struct drm_file *file) ret = i915_gem_context_open(i915, file); if (ret) - kfree(file_priv); + goto err_context; + + return 0; +err_context: + i915_drm_client_close(client); +err_client: + kfree(file_priv); +err_alloc: return ret; } diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c index 45d32ef42787..b7d4a6d2dd5c 100644 --- a/drivers/gpu/drm/i915/i915_sysfs.c +++ b/drivers/gpu/drm/i915/i915_sysfs.c @@ -560,6 +560,11 @@ void i915_setup_sysfs(struct drm_i915_private *dev_priv) struct device *kdev = dev_priv->drm.primary->kdev; int ret; + dev_priv->clients.root = + kobject_create_and_add("clients", &kdev->kobj); + if (!dev_priv->clients.root) + DRM_ERROR("Per-client sysfs setup failed\n"); + #ifdef CONFIG_PM if (HAS_RC6(dev_priv)) { ret = sysfs_merge_group(&kdev->kobj, @@ -627,4 +632,7 @@ void i915_teardown_sysfs(struct drm_i915_private *dev_priv) sysfs_unmerge_group(&kdev->kobj, &rc6_attr_group); sysfs_unmerge_group(&kdev->kobj, &rc6p_attr_group); #endif + + if (dev_priv->clients.root) + kobject_put(dev_priv->clients.root); }