diff mbox

[RFC,v2,1/7] cgroup: Allow drivers to store data associated with a cgroup

Message ID 20180201195315.4956-2-matthew.d.roper@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Matt Roper Feb. 1, 2018, 7:53 p.m. UTC
There are cases where drivers need to adjust behavior or control
device-specific resources according to the type of clients (processes)
submitting requests.  Linux cgroups are a natural fit for this type of
resource control and policy management, so we need a way for drivers to
associate their own driver-specific and device-specific data with
individual cgroups.

This is different than the cgroup controller support that exists today
in several important ways:
 * Drivers may be built as modules (and unloaded/reloaded) which is not
   something cgroup controllers support today.
 * Drivers may wish to provide their own interface to allow userspace to
   adjust driver-specific settings (e.g., via a driver ioctl rather than
   via the kernfs filesystem).
 * A single driver may be managing multiple devices and wish to maintain
   different driver-specific cgroup data for each.

To use this mechanism, drivers should call cgroup_driver_init() to
register themselves and provide provide handler functions for allocating
driver-specific data structures; this call will return a handle that can
be used to lookup the driver-specific data associated with the device.
Drivers managing multiple devices that wish to track separate data for
each device may register themselves multiple times (e.g., a graphics
driver might call cgroup_driver_init() for each drm_device it manages).

At runtime, drivers may call cgroup_driver_get_data() to fetch the
driver-specific data associated with a cgroup.  The driver-specific data
(which should be a driver-defined subclass of 'struct
cgroup_driver_data') will be allocated if one doesn't already exist.
Management of driver-specific data by the cgroups framework is protected
by cgroup_mutex, but drivers are responsible for performing their own
synchronization on the per-cgroup data they receive, if necessary.  Note
that driver-specific data for a cgroup will only be allocated if/when
the driver first requests data for that cgroup.  The driver data will
also be automatically destroyed if the cgroup it belongs to is removed.

Finally, drivers should cgroup_driver_release() on driver unload to
destroy all of their driver-specific data.

Note that technically these interfaces aren't restricted to drivers
(other non-driver parts of the kernel could make use of them as well).
I expect drivers to be the primary consumers of this interface and
couldn't think of a more appropriate generic name (the term "subsystem"
would probably be more accurate, but that's already used by cgroup
controllers).

Cc: Tejun Heo <tj@kernel.org>
Cc: cgroups@vger.kernel.org
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 include/linux/cgroup-defs.h   |  37 ++++++++++++
 include/linux/cgroup.h        |  27 +++++++++
 init/Kconfig                  |   4 ++
 kernel/cgroup/Makefile        |   1 +
 kernel/cgroup/cgroup.c        |   1 +
 kernel/cgroup/cgroup_driver.c | 130 ++++++++++++++++++++++++++++++++++++++++++
 6 files changed, 200 insertions(+)
 create mode 100644 kernel/cgroup/cgroup_driver.c

Comments

Tejun Heo Feb. 7, 2018, 10:11 p.m. UTC | #1
Hello,

On Thu, Feb 01, 2018 at 11:53:09AM -0800, Matt Roper wrote:
>  * Drivers may be built as modules (and unloaded/reloaded) which is not
>    something cgroup controllers support today.

As discussed in the other subthread, this shouldn't be a concern.

>  * Drivers may wish to provide their own interface to allow userspace to
>    adjust driver-specific settings (e.g., via a driver ioctl rather than
>    via the kernfs filesystem).
>  * A single driver may be managing multiple devices and wish to maintain
>    different driver-specific cgroup data for each.

If you look at io and rdma controllers, they already do this.

> Note that technically these interfaces aren't restricted to drivers
> (other non-driver parts of the kernel could make use of them as well).
> I expect drivers to be the primary consumers of this interface and
> couldn't think of a more appropriate generic name (the term "subsystem"
> would probably be more accurate, but that's already used by cgroup
> controllers).

Let's please not do "driver", it's really confusing.  Just coming up
with a made-up word would be fine as long as the connection can be
made and the word is easily identifiable.  e.g. cgroup cdata / pdata for
cgroup custom / priv data.

> +/*
> + * Driver-specific cgroup data.  Drivers should subclass this structure with
> + * their own fields for data that should be stored alongside individual
> + * cgroups.
> + */
> +struct cgroup_driver_data {
> +	/* Driver this data structure is associated with */
> +	struct cgroup_driver *drv;
> +
> +	/* Node in cgroup's data hashtable */
> +	struct hlist_node cgroupnode;
> +
> +	/* Node in driver's data list; used to cleanup on driver unload */
> +	struct list_head drivernode;
> +};
...
> +struct cgroup_driver {
> +	/* Functions this driver uses to manage its data */
> +	struct cgroup_driver_funcs *funcs;
> +
> +	/*
> +	 * List of driver-specific data structures that need to be cleaned up
> +	 * if driver is unloaded.
> +	 */
> +	struct list_head datalist;
> +};

It generally looks great but can we do something like the following in
terms of interface?

  struct cgroup_cdata {
	  const void *key;
	  void (*free)(struct cgroup_cdata *cdata);
	  /* whatever other necessary fields */
	  char data[];
  };

  int cgroup_cdata_install(struct cgroup *cgrp, struct cgroup_cdata *cdata);
  struct cgroup_cdata *cgroup_cdata_lookup(struct cgroup *cgrp, const void *key);
  int cgroup_cdata_free(struct cgroup *cgrp, const void *key);
  /* free is also automatically called when the cgroup is released */

And please use a separate lock or mutex for managing them.

Thanks.
diff mbox

Patch

diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h
index 8b7fd8eeccee..5728e3afc95f 100644
--- a/include/linux/cgroup-defs.h
+++ b/include/linux/cgroup-defs.h
@@ -8,6 +8,7 @@ 
 #ifndef _LINUX_CGROUP_DEFS_H
 #define _LINUX_CGROUP_DEFS_H
 
+#include <linux/hashtable.h>
 #include <linux/limits.h>
 #include <linux/list.h>
 #include <linux/idr.h>
@@ -27,6 +28,7 @@  struct cgroup;
 struct cgroup_root;
 struct cgroup_subsys;
 struct cgroup_taskset;
+struct cgroup_driver;
 struct kernfs_node;
 struct kernfs_ops;
 struct kernfs_open_file;
@@ -307,6 +309,22 @@  struct cgroup_stat {
 	struct prev_cputime prev_cputime;
 };
 
+/*
+ * Driver-specific cgroup data.  Drivers should subclass this structure with
+ * their own fields for data that should be stored alongside individual
+ * cgroups.
+ */
+struct cgroup_driver_data {
+	/* Driver this data structure is associated with */
+	struct cgroup_driver *drv;
+
+	/* Node in cgroup's data hashtable */
+	struct hlist_node cgroupnode;
+
+	/* Node in driver's data list; used to cleanup on driver unload */
+	struct list_head drivernode;
+};
+
 struct cgroup {
 	/* self css with NULL ->ss, points back to this cgroup */
 	struct cgroup_subsys_state self;
@@ -427,6 +445,12 @@  struct cgroup {
 	/* used to store eBPF programs */
 	struct cgroup_bpf bpf;
 
+	/*
+	 * list of cgroup_driver_data structures; used to manage
+	 * driver-specific data associated with individual cgroups
+	 */
+	DECLARE_HASHTABLE(driver_data, 4);
+
 	/* ids of the ancestors at each level including self */
 	int ancestor_ids[];
 };
@@ -662,6 +686,19 @@  struct cgroup_subsys {
 	unsigned int depends_on;
 };
 
+/* Function table for handling driver-specific cgroup data */
+struct cgroup_driver_funcs {
+	/* Allocates driver-specific data for a cgroup */
+	struct cgroup_driver_data *(*alloc_data)(struct cgroup_driver *drv);
+
+	/*
+	 * Frees a driver-specific datastructure.
+	 *
+	 * This function is optional; if NULL, data will be kvfree()'d.
+	 */
+	void (*free_data)(struct cgroup_driver_data *data);
+};
+
 extern struct percpu_rw_semaphore cgroup_threadgroup_rwsem;
 
 /**
diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 473e0c0abb86..0ba1374122c7 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -833,4 +833,31 @@  static inline void put_cgroup_ns(struct cgroup_namespace *ns)
 		free_cgroup_ns(ns);
 }
 
+struct cgroup_driver_funcs;
+
+#ifdef CONFIG_CGROUP_DRIVER
+
+struct cgroup_driver *cgroup_driver_init(struct cgroup_driver_funcs *funcs);
+void cgroup_driver_release(struct cgroup_driver *drv);
+struct cgroup_driver_data * cgroup_driver_get_data(struct cgroup_driver *drv,
+						   struct cgroup *cgrp,
+						   bool *is_new);
+
+#else /* !CONFIG_CGROUP_DRIVER */
+
+static inline struct cgroup_driver *
+cgroup_driver_init(struct cgroup_driver_funcs *funcs) {
+	return NULL;
+}
+static inline void cgroup_driver_release(struct cgroup_driver *drv) {}
+static inline struct cgroup_driver_data *
+cgroup_driver_get_data(struct cgroup_driver *drv,
+		       struct cgroup *cgrp,
+		       bool *is_new)
+{
+	return ERR_PTR(-EINVAL);
+}
+
+#endif /* !CONFIG_CGROUP_DRIVER */
+
 #endif /* _LINUX_CGROUP_H */
diff --git a/init/Kconfig b/init/Kconfig
index a9a2e2c86671..5f942ed8d248 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -873,6 +873,10 @@  config SOCK_CGROUP_DATA
 	bool
 	default n
 
+config CGROUP_DRIVER
+	bool
+	default n
+
 endif # CGROUPS
 
 menuconfig NAMESPACES
diff --git a/kernel/cgroup/Makefile b/kernel/cgroup/Makefile
index 2be89a003185..669ef66da3a2 100644
--- a/kernel/cgroup/Makefile
+++ b/kernel/cgroup/Makefile
@@ -6,3 +6,4 @@  obj-$(CONFIG_CGROUP_PIDS) += pids.o
 obj-$(CONFIG_CGROUP_RDMA) += rdma.o
 obj-$(CONFIG_CPUSETS) += cpuset.o
 obj-$(CONFIG_CGROUP_DEBUG) += debug.o
+obj-$(CONFIG_CGROUP_DRIVER) += cgroup_driver.o
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 7e4c44538119..b926713a1860 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -1838,6 +1838,7 @@  static void init_cgroup_housekeeping(struct cgroup *cgrp)
 	INIT_LIST_HEAD(&cgrp->self.children);
 	INIT_LIST_HEAD(&cgrp->cset_links);
 	INIT_LIST_HEAD(&cgrp->pidlists);
+	hash_init(cgrp->driver_data);
 	mutex_init(&cgrp->pidlist_mutex);
 	cgrp->self.cgroup = cgrp;
 	cgrp->self.flags |= CSS_ONLINE;
diff --git a/kernel/cgroup/cgroup_driver.c b/kernel/cgroup/cgroup_driver.c
new file mode 100644
index 000000000000..0d893395dc7b
--- /dev/null
+++ b/kernel/cgroup/cgroup_driver.c
@@ -0,0 +1,130 @@ 
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#include "cgroup-internal.h"
+
+#include <linux/hashtable.h>
+#include <linux/mm.h>
+
+/*
+ * General data structure returned by cgroup_driver_init() and used as a
+ * hashtable key to lookup driver-specific data.
+ */
+struct cgroup_driver {
+	/* Functions this driver uses to manage its data */
+	struct cgroup_driver_funcs *funcs;
+
+	/*
+	 * List of driver-specific data structures that need to be cleaned up
+	 * if driver is unloaded.
+	 */
+	struct list_head datalist;
+};
+
+/**
+ * cgroup_driver_init - initialize cgroups driver-specific data management
+ * @funcs: driver-specific data management functions
+ *
+ * Drivers that wish to store driver-specific data alongside individual
+ * cgroups should call this and provide a function table of driver-specific
+ * data operations.
+ *
+ * RETURNS:
+ * An instance of 'struct cgroup_driver' that will be used to help manage
+ * data storage for the invoking driver.  If an error occurs, a negative
+ * error code will be returned.  If CONFIG_CGROUP_DRIVER is not set, NULL
+ * will be returned.
+ */
+struct cgroup_driver *
+cgroup_driver_init(struct cgroup_driver_funcs *funcs)
+{
+	struct cgroup_driver *drv;
+
+	drv = kzalloc(sizeof *drv, GFP_KERNEL);
+	if (!drv)
+		return ERR_PTR(-ENOMEM);
+
+	drv->funcs = funcs;
+	INIT_LIST_HEAD(&drv->datalist);
+
+	return drv;
+}
+EXPORT_SYMBOL(cgroup_driver_init);
+
+/**
+ * cgroup_driver_release - release all driver-specific data for a driver
+ * @drv: driver to release data for
+ *
+ * Drivers storing their own data alongside cgroups should call this function
+ * when unloaded to ensure all driver-specific data is released.
+ */
+void
+cgroup_driver_release(struct cgroup_driver *drv)
+{
+	struct cgroup_driver_data *data, *tmp;
+
+	mutex_lock(&cgroup_mutex);
+	list_for_each_entry_safe(data, tmp, &drv->datalist, drivernode) {
+		hlist_del(&data->cgroupnode);
+		list_del(&data->drivernode);
+		if (drv->funcs && drv->funcs->free_data)
+			drv->funcs->free_data(data);
+		else
+			kvfree(data);
+	}
+	mutex_unlock(&cgroup_mutex);
+
+	kfree(drv);
+}
+EXPORT_SYMBOL(cgroup_driver_release);
+
+/**
+ * cgroup_driver_get_data - retrieve/allocate driver-specific data for a cgroup
+ * @drv: driver wishing to fetch data
+ * @cgrp: cgroup to fetch data for
+ * @is_new: will be set to true if a new structure is allocated
+ *
+ * Fetches the driver-specific data structure associated with a cgroup, if one
+ * has previously been set.  If no driver data has been associated with this
+ * cgroup, a new driver-specific data structure is allocated and returned.
+ *
+ * RETURNS:
+ * The driver data previously associated with this cgroup, or a fresh data
+ * structure allocated via drv->funcs->alloc_data() if no data has previously
+ * been associated.  On error, a negative error code is returned.
+ */
+struct cgroup_driver_data *
+cgroup_driver_get_data(struct cgroup_driver *drv,
+		       struct cgroup *cgrp,
+		       bool *is_new)
+{
+	struct cgroup_driver_data *data;
+
+	/* We only support driver-specific data on the cgroup-v2 hierarchy */
+	if (!cgroup_on_dfl(cgrp))
+		return ERR_PTR(-EINVAL);
+
+	mutex_lock(&cgroup_mutex);
+
+	if (is_new)
+		*is_new = false;
+	hash_for_each_possible(cgrp->driver_data, data, cgroupnode,
+			       (unsigned long)drv)
+		if (data->drv == drv)
+			goto out;
+
+	/* First time for this cgroup; alloc and store new data */
+	data = drv->funcs->alloc_data(drv);
+	if (!IS_ERR(data)) {
+		data->drv = drv;
+		hash_add(cgrp->driver_data, &data->cgroupnode,
+			 (unsigned long)drv);
+		list_add(&data->drivernode, &drv->datalist);
+		if (is_new)
+			*is_new = true;
+	}
+
+out:
+	mutex_unlock(&cgroup_mutex);
+	return data;
+}
+EXPORT_SYMBOL(cgroup_driver_get_data);