diff mbox series

[RFC,v3,04/11] drm, cgroup: Add total GEM buffer allocation limit

Message ID 20190626150522.11618-5-Kenny.Ho@amd.com (mailing list archive)
State New, archived
Headers show
Series new cgroup controller for gpu/drm subsystem | expand

Commit Message

Ho, Kenny June 26, 2019, 3:05 p.m. UTC
The drm resource being measured and limited here is the GEM buffer
objects.  User applications allocate and free these buffers.  In
addition, a process can allocate a buffer and share it with another
process.  The consumer of a shared buffer can also outlive the
allocator of the buffer.

For the purpose of cgroup accounting and limiting, ownership of the
buffer is deemed to be the cgroup for which the allocating process
belongs to.  There is one cgroup limit per drm device.

In order to prevent the buffer outliving the cgroup that owns it, a
process is prevented from importing buffers that are not own by the
process' cgroup or the ancestors of the process' cgroup.  In other
words, in order for a buffer to be shared between two cgroups, the
buffer must be created by the common ancestors of the cgroups.

drm.buffer.stats
        A read-only flat-keyed file which exists on all cgroups.  Each
        entry is keyed by the drm device's major:minor.

        Total GEM buffer allocation in bytes.

drm.buffer.default
        A read-only flat-keyed file which exists on the root cgroup.
        Each entry is keyed by the drm device's major:minor.

        Default limits on the total GEM buffer allocation in bytes.

drm.buffer.max
        A read-write flat-keyed file which exists on all cgroups.  Each
        entry is keyed by the drm device's major:minor.

        Per device limits on the total GEM buffer allocation in byte.
        This is a hard limit.  Attempts in allocating beyond the cgroup
        limit will result in ENOMEM.  Shorthand understood by memparse
        (such as k, m, g) can be used.

        Set allocation limit for /dev/dri/card1 to 1GB
        echo "226:1 1g" > drm.buffer.total.max

        Set allocation limit for /dev/dri/card0 to 512MB
        echo "226:0 512m" > drm.buffer.total.max

Change-Id: I4c249d06d45ec709d6481d4cbe87c5168545c5d0
Signed-off-by: Kenny Ho <Kenny.Ho@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c |   4 +
 drivers/gpu/drm/drm_gem.c                  |   8 +
 drivers/gpu/drm/drm_prime.c                |   9 +
 include/drm/drm_cgroup.h                   |  34 ++-
 include/drm/drm_gem.h                      |  11 +
 include/linux/cgroup_drm.h                 |   2 +
 kernel/cgroup/drm.c                        | 321 +++++++++++++++++++++
 7 files changed, 387 insertions(+), 2 deletions(-)

Comments

Daniel Vetter June 26, 2019, 4:05 p.m. UTC | #1
On Wed, Jun 26, 2019 at 11:05:15AM -0400, Kenny Ho wrote:
> The drm resource being measured and limited here is the GEM buffer
> objects.  User applications allocate and free these buffers.  In
> addition, a process can allocate a buffer and share it with another
> process.  The consumer of a shared buffer can also outlive the
> allocator of the buffer.
> 
> For the purpose of cgroup accounting and limiting, ownership of the
> buffer is deemed to be the cgroup for which the allocating process
> belongs to.  There is one cgroup limit per drm device.
> 
> In order to prevent the buffer outliving the cgroup that owns it, a
> process is prevented from importing buffers that are not own by the
> process' cgroup or the ancestors of the process' cgroup.  In other
> words, in order for a buffer to be shared between two cgroups, the
> buffer must be created by the common ancestors of the cgroups.
> 
> drm.buffer.stats
>         A read-only flat-keyed file which exists on all cgroups.  Each
>         entry is keyed by the drm device's major:minor.
> 
>         Total GEM buffer allocation in bytes.
> 
> drm.buffer.default
>         A read-only flat-keyed file which exists on the root cgroup.
>         Each entry is keyed by the drm device's major:minor.
> 
>         Default limits on the total GEM buffer allocation in bytes.

Don't we need a "0 means no limit" semantics here?

> drm.buffer.max
>         A read-write flat-keyed file which exists on all cgroups.  Each
>         entry is keyed by the drm device's major:minor.
> 
>         Per device limits on the total GEM buffer allocation in byte.
>         This is a hard limit.  Attempts in allocating beyond the cgroup
>         limit will result in ENOMEM.  Shorthand understood by memparse
>         (such as k, m, g) can be used.
> 
>         Set allocation limit for /dev/dri/card1 to 1GB
>         echo "226:1 1g" > drm.buffer.total.max
> 
>         Set allocation limit for /dev/dri/card0 to 512MB
>         echo "226:0 512m" > drm.buffer.total.max

I think we need a new drm-cgroup.rst which contains all this
documentation.

With multiple GPUs, do we need an overall GEM bo limit, across all gpus?
For other stuff later on like vram/tt/... and all that it needs to be
per-device, but I think one overall limit could be useful.

> 
> Change-Id: I4c249d06d45ec709d6481d4cbe87c5168545c5d0
> Signed-off-by: Kenny Ho <Kenny.Ho@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c |   4 +
>  drivers/gpu/drm/drm_gem.c                  |   8 +
>  drivers/gpu/drm/drm_prime.c                |   9 +
>  include/drm/drm_cgroup.h                   |  34 ++-
>  include/drm/drm_gem.h                      |  11 +
>  include/linux/cgroup_drm.h                 |   2 +
>  kernel/cgroup/drm.c                        | 321 +++++++++++++++++++++
>  7 files changed, 387 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index 93b2c5a48a71..b4c078b7ad63 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -34,6 +34,7 @@
>  #include <drm/drmP.h>
>  #include <drm/amdgpu_drm.h>
>  #include <drm/drm_cache.h>
> +#include <drm/drm_cgroup.h>
>  #include "amdgpu.h"
>  #include "amdgpu_trace.h"
>  #include "amdgpu_amdkfd.h"
> @@ -446,6 +447,9 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
>  	if (!amdgpu_bo_validate_size(adev, size, bp->domain))
>  		return -ENOMEM;
>  
> +	if (!drmcgrp_bo_can_allocate(current, adev->ddev, size))
> +		return -ENOMEM;

So what happens when you start a lot of threads all at the same time,
allocating gem bo? Also would be nice if we could roll out at least the
accounting part of this cgroup to all GEM drivers.

> +
>  	*bo_ptr = NULL;
>  
>  	acc_size = ttm_bo_dma_acc_size(&adev->mman.bdev, size,
> diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
> index 6a80db077dc6..e20c1034bf2b 100644
> --- a/drivers/gpu/drm/drm_gem.c
> +++ b/drivers/gpu/drm/drm_gem.c
> @@ -37,10 +37,12 @@
>  #include <linux/shmem_fs.h>
>  #include <linux/dma-buf.h>
>  #include <linux/mem_encrypt.h>
> +#include <linux/cgroup_drm.h>
>  #include <drm/drmP.h>
>  #include <drm/drm_vma_manager.h>
>  #include <drm/drm_gem.h>
>  #include <drm/drm_print.h>
> +#include <drm/drm_cgroup.h>
>  #include "drm_internal.h"
>  
>  /** @file drm_gem.c
> @@ -154,6 +156,9 @@ void drm_gem_private_object_init(struct drm_device *dev,
>  	obj->handle_count = 0;
>  	obj->size = size;
>  	drm_vma_node_reset(&obj->vma_node);
> +
> +	obj->drmcgrp = get_drmcgrp(current);
> +	drmcgrp_chg_bo_alloc(obj->drmcgrp, dev, size);
>  }
>  EXPORT_SYMBOL(drm_gem_private_object_init);
>  
> @@ -804,6 +809,9 @@ drm_gem_object_release(struct drm_gem_object *obj)
>  	if (obj->filp)
>  		fput(obj->filp);
>  
> +	drmcgrp_unchg_bo_alloc(obj->drmcgrp, obj->dev, obj->size);
> +	put_drmcgrp(obj->drmcgrp);
> +
>  	drm_gem_free_mmap_offset(obj);
>  }
>  EXPORT_SYMBOL(drm_gem_object_release);
> diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
> index 231e3f6d5f41..eeb612116810 100644
> --- a/drivers/gpu/drm/drm_prime.c
> +++ b/drivers/gpu/drm/drm_prime.c
> @@ -32,6 +32,7 @@
>  #include <drm/drm_prime.h>
>  #include <drm/drm_gem.h>
>  #include <drm/drmP.h>
> +#include <drm/drm_cgroup.h>
>  
>  #include "drm_internal.h"
>  
> @@ -794,6 +795,7 @@ int drm_gem_prime_fd_to_handle(struct drm_device *dev,
>  {
>  	struct dma_buf *dma_buf;
>  	struct drm_gem_object *obj;
> +	struct drmcgrp *drmcgrp = drmcgrp_from(current);
>  	int ret;
>  
>  	dma_buf = dma_buf_get(prime_fd);
> @@ -818,6 +820,13 @@ int drm_gem_prime_fd_to_handle(struct drm_device *dev,
>  		goto out_unlock;
>  	}
>  
> +	/* only allow bo from the same cgroup or its ancestor to be imported */
> +	if (drmcgrp != NULL &&

Quite a serious limitation here ...

> +			!drmcgrp_is_self_or_ancestor(drmcgrp, obj->drmcgrp)) {

Also what happens if you actually share across devices? Then importing in
the 2nd group is suddenly possible, and I think will be double-counted.

What's the underlying technical reason for not allowing sharing across
cgroups?

> +		ret = -EACCES;
> +		goto out_unlock;
> +	}
> +
>  	if (obj->dma_buf) {
>  		WARN_ON(obj->dma_buf != dma_buf);
>  	} else {
> diff --git a/include/drm/drm_cgroup.h b/include/drm/drm_cgroup.h
> index ddb9eab64360..8711b7c5f7bf 100644
> --- a/include/drm/drm_cgroup.h
> +++ b/include/drm/drm_cgroup.h
> @@ -4,12 +4,20 @@
>  #ifndef __DRM_CGROUP_H__
>  #define __DRM_CGROUP_H__
>  
> +#include <linux/cgroup_drm.h>
> +
>  #ifdef CONFIG_CGROUP_DRM
>  
>  int drmcgrp_register_device(struct drm_device *device);
> -
>  int drmcgrp_unregister_device(struct drm_device *device);
> -
> +bool drmcgrp_is_self_or_ancestor(struct drmcgrp *self,
> +		struct drmcgrp *relative);
> +void drmcgrp_chg_bo_alloc(struct drmcgrp *drmcgrp, struct drm_device *dev,
> +		size_t size);
> +void drmcgrp_unchg_bo_alloc(struct drmcgrp *drmcgrp, struct drm_device *dev,
> +		size_t size);
> +bool drmcgrp_bo_can_allocate(struct task_struct *task, struct drm_device *dev,
> +		size_t size);
>  #else
>  static inline int drmcgrp_register_device(struct drm_device *device)
>  {
> @@ -20,5 +28,27 @@ static inline int drmcgrp_unregister_device(struct drm_device *device)
>  {
>  	return 0;
>  }
> +
> +static inline bool drmcgrp_is_self_or_ancestor(struct drmcgrp *self,
> +		struct drmcgrp *relative)
> +{
> +	return false;
> +}
> +
> +static inline void drmcgrp_chg_bo_alloc(struct drmcgrp *drmcgrp,
> +		struct drm_device *dev,	size_t size)
> +{
> +}
> +
> +static inline void drmcgrp_unchg_bo_alloc(struct drmcgrp *drmcgrp,
> +		struct drm_device *dev,	size_t size)
> +{
> +}
> +
> +static inline bool drmcgrp_bo_can_allocate(struct task_struct *task,
> +		struct drm_device *dev,	size_t size)
> +{
> +	return true;
> +}
>  #endif /* CONFIG_CGROUP_DRM */
>  #endif /* __DRM_CGROUP_H__ */
> diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
> index c95727425284..09d1c69a3f0c 100644
> --- a/include/drm/drm_gem.h
> +++ b/include/drm/drm_gem.h
> @@ -272,6 +272,17 @@ struct drm_gem_object {
>  	 *
>  	 */
>  	const struct drm_gem_object_funcs *funcs;
> +
> +	/**
> +	 * @drmcgrp:
> +	 *
> +	 * DRM cgroup this GEM object belongs to.
> +	 *
> +	 * This is used to track and limit the amount of GEM objects a user
> +	 * can allocate.  Since GEM objects can be shared, this is also used
> +	 * to ensure GEM objects are only shared within the same cgroup.
> +	 */
> +	struct drmcgrp *drmcgrp;
>  };
>  
>  /**
> diff --git a/include/linux/cgroup_drm.h b/include/linux/cgroup_drm.h
> index 27497f786c93..efa019666f1c 100644
> --- a/include/linux/cgroup_drm.h
> +++ b/include/linux/cgroup_drm.h
> @@ -15,6 +15,8 @@
>  
>  struct drmcgrp_device_resource {
>  	/* for per device stats */
> +	s64			bo_stats_total_allocated;
> +	s64			bo_limits_total_allocated;
>  };
>  
>  struct drmcgrp {
> diff --git a/kernel/cgroup/drm.c b/kernel/cgroup/drm.c
> index 7da6e0d93991..cfc1fe74dca3 100644
> --- a/kernel/cgroup/drm.c
> +++ b/kernel/cgroup/drm.c
> @@ -9,6 +9,7 @@
>  #include <linux/cgroup_drm.h>
>  #include <linux/kernel.h>
>  #include <drm/drm_device.h>
> +#include <drm/drm_ioctl.h>
>  #include <drm/drm_cgroup.h>
>  
>  static DEFINE_MUTEX(drmcgrp_mutex);
> @@ -16,6 +17,26 @@ static DEFINE_MUTEX(drmcgrp_mutex);
>  struct drmcgrp_device {
>  	struct drm_device	*dev;
>  	struct mutex		mutex;
> +
> +	s64			bo_limits_total_allocated_default;
> +};
> +
> +#define DRMCG_CTF_PRIV_SIZE 3
> +#define DRMCG_CTF_PRIV_MASK GENMASK((DRMCG_CTF_PRIV_SIZE - 1), 0)
> +#define DRMCG_CTF_PRIV(res_type, f_type)  ((res_type) <<\
> +		DRMCG_CTF_PRIV_SIZE | (f_type))
> +#define DRMCG_CTF_PRIV2RESTYPE(priv) ((priv) >> DRMCG_CTF_PRIV_SIZE)
> +#define DRMCG_CTF_PRIV2FTYPE(priv) ((priv) & DRMCG_CTF_PRIV_MASK)
> +
> +
> +enum drmcgrp_res_type {
> +	DRMCGRP_TYPE_BO_TOTAL,
> +};
> +
> +enum drmcgrp_file_type {
> +	DRMCGRP_FTYPE_STATS,
> +	DRMCGRP_FTYPE_LIMIT,
> +	DRMCGRP_FTYPE_DEFAULT,
>  };
>  
>  /* indexed by drm_minor for access speed */
> @@ -54,6 +75,10 @@ static inline int init_drmcgrp_single(struct drmcgrp *drmcgrp, int minor)
>  	}
>  
>  	/* set defaults here */
> +	if (known_drmcgrp_devs[minor] != NULL) {
> +		ddr->bo_limits_total_allocated =
> +		  known_drmcgrp_devs[minor]->bo_limits_total_allocated_default;
> +	}
>  
>  	return 0;
>  }
> @@ -100,7 +125,225 @@ drmcgrp_css_alloc(struct cgroup_subsys_state *parent_css)
>  	return &drmcgrp->css;
>  }
>  
> +static inline void drmcgrp_print_stats(struct drmcgrp_device_resource *ddr,
> +		struct seq_file *sf, enum drmcgrp_res_type type)
> +{
> +	if (ddr == NULL) {
> +		seq_puts(sf, "\n");
> +		return;
> +	}
> +
> +	switch (type) {
> +	case DRMCGRP_TYPE_BO_TOTAL:
> +		seq_printf(sf, "%lld\n", ddr->bo_stats_total_allocated);
> +		break;
> +	default:
> +		seq_puts(sf, "\n");
> +		break;
> +	}
> +}
> +
> +static inline void drmcgrp_print_limits(struct drmcgrp_device_resource *ddr,
> +		struct seq_file *sf, enum drmcgrp_res_type type)
> +{
> +	if (ddr == NULL) {
> +		seq_puts(sf, "\n");
> +		return;
> +	}
> +
> +	switch (type) {
> +	case DRMCGRP_TYPE_BO_TOTAL:
> +		seq_printf(sf, "%lld\n", ddr->bo_limits_total_allocated);
> +		break;
> +	default:
> +		seq_puts(sf, "\n");
> +		break;
> +	}
> +}
> +
> +static inline void drmcgrp_print_default(struct drmcgrp_device *ddev,
> +		struct seq_file *sf, enum drmcgrp_res_type type)
> +{
> +	if (ddev == NULL) {
> +		seq_puts(sf, "\n");
> +		return;
> +	}
> +
> +	switch (type) {
> +	case DRMCGRP_TYPE_BO_TOTAL:
> +		seq_printf(sf, "%lld\n",
> +				ddev->bo_limits_total_allocated_default);
> +		break;
> +	default:
> +		seq_puts(sf, "\n");
> +		break;
> +	}
> +}
> +
> +int drmcgrp_bo_show(struct seq_file *sf, void *v)
> +{
> +	struct drmcgrp *drmcgrp = css_drmcgrp(seq_css(sf));
> +	struct drmcgrp_device_resource *ddr = NULL;
> +	enum drmcgrp_file_type f_type =
> +		DRMCG_CTF_PRIV2FTYPE(seq_cft(sf)->private);
> +	enum drmcgrp_res_type type =
> +		DRMCG_CTF_PRIV2RESTYPE(seq_cft(sf)->private);
> +	struct drmcgrp_device *ddev;
> +	int i;
> +
> +	for (i = 0; i <= max_minor; i++) {
> +		ddr = drmcgrp->dev_resources[i];
> +		ddev = known_drmcgrp_devs[i];
> +
> +		seq_printf(sf, "%d:%d ", DRM_MAJOR, i);
> +
> +		switch (f_type) {
> +		case DRMCGRP_FTYPE_STATS:
> +			drmcgrp_print_stats(ddr, sf, type);
> +			break;
> +		case DRMCGRP_FTYPE_LIMIT:
> +			drmcgrp_print_limits(ddr, sf, type);
> +			break;
> +		case DRMCGRP_FTYPE_DEFAULT:
> +			drmcgrp_print_default(ddev, sf, type);
> +			break;
> +		default:
> +			seq_puts(sf, "\n");
> +			break;
> +		}
> +	}
> +
> +	return 0;
> +}
> +
> +static inline void drmcgrp_pr_cft_err(const struct drmcgrp *drmcgrp,
> +		const char *cft_name, int minor)
> +{
> +	pr_err("drmcgrp: error parsing %s, minor %d ",
> +			cft_name, minor);
> +	pr_cont_cgroup_name(drmcgrp->css.cgroup);
> +	pr_cont("\n");
> +}
> +
> +static inline int drmcgrp_process_limit_val(char *sval, bool is_mem,
> +			s64 def_val, s64 max_val, s64 *ret_val)
> +{
> +	int rc = strcmp("max", sval);
> +
> +
> +	if (!rc)
> +		*ret_val = max_val;
> +	else {
> +		rc = strcmp("default", sval);
> +
> +		if (!rc)
> +			*ret_val = def_val;
> +	}
> +
> +	if (rc) {
> +		if (is_mem) {
> +			*ret_val = memparse(sval, NULL);
> +			rc = 0;
> +		} else {
> +			rc = kstrtoll(sval, 0, ret_val);
> +		}
> +	}
> +
> +	if (*ret_val > max_val)
> +		*ret_val = max_val;
> +
> +	return rc;
> +}
> +
> +ssize_t drmcgrp_bo_limit_write(struct kernfs_open_file *of, char *buf,
> +		size_t nbytes, loff_t off)
> +{
> +	struct drmcgrp *drmcgrp = css_drmcgrp(of_css(of));
> +	struct drmcgrp *parent = parent_drmcgrp(drmcgrp);
> +	enum drmcgrp_res_type type =
> +		DRMCG_CTF_PRIV2RESTYPE(of_cft(of)->private);
> +	char *cft_name = of_cft(of)->name;
> +	char *limits = strstrip(buf);
> +	struct drmcgrp_device *ddev;
> +	struct drmcgrp_device_resource *ddr;
> +	char *line;
> +	char sattr[256];
> +	s64 val;
> +	s64 p_max;
> +	int rc;
> +	int minor;
> +
> +	while (limits != NULL) {
> +		line =  strsep(&limits, "\n");
> +
> +		if (sscanf(line,
> +			__stringify(DRM_MAJOR)":%u %255[^\t\n]",
> +							&minor, sattr) != 2) {
> +			pr_err("drmcgrp: error parsing %s ", cft_name);
> +			pr_cont_cgroup_name(drmcgrp->css.cgroup);
> +			pr_cont("\n");
> +
> +			continue;
> +		}
> +
> +		if (minor < 0 || minor > max_minor) {
> +			pr_err("drmcgrp: invalid minor %d for %s ",
> +					minor, cft_name);
> +			pr_cont_cgroup_name(drmcgrp->css.cgroup);
> +			pr_cont("\n");
> +
> +			continue;
> +		}
> +
> +		ddr = drmcgrp->dev_resources[minor];
> +		ddev = known_drmcgrp_devs[minor];
> +		switch (type) {
> +		case DRMCGRP_TYPE_BO_TOTAL:
> +			p_max = parent == NULL ? S64_MAX :
> +				parent->dev_resources[minor]->
> +				bo_limits_total_allocated;
> +
> +			rc = drmcgrp_process_limit_val(sattr, true,
> +				ddev->bo_limits_total_allocated_default,
> +				p_max,
> +				&val);
> +
> +			if (rc || val < 0) {
> +				drmcgrp_pr_cft_err(drmcgrp, cft_name, minor);
> +				continue;
> +			}
> +
> +			ddr->bo_limits_total_allocated = val;
> +			break;
> +		default:
> +			break;
> +		}
> +	}
> +
> +	return nbytes;
> +}
> +
>  struct cftype files[] = {
> +	{
> +		.name = "buffer.total.stats",
> +		.seq_show = drmcgrp_bo_show,
> +		.private = DRMCG_CTF_PRIV(DRMCGRP_TYPE_BO_TOTAL,
> +						DRMCGRP_FTYPE_STATS),
> +	},
> +	{
> +		.name = "buffer.total.default",
> +		.seq_show = drmcgrp_bo_show,
> +		.flags = CFTYPE_ONLY_ON_ROOT,
> +		.private = DRMCG_CTF_PRIV(DRMCGRP_TYPE_BO_TOTAL,
> +						DRMCGRP_FTYPE_DEFAULT),
> +	},
> +	{
> +		.name = "buffer.total.max",
> +		.write = drmcgrp_bo_limit_write,
> +		.seq_show = drmcgrp_bo_show,
> +		.private = DRMCG_CTF_PRIV(DRMCGRP_TYPE_BO_TOTAL,
> +						DRMCGRP_FTYPE_LIMIT),
> +	},
>  	{ }	/* terminate */
>  };
>  
> @@ -121,6 +364,8 @@ int drmcgrp_register_device(struct drm_device *dev)
>  		return -ENOMEM;
>  
>  	ddev->dev = dev;
> +	ddev->bo_limits_total_allocated_default = S64_MAX;
> +
>  	mutex_init(&ddev->mutex);
>  
>  	mutex_lock(&drmcgrp_mutex);
> @@ -156,3 +401,79 @@ int drmcgrp_unregister_device(struct drm_device *dev)
>  	return 0;
>  }
>  EXPORT_SYMBOL(drmcgrp_unregister_device);
> +
> +bool drmcgrp_is_self_or_ancestor(struct drmcgrp *self, struct drmcgrp *relative)
> +{
> +	for (; self != NULL; self = parent_drmcgrp(self))
> +		if (self == relative)
> +			return true;
> +
> +	return false;
> +}
> +EXPORT_SYMBOL(drmcgrp_is_self_or_ancestor);
> +
> +bool drmcgrp_bo_can_allocate(struct task_struct *task, struct drm_device *dev,
> +		size_t size)
> +{
> +	struct drmcgrp *drmcgrp = drmcgrp_from(task);
> +	struct drmcgrp_device_resource *ddr;
> +	struct drmcgrp_device_resource *d;
> +	int devIdx = dev->primary->index;
> +	bool result = true;
> +	s64 delta = 0;
> +
> +	if (drmcgrp == NULL || drmcgrp == root_drmcgrp)
> +		return true;
> +
> +	ddr = drmcgrp->dev_resources[devIdx];
> +	mutex_lock(&known_drmcgrp_devs[devIdx]->mutex);
> +	for ( ; drmcgrp != root_drmcgrp; drmcgrp = parent_drmcgrp(drmcgrp)) {
> +		d = drmcgrp->dev_resources[devIdx];
> +		delta = d->bo_limits_total_allocated -
> +				d->bo_stats_total_allocated;
> +
> +		if (delta <= 0 || size > delta) {
> +			result = false;
> +			break;
> +		}
> +	}
> +	mutex_unlock(&known_drmcgrp_devs[devIdx]->mutex);
> +
> +	return result;
> +}
> +EXPORT_SYMBOL(drmcgrp_bo_can_allocate);
> +
> +void drmcgrp_chg_bo_alloc(struct drmcgrp *drmcgrp, struct drm_device *dev,
> +		size_t size)
> +{
> +	struct drmcgrp_device_resource *ddr;
> +	int devIdx = dev->primary->index;
> +
> +	if (drmcgrp == NULL || known_drmcgrp_devs[devIdx] == NULL)
> +		return;
> +
> +	mutex_lock(&known_drmcgrp_devs[devIdx]->mutex);
> +	for ( ; drmcgrp != NULL; drmcgrp = parent_drmcgrp(drmcgrp)) {
> +		ddr = drmcgrp->dev_resources[devIdx];
> +
> +		ddr->bo_stats_total_allocated += (s64)size;
> +	}
> +	mutex_unlock(&known_drmcgrp_devs[devIdx]->mutex);
> +}
> +EXPORT_SYMBOL(drmcgrp_chg_bo_alloc);
> +
> +void drmcgrp_unchg_bo_alloc(struct drmcgrp *drmcgrp, struct drm_device *dev,
> +		size_t size)
> +{
> +	int devIdx = dev->primary->index;
> +
> +	if (drmcgrp == NULL || known_drmcgrp_devs[devIdx] == NULL)
> +		return;
> +
> +	mutex_lock(&known_drmcgrp_devs[devIdx]->mutex);
> +	for ( ; drmcgrp != NULL; drmcgrp = parent_drmcgrp(drmcgrp))
> +		drmcgrp->dev_resources[devIdx]->bo_stats_total_allocated
> +			-= (s64)size;
> +	mutex_unlock(&known_drmcgrp_devs[devIdx]->mutex);
> +}
> +EXPORT_SYMBOL(drmcgrp_unchg_bo_alloc);
> -- 
> 2.21.0
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
Kenny Ho June 26, 2019, 9:27 p.m. UTC | #2
On Wed, Jun 26, 2019 at 12:05 PM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> > drm.buffer.default
> >         A read-only flat-keyed file which exists on the root cgroup.
> >         Each entry is keyed by the drm device's major:minor.
> >
> >         Default limits on the total GEM buffer allocation in bytes.
>
> Don't we need a "0 means no limit" semantics here?

I believe the convention is to use the 'max' keyword.

>
> I think we need a new drm-cgroup.rst which contains all this
> documentation.

Yes I planned to do that when things are more finalized.  I am
actually writing the commit message following the current doc format
so I can reuse it in the rst.

>
> With multiple GPUs, do we need an overall GEM bo limit, across all gpus?
> For other stuff later on like vram/tt/... and all that it needs to be
> per-device, but I think one overall limit could be useful.

This one I am not sure but should be fairly straightforward to add.
I'd love to hear more feedbacks on this as well.

> >       if (!amdgpu_bo_validate_size(adev, size, bp->domain))
> >               return -ENOMEM;
> >
> > +     if (!drmcgrp_bo_can_allocate(current, adev->ddev, size))
> > +             return -ENOMEM;
>
> So what happens when you start a lot of threads all at the same time,
> allocating gem bo? Also would be nice if we could roll out at least the
> accounting part of this cgroup to all GEM drivers.

When there is a large number of allocation, the allocation will be
checked in sequence within a device (since I used a per device mutex
in the check.)  Are you suggesting the overhead here is significant
enough to be a bottleneck?  The accounting part should be available to
all GEM drivers (unless I missed something) since the chg and unchg
function is called via the generic drm_gem_private_object_init and
drm_gem_object_release.

> > +     /* only allow bo from the same cgroup or its ancestor to be imported */
> > +     if (drmcgrp != NULL &&
>
> Quite a serious limitation here ...
>
> > +                     !drmcgrp_is_self_or_ancestor(drmcgrp, obj->drmcgrp)) {
>
> Also what happens if you actually share across devices? Then importing in
> the 2nd group is suddenly possible, and I think will be double-counted.
>
> What's the underlying technical reason for not allowing sharing across
> cgroups?

With the current implementation, there shouldn't be double counting as
the counting is done during the buffer init.

To be clear, sharing across cgroup is allowed, the buffer just needs
to be allocated by a process that is parent to the cgroup.  So in the
case of xorg allocating buffer for client, the xorg would be in the
root cgroup and the buffer can be passed around by different clients
(in root or other cgroup.)  The idea here is to establish some form of
ownership, otherwise there wouldn't be a way to account for or limit
the usage.

Regards,
Kenny
Daniel Vetter June 26, 2019, 9:41 p.m. UTC | #3
On Wed, Jun 26, 2019 at 05:27:48PM -0400, Kenny Ho wrote:
> On Wed, Jun 26, 2019 at 12:05 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> >
> > > drm.buffer.default
> > >         A read-only flat-keyed file which exists on the root cgroup.
> > >         Each entry is keyed by the drm device's major:minor.
> > >
> > >         Default limits on the total GEM buffer allocation in bytes.
> >
> > Don't we need a "0 means no limit" semantics here?
> 
> I believe the convention is to use the 'max' keyword.
> 
> >
> > I think we need a new drm-cgroup.rst which contains all this
> > documentation.
> 
> Yes I planned to do that when things are more finalized.  I am
> actually writing the commit message following the current doc format
> so I can reuse it in the rst.

Awesome.

> > With multiple GPUs, do we need an overall GEM bo limit, across all gpus?
> > For other stuff later on like vram/tt/... and all that it needs to be
> > per-device, but I think one overall limit could be useful.
> 
> This one I am not sure but should be fairly straightforward to add.
> I'd love to hear more feedbacks on this as well.
> 
> > >       if (!amdgpu_bo_validate_size(adev, size, bp->domain))
> > >               return -ENOMEM;
> > >
> > > +     if (!drmcgrp_bo_can_allocate(current, adev->ddev, size))
> > > +             return -ENOMEM;
> >
> > So what happens when you start a lot of threads all at the same time,
> > allocating gem bo? Also would be nice if we could roll out at least the
> > accounting part of this cgroup to all GEM drivers.
> 
> When there is a large number of allocation, the allocation will be
> checked in sequence within a device (since I used a per device mutex
> in the check.)  Are you suggesting the overhead here is significant
> enough to be a bottleneck?  The accounting part should be available to
> all GEM drivers (unless I missed something) since the chg and unchg
> function is called via the generic drm_gem_private_object_init and
> drm_gem_object_release.

thread 1: checks limits, still under the total

thread 2: checks limits, still under the total

thread 1: allocates, still under

thread 2: allocates, now over the limit

I think the check and chg need to be one step, or this wont work. Or I'm
missing something somewhere.

Wrt rolling out the accounting for all drivers: Since you also roll out
enforcement in this patch I'm not sure whether the accounting part is
fully stand-alone. And as discussed a bit on an earlier patch, I think for
DRIVER_GEM we should set up the accounting cgroup automatically.

> > > +     /* only allow bo from the same cgroup or its ancestor to be imported */
> > > +     if (drmcgrp != NULL &&
> >
> > Quite a serious limitation here ...
> >
> > > +                     !drmcgrp_is_self_or_ancestor(drmcgrp, obj->drmcgrp)) {
> >
> > Also what happens if you actually share across devices? Then importing in
> > the 2nd group is suddenly possible, and I think will be double-counted.
> >
> > What's the underlying technical reason for not allowing sharing across
> > cgroups?
> 
> With the current implementation, there shouldn't be double counting as
> the counting is done during the buffer init.

If you share across devices there will be two drm_gem_obect structures, on
on each device. But only one underlying bo.

Now the bo limit is per-device too, so that's all fine, but for a global
bo limit we'd need to make sure we count these only once.

> To be clear, sharing across cgroup is allowed, the buffer just needs
> to be allocated by a process that is parent to the cgroup.  So in the
> case of xorg allocating buffer for client, the xorg would be in the
> root cgroup and the buffer can be passed around by different clients
> (in root or other cgroup.)  The idea here is to establish some form of
> ownership, otherwise there wouldn't be a way to account for or limit
> the usage.

But why? What's the problem if I allocate something and then hand it to
someone else. E.g. one popular use of cgroups is to isolate clients, so
maybe you'd do a cgroup + namespace for each X11 client (ok wayland, with
X11 this is probably pointless).

But with your current limitation those clients can't pass buffers to the
compositor anymore, making cgroups useless. Your example here only works
if Xorg is in the root and allocates all the buffers. That's not even true
for DRI3 anymore.

So pretty serious limitation on cgroups, and I'm not really understanding
why we need this. I think if we want to prevent buffer sharing, what we
need are some selinux hooks and stuff so you can prevent an import/access
by someone who's not allowed to touch a buffer. But that kind of access
right management should be separate from resource control imo.
-Daniel
Kenny Ho June 26, 2019, 10:41 p.m. UTC | #4
On Wed, Jun 26, 2019 at 5:41 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> On Wed, Jun 26, 2019 at 05:27:48PM -0400, Kenny Ho wrote:
> > On Wed, Jun 26, 2019 at 12:05 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > So what happens when you start a lot of threads all at the same time,
> > > allocating gem bo? Also would be nice if we could roll out at least the
> > > accounting part of this cgroup to all GEM drivers.
> >
> > When there is a large number of allocation, the allocation will be
> > checked in sequence within a device (since I used a per device mutex
> > in the check.)  Are you suggesting the overhead here is significant
> > enough to be a bottleneck?  The accounting part should be available to
> > all GEM drivers (unless I missed something) since the chg and unchg
> > function is called via the generic drm_gem_private_object_init and
> > drm_gem_object_release.
>
> thread 1: checks limits, still under the total
>
> thread 2: checks limits, still under the total
>
> thread 1: allocates, still under
>
> thread 2: allocates, now over the limit
>
> I think the check and chg need to be one step, or this wont work. Or I'm
> missing something somewhere.

Ok, I see what you are saying.

> Wrt rolling out the accounting for all drivers: Since you also roll out
> enforcement in this patch I'm not sure whether the accounting part is
> fully stand-alone. And as discussed a bit on an earlier patch, I think for
> DRIVER_GEM we should set up the accounting cgroup automatically.
I think I should be able to split the commit and restructure things a bit.

> > > What's the underlying technical reason for not allowing sharing across
> > > cgroups?
> > To be clear, sharing across cgroup is allowed, the buffer just needs
> > to be allocated by a process that is parent to the cgroup.  So in the
> > case of xorg allocating buffer for client, the xorg would be in the
> > root cgroup and the buffer can be passed around by different clients
> > (in root or other cgroup.)  The idea here is to establish some form of
> > ownership, otherwise there wouldn't be a way to account for or limit
> > the usage.
>
> But why? What's the problem if I allocate something and then hand it to
> someone else. E.g. one popular use of cgroups is to isolate clients, so
> maybe you'd do a cgroup + namespace for each X11 client (ok wayland, with
> X11 this is probably pointless).
>
> But with your current limitation those clients can't pass buffers to the
> compositor anymore, making cgroups useless. Your example here only works
> if Xorg is in the root and allocates all the buffers. That's not even true
> for DRI3 anymore.
>
> So pretty serious limitation on cgroups, and I'm not really understanding
> why we need this. I think if we want to prevent buffer sharing, what we
> need are some selinux hooks and stuff so you can prevent an import/access
> by someone who's not allowed to touch a buffer. But that kind of access
> right management should be separate from resource control imo.
So without the sharing restriction and some kind of ownership
structure, we will have to migrate/change the owner of the buffer when
the cgroup that created the buffer die before the receiving cgroup(s)
and I am not sure how to do that properly at the moment.  1) Should
each cgroup keep track of all the buffers that belongs to it and
migrate?  (Is that efficient?)  2) which cgroup should be the new
owner (and therefore have the limit applied?)  Having the creator
being the owner is kind of natural, but let say the buffer is shared
with 5 other cgroups, which of these 5 cgroups should be the new owner
of the buffer?

Regards,
Kenny
Daniel Vetter June 27, 2019, 5:43 a.m. UTC | #5
On Wed, Jun 26, 2019 at 06:41:32PM -0400, Kenny Ho wrote:
> On Wed, Jun 26, 2019 at 5:41 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> > On Wed, Jun 26, 2019 at 05:27:48PM -0400, Kenny Ho wrote:
> > > On Wed, Jun 26, 2019 at 12:05 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > So what happens when you start a lot of threads all at the same time,
> > > > allocating gem bo? Also would be nice if we could roll out at least the
> > > > accounting part of this cgroup to all GEM drivers.
> > >
> > > When there is a large number of allocation, the allocation will be
> > > checked in sequence within a device (since I used a per device mutex
> > > in the check.)  Are you suggesting the overhead here is significant
> > > enough to be a bottleneck?  The accounting part should be available to
> > > all GEM drivers (unless I missed something) since the chg and unchg
> > > function is called via the generic drm_gem_private_object_init and
> > > drm_gem_object_release.
> >
> > thread 1: checks limits, still under the total
> >
> > thread 2: checks limits, still under the total
> >
> > thread 1: allocates, still under
> >
> > thread 2: allocates, now over the limit
> >
> > I think the check and chg need to be one step, or this wont work. Or I'm
> > missing something somewhere.
> 
> Ok, I see what you are saying.
> 
> > Wrt rolling out the accounting for all drivers: Since you also roll out
> > enforcement in this patch I'm not sure whether the accounting part is
> > fully stand-alone. And as discussed a bit on an earlier patch, I think for
> > DRIVER_GEM we should set up the accounting cgroup automatically.
> I think I should be able to split the commit and restructure things a bit.
> 
> > > > What's the underlying technical reason for not allowing sharing across
> > > > cgroups?
> > > To be clear, sharing across cgroup is allowed, the buffer just needs
> > > to be allocated by a process that is parent to the cgroup.  So in the
> > > case of xorg allocating buffer for client, the xorg would be in the
> > > root cgroup and the buffer can be passed around by different clients
> > > (in root or other cgroup.)  The idea here is to establish some form of
> > > ownership, otherwise there wouldn't be a way to account for or limit
> > > the usage.
> >
> > But why? What's the problem if I allocate something and then hand it to
> > someone else. E.g. one popular use of cgroups is to isolate clients, so
> > maybe you'd do a cgroup + namespace for each X11 client (ok wayland, with
> > X11 this is probably pointless).
> >
> > But with your current limitation those clients can't pass buffers to the
> > compositor anymore, making cgroups useless. Your example here only works
> > if Xorg is in the root and allocates all the buffers. That's not even true
> > for DRI3 anymore.
> >
> > So pretty serious limitation on cgroups, and I'm not really understanding
> > why we need this. I think if we want to prevent buffer sharing, what we
> > need are some selinux hooks and stuff so you can prevent an import/access
> > by someone who's not allowed to touch a buffer. But that kind of access
> > right management should be separate from resource control imo.
> So without the sharing restriction and some kind of ownership
> structure, we will have to migrate/change the owner of the buffer when
> the cgroup that created the buffer die before the receiving cgroup(s)
> and I am not sure how to do that properly at the moment.  1) Should
> each cgroup keep track of all the buffers that belongs to it and
> migrate?  (Is that efficient?)  2) which cgroup should be the new
> owner (and therefore have the limit applied?)  Having the creator
> being the owner is kind of natural, but let say the buffer is shared
> with 5 other cgroups, which of these 5 cgroups should be the new owner
> of the buffer?

Different answers:

- Do we care if we leak bos like this in a cgroup, if the cgroup
  disappears before all the bo are cleaned up?

- Just charge the bo to each cgroup it's in? Will be quite a bit more
  tracking needed to get that done ...

- Also, there's the legacy way of sharing a bo, with the FLINK and
  GEM_OPEN ioctls. We need to plug these holes too.

Just feels like your current solution is technically well-justified, but
it completely defeats the point of cgroups/containers and buffer sharing
...

Cheers, Daniel
Kenny Ho June 27, 2019, 6:42 p.m. UTC | #6
On Thu, Jun 27, 2019 at 1:43 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Wed, Jun 26, 2019 at 06:41:32PM -0400, Kenny Ho wrote:
> > So without the sharing restriction and some kind of ownership
> > structure, we will have to migrate/change the owner of the buffer when
> > the cgroup that created the buffer die before the receiving cgroup(s)
> > and I am not sure how to do that properly at the moment.  1) Should
> > each cgroup keep track of all the buffers that belongs to it and
> > migrate?  (Is that efficient?)  2) which cgroup should be the new
> > owner (and therefore have the limit applied?)  Having the creator
> > being the owner is kind of natural, but let say the buffer is shared
> > with 5 other cgroups, which of these 5 cgroups should be the new owner
> > of the buffer?
>
> Different answers:
>
> - Do we care if we leak bos like this in a cgroup, if the cgroup
>   disappears before all the bo are cleaned up?
>
> - Just charge the bo to each cgroup it's in? Will be quite a bit more
>   tracking needed to get that done ...
That seems to be the approach memcg takes, but as shown by the lwn
link you sent me from the last rfc (talk from Roman Gushchin), that
approach is not problem free either.  And wouldn't this approach
disconnect resource management from the underlying resource one would
like to control?  For example, if you have 5 MB of memory, you can
have 5 users using 1 MB each.  But in the charge-everybody approach, a
1 MB usage shared 4 times will make it looks like 5MB is used.  So the
resource being control is no longer 'real' since the amount of
resource you have is now dynamic and depends on the amount of sharing
one does.

> - Also, there's the legacy way of sharing a bo, with the FLINK and
>   GEM_OPEN ioctls. We need to plug these holes too.
>
> Just feels like your current solution is technically well-justified, but
> it completely defeats the point of cgroups/containers and buffer sharing
> ...
Um... I am going to get a bit philosophical here and suggest that the
idea of sharing (especially uncontrolled sharing) is inherently at odd
with containment.  It's like, if everybody is special, no one is
special.  Perhaps an alternative is to make this configurable so that
people can allow sharing knowing the caveat?  And just to be clear,
the current solution allows for sharing, even between cgroup.

Regards,
Kenny
Daniel Vetter June 27, 2019, 9:24 p.m. UTC | #7
On Thu, Jun 27, 2019 at 02:42:43PM -0400, Kenny Ho wrote:
> On Thu, Jun 27, 2019 at 1:43 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> >
> > On Wed, Jun 26, 2019 at 06:41:32PM -0400, Kenny Ho wrote:
> > > So without the sharing restriction and some kind of ownership
> > > structure, we will have to migrate/change the owner of the buffer when
> > > the cgroup that created the buffer die before the receiving cgroup(s)
> > > and I am not sure how to do that properly at the moment.  1) Should
> > > each cgroup keep track of all the buffers that belongs to it and
> > > migrate?  (Is that efficient?)  2) which cgroup should be the new
> > > owner (and therefore have the limit applied?)  Having the creator
> > > being the owner is kind of natural, but let say the buffer is shared
> > > with 5 other cgroups, which of these 5 cgroups should be the new owner
> > > of the buffer?
> >
> > Different answers:
> >
> > - Do we care if we leak bos like this in a cgroup, if the cgroup
> >   disappears before all the bo are cleaned up?
> >
> > - Just charge the bo to each cgroup it's in? Will be quite a bit more
> >   tracking needed to get that done ...
> That seems to be the approach memcg takes, but as shown by the lwn
> link you sent me from the last rfc (talk from Roman Gushchin), that
> approach is not problem free either.  And wouldn't this approach
> disconnect resource management from the underlying resource one would
> like to control?  For example, if you have 5 MB of memory, you can
> have 5 users using 1 MB each.  But in the charge-everybody approach, a
> 1 MB usage shared 4 times will make it looks like 5MB is used.  So the
> resource being control is no longer 'real' since the amount of
> resource you have is now dynamic and depends on the amount of sharing
> one does.

The problem with memcg is that it's not just the allocation, but a ton of
memory allocated to track these allocations. At least that's my
understanding of the nature of the memcg leak. Made a lot worse by pages
being small and plentiful and shared extremely widely (e.g. it's really
hard to control who gets charged for pagecache allocations, so those
pagecache entries might outlive the memcg forever if you're unlucky).

For us it's just a counter, plus bo sharing is a lot more controlled: On
any reasonable system if you do kill the compositor, then all the clients
go down. And when you do kill a client, the compositor will release all
the shared buffers (and any other resources).

So I think for drmcg we won't have anything near the same resource leak
problem even in theory, and in practice I think the issue is none.

> > - Also, there's the legacy way of sharing a bo, with the FLINK and
> >   GEM_OPEN ioctls. We need to plug these holes too.
> >
> > Just feels like your current solution is technically well-justified, but
> > it completely defeats the point of cgroups/containers and buffer sharing
> > ...
> Um... I am going to get a bit philosophical here and suggest that the
> idea of sharing (especially uncontrolled sharing) is inherently at odd
> with containment.  It's like, if everybody is special, no one is
> special.  Perhaps an alternative is to make this configurable so that
> people can allow sharing knowing the caveat?  And just to be clear,
> the current solution allows for sharing, even between cgroup.

The thing is, why shouldn't we just allow it (with some documented
caveat)?

I mean if all people do is share it as your current patches allow, then
there's nothing funny going on (at least if we go with just leaking the
allocations). If we allow additional sharing, then that's a plus.

And if you want additional containment, that's a different thing: The
entire linux architecture for containers is that a container doesn't
exist. Instead you get a pile of building blocks that all solve different
aspects of what a container needs to do:
- cgroups for resource limits
- namespaces for resource visibility
- selinux/secomp/lsm for resource isolation and access rights

Let's not try to build a drm cgroup control that tries to do more than
what cgroups are meant to solve. If you have a need to restrict the
sharing, imo that should be done with an lsm security hook.

btw for bo sharing, I've found a 3rd sharing path (besides dma-buf and gem
flink): GETCRTC ioctl can also be used (it's the itended goal actually) to
share buffers across processes.
-Daniel
Kenny Ho June 28, 2019, 6:43 p.m. UTC | #8
On Thu, Jun 27, 2019 at 5:24 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> On Thu, Jun 27, 2019 at 02:42:43PM -0400, Kenny Ho wrote:
> > Um... I am going to get a bit philosophical here and suggest that the
> > idea of sharing (especially uncontrolled sharing) is inherently at odd
> > with containment.  It's like, if everybody is special, no one is
> > special.  Perhaps an alternative is to make this configurable so that
> > people can allow sharing knowing the caveat?  And just to be clear,
> > the current solution allows for sharing, even between cgroup.
>
> The thing is, why shouldn't we just allow it (with some documented
> caveat)?
>
> I mean if all people do is share it as your current patches allow, then
> there's nothing funny going on (at least if we go with just leaking the
> allocations). If we allow additional sharing, then that's a plus.
Um... perhaps I was being overly conservative :).  So let me
illustrate with an example to add more clarity and get more comments
on it.

Let say we have the following cgroup hierarchy (The letters are
cgroups with R being the root cgroup.  The numbers in brackets are
processes.  The processes are placed with the 'No Internal Process
Constraint' in mind.)
R (4, 5) ------ A (6)
  \
    B ---- C (7,8)
     \
       D (9)

Here is a list of operation and the associated effect on the size
track by the cgroups (for simplicity, each buffer is 1 unit in size.)
With current implementation (charge on buffer creation with
restriction on sharing.)
R   A   B   C   D   |Ops
================
1   0   0   0   0   |4 allocated a buffer
1   0   0   0   0   |4 shared a buffer with 5
1   0   0   0   0   |4 shared a buffer with 9
2   0   1   0   1   |9 allocated a buffer
3   0   2   1   1   |7 allocated a buffer
3   0   2   1   1   |7 shared a buffer with 8
3   0   2   1   1   |7 sharing with 9 (not allowed)
3   0   2   1   1   |7 sharing with 4 (not allowed)
3   0   2   1   1   |7 release a buffer
2   0   1   0   1   |8 release a buffer from 7

The suggestion as I understand it (charge per buffer reference with
unrestricted sharing.)
R   A   B   C   D   |Ops
================
1   0   0   0   0   |4 allocated a buffer
2   0   0   0   0   |4 shared a buffer with 5
3   0   0   0   1   |4 shared a buffer with 9
4   0   1   0   2   |9 allocated a buffer
5   0   2   1   1   |7 allocated a buffer
6   0   3   2   1   |7 shared a buffer with 8
7   0   4   2   2   |7 sharing with 9
8   0   4   2   2   |7 sharing with 4
7   0   3   1   2   |7 release a buffer
6   0   2   0   2   |8 release a buffer from 7

Is this a correct understanding of the suggestion?

Regards,
Kenny
Daniel Vetter July 2, 2019, 1:16 p.m. UTC | #9
On Fri, Jun 28, 2019 at 02:43:18PM -0400, Kenny Ho wrote:
> On Thu, Jun 27, 2019 at 5:24 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> > On Thu, Jun 27, 2019 at 02:42:43PM -0400, Kenny Ho wrote:
> > > Um... I am going to get a bit philosophical here and suggest that the
> > > idea of sharing (especially uncontrolled sharing) is inherently at odd
> > > with containment.  It's like, if everybody is special, no one is
> > > special.  Perhaps an alternative is to make this configurable so that
> > > people can allow sharing knowing the caveat?  And just to be clear,
> > > the current solution allows for sharing, even between cgroup.
> >
> > The thing is, why shouldn't we just allow it (with some documented
> > caveat)?
> >
> > I mean if all people do is share it as your current patches allow, then
> > there's nothing funny going on (at least if we go with just leaking the
> > allocations). If we allow additional sharing, then that's a plus.
> Um... perhaps I was being overly conservative :).  So let me
> illustrate with an example to add more clarity and get more comments
> on it.
> 
> Let say we have the following cgroup hierarchy (The letters are
> cgroups with R being the root cgroup.  The numbers in brackets are
> processes.  The processes are placed with the 'No Internal Process
> Constraint' in mind.)
> R (4, 5) ------ A (6)
>   \
>     B ---- C (7,8)
>      \
>        D (9)
> 
> Here is a list of operation and the associated effect on the size
> track by the cgroups (for simplicity, each buffer is 1 unit in size.)
> With current implementation (charge on buffer creation with
> restriction on sharing.)
> R   A   B   C   D   |Ops
> ================
> 1   0   0   0   0   |4 allocated a buffer
> 1   0   0   0   0   |4 shared a buffer with 5
> 1   0   0   0   0   |4 shared a buffer with 9
> 2   0   1   0   1   |9 allocated a buffer
> 3   0   2   1   1   |7 allocated a buffer
> 3   0   2   1   1   |7 shared a buffer with 8
> 3   0   2   1   1   |7 sharing with 9 (not allowed)
> 3   0   2   1   1   |7 sharing with 4 (not allowed)
> 3   0   2   1   1   |7 release a buffer
> 2   0   1   0   1   |8 release a buffer from 7

This is your current implementation, right? Let's call it A.

> The suggestion as I understand it (charge per buffer reference with
> unrestricted sharing.)
> R   A   B   C   D   |Ops
> ================
> 1   0   0   0   0   |4 allocated a buffer
> 2   0   0   0   0   |4 shared a buffer with 5
> 3   0   0   0   1   |4 shared a buffer with 9
> 4   0   1   0   2   |9 allocated a buffer
> 5   0   2   1   1   |7 allocated a buffer
> 6   0   3   2   1   |7 shared a buffer with 8
> 7   0   4   2   2   |7 sharing with 9
> 8   0   4   2   2   |7 sharing with 4
> 7   0   3   1   2   |7 release a buffer
> 6   0   2   0   2   |8 release a buffer from 7
> 
> Is this a correct understanding of the suggestion?

Yup that's one option I think. The other option (and it's probably
simpler), is to go with your current accounting, but drop the sharing
restriction. I.e. buffers are accounting to whomever allocates them first,
not who's all using them. For memcg this has some serious trouble with
cgroups not getting cleaned up due to leaked refrences. But for gem bo we
spread the references in a lot more controlled manner, and all the
long-lived references are under control of userspace.

E.g. if Xorg fails to clean up bo references of clients that dead, that's
clearly an Xorg bug and needs to be fixed there. But not something we need
to allow as a valid use-cases. For page references/accounting in memcg
this is totally different, since pages can survive in the pagecache
forever. No such bo-cache or anything similar exists for gem_bo.

Personally I prefer option A, but on sharing restriction. If you want that
sharing restriction, we need to figure out how to implement it using
something else. Plus we need to make sure all possible ways to share a bo
are covered (and there are many).
-Daniel
diff mbox series

Patch

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 93b2c5a48a71..b4c078b7ad63 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -34,6 +34,7 @@ 
 #include <drm/drmP.h>
 #include <drm/amdgpu_drm.h>
 #include <drm/drm_cache.h>
+#include <drm/drm_cgroup.h>
 #include "amdgpu.h"
 #include "amdgpu_trace.h"
 #include "amdgpu_amdkfd.h"
@@ -446,6 +447,9 @@  static int amdgpu_bo_do_create(struct amdgpu_device *adev,
 	if (!amdgpu_bo_validate_size(adev, size, bp->domain))
 		return -ENOMEM;
 
+	if (!drmcgrp_bo_can_allocate(current, adev->ddev, size))
+		return -ENOMEM;
+
 	*bo_ptr = NULL;
 
 	acc_size = ttm_bo_dma_acc_size(&adev->mman.bdev, size,
diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index 6a80db077dc6..e20c1034bf2b 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -37,10 +37,12 @@ 
 #include <linux/shmem_fs.h>
 #include <linux/dma-buf.h>
 #include <linux/mem_encrypt.h>
+#include <linux/cgroup_drm.h>
 #include <drm/drmP.h>
 #include <drm/drm_vma_manager.h>
 #include <drm/drm_gem.h>
 #include <drm/drm_print.h>
+#include <drm/drm_cgroup.h>
 #include "drm_internal.h"
 
 /** @file drm_gem.c
@@ -154,6 +156,9 @@  void drm_gem_private_object_init(struct drm_device *dev,
 	obj->handle_count = 0;
 	obj->size = size;
 	drm_vma_node_reset(&obj->vma_node);
+
+	obj->drmcgrp = get_drmcgrp(current);
+	drmcgrp_chg_bo_alloc(obj->drmcgrp, dev, size);
 }
 EXPORT_SYMBOL(drm_gem_private_object_init);
 
@@ -804,6 +809,9 @@  drm_gem_object_release(struct drm_gem_object *obj)
 	if (obj->filp)
 		fput(obj->filp);
 
+	drmcgrp_unchg_bo_alloc(obj->drmcgrp, obj->dev, obj->size);
+	put_drmcgrp(obj->drmcgrp);
+
 	drm_gem_free_mmap_offset(obj);
 }
 EXPORT_SYMBOL(drm_gem_object_release);
diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
index 231e3f6d5f41..eeb612116810 100644
--- a/drivers/gpu/drm/drm_prime.c
+++ b/drivers/gpu/drm/drm_prime.c
@@ -32,6 +32,7 @@ 
 #include <drm/drm_prime.h>
 #include <drm/drm_gem.h>
 #include <drm/drmP.h>
+#include <drm/drm_cgroup.h>
 
 #include "drm_internal.h"
 
@@ -794,6 +795,7 @@  int drm_gem_prime_fd_to_handle(struct drm_device *dev,
 {
 	struct dma_buf *dma_buf;
 	struct drm_gem_object *obj;
+	struct drmcgrp *drmcgrp = drmcgrp_from(current);
 	int ret;
 
 	dma_buf = dma_buf_get(prime_fd);
@@ -818,6 +820,13 @@  int drm_gem_prime_fd_to_handle(struct drm_device *dev,
 		goto out_unlock;
 	}
 
+	/* only allow bo from the same cgroup or its ancestor to be imported */
+	if (drmcgrp != NULL &&
+			!drmcgrp_is_self_or_ancestor(drmcgrp, obj->drmcgrp)) {
+		ret = -EACCES;
+		goto out_unlock;
+	}
+
 	if (obj->dma_buf) {
 		WARN_ON(obj->dma_buf != dma_buf);
 	} else {
diff --git a/include/drm/drm_cgroup.h b/include/drm/drm_cgroup.h
index ddb9eab64360..8711b7c5f7bf 100644
--- a/include/drm/drm_cgroup.h
+++ b/include/drm/drm_cgroup.h
@@ -4,12 +4,20 @@ 
 #ifndef __DRM_CGROUP_H__
 #define __DRM_CGROUP_H__
 
+#include <linux/cgroup_drm.h>
+
 #ifdef CONFIG_CGROUP_DRM
 
 int drmcgrp_register_device(struct drm_device *device);
-
 int drmcgrp_unregister_device(struct drm_device *device);
-
+bool drmcgrp_is_self_or_ancestor(struct drmcgrp *self,
+		struct drmcgrp *relative);
+void drmcgrp_chg_bo_alloc(struct drmcgrp *drmcgrp, struct drm_device *dev,
+		size_t size);
+void drmcgrp_unchg_bo_alloc(struct drmcgrp *drmcgrp, struct drm_device *dev,
+		size_t size);
+bool drmcgrp_bo_can_allocate(struct task_struct *task, struct drm_device *dev,
+		size_t size);
 #else
 static inline int drmcgrp_register_device(struct drm_device *device)
 {
@@ -20,5 +28,27 @@  static inline int drmcgrp_unregister_device(struct drm_device *device)
 {
 	return 0;
 }
+
+static inline bool drmcgrp_is_self_or_ancestor(struct drmcgrp *self,
+		struct drmcgrp *relative)
+{
+	return false;
+}
+
+static inline void drmcgrp_chg_bo_alloc(struct drmcgrp *drmcgrp,
+		struct drm_device *dev,	size_t size)
+{
+}
+
+static inline void drmcgrp_unchg_bo_alloc(struct drmcgrp *drmcgrp,
+		struct drm_device *dev,	size_t size)
+{
+}
+
+static inline bool drmcgrp_bo_can_allocate(struct task_struct *task,
+		struct drm_device *dev,	size_t size)
+{
+	return true;
+}
 #endif /* CONFIG_CGROUP_DRM */
 #endif /* __DRM_CGROUP_H__ */
diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
index c95727425284..09d1c69a3f0c 100644
--- a/include/drm/drm_gem.h
+++ b/include/drm/drm_gem.h
@@ -272,6 +272,17 @@  struct drm_gem_object {
 	 *
 	 */
 	const struct drm_gem_object_funcs *funcs;
+
+	/**
+	 * @drmcgrp:
+	 *
+	 * DRM cgroup this GEM object belongs to.
+	 *
+	 * This is used to track and limit the amount of GEM objects a user
+	 * can allocate.  Since GEM objects can be shared, this is also used
+	 * to ensure GEM objects are only shared within the same cgroup.
+	 */
+	struct drmcgrp *drmcgrp;
 };
 
 /**
diff --git a/include/linux/cgroup_drm.h b/include/linux/cgroup_drm.h
index 27497f786c93..efa019666f1c 100644
--- a/include/linux/cgroup_drm.h
+++ b/include/linux/cgroup_drm.h
@@ -15,6 +15,8 @@ 
 
 struct drmcgrp_device_resource {
 	/* for per device stats */
+	s64			bo_stats_total_allocated;
+	s64			bo_limits_total_allocated;
 };
 
 struct drmcgrp {
diff --git a/kernel/cgroup/drm.c b/kernel/cgroup/drm.c
index 7da6e0d93991..cfc1fe74dca3 100644
--- a/kernel/cgroup/drm.c
+++ b/kernel/cgroup/drm.c
@@ -9,6 +9,7 @@ 
 #include <linux/cgroup_drm.h>
 #include <linux/kernel.h>
 #include <drm/drm_device.h>
+#include <drm/drm_ioctl.h>
 #include <drm/drm_cgroup.h>
 
 static DEFINE_MUTEX(drmcgrp_mutex);
@@ -16,6 +17,26 @@  static DEFINE_MUTEX(drmcgrp_mutex);
 struct drmcgrp_device {
 	struct drm_device	*dev;
 	struct mutex		mutex;
+
+	s64			bo_limits_total_allocated_default;
+};
+
+#define DRMCG_CTF_PRIV_SIZE 3
+#define DRMCG_CTF_PRIV_MASK GENMASK((DRMCG_CTF_PRIV_SIZE - 1), 0)
+#define DRMCG_CTF_PRIV(res_type, f_type)  ((res_type) <<\
+		DRMCG_CTF_PRIV_SIZE | (f_type))
+#define DRMCG_CTF_PRIV2RESTYPE(priv) ((priv) >> DRMCG_CTF_PRIV_SIZE)
+#define DRMCG_CTF_PRIV2FTYPE(priv) ((priv) & DRMCG_CTF_PRIV_MASK)
+
+
+enum drmcgrp_res_type {
+	DRMCGRP_TYPE_BO_TOTAL,
+};
+
+enum drmcgrp_file_type {
+	DRMCGRP_FTYPE_STATS,
+	DRMCGRP_FTYPE_LIMIT,
+	DRMCGRP_FTYPE_DEFAULT,
 };
 
 /* indexed by drm_minor for access speed */
@@ -54,6 +75,10 @@  static inline int init_drmcgrp_single(struct drmcgrp *drmcgrp, int minor)
 	}
 
 	/* set defaults here */
+	if (known_drmcgrp_devs[minor] != NULL) {
+		ddr->bo_limits_total_allocated =
+		  known_drmcgrp_devs[minor]->bo_limits_total_allocated_default;
+	}
 
 	return 0;
 }
@@ -100,7 +125,225 @@  drmcgrp_css_alloc(struct cgroup_subsys_state *parent_css)
 	return &drmcgrp->css;
 }
 
+static inline void drmcgrp_print_stats(struct drmcgrp_device_resource *ddr,
+		struct seq_file *sf, enum drmcgrp_res_type type)
+{
+	if (ddr == NULL) {
+		seq_puts(sf, "\n");
+		return;
+	}
+
+	switch (type) {
+	case DRMCGRP_TYPE_BO_TOTAL:
+		seq_printf(sf, "%lld\n", ddr->bo_stats_total_allocated);
+		break;
+	default:
+		seq_puts(sf, "\n");
+		break;
+	}
+}
+
+static inline void drmcgrp_print_limits(struct drmcgrp_device_resource *ddr,
+		struct seq_file *sf, enum drmcgrp_res_type type)
+{
+	if (ddr == NULL) {
+		seq_puts(sf, "\n");
+		return;
+	}
+
+	switch (type) {
+	case DRMCGRP_TYPE_BO_TOTAL:
+		seq_printf(sf, "%lld\n", ddr->bo_limits_total_allocated);
+		break;
+	default:
+		seq_puts(sf, "\n");
+		break;
+	}
+}
+
+static inline void drmcgrp_print_default(struct drmcgrp_device *ddev,
+		struct seq_file *sf, enum drmcgrp_res_type type)
+{
+	if (ddev == NULL) {
+		seq_puts(sf, "\n");
+		return;
+	}
+
+	switch (type) {
+	case DRMCGRP_TYPE_BO_TOTAL:
+		seq_printf(sf, "%lld\n",
+				ddev->bo_limits_total_allocated_default);
+		break;
+	default:
+		seq_puts(sf, "\n");
+		break;
+	}
+}
+
+int drmcgrp_bo_show(struct seq_file *sf, void *v)
+{
+	struct drmcgrp *drmcgrp = css_drmcgrp(seq_css(sf));
+	struct drmcgrp_device_resource *ddr = NULL;
+	enum drmcgrp_file_type f_type =
+		DRMCG_CTF_PRIV2FTYPE(seq_cft(sf)->private);
+	enum drmcgrp_res_type type =
+		DRMCG_CTF_PRIV2RESTYPE(seq_cft(sf)->private);
+	struct drmcgrp_device *ddev;
+	int i;
+
+	for (i = 0; i <= max_minor; i++) {
+		ddr = drmcgrp->dev_resources[i];
+		ddev = known_drmcgrp_devs[i];
+
+		seq_printf(sf, "%d:%d ", DRM_MAJOR, i);
+
+		switch (f_type) {
+		case DRMCGRP_FTYPE_STATS:
+			drmcgrp_print_stats(ddr, sf, type);
+			break;
+		case DRMCGRP_FTYPE_LIMIT:
+			drmcgrp_print_limits(ddr, sf, type);
+			break;
+		case DRMCGRP_FTYPE_DEFAULT:
+			drmcgrp_print_default(ddev, sf, type);
+			break;
+		default:
+			seq_puts(sf, "\n");
+			break;
+		}
+	}
+
+	return 0;
+}
+
+static inline void drmcgrp_pr_cft_err(const struct drmcgrp *drmcgrp,
+		const char *cft_name, int minor)
+{
+	pr_err("drmcgrp: error parsing %s, minor %d ",
+			cft_name, minor);
+	pr_cont_cgroup_name(drmcgrp->css.cgroup);
+	pr_cont("\n");
+}
+
+static inline int drmcgrp_process_limit_val(char *sval, bool is_mem,
+			s64 def_val, s64 max_val, s64 *ret_val)
+{
+	int rc = strcmp("max", sval);
+
+
+	if (!rc)
+		*ret_val = max_val;
+	else {
+		rc = strcmp("default", sval);
+
+		if (!rc)
+			*ret_val = def_val;
+	}
+
+	if (rc) {
+		if (is_mem) {
+			*ret_val = memparse(sval, NULL);
+			rc = 0;
+		} else {
+			rc = kstrtoll(sval, 0, ret_val);
+		}
+	}
+
+	if (*ret_val > max_val)
+		*ret_val = max_val;
+
+	return rc;
+}
+
+ssize_t drmcgrp_bo_limit_write(struct kernfs_open_file *of, char *buf,
+		size_t nbytes, loff_t off)
+{
+	struct drmcgrp *drmcgrp = css_drmcgrp(of_css(of));
+	struct drmcgrp *parent = parent_drmcgrp(drmcgrp);
+	enum drmcgrp_res_type type =
+		DRMCG_CTF_PRIV2RESTYPE(of_cft(of)->private);
+	char *cft_name = of_cft(of)->name;
+	char *limits = strstrip(buf);
+	struct drmcgrp_device *ddev;
+	struct drmcgrp_device_resource *ddr;
+	char *line;
+	char sattr[256];
+	s64 val;
+	s64 p_max;
+	int rc;
+	int minor;
+
+	while (limits != NULL) {
+		line =  strsep(&limits, "\n");
+
+		if (sscanf(line,
+			__stringify(DRM_MAJOR)":%u %255[^\t\n]",
+							&minor, sattr) != 2) {
+			pr_err("drmcgrp: error parsing %s ", cft_name);
+			pr_cont_cgroup_name(drmcgrp->css.cgroup);
+			pr_cont("\n");
+
+			continue;
+		}
+
+		if (minor < 0 || minor > max_minor) {
+			pr_err("drmcgrp: invalid minor %d for %s ",
+					minor, cft_name);
+			pr_cont_cgroup_name(drmcgrp->css.cgroup);
+			pr_cont("\n");
+
+			continue;
+		}
+
+		ddr = drmcgrp->dev_resources[minor];
+		ddev = known_drmcgrp_devs[minor];
+		switch (type) {
+		case DRMCGRP_TYPE_BO_TOTAL:
+			p_max = parent == NULL ? S64_MAX :
+				parent->dev_resources[minor]->
+				bo_limits_total_allocated;
+
+			rc = drmcgrp_process_limit_val(sattr, true,
+				ddev->bo_limits_total_allocated_default,
+				p_max,
+				&val);
+
+			if (rc || val < 0) {
+				drmcgrp_pr_cft_err(drmcgrp, cft_name, minor);
+				continue;
+			}
+
+			ddr->bo_limits_total_allocated = val;
+			break;
+		default:
+			break;
+		}
+	}
+
+	return nbytes;
+}
+
 struct cftype files[] = {
+	{
+		.name = "buffer.total.stats",
+		.seq_show = drmcgrp_bo_show,
+		.private = DRMCG_CTF_PRIV(DRMCGRP_TYPE_BO_TOTAL,
+						DRMCGRP_FTYPE_STATS),
+	},
+	{
+		.name = "buffer.total.default",
+		.seq_show = drmcgrp_bo_show,
+		.flags = CFTYPE_ONLY_ON_ROOT,
+		.private = DRMCG_CTF_PRIV(DRMCGRP_TYPE_BO_TOTAL,
+						DRMCGRP_FTYPE_DEFAULT),
+	},
+	{
+		.name = "buffer.total.max",
+		.write = drmcgrp_bo_limit_write,
+		.seq_show = drmcgrp_bo_show,
+		.private = DRMCG_CTF_PRIV(DRMCGRP_TYPE_BO_TOTAL,
+						DRMCGRP_FTYPE_LIMIT),
+	},
 	{ }	/* terminate */
 };
 
@@ -121,6 +364,8 @@  int drmcgrp_register_device(struct drm_device *dev)
 		return -ENOMEM;
 
 	ddev->dev = dev;
+	ddev->bo_limits_total_allocated_default = S64_MAX;
+
 	mutex_init(&ddev->mutex);
 
 	mutex_lock(&drmcgrp_mutex);
@@ -156,3 +401,79 @@  int drmcgrp_unregister_device(struct drm_device *dev)
 	return 0;
 }
 EXPORT_SYMBOL(drmcgrp_unregister_device);
+
+bool drmcgrp_is_self_or_ancestor(struct drmcgrp *self, struct drmcgrp *relative)
+{
+	for (; self != NULL; self = parent_drmcgrp(self))
+		if (self == relative)
+			return true;
+
+	return false;
+}
+EXPORT_SYMBOL(drmcgrp_is_self_or_ancestor);
+
+bool drmcgrp_bo_can_allocate(struct task_struct *task, struct drm_device *dev,
+		size_t size)
+{
+	struct drmcgrp *drmcgrp = drmcgrp_from(task);
+	struct drmcgrp_device_resource *ddr;
+	struct drmcgrp_device_resource *d;
+	int devIdx = dev->primary->index;
+	bool result = true;
+	s64 delta = 0;
+
+	if (drmcgrp == NULL || drmcgrp == root_drmcgrp)
+		return true;
+
+	ddr = drmcgrp->dev_resources[devIdx];
+	mutex_lock(&known_drmcgrp_devs[devIdx]->mutex);
+	for ( ; drmcgrp != root_drmcgrp; drmcgrp = parent_drmcgrp(drmcgrp)) {
+		d = drmcgrp->dev_resources[devIdx];
+		delta = d->bo_limits_total_allocated -
+				d->bo_stats_total_allocated;
+
+		if (delta <= 0 || size > delta) {
+			result = false;
+			break;
+		}
+	}
+	mutex_unlock(&known_drmcgrp_devs[devIdx]->mutex);
+
+	return result;
+}
+EXPORT_SYMBOL(drmcgrp_bo_can_allocate);
+
+void drmcgrp_chg_bo_alloc(struct drmcgrp *drmcgrp, struct drm_device *dev,
+		size_t size)
+{
+	struct drmcgrp_device_resource *ddr;
+	int devIdx = dev->primary->index;
+
+	if (drmcgrp == NULL || known_drmcgrp_devs[devIdx] == NULL)
+		return;
+
+	mutex_lock(&known_drmcgrp_devs[devIdx]->mutex);
+	for ( ; drmcgrp != NULL; drmcgrp = parent_drmcgrp(drmcgrp)) {
+		ddr = drmcgrp->dev_resources[devIdx];
+
+		ddr->bo_stats_total_allocated += (s64)size;
+	}
+	mutex_unlock(&known_drmcgrp_devs[devIdx]->mutex);
+}
+EXPORT_SYMBOL(drmcgrp_chg_bo_alloc);
+
+void drmcgrp_unchg_bo_alloc(struct drmcgrp *drmcgrp, struct drm_device *dev,
+		size_t size)
+{
+	int devIdx = dev->primary->index;
+
+	if (drmcgrp == NULL || known_drmcgrp_devs[devIdx] == NULL)
+		return;
+
+	mutex_lock(&known_drmcgrp_devs[devIdx]->mutex);
+	for ( ; drmcgrp != NULL; drmcgrp = parent_drmcgrp(drmcgrp))
+		drmcgrp->dev_resources[devIdx]->bo_stats_total_allocated
+			-= (s64)size;
+	mutex_unlock(&known_drmcgrp_devs[devIdx]->mutex);
+}
+EXPORT_SYMBOL(drmcgrp_unchg_bo_alloc);