diff mbox

[7/7] drm/i915: add i915_get_reset_stats_ioctl

Message ID 1372861332-6308-8-git-send-email-mika.kuoppala@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Mika Kuoppala July 3, 2013, 2:22 p.m. UTC
This ioctl returns reset stats for specified context.

The struct returned contains context loss counters.

reset_count:    all resets across all contexts
batch_active:   active batches lost on resets
batch_pending:  pending batches lost on resets

v2: get rid of state tracking completely and deliver only counts. Idea
    from Chris Wilson.

v3: fix commit message

v4: default context handled inside i915_gem_contest_get_hang_stats

v5: reset_count only for priviledged process

v6: ctx=0 needs CAP_SYS_ADMIN for batch_* counters (Chris Wilson)

v7: context hang stats never returns NULL

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Ian Romanick <idr@freedesktop.org>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_dma.c |    1 +
 drivers/gpu/drm/i915/i915_drv.c |   34 ++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_drv.h |    2 ++
 include/uapi/drm/i915_drm.h     |   17 +++++++++++++++++
 4 files changed, 54 insertions(+)

Comments

Chris Wilson July 3, 2013, 3:14 p.m. UTC | #1
On Wed, Jul 03, 2013 at 05:22:12PM +0300, Mika Kuoppala wrote:
>  int i915_max_ioctl = DRM_ARRAY_SIZE(i915_ioctls);
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 33cb973..0d4e3a8 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -1350,3 +1350,37 @@ int i915_reg_read_ioctl(struct drm_device *dev,
>  
>  	return 0;
>  }
> +
> +int i915_get_reset_stats_ioctl(struct drm_device *dev,
> +			       void *data, struct drm_file *file)
> +{
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct drm_i915_reset_stats *args = data;
> +	struct i915_ctx_hang_stats *hs;
> +	int ret;
> +
> +	if (args->ctx_id == 0 && !capable(CAP_SYS_ADMIN))
> +		return -EPERM;

When per-file default contexts land, there will not be any global
information leak and so we can drop the capability check here.

Ben, will we be informing userspace about the ABI change?
-Chris
Ben Widawsky July 3, 2013, 9:23 p.m. UTC | #2
On Wed, Jul 03, 2013 at 04:14:31PM +0100, Chris Wilson wrote:
> On Wed, Jul 03, 2013 at 05:22:12PM +0300, Mika Kuoppala wrote:
> >  int i915_max_ioctl = DRM_ARRAY_SIZE(i915_ioctls);
> > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> > index 33cb973..0d4e3a8 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.c
> > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > @@ -1350,3 +1350,37 @@ int i915_reg_read_ioctl(struct drm_device *dev,
> >  
> >  	return 0;
> >  }
> > +
> > +int i915_get_reset_stats_ioctl(struct drm_device *dev,
> > +			       void *data, struct drm_file *file)
> > +{
> > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > +	struct drm_i915_reset_stats *args = data;
> > +	struct i915_ctx_hang_stats *hs;
> > +	int ret;
> > +
> > +	if (args->ctx_id == 0 && !capable(CAP_SYS_ADMIN))
> > +		return -EPERM;
> 
> When per-file default contexts land, there will not be any global
> information leak and so we can drop the capability check here.
> 
> Ben, will we be informing userspace about the ABI change?
> -Chris

Hmm. I'm not convinced we ever want to reset the default context stats.

> 
> -- 
> Chris Wilson, Intel Open Source Technology Centre
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Chris Wilson July 3, 2013, 9:41 p.m. UTC | #3
On Wed, Jul 03, 2013 at 02:23:23PM -0700, Ben Widawsky wrote:
> On Wed, Jul 03, 2013 at 04:14:31PM +0100, Chris Wilson wrote:
> > Ben, will we be informing userspace about the ABI change?
> 
> Hmm. I'm not convinced we ever want to reset the default context stats.

I meant: will userspace be able to find out that the kernel now uses
per-fd default contexts. i.e. will I know if the kernel will
automatically allocate me a separate context and switch every time my
batches execute?
-Chris
Ben Widawsky July 3, 2013, 9:44 p.m. UTC | #4
On Wed, Jul 03, 2013 at 10:41:11PM +0100, Chris Wilson wrote:
> On Wed, Jul 03, 2013 at 02:23:23PM -0700, Ben Widawsky wrote:
> > On Wed, Jul 03, 2013 at 04:14:31PM +0100, Chris Wilson wrote:
> > > Ben, will we be informing userspace about the ABI change?
> > 
> > Hmm. I'm not convinced we ever want to reset the default context stats.
> 
> I meant: will userspace be able to find out that the kernel now uses
> per-fd default contexts. i.e. will I know if the kernel will
> automatically allocate me a separate context and switch every time my
> batches execute?
> -Chris
> 
> -- 
> Chris Wilson, Intel Open Source Technology Centre

Yes. It won't be a requirement for mesa to know, but since I want to
kill aliasing ppgtt as much as possible, I intend to update getparam to
both set 0 to aliasing, and a new one for ppgtt.
Daniel Vetter July 3, 2013, 9:58 p.m. UTC | #5
On Wed, Jul 3, 2013 at 11:44 PM, Ben Widawsky <ben@bwidawsk.net> wrote:
> Yes. It won't be a requirement for mesa to know, but since I want to
> kill aliasing ppgtt as much as possible, I intend to update getparam to
> both set 0 to aliasing, and a new one for ppgtt.

We use that one essentially to decide whether the hw uses ppgtt or
not, aliasing kinda doesn't matter. So to keep api I guess we need to
keep that one enabled. Although only i-g-t really cares I think, but
keeping it working would still be nice. Maybe just add a comment to
the #define explaining what's going on?
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
Ben Widawsky July 3, 2013, 10 p.m. UTC | #6
On Wed, Jul 03, 2013 at 11:58:31PM +0200, Daniel Vetter wrote:
> On Wed, Jul 3, 2013 at 11:44 PM, Ben Widawsky <ben@bwidawsk.net> wrote:
> > Yes. It won't be a requirement for mesa to know, but since I want to
> > kill aliasing ppgtt as much as possible, I intend to update getparam to
> > both set 0 to aliasing, and a new one for ppgtt.
> 
> We use that one essentially to decide whether the hw uses ppgtt or
> not, aliasing kinda doesn't matter. So to keep api I guess we need to
> keep that one enabled. Although only i-g-t really cares I think, but
> keeping it working would still be nice. Maybe just add a comment to
> the #define explaining what's going on?
> -Daniel

It's a bit off topic for this series, but I don't see any reason to keep
it working. If it doesn't end up being too much trouble, I can - but in
actuality on the kernel would be able to use it. Exposing the param as
true to user space would be a lie.

I was planning to update the IGT tests to support both things FWIW.

> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
Ville Syrjala July 4, 2013, 7:54 a.m. UTC | #7
On Wed, Jul 03, 2013 at 03:00:39PM -0700, Ben Widawsky wrote:
> On Wed, Jul 03, 2013 at 11:58:31PM +0200, Daniel Vetter wrote:
> > On Wed, Jul 3, 2013 at 11:44 PM, Ben Widawsky <ben@bwidawsk.net> wrote:
> > > Yes. It won't be a requirement for mesa to know, but since I want to
> > > kill aliasing ppgtt as much as possible, I intend to update getparam to
> > > both set 0 to aliasing, and a new one for ppgtt.
> > 
> > We use that one essentially to decide whether the hw uses ppgtt or
> > not, aliasing kinda doesn't matter. So to keep api I guess we need to
> > keep that one enabled. Although only i-g-t really cares I think, but
> > keeping it working would still be nice. Maybe just add a comment to
> > the #define explaining what's going on?
> > -Daniel
> 
> It's a bit off topic for this series, but I don't see any reason to keep
> it working. If it doesn't end up being too much trouble, I can - but in
> actuality on the kernel would be able to use it. Exposing the param as
> true to user space would be a lie.

But won't old Mesa refuse to work if you tell it aliasing ppgtt is not
enabled?
Ben Widawsky July 4, 2013, 4:39 p.m. UTC | #8
On Thu, Jul 04, 2013 at 10:54:58AM +0300, Ville Syrjälä wrote:
> On Wed, Jul 03, 2013 at 03:00:39PM -0700, Ben Widawsky wrote:
> > On Wed, Jul 03, 2013 at 11:58:31PM +0200, Daniel Vetter wrote:
> > > On Wed, Jul 3, 2013 at 11:44 PM, Ben Widawsky <ben@bwidawsk.net> wrote:
> > > > Yes. It won't be a requirement for mesa to know, but since I want to
> > > > kill aliasing ppgtt as much as possible, I intend to update getparam to
> > > > both set 0 to aliasing, and a new one for ppgtt.
> > > 
> > > We use that one essentially to decide whether the hw uses ppgtt or
> > > not, aliasing kinda doesn't matter. So to keep api I guess we need to
> > > keep that one enabled. Although only i-g-t really cares I think, but
> > > keeping it working would still be nice. Maybe just add a comment to
> > > the #define explaining what's going on?
> > > -Daniel
> > 
> > It's a bit off topic for this series, but I don't see any reason to keep
> > it working. If it doesn't end up being too much trouble, I can - but in
> > actuality on the kernel would be able to use it. Exposing the param as
> > true to user space would be a lie.
> 
> But won't old Mesa refuse to work if you tell it aliasing ppgtt is not
> enabled?

AFAIK, mesa doesn't check... I can update mesa too if so.

> 
> -- 
> Ville Syrjälä
> Intel OTC
Ian Romanick Oct. 26, 2013, 1:42 a.m. UTC | #9
Since the Mesa merge window is closing soon, I'm finally getting back on
this.  I've pushed a rebase of my old Mesa branch to my fd.o repo

http://cgit.freedesktop.org/~idr/mesa/log/?h=robustness3

I have a couple questions...

1. Has any of this landed an a kernel tree anywhere?

2. Has any support code landed in a libdrm tree anywhere?

3. What method should I use to detect that the kernel has support?  In
early discussions, reset notification was only going to be available on
some GPUs, so there was a getparam to detect actual availability.  I
guess now it's just based on kernel version?

It looks like I should just need to update df87cdd and 61dad8e in my
Mesa tree.

On 07/03/2013 07:22 AM, Mika Kuoppala wrote:
> This ioctl returns reset stats for specified context.
> 
> The struct returned contains context loss counters.
> 
> reset_count:    all resets across all contexts
> batch_active:   active batches lost on resets
> batch_pending:  pending batches lost on resets
> 
> v2: get rid of state tracking completely and deliver only counts. Idea
>     from Chris Wilson.
> 
> v3: fix commit message
> 
> v4: default context handled inside i915_gem_contest_get_hang_stats
> 
> v5: reset_count only for priviledged process
> 
> v6: ctx=0 needs CAP_SYS_ADMIN for batch_* counters (Chris Wilson)
> 
> v7: context hang stats never returns NULL
> 
> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
> Cc: Ian Romanick <idr@freedesktop.org>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> ---
>  drivers/gpu/drm/i915/i915_dma.c |    1 +
>  drivers/gpu/drm/i915/i915_drv.c |   34 ++++++++++++++++++++++++++++++++++
>  drivers/gpu/drm/i915/i915_drv.h |    2 ++
>  include/uapi/drm/i915_drm.h     |   17 +++++++++++++++++
>  4 files changed, 54 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> index 0e22142..d1a006f 100644
> --- a/drivers/gpu/drm/i915/i915_dma.c
> +++ b/drivers/gpu/drm/i915/i915_dma.c
> @@ -1889,6 +1889,7 @@ struct drm_ioctl_desc i915_ioctls[] = {
>  	DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_CREATE, i915_gem_context_create_ioctl, DRM_UNLOCKED),
>  	DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_DESTROY, i915_gem_context_destroy_ioctl, DRM_UNLOCKED),
>  	DRM_IOCTL_DEF_DRV(I915_REG_READ, i915_reg_read_ioctl, DRM_UNLOCKED),
> +	DRM_IOCTL_DEF_DRV(I915_GET_RESET_STATS, i915_get_reset_stats_ioctl, DRM_UNLOCKED),
>  };
>  
>  int i915_max_ioctl = DRM_ARRAY_SIZE(i915_ioctls);
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 33cb973..0d4e3a8 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -1350,3 +1350,37 @@ int i915_reg_read_ioctl(struct drm_device *dev,
>  
>  	return 0;
>  }
> +
> +int i915_get_reset_stats_ioctl(struct drm_device *dev,
> +			       void *data, struct drm_file *file)
> +{
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct drm_i915_reset_stats *args = data;
> +	struct i915_ctx_hang_stats *hs;
> +	int ret;
> +
> +	if (args->ctx_id == 0 && !capable(CAP_SYS_ADMIN))
> +		return -EPERM;
> +
> +	ret = mutex_lock_interruptible(&dev->struct_mutex);
> +	if (ret)
> +		return ret;
> +
> +	hs = i915_gem_context_get_hang_stats(dev, file, args->ctx_id);
> +	if (IS_ERR(hs)) {
> +		mutex_unlock(&dev->struct_mutex);
> +		return PTR_ERR(hs);
> +	}
> +
> +	if (capable(CAP_SYS_ADMIN))
> +		args->reset_count = i915_reset_count(&dev_priv->gpu_error);
> +	else
> +		args->reset_count = 0;
> +
> +	args->batch_active = hs->batch_active;
> +	args->batch_pending = hs->batch_pending;
> +
> +	mutex_unlock(&dev->struct_mutex);
> +
> +	return 0;
> +}
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 1def049..0ca98fa 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2021,6 +2021,8 @@ extern int intel_enable_rc6(const struct drm_device *dev);
>  extern bool i915_semaphore_is_enabled(struct drm_device *dev);
>  int i915_reg_read_ioctl(struct drm_device *dev, void *data,
>  			struct drm_file *file);
> +int i915_get_reset_stats_ioctl(struct drm_device *dev, void *data,
> +			       struct drm_file *file);
>  
>  /* overlay */
>  #ifdef CONFIG_DEBUG_FS
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index 923ed7f..29b07fd 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -198,6 +198,7 @@ typedef struct _drm_i915_sarea {
>  #define DRM_I915_GEM_SET_CACHING	0x2f
>  #define DRM_I915_GEM_GET_CACHING	0x30
>  #define DRM_I915_REG_READ		0x31
> +#define DRM_I915_GET_RESET_STATS	0x32
>  
>  #define DRM_IOCTL_I915_INIT		DRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t)
>  #define DRM_IOCTL_I915_FLUSH		DRM_IO ( DRM_COMMAND_BASE + DRM_I915_FLUSH)
> @@ -247,6 +248,7 @@ typedef struct _drm_i915_sarea {
>  #define DRM_IOCTL_I915_GEM_CONTEXT_CREATE	DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_CONTEXT_CREATE, struct drm_i915_gem_context_create)
>  #define DRM_IOCTL_I915_GEM_CONTEXT_DESTROY	DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_CONTEXT_DESTROY, struct drm_i915_gem_context_destroy)
>  #define DRM_IOCTL_I915_REG_READ			DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_REG_READ, struct drm_i915_reg_read)
> +#define DRM_IOCTL_I915_GET_RESET_STATS		DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GET_RESET_STATS, struct drm_i915_reset_stats)
>  
>  /* Allow drivers to submit batchbuffers directly to hardware, relying
>   * on the security mechanisms provided by hardware.
> @@ -981,4 +983,19 @@ struct drm_i915_reg_read {
>  	__u64 offset;
>  	__u64 val; /* Return value */
>  };
> +
> +struct drm_i915_reset_stats {
> +	__u32 ctx_id;
> +	__u32 flags;
> +
> +	/* For all contexts */
> +	__u32 reset_count;
> +
> +	/* For this context */
> +	__u32 batch_active;
> +	__u32 batch_pending;
> +
> +	__u32 pad;
> +};
> +
>  #endif /* _UAPI_I915_DRM_H_ */
>
Daniel Vetter Oct. 27, 2013, 12:30 p.m. UTC | #10
On Fri, Oct 25, 2013 at 06:42:35PM -0700, Ian Romanick wrote:
> Since the Mesa merge window is closing soon, I'm finally getting back on
> this.  I've pushed a rebase of my old Mesa branch to my fd.o repo
> 
> http://cgit.freedesktop.org/~idr/mesa/log/?h=robustness3
> 
> I have a couple questions...
> 
> 1. Has any of this landed an a kernel tree anywhere?

Afaik everything but the actual ioctl and i-g-t testcase has landed.

> 2. Has any support code landed in a libdrm tree anywhere?

Dunno whether Mika has libdrm patches. Since mesa is the only expected
user I'd just go with putting the ioctl wrapper (using the drmIoctl
helper) into mesa itself, that get rids of a dep for merging this support.

> 3. What method should I use to detect that the kernel has support?  In
> early discussions, reset notification was only going to be available on
> some GPUs, so there was a getparam to detect actual availability.  I
> guess now it's just based on kernel version?

Usually we add a new feature flag to get get_param ioctl if there's no
natural way otherwise for userspace to figure this out (usually by calling
the new ioctl and disabling the feature if that doesn't work).
-Daniel

> 
> It looks like I should just need to update df87cdd and 61dad8e in my
> Mesa tree.
> 
> On 07/03/2013 07:22 AM, Mika Kuoppala wrote:
> > This ioctl returns reset stats for specified context.
> > 
> > The struct returned contains context loss counters.
> > 
> > reset_count:    all resets across all contexts
> > batch_active:   active batches lost on resets
> > batch_pending:  pending batches lost on resets
> > 
> > v2: get rid of state tracking completely and deliver only counts. Idea
> >     from Chris Wilson.
> > 
> > v3: fix commit message
> > 
> > v4: default context handled inside i915_gem_contest_get_hang_stats
> > 
> > v5: reset_count only for priviledged process
> > 
> > v6: ctx=0 needs CAP_SYS_ADMIN for batch_* counters (Chris Wilson)
> > 
> > v7: context hang stats never returns NULL
> > 
> > Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
> > Cc: Ian Romanick <idr@freedesktop.org>
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > ---
> >  drivers/gpu/drm/i915/i915_dma.c |    1 +
> >  drivers/gpu/drm/i915/i915_drv.c |   34 ++++++++++++++++++++++++++++++++++
> >  drivers/gpu/drm/i915/i915_drv.h |    2 ++
> >  include/uapi/drm/i915_drm.h     |   17 +++++++++++++++++
> >  4 files changed, 54 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> > index 0e22142..d1a006f 100644
> > --- a/drivers/gpu/drm/i915/i915_dma.c
> > +++ b/drivers/gpu/drm/i915/i915_dma.c
> > @@ -1889,6 +1889,7 @@ struct drm_ioctl_desc i915_ioctls[] = {
> >  	DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_CREATE, i915_gem_context_create_ioctl, DRM_UNLOCKED),
> >  	DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_DESTROY, i915_gem_context_destroy_ioctl, DRM_UNLOCKED),
> >  	DRM_IOCTL_DEF_DRV(I915_REG_READ, i915_reg_read_ioctl, DRM_UNLOCKED),
> > +	DRM_IOCTL_DEF_DRV(I915_GET_RESET_STATS, i915_get_reset_stats_ioctl, DRM_UNLOCKED),
> >  };
> >  
> >  int i915_max_ioctl = DRM_ARRAY_SIZE(i915_ioctls);
> > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> > index 33cb973..0d4e3a8 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.c
> > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > @@ -1350,3 +1350,37 @@ int i915_reg_read_ioctl(struct drm_device *dev,
> >  
> >  	return 0;
> >  }
> > +
> > +int i915_get_reset_stats_ioctl(struct drm_device *dev,
> > +			       void *data, struct drm_file *file)
> > +{
> > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > +	struct drm_i915_reset_stats *args = data;
> > +	struct i915_ctx_hang_stats *hs;
> > +	int ret;
> > +
> > +	if (args->ctx_id == 0 && !capable(CAP_SYS_ADMIN))
> > +		return -EPERM;
> > +
> > +	ret = mutex_lock_interruptible(&dev->struct_mutex);
> > +	if (ret)
> > +		return ret;
> > +
> > +	hs = i915_gem_context_get_hang_stats(dev, file, args->ctx_id);
> > +	if (IS_ERR(hs)) {
> > +		mutex_unlock(&dev->struct_mutex);
> > +		return PTR_ERR(hs);
> > +	}
> > +
> > +	if (capable(CAP_SYS_ADMIN))
> > +		args->reset_count = i915_reset_count(&dev_priv->gpu_error);
> > +	else
> > +		args->reset_count = 0;
> > +
> > +	args->batch_active = hs->batch_active;
> > +	args->batch_pending = hs->batch_pending;
> > +
> > +	mutex_unlock(&dev->struct_mutex);
> > +
> > +	return 0;
> > +}
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index 1def049..0ca98fa 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -2021,6 +2021,8 @@ extern int intel_enable_rc6(const struct drm_device *dev);
> >  extern bool i915_semaphore_is_enabled(struct drm_device *dev);
> >  int i915_reg_read_ioctl(struct drm_device *dev, void *data,
> >  			struct drm_file *file);
> > +int i915_get_reset_stats_ioctl(struct drm_device *dev, void *data,
> > +			       struct drm_file *file);
> >  
> >  /* overlay */
> >  #ifdef CONFIG_DEBUG_FS
> > diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> > index 923ed7f..29b07fd 100644
> > --- a/include/uapi/drm/i915_drm.h
> > +++ b/include/uapi/drm/i915_drm.h
> > @@ -198,6 +198,7 @@ typedef struct _drm_i915_sarea {
> >  #define DRM_I915_GEM_SET_CACHING	0x2f
> >  #define DRM_I915_GEM_GET_CACHING	0x30
> >  #define DRM_I915_REG_READ		0x31
> > +#define DRM_I915_GET_RESET_STATS	0x32
> >  
> >  #define DRM_IOCTL_I915_INIT		DRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t)
> >  #define DRM_IOCTL_I915_FLUSH		DRM_IO ( DRM_COMMAND_BASE + DRM_I915_FLUSH)
> > @@ -247,6 +248,7 @@ typedef struct _drm_i915_sarea {
> >  #define DRM_IOCTL_I915_GEM_CONTEXT_CREATE	DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_CONTEXT_CREATE, struct drm_i915_gem_context_create)
> >  #define DRM_IOCTL_I915_GEM_CONTEXT_DESTROY	DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_CONTEXT_DESTROY, struct drm_i915_gem_context_destroy)
> >  #define DRM_IOCTL_I915_REG_READ			DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_REG_READ, struct drm_i915_reg_read)
> > +#define DRM_IOCTL_I915_GET_RESET_STATS		DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GET_RESET_STATS, struct drm_i915_reset_stats)
> >  
> >  /* Allow drivers to submit batchbuffers directly to hardware, relying
> >   * on the security mechanisms provided by hardware.
> > @@ -981,4 +983,19 @@ struct drm_i915_reg_read {
> >  	__u64 offset;
> >  	__u64 val; /* Return value */
> >  };
> > +
> > +struct drm_i915_reset_stats {
> > +	__u32 ctx_id;
> > +	__u32 flags;
> > +
> > +	/* For all contexts */
> > +	__u32 reset_count;
> > +
> > +	/* For this context */
> > +	__u32 batch_active;
> > +	__u32 batch_pending;
> > +
> > +	__u32 pad;
> > +};
> > +
> >  #endif /* _UAPI_I915_DRM_H_ */
> > 
>
Ian Romanick Oct. 29, 2013, 10:29 p.m. UTC | #11
On 10/27/2013 05:30 AM, Daniel Vetter wrote:
> On Fri, Oct 25, 2013 at 06:42:35PM -0700, Ian Romanick wrote:
>> Since the Mesa merge window is closing soon, I'm finally getting back on
>> this.  I've pushed a rebase of my old Mesa branch to my fd.o repo
>>
>> http://cgit.freedesktop.org/~idr/mesa/log/?h=robustness3
>>
>> I have a couple questions...
>>
>> 1. Has any of this landed an a kernel tree anywhere?
> 
> Afaik everything but the actual ioctl and i-g-t testcase has landed.

And that stuff will land once my patches hit the Mesa list or ... ?

>> 2. Has any support code landed in a libdrm tree anywhere?
> 
> Dunno whether Mika has libdrm patches. Since mesa is the only expected
> user I'd just go with putting the ioctl wrapper (using the drmIoctl
> helper) into mesa itself, that get rids of a dep for merging this support.

What's the right way to get the ctx_id out of the drm_intel_context?
That struct is private to libdrm, but the ioctl needs it.

>> 3. What method should I use to detect that the kernel has support?  In
>> early discussions, reset notification was only going to be available on
>> some GPUs, so there was a getparam to detect actual availability.  I
>> guess now it's just based on kernel version?
> 
> Usually we add a new feature flag to get get_param ioctl if there's no
> natural way otherwise for userspace to figure this out (usually by calling
> the new ioctl and disabling the feature if that doesn't work).
> -Daniel
> 
>>
>> It looks like I should just need to update df87cdd and 61dad8e in my
>> Mesa tree.
>>
>> On 07/03/2013 07:22 AM, Mika Kuoppala wrote:
>>> This ioctl returns reset stats for specified context.
>>>
>>> The struct returned contains context loss counters.
>>>
>>> reset_count:    all resets across all contexts
>>> batch_active:   active batches lost on resets
>>> batch_pending:  pending batches lost on resets
>>>
>>> v2: get rid of state tracking completely and deliver only counts. Idea
>>>     from Chris Wilson.
>>>
>>> v3: fix commit message
>>>
>>> v4: default context handled inside i915_gem_contest_get_hang_stats
>>>
>>> v5: reset_count only for priviledged process
>>>
>>> v6: ctx=0 needs CAP_SYS_ADMIN for batch_* counters (Chris Wilson)
>>>
>>> v7: context hang stats never returns NULL
>>>
>>> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
>>> Cc: Ian Romanick <idr@freedesktop.org>
>>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>>> ---
>>>  drivers/gpu/drm/i915/i915_dma.c |    1 +
>>>  drivers/gpu/drm/i915/i915_drv.c |   34 ++++++++++++++++++++++++++++++++++
>>>  drivers/gpu/drm/i915/i915_drv.h |    2 ++
>>>  include/uapi/drm/i915_drm.h     |   17 +++++++++++++++++
>>>  4 files changed, 54 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
>>> index 0e22142..d1a006f 100644
>>> --- a/drivers/gpu/drm/i915/i915_dma.c
>>> +++ b/drivers/gpu/drm/i915/i915_dma.c
>>> @@ -1889,6 +1889,7 @@ struct drm_ioctl_desc i915_ioctls[] = {
>>>  	DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_CREATE, i915_gem_context_create_ioctl, DRM_UNLOCKED),
>>>  	DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_DESTROY, i915_gem_context_destroy_ioctl, DRM_UNLOCKED),
>>>  	DRM_IOCTL_DEF_DRV(I915_REG_READ, i915_reg_read_ioctl, DRM_UNLOCKED),
>>> +	DRM_IOCTL_DEF_DRV(I915_GET_RESET_STATS, i915_get_reset_stats_ioctl, DRM_UNLOCKED),
>>>  };
>>>  
>>>  int i915_max_ioctl = DRM_ARRAY_SIZE(i915_ioctls);
>>> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
>>> index 33cb973..0d4e3a8 100644
>>> --- a/drivers/gpu/drm/i915/i915_drv.c
>>> +++ b/drivers/gpu/drm/i915/i915_drv.c
>>> @@ -1350,3 +1350,37 @@ int i915_reg_read_ioctl(struct drm_device *dev,
>>>  
>>>  	return 0;
>>>  }
>>> +
>>> +int i915_get_reset_stats_ioctl(struct drm_device *dev,
>>> +			       void *data, struct drm_file *file)
>>> +{
>>> +	struct drm_i915_private *dev_priv = dev->dev_private;
>>> +	struct drm_i915_reset_stats *args = data;
>>> +	struct i915_ctx_hang_stats *hs;
>>> +	int ret;
>>> +
>>> +	if (args->ctx_id == 0 && !capable(CAP_SYS_ADMIN))
>>> +		return -EPERM;
>>> +
>>> +	ret = mutex_lock_interruptible(&dev->struct_mutex);
>>> +	if (ret)
>>> +		return ret;
>>> +
>>> +	hs = i915_gem_context_get_hang_stats(dev, file, args->ctx_id);
>>> +	if (IS_ERR(hs)) {
>>> +		mutex_unlock(&dev->struct_mutex);
>>> +		return PTR_ERR(hs);
>>> +	}
>>> +
>>> +	if (capable(CAP_SYS_ADMIN))
>>> +		args->reset_count = i915_reset_count(&dev_priv->gpu_error);
>>> +	else
>>> +		args->reset_count = 0;
>>> +
>>> +	args->batch_active = hs->batch_active;
>>> +	args->batch_pending = hs->batch_pending;
>>> +
>>> +	mutex_unlock(&dev->struct_mutex);
>>> +
>>> +	return 0;
>>> +}
>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>>> index 1def049..0ca98fa 100644
>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>> @@ -2021,6 +2021,8 @@ extern int intel_enable_rc6(const struct drm_device *dev);
>>>  extern bool i915_semaphore_is_enabled(struct drm_device *dev);
>>>  int i915_reg_read_ioctl(struct drm_device *dev, void *data,
>>>  			struct drm_file *file);
>>> +int i915_get_reset_stats_ioctl(struct drm_device *dev, void *data,
>>> +			       struct drm_file *file);
>>>  
>>>  /* overlay */
>>>  #ifdef CONFIG_DEBUG_FS
>>> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
>>> index 923ed7f..29b07fd 100644
>>> --- a/include/uapi/drm/i915_drm.h
>>> +++ b/include/uapi/drm/i915_drm.h
>>> @@ -198,6 +198,7 @@ typedef struct _drm_i915_sarea {
>>>  #define DRM_I915_GEM_SET_CACHING	0x2f
>>>  #define DRM_I915_GEM_GET_CACHING	0x30
>>>  #define DRM_I915_REG_READ		0x31
>>> +#define DRM_I915_GET_RESET_STATS	0x32
>>>  
>>>  #define DRM_IOCTL_I915_INIT		DRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t)
>>>  #define DRM_IOCTL_I915_FLUSH		DRM_IO ( DRM_COMMAND_BASE + DRM_I915_FLUSH)
>>> @@ -247,6 +248,7 @@ typedef struct _drm_i915_sarea {
>>>  #define DRM_IOCTL_I915_GEM_CONTEXT_CREATE	DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_CONTEXT_CREATE, struct drm_i915_gem_context_create)
>>>  #define DRM_IOCTL_I915_GEM_CONTEXT_DESTROY	DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_CONTEXT_DESTROY, struct drm_i915_gem_context_destroy)
>>>  #define DRM_IOCTL_I915_REG_READ			DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_REG_READ, struct drm_i915_reg_read)
>>> +#define DRM_IOCTL_I915_GET_RESET_STATS		DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GET_RESET_STATS, struct drm_i915_reset_stats)
>>>  
>>>  /* Allow drivers to submit batchbuffers directly to hardware, relying
>>>   * on the security mechanisms provided by hardware.
>>> @@ -981,4 +983,19 @@ struct drm_i915_reg_read {
>>>  	__u64 offset;
>>>  	__u64 val; /* Return value */
>>>  };
>>> +
>>> +struct drm_i915_reset_stats {
>>> +	__u32 ctx_id;
>>> +	__u32 flags;
>>> +
>>> +	/* For all contexts */
>>> +	__u32 reset_count;
>>> +
>>> +	/* For this context */
>>> +	__u32 batch_active;
>>> +	__u32 batch_pending;
>>> +
>>> +	__u32 pad;
>>> +};
>>> +
>>>  #endif /* _UAPI_I915_DRM_H_ */
>>>
Daniel Vetter Oct. 30, 2013, 8:21 a.m. UTC | #12
On Tue, Oct 29, 2013 at 11:29 PM, Ian Romanick <idr@freedesktop.org> wrote:
> On 10/27/2013 05:30 AM, Daniel Vetter wrote:
>> On Fri, Oct 25, 2013 at 06:42:35PM -0700, Ian Romanick wrote:
>>> Since the Mesa merge window is closing soon, I'm finally getting back on
>>> this.  I've pushed a rebase of my old Mesa branch to my fd.o repo
>>>
>>> http://cgit.freedesktop.org/~idr/mesa/log/?h=robustness3
>>>
>>> I have a couple questions...
>>>
>>> 1. Has any of this landed an a kernel tree anywhere?
>>
>> Afaik everything but the actual ioctl and i-g-t testcase has landed.
>
> And that stuff will land once my patches hit the Mesa list or ... ?

Yup.

>>> 2. Has any support code landed in a libdrm tree anywhere?
>>
>> Dunno whether Mika has libdrm patches. Since mesa is the only expected
>> user I'd just go with putting the ioctl wrapper (using the drmIoctl
>> helper) into mesa itself, that get rids of a dep for merging this support.
>
> What's the right way to get the ctx_id out of the drm_intel_context?
> That struct is private to libdrm, but the ioctl needs it.

Hm, I guess then we need something in libdrm. I've just figured for
stuff that only mesa really cares about like the hw contexts here
libdrm wrappers are a bit pointless since it adds a 2nd ABI barrier
with no real gain.
-Daniel
Dave Airlie Nov. 8, 2013, 5:48 a.m. UTC | #13
On Wed, Oct 30, 2013 at 6:21 PM, Daniel Vetter <daniel@ffwll.ch> wrote:
> On Tue, Oct 29, 2013 at 11:29 PM, Ian Romanick <idr@freedesktop.org> wrote:
>> On 10/27/2013 05:30 AM, Daniel Vetter wrote:
>>> On Fri, Oct 25, 2013 at 06:42:35PM -0700, Ian Romanick wrote:
>>>> Since the Mesa merge window is closing soon, I'm finally getting back on
>>>> this.  I've pushed a rebase of my old Mesa branch to my fd.o repo
>>>>
>>>> http://cgit.freedesktop.org/~idr/mesa/log/?h=robustness3
>>>>
>>>> I have a couple questions...
>>>>
>>>> 1. Has any of this landed an a kernel tree anywhere?
>>>
>>> Afaik everything but the actual ioctl and i-g-t testcase has landed.
>>
>> And that stuff will land once my patches hit the Mesa list or ... ?
>
> Yup.

Hey kernel first, then upstream projects, at the moment libdrm has
ioctls in it that I have no upstream solid kernel commit for,

Either in the next 24 hrs I have this in my tree or the libdrm commits
need to be reverted,

and if someone releases libdrm in that time span then I'm going to be
quite pissed.

Dave.
Daniel Vetter Nov. 8, 2013, 6:32 a.m. UTC | #14
On Fri, Nov 8, 2013 at 6:48 AM, Dave Airlie <airlied@gmail.com> wrote:
> On Wed, Oct 30, 2013 at 6:21 PM, Daniel Vetter <daniel@ffwll.ch> wrote:
>> On Tue, Oct 29, 2013 at 11:29 PM, Ian Romanick <idr@freedesktop.org> wrote:
>>> On 10/27/2013 05:30 AM, Daniel Vetter wrote:
>>>> On Fri, Oct 25, 2013 at 06:42:35PM -0700, Ian Romanick wrote:
>>>>> Since the Mesa merge window is closing soon, I'm finally getting back on
>>>>> this.  I've pushed a rebase of my old Mesa branch to my fd.o repo
>>>>>
>>>>> http://cgit.freedesktop.org/~idr/mesa/log/?h=robustness3
>>>>>
>>>>> I have a couple questions...
>>>>>
>>>>> 1. Has any of this landed an a kernel tree anywhere?
>>>>
>>>> Afaik everything but the actual ioctl and i-g-t testcase has landed.
>>>
>>> And that stuff will land once my patches hit the Mesa list or ... ?
>>
>> Yup.
>
> Hey kernel first, then upstream projects, at the moment libdrm has
> ioctls in it that I have no upstream solid kernel commit for,
>
> Either in the next 24 hrs I have this in my tree or the libdrm commits
> need to be reverted,
>
> and if someone releases libdrm in that time span then I'm going to be
> quite pissed.

It's kinda too late imo for 3.13 (and there's an open question whether
we need one more flag or not), so I wanted to pull it in into 3.14.
Which also gives us plenty of time to add or not add that optional
flag. So I guess time to revert. Can you do that pls?
-Daniel
Ian Romanick Nov. 8, 2013, 5:21 p.m. UTC | #15
On 11/07/2013 10:32 PM, Daniel Vetter wrote:
> On Fri, Nov 8, 2013 at 6:48 AM, Dave Airlie <airlied@gmail.com> wrote:
>> On Wed, Oct 30, 2013 at 6:21 PM, Daniel Vetter <daniel@ffwll.ch> wrote:
>>> On Tue, Oct 29, 2013 at 11:29 PM, Ian Romanick <idr@freedesktop.org> wrote:
>>>> On 10/27/2013 05:30 AM, Daniel Vetter wrote:
>>>>> On Fri, Oct 25, 2013 at 06:42:35PM -0700, Ian Romanick wrote:
>>>>>> Since the Mesa merge window is closing soon, I'm finally getting back on
>>>>>> this.  I've pushed a rebase of my old Mesa branch to my fd.o repo
>>>>>>
>>>>>> http://cgit.freedesktop.org/~idr/mesa/log/?h=robustness3
>>>>>>
>>>>>> I have a couple questions...
>>>>>>
>>>>>> 1. Has any of this landed an a kernel tree anywhere?
>>>>>
>>>>> Afaik everything but the actual ioctl and i-g-t testcase has landed.
>>>>
>>>> And that stuff will land once my patches hit the Mesa list or ... ?
>>>
>>> Yup.
>>
>> Hey kernel first, then upstream projects, at the moment libdrm has
>> ioctls in it that I have no upstream solid kernel commit for,
>>
>> Either in the next 24 hrs I have this in my tree or the libdrm commits
>> need to be reverted,
>>
>> and if someone releases libdrm in that time span then I'm going to be
>> quite pissed.
> 
> It's kinda too late imo for 3.13 (and there's an open question whether
> we need one more flag or not), so I wanted to pull it in into 3.14.
> Which also gives us plenty of time to add or not add that optional
> flag. So I guess time to revert. Can you do that pls?

Reverting has completely broken Mesa builds, and was the wrong choice.
Thanks for giving me opportunity to reply before breaking my stuff.

> -Daniel
Dave Airlie Nov. 8, 2013, 7 p.m. UTC | #16
On Sat, Nov 9, 2013 at 3:21 AM, Ian Romanick <idr@freedesktop.org> wrote:
> On 11/07/2013 10:32 PM, Daniel Vetter wrote:
>> On Fri, Nov 8, 2013 at 6:48 AM, Dave Airlie <airlied@gmail.com> wrote:
>>> On Wed, Oct 30, 2013 at 6:21 PM, Daniel Vetter <daniel@ffwll.ch> wrote:
>>>> On Tue, Oct 29, 2013 at 11:29 PM, Ian Romanick <idr@freedesktop.org> wrote:
>>>>> On 10/27/2013 05:30 AM, Daniel Vetter wrote:
>>>>>> On Fri, Oct 25, 2013 at 06:42:35PM -0700, Ian Romanick wrote:
>>>>>>> Since the Mesa merge window is closing soon, I'm finally getting back on
>>>>>>> this.  I've pushed a rebase of my old Mesa branch to my fd.o repo
>>>>>>>
>>>>>>> http://cgit.freedesktop.org/~idr/mesa/log/?h=robustness3
>>>>>>>
>>>>>>> I have a couple questions...
>>>>>>>
>>>>>>> 1. Has any of this landed an a kernel tree anywhere?
>>>>>>
>>>>>> Afaik everything but the actual ioctl and i-g-t testcase has landed.
>>>>>
>>>>> And that stuff will land once my patches hit the Mesa list or ... ?
>>>>
>>>> Yup.
>>>
>>> Hey kernel first, then upstream projects, at the moment libdrm has
>>> ioctls in it that I have no upstream solid kernel commit for,
>>>
>>> Either in the next 24 hrs I have this in my tree or the libdrm commits
>>> need to be reverted,
>>>
>>> and if someone releases libdrm in that time span then I'm going to be
>>> quite pissed.
>>
>> It's kinda too late imo for 3.13 (and there's an open question whether
>> we need one more flag or not), so I wanted to pull it in into 3.14.
>> Which also gives us plenty of time to add or not add that optional
>> flag. So I guess time to revert. Can you do that pls?
>
> Reverting has completely broken Mesa builds, and was the wrong choice.
> Thanks for giving me opportunity to reply before breaking my stuff.
>

Hey Ian,

stop merging incomplete shit 5 mins before the branch point,

you should know better, Mesa is meant to be moving to 3 mth release
cycle to avoid this kinda merge everything crap,

there was an open thread on the api for this feature, there was
questions of whether a drm cap was needed, you failed to address any
of these and you merged the feature, the rules aren't different than
before, stuff goes in the kernel first and userspace second, we went
down this road before and it always gets screwed up.

Dave.
Jesse Barnes Nov. 8, 2013, 7:22 p.m. UTC | #17
On Fri, 8 Nov 2013 07:32:05 +0100
Daniel Vetter <daniel@ffwll.ch> wrote:

> On Fri, Nov 8, 2013 at 6:48 AM, Dave Airlie <airlied@gmail.com> wrote:
> > On Wed, Oct 30, 2013 at 6:21 PM, Daniel Vetter <daniel@ffwll.ch> wrote:
> >> On Tue, Oct 29, 2013 at 11:29 PM, Ian Romanick <idr@freedesktop.org> wrote:
> >>> On 10/27/2013 05:30 AM, Daniel Vetter wrote:
> >>>> On Fri, Oct 25, 2013 at 06:42:35PM -0700, Ian Romanick wrote:
> >>>>> Since the Mesa merge window is closing soon, I'm finally getting back on
> >>>>> this.  I've pushed a rebase of my old Mesa branch to my fd.o repo
> >>>>>
> >>>>> http://cgit.freedesktop.org/~idr/mesa/log/?h=robustness3
> >>>>>
> >>>>> I have a couple questions...
> >>>>>
> >>>>> 1. Has any of this landed an a kernel tree anywhere?
> >>>>
> >>>> Afaik everything but the actual ioctl and i-g-t testcase has landed.
> >>>
> >>> And that stuff will land once my patches hit the Mesa list or ... ?
> >>
> >> Yup.
> >
> > Hey kernel first, then upstream projects, at the moment libdrm has
> > ioctls in it that I have no upstream solid kernel commit for,
> >
> > Either in the next 24 hrs I have this in my tree or the libdrm commits
> > need to be reverted,
> >
> > and if someone releases libdrm in that time span then I'm going to be
> > quite pissed.
> 
> It's kinda too late imo for 3.13 (and there's an open question whether
> we need one more flag or not), so I wanted to pull it in into 3.14.
> Which also gives us plenty of time to add or not add that optional
> flag. So I guess time to revert. Can you do that pls?

This ioctl is tiny and self-contained.  So it seems like as long as the
Mesa team is good with it (i.e. using it successfully), there shouldn't
be a problem pushing it now.  There's no risk of regression as it's a
new feature and isolated.

Dave?
Ian Romanick Nov. 8, 2013, 9:20 p.m. UTC | #18
On 11/08/2013 11:00 AM, Dave Airlie wrote:
> On Sat, Nov 9, 2013 at 3:21 AM, Ian Romanick <idr@freedesktop.org> wrote:
>> On 11/07/2013 10:32 PM, Daniel Vetter wrote:
>>> On Fri, Nov 8, 2013 at 6:48 AM, Dave Airlie <airlied@gmail.com> wrote:
>>>> On Wed, Oct 30, 2013 at 6:21 PM, Daniel Vetter <daniel@ffwll.ch> wrote:
>>>>> On Tue, Oct 29, 2013 at 11:29 PM, Ian Romanick <idr@freedesktop.org> wrote:
>>>>>> On 10/27/2013 05:30 AM, Daniel Vetter wrote:
>>>>>>> On Fri, Oct 25, 2013 at 06:42:35PM -0700, Ian Romanick wrote:
>>>>>>>> Since the Mesa merge window is closing soon, I'm finally getting back on
>>>>>>>> this.  I've pushed a rebase of my old Mesa branch to my fd.o repo
>>>>>>>>
>>>>>>>> http://cgit.freedesktop.org/~idr/mesa/log/?h=robustness3
>>>>>>>>
>>>>>>>> I have a couple questions...
>>>>>>>>
>>>>>>>> 1. Has any of this landed an a kernel tree anywhere?
>>>>>>>
>>>>>>> Afaik everything but the actual ioctl and i-g-t testcase has landed.
>>>>>>
>>>>>> And that stuff will land once my patches hit the Mesa list or ... ?
>>>>>
>>>>> Yup.
>>>>
>>>> Hey kernel first, then upstream projects, at the moment libdrm has
>>>> ioctls in it that I have no upstream solid kernel commit for,
>>>>
>>>> Either in the next 24 hrs I have this in my tree or the libdrm commits
>>>> need to be reverted,
>>>>
>>>> and if someone releases libdrm in that time span then I'm going to be
>>>> quite pissed.
>>>
>>> It's kinda too late imo for 3.13 (and there's an open question whether
>>> we need one more flag or not), so I wanted to pull it in into 3.14.
>>> Which also gives us plenty of time to add or not add that optional
>>> flag. So I guess time to revert. Can you do that pls?
>>
>> Reverting has completely broken Mesa builds, and was the wrong choice.
>> Thanks for giving me opportunity to reply before breaking my stuff.
> 
> Hey Ian,
> 
> stop merging incomplete shit 5 mins before the branch point,

Stop calling other people's hard work shit.  It just makes you look like
an asshole.  Seriously.

> you should know better, Mesa is meant to be moving to 3 mth release
> cycle to avoid this kinda merge everything crap,

That's not the reason for the 3 month cycle.  The reason for the 3 month
cycle is because all of the distros, including Fedora, were pulling
random points from master because waiting 6 months for features and
performance improvements was too long.

> there was an open thread on the api for this feature, there was
> questions of whether a drm cap was needed, you failed to address any

I don't know where you're getting "drm cap" nonsense.  That was closed
two weeks ago:

http://lists.freedesktop.org/archives/intel-gfx/2013-October/035016.html

> of these and you merged the feature, the rules aren't different than
> before, stuff goes in the kernel first and userspace second, we went
> down this road before and it always gets screwed up.

I thought Daniel and I had closed that, and I Acked-by the kernel patch.
 He and I talked about this on IRC.  ***IF*** we need an extra flag for
global resets, there's already a place for it in the (currently) pad
field returned by the ioctl.

This isn't some last minute, half-baked interface.  We've been working
on this for nearly a year.  After going through several revisions,
Mika's kernel patch has been on the intel-gfx list since July.  That
people are exploding about this now just boggles my mind.

> Dave.
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 0e22142..d1a006f 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1889,6 +1889,7 @@  struct drm_ioctl_desc i915_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_CREATE, i915_gem_context_create_ioctl, DRM_UNLOCKED),
 	DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_DESTROY, i915_gem_context_destroy_ioctl, DRM_UNLOCKED),
 	DRM_IOCTL_DEF_DRV(I915_REG_READ, i915_reg_read_ioctl, DRM_UNLOCKED),
+	DRM_IOCTL_DEF_DRV(I915_GET_RESET_STATS, i915_get_reset_stats_ioctl, DRM_UNLOCKED),
 };
 
 int i915_max_ioctl = DRM_ARRAY_SIZE(i915_ioctls);
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 33cb973..0d4e3a8 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1350,3 +1350,37 @@  int i915_reg_read_ioctl(struct drm_device *dev,
 
 	return 0;
 }
+
+int i915_get_reset_stats_ioctl(struct drm_device *dev,
+			       void *data, struct drm_file *file)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct drm_i915_reset_stats *args = data;
+	struct i915_ctx_hang_stats *hs;
+	int ret;
+
+	if (args->ctx_id == 0 && !capable(CAP_SYS_ADMIN))
+		return -EPERM;
+
+	ret = mutex_lock_interruptible(&dev->struct_mutex);
+	if (ret)
+		return ret;
+
+	hs = i915_gem_context_get_hang_stats(dev, file, args->ctx_id);
+	if (IS_ERR(hs)) {
+		mutex_unlock(&dev->struct_mutex);
+		return PTR_ERR(hs);
+	}
+
+	if (capable(CAP_SYS_ADMIN))
+		args->reset_count = i915_reset_count(&dev_priv->gpu_error);
+	else
+		args->reset_count = 0;
+
+	args->batch_active = hs->batch_active;
+	args->batch_pending = hs->batch_pending;
+
+	mutex_unlock(&dev->struct_mutex);
+
+	return 0;
+}
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 1def049..0ca98fa 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2021,6 +2021,8 @@  extern int intel_enable_rc6(const struct drm_device *dev);
 extern bool i915_semaphore_is_enabled(struct drm_device *dev);
 int i915_reg_read_ioctl(struct drm_device *dev, void *data,
 			struct drm_file *file);
+int i915_get_reset_stats_ioctl(struct drm_device *dev, void *data,
+			       struct drm_file *file);
 
 /* overlay */
 #ifdef CONFIG_DEBUG_FS
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 923ed7f..29b07fd 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -198,6 +198,7 @@  typedef struct _drm_i915_sarea {
 #define DRM_I915_GEM_SET_CACHING	0x2f
 #define DRM_I915_GEM_GET_CACHING	0x30
 #define DRM_I915_REG_READ		0x31
+#define DRM_I915_GET_RESET_STATS	0x32
 
 #define DRM_IOCTL_I915_INIT		DRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t)
 #define DRM_IOCTL_I915_FLUSH		DRM_IO ( DRM_COMMAND_BASE + DRM_I915_FLUSH)
@@ -247,6 +248,7 @@  typedef struct _drm_i915_sarea {
 #define DRM_IOCTL_I915_GEM_CONTEXT_CREATE	DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_CONTEXT_CREATE, struct drm_i915_gem_context_create)
 #define DRM_IOCTL_I915_GEM_CONTEXT_DESTROY	DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_CONTEXT_DESTROY, struct drm_i915_gem_context_destroy)
 #define DRM_IOCTL_I915_REG_READ			DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_REG_READ, struct drm_i915_reg_read)
+#define DRM_IOCTL_I915_GET_RESET_STATS		DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GET_RESET_STATS, struct drm_i915_reset_stats)
 
 /* Allow drivers to submit batchbuffers directly to hardware, relying
  * on the security mechanisms provided by hardware.
@@ -981,4 +983,19 @@  struct drm_i915_reg_read {
 	__u64 offset;
 	__u64 val; /* Return value */
 };
+
+struct drm_i915_reset_stats {
+	__u32 ctx_id;
+	__u32 flags;
+
+	/* For all contexts */
+	__u32 reset_count;
+
+	/* For this context */
+	__u32 batch_active;
+	__u32 batch_pending;
+
+	__u32 pad;
+};
+
 #endif /* _UAPI_I915_DRM_H_ */