Message ID | 20210423223131.879208-12-jason@jlekstrand.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/i915/gem: ioctl clean-ups | expand |
On Fri, Apr 23, 2021 at 05:31:21PM -0500, Jason Ekstrand wrote: > As far as I can tell, the only real reason for this is to avoid taking a > reference to the i915_gem_context. The cost of those two atomics > probably pales in comparison to the cost of the ioctl itself so we're > really not buying ourselves anything here. We're about to make context > lookup a tiny bit more complicated, so let's get rid of the one hand- > rolled case. I think the historical reason here is that i965_brw checks this before every execbuf call, at least for arb_robustness contexts with the right flag. But we've fixed that hotpath problem by adding non-recoverable contexts. The kernel will tell you now automatically, for proper userspace at least (I checked iris and anv, assuming I got it correct), and reset_stats ioctl isn't a hot path worth micro-optimizing anymore. With that bit of more context added to the commit message: Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> > > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> > --- > drivers/gpu/drm/i915/gem/i915_gem_context.c | 13 ++++--------- > drivers/gpu/drm/i915/i915_drv.h | 8 +------- > 2 files changed, 5 insertions(+), 16 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c > index ecb3bf5369857..941fbf78267b4 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c > @@ -2090,16 +2090,13 @@ int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, > struct drm_i915_private *i915 = to_i915(dev); > struct drm_i915_reset_stats *args = data; > struct i915_gem_context *ctx; > - int ret; > > if (args->flags || args->pad) > return -EINVAL; > > - ret = -ENOENT; > - rcu_read_lock(); > - ctx = __i915_gem_context_lookup_rcu(file->driver_priv, args->ctx_id); > + ctx = i915_gem_context_lookup(file->driver_priv, args->ctx_id); > if (!ctx) > - goto out; > + return -ENOENT; > > /* > * We opt for unserialised reads here. This may result in tearing > @@ -2116,10 +2113,8 @@ int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, > args->batch_active = atomic_read(&ctx->guilty_count); > args->batch_pending = atomic_read(&ctx->active_count); > > - ret = 0; > -out: > - rcu_read_unlock(); > - return ret; > + i915_gem_context_put(ctx); > + return 0; > } > > /* GEM context-engines iterator: for_each_gem_engine() */ > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index 0b44333eb7033..8571c5c1509a7 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -1840,19 +1840,13 @@ struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev, > > struct dma_buf *i915_gem_prime_export(struct drm_gem_object *gem_obj, int flags); > > -static inline struct i915_gem_context * > -__i915_gem_context_lookup_rcu(struct drm_i915_file_private *file_priv, u32 id) > -{ > - return xa_load(&file_priv->context_xa, id); > -} > - > static inline struct i915_gem_context * > i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id) > { > struct i915_gem_context *ctx; > > rcu_read_lock(); > - ctx = __i915_gem_context_lookup_rcu(file_priv, id); > + ctx = xa_load(&file_priv->context_xa, id); > if (ctx && !kref_get_unless_zero(&ctx->ref)) > ctx = NULL; > rcu_read_unlock(); > -- > 2.31.1 > > _______________________________________________ > dri-devel mailing list > dri-devel@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel
On Wed, Apr 28, 2021 at 5:27 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > On Fri, Apr 23, 2021 at 05:31:21PM -0500, Jason Ekstrand wrote: > > As far as I can tell, the only real reason for this is to avoid taking a > > reference to the i915_gem_context. The cost of those two atomics > > probably pales in comparison to the cost of the ioctl itself so we're > > really not buying ourselves anything here. We're about to make context > > lookup a tiny bit more complicated, so let's get rid of the one hand- > > rolled case. > > I think the historical reason here is that i965_brw checks this before > every execbuf call, at least for arb_robustness contexts with the right > flag. But we've fixed that hotpath problem by adding non-recoverable > contexts. The kernel will tell you now automatically, for proper userspace > at least (I checked iris and anv, assuming I got it correct), and > reset_stats ioctl isn't a hot path worth micro-optimizing anymore. I'm not sure I agree with that bit. I don't think it was ever worth micro-optimizing like this. What does it gain us? Two fewer atomics? It's not like the bad old days when it took a lock. ANV still calls reset_stats before every set of execbuf (sometimes more than one) but I've never once seen it show up on a perf trace. execbuf, on the other hand, that does show up and pretty heavy sometimes. > With that bit of more context added to the commit message: I'd like to agree on what to add before adding something --Jason > Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> > > > > > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> > > --- > > drivers/gpu/drm/i915/gem/i915_gem_context.c | 13 ++++--------- > > drivers/gpu/drm/i915/i915_drv.h | 8 +------- > > 2 files changed, 5 insertions(+), 16 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c > > index ecb3bf5369857..941fbf78267b4 100644 > > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c > > @@ -2090,16 +2090,13 @@ int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, > > struct drm_i915_private *i915 = to_i915(dev); > > struct drm_i915_reset_stats *args = data; > > struct i915_gem_context *ctx; > > - int ret; > > > > if (args->flags || args->pad) > > return -EINVAL; > > > > - ret = -ENOENT; > > - rcu_read_lock(); > > - ctx = __i915_gem_context_lookup_rcu(file->driver_priv, args->ctx_id); > > + ctx = i915_gem_context_lookup(file->driver_priv, args->ctx_id); > > if (!ctx) > > - goto out; > > + return -ENOENT; > > > > /* > > * We opt for unserialised reads here. This may result in tearing > > @@ -2116,10 +2113,8 @@ int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, > > args->batch_active = atomic_read(&ctx->guilty_count); > > args->batch_pending = atomic_read(&ctx->active_count); > > > > - ret = 0; > > -out: > > - rcu_read_unlock(); > > - return ret; > > + i915_gem_context_put(ctx); > > + return 0; > > } > > > > /* GEM context-engines iterator: for_each_gem_engine() */ > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > > index 0b44333eb7033..8571c5c1509a7 100644 > > --- a/drivers/gpu/drm/i915/i915_drv.h > > +++ b/drivers/gpu/drm/i915/i915_drv.h > > @@ -1840,19 +1840,13 @@ struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev, > > > > struct dma_buf *i915_gem_prime_export(struct drm_gem_object *gem_obj, int flags); > > > > -static inline struct i915_gem_context * > > -__i915_gem_context_lookup_rcu(struct drm_i915_file_private *file_priv, u32 id) > > -{ > > - return xa_load(&file_priv->context_xa, id); > > -} > > - > > static inline struct i915_gem_context * > > i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id) > > { > > struct i915_gem_context *ctx; > > > > rcu_read_lock(); > > - ctx = __i915_gem_context_lookup_rcu(file_priv, id); > > + ctx = xa_load(&file_priv->context_xa, id); > > if (ctx && !kref_get_unless_zero(&ctx->ref)) > > ctx = NULL; > > rcu_read_unlock(); > > -- > > 2.31.1 > > > > _______________________________________________ > > dri-devel mailing list > > dri-devel@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/dri-devel > > -- > Daniel Vetter > Software Engineer, Intel Corporation > http://blog.ffwll.ch
On Wed, Apr 28, 2021 at 01:22:14PM -0500, Jason Ekstrand wrote: > On Wed, Apr 28, 2021 at 5:27 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > On Fri, Apr 23, 2021 at 05:31:21PM -0500, Jason Ekstrand wrote: > > > As far as I can tell, the only real reason for this is to avoid taking a > > > reference to the i915_gem_context. The cost of those two atomics > > > probably pales in comparison to the cost of the ioctl itself so we're > > > really not buying ourselves anything here. We're about to make context > > > lookup a tiny bit more complicated, so let's get rid of the one hand- > > > rolled case. > > > > I think the historical reason here is that i965_brw checks this before > > every execbuf call, at least for arb_robustness contexts with the right > > flag. But we've fixed that hotpath problem by adding non-recoverable > > contexts. The kernel will tell you now automatically, for proper userspace > > at least (I checked iris and anv, assuming I got it correct), and > > reset_stats ioctl isn't a hot path worth micro-optimizing anymore. > > I'm not sure I agree with that bit. I don't think it was ever worth > micro-optimizing like this. What does it gain us? Two fewer atomics? > It's not like the bad old days when it took a lock. > > ANV still calls reset_stats before every set of execbuf (sometimes > more than one) but I've never once seen it show up on a perf trace. > execbuf, on the other hand, that does show up and pretty heavy > sometimes. Huh I thought I checked, but I guess got lost. > > With that bit of more context added to the commit message: > > I'd like to agree on what to add before adding something Yeah in this case maybe just mention that with non-recoverable ctx there's no need for userspace to check before every execbuf, so if this ever shows up there's a proper fix which avoids the ioctl entirely. Like iris does. Or something like that. I just want to make it clear that if this ever does show up (once we've made execbuf faster with vm_bind and all that) then the correct fix isn't to make this ioctl faster. But to just not call it :-) Cheers, Daniel > > --Jason > > > Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> > > > > > > > > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> > > > --- > > > drivers/gpu/drm/i915/gem/i915_gem_context.c | 13 ++++--------- > > > drivers/gpu/drm/i915/i915_drv.h | 8 +------- > > > 2 files changed, 5 insertions(+), 16 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c > > > index ecb3bf5369857..941fbf78267b4 100644 > > > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c > > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c > > > @@ -2090,16 +2090,13 @@ int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, > > > struct drm_i915_private *i915 = to_i915(dev); > > > struct drm_i915_reset_stats *args = data; > > > struct i915_gem_context *ctx; > > > - int ret; > > > > > > if (args->flags || args->pad) > > > return -EINVAL; > > > > > > - ret = -ENOENT; > > > - rcu_read_lock(); > > > - ctx = __i915_gem_context_lookup_rcu(file->driver_priv, args->ctx_id); > > > + ctx = i915_gem_context_lookup(file->driver_priv, args->ctx_id); > > > if (!ctx) > > > - goto out; > > > + return -ENOENT; > > > > > > /* > > > * We opt for unserialised reads here. This may result in tearing > > > @@ -2116,10 +2113,8 @@ int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, > > > args->batch_active = atomic_read(&ctx->guilty_count); > > > args->batch_pending = atomic_read(&ctx->active_count); > > > > > > - ret = 0; > > > -out: > > > - rcu_read_unlock(); > > > - return ret; > > > + i915_gem_context_put(ctx); > > > + return 0; > > > } > > > > > > /* GEM context-engines iterator: for_each_gem_engine() */ > > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > > > index 0b44333eb7033..8571c5c1509a7 100644 > > > --- a/drivers/gpu/drm/i915/i915_drv.h > > > +++ b/drivers/gpu/drm/i915/i915_drv.h > > > @@ -1840,19 +1840,13 @@ struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev, > > > > > > struct dma_buf *i915_gem_prime_export(struct drm_gem_object *gem_obj, int flags); > > > > > > -static inline struct i915_gem_context * > > > -__i915_gem_context_lookup_rcu(struct drm_i915_file_private *file_priv, u32 id) > > > -{ > > > - return xa_load(&file_priv->context_xa, id); > > > -} > > > - > > > static inline struct i915_gem_context * > > > i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id) > > > { > > > struct i915_gem_context *ctx; > > > > > > rcu_read_lock(); > > > - ctx = __i915_gem_context_lookup_rcu(file_priv, id); > > > + ctx = xa_load(&file_priv->context_xa, id); > > > if (ctx && !kref_get_unless_zero(&ctx->ref)) > > > ctx = NULL; > > > rcu_read_unlock(); > > > -- > > > 2.31.1 > > > > > > _______________________________________________ > > > dri-devel mailing list > > > dri-devel@lists.freedesktop.org > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel > > > > -- > > Daniel Vetter > > Software Engineer, Intel Corporation > > http://blog.ffwll.ch
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index ecb3bf5369857..941fbf78267b4 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -2090,16 +2090,13 @@ int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, struct drm_i915_private *i915 = to_i915(dev); struct drm_i915_reset_stats *args = data; struct i915_gem_context *ctx; - int ret; if (args->flags || args->pad) return -EINVAL; - ret = -ENOENT; - rcu_read_lock(); - ctx = __i915_gem_context_lookup_rcu(file->driver_priv, args->ctx_id); + ctx = i915_gem_context_lookup(file->driver_priv, args->ctx_id); if (!ctx) - goto out; + return -ENOENT; /* * We opt for unserialised reads here. This may result in tearing @@ -2116,10 +2113,8 @@ int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, args->batch_active = atomic_read(&ctx->guilty_count); args->batch_pending = atomic_read(&ctx->active_count); - ret = 0; -out: - rcu_read_unlock(); - return ret; + i915_gem_context_put(ctx); + return 0; } /* GEM context-engines iterator: for_each_gem_engine() */ diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 0b44333eb7033..8571c5c1509a7 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1840,19 +1840,13 @@ struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev, struct dma_buf *i915_gem_prime_export(struct drm_gem_object *gem_obj, int flags); -static inline struct i915_gem_context * -__i915_gem_context_lookup_rcu(struct drm_i915_file_private *file_priv, u32 id) -{ - return xa_load(&file_priv->context_xa, id); -} - static inline struct i915_gem_context * i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id) { struct i915_gem_context *ctx; rcu_read_lock(); - ctx = __i915_gem_context_lookup_rcu(file_priv, id); + ctx = xa_load(&file_priv->context_xa, id); if (ctx && !kref_get_unless_zero(&ctx->ref)) ctx = NULL; rcu_read_unlock();
As far as I can tell, the only real reason for this is to avoid taking a reference to the i915_gem_context. The cost of those two atomics probably pales in comparison to the cost of the ioctl itself so we're really not buying ourselves anything here. We're about to make context lookup a tiny bit more complicated, so let's get rid of the one hand- rolled case. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 13 ++++--------- drivers/gpu/drm/i915/i915_drv.h | 8 +------- 2 files changed, 5 insertions(+), 16 deletions(-)