Message ID | 20210423223131.879208-17-jason@jlekstrand.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/i915/gem: ioctl clean-ups | expand |
Hi Jason, Thank you for the patch! Yet something to improve: [auto build test ERROR on drm-intel/for-linux-next] [also build test ERROR on drm-tip/drm-tip drm-exynos/exynos-drm-next next-20210423] [cannot apply to tegra-drm/drm/tegra/for-next drm/drm-next v5.12-rc8] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Jason-Ekstrand/drm-i915-gem-ioctl-clean-ups/20210424-063511 base: git://anongit.freedesktop.org/drm-intel for-linux-next config: x86_64-randconfig-a001-20210423 (attached as .config) compiler: gcc-9 (Debian 9.3.0-22) 9.3.0 reproduce (this is a W=1 build): # https://github.com/0day-ci/linux/commit/e00622bd8a3f3eccbb22721c2f8857bdfb7d5d9d git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Jason-Ekstrand/drm-i915-gem-ioctl-clean-ups/20210424-063511 git checkout e00622bd8a3f3eccbb22721c2f8857bdfb7d5d9d # save the attached .config to linux build tree make W=1 W=1 ARCH=x86_64 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>): >> drivers/gpu/drm/i915/gem/i915_gem_context.c:2439:1: error: no previous prototype for 'lazy_create_context_locked' [-Werror=missing-prototypes] 2439 | lazy_create_context_locked(struct drm_i915_file_private *file_priv, | ^~~~~~~~~~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors vim +/lazy_create_context_locked +2439 drivers/gpu/drm/i915/gem/i915_gem_context.c 2437 2438 struct i915_gem_context * > 2439 lazy_create_context_locked(struct drm_i915_file_private *file_priv, 2440 struct i915_gem_proto_context *pc, u32 id) 2441 { 2442 struct i915_gem_context *ctx; 2443 void *old; 2444 2445 ctx = i915_gem_create_context(file_priv->dev_priv, pc); 2446 if (IS_ERR(ctx)) 2447 return ctx; 2448 2449 gem_context_register(ctx, file_priv, id); 2450 2451 old = xa_erase(&file_priv->proto_context_xa, id); 2452 GEM_BUG_ON(old != pc); 2453 proto_context_close(pc); 2454 2455 /* One for the xarray and one for the caller */ 2456 return i915_gem_context_get(ctx); 2457 } 2458 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
Hi Jason, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on drm-intel/for-linux-next] [also build test WARNING on drm-tip/drm-tip drm-exynos/exynos-drm-next next-20210423] [cannot apply to tegra-drm/drm/tegra/for-next drm/drm-next v5.12-rc8] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Jason-Ekstrand/drm-i915-gem-ioctl-clean-ups/20210424-063511 base: git://anongit.freedesktop.org/drm-intel for-linux-next config: x86_64-rhel-8.3 (attached as .config) compiler: gcc-9 (Debian 9.3.0-22) 9.3.0 reproduce (this is a W=1 build): # https://github.com/0day-ci/linux/commit/e00622bd8a3f3eccbb22721c2f8857bdfb7d5d9d git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Jason-Ekstrand/drm-i915-gem-ioctl-clean-ups/20210424-063511 git checkout e00622bd8a3f3eccbb22721c2f8857bdfb7d5d9d # save the attached .config to linux build tree make W=1 W=1 ARCH=x86_64 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All warnings (new ones prefixed by >>): >> drivers/gpu/drm/i915/gem/i915_gem_context.c:2439:1: warning: no previous prototype for 'lazy_create_context_locked' [-Wmissing-prototypes] 2439 | lazy_create_context_locked(struct drm_i915_file_private *file_priv, | ^~~~~~~~~~~~~~~~~~~~~~~~~~ vim +/lazy_create_context_locked +2439 drivers/gpu/drm/i915/gem/i915_gem_context.c 2437 2438 struct i915_gem_context * > 2439 lazy_create_context_locked(struct drm_i915_file_private *file_priv, 2440 struct i915_gem_proto_context *pc, u32 id) 2441 { 2442 struct i915_gem_context *ctx; 2443 void *old; 2444 2445 ctx = i915_gem_create_context(file_priv->dev_priv, pc); 2446 if (IS_ERR(ctx)) 2447 return ctx; 2448 2449 gem_context_register(ctx, file_priv, id); 2450 2451 old = xa_erase(&file_priv->proto_context_xa, id); 2452 GEM_BUG_ON(old != pc); 2453 proto_context_close(pc); 2454 2455 /* One for the xarray and one for the caller */ 2456 return i915_gem_context_get(ctx); 2457 } 2458 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
Yeah this needs some text to explain what/why you're doing this, and maybe some rough sketch of the locking design. On Fri, Apr 23, 2021 at 05:31:26PM -0500, Jason Ekstrand wrote: > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> > --- > drivers/gpu/drm/i915/gem/i915_gem_context.c | 657 ++++++++++++++++-- > drivers/gpu/drm/i915/gem/i915_gem_context.h | 3 + > .../gpu/drm/i915/gem/i915_gem_context_types.h | 26 + > .../gpu/drm/i915/gem/selftests/mock_context.c | 5 +- > drivers/gpu/drm/i915/i915_drv.h | 17 +- > 5 files changed, 648 insertions(+), 60 deletions(-) So I think the patch split here is a bit unfortunate, because you're adding the new vm/engine validation code for proto context here, but the old stuff is only removed in the next patches that make vm/engines immutable after first use. I think a better split would be if this patch here only has all the scaffolding. You already have the EOPNOTSUPP fallback (which I hope gets removed), so moving the conversion entirely to later patches should be all fine. Or do I miss something? I think the only concern I'm seeing is that bisectability might be a bit lost, because we finalize the context in some cases in setparam. And if we do the conversion in a different order than the one media uses for its setparam, then later setparam might fail because the context is finalized already. But also - it's just bisectability of media functionality I think - just check which order media calls CTX_SETPARAM and use that to do the conversion And we should be fine ... I think? Some more thoughts below, but the proto ctx stuff itself looks fine. > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c > index db9153e0f85a7..aa8e61211924f 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c > @@ -193,8 +193,15 @@ static int validate_priority(struct drm_i915_private *i915, > > static void proto_context_close(struct i915_gem_proto_context *pc) > { > + int i; > + > if (pc->vm) > i915_vm_put(pc->vm); > + if (pc->user_engines) { > + for (i = 0; i < pc->num_user_engines; i++) > + kfree(pc->user_engines[i].siblings); > + kfree(pc->user_engines); > + } > kfree(pc); > } > > @@ -274,12 +281,417 @@ proto_context_create(struct drm_i915_private *i915, unsigned int flags) > proto_context_set_persistence(i915, pc, true); > pc->sched.priority = I915_PRIORITY_NORMAL; > > + pc->num_user_engines = -1; > + pc->user_engines = NULL; > + > if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE) > pc->single_timeline = true; > > return pc; > } > > +static int proto_context_register_locked(struct drm_i915_file_private *fpriv, > + struct i915_gem_proto_context *pc, > + u32 *id) > +{ > + int ret; > + void *old; assert_lock_held just for consistency. > + > + ret = xa_alloc(&fpriv->context_xa, id, NULL, xa_limit_32b, GFP_KERNEL); > + if (ret) > + return ret; > + > + old = xa_store(&fpriv->proto_context_xa, *id, pc, GFP_KERNEL); > + if (xa_is_err(old)) { > + xa_erase(&fpriv->context_xa, *id); > + return xa_err(old); > + } > + GEM_BUG_ON(old); > + > + return 0; > +} > + > +static int proto_context_register(struct drm_i915_file_private *fpriv, > + struct i915_gem_proto_context *pc, > + u32 *id) > +{ > + int ret; > + > + mutex_lock(&fpriv->proto_context_lock); > + ret = proto_context_register_locked(fpriv, pc, id); > + mutex_unlock(&fpriv->proto_context_lock); > + > + return ret; > +} > + > +static int set_proto_ctx_vm(struct drm_i915_file_private *fpriv, > + struct i915_gem_proto_context *pc, > + const struct drm_i915_gem_context_param *args) > +{ > + struct i915_address_space *vm; > + > + if (args->size) > + return -EINVAL; > + > + if (!pc->vm) > + return -ENODEV; > + > + if (upper_32_bits(args->value)) > + return -ENOENT; > + > + rcu_read_lock(); > + vm = xa_load(&fpriv->vm_xa, args->value); > + if (vm && !kref_get_unless_zero(&vm->ref)) > + vm = NULL; > + rcu_read_unlock(); > + if (!vm) > + return -ENOENT; > + > + i915_vm_put(pc->vm); > + pc->vm = vm; > + > + return 0; > +} > + > +struct set_proto_ctx_engines { > + struct drm_i915_private *i915; > + unsigned num_engines; > + struct i915_gem_proto_engine *engines; > +}; > + > +static int > +set_proto_ctx_engines_balance(struct i915_user_extension __user *base, > + void *data) > +{ > + struct i915_context_engines_load_balance __user *ext = > + container_of_user(base, typeof(*ext), base); > + const struct set_proto_ctx_engines *set = data; > + struct drm_i915_private *i915 = set->i915; > + struct intel_engine_cs **siblings; > + u16 num_siblings, idx; > + unsigned int n; > + int err; > + > + if (!HAS_EXECLISTS(i915)) > + return -ENODEV; > + > + if (intel_uc_uses_guc_submission(&i915->gt.uc)) > + return -ENODEV; /* not implement yet */ > + > + if (get_user(idx, &ext->engine_index)) > + return -EFAULT; > + > + if (idx >= set->num_engines) { > + drm_dbg(&i915->drm, "Invalid placement value, %d >= %d\n", > + idx, set->num_engines); > + return -EINVAL; > + } > + > + idx = array_index_nospec(idx, set->num_engines); > + if (set->engines[idx].type != I915_GEM_ENGINE_TYPE_INVALID) { > + drm_dbg(&i915->drm, > + "Invalid placement[%d], already occupied\n", idx); > + return -EEXIST; > + } > + > + if (get_user(num_siblings, &ext->num_siblings)) > + return -EFAULT; > + > + err = check_user_mbz(&ext->flags); > + if (err) > + return err; > + > + err = check_user_mbz(&ext->mbz64); > + if (err) > + return err; > + > + if (num_siblings == 0) > + return 0; > + > + siblings = kmalloc_array(num_siblings, sizeof(*siblings), GFP_KERNEL); > + if (!siblings) > + return -ENOMEM; > + > + for (n = 0; n < num_siblings; n++) { > + struct i915_engine_class_instance ci; > + > + if (copy_from_user(&ci, &ext->engines[n], sizeof(ci))) { > + err = -EFAULT; > + goto err_siblings; > + } > + > + siblings[n] = intel_engine_lookup_user(i915, > + ci.engine_class, > + ci.engine_instance); > + if (!siblings[n]) { > + drm_dbg(&i915->drm, > + "Invalid sibling[%d]: { class:%d, inst:%d }\n", > + n, ci.engine_class, ci.engine_instance); > + err = -EINVAL; > + goto err_siblings; > + } > + } > + > + if (num_siblings == 1) { > + set->engines[idx].type = I915_GEM_ENGINE_TYPE_PHYSICAL; > + set->engines[idx].engine = siblings[0]; > + kfree(siblings); > + } else { > + set->engines[idx].type = I915_GEM_ENGINE_TYPE_BALANCED; > + set->engines[idx].num_siblings = num_siblings; > + set->engines[idx].siblings = siblings; > + } > + > + return 0; > + > +err_siblings: > + kfree(siblings); > + > + return err; > +} > + > +static int > +set_proto_ctx_engines_bond(struct i915_user_extension __user *base, void *data) > +{ > + struct i915_context_engines_bond __user *ext = > + container_of_user(base, typeof(*ext), base); > + const struct set_proto_ctx_engines *set = data; > + struct drm_i915_private *i915 = set->i915; > + struct i915_engine_class_instance ci; > + struct intel_engine_cs *master; > + u16 idx, num_bonds; > + int err, n; > + > + if (get_user(idx, &ext->virtual_index)) > + return -EFAULT; > + > + if (idx >= set->num_engines) { > + drm_dbg(&i915->drm, > + "Invalid index for virtual engine: %d >= %d\n", > + idx, set->num_engines); > + return -EINVAL; > + } > + > + idx = array_index_nospec(idx, set->num_engines); > + if (set->engines[idx].type == I915_GEM_ENGINE_TYPE_INVALID) { > + drm_dbg(&i915->drm, "Invalid engine at %d\n", idx); > + return -EINVAL; > + } > + > + if (set->engines[idx].type != I915_GEM_ENGINE_TYPE_PHYSICAL) { > + drm_dbg(&i915->drm, > + "Bonding with virtual engines not allowed\n"); > + return -EINVAL; > + } > + > + err = check_user_mbz(&ext->flags); > + if (err) > + return err; > + > + for (n = 0; n < ARRAY_SIZE(ext->mbz64); n++) { > + err = check_user_mbz(&ext->mbz64[n]); > + if (err) > + return err; > + } > + > + if (copy_from_user(&ci, &ext->master, sizeof(ci))) > + return -EFAULT; > + > + master = intel_engine_lookup_user(i915, > + ci.engine_class, > + ci.engine_instance); > + if (!master) { > + drm_dbg(&i915->drm, > + "Unrecognised master engine: { class:%u, instance:%u }\n", > + ci.engine_class, ci.engine_instance); > + return -EINVAL; > + } > + > + if (get_user(num_bonds, &ext->num_bonds)) > + return -EFAULT; > + > + for (n = 0; n < num_bonds; n++) { > + struct intel_engine_cs *bond; > + > + if (copy_from_user(&ci, &ext->engines[n], sizeof(ci))) > + return -EFAULT; > + > + bond = intel_engine_lookup_user(i915, > + ci.engine_class, > + ci.engine_instance); > + if (!bond) { > + drm_dbg(&i915->drm, > + "Unrecognised engine[%d] for bonding: { class:%d, instance: %d }\n", > + n, ci.engine_class, ci.engine_instance); > + return -EINVAL; > + } > + } > + > + return 0; > +} > + > +static const i915_user_extension_fn set_proto_ctx_engines_extensions[] = { > + [I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE] = set_proto_ctx_engines_balance, > + [I915_CONTEXT_ENGINES_EXT_BOND] = set_proto_ctx_engines_bond, > +}; > + > +static int set_proto_ctx_engines(struct drm_i915_file_private *fpriv, > + struct i915_gem_proto_context *pc, > + const struct drm_i915_gem_context_param *args) > +{ > + struct drm_i915_private *i915 = fpriv->dev_priv; > + struct set_proto_ctx_engines set = { .i915 = i915 }; > + struct i915_context_param_engines __user *user = > + u64_to_user_ptr(args->value); > + unsigned int n; > + u64 extensions; > + int err; > + > + if (!args->size) { > + kfree(pc->user_engines); > + pc->num_user_engines = -1; > + pc->user_engines = NULL; > + return 0; > + } > + > + BUILD_BUG_ON(!IS_ALIGNED(sizeof(*user), sizeof(*user->engines))); > + if (args->size < sizeof(*user) || > + !IS_ALIGNED(args->size, sizeof(*user->engines))) { > + drm_dbg(&i915->drm, "Invalid size for engine array: %d\n", > + args->size); > + return -EINVAL; > + } > + > + set.num_engines = (args->size - sizeof(*user)) / sizeof(*user->engines); > + if (set.num_engines > I915_EXEC_RING_MASK + 1) > + return -EINVAL; > + > + set.engines = kmalloc_array(set.num_engines, sizeof(*set.engines), GFP_KERNEL); > + if (!set.engines) > + return -ENOMEM; > + > + for (n = 0; n < set.num_engines; n++) { > + struct i915_engine_class_instance ci; > + struct intel_engine_cs *engine; > + > + if (copy_from_user(&ci, &user->engines[n], sizeof(ci))) { > + kfree(set.engines); > + return -EFAULT; > + } > + > + memset(&set.engines[n], 0, sizeof(set.engines[n])); > + > + if (ci.engine_class == (u16)I915_ENGINE_CLASS_INVALID && > + ci.engine_instance == (u16)I915_ENGINE_CLASS_INVALID_NONE) > + continue; > + > + engine = intel_engine_lookup_user(i915, > + ci.engine_class, > + ci.engine_instance); > + if (!engine) { > + drm_dbg(&i915->drm, > + "Invalid engine[%d]: { class:%d, instance:%d }\n", > + n, ci.engine_class, ci.engine_instance); > + kfree(set.engines); > + return -ENOENT; > + } > + > + set.engines[n].type = I915_GEM_ENGINE_TYPE_PHYSICAL; > + set.engines[n].engine = engine; > + } > + > + err = -EFAULT; > + if (!get_user(extensions, &user->extensions)) > + err = i915_user_extensions(u64_to_user_ptr(extensions), > + set_proto_ctx_engines_extensions, > + ARRAY_SIZE(set_proto_ctx_engines_extensions), > + &set); > + if (err) { > + kfree(set.engines); > + return err; > + } > + > + kfree(pc->user_engines); > + pc->num_user_engines = set.num_engines; > + pc->user_engines = set.engines; > + > + return 0; > +} > + > +static int set_proto_ctx_param(struct drm_i915_file_private *fpriv, > + struct i915_gem_proto_context *pc, > + struct drm_i915_gem_context_param *args) > +{ > + int ret = 0; > + > + switch (args->param) { > + case I915_CONTEXT_PARAM_NO_ERROR_CAPTURE: > + if (args->size) > + ret = -EINVAL; > + else if (args->value) > + set_bit(UCONTEXT_NO_ERROR_CAPTURE, &pc->user_flags); Atomic bitops like in previous patches: Pls no :-) > + else > + clear_bit(UCONTEXT_NO_ERROR_CAPTURE, &pc->user_flags); > + break; > + > + case I915_CONTEXT_PARAM_BANNABLE: > + if (args->size) > + ret = -EINVAL; > + else if (!capable(CAP_SYS_ADMIN) && !args->value) > + ret = -EPERM; > + else if (args->value) > + set_bit(UCONTEXT_BANNABLE, &pc->user_flags); > + else > + clear_bit(UCONTEXT_BANNABLE, &pc->user_flags); > + break; > + > + case I915_CONTEXT_PARAM_RECOVERABLE: > + if (args->size) > + ret = -EINVAL; > + else if (args->value) > + set_bit(UCONTEXT_RECOVERABLE, &pc->user_flags); > + else > + clear_bit(UCONTEXT_RECOVERABLE, &pc->user_flags); > + break; > + > + case I915_CONTEXT_PARAM_PRIORITY: > + ret = validate_priority(fpriv->dev_priv, args); > + if (!ret) > + pc->sched.priority = args->value; > + break; > + > + case I915_CONTEXT_PARAM_SSEU: > + ret = -ENOTSUPP; > + break; > + > + case I915_CONTEXT_PARAM_VM: > + ret = set_proto_ctx_vm(fpriv, pc, args); > + break; > + > + case I915_CONTEXT_PARAM_ENGINES: > + ret = set_proto_ctx_engines(fpriv, pc, args); > + break; > + > + case I915_CONTEXT_PARAM_PERSISTENCE: > + if (args->size) > + ret = -EINVAL; > + else if (args->value) > + set_bit(UCONTEXT_PERSISTENCE, &pc->user_flags); > + else > + clear_bit(UCONTEXT_PERSISTENCE, &pc->user_flags); > + break; > + > + case I915_CONTEXT_PARAM_NO_ZEROMAP: > + case I915_CONTEXT_PARAM_BAN_PERIOD: > + case I915_CONTEXT_PARAM_RINGSIZE: > + default: > + ret = -EINVAL; > + break; > + } > + > + return ret; > +} > + > static struct i915_address_space * > context_get_vm_rcu(struct i915_gem_context *ctx) > { > @@ -450,6 +862,47 @@ static struct i915_gem_engines *default_engines(struct i915_gem_context *ctx) > return e; > } > > +static struct i915_gem_engines *user_engines(struct i915_gem_context *ctx, > + unsigned int num_engines, > + struct i915_gem_proto_engine *pe) > +{ > + struct i915_gem_engines *e; > + unsigned int n; > + > + e = alloc_engines(num_engines); > + for (n = 0; n < num_engines; n++) { > + struct intel_context *ce; > + > + switch (pe[n].type) { > + case I915_GEM_ENGINE_TYPE_PHYSICAL: > + ce = intel_context_create(pe[n].engine); > + break; > + > + case I915_GEM_ENGINE_TYPE_BALANCED: > + ce = intel_execlists_create_virtual(pe[n].siblings, > + pe[n].num_siblings); > + break; > + > + case I915_GEM_ENGINE_TYPE_INVALID: > + default: > + GEM_WARN_ON(pe[n].type != I915_GEM_ENGINE_TYPE_INVALID); > + continue; > + } > + > + if (IS_ERR(ce)) { > + __free_engines(e, n); > + return ERR_CAST(ce); > + } > + > + intel_context_set_gem(ce, ctx); > + > + e->engines[n] = ce; > + } > + e->num_engines = num_engines; > + > + return e; > +} > + > void i915_gem_context_release(struct kref *ref) > { > struct i915_gem_context *ctx = container_of(ref, typeof(*ctx), ref); > @@ -890,6 +1343,24 @@ i915_gem_create_context(struct drm_i915_private *i915, > mutex_unlock(&ctx->mutex); > } > > + if (pc->num_user_engines >= 0) { > + struct i915_gem_engines *engines; > + > + engines = user_engines(ctx, pc->num_user_engines, > + pc->user_engines); > + if (IS_ERR(engines)) { > + context_close(ctx); > + return ERR_CAST(engines); > + } > + > + mutex_lock(&ctx->engines_mutex); > + i915_gem_context_set_user_engines(ctx); > + engines = rcu_replace_pointer(ctx->engines, engines, 1); > + mutex_unlock(&ctx->engines_mutex); > + > + free_engines(engines); > + } > + > if (pc->single_timeline) { > ret = drm_syncobj_create(&ctx->syncobj, > DRM_SYNCOBJ_CREATE_SIGNALED, > @@ -916,12 +1387,12 @@ void i915_gem_init__contexts(struct drm_i915_private *i915) > init_contexts(&i915->gem.contexts); > } > > -static int gem_context_register(struct i915_gem_context *ctx, > - struct drm_i915_file_private *fpriv, > - u32 *id) > +static void gem_context_register(struct i915_gem_context *ctx, > + struct drm_i915_file_private *fpriv, > + u32 id) > { > struct drm_i915_private *i915 = ctx->i915; > - int ret; > + void *old; > > ctx->file_priv = fpriv; > > @@ -930,19 +1401,12 @@ static int gem_context_register(struct i915_gem_context *ctx, > current->comm, pid_nr(ctx->pid)); > > /* And finally expose ourselves to userspace via the idr */ > - ret = xa_alloc(&fpriv->context_xa, id, ctx, xa_limit_32b, GFP_KERNEL); > - if (ret) > - goto err_pid; > + old = xa_store(&fpriv->context_xa, id, ctx, GFP_KERNEL); > + GEM_BUG_ON(old); > > spin_lock(&i915->gem.contexts.lock); > list_add_tail(&ctx->link, &i915->gem.contexts.list); > spin_unlock(&i915->gem.contexts.lock); > - > - return 0; > - > -err_pid: > - put_pid(fetch_and_zero(&ctx->pid)); > - return ret; > } > > int i915_gem_context_open(struct drm_i915_private *i915, > @@ -952,9 +1416,12 @@ int i915_gem_context_open(struct drm_i915_private *i915, > struct i915_gem_proto_context *pc; > struct i915_gem_context *ctx; > int err; > - u32 id; > > - xa_init_flags(&file_priv->context_xa, XA_FLAGS_ALLOC); > + mutex_init(&file_priv->proto_context_lock); > + xa_init_flags(&file_priv->proto_context_xa, XA_FLAGS_ALLOC); > + > + /* 0 reserved for the default context */ > + xa_init_flags(&file_priv->context_xa, XA_FLAGS_ALLOC1); > > /* 0 reserved for invalid/unassigned ppgtt */ > xa_init_flags(&file_priv->vm_xa, XA_FLAGS_ALLOC1); > @@ -972,28 +1439,31 @@ int i915_gem_context_open(struct drm_i915_private *i915, > goto err; > } > > - err = gem_context_register(ctx, file_priv, &id); > - if (err < 0) > - goto err_ctx; > + gem_context_register(ctx, file_priv, 0); > > - GEM_BUG_ON(id); > return 0; > > -err_ctx: > - context_close(ctx); > err: > xa_destroy(&file_priv->vm_xa); > xa_destroy(&file_priv->context_xa); > + xa_destroy(&file_priv->proto_context_xa); > + mutex_destroy(&file_priv->proto_context_lock); > return err; > } > > void i915_gem_context_close(struct drm_file *file) > { > struct drm_i915_file_private *file_priv = file->driver_priv; > + struct i915_gem_proto_context *pc; > struct i915_address_space *vm; > struct i915_gem_context *ctx; > unsigned long idx; > > + xa_for_each(&file_priv->proto_context_xa, idx, pc) > + proto_context_close(pc); > + xa_destroy(&file_priv->proto_context_xa); > + mutex_destroy(&file_priv->proto_context_lock); > + > xa_for_each(&file_priv->context_xa, idx, ctx) > context_close(ctx); > xa_destroy(&file_priv->context_xa); > @@ -1918,7 +2388,7 @@ static int ctx_setparam(struct drm_i915_file_private *fpriv, > } > > struct create_ext { > - struct i915_gem_context *ctx; > + struct i915_gem_proto_context *pc; > struct drm_i915_file_private *fpriv; > }; > > @@ -1933,7 +2403,7 @@ static int create_setparam(struct i915_user_extension __user *ext, void *data) > if (local.param.ctx_id) > return -EINVAL; > > - return ctx_setparam(arg->fpriv, arg->ctx, &local.param); > + return set_proto_ctx_param(arg->fpriv, arg->pc, &local.param); > } > > static int invalid_ext(struct i915_user_extension __user *ext, void *data) > @@ -1951,12 +2421,71 @@ static bool client_is_banned(struct drm_i915_file_private *file_priv) > return atomic_read(&file_priv->ban_score) >= I915_CLIENT_SCORE_BANNED; > } > > +static inline struct i915_gem_context * > +__context_lookup(struct drm_i915_file_private *file_priv, u32 id) > +{ > + struct i915_gem_context *ctx; > + > + rcu_read_lock(); > + ctx = xa_load(&file_priv->context_xa, id); > + if (ctx && !kref_get_unless_zero(&ctx->ref)) > + ctx = NULL; > + rcu_read_unlock(); > + > + return ctx; > +} > + > +struct i915_gem_context * > +lazy_create_context_locked(struct drm_i915_file_private *file_priv, > + struct i915_gem_proto_context *pc, u32 id) > +{ > + struct i915_gem_context *ctx; > + void *old; assert_lock_held is alwasy nice in all _locked functions. It entirely compiles out without CONFIG_PROVE_LOCKING enabled. > + > + ctx = i915_gem_create_context(file_priv->dev_priv, pc); I think we need a prep patch which changes the calling convetion of this and anything it calls to only return a NULL pointer. Then i915_gem_context_lookup below can return the ERR_PTR(-ENOMEM) below for that case, and we know that we're never returning a wrong error pointer. > + if (IS_ERR(ctx)) > + return ctx; > + > + gem_context_register(ctx, file_priv, id); > + > + old = xa_erase(&file_priv->proto_context_xa, id); > + GEM_BUG_ON(old != pc); > + proto_context_close(pc); > + > + /* One for the xarray and one for the caller */ > + return i915_gem_context_get(ctx); > +} > + > +struct i915_gem_context * > +i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id) > +{ > + struct i915_gem_proto_context *pc; > + struct i915_gem_context *ctx; > + > + ctx = __context_lookup(file_priv, id); > + if (ctx) > + return ctx; > + > + mutex_lock(&file_priv->proto_context_lock); > + /* Try one more time under the lock */ > + ctx = __context_lookup(file_priv, id); > + if (!ctx) { > + pc = xa_load(&file_priv->proto_context_xa, id); > + if (!pc) > + ctx = ERR_PTR(-ENOENT); > + else > + ctx = lazy_create_context_locked(file_priv, pc, id); > + } > + mutex_unlock(&file_priv->proto_context_lock); > + > + return ctx; > +} > + > int i915_gem_context_create_ioctl(struct drm_device *dev, void *data, > struct drm_file *file) > { > struct drm_i915_private *i915 = to_i915(dev); > struct drm_i915_gem_context_create_ext *args = data; > - struct i915_gem_proto_context *pc; > struct create_ext ext_data; > int ret; > u32 id; > @@ -1979,14 +2508,9 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data, > return -EIO; > } > > - pc = proto_context_create(i915, args->flags); > - if (IS_ERR(pc)) > - return PTR_ERR(pc); > - > - ext_data.ctx = i915_gem_create_context(i915, pc); > - proto_context_close(pc); > - if (IS_ERR(ext_data.ctx)) > - return PTR_ERR(ext_data.ctx); > + ext_data.pc = proto_context_create(i915, args->flags); > + if (IS_ERR(ext_data.pc)) > + return PTR_ERR(ext_data.pc); > > if (args->flags & I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS) { > ret = i915_user_extensions(u64_to_user_ptr(args->extensions), > @@ -1994,20 +2518,20 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data, > ARRAY_SIZE(create_extensions), > &ext_data); > if (ret) > - goto err_ctx; > + goto err_pc; > } > > - ret = gem_context_register(ext_data.ctx, ext_data.fpriv, &id); > + ret = proto_context_register(ext_data.fpriv, ext_data.pc, &id); > if (ret < 0) > - goto err_ctx; > + goto err_pc; > > args->ctx_id = id; > drm_dbg(&i915->drm, "HW context %d created\n", args->ctx_id); > > return 0; > > -err_ctx: > - context_close(ext_data.ctx); > +err_pc: > + proto_context_close(ext_data.pc); > return ret; > } > > @@ -2016,6 +2540,7 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data, > { > struct drm_i915_gem_context_destroy *args = data; > struct drm_i915_file_private *file_priv = file->driver_priv; > + struct i915_gem_proto_context *pc; > struct i915_gem_context *ctx; > > if (args->pad != 0) > @@ -2024,11 +2549,21 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data, > if (!args->ctx_id) > return -ENOENT; > > + mutex_lock(&file_priv->proto_context_lock); > ctx = xa_erase(&file_priv->context_xa, args->ctx_id); > - if (!ctx) > + pc = xa_erase(&file_priv->proto_context_xa, args->ctx_id); > + mutex_unlock(&file_priv->proto_context_lock); > + > + if (!ctx && !pc) > return -ENOENT; > + GEM_WARN_ON(ctx && pc); > + > + if (pc) > + proto_context_close(pc); > + > + if (ctx) > + context_close(ctx); > > - context_close(ctx); > return 0; > } > > @@ -2161,16 +2696,48 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data, > { > struct drm_i915_file_private *file_priv = file->driver_priv; > struct drm_i915_gem_context_param *args = data; > + struct i915_gem_proto_context *pc; > struct i915_gem_context *ctx; > - int ret; > + int ret = 0; > > - ctx = i915_gem_context_lookup(file_priv, args->ctx_id); > - if (IS_ERR(ctx)) > - return PTR_ERR(ctx); > + ctx = __context_lookup(file_priv, args->ctx_id); > + if (ctx) > + goto set_ctx_param; > > - ret = ctx_setparam(file_priv, ctx, args); > + mutex_lock(&file_priv->proto_context_lock); > + ctx = __context_lookup(file_priv, args->ctx_id); > + if (ctx) > + goto unlock; > + > + pc = xa_load(&file_priv->proto_context_xa, args->ctx_id); > + if (!pc) { > + ret = -ENOENT; > + goto unlock; > + } > + > + ret = set_proto_ctx_param(file_priv, pc, args); I think we should have a FIXME here of not allowing this on some future platforms because just use CTX_CREATE_EXT. > + if (ret == -ENOTSUPP) { > + /* Some params, specifically SSEU, can only be set on fully I think this needs a FIXME: that this only holds during the conversion? Otherwise we kinda have a bit a problem me thinks ... > + * created contexts. > + */ > + ret = 0; > + ctx = lazy_create_context_locked(file_priv, pc, args->ctx_id); > + if (IS_ERR(ctx)) { > + ret = PTR_ERR(ctx); > + ctx = NULL; > + } > + } > + > +unlock: > + mutex_unlock(&file_priv->proto_context_lock); > + > +set_ctx_param: > + if (!ret && ctx) > + ret = ctx_setparam(file_priv, ctx, args); > + > + if (ctx) > + i915_gem_context_put(ctx); > > - i915_gem_context_put(ctx); > return ret; > } > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h b/drivers/gpu/drm/i915/gem/i915_gem_context.h > index b5c908f3f4f22..20411db84914a 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.h > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h > @@ -133,6 +133,9 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data, > int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data, > struct drm_file *file); > > +struct i915_gem_context * > +i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id); > + > static inline struct i915_gem_context * > i915_gem_context_get(struct i915_gem_context *ctx) > { > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h > index a42c429f94577..067ea3030ac91 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h > @@ -46,6 +46,26 @@ struct i915_gem_engines_iter { > const struct i915_gem_engines *engines; > }; > > +enum i915_gem_engine_type { > + I915_GEM_ENGINE_TYPE_INVALID = 0, > + I915_GEM_ENGINE_TYPE_PHYSICAL, > + I915_GEM_ENGINE_TYPE_BALANCED, > +}; > + Some kerneldoc missing? > +struct i915_gem_proto_engine { > + /** @type: Type of this engine */ > + enum i915_gem_engine_type type; > + > + /** @num_siblings: Engine, for physical */ > + struct intel_engine_cs *engine; > + > + /** @num_siblings: Number of balanced siblings */ > + unsigned int num_siblings; > + > + /** @num_siblings: Balanced siblings */ > + struct intel_engine_cs **siblings; I guess you're stuffing both balanced and siblings into one? > +}; > + > /** > * struct i915_gem_proto_context - prototype context > * > @@ -64,6 +84,12 @@ struct i915_gem_proto_context { > /** @sched: See i915_gem_context::sched */ > struct i915_sched_attr sched; > > + /** @num_user_engines: Number of user-specified engines or -1 */ > + int num_user_engines; > + > + /** @num_user_engines: User-specified engines */ > + struct i915_gem_proto_engine *user_engines; > + > bool single_timeline; > }; > > diff --git a/drivers/gpu/drm/i915/gem/selftests/mock_context.c b/drivers/gpu/drm/i915/gem/selftests/mock_context.c > index e0f512ef7f3c6..32cf2103828f9 100644 > --- a/drivers/gpu/drm/i915/gem/selftests/mock_context.c > +++ b/drivers/gpu/drm/i915/gem/selftests/mock_context.c > @@ -80,6 +80,7 @@ void mock_init_contexts(struct drm_i915_private *i915) > struct i915_gem_context * > live_context(struct drm_i915_private *i915, struct file *file) > { > + struct drm_i915_file_private *fpriv = to_drm_file(file)->driver_priv; > struct i915_gem_proto_context *pc; > struct i915_gem_context *ctx; > int err; > @@ -96,10 +97,12 @@ live_context(struct drm_i915_private *i915, struct file *file) > > i915_gem_context_set_no_error_capture(ctx); > > - err = gem_context_register(ctx, to_drm_file(file)->driver_priv, &id); > + err = xa_alloc(&fpriv->context_xa, &id, NULL, xa_limit_32b, GFP_KERNEL); > if (err < 0) > goto err_ctx; > > + gem_context_register(ctx, fpriv, id); > + > return ctx; > > err_ctx: > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index 004ed0e59c999..365c042529d72 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -200,6 +200,9 @@ struct drm_i915_file_private { > struct rcu_head rcu; > }; > > + struct mutex proto_context_lock; > + struct xarray proto_context_xa; Kerneldoc here please. Ideally also for the context_xa below (but maybe that's for later). Also please add a hint to the proto context struct that it's all fully protected by proto_context_lock above and is never visible outside of that. > + > struct xarray context_xa; > struct xarray vm_xa; > > @@ -1840,20 +1843,6 @@ struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev, > > struct dma_buf *i915_gem_prime_export(struct drm_gem_object *gem_obj, int flags); > > -static inline struct i915_gem_context * > -i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id) > -{ > - struct i915_gem_context *ctx; > - > - rcu_read_lock(); > - ctx = xa_load(&file_priv->context_xa, id); > - if (ctx && !kref_get_unless_zero(&ctx->ref)) > - ctx = NULL; > - rcu_read_unlock(); > - > - return ctx ? ctx : ERR_PTR(-ENOENT); > -} > - > /* i915_gem_evict.c */ > int __must_check i915_gem_evict_something(struct i915_address_space *vm, > u64 min_size, u64 alignment, I think I'll check details when I'm not getting distracted by the vm/engines validation code that I think shouldn't be here :-) -Daniel
On Thu, Apr 29, 2021 at 10:51 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > Yeah this needs some text to explain what/why you're doing this, and maybe > some rough sketch of the locking design. Yup. Will add. > > On Fri, Apr 23, 2021 at 05:31:26PM -0500, Jason Ekstrand wrote: > > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> > > --- > > drivers/gpu/drm/i915/gem/i915_gem_context.c | 657 ++++++++++++++++-- > > drivers/gpu/drm/i915/gem/i915_gem_context.h | 3 + > > .../gpu/drm/i915/gem/i915_gem_context_types.h | 26 + > > .../gpu/drm/i915/gem/selftests/mock_context.c | 5 +- > > drivers/gpu/drm/i915/i915_drv.h | 17 +- > > 5 files changed, 648 insertions(+), 60 deletions(-) > > So I think the patch split here is a bit unfortunate, because you're > adding the new vm/engine validation code for proto context here, but the > old stuff is only removed in the next patches that make vm/engines > immutable after first use. Yes, it's very unfortunate. I'm reworking things now to have a different split which I think makes more sense but actually separates the add from the remove even further. :-( > I think a better split would be if this patch here only has all the > scaffolding. You already have the EOPNOTSUPP fallback (which I hope gets > removed), so moving the conversion entirely to later patches should be all > fine. > > Or do I miss something? > > I think the only concern I'm seeing is that bisectability might be a bit > lost, because we finalize the context in some cases in setparam. And if we > do the conversion in a different order than the one media uses for its > setparam, then later setparam might fail because the context is finalized > already. But also > - it's just bisectability of media functionality I think > - just check which order media calls CTX_SETPARAM and use that to do the > conversion > > And we should be fine ... I think? Before we go down that path, let's what you think of my new ordering. > Some more thoughts below, but the proto ctx stuff itself looks fine. > > > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c > > index db9153e0f85a7..aa8e61211924f 100644 > > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c > > @@ -193,8 +193,15 @@ static int validate_priority(struct drm_i915_private *i915, > > > > static void proto_context_close(struct i915_gem_proto_context *pc) > > { > > + int i; > > + > > if (pc->vm) > > i915_vm_put(pc->vm); > > + if (pc->user_engines) { > > + for (i = 0; i < pc->num_user_engines; i++) > > + kfree(pc->user_engines[i].siblings); > > + kfree(pc->user_engines); > > + } > > kfree(pc); > > } > > > > @@ -274,12 +281,417 @@ proto_context_create(struct drm_i915_private *i915, unsigned int flags) > > proto_context_set_persistence(i915, pc, true); > > pc->sched.priority = I915_PRIORITY_NORMAL; > > > > + pc->num_user_engines = -1; > > + pc->user_engines = NULL; > > + > > if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE) > > pc->single_timeline = true; > > > > return pc; > > } > > > > +static int proto_context_register_locked(struct drm_i915_file_private *fpriv, > > + struct i915_gem_proto_context *pc, > > + u32 *id) > > +{ > > + int ret; > > + void *old; > > assert_lock_held just for consistency. Done. > > + > > + ret = xa_alloc(&fpriv->context_xa, id, NULL, xa_limit_32b, GFP_KERNEL); > > + if (ret) > > + return ret; > > + > > + old = xa_store(&fpriv->proto_context_xa, *id, pc, GFP_KERNEL); > > + if (xa_is_err(old)) { > > + xa_erase(&fpriv->context_xa, *id); > > + return xa_err(old); > > + } > > + GEM_BUG_ON(old); > > + > > + return 0; > > +} > > + > > +static int proto_context_register(struct drm_i915_file_private *fpriv, > > + struct i915_gem_proto_context *pc, > > + u32 *id) > > +{ > > + int ret; > > + > > + mutex_lock(&fpriv->proto_context_lock); > > + ret = proto_context_register_locked(fpriv, pc, id); > > + mutex_unlock(&fpriv->proto_context_lock); > > + > > + return ret; > > +} > > + > > +static int set_proto_ctx_vm(struct drm_i915_file_private *fpriv, > > + struct i915_gem_proto_context *pc, > > + const struct drm_i915_gem_context_param *args) > > +{ > > + struct i915_address_space *vm; > > + > > + if (args->size) > > + return -EINVAL; > > + > > + if (!pc->vm) > > + return -ENODEV; > > + > > + if (upper_32_bits(args->value)) > > + return -ENOENT; > > + > > + rcu_read_lock(); > > + vm = xa_load(&fpriv->vm_xa, args->value); > > + if (vm && !kref_get_unless_zero(&vm->ref)) > > + vm = NULL; > > + rcu_read_unlock(); > > + if (!vm) > > + return -ENOENT; > > + > > + i915_vm_put(pc->vm); > > + pc->vm = vm; > > + > > + return 0; > > +} > > + > > +struct set_proto_ctx_engines { > > + struct drm_i915_private *i915; > > + unsigned num_engines; > > + struct i915_gem_proto_engine *engines; > > +}; > > + > > +static int > > +set_proto_ctx_engines_balance(struct i915_user_extension __user *base, > > + void *data) > > +{ > > + struct i915_context_engines_load_balance __user *ext = > > + container_of_user(base, typeof(*ext), base); > > + const struct set_proto_ctx_engines *set = data; > > + struct drm_i915_private *i915 = set->i915; > > + struct intel_engine_cs **siblings; > > + u16 num_siblings, idx; > > + unsigned int n; > > + int err; > > + > > + if (!HAS_EXECLISTS(i915)) > > + return -ENODEV; > > + > > + if (intel_uc_uses_guc_submission(&i915->gt.uc)) > > + return -ENODEV; /* not implement yet */ > > + > > + if (get_user(idx, &ext->engine_index)) > > + return -EFAULT; > > + > > + if (idx >= set->num_engines) { > > + drm_dbg(&i915->drm, "Invalid placement value, %d >= %d\n", > > + idx, set->num_engines); > > + return -EINVAL; > > + } > > + > > + idx = array_index_nospec(idx, set->num_engines); > > + if (set->engines[idx].type != I915_GEM_ENGINE_TYPE_INVALID) { > > + drm_dbg(&i915->drm, > > + "Invalid placement[%d], already occupied\n", idx); > > + return -EEXIST; > > + } > > + > > + if (get_user(num_siblings, &ext->num_siblings)) > > + return -EFAULT; > > + > > + err = check_user_mbz(&ext->flags); > > + if (err) > > + return err; > > + > > + err = check_user_mbz(&ext->mbz64); > > + if (err) > > + return err; > > + > > + if (num_siblings == 0) > > + return 0; > > + > > + siblings = kmalloc_array(num_siblings, sizeof(*siblings), GFP_KERNEL); > > + if (!siblings) > > + return -ENOMEM; > > + > > + for (n = 0; n < num_siblings; n++) { > > + struct i915_engine_class_instance ci; > > + > > + if (copy_from_user(&ci, &ext->engines[n], sizeof(ci))) { > > + err = -EFAULT; > > + goto err_siblings; > > + } > > + > > + siblings[n] = intel_engine_lookup_user(i915, > > + ci.engine_class, > > + ci.engine_instance); > > + if (!siblings[n]) { > > + drm_dbg(&i915->drm, > > + "Invalid sibling[%d]: { class:%d, inst:%d }\n", > > + n, ci.engine_class, ci.engine_instance); > > + err = -EINVAL; > > + goto err_siblings; > > + } > > + } > > + > > + if (num_siblings == 1) { > > + set->engines[idx].type = I915_GEM_ENGINE_TYPE_PHYSICAL; > > + set->engines[idx].engine = siblings[0]; > > + kfree(siblings); > > + } else { > > + set->engines[idx].type = I915_GEM_ENGINE_TYPE_BALANCED; > > + set->engines[idx].num_siblings = num_siblings; > > + set->engines[idx].siblings = siblings; > > + } > > + > > + return 0; > > + > > +err_siblings: > > + kfree(siblings); > > + > > + return err; > > +} > > + > > +static int > > +set_proto_ctx_engines_bond(struct i915_user_extension __user *base, void *data) > > +{ > > + struct i915_context_engines_bond __user *ext = > > + container_of_user(base, typeof(*ext), base); > > + const struct set_proto_ctx_engines *set = data; > > + struct drm_i915_private *i915 = set->i915; > > + struct i915_engine_class_instance ci; > > + struct intel_engine_cs *master; > > + u16 idx, num_bonds; > > + int err, n; > > + > > + if (get_user(idx, &ext->virtual_index)) > > + return -EFAULT; > > + > > + if (idx >= set->num_engines) { > > + drm_dbg(&i915->drm, > > + "Invalid index for virtual engine: %d >= %d\n", > > + idx, set->num_engines); > > + return -EINVAL; > > + } > > + > > + idx = array_index_nospec(idx, set->num_engines); > > + if (set->engines[idx].type == I915_GEM_ENGINE_TYPE_INVALID) { > > + drm_dbg(&i915->drm, "Invalid engine at %d\n", idx); > > + return -EINVAL; > > + } > > + > > + if (set->engines[idx].type != I915_GEM_ENGINE_TYPE_PHYSICAL) { > > + drm_dbg(&i915->drm, > > + "Bonding with virtual engines not allowed\n"); > > + return -EINVAL; > > + } > > + > > + err = check_user_mbz(&ext->flags); > > + if (err) > > + return err; > > + > > + for (n = 0; n < ARRAY_SIZE(ext->mbz64); n++) { > > + err = check_user_mbz(&ext->mbz64[n]); > > + if (err) > > + return err; > > + } > > + > > + if (copy_from_user(&ci, &ext->master, sizeof(ci))) > > + return -EFAULT; > > + > > + master = intel_engine_lookup_user(i915, > > + ci.engine_class, > > + ci.engine_instance); > > + if (!master) { > > + drm_dbg(&i915->drm, > > + "Unrecognised master engine: { class:%u, instance:%u }\n", > > + ci.engine_class, ci.engine_instance); > > + return -EINVAL; > > + } > > + > > + if (get_user(num_bonds, &ext->num_bonds)) > > + return -EFAULT; > > + > > + for (n = 0; n < num_bonds; n++) { > > + struct intel_engine_cs *bond; > > + > > + if (copy_from_user(&ci, &ext->engines[n], sizeof(ci))) > > + return -EFAULT; > > + > > + bond = intel_engine_lookup_user(i915, > > + ci.engine_class, > > + ci.engine_instance); > > + if (!bond) { > > + drm_dbg(&i915->drm, > > + "Unrecognised engine[%d] for bonding: { class:%d, instance: %d }\n", > > + n, ci.engine_class, ci.engine_instance); > > + return -EINVAL; > > + } > > + } > > + > > + return 0; > > +} > > + > > +static const i915_user_extension_fn set_proto_ctx_engines_extensions[] = { > > + [I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE] = set_proto_ctx_engines_balance, > > + [I915_CONTEXT_ENGINES_EXT_BOND] = set_proto_ctx_engines_bond, > > +}; > > + > > +static int set_proto_ctx_engines(struct drm_i915_file_private *fpriv, > > + struct i915_gem_proto_context *pc, > > + const struct drm_i915_gem_context_param *args) > > +{ > > + struct drm_i915_private *i915 = fpriv->dev_priv; > > + struct set_proto_ctx_engines set = { .i915 = i915 }; > > + struct i915_context_param_engines __user *user = > > + u64_to_user_ptr(args->value); > > + unsigned int n; > > + u64 extensions; > > + int err; > > + > > + if (!args->size) { > > + kfree(pc->user_engines); > > + pc->num_user_engines = -1; > > + pc->user_engines = NULL; > > + return 0; > > + } > > + > > + BUILD_BUG_ON(!IS_ALIGNED(sizeof(*user), sizeof(*user->engines))); > > + if (args->size < sizeof(*user) || > > + !IS_ALIGNED(args->size, sizeof(*user->engines))) { > > + drm_dbg(&i915->drm, "Invalid size for engine array: %d\n", > > + args->size); > > + return -EINVAL; > > + } > > + > > + set.num_engines = (args->size - sizeof(*user)) / sizeof(*user->engines); > > + if (set.num_engines > I915_EXEC_RING_MASK + 1) > > + return -EINVAL; > > + > > + set.engines = kmalloc_array(set.num_engines, sizeof(*set.engines), GFP_KERNEL); > > + if (!set.engines) > > + return -ENOMEM; > > + > > + for (n = 0; n < set.num_engines; n++) { > > + struct i915_engine_class_instance ci; > > + struct intel_engine_cs *engine; > > + > > + if (copy_from_user(&ci, &user->engines[n], sizeof(ci))) { > > + kfree(set.engines); > > + return -EFAULT; > > + } > > + > > + memset(&set.engines[n], 0, sizeof(set.engines[n])); > > + > > + if (ci.engine_class == (u16)I915_ENGINE_CLASS_INVALID && > > + ci.engine_instance == (u16)I915_ENGINE_CLASS_INVALID_NONE) > > + continue; > > + > > + engine = intel_engine_lookup_user(i915, > > + ci.engine_class, > > + ci.engine_instance); > > + if (!engine) { > > + drm_dbg(&i915->drm, > > + "Invalid engine[%d]: { class:%d, instance:%d }\n", > > + n, ci.engine_class, ci.engine_instance); > > + kfree(set.engines); > > + return -ENOENT; > > + } > > + > > + set.engines[n].type = I915_GEM_ENGINE_TYPE_PHYSICAL; > > + set.engines[n].engine = engine; > > + } > > + > > + err = -EFAULT; > > + if (!get_user(extensions, &user->extensions)) > > + err = i915_user_extensions(u64_to_user_ptr(extensions), > > + set_proto_ctx_engines_extensions, > > + ARRAY_SIZE(set_proto_ctx_engines_extensions), > > + &set); > > + if (err) { > > + kfree(set.engines); > > + return err; > > + } > > + > > + kfree(pc->user_engines); > > + pc->num_user_engines = set.num_engines; > > + pc->user_engines = set.engines; > > + > > + return 0; > > +} > > + > > +static int set_proto_ctx_param(struct drm_i915_file_private *fpriv, > > + struct i915_gem_proto_context *pc, > > + struct drm_i915_gem_context_param *args) > > +{ > > + int ret = 0; > > + > > + switch (args->param) { > > + case I915_CONTEXT_PARAM_NO_ERROR_CAPTURE: > > + if (args->size) > > + ret = -EINVAL; > > + else if (args->value) > > + set_bit(UCONTEXT_NO_ERROR_CAPTURE, &pc->user_flags); > > Atomic bitops like in previous patches: Pls no :-) Yup. Fixed. > > + else > > + clear_bit(UCONTEXT_NO_ERROR_CAPTURE, &pc->user_flags); > > + break; > > + > > + case I915_CONTEXT_PARAM_BANNABLE: > > + if (args->size) > > + ret = -EINVAL; > > + else if (!capable(CAP_SYS_ADMIN) && !args->value) > > + ret = -EPERM; > > + else if (args->value) > > + set_bit(UCONTEXT_BANNABLE, &pc->user_flags); > > + else > > + clear_bit(UCONTEXT_BANNABLE, &pc->user_flags); > > + break; > > + > > + case I915_CONTEXT_PARAM_RECOVERABLE: > > + if (args->size) > > + ret = -EINVAL; > > + else if (args->value) > > + set_bit(UCONTEXT_RECOVERABLE, &pc->user_flags); > > + else > > + clear_bit(UCONTEXT_RECOVERABLE, &pc->user_flags); > > + break; > > + > > + case I915_CONTEXT_PARAM_PRIORITY: > > + ret = validate_priority(fpriv->dev_priv, args); > > + if (!ret) > > + pc->sched.priority = args->value; > > + break; > > + > > + case I915_CONTEXT_PARAM_SSEU: > > + ret = -ENOTSUPP; > > + break; > > + > > + case I915_CONTEXT_PARAM_VM: > > + ret = set_proto_ctx_vm(fpriv, pc, args); > > + break; > > + > > + case I915_CONTEXT_PARAM_ENGINES: > > + ret = set_proto_ctx_engines(fpriv, pc, args); > > + break; > > + > > + case I915_CONTEXT_PARAM_PERSISTENCE: > > + if (args->size) > > + ret = -EINVAL; > > + else if (args->value) > > + set_bit(UCONTEXT_PERSISTENCE, &pc->user_flags); > > + else > > + clear_bit(UCONTEXT_PERSISTENCE, &pc->user_flags); > > + break; > > + > > + case I915_CONTEXT_PARAM_NO_ZEROMAP: > > + case I915_CONTEXT_PARAM_BAN_PERIOD: > > + case I915_CONTEXT_PARAM_RINGSIZE: > > + default: > > + ret = -EINVAL; > > + break; > > + } > > + > > + return ret; > > +} > > + > > static struct i915_address_space * > > context_get_vm_rcu(struct i915_gem_context *ctx) > > { > > @@ -450,6 +862,47 @@ static struct i915_gem_engines *default_engines(struct i915_gem_context *ctx) > > return e; > > } > > > > +static struct i915_gem_engines *user_engines(struct i915_gem_context *ctx, > > + unsigned int num_engines, > > + struct i915_gem_proto_engine *pe) > > +{ > > + struct i915_gem_engines *e; > > + unsigned int n; > > + > > + e = alloc_engines(num_engines); > > + for (n = 0; n < num_engines; n++) { > > + struct intel_context *ce; > > + > > + switch (pe[n].type) { > > + case I915_GEM_ENGINE_TYPE_PHYSICAL: > > + ce = intel_context_create(pe[n].engine); > > + break; > > + > > + case I915_GEM_ENGINE_TYPE_BALANCED: > > + ce = intel_execlists_create_virtual(pe[n].siblings, > > + pe[n].num_siblings); > > + break; > > + > > + case I915_GEM_ENGINE_TYPE_INVALID: > > + default: > > + GEM_WARN_ON(pe[n].type != I915_GEM_ENGINE_TYPE_INVALID); > > + continue; > > + } > > + > > + if (IS_ERR(ce)) { > > + __free_engines(e, n); > > + return ERR_CAST(ce); > > + } > > + > > + intel_context_set_gem(ce, ctx); > > + > > + e->engines[n] = ce; > > + } > > + e->num_engines = num_engines; > > + > > + return e; > > +} > > + > > void i915_gem_context_release(struct kref *ref) > > { > > struct i915_gem_context *ctx = container_of(ref, typeof(*ctx), ref); > > @@ -890,6 +1343,24 @@ i915_gem_create_context(struct drm_i915_private *i915, > > mutex_unlock(&ctx->mutex); > > } > > > > + if (pc->num_user_engines >= 0) { > > + struct i915_gem_engines *engines; > > + > > + engines = user_engines(ctx, pc->num_user_engines, > > + pc->user_engines); > > + if (IS_ERR(engines)) { > > + context_close(ctx); > > + return ERR_CAST(engines); > > + } > > + > > + mutex_lock(&ctx->engines_mutex); > > + i915_gem_context_set_user_engines(ctx); > > + engines = rcu_replace_pointer(ctx->engines, engines, 1); > > + mutex_unlock(&ctx->engines_mutex); > > + > > + free_engines(engines); > > + } > > + > > if (pc->single_timeline) { > > ret = drm_syncobj_create(&ctx->syncobj, > > DRM_SYNCOBJ_CREATE_SIGNALED, > > @@ -916,12 +1387,12 @@ void i915_gem_init__contexts(struct drm_i915_private *i915) > > init_contexts(&i915->gem.contexts); > > } > > > > -static int gem_context_register(struct i915_gem_context *ctx, > > - struct drm_i915_file_private *fpriv, > > - u32 *id) > > +static void gem_context_register(struct i915_gem_context *ctx, > > + struct drm_i915_file_private *fpriv, > > + u32 id) > > { > > struct drm_i915_private *i915 = ctx->i915; > > - int ret; > > + void *old; > > > > ctx->file_priv = fpriv; > > > > @@ -930,19 +1401,12 @@ static int gem_context_register(struct i915_gem_context *ctx, > > current->comm, pid_nr(ctx->pid)); > > > > /* And finally expose ourselves to userspace via the idr */ > > - ret = xa_alloc(&fpriv->context_xa, id, ctx, xa_limit_32b, GFP_KERNEL); > > - if (ret) > > - goto err_pid; > > + old = xa_store(&fpriv->context_xa, id, ctx, GFP_KERNEL); > > + GEM_BUG_ON(old); > > > > spin_lock(&i915->gem.contexts.lock); > > list_add_tail(&ctx->link, &i915->gem.contexts.list); > > spin_unlock(&i915->gem.contexts.lock); > > - > > - return 0; > > - > > -err_pid: > > - put_pid(fetch_and_zero(&ctx->pid)); > > - return ret; > > } > > > > int i915_gem_context_open(struct drm_i915_private *i915, > > @@ -952,9 +1416,12 @@ int i915_gem_context_open(struct drm_i915_private *i915, > > struct i915_gem_proto_context *pc; > > struct i915_gem_context *ctx; > > int err; > > - u32 id; > > > > - xa_init_flags(&file_priv->context_xa, XA_FLAGS_ALLOC); > > + mutex_init(&file_priv->proto_context_lock); > > + xa_init_flags(&file_priv->proto_context_xa, XA_FLAGS_ALLOC); > > + > > + /* 0 reserved for the default context */ > > + xa_init_flags(&file_priv->context_xa, XA_FLAGS_ALLOC1); > > > > /* 0 reserved for invalid/unassigned ppgtt */ > > xa_init_flags(&file_priv->vm_xa, XA_FLAGS_ALLOC1); > > @@ -972,28 +1439,31 @@ int i915_gem_context_open(struct drm_i915_private *i915, > > goto err; > > } > > > > - err = gem_context_register(ctx, file_priv, &id); > > - if (err < 0) > > - goto err_ctx; > > + gem_context_register(ctx, file_priv, 0); > > > > - GEM_BUG_ON(id); > > return 0; > > > > -err_ctx: > > - context_close(ctx); > > err: > > xa_destroy(&file_priv->vm_xa); > > xa_destroy(&file_priv->context_xa); > > + xa_destroy(&file_priv->proto_context_xa); > > + mutex_destroy(&file_priv->proto_context_lock); > > return err; > > } > > > > void i915_gem_context_close(struct drm_file *file) > > { > > struct drm_i915_file_private *file_priv = file->driver_priv; > > + struct i915_gem_proto_context *pc; > > struct i915_address_space *vm; > > struct i915_gem_context *ctx; > > unsigned long idx; > > > > + xa_for_each(&file_priv->proto_context_xa, idx, pc) > > + proto_context_close(pc); > > + xa_destroy(&file_priv->proto_context_xa); > > + mutex_destroy(&file_priv->proto_context_lock); > > + > > xa_for_each(&file_priv->context_xa, idx, ctx) > > context_close(ctx); > > xa_destroy(&file_priv->context_xa); > > @@ -1918,7 +2388,7 @@ static int ctx_setparam(struct drm_i915_file_private *fpriv, > > } > > > > struct create_ext { > > - struct i915_gem_context *ctx; > > + struct i915_gem_proto_context *pc; > > struct drm_i915_file_private *fpriv; > > }; > > > > @@ -1933,7 +2403,7 @@ static int create_setparam(struct i915_user_extension __user *ext, void *data) > > if (local.param.ctx_id) > > return -EINVAL; > > > > - return ctx_setparam(arg->fpriv, arg->ctx, &local.param); > > + return set_proto_ctx_param(arg->fpriv, arg->pc, &local.param); > > } > > > > static int invalid_ext(struct i915_user_extension __user *ext, void *data) > > @@ -1951,12 +2421,71 @@ static bool client_is_banned(struct drm_i915_file_private *file_priv) > > return atomic_read(&file_priv->ban_score) >= I915_CLIENT_SCORE_BANNED; > > } > > > > +static inline struct i915_gem_context * > > +__context_lookup(struct drm_i915_file_private *file_priv, u32 id) > > +{ > > + struct i915_gem_context *ctx; > > + > > + rcu_read_lock(); > > + ctx = xa_load(&file_priv->context_xa, id); > > + if (ctx && !kref_get_unless_zero(&ctx->ref)) > > + ctx = NULL; > > + rcu_read_unlock(); > > + > > + return ctx; > > +} > > + > > +struct i915_gem_context * > > +lazy_create_context_locked(struct drm_i915_file_private *file_priv, > > + struct i915_gem_proto_context *pc, u32 id) > > +{ > > + struct i915_gem_context *ctx; > > + void *old; > > assert_lock_held is alwasy nice in all _locked functions. It entirely > compiles out without CONFIG_PROVE_LOCKING enabled. Done. > > + > > + ctx = i915_gem_create_context(file_priv->dev_priv, pc); > > I think we need a prep patch which changes the calling convetion of this > and anything it calls to only return a NULL pointer. Then > i915_gem_context_lookup below can return the ERR_PTR(-ENOMEM) below for > that case, and we know that we're never returning a wrong error pointer. > > > + if (IS_ERR(ctx)) > > + return ctx; > > + > > + gem_context_register(ctx, file_priv, id); > > + > > + old = xa_erase(&file_priv->proto_context_xa, id); > > + GEM_BUG_ON(old != pc); > > + proto_context_close(pc); > > + > > + /* One for the xarray and one for the caller */ > > + return i915_gem_context_get(ctx); > > +} > > + > > +struct i915_gem_context * > > +i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id) > > +{ > > + struct i915_gem_proto_context *pc; > > + struct i915_gem_context *ctx; > > + > > + ctx = __context_lookup(file_priv, id); > > + if (ctx) > > + return ctx; > > + > > + mutex_lock(&file_priv->proto_context_lock); > > + /* Try one more time under the lock */ > > + ctx = __context_lookup(file_priv, id); > > + if (!ctx) { > > + pc = xa_load(&file_priv->proto_context_xa, id); > > + if (!pc) > > + ctx = ERR_PTR(-ENOENT); > > + else > > + ctx = lazy_create_context_locked(file_priv, pc, id); > > + } > > + mutex_unlock(&file_priv->proto_context_lock); > > + > > + return ctx; > > +} > > + > > int i915_gem_context_create_ioctl(struct drm_device *dev, void *data, > > struct drm_file *file) > > { > > struct drm_i915_private *i915 = to_i915(dev); > > struct drm_i915_gem_context_create_ext *args = data; > > - struct i915_gem_proto_context *pc; > > struct create_ext ext_data; > > int ret; > > u32 id; > > @@ -1979,14 +2508,9 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data, > > return -EIO; > > } > > > > - pc = proto_context_create(i915, args->flags); > > - if (IS_ERR(pc)) > > - return PTR_ERR(pc); > > - > > - ext_data.ctx = i915_gem_create_context(i915, pc); > > - proto_context_close(pc); > > - if (IS_ERR(ext_data.ctx)) > > - return PTR_ERR(ext_data.ctx); > > + ext_data.pc = proto_context_create(i915, args->flags); > > + if (IS_ERR(ext_data.pc)) > > + return PTR_ERR(ext_data.pc); > > > > if (args->flags & I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS) { > > ret = i915_user_extensions(u64_to_user_ptr(args->extensions), > > @@ -1994,20 +2518,20 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data, > > ARRAY_SIZE(create_extensions), > > &ext_data); > > if (ret) > > - goto err_ctx; > > + goto err_pc; > > } > > > > - ret = gem_context_register(ext_data.ctx, ext_data.fpriv, &id); > > + ret = proto_context_register(ext_data.fpriv, ext_data.pc, &id); > > if (ret < 0) > > - goto err_ctx; > > + goto err_pc; > > > > args->ctx_id = id; > > drm_dbg(&i915->drm, "HW context %d created\n", args->ctx_id); > > > > return 0; > > > > -err_ctx: > > - context_close(ext_data.ctx); > > +err_pc: > > + proto_context_close(ext_data.pc); > > return ret; > > } > > > > @@ -2016,6 +2540,7 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data, > > { > > struct drm_i915_gem_context_destroy *args = data; > > struct drm_i915_file_private *file_priv = file->driver_priv; > > + struct i915_gem_proto_context *pc; > > struct i915_gem_context *ctx; > > > > if (args->pad != 0) > > @@ -2024,11 +2549,21 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data, > > if (!args->ctx_id) > > return -ENOENT; > > > > + mutex_lock(&file_priv->proto_context_lock); > > ctx = xa_erase(&file_priv->context_xa, args->ctx_id); > > - if (!ctx) > > + pc = xa_erase(&file_priv->proto_context_xa, args->ctx_id); > > + mutex_unlock(&file_priv->proto_context_lock); > > + > > + if (!ctx && !pc) > > return -ENOENT; > > + GEM_WARN_ON(ctx && pc); > > + > > + if (pc) > > + proto_context_close(pc); > > + > > + if (ctx) > > + context_close(ctx); > > > > - context_close(ctx); > > return 0; > > } > > > > @@ -2161,16 +2696,48 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data, > > { > > struct drm_i915_file_private *file_priv = file->driver_priv; > > struct drm_i915_gem_context_param *args = data; > > + struct i915_gem_proto_context *pc; > > struct i915_gem_context *ctx; > > - int ret; > > + int ret = 0; > > > > - ctx = i915_gem_context_lookup(file_priv, args->ctx_id); > > - if (IS_ERR(ctx)) > > - return PTR_ERR(ctx); > > + ctx = __context_lookup(file_priv, args->ctx_id); > > + if (ctx) > > + goto set_ctx_param; > > > > - ret = ctx_setparam(file_priv, ctx, args); > > + mutex_lock(&file_priv->proto_context_lock); > > + ctx = __context_lookup(file_priv, args->ctx_id); > > + if (ctx) > > + goto unlock; > > + > > + pc = xa_load(&file_priv->proto_context_xa, args->ctx_id); > > + if (!pc) { > > + ret = -ENOENT; > > + goto unlock; > > + } > > + > > + ret = set_proto_ctx_param(file_priv, pc, args); > > I think we should have a FIXME here of not allowing this on some future > platforms because just use CTX_CREATE_EXT. Done. > > + if (ret == -ENOTSUPP) { > > + /* Some params, specifically SSEU, can only be set on fully > > I think this needs a FIXME: that this only holds during the conversion? > Otherwise we kinda have a bit a problem me thinks ... I'm not sure what you mean by that. > > + * created contexts. > > + */ > > + ret = 0; > > + ctx = lazy_create_context_locked(file_priv, pc, args->ctx_id); > > + if (IS_ERR(ctx)) { > > + ret = PTR_ERR(ctx); > > + ctx = NULL; > > + } > > + } > > + > > +unlock: > > + mutex_unlock(&file_priv->proto_context_lock); > > + > > +set_ctx_param: > > + if (!ret && ctx) > > + ret = ctx_setparam(file_priv, ctx, args); > > + > > + if (ctx) > > + i915_gem_context_put(ctx); > > > > - i915_gem_context_put(ctx); > > return ret; > > } > > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h b/drivers/gpu/drm/i915/gem/i915_gem_context.h > > index b5c908f3f4f22..20411db84914a 100644 > > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.h > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h > > @@ -133,6 +133,9 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data, > > int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data, > > struct drm_file *file); > > > > +struct i915_gem_context * > > +i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id); > > + > > static inline struct i915_gem_context * > > i915_gem_context_get(struct i915_gem_context *ctx) > > { > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h > > index a42c429f94577..067ea3030ac91 100644 > > --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h > > @@ -46,6 +46,26 @@ struct i915_gem_engines_iter { > > const struct i915_gem_engines *engines; > > }; > > > > +enum i915_gem_engine_type { > > + I915_GEM_ENGINE_TYPE_INVALID = 0, > > + I915_GEM_ENGINE_TYPE_PHYSICAL, > > + I915_GEM_ENGINE_TYPE_BALANCED, > > +}; > > + > > Some kerneldoc missing? Yup. Fixed. > > +struct i915_gem_proto_engine { > > + /** @type: Type of this engine */ > > + enum i915_gem_engine_type type; > > + > > + /** @num_siblings: Engine, for physical */ > > + struct intel_engine_cs *engine; > > + > > + /** @num_siblings: Number of balanced siblings */ > > + unsigned int num_siblings; > > + > > + /** @num_siblings: Balanced siblings */ > > + struct intel_engine_cs **siblings; > > I guess you're stuffing both balanced and siblings into one? Nope. Thanks to the patch to disable balance+bonded, we just throw the bonding info away. :-D > > +}; > > + > > /** > > * struct i915_gem_proto_context - prototype context > > * > > @@ -64,6 +84,12 @@ struct i915_gem_proto_context { > > /** @sched: See i915_gem_context::sched */ > > struct i915_sched_attr sched; > > > > + /** @num_user_engines: Number of user-specified engines or -1 */ > > + int num_user_engines; > > + > > + /** @num_user_engines: User-specified engines */ > > + struct i915_gem_proto_engine *user_engines; > > + > > bool single_timeline; > > }; > > > > diff --git a/drivers/gpu/drm/i915/gem/selftests/mock_context.c b/drivers/gpu/drm/i915/gem/selftests/mock_context.c > > index e0f512ef7f3c6..32cf2103828f9 100644 > > --- a/drivers/gpu/drm/i915/gem/selftests/mock_context.c > > +++ b/drivers/gpu/drm/i915/gem/selftests/mock_context.c > > @@ -80,6 +80,7 @@ void mock_init_contexts(struct drm_i915_private *i915) > > struct i915_gem_context * > > live_context(struct drm_i915_private *i915, struct file *file) > > { > > + struct drm_i915_file_private *fpriv = to_drm_file(file)->driver_priv; > > struct i915_gem_proto_context *pc; > > struct i915_gem_context *ctx; > > int err; > > @@ -96,10 +97,12 @@ live_context(struct drm_i915_private *i915, struct file *file) > > > > i915_gem_context_set_no_error_capture(ctx); > > > > - err = gem_context_register(ctx, to_drm_file(file)->driver_priv, &id); > > + err = xa_alloc(&fpriv->context_xa, &id, NULL, xa_limit_32b, GFP_KERNEL); > > if (err < 0) > > goto err_ctx; > > > > + gem_context_register(ctx, fpriv, id); > > + > > return ctx; > > > > err_ctx: > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > > index 004ed0e59c999..365c042529d72 100644 > > --- a/drivers/gpu/drm/i915/i915_drv.h > > +++ b/drivers/gpu/drm/i915/i915_drv.h > > @@ -200,6 +200,9 @@ struct drm_i915_file_private { > > struct rcu_head rcu; > > }; > > > > + struct mutex proto_context_lock; > > + struct xarray proto_context_xa; > > Kerneldoc here please. Ideally also for the context_xa below (but maybe > that's for later). > > Also please add a hint to the proto context struct that it's all fully > protected by proto_context_lock above and is never visible outside of > that. Both done. > > + > > struct xarray context_xa; > > struct xarray vm_xa; > > > > @@ -1840,20 +1843,6 @@ struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev, > > > > struct dma_buf *i915_gem_prime_export(struct drm_gem_object *gem_obj, int flags); > > > > -static inline struct i915_gem_context * > > -i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id) > > -{ > > - struct i915_gem_context *ctx; > > - > > - rcu_read_lock(); > > - ctx = xa_load(&file_priv->context_xa, id); > > - if (ctx && !kref_get_unless_zero(&ctx->ref)) > > - ctx = NULL; > > - rcu_read_unlock(); > > - > > - return ctx ? ctx : ERR_PTR(-ENOENT); > > -} > > - > > /* i915_gem_evict.c */ > > int __must_check i915_gem_evict_something(struct i915_address_space *vm, > > u64 min_size, u64 alignment, > > I think I'll check details when I'm not getting distracted by the > vm/engines validation code that I think shouldn't be here :-) No worries. I should be sending out a new version of the series shortly that's hopefully easier to read. --Jason
On Thu, Apr 29, 2021 at 01:16:04PM -0500, Jason Ekstrand wrote: > On Thu, Apr 29, 2021 at 10:51 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > > + ret = set_proto_ctx_param(file_priv, pc, args); > > > > I think we should have a FIXME here of not allowing this on some future > > platforms because just use CTX_CREATE_EXT. > > Done. > > > > + if (ret == -ENOTSUPP) { > > > + /* Some params, specifically SSEU, can only be set on fully > > > > I think this needs a FIXME: that this only holds during the conversion? > > Otherwise we kinda have a bit a problem me thinks ... > > I'm not sure what you mean by that. Well I'm at least assuming that we wont have this case anymore, i.e. there's only two kinds of parameters: - those which are valid only on proto context - those which are valid on both (like priority) This SSEU thing looks like a 3rd parameter, which is only valid on finalized context. That feels all kinds of wrong. Will it stay? If yes *ugh* and why? -Daniel
On Thu, Apr 29, 2021 at 1:56 PM Daniel Vetter <daniel@ffwll.ch> wrote: > > On Thu, Apr 29, 2021 at 01:16:04PM -0500, Jason Ekstrand wrote: > > On Thu, Apr 29, 2021 at 10:51 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > + ret = set_proto_ctx_param(file_priv, pc, args); > > > > > > I think we should have a FIXME here of not allowing this on some future > > > platforms because just use CTX_CREATE_EXT. > > > > Done. > > > > > > + if (ret == -ENOTSUPP) { > > > > + /* Some params, specifically SSEU, can only be set on fully > > > > > > I think this needs a FIXME: that this only holds during the conversion? > > > Otherwise we kinda have a bit a problem me thinks ... > > > > I'm not sure what you mean by that. > > Well I'm at least assuming that we wont have this case anymore, i.e. > there's only two kinds of parameters: > - those which are valid only on proto context > - those which are valid on both (like priority) > > This SSEU thing looks like a 3rd parameter, which is only valid on > finalized context. That feels all kinds of wrong. Will it stay? If yes > *ugh* and why? Because I was being lazy. The SSEU stuff is a fairly complex param to parse and it's always set live. I can factor out the SSEU parsing code if you want and it shouldn't be too bad in the end. --Jason
On Thu, Apr 29, 2021 at 02:01:16PM -0500, Jason Ekstrand wrote: > On Thu, Apr 29, 2021 at 1:56 PM Daniel Vetter <daniel@ffwll.ch> wrote: > > On Thu, Apr 29, 2021 at 01:16:04PM -0500, Jason Ekstrand wrote: > > > On Thu, Apr 29, 2021 at 10:51 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > + ret = set_proto_ctx_param(file_priv, pc, args); > > > > > > > > I think we should have a FIXME here of not allowing this on some future > > > > platforms because just use CTX_CREATE_EXT. > > > > > > Done. > > > > > > > > + if (ret == -ENOTSUPP) { > > > > > + /* Some params, specifically SSEU, can only be set on fully > > > > > > > > I think this needs a FIXME: that this only holds during the conversion? > > > > Otherwise we kinda have a bit a problem me thinks ... > > > > > > I'm not sure what you mean by that. > > > > Well I'm at least assuming that we wont have this case anymore, i.e. > > there's only two kinds of parameters: > > - those which are valid only on proto context > > - those which are valid on both (like priority) > > > > This SSEU thing looks like a 3rd parameter, which is only valid on > > finalized context. That feels all kinds of wrong. Will it stay? If yes > > *ugh* and why? > > Because I was being lazy. The SSEU stuff is a fairly complex param to > parse and it's always set live. I can factor out the SSEU parsing > code if you want and it shouldn't be too bad in the end. Yeah I think the special case here is a bit too jarring. -Daniel
On Thu, Apr 29, 2021 at 2:07 PM Daniel Vetter <daniel@ffwll.ch> wrote: > > On Thu, Apr 29, 2021 at 02:01:16PM -0500, Jason Ekstrand wrote: > > On Thu, Apr 29, 2021 at 1:56 PM Daniel Vetter <daniel@ffwll.ch> wrote: > > > On Thu, Apr 29, 2021 at 01:16:04PM -0500, Jason Ekstrand wrote: > > > > On Thu, Apr 29, 2021 at 10:51 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > > + ret = set_proto_ctx_param(file_priv, pc, args); > > > > > > > > > > I think we should have a FIXME here of not allowing this on some future > > > > > platforms because just use CTX_CREATE_EXT. > > > > > > > > Done. > > > > > > > > > > + if (ret == -ENOTSUPP) { > > > > > > + /* Some params, specifically SSEU, can only be set on fully > > > > > > > > > > I think this needs a FIXME: that this only holds during the conversion? > > > > > Otherwise we kinda have a bit a problem me thinks ... > > > > > > > > I'm not sure what you mean by that. > > > > > > Well I'm at least assuming that we wont have this case anymore, i.e. > > > there's only two kinds of parameters: > > > - those which are valid only on proto context > > > - those which are valid on both (like priority) > > > > > > This SSEU thing looks like a 3rd parameter, which is only valid on > > > finalized context. That feels all kinds of wrong. Will it stay? If yes > > > *ugh* and why? > > > > Because I was being lazy. The SSEU stuff is a fairly complex param to > > parse and it's always set live. I can factor out the SSEU parsing > > code if you want and it shouldn't be too bad in the end. > > Yeah I think the special case here is a bit too jarring. I rolled a v5 that allows you to set SSEU as a create param. I'm not a huge fan of that much code duplication for the SSEU set but I guess that's what we get for deciding to "unify" our context creation parameter path with our on-the-fly parameter path.... You can look at it here: https://gitlab.freedesktop.org/jekstrand/linux/-/commit/c805f424a3374b2de405b7fc651eab551df2cdaf#474deb1194892a272db022ff175872d42004dfda_283_588 I'm also going to send it to trybot. --Jason
On Thu, Apr 29, 2021 at 11:35 PM Jason Ekstrand <jason@jlekstrand.net> wrote: > > On Thu, Apr 29, 2021 at 2:07 PM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > On Thu, Apr 29, 2021 at 02:01:16PM -0500, Jason Ekstrand wrote: > > > On Thu, Apr 29, 2021 at 1:56 PM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > On Thu, Apr 29, 2021 at 01:16:04PM -0500, Jason Ekstrand wrote: > > > > > On Thu, Apr 29, 2021 at 10:51 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > > > + ret = set_proto_ctx_param(file_priv, pc, args); > > > > > > > > > > > > I think we should have a FIXME here of not allowing this on some future > > > > > > platforms because just use CTX_CREATE_EXT. > > > > > > > > > > Done. > > > > > > > > > > > > + if (ret == -ENOTSUPP) { > > > > > > > + /* Some params, specifically SSEU, can only be set on fully > > > > > > > > > > > > I think this needs a FIXME: that this only holds during the conversion? > > > > > > Otherwise we kinda have a bit a problem me thinks ... > > > > > > > > > > I'm not sure what you mean by that. > > > > > > > > Well I'm at least assuming that we wont have this case anymore, i.e. > > > > there's only two kinds of parameters: > > > > - those which are valid only on proto context > > > > - those which are valid on both (like priority) > > > > > > > > This SSEU thing looks like a 3rd parameter, which is only valid on > > > > finalized context. That feels all kinds of wrong. Will it stay? If yes > > > > *ugh* and why? > > > > > > Because I was being lazy. The SSEU stuff is a fairly complex param to > > > parse and it's always set live. I can factor out the SSEU parsing > > > code if you want and it shouldn't be too bad in the end. > > > > Yeah I think the special case here is a bit too jarring. > > I rolled a v5 that allows you to set SSEU as a create param. I'm not > a huge fan of that much code duplication for the SSEU set but I guess > that's what we get for deciding to "unify" our context creation > parameter path with our on-the-fly parameter path.... > > You can look at it here: > > https://gitlab.freedesktop.org/jekstrand/linux/-/commit/c805f424a3374b2de405b7fc651eab551df2cdaf#474deb1194892a272db022ff175872d42004dfda_283_588 Hm yeah the duplication of the render engine check is a bit annoying. What's worse, if you tthrow another set_engines on top it's probably all wrong then. The old thing solved that by just throwing that intel_context away. You're also not keeping the engine id in the proto ctx for this, so there's probably some gaps there. We'd need to clear the SSEU if userspace puts another context there. But also no userspace does that. Plus cursory review of userspace show - mesa doesn't set this - compute sets its right before running the batch - media sets it as the last thing of context creation So it's kinda not needed. But also we're asking umd to switch over to CTX_CREATE_EXT, and if sseu doesn't work for that media team will be puzzled. And we've confused them enough already with our uapis. Another idea: proto_set_sseu just stores the uapi struct and a note that it's set, and checks nothing. To validate sseu on proto context we do (but only when an sseu parameter is set): 1. finalize the context 2. call the real set_sseu for validation 3. throw the finalized context away again, it was just for validating the overall thing That way we don't have to consider all the interactions of setting sseu and engines in any order on proto context, validation code is guaranteed shared. Only downside is that there's a slight chance in behaviour: SSEU, then setting another engine in that slot will fail instead of throwing the sseu parameters away. That's the right thing for CTX_CREATE_EXT anyway, and current userspace doesn't care. Thoughts? > I'm also going to send it to trybot. If you resend pls include all my r-b, I think some got lost in v4. Also, in the kernel at least we expect minimal commit message with a bit of context, there's no Part-of: link pointing at the entire MR with overview and discussion, the patchwork Link: we add is a pretty bad substitute. Some of the new patches in v4 are a bit too terse on that. And finally I'm still not a big fan of the add/remove split over patches, but oh well. -Daniel
On 30/04/2021 07:53, Daniel Vetter wrote: > On Thu, Apr 29, 2021 at 11:35 PM Jason Ekstrand <jason@jlekstrand.net> wrote: >> >> On Thu, Apr 29, 2021 at 2:07 PM Daniel Vetter <daniel@ffwll.ch> wrote: >>> >>> On Thu, Apr 29, 2021 at 02:01:16PM -0500, Jason Ekstrand wrote: >>>> On Thu, Apr 29, 2021 at 1:56 PM Daniel Vetter <daniel@ffwll.ch> wrote: >>>>> On Thu, Apr 29, 2021 at 01:16:04PM -0500, Jason Ekstrand wrote: >>>>>> On Thu, Apr 29, 2021 at 10:51 AM Daniel Vetter <daniel@ffwll.ch> wrote: >>>>>>>> + ret = set_proto_ctx_param(file_priv, pc, args); >>>>>>> >>>>>>> I think we should have a FIXME here of not allowing this on some future >>>>>>> platforms because just use CTX_CREATE_EXT. >>>>>> >>>>>> Done. >>>>>> >>>>>>>> + if (ret == -ENOTSUPP) { >>>>>>>> + /* Some params, specifically SSEU, can only be set on fully >>>>>>> >>>>>>> I think this needs a FIXME: that this only holds during the conversion? >>>>>>> Otherwise we kinda have a bit a problem me thinks ... >>>>>> >>>>>> I'm not sure what you mean by that. >>>>> >>>>> Well I'm at least assuming that we wont have this case anymore, i.e. >>>>> there's only two kinds of parameters: >>>>> - those which are valid only on proto context >>>>> - those which are valid on both (like priority) >>>>> >>>>> This SSEU thing looks like a 3rd parameter, which is only valid on >>>>> finalized context. That feels all kinds of wrong. Will it stay? If yes >>>>> *ugh* and why? >>>> >>>> Because I was being lazy. The SSEU stuff is a fairly complex param to >>>> parse and it's always set live. I can factor out the SSEU parsing >>>> code if you want and it shouldn't be too bad in the end. >>> >>> Yeah I think the special case here is a bit too jarring. >> >> I rolled a v5 that allows you to set SSEU as a create param. I'm not >> a huge fan of that much code duplication for the SSEU set but I guess >> that's what we get for deciding to "unify" our context creation >> parameter path with our on-the-fly parameter path.... >> >> You can look at it here: >> >> https://gitlab.freedesktop.org/jekstrand/linux/-/commit/c805f424a3374b2de405b7fc651eab551df2cdaf#474deb1194892a272db022ff175872d42004dfda_283_588 > > Hm yeah the duplication of the render engine check is a bit annoying. > What's worse, if you tthrow another set_engines on top it's probably > all wrong then. The old thing solved that by just throwing that > intel_context away. > > You're also not keeping the engine id in the proto ctx for this, so > there's probably some gaps there. We'd need to clear the SSEU if > userspace puts another context there. But also no userspace does that. > > Plus cursory review of userspace show > - mesa doesn't set this > - compute sets its right before running the batch > - media sets it as the last thing of context creation Noticed a long sub-thread so looked inside.. SSEU is a really an interesting one. For current userspace limiting to context creation is fine, since it is only allowed for Icelake/VME use case. But if you notice the comment inside: /* ABI restriction - VME use case only. */ It is a hint there was, or could be, more to this uapi than that. And from memory I think limiting to creation time will nip the hopes media had to use this dynamically on other platforms in the bud. So not that good really. They had convincing numbers what gets significantly better if we allowed dynamic control to this, just that as always, open source userspace was not there so we never allowed it. However if you come up with a new world order where it can only be done at context creation, as said already, the possibility for that improvement (aka further improving the competitive advantage) is most likely dashed. Regards, Tvrtko
On Fri, Apr 30, 2021 at 1:58 PM Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> wrote: > > > On 30/04/2021 07:53, Daniel Vetter wrote: > > On Thu, Apr 29, 2021 at 11:35 PM Jason Ekstrand <jason@jlekstrand.net> wrote: > >> > >> On Thu, Apr 29, 2021 at 2:07 PM Daniel Vetter <daniel@ffwll.ch> wrote: > >>> > >>> On Thu, Apr 29, 2021 at 02:01:16PM -0500, Jason Ekstrand wrote: > >>>> On Thu, Apr 29, 2021 at 1:56 PM Daniel Vetter <daniel@ffwll.ch> wrote: > >>>>> On Thu, Apr 29, 2021 at 01:16:04PM -0500, Jason Ekstrand wrote: > >>>>>> On Thu, Apr 29, 2021 at 10:51 AM Daniel Vetter <daniel@ffwll.ch> wrote: > >>>>>>>> + ret = set_proto_ctx_param(file_priv, pc, args); > >>>>>>> > >>>>>>> I think we should have a FIXME here of not allowing this on some future > >>>>>>> platforms because just use CTX_CREATE_EXT. > >>>>>> > >>>>>> Done. > >>>>>> > >>>>>>>> + if (ret == -ENOTSUPP) { > >>>>>>>> + /* Some params, specifically SSEU, can only be set on fully > >>>>>>> > >>>>>>> I think this needs a FIXME: that this only holds during the conversion? > >>>>>>> Otherwise we kinda have a bit a problem me thinks ... > >>>>>> > >>>>>> I'm not sure what you mean by that. > >>>>> > >>>>> Well I'm at least assuming that we wont have this case anymore, i.e. > >>>>> there's only two kinds of parameters: > >>>>> - those which are valid only on proto context > >>>>> - those which are valid on both (like priority) > >>>>> > >>>>> This SSEU thing looks like a 3rd parameter, which is only valid on > >>>>> finalized context. That feels all kinds of wrong. Will it stay? If yes > >>>>> *ugh* and why? > >>>> > >>>> Because I was being lazy. The SSEU stuff is a fairly complex param to > >>>> parse and it's always set live. I can factor out the SSEU parsing > >>>> code if you want and it shouldn't be too bad in the end. > >>> > >>> Yeah I think the special case here is a bit too jarring. > >> > >> I rolled a v5 that allows you to set SSEU as a create param. I'm not > >> a huge fan of that much code duplication for the SSEU set but I guess > >> that's what we get for deciding to "unify" our context creation > >> parameter path with our on-the-fly parameter path.... > >> > >> You can look at it here: > >> > >> https://gitlab.freedesktop.org/jekstrand/linux/-/commit/c805f424a3374b2de405b7fc651eab551df2cdaf#474deb1194892a272db022ff175872d42004dfda_283_588 > > > > Hm yeah the duplication of the render engine check is a bit annoying. > > What's worse, if you tthrow another set_engines on top it's probably > > all wrong then. The old thing solved that by just throwing that > > intel_context away. > > > > You're also not keeping the engine id in the proto ctx for this, so > > there's probably some gaps there. We'd need to clear the SSEU if > > userspace puts another context there. But also no userspace does that. > > > > Plus cursory review of userspace show > > - mesa doesn't set this > > - compute sets its right before running the batch > > - media sets it as the last thing of context creation > > Noticed a long sub-thread so looked inside.. > > SSEU is a really an interesting one. > > For current userspace limiting to context creation is fine, since it is > only allowed for Icelake/VME use case. But if you notice the comment inside: > > /* ABI restriction - VME use case only. */ > > It is a hint there was, or could be, more to this uapi than that. > > And from memory I think limiting to creation time will nip the hopes > media had to use this dynamically on other platforms in the bud. So not > that good really. They had convincing numbers what gets significantly > better if we allowed dynamic control to this, just that as always, open > source userspace was not there so we never allowed it. However if you > come up with a new world order where it can only be done at context > creation, as said already, the possibility for that improvement (aka > further improving the competitive advantage) is most likely dashed. Hm are you sure that this is create-time only? media-driver uses it like that, but from my checking compute-runtime updates SSEU mode before every execbuf call. So it very much looked like we have to keep this dynamic. Or do you mean this is defacto dead code? this = compute setting it before every batch I mean here. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
On 30/04/2021 13:30, Daniel Vetter wrote: > On Fri, Apr 30, 2021 at 1:58 PM Tvrtko Ursulin > <tvrtko.ursulin@linux.intel.com> wrote: >> On 30/04/2021 07:53, Daniel Vetter wrote: >>> On Thu, Apr 29, 2021 at 11:35 PM Jason Ekstrand <jason@jlekstrand.net> wrote: >>>> >>>> On Thu, Apr 29, 2021 at 2:07 PM Daniel Vetter <daniel@ffwll.ch> wrote: >>>>> >>>>> On Thu, Apr 29, 2021 at 02:01:16PM -0500, Jason Ekstrand wrote: >>>>>> On Thu, Apr 29, 2021 at 1:56 PM Daniel Vetter <daniel@ffwll.ch> wrote: >>>>>>> On Thu, Apr 29, 2021 at 01:16:04PM -0500, Jason Ekstrand wrote: >>>>>>>> On Thu, Apr 29, 2021 at 10:51 AM Daniel Vetter <daniel@ffwll.ch> wrote: >>>>>>>>>> + ret = set_proto_ctx_param(file_priv, pc, args); >>>>>>>>> >>>>>>>>> I think we should have a FIXME here of not allowing this on some future >>>>>>>>> platforms because just use CTX_CREATE_EXT. >>>>>>>> >>>>>>>> Done. >>>>>>>> >>>>>>>>>> + if (ret == -ENOTSUPP) { >>>>>>>>>> + /* Some params, specifically SSEU, can only be set on fully >>>>>>>>> >>>>>>>>> I think this needs a FIXME: that this only holds during the conversion? >>>>>>>>> Otherwise we kinda have a bit a problem me thinks ... >>>>>>>> >>>>>>>> I'm not sure what you mean by that. >>>>>>> >>>>>>> Well I'm at least assuming that we wont have this case anymore, i.e. >>>>>>> there's only two kinds of parameters: >>>>>>> - those which are valid only on proto context >>>>>>> - those which are valid on both (like priority) >>>>>>> >>>>>>> This SSEU thing looks like a 3rd parameter, which is only valid on >>>>>>> finalized context. That feels all kinds of wrong. Will it stay? If yes >>>>>>> *ugh* and why? >>>>>> >>>>>> Because I was being lazy. The SSEU stuff is a fairly complex param to >>>>>> parse and it's always set live. I can factor out the SSEU parsing >>>>>> code if you want and it shouldn't be too bad in the end. >>>>> >>>>> Yeah I think the special case here is a bit too jarring. >>>> >>>> I rolled a v5 that allows you to set SSEU as a create param. I'm not >>>> a huge fan of that much code duplication for the SSEU set but I guess >>>> that's what we get for deciding to "unify" our context creation >>>> parameter path with our on-the-fly parameter path.... >>>> >>>> You can look at it here: >>>> >>>> https://gitlab.freedesktop.org/jekstrand/linux/-/commit/c805f424a3374b2de405b7fc651eab551df2cdaf#474deb1194892a272db022ff175872d42004dfda_283_588 >>> >>> Hm yeah the duplication of the render engine check is a bit annoying. >>> What's worse, if you tthrow another set_engines on top it's probably >>> all wrong then. The old thing solved that by just throwing that >>> intel_context away. >>> >>> You're also not keeping the engine id in the proto ctx for this, so >>> there's probably some gaps there. We'd need to clear the SSEU if >>> userspace puts another context there. But also no userspace does that. >>> >>> Plus cursory review of userspace show >>> - mesa doesn't set this >>> - compute sets its right before running the batch >>> - media sets it as the last thing of context creation >> >> Noticed a long sub-thread so looked inside.. >> >> SSEU is a really an interesting one. >> >> For current userspace limiting to context creation is fine, since it is >> only allowed for Icelake/VME use case. But if you notice the comment inside: >> >> /* ABI restriction - VME use case only. */ >> >> It is a hint there was, or could be, more to this uapi than that. >> >> And from memory I think limiting to creation time will nip the hopes >> media had to use this dynamically on other platforms in the bud. So not >> that good really. They had convincing numbers what gets significantly >> better if we allowed dynamic control to this, just that as always, open >> source userspace was not there so we never allowed it. However if you >> come up with a new world order where it can only be done at context >> creation, as said already, the possibility for that improvement (aka >> further improving the competitive advantage) is most likely dashed. > > Hm are you sure that this is create-time only? media-driver uses it > like that, but from my checking compute-runtime updates SSEU mode > before every execbuf call. So it very much looked like we have to keep > this dynamic. Ah okay, I assumed it's more of the overall drive to eliminate set_param. If sseu set_param stays then it's fine for what I had in mind. > Or do you mean this is defacto dead code? this = compute setting it > before every batch I mean here. No idea, wasn't aware of the compute usage. Before every execbuf is not very ideal though since we have to inject a foreign context operation to update context image, which means stream of work belonging to the context cannot be coalesced (assuming it could to start with). There is also a hw cost to reconfigure the sseu which adds latency on top. Anyway, I was only aware of the current media usage, which is static as you say, and future/wishlist media usage, which would be dynamic, but a complicated story to get right (partly due downsides mentioned in the previous paragraph mean balancing benefit vs cost of dynamic sseu is not easy). Regards, Tvrtko
On Fri, Apr 30, 2021 at 2:44 PM Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> wrote: > > > > On 30/04/2021 13:30, Daniel Vetter wrote: > > On Fri, Apr 30, 2021 at 1:58 PM Tvrtko Ursulin > > <tvrtko.ursulin@linux.intel.com> wrote: > >> On 30/04/2021 07:53, Daniel Vetter wrote: > >>> On Thu, Apr 29, 2021 at 11:35 PM Jason Ekstrand <jason@jlekstrand.net> wrote: > >>>> > >>>> On Thu, Apr 29, 2021 at 2:07 PM Daniel Vetter <daniel@ffwll.ch> wrote: > >>>>> > >>>>> On Thu, Apr 29, 2021 at 02:01:16PM -0500, Jason Ekstrand wrote: > >>>>>> On Thu, Apr 29, 2021 at 1:56 PM Daniel Vetter <daniel@ffwll.ch> wrote: > >>>>>>> On Thu, Apr 29, 2021 at 01:16:04PM -0500, Jason Ekstrand wrote: > >>>>>>>> On Thu, Apr 29, 2021 at 10:51 AM Daniel Vetter <daniel@ffwll.ch> wrote: > >>>>>>>>>> + ret = set_proto_ctx_param(file_priv, pc, args); > >>>>>>>>> > >>>>>>>>> I think we should have a FIXME here of not allowing this on some future > >>>>>>>>> platforms because just use CTX_CREATE_EXT. > >>>>>>>> > >>>>>>>> Done. > >>>>>>>> > >>>>>>>>>> + if (ret == -ENOTSUPP) { > >>>>>>>>>> + /* Some params, specifically SSEU, can only be set on fully > >>>>>>>>> > >>>>>>>>> I think this needs a FIXME: that this only holds during the conversion? > >>>>>>>>> Otherwise we kinda have a bit a problem me thinks ... > >>>>>>>> > >>>>>>>> I'm not sure what you mean by that. > >>>>>>> > >>>>>>> Well I'm at least assuming that we wont have this case anymore, i.e. > >>>>>>> there's only two kinds of parameters: > >>>>>>> - those which are valid only on proto context > >>>>>>> - those which are valid on both (like priority) > >>>>>>> > >>>>>>> This SSEU thing looks like a 3rd parameter, which is only valid on > >>>>>>> finalized context. That feels all kinds of wrong. Will it stay? If yes > >>>>>>> *ugh* and why? > >>>>>> > >>>>>> Because I was being lazy. The SSEU stuff is a fairly complex param to > >>>>>> parse and it's always set live. I can factor out the SSEU parsing > >>>>>> code if you want and it shouldn't be too bad in the end. > >>>>> > >>>>> Yeah I think the special case here is a bit too jarring. > >>>> > >>>> I rolled a v5 that allows you to set SSEU as a create param. I'm not > >>>> a huge fan of that much code duplication for the SSEU set but I guess > >>>> that's what we get for deciding to "unify" our context creation > >>>> parameter path with our on-the-fly parameter path.... > >>>> > >>>> You can look at it here: > >>>> > >>>> https://gitlab.freedesktop.org/jekstrand/linux/-/commit/c805f424a3374b2de405b7fc651eab551df2cdaf#474deb1194892a272db022ff175872d42004dfda_283_588 > >>> > >>> Hm yeah the duplication of the render engine check is a bit annoying. > >>> What's worse, if you tthrow another set_engines on top it's probably > >>> all wrong then. The old thing solved that by just throwing that > >>> intel_context away. > >>> > >>> You're also not keeping the engine id in the proto ctx for this, so > >>> there's probably some gaps there. We'd need to clear the SSEU if > >>> userspace puts another context there. But also no userspace does that. > >>> > >>> Plus cursory review of userspace show > >>> - mesa doesn't set this > >>> - compute sets its right before running the batch > >>> - media sets it as the last thing of context creation > >> > >> Noticed a long sub-thread so looked inside.. > >> > >> SSEU is a really an interesting one. > >> > >> For current userspace limiting to context creation is fine, since it is > >> only allowed for Icelake/VME use case. But if you notice the comment inside: > >> > >> /* ABI restriction - VME use case only. */ > >> > >> It is a hint there was, or could be, more to this uapi than that. > >> > >> And from memory I think limiting to creation time will nip the hopes > >> media had to use this dynamically on other platforms in the bud. So not > >> that good really. They had convincing numbers what gets significantly > >> better if we allowed dynamic control to this, just that as always, open > >> source userspace was not there so we never allowed it. However if you > >> come up with a new world order where it can only be done at context > >> creation, as said already, the possibility for that improvement (aka > >> further improving the competitive advantage) is most likely dashed. > > > > Hm are you sure that this is create-time only? media-driver uses it > > like that, but from my checking compute-runtime updates SSEU mode > > before every execbuf call. So it very much looked like we have to keep > > this dynamic. > > Ah okay, I assumed it's more of the overall drive to eliminate > set_param. If sseu set_param stays then it's fine for what I had in mind. > > > Or do you mean this is defacto dead code? this = compute setting it > > before every batch I mean here. > > No idea, wasn't aware of the compute usage. > > Before every execbuf is not very ideal though since we have to inject a > foreign context operation to update context image, which means stream of > work belonging to the context cannot be coalesced (assuming it could to > start with). There is also a hw cost to reconfigure the sseu which adds > latency on top. They filter out no-op changes. I just meant that from look at compute-runtime, it seems like sseu can change whenever. -Daniel > Anyway, I was only aware of the current media usage, which is static as > you say, and future/wishlist media usage, which would be dynamic, but a > complicated story to get right (partly due downsides mentioned in the > previous paragraph mean balancing benefit vs cost of dynamic sseu is not > easy). > > Regards, > > Tvrtko
On 30/04/2021 14:07, Daniel Vetter wrote: > On Fri, Apr 30, 2021 at 2:44 PM Tvrtko Ursulin > <tvrtko.ursulin@linux.intel.com> wrote: >> On 30/04/2021 13:30, Daniel Vetter wrote: >>> On Fri, Apr 30, 2021 at 1:58 PM Tvrtko Ursulin >>> <tvrtko.ursulin@linux.intel.com> wrote: >>>> On 30/04/2021 07:53, Daniel Vetter wrote: >>>>> On Thu, Apr 29, 2021 at 11:35 PM Jason Ekstrand <jason@jlekstrand.net> wrote: >>>>>> >>>>>> On Thu, Apr 29, 2021 at 2:07 PM Daniel Vetter <daniel@ffwll.ch> wrote: >>>>>>> >>>>>>> On Thu, Apr 29, 2021 at 02:01:16PM -0500, Jason Ekstrand wrote: >>>>>>>> On Thu, Apr 29, 2021 at 1:56 PM Daniel Vetter <daniel@ffwll.ch> wrote: >>>>>>>>> On Thu, Apr 29, 2021 at 01:16:04PM -0500, Jason Ekstrand wrote: >>>>>>>>>> On Thu, Apr 29, 2021 at 10:51 AM Daniel Vetter <daniel@ffwll.ch> wrote: >>>>>>>>>>>> + ret = set_proto_ctx_param(file_priv, pc, args); >>>>>>>>>>> >>>>>>>>>>> I think we should have a FIXME here of not allowing this on some future >>>>>>>>>>> platforms because just use CTX_CREATE_EXT. >>>>>>>>>> >>>>>>>>>> Done. >>>>>>>>>> >>>>>>>>>>>> + if (ret == -ENOTSUPP) { >>>>>>>>>>>> + /* Some params, specifically SSEU, can only be set on fully >>>>>>>>>>> >>>>>>>>>>> I think this needs a FIXME: that this only holds during the conversion? >>>>>>>>>>> Otherwise we kinda have a bit a problem me thinks ... >>>>>>>>>> >>>>>>>>>> I'm not sure what you mean by that. >>>>>>>>> >>>>>>>>> Well I'm at least assuming that we wont have this case anymore, i.e. >>>>>>>>> there's only two kinds of parameters: >>>>>>>>> - those which are valid only on proto context >>>>>>>>> - those which are valid on both (like priority) >>>>>>>>> >>>>>>>>> This SSEU thing looks like a 3rd parameter, which is only valid on >>>>>>>>> finalized context. That feels all kinds of wrong. Will it stay? If yes >>>>>>>>> *ugh* and why? >>>>>>>> >>>>>>>> Because I was being lazy. The SSEU stuff is a fairly complex param to >>>>>>>> parse and it's always set live. I can factor out the SSEU parsing >>>>>>>> code if you want and it shouldn't be too bad in the end. >>>>>>> >>>>>>> Yeah I think the special case here is a bit too jarring. >>>>>> >>>>>> I rolled a v5 that allows you to set SSEU as a create param. I'm not >>>>>> a huge fan of that much code duplication for the SSEU set but I guess >>>>>> that's what we get for deciding to "unify" our context creation >>>>>> parameter path with our on-the-fly parameter path.... >>>>>> >>>>>> You can look at it here: >>>>>> >>>>>> https://gitlab.freedesktop.org/jekstrand/linux/-/commit/c805f424a3374b2de405b7fc651eab551df2cdaf#474deb1194892a272db022ff175872d42004dfda_283_588 >>>>> >>>>> Hm yeah the duplication of the render engine check is a bit annoying. >>>>> What's worse, if you tthrow another set_engines on top it's probably >>>>> all wrong then. The old thing solved that by just throwing that >>>>> intel_context away. >>>>> >>>>> You're also not keeping the engine id in the proto ctx for this, so >>>>> there's probably some gaps there. We'd need to clear the SSEU if >>>>> userspace puts another context there. But also no userspace does that. >>>>> >>>>> Plus cursory review of userspace show >>>>> - mesa doesn't set this >>>>> - compute sets its right before running the batch >>>>> - media sets it as the last thing of context creation >>>> >>>> Noticed a long sub-thread so looked inside.. >>>> >>>> SSEU is a really an interesting one. >>>> >>>> For current userspace limiting to context creation is fine, since it is >>>> only allowed for Icelake/VME use case. But if you notice the comment inside: >>>> >>>> /* ABI restriction - VME use case only. */ >>>> >>>> It is a hint there was, or could be, more to this uapi than that. >>>> >>>> And from memory I think limiting to creation time will nip the hopes >>>> media had to use this dynamically on other platforms in the bud. So not >>>> that good really. They had convincing numbers what gets significantly >>>> better if we allowed dynamic control to this, just that as always, open >>>> source userspace was not there so we never allowed it. However if you >>>> come up with a new world order where it can only be done at context >>>> creation, as said already, the possibility for that improvement (aka >>>> further improving the competitive advantage) is most likely dashed. >>> >>> Hm are you sure that this is create-time only? media-driver uses it >>> like that, but from my checking compute-runtime updates SSEU mode >>> before every execbuf call. So it very much looked like we have to keep >>> this dynamic. >> >> Ah okay, I assumed it's more of the overall drive to eliminate >> set_param. If sseu set_param stays then it's fine for what I had in mind. >> >>> Or do you mean this is defacto dead code? this = compute setting it >>> before every batch I mean here. >> >> No idea, wasn't aware of the compute usage. >> >> Before every execbuf is not very ideal though since we have to inject a >> foreign context operation to update context image, which means stream of >> work belonging to the context cannot be coalesced (assuming it could to >> start with). There is also a hw cost to reconfigure the sseu which adds >> latency on top. > > They filter out no-op changes. I just meant that from look at > compute-runtime, it seems like sseu can change whenever. i915 does it as well for good measure - since the penalty is global we have to. So I guess they don't know when VME block will be used ie it is per batch. Regards, Tvrtko
On Fri, Apr 30, 2021 at 1:53 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > On Thu, Apr 29, 2021 at 11:35 PM Jason Ekstrand <jason@jlekstrand.net> wrote: > > > > On Thu, Apr 29, 2021 at 2:07 PM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > > On Thu, Apr 29, 2021 at 02:01:16PM -0500, Jason Ekstrand wrote: > > > > On Thu, Apr 29, 2021 at 1:56 PM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > On Thu, Apr 29, 2021 at 01:16:04PM -0500, Jason Ekstrand wrote: > > > > > > On Thu, Apr 29, 2021 at 10:51 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > > > > + ret = set_proto_ctx_param(file_priv, pc, args); > > > > > > > > > > > > > > I think we should have a FIXME here of not allowing this on some future > > > > > > > platforms because just use CTX_CREATE_EXT. > > > > > > > > > > > > Done. > > > > > > > > > > > > > > + if (ret == -ENOTSUPP) { > > > > > > > > + /* Some params, specifically SSEU, can only be set on fully > > > > > > > > > > > > > > I think this needs a FIXME: that this only holds during the conversion? > > > > > > > Otherwise we kinda have a bit a problem me thinks ... > > > > > > > > > > > > I'm not sure what you mean by that. > > > > > > > > > > Well I'm at least assuming that we wont have this case anymore, i.e. > > > > > there's only two kinds of parameters: > > > > > - those which are valid only on proto context > > > > > - those which are valid on both (like priority) > > > > > > > > > > This SSEU thing looks like a 3rd parameter, which is only valid on > > > > > finalized context. That feels all kinds of wrong. Will it stay? If yes > > > > > *ugh* and why? > > > > > > > > Because I was being lazy. The SSEU stuff is a fairly complex param to > > > > parse and it's always set live. I can factor out the SSEU parsing > > > > code if you want and it shouldn't be too bad in the end. > > > > > > Yeah I think the special case here is a bit too jarring. > > > > I rolled a v5 that allows you to set SSEU as a create param. I'm not > > a huge fan of that much code duplication for the SSEU set but I guess > > that's what we get for deciding to "unify" our context creation > > parameter path with our on-the-fly parameter path.... > > > > You can look at it here: > > > > https://gitlab.freedesktop.org/jekstrand/linux/-/commit/c805f424a3374b2de405b7fc651eab551df2cdaf#474deb1194892a272db022ff175872d42004dfda_283_588 > > Hm yeah the duplication of the render engine check is a bit annoying. > What's worse, if you tthrow another set_engines on top it's probably > all wrong then. The old thing solved that by just throwing that > intel_context away. I think that's already mostly taken care of. When set_engines happens, we throw away the old array of engines and start with a new one where everything has been memset to 0. The one remaining problem is that, if userspace resets the engine set, we need to memset legacy_rcs_sseu to 0. I've added that. > You're also not keeping the engine id in the proto ctx for this, so > there's probably some gaps there. We'd need to clear the SSEU if > userspace puts another context there. But also no userspace does that. Again, I think that's handled. See above. > Plus cursory review of userspace show > - mesa doesn't set this > - compute sets its right before running the batch > - media sets it as the last thing of context creation > > So it's kinda not needed. But also we're asking umd to switch over to > CTX_CREATE_EXT, and if sseu doesn't work for that media team will be > puzzled. And we've confused them enough already with our uapis. > > Another idea: proto_set_sseu just stores the uapi struct and a note > that it's set, and checks nothing. To validate sseu on proto context > we do (but only when an sseu parameter is set): > 1. finalize the context > 2. call the real set_sseu for validation > 3. throw the finalized context away again, it was just for validating > the overall thing > > That way we don't have to consider all the interactions of setting > sseu and engines in any order on proto context, validation code is > guaranteed shared. Only downside is that there's a slight chance in > behaviour: SSEU, then setting another engine in that slot will fail > instead of throwing the sseu parameters away. That's the right thing > for CTX_CREATE_EXT anyway, and current userspace doesn't care. > > Thoughts? I thought about that. The problem is that they can set_sseu multiple times on different engines. This means we'd have to effectively build up an arbitrary list of SSEU set operations and replay it. I'm not sure how I feel about building up a big data structure. > > I'm also going to send it to trybot. > > If you resend pls include all my r-b, I think some got lost in v4. I'll try and dig those up. > Also, in the kernel at least we expect minimal commit message with a > bit of context, there's no Part-of: link pointing at the entire MR > with overview and discussion, the patchwork Link: we add is a pretty > bad substitute. Some of the new patches in v4 are a bit too terse on > that. Yup. I can try to expand things a bit more. > And finally I'm still not a big fan of the add/remove split over > patches, but oh well. I'm not either but working through all this reminded me of why I didn't do it more gradual. The problem is ordering. If add and remove at the same time and do it one param at a time, we'll end up with a situation in the middle where some params will only be allowed to be set on the proto-ctx and others will force a proto-ctx -> context conversion. If, for instance, one UMD sets engines first and then VMs and another sets VMs first and then engines, there's no way to do a gradual transition without breaking one of them. Also, we need to handle basically all the setparam complexity in order to handle creation structs and, again, those can come in any order. I hate it, I just don't see another way. :-( --Jason
On Fri, Apr 30, 2021 at 6:27 PM Jason Ekstrand <jason@jlekstrand.net> wrote: > > On Fri, Apr 30, 2021 at 1:53 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > On Thu, Apr 29, 2021 at 11:35 PM Jason Ekstrand <jason@jlekstrand.net> wrote: > > > > > > On Thu, Apr 29, 2021 at 2:07 PM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > > > > On Thu, Apr 29, 2021 at 02:01:16PM -0500, Jason Ekstrand wrote: > > > > > On Thu, Apr 29, 2021 at 1:56 PM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > > On Thu, Apr 29, 2021 at 01:16:04PM -0500, Jason Ekstrand wrote: > > > > > > > On Thu, Apr 29, 2021 at 10:51 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > > > > > + ret = set_proto_ctx_param(file_priv, pc, args); > > > > > > > > > > > > > > > > I think we should have a FIXME here of not allowing this on some future > > > > > > > > platforms because just use CTX_CREATE_EXT. > > > > > > > > > > > > > > Done. > > > > > > > > > > > > > > > > + if (ret == -ENOTSUPP) { > > > > > > > > > + /* Some params, specifically SSEU, can only be set on fully > > > > > > > > > > > > > > > > I think this needs a FIXME: that this only holds during the conversion? > > > > > > > > Otherwise we kinda have a bit a problem me thinks ... > > > > > > > > > > > > > > I'm not sure what you mean by that. > > > > > > > > > > > > Well I'm at least assuming that we wont have this case anymore, i.e. > > > > > > there's only two kinds of parameters: > > > > > > - those which are valid only on proto context > > > > > > - those which are valid on both (like priority) > > > > > > > > > > > > This SSEU thing looks like a 3rd parameter, which is only valid on > > > > > > finalized context. That feels all kinds of wrong. Will it stay? If yes > > > > > > *ugh* and why? > > > > > > > > > > Because I was being lazy. The SSEU stuff is a fairly complex param to > > > > > parse and it's always set live. I can factor out the SSEU parsing > > > > > code if you want and it shouldn't be too bad in the end. > > > > > > > > Yeah I think the special case here is a bit too jarring. > > > > > > I rolled a v5 that allows you to set SSEU as a create param. I'm not > > > a huge fan of that much code duplication for the SSEU set but I guess > > > that's what we get for deciding to "unify" our context creation > > > parameter path with our on-the-fly parameter path.... > > > > > > You can look at it here: > > > > > > https://gitlab.freedesktop.org/jekstrand/linux/-/commit/c805f424a3374b2de405b7fc651eab551df2cdaf#474deb1194892a272db022ff175872d42004dfda_283_588 > > > > Hm yeah the duplication of the render engine check is a bit annoying. > > What's worse, if you tthrow another set_engines on top it's probably > > all wrong then. The old thing solved that by just throwing that > > intel_context away. > > I think that's already mostly taken care of. When set_engines > happens, we throw away the old array of engines and start with a new > one where everything has been memset to 0. The one remaining problem > is that, if userspace resets the engine set, we need to memset > legacy_rcs_sseu to 0. I've added that. > > > You're also not keeping the engine id in the proto ctx for this, so > > there's probably some gaps there. We'd need to clear the SSEU if > > userspace puts another context there. But also no userspace does that. > > Again, I think that's handled. See above. > > > Plus cursory review of userspace show > > - mesa doesn't set this > > - compute sets its right before running the batch > > - media sets it as the last thing of context creation > > > > So it's kinda not needed. But also we're asking umd to switch over to > > CTX_CREATE_EXT, and if sseu doesn't work for that media team will be > > puzzled. And we've confused them enough already with our uapis. > > > > Another idea: proto_set_sseu just stores the uapi struct and a note > > that it's set, and checks nothing. To validate sseu on proto context > > we do (but only when an sseu parameter is set): > > 1. finalize the context > > 2. call the real set_sseu for validation > > 3. throw the finalized context away again, it was just for validating > > the overall thing > > > > That way we don't have to consider all the interactions of setting > > sseu and engines in any order on proto context, validation code is > > guaranteed shared. Only downside is that there's a slight chance in > > behaviour: SSEU, then setting another engine in that slot will fail > > instead of throwing the sseu parameters away. That's the right thing > > for CTX_CREATE_EXT anyway, and current userspace doesn't care. > > > > Thoughts? > > I thought about that. The problem is that they can set_sseu multiple > times on different engines. This means we'd have to effectively build > up an arbitrary list of SSEU set operations and replay it. I'm not > sure how I feel about building up a big data structure. Hm, but how does this work with proto ctx then? I've only seen a single sseu param set in the patch you linked. > > > I'm also going to send it to trybot. > > > > If you resend pls include all my r-b, I think some got lost in v4. > > I'll try and dig those up. > > > Also, in the kernel at least we expect minimal commit message with a > > bit of context, there's no Part-of: link pointing at the entire MR > > with overview and discussion, the patchwork Link: we add is a pretty > > bad substitute. Some of the new patches in v4 are a bit too terse on > > that. > > Yup. I can try to expand things a bit more. > > > And finally I'm still not a big fan of the add/remove split over > > patches, but oh well. > > I'm not either but working through all this reminded me of why I > didn't do it more gradual. The problem is ordering. If add and > remove at the same time and do it one param at a time, we'll end up > with a situation in the middle where some params will only be allowed > to be set on the proto-ctx and others will force a proto-ctx -> > context conversion. If, for instance, one UMD sets engines first and > then VMs and another sets VMs first and then engines, there's no way > to do a gradual transition without breaking one of them. Also, we > need to handle basically all the setparam complexity in order to > handle creation structs and, again, those can come in any order. Yeah I know, but I considered that. I think compute-runtime uses CTX_CREATE_EXT, it's only media. So we need to order the patches in exactly the order media calls setparam. And then we're good. Worst case it's exactly as useful in bisecting as your approach here (you add dead code first, then use it, so might as well just squash it all down to one), but if we get the ordering right it's substantially better. But maybe "clever ordering of the conversion" is too clever. End result is the same anyway. -Daniel > I hate it, I just don't see another way. :-(
On Fri, Apr 30, 2021 at 11:33 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > On Fri, Apr 30, 2021 at 6:27 PM Jason Ekstrand <jason@jlekstrand.net> wrote: > > > > On Fri, Apr 30, 2021 at 1:53 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > > On Thu, Apr 29, 2021 at 11:35 PM Jason Ekstrand <jason@jlekstrand.net> wrote: > > > > > > > > On Thu, Apr 29, 2021 at 2:07 PM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > > > > > > On Thu, Apr 29, 2021 at 02:01:16PM -0500, Jason Ekstrand wrote: > > > > > > On Thu, Apr 29, 2021 at 1:56 PM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > > > On Thu, Apr 29, 2021 at 01:16:04PM -0500, Jason Ekstrand wrote: > > > > > > > > On Thu, Apr 29, 2021 at 10:51 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > > > > > > + ret = set_proto_ctx_param(file_priv, pc, args); > > > > > > > > > > > > > > > > > > I think we should have a FIXME here of not allowing this on some future > > > > > > > > > platforms because just use CTX_CREATE_EXT. > > > > > > > > > > > > > > > > Done. > > > > > > > > > > > > > > > > > > + if (ret == -ENOTSUPP) { > > > > > > > > > > + /* Some params, specifically SSEU, can only be set on fully > > > > > > > > > > > > > > > > > > I think this needs a FIXME: that this only holds during the conversion? > > > > > > > > > Otherwise we kinda have a bit a problem me thinks ... > > > > > > > > > > > > > > > > I'm not sure what you mean by that. > > > > > > > > > > > > > > Well I'm at least assuming that we wont have this case anymore, i.e. > > > > > > > there's only two kinds of parameters: > > > > > > > - those which are valid only on proto context > > > > > > > - those which are valid on both (like priority) > > > > > > > > > > > > > > This SSEU thing looks like a 3rd parameter, which is only valid on > > > > > > > finalized context. That feels all kinds of wrong. Will it stay? If yes > > > > > > > *ugh* and why? > > > > > > > > > > > > Because I was being lazy. The SSEU stuff is a fairly complex param to > > > > > > parse and it's always set live. I can factor out the SSEU parsing > > > > > > code if you want and it shouldn't be too bad in the end. > > > > > > > > > > Yeah I think the special case here is a bit too jarring. > > > > > > > > I rolled a v5 that allows you to set SSEU as a create param. I'm not > > > > a huge fan of that much code duplication for the SSEU set but I guess > > > > that's what we get for deciding to "unify" our context creation > > > > parameter path with our on-the-fly parameter path.... > > > > > > > > You can look at it here: > > > > > > > > https://gitlab.freedesktop.org/jekstrand/linux/-/commit/c805f424a3374b2de405b7fc651eab551df2cdaf#474deb1194892a272db022ff175872d42004dfda_283_588 > > > > > > Hm yeah the duplication of the render engine check is a bit annoying. > > > What's worse, if you tthrow another set_engines on top it's probably > > > all wrong then. The old thing solved that by just throwing that > > > intel_context away. > > > > I think that's already mostly taken care of. When set_engines > > happens, we throw away the old array of engines and start with a new > > one where everything has been memset to 0. The one remaining problem > > is that, if userspace resets the engine set, we need to memset > > legacy_rcs_sseu to 0. I've added that. > > > > > You're also not keeping the engine id in the proto ctx for this, so > > > there's probably some gaps there. We'd need to clear the SSEU if > > > userspace puts another context there. But also no userspace does that. > > > > Again, I think that's handled. See above. > > > > > Plus cursory review of userspace show > > > - mesa doesn't set this > > > - compute sets its right before running the batch > > > - media sets it as the last thing of context creation > > > > > > So it's kinda not needed. But also we're asking umd to switch over to > > > CTX_CREATE_EXT, and if sseu doesn't work for that media team will be > > > puzzled. And we've confused them enough already with our uapis. > > > > > > Another idea: proto_set_sseu just stores the uapi struct and a note > > > that it's set, and checks nothing. To validate sseu on proto context > > > we do (but only when an sseu parameter is set): > > > 1. finalize the context > > > 2. call the real set_sseu for validation > > > 3. throw the finalized context away again, it was just for validating > > > the overall thing > > > > > > That way we don't have to consider all the interactions of setting > > > sseu and engines in any order on proto context, validation code is > > > guaranteed shared. Only downside is that there's a slight chance in > > > behaviour: SSEU, then setting another engine in that slot will fail > > > instead of throwing the sseu parameters away. That's the right thing > > > for CTX_CREATE_EXT anyway, and current userspace doesn't care. > > > > > > Thoughts? > > > > I thought about that. The problem is that they can set_sseu multiple > > times on different engines. This means we'd have to effectively build > > up an arbitrary list of SSEU set operations and replay it. I'm not > > sure how I feel about building up a big data structure. > > Hm, but how does this work with proto ctx then? I've only seen a > single sseu param set in the patch you linked. It works roughly the same as it works now: - If set_sseu is called, it always overwrites whatever was there before. If it's called for a legacy (no user-specified engines) context, it overwrites legacy_rcs_sseu. If it's called on a user engine context, it overwrites the sseu on the given engine. - When set_engines is called, it throws away all the user engine data (if any) and memsets legacy_rcu_sseu to 0. The end result is that everything gets reset. > > > > I'm also going to send it to trybot. > > > > > > If you resend pls include all my r-b, I think some got lost in v4. > > > > I'll try and dig those up. > > > > > Also, in the kernel at least we expect minimal commit message with a > > > bit of context, there's no Part-of: link pointing at the entire MR > > > with overview and discussion, the patchwork Link: we add is a pretty > > > bad substitute. Some of the new patches in v4 are a bit too terse on > > > that. > > > > Yup. I can try to expand things a bit more. > > > > > And finally I'm still not a big fan of the add/remove split over > > > patches, but oh well. > > > > I'm not either but working through all this reminded me of why I > > didn't do it more gradual. The problem is ordering. If add and > > remove at the same time and do it one param at a time, we'll end up > > with a situation in the middle where some params will only be allowed > > to be set on the proto-ctx and others will force a proto-ctx -> > > context conversion. If, for instance, one UMD sets engines first and > > then VMs and another sets VMs first and then engines, there's no way > > to do a gradual transition without breaking one of them. Also, we > > need to handle basically all the setparam complexity in order to > > handle creation structs and, again, those can come in any order. > > Yeah I know, but I considered that. I think compute-runtime uses > CTX_CREATE_EXT, it's only media. That doesn't really matter because both go through the same path. Anything that uses CONTEXT_CREATE_EXT is identical to something which creates the context and then calls SET_CONTEXT_PARAM in the same order as the structs in the extension chain. Incidentally, this also means that if we do it gradually, we have to handle finalizing the proto-ctx mid-way through handling the chain of create extensions. That should be possible to handle if a bit tricky. It'll also mean we'll have a (small) range of kernels where the CONTEXT_CREATE_EXT method is broken if you get it in the wrong order. > So we need to order the patches in > exactly the order media calls setparam. And then we're good. Mesa only ever sets engines. Upstream compute only ever sets the VM. Media always sets the VM first. So, if we handle VM first, we should be good-to-go, I think. > Worst case it's exactly as useful in bisecting as your approach here > (you add dead code first, then use it, It's not dead. At the time it's added, it's used for all CONTEXT_CREATE_EXT. Then, later, it becomes used for everything. > so might as well just squash it > all down to one), but if we get the ordering right it's substantially > better. I can try to spin a v5 and see how bad it ends up being. I don't really like breaking CONTEXT_CREATE_EXT in the middle, though. --Jason
On Fri, Apr 30, 2021 at 6:57 PM Jason Ekstrand <jason@jlekstrand.net> wrote: > > On Fri, Apr 30, 2021 at 11:33 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > On Fri, Apr 30, 2021 at 6:27 PM Jason Ekstrand <jason@jlekstrand.net> wrote: > > > > > > On Fri, Apr 30, 2021 at 1:53 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > > > > On Thu, Apr 29, 2021 at 11:35 PM Jason Ekstrand <jason@jlekstrand.net> wrote: > > > > > > > > > > On Thu, Apr 29, 2021 at 2:07 PM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > > > > > > > > On Thu, Apr 29, 2021 at 02:01:16PM -0500, Jason Ekstrand wrote: > > > > > > > On Thu, Apr 29, 2021 at 1:56 PM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > > > > On Thu, Apr 29, 2021 at 01:16:04PM -0500, Jason Ekstrand wrote: > > > > > > > > > On Thu, Apr 29, 2021 at 10:51 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > > > > > > > > > > + ret = set_proto_ctx_param(file_priv, pc, args); > > > > > > > > > > > > > > > > > > > > I think we should have a FIXME here of not allowing this on some future > > > > > > > > > > platforms because just use CTX_CREATE_EXT. > > > > > > > > > > > > > > > > > > Done. > > > > > > > > > > > > > > > > > > > > + if (ret == -ENOTSUPP) { > > > > > > > > > > > + /* Some params, specifically SSEU, can only be set on fully > > > > > > > > > > > > > > > > > > > > I think this needs a FIXME: that this only holds during the conversion? > > > > > > > > > > Otherwise we kinda have a bit a problem me thinks ... > > > > > > > > > > > > > > > > > > I'm not sure what you mean by that. > > > > > > > > > > > > > > > > Well I'm at least assuming that we wont have this case anymore, i.e. > > > > > > > > there's only two kinds of parameters: > > > > > > > > - those which are valid only on proto context > > > > > > > > - those which are valid on both (like priority) > > > > > > > > > > > > > > > > This SSEU thing looks like a 3rd parameter, which is only valid on > > > > > > > > finalized context. That feels all kinds of wrong. Will it stay? If yes > > > > > > > > *ugh* and why? > > > > > > > > > > > > > > Because I was being lazy. The SSEU stuff is a fairly complex param to > > > > > > > parse and it's always set live. I can factor out the SSEU parsing > > > > > > > code if you want and it shouldn't be too bad in the end. > > > > > > > > > > > > Yeah I think the special case here is a bit too jarring. > > > > > > > > > > I rolled a v5 that allows you to set SSEU as a create param. I'm not > > > > > a huge fan of that much code duplication for the SSEU set but I guess > > > > > that's what we get for deciding to "unify" our context creation > > > > > parameter path with our on-the-fly parameter path.... > > > > > > > > > > You can look at it here: > > > > > > > > > > https://gitlab.freedesktop.org/jekstrand/linux/-/commit/c805f424a3374b2de405b7fc651eab551df2cdaf#474deb1194892a272db022ff175872d42004dfda_283_588 > > > > > > > > Hm yeah the duplication of the render engine check is a bit annoying. > > > > What's worse, if you tthrow another set_engines on top it's probably > > > > all wrong then. The old thing solved that by just throwing that > > > > intel_context away. > > > > > > I think that's already mostly taken care of. When set_engines > > > happens, we throw away the old array of engines and start with a new > > > one where everything has been memset to 0. The one remaining problem > > > is that, if userspace resets the engine set, we need to memset > > > legacy_rcs_sseu to 0. I've added that. > > > > > > > You're also not keeping the engine id in the proto ctx for this, so > > > > there's probably some gaps there. We'd need to clear the SSEU if > > > > userspace puts another context there. But also no userspace does that. > > > > > > Again, I think that's handled. See above. > > > > > > > Plus cursory review of userspace show > > > > - mesa doesn't set this > > > > - compute sets its right before running the batch > > > > - media sets it as the last thing of context creation > > > > > > > > So it's kinda not needed. But also we're asking umd to switch over to > > > > CTX_CREATE_EXT, and if sseu doesn't work for that media team will be > > > > puzzled. And we've confused them enough already with our uapis. > > > > > > > > Another idea: proto_set_sseu just stores the uapi struct and a note > > > > that it's set, and checks nothing. To validate sseu on proto context > > > > we do (but only when an sseu parameter is set): > > > > 1. finalize the context > > > > 2. call the real set_sseu for validation > > > > 3. throw the finalized context away again, it was just for validating > > > > the overall thing > > > > > > > > That way we don't have to consider all the interactions of setting > > > > sseu and engines in any order on proto context, validation code is > > > > guaranteed shared. Only downside is that there's a slight chance in > > > > behaviour: SSEU, then setting another engine in that slot will fail > > > > instead of throwing the sseu parameters away. That's the right thing > > > > for CTX_CREATE_EXT anyway, and current userspace doesn't care. > > > > > > > > Thoughts? > > > > > > I thought about that. The problem is that they can set_sseu multiple > > > times on different engines. This means we'd have to effectively build > > > up an arbitrary list of SSEU set operations and replay it. I'm not > > > sure how I feel about building up a big data structure. > > > > Hm, but how does this work with proto ctx then? I've only seen a > > single sseu param set in the patch you linked. > > It works roughly the same as it works now: > > - If set_sseu is called, it always overwrites whatever was there > before. If it's called for a legacy (no user-specified engines) > context, it overwrites legacy_rcs_sseu. If it's called on a user > engine context, it overwrites the sseu on the given engine. > - When set_engines is called, it throws away all the user engine data > (if any) and memsets legacy_rcu_sseu to 0. The end result is that > everything gets reset. I think I need to review this carefully in the new version. Definitely too much w/e here already for tricky stuff :-) > > > > > I'm also going to send it to trybot. > > > > > > > > If you resend pls include all my r-b, I think some got lost in v4. > > > > > > I'll try and dig those up. > > > > > > > Also, in the kernel at least we expect minimal commit message with a > > > > bit of context, there's no Part-of: link pointing at the entire MR > > > > with overview and discussion, the patchwork Link: we add is a pretty > > > > bad substitute. Some of the new patches in v4 are a bit too terse on > > > > that. > > > > > > Yup. I can try to expand things a bit more. > > > > > > > And finally I'm still not a big fan of the add/remove split over > > > > patches, but oh well. > > > > > > I'm not either but working through all this reminded me of why I > > > didn't do it more gradual. The problem is ordering. If add and > > > remove at the same time and do it one param at a time, we'll end up > > > with a situation in the middle where some params will only be allowed > > > to be set on the proto-ctx and others will force a proto-ctx -> > > > context conversion. If, for instance, one UMD sets engines first and > > > then VMs and another sets VMs first and then engines, there's no way > > > to do a gradual transition without breaking one of them. Also, we > > > need to handle basically all the setparam complexity in order to > > > handle creation structs and, again, those can come in any order. > > > > Yeah I know, but I considered that. I think compute-runtime uses > > CTX_CREATE_EXT, it's only media. > > That doesn't really matter because both go through the same path. > Anything that uses CONTEXT_CREATE_EXT is identical to something which > creates the context and then calls SET_CONTEXT_PARAM in the same order > as the structs in the extension chain. > > Incidentally, this also means that if we do it gradually, we have to > handle finalizing the proto-ctx mid-way through handling the chain of > create extensions. That should be possible to handle if a bit tricky. > It'll also mean we'll have a (small) range of kernels where the > CONTEXT_CREATE_EXT method is broken if you get it in the wrong order. > > > So we need to order the patches in > > exactly the order media calls setparam. And then we're good. > > Mesa only ever sets engines. Upstream compute only ever sets the VM. > Media always sets the VM first. So, if we handle VM first, we should > be good-to-go, I think. > > > Worst case it's exactly as useful in bisecting as your approach here > > (you add dead code first, then use it, > > It's not dead. At the time it's added, it's used for all > CONTEXT_CREATE_EXT. Then, later, it becomes used for everything. > > > so might as well just squash it > > all down to one), but if we get the ordering right it's substantially > > better. > > I can try to spin a v5 and see how bad it ends up being. I don't > really like breaking CONTEXT_CREATE_EXT in the middle, though. Hm right, I forgot that we also de-proto in the middle of CONTEXT_CREATE_EXT while the conversion is going on. This really is annoying. -Daniel
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index db9153e0f85a7..aa8e61211924f 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -193,8 +193,15 @@ static int validate_priority(struct drm_i915_private *i915, static void proto_context_close(struct i915_gem_proto_context *pc) { + int i; + if (pc->vm) i915_vm_put(pc->vm); + if (pc->user_engines) { + for (i = 0; i < pc->num_user_engines; i++) + kfree(pc->user_engines[i].siblings); + kfree(pc->user_engines); + } kfree(pc); } @@ -274,12 +281,417 @@ proto_context_create(struct drm_i915_private *i915, unsigned int flags) proto_context_set_persistence(i915, pc, true); pc->sched.priority = I915_PRIORITY_NORMAL; + pc->num_user_engines = -1; + pc->user_engines = NULL; + if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE) pc->single_timeline = true; return pc; } +static int proto_context_register_locked(struct drm_i915_file_private *fpriv, + struct i915_gem_proto_context *pc, + u32 *id) +{ + int ret; + void *old; + + ret = xa_alloc(&fpriv->context_xa, id, NULL, xa_limit_32b, GFP_KERNEL); + if (ret) + return ret; + + old = xa_store(&fpriv->proto_context_xa, *id, pc, GFP_KERNEL); + if (xa_is_err(old)) { + xa_erase(&fpriv->context_xa, *id); + return xa_err(old); + } + GEM_BUG_ON(old); + + return 0; +} + +static int proto_context_register(struct drm_i915_file_private *fpriv, + struct i915_gem_proto_context *pc, + u32 *id) +{ + int ret; + + mutex_lock(&fpriv->proto_context_lock); + ret = proto_context_register_locked(fpriv, pc, id); + mutex_unlock(&fpriv->proto_context_lock); + + return ret; +} + +static int set_proto_ctx_vm(struct drm_i915_file_private *fpriv, + struct i915_gem_proto_context *pc, + const struct drm_i915_gem_context_param *args) +{ + struct i915_address_space *vm; + + if (args->size) + return -EINVAL; + + if (!pc->vm) + return -ENODEV; + + if (upper_32_bits(args->value)) + return -ENOENT; + + rcu_read_lock(); + vm = xa_load(&fpriv->vm_xa, args->value); + if (vm && !kref_get_unless_zero(&vm->ref)) + vm = NULL; + rcu_read_unlock(); + if (!vm) + return -ENOENT; + + i915_vm_put(pc->vm); + pc->vm = vm; + + return 0; +} + +struct set_proto_ctx_engines { + struct drm_i915_private *i915; + unsigned num_engines; + struct i915_gem_proto_engine *engines; +}; + +static int +set_proto_ctx_engines_balance(struct i915_user_extension __user *base, + void *data) +{ + struct i915_context_engines_load_balance __user *ext = + container_of_user(base, typeof(*ext), base); + const struct set_proto_ctx_engines *set = data; + struct drm_i915_private *i915 = set->i915; + struct intel_engine_cs **siblings; + u16 num_siblings, idx; + unsigned int n; + int err; + + if (!HAS_EXECLISTS(i915)) + return -ENODEV; + + if (intel_uc_uses_guc_submission(&i915->gt.uc)) + return -ENODEV; /* not implement yet */ + + if (get_user(idx, &ext->engine_index)) + return -EFAULT; + + if (idx >= set->num_engines) { + drm_dbg(&i915->drm, "Invalid placement value, %d >= %d\n", + idx, set->num_engines); + return -EINVAL; + } + + idx = array_index_nospec(idx, set->num_engines); + if (set->engines[idx].type != I915_GEM_ENGINE_TYPE_INVALID) { + drm_dbg(&i915->drm, + "Invalid placement[%d], already occupied\n", idx); + return -EEXIST; + } + + if (get_user(num_siblings, &ext->num_siblings)) + return -EFAULT; + + err = check_user_mbz(&ext->flags); + if (err) + return err; + + err = check_user_mbz(&ext->mbz64); + if (err) + return err; + + if (num_siblings == 0) + return 0; + + siblings = kmalloc_array(num_siblings, sizeof(*siblings), GFP_KERNEL); + if (!siblings) + return -ENOMEM; + + for (n = 0; n < num_siblings; n++) { + struct i915_engine_class_instance ci; + + if (copy_from_user(&ci, &ext->engines[n], sizeof(ci))) { + err = -EFAULT; + goto err_siblings; + } + + siblings[n] = intel_engine_lookup_user(i915, + ci.engine_class, + ci.engine_instance); + if (!siblings[n]) { + drm_dbg(&i915->drm, + "Invalid sibling[%d]: { class:%d, inst:%d }\n", + n, ci.engine_class, ci.engine_instance); + err = -EINVAL; + goto err_siblings; + } + } + + if (num_siblings == 1) { + set->engines[idx].type = I915_GEM_ENGINE_TYPE_PHYSICAL; + set->engines[idx].engine = siblings[0]; + kfree(siblings); + } else { + set->engines[idx].type = I915_GEM_ENGINE_TYPE_BALANCED; + set->engines[idx].num_siblings = num_siblings; + set->engines[idx].siblings = siblings; + } + + return 0; + +err_siblings: + kfree(siblings); + + return err; +} + +static int +set_proto_ctx_engines_bond(struct i915_user_extension __user *base, void *data) +{ + struct i915_context_engines_bond __user *ext = + container_of_user(base, typeof(*ext), base); + const struct set_proto_ctx_engines *set = data; + struct drm_i915_private *i915 = set->i915; + struct i915_engine_class_instance ci; + struct intel_engine_cs *master; + u16 idx, num_bonds; + int err, n; + + if (get_user(idx, &ext->virtual_index)) + return -EFAULT; + + if (idx >= set->num_engines) { + drm_dbg(&i915->drm, + "Invalid index for virtual engine: %d >= %d\n", + idx, set->num_engines); + return -EINVAL; + } + + idx = array_index_nospec(idx, set->num_engines); + if (set->engines[idx].type == I915_GEM_ENGINE_TYPE_INVALID) { + drm_dbg(&i915->drm, "Invalid engine at %d\n", idx); + return -EINVAL; + } + + if (set->engines[idx].type != I915_GEM_ENGINE_TYPE_PHYSICAL) { + drm_dbg(&i915->drm, + "Bonding with virtual engines not allowed\n"); + return -EINVAL; + } + + err = check_user_mbz(&ext->flags); + if (err) + return err; + + for (n = 0; n < ARRAY_SIZE(ext->mbz64); n++) { + err = check_user_mbz(&ext->mbz64[n]); + if (err) + return err; + } + + if (copy_from_user(&ci, &ext->master, sizeof(ci))) + return -EFAULT; + + master = intel_engine_lookup_user(i915, + ci.engine_class, + ci.engine_instance); + if (!master) { + drm_dbg(&i915->drm, + "Unrecognised master engine: { class:%u, instance:%u }\n", + ci.engine_class, ci.engine_instance); + return -EINVAL; + } + + if (get_user(num_bonds, &ext->num_bonds)) + return -EFAULT; + + for (n = 0; n < num_bonds; n++) { + struct intel_engine_cs *bond; + + if (copy_from_user(&ci, &ext->engines[n], sizeof(ci))) + return -EFAULT; + + bond = intel_engine_lookup_user(i915, + ci.engine_class, + ci.engine_instance); + if (!bond) { + drm_dbg(&i915->drm, + "Unrecognised engine[%d] for bonding: { class:%d, instance: %d }\n", + n, ci.engine_class, ci.engine_instance); + return -EINVAL; + } + } + + return 0; +} + +static const i915_user_extension_fn set_proto_ctx_engines_extensions[] = { + [I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE] = set_proto_ctx_engines_balance, + [I915_CONTEXT_ENGINES_EXT_BOND] = set_proto_ctx_engines_bond, +}; + +static int set_proto_ctx_engines(struct drm_i915_file_private *fpriv, + struct i915_gem_proto_context *pc, + const struct drm_i915_gem_context_param *args) +{ + struct drm_i915_private *i915 = fpriv->dev_priv; + struct set_proto_ctx_engines set = { .i915 = i915 }; + struct i915_context_param_engines __user *user = + u64_to_user_ptr(args->value); + unsigned int n; + u64 extensions; + int err; + + if (!args->size) { + kfree(pc->user_engines); + pc->num_user_engines = -1; + pc->user_engines = NULL; + return 0; + } + + BUILD_BUG_ON(!IS_ALIGNED(sizeof(*user), sizeof(*user->engines))); + if (args->size < sizeof(*user) || + !IS_ALIGNED(args->size, sizeof(*user->engines))) { + drm_dbg(&i915->drm, "Invalid size for engine array: %d\n", + args->size); + return -EINVAL; + } + + set.num_engines = (args->size - sizeof(*user)) / sizeof(*user->engines); + if (set.num_engines > I915_EXEC_RING_MASK + 1) + return -EINVAL; + + set.engines = kmalloc_array(set.num_engines, sizeof(*set.engines), GFP_KERNEL); + if (!set.engines) + return -ENOMEM; + + for (n = 0; n < set.num_engines; n++) { + struct i915_engine_class_instance ci; + struct intel_engine_cs *engine; + + if (copy_from_user(&ci, &user->engines[n], sizeof(ci))) { + kfree(set.engines); + return -EFAULT; + } + + memset(&set.engines[n], 0, sizeof(set.engines[n])); + + if (ci.engine_class == (u16)I915_ENGINE_CLASS_INVALID && + ci.engine_instance == (u16)I915_ENGINE_CLASS_INVALID_NONE) + continue; + + engine = intel_engine_lookup_user(i915, + ci.engine_class, + ci.engine_instance); + if (!engine) { + drm_dbg(&i915->drm, + "Invalid engine[%d]: { class:%d, instance:%d }\n", + n, ci.engine_class, ci.engine_instance); + kfree(set.engines); + return -ENOENT; + } + + set.engines[n].type = I915_GEM_ENGINE_TYPE_PHYSICAL; + set.engines[n].engine = engine; + } + + err = -EFAULT; + if (!get_user(extensions, &user->extensions)) + err = i915_user_extensions(u64_to_user_ptr(extensions), + set_proto_ctx_engines_extensions, + ARRAY_SIZE(set_proto_ctx_engines_extensions), + &set); + if (err) { + kfree(set.engines); + return err; + } + + kfree(pc->user_engines); + pc->num_user_engines = set.num_engines; + pc->user_engines = set.engines; + + return 0; +} + +static int set_proto_ctx_param(struct drm_i915_file_private *fpriv, + struct i915_gem_proto_context *pc, + struct drm_i915_gem_context_param *args) +{ + int ret = 0; + + switch (args->param) { + case I915_CONTEXT_PARAM_NO_ERROR_CAPTURE: + if (args->size) + ret = -EINVAL; + else if (args->value) + set_bit(UCONTEXT_NO_ERROR_CAPTURE, &pc->user_flags); + else + clear_bit(UCONTEXT_NO_ERROR_CAPTURE, &pc->user_flags); + break; + + case I915_CONTEXT_PARAM_BANNABLE: + if (args->size) + ret = -EINVAL; + else if (!capable(CAP_SYS_ADMIN) && !args->value) + ret = -EPERM; + else if (args->value) + set_bit(UCONTEXT_BANNABLE, &pc->user_flags); + else + clear_bit(UCONTEXT_BANNABLE, &pc->user_flags); + break; + + case I915_CONTEXT_PARAM_RECOVERABLE: + if (args->size) + ret = -EINVAL; + else if (args->value) + set_bit(UCONTEXT_RECOVERABLE, &pc->user_flags); + else + clear_bit(UCONTEXT_RECOVERABLE, &pc->user_flags); + break; + + case I915_CONTEXT_PARAM_PRIORITY: + ret = validate_priority(fpriv->dev_priv, args); + if (!ret) + pc->sched.priority = args->value; + break; + + case I915_CONTEXT_PARAM_SSEU: + ret = -ENOTSUPP; + break; + + case I915_CONTEXT_PARAM_VM: + ret = set_proto_ctx_vm(fpriv, pc, args); + break; + + case I915_CONTEXT_PARAM_ENGINES: + ret = set_proto_ctx_engines(fpriv, pc, args); + break; + + case I915_CONTEXT_PARAM_PERSISTENCE: + if (args->size) + ret = -EINVAL; + else if (args->value) + set_bit(UCONTEXT_PERSISTENCE, &pc->user_flags); + else + clear_bit(UCONTEXT_PERSISTENCE, &pc->user_flags); + break; + + case I915_CONTEXT_PARAM_NO_ZEROMAP: + case I915_CONTEXT_PARAM_BAN_PERIOD: + case I915_CONTEXT_PARAM_RINGSIZE: + default: + ret = -EINVAL; + break; + } + + return ret; +} + static struct i915_address_space * context_get_vm_rcu(struct i915_gem_context *ctx) { @@ -450,6 +862,47 @@ static struct i915_gem_engines *default_engines(struct i915_gem_context *ctx) return e; } +static struct i915_gem_engines *user_engines(struct i915_gem_context *ctx, + unsigned int num_engines, + struct i915_gem_proto_engine *pe) +{ + struct i915_gem_engines *e; + unsigned int n; + + e = alloc_engines(num_engines); + for (n = 0; n < num_engines; n++) { + struct intel_context *ce; + + switch (pe[n].type) { + case I915_GEM_ENGINE_TYPE_PHYSICAL: + ce = intel_context_create(pe[n].engine); + break; + + case I915_GEM_ENGINE_TYPE_BALANCED: + ce = intel_execlists_create_virtual(pe[n].siblings, + pe[n].num_siblings); + break; + + case I915_GEM_ENGINE_TYPE_INVALID: + default: + GEM_WARN_ON(pe[n].type != I915_GEM_ENGINE_TYPE_INVALID); + continue; + } + + if (IS_ERR(ce)) { + __free_engines(e, n); + return ERR_CAST(ce); + } + + intel_context_set_gem(ce, ctx); + + e->engines[n] = ce; + } + e->num_engines = num_engines; + + return e; +} + void i915_gem_context_release(struct kref *ref) { struct i915_gem_context *ctx = container_of(ref, typeof(*ctx), ref); @@ -890,6 +1343,24 @@ i915_gem_create_context(struct drm_i915_private *i915, mutex_unlock(&ctx->mutex); } + if (pc->num_user_engines >= 0) { + struct i915_gem_engines *engines; + + engines = user_engines(ctx, pc->num_user_engines, + pc->user_engines); + if (IS_ERR(engines)) { + context_close(ctx); + return ERR_CAST(engines); + } + + mutex_lock(&ctx->engines_mutex); + i915_gem_context_set_user_engines(ctx); + engines = rcu_replace_pointer(ctx->engines, engines, 1); + mutex_unlock(&ctx->engines_mutex); + + free_engines(engines); + } + if (pc->single_timeline) { ret = drm_syncobj_create(&ctx->syncobj, DRM_SYNCOBJ_CREATE_SIGNALED, @@ -916,12 +1387,12 @@ void i915_gem_init__contexts(struct drm_i915_private *i915) init_contexts(&i915->gem.contexts); } -static int gem_context_register(struct i915_gem_context *ctx, - struct drm_i915_file_private *fpriv, - u32 *id) +static void gem_context_register(struct i915_gem_context *ctx, + struct drm_i915_file_private *fpriv, + u32 id) { struct drm_i915_private *i915 = ctx->i915; - int ret; + void *old; ctx->file_priv = fpriv; @@ -930,19 +1401,12 @@ static int gem_context_register(struct i915_gem_context *ctx, current->comm, pid_nr(ctx->pid)); /* And finally expose ourselves to userspace via the idr */ - ret = xa_alloc(&fpriv->context_xa, id, ctx, xa_limit_32b, GFP_KERNEL); - if (ret) - goto err_pid; + old = xa_store(&fpriv->context_xa, id, ctx, GFP_KERNEL); + GEM_BUG_ON(old); spin_lock(&i915->gem.contexts.lock); list_add_tail(&ctx->link, &i915->gem.contexts.list); spin_unlock(&i915->gem.contexts.lock); - - return 0; - -err_pid: - put_pid(fetch_and_zero(&ctx->pid)); - return ret; } int i915_gem_context_open(struct drm_i915_private *i915, @@ -952,9 +1416,12 @@ int i915_gem_context_open(struct drm_i915_private *i915, struct i915_gem_proto_context *pc; struct i915_gem_context *ctx; int err; - u32 id; - xa_init_flags(&file_priv->context_xa, XA_FLAGS_ALLOC); + mutex_init(&file_priv->proto_context_lock); + xa_init_flags(&file_priv->proto_context_xa, XA_FLAGS_ALLOC); + + /* 0 reserved for the default context */ + xa_init_flags(&file_priv->context_xa, XA_FLAGS_ALLOC1); /* 0 reserved for invalid/unassigned ppgtt */ xa_init_flags(&file_priv->vm_xa, XA_FLAGS_ALLOC1); @@ -972,28 +1439,31 @@ int i915_gem_context_open(struct drm_i915_private *i915, goto err; } - err = gem_context_register(ctx, file_priv, &id); - if (err < 0) - goto err_ctx; + gem_context_register(ctx, file_priv, 0); - GEM_BUG_ON(id); return 0; -err_ctx: - context_close(ctx); err: xa_destroy(&file_priv->vm_xa); xa_destroy(&file_priv->context_xa); + xa_destroy(&file_priv->proto_context_xa); + mutex_destroy(&file_priv->proto_context_lock); return err; } void i915_gem_context_close(struct drm_file *file) { struct drm_i915_file_private *file_priv = file->driver_priv; + struct i915_gem_proto_context *pc; struct i915_address_space *vm; struct i915_gem_context *ctx; unsigned long idx; + xa_for_each(&file_priv->proto_context_xa, idx, pc) + proto_context_close(pc); + xa_destroy(&file_priv->proto_context_xa); + mutex_destroy(&file_priv->proto_context_lock); + xa_for_each(&file_priv->context_xa, idx, ctx) context_close(ctx); xa_destroy(&file_priv->context_xa); @@ -1918,7 +2388,7 @@ static int ctx_setparam(struct drm_i915_file_private *fpriv, } struct create_ext { - struct i915_gem_context *ctx; + struct i915_gem_proto_context *pc; struct drm_i915_file_private *fpriv; }; @@ -1933,7 +2403,7 @@ static int create_setparam(struct i915_user_extension __user *ext, void *data) if (local.param.ctx_id) return -EINVAL; - return ctx_setparam(arg->fpriv, arg->ctx, &local.param); + return set_proto_ctx_param(arg->fpriv, arg->pc, &local.param); } static int invalid_ext(struct i915_user_extension __user *ext, void *data) @@ -1951,12 +2421,71 @@ static bool client_is_banned(struct drm_i915_file_private *file_priv) return atomic_read(&file_priv->ban_score) >= I915_CLIENT_SCORE_BANNED; } +static inline struct i915_gem_context * +__context_lookup(struct drm_i915_file_private *file_priv, u32 id) +{ + struct i915_gem_context *ctx; + + rcu_read_lock(); + ctx = xa_load(&file_priv->context_xa, id); + if (ctx && !kref_get_unless_zero(&ctx->ref)) + ctx = NULL; + rcu_read_unlock(); + + return ctx; +} + +struct i915_gem_context * +lazy_create_context_locked(struct drm_i915_file_private *file_priv, + struct i915_gem_proto_context *pc, u32 id) +{ + struct i915_gem_context *ctx; + void *old; + + ctx = i915_gem_create_context(file_priv->dev_priv, pc); + if (IS_ERR(ctx)) + return ctx; + + gem_context_register(ctx, file_priv, id); + + old = xa_erase(&file_priv->proto_context_xa, id); + GEM_BUG_ON(old != pc); + proto_context_close(pc); + + /* One for the xarray and one for the caller */ + return i915_gem_context_get(ctx); +} + +struct i915_gem_context * +i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id) +{ + struct i915_gem_proto_context *pc; + struct i915_gem_context *ctx; + + ctx = __context_lookup(file_priv, id); + if (ctx) + return ctx; + + mutex_lock(&file_priv->proto_context_lock); + /* Try one more time under the lock */ + ctx = __context_lookup(file_priv, id); + if (!ctx) { + pc = xa_load(&file_priv->proto_context_xa, id); + if (!pc) + ctx = ERR_PTR(-ENOENT); + else + ctx = lazy_create_context_locked(file_priv, pc, id); + } + mutex_unlock(&file_priv->proto_context_lock); + + return ctx; +} + int i915_gem_context_create_ioctl(struct drm_device *dev, void *data, struct drm_file *file) { struct drm_i915_private *i915 = to_i915(dev); struct drm_i915_gem_context_create_ext *args = data; - struct i915_gem_proto_context *pc; struct create_ext ext_data; int ret; u32 id; @@ -1979,14 +2508,9 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data, return -EIO; } - pc = proto_context_create(i915, args->flags); - if (IS_ERR(pc)) - return PTR_ERR(pc); - - ext_data.ctx = i915_gem_create_context(i915, pc); - proto_context_close(pc); - if (IS_ERR(ext_data.ctx)) - return PTR_ERR(ext_data.ctx); + ext_data.pc = proto_context_create(i915, args->flags); + if (IS_ERR(ext_data.pc)) + return PTR_ERR(ext_data.pc); if (args->flags & I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS) { ret = i915_user_extensions(u64_to_user_ptr(args->extensions), @@ -1994,20 +2518,20 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data, ARRAY_SIZE(create_extensions), &ext_data); if (ret) - goto err_ctx; + goto err_pc; } - ret = gem_context_register(ext_data.ctx, ext_data.fpriv, &id); + ret = proto_context_register(ext_data.fpriv, ext_data.pc, &id); if (ret < 0) - goto err_ctx; + goto err_pc; args->ctx_id = id; drm_dbg(&i915->drm, "HW context %d created\n", args->ctx_id); return 0; -err_ctx: - context_close(ext_data.ctx); +err_pc: + proto_context_close(ext_data.pc); return ret; } @@ -2016,6 +2540,7 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data, { struct drm_i915_gem_context_destroy *args = data; struct drm_i915_file_private *file_priv = file->driver_priv; + struct i915_gem_proto_context *pc; struct i915_gem_context *ctx; if (args->pad != 0) @@ -2024,11 +2549,21 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data, if (!args->ctx_id) return -ENOENT; + mutex_lock(&file_priv->proto_context_lock); ctx = xa_erase(&file_priv->context_xa, args->ctx_id); - if (!ctx) + pc = xa_erase(&file_priv->proto_context_xa, args->ctx_id); + mutex_unlock(&file_priv->proto_context_lock); + + if (!ctx && !pc) return -ENOENT; + GEM_WARN_ON(ctx && pc); + + if (pc) + proto_context_close(pc); + + if (ctx) + context_close(ctx); - context_close(ctx); return 0; } @@ -2161,16 +2696,48 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data, { struct drm_i915_file_private *file_priv = file->driver_priv; struct drm_i915_gem_context_param *args = data; + struct i915_gem_proto_context *pc; struct i915_gem_context *ctx; - int ret; + int ret = 0; - ctx = i915_gem_context_lookup(file_priv, args->ctx_id); - if (IS_ERR(ctx)) - return PTR_ERR(ctx); + ctx = __context_lookup(file_priv, args->ctx_id); + if (ctx) + goto set_ctx_param; - ret = ctx_setparam(file_priv, ctx, args); + mutex_lock(&file_priv->proto_context_lock); + ctx = __context_lookup(file_priv, args->ctx_id); + if (ctx) + goto unlock; + + pc = xa_load(&file_priv->proto_context_xa, args->ctx_id); + if (!pc) { + ret = -ENOENT; + goto unlock; + } + + ret = set_proto_ctx_param(file_priv, pc, args); + if (ret == -ENOTSUPP) { + /* Some params, specifically SSEU, can only be set on fully + * created contexts. + */ + ret = 0; + ctx = lazy_create_context_locked(file_priv, pc, args->ctx_id); + if (IS_ERR(ctx)) { + ret = PTR_ERR(ctx); + ctx = NULL; + } + } + +unlock: + mutex_unlock(&file_priv->proto_context_lock); + +set_ctx_param: + if (!ret && ctx) + ret = ctx_setparam(file_priv, ctx, args); + + if (ctx) + i915_gem_context_put(ctx); - i915_gem_context_put(ctx); return ret; } diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h b/drivers/gpu/drm/i915/gem/i915_gem_context.h index b5c908f3f4f22..20411db84914a 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h @@ -133,6 +133,9 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data, int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data, struct drm_file *file); +struct i915_gem_context * +i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id); + static inline struct i915_gem_context * i915_gem_context_get(struct i915_gem_context *ctx) { diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h index a42c429f94577..067ea3030ac91 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h @@ -46,6 +46,26 @@ struct i915_gem_engines_iter { const struct i915_gem_engines *engines; }; +enum i915_gem_engine_type { + I915_GEM_ENGINE_TYPE_INVALID = 0, + I915_GEM_ENGINE_TYPE_PHYSICAL, + I915_GEM_ENGINE_TYPE_BALANCED, +}; + +struct i915_gem_proto_engine { + /** @type: Type of this engine */ + enum i915_gem_engine_type type; + + /** @num_siblings: Engine, for physical */ + struct intel_engine_cs *engine; + + /** @num_siblings: Number of balanced siblings */ + unsigned int num_siblings; + + /** @num_siblings: Balanced siblings */ + struct intel_engine_cs **siblings; +}; + /** * struct i915_gem_proto_context - prototype context * @@ -64,6 +84,12 @@ struct i915_gem_proto_context { /** @sched: See i915_gem_context::sched */ struct i915_sched_attr sched; + /** @num_user_engines: Number of user-specified engines or -1 */ + int num_user_engines; + + /** @num_user_engines: User-specified engines */ + struct i915_gem_proto_engine *user_engines; + bool single_timeline; }; diff --git a/drivers/gpu/drm/i915/gem/selftests/mock_context.c b/drivers/gpu/drm/i915/gem/selftests/mock_context.c index e0f512ef7f3c6..32cf2103828f9 100644 --- a/drivers/gpu/drm/i915/gem/selftests/mock_context.c +++ b/drivers/gpu/drm/i915/gem/selftests/mock_context.c @@ -80,6 +80,7 @@ void mock_init_contexts(struct drm_i915_private *i915) struct i915_gem_context * live_context(struct drm_i915_private *i915, struct file *file) { + struct drm_i915_file_private *fpriv = to_drm_file(file)->driver_priv; struct i915_gem_proto_context *pc; struct i915_gem_context *ctx; int err; @@ -96,10 +97,12 @@ live_context(struct drm_i915_private *i915, struct file *file) i915_gem_context_set_no_error_capture(ctx); - err = gem_context_register(ctx, to_drm_file(file)->driver_priv, &id); + err = xa_alloc(&fpriv->context_xa, &id, NULL, xa_limit_32b, GFP_KERNEL); if (err < 0) goto err_ctx; + gem_context_register(ctx, fpriv, id); + return ctx; err_ctx: diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 004ed0e59c999..365c042529d72 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -200,6 +200,9 @@ struct drm_i915_file_private { struct rcu_head rcu; }; + struct mutex proto_context_lock; + struct xarray proto_context_xa; + struct xarray context_xa; struct xarray vm_xa; @@ -1840,20 +1843,6 @@ struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev, struct dma_buf *i915_gem_prime_export(struct drm_gem_object *gem_obj, int flags); -static inline struct i915_gem_context * -i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id) -{ - struct i915_gem_context *ctx; - - rcu_read_lock(); - ctx = xa_load(&file_priv->context_xa, id); - if (ctx && !kref_get_unless_zero(&ctx->ref)) - ctx = NULL; - rcu_read_unlock(); - - return ctx ? ctx : ERR_PTR(-ENOENT); -} - /* i915_gem_evict.c */ int __must_check i915_gem_evict_something(struct i915_address_space *vm, u64 min_size, u64 alignment,
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 657 ++++++++++++++++-- drivers/gpu/drm/i915/gem/i915_gem_context.h | 3 + .../gpu/drm/i915/gem/i915_gem_context_types.h | 26 + .../gpu/drm/i915/gem/selftests/mock_context.c | 5 +- drivers/gpu/drm/i915/i915_drv.h | 17 +- 5 files changed, 648 insertions(+), 60 deletions(-)