Message ID | 554212BF.1040309@zoho.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
I've now done some testing (on an i5-3230M, in Debian 8), and this patch doesn't *appear* to break anything: both with and without it (starting from linux-next 20150430 (fa94df1) + commit 245054a drm/i915: Enable cmd parser to do secure batch promotion for aliasing ppgtt), -libva (said in earlier discussion to use chained batches): all basic tests pass except test_07 (which doesn't work under 3.16 either); putsurface works -video (file playback and live camera) in vlc works -beignet (OpenCL) test suite: all pass except builtin_powr_* (long-standing known issue) and builtin_tgamma (it appears that linux-next puts the *C*PU in denormals-flushed-to-0 floating point mode, which breaks this test's checking mechanism: not sure if that's a bug or just a difference between Debian's and your defaults, but as it happens both with and without the patch, it's nothing to do with this) The one problem I did see only with the patch was that one session had all its windows open in the top left of the screen, un-movable, and missing their title bar, but this was not reproducible, so I can't tell if it was a result of the patch or a coincidence. However, plain linux-next 20150430 (without 245054a) has a lot of problems ("GPU HANG" in the kernel log on startup but the Xfce desktop does come up), glxgears segfaults, beignet gives a few wrong (all-0) results then throws CL_OUT_OF_RESOURCES, video doesn't play; probably https://bugs.freedesktop.org/show_bug.cgi?id=90190), and given that all 245054a does is enable secure batch promotion, that suggests that the driver no longer handles non-promoted batches properly, making this patch a risky move. I tried the intel-gpu-tools tests (1.10, running in recovery mode to avoid loading X), but found that most (not all) of the tests reported "GPU HANG" in all three linux-next cases (but worked under 3.16). Note that I will be away from email for the next few days.
Repeating those tests[1] on linux-next 20150505 gave the same results, but having seen the intermittent "no window title bar" problem twice more, I now have to consider it a result of my patch[0], and hence withdraw it. However, that doesn't mean there isn't a security problem here, only that we need a different way to fix it. [0] http://lists.freedesktop.org/archives/intel-gfx/2015-April/065627.html [1] http://lists.freedesktop.org/archives/intel-gfx/2015-May/065705.html
"Rebecca N. Palmer" <rebecca_palmer@zoho.com> writes: Hi, > i915_parse_cmds returns -EACCES on chained batches, which "tells the > caller to abort and dispatch the workload as a non-secure batch", > but the mechanism implementing that was broken when > flags |= I915_DISPATCH_SECURE was moved from i915_gem_execbuffer_parse > to i915_gem_do_execbuffer (17cabf571e50677d980e9ab2a43c5f11213003ae): > i915_gem_execbuffer_parse returns the original batch_obj in this case, > and i915_gem_do_execbuffer doesn't check for that. > Is this being made secure some other way (in which case the obsolete > comments should probably be removed), or is this a security hole? > > Warning: this is my first kernel patch, and has not been tested yet. > Signed-off-by: Rebecca Palmer <rebecca_palmer@zoho.com> > > --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c > +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c > @@ -1398,7 +1398,7 @@ i915_gem_do_execbuffer(struct drm_device > { > struct drm_i915_private *dev_priv = dev->dev_private; > struct eb_vmas *eb; > - struct drm_i915_gem_object *batch_obj; > + struct drm_i915_gem_object *batch_obj, *orig_batch_obj; > struct drm_i915_gem_exec_object2 shadow_exec_entry; > struct intel_engine_cs *ring; > struct intel_context *ctx; > @@ -1511,7 +1511,7 @@ i915_gem_do_execbuffer(struct drm_device > goto err; > > /* take note of the batch buffer before we might reorder the lists */ > - batch_obj = eb_get_batch(eb); > + orig_batch_obj = eb_get_batch(eb); > > /* Move the objects en-masse into the GTT, evicting if necessary. */ > need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0; > @@ -1533,7 +1533,7 @@ i915_gem_do_execbuffer(struct drm_device > } > > /* Set the pending read domains for the batch buffer to COMMAND */ > - if (batch_obj->base.pending_write_domain) { > + if (orig_batch_obj->base.pending_write_domain) { > DRM_DEBUG("Attempting to use self-modifying batch buffer\n"); > ret = -EINVAL; > goto err; > @@ -1543,7 +1543,7 @@ i915_gem_do_execbuffer(struct drm_device > batch_obj = i915_gem_execbuffer_parse(ring, > &shadow_exec_entry, > eb, > - batch_obj, > + orig_batch_obj, > args->batch_start_offset, > args->batch_len, > file->is_master); > @@ -1559,7 +1559,7 @@ i915_gem_do_execbuffer(struct drm_device > * don't want that set when the command parser is > * enabled. > */ > - if (USES_PPGTT(dev)) USES_PPGTT(dev) has been removed in the latest nightly, so you can remove it here. > + if (USES_PPGTT(dev) && batch_obj!=orig_batch_obj) Coding convention needs spaces around the != check. (see scripts/checkpatch.pl). Also please consider adding comment above parsed_obj != batch_obj check about the parser ignoring the batch. Like /* Skip the promotion if the parser ignored the patch */ > dispatch_flags |= I915_DISPATCH_SECURE; On other gens where cmdparser is disabled, batch_obj is left dangling as the 'if (i915_needs_cmd_parser(ring) && args->batch_len)' branch is never taken on other than gen == 7. I suggest that you introduce a *parsed_obj in the branch scope, give original batch_obj to execbuffer_parse() and and do the parsed_obj != batch_obj and batch_obj reassignment inside the scope. -Mika > exec_start = 0; > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
On Thu, Apr 30, 2015 at 12:32:15PM +0100, Rebecca N. Palmer wrote: > i915_parse_cmds returns -EACCES on chained batches, which "tells the > caller to abort and dispatch the workload as a non-secure batch", > but the mechanism implementing that was broken when > flags |= I915_DISPATCH_SECURE was moved from i915_gem_execbuffer_parse > to i915_gem_do_execbuffer (17cabf571e50677d980e9ab2a43c5f11213003ae): > i915_gem_execbuffer_parse returns the original batch_obj in this case, > and i915_gem_do_execbuffer doesn't check for that. > > Is this being made secure some other way (in which case the obsolete > comments should probably be removed), or is this a security hole? > > Warning: this is my first kernel patch, and has not been tested yet. Looks really nice tbh and seems to fix a regression that the igt testsuite caught (gem_cmd_parse/chained-batches). Thanks a lot. Mika has some minor review comments, with those address I'll pull this in. Thanks, Daniel > Signed-off-by: Rebecca Palmer <rebecca_palmer@zoho.com> > > --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c > +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c > @@ -1398,7 +1398,7 @@ i915_gem_do_execbuffer(struct drm_device > { > struct drm_i915_private *dev_priv = dev->dev_private; > struct eb_vmas *eb; > - struct drm_i915_gem_object *batch_obj; > + struct drm_i915_gem_object *batch_obj, *orig_batch_obj; > struct drm_i915_gem_exec_object2 shadow_exec_entry; > struct intel_engine_cs *ring; > struct intel_context *ctx; > @@ -1511,7 +1511,7 @@ i915_gem_do_execbuffer(struct drm_device > goto err; > > /* take note of the batch buffer before we might reorder the lists */ > - batch_obj = eb_get_batch(eb); > + orig_batch_obj = eb_get_batch(eb); > > /* Move the objects en-masse into the GTT, evicting if necessary. */ > need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0; > @@ -1533,7 +1533,7 @@ i915_gem_do_execbuffer(struct drm_device > } > > /* Set the pending read domains for the batch buffer to COMMAND */ > - if (batch_obj->base.pending_write_domain) { > + if (orig_batch_obj->base.pending_write_domain) { > DRM_DEBUG("Attempting to use self-modifying batch buffer\n"); > ret = -EINVAL; > goto err; > @@ -1543,7 +1543,7 @@ i915_gem_do_execbuffer(struct drm_device > batch_obj = i915_gem_execbuffer_parse(ring, > &shadow_exec_entry, > eb, > - batch_obj, > + orig_batch_obj, > args->batch_start_offset, > args->batch_len, > file->is_master); > @@ -1559,7 +1559,7 @@ i915_gem_do_execbuffer(struct drm_device > * don't want that set when the command parser is > * enabled. > */ > - if (USES_PPGTT(dev)) > + if (USES_PPGTT(dev) && batch_obj!=orig_batch_obj) > dispatch_flags |= I915_DISPATCH_SECURE; > > exec_start = 0; > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Since this has been publicly reported, I'm moving security@kernel.org to Bcc] On Tue, May 5, 2015 at 2:39 PM, Rebecca N. Palmer <rebecca_palmer@zoho.com> wrote: > Repeating those tests[1] on linux-next 20150505 gave the same results, > but having seen the intermittent "no window title bar" problem twice > more, I now have to consider it a result of my patch[0], and hence > withdraw it. > > However, that doesn't mean there isn't a security problem here, > only that we need a different way to fix it. > > [0] http://lists.freedesktop.org/archives/intel-gfx/2015-April/065627.html > [1] http://lists.freedesktop.org/archives/intel-gfx/2015-May/065705.html Has this issue been resolved? (I have no idea what "secure" means in this graphics context.) Thanks! -Kees
In next, it was fixed by http://cgit.freedesktop.org/drm-intel/commit/drivers/gpu/drm/i915/i915_gem_execbuffer.c?h=for-linux-next&id=c7c7372edc4ebc173ad359aeb5752e9ce09f2741 In 4.1, the problem only exists with the marked-as-unsafe parameter i915.enable_ppgtt=2. (It could be fixed there with http://lists.freedesktop.org/archives/intel-gfx/2015-May/066259.html , but doing so at this late stage would be risky.)
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -1398,7 +1398,7 @@ i915_gem_do_execbuffer(struct drm_device { struct drm_i915_private *dev_priv = dev->dev_private; struct eb_vmas *eb; - struct drm_i915_gem_object *batch_obj; + struct drm_i915_gem_object *batch_obj, *orig_batch_obj; struct drm_i915_gem_exec_object2 shadow_exec_entry; struct intel_engine_cs *ring; struct intel_context *ctx; @@ -1511,7 +1511,7 @@ i915_gem_do_execbuffer(struct drm_device goto err; /* take note of the batch buffer before we might reorder the lists */ - batch_obj = eb_get_batch(eb); + orig_batch_obj = eb_get_batch(eb); /* Move the objects en-masse into the GTT, evicting if necessary. */ need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0; @@ -1533,7 +1533,7 @@ i915_gem_do_execbuffer(struct drm_device } /* Set the pending read domains for the batch buffer to COMMAND */ - if (batch_obj->base.pending_write_domain) { + if (orig_batch_obj->base.pending_write_domain) { DRM_DEBUG("Attempting to use self-modifying batch buffer\n"); ret = -EINVAL; goto err; @@ -1543,7 +1543,7 @@ i915_gem_do_execbuffer(struct drm_device batch_obj = i915_gem_execbuffer_parse(ring, &shadow_exec_entry, eb, - batch_obj, + orig_batch_obj, args->batch_start_offset, args->batch_len, file->is_master); @@ -1559,7 +1559,7 @@ i915_gem_do_execbuffer(struct drm_device * don't want that set when the command parser is * enabled. */ - if (USES_PPGTT(dev)) + if (USES_PPGTT(dev) && batch_obj!=orig_batch_obj) dispatch_flags |= I915_DISPATCH_SECURE; exec_start = 0;
i915_parse_cmds returns -EACCES on chained batches, which "tells the caller to abort and dispatch the workload as a non-secure batch", but the mechanism implementing that was broken when flags |= I915_DISPATCH_SECURE was moved from i915_gem_execbuffer_parse to i915_gem_do_execbuffer (17cabf571e50677d980e9ab2a43c5f11213003ae): i915_gem_execbuffer_parse returns the original batch_obj in this case, and i915_gem_do_execbuffer doesn't check for that. Is this being made secure some other way (in which case the obsolete comments should probably be removed), or is this a security hole? Warning: this is my first kernel patch, and has not been tested yet. Signed-off-by: Rebecca Palmer <rebecca_palmer@zoho.com>