[v3,03/11] drm/i915: return EACCES for check_cmd() failures
diff mbox

Message ID 20160815144128.7847-4-robert@sixbynine.org
State New
Headers show

Commit Message

Robert Bragg Aug. 15, 2016, 2:41 p.m. UTC
check_cmd() is checking whether a command adheres to certain
restrictions that ensure it's safe to execute within a privileged batch
buffer. Returning false implies a privilege problem, not that the
command is invalid.

The distinction makes the difference between allowing the buffer to be
executed as an unprivileged batch buffer or returning an EINVAL error to
userspace without executing anything.

In a case where userspace may want to test whether it can successfully
write to a register that needs privileges the distinction may be
important and an EINVAL error may be considered fatal.

In particular this is currently true for Mesa, which includes a test for
whether OACONTROL can be written too, but Mesa treats any error when
flushing a batch buffer as fatal, calling exit(1).

As it is currently Mesa can gracefully handle a failure to write to
OACONTROL if the command parser is disabled, but if we were to remove
OACONTROL from the parser's whitelist then the returned EINVAL would
break Mesa applications as they attempt an OACONTROL write.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
---
 drivers/gpu/drm/i915/i915_cmd_parser.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Chris Wilson Aug. 15, 2016, 3:04 p.m. UTC | #1
On Mon, Aug 15, 2016 at 03:41:20PM +0100, Robert Bragg wrote:
> check_cmd() is checking whether a command adheres to certain
> restrictions that ensure it's safe to execute within a privileged batch
> buffer. Returning false implies a privilege problem, not that the
> command is invalid.
> 
> The distinction makes the difference between allowing the buffer to be
> executed as an unprivileged batch buffer or returning an EINVAL error to
> userspace without executing anything.

Ah, but you choose to actually execute it instead. We can't allow that
either.
-Chris
Robert Bragg Aug. 18, 2016, 9:18 p.m. UTC | #2
On Mon, Aug 15, 2016 at 4:04 PM, Chris Wilson <chris@chris-wilson.co.uk>
wrote:

> On Mon, Aug 15, 2016 at 03:41:20PM +0100, Robert Bragg wrote:
> > check_cmd() is checking whether a command adheres to certain
> > restrictions that ensure it's safe to execute within a privileged batch
> > buffer. Returning false implies a privilege problem, not that the
> > command is invalid.
> >
> > The distinction makes the difference between allowing the buffer to be
> > executed as an unprivileged batch buffer or returning an EINVAL error to
> > userspace without executing anything.
>
> Ah, but you choose to actually execute it instead. We can't allow that
> either.
>

Okey, I've got myself a bit confused over this, going in circles a few
times now... Initially I took this to imply the cmd parser is not only
there to enable more than the HW policy allows, and there must be some
stuff we're blocking that the HW policy otherwise allows (and therefore
it's bad to return -EACCES here and fallback to the HW policy).

One of the things that's confusing me is that looking at the command parser
code, it looks like it's possible to bail early with an
MI_BATCH_BUFFER_START command, returning -EACCES and falling back to the
non-privileged HW policy. If the HW policy isn't strict enough in some
cases then presumably we wouldn't ever allow an early fallback to the HW
policy?

oacontrol does look to be an example where it seems a little dubious that
the HW considers it ok to write from a non-privileged buffer for gen8+, but
for gen7 all LRIs are considered privileged afik and the -EACCES fallback
should result in an oacontrol LRI becoming a NOOP.

The change appears to be having the desired effect with respect to mesa's
check for oacontrol writes failing gracefully with AMD_performance_monitor
not being advertised. The fallback via -EACCES to the non-privileged HW
policy seems to be NOOPing the LRIs based on running gputop to capture
system-wide metrics (oacontrol enabled via i915 perf interface) and then
running Mesa/GL applications that that would otherwise clobber oacontrol
with test LRIs when deciding to advertise AMD_performance_monitor. This
would interfere with gputop by disabling the OA unit if the LRIs were
successful.

Unless I've got the wrong end of the stick, I think this is ok for Haswell.

- Robert


> -Chris
>
> --
> Chris Wilson, Intel Open Source Technology Centre
>

Patch
diff mbox

diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
index cfe3e7a..71e778b 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -1261,7 +1261,7 @@  int intel_engine_cmd_parser(struct intel_engine_cs *engine,
 
 		if (!check_cmd(engine, desc, cmd, length, is_master,
 			       &oacontrol_set)) {
-			ret = -EINVAL;
+			ret = -EACCES;
 			break;
 		}