Message ID | 1391032514-19136-11-git-send-email-bradley.d.volkin@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Wed, Jan 29, 2014 at 01:55:11PM -0800, bradley.d.volkin@intel.com wrote: > From: Brad Volkin <bradley.d.volkin@intel.com> > > Various commands that access memory have a bit to determine whether > the graphics address specified in the command should use the GGTT or > PPGTT for translation. These checks ensure that the bit indicates > PPGTT translation. > > Most of these checks use the existing bit-checking infrastructure. > The PIPE_CONTROL and MI_FLUSH_DW commands, however, are multi-function > commands. The GGTT/PPGTT bit is only relevant for certain uses of the > command. As such, this change also extends the bit-checking code to > include a "condition" mask and offset. If the condition mask is non-zero > then the parser only performs the bit check when the bits specified by > the condition mask/offset are also non-zero. > > NOTE: At this point in the series PPGTT must be enabled for the parser > to work correctly. If it's not enabled, userspace will not be setting > the PPGTT bits the way the parser requires. VLV is the only platform > where this is a problem, so at this point, we disable parsing for VLV. That doesn't make sense. Are we not verifying that userspace has set the bits as appropriate for the hardware setup? So the value we expect depends upon how we have enabled ppgtt (or not). -Chris
On Wed, Jan 29, 2014 at 02:33:55PM -0800, Chris Wilson wrote: > On Wed, Jan 29, 2014 at 01:55:11PM -0800, bradley.d.volkin@intel.com wrote: > > From: Brad Volkin <bradley.d.volkin@intel.com> > > > > Various commands that access memory have a bit to determine whether > > the graphics address specified in the command should use the GGTT or > > PPGTT for translation. These checks ensure that the bit indicates > > PPGTT translation. > > > > Most of these checks use the existing bit-checking infrastructure. > > The PIPE_CONTROL and MI_FLUSH_DW commands, however, are multi-function > > commands. The GGTT/PPGTT bit is only relevant for certain uses of the > > command. As such, this change also extends the bit-checking code to > > include a "condition" mask and offset. If the condition mask is non-zero > > then the parser only performs the bit check when the bits specified by > > the condition mask/offset are also non-zero. > > > > NOTE: At this point in the series PPGTT must be enabled for the parser > > to work correctly. If it's not enabled, userspace will not be setting > > the PPGTT bits the way the parser requires. VLV is the only platform > > where this is a problem, so at this point, we disable parsing for VLV. > > That doesn't make sense. Are we not verifying that userspace has set the > bits as appropriate for the hardware setup? So the value we expect > depends upon how we have enabled ppgtt (or not). We could but don't currently. I was under the impression the parser wasn't seen as having as much benefit without ppgtt and that we're generally moving towards ppgtt as the default for all relevant platforms. - Brad > -Chris > > -- > Chris Wilson, Intel Open Source Technology Centre
On Wed, Jan 29, 2014 at 03:00:11PM -0800, Volkin, Bradley D wrote: > On Wed, Jan 29, 2014 at 02:33:55PM -0800, Chris Wilson wrote: > > On Wed, Jan 29, 2014 at 01:55:11PM -0800, bradley.d.volkin@intel.com wrote: > > > From: Brad Volkin <bradley.d.volkin@intel.com> > > > > > > Various commands that access memory have a bit to determine whether > > > the graphics address specified in the command should use the GGTT or > > > PPGTT for translation. These checks ensure that the bit indicates > > > PPGTT translation. > > > > > > Most of these checks use the existing bit-checking infrastructure. > > > The PIPE_CONTROL and MI_FLUSH_DW commands, however, are multi-function > > > commands. The GGTT/PPGTT bit is only relevant for certain uses of the > > > command. As such, this change also extends the bit-checking code to > > > include a "condition" mask and offset. If the condition mask is non-zero > > > then the parser only performs the bit check when the bits specified by > > > the condition mask/offset are also non-zero. > > > > > > NOTE: At this point in the series PPGTT must be enabled for the parser > > > to work correctly. If it's not enabled, userspace will not be setting > > > the PPGTT bits the way the parser requires. VLV is the only platform > > > where this is a problem, so at this point, we disable parsing for VLV. > > > > That doesn't make sense. Are we not verifying that userspace has set the > > bits as appropriate for the hardware setup? So the value we expect > > depends upon how we have enabled ppgtt (or not). > > We could but don't currently. I was under the impression the parser wasn't > seen as having as much benefit without ppgtt and that we're generally moving > towards ppgtt as the default for all relevant platforms. Oh, I remember that argument. It's just the way you phrased the note made me think that it was a limitation of the patch. Personally I would implement the checks against the hardware state as we know it. It's a nice pedalogical example, and removes a buried assumption from the code. -Chris
On Wed, 29 Jan 2014, bradley.d.volkin@intel.com wrote: > From: Brad Volkin <bradley.d.volkin@intel.com> > > Various commands that access memory have a bit to determine whether > the graphics address specified in the command should use the GGTT or > PPGTT for translation. These checks ensure that the bit indicates > PPGTT translation. > > Most of these checks use the existing bit-checking infrastructure. > The PIPE_CONTROL and MI_FLUSH_DW commands, however, are multi-function > commands. The GGTT/PPGTT bit is only relevant for certain uses of the > command. As such, this change also extends the bit-checking code to > include a "condition" mask and offset. If the condition mask is non-zero > then the parser only performs the bit check when the bits specified by > the condition mask/offset are also non-zero. > > NOTE: At this point in the series PPGTT must be enabled for the parser > to work correctly. If it's not enabled, userspace will not be setting > the PPGTT bits the way the parser requires. VLV is the only platform > where this is a problem, so at this point, we disable parsing for VLV. > > OTC-Tracker: AXIA-4631 > Change-Id: I3f4c76b6734f1956ec47e698230f97d0998ff92b > Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com> > --- > drivers/gpu/drm/i915/i915_cmd_parser.c | 147 +++++++++++++++++++++++++++++---- > drivers/gpu/drm/i915/i915_drv.h | 6 ++ > drivers/gpu/drm/i915/i915_reg.h | 6 ++ > 3 files changed, 144 insertions(+), 15 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c > index 7de7c6a..26072a2 100644 > --- a/drivers/gpu/drm/i915/i915_cmd_parser.c > +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c > @@ -65,10 +65,22 @@ static const struct drm_i915_cmd_descriptor common_cmds[] = { > CMD( MI_STORE_DWORD_INDEX, SMI, !F, 0xFF, R ), > CMD( MI_LOAD_REGISTER_IMM(1), SMI, !F, 0xFF, W, > .reg = { .offset = 1, .mask = 0x007FFFFC } ), > - CMD( MI_STORE_REGISTER_MEM(1), SMI, !F, 0xFF, W, > - .reg = { .offset = 1, .mask = 0x007FFFFC } ), > - CMD( MI_LOAD_REGISTER_MEM, SMI, !F, 0xFF, W, > - .reg = { .offset = 1, .mask = 0x007FFFFC } ), > + CMD( MI_STORE_REGISTER_MEM(1), SMI, !F, 0xFF, W | B, > + .reg = { .offset = 1, .mask = 0x007FFFFC }, > + .bits = {{ > + .offset = 0, > + .mask = MI_GLOBAL_GTT, > + .expected = 0 Not specific to this patch or this field, but all around I think you should add the comma to the last line too. It's a pretty universal way of doing things in the kernel, both for array and struct initialization. > + }}, > + .bits_count = 1 ), > + CMD( MI_LOAD_REGISTER_MEM, SMI, !F, 0xFF, W | B, > + .reg = { .offset = 1, .mask = 0x007FFFFC }, > + .bits = {{ > + .offset = 0, > + .mask = MI_GLOBAL_GTT, > + .expected = 0 > + }}, > + .bits_count = 1 ), > CMD( MI_BATCH_BUFFER_START, SMI, !F, 0xFF, S ), > }; > > @@ -80,9 +92,35 @@ static const struct drm_i915_cmd_descriptor render_cmds[] = { > CMD( MI_DISPLAY_FLIP, SMI, !F, 0xFF, R ), > CMD( MI_SET_CONTEXT, SMI, !F, 0xFF, R ), > CMD( MI_URB_CLEAR, SMI, !F, 0xFF, S ), > + CMD( MI_STORE_DWORD_IMM, SMI, !F, 0x3F, B, > + .bits = {{ > + .offset = 0, > + .mask = MI_GLOBAL_GTT, > + .expected = 0 > + }}, > + .bits_count = 1 ), > CMD( MI_UPDATE_GTT, SMI, !F, 0xFF, R ), > - CMD( MI_CLFLUSH, SMI, !F, 0x3FF, S ), > - CMD( MI_CONDITIONAL_BATCH_BUFFER_END, SMI, !F, 0xFF, S ), > + CMD( MI_CLFLUSH, SMI, !F, 0x3FF, B, > + .bits = {{ > + .offset = 0, > + .mask = MI_GLOBAL_GTT, > + .expected = 0 > + }}, > + .bits_count = 1 ), > + CMD( MI_REPORT_PERF_COUNT, SMI, !F, 0x3F, B, > + .bits = {{ > + .offset = 1, > + .mask = MI_REPORT_PERF_COUNT_GGTT, > + .expected = 0 > + }}, > + .bits_count = 1 ), > + CMD( MI_CONDITIONAL_BATCH_BUFFER_END, SMI, !F, 0xFF, B, > + .bits = {{ > + .offset = 0, > + .mask = MI_GLOBAL_GTT, > + .expected = 0 > + }}, > + .bits_count = 1 ), > CMD( GFX_OP_3DSTATE_VF_STATISTICS, S3D, F, 1, S ), > CMD( PIPELINE_SELECT, S3D, F, 1, S ), > CMD( MEDIA_VFE_STATE, S3D, !F, 0xFFFF, B, > @@ -100,8 +138,15 @@ static const struct drm_i915_cmd_descriptor render_cmds[] = { > .offset = 1, > .mask = (PIPE_CONTROL_MMIO_WRITE | PIPE_CONTROL_NOTIFY), > .expected = 0 > + }, > + { > + .offset = 1, > + .mask = PIPE_CONTROL_GLOBAL_GTT_IVB, > + .expected = 0, > + .condition_offset = 1, > + .condition_mask = PIPE_CONTROL_POST_SYNC_OP_MASK > }}, > - .bits_count = 1 ), > + .bits_count = 2 ), > }; > > static const struct drm_i915_cmd_descriptor hsw_render_cmds[] = { > @@ -127,16 +172,35 @@ static const struct drm_i915_cmd_descriptor hsw_render_cmds[] = { > > static const struct drm_i915_cmd_descriptor video_cmds[] = { > CMD( MI_ARB_ON_OFF, SMI, F, 1, R ), > - CMD( MI_STORE_DWORD_IMM, SMI, !F, 0xFF, S ), > + CMD( MI_STORE_DWORD_IMM, SMI, !F, 0xFF, B, > + .bits = {{ > + .offset = 0, > + .mask = MI_GLOBAL_GTT, > + .expected = 0 > + }}, > + .bits_count = 1 ), > CMD( MI_UPDATE_GTT, SMI, !F, 0x3F, R ), > CMD( MI_FLUSH_DW, SMI, !F, 0x3F, B, > .bits = {{ > .offset = 0, > .mask = MI_FLUSH_DW_NOTIFY, > .expected = 0 > + }, > + { > + .offset = 1, > + .mask = MI_FLUSH_DW_USE_GTT, > + .expected = 0, > + .condition_offset = 0, > + .condition_mask = MI_FLUSH_DW_OP_MASK > }}, > - .bits_count = 1 ), > - CMD( MI_CONDITIONAL_BATCH_BUFFER_END, SMI, !F, 0xFF, S ), > + .bits_count = 2 ), > + CMD( MI_CONDITIONAL_BATCH_BUFFER_END, SMI, !F, 0xFF, B, > + .bits = {{ > + .offset = 0, > + .mask = MI_GLOBAL_GTT, > + .expected = 0 > + }}, > + .bits_count = 1 ), > /* > * MFX_WAIT doesn't fit the way we handle length for most commands. > * It has a length field but it uses a non-standard length bias. > @@ -147,29 +211,61 @@ static const struct drm_i915_cmd_descriptor video_cmds[] = { > > static const struct drm_i915_cmd_descriptor vecs_cmds[] = { > CMD( MI_ARB_ON_OFF, SMI, F, 1, R ), > - CMD( MI_STORE_DWORD_IMM, SMI, !F, 0xFF, S ), > + CMD( MI_STORE_DWORD_IMM, SMI, !F, 0xFF, B, > + .bits = {{ > + .offset = 0, > + .mask = MI_GLOBAL_GTT, > + .expected = 0 > + }}, > + .bits_count = 1 ), > CMD( MI_UPDATE_GTT, SMI, !F, 0x3F, R ), > CMD( MI_FLUSH_DW, SMI, !F, 0x3F, B, > .bits = {{ > .offset = 0, > .mask = MI_FLUSH_DW_NOTIFY, > .expected = 0 > + }, > + { > + .offset = 1, > + .mask = MI_FLUSH_DW_USE_GTT, > + .expected = 0, > + .condition_offset = 0, > + .condition_mask = MI_FLUSH_DW_OP_MASK > }}, > - .bits_count = 1 ), > - CMD( MI_CONDITIONAL_BATCH_BUFFER_END, SMI, !F, 0xFF, S ), > + .bits_count = 2 ), > + CMD( MI_CONDITIONAL_BATCH_BUFFER_END, SMI, !F, 0xFF, B, > + .bits = {{ > + .offset = 0, > + .mask = MI_GLOBAL_GTT, > + .expected = 0 > + }}, > + .bits_count = 1 ), > }; > > static const struct drm_i915_cmd_descriptor blt_cmds[] = { > CMD( MI_DISPLAY_FLIP, SMI, !F, 0xFF, R ), > - CMD( MI_STORE_DWORD_IMM, SMI, !F, 0x3FF, S ), > + CMD( MI_STORE_DWORD_IMM, SMI, !F, 0x3FF, B, > + .bits = {{ > + .offset = 0, > + .mask = MI_GLOBAL_GTT, > + .expected = 0 > + }}, > + .bits_count = 1 ), > CMD( MI_UPDATE_GTT, SMI, !F, 0x3F, R ), > CMD( MI_FLUSH_DW, SMI, !F, 0x3F, B, > .bits = {{ > .offset = 0, > .mask = MI_FLUSH_DW_NOTIFY, > .expected = 0 > + }, > + { > + .offset = 1, > + .mask = MI_FLUSH_DW_USE_GTT, > + .expected = 0, > + .condition_offset = 0, > + .condition_mask = MI_FLUSH_DW_OP_MASK > }}, > - .bits_count = 1 ), > + .bits_count = 2 ), > CMD( COLOR_BLT, S2D, !F, 0x3F, S ), > CMD( SRC_COPY_BLT, S2D, !F, 0x3F, S ), > }; > @@ -569,10 +665,21 @@ finish: > > int i915_needs_cmd_parser(struct intel_ring_buffer *ring) > { > + drm_i915_private_t *dev_priv = > + (drm_i915_private_t *)ring->dev->dev_private; > + > /* No command tables indicates a platform without parsing */ > if (!ring->cmd_tables) > return 0; > > + /* > + * XXX: VLV is Gen7 and therefore has cmd_tables, but has PPGTT > + * disabled. That will cause all of the parser's PPGTT checks to > + * fail. For now, disable parsing when PPGTT is off. > + */ > + if(!dev_priv->mm.aliasing_ppgtt) ^ missing space. > + return 0; > + Hmm, shouldn't this belong to some other patch, much earlier in the series? Like patch 2 or 3? > return i915.enable_cmd_parser; > } > > @@ -675,6 +782,16 @@ int i915_parse_cmds(struct intel_ring_buffer *ring, > u32 dword = cmd[desc->bits[i].offset] & > desc->bits[i].mask; > > + if (desc->bits[i].condition_mask != 0) { > + u32 offset = > + desc->bits[i].condition_offset; > + u32 condition = cmd[offset] & > + desc->bits[i].condition_mask; > + > + if (condition == 0) > + continue; > + } > + > if (dword != desc->bits[i].expected) { > DRM_DEBUG_DRIVER("CMD: Rejected command 0x%08X for bitmask 0x%08X (exp=0x%08X act=0x%08X) (ring=%d)\n", > *cmd, > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index 8aed80f..2d1d2ef 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -1829,11 +1829,17 @@ struct drm_i915_cmd_descriptor { > * compared against an expected value. If the command does not match > * the expected value, the parser rejects it. Only valid if flags has > * the CMD_DESC_BITMASK bit set. > + * > + * If the check specifies a non-zero condition_mask then the parser > + * only performs the check when the bits specified by condition_mask > + * are non-zero. > */ > struct { > u32 offset; > u32 mask; > u32 expected; > + u32 condition_offset; > + u32 condition_mask; > } bits[MAX_CMD_DESC_BITMASKS]; > /** Number of valid entries in the bits array */ > int bits_count; > diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h > index c2e4898..ff263f4 100644 > --- a/drivers/gpu/drm/i915/i915_reg.h > +++ b/drivers/gpu/drm/i915/i915_reg.h > @@ -179,6 +179,8 @@ > * Memory interface instructions used by the kernel > */ > #define MI_INSTR(opcode, flags) (((opcode) << 23) | (flags)) > +/* Many MI commands use bit 22 of the header dword for GGTT vs PPGTT */ > +#define MI_GLOBAL_GTT (1<<22) > > #define MI_NOOP MI_INSTR(0, 0) > #define MI_USER_INTERRUPT MI_INSTR(0x02, 0) > @@ -258,6 +260,7 @@ > #define MI_FLUSH_DW_STORE_INDEX (1<<21) > #define MI_INVALIDATE_TLB (1<<18) > #define MI_FLUSH_DW_OP_STOREDW (1<<14) > +#define MI_FLUSH_DW_OP_MASK (3<<14) > #define MI_FLUSH_DW_NOTIFY (1<<8) > #define MI_INVALIDATE_BSD (1<<7) > #define MI_FLUSH_DW_USE_GTT (1<<2) > @@ -324,6 +327,7 @@ > #define PIPE_CONTROL_CS_STALL (1<<20) > #define PIPE_CONTROL_TLB_INVALIDATE (1<<18) > #define PIPE_CONTROL_QW_WRITE (1<<14) > +#define PIPE_CONTROL_POST_SYNC_OP_MASK (3<<14) > #define PIPE_CONTROL_DEPTH_STALL (1<<13) > #define PIPE_CONTROL_WRITE_FLUSH (1<<12) > #define PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH (1<<12) /* gen6+ */ > @@ -352,6 +356,8 @@ > #define MI_URB_CLEAR MI_INSTR(0x19, 0) > #define MI_UPDATE_GTT MI_INSTR(0x23, 0) > #define MI_CLFLUSH MI_INSTR(0x27, 0) > +#define MI_REPORT_PERF_COUNT MI_INSTR(0x28, 0) > +#define MI_REPORT_PERF_COUNT_GGTT (1<<0) > #define MI_LOAD_REGISTER_MEM MI_INSTR(0x29, 0) > #define MI_LOAD_REGISTER_REG MI_INSTR(0x2A, 0) > #define MI_RS_STORE_DATA_IMM MI_INSTR(0x2B, 0) > -- > 1.8.5.2 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[snip] On Wed, Feb 05, 2014 at 07:37:51AM -0800, Jani Nikula wrote: > On Wed, 29 Jan 2014, bradley.d.volkin@intel.com wrote: > > int i915_needs_cmd_parser(struct intel_ring_buffer *ring) > > { > > + drm_i915_private_t *dev_priv = > > + (drm_i915_private_t *)ring->dev->dev_private; > > + > > /* No command tables indicates a platform without parsing */ > > if (!ring->cmd_tables) > > return 0; > > > > + /* > > + * XXX: VLV is Gen7 and therefore has cmd_tables, but has PPGTT > > + * disabled. That will cause all of the parser's PPGTT checks to > > + * fail. For now, disable parsing when PPGTT is off. > > + */ > > + if(!dev_priv->mm.aliasing_ppgtt) > ^ missing space. Oops > > > + return 0; > > + > > Hmm, shouldn't this belong to some other patch, much earlier in the > series? Like patch 2 or 3? Not necessarily. It's only because we've added the PPGTT checks without somehow making them conditional on aliasing_ppgtt==true that we have a problem, and that only happens with this patch. The parser works, though is less useful, on !aliasing_ppgtt platforms up to this point. Chris suggested that we just fix it up so that the PPGTT checks are conditional on PPGTT actually enabled, so I'm going to look at that. - Brad > > > return i915.enable_cmd_parser; > > } > > > > @@ -675,6 +782,16 @@ int i915_parse_cmds(struct intel_ring_buffer *ring, > > u32 dword = cmd[desc->bits[i].offset] & > > desc->bits[i].mask; > > > > + if (desc->bits[i].condition_mask != 0) { > > + u32 offset = > > + desc->bits[i].condition_offset; > > + u32 condition = cmd[offset] & > > + desc->bits[i].condition_mask; > > + > > + if (condition == 0) > > + continue; > > + } > > + > > if (dword != desc->bits[i].expected) { > > DRM_DEBUG_DRIVER("CMD: Rejected command 0x%08X for bitmask 0x%08X (exp=0x%08X act=0x%08X) (ring=%d)\n", > > *cmd, > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > > index 8aed80f..2d1d2ef 100644 > > --- a/drivers/gpu/drm/i915/i915_drv.h > > +++ b/drivers/gpu/drm/i915/i915_drv.h > > @@ -1829,11 +1829,17 @@ struct drm_i915_cmd_descriptor { > > * compared against an expected value. If the command does not match > > * the expected value, the parser rejects it. Only valid if flags has > > * the CMD_DESC_BITMASK bit set. > > + * > > + * If the check specifies a non-zero condition_mask then the parser > > + * only performs the check when the bits specified by condition_mask > > + * are non-zero. > > */ > > struct { > > u32 offset; > > u32 mask; > > u32 expected; > > + u32 condition_offset; > > + u32 condition_mask; > > } bits[MAX_CMD_DESC_BITMASKS]; > > /** Number of valid entries in the bits array */ > > int bits_count; > > diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h > > index c2e4898..ff263f4 100644 > > --- a/drivers/gpu/drm/i915/i915_reg.h > > +++ b/drivers/gpu/drm/i915/i915_reg.h > > @@ -179,6 +179,8 @@ > > * Memory interface instructions used by the kernel > > */ > > #define MI_INSTR(opcode, flags) (((opcode) << 23) | (flags)) > > +/* Many MI commands use bit 22 of the header dword for GGTT vs PPGTT */ > > +#define MI_GLOBAL_GTT (1<<22) > > > > #define MI_NOOP MI_INSTR(0, 0) > > #define MI_USER_INTERRUPT MI_INSTR(0x02, 0) > > @@ -258,6 +260,7 @@ > > #define MI_FLUSH_DW_STORE_INDEX (1<<21) > > #define MI_INVALIDATE_TLB (1<<18) > > #define MI_FLUSH_DW_OP_STOREDW (1<<14) > > +#define MI_FLUSH_DW_OP_MASK (3<<14) > > #define MI_FLUSH_DW_NOTIFY (1<<8) > > #define MI_INVALIDATE_BSD (1<<7) > > #define MI_FLUSH_DW_USE_GTT (1<<2) > > @@ -324,6 +327,7 @@ > > #define PIPE_CONTROL_CS_STALL (1<<20) > > #define PIPE_CONTROL_TLB_INVALIDATE (1<<18) > > #define PIPE_CONTROL_QW_WRITE (1<<14) > > +#define PIPE_CONTROL_POST_SYNC_OP_MASK (3<<14) > > #define PIPE_CONTROL_DEPTH_STALL (1<<13) > > #define PIPE_CONTROL_WRITE_FLUSH (1<<12) > > #define PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH (1<<12) /* gen6+ */ > > @@ -352,6 +356,8 @@ > > #define MI_URB_CLEAR MI_INSTR(0x19, 0) > > #define MI_UPDATE_GTT MI_INSTR(0x23, 0) > > #define MI_CLFLUSH MI_INSTR(0x27, 0) > > +#define MI_REPORT_PERF_COUNT MI_INSTR(0x28, 0) > > +#define MI_REPORT_PERF_COUNT_GGTT (1<<0) > > #define MI_LOAD_REGISTER_MEM MI_INSTR(0x29, 0) > > #define MI_LOAD_REGISTER_REG MI_INSTR(0x2A, 0) > > #define MI_RS_STORE_DATA_IMM MI_INSTR(0x2B, 0) > > -- > > 1.8.5.2 > > > > _______________________________________________ > > Intel-gfx mailing list > > Intel-gfx@lists.freedesktop.org > > http://lists.freedesktop.org/mailman/listinfo/intel-gfx > > -- > Jani Nikula, Intel Open Source Technology Center
diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c index 7de7c6a..26072a2 100644 --- a/drivers/gpu/drm/i915/i915_cmd_parser.c +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c @@ -65,10 +65,22 @@ static const struct drm_i915_cmd_descriptor common_cmds[] = { CMD( MI_STORE_DWORD_INDEX, SMI, !F, 0xFF, R ), CMD( MI_LOAD_REGISTER_IMM(1), SMI, !F, 0xFF, W, .reg = { .offset = 1, .mask = 0x007FFFFC } ), - CMD( MI_STORE_REGISTER_MEM(1), SMI, !F, 0xFF, W, - .reg = { .offset = 1, .mask = 0x007FFFFC } ), - CMD( MI_LOAD_REGISTER_MEM, SMI, !F, 0xFF, W, - .reg = { .offset = 1, .mask = 0x007FFFFC } ), + CMD( MI_STORE_REGISTER_MEM(1), SMI, !F, 0xFF, W | B, + .reg = { .offset = 1, .mask = 0x007FFFFC }, + .bits = {{ + .offset = 0, + .mask = MI_GLOBAL_GTT, + .expected = 0 + }}, + .bits_count = 1 ), + CMD( MI_LOAD_REGISTER_MEM, SMI, !F, 0xFF, W | B, + .reg = { .offset = 1, .mask = 0x007FFFFC }, + .bits = {{ + .offset = 0, + .mask = MI_GLOBAL_GTT, + .expected = 0 + }}, + .bits_count = 1 ), CMD( MI_BATCH_BUFFER_START, SMI, !F, 0xFF, S ), }; @@ -80,9 +92,35 @@ static const struct drm_i915_cmd_descriptor render_cmds[] = { CMD( MI_DISPLAY_FLIP, SMI, !F, 0xFF, R ), CMD( MI_SET_CONTEXT, SMI, !F, 0xFF, R ), CMD( MI_URB_CLEAR, SMI, !F, 0xFF, S ), + CMD( MI_STORE_DWORD_IMM, SMI, !F, 0x3F, B, + .bits = {{ + .offset = 0, + .mask = MI_GLOBAL_GTT, + .expected = 0 + }}, + .bits_count = 1 ), CMD( MI_UPDATE_GTT, SMI, !F, 0xFF, R ), - CMD( MI_CLFLUSH, SMI, !F, 0x3FF, S ), - CMD( MI_CONDITIONAL_BATCH_BUFFER_END, SMI, !F, 0xFF, S ), + CMD( MI_CLFLUSH, SMI, !F, 0x3FF, B, + .bits = {{ + .offset = 0, + .mask = MI_GLOBAL_GTT, + .expected = 0 + }}, + .bits_count = 1 ), + CMD( MI_REPORT_PERF_COUNT, SMI, !F, 0x3F, B, + .bits = {{ + .offset = 1, + .mask = MI_REPORT_PERF_COUNT_GGTT, + .expected = 0 + }}, + .bits_count = 1 ), + CMD( MI_CONDITIONAL_BATCH_BUFFER_END, SMI, !F, 0xFF, B, + .bits = {{ + .offset = 0, + .mask = MI_GLOBAL_GTT, + .expected = 0 + }}, + .bits_count = 1 ), CMD( GFX_OP_3DSTATE_VF_STATISTICS, S3D, F, 1, S ), CMD( PIPELINE_SELECT, S3D, F, 1, S ), CMD( MEDIA_VFE_STATE, S3D, !F, 0xFFFF, B, @@ -100,8 +138,15 @@ static const struct drm_i915_cmd_descriptor render_cmds[] = { .offset = 1, .mask = (PIPE_CONTROL_MMIO_WRITE | PIPE_CONTROL_NOTIFY), .expected = 0 + }, + { + .offset = 1, + .mask = PIPE_CONTROL_GLOBAL_GTT_IVB, + .expected = 0, + .condition_offset = 1, + .condition_mask = PIPE_CONTROL_POST_SYNC_OP_MASK }}, - .bits_count = 1 ), + .bits_count = 2 ), }; static const struct drm_i915_cmd_descriptor hsw_render_cmds[] = { @@ -127,16 +172,35 @@ static const struct drm_i915_cmd_descriptor hsw_render_cmds[] = { static const struct drm_i915_cmd_descriptor video_cmds[] = { CMD( MI_ARB_ON_OFF, SMI, F, 1, R ), - CMD( MI_STORE_DWORD_IMM, SMI, !F, 0xFF, S ), + CMD( MI_STORE_DWORD_IMM, SMI, !F, 0xFF, B, + .bits = {{ + .offset = 0, + .mask = MI_GLOBAL_GTT, + .expected = 0 + }}, + .bits_count = 1 ), CMD( MI_UPDATE_GTT, SMI, !F, 0x3F, R ), CMD( MI_FLUSH_DW, SMI, !F, 0x3F, B, .bits = {{ .offset = 0, .mask = MI_FLUSH_DW_NOTIFY, .expected = 0 + }, + { + .offset = 1, + .mask = MI_FLUSH_DW_USE_GTT, + .expected = 0, + .condition_offset = 0, + .condition_mask = MI_FLUSH_DW_OP_MASK }}, - .bits_count = 1 ), - CMD( MI_CONDITIONAL_BATCH_BUFFER_END, SMI, !F, 0xFF, S ), + .bits_count = 2 ), + CMD( MI_CONDITIONAL_BATCH_BUFFER_END, SMI, !F, 0xFF, B, + .bits = {{ + .offset = 0, + .mask = MI_GLOBAL_GTT, + .expected = 0 + }}, + .bits_count = 1 ), /* * MFX_WAIT doesn't fit the way we handle length for most commands. * It has a length field but it uses a non-standard length bias. @@ -147,29 +211,61 @@ static const struct drm_i915_cmd_descriptor video_cmds[] = { static const struct drm_i915_cmd_descriptor vecs_cmds[] = { CMD( MI_ARB_ON_OFF, SMI, F, 1, R ), - CMD( MI_STORE_DWORD_IMM, SMI, !F, 0xFF, S ), + CMD( MI_STORE_DWORD_IMM, SMI, !F, 0xFF, B, + .bits = {{ + .offset = 0, + .mask = MI_GLOBAL_GTT, + .expected = 0 + }}, + .bits_count = 1 ), CMD( MI_UPDATE_GTT, SMI, !F, 0x3F, R ), CMD( MI_FLUSH_DW, SMI, !F, 0x3F, B, .bits = {{ .offset = 0, .mask = MI_FLUSH_DW_NOTIFY, .expected = 0 + }, + { + .offset = 1, + .mask = MI_FLUSH_DW_USE_GTT, + .expected = 0, + .condition_offset = 0, + .condition_mask = MI_FLUSH_DW_OP_MASK }}, - .bits_count = 1 ), - CMD( MI_CONDITIONAL_BATCH_BUFFER_END, SMI, !F, 0xFF, S ), + .bits_count = 2 ), + CMD( MI_CONDITIONAL_BATCH_BUFFER_END, SMI, !F, 0xFF, B, + .bits = {{ + .offset = 0, + .mask = MI_GLOBAL_GTT, + .expected = 0 + }}, + .bits_count = 1 ), }; static const struct drm_i915_cmd_descriptor blt_cmds[] = { CMD( MI_DISPLAY_FLIP, SMI, !F, 0xFF, R ), - CMD( MI_STORE_DWORD_IMM, SMI, !F, 0x3FF, S ), + CMD( MI_STORE_DWORD_IMM, SMI, !F, 0x3FF, B, + .bits = {{ + .offset = 0, + .mask = MI_GLOBAL_GTT, + .expected = 0 + }}, + .bits_count = 1 ), CMD( MI_UPDATE_GTT, SMI, !F, 0x3F, R ), CMD( MI_FLUSH_DW, SMI, !F, 0x3F, B, .bits = {{ .offset = 0, .mask = MI_FLUSH_DW_NOTIFY, .expected = 0 + }, + { + .offset = 1, + .mask = MI_FLUSH_DW_USE_GTT, + .expected = 0, + .condition_offset = 0, + .condition_mask = MI_FLUSH_DW_OP_MASK }}, - .bits_count = 1 ), + .bits_count = 2 ), CMD( COLOR_BLT, S2D, !F, 0x3F, S ), CMD( SRC_COPY_BLT, S2D, !F, 0x3F, S ), }; @@ -569,10 +665,21 @@ finish: int i915_needs_cmd_parser(struct intel_ring_buffer *ring) { + drm_i915_private_t *dev_priv = + (drm_i915_private_t *)ring->dev->dev_private; + /* No command tables indicates a platform without parsing */ if (!ring->cmd_tables) return 0; + /* + * XXX: VLV is Gen7 and therefore has cmd_tables, but has PPGTT + * disabled. That will cause all of the parser's PPGTT checks to + * fail. For now, disable parsing when PPGTT is off. + */ + if(!dev_priv->mm.aliasing_ppgtt) + return 0; + return i915.enable_cmd_parser; } @@ -675,6 +782,16 @@ int i915_parse_cmds(struct intel_ring_buffer *ring, u32 dword = cmd[desc->bits[i].offset] & desc->bits[i].mask; + if (desc->bits[i].condition_mask != 0) { + u32 offset = + desc->bits[i].condition_offset; + u32 condition = cmd[offset] & + desc->bits[i].condition_mask; + + if (condition == 0) + continue; + } + if (dword != desc->bits[i].expected) { DRM_DEBUG_DRIVER("CMD: Rejected command 0x%08X for bitmask 0x%08X (exp=0x%08X act=0x%08X) (ring=%d)\n", *cmd, diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 8aed80f..2d1d2ef 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1829,11 +1829,17 @@ struct drm_i915_cmd_descriptor { * compared against an expected value. If the command does not match * the expected value, the parser rejects it. Only valid if flags has * the CMD_DESC_BITMASK bit set. + * + * If the check specifies a non-zero condition_mask then the parser + * only performs the check when the bits specified by condition_mask + * are non-zero. */ struct { u32 offset; u32 mask; u32 expected; + u32 condition_offset; + u32 condition_mask; } bits[MAX_CMD_DESC_BITMASKS]; /** Number of valid entries in the bits array */ int bits_count; diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index c2e4898..ff263f4 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -179,6 +179,8 @@ * Memory interface instructions used by the kernel */ #define MI_INSTR(opcode, flags) (((opcode) << 23) | (flags)) +/* Many MI commands use bit 22 of the header dword for GGTT vs PPGTT */ +#define MI_GLOBAL_GTT (1<<22) #define MI_NOOP MI_INSTR(0, 0) #define MI_USER_INTERRUPT MI_INSTR(0x02, 0) @@ -258,6 +260,7 @@ #define MI_FLUSH_DW_STORE_INDEX (1<<21) #define MI_INVALIDATE_TLB (1<<18) #define MI_FLUSH_DW_OP_STOREDW (1<<14) +#define MI_FLUSH_DW_OP_MASK (3<<14) #define MI_FLUSH_DW_NOTIFY (1<<8) #define MI_INVALIDATE_BSD (1<<7) #define MI_FLUSH_DW_USE_GTT (1<<2) @@ -324,6 +327,7 @@ #define PIPE_CONTROL_CS_STALL (1<<20) #define PIPE_CONTROL_TLB_INVALIDATE (1<<18) #define PIPE_CONTROL_QW_WRITE (1<<14) +#define PIPE_CONTROL_POST_SYNC_OP_MASK (3<<14) #define PIPE_CONTROL_DEPTH_STALL (1<<13) #define PIPE_CONTROL_WRITE_FLUSH (1<<12) #define PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH (1<<12) /* gen6+ */ @@ -352,6 +356,8 @@ #define MI_URB_CLEAR MI_INSTR(0x19, 0) #define MI_UPDATE_GTT MI_INSTR(0x23, 0) #define MI_CLFLUSH MI_INSTR(0x27, 0) +#define MI_REPORT_PERF_COUNT MI_INSTR(0x28, 0) +#define MI_REPORT_PERF_COUNT_GGTT (1<<0) #define MI_LOAD_REGISTER_MEM MI_INSTR(0x29, 0) #define MI_LOAD_REGISTER_REG MI_INSTR(0x2A, 0) #define MI_RS_STORE_DATA_IMM MI_INSTR(0x2B, 0)