Message ID | 1365427720-10926-1-git-send-email-chris@chris-wilson.co.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 04/08/2013 06:28 AM, Chris Wilson wrote: > Enabling context support increases SwapBuffers latency by about 20% > (measured on an i7-3720qm). We can offset that loss slightly by enabling > faster caching for the contexts. As they are not backed by any > particular cache (such as the sampler or render caches) our only option > is to select the generic mid-level cache. This reduces the latency of > the swap by about 5%. > > Oddly this effect can be observed running smokin-guns on IVB at > 1280x1024: > Using BLT copies for swaps: 151.67 fps > Using Render copies for swaps (unpatched): 141.70 fps > With contexts disabled: 150.23 fps > With contexts in L3$: 150.77 fps > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Ben Widawsky <ben@bwidawsk.net> > Cc: Kenneth Graunke <kenneth@whitecape.org> > --- > drivers/gpu/drm/i915/i915_gem_context.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c > index 94d873a..a1e8ecb 100644 > --- a/drivers/gpu/drm/i915/i915_gem_context.c > +++ b/drivers/gpu/drm/i915/i915_gem_context.c > @@ -152,6 +152,13 @@ create_hw_context(struct drm_device *dev, > return ERR_PTR(-ENOMEM); > } > > + if (INTEL_INFO(dev)->gen >= 7) { > + ret = i915_gem_object_set_cache_level(ctx->obj, > + I915_CACHE_LLC_MLC); > + if (ret) > + goto err_out; > + } > + > /* The ring associated with the context object is handled by the normal > * object tracking code. We give an initial ring value simple to pass an > * assertion in the context switch code. Sounds good to me, Chris. Is this also useful on Sandybridge? I don't see why we couldn't use both L3 & LLC there as well. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> On a tangent, it would be nice to clean up the I915_CACHE_* enums. The term "MLC" isn't really used in the modern documentation, and there's even text that says "Ivybridge doesn't have MLC". But what it really means here is LLC + L3. It's probably not worth reworking until we can send out the new Haswell bits, though. Thanks for doing this! --Ken
On Mon, Apr 08, 2013 at 07:57:40AM -0700, Kenneth Graunke wrote: > On 04/08/2013 06:28 AM, Chris Wilson wrote: > >Enabling context support increases SwapBuffers latency by about 20% > >(measured on an i7-3720qm). We can offset that loss slightly by enabling > >faster caching for the contexts. As they are not backed by any > >particular cache (such as the sampler or render caches) our only option > >is to select the generic mid-level cache. This reduces the latency of > >the swap by about 5%. > > > >Oddly this effect can be observed running smokin-guns on IVB at > >1280x1024: > >Using BLT copies for swaps: 151.67 fps > >Using Render copies for swaps (unpatched): 141.70 fps > >With contexts disabled: 150.23 fps > >With contexts in L3$: 150.77 fps > > > >Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > >Cc: Ben Widawsky <ben@bwidawsk.net> > >Cc: Kenneth Graunke <kenneth@whitecape.org> > >--- > > drivers/gpu/drm/i915/i915_gem_context.c | 7 +++++++ > > 1 file changed, 7 insertions(+) > > > >diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c > >index 94d873a..a1e8ecb 100644 > >--- a/drivers/gpu/drm/i915/i915_gem_context.c > >+++ b/drivers/gpu/drm/i915/i915_gem_context.c > >@@ -152,6 +152,13 @@ create_hw_context(struct drm_device *dev, > > return ERR_PTR(-ENOMEM); > > } > > > >+ if (INTEL_INFO(dev)->gen >= 7) { > >+ ret = i915_gem_object_set_cache_level(ctx->obj, > >+ I915_CACHE_LLC_MLC); > >+ if (ret) > >+ goto err_out; > >+ } > >+ > > /* The ring associated with the context object is handled by the normal > > * object tracking code. We give an initial ring value simple to pass an > > * assertion in the context switch code. > > Sounds good to me, Chris. Is this also useful on Sandybridge? I > don't see why we couldn't use both L3 & LLC there as well. Snb doesn't have gpu l3 afaik. > > Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> > > On a tangent, it would be nice to clean up the I915_CACHE_* enums. > The term "MLC" isn't really used in the modern documentation, and > there's even text that says "Ivybridge doesn't have MLC". But what > it really means here is LLC + L3. It's probably not worth reworking > until we can send out the new Haswell bits, though. Yeah, my idea is that iternally we split this into cache_level + flags, where cache_level controls how coherency works (i.e. where we have to clflush and do similar magic) and the flags do the fine-tuning (like enabling l3 caching). But like you've said, can wait for more hsw magic (or gfdt) landing. Queued for -next, thanks for the patch. -Daniel
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 94d873a..a1e8ecb 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -152,6 +152,13 @@ create_hw_context(struct drm_device *dev, return ERR_PTR(-ENOMEM); } + if (INTEL_INFO(dev)->gen >= 7) { + ret = i915_gem_object_set_cache_level(ctx->obj, + I915_CACHE_LLC_MLC); + if (ret) + goto err_out; + } + /* The ring associated with the context object is handled by the normal * object tracking code. We give an initial ring value simple to pass an * assertion in the context switch code.
Enabling context support increases SwapBuffers latency by about 20% (measured on an i7-3720qm). We can offset that loss slightly by enabling faster caching for the contexts. As they are not backed by any particular cache (such as the sampler or render caches) our only option is to select the generic mid-level cache. This reduces the latency of the swap by about 5%. Oddly this effect can be observed running smokin-guns on IVB at 1280x1024: Using BLT copies for swaps: 151.67 fps Using Render copies for swaps (unpatched): 141.70 fps With contexts disabled: 150.23 fps With contexts in L3$: 150.77 fps Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <ben@bwidawsk.net> Cc: Kenneth Graunke <kenneth@whitecape.org> --- drivers/gpu/drm/i915/i915_gem_context.c | 7 +++++++ 1 file changed, 7 insertions(+)