Message ID | 1371477978-25440-1-git-send-email-ajax@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, Jun 17, 2013 at 10:06 AM, Adam Jackson <ajax@redhat.com> wrote: > At least on an IBM Power 720, this check passes, but several piglit > tests will reliably trigger GPU resets due to the ring buffer pointers > not being updated. There's probably a better way to limit this to just > affected machines though. What radeon chips are you seeing this on? wb is more or less required on r6xx and newer and I'm not sure those generations will even work properly without writeback enabled these days. We force it to always be enabled on APUs and NI and newer asics. With KMS, wb encompasses more than just rptr writeback; it covers pretty much everything involving the GPU writing any status information to memory. Alex > > Signed-off-by: Adam Jackson <ajax@redhat.com> > --- > drivers/gpu/drm/radeon/r600_cp.c | 7 +++++++ > drivers/gpu/drm/radeon/radeon_cp.c | 7 +++++++ > drivers/gpu/drm/radeon/radeon_device.c | 4 ++-- > drivers/gpu/drm/radeon/radeon_drv.c | 2 +- > 4 files changed, 17 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/radeon/r600_cp.c b/drivers/gpu/drm/radeon/r600_cp.c > index 1c51c08..ef28532 100644 > --- a/drivers/gpu/drm/radeon/r600_cp.c > +++ b/drivers/gpu/drm/radeon/r600_cp.c > @@ -552,6 +552,13 @@ static void r600_test_writeback(drm_radeon_private_t *dev_priv) > dev_priv->writeback_works = 0; > DRM_INFO("writeback test failed\n"); > } > +#if defined(__ppc__) || defined(__ppc64__) > + /* the test might succeed on ppc, but it's usually not reliable */ > + if (radeon_no_wb == -1) { > + radeon_no_wb = 1; > + DRM_INFO("not trusting writeback test due to arch quirk\n"); > + } > +#endif > if (radeon_no_wb == 1) { > dev_priv->writeback_works = 0; > DRM_INFO("writeback forced off\n"); > diff --git a/drivers/gpu/drm/radeon/radeon_cp.c b/drivers/gpu/drm/radeon/radeon_cp.c > index efc4f64..a967b33 100644 > --- a/drivers/gpu/drm/radeon/radeon_cp.c > +++ b/drivers/gpu/drm/radeon/radeon_cp.c > @@ -892,6 +892,13 @@ static void radeon_test_writeback(drm_radeon_private_t * dev_priv) > dev_priv->writeback_works = 0; > DRM_INFO("writeback test failed\n"); > } > +#if defined(__ppc__) || defined(__ppc64__) > + /* the test might succeed on ppc, but it's usually not reliable */ > + if (radeon_no_wb == -1) { > + radeon_no_wb = 1; > + DRM_INFO("not trusting writeback test due to arch quirk\n"); > + } > +#endif > if (radeon_no_wb == 1) { > dev_priv->writeback_works = 0; > DRM_INFO("writeback forced off\n"); > diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c > index 1899738..524046e 100644 > --- a/drivers/gpu/drm/radeon/radeon_device.c > +++ b/drivers/gpu/drm/radeon/radeon_device.c > @@ -322,8 +322,8 @@ int radeon_wb_init(struct radeon_device *rdev) > /* disable event_write fences */ > rdev->wb.use_event = false; > /* disabled via module param */ > - if (radeon_no_wb == 1) { > - rdev->wb.enabled = false; > + if (radeon_no_wb != -1) { > + rdev->wb.enabled = !!radeon_no_wb; > } else { > if (rdev->flags & RADEON_IS_AGP) { > /* often unreliable on AGP */ > diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c > index 094e7e5..04809d4 100644 > --- a/drivers/gpu/drm/radeon/radeon_drv.c > +++ b/drivers/gpu/drm/radeon/radeon_drv.c > @@ -146,7 +146,7 @@ static inline void radeon_register_atpx_handler(void) {} > static inline void radeon_unregister_atpx_handler(void) {} > #endif > > -int radeon_no_wb; > +int radeon_no_wb = -1; > int radeon_modeset = -1; > int radeon_dynclks = -1; > int radeon_r4xx_atom = 0; > -- > 1.8.2.1 > > _______________________________________________ > dri-devel mailing list > dri-devel@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/dri-devel
On Mon, 2013-06-17 at 11:04 -0400, Alex Deucher wrote: > On Mon, Jun 17, 2013 at 10:06 AM, Adam Jackson <ajax@redhat.com> wrote: > > At least on an IBM Power 720, this check passes, but several piglit > > tests will reliably trigger GPU resets due to the ring buffer pointers > > not being updated. There's probably a better way to limit this to just > > affected machines though. > > What radeon chips are you seeing this on? wb is more or less required > on r6xx and newer and I'm not sure those generations will even work > properly without writeback enabled these days. We force it to always > be enabled on APUs and NI and newer asics. With KMS, wb encompasses > more than just rptr writeback; it covers pretty much everything > involving the GPU writing any status information to memory. FirePro 2270, at least. Booting with no_wb=1, piglit runs to completion with no GPU resets or IB submission failures. Booting without no_wb, the following piglits go from pass to fail, all complaining that the kernel rejected CS: no-wb wb (info) (info) All 7598/9587 7403/9553 glean 362/385 361/385 pointAtten pass fail texture_srgb pass fail shaders 477/533 441/533 glsl-algebraic-add-add-1 pass fail glsl-algebraic-add-add-2 pass fail glsl-algebraic-add-add-3 pass fail glsl-algebraic-sub-zero-3 pass fail glsl-algebraic-sub-zero-4 pass fail glsl-complex-subscript pass fail glsl-copy-propagation-if-1 pass fail glsl-copy-propagation-if-2 pass fail glsl-copy-propagation-if-3 pass fail glsl-copy-propagation-vector-indexing pass fail glsl-fs-atan-2 pass fail glsl-fs-dot-vec2-2 pass fail glsl-fs-log2 pass fail glsl-fs-main-return pass fail glsl-fs-max-3 pass fail glsl-fs-min-2 pass fail glsl-fs-min-3 pass fail glsl-fs-statevar-call pass fail glsl-fs-struct-equal pass fail glsl-function-chain16 pass fail glsl-implicit-conversion-02 pass fail glsl-inout-struct-01 pass fail glsl-inout-struct-02 pass fail glsl-link-varying-TexCoord pass fail glsl-link-varyings-2 pass fail glsl-uniform-initializer-4 pass fail glsl-uniform-initializer-6 pass fail glsl-uniform-initializer-7 pass fail glsl-vs-abs-neg-with-intermediate pass fail glsl-vs-clamp-1 pass fail glsl-vs-deadcode-1 pass fail glsl-vs-deadcode-2 pass fail glsl-vs-f2b pass fail glsl-vs-position-outval pass fail link-uniform-array-size pass fail loopfunc pass fail spec 5921/7801 5763/7767 !OpenGL 1.1 118/229 101/219 depthstencil-default_fb-clear pass fail getteximage-simple pass fail texwrap formats 27/50 12/40 GL_LUMINANCE12 pass fail GL_LUMINANCE16 pass fail GL_LUMINANCE4 pass fail GL_LUMINANCE4_ALPHA4 pass fail GL_LUMINANCE8 pass fail !OpenGL 1.4 11/13 9/13 triangle-rasterization pass fail triangle-rasterization-fbo pass fail ARB_depth_buffer_float 9/63 6/63 fbo-depth-GL_DEPTH32F_STENCIL8-clear pass fail fbo-depth-GL_DEPTH_COMPONENT32F-clear pass fail fbo-stencil-GL_DEPTH32F_STENCIL8-clear pass fail ARB_depth_texture 11/53 10/53 fbo-depth-GL_DEPTH_COMPONENT32-clear pass fail ARB_fragment_program 23/28 22/28 kil-swizzle pass fail ARB_framebuffer_object 14/26 13/26 fbo-deriv pass fail ARB_framebuffer_sRGB 80/81 65/81 blit renderbuffer linear downsample disabled pass fail blit renderbuffer linear msaa enabled pass fail blit renderbuffer linear single_sampled enabled pass fail blit renderbuffer linear_to_srgb scaled disabled pass fail blit renderbuffer srgb msaa disabled pass fail blit renderbuffer srgb msaa enabled pass fail blit renderbuffer srgb scaled enabled pass fail blit renderbuffer srgb single_sampled enabled pass fail blit renderbuffer srgb upsample enabled pass fail blit renderbuffer srgb_to_linear msaa disabled pass fail blit renderbuffer srgb_to_linear msaa enabled pass fail blit texture linear scaled enabled pass fail blit texture linear_to_srgb upsample disabled pass fail blit texture srgb msaa disabled pass fail blit texture srgb msaa enabled pass fail ARB_map_buffer_range 8/8 6/8 MAP_INVALIDATE_RANGE_BIT decrement-offset pass fail MAP_INVALIDATE_RANGE_BIT offset=0 pass fail ARB_sampler_objects 1/4 0/4 sampler-objects pass fail ARB_shading_language_packing 10/10 7/10 execution 10/10 7/10 built-in-functions 10/10 7/10 const-packSnorm4x8 pass fail const-unpackSnorm2x16 pass fail const-unpackUnorm2x16 pass fail ARB_texture_compression 25/40 22/40 texwrap formats bordercolor-swizzled 3/6 0/6 GL_COMPRESSED_INTENSITY, swizzled, border color only pass fail GL_COMPRESSED_LUMINANCE, swizzled, border color only pass fail GL_COMPRESSED_LUMINANCE_ALPHA, swizzled, border color pass fail only ARB_texture_cube_map 7/8 6/8 cubemap pass fail ARB_texture_rectangle 11/21 10/21 texrect_simple_arb_texrect pass fail ARB_texture_rg 42/113 19/97 texwrap formats-int 24/28 0/12 GL_R16UI pass fail GL_R32UI pass fail GL_R8I pass fail GL_R8UI pass fail GL_RG16UI pass fail GL_RG32UI pass fail GL_RG8I pass fail GL_RG8UI pass fail ARB_texture_storage 1/1 0/1 texture-storage pass fail ARB_timer_query 1/3 0/3 query GL_TIMESTAMP pass fail ATI_draw_buffers 3/3 2/3 arbfp-no-option pass fail EXT_framebuffer_multisample 267/287 259/287 bitmap 4 pass fail bitmap 6 pass fail bitmap 8 pass fail clear 2 depth pass fail clear 6 color pass fail clear 6 depth pass fail clear 6 stencil pass fail draw-buffers-alpha-to-one 4 pass fail EXT_framebuffer_object 175/289 173/289 fbo-fragcoord2 pass fail fbo-generatemipmap-filtering pass fail fbo-generatemipmap-formats 2/76 1/76 GL_LUMINANCE4_ALPHA4 pass fail fbo-readpixels-depth-formats 14/24 16/24 GL_DEPTH24_STENCIL8_EXT 3/4 2/4 GL_UNSIGNED_INT pass fail fbo-stencil-GL_STENCIL_INDEX16-clear pass fail EXT_packed_depth_stencil 13/58 11/58 fbo-depthstencil-GL_DEPTH24_STENCIL8-clear pass fail fbo-stencil-GL_DEPTH24_STENCIL8-clear pass fail EXT_texture_compression_rgtc 31/39 11/31 fbo-generatemipmap-formats 8/8 4/8 GL_COMPRESSED_RED pass fail GL_COMPRESSED_RED_GREEN_RGTC2_EXT pass fail GL_COMPRESSED_RED_RGTC1_EXT pass fail GL_COMPRESSED_RG pass fail texwrap formats 12/12 0/4 GL_COMPRESSED_RED_RGTC1 pass fail GL_COMPRESSED_RG_RGTC2 pass fail GL_COMPRESSED_SIGNED_RED_RGTC1 pass fail GL_COMPRESSED_SIGNED_RG_RGTC2 pass fail texwrap formats bordercolor 4/4 0/4 GL_COMPRESSED_RED_RGTC1, border color only pass fail GL_COMPRESSED_RG_RGTC2, border color only pass fail GL_COMPRESSED_SIGNED_RED_RGTC1, border color only pass fail GL_COMPRESSED_SIGNED_RG_RGTC2, border color only pass fail EXT_transform_feedback 45/279 44/279 discard-bitmap pass fail glsl-1.10 1540/1636 1529/1636 execution 1295/1391 1286/1391 fs-inline-notequal pass fail fs-saturate-pow pass fail maximums 12/12 8/12 gl_MaxCombinedTextureImageUnits pass fail gl_MaxFragmentUniformComponents pass fail gl_MaxTextureImageUnits pass fail gl_MaxVertexAttribs pass fail vs-mat2-array-assignment pass fail vs-saturate-exp2 pass fail vs-saturate-pow pass fail linker 18/18 16/18 access-builtin-global-from-fn-unknown-to-main pass fail override-builtin-uniform-01 pass fail glsl-1.20 2124/2183 2097/2183 execution 788/845 761/845 maximums 12/12 7/12 gl_MaxClipPlanes pass fail gl_MaxCombinedTextureImageUnits pass fail gl_MaxTextureCoords pass fail gl_MaxVaryingFloats pass fail gl_MaxVertexTextureImageUnits pass fail uniform-initializer 64/64 44/64 fs-bool-set-by-other-stage pass fail fs-float-array pass fail fs-float-from-const pass fail fs-float-set-by-other-stage pass fail fs-int pass fail fs-int-from-const pass fail fs-int-set-by-other-stage pass fail fs-mat3-set-by-other-stage pass fail fs-mat4-from-const pass fail fs-mat4-set-by-API pass fail fs-mat4-set-by-other-stage pass fail fs-structure-array pass fail vs-bool-from-const pass fail vs-float pass fail vs-float-array pass fail vs-mat2 pass fail vs-mat3 pass fail vs-mat4 pass fail vs-mat4-from-const pass fail vs-mat4-set-by-API pass fail vs-all-equal-bool-array pass fail vs-assign-varied-struct pass fail glsl-1.30 468/604 455/604 execution 465/601 452/601 clipping 20/20 18/20 fs-clip-distance-interpolated pass fail vs-clip-distance-explicitly-sized pass fail fs-discard-exit-1 pass fail fs-discard-exit-2 pass fail fs-increment-int pass fail fs-multiply-const-ivec4 pass fail maximums 13/13 9/13 gl_MaxClipPlanes pass fail gl_MaxCombinedTextureImageUnits pass fail gl_MaxTextureImageUnits pass fail gl_MaxVaryingFloats pass fail qualifiers 1/1 0/1 vs-out-conversion-ivec4-to-vec4 pass fail uniform-initializer 8/8 6/8 fs-uint-from-const pass fail vs-uint pass fail - ajax
On Mon, Jun 17, 2013 at 12:07 PM, Adam Jackson <ajax@redhat.com> wrote: > On Mon, 2013-06-17 at 11:04 -0400, Alex Deucher wrote: >> On Mon, Jun 17, 2013 at 10:06 AM, Adam Jackson <ajax@redhat.com> wrote: >> > At least on an IBM Power 720, this check passes, but several piglit >> > tests will reliably trigger GPU resets due to the ring buffer pointers >> > not being updated. There's probably a better way to limit this to just >> > affected machines though. >> >> What radeon chips are you seeing this on? wb is more or less required >> on r6xx and newer and I'm not sure those generations will even work >> properly without writeback enabled these days. We force it to always >> be enabled on APUs and NI and newer asics. With KMS, wb encompasses >> more than just rptr writeback; it covers pretty much everything >> involving the GPU writing any status information to memory. > > FirePro 2270, at least. Booting with no_wb=1, piglit runs to completion > with no GPU resets or IB submission failures. Booting without no_wb, > the following piglits go from pass to fail, all complaining that the > kernel rejected CS: Weird. I wonder if there is an issue with cache snoops on PPC. We currently use the gart in cached mode (GPU snoops CPU cache) with cached pages. I wonder if we need to use uncached pages on PPC. Alex
On Mon, 2013-06-17 at 18:57 -0400, Alex Deucher wrote: > Weird. I wonder if there is an issue with cache snoops on PPC. We > currently use the gart in cached mode (GPU snoops CPU cache) with > cached pages. I wonder if we need to use uncached pages on PPC. There is no such issue and no known bugs with DMA writes on those PCIe host bridges (and they do get hammered pretty bad here). This needs further investigation by the lab/hw guys to find out what's actually happening on the bus and the host bridge. Thadeu, Kleber: Jerome suggested writing a test case in userspace that continuously writes to a spare scratch register (thus triggering the corresponding writeback DMA) and checks the memory location to compare the writeback value (using a debugfs file for example, or mmap). Can you guys do something like that ? Then we need the analyzer on and/or the lab guys to look at the fabric trace & PHB trace. Ben.
On 11/07/2013 08:29 PM, Benjamin Herrenschmidt wrote: > On Mon, 2013-06-17 at 18:57 -0400, Alex Deucher wrote: > >> Weird. I wonder if there is an issue with cache snoops on PPC. We >> currently use the gart in cached mode (GPU snoops CPU cache) with >> cached pages. I wonder if we need to use uncached pages on PPC. > > There is no such issue and no known bugs with DMA writes on those > PCIe host bridges (and they do get hammered pretty bad here). > > This needs further investigation by the lab/hw guys to find out what's > actually happening on the bus and the host bridge. > > Thadeu, Kleber: Jerome suggested writing a test case in userspace that > continuously writes to a spare scratch register (thus triggering the > corresponding writeback DMA) and checks the memory location to compare > the writeback value (using a debugfs file for example, or mmap). > I can look into that. Thanks, Kleber > Can you guys do something like that ? Then we need the analyzer on > and/or the lab guys to look at the fabric trace & PHB trace. > > Ben. > >
On Fri, 2013-11-08 at 11:43 -0200, Kleber Sacilotto de Souza wrote: > On 11/07/2013 08:29 PM, Benjamin Herrenschmidt wrote: > > On Mon, 2013-06-17 at 18:57 -0400, Alex Deucher wrote: > > > >> Weird. I wonder if there is an issue with cache snoops on PPC. We > >> currently use the gart in cached mode (GPU snoops CPU cache) with > >> cached pages. I wonder if we need to use uncached pages on PPC. > > > > There is no such issue and no known bugs with DMA writes on those > > PCIe host bridges (and they do get hammered pretty bad here). > > > > This needs further investigation by the lab/hw guys to find out what's > > actually happening on the bus and the host bridge. > > > > Thadeu, Kleber: Jerome suggested writing a test case in userspace that > > continuously writes to a spare scratch register (thus triggering the > > corresponding writeback DMA) and checks the memory location to compare > > the writeback value (using a debugfs file for example, or mmap). > > > > I can look into that. Any news ? Ben. > > Thanks, > > Kleber > > > Can you guys do something like that ? Then we need the analyzer on > > and/or the lab guys to look at the fabric trace & PHB trace. > > > > Ben. > > > > > >
On 11/24/2013 09:15 PM, Benjamin Herrenschmidt wrote: > On Fri, 2013-11-08 at 11:43 -0200, Kleber Sacilotto de Souza wrote: >> On 11/07/2013 08:29 PM, Benjamin Herrenschmidt wrote: >>> On Mon, 2013-06-17 at 18:57 -0400, Alex Deucher wrote: >>> >>>> Weird. I wonder if there is an issue with cache snoops on PPC. We >>>> currently use the gart in cached mode (GPU snoops CPU cache) with >>>> cached pages. I wonder if we need to use uncached pages on PPC. >>> There is no such issue and no known bugs with DMA writes on those >>> PCIe host bridges (and they do get hammered pretty bad here). >>> >>> This needs further investigation by the lab/hw guys to find out what's >>> actually happening on the bus and the host bridge. >>> >>> Thadeu, Kleber: Jerome suggested writing a test case in userspace that >>> continuously writes to a spare scratch register (thus triggering the >>> corresponding writeback DMA) and checks the memory location to compare >>> the writeback value (using a debugfs file for example, or mmap). >>> >> I can look into that. > Any news ? Only now I've got the bandwidth to actually take a look at this. I will start to write some code and let you guys know of the results. Thanks, Kleber > > Ben. > >> Thanks, >> >> Kleber >> >>> Can you guys do something like that ? Then we need the analyzer on >>> and/or the lab guys to look at the fabric trace & PHB trace. >>> >>> Ben. >>> >>> >> >
On 11/25/2013 10:11 PM, Kleber Sacilotto de Souza wrote: > On 11/24/2013 09:15 PM, Benjamin Herrenschmidt wrote: >> On Fri, 2013-11-08 at 11:43 -0200, Kleber Sacilotto de Souza wrote: >>> On 11/07/2013 08:29 PM, Benjamin Herrenschmidt wrote: >>>> On Mon, 2013-06-17 at 18:57 -0400, Alex Deucher wrote: >>>> >>>>> Weird. I wonder if there is an issue with cache snoops on PPC. We >>>>> currently use the gart in cached mode (GPU snoops CPU cache) with >>>>> cached pages. I wonder if we need to use uncached pages on PPC. >>>> There is no such issue and no known bugs with DMA writes on those >>>> PCIe host bridges (and they do get hammered pretty bad here). >>>> >>>> This needs further investigation by the lab/hw guys to find out what's >>>> actually happening on the bus and the host bridge. >>>> >>>> Thadeu, Kleber: Jerome suggested writing a test case in userspace that >>>> continuously writes to a spare scratch register (thus triggering the >>>> corresponding writeback DMA) and checks the memory location to compare >>>> the writeback value (using a debugfs file for example, or mmap). I was not able to reproduce the issue with this method, even after a weekend run. However, doing some more investigation it seems the problem is here, where we read the ring rptr: u32 radeon_ring_generic_get_rptr(struct radeon_device *rdev, struct radeon_ring *ring) { u32 rptr; if (rdev->wb.enabled) rptr = le32_to_cpu(rdev->wb.wb[ring->rptr_offs/4]); else rptr = RREG32(ring->rptr_reg); return rptr; } I realized that the DMA'ed rptr value has always the opposite byte order from the MMIO value. Since RREG32 already returns the register value on the CPU byte order, it seems we don't need to byte-swap the DMA'ed value. If I remove the le32_to_cpu() call and use the DMA'ed value directly, I don't get the IB scheduling failures and piglit results are the same as with writeback disabled. Is the adapter chipset swapping the bytes before doing the DMA to a big-endian host?
On Wed, Dec 4, 2013 at 5:16 PM, Kleber Sacilotto de Souza <klebers@linux.vnet.ibm.com> wrote: > On 11/25/2013 10:11 PM, Kleber Sacilotto de Souza wrote: >> >> On 11/24/2013 09:15 PM, Benjamin Herrenschmidt wrote: >>> >>> On Fri, 2013-11-08 at 11:43 -0200, Kleber Sacilotto de Souza wrote: >>>> >>>> On 11/07/2013 08:29 PM, Benjamin Herrenschmidt wrote: >>>>> >>>>> On Mon, 2013-06-17 at 18:57 -0400, Alex Deucher wrote: >>>>> >>>>>> Weird. I wonder if there is an issue with cache snoops on PPC. We >>>>>> currently use the gart in cached mode (GPU snoops CPU cache) with >>>>>> cached pages. I wonder if we need to use uncached pages on PPC. >>>>> >>>>> There is no such issue and no known bugs with DMA writes on those >>>>> PCIe host bridges (and they do get hammered pretty bad here). >>>>> >>>>> This needs further investigation by the lab/hw guys to find out what's >>>>> actually happening on the bus and the host bridge. >>>>> >>>>> Thadeu, Kleber: Jerome suggested writing a test case in userspace that >>>>> continuously writes to a spare scratch register (thus triggering the >>>>> corresponding writeback DMA) and checks the memory location to compare >>>>> the writeback value (using a debugfs file for example, or mmap). > > > I was not able to reproduce the issue with this method, even after a weekend > run. > > However, doing some more investigation it seems the problem is here, where > we read the ring rptr: > > u32 radeon_ring_generic_get_rptr(struct radeon_device *rdev, > struct radeon_ring *ring) > { > u32 rptr; > > if (rdev->wb.enabled) > rptr = le32_to_cpu(rdev->wb.wb[ring->rptr_offs/4]); > else > rptr = RREG32(ring->rptr_reg); > > return rptr; > } > > I realized that the DMA'ed rptr value has always the opposite byte order > from the MMIO value. Since RREG32 already returns the register value on the > CPU byte order, it seems we don't need to byte-swap the DMA'ed value. If I > remove the le32_to_cpu() call and use the DMA'ed value directly, I don't get > the IB scheduling failures and piglit results are the same as with writeback > disabled. > > Is the adapter chipset swapping the bytes before doing the DMA to a > big-endian host? Setting CP_RB_CNTL.BUF_SWAP causes the CP to use the selected byte swapping for just about everything accessed by the CP (rptr writeback, indirect buffers, etc.). Looks like the DMA ring supports and enables rptr writeback as well (DMA_RB_CNTL.DMA_RPTR_WRITEBACK_SWAP_ENABLE) so I think we can drop the swapping of the rptr writeback. Alex > > > > -- > Kleber Sacilotto de Souza > IBM Linux Technology Center >
diff --git a/drivers/gpu/drm/radeon/r600_cp.c b/drivers/gpu/drm/radeon/r600_cp.c index 1c51c08..ef28532 100644 --- a/drivers/gpu/drm/radeon/r600_cp.c +++ b/drivers/gpu/drm/radeon/r600_cp.c @@ -552,6 +552,13 @@ static void r600_test_writeback(drm_radeon_private_t *dev_priv) dev_priv->writeback_works = 0; DRM_INFO("writeback test failed\n"); } +#if defined(__ppc__) || defined(__ppc64__) + /* the test might succeed on ppc, but it's usually not reliable */ + if (radeon_no_wb == -1) { + radeon_no_wb = 1; + DRM_INFO("not trusting writeback test due to arch quirk\n"); + } +#endif if (radeon_no_wb == 1) { dev_priv->writeback_works = 0; DRM_INFO("writeback forced off\n"); diff --git a/drivers/gpu/drm/radeon/radeon_cp.c b/drivers/gpu/drm/radeon/radeon_cp.c index efc4f64..a967b33 100644 --- a/drivers/gpu/drm/radeon/radeon_cp.c +++ b/drivers/gpu/drm/radeon/radeon_cp.c @@ -892,6 +892,13 @@ static void radeon_test_writeback(drm_radeon_private_t * dev_priv) dev_priv->writeback_works = 0; DRM_INFO("writeback test failed\n"); } +#if defined(__ppc__) || defined(__ppc64__) + /* the test might succeed on ppc, but it's usually not reliable */ + if (radeon_no_wb == -1) { + radeon_no_wb = 1; + DRM_INFO("not trusting writeback test due to arch quirk\n"); + } +#endif if (radeon_no_wb == 1) { dev_priv->writeback_works = 0; DRM_INFO("writeback forced off\n"); diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c index 1899738..524046e 100644 --- a/drivers/gpu/drm/radeon/radeon_device.c +++ b/drivers/gpu/drm/radeon/radeon_device.c @@ -322,8 +322,8 @@ int radeon_wb_init(struct radeon_device *rdev) /* disable event_write fences */ rdev->wb.use_event = false; /* disabled via module param */ - if (radeon_no_wb == 1) { - rdev->wb.enabled = false; + if (radeon_no_wb != -1) { + rdev->wb.enabled = !!radeon_no_wb; } else { if (rdev->flags & RADEON_IS_AGP) { /* often unreliable on AGP */ diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c index 094e7e5..04809d4 100644 --- a/drivers/gpu/drm/radeon/radeon_drv.c +++ b/drivers/gpu/drm/radeon/radeon_drv.c @@ -146,7 +146,7 @@ static inline void radeon_register_atpx_handler(void) {} static inline void radeon_unregister_atpx_handler(void) {} #endif -int radeon_no_wb; +int radeon_no_wb = -1; int radeon_modeset = -1; int radeon_dynclks = -1; int radeon_r4xx_atom = 0;
At least on an IBM Power 720, this check passes, but several piglit tests will reliably trigger GPU resets due to the ring buffer pointers not being updated. There's probably a better way to limit this to just affected machines though. Signed-off-by: Adam Jackson <ajax@redhat.com> --- drivers/gpu/drm/radeon/r600_cp.c | 7 +++++++ drivers/gpu/drm/radeon/radeon_cp.c | 7 +++++++ drivers/gpu/drm/radeon/radeon_device.c | 4 ++-- drivers/gpu/drm/radeon/radeon_drv.c | 2 +- 4 files changed, 17 insertions(+), 3 deletions(-)