diff mbox

[radeon,r100] when ring test fails, provide users with option to test

Message ID 20151128205840.GA26757@amd (mailing list archive)
State New, archived
Headers show

Commit Message

Pavel Machek Nov. 28, 2015, 8:58 p.m. UTC
Ring test failure is often caused by too high agpmode. Tell the user
what to try.

Signed-off-by: Pavel Machek <pavel@ucw.cz>

Comments

Christian König Nov. 29, 2015, 7:48 p.m. UTC | #1
On 28.11.2015 21:58, Pavel Machek wrote:
> Ring test failure is often caused by too high agpmode. Tell the user
> what to try.
>
> Signed-off-by: Pavel Machek <pavel@ucw.cz>

NAK, the ring test can fail for any number of reasons and the agpmode is 
actually rather unlikely to be the cause.

Regards,
Christian.

>
> diff --git a/drivers/gpu/drm/radeon/r100.c b/drivers/gpu/drm/radeon/r100.c
> index 238b13f..32b1917 100644
> --- a/drivers/gpu/drm/radeon/r100.c
> +++ b/drivers/gpu/drm/radeon/r100.c
> @@ -3665,7 +3665,7 @@ int r100_ring_test(struct radeon_device *rdev, struct radeon_ring *ring)
>   	if (i < rdev->usec_timeout) {
>   		DRM_INFO("ring test succeeded in %d usecs\n", i);
>   	} else {
> -		DRM_ERROR("radeon: ring test failed (scratch(0x%04X)=0x%08X)\n",
> +		DRM_ERROR("radeon: ring test failed (scratch(0x%04X)=0x%08X), try radeon.agpmode=1?\n",
>   			  scratch, tmp);
>   		r = -EINVAL;
>   	}
>
Pavel Machek Nov. 29, 2015, 10:22 p.m. UTC | #2
On Sun 2015-11-29 20:48:53, Christian König wrote:
> On 28.11.2015 21:58, Pavel Machek wrote:
> >Ring test failure is often caused by too high agpmode. Tell the user
> >what to try.
> >
> >Signed-off-by: Pavel Machek <pavel@ucw.cz>
> 
> NAK, the ring test can fail for any number of reasons and the agpmode is
> actually rather unlikely to be the cause.

Well, when I asked on the list "why this is happened" I got "umm,
noone knows" response that was not exactly helpful. And then someone
told me about agpmode.

If you know about the reasons it can fail, could you list them near
the DRM_ERROR, at least as a comment?

Thanks,
									Pavel

> Regards,
> Christian.
> 
> >
> >diff --git a/drivers/gpu/drm/radeon/r100.c b/drivers/gpu/drm/radeon/r100.c
> >index 238b13f..32b1917 100644
> >--- a/drivers/gpu/drm/radeon/r100.c
> >+++ b/drivers/gpu/drm/radeon/r100.c
> >@@ -3665,7 +3665,7 @@ int r100_ring_test(struct radeon_device *rdev, struct radeon_ring *ring)
> >  	if (i < rdev->usec_timeout) {
> >  		DRM_INFO("ring test succeeded in %d usecs\n", i);
> >  	} else {
> >-		DRM_ERROR("radeon: ring test failed (scratch(0x%04X)=0x%08X)\n",
> >+		DRM_ERROR("radeon: ring test failed (scratch(0x%04X)=0x%08X), try radeon.agpmode=1?\n",
> >  			  scratch, tmp);
> >  		r = -EINVAL;
> >  	}
> >
Christian König Nov. 30, 2015, 8:39 a.m. UTC | #3
On 29.11.2015 23:22, Pavel Machek wrote:
> On Sun 2015-11-29 20:48:53, Christian König wrote:
>> On 28.11.2015 21:58, Pavel Machek wrote:
>>> Ring test failure is often caused by too high agpmode. Tell the user
>>> what to try.
>>>
>>> Signed-off-by: Pavel Machek <pavel@ucw.cz>
>> NAK, the ring test can fail for any number of reasons and the agpmode is
>> actually rather unlikely to be the cause.
> Well, when I asked on the list "why this is happened" I got "umm,
> noone knows" response that was not exactly helpful. And then someone
> told me about agpmode.
>
> If you know about the reasons it can fail, could you list them near
> the DRM_ERROR, at least as a comment?

Well as I said, that could be any number of reasons. Some of them even 
completely unrelated to the driver itself.

E.g. BIOS setting, faulty hardware, problems with the writeback etc... 
There is really not a list you could give here.

Lowering the agpmode usually helps more to prevent random corruptions 
and problems under load.

Regards,
Christian.

>
> Thanks,
> 									Pavel
>
>> Regards,
>> Christian.
>>
>>> diff --git a/drivers/gpu/drm/radeon/r100.c b/drivers/gpu/drm/radeon/r100.c
>>> index 238b13f..32b1917 100644
>>> --- a/drivers/gpu/drm/radeon/r100.c
>>> +++ b/drivers/gpu/drm/radeon/r100.c
>>> @@ -3665,7 +3665,7 @@ int r100_ring_test(struct radeon_device *rdev, struct radeon_ring *ring)
>>>   	if (i < rdev->usec_timeout) {
>>>   		DRM_INFO("ring test succeeded in %d usecs\n", i);
>>>   	} else {
>>> -		DRM_ERROR("radeon: ring test failed (scratch(0x%04X)=0x%08X)\n",
>>> +		DRM_ERROR("radeon: ring test failed (scratch(0x%04X)=0x%08X), try radeon.agpmode=1?\n",
>>>   			  scratch, tmp);
>>>   		r = -EINVAL;
>>>   	}
>>>
Pavel Machek Dec. 1, 2015, 10:01 a.m. UTC | #4
On Mon 2015-11-30 09:39:54, Christian König wrote:
> On 29.11.2015 23:22, Pavel Machek wrote:
> >On Sun 2015-11-29 20:48:53, Christian König wrote:
> >>On 28.11.2015 21:58, Pavel Machek wrote:
> >>>Ring test failure is often caused by too high agpmode. Tell the user
> >>>what to try.
> >>>
> >>>Signed-off-by: Pavel Machek <pavel@ucw.cz>
> >>NAK, the ring test can fail for any number of reasons and the agpmode is
> >>actually rather unlikely to be the cause.
> >Well, when I asked on the list "why this is happened" I got "umm,
> >noone knows" response that was not exactly helpful. And then someone
> >told me about agpmode.
> >
> >If you know about the reasons it can fail, could you list them near
> >the DRM_ERROR, at least as a comment?
> 
> Well as I said, that could be any number of reasons. Some of them even
> completely unrelated to the driver itself.
> 
> E.g. BIOS setting, faulty hardware, problems with the writeback etc... There
> is really not a list you could give here.
> 
> Lowering the agpmode usually helps more to prevent random corruptions and
> problems under load.

Take a look at

http://www.gossamer-threads.com/lists/linux/kernel/2197183

. I had a problem, you did not know how to debug it, but it already
happened to pebolle at tiscali ... and yes, it was agpmode. That
problem is clearly more common then you realize... So this should go
in.

									Pavel

> >>>--- a/drivers/gpu/drm/radeon/r100.c
> >>>+++ b/drivers/gpu/drm/radeon/r100.c
> >>>@@ -3665,7 +3665,7 @@ int r100_ring_test(struct radeon_device *rdev, struct radeon_ring *ring)
> >>>  	if (i < rdev->usec_timeout) {
> >>>  		DRM_INFO("ring test succeeded in %d usecs\n", i);
> >>>  	} else {
> >>>-		DRM_ERROR("radeon: ring test failed (scratch(0x%04X)=0x%08X)\n",
> >>>+		DRM_ERROR("radeon: ring test failed (scratch(0x%04X)=0x%08X), try radeon.agpmode=1?\n",
> >>>  			  scratch, tmp);
> >>>  		r = -EINVAL;
> >>>  	}
> >>>
Michel Dänzer Dec. 2, 2015, 3:14 a.m. UTC | #5
On 01.12.2015 19:01, Pavel Machek wrote:
> On Mon 2015-11-30 09:39:54, Christian König wrote:
>> On 29.11.2015 23:22, Pavel Machek wrote:
>>> On Sun 2015-11-29 20:48:53, Christian König wrote:
>>>> On 28.11.2015 21:58, Pavel Machek wrote:
>>>>> Ring test failure is often caused by too high agpmode. Tell the user
>>>>> what to try.
>>>>>
>>>>> Signed-off-by: Pavel Machek <pavel@ucw.cz>
>>>> NAK, the ring test can fail for any number of reasons and the agpmode is
>>>> actually rather unlikely to be the cause.
>>> Well, when I asked on the list "why this is happened" I got "umm,
>>> noone knows" response that was not exactly helpful. And then someone
>>> told me about agpmode.
>>>
>>> If you know about the reasons it can fail, could you list them near
>>> the DRM_ERROR, at least as a comment?
>>
>> Well as I said, that could be any number of reasons. Some of them even
>> completely unrelated to the driver itself.
>>
>> E.g. BIOS setting, faulty hardware, problems with the writeback etc... There
>> is really not a list you could give here.
>>
>> Lowering the agpmode usually helps more to prevent random corruptions and
>> problems under load.
> 
> Take a look at
> 
> http://www.gossamer-threads.com/lists/linux/kernel/2197183
> 
> . I had a problem, you did not know how to debug it, but it already
> happened to pebolle at tiscali ... and yes, it was agpmode. That
> problem is clearly more common then you realize... So this should go
> in.

I agree with Christian, but at the very least, agpmode must not be
mentioned if AGP isn't being used in the first place, i.e. either the
GPU isn't AGP or is being forced to PCI(e) mode.
Christian König Dec. 2, 2015, 8:57 a.m. UTC | #6
On 02.12.2015 04:14, Michel Dänzer wrote:
> On 01.12.2015 19:01, Pavel Machek wrote:
>> On Mon 2015-11-30 09:39:54, Christian König wrote:
>>> On 29.11.2015 23:22, Pavel Machek wrote:
>>>> On Sun 2015-11-29 20:48:53, Christian König wrote:
>>>>> On 28.11.2015 21:58, Pavel Machek wrote:
>>>>>> Ring test failure is often caused by too high agpmode. Tell the user
>>>>>> what to try.
>>>>>>
>>>>>> Signed-off-by: Pavel Machek <pavel@ucw.cz>
>>>>> NAK, the ring test can fail for any number of reasons and the agpmode is
>>>>> actually rather unlikely to be the cause.
>>>> Well, when I asked on the list "why this is happened" I got "umm,
>>>> noone knows" response that was not exactly helpful. And then someone
>>>> told me about agpmode.
>>>>
>>>> If you know about the reasons it can fail, could you list them near
>>>> the DRM_ERROR, at least as a comment?
>>> Well as I said, that could be any number of reasons. Some of them even
>>> completely unrelated to the driver itself.
>>>
>>> E.g. BIOS setting, faulty hardware, problems with the writeback etc... There
>>> is really not a list you could give here.
>>>
>>> Lowering the agpmode usually helps more to prevent random corruptions and
>>> problems under load.
>> Take a look at
>>
>> http://www.gossamer-threads.com/lists/linux/kernel/2197183
>>
>> . I had a problem, you did not know how to debug it, but it already
>> happened to pebolle at tiscali ... and yes, it was agpmode. That
>> problem is clearly more common then you realize... So this should go
>> in.
> I agree with Christian, but at the very least, agpmode must not be
> mentioned if AGP isn't being used in the first place, i.e. either the
> GPU isn't AGP or is being forced to PCI(e) mode.

Well maybe to explain the background, r100_ring_test() is used for a 
whole bunch of different hardware generations.

Most of them doesn't even have AGP, so mentioning this here would be 
even confusion for the majority of users.

Regards,
Christian.
diff mbox

Patch

diff --git a/drivers/gpu/drm/radeon/r100.c b/drivers/gpu/drm/radeon/r100.c
index 238b13f..32b1917 100644
--- a/drivers/gpu/drm/radeon/r100.c
+++ b/drivers/gpu/drm/radeon/r100.c
@@ -3665,7 +3665,7 @@  int r100_ring_test(struct radeon_device *rdev, struct radeon_ring *ring)
 	if (i < rdev->usec_timeout) {
 		DRM_INFO("ring test succeeded in %d usecs\n", i);
 	} else {
-		DRM_ERROR("radeon: ring test failed (scratch(0x%04X)=0x%08X)\n",
+		DRM_ERROR("radeon: ring test failed (scratch(0x%04X)=0x%08X), try radeon.agpmode=1?\n",
 			  scratch, tmp);
 		r = -EINVAL;
 	}