mbox series

[0/2] drm/radeon: have the callers of set_memory_*() check the return value

Message ID 20200107192555.20606-1-tli@digitalocean.com (mailing list archive)
Headers show
Series drm/radeon: have the callers of set_memory_*() check the return value | expand

Message

Tianlin Li Jan. 7, 2020, 7:25 p.m. UTC
Right now several architectures allow their set_memory_*() family of  
functions to fail, but callers may not be checking the return values.
If set_memory_*() returns with an error, call-site assumptions may be
infact wrong to assume that it would either succeed or not succeed at  
all. Ideally, the failure of set_memory_*() should be passed up the 
call stack, and callers should examine the failure and deal with it. 

Need to fix the callers and add the __must_check attribute. They also 
may not provide any level of atomicity, in the sense that the memory 
protections may be left incomplete on failure. This issue likely has a 
few steps on effects architectures:
1)Have all callers of set_memory_*() helpers check the return value.
2)Add __must_check to all set_memory_*() helpers so that new uses do  
not ignore the return value.
3)Add atomicity to the calls so that the memory protections aren't left 
in a partial state.

This series is part of step 1. Make drm/radeon check the return value of  
set_memory_*().

Tianlin Li (2):
  drm/radeon: have the callers of set_memory_*() check the return value
  drm/radeon: change call sites to handle return value properly.

 drivers/gpu/drm/radeon/r100.c        |  3 ++-
 drivers/gpu/drm/radeon/radeon.h      |  2 +-
 drivers/gpu/drm/radeon/radeon_gart.c | 22 ++++++++++++++++++----
 drivers/gpu/drm/radeon/rs400.c       |  3 ++-
 4 files changed, 23 insertions(+), 7 deletions(-)

Comments

Christian König Jan. 8, 2020, 12:56 p.m. UTC | #1
Am 07.01.20 um 20:25 schrieb Tianlin Li:
> Right now several architectures allow their set_memory_*() family of
> functions to fail, but callers may not be checking the return values.
> If set_memory_*() returns with an error, call-site assumptions may be
> infact wrong to assume that it would either succeed or not succeed at
> all. Ideally, the failure of set_memory_*() should be passed up the
> call stack, and callers should examine the failure and deal with it.
>
> Need to fix the callers and add the __must_check attribute. They also
> may not provide any level of atomicity, in the sense that the memory
> protections may be left incomplete on failure. This issue likely has a
> few steps on effects architectures:
> 1)Have all callers of set_memory_*() helpers check the return value.
> 2)Add __must_check to all set_memory_*() helpers so that new uses do
> not ignore the return value.
> 3)Add atomicity to the calls so that the memory protections aren't left
> in a partial state.
>
> This series is part of step 1. Make drm/radeon check the return value of
> set_memory_*().

I'm a little hesitate merge that. This hardware is >15 years old and 
nobody of the developers have any system left to test this change on.

Would it be to much of a problem to just add something like: r = 
set_memory_*(); (void)r; /* Intentionally ignored */.

Apart from that certainly a good idea to add __must_check to the functions.

Regards,
Christian.

>
> Tianlin Li (2):
>    drm/radeon: have the callers of set_memory_*() check the return value
>    drm/radeon: change call sites to handle return value properly.
>
>   drivers/gpu/drm/radeon/r100.c        |  3 ++-
>   drivers/gpu/drm/radeon/radeon.h      |  2 +-
>   drivers/gpu/drm/radeon/radeon_gart.c | 22 ++++++++++++++++++----
>   drivers/gpu/drm/radeon/rs400.c       |  3 ++-
>   4 files changed, 23 insertions(+), 7 deletions(-)
>
Tianlin Li Jan. 8, 2020, 4:04 p.m. UTC | #2
> On Jan 8, 2020, at 6:56 AM, Christian König <christian.koenig@amd.com> wrote:
> 
> Am 07.01.20 um 20:25 schrieb Tianlin Li:
>> Right now several architectures allow their set_memory_*() family of
>> functions to fail, but callers may not be checking the return values.
>> If set_memory_*() returns with an error, call-site assumptions may be
>> infact wrong to assume that it would either succeed or not succeed at
>> all. Ideally, the failure of set_memory_*() should be passed up the
>> call stack, and callers should examine the failure and deal with it.
>> 
>> Need to fix the callers and add the __must_check attribute. They also
>> may not provide any level of atomicity, in the sense that the memory
>> protections may be left incomplete on failure. This issue likely has a
>> few steps on effects architectures:
>> 1)Have all callers of set_memory_*() helpers check the return value.
>> 2)Add __must_check to all set_memory_*() helpers so that new uses do
>> not ignore the return value.
>> 3)Add atomicity to the calls so that the memory protections aren't left
>> in a partial state.
>> 
>> This series is part of step 1. Make drm/radeon check the return value of
>> set_memory_*().
> 
> I'm a little hesitate merge that. This hardware is >15 years old and nobody of the developers have any system left to test this change on.
> 
> Would it be to much of a problem to just add something like: r = set_memory_*(); (void)r; /* Intentionally ignored */.

Thank you. I will fix that in patch 1 and remove patch 2 (since no need to fix the call sites to handle the retval). 

Best regards,
Tianlin
> Apart from that certainly a good idea to add __must_check to the functions.
> 
> Regards,
> Christian.
> 
>> 
>> Tianlin Li (2):
>>   drm/radeon: have the callers of set_memory_*() check the return value
>>   drm/radeon: change call sites to handle return value properly.
>> 
>>  drivers/gpu/drm/radeon/r100.c        |  3 ++-
>>  drivers/gpu/drm/radeon/radeon.h      |  2 +-
>>  drivers/gpu/drm/radeon/radeon_gart.c | 22 ++++++++++++++++++----
>>  drivers/gpu/drm/radeon/rs400.c       |  3 ++-
>>  4 files changed, 23 insertions(+), 7 deletions(-)
Kees Cook Jan. 8, 2020, 5:39 p.m. UTC | #3
On Wed, Jan 08, 2020 at 01:56:47PM +0100, Christian König wrote:
> Am 07.01.20 um 20:25 schrieb Tianlin Li:
> > Right now several architectures allow their set_memory_*() family of
> > functions to fail, but callers may not be checking the return values.
> > If set_memory_*() returns with an error, call-site assumptions may be
> > infact wrong to assume that it would either succeed or not succeed at
> > all. Ideally, the failure of set_memory_*() should be passed up the
> > call stack, and callers should examine the failure and deal with it.
> > 
> > Need to fix the callers and add the __must_check attribute. They also
> > may not provide any level of atomicity, in the sense that the memory
> > protections may be left incomplete on failure. This issue likely has a
> > few steps on effects architectures:
> > 1)Have all callers of set_memory_*() helpers check the return value.
> > 2)Add __must_check to all set_memory_*() helpers so that new uses do
> > not ignore the return value.
> > 3)Add atomicity to the calls so that the memory protections aren't left
> > in a partial state.
> > 
> > This series is part of step 1. Make drm/radeon check the return value of
> > set_memory_*().
> 
> I'm a little hesitate merge that. This hardware is >15 years old and nobody
> of the developers have any system left to test this change on.

If that's true it should be removed from the tree. We need to be able to
correctly make these kinds of changes in the kernel.

> Would it be to much of a problem to just add something like: r =
> set_memory_*(); (void)r; /* Intentionally ignored */.

This seems like a bad idea -- we shouldn't be papering over failures
like this when there is logic available to deal with it.

> Apart from that certainly a good idea to add __must_check to the functions.

Agreed!

-Kees
Alex Deucher Jan. 8, 2020, 5:51 p.m. UTC | #4
On Wed, Jan 8, 2020 at 12:39 PM Kees Cook <keescook@chromium.org> wrote:
>
> On Wed, Jan 08, 2020 at 01:56:47PM +0100, Christian König wrote:
> > Am 07.01.20 um 20:25 schrieb Tianlin Li:
> > > Right now several architectures allow their set_memory_*() family of
> > > functions to fail, but callers may not be checking the return values.
> > > If set_memory_*() returns with an error, call-site assumptions may be
> > > infact wrong to assume that it would either succeed or not succeed at
> > > all. Ideally, the failure of set_memory_*() should be passed up the
> > > call stack, and callers should examine the failure and deal with it.
> > >
> > > Need to fix the callers and add the __must_check attribute. They also
> > > may not provide any level of atomicity, in the sense that the memory
> > > protections may be left incomplete on failure. This issue likely has a
> > > few steps on effects architectures:
> > > 1)Have all callers of set_memory_*() helpers check the return value.
> > > 2)Add __must_check to all set_memory_*() helpers so that new uses do
> > > not ignore the return value.
> > > 3)Add atomicity to the calls so that the memory protections aren't left
> > > in a partial state.
> > >
> > > This series is part of step 1. Make drm/radeon check the return value of
> > > set_memory_*().
> >
> > I'm a little hesitate merge that. This hardware is >15 years old and nobody
> > of the developers have any system left to test this change on.
>
> If that's true it should be removed from the tree. We need to be able to
> correctly make these kinds of changes in the kernel.

This driver supports about 15 years of hardware generations.  Newer
cards are still prevalent, but the older stuff is less so.  It still
works and people use it based on feedback I've seen, but the older
stuff has no active development at this point.  This change just
happens to target those older chips.

Alex

>
> > Would it be to much of a problem to just add something like: r =
> > set_memory_*(); (void)r; /* Intentionally ignored */.
>
> This seems like a bad idea -- we shouldn't be papering over failures
> like this when there is logic available to deal with it.
>
> > Apart from that certainly a good idea to add __must_check to the functions.
>
> Agreed!
>
> -Kees
>
> --
> Kees Cook
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
Christian König Jan. 9, 2020, 10:15 a.m. UTC | #5
Am 08.01.20 um 18:51 schrieb Alex Deucher:
> On Wed, Jan 8, 2020 at 12:39 PM Kees Cook <keescook@chromium.org> wrote:
>> On Wed, Jan 08, 2020 at 01:56:47PM +0100, Christian König wrote:
>>> Am 07.01.20 um 20:25 schrieb Tianlin Li:
>>>> Right now several architectures allow their set_memory_*() family of
>>>> functions to fail, but callers may not be checking the return values.
>>>> If set_memory_*() returns with an error, call-site assumptions may be
>>>> infact wrong to assume that it would either succeed or not succeed at
>>>> all. Ideally, the failure of set_memory_*() should be passed up the
>>>> call stack, and callers should examine the failure and deal with it.
>>>>
>>>> Need to fix the callers and add the __must_check attribute. They also
>>>> may not provide any level of atomicity, in the sense that the memory
>>>> protections may be left incomplete on failure. This issue likely has a
>>>> few steps on effects architectures:
>>>> 1)Have all callers of set_memory_*() helpers check the return value.
>>>> 2)Add __must_check to all set_memory_*() helpers so that new uses do
>>>> not ignore the return value.
>>>> 3)Add atomicity to the calls so that the memory protections aren't left
>>>> in a partial state.
>>>>
>>>> This series is part of step 1. Make drm/radeon check the return value of
>>>> set_memory_*().
>>> I'm a little hesitate merge that. This hardware is >15 years old and nobody
>>> of the developers have any system left to test this change on.
>> If that's true it should be removed from the tree. We need to be able to
>> correctly make these kinds of changes in the kernel.
> This driver supports about 15 years of hardware generations.  Newer
> cards are still prevalent, but the older stuff is less so.  It still
> works and people use it based on feedback I've seen, but the older
> stuff has no active development at this point.  This change just
> happens to target those older chips.

Just a few weeks back we've got a mail from somebody using an integrated 
R128 in a laptop.

After a few mails back and force we figured out that his nearly 20 years 
old hardware was finally failing.

Up till that he was still successfully updating his kernel from time to 
time and the driver still worked. I find that pretty impressive.

>
> Alex
>
>>> Would it be to much of a problem to just add something like: r =
>>> set_memory_*(); (void)r; /* Intentionally ignored */.
>> This seems like a bad idea -- we shouldn't be papering over failures
>> like this when there is logic available to deal with it.

Well I certainly agree to that, but we are talking about a call which 
happens only once during driver load/unload. If necessary we could also 
print an error when something goes wrong, but please no larger 
refactoring of return values and call paths.

It is perfectly possible that this call actually failed on somebodies 
hardware, but we never noticed because the driver still works fine. If 
we now handle the error it is possible that the module never loads and 
the user gets a black screen instead.

Regards,
Christian.

>>
>>> Apart from that certainly a good idea to add __must_check to the functions.
>> Agreed!
>>
>> -Kees
>>
>> --
>> Kees Cook
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-devel&amp;data=02%7C01%7Cchristian.koenig%40amd.com%7Ca542d384d54040b5b0b708d794636df1%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637141027080080147&amp;sdata=EHFl6YOHmNp7gOqWsVmfoeD0jNirBTOGHcCP4efC%2FvE%3D&amp;reserved=0
Thomas Zimmermann Jan. 9, 2020, 10:49 a.m. UTC | #6
Hi

Am 09.01.20 um 11:15 schrieb Christian König:
> Am 08.01.20 um 18:51 schrieb Alex Deucher:
>> On Wed, Jan 8, 2020 at 12:39 PM Kees Cook <keescook@chromium.org> wrote:
>>> On Wed, Jan 08, 2020 at 01:56:47PM +0100, Christian König wrote:
>>>> Am 07.01.20 um 20:25 schrieb Tianlin Li:
>>>>> Right now several architectures allow their set_memory_*() family of
>>>>> functions to fail, but callers may not be checking the return values.
>>>>> If set_memory_*() returns with an error, call-site assumptions may be
>>>>> infact wrong to assume that it would either succeed or not succeed at
>>>>> all. Ideally, the failure of set_memory_*() should be passed up the
>>>>> call stack, and callers should examine the failure and deal with it.
>>>>>
>>>>> Need to fix the callers and add the __must_check attribute. They also
>>>>> may not provide any level of atomicity, in the sense that the memory
>>>>> protections may be left incomplete on failure. This issue likely has a
>>>>> few steps on effects architectures:
>>>>> 1)Have all callers of set_memory_*() helpers check the return value.
>>>>> 2)Add __must_check to all set_memory_*() helpers so that new uses do
>>>>> not ignore the return value.
>>>>> 3)Add atomicity to the calls so that the memory protections aren't
>>>>> left
>>>>> in a partial state.
>>>>>
>>>>> This series is part of step 1. Make drm/radeon check the return
>>>>> value of
>>>>> set_memory_*().
>>>> I'm a little hesitate merge that. This hardware is >15 years old and
>>>> nobody
>>>> of the developers have any system left to test this change on.
>>> If that's true it should be removed from the tree. We need to be able to
>>> correctly make these kinds of changes in the kernel.
>> This driver supports about 15 years of hardware generations.  Newer
>> cards are still prevalent, but the older stuff is less so.  It still
>> works and people use it based on feedback I've seen, but the older
>> stuff has no active development at this point.  This change just
>> happens to target those older chips.
> 
> Just a few weeks back we've got a mail from somebody using an integrated
> R128 in a laptop.
> 
> After a few mails back and force we figured out that his nearly 20 years
> old hardware was finally failing.
> 
> Up till that he was still successfully updating his kernel from time to
> time and the driver still worked. I find that pretty impressive.
> 
>>
>> Alex
>>
>>>> Would it be to much of a problem to just add something like: r =
>>>> set_memory_*(); (void)r; /* Intentionally ignored */.
>>> This seems like a bad idea -- we shouldn't be papering over failures
>>> like this when there is logic available to deal with it.
> 
> Well I certainly agree to that, but we are talking about a call which
> happens only once during driver load/unload. If necessary we could also
> print an error when something goes wrong, but please no larger
> refactoring of return values and call paths.
> 

IMHO radeon should be marked as orphaned or obsolete then.

Best regards
Thomas

> It is perfectly possible that this call actually failed on somebodies
> hardware, but we never noticed because the driver still works fine. If
> we now handle the error it is possible that the module never loads and
> the user gets a black screen instead.
> 
> Regards,
> Christian.
> 
>>>
>>>> Apart from that certainly a good idea to add __must_check to the
>>>> functions.
>>> Agreed!
>>>
>>> -Kees
>>>
>>> -- 
>>> Kees Cook
>>> _______________________________________________
>>> dri-devel mailing list
>>> dri-devel@lists.freedesktop.org
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-devel&amp;data=02%7C01%7Cchristian.koenig%40amd.com%7Ca542d384d54040b5b0b708d794636df1%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637141027080080147&amp;sdata=EHFl6YOHmNp7gOqWsVmfoeD0jNirBTOGHcCP4efC%2FvE%3D&amp;reserved=0
>>>
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
Alex Deucher Jan. 9, 2020, 8:16 p.m. UTC | #7
On Thu, Jan 9, 2020 at 5:49 AM Thomas Zimmermann <tzimmermann@suse.de> wrote:
>
> Hi
>
> Am 09.01.20 um 11:15 schrieb Christian König:
> > Am 08.01.20 um 18:51 schrieb Alex Deucher:
> >> On Wed, Jan 8, 2020 at 12:39 PM Kees Cook <keescook@chromium.org> wrote:
> >>> On Wed, Jan 08, 2020 at 01:56:47PM +0100, Christian König wrote:
> >>>> Am 07.01.20 um 20:25 schrieb Tianlin Li:
> >>>>> Right now several architectures allow their set_memory_*() family of
> >>>>> functions to fail, but callers may not be checking the return values.
> >>>>> If set_memory_*() returns with an error, call-site assumptions may be
> >>>>> infact wrong to assume that it would either succeed or not succeed at
> >>>>> all. Ideally, the failure of set_memory_*() should be passed up the
> >>>>> call stack, and callers should examine the failure and deal with it.
> >>>>>
> >>>>> Need to fix the callers and add the __must_check attribute. They also
> >>>>> may not provide any level of atomicity, in the sense that the memory
> >>>>> protections may be left incomplete on failure. This issue likely has a
> >>>>> few steps on effects architectures:
> >>>>> 1)Have all callers of set_memory_*() helpers check the return value.
> >>>>> 2)Add __must_check to all set_memory_*() helpers so that new uses do
> >>>>> not ignore the return value.
> >>>>> 3)Add atomicity to the calls so that the memory protections aren't
> >>>>> left
> >>>>> in a partial state.
> >>>>>
> >>>>> This series is part of step 1. Make drm/radeon check the return
> >>>>> value of
> >>>>> set_memory_*().
> >>>> I'm a little hesitate merge that. This hardware is >15 years old and
> >>>> nobody
> >>>> of the developers have any system left to test this change on.
> >>> If that's true it should be removed from the tree. We need to be able to
> >>> correctly make these kinds of changes in the kernel.
> >> This driver supports about 15 years of hardware generations.  Newer
> >> cards are still prevalent, but the older stuff is less so.  It still
> >> works and people use it based on feedback I've seen, but the older
> >> stuff has no active development at this point.  This change just
> >> happens to target those older chips.
> >
> > Just a few weeks back we've got a mail from somebody using an integrated
> > R128 in a laptop.
> >
> > After a few mails back and force we figured out that his nearly 20 years
> > old hardware was finally failing.
> >
> > Up till that he was still successfully updating his kernel from time to
> > time and the driver still worked. I find that pretty impressive.
> >
> >>
> >> Alex
> >>
> >>>> Would it be to much of a problem to just add something like: r =
> >>>> set_memory_*(); (void)r; /* Intentionally ignored */.
> >>> This seems like a bad idea -- we shouldn't be papering over failure
> >>> like this when there is logic available to deal with it.
> >
> > Well I certainly agree to that, but we are talking about a call which
> > happens only once during driver load/unload. If necessary we could also
> > print an error when something goes wrong, but please no larger
> > refactoring of return values and call paths.
> >
>
> IMHO radeon should be marked as orphaned or obsolete then.

As I said this covers about 15-17 years of GPUs (~60 asic families).
The older stuff is hard to test these days because it's PCI or AGP
hardware.  So far it works for most people.  The newer stuff is still
tested as used regularly.

Alex

>
> Best regards
> Thomas
>
> > It is perfectly possible that this call actually failed on somebodies
> > hardware, but we never noticed because the driver still works fine. If
> > we now handle the error it is possible that the module never loads and
> > the user gets a black screen instead.
> >
> > Regards,
> > Christian.
> >
> >>>
> >>>> Apart from that certainly a good idea to add __must_check to the
> >>>> functions.
> >>> Agreed!
> >>>
> >>> -Kees
> >>>
> >>> --
> >>> Kees Cook
> >>> _______________________________________________
> >>> dri-devel mailing list
> >>> dri-devel@lists.freedesktop.org
> >>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-devel&amp;data=02%7C01%7Cchristian.koenig%40amd.com%7Ca542d384d54040b5b0b708d794636df1%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637141027080080147&amp;sdata=EHFl6YOHmNp7gOqWsVmfoeD0jNirBTOGHcCP4efC%2FvE%3D&amp;reserved=0
> >>>
> >
> > _______________________________________________
> > dri-devel mailing list
> > dri-devel@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/dri-devel
>
> --
> Thomas Zimmermann
> Graphics Driver Developer
> SUSE Software Solutions Germany GmbH
> Maxfeldstr. 5, 90409 Nürnberg, Germany
> (HRB 36809, AG Nürnberg)
> Geschäftsführer: Felix Imendörffer
>