diff mbox

[11/31] dma: add channel request API that supports deferred probe

Message ID CAPcyv4jTUVjhFFXP8RL2jCqFj1MqxSCKQYvfdLHTj+1PRDDL3Q@mail.gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Dan Williams Nov. 22, 2013, 6:54 a.m. UTC
On Thu, Nov 21, 2013 at 10:22 AM, Stephen Warren <swarren@wwwdotorg.org> wrote:
> On 11/20/2013 08:22 PM, Dan Williams wrote:
>> On Wed, Nov 20, 2013 at 1:24 PM, Stephen Warren <swarren@wwwdotorg.org> wrote:
>>> On 11/20/2013 01:23 PM, Williams, Dan J wrote:
>>> ...
>>>> Why do the drivers that call dma_request_channel need to convert it to
>>>> an ERR value?  i.e. what's problematic about the below (not compile
>>>> tested)?
>>> ...
>>>> diff --git a/arch/arm/plat-samsung/dma-ops.c b/arch/arm/plat-samsung/dma-ops.c
>>> ...
>>>> @@ -22,16 +22,20 @@ static unsigned samsung_dmadev_request(enum dma_ch dma_ch,
>>>>                               struct samsung_dma_req *param,
>>>>                               struct device *dev, char *ch_name)
>>> ...
>>>> +     if (dev->of_node) {
>>>> +             chan = dma_request_slave_channel(dev, ch_name);
>>>> +             return IS_ERR(chan) ? (unsigned) NULL : (unsigned) chan;
>>>> +     } else {
>>>>               return (unsigned)dma_request_channel(mask, pl330_filter,
>>>>                                                       (void *)dma_ch);
>>>> +     }
>>>
>>> The argument is that if a function returns errors encoded as an ERR
>>> pointer, then callers must assume that any non-IS_ERR value that the
>>> function returns is valid. NULL is one of those values. As such, callers
>>> can no longer check the value against NULL, but must use IS_ERR().
>>> Converting any IS_ERR() returns to NULL theoretically is the act of
>>> converting one valid return value to some other completely random return
>>> value.
>>
>> You describe how IS_ERR() works, but you didn't address the patch.
>> There's nothing random about the changes to samsung_dmadev_request().
>> It still returns NULL or a valid channel just as before.
>
> I was addressing the patch. I guess I should have explained as follows.
>
> First, the following code is technically buggy:

No, it's not, but I think we have different implementations in mind.

>
> +             chan = dma_request_slave_channel(dev, ch_name);
> +             return IS_ERR(chan) ? (unsigned) NULL : (unsigned) chan;
>
> ... since it assumes that dma_request_slave_channel() never returns NULL
> as a valid non-error value. This is specifically prohibited by the fact
> that dma_request_slave_channel() returns either an ERR value or a valid
> value; in that case, NULL is not an ERR value, and hence must be
> considered valid.

Let's stop there and be clear we are talking about the same proposal.

The proposal is dma_request_slave_channel only returns errors or valid
pointers, never NULL.

In the above the assumption is that of_dma_request_slave_channel() is
modified to guarantee it only returns ERR_PTRs or valid pointers never
NULL.  acpi_dma_request_slave_chan_by_name() can continue returning
NULL and dma_request_slave_channel will translate it to an ERR_PTR, or
you can convert it as you do in your patch.  Not much value in
converting acpi_dma_request_slave_chan_by_name as it does not return
EPROBE_DEFER and nothing currently cares about the other error values.

Comments

Stephen Warren Nov. 22, 2013, 5:34 p.m. UTC | #1
On 11/21/2013 11:54 PM, Dan Williams wrote:
> On Thu, Nov 21, 2013 at 10:22 AM, Stephen Warren <swarren@wwwdotorg.org> wrote:
>> On 11/20/2013 08:22 PM, Dan Williams wrote:
>>> On Wed, Nov 20, 2013 at 1:24 PM, Stephen Warren <swarren@wwwdotorg.org> wrote:
>>>> On 11/20/2013 01:23 PM, Williams, Dan J wrote:
>>>> ...
>>>>> Why do the drivers that call dma_request_channel need to convert it to
>>>>> an ERR value?  i.e. what's problematic about the below (not compile
>>>>> tested)?
>>>> ...
>>>>> diff --git a/arch/arm/plat-samsung/dma-ops.c b/arch/arm/plat-samsung/dma-ops.c
>>>> ...
>>>>> @@ -22,16 +22,20 @@ static unsigned samsung_dmadev_request(enum dma_ch dma_ch,
>>>>>                               struct samsung_dma_req *param,
>>>>>                               struct device *dev, char *ch_name)
>>>> ...
>>>>> +     if (dev->of_node) {
>>>>> +             chan = dma_request_slave_channel(dev, ch_name);
>>>>> +             return IS_ERR(chan) ? (unsigned) NULL : (unsigned) chan;
>>>>> +     } else {
>>>>>               return (unsigned)dma_request_channel(mask, pl330_filter,
>>>>>                                                       (void *)dma_ch);
>>>>> +     }
>>>>
>>>> The argument is that if a function returns errors encoded as an ERR
>>>> pointer, then callers must assume that any non-IS_ERR value that the
>>>> function returns is valid. NULL is one of those values. As such, callers
>>>> can no longer check the value against NULL, but must use IS_ERR().
>>>> Converting any IS_ERR() returns to NULL theoretically is the act of
>>>> converting one valid return value to some other completely random return
>>>> value.
>>>
>>> You describe how IS_ERR() works, but you didn't address the patch.
>>> There's nothing random about the changes to samsung_dmadev_request().
>>> It still returns NULL or a valid channel just as before.
>>
>> I was addressing the patch. I guess I should have explained as follows.
>>
>> First, the following code is technically buggy:
> 
> No, it's not, but I think we have different implementations in mind.
> 
>>
>> +             chan = dma_request_slave_channel(dev, ch_name);
>> +             return IS_ERR(chan) ? (unsigned) NULL : (unsigned) chan;
>>
>> ... since it assumes that dma_request_slave_channel() never returns NULL
>> as a valid non-error value. This is specifically prohibited by the fact
>> that dma_request_slave_channel() returns either an ERR value or a valid
>> value; in that case, NULL is not an ERR value, and hence must be
>> considered valid.
> 
> Let's stop there and be clear we are talking about the same proposal.
> 
> The proposal is dma_request_slave_channel only returns errors or valid
> pointers, never NULL.

OK, so if you make that assumption, I guess it's safe. However, I
believe that's a new class of return value. To date, we have had two
classes:

a) Returns a valid value (which could include NULL), or an ERR value.

b) Returns a valid value (which doesn't include ERR values), or NULL.

You're talking about adding a third class:

c) Returns a valid value (which doesn't include NULL or ERR values), or
an ERR value.

Russell at least has argued in the past that APIs that return ERR values
for errors by definition can return NULL as a valid return value.
However, if we go against that and explicitly define the API the way you
propose, and nobody objects to defining it that way, then yes, that
would work out OK.
Dan Williams Nov. 22, 2013, 6:04 p.m. UTC | #2
On Fri, Nov 22, 2013 at 9:34 AM, Stephen Warren <swarren@wwwdotorg.org> wrote:
> On 11/21/2013 11:54 PM, Dan Williams wrote:
>> On Thu, Nov 21, 2013 at 10:22 AM, Stephen Warren <swarren@wwwdotorg.org> wrote:
>>> On 11/20/2013 08:22 PM, Dan Williams wrote:
>>>> On Wed, Nov 20, 2013 at 1:24 PM, Stephen Warren <swarren@wwwdotorg.org> wrote:
>>>>> On 11/20/2013 01:23 PM, Williams, Dan J wrote:
>>>>> ...
>>>>>> Why do the drivers that call dma_request_channel need to convert it to
>>>>>> an ERR value?  i.e. what's problematic about the below (not compile
>>>>>> tested)?
>>>>> ...
>>>>>> diff --git a/arch/arm/plat-samsung/dma-ops.c b/arch/arm/plat-samsung/dma-ops.c
>>>>> ...
>>>>>> @@ -22,16 +22,20 @@ static unsigned samsung_dmadev_request(enum dma_ch dma_ch,
>>>>>>                               struct samsung_dma_req *param,
>>>>>>                               struct device *dev, char *ch_name)
>>>>> ...
>>>>>> +     if (dev->of_node) {
>>>>>> +             chan = dma_request_slave_channel(dev, ch_name);
>>>>>> +             return IS_ERR(chan) ? (unsigned) NULL : (unsigned) chan;
>>>>>> +     } else {
>>>>>>               return (unsigned)dma_request_channel(mask, pl330_filter,
>>>>>>                                                       (void *)dma_ch);
>>>>>> +     }
>>>>>
>>>>> The argument is that if a function returns errors encoded as an ERR
>>>>> pointer, then callers must assume that any non-IS_ERR value that the
>>>>> function returns is valid. NULL is one of those values. As such, callers
>>>>> can no longer check the value against NULL, but must use IS_ERR().
>>>>> Converting any IS_ERR() returns to NULL theoretically is the act of
>>>>> converting one valid return value to some other completely random return
>>>>> value.
>>>>
>>>> You describe how IS_ERR() works, but you didn't address the patch.
>>>> There's nothing random about the changes to samsung_dmadev_request().
>>>> It still returns NULL or a valid channel just as before.
>>>
>>> I was addressing the patch. I guess I should have explained as follows.
>>>
>>> First, the following code is technically buggy:
>>
>> No, it's not, but I think we have different implementations in mind.
>>
>>>
>>> +             chan = dma_request_slave_channel(dev, ch_name);
>>> +             return IS_ERR(chan) ? (unsigned) NULL : (unsigned) chan;
>>>
>>> ... since it assumes that dma_request_slave_channel() never returns NULL
>>> as a valid non-error value. This is specifically prohibited by the fact
>>> that dma_request_slave_channel() returns either an ERR value or a valid
>>> value; in that case, NULL is not an ERR value, and hence must be
>>> considered valid.
>>
>> Let's stop there and be clear we are talking about the same proposal.
>>
>> The proposal is dma_request_slave_channel only returns errors or valid
>> pointers, never NULL.
>
> OK, so if you make that assumption, I guess it's safe.

I made that assumption because that is what your original patch proposed:

+/**
+ * dma_request_slave_channel_or_err - try to allocate an exclusive
slave channel
+ * @dev:       pointer to client device structure
+ * @name:      slave channel name
+ *
+ * Returns pointer to appropriate dma channel on success or an error pointer.
+ */

What's the benefit of leaking NULL values to callers?  If they already
need to check for err, why force them to check for NULL too?
Stephen Warren Nov. 22, 2013, 6:10 p.m. UTC | #3
On 11/22/2013 11:04 AM, Dan Williams wrote:
> On Fri, Nov 22, 2013 at 9:34 AM, Stephen Warren <swarren@wwwdotorg.org> wrote:
>> On 11/21/2013 11:54 PM, Dan Williams wrote:
>>> On Thu, Nov 21, 2013 at 10:22 AM, Stephen Warren <swarren@wwwdotorg.org> wrote:
>>>> On 11/20/2013 08:22 PM, Dan Williams wrote:
>>>>> On Wed, Nov 20, 2013 at 1:24 PM, Stephen Warren <swarren@wwwdotorg.org> wrote:
>>>>>> On 11/20/2013 01:23 PM, Williams, Dan J wrote:
>>>>>> ...
>>>>>>> Why do the drivers that call dma_request_channel need to convert it to
>>>>>>> an ERR value?  i.e. what's problematic about the below (not compile
>>>>>>> tested)?
>>>>>> ...
>>>>>>> diff --git a/arch/arm/plat-samsung/dma-ops.c b/arch/arm/plat-samsung/dma-ops.c
>>>>>> ...
>>>>>>> @@ -22,16 +22,20 @@ static unsigned samsung_dmadev_request(enum dma_ch dma_ch,
>>>>>>>                               struct samsung_dma_req *param,
>>>>>>>                               struct device *dev, char *ch_name)
>>>>>> ...
>>>>>>> +     if (dev->of_node) {
>>>>>>> +             chan = dma_request_slave_channel(dev, ch_name);
>>>>>>> +             return IS_ERR(chan) ? (unsigned) NULL : (unsigned) chan;
>>>>>>> +     } else {
>>>>>>>               return (unsigned)dma_request_channel(mask, pl330_filter,
>>>>>>>                                                       (void *)dma_ch);
>>>>>>> +     }
>>>>>>
>>>>>> The argument is that if a function returns errors encoded as an ERR
>>>>>> pointer, then callers must assume that any non-IS_ERR value that the
>>>>>> function returns is valid. NULL is one of those values. As such, callers
>>>>>> can no longer check the value against NULL, but must use IS_ERR().
>>>>>> Converting any IS_ERR() returns to NULL theoretically is the act of
>>>>>> converting one valid return value to some other completely random return
>>>>>> value.
>>>>>
>>>>> You describe how IS_ERR() works, but you didn't address the patch.
>>>>> There's nothing random about the changes to samsung_dmadev_request().
>>>>> It still returns NULL or a valid channel just as before.
>>>>
>>>> I was addressing the patch. I guess I should have explained as follows.
>>>>
>>>> First, the following code is technically buggy:
>>>
>>> No, it's not, but I think we have different implementations in mind.
>>>
>>>>
>>>> +             chan = dma_request_slave_channel(dev, ch_name);
>>>> +             return IS_ERR(chan) ? (unsigned) NULL : (unsigned) chan;
>>>>
>>>> ... since it assumes that dma_request_slave_channel() never returns NULL
>>>> as a valid non-error value. This is specifically prohibited by the fact
>>>> that dma_request_slave_channel() returns either an ERR value or a valid
>>>> value; in that case, NULL is not an ERR value, and hence must be
>>>> considered valid.
>>>
>>> Let's stop there and be clear we are talking about the same proposal.
>>>
>>> The proposal is dma_request_slave_channel only returns errors or valid
>>> pointers, never NULL.
>>
>> OK, so if you make that assumption, I guess it's safe.
> 
> I made that assumption because that is what your original patch proposed:
> 
> +/**
> + * dma_request_slave_channel_or_err - try to allocate an exclusive
> slave channel
> + * @dev:       pointer to client device structure
> + * @name:      slave channel name
> + *
> + * Returns pointer to appropriate dma channel on success or an error pointer.
> + */
> 
> What's the benefit of leaking NULL values to callers?  If they already
> need to check for err, why force them to check for NULL too?

"Returns pointer to appropriate dma channel on success or an error
pointer." means that callers only have to check for an ERR value. If the
function returns NULL, then other DMA-related functions must treat that
as a valid channel ID. This is case (a) in my previous email.
Dan Williams Nov. 22, 2013, 7:49 p.m. UTC | #4
On Fri, Nov 22, 2013 at 10:10 AM, Stephen Warren <swarren@wwwdotorg.org> wrote:
>>>> The proposal is dma_request_slave_channel only returns errors or valid
>>>> pointers, never NULL.
>>>
>>> OK, so if you make that assumption, I guess it's safe.
>>
>> I made that assumption because that is what your original patch proposed:
>>
>> +/**
>> + * dma_request_slave_channel_or_err - try to allocate an exclusive
>> slave channel
>> + * @dev:       pointer to client device structure
>> + * @name:      slave channel name
>> + *
>> + * Returns pointer to appropriate dma channel on success or an error pointer.
>> + */
>>
>> What's the benefit of leaking NULL values to callers?  If they already
>> need to check for err, why force them to check for NULL too?
>
> "Returns pointer to appropriate dma channel on success or an error
> pointer." means that callers only have to check for an ERR value. If the
> function returns NULL, then other DMA-related functions must treat that
> as a valid channel ID. This is case (a) in my previous email.

How can a channel be "valid" and NULL at the same time?  Without the
guarantee that dma_request_channel always returns a non-null-channel
pointer or an error pointer you're forcing clients to use or open-code
IS_ERR_OR_NULL.  Make the caller's life easier and just turn success
or failure like before.
Stephen Warren Nov. 22, 2013, 7:53 p.m. UTC | #5
On 11/22/2013 12:49 PM, Dan Williams wrote:
> On Fri, Nov 22, 2013 at 10:10 AM, Stephen Warren <swarren@wwwdotorg.org> wrote:
>>>>> The proposal is dma_request_slave_channel only returns errors or valid
>>>>> pointers, never NULL.
>>>>
>>>> OK, so if you make that assumption, I guess it's safe.
>>>
>>> I made that assumption because that is what your original patch proposed:
>>>
>>> +/**
>>> + * dma_request_slave_channel_or_err - try to allocate an exclusive
>>> slave channel
>>> + * @dev:       pointer to client device structure
>>> + * @name:      slave channel name
>>> + *
>>> + * Returns pointer to appropriate dma channel on success or an error pointer.
>>> + */
>>>
>>> What's the benefit of leaking NULL values to callers?  If they already
>>> need to check for err, why force them to check for NULL too?
>>
>> "Returns pointer to appropriate dma channel on success or an error
>> pointer." means that callers only have to check for an ERR value. If the
>> function returns NULL, then other DMA-related functions must treat that
>> as a valid channel ID. This is case (a) in my previous email.
> 
> How can a channel be "valid" and NULL at the same time?  Without the
> guarantee that dma_request_channel always returns a non-null-channel
> pointer or an error pointer you're forcing clients to use or open-code
> IS_ERR_OR_NULL.

No, callers should just follow the documentation. If all error cases are
indicated by an ERR pointer, then there is no need to check for NULL. In
fact, client must not check anything beyond whether the value is an ERR
value or not. So, there's no need to use IS_ERR_OR_NULL.

It's up to the API to make sure that it returns values that are valid
for other calls to related APIs. If that doesn't include NULL, it won't
return NULL. If it does, it might. But, that's an internal
implementation detail of the API (and associated APIs), not something
that clients should know about.

One situation where a NULL might be valid is where the return value
isn't really a pointer, but an integer index or ID cast to a pointer.
Dan Williams Nov. 22, 2013, 8:46 p.m. UTC | #6
On Fri, Nov 22, 2013 at 11:53 AM, Stephen Warren <swarren@wwwdotorg.org> wrote:
> On 11/22/2013 12:49 PM, Dan Williams wrote:
>> On Fri, Nov 22, 2013 at 10:10 AM, Stephen Warren <swarren@wwwdotorg.org> wrote:
>>>>>> The proposal is dma_request_slave_channel only returns errors or valid
>>>>>> pointers, never NULL.
>>>>>
>>>>> OK, so if you make that assumption, I guess it's safe.
>>>>
>>>> I made that assumption because that is what your original patch proposed:
>>>>
>>>> +/**
>>>> + * dma_request_slave_channel_or_err - try to allocate an exclusive
>>>> slave channel
>>>> + * @dev:       pointer to client device structure
>>>> + * @name:      slave channel name
>>>> + *
>>>> + * Returns pointer to appropriate dma channel on success or an error pointer.
>>>> + */
>>>>
>>>> What's the benefit of leaking NULL values to callers?  If they already
>>>> need to check for err, why force them to check for NULL too?
>>>
>>> "Returns pointer to appropriate dma channel on success or an error
>>> pointer." means that callers only have to check for an ERR value. If the
>>> function returns NULL, then other DMA-related functions must treat that
>>> as a valid channel ID. This is case (a) in my previous email.
>>
>> How can a channel be "valid" and NULL at the same time?  Without the
>> guarantee that dma_request_channel always returns a non-null-channel
>> pointer or an error pointer you're forcing clients to use or open-code
>> IS_ERR_OR_NULL.
>
> No, callers should just follow the documentation. If all error cases are
> indicated by an ERR pointer, then there is no need to check for NULL. In
> fact, client must not check anything beyond whether the value is an ERR
> value or not. So, there's no need to use IS_ERR_OR_NULL.
>
> It's up to the API to make sure that it returns values that are valid
> for other calls to related APIs. If that doesn't include NULL, it won't
> return NULL. If it does, it might. But, that's an internal
> implementation detail of the API (and associated APIs), not something
> that clients should know about.
>
> One situation where a NULL might be valid is where the return value
> isn't really a pointer, but an integer index or ID cast to a pointer.

Ok that's the piece I am missing, and maybe explains why
samsung_dmadev_request() looks so broken.  Are there really
implementations out there that somehow know that the return value from
dma_request_slave channel is not a (struct dma_chan *)??

At that point just change the prototype of dma_request_slave_channel to:

MAGIC_t dma_request_slave_channel(struct device *dev, const char *name)

Those clients need to be killed or fixed, otherwise how do you
guarantee that the 'integer index or ID' does not collide with the
ERR_PTR() number space?
Stephen Warren Nov. 22, 2013, 9:50 p.m. UTC | #7
On 11/22/2013 01:46 PM, Dan Williams wrote:
> On Fri, Nov 22, 2013 at 11:53 AM, Stephen Warren <swarren@wwwdotorg.org> wrote:
>> On 11/22/2013 12:49 PM, Dan Williams wrote:
>>> On Fri, Nov 22, 2013 at 10:10 AM, Stephen Warren <swarren@wwwdotorg.org> wrote:
>>>>>>> The proposal is dma_request_slave_channel only returns errors or valid
>>>>>>> pointers, never NULL.
>>>>>>
>>>>>> OK, so if you make that assumption, I guess it's safe.
>>>>>
>>>>> I made that assumption because that is what your original patch proposed:
>>>>>
>>>>> +/**
>>>>> + * dma_request_slave_channel_or_err - try to allocate an exclusive
>>>>> slave channel
>>>>> + * @dev:       pointer to client device structure
>>>>> + * @name:      slave channel name
>>>>> + *
>>>>> + * Returns pointer to appropriate dma channel on success or an error pointer.
>>>>> + */
>>>>>
>>>>> What's the benefit of leaking NULL values to callers?  If they already
>>>>> need to check for err, why force them to check for NULL too?
>>>>
>>>> "Returns pointer to appropriate dma channel on success or an error
>>>> pointer." means that callers only have to check for an ERR value. If the
>>>> function returns NULL, then other DMA-related functions must treat that
>>>> as a valid channel ID. This is case (a) in my previous email.
>>>
>>> How can a channel be "valid" and NULL at the same time?  Without the
>>> guarantee that dma_request_channel always returns a non-null-channel
>>> pointer or an error pointer you're forcing clients to use or open-code
>>> IS_ERR_OR_NULL.
>>
>> No, callers should just follow the documentation. If all error cases are
>> indicated by an ERR pointer, then there is no need to check for NULL. In
>> fact, client must not check anything beyond whether the value is an ERR
>> value or not. So, there's no need to use IS_ERR_OR_NULL.
>>
>> It's up to the API to make sure that it returns values that are valid
>> for other calls to related APIs. If that doesn't include NULL, it won't
>> return NULL. If it does, it might. But, that's an internal
>> implementation detail of the API (and associated APIs), not something
>> that clients should know about.
>>
>> One situation where a NULL might be valid is where the return value
>> isn't really a pointer, but an integer index or ID cast to a pointer.
> 
> Ok that's the piece I am missing, and maybe explains why
> samsung_dmadev_request() looks so broken.  Are there really
> implementations out there that somehow know that the return value from
> dma_request_slave channel is not a (struct dma_chan *)??

No client of the API should know that; it'd be more like an agreement
between multiple functions in the subsystem:

handle = subsystemx_allocate_something();
...
subsystemx_use_handle(handle);

Where subsystemx_allocate_something() casts from ID to "pointer", and
subsystemx_use_handle() casts back from "pointer" to ID. The callers
would have no idea this was happening.

I'm not actually aware of any specific cases where that actually happens
right now, it's just that given the way subsystemx_allocate_something()
is documented (valid handle/cookie return or ERR value) it's legal for
"subsystemx" to work that way if it wants, and it should be able to
change between this cast-a-handle style and actual pointer returns
without clients being affected.

> At that point just change the prototype of dma_request_slave_channel to:
> 
> MAGIC_t dma_request_slave_channel(struct device *dev, const char *name)
> 
> Those clients need to be killed or fixed, otherwise how do you
> guarantee that the 'integer index or ID' does not collide with the
> ERR_PTR() number space?

subsystemx_allocate_something() would have to ensure that. Probably just
by imposing a maximum limit on the handle/ID values.

Anyway, your proposal can certainly /work/. I simply wanted to point out
that it was different to the two currently accepted styles of return
value. If you're sure e.g. Russell isn't going to shout at me or you for
introducing an API that works as you describe, we certainly could go
ahead with it. Should we explicitly ping him to confirm that?
Dan Williams Nov. 22, 2013, 11:13 p.m. UTC | #8
On Fri, Nov 22, 2013 at 1:50 PM, Stephen Warren <swarren@wwwdotorg.org> wrote:
> On 11/22/2013 01:46 PM, Dan Williams wrote:
>> On Fri, Nov 22, 2013 at 11:53 AM, Stephen Warren <swarren@wwwdotorg.org> wrote:
>>> On 11/22/2013 12:49 PM, Dan Williams wrote:
>>>> On Fri, Nov 22, 2013 at 10:10 AM, Stephen Warren <swarren@wwwdotorg.org> wrote:
>>>>>>>> The proposal is dma_request_slave_channel only returns errors or valid
>>>>>>>> pointers, never NULL.
>>>>>>>
>>>>>>> OK, so if you make that assumption, I guess it's safe.
>>>>>>
>>>>>> I made that assumption because that is what your original patch proposed:
>>>>>>
>>>>>> +/**
>>>>>> + * dma_request_slave_channel_or_err - try to allocate an exclusive
>>>>>> slave channel
>>>>>> + * @dev:       pointer to client device structure
>>>>>> + * @name:      slave channel name
>>>>>> + *
>>>>>> + * Returns pointer to appropriate dma channel on success or an error pointer.
>>>>>> + */
>>>>>>
>>>>>> What's the benefit of leaking NULL values to callers?  If they already
>>>>>> need to check for err, why force them to check for NULL too?
>>>>>
>>>>> "Returns pointer to appropriate dma channel on success or an error
>>>>> pointer." means that callers only have to check for an ERR value. If the
>>>>> function returns NULL, then other DMA-related functions must treat that
>>>>> as a valid channel ID. This is case (a) in my previous email.
>>>>
>>>> How can a channel be "valid" and NULL at the same time?  Without the
>>>> guarantee that dma_request_channel always returns a non-null-channel
>>>> pointer or an error pointer you're forcing clients to use or open-code
>>>> IS_ERR_OR_NULL.
>>>
>>> No, callers should just follow the documentation. If all error cases are
>>> indicated by an ERR pointer, then there is no need to check for NULL. In
>>> fact, client must not check anything beyond whether the value is an ERR
>>> value or not. So, there's no need to use IS_ERR_OR_NULL.
>>>
>>> It's up to the API to make sure that it returns values that are valid
>>> for other calls to related APIs. If that doesn't include NULL, it won't
>>> return NULL. If it does, it might. But, that's an internal
>>> implementation detail of the API (and associated APIs), not something
>>> that clients should know about.
>>>
>>> One situation where a NULL might be valid is where the return value
>>> isn't really a pointer, but an integer index or ID cast to a pointer.
>>
>> Ok that's the piece I am missing, and maybe explains why
>> samsung_dmadev_request() looks so broken.  Are there really
>> implementations out there that somehow know that the return value from
>> dma_request_slave channel is not a (struct dma_chan *)??
>
> No client of the API should know that; it'd be more like an agreement
> between multiple functions in the subsystem:
>
> handle = subsystemx_allocate_something();
> ...
> subsystemx_use_handle(handle);
>
> Where subsystemx_allocate_something() casts from ID to "pointer", and
> subsystemx_use_handle() casts back from "pointer" to ID. The callers
> would have no idea this was happening.

That's a bug not a feature.  That's someone abusing an api and
breaking type safety to pass arbitrary data.  But since we're talking
in abstract 'buggy_subsytemx' terms why worry?

> I'm not actually aware of any specific cases where that actually happens
> right now, it's just that given the way subsystemx_allocate_something()
> is documented (valid handle/cookie return or ERR value) it's legal for
> "subsystemx" to work that way if it wants, and it should be able to
> change between this cast-a-handle style and actual pointer returns
> without clients being affected.

Wait, this busted way of doing things is documented?

>> At that point just change the prototype of dma_request_slave_channel to:
>>
>> MAGIC_t dma_request_slave_channel(struct device *dev, const char *name)
>>
>> Those clients need to be killed or fixed, otherwise how do you
>> guarantee that the 'integer index or ID' does not collide with the
>> ERR_PTR() number space?
>
> subsystemx_allocate_something() would have to ensure that. Probably just
> by imposing a maximum limit on the handle/ID values.
>
> Anyway, your proposal can certainly /work/. I simply wanted to point out
> that it was different to the two currently accepted styles of return
> value. If you're sure e.g. Russell isn't going to shout at me or you for
> introducing an API that works as you describe, we certainly could go
> ahead with it. Should we explicitly ping him to confirm that?

He already has in other threads IS_ERR_OR_NULL() must die, this
proposal is not that.  Let me go back to your note about "case 2" and
clarify.
Stephen Warren Nov. 22, 2013, 11:45 p.m. UTC | #9
On 11/22/2013 04:13 PM, Dan Williams wrote:
> On Fri, Nov 22, 2013 at 1:50 PM, Stephen Warren <swarren@wwwdotorg.org> wrote:
>> On 11/22/2013 01:46 PM, Dan Williams wrote:
>>> On Fri, Nov 22, 2013 at 11:53 AM, Stephen Warren <swarren@wwwdotorg.org> wrote:
>>>> On 11/22/2013 12:49 PM, Dan Williams wrote:
>>>>> On Fri, Nov 22, 2013 at 10:10 AM, Stephen Warren <swarren@wwwdotorg.org> wrote:
>>>>>>>>> The proposal is dma_request_slave_channel only returns errors or valid
>>>>>>>>> pointers, never NULL.
>>>>>>>>
>>>>>>>> OK, so if you make that assumption, I guess it's safe.
>>>>>>>
>>>>>>> I made that assumption because that is what your original patch proposed:
>>>>>>>
>>>>>>> +/**
>>>>>>> + * dma_request_slave_channel_or_err - try to allocate an exclusive
>>>>>>> slave channel
>>>>>>> + * @dev:       pointer to client device structure
>>>>>>> + * @name:      slave channel name
>>>>>>> + *
>>>>>>> + * Returns pointer to appropriate dma channel on success or an error pointer.
>>>>>>> + */
>>>>>>>
>>>>>>> What's the benefit of leaking NULL values to callers?  If they already
>>>>>>> need to check for err, why force them to check for NULL too?
>>>>>>
>>>>>> "Returns pointer to appropriate dma channel on success or an error
>>>>>> pointer." means that callers only have to check for an ERR value. If the
>>>>>> function returns NULL, then other DMA-related functions must treat that
>>>>>> as a valid channel ID. This is case (a) in my previous email.
>>>>>
>>>>> How can a channel be "valid" and NULL at the same time?  Without the
>>>>> guarantee that dma_request_channel always returns a non-null-channel
>>>>> pointer or an error pointer you're forcing clients to use or open-code
>>>>> IS_ERR_OR_NULL.
>>>>
>>>> No, callers should just follow the documentation. If all error cases are
>>>> indicated by an ERR pointer, then there is no need to check for NULL. In
>>>> fact, client must not check anything beyond whether the value is an ERR
>>>> value or not. So, there's no need to use IS_ERR_OR_NULL.
>>>>
>>>> It's up to the API to make sure that it returns values that are valid
>>>> for other calls to related APIs. If that doesn't include NULL, it won't
>>>> return NULL. If it does, it might. But, that's an internal
>>>> implementation detail of the API (and associated APIs), not something
>>>> that clients should know about.
>>>>
>>>> One situation where a NULL might be valid is where the return value
>>>> isn't really a pointer, but an integer index or ID cast to a pointer.
>>>
>>> Ok that's the piece I am missing, and maybe explains why
>>> samsung_dmadev_request() looks so broken.  Are there really
>>> implementations out there that somehow know that the return value from
>>> dma_request_slave channel is not a (struct dma_chan *)??
>>
>> No client of the API should know that; it'd be more like an agreement
>> between multiple functions in the subsystem:
>>
>> handle = subsystemx_allocate_something();
>> ...
>> subsystemx_use_handle(handle);
>>
>> Where subsystemx_allocate_something() casts from ID to "pointer", and
>> subsystemx_use_handle() casts back from "pointer" to ID. The callers
>> would have no idea this was happening.
> 
> That's a bug not a feature.  That's someone abusing an api and
> breaking type safety to pass arbitrary data.  But since we're talking
> in abstract 'buggy_subsytemx' terms why worry?
> 
>> I'm not actually aware of any specific cases where that actually happens
>> right now, it's just that given the way subsystemx_allocate_something()
>> is documented (valid handle/cookie return or ERR value) it's legal for
>> "subsystemx" to work that way if it wants, and it should be able to
>> change between this cast-a-handle style and actual pointer returns
>> without clients being affected.
> 
> Wait, this busted way of doing things is documented?

I should have said: s/is documented/would be documented/. Or perhaps
s/documented/discussed/. IIRC, in previous discussions of
IS_ERR_OR_NULL, this came up as a specific (perhaps hypothetical) thing
that APIs could legitimately do that made it important the API clients
didn't impose restrictions on return values beyond what APIs documented.
Russell King - ARM Linux Nov. 23, 2013, 12:34 a.m. UTC | #10
On Fri, Nov 22, 2013 at 11:10:01AM -0700, Stephen Warren wrote:
> On 11/22/2013 11:04 AM, Dan Williams wrote:
> > I made that assumption because that is what your original patch proposed:
> > 
> > +/**
> > + * dma_request_slave_channel_or_err - try to allocate an exclusive
> > slave channel
> > + * @dev:       pointer to client device structure
> > + * @name:      slave channel name
> > + *
> > + * Returns pointer to appropriate dma channel on success or an error pointer.
> > + */
> > 
> > What's the benefit of leaking NULL values to callers?  If they already
> > need to check for err, why force them to check for NULL too?
> 
> "Returns pointer to appropriate dma channel on success or an error
> pointer." means that callers only have to check for an ERR value. If the
> function returns NULL, then other DMA-related functions must treat that
> as a valid channel ID. This is case (a) in my previous email.

Stephen, Dan,

My point (which you quoted) is a fine one - read the definition above.

"Returns pointer to appropriate DMA channel on success or an error pointer."

This defines the range of values which are considered successful, and
those which are considered to be errors.  Error pointers are defined
by IS_ERR(ptr) being true (not by IS_ERR_OR_NULL(ptr) being true.
Conversely, it defines what are considered to be non-errors.

Therefore, users of such a function _must_ check the return value using
IS_ERR() and not the IS_ERR_OR_NULL() abomination.

The question about NULL is unanswered, but with nothing specified, users
must assume that if a subsystem returns NULL, it's fine to pass that back
to the subsystem.  If the subsystem didn't intend for NULL to be valid,
it shouldn't be returning NULL from such a defined function.

It's not up to the user of the interface to dream up an error code if
the subsystem happens to return NULL, or do other crap stuff like this:

	if (IS_ERR_OR_NULL(ptr))
		return PTR_ERR(ptr);

which we already see cropping up from time to time.

So, my argument is that if you define an API to be "pointers on success,
or error pointers" then your API better handle any cookie you return
which satisfies IS_ERR(ptr) = false - which by definition includes NULL.
Russell King - ARM Linux Nov. 23, 2013, 12:40 a.m. UTC | #11
On Fri, Nov 22, 2013 at 11:49:55AM -0800, Dan Williams wrote:
> On Fri, Nov 22, 2013 at 10:10 AM, Stephen Warren <swarren@wwwdotorg.org> wrote:
> >>>> The proposal is dma_request_slave_channel only returns errors or valid
> >>>> pointers, never NULL.
> >>>
> >>> OK, so if you make that assumption, I guess it's safe.
> >>
> >> I made that assumption because that is what your original patch proposed:
> >>
> >> +/**
> >> + * dma_request_slave_channel_or_err - try to allocate an exclusive
> >> slave channel
> >> + * @dev:       pointer to client device structure
> >> + * @name:      slave channel name
> >> + *
> >> + * Returns pointer to appropriate dma channel on success or an error pointer.
> >> + */
> >>
> >> What's the benefit of leaking NULL values to callers?  If they already
> >> need to check for err, why force them to check for NULL too?
> >
> > "Returns pointer to appropriate dma channel on success or an error
> > pointer." means that callers only have to check for an ERR value. If the
> > function returns NULL, then other DMA-related functions must treat that
> > as a valid channel ID. This is case (a) in my previous email.
> 
> How can a channel be "valid" and NULL at the same time?  Without the
> guarantee that dma_request_channel always returns a non-null-channel
> pointer or an error pointer you're forcing clients to use or open-code
> IS_ERR_OR_NULL.  Make the caller's life easier and just turn success
> or failure like before.

It's absolutely fine for an API to return "valid pointers or an error
pointer" and not have to care about the NULL condition.  Really.

Think about it.

	chan = dma_request_blah();
	if (IS_ERR(chan))
		return PTR_ERR(chan);

	dma_do_something(chan);

Now, if the API is defined to be "valid pointers or error pointers" the
above code is quite sane.  The user is doing everything required to use
the API safely.

However, if dma_request_blah() happens to return NULL, that's a failure
of dma_request_blah() - not of the user to check for it.  If
dma_request_blah() is never supposed to return NULL, then there won't be
any checks done to handle a NULL pointer into dma_do_something(), and we
get to find out about this failure by a kernel NULL pointer deref oops.

That's a good thing.  We have a condition which can be debugged and the
bug which caused the NULL pointer to be returned can be found and fixed.
Stephen Warren Nov. 25, 2013, 5:26 p.m. UTC | #12
On 11/22/2013 05:34 PM, Russell King - ARM Linux wrote:
[... various discussions re: IS_ERR, IS_ERR_OR_NULL, etc.]

Russell, so if we document dma_request_slave_channel() as follows:

> /*
>  * dma_request_slave_channel - try to allocate an exclusive slave channel
> ...
>  * Returns pointer to appropriate DMA channel on success or an error pointer.
>  * In order to ease compatibility with other DMA request APIs, this function
>  * guarantees NEVER to return NULL.
>  */

Are you then OK with clients doing either of e.g.:

chan = dma_request_slave_channel(...);
if (IS_ERR(chan))
	// This returns NULL or a valid handle/pointer
	chan = dma_request_channel()
if (!chan)
	Here, chan is invalid;
	return;
Here, chan is valid

or:

if (xxx) {
	chan = dma_request_slave_channel(...);
	// Convert to same error value as else{} block generates
	if (IS_ERR(chan))
		chan = NULL
} else {
	// This returns NULL or a valid handle/pointer
	chan = dma_request_channel()
}
if (!chan)
	Here, chan is invalid
	return;
Here, chan is valid
Russell King - ARM Linux Nov. 25, 2013, 5:53 p.m. UTC | #13
On Mon, Nov 25, 2013 at 10:26:18AM -0700, Stephen Warren wrote:
> On 11/22/2013 05:34 PM, Russell King - ARM Linux wrote:
> [... various discussions re: IS_ERR, IS_ERR_OR_NULL, etc.]
> 
> Russell, so if we document dma_request_slave_channel() as follows:
> 
> > /*
> >  * dma_request_slave_channel - try to allocate an exclusive slave channel
> > ...
> >  * Returns pointer to appropriate DMA channel on success or an error pointer.
> >  * In order to ease compatibility with other DMA request APIs, this function
> >  * guarantees NEVER to return NULL.
> >  */
> 
> Are you then OK with clients doing either of e.g.:
> 
> chan = dma_request_slave_channel(...);
> if (IS_ERR(chan))
> 	// This returns NULL or a valid handle/pointer
> 	chan = dma_request_channel()
> if (!chan)
> 	Here, chan is invalid;
> 	return;
> Here, chan is valid
> 
> or:
> 
> if (xxx) {
> 	chan = dma_request_slave_channel(...);
> 	// Convert to same error value as else{} block generates
> 	if (IS_ERR(chan))
> 		chan = NULL
> } else {
> 	// This returns NULL or a valid handle/pointer
> 	chan = dma_request_channel()
> }
> if (!chan)
> 	Here, chan is invalid
> 	return;
> Here, chan is valid

The cleaner way to write this is:

	chan = dma_request_slave_channel(...);
	if (IS_ERR(chan)) {
		chan = dma_request_channel();
		if (!chan)
			return;
	}

So, if you make it to this point, chan must be valid according to the
conditions on the API - you've applied the test individally to each
return function according to its documented behaviour.  If it does
happen to be NULL, that's not *your* problem as a user of the API.
Stephen Warren Nov. 25, 2013, 5:57 p.m. UTC | #14
On 11/25/2013 10:53 AM, Russell King - ARM Linux wrote:
> On Mon, Nov 25, 2013 at 10:26:18AM -0700, Stephen Warren wrote:
>> On 11/22/2013 05:34 PM, Russell King - ARM Linux wrote:
>> [... various discussions re: IS_ERR, IS_ERR_OR_NULL, etc.]
>>
>> Russell, so if we document dma_request_slave_channel() as follows:
>>
>>> /*
>>>  * dma_request_slave_channel - try to allocate an exclusive slave channel
>>> ...
>>>  * Returns pointer to appropriate DMA channel on success or an error pointer.
>>>  * In order to ease compatibility with other DMA request APIs, this function
>>>  * guarantees NEVER to return NULL.
>>>  */
>>
>> Are you then OK with clients doing either of e.g.:
>>
>> chan = dma_request_slave_channel(...);
>> if (IS_ERR(chan))
>> 	// This returns NULL or a valid handle/pointer
>> 	chan = dma_request_channel()
>> if (!chan)
>> 	Here, chan is invalid;
>> 	return;
>> Here, chan is valid
>>
>> or:
>>
>> if (xxx) {
>> 	chan = dma_request_slave_channel(...);
>> 	// Convert to same error value as else{} block generates
>> 	if (IS_ERR(chan))
>> 		chan = NULL
>> } else {
>> 	// This returns NULL or a valid handle/pointer
>> 	chan = dma_request_channel()
>> }
>> if (!chan)
>> 	Here, chan is invalid
>> 	return;
>> Here, chan is valid
> 
> The cleaner way to write this is:
> 
> 	chan = dma_request_slave_channel(...);
> 	if (IS_ERR(chan)) {
> 		chan = dma_request_channel();
> 		if (!chan)
> 			return;
> 	}
> 
> So, if you make it to this point, chan must be valid according to the
> conditions on the API - you've applied the test individally to each
> return function according to its documented behaviour.  If it does
> happen to be NULL, that's not *your* problem as a user of the API.

As Dan pointed out, there are many drivers where DMA is optional, so
there's a lot of this sprinkled through the body of the driver:

if (chan) ...
if (!chan) ...

If we don't convert IS_ERR() values to NULL like I showed above, then
all those tests would need to be converted to something else. Can we
please avoid that?
Gerhard Sittig Nov. 25, 2013, 8:28 p.m. UTC | #15
On Mon, Nov 25, 2013 at 10:57 -0700, Stephen Warren wrote:
> 
> As Dan pointed out, there are many drivers where DMA is optional, so
> there's a lot of this sprinkled through the body of the driver:
> 
> if (chan) ...
> if (!chan) ...
> 
> If we don't convert IS_ERR() values to NULL like I showed above, then
> all those tests would need to be converted to something else. Can we
> please avoid that?

Recently I had a similar situation with clocks.  It turned out to
be cumbersome to call allocation routines to assign the result
into state tracking variables, and to adjust the pointer to
become NULL in hindsight upon failure.  And I received feedback
that this feels dirty and somehow wrong.

What I did instead was to assign the allocation/lookup results to
local variables, then test them, and only assign to state
tracking variables when they are not error pointers (or expected
or acceptable errors which should go silently).

The shutdown logic then only had to check for the state tracking
variables against NULL, and release what was allocated as these
never could be errors.  Other references during the driver's
lifetime would be similar.

See 2771399a "fs_enet: cleanup clock API use" or b3bfce2bc "i2c:
mpc: cleanup clock API use" for an example.

Does this help in your case?  The "normalization" would be
concentrated in the acquisition spot, all error situations are
handled appropriately, and all other references in subsequent
code paths remain simple, and identical if there already are any.


virtually yours
Gerhard Sittig
Russell King - ARM Linux Nov. 25, 2013, 8:52 p.m. UTC | #16
On Mon, Nov 25, 2013 at 09:28:08PM +0100, Gerhard Sittig wrote:
> See 2771399a "fs_enet: cleanup clock API use" or b3bfce2bc "i2c:
> mpc: cleanup clock API use" for an example.

And had I seen that change I'd have commented thusly:

+       /* make clock lookup non-fatal (the driver is shared among platforms),
+        * but require enable to succeed when a clock was specified/found,
+        * keep a reference to the clock upon successful acquisition
+        */
+       clk = devm_clk_get(&ofdev->dev, "per");
+       if (!IS_ERR(clk)) {
+               err = clk_prepare_enable(clk);
+               if (err) {
+                       ret = err;
+                       goto out_free_fpi;
+               }
+               fpi->clk_per = clk;
+       }

 out_put:
        of_node_put(fpi->phy_node);
+       if (fpi->clk_per)
+               clk_disable_unprepare(fpi->clk_per);

        of_node_put(fep->fpi->phy_node);
+       if (fep->fpi->clk_per)
+               clk_disable_unprepare(fep->fpi->clk_per);

So, lets consider what happens if clk_get() inside devm_clk_get() returns
NULL.

* devm_clk_get() allocates its tracking structure, and sets the clk pointer
  to be freed to NULL.
* !IS_ERR(NULL) is true, so we call clk_prepare_enable(NULL).  This succeeds.
* We store NULL into fpi->clk_per.
* The error cleanup/teardown paths check for a NULL pointer, and fail to
  call the CLK API in that case.

This is not very nice.  Better solution:

	fpi->clk_per = PTR_ERR(-EINVAL);
	clk = devm_clk_get(&ofdev->dev, "per");
	if (!IS_ERR(clk)) {
		err = clk_prepare_enable(clk);
		if (err) {
			ret = err;
			goto out_free_fpi;
		}
		fpi->clk_per = clk;
	}

...

        of_node_put(fpi->phy_node);
        if (!IS_ERR(fpi->clk_per))
                clk_disable_unprepare(fpi->clk_per);
Gerhard Sittig Nov. 28, 2013, 9:20 p.m. UTC | #17
On Mon, Nov 25, 2013 at 20:52 +0000, Russell King - ARM Linux wrote:
> 
> [ ... 2771399a "fs_enet: cleanup clock API use" example ... ]
> [ ... NULL clocks won't get released, calls might get skipped ... ]

I agree that the NULL clock reference is not handled correctly in
that code.  Part of the issue is that NULL is both an
initialization value and a valid reference to something that was
acquired (and quite counter intuitive so).

This inability to tell the two reasons for NULL apart only
vanishes when each clock variable explicitly gets initialized or
pre-set to some ERR_PTR() value (or when the allocation and
assignment is unconditional, no code path depends on some other
condition).  I'm afraid that the source tree currently is not
there yet, and it may be quite some churn (with a lot of
potential for conflicts) to touch each driver, if it's at all
possible to identify all spots where those variables are NULL
because of
- static declarations without initializers
- static declarations with incomplete initializers
- memory allocation with "zeroes please" flags
- memset() calls
- clock acquisition code paths were not taken


Which platform or clock provider or specific clock driver exactly
is this mysterious instance which returns NULL for a valid clock?
Is this one or the very few instances easier to adjust and thus
(re-)gain certainty about correct operation with less effort and
much less risk of missing something?


> This is not very nice.  Better solution:
> 
> 	fpi->clk_per = PTR_ERR(-EINVAL);
> 	clk = devm_clk_get(&ofdev->dev, "per");
> 	if (!IS_ERR(clk)) {
> 		err = clk_prepare_enable(clk);
> 		if (err) {
> 			ret = err;
> 			goto out_free_fpi;
> 		}
> 		fpi->clk_per = clk;
> 	}
> 
> ...
> 
>         of_node_put(fpi->phy_node);
>         if (!IS_ERR(fpi->clk_per))
>                 clk_disable_unprepare(fpi->clk_per);

Assume in the above example that the fpi->clk_per assignment
would be in a conditional code path.  In that case the code is as
wrong (unpreparing a not acquired NULL reference) as leaking an
acquired NULL reference is.


virtually yours
Gerhard Sittig
diff mbox

Patch

diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
index 9162ac80c18f..64c163664b9d 100644
--- a/drivers/dma/dmaengine.c
+++ b/drivers/dma/dmaengine.c
@@ -593,15 +593,20 @@  EXPORT_SYMBOL_GPL(__dma_request_channel);
  */
 struct dma_chan *dma_request_slave_channel(struct device *dev, const
char *name)
 {
+       struct dma_chan *chan = ERR_PTR(-ENODEV);
+
        /* If device-tree is present get slave info from here */
        if (dev->of_node)
                return of_dma_request_slave_channel(dev->of_node, name);

        /* If device was enumerated by ACPI get slave info from here */
-       if (ACPI_HANDLE(dev))
-               return acpi_dma_request_slave_chan_by_name(dev, name);
+       if (ACPI_HANDLE(dev)) {
+               chan = acpi_dma_request_slave_chan_by_name(dev, name);
+               if (!chan)
+                       chan = ERR_PTR(-ENODEV);
+       }

-       return NULL;
+       return chan;
 }