diff mbox

dmaengine: virt-dma: fix completion list manipulation

Message ID 1425331196-7895-1-git-send-email-robert.jarzmik@free.fr (mailing list archive)
State Rejected
Headers show

Commit Message

Robert Jarzmik March 2, 2015, 9:19 p.m. UTC
When a transfer is completed, the descriptor is moved from issued list
to completed list. Fix the list manipulation, from list_add to
list_move_tail.

The bug was seen with a multiple descriptors issued and completed lists,
where the issued list chaining was corrupted.

Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
---
 drivers/dma/virt-dma.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Lars-Peter Clausen March 2, 2015, 9:29 p.m. UTC | #1
On 03/02/2015 10:19 PM, Robert Jarzmik wrote:
> diff --git a/drivers/dma/virt-dma.h b/drivers/dma/virt-dma.h
> index 3772032..2a3da22 100644
> --- a/drivers/dma/virt-dma.h
> +++ b/drivers/dma/virt-dma.h
> @@ -91,7 +91,7 @@ static inline void vchan_cookie_complete(struct virt_dma_desc *vd)
>   	dma_cookie_complete(&vd->tx);
>   	dev_vdbg(vc->chan.device->dev, "txd %p[%x]: marked complete\n",
>   		 vd, cookie);
> -	list_add_tail(&vd->node, &vc->desc_completed);
> +	list_move_tail(&vd->node, &vc->desc_completed);

That will break all drivers which handle this currently correctly and remove 
the descriptor from any list before calling vchan_cookie_complete.

--
To unsubscribe from this list: send the line "unsubscribe dmaengine" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Robert Jarzmik March 2, 2015, 10:03 p.m. UTC | #2
Lars-Peter Clausen <lars@metafoo.de> writes:

> On 03/02/2015 10:19 PM, Robert Jarzmik wrote:
>> diff --git a/drivers/dma/virt-dma.h b/drivers/dma/virt-dma.h
>> index 3772032..2a3da22 100644
>> --- a/drivers/dma/virt-dma.h
>> +++ b/drivers/dma/virt-dma.h
>> @@ -91,7 +91,7 @@ static inline void vchan_cookie_complete(struct virt_dma_desc *vd)
>>   	dma_cookie_complete(&vd->tx);
>>   	dev_vdbg(vc->chan.device->dev, "txd %p[%x]: marked complete\n",
>>   		 vd, cookie);
>> -	list_add_tail(&vd->node, &vc->desc_completed);
>> +	list_move_tail(&vd->node, &vc->desc_completed);
>
> That will break all drivers which handle this currently correctly and remove the
> descriptor from any list before calling vchan_cookie_complete.
Ah, well well I don't agree.

First, let's split the drivers which remove the descriptors and these which
don't :

These which remove the descriptor:
dma-jz4740.c
fsl-edma.c

These which don't remove the descriptor:
amba-pl08x.c
edma.c
img-mdc-dma.c
k3dma.c
moxart-dma.c
omap-dma.c
qcom_bam_dma.c
s3c24xx-dma.c
sa11x0-dma.c
sun6i-dma.c

That settles the correctness I think, the correct behavior is to not remove the
descriptor and let it be done by vchan_cookie_complete().

Now for the remaining 2 drivers, we'll have :
 - list_del(&vd->node) => vd becomes a singleton
 - list_move_tail(&vd->node, &...desc_completed)
   => list_del(&vd->node) : nothing changes, it's a nop
   => list_add_tail(&vd->node, &...desc_completed)

And the behavior remains correct after the patch, only one "list_del()" is done
twice for nothing. Where do you see any breakage ?

Cheers.
Lars-Peter Clausen March 3, 2015, 7:23 a.m. UTC | #3
On 03/02/2015 11:03 PM, Robert Jarzmik wrote:
> Lars-Peter Clausen <lars@metafoo.de> writes:
>
>> On 03/02/2015 10:19 PM, Robert Jarzmik wrote:
>>> diff --git a/drivers/dma/virt-dma.h b/drivers/dma/virt-dma.h
>>> index 3772032..2a3da22 100644
>>> --- a/drivers/dma/virt-dma.h
>>> +++ b/drivers/dma/virt-dma.h
>>> @@ -91,7 +91,7 @@ static inline void vchan_cookie_complete(struct virt_dma_desc *vd)
>>>    	dma_cookie_complete(&vd->tx);
>>>    	dev_vdbg(vc->chan.device->dev, "txd %p[%x]: marked complete\n",
>>>    		 vd, cookie);
>>> -	list_add_tail(&vd->node, &vc->desc_completed);
>>> +	list_move_tail(&vd->node, &vc->desc_completed);
>>
>> That will break all drivers which handle this currently correctly and remove the
>> descriptor from any list before calling vchan_cookie_complete.
> Ah, well well I don't agree.
>
> First, let's split the drivers which remove the descriptors and these which
> don't :
>
> These which remove the descriptor:
> dma-jz4740.c
> fsl-edma.c
>
> These which don't remove the descriptor:
> amba-pl08x.c
> edma.c
> img-mdc-dma.c
> k3dma.c
> moxart-dma.c
> omap-dma.c
> qcom_bam_dma.c
> s3c24xx-dma.c
> sa11x0-dma.c
> sun6i-dma.c

All of those remove the descriptor from the list when they start the transfer.

>
> That settles the correctness I think, the correct behavior is to not remove the
> descriptor and let it be done by vchan_cookie_complete().
>
> Now for the remaining 2 drivers, we'll have :
>   - list_del(&vd->node) => vd becomes a singleton
>   - list_move_tail(&vd->node, &...desc_completed)
>     => list_del(&vd->node) : nothing changes, it's a nop
>     => list_add_tail(&vd->node, &...desc_completed)
>
> And the behavior remains correct after the patch, only one "list_del()" is done
> twice for nothing. Where do you see any breakage ?

Calling list_del() on a list item that is not on a list causes undefined 
behavior, which can result in memory corruption, segfaults, etc...

--
To unsubscribe from this list: send the line "unsubscribe dmaengine" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Robert Jarzmik March 3, 2015, 10:27 a.m. UTC | #4
Lars-Peter Clausen <lars@metafoo.de> writes:

>>> That will break all drivers which handle this currently correctly and remove the
>>> descriptor from any list before calling vchan_cookie_complete.
>> Ah, well well I don't agree.
>>
>> First, let's split the drivers which remove the descriptors and these which
>> don't :
>>
>> These which remove the descriptor:
>> dma-jz4740.c
>> fsl-edma.c
>>
>> These which don't remove the descriptor:
>> amba-pl08x.c
>> edma.c
>> img-mdc-dma.c
>> k3dma.c
>> moxart-dma.c
>> omap-dma.c
>> qcom_bam_dma.c
>> s3c24xx-dma.c
>> sa11x0-dma.c
>> sun6i-dma.c
>
> All of those remove the descriptor from the list when they start the transfer.
Ah, I didn't see that.
Isn't the descriptor lost if a terminate_all(), relying on
vchan_get_all_descriptors() and vchan_dma_desc_free_list() is used ?

>>
>> That settles the correctness I think, the correct behavior is to not remove the
>> descriptor and let it be done by vchan_cookie_complete().
>>
>> Now for the remaining 2 drivers, we'll have :
>>   - list_del(&vd->node) => vd becomes a singleton
>>   - list_move_tail(&vd->node, &...desc_completed)
>>     => list_del(&vd->node) : nothing changes, it's a nop
>>     => list_add_tail(&vd->node, &...desc_completed)
>>
>> And the behavior remains correct after the patch, only one "list_del()" is done
>> twice for nothing. Where do you see any breakage ?
>
> Calling list_del() on a list item that is not on a list causes undefined
> behavior, which can result in memory corruption, segfaults, etc...
Ah yes, you must be thinking about the "poisoning" after the __list_del() call,
I forgot about that.

Do you think amending all these drivers and changing their list_del() into
list_del_init() would at least prevent the "undefined behavior" ?

I still think that their use of virt-dma is incorrect, ie. that at one point in
time a virtual descriptor has to be on exactly one list of virt-dma (excepting
transient critical sections).

Cheers.
Lars-Peter Clausen March 3, 2015, 11:21 a.m. UTC | #5
On 03/03/2015 11:27 AM, Robert Jarzmik wrote:
> Lars-Peter Clausen <lars@metafoo.de> writes:
>
>>>> That will break all drivers which handle this currently correctly and remove the
>>>> descriptor from any list before calling vchan_cookie_complete.
>>> Ah, well well I don't agree.
>>>
>>> First, let's split the drivers which remove the descriptors and these which
>>> don't :
>>>
>>> These which remove the descriptor:
>>> dma-jz4740.c
>>> fsl-edma.c
>>>
>>> These which don't remove the descriptor:
>>> amba-pl08x.c
>>> edma.c
>>> img-mdc-dma.c
>>> k3dma.c
>>> moxart-dma.c
>>> omap-dma.c
>>> qcom_bam_dma.c
>>> s3c24xx-dma.c
>>> sa11x0-dma.c
>>> sun6i-dma.c
>>
>> All of those remove the descriptor from the list when they start the transfer.
> Ah, I didn't see that.
> Isn't the descriptor lost if a terminate_all(), relying on
> vchan_get_all_descriptors() and vchan_dma_desc_free_list() is used ?

If the driver doesn't manually add it to the list of to be freed descriptors 
yes.

>
>>>
>>> That settles the correctness I think, the correct behavior is to not remove the
>>> descriptor and let it be done by vchan_cookie_complete().
>>>
>>> Now for the remaining 2 drivers, we'll have :
>>>    - list_del(&vd->node) => vd becomes a singleton
>>>    - list_move_tail(&vd->node, &...desc_completed)
>>>      => list_del(&vd->node) : nothing changes, it's a nop
>>>      => list_add_tail(&vd->node, &...desc_completed)
>>>
>>> And the behavior remains correct after the patch, only one "list_del()" is done
>>> twice for nothing. Where do you see any breakage ?
>>
>> Calling list_del() on a list item that is not on a list causes undefined
>> behavior, which can result in memory corruption, segfaults, etc...
> Ah yes, you must be thinking about the "poisoning" after the __list_del() call,
> I forgot about that.
>
> Do you think amending all these drivers and changing their list_del() into
> list_del_init() would at least prevent the "undefined behavior" ?

Yes.

>
> I still think that their use of virt-dma is incorrect, ie. that at one point in
> time a virtual descriptor has to be on exactly one list of virt-dma (excepting
> transient critical sections).

Well the drivers conform to the current expected behavior. It might be worth 
changing that, but you need to modify all the driver to conform to the new 
semantics, rather than just changing the API.

Requiring that the descriptor is always on one of the virt-dma list is to 
restrictive. Some DMA controllers are able to submit multiple descriptors at 
the same time, these typically have a separate list to manage to active 
descriptors.

- Lars

--
To unsubscribe from this list: send the line "unsubscribe dmaengine" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Robert Jarzmik March 3, 2015, 11:55 a.m. UTC | #6
Lars-Peter Clausen <lars@metafoo.de> writes:

> On 03/03/2015 11:27 AM, Robert Jarzmik wrote:
>> I still think that their use of virt-dma is incorrect, ie. that at one point in
>> time a virtual descriptor has to be on exactly one list of virt-dma (excepting
>> transient critical sections).
>
> Well the drivers conform to the current expected behavior. It might be worth
> changing that, but you need to modify all the driver to conform to the new
> semantics, rather than just changing the API.
>
> Requiring that the descriptor is always on one of the virt-dma list is to
> restrictive. Some DMA controllers are able to submit multiple descriptors at the
> same time, these typically have a separate list to manage to active
> descriptors.

Ah, so your understanding of the virt-dma API is that the virtual descriptors
might be out of virt-dma linked lists. In that case, I must rethink this over,
as I was thinking the "multiple active descriptors" case was relying on all
these descriptors being on the desc_issued list, and the "multiple simultaneous
active" list was not in virtual descriptor but in the dmaengine driver's one.

Anyway, I'll drop this patch for now, as nobody else seems to care about
virt-dma semantics, and I won't modify the API if the pros and cons I see do
look equivalent (which is the case right now).

Cheers.
diff mbox

Patch

diff --git a/drivers/dma/virt-dma.h b/drivers/dma/virt-dma.h
index 3772032..2a3da22 100644
--- a/drivers/dma/virt-dma.h
+++ b/drivers/dma/virt-dma.h
@@ -91,7 +91,7 @@  static inline void vchan_cookie_complete(struct virt_dma_desc *vd)
 	dma_cookie_complete(&vd->tx);
 	dev_vdbg(vc->chan.device->dev, "txd %p[%x]: marked complete\n",
 		 vd, cookie);
-	list_add_tail(&vd->node, &vc->desc_completed);
+	list_move_tail(&vd->node, &vc->desc_completed);
 
 	tasklet_schedule(&vc->task);
 }