diff mbox

net: mediatek: use dma_zalloc_coherent instead of allocator/memset

Message ID 20180719140955.19444-1-yuehaibing@huawei.com (mailing list archive)
State New, archived
Headers show

Commit Message

Yue Haibing July 19, 2018, 2:09 p.m. UTC
Use dma_zalloc_coherent instead of dma_alloc_coherent
followed by memset 0.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
---
 drivers/net/ethernet/mediatek/mtk_eth_soc.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

Comments

Russell King (Oracle) July 19, 2018, 2:17 p.m. UTC | #1
On Thu, Jul 19, 2018 at 10:09:55PM +0800, YueHaibing wrote:
> Use dma_zalloc_coherent instead of dma_alloc_coherent
> followed by memset 0.
> 
> Signed-off-by: YueHaibing <yuehaibing@huawei.com>
> ---
>  drivers/net/ethernet/mediatek/mtk_eth_soc.c | 7 ++-----
>  1 file changed, 2 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> index d8ebf0a..fbdb3e3 100644
> --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> @@ -1221,14 +1221,11 @@ static int mtk_tx_alloc(struct mtk_eth *eth)
>  	if (!ring->buf)
>  		goto no_tx_mem;
>  
> -	ring->dma = dma_alloc_coherent(eth->dev,
> -					  MTK_DMA_SIZE * sz,
> -					  &ring->phys,
> -					  GFP_ATOMIC | __GFP_ZERO);
> +	ring->dma = dma_zalloc_coherent(eth->dev, MTK_DMA_SIZE * sz,
> +					&ring->phys, GFP_ATOMIC | __GFP_ZERO);
>  	if (!ring->dma)
>  		goto no_tx_mem;
>  
> -	memset(ring->dma, 0, MTK_DMA_SIZE * sz);

I have to wonder whether this code needs two forms of zeroing... in
the original code, __GFP_ZERO _and_ a call to memset() just in case
__GFP_ZERO failed to do its job, and in the replacement code, just
in case dma_zalloc_coherent() hasn't got the idea...

I think you can drop the __GFP_ZERO. ;)
Sean Wang July 19, 2018, 5:02 p.m. UTC | #2
On Thu, 2018-07-19 at 15:17 +0100, Russell King - ARM Linux wrote:
> On Thu, Jul 19, 2018 at 10:09:55PM +0800, YueHaibing wrote:
> > Use dma_zalloc_coherent instead of dma_alloc_coherent
> > followed by memset 0.
> > 
> > Signed-off-by: YueHaibing <yuehaibing@huawei.com>
> > ---
> >  drivers/net/ethernet/mediatek/mtk_eth_soc.c | 7 ++-----
> >  1 file changed, 2 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> > index d8ebf0a..fbdb3e3 100644
> > --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> > +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> > @@ -1221,14 +1221,11 @@ static int mtk_tx_alloc(struct mtk_eth *eth)
> >  	if (!ring->buf)
> >  		goto no_tx_mem;
> >  
> > -	ring->dma = dma_alloc_coherent(eth->dev,
> > -					  MTK_DMA_SIZE * sz,
> > -					  &ring->phys,
> > -					  GFP_ATOMIC | __GFP_ZERO);
> > +	ring->dma = dma_zalloc_coherent(eth->dev, MTK_DMA_SIZE * sz,
> > +					&ring->phys, GFP_ATOMIC | __GFP_ZERO);
> >  	if (!ring->dma)
> >  		goto no_tx_mem;
> >  
> > -	memset(ring->dma, 0, MTK_DMA_SIZE * sz);
> 
> I have to wonder whether this code needs two forms of zeroing... in
> the original code, __GFP_ZERO _and_ a call to memset() just in case
> __GFP_ZERO failed to do its job, and in the replacement code, just
> in case dma_zalloc_coherent() hasn't got the idea...
> 
> I think you can drop the __GFP_ZERO. ;)
> 

Just now I did an experiment on 4.14.56 on armv7. I found that
dma_zalloc_coherent does not guarantee that the buffer we get
is all filled with 0.


I really think it's a little bit weird OR what was I missing something
for enabling dma_zalloc_coherent ? The result seems to tell that we
can't remove freely the memset with 0 at this moment until we get a
cause.


my test code is

         ring->dma = dma_zalloc_coherent(eth->dev,
                                          MTK_DMA_SIZE * sz,
                                          &ring->phys,
                                          GFP_ATOMIC);
        if (!ring->dma)
                goto no_tx_mem;

        print_hex_dump(KERN_INFO, "mtk_tx_alloc:",
                       DUMP_PREFIX_OFFSET, 16, 1,
                       ring->dma, MTK_DMA_SIZE * sz, false);

        memset(ring->dma, 0, MTK_DMA_SIZE * sz);

        print_hex_dump(KERN_INFO, "mtk_tx_alloc2:",
                       DUMP_PREFIX_OFFSET, 16, 1,
                       ring->dma, MTK_DMA_SIZE * sz, false);


and the output 

...

[  259.610413] mtk_tx_alloc:00000f40: 00 00 00 00 50 1f 00 bc 00 00 00 c0 00 00 00 00
[  259.617934] mtk_tx_alloc:00000f50: 00 00 00 00 60 1f 00 bc 00 00 00 c0 00 00 00 00
[  259.625470] mtk_tx_alloc:00000f60: 00 00 00 00 70 1f 00 bc 00 00 00 c0 00 00 00 00
[  259.633005] mtk_tx_alloc:00000f70: 00 00 00 00 80 1f 00 bc 00 00 00 c0 00 00 00 00
[  259.640539] mtk_tx_alloc:00000f80: 00 00 00 00 90 1f 00 bc 00 00 00 c0 00 00 00 00
[  259.648073] mtk_tx_alloc:00000f90: 00 00 00 00 a0 1f 00 bc 00 00 00 c0 00 00 00 00
[  259.655590] mtk_tx_alloc:00000fa0: 00 00 00 00 b0 1f 00 bc 00 00 00 c0 00 00 00 00
[  259.663124] mtk_tx_alloc:00000fb0: 00 00 00 00 c0 1f 00 bc 00 00 00 c0 00 00 00 00
[  259.670660] mtk_tx_alloc:00000fc0: 00 00 00 00 d0 1f 00 bc 00 00 00 c0 00 00 00 00
[  259.678196] mtk_tx_alloc:00000fd0: 00 00 00 00 e0 1f 00 bc 00 00 00 c0 00 00 00 00
[  259.685713] mtk_tx_alloc:00000fe0: 00 00 00 00 f0 1f 00 bc 00 00 00 c0 00 00 00 00
[  259.693247] mtk_tx_alloc:00000ff0: 02 c0 90 ad 00 10 00 bc 00 40 5a c0 00 00 00 02
[  259.700782] mtk_tx_alloc2:00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  259.708405] mtk_tx_alloc2:00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  259.716013] mtk_tx_alloc2:00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  259.723634] mtk_tx_alloc2:00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  259.731253] mtk_tx_alloc2:00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  259.738875] mtk_tx_alloc2:00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  259.746481] mtk_tx_alloc2:00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  259.754103] mtk_tx_alloc2:00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  259.761723] mtk_tx_alloc2:00000080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  259.769344] mtk_tx_alloc2:00000090: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  259.776951] mtk_tx_alloc2:000000a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

....


	Sean
Yue Haibing July 20, 2018, 6:30 a.m. UTC | #3
On 2018/7/20 1:02, Sean Wang wrote:
> On Thu, 2018-07-19 at 15:17 +0100, Russell King - ARM Linux wrote:
>> On Thu, Jul 19, 2018 at 10:09:55PM +0800, YueHaibing wrote:
>>> Use dma_zalloc_coherent instead of dma_alloc_coherent
>>> followed by memset 0.
>>>
>>> Signed-off-by: YueHaibing <yuehaibing@huawei.com>
>>> ---
>>>  drivers/net/ethernet/mediatek/mtk_eth_soc.c | 7 ++-----
>>>  1 file changed, 2 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
>>> index d8ebf0a..fbdb3e3 100644
>>> --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
>>> +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
>>> @@ -1221,14 +1221,11 @@ static int mtk_tx_alloc(struct mtk_eth *eth)
>>>  	if (!ring->buf)
>>>  		goto no_tx_mem;
>>>  
>>> -	ring->dma = dma_alloc_coherent(eth->dev,
>>> -					  MTK_DMA_SIZE * sz,
>>> -					  &ring->phys,
>>> -					  GFP_ATOMIC | __GFP_ZERO);
>>> +	ring->dma = dma_zalloc_coherent(eth->dev, MTK_DMA_SIZE * sz,
>>> +					&ring->phys, GFP_ATOMIC | __GFP_ZERO);
>>>  	if (!ring->dma)
>>>  		goto no_tx_mem;
>>>  
>>> -	memset(ring->dma, 0, MTK_DMA_SIZE * sz);
>>
>> I have to wonder whether this code needs two forms of zeroing... in
>> the original code, __GFP_ZERO _and_ a call to memset() just in case
>> __GFP_ZERO failed to do its job, and in the replacement code, just
>> in case dma_zalloc_coherent() hasn't got the idea...
>>
>> I think you can drop the __GFP_ZERO. ;)
>>
> 
> Just now I did an experiment on 4.14.56 on armv7. I found that
> dma_zalloc_coherent does not guarantee that the buffer we get
> is all filled with 0.
> 
> 
> I really think it's a little bit weird OR what was I missing something
> for enabling dma_zalloc_coherent ? The result seems to tell that we
> can't remove freely the memset with 0 at this moment until we get a
> cause.
> 

That means dma_zalloc_coherent doesn't work as expect on armv7?

> 
> my test code is
> 
>          ring->dma = dma_zalloc_coherent(eth->dev,
>                                           MTK_DMA_SIZE * sz,
>                                           &ring->phys,
>                                           GFP_ATOMIC);
>         if (!ring->dma)
>                 goto no_tx_mem;
> 
>         print_hex_dump(KERN_INFO, "mtk_tx_alloc:",
>                        DUMP_PREFIX_OFFSET, 16, 1,
>                        ring->dma, MTK_DMA_SIZE * sz, false);
> 
>         memset(ring->dma, 0, MTK_DMA_SIZE * sz);
> 
>         print_hex_dump(KERN_INFO, "mtk_tx_alloc2:",
>                        DUMP_PREFIX_OFFSET, 16, 1,
>                        ring->dma, MTK_DMA_SIZE * sz, false);
> 
> 
> and the output 
> 
> ...
> 
> [  259.610413] mtk_tx_alloc:00000f40: 00 00 00 00 50 1f 00 bc 00 00 00 c0 00 00 00 00
> [  259.617934] mtk_tx_alloc:00000f50: 00 00 00 00 60 1f 00 bc 00 00 00 c0 00 00 00 00
> [  259.625470] mtk_tx_alloc:00000f60: 00 00 00 00 70 1f 00 bc 00 00 00 c0 00 00 00 00
> [  259.633005] mtk_tx_alloc:00000f70: 00 00 00 00 80 1f 00 bc 00 00 00 c0 00 00 00 00
> [  259.640539] mtk_tx_alloc:00000f80: 00 00 00 00 90 1f 00 bc 00 00 00 c0 00 00 00 00
> [  259.648073] mtk_tx_alloc:00000f90: 00 00 00 00 a0 1f 00 bc 00 00 00 c0 00 00 00 00
> [  259.655590] mtk_tx_alloc:00000fa0: 00 00 00 00 b0 1f 00 bc 00 00 00 c0 00 00 00 00
> [  259.663124] mtk_tx_alloc:00000fb0: 00 00 00 00 c0 1f 00 bc 00 00 00 c0 00 00 00 00
> [  259.670660] mtk_tx_alloc:00000fc0: 00 00 00 00 d0 1f 00 bc 00 00 00 c0 00 00 00 00
> [  259.678196] mtk_tx_alloc:00000fd0: 00 00 00 00 e0 1f 00 bc 00 00 00 c0 00 00 00 00
> [  259.685713] mtk_tx_alloc:00000fe0: 00 00 00 00 f0 1f 00 bc 00 00 00 c0 00 00 00 00
> [  259.693247] mtk_tx_alloc:00000ff0: 02 c0 90 ad 00 10 00 bc 00 40 5a c0 00 00 00 02
> [  259.700782] mtk_tx_alloc2:00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [  259.708405] mtk_tx_alloc2:00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [  259.716013] mtk_tx_alloc2:00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [  259.723634] mtk_tx_alloc2:00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [  259.731253] mtk_tx_alloc2:00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [  259.738875] mtk_tx_alloc2:00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [  259.746481] mtk_tx_alloc2:00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [  259.754103] mtk_tx_alloc2:00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [  259.761723] mtk_tx_alloc2:00000080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [  259.769344] mtk_tx_alloc2:00000090: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [  259.776951] mtk_tx_alloc2:000000a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 
> ....
> 
> 
> 	Sean
> 
> 
> 
> .
>
Sean Wang July 20, 2018, 6:54 a.m. UTC | #4
On Fri, 2018-07-20 at 14:30 +0800, YueHaibing wrote:
> On 2018/7/20 1:02, Sean Wang wrote:
> > On Thu, 2018-07-19 at 15:17 +0100, Russell King - ARM Linux wrote:
> >> On Thu, Jul 19, 2018 at 10:09:55PM +0800, YueHaibing wrote:
> >>> Use dma_zalloc_coherent instead of dma_alloc_coherent
> >>> followed by memset 0.
> >>>
> >>> Signed-off-by: YueHaibing <yuehaibing@huawei.com>
> >>> ---
> >>>  drivers/net/ethernet/mediatek/mtk_eth_soc.c | 7 ++-----
> >>>  1 file changed, 2 insertions(+), 5 deletions(-)
> >>>
> >>> diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> >>> index d8ebf0a..fbdb3e3 100644
> >>> --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> >>> +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> >>> @@ -1221,14 +1221,11 @@ static int mtk_tx_alloc(struct mtk_eth *eth)
> >>>  	if (!ring->buf)
> >>>  		goto no_tx_mem;
> >>>  
> >>> -	ring->dma = dma_alloc_coherent(eth->dev,
> >>> -					  MTK_DMA_SIZE * sz,
> >>> -					  &ring->phys,
> >>> -					  GFP_ATOMIC | __GFP_ZERO);
> >>> +	ring->dma = dma_zalloc_coherent(eth->dev, MTK_DMA_SIZE * sz,
> >>> +					&ring->phys, GFP_ATOMIC | __GFP_ZERO);
> >>>  	if (!ring->dma)
> >>>  		goto no_tx_mem;
> >>>  
> >>> -	memset(ring->dma, 0, MTK_DMA_SIZE * sz);
> >>
> >> I have to wonder whether this code needs two forms of zeroing... in
> >> the original code, __GFP_ZERO _and_ a call to memset() just in case
> >> __GFP_ZERO failed to do its job, and in the replacement code, just
> >> in case dma_zalloc_coherent() hasn't got the idea...
> >>
> >> I think you can drop the __GFP_ZERO. ;)
> >>
> > 
> > Just now I did an experiment on 4.14.56 on armv7. I found that
> > dma_zalloc_coherent does not guarantee that the buffer we get
> > is all filled with 0.
> > 
> > 
> > I really think it's a little bit weird OR what was I missing something
> > for enabling dma_zalloc_coherent ? The result seems to tell that we
> > can't remove freely the memset with 0 at this moment until we get a
> > cause.
> > 
> 
> That means dma_zalloc_coherent doesn't work as expect on armv7?
> 

I'm not sure if it's true for every armv7. or it's only happening on my
device.

anyway, i think we can replace all occurrences in the driver for
dma_alloc_coherent with __GFP_ZERO by dma_zalloc_coherent, and but
keep the extra memset as is.
	
	Sean

> > 
> > my test code is
> > 
> >          ring->dma = dma_zalloc_coherent(eth->dev,
> >                                           MTK_DMA_SIZE * sz,
> >                                           &ring->phys,
> >                                           GFP_ATOMIC);
> >         if (!ring->dma)
> >                 goto no_tx_mem;
> > 
> >         print_hex_dump(KERN_INFO, "mtk_tx_alloc:",
> >                        DUMP_PREFIX_OFFSET, 16, 1,
> >                        ring->dma, MTK_DMA_SIZE * sz, false);
> > 
> >         memset(ring->dma, 0, MTK_DMA_SIZE * sz);
> > 
> >         print_hex_dump(KERN_INFO, "mtk_tx_alloc2:",
> >                        DUMP_PREFIX_OFFSET, 16, 1,
> >                        ring->dma, MTK_DMA_SIZE * sz, false);
> > 
> > 
> > and the output 
> > 
> > ...
> > 
> > [  259.610413] mtk_tx_alloc:00000f40: 00 00 00 00 50 1f 00 bc 00 00 00 c0 00 00 00 00
> > [  259.617934] mtk_tx_alloc:00000f50: 00 00 00 00 60 1f 00 bc 00 00 00 c0 00 00 00 00
> > [  259.625470] mtk_tx_alloc:00000f60: 00 00 00 00 70 1f 00 bc 00 00 00 c0 00 00 00 00
> > [  259.633005] mtk_tx_alloc:00000f70: 00 00 00 00 80 1f 00 bc 00 00 00 c0 00 00 00 00
> > [  259.640539] mtk_tx_alloc:00000f80: 00 00 00 00 90 1f 00 bc 00 00 00 c0 00 00 00 00
> > [  259.648073] mtk_tx_alloc:00000f90: 00 00 00 00 a0 1f 00 bc 00 00 00 c0 00 00 00 00
> > [  259.655590] mtk_tx_alloc:00000fa0: 00 00 00 00 b0 1f 00 bc 00 00 00 c0 00 00 00 00
> > [  259.663124] mtk_tx_alloc:00000fb0: 00 00 00 00 c0 1f 00 bc 00 00 00 c0 00 00 00 00
> > [  259.670660] mtk_tx_alloc:00000fc0: 00 00 00 00 d0 1f 00 bc 00 00 00 c0 00 00 00 00
> > [  259.678196] mtk_tx_alloc:00000fd0: 00 00 00 00 e0 1f 00 bc 00 00 00 c0 00 00 00 00
> > [  259.685713] mtk_tx_alloc:00000fe0: 00 00 00 00 f0 1f 00 bc 00 00 00 c0 00 00 00 00
> > [  259.693247] mtk_tx_alloc:00000ff0: 02 c0 90 ad 00 10 00 bc 00 40 5a c0 00 00 00 02
> > [  259.700782] mtk_tx_alloc2:00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > [  259.708405] mtk_tx_alloc2:00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > [  259.716013] mtk_tx_alloc2:00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > [  259.723634] mtk_tx_alloc2:00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > [  259.731253] mtk_tx_alloc2:00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > [  259.738875] mtk_tx_alloc2:00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > [  259.746481] mtk_tx_alloc2:00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > [  259.754103] mtk_tx_alloc2:00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > [  259.761723] mtk_tx_alloc2:00000080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > [  259.769344] mtk_tx_alloc2:00000090: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > [  259.776951] mtk_tx_alloc2:000000a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > 
> > ....
> > 
> > 
> > 	Sean
> > 
> > 
> > 
> > .
> > 
>
Russell King (Oracle) July 20, 2018, 8:13 a.m. UTC | #5
On Fri, Jul 20, 2018 at 02:30:53PM +0800, YueHaibing wrote:
> On 2018/7/20 1:02, Sean Wang wrote:
> > On Thu, 2018-07-19 at 15:17 +0100, Russell King - ARM Linux wrote:
> >> On Thu, Jul 19, 2018 at 10:09:55PM +0800, YueHaibing wrote:
> >>> Use dma_zalloc_coherent instead of dma_alloc_coherent
> >>> followed by memset 0.
> >>>
> >>> Signed-off-by: YueHaibing <yuehaibing@huawei.com>
> >>> ---
> >>>  drivers/net/ethernet/mediatek/mtk_eth_soc.c | 7 ++-----
> >>>  1 file changed, 2 insertions(+), 5 deletions(-)
> >>>
> >>> diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> >>> index d8ebf0a..fbdb3e3 100644
> >>> --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> >>> +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> >>> @@ -1221,14 +1221,11 @@ static int mtk_tx_alloc(struct mtk_eth *eth)
> >>>  	if (!ring->buf)
> >>>  		goto no_tx_mem;
> >>>  
> >>> -	ring->dma = dma_alloc_coherent(eth->dev,
> >>> -					  MTK_DMA_SIZE * sz,
> >>> -					  &ring->phys,
> >>> -					  GFP_ATOMIC | __GFP_ZERO);
> >>> +	ring->dma = dma_zalloc_coherent(eth->dev, MTK_DMA_SIZE * sz,
> >>> +					&ring->phys, GFP_ATOMIC | __GFP_ZERO);
> >>>  	if (!ring->dma)
> >>>  		goto no_tx_mem;
> >>>  
> >>> -	memset(ring->dma, 0, MTK_DMA_SIZE * sz);
> >>
> >> I have to wonder whether this code needs two forms of zeroing... in
> >> the original code, __GFP_ZERO _and_ a call to memset() just in case
> >> __GFP_ZERO failed to do its job, and in the replacement code, just
> >> in case dma_zalloc_coherent() hasn't got the idea...
> >>
> >> I think you can drop the __GFP_ZERO. ;)
> >>
> > 
> > Just now I did an experiment on 4.14.56 on armv7. I found that
> > dma_zalloc_coherent does not guarantee that the buffer we get
> > is all filled with 0.
> > 
> > 
> > I really think it's a little bit weird OR what was I missing something
> > for enabling dma_zalloc_coherent ? The result seems to tell that we
> > can't remove freely the memset with 0 at this moment until we get a
> > cause.
> > 
> 
> That means dma_zalloc_coherent doesn't work as expect on armv7?

Can someone work out which underlying allocator is being used - the
possibilities are:

- dma_alloc_from_dev_coherent
- cma
- simple
- remap
- pool

Looking at the code, I'd guess it's the pool allocator, as I don't see
anything which zeros memory there, and it doesn't honor the __GFP_ZERO
flag.  This is definitely an allocator bug.
Russell King (Oracle) July 20, 2018, 9:25 a.m. UTC | #6
On Fri, Jul 20, 2018 at 02:54:23PM +0800, Sean Wang wrote:
> On Fri, 2018-07-20 at 14:30 +0800, YueHaibing wrote:
> > On 2018/7/20 1:02, Sean Wang wrote:
> > > On Thu, 2018-07-19 at 15:17 +0100, Russell King - ARM Linux wrote:
> > >> On Thu, Jul 19, 2018 at 10:09:55PM +0800, YueHaibing wrote:
> > >>> Use dma_zalloc_coherent instead of dma_alloc_coherent
> > >>> followed by memset 0.
> > >>>
> > >>> Signed-off-by: YueHaibing <yuehaibing@huawei.com>
> > >>> ---
> > >>>  drivers/net/ethernet/mediatek/mtk_eth_soc.c | 7 ++-----
> > >>>  1 file changed, 2 insertions(+), 5 deletions(-)
> > >>>
> > >>> diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> > >>> index d8ebf0a..fbdb3e3 100644
> > >>> --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> > >>> +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> > >>> @@ -1221,14 +1221,11 @@ static int mtk_tx_alloc(struct mtk_eth *eth)
> > >>>  	if (!ring->buf)
> > >>>  		goto no_tx_mem;
> > >>>  
> > >>> -	ring->dma = dma_alloc_coherent(eth->dev,
> > >>> -					  MTK_DMA_SIZE * sz,
> > >>> -					  &ring->phys,
> > >>> -					  GFP_ATOMIC | __GFP_ZERO);
> > >>> +	ring->dma = dma_zalloc_coherent(eth->dev, MTK_DMA_SIZE * sz,
> > >>> +					&ring->phys, GFP_ATOMIC | __GFP_ZERO);
> > >>>  	if (!ring->dma)
> > >>>  		goto no_tx_mem;
> > >>>  
> > >>> -	memset(ring->dma, 0, MTK_DMA_SIZE * sz);
> > >>
> > >> I have to wonder whether this code needs two forms of zeroing... in
> > >> the original code, __GFP_ZERO _and_ a call to memset() just in case
> > >> __GFP_ZERO failed to do its job, and in the replacement code, just
> > >> in case dma_zalloc_coherent() hasn't got the idea...
> > >>
> > >> I think you can drop the __GFP_ZERO. ;)
> > >>
> > > 
> > > Just now I did an experiment on 4.14.56 on armv7. I found that
> > > dma_zalloc_coherent does not guarantee that the buffer we get
> > > is all filled with 0.
> > > 
> > > 
> > > I really think it's a little bit weird OR what was I missing something
> > > for enabling dma_zalloc_coherent ? The result seems to tell that we
> > > can't remove freely the memset with 0 at this moment until we get a
> > > cause.
> > > 
> > 
> > That means dma_zalloc_coherent doesn't work as expect on armv7?
> > 
> 
> I'm not sure if it's true for every armv7. or it's only happening on my
> device.
> 
> anyway, i think we can replace all occurrences in the driver for
> dma_alloc_coherent with __GFP_ZERO by dma_zalloc_coherent, and but
> keep the extra memset as is.

No, a bug in the allocator has been found that needs fixing.  The
right solution is to fix the allocator and remove what should be
unnecessary memset()s.

This should have been reported when the __GFP_ZERO flag was not
being honoured and memset() was initially found to be required -
all that dma_zalloc_coherent() does is set the __GFP_ZERO flag
before calling dma_alloc_coherent().
Yue Haibing July 20, 2018, 12:13 p.m. UTC | #7
On 2018/7/20 17:25, Russell King - ARM Linux wrote:
> On Fri, Jul 20, 2018 at 02:54:23PM +0800, Sean Wang wrote:
>> On Fri, 2018-07-20 at 14:30 +0800, YueHaibing wrote:
>>> On 2018/7/20 1:02, Sean Wang wrote:
>>>> On Thu, 2018-07-19 at 15:17 +0100, Russell King - ARM Linux wrote:
>>>>> On Thu, Jul 19, 2018 at 10:09:55PM +0800, YueHaibing wrote:
>>>>>> Use dma_zalloc_coherent instead of dma_alloc_coherent
>>>>>> followed by memset 0.
>>>>>>
>>>>>> Signed-off-by: YueHaibing <yuehaibing@huawei.com>
>>>>>> ---
>>>>>>  drivers/net/ethernet/mediatek/mtk_eth_soc.c | 7 ++-----
>>>>>>  1 file changed, 2 insertions(+), 5 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
>>>>>> index d8ebf0a..fbdb3e3 100644
>>>>>> --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
>>>>>> +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
>>>>>> @@ -1221,14 +1221,11 @@ static int mtk_tx_alloc(struct mtk_eth *eth)
>>>>>>  	if (!ring->buf)
>>>>>>  		goto no_tx_mem;
>>>>>>  
>>>>>> -	ring->dma = dma_alloc_coherent(eth->dev,
>>>>>> -					  MTK_DMA_SIZE * sz,
>>>>>> -					  &ring->phys,
>>>>>> -					  GFP_ATOMIC | __GFP_ZERO);
>>>>>> +	ring->dma = dma_zalloc_coherent(eth->dev, MTK_DMA_SIZE * sz,
>>>>>> +					&ring->phys, GFP_ATOMIC | __GFP_ZERO);
>>>>>>  	if (!ring->dma)
>>>>>>  		goto no_tx_mem;
>>>>>>  
>>>>>> -	memset(ring->dma, 0, MTK_DMA_SIZE * sz);
>>>>>
>>>>> I have to wonder whether this code needs two forms of zeroing... in
>>>>> the original code, __GFP_ZERO _and_ a call to memset() just in case
>>>>> __GFP_ZERO failed to do its job, and in the replacement code, just
>>>>> in case dma_zalloc_coherent() hasn't got the idea...
>>>>>
>>>>> I think you can drop the __GFP_ZERO. ;)
>>>>>
>>>>
>>>> Just now I did an experiment on 4.14.56 on armv7. I found that
>>>> dma_zalloc_coherent does not guarantee that the buffer we get
>>>> is all filled with 0.
>>>>
>>>>
>>>> I really think it's a little bit weird OR what was I missing something
>>>> for enabling dma_zalloc_coherent ? The result seems to tell that we
>>>> can't remove freely the memset with 0 at this moment until we get a
>>>> cause.
>>>>
>>>
>>> That means dma_zalloc_coherent doesn't work as expect on armv7?
>>>
>>
>> I'm not sure if it's true for every armv7. or it's only happening on my
>> device.
>>
>> anyway, i think we can replace all occurrences in the driver for
>> dma_alloc_coherent with __GFP_ZERO by dma_zalloc_coherent, and but
>> keep the extra memset as is.
> 
> No, a bug in the allocator has been found that needs fixing.  The
> right solution is to fix the allocator and remove what should be
> unnecessary memset()s.

Agree, will send v2.

> 
> This should have been reported when the __GFP_ZERO flag was not
> being honoured and memset() was initially found to be required -
> all that dma_zalloc_coherent() does is set the __GFP_ZERO flag
> before calling dma_alloc_coherent().
>
Sean Wang July 20, 2018, 1:48 p.m. UTC | #8
On Fri, 2018-07-20 at 09:13 +0100, Russell King - ARM Linux wrote:
> On Fri, Jul 20, 2018 at 02:30:53PM +0800, YueHaibing wrote:
> > On 2018/7/20 1:02, Sean Wang wrote:
> > > On Thu, 2018-07-19 at 15:17 +0100, Russell King - ARM Linux wrote:
> > >> On Thu, Jul 19, 2018 at 10:09:55PM +0800, YueHaibing wrote:
> > >>> Use dma_zalloc_coherent instead of dma_alloc_coherent
> > >>> followed by memset 0.
> > >>>
> > >>> Signed-off-by: YueHaibing <yuehaibing@huawei.com>
> > >>> ---
> > >>>  drivers/net/ethernet/mediatek/mtk_eth_soc.c | 7 ++-----
> > >>>  1 file changed, 2 insertions(+), 5 deletions(-)
> > >>>
> > >>> diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> > >>> index d8ebf0a..fbdb3e3 100644
> > >>> --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> > >>> +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> > >>> @@ -1221,14 +1221,11 @@ static int mtk_tx_alloc(struct mtk_eth *eth)
> > >>>  	if (!ring->buf)
> > >>>  		goto no_tx_mem;
> > >>>  
> > >>> -	ring->dma = dma_alloc_coherent(eth->dev,
> > >>> -					  MTK_DMA_SIZE * sz,
> > >>> -					  &ring->phys,
> > >>> -					  GFP_ATOMIC | __GFP_ZERO);
> > >>> +	ring->dma = dma_zalloc_coherent(eth->dev, MTK_DMA_SIZE * sz,
> > >>> +					&ring->phys, GFP_ATOMIC | __GFP_ZERO);
> > >>>  	if (!ring->dma)
> > >>>  		goto no_tx_mem;
> > >>>  
> > >>> -	memset(ring->dma, 0, MTK_DMA_SIZE * sz);
> > >>
> > >> I have to wonder whether this code needs two forms of zeroing... in
> > >> the original code, __GFP_ZERO _and_ a call to memset() just in case
> > >> __GFP_ZERO failed to do its job, and in the replacement code, just
> > >> in case dma_zalloc_coherent() hasn't got the idea...
> > >>
> > >> I think you can drop the __GFP_ZERO. ;)
> > >>
> > > 
> > > Just now I did an experiment on 4.14.56 on armv7. I found that
> > > dma_zalloc_coherent does not guarantee that the buffer we get
> > > is all filled with 0.
> > > 
> > > 
> > > I really think it's a little bit weird OR what was I missing something
> > > for enabling dma_zalloc_coherent ? The result seems to tell that we
> > > can't remove freely the memset with 0 at this moment until we get a
> > > cause.
> > > 
> > 
> > That means dma_zalloc_coherent doesn't work as expect on armv7?
> 
> Can someone work out which underlying allocator is being used - the
> possibilities are:
> 
> - dma_alloc_from_dev_coherent
> - cma
> - simple
> - remap
> - pool
> 
> Looking at the code, I'd guess it's the pool allocator, as I don't see
> anything which zeros memory there, and it doesn't honor the __GFP_ZERO
> flag.  This is definitely an allocator bug.
> 

Yes, your guess is right. the allocator is from pool. I show the full
stack as below when calling dma_zalloc_coherent

[   54.358310] [<c0113ca8>] (unwind_backtrace) from [<c010e558>] (show_stack+0x20/0x24)
[   54.366012] [<c010e558>] (show_stack) from [<c0927f58>] (dump_stack+0x98/0xac)
[   54.373196] [<c0927f58>] (dump_stack) from [<c011c1f8>] (pool_allocator_alloc+0x20/0x30)
[   54.381238] [<c011c1f8>] (pool_allocator_alloc) from [<c011a238>] (__dma_alloc+0x1b8/0x344)
[   54.389538] [<c011a238>] (__dma_alloc) from [<c011a45c>] (arm_dma_alloc+0x50/0x58)
[   54.397063] [<c011a45c>] (arm_dma_alloc) from [<c05dc208>] (mtk_open+0xf4/0x710)
[   54.404416] [<c05dc208>] (mtk_open) from [<c076d110>] (__dev_open+0xdc/0x160)
[   54.411507] [<c076d110>] (__dev_open) from [<c076d530>] (__dev_change_flags+0x178/0x1c4)
[   54.419547] [<c076d530>] (__dev_change_flags) from [<c076d5a4>] (dev_change_flags+0x28/0x58)
[   54.427934] [<c076d5a4>] (dev_change_flags) from [<c07ea7c0>] (devinet_ioctl+0x630/0x720)
[   54.436062] [<c07ea7c0>] (devinet_ioctl) from [<c07ed070>] (inet_ioctl+0x210/0x3a8)
[   54.443674] [<c07ed070>] (inet_ioctl) from [<c0747094>] (sock_ioctl+0x234/0x4dc)
[   54.451029] [<c0747094>] (sock_ioctl) from [<c028206c>] (do_vfs_ioctl+0xc0/0x914)
[   54.458468] [<c028206c>] (do_vfs_ioctl) from [<c0282904>] (SyS_ioctl+0x44/0x6c)
[   54.465733] [<c0282904>] (SyS_ioctl) from [<c0101000>] (ret_fast_syscall+0x0/0x54)
diff mbox

Patch

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index d8ebf0a..fbdb3e3 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -1221,14 +1221,11 @@  static int mtk_tx_alloc(struct mtk_eth *eth)
 	if (!ring->buf)
 		goto no_tx_mem;
 
-	ring->dma = dma_alloc_coherent(eth->dev,
-					  MTK_DMA_SIZE * sz,
-					  &ring->phys,
-					  GFP_ATOMIC | __GFP_ZERO);
+	ring->dma = dma_zalloc_coherent(eth->dev, MTK_DMA_SIZE * sz,
+					&ring->phys, GFP_ATOMIC | __GFP_ZERO);
 	if (!ring->dma)
 		goto no_tx_mem;
 
-	memset(ring->dma, 0, MTK_DMA_SIZE * sz);
 	for (i = 0; i < MTK_DMA_SIZE; i++) {
 		int next = (i + 1) % MTK_DMA_SIZE;
 		u32 next_ptr = ring->phys + next * sz;