diff mbox

[v3] dma-debug: fix incorrect pfn calculation

Message ID 1506484087-1177-1-git-send-email-miles.chen@mediatek.com (mailing list archive)
State New, archived
Headers show

Commit Message

Miles Chen Sept. 27, 2017, 3:48 a.m. UTC
From: Miles Chen <miles.chen@mediatek.com>

dma-debug reports the following warning:

[name:panic&]WARNING: CPU: 3 PID: 298 at kernel-4.4/lib/dma-debug.c:604
debug _dma_assert_idle+0x1a8/0x230()
DMA-API: cpu touching an active dma mapped cacheline [cln=0x00000882300]
CPU: 3 PID: 298 Comm: vold Tainted: G        W  O    4.4.22+ #1
Hardware name: MT6739 (DT)
Call trace:
[<ffffff800808acd0>] dump_backtrace+0x0/0x1d4
[<ffffff800808affc>] show_stack+0x14/0x1c
[<ffffff800838019c>] dump_stack+0xa8/0xe0
[<ffffff80080a0594>] warn_slowpath_common+0xf4/0x11c
[<ffffff80080a061c>] warn_slowpath_fmt+0x60/0x80
[<ffffff80083afe24>] debug_dma_assert_idle+0x1a8/0x230
[<ffffff80081dca9c>] wp_page_copy.isra.96+0x118/0x520
[<ffffff80081de114>] do_wp_page+0x4fc/0x534
[<ffffff80081e0a14>] handle_mm_fault+0xd4c/0x1310
[<ffffff8008098798>] do_page_fault+0x1c8/0x394
[<ffffff800808231c>] do_mem_abort+0x50/0xec

I found that debug_dma_alloc_coherent() and debug_dma_free_coherent()
assume that dma_alloc_coherent() always returns a linear address.
However it's possible that dma_alloc_coherent() returns a non-linear
address. In this case, page_to_pfn(virt_to_page(virt)) will return an
incorrect pfn. If the pfn is valid and mapped as a COW page,
we will hit the warning when doing wp_page_copy().

Fix this by calculating pfn for linear and non-linear addresses.

Signed-off-by: Miles Chen <miles.chen@mediatek.com>
---
 lib/dma-debug.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

Comments

Robin Murphy Sept. 27, 2017, 10:23 a.m. UTC | #1
[+DMA maintainers]

On 27/09/17 04:48, miles.chen@mediatek.com wrote:
> From: Miles Chen <miles.chen@mediatek.com>
> 
> dma-debug reports the following warning:
> 
> [name:panic&]WARNING: CPU: 3 PID: 298 at kernel-4.4/lib/dma-debug.c:604
> debug _dma_assert_idle+0x1a8/0x230()
> DMA-API: cpu touching an active dma mapped cacheline [cln=0x00000882300]
> CPU: 3 PID: 298 Comm: vold Tainted: G        W  O    4.4.22+ #1
> Hardware name: MT6739 (DT)
> Call trace:
> [<ffffff800808acd0>] dump_backtrace+0x0/0x1d4
> [<ffffff800808affc>] show_stack+0x14/0x1c
> [<ffffff800838019c>] dump_stack+0xa8/0xe0
> [<ffffff80080a0594>] warn_slowpath_common+0xf4/0x11c
> [<ffffff80080a061c>] warn_slowpath_fmt+0x60/0x80
> [<ffffff80083afe24>] debug_dma_assert_idle+0x1a8/0x230
> [<ffffff80081dca9c>] wp_page_copy.isra.96+0x118/0x520
> [<ffffff80081de114>] do_wp_page+0x4fc/0x534
> [<ffffff80081e0a14>] handle_mm_fault+0xd4c/0x1310
> [<ffffff8008098798>] do_page_fault+0x1c8/0x394
> [<ffffff800808231c>] do_mem_abort+0x50/0xec
> 
> I found that debug_dma_alloc_coherent() and debug_dma_free_coherent()
> assume that dma_alloc_coherent() always returns a linear address.
> However it's possible that dma_alloc_coherent() returns a non-linear
> address. In this case, page_to_pfn(virt_to_page(virt)) will return an
> incorrect pfn. If the pfn is valid and mapped as a COW page,
> we will hit the warning when doing wp_page_copy().

Yeah, we definitely want that explanation recorded in the commit, since
the warning is pretty non-obvious otherwise.

Reviewed-by: Robin Murphy <robin.murphy@arm.com>

> Fix this by calculating pfn for linear and non-linear addresses.
> 
> Signed-off-by: Miles Chen <miles.chen@mediatek.com>
> ---
>  lib/dma-debug.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/dma-debug.c b/lib/dma-debug.c
> index ea4cc3d..e5b4237 100644
> --- a/lib/dma-debug.c
> +++ b/lib/dma-debug.c
> @@ -1497,7 +1497,8 @@ void debug_dma_alloc_coherent(struct device *dev, size_t size,
>  
>  	entry->type      = dma_debug_coherent;
>  	entry->dev       = dev;
> -	entry->pfn	 = page_to_pfn(virt_to_page(virt));
> +	entry->pfn	 = is_vmalloc_addr(virt) ? vmalloc_to_pfn(virt) :
> +						page_to_pfn(virt_to_page(virt));
>  	entry->offset	 = offset_in_page(virt);
>  	entry->size      = size;
>  	entry->dev_addr  = dma_addr;
> @@ -1513,7 +1514,8 @@ void debug_dma_free_coherent(struct device *dev, size_t size,
>  	struct dma_debug_entry ref = {
>  		.type           = dma_debug_coherent,
>  		.dev            = dev,
> -		.pfn		= page_to_pfn(virt_to_page(virt)),
> +		.pfn		= is_vmalloc_addr(virt) ? vmalloc_to_pfn(virt) :
> +						page_to_pfn(virt_to_page(virt)),
>  		.offset		= offset_in_page(virt),
>  		.dev_addr       = addr,
>  		.size           = size,
>
Christoph Hellwig Oct. 1, 2017, 8:04 a.m. UTC | #2
On Wed, Sep 27, 2017 at 11:23:52AM +0100, Robin Murphy wrote:
> > I found that debug_dma_alloc_coherent() and debug_dma_free_coherent()
> > assume that dma_alloc_coherent() always returns a linear address.
> > However it's possible that dma_alloc_coherent() returns a non-linear
> > address. In this case, page_to_pfn(virt_to_page(virt)) will return an
> > incorrect pfn. If the pfn is valid and mapped as a COW page,
> > we will hit the warning when doing wp_page_copy().

Hmm, can the debug code assume anything?  Right now you're just patching
it from supporting linear and vmalloc.  But what about other
potential mapping types?

> > +	entry->pfn	 = is_vmalloc_addr(virt) ? vmalloc_to_pfn(virt) :
> > +						page_to_pfn(virt_to_page(virt));

Please use normal if/else conditionsals:

	if (is_vmalloc_addr(virt))
		entry->pfn = vmalloc_to_pfn(virt);
	else
		entry->pfn = page_to_pfn(virt_to_page(virt));

> > +		.pfn		= is_vmalloc_addr(virt) ? vmalloc_to_pfn(virt) :
> > +						page_to_pfn(virt_to_page(virt)),

Same here.
Miles Chen Oct. 2, 2017, 10:30 a.m. UTC | #3
On Sun, 2017-10-01 at 10:04 +0200, Christoph Hellwig wrote:
> On Wed, Sep 27, 2017 at 11:23:52AM +0100, Robin Murphy wrote:
> > > I found that debug_dma_alloc_coherent() and debug_dma_free_coherent()
> > > assume that dma_alloc_coherent() always returns a linear address.
> > > However it's possible that dma_alloc_coherent() returns a non-linear
> > > address. In this case, page_to_pfn(virt_to_page(virt)) will return an
> > > incorrect pfn. If the pfn is valid and mapped as a COW page,
> > > we will hit the warning when doing wp_page_copy().
> 
> Hmm, can the debug code assume anything?  Right now you're just patching
> it from supporting linear and vmalloc.  But what about other
> potential mapping types?

thanks for the review.

ARCHs like metag and xtensa define their mappings (non-vmalloc and
non-linear) for dma allocation.
These mapping types are architecture-dependent and should not be used
outside arch folders. So it is hard to check the mappings and convert
a virtual address to a correct pfn in lib/dam-debug.c

How about recording only vmalloc (by is_vmalloc_addr()) and linear
address (by virt_addr_valid()) in lib/dma-debug? Since current 
implementation is not correct for those ARCHs.

if (!is_vmalloc_addr(addr) && !virt_addr_valid(addr))
    return;

or

every ARCH should define its own dmava-to-pfn API to convert
a dma-allocted virtual address to a correct pfn and lib/dma-debug.c
can use that API directly. (long-term)

> 
> > > +	entry->pfn	 = is_vmalloc_addr(virt) ? vmalloc_to_pfn(virt) :
> > > +						page_to_pfn(virt_to_page(virt));
> 
> Please use normal if/else conditionsals:

Is this for better readability? I'll send another patch for this.


thanks 
> 
> 	if (is_vmalloc_addr(virt))
> 		entry->pfn = vmalloc_to_pfn(virt);
> 	else
> 		entry->pfn = page_to_pfn(virt_to_page(virt));
> 
> > > +		.pfn		= is_vmalloc_addr(virt) ? vmalloc_to_pfn(virt) :
> > > +						page_to_pfn(virt_to_page(virt)),
> 
> Same here.
Christoph Hellwig Oct. 3, 2017, 7:07 a.m. UTC | #4
On Mon, Oct 02, 2017 at 06:30:41PM +0800, Miles Chen wrote:
> ARCHs like metag and xtensa define their mappings (non-vmalloc and
> non-linear) for dma allocation.

metag basically is a reimplementation of the vmalloc map mechanism
that should be easy to consolidate into the common one :(  xtensa
has a weird remapping into a different segment, something that I
vaguely remember mips used to support as well.

> These mapping types are architecture-dependent and should not be used
> outside arch folders. So it is hard to check the mappings and convert
> a virtual address to a correct pfn in lib/dam-debug.c
> 
> How about recording only vmalloc (by is_vmalloc_addr()) and linear
> address (by virt_addr_valid()) in lib/dma-debug? Since current 
> implementation is not correct for those ARCHs.
> 
> if (!is_vmalloc_addr(addr) && !virt_addr_valid(addr))
>     return;
> 
> or

This looks like a good start, although I'm not sure I'd trust
virt_addr_valid on every little arch.  In the worse case we'll
have to exclude offenders from supporting dma debug, so let's go
with that version.

> > > > +	entry->pfn	 = is_vmalloc_addr(virt) ? vmalloc_to_pfn(virt) :
> > > > +						page_to_pfn(virt_to_page(virt));
> > 
> > Please use normal if/else conditionsals:
> 
> Is this for better readability? I'll send another patch for this.

Yes.
diff mbox

Patch

diff --git a/lib/dma-debug.c b/lib/dma-debug.c
index ea4cc3d..e5b4237 100644
--- a/lib/dma-debug.c
+++ b/lib/dma-debug.c
@@ -1497,7 +1497,8 @@  void debug_dma_alloc_coherent(struct device *dev, size_t size,
 
 	entry->type      = dma_debug_coherent;
 	entry->dev       = dev;
-	entry->pfn	 = page_to_pfn(virt_to_page(virt));
+	entry->pfn	 = is_vmalloc_addr(virt) ? vmalloc_to_pfn(virt) :
+						page_to_pfn(virt_to_page(virt));
 	entry->offset	 = offset_in_page(virt);
 	entry->size      = size;
 	entry->dev_addr  = dma_addr;
@@ -1513,7 +1514,8 @@  void debug_dma_free_coherent(struct device *dev, size_t size,
 	struct dma_debug_entry ref = {
 		.type           = dma_debug_coherent,
 		.dev            = dev,
-		.pfn		= page_to_pfn(virt_to_page(virt)),
+		.pfn		= is_vmalloc_addr(virt) ? vmalloc_to_pfn(virt) :
+						page_to_pfn(virt_to_page(virt)),
 		.offset		= offset_in_page(virt),
 		.dev_addr       = addr,
 		.size           = size,