From patchwork Mon Dec  8 08:11:55 2014
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Alexandre Courbot <acourbot@nvidia.com>
X-Patchwork-Id: 5454731
Return-Path: <dri-devel-bounces@lists.freedesktop.org>
X-Original-To: patchwork-dri-devel@patchwork.kernel.org
Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org
Received: from mail.kernel.org (mail.kernel.org [198.145.19.201])
	by patchwork2.web.kernel.org (Postfix) with ESMTP id CD357BEEA8
	for <patchwork-dri-devel@patchwork.kernel.org>;
	Mon,  8 Dec 2014 08:12:37 +0000 (UTC)
Received: from mail.kernel.org (localhost [127.0.0.1])
	by mail.kernel.org (Postfix) with ESMTP id B86152012B
	for <patchwork-dri-devel@patchwork.kernel.org>;
	Mon,  8 Dec 2014 08:12:36 +0000 (UTC)
Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177])
	by mail.kernel.org (Postfix) with ESMTP id 8F66220125
	for <patchwork-dri-devel@patchwork.kernel.org>;
	Mon,  8 Dec 2014 08:12:35 +0000 (UTC)
Received: from gabe.freedesktop.org (localhost [127.0.0.1])
	by gabe.freedesktop.org (Postfix) with ESMTP id A1BED6E36D;
	Mon,  8 Dec 2014 00:12:33 -0800 (PST)
X-Original-To: dri-devel@lists.freedesktop.org
Delivered-To: dri-devel@lists.freedesktop.org
Received: from hqemgate15.nvidia.com (hqemgate15.nvidia.com [216.228.121.64])
	by gabe.freedesktop.org (Postfix) with ESMTP id 40EB56E372
	for <dri-devel@lists.freedesktop.org>;
	Mon,  8 Dec 2014 00:12:32 -0800 (PST)
Received: from hqnvupgp08.nvidia.com (Not Verified[216.228.121.13]) by
	hqemgate15.nvidia.com
	id <B54855d3e0000>; Mon, 08 Dec 2014 00:11:42 -0800
Received: from HQMAIL104.nvidia.com ([172.20.12.94])
	by hqnvupgp08.nvidia.com (PGP Universal service);
	Mon, 08 Dec 2014 00:10:46 -0800
X-PGP-Universal: processed;
	by hqnvupgp08.nvidia.com on Mon, 08 Dec 2014 00:10:46 -0800
Received: from HKMAIL102.nvidia.com (10.18.16.11) by HQMAIL104.nvidia.com
	(172.18.146.11) with Microsoft SMTP Server (TLS) id 15.0.847.32;
	Mon, 8 Dec 2014 08:12:31 +0000
Received: from [10.19.57.128] (10.19.57.128) by HKMAIL102.nvidia.com
	(10.18.16.11) with Microsoft SMTP Server (TLS) id 15.0.847.32;
	Mon, 8 Dec 2014 08:12:10 +0000
Message-ID: <54855D4B.50307@nvidia.com>
Date: Mon, 8 Dec 2014 17:11:55 +0900
From: Alexandre Courbot <acourbot@nvidia.com>
Organization: NVIDIA
User-Agent: Mozilla/5.0 (X11; Linux x86_64;
	rv:31.0) Gecko/20100101 Thunderbird/31.2.0
MIME-Version: 1.0
To: Thierry Reding <thierry.reding@gmail.com>,
	<dri-devel@lists.freedesktop.org>
Subject: Re: [RFC] drm/ttm: dma: Fixes for 32-bit and 64-bit ARM
References: <1415795945-17575-1-git-send-email-thierry.reding@gmail.com>
	<54854FE7.8000604@nvidia.com>
In-Reply-To: <54854FE7.8000604@nvidia.com>
X-NVConfidentiality: public
X-Originating-IP: [10.19.57.128]
X-ClientProxiedBy: DRBGMAIL103.nvidia.com (10.18.16.22) To
	HKMAIL102.nvidia.com (10.18.16.11)
Cc: Alexandre Courbot <gnurou@gmail.com>,
	Thomas Hellstrom <thellstrom@vmware.com>,
	Allen Martin <amartin@nvidia.com>, Arnd Bergmann <arnd@arndb.de>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
X-BeenThere: dri-devel@lists.freedesktop.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: Direct Rendering Infrastructure - Development
	<dri-devel.lists.freedesktop.org>
List-Unsubscribe: <http://lists.freedesktop.org/mailman/options/dri-devel>,
	<mailto:dri-devel-request@lists.freedesktop.org?subject=unsubscribe>
List-Archive: <http://lists.freedesktop.org/archives/dri-devel>
List-Post: <mailto:dri-devel@lists.freedesktop.org>
List-Help: <mailto:dri-devel-request@lists.freedesktop.org?subject=help>
List-Subscribe: <http://lists.freedesktop.org/mailman/listinfo/dri-devel>,
	<mailto:dri-devel-request@lists.freedesktop.org?subject=subscribe>
Errors-To: dri-devel-bounces@lists.freedesktop.org
Sender: "dri-devel" <dri-devel-bounces@lists.freedesktop.org>
X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED,
	T_RP_MATCHES_RCVD,
	UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org
X-Virus-Scanned: ClamAV using ClamSMTP

On 12/08/2014 04:14 PM, Alexandre Courbot wrote:
> On 11/12/2014 09:39 PM, Thierry Reding wrote:
>> From: Thierry Reding <treding@nvidia.com>
>>
>> dma_alloc_coherent() returns a kernel virtual address that is part of
>> the linear range. Passing such an address to virt_to_page() is illegal
>> on non-coherent architectures. This causes the kernel to oops on 64-bit
>> ARM because the struct page * obtained from virt_to_page() points to
>> unmapped memory.
>>
>> This commit fixes this by using phys_to_page() since we get a physical
>> address from dma_alloc_coherent(). Note that this is not a proper fix
>> because if an IOMMU is set up to translate addresses for the GPU this
>> address will be an I/O virtual address rather than a physical one. The
>> proper fix probably involves not getting a pointer to the struct page
>> in the first place, but that would be a much more intrusive change, if
>> at all possible.
>>
>> Until that time, this temporary fix will allow TTM to work on 32-bit
>> and 64-bit ARM as well, provided that no IOMMU translations are enabled
>> for the GPU.
>>
>> Signed-off-by: Thierry Reding <treding@nvidia.com>
>> ---
>> Arnd, I realize that this isn't a proper fix according to what we
>> discussed on
>> IRC yesterday, but I can't see a way to remove access to the pages
>> array that
>> would be as simple as this. I've marked this as RFC in the hope that
>> it will
>> trigger some discussion that will lead to a proper solution.
>>
>>   drivers/gpu/drm/ttm/ttm_page_alloc_dma.c | 4 ++++
>>   1 file changed, 4 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
>> b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
>> index c96db433f8af..d7993985752c 100644
>> --- a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
>> +++ b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
>> @@ -343,7 +343,11 @@ static struct dma_page
>> *__ttm_dma_alloc_page(struct dma_pool *pool)
>>                          &d_page->dma,
>>                          pool->gfp_flags);
>>       if (d_page->vaddr)
>> +#if defined(CONFIG_ARM) || defined(CONFIG_ARM64)
>> +        d_page->p = phys_to_page(d_page->dma);
>> +#else
>>           d_page->p = virt_to_page(d_page->vaddr);
>> +#endif
>
> Since I am messing with the IOMMU I just happened to have hit the issue
> you are mentioning. Wouldn't the following work:
>
> -       if (d_page->vaddr)
> -#if defined(CONFIG_ARM) || defined(CONFIG_ARM64)
> -               d_page->p = phys_to_page(d_page->dma);
> -#else
> -               d_page->p = virt_to_page(d_page->vaddr);
> -#endif
> -       else {
> +       if (d_page->vaddr) {
> +               if (is_vmalloc_addr(d_page->vaddr)) {
> +                       d_page->p = vmalloc_to_page(d_page->vaddr);
> +               } else {
> +                       d_page->p = virt_to_page(d_page->vaddr);
> +               }
> +       } else {
>
> A remapped page will end up in the vmalloc range of the address space,
> and in this case we can use vmalloc_to_page() to get the right page.
> Pages outside of this range are part of the linear mapping and can be
> resolved using virt_to_page().
>
> Jetson seems to be mostly happy with this, although I sometimes get the
> following trace:
>
> [   13.174763] kernel BUG at ../mm/slab.c:2593!
> [   13.174767] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
> [   13.174790] Modules linked in: nouveau_platform(O+) nouveau(O)
> cfbfillrect cfbimgblt cfbcopyarea ttm
> ...
> [   13.175234] [<c00de238>] (cache_alloc_refill) from [<c00de528>]
> (__kmalloc+0x100/0x13c)
> [   13.175247] [<c00de528>] (__kmalloc) from [<c001d564>]
> (arm_iommu_alloc_attrs+0x94/0x3a8)
> [   13.175269] [<c001d564>] (arm_iommu_alloc_attrs) from [<bf008f4c>]
> (ttm_dma_populate+0x498/0x76c [ttm])
> [   13.175294] [<bf008f4c>] (ttm_dma_populate [ttm]) from [<bf000bb8>]
> (ttm_tt_bind+0x38/0x68 [ttm])
> [   13.175315] [<bf000bb8>] (ttm_tt_bind [ttm]) from [<bf00298c>]
> (ttm_bo_handle_move_mem+0x408/0x47c [ttm])
> [   13.175337] [<bf00298c>] (ttm_bo_handle_move_mem [ttm]) from
> [<bf003758>] (ttm_bo_validate+0x220/0x22c [ttm])
> [   13.175359] [<bf003758>] (ttm_bo_validate [ttm]) from [<bf003984>]
> (ttm_bo_init+0x220/0x338 [ttm])
> [   13.175480] [<bf003984>] (ttm_bo_init [ttm]) from [<bf0c70a0>]
> (nouveau_bo_new+0x1c0/0x294 [nouveau])
> [   13.175688] [<bf0c70a0>] (nouveau_bo_new [nouveau]) from [<bf0ce88c>]
> (nv84_fence_create+0x1cc/0x240 [nouveau])
> [   13.175891] [<bf0ce88c>] (nv84_fence_create [nouveau]) from
> [<bf0cec90>] (nvc0_fence_create+0xc/0x24 [nouveau])
> [   13.176094] [<bf0cec90>] (nvc0_fence_create [nouveau]) from
> [<bf0c1480>] (nouveau_accel_init+0xec/0x450 [nouveau])
>
> I suspect this is related to this change, but it might also be the
> side-effect of another bug in my code.

Ok, I finally understand where this trace comes from.

slab.c:2593:
BUG_ON(flags & GFP_SLAB_BUG_MASK);

gfp.h:
#define GFP_SLAB_BUG_MASK (__GFP_DMA32|__GFP_HIGHMEM|~__GFP_BITS_MASK)

and ttm_dma_populate() calls ttm_dma_pool_init() with gfp_flags == 
GFP_HIGHUSER, which includes __GFP_HIGHMEM. Somehow these flags are 
propagated through arm_iommu_alloc_attrs() up to the slab allocator, 
which meets this BUG_ON() when it needs to grow its cache.

Note that ttm_dma_pool_init() can also be called with GFP_DMA32, which 
will trigger the same error.

So although I am still not sure how this should be fixed, it seems like 
this has nothing to do with the change I am proposing.

Temporary workaround:


... which obviously doesn't look right.
--- a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
+++ b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
@@ -882,9 +882,9 @@ int ttm_dma_populate(struct ttm_dma_tt *ttm_dma, 
struct device *dev)

         type = ttm_to_type(ttm->page_flags, ttm->caching_state);
         if (ttm->page_flags & TTM_PAGE_FLAG_DMA32)
-               gfp_flags = GFP_USER | GFP_DMA32;
+               gfp_flags = GFP_USER;
         else
-               gfp_flags = GFP_HIGHUSER;
+               gfp_flags = GFP_USER;
         if (ttm->page_flags & TTM_PAGE_FLAG_ZERO_ALLOC)
                 gfp_flags |= __GFP_ZERO;