[3/5] drm/amdkfd: use vma_is_stack() and vma_is_heap()

Message ID	20230712143831.120701-4-wangkefeng.wang@huawei.com (mailing list archive)
State	Handled Elsewhere
Headers	show Return-Path: <selinux-owner@vger.kernel.org> From: Kefeng Wang <wangkefeng.wang@huawei.com> To: Andrew Morton <akpm@linux-foundation.org> CC: <amd-gfx@lists.freedesktop.org>, <dri-devel@lists.freedesktop.org>, <linux-kernel@vger.kernel.org>, <linux-fsdevel@vger.kernel.org>, <linux-mm@kvack.org>, <linux-perf-users@vger.kernel.org>, <selinux@vger.kernel.org>, Kefeng Wang <wangkefeng.wang@huawei.com> Subject: [PATCH 3/5] drm/amdkfd: use vma_is_stack() and vma_is_heap() Date: Wed, 12 Jul 2023 22:38:29 +0800 Message-ID: <20230712143831.120701-4-wangkefeng.wang@huawei.com> In-Reply-To: <20230712143831.120701-1-wangkefeng.wang@huawei.com> References: <20230712143831.120701-1-wangkefeng.wang@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7BIT Content-Type: text/plain; charset=US-ASCII Precedence: bulk
Series	mm: convert to vma_is_heap/stack() \| expand [0/5] mm: convert to vma_is_heap/stack() [1/5] mm: introduce vma_is_stack() and vma_is_heap() [2/5] mm: use vma_is_stack() and vma_is_heap() [3/5] drm/amdkfd: use vma_is_stack() and vma_is_heap() [4/5] selinux: use vma_is_stack() and vma_is_heap() [5/5] perf/core: use vma_is_stack() and vma_is_heap()

Message ID

20230712143831.120701-4-wangkefeng.wang@huawei.com (mailing list archive)

State

Handled Elsewhere

Headers

From: Kefeng Wang <wangkefeng.wang@huawei.com>
To: Andrew Morton <akpm@linux-foundation.org>
CC: <amd-gfx@lists.freedesktop.org>, <dri-devel@lists.freedesktop.org>,
        <linux-kernel@vger.kernel.org>, <linux-fsdevel@vger.kernel.org>,
        <linux-mm@kvack.org>, <linux-perf-users@vger.kernel.org>,
        <selinux@vger.kernel.org>, Kefeng Wang <wangkefeng.wang@huawei.com>
Subject: [PATCH 3/5] drm/amdkfd: use vma_is_stack() and vma_is_heap()
Date: Wed, 12 Jul 2023 22:38:29 +0800
Message-ID: <20230712143831.120701-4-wangkefeng.wang@huawei.com>
In-Reply-To: <20230712143831.120701-1-wangkefeng.wang@huawei.com>
References: <20230712143831.120701-1-wangkefeng.wang@huawei.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 7BIT
Content-Type: text/plain; charset=US-ASCII
Precedence: bulk

Series

mm: convert to vma_is_heap/stack() | expand

Comments

Christoph Hellwig July 12, 2023, 2:42 p.m. UTC | #1

On Wed, Jul 12, 2023 at 10:38:29PM +0800, Kefeng Wang wrote:
> Use the helpers to simplify code.

Nothing against your addition of a helper, but a GPU driver really
should have no business even looking at this information..

Felix Kuehling July 12, 2023, 4:24 p.m. UTC | #2

Allocations in the heap and stack tend to be small, with several 
allocations sharing the same page. Sharing the same page for different 
allocations with different access patterns leads to thrashing when we 
migrate data back and forth on GPU and CPU access. To avoid this we 
disable HMM migrations for head and stack VMAs.

Regards,
   Felix

Am 2023-07-12 um 10:42 schrieb Christoph Hellwig:
> On Wed, Jul 12, 2023 at 10:38:29PM +0800, Kefeng Wang wrote:
>> Use the helpers to simplify code.
> Nothing against your addition of a helper, but a GPU driver really
> should have no business even looking at this information..
>
>

Vlastimil Babka July 14, 2023, 2:26 p.m. UTC | #3

On 7/12/23 18:24, Felix Kuehling wrote:
> Allocations in the heap and stack tend to be small, with several 
> allocations sharing the same page. Sharing the same page for different 
> allocations with different access patterns leads to thrashing when we 
> migrate data back and forth on GPU and CPU access. To avoid this we 
> disable HMM migrations for head and stack VMAs.

Wonder how well does it really work in practice? AFAIK "heaps" (malloc())
today uses various arenas obtained by mmap() and not a single brk() managed
space anymore? And programs might be multithreaded, thus have multiple
stacks, while vma_is_stack() will recognize only the initial one...

Vlastimil

> Regards,
>    Felix
> 
> 
> Am 2023-07-12 um 10:42 schrieb Christoph Hellwig:
>> On Wed, Jul 12, 2023 at 10:38:29PM +0800, Kefeng Wang wrote:
>>> Use the helpers to simplify code.
>> Nothing against your addition of a helper, but a GPU driver really
>> should have no business even looking at this information..
>>
>>
>

Felix Kuehling July 14, 2023, 3:09 p.m. UTC | #4

Am 2023-07-14 um 10:26 schrieb Vlastimil Babka:
> On 7/12/23 18:24, Felix Kuehling wrote:
>> Allocations in the heap and stack tend to be small, with several
>> allocations sharing the same page. Sharing the same page for different
>> allocations with different access patterns leads to thrashing when we
>> migrate data back and forth on GPU and CPU access. To avoid this we
>> disable HMM migrations for head and stack VMAs.
> Wonder how well does it really work in practice? AFAIK "heaps" (malloc())
> today uses various arenas obtained by mmap() and not a single brk() managed
> space anymore? And programs might be multithreaded, thus have multiple
> stacks, while vma_is_stack() will recognize only the initial one...

Thanks for these pointers. I have not heard of such problems with mmap 
arenas and multiple thread stacks in practice. But I'll keep it in mind 
in case we observe unexpected thrashing in the future. FWIW, we once had 
the opposite problem of a custom malloc implementation that used sbrk 
for very large allocations. This disabled migrations of large buffers 
unexpectedly.

I agree that eventually we'll want a more dynamic way of detecting and 
suppressing thrashing that's based on observed memory access patterns. 
Getting this right is probably trickier than it sounds, so I'd prefer to 
have some more experience with real workloads to use as benchmarks. 
Compared to other things we're working on, this is fairly low on our 
priority list at the moment. Using the VMA flags is a simple and 
effective method for now, at least until we see it failing in real 
workloads.

Regards,
   Felix

>
> Vlastimil
>
>> Regards,
>>     Felix
>>
>>
>> Am 2023-07-12 um 10:42 schrieb Christoph Hellwig:
>>> On Wed, Jul 12, 2023 at 10:38:29PM +0800, Kefeng Wang wrote:
>>>> Use the helpers to simplify code.
>>> Nothing against your addition of a helper, but a GPU driver really
>>> should have no business even looking at this information..
>>>
>>>

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 479c4f66afa7..19ce68a7e1a8 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -2623,10 +2623,7 @@  svm_range_get_range_boundaries(struct kfd_process *p, int64_t addr,
 		return -EFAULT;
 	}
 
-	*is_heap_stack = (vma->vm_start <= vma->vm_mm->brk &&
-			  vma->vm_end >= vma->vm_mm->start_brk) ||
-			 (vma->vm_start <= vma->vm_mm->start_stack &&
-			  vma->vm_end >= vma->vm_mm->start_stack);
+	*is_heap_stack = vma_is_heap(vma) || vma_is_stack(vma);
 
 	start_limit = max(vma->vm_start >> PAGE_SHIFT,
 		      (unsigned long)ALIGN_DOWN(addr, 2UL << 8));

[3/5] drm/amdkfd: use vma_is_stack() and vma_is_heap()

Commit Message

Comments

Patch