[Bug,92775,radeon,TTM] Contention when evicting large buffers between VRAM and GTT
diff mbox

Message ID bug-92775-502@http.bugs.freedesktop.org/
State New
Headers show

Commit Message

bugzilla-daemon@freedesktop.org Nov. 2, 2015, 9:02 a.m. UTC
https://bugs.freedesktop.org/show_bug.cgi?id=92775

            Bug ID: 92775
           Summary: [radeon][TTM] Contention when evicting large buffers
                    between VRAM and GTT
           Product: DRI
           Version: XOrg git
          Hardware: Other
                OS: All
            Status: NEW
          Severity: normal
          Priority: medium
         Component: DRM/Radeon
          Assignee: dri-devel@lists.freedesktop.org
          Reporter: shawn.starr@rogers.com

In this example output, we fail to evict pages.

[  284.937397] CPU: 2 PID: 2113 Comm: RenderThread 2 Tainted: G            E  
4.3.0-0.rc7.git2.2.fc23.x86_64+debug+ #1
[  284.937916] Hardware name: Dell Inc. Precision M6800/05NG6V, BIOS A15
09/29/2015
[  284.938277]  0000000000000000 00000000a3aa6e6e ffff88078b2af350
ffffffff813a496f
[  284.938699]  ffff880523ce68a0 ffff88078b2af370 ffffffffa01d6132
ffff88003f9c4738
[  284.939093]  0000000000000002 ffff88078b2af3b0 ffffffffa013e5f4
0000000000000000
[  284.939563] Call Trace:
[  284.939740]  [<ffffffff813a496f>] dump_stack+0x44/0x55
[  284.939999]  [<ffffffffa01d6132>] radeon_ttm_io_mem_reserve+0xd2/0x100
[radeon]
[  284.940417]  [<ffffffffa013e5f4>] ttm_mem_io_reserve+0x64/0x110 [ttm]
[  284.940797]  [<ffffffffa013eb53>] ttm_mem_reg_ioremap+0x53/0x140 [ttm]
[  284.941111]  [<ffffffffa013f0b0>] ttm_bo_move_memcpy+0xe0/0x680 [ttm]
[  284.941481]  [<ffffffffa01d69f0>] radeon_bo_move+0x190/0x200 [radeon]
[  284.941893]  [<ffffffffa013cd62>] ttm_bo_handle_move_mem+0x2c2/0x530 [ttm]
[  284.942262]  [<ffffffffa013d537>] ? ttm_bo_mem_space+0x137/0x3b0 [ttm]
[  284.942617]  [<ffffffffa013d121>] ttm_bo_evict+0x151/0x220 [ttm]
[  284.942952]  [<ffffffffa013d388>] ttm_mem_evict_first+0x198/0x210 [ttm]
[  284.943308]  [<ffffffffa013d6fa>] ttm_bo_mem_space+0x2fa/0x3b0 [ttm]
[  284.943715]  [<ffffffffa013dbd9>] ttm_bo_validate+0x199/0x210 [ttm]
[  284.944016]  [<ffffffffa0141998>] ? ttm_eu_reserve_buffers+0x168/0x300 [ttm]
[  284.944428]  [<ffffffffa01d88ec>] radeon_bo_list_validate+0xcc/0x210
[radeon]
[  284.944809]  [<ffffffffa01ee6c3>] radeon_cs_parser_relocs+0x393/0x460
[radeon]
[  284.945181]  [<ffffffffa01ef049>] radeon_cs_ioctl+0x269/0x780 [radeon]
[  284.945532]  [<ffffffffa00d3408>] drm_ioctl+0x138/0x500 [drm]
[  284.945944]  [<ffffffffa01eede0>] ? radeon_cs_parser_init+0x490/0x490
[radeon]
[  284.946305]  [<ffffffff811d7aae>] ? handle_mm_fault+0xb6e/0x1840
[  284.946674]  [<ffffffff8177e96e>] ? _raw_spin_unlock_irqrestore+0xe/0x10
[  284.946999]  [<ffffffffa01b904c>] radeon_drm_ioctl+0x4c/0x80 [radeon]
[  284.947380]  [<ffffffff81235435>] do_vfs_ioctl+0x295/0x470
[  284.947741]  [<ffffffff81065104>] ? __do_page_fault+0x1b4/0x400
[  284.948024]  [<ffffffff81235689>] SyS_ioctl+0x79/0x90
[  284.948294]  [<ffffffff8177eeee>] entry_SYSCALL_64_fastpath+0x12/0x71
[  284.948635] radeon_ttm_io_mem_reserve: Check if it's bus.offset + bus.size
greater than BAR SIZE: is 6a977000 > 10000000?
[  284.949261] [TTM] in ttm_bo_handle_move_mem(): Failing! - OTHER, from
ttm_bo_move_memcpy() return
[  284.949796] [TTM] Buffer eviction failed
[  284.950004] [TTM] No space for ffff880523ce6868 (1367 pages, 5468K, 5M)
[  284.950422] [TTM]   placement[0]=0x00060002 (1)
[  284.950796] [TTM]     has_type: 1
[  284.950958] [TTM]     use_type: 1
[  284.951130] [TTM]     flags: 0x0000000A
[  284.951333] [TTM]     gpu_offset: 0x80000000
[  284.951709] [TTM]     size: 2097152
[  284.951892] [TTM]     available_caching: 0x00070000
[  284.952141] [TTM]     default_caching: 0x00010000
[  284.953945] [TTM]   placement[1]=0x00060001 (0)
[  284.954193] [TTM]     has_type: 1
[  284.954359] [TTM]     use_type: 1
[  284.954593] [TTM]     flags: 0x00000002
[  284.954780] [TTM]     gpu_offset: 0x00000000
[  284.954986] [TTM]     size: 0
[  284.955160] [TTM]     available_caching: 0x00070000
[  284.955421] [TTM]     default_caching: 0x00010000

In discussions on IRC, a current workaround in radeonsi DRI is this:


One solution discussed is to split up the transfer into smaller chunks in
radeon_ttm.

Comments

bugzilla-daemon@freedesktop.org Dec. 25, 2015, 7:41 a.m. UTC | #1
https://bugs.freedesktop.org/show_bug.cgi?id=92775

--- Comment #1 from Michel Dänzer <michel@daenzer.net> ---
(In reply to Shawn Starr from comment #0)
> One solution discussed is to split up the transfer into smaller chunks in
> radeon_ttm.

Specifically, here's how I think a fallback could be implemented in the kernel
driver which can never fail because of fragmentation or resource starvation:

During initialization, reserve some pinned GTT memory for bounce buffers. When
a BO can't be bound to GTT for eviction as in the case reported here, instead
do the eviction directly from VRAM to CPU domain in one or several passes of:
1. Copy part of the BO from VRAM to one of the reserved bounce buffers in GTT
using the GPU.
2. Copy that part of the BO from the bounce buffer to the BO's system RAM pages
using the CPU.
bugzilla-daemon@freedesktop.org Dec. 25, 2015, 8:16 a.m. UTC | #2
https://bugs.freedesktop.org/show_bug.cgi?id=92775

--- Comment #2 from david1.zhou@amd.com <david1.zhou@amd.com> ---
(In reply to Michel Dänzer from comment #1)
> (In reply to Shawn Starr from comment #0)
> > One solution discussed is to split up the transfer into smaller chunks in
> > radeon_ttm.
> 
> Specifically, here's how I think a fallback could be implemented in the
> kernel driver which can never fail because of fragmentation or resource
> starvation:
> 
> During initialization, reserve some pinned GTT memory for bounce buffers.
> When a BO can't be bound to GTT for eviction as in the case reported here,
> instead do the eviction directly from VRAM to CPU domain in one or several
> passes of:
> 1. Copy part of the BO from VRAM to one of the reserved bounce buffers in
> GTT using the GPU.
> 2. Copy that part of the BO from the bounce buffer to the BO's system RAM
> pages using the CPU.


if we can split a large BO to two parts, one is in VRAM, one is in GTT, seems
also to be helpful for this case.
bugzilla-daemon@freedesktop.org Dec. 25, 2015, 3:51 p.m. UTC | #3
https://bugs.freedesktop.org/show_bug.cgi?id=92775

--- Comment #3 from Christian König <deathsimple@vodafone.de> ---
Actually it doesn't need to be so complicated. Just take a look at
amdgpu|radeon_move_vram_ram().

Instead of trying to reallocate and binding everything at once we just need to
bind the already allocate new_mem pages page by page and copy page by page.
bugzilla-daemon@freedesktop.org July 17, 2017, 8:32 p.m. UTC | #4
https://bugs.freedesktop.org/show_bug.cgi?id=92775

Shawn Starr <shawn.starr@rogers.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |WORKSFORME
             Status|NEW                         |RESOLVED

--- Comment #4 from Shawn Starr <shawn.starr@rogers.com> ---
I believe this can be closed, given the massive changes in amdgpu. I haven't
seen issues anymore.

Patch
diff mbox

--- r600_buffer_common.c        2015-11-02 01:56:10.796446185 -0500
+++ r600_buffer_common.c.workaround     2015-11-01 21:16:55.398517539 -0500
@@ -133,7 +133,7 @@  bool r600_init_resource(struct r600_comm
        case PIPE_USAGE_IMMUTABLE:
        default:
                /* Not listing GTT here improves performance in some apps. */
-               res->domains = RADEON_DOMAIN_VRAM;
+               res->domains = RADEON_DOMAIN_VRAM | RADEON_DOMAIN_GTT;
                flags |= RADEON_FLAG_GTT_WC;
                break;
        }
@@ -158,7 +158,7 @@  bool r600_init_resource(struct r600_comm
        /* Tiled textures are unmappable. Always put them in VRAM. */
        if (res->b.b.target != PIPE_BUFFER &&
            rtex->surface.level[0].mode >= RADEON_SURF_MODE_1D) {
-               res->domains = RADEON_DOMAIN_VRAM;
+               res->domains = RADEON_DOMAIN_VRAM | RADEON_DOMAIN_GTT;
                flags &= ~RADEON_FLAG_CPU_ACCESS;
                flags |= RADEON_FLAG_NO_CPU_ACCESS;
        }