mbox series

[v3,00/37] Introduce memory region concept (including device local memory)

Message ID 20190809222643.23142-1-matthew.auld@intel.com (mailing list archive)
Headers show
Series Introduce memory region concept (including device local memory) | expand

Message

Matthew Auld Aug. 9, 2019, 10:26 p.m. UTC
In preparation for upcoming devices with device local memory, introduce the
concept of different memory regions, and a simple buddy allocator to manage
them in i915.

One of the concerns raised from v1 was around not using enough of TTM, which is
a fair criticism, so trying to get better alignment here is something we are
investigating, though currently that is still WIP so in the meantime v3 still
continues to push more of the low-level details forward, but not yet the TTM
interactions.

Sidenote:
Daniel raised a fair point with the whole mmap_offset uAPI and whether we can
just get away with using gtt_mmap, it looks like it should work and would
simplify a few things and possibly allow us to drop a couple patches. Thoughts?

Abdiel Janulgue (11):
  drm/i915: Add memory region information to device_info
  drm/i915: setup io-mapping for LMEM
  drm/i915/lmem: support kernel mapping
  drm/i915: enumerate and init each supported region
  drm/i915: Allow i915 to manage the vma offset nodes instead of drm
    core
  drm/i915: Introduce DRM_I915_GEM_MMAP_OFFSET
  drm/i915/lmem: add helper to get CPU accessible offset
  drm/i915: Add cpu and lmem fault handlers
  drm/i915: cpu-map based dumb buffers
  drm/i915: Introduce GEM_OBJECT_SETPARAM with I915_PARAM_MEMORY_REGION
  drm/i915/query: Expose memory regions through the query uAPI

CQ Tang (1):
  drm/i915: check for missing aperture in insert_mappable_node

Daniele Ceraolo Spurio (4):
  drm/i915: define HAS_MAPPABLE_APERTURE
  drm/i915: do not map aperture if it is not available.
  drm/i915: set num_fence_regs to 0 if there is no aperture
  drm/i915: error capture with no ggtt slot

Matthew Auld (20):
  drm/i915: buddy allocator
  drm/i915: introduce intel_memory_region
  drm/i915/region: support basic eviction
  drm/i915/region: support continuous allocations
  drm/i915/region: support volatile objects
  drm/i915: support creating LMEM objects
  drm/i915/blt: don't assume pinned intel_context
  drm/i915/blt: bump size restriction
  drm/i915/blt: support copying objects
  drm/i915/selftests: move gpu-write-dw into utils
  drm/i915/selftests: add write-dword test for LMEM
  drm/i915/selftest: extend coverage to include LMEM huge-pages
  drm/i915/lmem: support CPU relocations
  drm/i915/lmem: support pread
  drm/i915/lmem: support pwrite
  drm/i915: treat shmem as a region
  drm/i915: treat stolen as a region
  drm/i915/selftests: check for missing aperture
  drm/i915: support basic object migration
  HAX drm/i915: add the fake lmem region

Michal Wajdeczko (1):
  drm/i915: Don't try to place HWS in non-existing mappable region

 arch/x86/kernel/early-quirks.c                |  26 +
 drivers/gpu/drm/i915/Makefile                 |   5 +
 .../gpu/drm/i915/gem/i915_gem_client_blt.c    |  34 +-
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |  17 +
 drivers/gpu/drm/i915/gem/i915_gem_context.h   |   2 +
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |  55 +-
 drivers/gpu/drm/i915/gem/i915_gem_internal.c  |  21 +-
 drivers/gpu/drm/i915/gem/i915_gem_ioctls.h    |   4 +
 drivers/gpu/drm/i915/gem/i915_gem_lmem.c      | 315 +++++++
 drivers/gpu/drm/i915/gem/i915_gem_lmem.h      |  37 +
 drivers/gpu/drm/i915/gem/i915_gem_mman.c      | 376 +++++++-
 drivers/gpu/drm/i915/gem/i915_gem_object.c    | 271 ++++++
 drivers/gpu/drm/i915/gem/i915_gem_object.h    |  29 +-
 .../gpu/drm/i915/gem/i915_gem_object_blt.c    | 349 +++++++-
 .../gpu/drm/i915/gem/i915_gem_object_blt.h    |  18 +-
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |  48 +-
 drivers/gpu/drm/i915/gem/i915_gem_pages.c     |  28 +-
 drivers/gpu/drm/i915/gem/i915_gem_phys.c      |   6 +-
 drivers/gpu/drm/i915/gem/i915_gem_region.c    | 165 ++++
 drivers/gpu/drm/i915/gem/i915_gem_region.h    |  29 +
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c     |  71 +-
 drivers/gpu/drm/i915/gem/i915_gem_stolen.c    |  71 +-
 drivers/gpu/drm/i915/gem/i915_gem_stolen.h    |   3 +-
 .../drm/i915/gem/selftests/huge_gem_object.c  |   4 +-
 .../gpu/drm/i915/gem/selftests/huge_pages.c   | 331 ++++---
 .../i915/gem/selftests/i915_gem_client_blt.c  |  16 +-
 .../i915/gem/selftests/i915_gem_coherency.c   |   5 +-
 .../drm/i915/gem/selftests/i915_gem_context.c | 134 +--
 .../drm/i915/gem/selftests/i915_gem_mman.c    |  15 +-
 .../i915/gem/selftests/i915_gem_object_blt.c  | 128 ++-
 .../drm/i915/gem/selftests/igt_gem_utils.c    | 135 +++
 .../drm/i915/gem/selftests/igt_gem_utils.h    |  16 +
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |   2 +-
 drivers/gpu/drm/i915/gt/intel_gpu_commands.h  |   5 +-
 drivers/gpu/drm/i915/gt/intel_reset.c         |  13 +-
 drivers/gpu/drm/i915/gt/intel_ringbuffer.c    |   2 +-
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c  |  14 +-
 drivers/gpu/drm/i915/i915_buddy.c             | 433 ++++++++++
 drivers/gpu/drm/i915/i915_buddy.h             | 128 +++
 drivers/gpu/drm/i915/i915_drv.c               |  28 +-
 drivers/gpu/drm/i915/i915_drv.h               |  20 +-
 drivers/gpu/drm/i915/i915_gem.c               |  41 +-
 drivers/gpu/drm/i915/i915_gem_fence_reg.c     |   6 +-
 drivers/gpu/drm/i915/i915_gem_gtt.c           | 121 ++-
 drivers/gpu/drm/i915/i915_getparam.c          |   1 +
 drivers/gpu/drm/i915/i915_globals.c           |   1 +
 drivers/gpu/drm/i915/i915_globals.h           |   1 +
 drivers/gpu/drm/i915/i915_gpu_error.c         |  64 +-
 drivers/gpu/drm/i915/i915_pci.c               |  29 +-
 drivers/gpu/drm/i915/i915_query.c             |  57 ++
 drivers/gpu/drm/i915/i915_vma.c               |  21 +-
 drivers/gpu/drm/i915/intel_device_info.h      |   2 +
 drivers/gpu/drm/i915/intel_memory_region.c    | 240 +++++
 drivers/gpu/drm/i915/intel_memory_region.h    | 116 +++
 drivers/gpu/drm/i915/intel_region_lmem.c      | 141 +++
 drivers/gpu/drm/i915/intel_region_lmem.h      |  16 +
 drivers/gpu/drm/i915/selftests/i915_buddy.c   | 719 +++++++++++++++
 drivers/gpu/drm/i915/selftests/i915_gem.c     |   3 +
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c |   8 +-
 .../drm/i915/selftests/i915_live_selftests.h  |   1 +
 .../drm/i915/selftests/i915_mock_selftests.h  |   2 +
 .../drm/i915/selftests/intel_memory_region.c  | 817 ++++++++++++++++++
 .../gpu/drm/i915/selftests/mock_gem_device.c  |   9 +-
 drivers/gpu/drm/i915/selftests/mock_region.c  |  64 ++
 drivers/gpu/drm/i915/selftests/mock_region.h  |  16 +
 include/drm/i915_drm.h                        |   3 +
 include/uapi/drm/i915_drm.h                   |  93 ++
 67 files changed, 5524 insertions(+), 477 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_lmem.c
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_lmem.h
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_region.c
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_region.h
 create mode 100644 drivers/gpu/drm/i915/i915_buddy.c
 create mode 100644 drivers/gpu/drm/i915/i915_buddy.h
 create mode 100644 drivers/gpu/drm/i915/intel_memory_region.c
 create mode 100644 drivers/gpu/drm/i915/intel_memory_region.h
 create mode 100644 drivers/gpu/drm/i915/intel_region_lmem.c
 create mode 100644 drivers/gpu/drm/i915/intel_region_lmem.h
 create mode 100644 drivers/gpu/drm/i915/selftests/i915_buddy.c
 create mode 100644 drivers/gpu/drm/i915/selftests/intel_memory_region.c
 create mode 100644 drivers/gpu/drm/i915/selftests/mock_region.c
 create mode 100644 drivers/gpu/drm/i915/selftests/mock_region.h

Comments

Dave Airlie Aug. 13, 2019, 7:20 p.m. UTC | #1
On Sat, 10 Aug 2019 at 08:26, Matthew Auld <matthew.auld@intel.com> wrote:
>
> In preparation for upcoming devices with device local memory, introduce the
> concept of different memory regions, and a simple buddy allocator to manage
> them in i915.
>
> One of the concerns raised from v1 was around not using enough of TTM, which is
> a fair criticism, so trying to get better alignment here is something we are
> investigating, though currently that is still WIP so in the meantime v3 still
> continues to push more of the low-level details forward, but not yet the TTM
> interactions.

Can we bump the TTM work up the ladder here, as is I'm not willing to
accept any of this code upstream without some serious analysis, this
isn't a case of me making a nice suggestion and you having the option
to ignore it. Don't make me shout.

Dave.
Joonas Lahtinen Sept. 12, 2019, 1:33 p.m. UTC | #2
Quoting Dave Airlie (2019-08-13 22:20:52)
> On Sat, 10 Aug 2019 at 08:26, Matthew Auld <matthew.auld@intel.com> wrote:
> >
> > In preparation for upcoming devices with device local memory, introduce the
> > concept of different memory regions, and a simple buddy allocator to manage
> > them in i915.
> >
> > One of the concerns raised from v1 was around not using enough of TTM, which is
> > a fair criticism, so trying to get better alignment here is something we are
> > investigating, though currently that is still WIP so in the meantime v3 still
> > continues to push more of the low-level details forward, but not yet the TTM
> > interactions.
> 
> Can we bump the TTM work up the ladder here, as is I'm not willing to
> accept any of this code upstream without some serious analysis, this
> isn't a case of me making a nice suggestion and you having the option
> to ignore it. Don't make me shout.

Thanks for a reminder. TTM analysis was ongoing on the background
and we now reserved enough time to conclude on how to best align
with TTM in short-term and long-term.

We decided to bite the bullet and apply dma_resv as the outer-most
locking in i915 codepaths to align with the TTM locking. As a
conclusion to those discussions we documented guidelines how to
align with TTM locking:

https://patchwork.freedesktop.org/patch/328266/

As refactoring of locking fundamentals of the driver is a massive
undergoing with many opens along the path, we'd like to propose a
staged approach to avoid stalling the upstream work while it's
being done.

Our first suggested step would be merging the i915 local memory
related internal code reworks to unblock the display work. This
step should not cause any conflicts with TTM.

Following step would be to merge proposed memory allocation/
management uAPIs with TTM related functionality behind them for
early debug. They would be protected by DRM_I915_DEBUG_EARLY_API
kernel config flag (depending on EXPERT & STAGING & BROKEN).

This would allow us to keep debugging these new IOCTLs with Mesa
etc. while we rework the locking. The protection still leaving us
a possibility to correcting the uAPIs if/when there is need after
reworking the locking around dma_resv progresses. Draft of such
proposal here:

https://patchwork.freedesktop.org/patch/327908/

The final step (a rather long one) would be then to complete the
locking rework in the driver and lift the DEBUG_EARLY_API
protection once the locking has been sorted.

If you could confirm the above plan sounds reasonable to you, we
may then proceed with it.

Regards, Joonas
Dave Airlie Sept. 13, 2019, 9:55 a.m. UTC | #3
On Thu, 12 Sep 2019 at 23:33, Joonas Lahtinen
<joonas.lahtinen@linux.intel.com> wrote:
>
> Quoting Dave Airlie (2019-08-13 22:20:52)
> > On Sat, 10 Aug 2019 at 08:26, Matthew Auld <matthew.auld@intel.com> wrote:
> > >
> > > In preparation for upcoming devices with device local memory, introduce the
> > > concept of different memory regions, and a simple buddy allocator to manage
> > > them in i915.
> > >
> > > One of the concerns raised from v1 was around not using enough of TTM, which is
> > > a fair criticism, so trying to get better alignment here is something we are
> > > investigating, though currently that is still WIP so in the meantime v3 still
> > > continues to push more of the low-level details forward, but not yet the TTM
> > > interactions.
> >
> > Can we bump the TTM work up the ladder here, as is I'm not willing to
> > accept any of this code upstream without some serious analysis, this
> > isn't a case of me making a nice suggestion and you having the option
> > to ignore it. Don't make me shout.
>
> Thanks for a reminder. TTM analysis was ongoing on the background
> and we now reserved enough time to conclude on how to best align
> with TTM in short-term and long-term.
>
> We decided to bite the bullet and apply dma_resv as the outer-most
> locking in i915 codepaths to align with the TTM locking. As a
> conclusion to those discussions we documented guidelines how to
> align with TTM locking:
>
> https://patchwork.freedesktop.org/patch/328266/
>
> As refactoring of locking fundamentals of the driver is a massive
> undergoing with many opens along the path, we'd like to propose a
> staged approach to avoid stalling the upstream work while it's
> being done.
>
> Our first suggested step would be merging the i915 local memory
> related internal code reworks to unblock the display work. This
> step should not cause any conflicts with TTM.
>
> Following step would be to merge proposed memory allocation/
> management uAPIs with TTM related functionality behind them for
> early debug. They would be protected by DRM_I915_DEBUG_EARLY_API
> kernel config flag (depending on EXPERT & STAGING & BROKEN).
>
> This would allow us to keep debugging these new IOCTLs with Mesa
> etc. while we rework the locking. The protection still leaving us
> a possibility to correcting the uAPIs if/when there is need after
> reworking the locking around dma_resv progresses. Draft of such
> proposal here:
>
> https://patchwork.freedesktop.org/patch/327908/
>
> The final step (a rather long one) would be then to complete the
> locking rework in the driver and lift the DEBUG_EARLY_API
> protection once the locking has been sorted.
>
> If you could confirm the above plan sounds reasonable to you, we
> may then proceed with it.

Just travelling, but this sounds like a good way foward to me.

Dave.