mbox series

[PULL] drm-xe-next

Message ID ZxqJS8bCWc9ZgIav@fedora (mailing list archive)
State New, archived
Headers show
Series [PULL] drm-xe-next | expand

Pull-request

https://gitlab.freedesktop.org/drm/xe/kernel.git tags/drm-xe-next-2024-10-24

Message

Thomas Hellstrom Oct. 24, 2024, 5:52 p.m. UTC
Hi, Dave & Simona,

This week's drm-xe-next PR

Thanks,
Thomas


drm-xe-next-2024-10-24:
UAPI Changes:
- Define and parse OA sync properties (Ashutosh)

Driver Changes:
- Add caller info to xe_gt_reset_async (Nirmoy)
- A large forcewake rework / cleanup (Himal)
- A g2h response timeout fix (Badal)
- A PTL workaround (Vinay)
- Handle unreliable MMIO reads during forcewake (Shuicheng)
- Ufence user-space access fixes (Nirmoy)
- Annotate flexible arrays (Matthew Brost)
- Enable GuC lite restore (Fei)
- Prevent GuC register capture on VF (Zhanjun)
- Show VFs VRAM / LMEM provisioning summary over debugfs (Michal)
- Parallel queues fix on GT reset (Nirmoy)
- Move reference grabbing to a job's dma-fence (Matt Brost)
- Mark a number of local workqueues WQ_MEM_RECLAIM (Matt Brost)
- OA synchronization support (Ashutosh)

The following changes since commit 2eb460ab9f4bc5b575f52568d17936da0af681d8:

  drm/xe: Enlarge the invalidation timeout from 150 to 500 (2024-10-16 16:11:10 +0100)

are available in the Git repository at:

  https://gitlab.freedesktop.org/drm/xe/kernel.git tags/drm-xe-next-2024-10-24

for you to fetch changes up to 85d3f9e84e0628c412b69aa99b63654dfa08ad68:

  drm/xe/oa: Allow only certain property changes from config (2024-10-23 12:42:20 -0700)

----------------------------------------------------------------
UAPI Changes:
- Define and parse OA sync properties (Ashutosh)

Driver Changes:
- Add caller info to xe_gt_reset_async (Nirmoy)
- A large forcewake rework / cleanup (Himal)
- A g2h response timeout fix (Badal)
- A PTL workaround (Vinay)
- Handle unreliable MMIO reads during forcewake (Shuicheng)
- Ufence user-space access fixes (Nirmoy)
- Annotate flexible arrays (Matthew Brost)
- Enable GuC lite restore (Fei)
- Prevent GuC register capture on VF (Zhanjun)
- Show VFs VRAM / LMEM provisioning summary over debugfs (Michal)
- Parallel queues fix on GT reset (Nirmoy)
- Move reference grabbing to a job's dma-fence (Matt Brost)
- Mark a number of local workqueues WQ_MEM_RECLAIM (Matt Brost)
- OA synchronization support (Ashutosh)

----------------------------------------------------------------
Ashutosh Dixit (7):
      drm/xe/oa: Separate batch submission from waiting for completion
      drm/xe/oa/uapi: Define and parse OA sync properties
      drm/xe/oa: Add input fence dependencies
      drm/xe/oa: Signal output fences
      drm/xe/oa: Move functions up so they can be reused for config ioctl
      drm/xe/oa: Add syncs support to OA config ioctl
      drm/xe/oa: Allow only certain property changes from config

Badal Nilawar (1):
      drm/xe/guc/ct: Flush g2h worker in case of g2h response timeout

Fei Yang (1):
      drm/xe: enable lite restore

Himal Prasad Ghimiray (26):
      drm/xe: Add member initialized_domains to xe_force_wake()
      drm/xe/forcewake: Change awake_domain datatype
      drm/xe/forcewake: Add a helper xe_force_wake_ref_has_domain()
      drm/xe: Error handling in xe_force_wake_get()
      drm/xe: Modify xe_force_wake_put to handle _get returned mask
      drm/xe/device: Update handling of xe_force_wake_get return
      drm/xe/hdcp: Update handling of xe_force_wake_get return
      drm/xe/gsc: Update handling of xe_force_wake_get return
      drm/xe/gt: Update handling of xe_force_wake_get return
      drm/xe/xe_gt_idle: Update handling of xe_force_wake_get return
      drm/xe/devcoredump: Update handling of xe_force_wake_get return
      drm/xe/tests/mocs: Update xe_force_wake_get() return handling
      drm/xe/mocs: Update handling of xe_force_wake_get return
      drm/xe/xe_drm_client: Update handling of xe_force_wake_get return
      drm/xe/xe_gt_debugfs: Update handling of xe_force_wake_get return
      drm/xe/guc: Update handling of xe_force_wake_get return
      drm/xe/huc: Update handling of xe_force_wake_get return
      drm/xe/oa: Handle force_wake_get failure in xe_oa_stream_init()
      drm/xe/pat: Update handling of xe_force_wake_get return
      drm/xe/gt_tlb_invalidation_ggtt: Update handling of xe_force_wake_get return
      drm/xe/xe_reg_sr: Update handling of xe_force_wake_get return
      drm/xe/query: Update handling of xe_force_wake_get return
      drm/xe/vram: Update handling of xe_force_wake_get return
      drm/xe: forcewake debugfs open fails on xe_forcewake_get failure
      drm/xe: Ensure __must_check for xe_force_wake_get() return
      drm/xe: Change return type to void for xe_force_wake_put

Matthew Brost (5):
      drm/xe: Use __counted_by for flexible arrays
      drm/xe: Take ref to job's fence in arm
      drm/xe: Mark GGTT work queue with WQ_MEM_RECLAIM
      drm/xe: Mark G2H work queue with WQ_MEM_RECLAIM
      drm/xe: Mark GT work queue with WQ_MEM_RECLAIM

Michal Wajdeczko (1):
      drm/xe/pf: Show VFs LMEM provisioning summary over debugfs

Nirmoy Das (4):
      drm/xe: Add caller info to xe_gt_reset_async
      drm/xe/ufence: Prefetch ufence addr to catch bogus address
      drm/xe/ufence: Warn if mmget_not_zero() fails
      drm/xe: Don't restart parallel queues multiple times on GT reset

Shuicheng Lin (1):
      drm/xe: Handle unreliable MMIO reads during forcewake

Vinay Belgaumkar (1):
      drm/xe/ptl: Apply Wa_14022866841

Zhanjun Dong (1):
      drm/xe/guc: Prevent GuC register capture running on VF

 drivers/gpu/drm/xe/abi/guc_klvs_abi.h       |   1 +
 drivers/gpu/drm/xe/display/xe_hdcp_gsc.c    |   6 +-
 drivers/gpu/drm/xe/tests/xe_mocs.c          |  18 +-
 drivers/gpu/drm/xe/xe_debugfs.c             |  27 +-
 drivers/gpu/drm/xe/xe_devcoredump.c         |  14 +-
 drivers/gpu/drm/xe/xe_device.c              |  25 +-
 drivers/gpu/drm/xe/xe_drm_client.c          |   8 +-
 drivers/gpu/drm/xe/xe_exec_queue_types.h    |   2 +-
 drivers/gpu/drm/xe/xe_execlist.c            |   2 +-
 drivers/gpu/drm/xe/xe_force_wake.c          | 134 ++++--
 drivers/gpu/drm/xe/xe_force_wake.h          |  23 +-
 drivers/gpu/drm/xe/xe_force_wake_types.h    |   6 +-
 drivers/gpu/drm/xe/xe_ggtt.c                |   2 +-
 drivers/gpu/drm/xe/xe_gsc.c                 |  23 +-
 drivers/gpu/drm/xe/xe_gsc_proxy.c           |   9 +-
 drivers/gpu/drm/xe/xe_gt.c                  | 110 +++--
 drivers/gpu/drm/xe/xe_gt_debugfs.c          |  13 +-
 drivers/gpu/drm/xe/xe_gt_idle.c             |  26 +-
 drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c  |  35 ++
 drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h  |   1 +
 drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c |   5 +
 drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c |   5 +-
 drivers/gpu/drm/xe/xe_guc.c                 |  15 +-
 drivers/gpu/drm/xe/xe_guc_ads.c             |   5 +
 drivers/gpu/drm/xe/xe_guc_capture.c         |   8 +-
 drivers/gpu/drm/xe/xe_guc_ct.c              |  20 +-
 drivers/gpu/drm/xe/xe_guc_fwif.h            |   1 +
 drivers/gpu/drm/xe/xe_guc_log.c             |   9 +-
 drivers/gpu/drm/xe/xe_guc_pc.c              |  50 +-
 drivers/gpu/drm/xe/xe_guc_submit.c          |  29 +-
 drivers/gpu/drm/xe/xe_huc.c                 |   8 +-
 drivers/gpu/drm/xe/xe_mocs.c                |  14 +-
 drivers/gpu/drm/xe/xe_oa.c                  | 678 +++++++++++++++++++---------
 drivers/gpu/drm/xe/xe_oa_types.h            |  12 +
 drivers/gpu/drm/xe/xe_pat.c                 |  65 ++-
 drivers/gpu/drm/xe/xe_query.c               |  10 +-
 drivers/gpu/drm/xe/xe_reg_sr.c              |  24 +-
 drivers/gpu/drm/xe/xe_sched_job.c           |   2 +-
 drivers/gpu/drm/xe/xe_sched_job_types.h     |   3 +-
 drivers/gpu/drm/xe/xe_sync.c                |   5 +-
 drivers/gpu/drm/xe/xe_vram.c                |  12 +-
 drivers/gpu/drm/xe/xe_wa_oob.rules          |   2 +
 include/uapi/drm/xe_drm.h                   |  17 +
 43 files changed, 997 insertions(+), 487 deletions(-)

Comments

Matthew Brost Oct. 24, 2024, 7:22 p.m. UTC | #1
On Thu, Oct 24, 2024 at 07:52:11PM +0200, Thomas Hellstrom wrote:
> Hi, Dave & Simona,
> 
> This week's drm-xe-next PR
> 
> Thanks,
> Thomas
> 
> 
> drm-xe-next-2024-10-24:
> UAPI Changes:
> - Define and parse OA sync properties (Ashutosh)
> 
> Driver Changes:
> - Add caller info to xe_gt_reset_async (Nirmoy)
> - A large forcewake rework / cleanup (Himal)
> - A g2h response timeout fix (Badal)
> - A PTL workaround (Vinay)
> - Handle unreliable MMIO reads during forcewake (Shuicheng)
> - Ufence user-space access fixes (Nirmoy)
> - Annotate flexible arrays (Matthew Brost)
> - Enable GuC lite restore (Fei)
> - Prevent GuC register capture on VF (Zhanjun)
> - Show VFs VRAM / LMEM provisioning summary over debugfs (Michal)
> - Parallel queues fix on GT reset (Nirmoy)
> - Move reference grabbing to a job's dma-fence (Matt Brost)
> - Mark a number of local workqueues WQ_MEM_RECLAIM (Matt Brost)

This breaks CI [1] - my mistake. Maybe omit these in this weeks PR.

We need [2] merged to fix this. Waiting on an RB but I'd like to get all of this in 6.12.

Matt

[1] https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-140135v2/bat-lnl-1/igt@xe_exec_fault_mode@twice-invalid-fault.html
[2] https://patchwork.freedesktop.org/series/140406/

> - OA synchronization support (Ashutosh)
> 
> The following changes since commit 2eb460ab9f4bc5b575f52568d17936da0af681d8:
> 
>   drm/xe: Enlarge the invalidation timeout from 150 to 500 (2024-10-16 16:11:10 +0100)
> 
> are available in the Git repository at:
> 
>   https://gitlab.freedesktop.org/drm/xe/kernel.git tags/drm-xe-next-2024-10-24
> 
> for you to fetch changes up to 85d3f9e84e0628c412b69aa99b63654dfa08ad68:
> 
>   drm/xe/oa: Allow only certain property changes from config (2024-10-23 12:42:20 -0700)
> 
> ----------------------------------------------------------------
> UAPI Changes:
> - Define and parse OA sync properties (Ashutosh)
> 
> Driver Changes:
> - Add caller info to xe_gt_reset_async (Nirmoy)
> - A large forcewake rework / cleanup (Himal)
> - A g2h response timeout fix (Badal)
> - A PTL workaround (Vinay)
> - Handle unreliable MMIO reads during forcewake (Shuicheng)
> - Ufence user-space access fixes (Nirmoy)
> - Annotate flexible arrays (Matthew Brost)
> - Enable GuC lite restore (Fei)
> - Prevent GuC register capture on VF (Zhanjun)
> - Show VFs VRAM / LMEM provisioning summary over debugfs (Michal)
> - Parallel queues fix on GT reset (Nirmoy)
> - Move reference grabbing to a job's dma-fence (Matt Brost)
> - Mark a number of local workqueues WQ_MEM_RECLAIM (Matt Brost)
> - OA synchronization support (Ashutosh)
> 
> ----------------------------------------------------------------
> Ashutosh Dixit (7):
>       drm/xe/oa: Separate batch submission from waiting for completion
>       drm/xe/oa/uapi: Define and parse OA sync properties
>       drm/xe/oa: Add input fence dependencies
>       drm/xe/oa: Signal output fences
>       drm/xe/oa: Move functions up so they can be reused for config ioctl
>       drm/xe/oa: Add syncs support to OA config ioctl
>       drm/xe/oa: Allow only certain property changes from config
> 
> Badal Nilawar (1):
>       drm/xe/guc/ct: Flush g2h worker in case of g2h response timeout
> 
> Fei Yang (1):
>       drm/xe: enable lite restore
> 
> Himal Prasad Ghimiray (26):
>       drm/xe: Add member initialized_domains to xe_force_wake()
>       drm/xe/forcewake: Change awake_domain datatype
>       drm/xe/forcewake: Add a helper xe_force_wake_ref_has_domain()
>       drm/xe: Error handling in xe_force_wake_get()
>       drm/xe: Modify xe_force_wake_put to handle _get returned mask
>       drm/xe/device: Update handling of xe_force_wake_get return
>       drm/xe/hdcp: Update handling of xe_force_wake_get return
>       drm/xe/gsc: Update handling of xe_force_wake_get return
>       drm/xe/gt: Update handling of xe_force_wake_get return
>       drm/xe/xe_gt_idle: Update handling of xe_force_wake_get return
>       drm/xe/devcoredump: Update handling of xe_force_wake_get return
>       drm/xe/tests/mocs: Update xe_force_wake_get() return handling
>       drm/xe/mocs: Update handling of xe_force_wake_get return
>       drm/xe/xe_drm_client: Update handling of xe_force_wake_get return
>       drm/xe/xe_gt_debugfs: Update handling of xe_force_wake_get return
>       drm/xe/guc: Update handling of xe_force_wake_get return
>       drm/xe/huc: Update handling of xe_force_wake_get return
>       drm/xe/oa: Handle force_wake_get failure in xe_oa_stream_init()
>       drm/xe/pat: Update handling of xe_force_wake_get return
>       drm/xe/gt_tlb_invalidation_ggtt: Update handling of xe_force_wake_get return
>       drm/xe/xe_reg_sr: Update handling of xe_force_wake_get return
>       drm/xe/query: Update handling of xe_force_wake_get return
>       drm/xe/vram: Update handling of xe_force_wake_get return
>       drm/xe: forcewake debugfs open fails on xe_forcewake_get failure
>       drm/xe: Ensure __must_check for xe_force_wake_get() return
>       drm/xe: Change return type to void for xe_force_wake_put
> 
> Matthew Brost (5):
>       drm/xe: Use __counted_by for flexible arrays
>       drm/xe: Take ref to job's fence in arm
>       drm/xe: Mark GGTT work queue with WQ_MEM_RECLAIM
>       drm/xe: Mark G2H work queue with WQ_MEM_RECLAIM
>       drm/xe: Mark GT work queue with WQ_MEM_RECLAIM
> 
> Michal Wajdeczko (1):
>       drm/xe/pf: Show VFs LMEM provisioning summary over debugfs
> 
> Nirmoy Das (4):
>       drm/xe: Add caller info to xe_gt_reset_async
>       drm/xe/ufence: Prefetch ufence addr to catch bogus address
>       drm/xe/ufence: Warn if mmget_not_zero() fails
>       drm/xe: Don't restart parallel queues multiple times on GT reset
> 
> Shuicheng Lin (1):
>       drm/xe: Handle unreliable MMIO reads during forcewake
> 
> Vinay Belgaumkar (1):
>       drm/xe/ptl: Apply Wa_14022866841
> 
> Zhanjun Dong (1):
>       drm/xe/guc: Prevent GuC register capture running on VF
> 
>  drivers/gpu/drm/xe/abi/guc_klvs_abi.h       |   1 +
>  drivers/gpu/drm/xe/display/xe_hdcp_gsc.c    |   6 +-
>  drivers/gpu/drm/xe/tests/xe_mocs.c          |  18 +-
>  drivers/gpu/drm/xe/xe_debugfs.c             |  27 +-
>  drivers/gpu/drm/xe/xe_devcoredump.c         |  14 +-
>  drivers/gpu/drm/xe/xe_device.c              |  25 +-
>  drivers/gpu/drm/xe/xe_drm_client.c          |   8 +-
>  drivers/gpu/drm/xe/xe_exec_queue_types.h    |   2 +-
>  drivers/gpu/drm/xe/xe_execlist.c            |   2 +-
>  drivers/gpu/drm/xe/xe_force_wake.c          | 134 ++++--
>  drivers/gpu/drm/xe/xe_force_wake.h          |  23 +-
>  drivers/gpu/drm/xe/xe_force_wake_types.h    |   6 +-
>  drivers/gpu/drm/xe/xe_ggtt.c                |   2 +-
>  drivers/gpu/drm/xe/xe_gsc.c                 |  23 +-
>  drivers/gpu/drm/xe/xe_gsc_proxy.c           |   9 +-
>  drivers/gpu/drm/xe/xe_gt.c                  | 110 +++--
>  drivers/gpu/drm/xe/xe_gt_debugfs.c          |  13 +-
>  drivers/gpu/drm/xe/xe_gt_idle.c             |  26 +-
>  drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c  |  35 ++
>  drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h  |   1 +
>  drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c |   5 +
>  drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c |   5 +-
>  drivers/gpu/drm/xe/xe_guc.c                 |  15 +-
>  drivers/gpu/drm/xe/xe_guc_ads.c             |   5 +
>  drivers/gpu/drm/xe/xe_guc_capture.c         |   8 +-
>  drivers/gpu/drm/xe/xe_guc_ct.c              |  20 +-
>  drivers/gpu/drm/xe/xe_guc_fwif.h            |   1 +
>  drivers/gpu/drm/xe/xe_guc_log.c             |   9 +-
>  drivers/gpu/drm/xe/xe_guc_pc.c              |  50 +-
>  drivers/gpu/drm/xe/xe_guc_submit.c          |  29 +-
>  drivers/gpu/drm/xe/xe_huc.c                 |   8 +-
>  drivers/gpu/drm/xe/xe_mocs.c                |  14 +-
>  drivers/gpu/drm/xe/xe_oa.c                  | 678 +++++++++++++++++++---------
>  drivers/gpu/drm/xe/xe_oa_types.h            |  12 +
>  drivers/gpu/drm/xe/xe_pat.c                 |  65 ++-
>  drivers/gpu/drm/xe/xe_query.c               |  10 +-
>  drivers/gpu/drm/xe/xe_reg_sr.c              |  24 +-
>  drivers/gpu/drm/xe/xe_sched_job.c           |   2 +-
>  drivers/gpu/drm/xe/xe_sched_job_types.h     |   3 +-
>  drivers/gpu/drm/xe/xe_sync.c                |   5 +-
>  drivers/gpu/drm/xe/xe_vram.c                |  12 +-
>  drivers/gpu/drm/xe/xe_wa_oob.rules          |   2 +
>  include/uapi/drm/xe_drm.h                   |  17 +
>  43 files changed, 997 insertions(+), 487 deletions(-)
Thomas Hellstrom Oct. 25, 2024, 7:30 a.m. UTC | #2
On Thu, 2024-10-24 at 19:22 +0000, Matthew Brost wrote:
> On Thu, Oct 24, 2024 at 07:52:11PM +0200, Thomas Hellstrom wrote:
> > Hi, Dave & Simona,
> > 
> > This week's drm-xe-next PR
> > 
> > Thanks,
> > Thomas
> > 
> > 
> > drm-xe-next-2024-10-24:
> > UAPI Changes:
> > - Define and parse OA sync properties (Ashutosh)
> > 
> > Driver Changes:
> > - Add caller info to xe_gt_reset_async (Nirmoy)
> > - A large forcewake rework / cleanup (Himal)
> > - A g2h response timeout fix (Badal)
> > - A PTL workaround (Vinay)
> > - Handle unreliable MMIO reads during forcewake (Shuicheng)
> > - Ufence user-space access fixes (Nirmoy)
> > - Annotate flexible arrays (Matthew Brost)
> > - Enable GuC lite restore (Fei)
> > - Prevent GuC register capture on VF (Zhanjun)
> > - Show VFs VRAM / LMEM provisioning summary over debugfs (Michal)
> > - Parallel queues fix on GT reset (Nirmoy)
> > - Move reference grabbing to a job's dma-fence (Matt Brost)
> > - Mark a number of local workqueues WQ_MEM_RECLAIM (Matt Brost)
> 
> This breaks CI [1] - my mistake. Maybe omit these in this weeks PR.
> 
> We need [2] merged to fix this. Waiting on an RB but I'd like to get
> all of this in 6.12.
> 
> Matt
> 
> [1]
> https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-140135v2/bat-lnl-1/igt@xe_exec_fault_mode@twice-invalid-fault.html
> [2] https://patchwork.freedesktop.org/series/140406/

So this CI failure is a warning only and IMHO for drm-xe-next (6.13)
it's not catastrophic. There might be a window in the bisect history
where this warning appears. It's perhaps more important for -fixes,
though.

If we need to wait for the scheduler patch going into drm-misc-next /
drm-next/ backmerge we'd hold off this branch for too long I fear.

@Dave, @Sima 
If you feel differently please skip this PR for this week and we'll
work to get the scheduler patch merged asap.

Thanks,
Thomas


> 
> > - OA synchronization support (Ashutosh)
> > 
> > The following changes since commit
> > 2eb460ab9f4bc5b575f52568d17936da0af681d8:
> > 
> >   drm/xe: Enlarge the invalidation timeout from 150 to 500 (2024-
> > 10-16 16:11:10 +0100)
> > 
> > are available in the Git repository at:
> > 
> >   https://gitlab.freedesktop.org/drm/xe/kernel.git tags/drm-xe-
> > next-2024-10-24
> > 
> > for you to fetch changes up to
> > 85d3f9e84e0628c412b69aa99b63654dfa08ad68:
> > 
> >   drm/xe/oa: Allow only certain property changes from config (2024-
> > 10-23 12:42:20 -0700)
> > 
> > ----------------------------------------------------------------
> > UAPI Changes:
> > - Define and parse OA sync properties (Ashutosh)
> > 
> > Driver Changes:
> > - Add caller info to xe_gt_reset_async (Nirmoy)
> > - A large forcewake rework / cleanup (Himal)
> > - A g2h response timeout fix (Badal)
> > - A PTL workaround (Vinay)
> > - Handle unreliable MMIO reads during forcewake (Shuicheng)
> > - Ufence user-space access fixes (Nirmoy)
> > - Annotate flexible arrays (Matthew Brost)
> > - Enable GuC lite restore (Fei)
> > - Prevent GuC register capture on VF (Zhanjun)
> > - Show VFs VRAM / LMEM provisioning summary over debugfs (Michal)
> > - Parallel queues fix on GT reset (Nirmoy)
> > - Move reference grabbing to a job's dma-fence (Matt Brost)
> > - Mark a number of local workqueues WQ_MEM_RECLAIM (Matt Brost)
> > - OA synchronization support (Ashutosh)
> > 
> > ----------------------------------------------------------------
> > Ashutosh Dixit (7):
> >       drm/xe/oa: Separate batch submission from waiting for
> > completion
> >       drm/xe/oa/uapi: Define and parse OA sync properties
> >       drm/xe/oa: Add input fence dependencies
> >       drm/xe/oa: Signal output fences
> >       drm/xe/oa: Move functions up so they can be reused for config
> > ioctl
> >       drm/xe/oa: Add syncs support to OA config ioctl
> >       drm/xe/oa: Allow only certain property changes from config
> > 
> > Badal Nilawar (1):
> >       drm/xe/guc/ct: Flush g2h worker in case of g2h response
> > timeout
> > 
> > Fei Yang (1):
> >       drm/xe: enable lite restore
> > 
> > Himal Prasad Ghimiray (26):
> >       drm/xe: Add member initialized_domains to xe_force_wake()
> >       drm/xe/forcewake: Change awake_domain datatype
> >       drm/xe/forcewake: Add a helper xe_force_wake_ref_has_domain()
> >       drm/xe: Error handling in xe_force_wake_get()
> >       drm/xe: Modify xe_force_wake_put to handle _get returned mask
> >       drm/xe/device: Update handling of xe_force_wake_get return
> >       drm/xe/hdcp: Update handling of xe_force_wake_get return
> >       drm/xe/gsc: Update handling of xe_force_wake_get return
> >       drm/xe/gt: Update handling of xe_force_wake_get return
> >       drm/xe/xe_gt_idle: Update handling of xe_force_wake_get
> > return
> >       drm/xe/devcoredump: Update handling of xe_force_wake_get
> > return
> >       drm/xe/tests/mocs: Update xe_force_wake_get() return handling
> >       drm/xe/mocs: Update handling of xe_force_wake_get return
> >       drm/xe/xe_drm_client: Update handling of xe_force_wake_get
> > return
> >       drm/xe/xe_gt_debugfs: Update handling of xe_force_wake_get
> > return
> >       drm/xe/guc: Update handling of xe_force_wake_get return
> >       drm/xe/huc: Update handling of xe_force_wake_get return
> >       drm/xe/oa: Handle force_wake_get failure in
> > xe_oa_stream_init()
> >       drm/xe/pat: Update handling of xe_force_wake_get return
> >       drm/xe/gt_tlb_invalidation_ggtt: Update handling of
> > xe_force_wake_get return
> >       drm/xe/xe_reg_sr: Update handling of xe_force_wake_get return
> >       drm/xe/query: Update handling of xe_force_wake_get return
> >       drm/xe/vram: Update handling of xe_force_wake_get return
> >       drm/xe: forcewake debugfs open fails on xe_forcewake_get
> > failure
> >       drm/xe: Ensure __must_check for xe_force_wake_get() return
> >       drm/xe: Change return type to void for xe_force_wake_put
> > 
> > Matthew Brost (5):
> >       drm/xe: Use __counted_by for flexible arrays
> >       drm/xe: Take ref to job's fence in arm
> >       drm/xe: Mark GGTT work queue with WQ_MEM_RECLAIM
> >       drm/xe: Mark G2H work queue with WQ_MEM_RECLAIM
> >       drm/xe: Mark GT work queue with WQ_MEM_RECLAIM
> > 
> > Michal Wajdeczko (1):
> >       drm/xe/pf: Show VFs LMEM provisioning summary over debugfs
> > 
> > Nirmoy Das (4):
> >       drm/xe: Add caller info to xe_gt_reset_async
> >       drm/xe/ufence: Prefetch ufence addr to catch bogus address
> >       drm/xe/ufence: Warn if mmget_not_zero() fails
> >       drm/xe: Don't restart parallel queues multiple times on GT
> > reset
> > 
> > Shuicheng Lin (1):
> >       drm/xe: Handle unreliable MMIO reads during forcewake
> > 
> > Vinay Belgaumkar (1):
> >       drm/xe/ptl: Apply Wa_14022866841
> > 
> > Zhanjun Dong (1):
> >       drm/xe/guc: Prevent GuC register capture running on VF
> > 
> >  drivers/gpu/drm/xe/abi/guc_klvs_abi.h       |   1 +
> >  drivers/gpu/drm/xe/display/xe_hdcp_gsc.c    |   6 +-
> >  drivers/gpu/drm/xe/tests/xe_mocs.c          |  18 +-
> >  drivers/gpu/drm/xe/xe_debugfs.c             |  27 +-
> >  drivers/gpu/drm/xe/xe_devcoredump.c         |  14 +-
> >  drivers/gpu/drm/xe/xe_device.c              |  25 +-
> >  drivers/gpu/drm/xe/xe_drm_client.c          |   8 +-
> >  drivers/gpu/drm/xe/xe_exec_queue_types.h    |   2 +-
> >  drivers/gpu/drm/xe/xe_execlist.c            |   2 +-
> >  drivers/gpu/drm/xe/xe_force_wake.c          | 134 ++++--
> >  drivers/gpu/drm/xe/xe_force_wake.h          |  23 +-
> >  drivers/gpu/drm/xe/xe_force_wake_types.h    |   6 +-
> >  drivers/gpu/drm/xe/xe_ggtt.c                |   2 +-
> >  drivers/gpu/drm/xe/xe_gsc.c                 |  23 +-
> >  drivers/gpu/drm/xe/xe_gsc_proxy.c           |   9 +-
> >  drivers/gpu/drm/xe/xe_gt.c                  | 110 +++--
> >  drivers/gpu/drm/xe/xe_gt_debugfs.c          |  13 +-
> >  drivers/gpu/drm/xe/xe_gt_idle.c             |  26 +-
> >  drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c  |  35 ++
> >  drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h  |   1 +
> >  drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c |   5 +
> >  drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c |   5 +-
> >  drivers/gpu/drm/xe/xe_guc.c                 |  15 +-
> >  drivers/gpu/drm/xe/xe_guc_ads.c             |   5 +
> >  drivers/gpu/drm/xe/xe_guc_capture.c         |   8 +-
> >  drivers/gpu/drm/xe/xe_guc_ct.c              |  20 +-
> >  drivers/gpu/drm/xe/xe_guc_fwif.h            |   1 +
> >  drivers/gpu/drm/xe/xe_guc_log.c             |   9 +-
> >  drivers/gpu/drm/xe/xe_guc_pc.c              |  50 +-
> >  drivers/gpu/drm/xe/xe_guc_submit.c          |  29 +-
> >  drivers/gpu/drm/xe/xe_huc.c                 |   8 +-
> >  drivers/gpu/drm/xe/xe_mocs.c                |  14 +-
> >  drivers/gpu/drm/xe/xe_oa.c                  | 678
> > +++++++++++++++++++---------
> >  drivers/gpu/drm/xe/xe_oa_types.h            |  12 +
> >  drivers/gpu/drm/xe/xe_pat.c                 |  65 ++-
> >  drivers/gpu/drm/xe/xe_query.c               |  10 +-
> >  drivers/gpu/drm/xe/xe_reg_sr.c              |  24 +-
> >  drivers/gpu/drm/xe/xe_sched_job.c           |   2 +-
> >  drivers/gpu/drm/xe/xe_sched_job_types.h     |   3 +-
> >  drivers/gpu/drm/xe/xe_sync.c                |   5 +-
> >  drivers/gpu/drm/xe/xe_vram.c                |  12 +-
> >  drivers/gpu/drm/xe/xe_wa_oob.rules          |   2 +
> >  include/uapi/drm/xe_drm.h                   |  17 +
> >  43 files changed, 997 insertions(+), 487 deletions(-)
Jani Nikula Oct. 25, 2024, 9:34 a.m. UTC | #3
On Fri, 25 Oct 2024, Thomas Hellström <thomas.hellstrom@linux.intel.com> wrote:
> On Thu, 2024-10-24 at 19:22 +0000, Matthew Brost wrote:
>> On Thu, Oct 24, 2024 at 07:52:11PM +0200, Thomas Hellstrom wrote:
>> > Hi, Dave & Simona,
>> > 
>> > This week's drm-xe-next PR
>> > 
>> > Thanks,
>> > Thomas
>> > 
>> > 
>> > drm-xe-next-2024-10-24:
>> > UAPI Changes:
>> > - Define and parse OA sync properties (Ashutosh)
>> > 
>> > Driver Changes:
>> > - Add caller info to xe_gt_reset_async (Nirmoy)
>> > - A large forcewake rework / cleanup (Himal)
>> > - A g2h response timeout fix (Badal)
>> > - A PTL workaround (Vinay)
>> > - Handle unreliable MMIO reads during forcewake (Shuicheng)
>> > - Ufence user-space access fixes (Nirmoy)
>> > - Annotate flexible arrays (Matthew Brost)
>> > - Enable GuC lite restore (Fei)
>> > - Prevent GuC register capture on VF (Zhanjun)
>> > - Show VFs VRAM / LMEM provisioning summary over debugfs (Michal)
>> > - Parallel queues fix on GT reset (Nirmoy)
>> > - Move reference grabbing to a job's dma-fence (Matt Brost)
>> > - Mark a number of local workqueues WQ_MEM_RECLAIM (Matt Brost)
>> 
>> This breaks CI [1] - my mistake. Maybe omit these in this weeks PR.

How did this pass CI and get merged in the first place?!?

It's now botching unrelated pre-merge testing all over the place,
e.g. [3] and [4].

BR,
Jani.


[3] https://lore.kernel.org/r/172981565466.1330037.6238046952250769671@2413ebb6fbb6
[4] https://lore.kernel.org/r/172981849964.1330038.16133455483045565936@2413ebb6fbb6


>> 
>> We need [2] merged to fix this. Waiting on an RB but I'd like to get
>> all of this in 6.12.
>> 
>> Matt
>> 
>> [1]
>> https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-140135v2/bat-lnl-1/igt@xe_exec_fault_mode@twice-invalid-fault.html
>> [2] https://patchwork.freedesktop.org/series/140406/
>
> So this CI failure is a warning only and IMHO for drm-xe-next (6.13)
> it's not catastrophic. There might be a window in the bisect history
> where this warning appears. It's perhaps more important for -fixes,
> though.
>
> If we need to wait for the scheduler patch going into drm-misc-next /
> drm-next/ backmerge we'd hold off this branch for too long I fear.
>
> @Dave, @Sima 
> If you feel differently please skip this PR for this week and we'll
> work to get the scheduler patch merged asap.
>
> Thanks,
> Thomas
>
>
>> 
>> > - OA synchronization support (Ashutosh)
>> > 
>> > The following changes since commit
>> > 2eb460ab9f4bc5b575f52568d17936da0af681d8:
>> > 
>> >   drm/xe: Enlarge the invalidation timeout from 150 to 500 (2024-
>> > 10-16 16:11:10 +0100)
>> > 
>> > are available in the Git repository at:
>> > 
>> >   https://gitlab.freedesktop.org/drm/xe/kernel.git tags/drm-xe-
>> > next-2024-10-24
>> > 
>> > for you to fetch changes up to
>> > 85d3f9e84e0628c412b69aa99b63654dfa08ad68:
>> > 
>> >   drm/xe/oa: Allow only certain property changes from config (2024-
>> > 10-23 12:42:20 -0700)
>> > 
>> > ----------------------------------------------------------------
>> > UAPI Changes:
>> > - Define and parse OA sync properties (Ashutosh)
>> > 
>> > Driver Changes:
>> > - Add caller info to xe_gt_reset_async (Nirmoy)
>> > - A large forcewake rework / cleanup (Himal)
>> > - A g2h response timeout fix (Badal)
>> > - A PTL workaround (Vinay)
>> > - Handle unreliable MMIO reads during forcewake (Shuicheng)
>> > - Ufence user-space access fixes (Nirmoy)
>> > - Annotate flexible arrays (Matthew Brost)
>> > - Enable GuC lite restore (Fei)
>> > - Prevent GuC register capture on VF (Zhanjun)
>> > - Show VFs VRAM / LMEM provisioning summary over debugfs (Michal)
>> > - Parallel queues fix on GT reset (Nirmoy)
>> > - Move reference grabbing to a job's dma-fence (Matt Brost)
>> > - Mark a number of local workqueues WQ_MEM_RECLAIM (Matt Brost)
>> > - OA synchronization support (Ashutosh)
>> > 
>> > ----------------------------------------------------------------
>> > Ashutosh Dixit (7):
>> >       drm/xe/oa: Separate batch submission from waiting for
>> > completion
>> >       drm/xe/oa/uapi: Define and parse OA sync properties
>> >       drm/xe/oa: Add input fence dependencies
>> >       drm/xe/oa: Signal output fences
>> >       drm/xe/oa: Move functions up so they can be reused for config
>> > ioctl
>> >       drm/xe/oa: Add syncs support to OA config ioctl
>> >       drm/xe/oa: Allow only certain property changes from config
>> > 
>> > Badal Nilawar (1):
>> >       drm/xe/guc/ct: Flush g2h worker in case of g2h response
>> > timeout
>> > 
>> > Fei Yang (1):
>> >       drm/xe: enable lite restore
>> > 
>> > Himal Prasad Ghimiray (26):
>> >       drm/xe: Add member initialized_domains to xe_force_wake()
>> >       drm/xe/forcewake: Change awake_domain datatype
>> >       drm/xe/forcewake: Add a helper xe_force_wake_ref_has_domain()
>> >       drm/xe: Error handling in xe_force_wake_get()
>> >       drm/xe: Modify xe_force_wake_put to handle _get returned mask
>> >       drm/xe/device: Update handling of xe_force_wake_get return
>> >       drm/xe/hdcp: Update handling of xe_force_wake_get return
>> >       drm/xe/gsc: Update handling of xe_force_wake_get return
>> >       drm/xe/gt: Update handling of xe_force_wake_get return
>> >       drm/xe/xe_gt_idle: Update handling of xe_force_wake_get
>> > return
>> >       drm/xe/devcoredump: Update handling of xe_force_wake_get
>> > return
>> >       drm/xe/tests/mocs: Update xe_force_wake_get() return handling
>> >       drm/xe/mocs: Update handling of xe_force_wake_get return
>> >       drm/xe/xe_drm_client: Update handling of xe_force_wake_get
>> > return
>> >       drm/xe/xe_gt_debugfs: Update handling of xe_force_wake_get
>> > return
>> >       drm/xe/guc: Update handling of xe_force_wake_get return
>> >       drm/xe/huc: Update handling of xe_force_wake_get return
>> >       drm/xe/oa: Handle force_wake_get failure in
>> > xe_oa_stream_init()
>> >       drm/xe/pat: Update handling of xe_force_wake_get return
>> >       drm/xe/gt_tlb_invalidation_ggtt: Update handling of
>> > xe_force_wake_get return
>> >       drm/xe/xe_reg_sr: Update handling of xe_force_wake_get return
>> >       drm/xe/query: Update handling of xe_force_wake_get return
>> >       drm/xe/vram: Update handling of xe_force_wake_get return
>> >       drm/xe: forcewake debugfs open fails on xe_forcewake_get
>> > failure
>> >       drm/xe: Ensure __must_check for xe_force_wake_get() return
>> >       drm/xe: Change return type to void for xe_force_wake_put
>> > 
>> > Matthew Brost (5):
>> >       drm/xe: Use __counted_by for flexible arrays
>> >       drm/xe: Take ref to job's fence in arm
>> >       drm/xe: Mark GGTT work queue with WQ_MEM_RECLAIM
>> >       drm/xe: Mark G2H work queue with WQ_MEM_RECLAIM
>> >       drm/xe: Mark GT work queue with WQ_MEM_RECLAIM
>> > 
>> > Michal Wajdeczko (1):
>> >       drm/xe/pf: Show VFs LMEM provisioning summary over debugfs
>> > 
>> > Nirmoy Das (4):
>> >       drm/xe: Add caller info to xe_gt_reset_async
>> >       drm/xe/ufence: Prefetch ufence addr to catch bogus address
>> >       drm/xe/ufence: Warn if mmget_not_zero() fails
>> >       drm/xe: Don't restart parallel queues multiple times on GT
>> > reset
>> > 
>> > Shuicheng Lin (1):
>> >       drm/xe: Handle unreliable MMIO reads during forcewake
>> > 
>> > Vinay Belgaumkar (1):
>> >       drm/xe/ptl: Apply Wa_14022866841
>> > 
>> > Zhanjun Dong (1):
>> >       drm/xe/guc: Prevent GuC register capture running on VF
>> > 
>> >  drivers/gpu/drm/xe/abi/guc_klvs_abi.h       |   1 +
>> >  drivers/gpu/drm/xe/display/xe_hdcp_gsc.c    |   6 +-
>> >  drivers/gpu/drm/xe/tests/xe_mocs.c          |  18 +-
>> >  drivers/gpu/drm/xe/xe_debugfs.c             |  27 +-
>> >  drivers/gpu/drm/xe/xe_devcoredump.c         |  14 +-
>> >  drivers/gpu/drm/xe/xe_device.c              |  25 +-
>> >  drivers/gpu/drm/xe/xe_drm_client.c          |   8 +-
>> >  drivers/gpu/drm/xe/xe_exec_queue_types.h    |   2 +-
>> >  drivers/gpu/drm/xe/xe_execlist.c            |   2 +-
>> >  drivers/gpu/drm/xe/xe_force_wake.c          | 134 ++++--
>> >  drivers/gpu/drm/xe/xe_force_wake.h          |  23 +-
>> >  drivers/gpu/drm/xe/xe_force_wake_types.h    |   6 +-
>> >  drivers/gpu/drm/xe/xe_ggtt.c                |   2 +-
>> >  drivers/gpu/drm/xe/xe_gsc.c                 |  23 +-
>> >  drivers/gpu/drm/xe/xe_gsc_proxy.c           |   9 +-
>> >  drivers/gpu/drm/xe/xe_gt.c                  | 110 +++--
>> >  drivers/gpu/drm/xe/xe_gt_debugfs.c          |  13 +-
>> >  drivers/gpu/drm/xe/xe_gt_idle.c             |  26 +-
>> >  drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c  |  35 ++
>> >  drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h  |   1 +
>> >  drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c |   5 +
>> >  drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c |   5 +-
>> >  drivers/gpu/drm/xe/xe_guc.c                 |  15 +-
>> >  drivers/gpu/drm/xe/xe_guc_ads.c             |   5 +
>> >  drivers/gpu/drm/xe/xe_guc_capture.c         |   8 +-
>> >  drivers/gpu/drm/xe/xe_guc_ct.c              |  20 +-
>> >  drivers/gpu/drm/xe/xe_guc_fwif.h            |   1 +
>> >  drivers/gpu/drm/xe/xe_guc_log.c             |   9 +-
>> >  drivers/gpu/drm/xe/xe_guc_pc.c              |  50 +-
>> >  drivers/gpu/drm/xe/xe_guc_submit.c          |  29 +-
>> >  drivers/gpu/drm/xe/xe_huc.c                 |   8 +-
>> >  drivers/gpu/drm/xe/xe_mocs.c                |  14 +-
>> >  drivers/gpu/drm/xe/xe_oa.c                  | 678
>> > +++++++++++++++++++---------
>> >  drivers/gpu/drm/xe/xe_oa_types.h            |  12 +
>> >  drivers/gpu/drm/xe/xe_pat.c                 |  65 ++-
>> >  drivers/gpu/drm/xe/xe_query.c               |  10 +-
>> >  drivers/gpu/drm/xe/xe_reg_sr.c              |  24 +-
>> >  drivers/gpu/drm/xe/xe_sched_job.c           |   2 +-
>> >  drivers/gpu/drm/xe/xe_sched_job_types.h     |   3 +-
>> >  drivers/gpu/drm/xe/xe_sync.c                |   5 +-
>> >  drivers/gpu/drm/xe/xe_vram.c                |  12 +-
>> >  drivers/gpu/drm/xe/xe_wa_oob.rules          |   2 +
>> >  include/uapi/drm/xe_drm.h                   |  17 +
>> >  43 files changed, 997 insertions(+), 487 deletions(-)
>
Thomas Hellstrom Oct. 25, 2024, 10:45 a.m. UTC | #4
On Fri, 2024-10-25 at 12:34 +0300, Jani Nikula wrote:
> On Fri, 25 Oct 2024, Thomas Hellström
> <thomas.hellstrom@linux.intel.com> wrote:
> > On Thu, 2024-10-24 at 19:22 +0000, Matthew Brost wrote:
> > > On Thu, Oct 24, 2024 at 07:52:11PM +0200, Thomas Hellstrom wrote:
> > > > Hi, Dave & Simona,
> > > > 
> > > > This week's drm-xe-next PR
> > > > 
> > > > Thanks,
> > > > Thomas
> > > > 
> > > > 
> > > > drm-xe-next-2024-10-24:
> > > > UAPI Changes:
> > > > - Define and parse OA sync properties (Ashutosh)
> > > > 
> > > > Driver Changes:
> > > > - Add caller info to xe_gt_reset_async (Nirmoy)
> > > > - A large forcewake rework / cleanup (Himal)
> > > > - A g2h response timeout fix (Badal)
> > > > - A PTL workaround (Vinay)
> > > > - Handle unreliable MMIO reads during forcewake (Shuicheng)
> > > > - Ufence user-space access fixes (Nirmoy)
> > > > - Annotate flexible arrays (Matthew Brost)
> > > > - Enable GuC lite restore (Fei)
> > > > - Prevent GuC register capture on VF (Zhanjun)
> > > > - Show VFs VRAM / LMEM provisioning summary over debugfs
> > > > (Michal)
> > > > - Parallel queues fix on GT reset (Nirmoy)
> > > > - Move reference grabbing to a job's dma-fence (Matt Brost)
> > > > - Mark a number of local workqueues WQ_MEM_RECLAIM (Matt Brost)
> > > 
> > > This breaks CI [1] - my mistake. Maybe omit these in this weeks
> > > PR.
> 
> How did this pass CI and get merged in the first place?!?
> 
> It's now botching unrelated pre-merge testing all over the place,
> e.g. [3] and [4].
> 
> BR,
> Jani.

This appears to have been a partial merge of a passing series....
/Thomas


> 
> 
> [3]
> https://lore.kernel.org/r/172981565466.1330037.6238046952250769671@2413ebb6fbb6
> [4]
> https://lore.kernel.org/r/172981849964.1330038.16133455483045565936@2413ebb6fbb6
> 
> 
> > > 
> > > We need [2] merged to fix this. Waiting on an RB but I'd like to
> > > get
> > > all of this in 6.12.
> > > 
> > > Matt
> > > 
> > > [1]
> > > https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-140135v2/bat-lnl-1/igt@xe_exec_fault_mode@twice-invalid-fault.html
> > > [2] https://patchwork.freedesktop.org/series/140406/
> > 
> > So this CI failure is a warning only and IMHO for drm-xe-next
> > (6.13)
> > it's not catastrophic. There might be a window in the bisect
> > history
> > where this warning appears. It's perhaps more important for -fixes,
> > though.
> > 
> > If we need to wait for the scheduler patch going into drm-misc-next
> > /
> > drm-next/ backmerge we'd hold off this branch for too long I fear.
> > 
> > @Dave, @Sima 
> > If you feel differently please skip this PR for this week and we'll
> > work to get the scheduler patch merged asap.
> > 
> > Thanks,
> > Thomas
> > 
> > 
> > > 
> > > > - OA synchronization support (Ashutosh)
> > > > 
> > > > The following changes since commit
> > > > 2eb460ab9f4bc5b575f52568d17936da0af681d8:
> > > > 
> > > >   drm/xe: Enlarge the invalidation timeout from 150 to 500
> > > > (2024-
> > > > 10-16 16:11:10 +0100)
> > > > 
> > > > are available in the Git repository at:
> > > > 
> > > >   https://gitlab.freedesktop.org/drm/xe/kernel.git tags/drm-xe-
> > > > next-2024-10-24
> > > > 
> > > > for you to fetch changes up to
> > > > 85d3f9e84e0628c412b69aa99b63654dfa08ad68:
> > > > 
> > > >   drm/xe/oa: Allow only certain property changes from config
> > > > (2024-
> > > > 10-23 12:42:20 -0700)
> > > > 
> > > > ---------------------------------------------------------------
> > > > -
> > > > UAPI Changes:
> > > > - Define and parse OA sync properties (Ashutosh)
> > > > 
> > > > Driver Changes:
> > > > - Add caller info to xe_gt_reset_async (Nirmoy)
> > > > - A large forcewake rework / cleanup (Himal)
> > > > - A g2h response timeout fix (Badal)
> > > > - A PTL workaround (Vinay)
> > > > - Handle unreliable MMIO reads during forcewake (Shuicheng)
> > > > - Ufence user-space access fixes (Nirmoy)
> > > > - Annotate flexible arrays (Matthew Brost)
> > > > - Enable GuC lite restore (Fei)
> > > > - Prevent GuC register capture on VF (Zhanjun)
> > > > - Show VFs VRAM / LMEM provisioning summary over debugfs
> > > > (Michal)
> > > > - Parallel queues fix on GT reset (Nirmoy)
> > > > - Move reference grabbing to a job's dma-fence (Matt Brost)
> > > > - Mark a number of local workqueues WQ_MEM_RECLAIM (Matt Brost)
> > > > - OA synchronization support (Ashutosh)
> > > > 
> > > > ---------------------------------------------------------------
> > > > -
> > > > Ashutosh Dixit (7):
> > > >       drm/xe/oa: Separate batch submission from waiting for
> > > > completion
> > > >       drm/xe/oa/uapi: Define and parse OA sync properties
> > > >       drm/xe/oa: Add input fence dependencies
> > > >       drm/xe/oa: Signal output fences
> > > >       drm/xe/oa: Move functions up so they can be reused for
> > > > config
> > > > ioctl
> > > >       drm/xe/oa: Add syncs support to OA config ioctl
> > > >       drm/xe/oa: Allow only certain property changes from
> > > > config
> > > > 
> > > > Badal Nilawar (1):
> > > >       drm/xe/guc/ct: Flush g2h worker in case of g2h response
> > > > timeout
> > > > 
> > > > Fei Yang (1):
> > > >       drm/xe: enable lite restore
> > > > 
> > > > Himal Prasad Ghimiray (26):
> > > >       drm/xe: Add member initialized_domains to xe_force_wake()
> > > >       drm/xe/forcewake: Change awake_domain datatype
> > > >       drm/xe/forcewake: Add a helper
> > > > xe_force_wake_ref_has_domain()
> > > >       drm/xe: Error handling in xe_force_wake_get()
> > > >       drm/xe: Modify xe_force_wake_put to handle _get returned
> > > > mask
> > > >       drm/xe/device: Update handling of xe_force_wake_get
> > > > return
> > > >       drm/xe/hdcp: Update handling of xe_force_wake_get return
> > > >       drm/xe/gsc: Update handling of xe_force_wake_get return
> > > >       drm/xe/gt: Update handling of xe_force_wake_get return
> > > >       drm/xe/xe_gt_idle: Update handling of xe_force_wake_get
> > > > return
> > > >       drm/xe/devcoredump: Update handling of xe_force_wake_get
> > > > return
> > > >       drm/xe/tests/mocs: Update xe_force_wake_get() return
> > > > handling
> > > >       drm/xe/mocs: Update handling of xe_force_wake_get return
> > > >       drm/xe/xe_drm_client: Update handling of
> > > > xe_force_wake_get
> > > > return
> > > >       drm/xe/xe_gt_debugfs: Update handling of
> > > > xe_force_wake_get
> > > > return
> > > >       drm/xe/guc: Update handling of xe_force_wake_get return
> > > >       drm/xe/huc: Update handling of xe_force_wake_get return
> > > >       drm/xe/oa: Handle force_wake_get failure in
> > > > xe_oa_stream_init()
> > > >       drm/xe/pat: Update handling of xe_force_wake_get return
> > > >       drm/xe/gt_tlb_invalidation_ggtt: Update handling of
> > > > xe_force_wake_get return
> > > >       drm/xe/xe_reg_sr: Update handling of xe_force_wake_get
> > > > return
> > > >       drm/xe/query: Update handling of xe_force_wake_get return
> > > >       drm/xe/vram: Update handling of xe_force_wake_get return
> > > >       drm/xe: forcewake debugfs open fails on xe_forcewake_get
> > > > failure
> > > >       drm/xe: Ensure __must_check for xe_force_wake_get()
> > > > return
> > > >       drm/xe: Change return type to void for xe_force_wake_put
> > > > 
> > > > Matthew Brost (5):
> > > >       drm/xe: Use __counted_by for flexible arrays
> > > >       drm/xe: Take ref to job's fence in arm
> > > >       drm/xe: Mark GGTT work queue with WQ_MEM_RECLAIM
> > > >       drm/xe: Mark G2H work queue with WQ_MEM_RECLAIM
> > > >       drm/xe: Mark GT work queue with WQ_MEM_RECLAIM
> > > > 
> > > > Michal Wajdeczko (1):
> > > >       drm/xe/pf: Show VFs LMEM provisioning summary over
> > > > debugfs
> > > > 
> > > > Nirmoy Das (4):
> > > >       drm/xe: Add caller info to xe_gt_reset_async
> > > >       drm/xe/ufence: Prefetch ufence addr to catch bogus
> > > > address
> > > >       drm/xe/ufence: Warn if mmget_not_zero() fails
> > > >       drm/xe: Don't restart parallel queues multiple times on
> > > > GT
> > > > reset
> > > > 
> > > > Shuicheng Lin (1):
> > > >       drm/xe: Handle unreliable MMIO reads during forcewake
> > > > 
> > > > Vinay Belgaumkar (1):
> > > >       drm/xe/ptl: Apply Wa_14022866841
> > > > 
> > > > Zhanjun Dong (1):
> > > >       drm/xe/guc: Prevent GuC register capture running on VF
> > > > 
> > > >  drivers/gpu/drm/xe/abi/guc_klvs_abi.h       |   1 +
> > > >  drivers/gpu/drm/xe/display/xe_hdcp_gsc.c    |   6 +-
> > > >  drivers/gpu/drm/xe/tests/xe_mocs.c          |  18 +-
> > > >  drivers/gpu/drm/xe/xe_debugfs.c             |  27 +-
> > > >  drivers/gpu/drm/xe/xe_devcoredump.c         |  14 +-
> > > >  drivers/gpu/drm/xe/xe_device.c              |  25 +-
> > > >  drivers/gpu/drm/xe/xe_drm_client.c          |   8 +-
> > > >  drivers/gpu/drm/xe/xe_exec_queue_types.h    |   2 +-
> > > >  drivers/gpu/drm/xe/xe_execlist.c            |   2 +-
> > > >  drivers/gpu/drm/xe/xe_force_wake.c          | 134 ++++--
> > > >  drivers/gpu/drm/xe/xe_force_wake.h          |  23 +-
> > > >  drivers/gpu/drm/xe/xe_force_wake_types.h    |   6 +-
> > > >  drivers/gpu/drm/xe/xe_ggtt.c                |   2 +-
> > > >  drivers/gpu/drm/xe/xe_gsc.c                 |  23 +-
> > > >  drivers/gpu/drm/xe/xe_gsc_proxy.c           |   9 +-
> > > >  drivers/gpu/drm/xe/xe_gt.c                  | 110 +++--
> > > >  drivers/gpu/drm/xe/xe_gt_debugfs.c          |  13 +-
> > > >  drivers/gpu/drm/xe/xe_gt_idle.c             |  26 +-
> > > >  drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c  |  35 ++
> > > >  drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h  |   1 +
> > > >  drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c |   5 +
> > > >  drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c |   5 +-
> > > >  drivers/gpu/drm/xe/xe_guc.c                 |  15 +-
> > > >  drivers/gpu/drm/xe/xe_guc_ads.c             |   5 +
> > > >  drivers/gpu/drm/xe/xe_guc_capture.c         |   8 +-
> > > >  drivers/gpu/drm/xe/xe_guc_ct.c              |  20 +-
> > > >  drivers/gpu/drm/xe/xe_guc_fwif.h            |   1 +
> > > >  drivers/gpu/drm/xe/xe_guc_log.c             |   9 +-
> > > >  drivers/gpu/drm/xe/xe_guc_pc.c              |  50 +-
> > > >  drivers/gpu/drm/xe/xe_guc_submit.c          |  29 +-
> > > >  drivers/gpu/drm/xe/xe_huc.c                 |   8 +-
> > > >  drivers/gpu/drm/xe/xe_mocs.c                |  14 +-
> > > >  drivers/gpu/drm/xe/xe_oa.c                  | 678
> > > > +++++++++++++++++++---------
> > > >  drivers/gpu/drm/xe/xe_oa_types.h            |  12 +
> > > >  drivers/gpu/drm/xe/xe_pat.c                 |  65 ++-
> > > >  drivers/gpu/drm/xe/xe_query.c               |  10 +-
> > > >  drivers/gpu/drm/xe/xe_reg_sr.c              |  24 +-
> > > >  drivers/gpu/drm/xe/xe_sched_job.c           |   2 +-
> > > >  drivers/gpu/drm/xe/xe_sched_job_types.h     |   3 +-
> > > >  drivers/gpu/drm/xe/xe_sync.c                |   5 +-
> > > >  drivers/gpu/drm/xe/xe_vram.c                |  12 +-
> > > >  drivers/gpu/drm/xe/xe_wa_oob.rules          |   2 +
> > > >  include/uapi/drm/xe_drm.h                   |  17 +
> > > >  43 files changed, 997 insertions(+), 487 deletions(-)
> > 
>
Matthew Brost Oct. 25, 2024, 10:26 p.m. UTC | #5
On Fri, Oct 25, 2024 at 12:45:26PM +0200, Thomas Hellström wrote:
> On Fri, 2024-10-25 at 12:34 +0300, Jani Nikula wrote:
> > On Fri, 25 Oct 2024, Thomas Hellström
> > <thomas.hellstrom@linux.intel.com> wrote:
> > > On Thu, 2024-10-24 at 19:22 +0000, Matthew Brost wrote:
> > > > On Thu, Oct 24, 2024 at 07:52:11PM +0200, Thomas Hellstrom wrote:
> > > > > Hi, Dave & Simona,
> > > > > 
> > > > > This week's drm-xe-next PR
> > > > > 
> > > > > Thanks,
> > > > > Thomas
> > > > > 
> > > > > 
> > > > > drm-xe-next-2024-10-24:
> > > > > UAPI Changes:
> > > > > - Define and parse OA sync properties (Ashutosh)
> > > > > 
> > > > > Driver Changes:
> > > > > - Add caller info to xe_gt_reset_async (Nirmoy)
> > > > > - A large forcewake rework / cleanup (Himal)
> > > > > - A g2h response timeout fix (Badal)
> > > > > - A PTL workaround (Vinay)
> > > > > - Handle unreliable MMIO reads during forcewake (Shuicheng)
> > > > > - Ufence user-space access fixes (Nirmoy)
> > > > > - Annotate flexible arrays (Matthew Brost)
> > > > > - Enable GuC lite restore (Fei)
> > > > > - Prevent GuC register capture on VF (Zhanjun)
> > > > > - Show VFs VRAM / LMEM provisioning summary over debugfs
> > > > > (Michal)
> > > > > - Parallel queues fix on GT reset (Nirmoy)
> > > > > - Move reference grabbing to a job's dma-fence (Matt Brost)
> > > > > - Mark a number of local workqueues WQ_MEM_RECLAIM (Matt Brost)
> > > > 
> > > > This breaks CI [1] - my mistake. Maybe omit these in this weeks
> > > > PR.
> > 
> > How did this pass CI and get merged in the first place?!?
> > 
> > It's now botching unrelated pre-merge testing all over the place,
> > e.g. [3] and [4].
> > 
> > BR,
> > Jani.
> 
> This appears to have been a partial merge of a passing series....
> /Thomas
> 

Yea again my mistake on the partial merge - will be more careful going
forward. Have RBs on the scheduler patch which will fix our CI but
getting conflicts on drm-misc-next so need some maintainer help there.
Friday so won't get this fixed up until Monday.

Matt

> 
> > 
> > 
> > [3]
> > https://lore.kernel.org/r/172981565466.1330037.6238046952250769671@2413ebb6fbb6
> > [4]
> > https://lore.kernel.org/r/172981849964.1330038.16133455483045565936@2413ebb6fbb6
> > 
> > 
> > > > 
> > > > We need [2] merged to fix this. Waiting on an RB but I'd like to
> > > > get
> > > > all of this in 6.12.
> > > > 
> > > > Matt
> > > > 
> > > > [1]
> > > > https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-140135v2/bat-lnl-1/igt@xe_exec_fault_mode@twice-invalid-fault.html
> > > > [2] https://patchwork.freedesktop.org/series/140406/
> > > 
> > > So this CI failure is a warning only and IMHO for drm-xe-next
> > > (6.13)
> > > it's not catastrophic. There might be a window in the bisect
> > > history
> > > where this warning appears. It's perhaps more important for -fixes,
> > > though.
> > > 
> > > If we need to wait for the scheduler patch going into drm-misc-next
> > > /
> > > drm-next/ backmerge we'd hold off this branch for too long I fear.
> > > 
> > > @Dave, @Sima 
> > > If you feel differently please skip this PR for this week and we'll
> > > work to get the scheduler patch merged asap.
> > > 
> > > Thanks,
> > > Thomas
> > > 
> > > 
> > > > 
> > > > > - OA synchronization support (Ashutosh)
> > > > > 
> > > > > The following changes since commit
> > > > > 2eb460ab9f4bc5b575f52568d17936da0af681d8:
> > > > > 
> > > > >   drm/xe: Enlarge the invalidation timeout from 150 to 500
> > > > > (2024-
> > > > > 10-16 16:11:10 +0100)
> > > > > 
> > > > > are available in the Git repository at:
> > > > > 
> > > > >   https://gitlab.freedesktop.org/drm/xe/kernel.git tags/drm-xe-
> > > > > next-2024-10-24
> > > > > 
> > > > > for you to fetch changes up to
> > > > > 85d3f9e84e0628c412b69aa99b63654dfa08ad68:
> > > > > 
> > > > >   drm/xe/oa: Allow only certain property changes from config
> > > > > (2024-
> > > > > 10-23 12:42:20 -0700)
> > > > > 
> > > > > ---------------------------------------------------------------
> > > > > -
> > > > > UAPI Changes:
> > > > > - Define and parse OA sync properties (Ashutosh)
> > > > > 
> > > > > Driver Changes:
> > > > > - Add caller info to xe_gt_reset_async (Nirmoy)
> > > > > - A large forcewake rework / cleanup (Himal)
> > > > > - A g2h response timeout fix (Badal)
> > > > > - A PTL workaround (Vinay)
> > > > > - Handle unreliable MMIO reads during forcewake (Shuicheng)
> > > > > - Ufence user-space access fixes (Nirmoy)
> > > > > - Annotate flexible arrays (Matthew Brost)
> > > > > - Enable GuC lite restore (Fei)
> > > > > - Prevent GuC register capture on VF (Zhanjun)
> > > > > - Show VFs VRAM / LMEM provisioning summary over debugfs
> > > > > (Michal)
> > > > > - Parallel queues fix on GT reset (Nirmoy)
> > > > > - Move reference grabbing to a job's dma-fence (Matt Brost)
> > > > > - Mark a number of local workqueues WQ_MEM_RECLAIM (Matt Brost)
> > > > > - OA synchronization support (Ashutosh)
> > > > > 
> > > > > ---------------------------------------------------------------
> > > > > -
> > > > > Ashutosh Dixit (7):
> > > > >       drm/xe/oa: Separate batch submission from waiting for
> > > > > completion
> > > > >       drm/xe/oa/uapi: Define and parse OA sync properties
> > > > >       drm/xe/oa: Add input fence dependencies
> > > > >       drm/xe/oa: Signal output fences
> > > > >       drm/xe/oa: Move functions up so they can be reused for
> > > > > config
> > > > > ioctl
> > > > >       drm/xe/oa: Add syncs support to OA config ioctl
> > > > >       drm/xe/oa: Allow only certain property changes from
> > > > > config
> > > > > 
> > > > > Badal Nilawar (1):
> > > > >       drm/xe/guc/ct: Flush g2h worker in case of g2h response
> > > > > timeout
> > > > > 
> > > > > Fei Yang (1):
> > > > >       drm/xe: enable lite restore
> > > > > 
> > > > > Himal Prasad Ghimiray (26):
> > > > >       drm/xe: Add member initialized_domains to xe_force_wake()
> > > > >       drm/xe/forcewake: Change awake_domain datatype
> > > > >       drm/xe/forcewake: Add a helper
> > > > > xe_force_wake_ref_has_domain()
> > > > >       drm/xe: Error handling in xe_force_wake_get()
> > > > >       drm/xe: Modify xe_force_wake_put to handle _get returned
> > > > > mask
> > > > >       drm/xe/device: Update handling of xe_force_wake_get
> > > > > return
> > > > >       drm/xe/hdcp: Update handling of xe_force_wake_get return
> > > > >       drm/xe/gsc: Update handling of xe_force_wake_get return
> > > > >       drm/xe/gt: Update handling of xe_force_wake_get return
> > > > >       drm/xe/xe_gt_idle: Update handling of xe_force_wake_get
> > > > > return
> > > > >       drm/xe/devcoredump: Update handling of xe_force_wake_get
> > > > > return
> > > > >       drm/xe/tests/mocs: Update xe_force_wake_get() return
> > > > > handling
> > > > >       drm/xe/mocs: Update handling of xe_force_wake_get return
> > > > >       drm/xe/xe_drm_client: Update handling of
> > > > > xe_force_wake_get
> > > > > return
> > > > >       drm/xe/xe_gt_debugfs: Update handling of
> > > > > xe_force_wake_get
> > > > > return
> > > > >       drm/xe/guc: Update handling of xe_force_wake_get return
> > > > >       drm/xe/huc: Update handling of xe_force_wake_get return
> > > > >       drm/xe/oa: Handle force_wake_get failure in
> > > > > xe_oa_stream_init()
> > > > >       drm/xe/pat: Update handling of xe_force_wake_get return
> > > > >       drm/xe/gt_tlb_invalidation_ggtt: Update handling of
> > > > > xe_force_wake_get return
> > > > >       drm/xe/xe_reg_sr: Update handling of xe_force_wake_get
> > > > > return
> > > > >       drm/xe/query: Update handling of xe_force_wake_get return
> > > > >       drm/xe/vram: Update handling of xe_force_wake_get return
> > > > >       drm/xe: forcewake debugfs open fails on xe_forcewake_get
> > > > > failure
> > > > >       drm/xe: Ensure __must_check for xe_force_wake_get()
> > > > > return
> > > > >       drm/xe: Change return type to void for xe_force_wake_put
> > > > > 
> > > > > Matthew Brost (5):
> > > > >       drm/xe: Use __counted_by for flexible arrays
> > > > >       drm/xe: Take ref to job's fence in arm
> > > > >       drm/xe: Mark GGTT work queue with WQ_MEM_RECLAIM
> > > > >       drm/xe: Mark G2H work queue with WQ_MEM_RECLAIM
> > > > >       drm/xe: Mark GT work queue with WQ_MEM_RECLAIM
> > > > > 
> > > > > Michal Wajdeczko (1):
> > > > >       drm/xe/pf: Show VFs LMEM provisioning summary over
> > > > > debugfs
> > > > > 
> > > > > Nirmoy Das (4):
> > > > >       drm/xe: Add caller info to xe_gt_reset_async
> > > > >       drm/xe/ufence: Prefetch ufence addr to catch bogus
> > > > > address
> > > > >       drm/xe/ufence: Warn if mmget_not_zero() fails
> > > > >       drm/xe: Don't restart parallel queues multiple times on
> > > > > GT
> > > > > reset
> > > > > 
> > > > > Shuicheng Lin (1):
> > > > >       drm/xe: Handle unreliable MMIO reads during forcewake
> > > > > 
> > > > > Vinay Belgaumkar (1):
> > > > >       drm/xe/ptl: Apply Wa_14022866841
> > > > > 
> > > > > Zhanjun Dong (1):
> > > > >       drm/xe/guc: Prevent GuC register capture running on VF
> > > > > 
> > > > >  drivers/gpu/drm/xe/abi/guc_klvs_abi.h       |   1 +
> > > > >  drivers/gpu/drm/xe/display/xe_hdcp_gsc.c    |   6 +-
> > > > >  drivers/gpu/drm/xe/tests/xe_mocs.c          |  18 +-
> > > > >  drivers/gpu/drm/xe/xe_debugfs.c             |  27 +-
> > > > >  drivers/gpu/drm/xe/xe_devcoredump.c         |  14 +-
> > > > >  drivers/gpu/drm/xe/xe_device.c              |  25 +-
> > > > >  drivers/gpu/drm/xe/xe_drm_client.c          |   8 +-
> > > > >  drivers/gpu/drm/xe/xe_exec_queue_types.h    |   2 +-
> > > > >  drivers/gpu/drm/xe/xe_execlist.c            |   2 +-
> > > > >  drivers/gpu/drm/xe/xe_force_wake.c          | 134 ++++--
> > > > >  drivers/gpu/drm/xe/xe_force_wake.h          |  23 +-
> > > > >  drivers/gpu/drm/xe/xe_force_wake_types.h    |   6 +-
> > > > >  drivers/gpu/drm/xe/xe_ggtt.c                |   2 +-
> > > > >  drivers/gpu/drm/xe/xe_gsc.c                 |  23 +-
> > > > >  drivers/gpu/drm/xe/xe_gsc_proxy.c           |   9 +-
> > > > >  drivers/gpu/drm/xe/xe_gt.c                  | 110 +++--
> > > > >  drivers/gpu/drm/xe/xe_gt_debugfs.c          |  13 +-
> > > > >  drivers/gpu/drm/xe/xe_gt_idle.c             |  26 +-
> > > > >  drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c  |  35 ++
> > > > >  drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h  |   1 +
> > > > >  drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c |   5 +
> > > > >  drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c |   5 +-
> > > > >  drivers/gpu/drm/xe/xe_guc.c                 |  15 +-
> > > > >  drivers/gpu/drm/xe/xe_guc_ads.c             |   5 +
> > > > >  drivers/gpu/drm/xe/xe_guc_capture.c         |   8 +-
> > > > >  drivers/gpu/drm/xe/xe_guc_ct.c              |  20 +-
> > > > >  drivers/gpu/drm/xe/xe_guc_fwif.h            |   1 +
> > > > >  drivers/gpu/drm/xe/xe_guc_log.c             |   9 +-
> > > > >  drivers/gpu/drm/xe/xe_guc_pc.c              |  50 +-
> > > > >  drivers/gpu/drm/xe/xe_guc_submit.c          |  29 +-
> > > > >  drivers/gpu/drm/xe/xe_huc.c                 |   8 +-
> > > > >  drivers/gpu/drm/xe/xe_mocs.c                |  14 +-
> > > > >  drivers/gpu/drm/xe/xe_oa.c                  | 678
> > > > > +++++++++++++++++++---------
> > > > >  drivers/gpu/drm/xe/xe_oa_types.h            |  12 +
> > > > >  drivers/gpu/drm/xe/xe_pat.c                 |  65 ++-
> > > > >  drivers/gpu/drm/xe/xe_query.c               |  10 +-
> > > > >  drivers/gpu/drm/xe/xe_reg_sr.c              |  24 +-
> > > > >  drivers/gpu/drm/xe/xe_sched_job.c           |   2 +-
> > > > >  drivers/gpu/drm/xe/xe_sched_job_types.h     |   3 +-
> > > > >  drivers/gpu/drm/xe/xe_sync.c                |   5 +-
> > > > >  drivers/gpu/drm/xe/xe_vram.c                |  12 +-
> > > > >  drivers/gpu/drm/xe/xe_wa_oob.rules          |   2 +
> > > > >  include/uapi/drm/xe_drm.h                   |  17 +
> > > > >  43 files changed, 997 insertions(+), 487 deletions(-)
> > > 
> > 
>