diff mbox

[04/12] drm/i915: Flush periodic samples, in case of no pending CS sample requests

Message ID 1501487985-2017-5-git-send-email-sagar.a.kamble@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

sagar.a.kamble@intel.com July 31, 2017, 7:59 a.m. UTC
From: Sourab Gupta <sourab.gupta@intel.com>

When there are no pending CS OA samples, flush the periodic OA samples
collected so far.

We can safely forward the periodic OA samples in the case we
have no pending CS samples, but we can't do so in the case we have
pending CS samples, since we don't know what the ordering between
pending CS samples and periodic samples will eventually be. If we
have no pending CS sample, it won't be possible for future pending CS
sample to have timestamps earlier than current periodic timestamp.

Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h  |   5 +-
 drivers/gpu/drm/i915/i915_perf.c | 142 ++++++++++++++++++++++++++++++---------
 2 files changed, 113 insertions(+), 34 deletions(-)

Comments

kernel test robot July 31, 2017, 4:52 p.m. UTC | #1
Hi Sourab,

[auto build test WARNING on drm-intel/for-linux-next]
[also build test WARNING on next-20170731]
[cannot apply to v4.13-rc3]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Sagar-Arun-Kamble/i915-perf-support-for-command-stream-based-OA-GPU-and-workload-metrics-capture/20170731-184412
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
reproduce: make htmldocs

All warnings (new ones prefixed by >>):

   WARNING: convert(1) not found, for SVG to PDF conversion install ImageMagick (https://www.imagemagick.org)
   include/linux/init.h:1: warning: no structured comments found
   include/linux/mod_devicetable.h:687: warning: Excess struct/union/enum/typedef member 'ver_major' description in 'fsl_mc_device_id'
   include/linux/mod_devicetable.h:687: warning: Excess struct/union/enum/typedef member 'ver_minor' description in 'fsl_mc_device_id'
   kernel/sched/core.c:2080: warning: No description found for parameter 'rf'
   kernel/sched/core.c:2080: warning: Excess function parameter 'cookie' description in 'try_to_wake_up_local'
   include/linux/wait.h:555: warning: No description found for parameter 'wq'
   include/linux/wait.h:555: warning: Excess function parameter 'wq_head' description in 'wait_event_interruptible_hrtimeout'
   include/linux/wait.h:759: warning: No description found for parameter 'wq_head'
   include/linux/wait.h:759: warning: Excess function parameter 'wq' description in 'wait_event_killable'
   include/linux/kthread.h:26: warning: Excess function parameter '...' description in 'kthread_create'
   kernel/sys.c:1: warning: no structured comments found
   include/linux/device.h:968: warning: No description found for parameter 'dma_ops'
   drivers/dma-buf/seqno-fence.c:1: warning: no structured comments found
   include/linux/iio/iio.h:603: warning: No description found for parameter 'trig_readonly'
   include/linux/iio/trigger.h:151: warning: No description found for parameter 'indio_dev'
   include/linux/iio/trigger.h:151: warning: No description found for parameter 'trig'
   include/linux/device.h:969: warning: No description found for parameter 'dma_ops'
   drivers/ata/libata-eh.c:1449: warning: No description found for parameter 'link'
   drivers/ata/libata-eh.c:1449: warning: Excess function parameter 'ap' description in 'ata_eh_done'
   drivers/ata/libata-eh.c:1590: warning: No description found for parameter 'qc'
   drivers/ata/libata-eh.c:1590: warning: Excess function parameter 'dev' description in 'ata_eh_request_sense'
   drivers/mtd/nand/nand_base.c:2751: warning: Excess function parameter 'cached' description in 'nand_write_page'
   drivers/mtd/nand/nand_base.c:2751: warning: Excess function parameter 'cached' description in 'nand_write_page'
   arch/s390/include/asm/cmb.h:1: warning: no structured comments found
   drivers/scsi/scsi_lib.c:1116: warning: No description found for parameter 'rq'
   drivers/scsi/constants.c:1: warning: no structured comments found
   include/linux/usb/gadget.h:230: warning: No description found for parameter 'claimed'
   include/linux/usb/gadget.h:230: warning: No description found for parameter 'enabled'
   include/linux/usb/gadget.h:412: warning: No description found for parameter 'quirk_altset_not_supp'
   include/linux/usb/gadget.h:412: warning: No description found for parameter 'quirk_stall_not_supp'
   include/linux/usb/gadget.h:412: warning: No description found for parameter 'quirk_zlp_not_supp'
   fs/inode.c:1666: warning: No description found for parameter 'rcu'
   include/linux/jbd2.h:443: warning: No description found for parameter 'i_transaction'
   include/linux/jbd2.h:443: warning: No description found for parameter 'i_next_transaction'
   include/linux/jbd2.h:443: warning: No description found for parameter 'i_list'
   include/linux/jbd2.h:443: warning: No description found for parameter 'i_vfs_inode'
   include/linux/jbd2.h:443: warning: No description found for parameter 'i_flags'
   include/linux/jbd2.h:497: warning: No description found for parameter 'h_rsv_handle'
   include/linux/jbd2.h:497: warning: No description found for parameter 'h_reserved'
   include/linux/jbd2.h:497: warning: No description found for parameter 'h_type'
   include/linux/jbd2.h:497: warning: No description found for parameter 'h_line_no'
   include/linux/jbd2.h:497: warning: No description found for parameter 'h_start_jiffies'
   include/linux/jbd2.h:497: warning: No description found for parameter 'h_requested_credits'
   include/linux/jbd2.h:497: warning: No description found for parameter 'saved_alloc_context'
   include/linux/jbd2.h:1050: warning: No description found for parameter 'j_chkpt_bhs'
   include/linux/jbd2.h:1050: warning: No description found for parameter 'j_devname'
   include/linux/jbd2.h:1050: warning: No description found for parameter 'j_average_commit_time'
   include/linux/jbd2.h:1050: warning: No description found for parameter 'j_min_batch_time'
   include/linux/jbd2.h:1050: warning: No description found for parameter 'j_max_batch_time'
   include/linux/jbd2.h:1050: warning: No description found for parameter 'j_commit_callback'
   include/linux/jbd2.h:1050: warning: No description found for parameter 'j_failed_commit'
   include/linux/jbd2.h:1050: warning: No description found for parameter 'j_chksum_driver'
   include/linux/jbd2.h:1050: warning: No description found for parameter 'j_csum_seed'
   fs/jbd2/transaction.c:511: warning: No description found for parameter 'type'
   fs/jbd2/transaction.c:511: warning: No description found for parameter 'line_no'
   fs/jbd2/transaction.c:641: warning: No description found for parameter 'gfp_mask'
   include/drm/drm_drv.h:553: warning: No description found for parameter 'debugfs_init'
   include/drm/drm_drv.h:553: warning: No description found for parameter 'gem_open_object'
   include/drm/drm_drv.h:553: warning: No description found for parameter 'gem_close_object'
   include/drm/drm_drv.h:553: warning: No description found for parameter 'prime_handle_to_fd'
   include/drm/drm_drv.h:553: warning: No description found for parameter 'prime_fd_to_handle'
   include/drm/drm_drv.h:553: warning: No description found for parameter 'gem_prime_export'
   include/drm/drm_drv.h:553: warning: No description found for parameter 'gem_prime_import'
   include/drm/drm_drv.h:553: warning: No description found for parameter 'gem_prime_pin'
   include/drm/drm_drv.h:553: warning: No description found for parameter 'gem_prime_unpin'
   include/drm/drm_drv.h:553: warning: No description found for parameter 'gem_prime_res_obj'
   include/drm/drm_drv.h:553: warning: No description found for parameter 'gem_prime_get_sg_table'
   include/drm/drm_drv.h:553: warning: No description found for parameter 'gem_prime_import_sg_table'
   include/drm/drm_drv.h:553: warning: No description found for parameter 'gem_prime_vmap'
   include/drm/drm_drv.h:553: warning: No description found for parameter 'gem_prime_vunmap'
   include/drm/drm_drv.h:553: warning: No description found for parameter 'gem_prime_mmap'
   include/drm/drm_drv.h:553: warning: No description found for parameter 'gem_vm_ops'
   include/drm/drm_drv.h:553: warning: No description found for parameter 'major'
   include/drm/drm_drv.h:553: warning: No description found for parameter 'minor'
   include/drm/drm_drv.h:553: warning: No description found for parameter 'patchlevel'
   include/drm/drm_drv.h:553: warning: No description found for parameter 'name'
   include/drm/drm_drv.h:553: warning: No description found for parameter 'desc'
   include/drm/drm_drv.h:553: warning: No description found for parameter 'date'
   include/drm/drm_drv.h:553: warning: No description found for parameter 'driver_features'
   drivers/gpu/drm/drm_modes.c:1623: warning: No description found for parameter 'display'
   drivers/gpu/drm/drm_modes.c:1623: warning: Excess function parameter 'connector' description in 'drm_mode_is_420_only'
>> drivers/gpu/drm/i915/i915_perf.c:684: warning: No description found for parameter 'last_ts'
>> drivers/gpu/drm/i915/i915_perf.c:684: warning: No description found for parameter 'last_ts'
>> drivers/gpu/drm/i915/i915_perf.c:684: warning: No description found for parameter 'last_ts'
>> drivers/gpu/drm/i915/i915_perf.c:684: warning: No description found for parameter 'last_ts'
>> drivers/gpu/drm/i915/i915_perf.c:684: warning: No description found for parameter 'last_ts'
>> drivers/gpu/drm/i915/i915_perf.c:684: warning: No description found for parameter 'last_ts'
>> drivers/gpu/drm/i915/i915_perf.c:684: warning: No description found for parameter 'last_ts'
>> drivers/gpu/drm/i915/i915_perf.c:684: warning: No description found for parameter 'last_ts'
   drivers/gpu/drm/i915/i915_drv.h:2000: warning: No description found for parameter 'emit_sample_capture'
   drivers/gpu/drm/i915/i915_drv.h:2078: warning: No description found for parameter 'cs_buffer'
   drivers/gpu/drm/i915/i915_drv.h:2078: warning: No description found for parameter 'cs_samples'
   drivers/gpu/drm/i915/i915_drv.h:2078: warning: No description found for parameter 'cs_samples_lock'
   drivers/gpu/drm/i915/i915_drv.h:2078: warning: No description found for parameter 'poll_wq'
   drivers/gpu/drm/i915/i915_drv.h:2078: warning: No description found for parameter 'pollin'
   drivers/gpu/drm/i915/i915_drv.h:2000: warning: No description found for parameter 'emit_sample_capture'
   drivers/gpu/drm/i915/i915_drv.h:2078: warning: No description found for parameter 'cs_buffer'
   drivers/gpu/drm/i915/i915_drv.h:2078: warning: No description found for parameter 'cs_samples'
   drivers/gpu/drm/i915/i915_drv.h:2078: warning: No description found for parameter 'cs_samples_lock'
   drivers/gpu/drm/i915/i915_drv.h:2078: warning: No description found for parameter 'poll_wq'
   drivers/gpu/drm/i915/i915_drv.h:2078: warning: No description found for parameter 'pollin'
>> drivers/gpu/drm/i915/i915_perf.c:684: warning: No description found for parameter 'last_ts'
>> drivers/gpu/drm/i915/i915_perf.c:684: warning: No description found for parameter 'last_ts'
>> drivers/gpu/drm/i915/i915_perf.c:684: warning: No description found for parameter 'last_ts'
>> drivers/gpu/drm/i915/i915_perf.c:684: warning: No description found for parameter 'last_ts'
>> drivers/gpu/drm/i915/i915_perf.c:684: warning: No description found for parameter 'last_ts'
>> drivers/gpu/drm/i915/i915_perf.c:684: warning: No description found for parameter 'last_ts'
>> drivers/gpu/drm/i915/i915_perf.c:684: warning: No description found for parameter 'last_ts'
>> drivers/gpu/drm/i915/i915_perf.c:684: warning: No description found for parameter 'last_ts'
>> drivers/gpu/drm/i915/i915_perf.c:684: warning: No description found for parameter 'last_ts'
   drivers/gpu/drm/i915/i915_drv.h:2000: warning: No description found for parameter 'emit_sample_capture'
   drivers/gpu/drm/i915/i915_drv.h:2078: warning: No description found for parameter 'cs_buffer'
   drivers/gpu/drm/i915/i915_drv.h:2078: warning: No description found for parameter 'cs_samples'
   drivers/gpu/drm/i915/i915_drv.h:2078: warning: No description found for parameter 'cs_samples_lock'
   drivers/gpu/drm/i915/i915_drv.h:2078: warning: No description found for parameter 'poll_wq'
   drivers/gpu/drm/i915/i915_drv.h:2078: warning: No description found for parameter 'pollin'
>> drivers/gpu/drm/i915/i915_perf.c:684: warning: No description found for parameter 'last_ts'
   drivers/gpu/drm/i915/i915_perf.c:1: warning: no structured comments found
>> drivers/gpu/drm/i915/i915_perf.c:684: warning: No description found for parameter 'last_ts'
   drivers/gpu/drm/i915/i915_perf.c:1: warning: no structured comments found
>> drivers/gpu/drm/i915/i915_perf.c:684: warning: No description found for parameter 'last_ts'
   drivers/gpu/drm/i915/i915_perf.c:1: warning: no structured comments found

vim +/last_ts +684 drivers/gpu/drm/i915/i915_perf.c

b0aca6b4 Sourab Gupta 2017-07-31  657  
b0aca6b4 Sourab Gupta 2017-07-31  658  /**
24459f50 Sourab Gupta 2017-07-31  659   * oa_buffer_num_reports_unlocked - check for data and update tail ptr state
0dd860cf Robert Bragg 2017-05-11  660   * @dev_priv: i915 device instance
d7965152 Robert Bragg 2016-11-07  661   *
0dd860cf Robert Bragg 2017-05-11  662   * This is either called via fops (for blocking reads in user ctx) or the poll
0dd860cf Robert Bragg 2017-05-11  663   * check hrtimer (atomic ctx) to check the OA buffer tail pointer and check
0dd860cf Robert Bragg 2017-05-11  664   * if there is data available for userspace to read.
d7965152 Robert Bragg 2016-11-07  665   *
0dd860cf Robert Bragg 2017-05-11  666   * This function is central to providing a workaround for the OA unit tail
0dd860cf Robert Bragg 2017-05-11  667   * pointer having a race with respect to what data is visible to the CPU.
0dd860cf Robert Bragg 2017-05-11  668   * It is responsible for reading tail pointers from the hardware and giving
0dd860cf Robert Bragg 2017-05-11  669   * the pointers time to 'age' before they are made available for reading.
0dd860cf Robert Bragg 2017-05-11  670   * (See description of OA_TAIL_MARGIN_NSEC above for further details.)
0dd860cf Robert Bragg 2017-05-11  671   *
24459f50 Sourab Gupta 2017-07-31  672   * Besides returning num of reports when there is data available to read() it
0dd860cf Robert Bragg 2017-05-11  673   * also has the side effect of updating the oa_buffer.tails[], .aging_timestamp
0dd860cf Robert Bragg 2017-05-11  674   * and .aged_tail_idx state used for reading.
0dd860cf Robert Bragg 2017-05-11  675   *
0dd860cf Robert Bragg 2017-05-11  676   * Note: It's safe to read OA config state here unlocked, assuming that this is
0dd860cf Robert Bragg 2017-05-11  677   * only called while the stream is enabled, while the global OA configuration
0dd860cf Robert Bragg 2017-05-11  678   * can't be modified.
0dd860cf Robert Bragg 2017-05-11  679   *
24459f50 Sourab Gupta 2017-07-31  680   * Returns: number of samples available to read
d7965152 Robert Bragg 2016-11-07  681   */
24459f50 Sourab Gupta 2017-07-31  682  static u32 oa_buffer_num_reports_unlocked(
24459f50 Sourab Gupta 2017-07-31  683  			struct drm_i915_private *dev_priv, u32 *last_ts)
d7965152 Robert Bragg 2016-11-07 @684  {
d7965152 Robert Bragg 2016-11-07  685  	int report_size = dev_priv->perf.oa.oa_buffer.format_size;
0dd860cf Robert Bragg 2017-05-11  686  	unsigned long flags;
0dd860cf Robert Bragg 2017-05-11  687  	unsigned int aged_idx;
24459f50 Sourab Gupta 2017-07-31  688  	u32 head, hw_tail, aged_tail, aging_tail, num_reports = 0;
0dd860cf Robert Bragg 2017-05-11  689  	u64 now;
0dd860cf Robert Bragg 2017-05-11  690  
0dd860cf Robert Bragg 2017-05-11  691  	/* We have to consider the (unlikely) possibility that read() errors
0dd860cf Robert Bragg 2017-05-11  692  	 * could result in an OA buffer reset which might reset the head,
0dd860cf Robert Bragg 2017-05-11  693  	 * tails[] and aged_tail state.
0dd860cf Robert Bragg 2017-05-11  694  	 */
0dd860cf Robert Bragg 2017-05-11  695  	spin_lock_irqsave(&dev_priv->perf.oa.oa_buffer.ptr_lock, flags);
0dd860cf Robert Bragg 2017-05-11  696  
0dd860cf Robert Bragg 2017-05-11  697  	/* NB: The head we observe here might effectively be a little out of
0dd860cf Robert Bragg 2017-05-11  698  	 * date (between head and tails[aged_idx].offset if there is currently
0dd860cf Robert Bragg 2017-05-11  699  	 * a read() in progress.
0dd860cf Robert Bragg 2017-05-11  700  	 */
0dd860cf Robert Bragg 2017-05-11  701  	head = dev_priv->perf.oa.oa_buffer.head;
0dd860cf Robert Bragg 2017-05-11  702  
0dd860cf Robert Bragg 2017-05-11  703  	aged_idx = dev_priv->perf.oa.oa_buffer.aged_tail_idx;
0dd860cf Robert Bragg 2017-05-11  704  	aged_tail = dev_priv->perf.oa.oa_buffer.tails[aged_idx].offset;
0dd860cf Robert Bragg 2017-05-11  705  	aging_tail = dev_priv->perf.oa.oa_buffer.tails[!aged_idx].offset;
0dd860cf Robert Bragg 2017-05-11  706  
19f81df2 Robert Bragg 2017-06-13  707  	hw_tail = dev_priv->perf.oa.ops.oa_hw_tail_read(dev_priv);
0dd860cf Robert Bragg 2017-05-11  708  
0dd860cf Robert Bragg 2017-05-11  709  	/* The tail pointer increases in 64 byte increments,
0dd860cf Robert Bragg 2017-05-11  710  	 * not in report_size steps...
0dd860cf Robert Bragg 2017-05-11  711  	 */
0dd860cf Robert Bragg 2017-05-11  712  	hw_tail &= ~(report_size - 1);
0dd860cf Robert Bragg 2017-05-11  713  
0dd860cf Robert Bragg 2017-05-11  714  	now = ktime_get_mono_fast_ns();
0dd860cf Robert Bragg 2017-05-11  715  
4117ebc7 Robert Bragg 2017-05-11  716  	/* Update the aged tail
4117ebc7 Robert Bragg 2017-05-11  717  	 *
4117ebc7 Robert Bragg 2017-05-11  718  	 * Flip the tail pointer available for read()s once the aging tail is
4117ebc7 Robert Bragg 2017-05-11  719  	 * old enough to trust that the corresponding data will be visible to
4117ebc7 Robert Bragg 2017-05-11  720  	 * the CPU...
4117ebc7 Robert Bragg 2017-05-11  721  	 *
4117ebc7 Robert Bragg 2017-05-11  722  	 * Do this before updating the aging pointer in case we may be able to
4117ebc7 Robert Bragg 2017-05-11  723  	 * immediately start aging a new pointer too (if new data has become
4117ebc7 Robert Bragg 2017-05-11  724  	 * available) without needing to wait for a later hrtimer callback.
4117ebc7 Robert Bragg 2017-05-11  725  	 */
4117ebc7 Robert Bragg 2017-05-11  726  	if (aging_tail != INVALID_TAIL_PTR &&
4117ebc7 Robert Bragg 2017-05-11  727  	    ((now - dev_priv->perf.oa.oa_buffer.aging_timestamp) >
4117ebc7 Robert Bragg 2017-05-11  728  	     OA_TAIL_MARGIN_NSEC)) {
24459f50 Sourab Gupta 2017-07-31  729  		u32 mask = (OA_BUFFER_SIZE - 1);
24459f50 Sourab Gupta 2017-07-31  730  		u32 gtt_offset = i915_ggtt_offset(
24459f50 Sourab Gupta 2017-07-31  731  				dev_priv->perf.oa.oa_buffer.vma);
24459f50 Sourab Gupta 2017-07-31  732  		u32 head = (dev_priv->perf.oa.oa_buffer.head - gtt_offset)
24459f50 Sourab Gupta 2017-07-31  733  				& mask;
24459f50 Sourab Gupta 2017-07-31  734  		u8 *oa_buf_base = dev_priv->perf.oa.oa_buffer.vaddr;
24459f50 Sourab Gupta 2017-07-31  735  		u32 *report32;
19f81df2 Robert Bragg 2017-06-13  736  
4117ebc7 Robert Bragg 2017-05-11  737  		aged_idx ^= 1;
4117ebc7 Robert Bragg 2017-05-11  738  		dev_priv->perf.oa.oa_buffer.aged_tail_idx = aged_idx;
4117ebc7 Robert Bragg 2017-05-11  739  
4117ebc7 Robert Bragg 2017-05-11  740  		aged_tail = aging_tail;
4117ebc7 Robert Bragg 2017-05-11  741  
4117ebc7 Robert Bragg 2017-05-11  742  		/* Mark that we need a new pointer to start aging... */
4117ebc7 Robert Bragg 2017-05-11  743  		dev_priv->perf.oa.oa_buffer.tails[!aged_idx].offset = INVALID_TAIL_PTR;
4117ebc7 Robert Bragg 2017-05-11  744  		aging_tail = INVALID_TAIL_PTR;
24459f50 Sourab Gupta 2017-07-31  745  
24459f50 Sourab Gupta 2017-07-31  746  		num_reports = OA_TAKEN(((aged_tail - gtt_offset) & mask), head)/
24459f50 Sourab Gupta 2017-07-31  747  				report_size;
24459f50 Sourab Gupta 2017-07-31  748  
24459f50 Sourab Gupta 2017-07-31  749  		/* read the timestamp of last OA report */
24459f50 Sourab Gupta 2017-07-31  750  		head = (head + report_size*(num_reports - 1)) & mask;
24459f50 Sourab Gupta 2017-07-31  751  		report32 = (u32 *)(oa_buf_base + head);
24459f50 Sourab Gupta 2017-07-31  752  		*last_ts = report32[1];
4117ebc7 Robert Bragg 2017-05-11  753  	}
4117ebc7 Robert Bragg 2017-05-11  754  
0dd860cf Robert Bragg 2017-05-11  755  	/* Update the aging tail
0dd860cf Robert Bragg 2017-05-11  756  	 *
0dd860cf Robert Bragg 2017-05-11  757  	 * We throttle aging tail updates until we have a new tail that
0dd860cf Robert Bragg 2017-05-11  758  	 * represents >= one report more data than is already available for
0dd860cf Robert Bragg 2017-05-11  759  	 * reading. This ensures there will be enough data for a successful
0dd860cf Robert Bragg 2017-05-11  760  	 * read once this new pointer has aged and ensures we will give the new
0dd860cf Robert Bragg 2017-05-11  761  	 * pointer time to age.
0dd860cf Robert Bragg 2017-05-11  762  	 */
0dd860cf Robert Bragg 2017-05-11  763  	if (aging_tail == INVALID_TAIL_PTR &&
0dd860cf Robert Bragg 2017-05-11  764  	    (aged_tail == INVALID_TAIL_PTR ||
0dd860cf Robert Bragg 2017-05-11  765  	     OA_TAKEN(hw_tail, aged_tail) >= report_size)) {
0dd860cf Robert Bragg 2017-05-11  766  		struct i915_vma *vma = dev_priv->perf.oa.oa_buffer.vma;
0dd860cf Robert Bragg 2017-05-11  767  		u32 gtt_offset = i915_ggtt_offset(vma);
0dd860cf Robert Bragg 2017-05-11  768  
0dd860cf Robert Bragg 2017-05-11  769  		/* Be paranoid and do a bounds check on the pointer read back
0dd860cf Robert Bragg 2017-05-11  770  		 * from hardware, just in case some spurious hardware condition
0dd860cf Robert Bragg 2017-05-11  771  		 * could put the tail out of bounds...
0dd860cf Robert Bragg 2017-05-11  772  		 */
0dd860cf Robert Bragg 2017-05-11  773  		if (hw_tail >= gtt_offset &&
0dd860cf Robert Bragg 2017-05-11  774  		    hw_tail < (gtt_offset + OA_BUFFER_SIZE)) {
0dd860cf Robert Bragg 2017-05-11  775  			dev_priv->perf.oa.oa_buffer.tails[!aged_idx].offset =
0dd860cf Robert Bragg 2017-05-11  776  				aging_tail = hw_tail;
0dd860cf Robert Bragg 2017-05-11  777  			dev_priv->perf.oa.oa_buffer.aging_timestamp = now;
0dd860cf Robert Bragg 2017-05-11  778  		} else {
0dd860cf Robert Bragg 2017-05-11  779  			DRM_ERROR("Ignoring spurious out of range OA buffer tail pointer = %u\n",
0dd860cf Robert Bragg 2017-05-11  780  				  hw_tail);
0dd860cf Robert Bragg 2017-05-11  781  		}
0dd860cf Robert Bragg 2017-05-11  782  	}
0dd860cf Robert Bragg 2017-05-11  783  
0dd860cf Robert Bragg 2017-05-11  784  	spin_unlock_irqrestore(&dev_priv->perf.oa.oa_buffer.ptr_lock, flags);
0dd860cf Robert Bragg 2017-05-11  785  
24459f50 Sourab Gupta 2017-07-31  786  	return aged_tail == INVALID_TAIL_PTR ? 0 : num_reports;
d7965152 Robert Bragg 2016-11-07  787  }
d7965152 Robert Bragg 2016-11-07  788  

:::::: The code at line 684 was first introduced by commit
:::::: d79651522e89c4ffa8992b48dfe449f0c583f809 drm/i915: Enable i915 perf stream for Haswell OA unit

:::::: TO: Robert Bragg <robert@sixbynine.org>
:::::: CC: Daniel Vetter <daniel.vetter@ffwll.ch>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 8b1cecf..886fc5e 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2138,7 +2138,8 @@  struct i915_oa_ops {
 		    char __user *buf,
 		    size_t count,
 		    size_t *offset,
-		    u32 ts);
+		    u32 ts,
+		    u32 max_reports);
 
 	/**
 	 * @oa_hw_tail_read: read the OA tail pointer register
@@ -2604,6 +2605,8 @@  struct drm_i915_private {
 			u32 gen7_latched_oastatus1;
 			u32 ctx_oactxctrl_offset;
 			u32 ctx_flexeu0_offset;
+			u32 n_pending_periodic_samples;
+			u32 pending_periodic_ts;
 
 			/**
 			 * The RPT_ID/reason field for Gen8+ includes a bit
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 57e1936..462d180 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -656,7 +656,7 @@  static void i915_perf_stream_release_samples(struct i915_perf_stream *stream)
 }
 
 /**
- * oa_buffer_check_unlocked - check for data and update tail ptr state
+ * oa_buffer_num_reports_unlocked - check for data and update tail ptr state
  * @dev_priv: i915 device instance
  *
  * This is either called via fops (for blocking reads in user ctx) or the poll
@@ -669,7 +669,7 @@  static void i915_perf_stream_release_samples(struct i915_perf_stream *stream)
  * the pointers time to 'age' before they are made available for reading.
  * (See description of OA_TAIL_MARGIN_NSEC above for further details.)
  *
- * Besides returning true when there is data available to read() this function
+ * Besides returning num of reports when there is data available to read() it
  * also has the side effect of updating the oa_buffer.tails[], .aging_timestamp
  * and .aged_tail_idx state used for reading.
  *
@@ -677,14 +677,15 @@  static void i915_perf_stream_release_samples(struct i915_perf_stream *stream)
  * only called while the stream is enabled, while the global OA configuration
  * can't be modified.
  *
- * Returns: %true if the OA buffer contains data, else %false
+ * Returns: number of samples available to read
  */
-static bool oa_buffer_check_unlocked(struct drm_i915_private *dev_priv)
+static u32 oa_buffer_num_reports_unlocked(
+			struct drm_i915_private *dev_priv, u32 *last_ts)
 {
 	int report_size = dev_priv->perf.oa.oa_buffer.format_size;
 	unsigned long flags;
 	unsigned int aged_idx;
-	u32 head, hw_tail, aged_tail, aging_tail;
+	u32 head, hw_tail, aged_tail, aging_tail, num_reports = 0;
 	u64 now;
 
 	/* We have to consider the (unlikely) possibility that read() errors
@@ -725,6 +726,13 @@  static bool oa_buffer_check_unlocked(struct drm_i915_private *dev_priv)
 	if (aging_tail != INVALID_TAIL_PTR &&
 	    ((now - dev_priv->perf.oa.oa_buffer.aging_timestamp) >
 	     OA_TAIL_MARGIN_NSEC)) {
+		u32 mask = (OA_BUFFER_SIZE - 1);
+		u32 gtt_offset = i915_ggtt_offset(
+				dev_priv->perf.oa.oa_buffer.vma);
+		u32 head = (dev_priv->perf.oa.oa_buffer.head - gtt_offset)
+				& mask;
+		u8 *oa_buf_base = dev_priv->perf.oa.oa_buffer.vaddr;
+		u32 *report32;
 
 		aged_idx ^= 1;
 		dev_priv->perf.oa.oa_buffer.aged_tail_idx = aged_idx;
@@ -734,6 +742,14 @@  static bool oa_buffer_check_unlocked(struct drm_i915_private *dev_priv)
 		/* Mark that we need a new pointer to start aging... */
 		dev_priv->perf.oa.oa_buffer.tails[!aged_idx].offset = INVALID_TAIL_PTR;
 		aging_tail = INVALID_TAIL_PTR;
+
+		num_reports = OA_TAKEN(((aged_tail - gtt_offset) & mask), head)/
+				report_size;
+
+		/* read the timestamp of last OA report */
+		head = (head + report_size*(num_reports - 1)) & mask;
+		report32 = (u32 *)(oa_buf_base + head);
+		*last_ts = report32[1];
 	}
 
 	/* Update the aging tail
@@ -767,8 +783,7 @@  static bool oa_buffer_check_unlocked(struct drm_i915_private *dev_priv)
 
 	spin_unlock_irqrestore(&dev_priv->perf.oa.oa_buffer.ptr_lock, flags);
 
-	return aged_tail == INVALID_TAIL_PTR ?
-		false : OA_TAKEN(aged_tail, head) >= report_size;
+	return aged_tail == INVALID_TAIL_PTR ? 0 : num_reports;
 }
 
 /**
@@ -926,6 +941,7 @@  static int append_oa_buffer_sample(struct i915_perf_stream *stream,
  * @count: the number of bytes userspace wants to read
  * @offset: (inout): the current position for writing into @buf
  * @ts: copy OA reports till this timestamp
+ * @max_reports: max number of OA reports to copy
  *
  * Notably any error condition resulting in a short read (-%ENOSPC or
  * -%EFAULT) will be returned even though one or more records may
@@ -944,7 +960,8 @@  static int gen8_append_oa_reports(struct i915_perf_stream *stream,
 				  char __user *buf,
 				  size_t count,
 				  size_t *offset,
-				  u32 ts)
+				  u32 ts,
+				  u32 max_reports)
 {
 	struct drm_i915_private *dev_priv = stream->dev_priv;
 	int report_size = dev_priv->perf.oa.oa_buffer.format_size;
@@ -957,6 +974,7 @@  static int gen8_append_oa_reports(struct i915_perf_stream *stream,
 	u32 head, tail;
 	u32 taken;
 	int ret = 0;
+	u32 report_count = 0;
 
 	if (WARN_ON(stream->state != I915_PERF_STREAM_ENABLED))
 		return -EIO;
@@ -998,7 +1016,7 @@  static int gen8_append_oa_reports(struct i915_perf_stream *stream,
 
 
 	for (/* none */;
-	     (taken = OA_TAKEN(tail, head));
+	     (taken = OA_TAKEN(tail, head)) && (report_count <= max_reports);
 	     head = (head + report_size) & mask) {
 		u8 *report = oa_buf_base + head;
 		u32 *report32 = (void *)report;
@@ -1110,6 +1128,7 @@  static int gen8_append_oa_reports(struct i915_perf_stream *stream,
 			if (ret)
 				break;
 
+			report_count++;
 			dev_priv->perf.oa.oa_buffer.last_ctx_id = ctx_id;
 		}
 
@@ -1148,6 +1167,7 @@  static int gen8_append_oa_reports(struct i915_perf_stream *stream,
  * @count: the number of bytes userspace wants to read
  * @offset: (inout): the current position for writing into @buf
  * @ts: copy OA reports till this timestamp
+ * @max_reports: max number of OA reports to copy
  *
  * Checks OA unit status registers and if necessary appends corresponding
  * status records for userspace (such as for a buffer full condition) and then
@@ -1166,7 +1186,8 @@  static int gen8_oa_read(struct i915_perf_stream *stream,
 			char __user *buf,
 			size_t count,
 			size_t *offset,
-			u32 ts)
+			u32 ts,
+			u32 max_reports)
 {
 	struct drm_i915_private *dev_priv = stream->dev_priv;
 	u32 oastatus;
@@ -1219,7 +1240,8 @@  static int gen8_oa_read(struct i915_perf_stream *stream,
 			   oastatus & ~GEN8_OASTATUS_REPORT_LOST);
 	}
 
-	return gen8_append_oa_reports(stream, buf, count, offset, ts);
+	return gen8_append_oa_reports(stream, buf, count, offset, ts,
+					max_reports);
 }
 
 /**
@@ -1229,6 +1251,7 @@  static int gen8_oa_read(struct i915_perf_stream *stream,
  * @count: the number of bytes userspace wants to read
  * @offset: (inout): the current position for writing into @buf
  * @ts: copy OA reports till this timestamp
+ * @max_reports: max number of OA reports to copy
  *
  * Notably any error condition resulting in a short read (-%ENOSPC or
  * -%EFAULT) will be returned even though one or more records may
@@ -1247,7 +1270,8 @@  static int gen7_append_oa_reports(struct i915_perf_stream *stream,
 				  char __user *buf,
 				  size_t count,
 				  size_t *offset,
-				  u32 ts)
+				  u32 ts,
+				  u32 max_reports)
 {
 	struct drm_i915_private *dev_priv = stream->dev_priv;
 	int report_size = dev_priv->perf.oa.oa_buffer.format_size;
@@ -1260,6 +1284,7 @@  static int gen7_append_oa_reports(struct i915_perf_stream *stream,
 	u32 head, tail;
 	u32 taken;
 	int ret = 0;
+	u32 report_count = 0;
 
 	if (WARN_ON(stream->state != I915_PERF_STREAM_ENABLED))
 		return -EIO;
@@ -1298,7 +1323,7 @@  static int gen7_append_oa_reports(struct i915_perf_stream *stream,
 
 
 	for (/* none */;
-	     (taken = OA_TAKEN(tail, head));
+	     (taken = OA_TAKEN(tail, head)) && (report_count <= max_reports);
 	     head = (head + report_size) & mask) {
 		u8 *report = oa_buf_base + head;
 		u32 *report32 = (void *)report;
@@ -1337,6 +1362,7 @@  static int gen7_append_oa_reports(struct i915_perf_stream *stream,
 		if (ret)
 			break;
 
+		report_count++;
 		/* The above report-id field sanity check is based on
 		 * the assumption that the OA buffer is initially
 		 * zeroed and we reset the field after copying so the
@@ -1372,6 +1398,7 @@  static int gen7_append_oa_reports(struct i915_perf_stream *stream,
  * @count: the number of bytes userspace wants to read
  * @offset: (inout): the current position for writing into @buf
  * @ts: copy OA reports till this timestamp
+ * @max_reports: max number of OA reports to copy
  *
  * Checks Gen 7 specific OA unit status registers and if necessary appends
  * corresponding status records for userspace (such as for a buffer full
@@ -1386,7 +1413,8 @@  static int gen7_oa_read(struct i915_perf_stream *stream,
 			char __user *buf,
 			size_t count,
 			size_t *offset,
-			u32 ts)
+			u32 ts,
+			u32 max_reports)
 {
 	struct drm_i915_private *dev_priv = stream->dev_priv;
 	u32 oastatus1;
@@ -1448,7 +1476,8 @@  static int gen7_oa_read(struct i915_perf_stream *stream,
 			GEN7_OASTATUS1_REPORT_LOST;
 	}
 
-	return gen7_append_oa_reports(stream, buf, count, offset, ts);
+	return gen7_append_oa_reports(stream, buf, count, offset, ts,
+					max_reports);
 }
 
 /**
@@ -1483,7 +1512,7 @@  static int append_cs_buffer_sample(struct i915_perf_stream *stream,
 		 * timestamp values
 		 */
 		ret = dev_priv->perf.oa.ops.read(stream, buf, count, offset,
-						 sample_ts);
+						 sample_ts, U32_MAX);
 		if (ret)
 			return ret;
 	}
@@ -1518,6 +1547,7 @@  static int append_cs_buffer_samples(struct i915_perf_stream *stream,
 				size_t count,
 				size_t *offset)
 {
+	struct drm_i915_private *dev_priv = stream->dev_priv;
 	struct i915_perf_cs_sample *entry, *next;
 	LIST_HEAD(free_list);
 	int ret = 0;
@@ -1526,7 +1556,7 @@  static int append_cs_buffer_samples(struct i915_perf_stream *stream,
 	spin_lock_irqsave(&stream->cs_samples_lock, flags);
 	if (list_empty(&stream->cs_samples)) {
 		spin_unlock_irqrestore(&stream->cs_samples_lock, flags);
-		return 0;
+		goto pending_periodic;
 	}
 	list_for_each_entry_safe(entry, next,
 				 &stream->cs_samples, link) {
@@ -1537,7 +1567,7 @@  static int append_cs_buffer_samples(struct i915_perf_stream *stream,
 	spin_unlock_irqrestore(&stream->cs_samples_lock, flags);
 
 	if (list_empty(&free_list))
-		return 0;
+		goto pending_periodic;
 
 	list_for_each_entry_safe(entry, next, &free_list, link) {
 		ret = append_cs_buffer_sample(stream, buf, count, offset,
@@ -1556,18 +1586,37 @@  static int append_cs_buffer_samples(struct i915_perf_stream *stream,
 	spin_unlock_irqrestore(&stream->cs_samples_lock, flags);
 
 	return ret;
+
+pending_periodic:
+	if (!((stream->sample_flags & SAMPLE_OA_REPORT) &&
+			dev_priv->perf.oa.n_pending_periodic_samples))
+		return 0;
+
+	ret = dev_priv->perf.oa.ops.read(stream, buf, count, offset,
+				dev_priv->perf.oa.pending_periodic_ts,
+				dev_priv->perf.oa.n_pending_periodic_samples);
+	dev_priv->perf.oa.n_pending_periodic_samples = 0;
+	dev_priv->perf.oa.pending_periodic_ts = 0;
+	return ret;
 }
 
+enum cs_buf_state {
+	CS_BUF_EMPTY,
+	CS_BUF_REQ_PENDING,
+	CS_BUF_HAVE_DATA,
+};
+
 /*
- * cs_buffer_is_empty - Checks whether the command stream buffer
+ * cs_buffer_state - Checks whether the command stream buffer
  * associated with the stream has data available.
  * @stream: An i915-perf stream opened for OA metrics
  *
- * Returns: true if atleast one request associated with command stream is
- * completed, else returns false.
+ * Returns:
+ * CS_BUF_HAVE_DATA	- if there is atleast one completed request
+ * CS_BUF_REQ_PENDING	- there are requests pending, but no completed requests
+ * CS_BUF_EMPTY		- no requests scheduled
  */
-static bool cs_buffer_is_empty(struct i915_perf_stream *stream)
-
+static enum cs_buf_state cs_buffer_state(struct i915_perf_stream *stream)
 {
 	struct i915_perf_cs_sample *entry = NULL;
 	struct drm_i915_gem_request *request = NULL;
@@ -1581,30 +1630,57 @@  static bool cs_buffer_is_empty(struct i915_perf_stream *stream)
 	spin_unlock_irqrestore(&stream->cs_samples_lock, flags);
 
 	if (!entry)
-		return true;
+		return CS_BUF_EMPTY;
 	else if (!i915_gem_request_completed(request))
-		return true;
+		return CS_BUF_REQ_PENDING;
 	else
-		return false;
+		return CS_BUF_HAVE_DATA;
 }
 
 /**
  * stream_have_data_unlocked - Checks whether the stream has data available
  * @stream: An i915-perf stream opened for OA metrics
  *
- * For command stream based streams, check if the command stream buffer has
- * atleast one sample available, if not return false, irrespective of periodic
- * oa buffer having the data or not.
+ * Note: We can safely forward the periodic OA samples in the case we have no
+ * pending CS samples, but we can't do so in the case we have pending CS
+ * samples, since we don't know what the ordering between pending CS samples
+ * and periodic samples will eventually be. If we have no pending CS sample,
+ * it won't be possible for future pending CS sample to have timestamps
+ * earlier than current periodic timestamp.
  */
 
 static bool stream_have_data_unlocked(struct i915_perf_stream *stream)
 {
 	struct drm_i915_private *dev_priv = stream->dev_priv;
+	enum cs_buf_state state = CS_BUF_EMPTY;
+	u32 num_samples = 0, last_ts = 0;
+
+	dev_priv->perf.oa.n_pending_periodic_samples = 0;
+	dev_priv->perf.oa.pending_periodic_ts = 0;
+	num_samples = oa_buffer_num_reports_unlocked(dev_priv,
+						     &last_ts);
 
 	if (stream->cs_mode)
-		return !cs_buffer_is_empty(stream);
-	else
-		return oa_buffer_check_unlocked(dev_priv);
+		state = cs_buffer_state(stream);
+
+	switch (state) {
+	case CS_BUF_EMPTY:
+		if (stream->sample_flags & SAMPLE_OA_REPORT) {
+			dev_priv->perf.oa.n_pending_periodic_samples =
+							num_samples;
+			dev_priv->perf.oa.pending_periodic_ts = last_ts;
+			return (num_samples != 0);
+		} else
+			return false;
+
+	case CS_BUF_HAVE_DATA:
+		return true;
+
+	case CS_BUF_REQ_PENDING:
+	default:
+		return false;
+	}
+	return false;
 }
 
 /**
@@ -1691,7 +1767,7 @@  static int i915_perf_stream_read(struct i915_perf_stream *stream,
 		return append_cs_buffer_samples(stream, buf, count, offset);
 	else if (stream->sample_flags & SAMPLE_OA_REPORT)
 		return dev_priv->perf.oa.ops.read(stream, buf, count, offset,
-						U32_MAX);
+						U32_MAX, U32_MAX);
 	else
 		return -EINVAL;
 }