From patchwork Thu Jan 30 04:39:01 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Keith Packard X-Patchwork-Id: 11357401 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B0A8A924 for ; Thu, 30 Jan 2020 04:48:42 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 8A6972083E for ; Thu, 30 Jan 2020 04:48:42 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=keithp.com header.i=@keithp.com header.b="i6lu0w6j"; dkim=fail reason="signature verification failed" (2048-bit key) header.d=keithp.com header.i=@keithp.com header.b="JR3TmZRp" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8A6972083E Authentication-Results: mail.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=keithp.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=dri-devel-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 2B6AA6F8E8; Thu, 30 Jan 2020 04:48:39 +0000 (UTC) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org X-Greylist: delayed 568 seconds by postgrey-1.36 at gabe; Thu, 30 Jan 2020 04:48:38 UTC Received: from elaine.keithp.com (home.keithp.com [63.227.221.253]) by gabe.freedesktop.org (Postfix) with ESMTPS id 4DA3D6F8E8; Thu, 30 Jan 2020 04:48:38 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by elaine.keithp.com (Postfix) with ESMTP id 8E1C03F2A8FD; Wed, 29 Jan 2020 20:39:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=keithp.com; s=mail; t=1580359149; bh=nrcpn1UdssIdc/Xz3UfykbQPnMRoU4nI2rOww/buYkY=; h=From:To:Cc:Subject:Date:From; b=i6lu0w6jC/DB117YQZ9v7QgBgZ4X/sncItXVNfeBY1ujZPEPJJlmj6+zSBWTYe7my VDFUmUvYFfEfJBJfh54ljjaNCguJcmGQQKtJUC61GIJvzg/JpN9WxucsCFu3ESwd03 KNBqJI3v1oOzeiubgfWp2cmtBgZueC4CVClIFrQDYwEvm0QBGNZSOqR0Uhbou/zlmt YXnG+VkujYjO4cw6FcOeWmCUxbWgfQLFXHp6+Fdvsr3Y/F1V7nn3LOLMMWd6XcdYTF 1AQGM8n4HmQLswDhJ0wwhoBTbN06o0hTWRvTtU9yMTW+mOGpQF9vwawksCJPYJs8Fl OTAKJ+dcwjshg== X-Virus-Scanned: Debian amavisd-new at keithp.com Received: from elaine.keithp.com ([127.0.0.1]) by localhost (elaine.keithp.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id Yco-bxCB-pPu; Wed, 29 Jan 2020 20:39:07 -0800 (PST) Received: from keithp.com (koto.keithp.com [10.0.0.2]) by elaine.keithp.com (Postfix) with ESMTPSA id 831773F2A8FB; Wed, 29 Jan 2020 20:39:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=keithp.com; s=mail; t=1580359147; bh=nrcpn1UdssIdc/Xz3UfykbQPnMRoU4nI2rOww/buYkY=; h=From:To:Cc:Subject:Date:From; b=JR3TmZRpPFzqVFqCZE1l6ip5cd/emLOyC03IfibBFHSUIDSVJTFcWv1Db0WlzZtLD t9UlYFf7N0E7b/zjO3hX98WtLkw0k1+Bp9h0X8k7YU6b4aaOz1fDtmOb8NbfwRNBY8 thSAeTcxTSKujndG0vcQmHWx9gyo1V6yX3VmZvasM+RC8Q5rBuVVza2/mCR1ZxJEIB LyQpevLWCdZXP7ZyNfNRI4EAjKSHq5CvCh+aruDhF7ZElyMnhlY3mrsIQlvVncB9VM 0y4qArPSYl8d+5kdzSN9z9JCOP0GSOwCaxXLus4bOD/CYz94cF9LPwGCdXJQaeih45 tzPjKYhRlW8GQ== Received: by keithp.com (Postfix, from userid 1000) id 2312F1582162; Wed, 29 Jan 2020 20:39:07 -0800 (PST) From: Keith Packard To: mesa-dev@lists.freedesktop.org Subject: [PATCH] vulkan: Add VK_GOOGLE_display_timing extension (x11+display, anv+radv) [v8] Date: Wed, 29 Jan 2020 20:39:01 -0800 Message-Id: <20200130043901.571143-1-keithp@keithp.com> X-Mailer: git-send-email 2.25.0 MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Keith Packard , dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" This adds support for the VK_GOOGLE_display timing extension, which provides two things: 1) Detailed information about when frames are displayed, including slack time between GPU execution and display frame. 2) Absolute time control over swapchain queue processing. This allows the application to request frames be displayed at specific absolute times, using the same timebase as that provided in vblank events. Support for this extension has been implemented for the x11 and display backends; adding support to other backends should be reasonable straightforward for one familiar with those systems and should not require any additional device-specific code. v2: Adjust GOOGLE_display_timing earliest value. The earliestPresentTime for an image cannot be before the previous image was displayed, or even a frame later (in FIFO mode). Make GOOGLE_display_timing use render completed time. Switch from VK_PIPELINE_TOP_OF_PIPE_BIT to VK_PIPELINE_STAGE_ALL_COMMANDS_BIT so that the time reported to applications as the end of rendering reflects the latest possible value to ensure that applications don't underestimate the amount of work done in the frame. v3: Adopt Jason Ekstrand's coding conventions. Declare variables at first use, eliminate extra whitespace between types and names. Wrap lines to 80 columns. Suggested-by: Jason Ekstrand v4: Adapt to changes in MESA_query_timestamp extension v5: Squash core bits and anv/radv wrappers into a single patch Suggested-by: Jason Ekstrand v6: Switch from MESA_query_timestamp to EXT_calibrated_timestamps v7: Ensure we target frame no earlier than desired. This means rounding the target frame up, rather than selecting the nearest one. Suggested-by: Michel Dänzer v8: Re-order display_timing in anv_extensions.py. That code now requires extensions in alphabetical order. Rename wsi_mark_time to wsi_present_complete to make the functionality clearer. Signed-off-by: Keith Packard --- src/amd/vulkan/radv_extensions.py | 1 + src/amd/vulkan/radv_wsi.c | 33 +++ src/intel/vulkan/anv_extensions.py | 1 + src/intel/vulkan/anv_wsi.c | 31 +++ src/vulkan/wsi/wsi_common.c | 301 +++++++++++++++++++++++++++- src/vulkan/wsi/wsi_common.h | 32 +++ src/vulkan/wsi/wsi_common_display.c | 163 ++++++++++++++- src/vulkan/wsi/wsi_common_private.h | 35 ++++ src/vulkan/wsi/wsi_common_x11.c | 71 ++++++- 9 files changed, 656 insertions(+), 12 deletions(-) diff --git a/src/amd/vulkan/radv_extensions.py b/src/amd/vulkan/radv_extensions.py index 57aa67be616..c255b49437a 100644 --- a/src/amd/vulkan/radv_extensions.py +++ b/src/amd/vulkan/radv_extensions.py @@ -166,6 +166,7 @@ EXTENSIONS = [ Extension('VK_AMD_shader_trinary_minmax', 1, True), Extension('VK_GOOGLE_decorate_string', 1, True), Extension('VK_GOOGLE_hlsl_functionality1', 1, True), + Extension('VK_GOOGLE_display_timing', 1, True), Extension('VK_NV_compute_shader_derivatives', 1, 'device->rad_info.chip_class >= GFX8'), ] diff --git a/src/amd/vulkan/radv_wsi.c b/src/amd/vulkan/radv_wsi.c index a2b0afa48c3..b722e23ff53 100644 --- a/src/amd/vulkan/radv_wsi.c +++ b/src/amd/vulkan/radv_wsi.c @@ -316,3 +316,36 @@ VkResult radv_GetPhysicalDevicePresentRectanglesKHR( surface, pRectCount, pRects); } + +/* VK_GOOGLE_display_timing */ +VkResult +radv_GetRefreshCycleDurationGOOGLE( + VkDevice _device, + VkSwapchainKHR swapchain, + VkRefreshCycleDurationGOOGLE *pDisplayTimingProperties) +{ + RADV_FROM_HANDLE(radv_device, device, _device); + struct radv_physical_device *pdevice = device->physical_device; + + return wsi_common_get_refresh_cycle_duration(&pdevice->wsi_device, + _device, + swapchain, + pDisplayTimingProperties); +} + +VkResult +radv_GetPastPresentationTimingGOOGLE(VkDevice _device, + VkSwapchainKHR swapchain, + uint32_t *pPresentationTimingCount, + VkPastPresentationTimingGOOGLE + *pPresentationTimings) +{ + RADV_FROM_HANDLE(radv_device, device, _device); + struct radv_physical_device *pdevice = device->physical_device; + + return wsi_common_get_past_presentation_timing(&pdevice->wsi_device, + _device, + swapchain, + pPresentationTimingCount, + pPresentationTimings); +} diff --git a/src/intel/vulkan/anv_extensions.py b/src/intel/vulkan/anv_extensions.py index 0392b0d2474..256814a8584 100644 --- a/src/intel/vulkan/anv_extensions.py +++ b/src/intel/vulkan/anv_extensions.py @@ -170,6 +170,7 @@ EXTENSIONS = [ Extension('VK_ANDROID_external_memory_android_hardware_buffer', 3, 'ANDROID'), Extension('VK_ANDROID_native_buffer', 7, 'ANDROID'), Extension('VK_GOOGLE_decorate_string', 1, True), + Extension('VK_GOOGLE_display_timing', 1, True), Extension('VK_GOOGLE_hlsl_functionality1', 1, True), Extension('VK_INTEL_performance_query', 1, 'device->perf'), Extension('VK_INTEL_shader_integer_functions2', 1, 'device->info.gen >= 8'), diff --git a/src/intel/vulkan/anv_wsi.c b/src/intel/vulkan/anv_wsi.c index dbb8d512cbd..c934457f619 100644 --- a/src/intel/vulkan/anv_wsi.c +++ b/src/intel/vulkan/anv_wsi.c @@ -329,3 +329,34 @@ VkResult anv_GetPhysicalDevicePresentRectanglesKHR( surface, pRectCount, pRects); } + +/* VK_GOOGLE_display_timing */ +VkResult +anv_GetRefreshCycleDurationGOOGLE(VkDevice _device, + VkSwapchainKHR swapchain, + VkRefreshCycleDurationGOOGLE + *pDisplayTimingProperties) +{ + ANV_FROM_HANDLE(anv_device, device, _device); + + return wsi_common_get_refresh_cycle_duration(&device->physical->wsi_device, + _device, + swapchain, + pDisplayTimingProperties); +} + +VkResult +anv_GetPastPresentationTimingGOOGLE(VkDevice _device, + VkSwapchainKHR swapchain, + uint32_t *pPresentationTimingCount, + VkPastPresentationTimingGOOGLE + *pPresentationTimings) +{ + ANV_FROM_HANDLE(anv_device, device, _device); + + return wsi_common_get_past_presentation_timing(&device->physical->wsi_device, + _device, + swapchain, + pPresentationTimingCount, + pPresentationTimings); +} diff --git a/src/vulkan/wsi/wsi_common.c b/src/vulkan/wsi/wsi_common.c index 0adf54eab8f..f46bfe11ec2 100644 --- a/src/vulkan/wsi/wsi_common.c +++ b/src/vulkan/wsi/wsi_common.c @@ -32,6 +32,7 @@ #include #include #include +#include VkResult wsi_device_init(struct wsi_device *wsi, @@ -54,6 +55,7 @@ wsi_device_init(struct wsi_device *wsi, WSI_GET_CB(GetPhysicalDeviceProperties2); WSI_GET_CB(GetPhysicalDeviceMemoryProperties); WSI_GET_CB(GetPhysicalDeviceQueueFamilyProperties); + WSI_GET_CB(GetPhysicalDeviceProperties); #undef WSI_GET_CB wsi->pci_bus_info.sType = @@ -70,6 +72,10 @@ wsi_device_init(struct wsi_device *wsi, GetPhysicalDeviceMemoryProperties(pdevice, &wsi->memory_props); GetPhysicalDeviceQueueFamilyProperties(pdevice, &wsi->queue_family_count, NULL); + VkPhysicalDeviceProperties properties; + GetPhysicalDeviceProperties(pdevice, &properties); + wsi->timestamp_period = properties.limits.timestampPeriod; + #define WSI_GET_CB(func) \ wsi->func = (PFN_vk##func)proc_addr(pdevice, "vk" #func) WSI_GET_CB(AllocateMemory); @@ -78,14 +84,18 @@ wsi_device_init(struct wsi_device *wsi, WSI_GET_CB(BindImageMemory); WSI_GET_CB(BeginCommandBuffer); WSI_GET_CB(CmdCopyImageToBuffer); + WSI_GET_CB(CmdResetQueryPool); + WSI_GET_CB(CmdWriteTimestamp); WSI_GET_CB(CreateBuffer); WSI_GET_CB(CreateCommandPool); WSI_GET_CB(CreateFence); WSI_GET_CB(CreateImage); + WSI_GET_CB(CreateQueryPool); WSI_GET_CB(DestroyBuffer); WSI_GET_CB(DestroyCommandPool); WSI_GET_CB(DestroyFence); WSI_GET_CB(DestroyImage); + WSI_GET_CB(DestroyQueryPool); WSI_GET_CB(EndCommandBuffer); WSI_GET_CB(FreeMemory); WSI_GET_CB(FreeCommandBuffers); @@ -94,11 +104,15 @@ wsi_device_init(struct wsi_device *wsi, WSI_GET_CB(GetImageMemoryRequirements); WSI_GET_CB(GetImageSubresourceLayout); WSI_GET_CB(GetMemoryFdKHR); + WSI_GET_CB(GetPhysicalDeviceProperties); WSI_GET_CB(GetPhysicalDeviceFormatProperties); WSI_GET_CB(GetPhysicalDeviceFormatProperties2KHR); WSI_GET_CB(GetPhysicalDeviceImageFormatProperties2); + WSI_GET_CB(GetPhysicalDeviceQueueFamilyProperties); + WSI_GET_CB(GetQueryPoolResults); WSI_GET_CB(ResetFences); WSI_GET_CB(QueueSubmit); + WSI_GET_CB(GetCalibratedTimestampsEXT); WSI_GET_CB(WaitForFences); #undef WSI_GET_CB @@ -210,6 +224,8 @@ wsi_swapchain_init(const struct wsi_device *wsi, chain->device = device; chain->alloc = *pAllocator; chain->use_prime_blit = false; + chain->timing_insert = 0; + chain->timing_count = 0; chain->cmd_pools = vk_zalloc(pAllocator, sizeof(VkCommandPool) * wsi->queue_family_count, 8, @@ -340,6 +356,63 @@ align_u32(uint32_t v, uint32_t a) return (v + a - 1) & ~(a - 1); } +static VkResult +wsi_image_init_timestamp(const struct wsi_swapchain *chain, + struct wsi_image *image) +{ + const struct wsi_device *wsi = chain->wsi; + VkResult result; + /* Set up command buffer to get timestamp info */ + + result = wsi->CreateQueryPool( + chain->device, + &(const VkQueryPoolCreateInfo) { + .sType = VK_STRUCTURE_TYPE_QUERY_POOL_CREATE_INFO, + .queryType = VK_QUERY_TYPE_TIMESTAMP, + .queryCount = 1, + }, + NULL, + &image->query_pool); + + if (result != VK_SUCCESS) + goto fail; + + result = wsi->AllocateCommandBuffers( + chain->device, + &(const VkCommandBufferAllocateInfo) { + .sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_ALLOCATE_INFO, + .pNext = NULL, + .commandPool = chain->cmd_pools[0], + .level = VK_COMMAND_BUFFER_LEVEL_PRIMARY, + .commandBufferCount = 1, + }, + &image->timestamp_buffer); + if (result != VK_SUCCESS) + goto fail; + + wsi->BeginCommandBuffer( + image->timestamp_buffer, + &(VkCommandBufferBeginInfo) { + .sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_BEGIN_INFO, + .flags = 0 + }); + + wsi->CmdResetQueryPool(image->timestamp_buffer, + image->query_pool, + 0, 1); + + wsi->CmdWriteTimestamp(image->timestamp_buffer, + VK_PIPELINE_STAGE_ALL_COMMANDS_BIT, + image->query_pool, + 0); + + wsi->EndCommandBuffer(image->timestamp_buffer); + + return VK_SUCCESS; +fail: + return result; +} + VkResult wsi_create_native_image(const struct wsi_swapchain *chain, const VkSwapchainCreateInfoKHR *pCreateInfo, @@ -581,6 +654,10 @@ wsi_create_native_image(const struct wsi_swapchain *chain, if (result != VK_SUCCESS) goto fail; + result = wsi_image_init_timestamp(chain, image); + if (result != VK_SUCCESS) + goto fail; + if (num_modifier_lists > 0) { VkImageDrmFormatModifierPropertiesEXT image_mod_props = { .sType = VK_STRUCTURE_TYPE_IMAGE_DRM_FORMAT_MODIFIER_PROPERTIES_EXT, @@ -834,6 +911,10 @@ wsi_create_prime_image(const struct wsi_swapchain *chain, goto fail; } + result = wsi_image_init_timestamp(chain, image); + if (result != VK_SUCCESS) + goto fail; + const VkMemoryGetFdInfoKHR linear_memory_get_fd_info = { .sType = VK_STRUCTURE_TYPE_MEMORY_GET_FD_INFO_KHR, .pNext = NULL, @@ -1114,6 +1195,128 @@ wsi_common_acquire_next_image2(const struct wsi_device *wsi, return VK_SUCCESS; } +static struct wsi_timing * +wsi_get_timing(struct wsi_swapchain *chain, uint32_t i) +{ + uint32_t j = WSI_TIMING_HISTORY + chain->timing_insert - + chain->timing_count + i; + + if (j >= WSI_TIMING_HISTORY) + j -= WSI_TIMING_HISTORY; + return &chain->timing[j]; +} + +static struct wsi_timing * +wsi_next_timing(struct wsi_swapchain *chain, int image_index) +{ + uint32_t j = chain->timing_insert; + ++chain->timing_insert; + if (chain->timing_insert >= WSI_TIMING_HISTORY) + chain->timing_insert = 0; + if (chain->timing_count < WSI_TIMING_HISTORY) + ++chain->timing_count; + struct wsi_timing *timing = &chain->timing[j]; + memset(timing, '\0', sizeof (*timing)); + return timing; +} + +void +wsi_present_complete(struct wsi_swapchain *swapchain, + struct wsi_image *image, + uint64_t ust, + uint64_t msc) +{ + const struct wsi_device *wsi = swapchain->wsi; + struct wsi_timing *timing = image->timing; + + if (!timing) + return; + + uint64_t render_timestamp; + + VkResult result = wsi->GetQueryPoolResults( + swapchain->device, image->query_pool, + 0, 1, sizeof(render_timestamp), &render_timestamp, + sizeof (uint64_t), + VK_QUERY_RESULT_64_BIT|VK_QUERY_RESULT_WAIT_BIT); + if (result != VK_SUCCESS) + return; + + static const VkCalibratedTimestampInfoEXT timestampInfo[2] = { + { + .sType = VK_STRUCTURE_TYPE_CALIBRATED_TIMESTAMP_INFO_EXT, + .pNext = NULL, + .timeDomain = VK_TIME_DOMAIN_DEVICE_EXT, + }, + { + .sType = VK_STRUCTURE_TYPE_CALIBRATED_TIMESTAMP_INFO_EXT, + .pNext = NULL, + .timeDomain = VK_TIME_DOMAIN_CLOCK_MONOTONIC_EXT, + }, + }; + + uint64_t timestamps[2]; + uint64_t maxDeviation; + + result = wsi->GetCalibratedTimestampsEXT(swapchain->device, + 2, + timestampInfo, + timestamps, + &maxDeviation); + if (result != VK_SUCCESS) + return; + + uint64_t current_gpu_timestamp = timestamps[0]; + uint64_t current_time = timestamps[1]; + + VkRefreshCycleDurationGOOGLE display_timings; + swapchain->get_refresh_cycle_duration(swapchain, &display_timings); + + uint64_t refresh_duration = display_timings.refreshDuration; + + /* When did drawing complete (in nsec) */ + + int64_t since_render = (int64_t) floor ((double) (current_gpu_timestamp - render_timestamp) * + (double) wsi->timestamp_period + 0.5); + uint64_t render_time = current_time - since_render; + + if (render_time > ust) + render_time = ust; + + uint64_t render_frames = (ust - render_time) / refresh_duration; + + uint64_t earliest_time = ust - render_frames * refresh_duration; + + /* Use the presentation mode to figure out when the image could have been + * displayed. It couldn't have been displayed before the previous image, so + * use that as a lower bound. If we're in FIFO mode, then it couldn't have + * been displayed before one frame *after* the previous image + */ + uint64_t possible_frame = swapchain->frame_ust; + + switch (swapchain->present_mode) { + case VK_PRESENT_MODE_FIFO_KHR: + case VK_PRESENT_MODE_FIFO_RELAXED_KHR: + possible_frame += refresh_duration; + break; + default: + break; + } + if (earliest_time < possible_frame) + earliest_time = possible_frame; + + if (earliest_time > ust) + earliest_time = ust; + + timing->timing.actualPresentTime = ust; + timing->timing.earliestPresentTime = earliest_time; + timing->timing.presentMargin = earliest_time - render_time; + timing->complete = true; + + swapchain->frame_msc = msc; + swapchain->frame_ust = ust; +} + VkResult wsi_common_queue_present(const struct wsi_device *wsi, VkDevice device, @@ -1125,11 +1328,14 @@ wsi_common_queue_present(const struct wsi_device *wsi, const VkPresentRegionsKHR *regions = vk_find_struct_const(pPresentInfo->pNext, PRESENT_REGIONS_KHR); + const VkPresentTimesInfoGOOGLE *present_times_info = + vk_find_struct_const(pPresentInfo->pNext, PRESENT_TIMES_INFO_GOOGLE); for (uint32_t i = 0; i < pPresentInfo->swapchainCount; i++) { WSI_FROM_HANDLE(wsi_swapchain, swapchain, pPresentInfo->pSwapchains[i]); uint32_t image_index = pPresentInfo->pImageIndices[i]; VkResult result; + struct wsi_timing *timing = NULL; if (swapchain->fences[image_index] == VK_NULL_HANDLE) { const VkFenceCreateInfo fence_info = { @@ -1164,9 +1370,12 @@ wsi_common_queue_present(const struct wsi_device *wsi, .memory = image->memory, }; + VkCommandBuffer submit_buffers[2]; VkSubmitInfo submit_info = { .sType = VK_STRUCTURE_TYPE_SUBMIT_INFO, .pNext = &mem_signal, + .pCommandBuffers = submit_buffers, + .commandBufferCount = 0 }; VkPipelineStageFlags *stage_flags = NULL; @@ -1197,10 +1406,47 @@ wsi_common_queue_present(const struct wsi_device *wsi, /* If we are using prime blits, we need to perform the blit now. The * command buffer is attached to the image. */ - submit_info.commandBufferCount = 1; - submit_info.pCommandBuffers = - &image->prime.blit_cmd_buffers[queue_family_index]; mem_signal.memory = image->prime.memory; + submit_buffers[submit_info.commandBufferCount++] = + image->prime.blit_cmd_buffers[queue_family_index]; + } + + /* Set up GOOGLE_display_timing bits */ + if (present_times_info && + present_times_info->pTimes != NULL && + i < present_times_info->swapchainCount) + { + const VkPresentTimeGOOGLE *present_time = + &present_times_info->pTimes[i]; + + timing = wsi_next_timing(swapchain, pPresentInfo->pImageIndices[i]); + timing->timing.presentID = present_time->presentID; + timing->timing.desiredPresentTime = present_time->desiredPresentTime; + timing->target_msc = 0; + image->timing = timing; + + if (present_time->desiredPresentTime != 0) + { + int64_t delta_nsec = (int64_t) (present_time->desiredPresentTime - + swapchain->frame_ust); + + /* Set the target msc only if it's no more than two seconds from + * now, and not stale + */ + if (0 <= delta_nsec && delta_nsec <= 2000000000ul) { + VkRefreshCycleDurationGOOGLE refresh_timing; + + swapchain->get_refresh_cycle_duration(swapchain, + &refresh_timing); + + int64_t refresh = (int64_t) refresh_timing.refreshDuration; + int64_t frames = (delta_nsec + refresh - 1) / refresh; + timing->target_msc = swapchain->frame_msc + frames; + } + } + + submit_buffers[submit_info.commandBufferCount++] = + image->timestamp_buffer; } result = wsi->QueueSubmit(queue, 1, &submit_info, swapchain->fences[image_index]); @@ -1235,3 +1481,52 @@ wsi_common_get_current_time(void) clock_gettime(CLOCK_MONOTONIC, ¤t); return current.tv_nsec + current.tv_sec * 1000000000ull; } + +VkResult +wsi_common_get_refresh_cycle_duration( + const struct wsi_device *wsi, + VkDevice device_h, + VkSwapchainKHR _swapchain, + VkRefreshCycleDurationGOOGLE *pDisplayTimingProperties) +{ + WSI_FROM_HANDLE(wsi_swapchain, swapchain, _swapchain); + + if (!swapchain->get_refresh_cycle_duration) + return VK_ERROR_EXTENSION_NOT_PRESENT; + return swapchain->get_refresh_cycle_duration(swapchain, + pDisplayTimingProperties); +} + + +VkResult +wsi_common_get_past_presentation_timing( + const struct wsi_device *wsi, + VkDevice device_h, + VkSwapchainKHR _swapchain, + uint32_t *count, + VkPastPresentationTimingGOOGLE *timings) +{ + WSI_FROM_HANDLE(wsi_swapchain, swapchain, _swapchain); + uint32_t timing_count_requested = *count; + uint32_t timing_count_available = 0; + + /* Count the number of completed entries, copy */ + for (uint32_t t = 0; t < swapchain->timing_count; t++) { + struct wsi_timing *timing = wsi_get_timing(swapchain, t); + + if (timing->complete && !timing->consumed) { + if (timings && timing_count_available < timing_count_requested) { + timings[timing_count_available] = timing->timing; + timing->consumed = true; + } + timing_count_available++; + } + } + + *count = timing_count_available; + + if (timing_count_available > timing_count_requested && timings != NULL) + return VK_INCOMPLETE; + + return VK_SUCCESS; +} diff --git a/src/vulkan/wsi/wsi_common.h b/src/vulkan/wsi/wsi_common.h index 92121be8bda..038a56f3fd8 100644 --- a/src/vulkan/wsi/wsi_common.h +++ b/src/vulkan/wsi/wsi_common.h @@ -88,6 +88,7 @@ struct wsi_device { VkPhysicalDevice pdevice; VkPhysicalDeviceMemoryProperties memory_props; uint32_t queue_family_count; + float timestamp_period; VkPhysicalDevicePCIBusInfoPropertiesEXT pci_bus_info; @@ -134,14 +135,18 @@ struct wsi_device { WSI_CB(BindImageMemory); WSI_CB(BeginCommandBuffer); WSI_CB(CmdCopyImageToBuffer); + WSI_CB(CmdResetQueryPool); + WSI_CB(CmdWriteTimestamp); WSI_CB(CreateBuffer); WSI_CB(CreateCommandPool); WSI_CB(CreateFence); WSI_CB(CreateImage); + WSI_CB(CreateQueryPool); WSI_CB(DestroyBuffer); WSI_CB(DestroyCommandPool); WSI_CB(DestroyFence); WSI_CB(DestroyImage); + WSI_CB(DestroyQueryPool); WSI_CB(EndCommandBuffer); WSI_CB(FreeMemory); WSI_CB(FreeCommandBuffers); @@ -150,11 +155,15 @@ struct wsi_device { WSI_CB(GetImageMemoryRequirements); WSI_CB(GetImageSubresourceLayout); WSI_CB(GetMemoryFdKHR); + WSI_CB(GetPhysicalDeviceProperties); WSI_CB(GetPhysicalDeviceFormatProperties); WSI_CB(GetPhysicalDeviceFormatProperties2KHR); WSI_CB(GetPhysicalDeviceImageFormatProperties2); + WSI_CB(GetPhysicalDeviceQueueFamilyProperties); + WSI_CB(GetQueryPoolResults); WSI_CB(ResetFences); WSI_CB(QueueSubmit); + WSI_CB(GetCalibratedTimestampsEXT); WSI_CB(WaitForFences); #undef WSI_CB @@ -272,4 +281,27 @@ wsi_common_queue_present(const struct wsi_device *wsi, uint64_t wsi_common_get_current_time(void); +VkResult +wsi_common_convert_timestamp(const struct wsi_device *wsi, + VkDevice device_h, + VkSurfaceKHR surface_h, + uint64_t monotonic_timestamp, + uint64_t *surface_timestamp); + +/* VK_GOOGLE_display_timing */ +VkResult +wsi_common_get_refresh_cycle_duration(const struct wsi_device *wsi, + VkDevice device_h, + VkSwapchainKHR swapchain, + VkRefreshCycleDurationGOOGLE + *pDisplayTimingProperties); + +VkResult +wsi_common_get_past_presentation_timing(const struct wsi_device *wsi, + VkDevice device_h, + VkSwapchainKHR swapchain, + uint32_t *pPresentationTimingCount, + VkPastPresentationTimingGOOGLE + *pPresentationTimings); + #endif diff --git a/src/vulkan/wsi/wsi_common_display.c b/src/vulkan/wsi/wsi_common_display.c index 0f9a1ffe8d3..3b2efe57ce1 100644 --- a/src/vulkan/wsi/wsi_common_display.c +++ b/src/vulkan/wsi/wsi_common_display.c @@ -76,6 +76,8 @@ typedef struct wsi_display_connector { char *name; bool connected; bool active; + uint64_t last_frame; + uint64_t last_nsec; struct list_head display_modes; wsi_display_mode *current_mode; drmModeModeInfo current_drm_mode; @@ -110,6 +112,7 @@ struct wsi_display { enum wsi_image_state { WSI_IMAGE_IDLE, WSI_IMAGE_DRAWING, + WSI_IMAGE_WAITING, WSI_IMAGE_QUEUED, WSI_IMAGE_FLIPPING, WSI_IMAGE_DISPLAYING @@ -119,6 +122,7 @@ struct wsi_display_image { struct wsi_image base; struct wsi_display_swapchain *chain; enum wsi_image_state state; + struct wsi_display_fence *fence; uint32_t fb_id; uint32_t buffer[4]; uint64_t flip_sequence; @@ -138,6 +142,7 @@ struct wsi_display_fence { bool event_received; bool destroyed; uint64_t sequence; + struct wsi_display_image *image; }; static uint64_t fence_sequence; @@ -1044,6 +1049,7 @@ wsi_display_image_init(VkDevice device_h, image->chain = chain; image->state = WSI_IMAGE_IDLE; + image->fence = NULL; image->fb_id = 0; int ret = drmModeAddFB2(wsi->fd, @@ -1135,6 +1141,11 @@ wsi_display_idle_old_displaying(struct wsi_display_image *active_image) static VkResult _wsi_display_queue_next(struct wsi_swapchain *drv_chain); +static uint64_t widen_32_to_64(uint32_t narrow, uint64_t near) +{ + return near + (int32_t) (narrow - near); +} + static void wsi_display_page_flip_handler2(int fd, unsigned int frame, @@ -1145,17 +1156,38 @@ wsi_display_page_flip_handler2(int fd, { struct wsi_display_image *image = data; struct wsi_display_swapchain *chain = image->chain; + VkIcdSurfaceDisplay *surface = chain->surface; + wsi_display_mode *display_mode = + wsi_display_mode_from_handle(surface->displayMode); + wsi_display_connector *connector = display_mode->connector; + uint64_t nsec = (uint64_t) sec * 1000000000ull + (uint64_t) usec * 1000; wsi_display_debug("image %ld displayed at %d\n", image - &(image->chain->images[0]), frame); + + /* Don't let time go backwards because this function has lower resolution + * than ktime */ + + if (nsec < connector->last_nsec) + nsec = connector->last_nsec; + image->state = WSI_IMAGE_DISPLAYING; + + uint64_t frame64 = widen_32_to_64(frame, connector->last_frame); + + connector->last_frame = frame64; + connector->last_nsec = nsec; + wsi_present_complete(&image->chain->base, &image->base, + nsec, frame64); wsi_display_idle_old_displaying(image); VkResult result = _wsi_display_queue_next(&(chain->base)); if (result != VK_SUCCESS) chain->status = result; } -static void wsi_display_fence_event_handler(struct wsi_display_fence *fence); +static void wsi_display_fence_event_handler(struct wsi_display_fence *fence, + uint64_t nsec, + uint64_t frame); static void wsi_display_page_flip_handler(int fd, unsigned int frame, @@ -1171,8 +1203,17 @@ static void wsi_display_vblank_handler(int fd, unsigned int frame, void *data) { struct wsi_display_fence *fence = data; + struct wsi_display_connector *connector = + wsi_display_connector_from_handle(fence->base.display); + uint64_t frame64 = widen_32_to_64(frame, connector->last_frame); + uint64_t nsec = (uint64_t) sec * 1000000000 + (uint64_t) usec * 1000; - wsi_display_fence_event_handler(fence); + /* Don't let time go backwards because this function has lower resolution + * than ktime */ + if (nsec < connector->last_nsec) + nsec = connector->last_nsec; + + wsi_display_fence_event_handler(fence, nsec, frame64); } static void wsi_display_sequence_handler(int fd, uint64_t frame, @@ -1181,7 +1222,7 @@ static void wsi_display_sequence_handler(int fd, uint64_t frame, struct wsi_display_fence *fence = (struct wsi_display_fence *) (uintptr_t) user_data; - wsi_display_fence_event_handler(fence); + wsi_display_fence_event_handler(fence, nsec, frame); } static drmEventContext event_context = { @@ -1513,12 +1554,31 @@ wsi_display_fence_check_free(struct wsi_display_fence *fence) vk_free(fence->base.alloc, fence); } -static void wsi_display_fence_event_handler(struct wsi_display_fence *fence) +static void wsi_display_fence_event_handler(struct wsi_display_fence *fence, + uint64_t nsec, + uint64_t frame) { + struct wsi_display_connector *connector = + wsi_display_connector_from_handle(fence->base.display); + struct wsi_display_image *image = fence->image; + + wsi_display_debug("%9lu fence %lu received %lu nsec %lu\n", + pthread_self(), fence->sequence, frame, nsec); + + connector->last_nsec = nsec; + connector->last_frame = frame; fence->event_received = true; wsi_display_fence_check_free(fence); + if (image) { + image->flip_sequence = ++image->chain->flip_sequence; + image->state = WSI_IMAGE_QUEUED; + VkResult result = _wsi_display_queue_next(&image->chain->base); + if (result != VK_SUCCESS) + image->chain->status = result; + } } + static void wsi_display_fence_destroy(struct wsi_fence *fence_wsi) { @@ -1553,6 +1613,7 @@ wsi_display_fence_alloc(VkDevice device, fence->event_received = false; fence->destroyed = false; fence->sequence = ++fence_sequence; + fence->image = NULL; return fence; } @@ -1660,7 +1721,14 @@ _wsi_display_queue_next(struct wsi_swapchain *drv_chain) if (!image) return VK_SUCCESS; + if (image->fence) { + image->fence->image = NULL; + wsi_display_fence_destroy(&image->fence->base); + image->fence = NULL; + } + int ret; + if (connector->active) { ret = drmModePageFlip(wsi->fd, connector->crtc_id, image->fb_id, DRM_MODE_PAGE_FLIP_EVENT, image); @@ -1740,6 +1808,66 @@ wsi_display_queue_present(struct wsi_swapchain *drv_chain, pthread_mutex_lock(&wsi->wait_mutex); + if (image->base.timing && image->base.timing->target_msc != 0) { + VkIcdSurfaceDisplay *surface = chain->surface; + wsi_display_mode *display_mode = + wsi_display_mode_from_handle(surface->displayMode); + wsi_display_connector *connector = display_mode->connector; + + wsi_display_debug("delta frame %ld\n", + image->base.timing->target_msc - connector->last_frame); + if (image->base.timing->target_msc > connector->last_frame) { + uint64_t frame_queued; + VkDisplayKHR display = wsi_display_connector_to_handle(connector); + + wsi_display_debug_code(uint64_t current_frame, current_nsec; + drmCrtcGetSequence(wsi->fd, connector->crtc_id, + ¤t_frame, + ¤t_nsec); + wsi_display_debug("from current: %ld\n", + image->base.timing->target_msc + - current_frame)); + + image->fence = wsi_display_fence_alloc(chain->base.device, + chain->base.wsi, + display, &chain->base.alloc); + + if (!image->fence) { + result = VK_ERROR_OUT_OF_HOST_MEMORY; + goto bail_unlock; + } + + result = wsi_register_vblank_event(image->fence, + chain->base.wsi, + display, + 0, + image->base.timing->target_msc - 1, + &frame_queued); + + if (result != VK_SUCCESS) + goto bail_unlock; + + /* Check and make sure we are queued for the right frame, otherwise + * just go queue an image + */ + if (frame_queued <= image->base.timing->target_msc - 1) { + image->state = WSI_IMAGE_WAITING; + + /* + * Don't set the image member until we're going to wait for the + * event to arrive before flipping to the image. That way, if the + * register_vblank_event call happens to process the event, it + * won't actually do anything + */ + image->fence->image = image; + wsi_display_start_wait_thread(wsi); + result = VK_SUCCESS; + goto bail_unlock; + } + + } + } + image->flip_sequence = ++chain->flip_sequence; image->state = WSI_IMAGE_QUEUED; @@ -1747,6 +1875,7 @@ wsi_display_queue_present(struct wsi_swapchain *drv_chain, if (result != VK_SUCCESS) chain->status = result; +bail_unlock: pthread_mutex_unlock(&wsi->wait_mutex); if (result != VK_SUCCESS) @@ -1755,6 +1884,21 @@ wsi_display_queue_present(struct wsi_swapchain *drv_chain, return chain->status; } +static VkResult +wsi_display_get_refresh_cycle_duration(struct wsi_swapchain *drv_chain, + VkRefreshCycleDurationGOOGLE *duration) +{ + struct wsi_display_swapchain *chain = + (struct wsi_display_swapchain *) drv_chain; + VkIcdSurfaceDisplay *surface = chain->surface; + wsi_display_mode *display_mode = + wsi_display_mode_from_handle(surface->displayMode); + double refresh = wsi_display_mode_refresh(display_mode); + + duration->refreshDuration = (uint64_t) (floor (1.0/refresh * 1e9 + 0.5)); + return VK_SUCCESS; +} + static VkResult wsi_display_surface_create_swapchain( VkIcdSurfaceBase *icd_surface, @@ -1790,6 +1934,8 @@ wsi_display_surface_create_swapchain( chain->base.acquire_next_image = wsi_display_acquire_next_image; chain->base.queue_present = wsi_display_queue_present; chain->base.present_mode = wsi_swapchain_get_present_mode(wsi_device, create_info); + chain->base.get_refresh_cycle_duration = + wsi_display_get_refresh_cycle_duration; chain->base.image_count = num_images; chain->wsi = wsi; @@ -2555,9 +2701,14 @@ wsi_get_swapchain_counter(VkDevice device, return VK_SUCCESS; } - int ret = drmCrtcGetSequence(wsi->fd, connector->crtc_id, value, NULL); - if (ret) + uint64_t nsec; + int ret = drmCrtcGetSequence(wsi->fd, connector->crtc_id, value, &nsec); + if (ret) { *value = 0; + } else { + connector->last_frame = *value; + connector->last_nsec = nsec; + } return VK_SUCCESS; } diff --git a/src/vulkan/wsi/wsi_common_private.h b/src/vulkan/wsi/wsi_common_private.h index 88c360a2409..80b27fc76a6 100644 --- a/src/vulkan/wsi/wsi_common_private.h +++ b/src/vulkan/wsi/wsi_common_private.h @@ -25,6 +25,13 @@ #include "wsi_common.h" +struct wsi_timing { + bool complete; + bool consumed; + uint64_t target_msc; + VkPastPresentationTimingGOOGLE timing; +}; + struct wsi_image { VkImage image; VkDeviceMemory memory; @@ -41,8 +48,16 @@ struct wsi_image { uint32_t offsets[4]; uint32_t row_pitches[4]; int fds[4]; + + VkQueryPool query_pool; + + VkCommandBuffer timestamp_buffer; + + struct wsi_timing *timing; }; +#define WSI_TIMING_HISTORY 16 + struct wsi_swapchain { const struct wsi_device *wsi; @@ -54,6 +69,16 @@ struct wsi_swapchain { bool use_prime_blit; + uint32_t timing_insert; + uint32_t timing_count; + + struct wsi_timing timing[WSI_TIMING_HISTORY]; + + uint64_t frame_msc; + uint64_t frame_ust; + + float timestamp_period; + /* Command pools, one per queue family */ VkCommandPool *cmd_pools; @@ -67,6 +92,10 @@ struct wsi_swapchain { VkResult (*queue_present)(struct wsi_swapchain *swap_chain, uint32_t image_index, const VkPresentRegionKHR *damage); + VkResult (*get_refresh_cycle_duration)(struct wsi_swapchain *swap_chain, + VkRefreshCycleDurationGOOGLE + *pDisplayTimingProperties); + }; bool @@ -104,6 +133,12 @@ wsi_destroy_image(const struct wsi_swapchain *chain, struct wsi_image *image); +void +wsi_present_complete(struct wsi_swapchain *swapchain, + struct wsi_image *image, + uint64_t ust, + uint64_t msc); + struct wsi_interface { VkResult (*get_support)(VkIcdSurfaceBase *surface, struct wsi_device *wsi_device, diff --git a/src/vulkan/wsi/wsi_common_x11.c b/src/vulkan/wsi/wsi_common_x11.c index 95106af5b6e..0a1738ce7cc 100644 --- a/src/vulkan/wsi/wsi_common_x11.c +++ b/src/vulkan/wsi/wsi_common_x11.c @@ -748,6 +748,7 @@ struct x11_image { bool busy; struct xshmfence * shm_fence; uint32_t sync_fence; + uint32_t serial; }; struct x11_swapchain { @@ -766,6 +767,8 @@ struct x11_swapchain { uint64_t send_sbc; uint64_t last_present_msc; uint32_t stamp; + uint64_t last_present_nsec; + uint64_t refresh_period; bool has_present_queue; bool has_acquire_queue; @@ -859,8 +862,39 @@ x11_handle_dri3_present_event(struct x11_swapchain *chain, case XCB_PRESENT_EVENT_COMPLETE_NOTIFY: { xcb_present_complete_notify_event_t *complete = (void *) event; - if (complete->kind == XCB_PRESENT_COMPLETE_KIND_PIXMAP) + if (complete->kind == XCB_PRESENT_COMPLETE_KIND_PIXMAP) { + uint64_t frames = complete->msc - chain->last_present_msc; + uint64_t present_nsec = complete->ust * 1000; + + /* + * Well, this is about as good as we can do -- measure the refresh + * instead of asking for the current mode and using that. Turns out, + * for eDP panels, this works better anyways as they use the builtin + * fixed mode for everything + */ + if (0 < frames && frames < 10 && + present_nsec > chain->last_present_nsec) + { + uint64_t refresh_period = + (present_nsec - chain->last_present_nsec + frames / 2) / frames; + + if (chain->refresh_period) + refresh_period = + (3 * chain->refresh_period + refresh_period) >> 2; + + chain->refresh_period = refresh_period; + } + chain->last_present_msc = complete->msc; + chain->last_present_nsec = present_nsec; + for (unsigned i = 0; i < chain->base.image_count; i++) { + if (chain->images[i].serial == complete->serial) { + wsi_present_complete(&chain->base, &chain->images[i].base, + present_nsec, complete->msc); + break; + } + } + } VkResult result = VK_SUCCESS; @@ -994,7 +1028,7 @@ x11_acquire_next_image_from_queue(struct x11_swapchain *chain, static VkResult x11_present_to_x11(struct x11_swapchain *chain, uint32_t image_index, - uint32_t target_msc) + uint64_t target_msc) { struct x11_image *image = &chain->images[image_index]; @@ -1029,11 +1063,12 @@ x11_present_to_x11(struct x11_swapchain *chain, uint32_t image_index, xshmfence_reset(image->shm_fence); ++chain->send_sbc; + image->serial = (uint32_t) chain->send_sbc; xcb_void_cookie_t cookie = xcb_present_pixmap(chain->conn, chain->window, image->pixmap, - (uint32_t) chain->send_sbc, + image->serial, 0, /* valid */ 0, /* update */ 0, /* x_off */ @@ -1091,6 +1126,26 @@ x11_queue_present(struct wsi_swapchain *anv_chain, } } +static uint64_t +x11_refresh_duration(struct x11_swapchain *chain) +{ + /* Pick 60Hz if we don't know what it actually is yet */ + if (!chain->refresh_period) + return (uint64_t) (1e9 / 59.98 + 0.5); + + return chain->refresh_period; +} + +static VkResult +x11_get_refresh(struct wsi_swapchain *wsi_chain, + VkRefreshCycleDurationGOOGLE *timings) +{ + struct x11_swapchain *chain = (struct x11_swapchain *)wsi_chain; + + timings->refreshDuration = x11_refresh_duration(chain); + return VK_SUCCESS; +} + static void * x11_manage_fifo_queues(void *state) { @@ -1106,6 +1161,7 @@ x11_manage_fifo_queues(void *state) * other than the currently presented one. */ uint32_t image_index = 0; + struct x11_image *image; result = wsi_queue_pull(&chain->present_queue, &image_index, INT64_MAX); assert(result != VK_TIMEOUT); if (result < 0) { @@ -1131,6 +1187,12 @@ x11_manage_fifo_queues(void *state) if (chain->has_acquire_queue) target_msc = chain->last_present_msc + 1; + image = &chain->images[image_index]; + + struct wsi_timing *timing = image->base.timing; + if (timing && timing->target_msc != 0 && timing->target_msc > target_msc) + target_msc = timing->target_msc; + result = x11_present_to_x11(chain, image_index, target_msc); if (result < 0) goto fail; @@ -1471,6 +1533,7 @@ x11_surface_create_swapchain(VkIcdSurfaceBase *icd_surface, chain->base.acquire_next_image = x11_acquire_next_image; chain->base.queue_present = x11_queue_present; chain->base.present_mode = present_mode; + chain->base.get_refresh_cycle_duration = x11_get_refresh; chain->base.image_count = num_images; chain->conn = conn; chain->window = window; @@ -1480,6 +1543,8 @@ x11_surface_create_swapchain(VkIcdSurfaceBase *icd_surface, chain->last_present_msc = 0; chain->has_acquire_queue = false; chain->has_present_queue = false; + chain->last_present_nsec = 0; + chain->threaded = false; chain->status = VK_SUCCESS; chain->has_dri3_modifiers = wsi_conn->has_dri3_modifiers;