From patchwork Mon Feb 28 21:10:46 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Melissa Wen X-Patchwork-Id: 12763805 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 281EFC433FE for ; Mon, 28 Feb 2022 21:11:43 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id DCD8B10E5C5; Mon, 28 Feb 2022 21:11:41 +0000 (UTC) Received: from fanzine2.igalia.com (fanzine2.igalia.com [213.97.179.56]) by gabe.freedesktop.org (Postfix) with ESMTPS id E20D610E442; Mon, 28 Feb 2022 21:11:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To:Content-Type:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=vFXJVU/zkUlvWZ15xcOen8+g6QF5if2TYI4v/ng6ZkQ=; b=iyaXGiXUD04oqZrIYyem791UAj gEnbaEzEppTEOadZpvCUpeXm9kdRSOuY5BhOzEx3oHUlZCML2Ph0o3SrCAyQcfdhfvwqs2Hqt4qsk T+CfKLPMEkigRVcimAV+fNRohGqKD1SjWp1Ke7I8KtLd+KL6tCnBELreEdyR/zsYrwafUrPw5yFf9 wRjFwYYySZB2+U8twZv01RI8LPZY+UvvZamknqot+fbVgh3FmbJ8OCXY7KYBY+qOfDcvzwJh+igGl BH7gWCPfXrn1//T9GXOSiERaB5DhpmJuKRR/BURxSejf6/98HbWCzAeZhG8QR6/fEx8uz5+VzFLQC augarS5g==; Received: from [165.90.126.25] (helo=killbill.home) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1nOnIv-000CFy-Ao; Mon, 28 Feb 2022 22:11:37 +0100 From: Melissa Wen To: amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, harry.wentland@amd.com, sunpeng.li@amd.com, Rodrigo.Siqueira@amd.com, alexander.deucher@amd.com, christian.koenig@amd.com, Xinhui.Pan@amd.com, airlied@linux.ie, daniel@ffwll.ch Subject: [PATCH 1/2] drm/amd/display: move FPU operations from dcn21 to dml/dcn20 folder Date: Mon, 28 Feb 2022 20:10:46 -0100 Message-Id: <20220228211047.3957945-2-mwen@igalia.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220228211047.3957945-1-mwen@igalia.com> References: <20220228211047.3957945-1-mwen@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Melissa Wen , Qingqing Zhuo , Dmytro Laktyushkin , linux-kernel@vger.kernel.org, Jasdeep Dhillon Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" dml/dcn20_fpu file centralizes all DCN2x functions that require FPU access. Therefore, this patch moves FPU-related code from dcn21 to dcn20_fpu. These include: - dcn21_populate_dml_pipes_from_context() - dcn21_validate_bandwidth_fp() and related: - dcn21_calculate_wm(), - patch_bounding_box(), - calculate_wm_set_for_vlevel() - renaming update_bw_bounding_box() to dcn21_update_bw_bounding_box(), move to dcn20_fpu with related static function construct_low_pstate_lvl() Also, make dcn21_fast_validate_bw() public in dcn21_resource as it is called by dcn21_validate_bandwidth_fp() now in dcn20_fpu. Reuse dcn20_fpu_adjust_dppclk() in dcn21_fast_validate_bw() as it isolates the same FPU operation. Include dchubbub.h as it is required in dcn21_populate_dml_pipes_from_context() Signed-off-by: Melissa Wen --- drivers/gpu/drm/amd/display/dc/dcn21/Makefile | 25 - .../drm/amd/display/dc/dcn21/dcn21_resource.c | 566 +----------------- .../drm/amd/display/dc/dcn21/dcn21_resource.h | 11 + .../drm/amd/display/dc/dml/dcn20/dcn20_fpu.c | 538 ++++++++++++++++- .../drm/amd/display/dc/dml/dcn20/dcn20_fpu.h | 9 + 5 files changed, 571 insertions(+), 578 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/dcn21/Makefile b/drivers/gpu/drm/amd/display/dc/dcn21/Makefile index bb8c95141082..0dc06e428999 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn21/Makefile +++ b/drivers/gpu/drm/amd/display/dc/dcn21/Makefile @@ -5,31 +5,6 @@ DCN21 = dcn21_init.o dcn21_hubp.o dcn21_hubbub.o dcn21_resource.o \ dcn21_hwseq.o dcn21_link_encoder.o dcn21_dccg.o -ifdef CONFIG_X86 -CFLAGS_$(AMDDALPATH)/dc/dcn21/dcn21_resource.o := -mhard-float -msse -endif - -ifdef CONFIG_PPC64 -CFLAGS_$(AMDDALPATH)/dc/dcn21/dcn21_resource.o := -mhard-float -maltivec -endif - -ifdef CONFIG_CC_IS_GCC -ifeq ($(call cc-ifversion, -lt, 0701, y), y) -IS_OLD_GCC = 1 -endif -endif - -ifdef CONFIG_X86 -ifdef IS_OLD_GCC -# Stack alignment mismatch, proceed with caution. -# GCC < 7.1 cannot compile code using `double` and -mpreferred-stack-boundary=3 -# (8B stack alignment). -CFLAGS_$(AMDDALPATH)/dc/dcn21/dcn21_resource.o += -mpreferred-stack-boundary=4 -else -CFLAGS_$(AMDDALPATH)/dc/dcn21/dcn21_resource.o += -msse2 -endif -endif - AMD_DAL_DCN21 = $(addprefix $(AMDDALPATH)/dc/dcn21/,$(DCN21)) AMD_DISPLAY_FILES += $(AMD_DAL_DCN21) diff --git a/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c b/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c index c1cd1a8ff1d7..612732656772 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c +++ b/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c @@ -34,6 +34,7 @@ #include "resource.h" #include "include/irq_service_interface.h" #include "dcn20/dcn20_resource.h" +#include "dcn21/dcn21_resource.h" #include "dml/dcn20/dcn20_fpu.h" @@ -89,230 +90,6 @@ #include "dce/dmub_psr.h" #include "dce/dmub_abm.h" -#define DC_LOGGER_INIT(logger) - - -struct _vcs_dpi_ip_params_st dcn2_1_ip = { - .odm_capable = 1, - .gpuvm_enable = 1, - .hostvm_enable = 1, - .gpuvm_max_page_table_levels = 1, - .hostvm_max_page_table_levels = 4, - .hostvm_cached_page_table_levels = 2, - .num_dsc = 3, - .rob_buffer_size_kbytes = 168, - .det_buffer_size_kbytes = 164, - .dpte_buffer_size_in_pte_reqs_luma = 44, - .dpte_buffer_size_in_pte_reqs_chroma = 42,//todo - .dpp_output_buffer_pixels = 2560, - .opp_output_buffer_lines = 1, - .pixel_chunk_size_kbytes = 8, - .pte_enable = 1, - .max_page_table_levels = 4, - .pte_chunk_size_kbytes = 2, - .meta_chunk_size_kbytes = 2, - .min_meta_chunk_size_bytes = 256, - .writeback_chunk_size_kbytes = 2, - .line_buffer_size_bits = 789504, - .is_line_buffer_bpp_fixed = 0, - .line_buffer_fixed_bpp = 0, - .dcc_supported = true, - .max_line_buffer_lines = 12, - .writeback_luma_buffer_size_kbytes = 12, - .writeback_chroma_buffer_size_kbytes = 8, - .writeback_chroma_line_buffer_width_pixels = 4, - .writeback_max_hscl_ratio = 1, - .writeback_max_vscl_ratio = 1, - .writeback_min_hscl_ratio = 1, - .writeback_min_vscl_ratio = 1, - .writeback_max_hscl_taps = 12, - .writeback_max_vscl_taps = 12, - .writeback_line_buffer_luma_buffer_size = 0, - .writeback_line_buffer_chroma_buffer_size = 14643, - .cursor_buffer_size = 8, - .cursor_chunk_size = 2, - .max_num_otg = 4, - .max_num_dpp = 4, - .max_num_wb = 1, - .max_dchub_pscl_bw_pix_per_clk = 4, - .max_pscl_lb_bw_pix_per_clk = 2, - .max_lb_vscl_bw_pix_per_clk = 4, - .max_vscl_hscl_bw_pix_per_clk = 4, - .max_hscl_ratio = 4, - .max_vscl_ratio = 4, - .hscl_mults = 4, - .vscl_mults = 4, - .max_hscl_taps = 8, - .max_vscl_taps = 8, - .dispclk_ramp_margin_percent = 1, - .underscan_factor = 1.10, - .min_vblank_lines = 32, // - .dppclk_delay_subtotal = 77, // - .dppclk_delay_scl_lb_only = 16, - .dppclk_delay_scl = 50, - .dppclk_delay_cnvc_formatter = 8, - .dppclk_delay_cnvc_cursor = 6, - .dispclk_delay_subtotal = 87, // - .dcfclk_cstate_latency = 10, // SRExitTime - .max_inter_dcn_tile_repeaters = 8, - - .xfc_supported = false, - .xfc_fill_bw_overhead_percent = 10.0, - .xfc_fill_constant_bytes = 0, - .ptoi_supported = 0, - .number_of_cursors = 1, -}; - -struct _vcs_dpi_soc_bounding_box_st dcn2_1_soc = { - .clock_limits = { - { - .state = 0, - .dcfclk_mhz = 400.0, - .fabricclk_mhz = 400.0, - .dispclk_mhz = 600.0, - .dppclk_mhz = 400.00, - .phyclk_mhz = 600.0, - .socclk_mhz = 278.0, - .dscclk_mhz = 205.67, - .dram_speed_mts = 1600.0, - }, - { - .state = 1, - .dcfclk_mhz = 464.52, - .fabricclk_mhz = 800.0, - .dispclk_mhz = 654.55, - .dppclk_mhz = 626.09, - .phyclk_mhz = 600.0, - .socclk_mhz = 278.0, - .dscclk_mhz = 205.67, - .dram_speed_mts = 1600.0, - }, - { - .state = 2, - .dcfclk_mhz = 514.29, - .fabricclk_mhz = 933.0, - .dispclk_mhz = 757.89, - .dppclk_mhz = 685.71, - .phyclk_mhz = 600.0, - .socclk_mhz = 278.0, - .dscclk_mhz = 287.67, - .dram_speed_mts = 1866.0, - }, - { - .state = 3, - .dcfclk_mhz = 576.00, - .fabricclk_mhz = 1067.0, - .dispclk_mhz = 847.06, - .dppclk_mhz = 757.89, - .phyclk_mhz = 600.0, - .socclk_mhz = 715.0, - .dscclk_mhz = 318.334, - .dram_speed_mts = 2134.0, - }, - { - .state = 4, - .dcfclk_mhz = 626.09, - .fabricclk_mhz = 1200.0, - .dispclk_mhz = 900.00, - .dppclk_mhz = 847.06, - .phyclk_mhz = 810.0, - .socclk_mhz = 953.0, - .dscclk_mhz = 489.0, - .dram_speed_mts = 2400.0, - }, - { - .state = 5, - .dcfclk_mhz = 685.71, - .fabricclk_mhz = 1333.0, - .dispclk_mhz = 1028.57, - .dppclk_mhz = 960.00, - .phyclk_mhz = 810.0, - .socclk_mhz = 278.0, - .dscclk_mhz = 287.67, - .dram_speed_mts = 2666.0, - }, - { - .state = 6, - .dcfclk_mhz = 757.89, - .fabricclk_mhz = 1467.0, - .dispclk_mhz = 1107.69, - .dppclk_mhz = 1028.57, - .phyclk_mhz = 810.0, - .socclk_mhz = 715.0, - .dscclk_mhz = 318.334, - .dram_speed_mts = 3200.0, - }, - { - .state = 7, - .dcfclk_mhz = 847.06, - .fabricclk_mhz = 1600.0, - .dispclk_mhz = 1395.0, - .dppclk_mhz = 1285.00, - .phyclk_mhz = 1325.0, - .socclk_mhz = 953.0, - .dscclk_mhz = 489.0, - .dram_speed_mts = 4266.0, - }, - /*Extra state, no dispclk ramping*/ - { - .state = 8, - .dcfclk_mhz = 847.06, - .fabricclk_mhz = 1600.0, - .dispclk_mhz = 1395.0, - .dppclk_mhz = 1285.0, - .phyclk_mhz = 1325.0, - .socclk_mhz = 953.0, - .dscclk_mhz = 489.0, - .dram_speed_mts = 4266.0, - }, - - }, - - .sr_exit_time_us = 12.5, - .sr_enter_plus_exit_time_us = 17.0, - .urgent_latency_us = 4.0, - .urgent_latency_pixel_data_only_us = 4.0, - .urgent_latency_pixel_mixed_with_vm_data_us = 4.0, - .urgent_latency_vm_data_only_us = 4.0, - .urgent_out_of_order_return_per_channel_pixel_only_bytes = 4096, - .urgent_out_of_order_return_per_channel_pixel_and_vm_bytes = 4096, - .urgent_out_of_order_return_per_channel_vm_only_bytes = 4096, - .pct_ideal_dram_sdp_bw_after_urgent_pixel_only = 80.0, - .pct_ideal_dram_sdp_bw_after_urgent_pixel_and_vm = 75.0, - .pct_ideal_dram_sdp_bw_after_urgent_vm_only = 40.0, - .max_avg_sdp_bw_use_normal_percent = 60.0, - .max_avg_dram_bw_use_normal_percent = 100.0, - .writeback_latency_us = 12.0, - .max_request_size_bytes = 256, - .dram_channel_width_bytes = 4, - .fabric_datapath_to_dcn_data_return_bytes = 32, - .dcn_downspread_percent = 0.5, - .downspread_percent = 0.38, - .dram_page_open_time_ns = 50.0, - .dram_rw_turnaround_time_ns = 17.5, - .dram_return_buffer_per_channel_bytes = 8192, - .round_trip_ping_latency_dcfclk_cycles = 128, - .urgent_out_of_order_return_per_channel_bytes = 4096, - .channel_interleave_bytes = 256, - .num_banks = 8, - .num_chans = 4, - .vmm_page_size_bytes = 4096, - .dram_clock_change_latency_us = 23.84, - .return_bus_width_bytes = 64, - .dispclk_dppclk_vco_speed_mhz = 3600, - .xfc_bus_transport_time_us = 4, - .xfc_xbuf_latency_tolerance_us = 4, - .use_urgent_burst_bw = 1, - .num_states = 8 -}; - -#ifndef MAX -#define MAX(X, Y) ((X) > (Y) ? (X) : (Y)) -#endif -#ifndef MIN -#define MIN(X, Y) ((X) < (Y) ? (X) : (Y)) -#endif - /* begin ********************* * macros to expend register list macro defined in HW object header file */ @@ -705,12 +482,6 @@ static const struct dcn10_stream_encoder_mask se_mask = { static void dcn21_pp_smu_destroy(struct pp_smu_funcs **pp_smu); -static int dcn21_populate_dml_pipes_from_context( - struct dc *dc, - struct dc_state *context, - display_e2e_pipe_params_st *pipes, - bool fast_validate); - static struct input_pixel_processor *dcn21_ipp_create( struct dc_context *ctx, uint32_t inst) { @@ -1029,163 +800,13 @@ static void dcn21_resource_destruct(struct dcn21_resource_pool *pool) dcn21_pp_smu_destroy(&pool->base.pp_smu); } - -static void calculate_wm_set_for_vlevel( - int vlevel, - struct wm_range_table_entry *table_entry, - struct dcn_watermarks *wm_set, - struct display_mode_lib *dml, - display_e2e_pipe_params_st *pipes, - int pipe_cnt) -{ - double dram_clock_change_latency_cached = dml->soc.dram_clock_change_latency_us; - - ASSERT(vlevel < dml->soc.num_states); - /* only pipe 0 is read for voltage and dcf/soc clocks */ - pipes[0].clks_cfg.voltage = vlevel; - pipes[0].clks_cfg.dcfclk_mhz = dml->soc.clock_limits[vlevel].dcfclk_mhz; - pipes[0].clks_cfg.socclk_mhz = dml->soc.clock_limits[vlevel].socclk_mhz; - - dml->soc.dram_clock_change_latency_us = table_entry->pstate_latency_us; - dml->soc.sr_exit_time_us = table_entry->sr_exit_time_us; - dml->soc.sr_enter_plus_exit_time_us = table_entry->sr_enter_plus_exit_time_us; - - wm_set->urgent_ns = get_wm_urgent(dml, pipes, pipe_cnt) * 1000; - wm_set->cstate_pstate.cstate_enter_plus_exit_ns = get_wm_stutter_enter_exit(dml, pipes, pipe_cnt) * 1000; - wm_set->cstate_pstate.cstate_exit_ns = get_wm_stutter_exit(dml, pipes, pipe_cnt) * 1000; - wm_set->cstate_pstate.pstate_change_ns = get_wm_dram_clock_change(dml, pipes, pipe_cnt) * 1000; - wm_set->pte_meta_urgent_ns = get_wm_memory_trip(dml, pipes, pipe_cnt) * 1000; - wm_set->frac_urg_bw_nom = get_fraction_of_urgent_bandwidth(dml, pipes, pipe_cnt) * 1000; - wm_set->frac_urg_bw_flip = get_fraction_of_urgent_bandwidth_imm_flip(dml, pipes, pipe_cnt) * 1000; - wm_set->urgent_latency_ns = get_urgent_latency(dml, pipes, pipe_cnt) * 1000; - dml->soc.dram_clock_change_latency_us = dram_clock_change_latency_cached; - -} - -static void patch_bounding_box(struct dc *dc, struct _vcs_dpi_soc_bounding_box_st *bb) -{ - int i; - - if (dc->bb_overrides.sr_exit_time_ns) { - for (i = 0; i < WM_SET_COUNT; i++) { - dc->clk_mgr->bw_params->wm_table.entries[i].sr_exit_time_us = - dc->bb_overrides.sr_exit_time_ns / 1000.0; - } - } - - if (dc->bb_overrides.sr_enter_plus_exit_time_ns) { - for (i = 0; i < WM_SET_COUNT; i++) { - dc->clk_mgr->bw_params->wm_table.entries[i].sr_enter_plus_exit_time_us = - dc->bb_overrides.sr_enter_plus_exit_time_ns / 1000.0; - } - } - - if (dc->bb_overrides.urgent_latency_ns) { - bb->urgent_latency_us = dc->bb_overrides.urgent_latency_ns / 1000.0; - } - - if (dc->bb_overrides.dram_clock_change_latency_ns) { - for (i = 0; i < WM_SET_COUNT; i++) { - dc->clk_mgr->bw_params->wm_table.entries[i].pstate_latency_us = - dc->bb_overrides.dram_clock_change_latency_ns / 1000.0; - } - } -} - -static void dcn21_calculate_wm( - struct dc *dc, struct dc_state *context, - display_e2e_pipe_params_st *pipes, - int *out_pipe_cnt, - int *pipe_split_from, - int vlevel_req, - bool fast_validate) -{ - int pipe_cnt, i, pipe_idx; - int vlevel, vlevel_max; - struct wm_range_table_entry *table_entry; - struct clk_bw_params *bw_params = dc->clk_mgr->bw_params; - - ASSERT(bw_params); - - patch_bounding_box(dc, &context->bw_ctx.dml.soc); - - for (i = 0, pipe_idx = 0, pipe_cnt = 0; i < dc->res_pool->pipe_count; i++) { - if (!context->res_ctx.pipe_ctx[i].stream) - continue; - - pipes[pipe_cnt].clks_cfg.refclk_mhz = dc->res_pool->ref_clocks.dchub_ref_clock_inKhz / 1000.0; - pipes[pipe_cnt].clks_cfg.dispclk_mhz = context->bw_ctx.dml.vba.RequiredDISPCLK[vlevel_req][context->bw_ctx.dml.vba.maxMpcComb]; - - if (pipe_split_from[i] < 0) { - pipes[pipe_cnt].clks_cfg.dppclk_mhz = - context->bw_ctx.dml.vba.RequiredDPPCLK[vlevel_req][context->bw_ctx.dml.vba.maxMpcComb][pipe_idx]; - if (context->bw_ctx.dml.vba.BlendingAndTiming[pipe_idx] == pipe_idx) - pipes[pipe_cnt].pipe.dest.odm_combine = - context->bw_ctx.dml.vba.ODMCombineEnablePerState[vlevel_req][pipe_idx]; - else - pipes[pipe_cnt].pipe.dest.odm_combine = 0; - pipe_idx++; - } else { - pipes[pipe_cnt].clks_cfg.dppclk_mhz = - context->bw_ctx.dml.vba.RequiredDPPCLK[vlevel_req][context->bw_ctx.dml.vba.maxMpcComb][pipe_split_from[i]]; - if (context->bw_ctx.dml.vba.BlendingAndTiming[pipe_split_from[i]] == pipe_split_from[i]) - pipes[pipe_cnt].pipe.dest.odm_combine = - context->bw_ctx.dml.vba.ODMCombineEnablePerState[vlevel_req][pipe_split_from[i]]; - else - pipes[pipe_cnt].pipe.dest.odm_combine = 0; - } - pipe_cnt++; - } - - if (pipe_cnt != pipe_idx) { - if (dc->res_pool->funcs->populate_dml_pipes) - pipe_cnt = dc->res_pool->funcs->populate_dml_pipes(dc, - context, pipes, fast_validate); - else - pipe_cnt = dcn21_populate_dml_pipes_from_context(dc, - context, pipes, fast_validate); - } - - *out_pipe_cnt = pipe_cnt; - - vlevel_max = bw_params->clk_table.num_entries - 1; - - - /* WM Set D */ - table_entry = &bw_params->wm_table.entries[WM_D]; - if (table_entry->wm_type == WM_TYPE_RETRAINING) - vlevel = 0; - else - vlevel = vlevel_max; - calculate_wm_set_for_vlevel(vlevel, table_entry, &context->bw_ctx.bw.dcn.watermarks.d, - &context->bw_ctx.dml, pipes, pipe_cnt); - /* WM Set C */ - table_entry = &bw_params->wm_table.entries[WM_C]; - vlevel = MIN(MAX(vlevel_req, 3), vlevel_max); - calculate_wm_set_for_vlevel(vlevel, table_entry, &context->bw_ctx.bw.dcn.watermarks.c, - &context->bw_ctx.dml, pipes, pipe_cnt); - /* WM Set B */ - table_entry = &bw_params->wm_table.entries[WM_B]; - vlevel = MIN(MAX(vlevel_req, 2), vlevel_max); - calculate_wm_set_for_vlevel(vlevel, table_entry, &context->bw_ctx.bw.dcn.watermarks.b, - &context->bw_ctx.dml, pipes, pipe_cnt); - - /* WM Set A */ - table_entry = &bw_params->wm_table.entries[WM_A]; - vlevel = MIN(vlevel_req, vlevel_max); - calculate_wm_set_for_vlevel(vlevel, table_entry, &context->bw_ctx.bw.dcn.watermarks.a, - &context->bw_ctx.dml, pipes, pipe_cnt); -} - - -static bool dcn21_fast_validate_bw( - struct dc *dc, - struct dc_state *context, - display_e2e_pipe_params_st *pipes, - int *pipe_cnt_out, - int *pipe_split_from, - int *vlevel_out, - bool fast_validate) +bool dcn21_fast_validate_bw(struct dc *dc, + struct dc_state *context, + display_e2e_pipe_params_st *pipes, + int *pipe_cnt_out, + int *pipe_split_from, + int *vlevel_out, + bool fast_validate) { bool out = false; int split[MAX_PIPES] = { 0 }; @@ -1197,7 +818,9 @@ static bool dcn21_fast_validate_bw( dcn20_merge_pipes_for_validate(dc, context); + DC_FP_START(); pipe_cnt = dc->res_pool->funcs->populate_dml_pipes(dc, context, pipes, fast_validate); + DC_FP_END(); *pipe_cnt_out = pipe_cnt; @@ -1287,7 +910,9 @@ static bool dcn21_fast_validate_bw( hsplit_pipe = dcn20_find_secondary_pipe(dc, &context->res_ctx, dc->res_pool, pipe); ASSERT(hsplit_pipe); if (!hsplit_pipe) { - context->bw_ctx.dml.vba.RequiredDPPCLK[vlevel][context->bw_ctx.dml.vba.maxMpcComb][pipe_idx] *= 2; + DC_FP_START(); + dcn20_fpu_adjust_dppclk(&context->bw_ctx.dml.vba, vlevel, context->bw_ctx.dml.vba.maxMpcComb, pipe_idx, true); + DC_FP_END(); continue; } if (context->bw_ctx.dml.vba.ODMCombineEnabled[pipe_idx]) { @@ -1329,63 +954,6 @@ static bool dcn21_fast_validate_bw( return out; } -static noinline bool dcn21_validate_bandwidth_fp(struct dc *dc, - struct dc_state *context, bool fast_validate) -{ - bool out = false; - - BW_VAL_TRACE_SETUP(); - - int vlevel = 0; - int pipe_split_from[MAX_PIPES]; - int pipe_cnt = 0; - display_e2e_pipe_params_st *pipes = kzalloc(dc->res_pool->pipe_count * sizeof(display_e2e_pipe_params_st), GFP_ATOMIC); - DC_LOGGER_INIT(dc->ctx->logger); - - BW_VAL_TRACE_COUNT(); - - /*Unsafe due to current pipe merge and split logic*/ - ASSERT(context != dc->current_state); - - out = dcn21_fast_validate_bw(dc, context, pipes, &pipe_cnt, pipe_split_from, &vlevel, fast_validate); - - if (pipe_cnt == 0) - goto validate_out; - - if (!out) - goto validate_fail; - - BW_VAL_TRACE_END_VOLTAGE_LEVEL(); - - if (fast_validate) { - BW_VAL_TRACE_SKIP(fast); - goto validate_out; - } - - dcn21_calculate_wm(dc, context, pipes, &pipe_cnt, pipe_split_from, vlevel, fast_validate); - DC_FP_START(); - dcn20_calculate_dlg_params(dc, context, pipes, pipe_cnt, vlevel); - DC_FP_END(); - - BW_VAL_TRACE_END_WATERMARKS(); - - goto validate_out; - -validate_fail: - DC_LOG_WARNING("Mode Validation Warning: %s failed validation.\n", - dml_get_status_message(context->bw_ctx.dml.vba.ValidationStatus[context->bw_ctx.dml.vba.soc.num_states])); - - BW_VAL_TRACE_SKIP(fail); - out = false; - -validate_out: - kfree(pipes); - - BW_VAL_TRACE_FINISH(); - - return out; -} - /* * Some of the functions further below use the FPU, so we need to wrap this * with DC_FP_START()/DC_FP_END(). Use the same approach as for @@ -1560,94 +1128,6 @@ static struct display_stream_compressor *dcn21_dsc_create(struct dc_context *ctx return &dsc->base; } -static struct _vcs_dpi_voltage_scaling_st construct_low_pstate_lvl(struct clk_limit_table *clk_table, unsigned int high_voltage_lvl) -{ - struct _vcs_dpi_voltage_scaling_st low_pstate_lvl; - int i; - - low_pstate_lvl.state = 1; - low_pstate_lvl.dcfclk_mhz = clk_table->entries[0].dcfclk_mhz; - low_pstate_lvl.fabricclk_mhz = clk_table->entries[0].fclk_mhz; - low_pstate_lvl.socclk_mhz = clk_table->entries[0].socclk_mhz; - low_pstate_lvl.dram_speed_mts = clk_table->entries[0].memclk_mhz * 2; - - low_pstate_lvl.dispclk_mhz = dcn2_1_soc.clock_limits[high_voltage_lvl].dispclk_mhz; - low_pstate_lvl.dppclk_mhz = dcn2_1_soc.clock_limits[high_voltage_lvl].dppclk_mhz; - low_pstate_lvl.dram_bw_per_chan_gbps = dcn2_1_soc.clock_limits[high_voltage_lvl].dram_bw_per_chan_gbps; - low_pstate_lvl.dscclk_mhz = dcn2_1_soc.clock_limits[high_voltage_lvl].dscclk_mhz; - low_pstate_lvl.dtbclk_mhz = dcn2_1_soc.clock_limits[high_voltage_lvl].dtbclk_mhz; - low_pstate_lvl.phyclk_d18_mhz = dcn2_1_soc.clock_limits[high_voltage_lvl].phyclk_d18_mhz; - low_pstate_lvl.phyclk_mhz = dcn2_1_soc.clock_limits[high_voltage_lvl].phyclk_mhz; - - for (i = clk_table->num_entries; i > 1; i--) - clk_table->entries[i] = clk_table->entries[i-1]; - clk_table->entries[1] = clk_table->entries[0]; - clk_table->num_entries++; - - return low_pstate_lvl; -} - -static void update_bw_bounding_box(struct dc *dc, struct clk_bw_params *bw_params) -{ - struct dcn21_resource_pool *pool = TO_DCN21_RES_POOL(dc->res_pool); - struct clk_limit_table *clk_table = &bw_params->clk_table; - struct _vcs_dpi_voltage_scaling_st clock_limits[DC__VOLTAGE_STATES]; - unsigned int i, closest_clk_lvl = 0, k = 0; - int j; - - dcn2_1_ip.max_num_otg = pool->base.res_cap->num_timing_generator; - dcn2_1_ip.max_num_dpp = pool->base.pipe_count; - dcn2_1_soc.num_chans = bw_params->num_channels; - - ASSERT(clk_table->num_entries); - /* Copy dcn2_1_soc.clock_limits to clock_limits to avoid copying over null states later */ - for (i = 0; i < dcn2_1_soc.num_states + 1; i++) { - clock_limits[i] = dcn2_1_soc.clock_limits[i]; - } - - for (i = 0; i < clk_table->num_entries; i++) { - /* loop backwards*/ - for (closest_clk_lvl = 0, j = dcn2_1_soc.num_states - 1; j >= 0; j--) { - if ((unsigned int) dcn2_1_soc.clock_limits[j].dcfclk_mhz <= clk_table->entries[i].dcfclk_mhz) { - closest_clk_lvl = j; - break; - } - } - - /* clk_table[1] is reserved for min DF PState. skip here to fill in later. */ - if (i == 1) - k++; - - clock_limits[k].state = k; - clock_limits[k].dcfclk_mhz = clk_table->entries[i].dcfclk_mhz; - clock_limits[k].fabricclk_mhz = clk_table->entries[i].fclk_mhz; - clock_limits[k].socclk_mhz = clk_table->entries[i].socclk_mhz; - clock_limits[k].dram_speed_mts = clk_table->entries[i].memclk_mhz * 2; - - clock_limits[k].dispclk_mhz = dcn2_1_soc.clock_limits[closest_clk_lvl].dispclk_mhz; - clock_limits[k].dppclk_mhz = dcn2_1_soc.clock_limits[closest_clk_lvl].dppclk_mhz; - clock_limits[k].dram_bw_per_chan_gbps = dcn2_1_soc.clock_limits[closest_clk_lvl].dram_bw_per_chan_gbps; - clock_limits[k].dscclk_mhz = dcn2_1_soc.clock_limits[closest_clk_lvl].dscclk_mhz; - clock_limits[k].dtbclk_mhz = dcn2_1_soc.clock_limits[closest_clk_lvl].dtbclk_mhz; - clock_limits[k].phyclk_d18_mhz = dcn2_1_soc.clock_limits[closest_clk_lvl].phyclk_d18_mhz; - clock_limits[k].phyclk_mhz = dcn2_1_soc.clock_limits[closest_clk_lvl].phyclk_mhz; - - k++; - } - for (i = 0; i < clk_table->num_entries + 1; i++) - dcn2_1_soc.clock_limits[i] = clock_limits[i]; - if (clk_table->num_entries) { - dcn2_1_soc.num_states = clk_table->num_entries + 1; - /* fill in min DF PState */ - dcn2_1_soc.clock_limits[1] = construct_low_pstate_lvl(clk_table, closest_clk_lvl); - /* duplicate last level */ - dcn2_1_soc.clock_limits[dcn2_1_soc.num_states] = dcn2_1_soc.clock_limits[dcn2_1_soc.num_states - 1]; - dcn2_1_soc.clock_limits[dcn2_1_soc.num_states].state = dcn2_1_soc.num_states; - } - - dml_init_instance(&dc->dml, &dcn2_1_soc, &dcn2_1_ip, DML_PROJECT_DCN21); -} - static struct pp_smu_funcs *dcn21_pp_smu_create(struct dc_context *ctx) { struct pp_smu_funcs *pp_smu = kzalloc(sizeof(*pp_smu), GFP_KERNEL); @@ -1898,24 +1378,6 @@ static uint32_t read_pipe_fuses(struct dc_context *ctx) return value; } -static int dcn21_populate_dml_pipes_from_context( - struct dc *dc, - struct dc_state *context, - display_e2e_pipe_params_st *pipes, - bool fast_validate) -{ - uint32_t pipe_cnt = dcn20_populate_dml_pipes_from_context(dc, context, pipes, fast_validate); - int i; - - for (i = 0; i < pipe_cnt; i++) { - - pipes[i].pipe.src.hostvm = dc->res_pool->hubbub->riommu_active; - pipes[i].pipe.src.gpuvm = 1; - } - - return pipe_cnt; -} - static enum dc_status dcn21_patch_unknown_plane_state(struct dc_plane_state *plane_state) { enum dc_status result = DC_OK; @@ -1943,7 +1405,7 @@ static const struct resource_funcs dcn21_res_pool_funcs = { .patch_unknown_plane_state = dcn21_patch_unknown_plane_state, .set_mcif_arb_params = dcn20_set_mcif_arb_params, .find_first_free_match_stream_enc_for_link = dcn10_find_first_free_match_stream_enc_for_link, - .update_bw_bounding_box = update_bw_bounding_box + .update_bw_bounding_box = dcn21_update_bw_bounding_box, }; static bool dcn21_resource_construct( diff --git a/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.h b/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.h index a27355171bca..f7ecc002c2f7 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.h +++ b/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.h @@ -35,11 +35,22 @@ struct dc; struct resource_pool; struct _vcs_dpi_display_pipe_params_st; +extern struct _vcs_dpi_ip_params_st dcn2_1_ip; +extern struct _vcs_dpi_soc_bounding_box_st dcn2_1_soc; + struct dcn21_resource_pool { struct resource_pool base; }; struct resource_pool *dcn21_create_resource_pool( const struct dc_init_data *init_data, struct dc *dc); +bool dcn21_fast_validate_bw( + struct dc *dc, + struct dc_state *context, + display_e2e_pipe_params_st *pipes, + int *pipe_cnt_out, + int *pipe_split_from, + int *vlevel_out, + bool fast_validate); #endif /* _DCN21_RESOURCE_H_ */ diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c index b7adc9b6a543..bfdb4b78f571 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c @@ -27,10 +27,21 @@ #include "resource.h" #include "clk_mgr.h" #include "dc_link_dp.h" +#include "dchubbub.h" #include "dcn20/dcn20_resource.h" +#include "dcn21/dcn21_resource.h" #include "dcn20_fpu.h" +#define DC_LOGGER_INIT(logger) + +#ifndef MAX +#define MAX(X, Y) ((X) > (Y) ? (X) : (Y)) +#endif +#ifndef MIN +#define MIN(X, Y) ((X) < (Y) ? (X) : (Y)) +#endif + /** * DOC: DCN2x FPU manipulation Overview * @@ -426,7 +437,219 @@ struct _vcs_dpi_soc_bounding_box_st dcn2_0_nv14_soc = { struct _vcs_dpi_soc_bounding_box_st dcn2_0_nv12_soc = { 0 }; -#define DC_LOGGER_INIT(logger) +struct _vcs_dpi_ip_params_st dcn2_1_ip = { + .odm_capable = 1, + .gpuvm_enable = 1, + .hostvm_enable = 1, + .gpuvm_max_page_table_levels = 1, + .hostvm_max_page_table_levels = 4, + .hostvm_cached_page_table_levels = 2, + .num_dsc = 3, + .rob_buffer_size_kbytes = 168, + .det_buffer_size_kbytes = 164, + .dpte_buffer_size_in_pte_reqs_luma = 44, + .dpte_buffer_size_in_pte_reqs_chroma = 42,//todo + .dpp_output_buffer_pixels = 2560, + .opp_output_buffer_lines = 1, + .pixel_chunk_size_kbytes = 8, + .pte_enable = 1, + .max_page_table_levels = 4, + .pte_chunk_size_kbytes = 2, + .meta_chunk_size_kbytes = 2, + .min_meta_chunk_size_bytes = 256, + .writeback_chunk_size_kbytes = 2, + .line_buffer_size_bits = 789504, + .is_line_buffer_bpp_fixed = 0, + .line_buffer_fixed_bpp = 0, + .dcc_supported = true, + .max_line_buffer_lines = 12, + .writeback_luma_buffer_size_kbytes = 12, + .writeback_chroma_buffer_size_kbytes = 8, + .writeback_chroma_line_buffer_width_pixels = 4, + .writeback_max_hscl_ratio = 1, + .writeback_max_vscl_ratio = 1, + .writeback_min_hscl_ratio = 1, + .writeback_min_vscl_ratio = 1, + .writeback_max_hscl_taps = 12, + .writeback_max_vscl_taps = 12, + .writeback_line_buffer_luma_buffer_size = 0, + .writeback_line_buffer_chroma_buffer_size = 14643, + .cursor_buffer_size = 8, + .cursor_chunk_size = 2, + .max_num_otg = 4, + .max_num_dpp = 4, + .max_num_wb = 1, + .max_dchub_pscl_bw_pix_per_clk = 4, + .max_pscl_lb_bw_pix_per_clk = 2, + .max_lb_vscl_bw_pix_per_clk = 4, + .max_vscl_hscl_bw_pix_per_clk = 4, + .max_hscl_ratio = 4, + .max_vscl_ratio = 4, + .hscl_mults = 4, + .vscl_mults = 4, + .max_hscl_taps = 8, + .max_vscl_taps = 8, + .dispclk_ramp_margin_percent = 1, + .underscan_factor = 1.10, + .min_vblank_lines = 32, // + .dppclk_delay_subtotal = 77, // + .dppclk_delay_scl_lb_only = 16, + .dppclk_delay_scl = 50, + .dppclk_delay_cnvc_formatter = 8, + .dppclk_delay_cnvc_cursor = 6, + .dispclk_delay_subtotal = 87, // + .dcfclk_cstate_latency = 10, // SRExitTime + .max_inter_dcn_tile_repeaters = 8, + + .xfc_supported = false, + .xfc_fill_bw_overhead_percent = 10.0, + .xfc_fill_constant_bytes = 0, + .ptoi_supported = 0, + .number_of_cursors = 1, +}; + +struct _vcs_dpi_soc_bounding_box_st dcn2_1_soc = { + .clock_limits = { + { + .state = 0, + .dcfclk_mhz = 400.0, + .fabricclk_mhz = 400.0, + .dispclk_mhz = 600.0, + .dppclk_mhz = 400.00, + .phyclk_mhz = 600.0, + .socclk_mhz = 278.0, + .dscclk_mhz = 205.67, + .dram_speed_mts = 1600.0, + }, + { + .state = 1, + .dcfclk_mhz = 464.52, + .fabricclk_mhz = 800.0, + .dispclk_mhz = 654.55, + .dppclk_mhz = 626.09, + .phyclk_mhz = 600.0, + .socclk_mhz = 278.0, + .dscclk_mhz = 205.67, + .dram_speed_mts = 1600.0, + }, + { + .state = 2, + .dcfclk_mhz = 514.29, + .fabricclk_mhz = 933.0, + .dispclk_mhz = 757.89, + .dppclk_mhz = 685.71, + .phyclk_mhz = 600.0, + .socclk_mhz = 278.0, + .dscclk_mhz = 287.67, + .dram_speed_mts = 1866.0, + }, + { + .state = 3, + .dcfclk_mhz = 576.00, + .fabricclk_mhz = 1067.0, + .dispclk_mhz = 847.06, + .dppclk_mhz = 757.89, + .phyclk_mhz = 600.0, + .socclk_mhz = 715.0, + .dscclk_mhz = 318.334, + .dram_speed_mts = 2134.0, + }, + { + .state = 4, + .dcfclk_mhz = 626.09, + .fabricclk_mhz = 1200.0, + .dispclk_mhz = 900.00, + .dppclk_mhz = 847.06, + .phyclk_mhz = 810.0, + .socclk_mhz = 953.0, + .dscclk_mhz = 489.0, + .dram_speed_mts = 2400.0, + }, + { + .state = 5, + .dcfclk_mhz = 685.71, + .fabricclk_mhz = 1333.0, + .dispclk_mhz = 1028.57, + .dppclk_mhz = 960.00, + .phyclk_mhz = 810.0, + .socclk_mhz = 278.0, + .dscclk_mhz = 287.67, + .dram_speed_mts = 2666.0, + }, + { + .state = 6, + .dcfclk_mhz = 757.89, + .fabricclk_mhz = 1467.0, + .dispclk_mhz = 1107.69, + .dppclk_mhz = 1028.57, + .phyclk_mhz = 810.0, + .socclk_mhz = 715.0, + .dscclk_mhz = 318.334, + .dram_speed_mts = 3200.0, + }, + { + .state = 7, + .dcfclk_mhz = 847.06, + .fabricclk_mhz = 1600.0, + .dispclk_mhz = 1395.0, + .dppclk_mhz = 1285.00, + .phyclk_mhz = 1325.0, + .socclk_mhz = 953.0, + .dscclk_mhz = 489.0, + .dram_speed_mts = 4266.0, + }, + /*Extra state, no dispclk ramping*/ + { + .state = 8, + .dcfclk_mhz = 847.06, + .fabricclk_mhz = 1600.0, + .dispclk_mhz = 1395.0, + .dppclk_mhz = 1285.0, + .phyclk_mhz = 1325.0, + .socclk_mhz = 953.0, + .dscclk_mhz = 489.0, + .dram_speed_mts = 4266.0, + }, + + }, + + .sr_exit_time_us = 12.5, + .sr_enter_plus_exit_time_us = 17.0, + .urgent_latency_us = 4.0, + .urgent_latency_pixel_data_only_us = 4.0, + .urgent_latency_pixel_mixed_with_vm_data_us = 4.0, + .urgent_latency_vm_data_only_us = 4.0, + .urgent_out_of_order_return_per_channel_pixel_only_bytes = 4096, + .urgent_out_of_order_return_per_channel_pixel_and_vm_bytes = 4096, + .urgent_out_of_order_return_per_channel_vm_only_bytes = 4096, + .pct_ideal_dram_sdp_bw_after_urgent_pixel_only = 80.0, + .pct_ideal_dram_sdp_bw_after_urgent_pixel_and_vm = 75.0, + .pct_ideal_dram_sdp_bw_after_urgent_vm_only = 40.0, + .max_avg_sdp_bw_use_normal_percent = 60.0, + .max_avg_dram_bw_use_normal_percent = 100.0, + .writeback_latency_us = 12.0, + .max_request_size_bytes = 256, + .dram_channel_width_bytes = 4, + .fabric_datapath_to_dcn_data_return_bytes = 32, + .dcn_downspread_percent = 0.5, + .downspread_percent = 0.38, + .dram_page_open_time_ns = 50.0, + .dram_rw_turnaround_time_ns = 17.5, + .dram_return_buffer_per_channel_bytes = 8192, + .round_trip_ping_latency_dcfclk_cycles = 128, + .urgent_out_of_order_return_per_channel_bytes = 4096, + .channel_interleave_bytes = 256, + .num_banks = 8, + .num_chans = 4, + .vmm_page_size_bytes = 4096, + .dram_clock_change_latency_us = 23.84, + .return_bus_width_bytes = 64, + .dispclk_dppclk_vco_speed_mhz = 3600, + .xfc_bus_transport_time_us = 4, + .xfc_xbuf_latency_tolerance_us = 4, + .use_urgent_burst_bw = 1, + .num_states = 8 +}; void dcn20_populate_dml_writeback_from_context(struct dc *dc, struct resource_context *res_ctx, @@ -1485,3 +1708,316 @@ void dcn20_fpu_adjust_dppclk(struct vba_vars_st *v, else v->RequiredDPPCLK[vlevel][max_mpc_comb][pipe_idx] /= 2; } + +int dcn21_populate_dml_pipes_from_context(struct dc *dc, + struct dc_state *context, + display_e2e_pipe_params_st *pipes, + bool fast_validate) +{ + uint32_t pipe_cnt; + int i; + + dc_assert_fp_enabled(); + + pipe_cnt = dcn20_populate_dml_pipes_from_context(dc, context, pipes, fast_validate); + + for (i = 0; i < pipe_cnt; i++) { + + pipes[i].pipe.src.hostvm = dc->res_pool->hubbub->riommu_active; + pipes[i].pipe.src.gpuvm = 1; + } + + return pipe_cnt; +} + +static void patch_bounding_box(struct dc *dc, struct _vcs_dpi_soc_bounding_box_st *bb) +{ + int i; + + if (dc->bb_overrides.sr_exit_time_ns) { + for (i = 0; i < WM_SET_COUNT; i++) { + dc->clk_mgr->bw_params->wm_table.entries[i].sr_exit_time_us = + dc->bb_overrides.sr_exit_time_ns / 1000.0; + } + } + + if (dc->bb_overrides.sr_enter_plus_exit_time_ns) { + for (i = 0; i < WM_SET_COUNT; i++) { + dc->clk_mgr->bw_params->wm_table.entries[i].sr_enter_plus_exit_time_us = + dc->bb_overrides.sr_enter_plus_exit_time_ns / 1000.0; + } + } + + if (dc->bb_overrides.urgent_latency_ns) { + bb->urgent_latency_us = dc->bb_overrides.urgent_latency_ns / 1000.0; + } + + if (dc->bb_overrides.dram_clock_change_latency_ns) { + for (i = 0; i < WM_SET_COUNT; i++) { + dc->clk_mgr->bw_params->wm_table.entries[i].pstate_latency_us = + dc->bb_overrides.dram_clock_change_latency_ns / 1000.0; + } + } +} + +static void calculate_wm_set_for_vlevel(int vlevel, + struct wm_range_table_entry *table_entry, + struct dcn_watermarks *wm_set, + struct display_mode_lib *dml, + display_e2e_pipe_params_st *pipes, + int pipe_cnt) +{ + double dram_clock_change_latency_cached = dml->soc.dram_clock_change_latency_us; + + ASSERT(vlevel < dml->soc.num_states); + /* only pipe 0 is read for voltage and dcf/soc clocks */ + pipes[0].clks_cfg.voltage = vlevel; + pipes[0].clks_cfg.dcfclk_mhz = dml->soc.clock_limits[vlevel].dcfclk_mhz; + pipes[0].clks_cfg.socclk_mhz = dml->soc.clock_limits[vlevel].socclk_mhz; + + dml->soc.dram_clock_change_latency_us = table_entry->pstate_latency_us; + dml->soc.sr_exit_time_us = table_entry->sr_exit_time_us; + dml->soc.sr_enter_plus_exit_time_us = table_entry->sr_enter_plus_exit_time_us; + + wm_set->urgent_ns = get_wm_urgent(dml, pipes, pipe_cnt) * 1000; + wm_set->cstate_pstate.cstate_enter_plus_exit_ns = get_wm_stutter_enter_exit(dml, pipes, pipe_cnt) * 1000; + wm_set->cstate_pstate.cstate_exit_ns = get_wm_stutter_exit(dml, pipes, pipe_cnt) * 1000; + wm_set->cstate_pstate.pstate_change_ns = get_wm_dram_clock_change(dml, pipes, pipe_cnt) * 1000; + wm_set->pte_meta_urgent_ns = get_wm_memory_trip(dml, pipes, pipe_cnt) * 1000; + wm_set->frac_urg_bw_nom = get_fraction_of_urgent_bandwidth(dml, pipes, pipe_cnt) * 1000; + wm_set->frac_urg_bw_flip = get_fraction_of_urgent_bandwidth_imm_flip(dml, pipes, pipe_cnt) * 1000; + wm_set->urgent_latency_ns = get_urgent_latency(dml, pipes, pipe_cnt) * 1000; + dml->soc.dram_clock_change_latency_us = dram_clock_change_latency_cached; +} + +static void dcn21_calculate_wm(struct dc *dc, struct dc_state *context, + display_e2e_pipe_params_st *pipes, + int *out_pipe_cnt, + int *pipe_split_from, + int vlevel_req, + bool fast_validate) +{ + int pipe_cnt, i, pipe_idx; + int vlevel, vlevel_max; + struct wm_range_table_entry *table_entry; + struct clk_bw_params *bw_params = dc->clk_mgr->bw_params; + + ASSERT(bw_params); + + patch_bounding_box(dc, &context->bw_ctx.dml.soc); + + for (i = 0, pipe_idx = 0, pipe_cnt = 0; i < dc->res_pool->pipe_count; i++) { + if (!context->res_ctx.pipe_ctx[i].stream) + continue; + + pipes[pipe_cnt].clks_cfg.refclk_mhz = dc->res_pool->ref_clocks.dchub_ref_clock_inKhz / 1000.0; + pipes[pipe_cnt].clks_cfg.dispclk_mhz = context->bw_ctx.dml.vba.RequiredDISPCLK[vlevel_req][context->bw_ctx.dml.vba.maxMpcComb]; + + if (pipe_split_from[i] < 0) { + pipes[pipe_cnt].clks_cfg.dppclk_mhz = + context->bw_ctx.dml.vba.RequiredDPPCLK[vlevel_req][context->bw_ctx.dml.vba.maxMpcComb][pipe_idx]; + if (context->bw_ctx.dml.vba.BlendingAndTiming[pipe_idx] == pipe_idx) + pipes[pipe_cnt].pipe.dest.odm_combine = + context->bw_ctx.dml.vba.ODMCombineEnablePerState[vlevel_req][pipe_idx]; + else + pipes[pipe_cnt].pipe.dest.odm_combine = 0; + pipe_idx++; + } else { + pipes[pipe_cnt].clks_cfg.dppclk_mhz = + context->bw_ctx.dml.vba.RequiredDPPCLK[vlevel_req][context->bw_ctx.dml.vba.maxMpcComb][pipe_split_from[i]]; + if (context->bw_ctx.dml.vba.BlendingAndTiming[pipe_split_from[i]] == pipe_split_from[i]) + pipes[pipe_cnt].pipe.dest.odm_combine = + context->bw_ctx.dml.vba.ODMCombineEnablePerState[vlevel_req][pipe_split_from[i]]; + else + pipes[pipe_cnt].pipe.dest.odm_combine = 0; + } + pipe_cnt++; + } + + if (pipe_cnt != pipe_idx) { + if (dc->res_pool->funcs->populate_dml_pipes) + pipe_cnt = dc->res_pool->funcs->populate_dml_pipes(dc, + context, pipes, fast_validate); + else + pipe_cnt = dcn21_populate_dml_pipes_from_context(dc, + context, pipes, fast_validate); + } + + *out_pipe_cnt = pipe_cnt; + + vlevel_max = bw_params->clk_table.num_entries - 1; + + + /* WM Set D */ + table_entry = &bw_params->wm_table.entries[WM_D]; + if (table_entry->wm_type == WM_TYPE_RETRAINING) + vlevel = 0; + else + vlevel = vlevel_max; + calculate_wm_set_for_vlevel(vlevel, table_entry, &context->bw_ctx.bw.dcn.watermarks.d, + &context->bw_ctx.dml, pipes, pipe_cnt); + /* WM Set C */ + table_entry = &bw_params->wm_table.entries[WM_C]; + vlevel = MIN(MAX(vlevel_req, 3), vlevel_max); + calculate_wm_set_for_vlevel(vlevel, table_entry, &context->bw_ctx.bw.dcn.watermarks.c, + &context->bw_ctx.dml, pipes, pipe_cnt); + /* WM Set B */ + table_entry = &bw_params->wm_table.entries[WM_B]; + vlevel = MIN(MAX(vlevel_req, 2), vlevel_max); + calculate_wm_set_for_vlevel(vlevel, table_entry, &context->bw_ctx.bw.dcn.watermarks.b, + &context->bw_ctx.dml, pipes, pipe_cnt); + + /* WM Set A */ + table_entry = &bw_params->wm_table.entries[WM_A]; + vlevel = MIN(vlevel_req, vlevel_max); + calculate_wm_set_for_vlevel(vlevel, table_entry, &context->bw_ctx.bw.dcn.watermarks.a, + &context->bw_ctx.dml, pipes, pipe_cnt); +} + +bool dcn21_validate_bandwidth_fp(struct dc *dc, + struct dc_state *context, + bool fast_validate) +{ + bool out = false; + + BW_VAL_TRACE_SETUP(); + + int vlevel = 0; + int pipe_split_from[MAX_PIPES]; + int pipe_cnt = 0; + display_e2e_pipe_params_st *pipes = kzalloc(dc->res_pool->pipe_count * sizeof(display_e2e_pipe_params_st), GFP_ATOMIC); + DC_LOGGER_INIT(dc->ctx->logger); + + BW_VAL_TRACE_COUNT(); + + dc_assert_fp_enabled(); + + /*Unsafe due to current pipe merge and split logic*/ + ASSERT(context != dc->current_state); + + out = dcn21_fast_validate_bw(dc, context, pipes, &pipe_cnt, pipe_split_from, &vlevel, fast_validate); + + if (pipe_cnt == 0) + goto validate_out; + + if (!out) + goto validate_fail; + + BW_VAL_TRACE_END_VOLTAGE_LEVEL(); + + if (fast_validate) { + BW_VAL_TRACE_SKIP(fast); + goto validate_out; + } + + dcn21_calculate_wm(dc, context, pipes, &pipe_cnt, pipe_split_from, vlevel, fast_validate); + dcn20_calculate_dlg_params(dc, context, pipes, pipe_cnt, vlevel); + + BW_VAL_TRACE_END_WATERMARKS(); + + goto validate_out; + +validate_fail: + DC_LOG_WARNING("Mode Validation Warning: %s failed validation.\n", + dml_get_status_message(context->bw_ctx.dml.vba.ValidationStatus[context->bw_ctx.dml.vba.soc.num_states])); + + BW_VAL_TRACE_SKIP(fail); + out = false; + +validate_out: + kfree(pipes); + + BW_VAL_TRACE_FINISH(); + + return out; +} + +static struct _vcs_dpi_voltage_scaling_st construct_low_pstate_lvl(struct clk_limit_table *clk_table, unsigned int high_voltage_lvl) +{ + struct _vcs_dpi_voltage_scaling_st low_pstate_lvl; + int i; + + low_pstate_lvl.state = 1; + low_pstate_lvl.dcfclk_mhz = clk_table->entries[0].dcfclk_mhz; + low_pstate_lvl.fabricclk_mhz = clk_table->entries[0].fclk_mhz; + low_pstate_lvl.socclk_mhz = clk_table->entries[0].socclk_mhz; + low_pstate_lvl.dram_speed_mts = clk_table->entries[0].memclk_mhz * 2; + + low_pstate_lvl.dispclk_mhz = dcn2_1_soc.clock_limits[high_voltage_lvl].dispclk_mhz; + low_pstate_lvl.dppclk_mhz = dcn2_1_soc.clock_limits[high_voltage_lvl].dppclk_mhz; + low_pstate_lvl.dram_bw_per_chan_gbps = dcn2_1_soc.clock_limits[high_voltage_lvl].dram_bw_per_chan_gbps; + low_pstate_lvl.dscclk_mhz = dcn2_1_soc.clock_limits[high_voltage_lvl].dscclk_mhz; + low_pstate_lvl.dtbclk_mhz = dcn2_1_soc.clock_limits[high_voltage_lvl].dtbclk_mhz; + low_pstate_lvl.phyclk_d18_mhz = dcn2_1_soc.clock_limits[high_voltage_lvl].phyclk_d18_mhz; + low_pstate_lvl.phyclk_mhz = dcn2_1_soc.clock_limits[high_voltage_lvl].phyclk_mhz; + + for (i = clk_table->num_entries; i > 1; i--) + clk_table->entries[i] = clk_table->entries[i-1]; + clk_table->entries[1] = clk_table->entries[0]; + clk_table->num_entries++; + + return low_pstate_lvl; +} + +void dcn21_update_bw_bounding_box(struct dc *dc, struct clk_bw_params *bw_params) +{ + struct dcn21_resource_pool *pool = TO_DCN21_RES_POOL(dc->res_pool); + struct clk_limit_table *clk_table = &bw_params->clk_table; + struct _vcs_dpi_voltage_scaling_st clock_limits[DC__VOLTAGE_STATES]; + unsigned int i, closest_clk_lvl = 0, k = 0; + int j; + + dc_assert_fp_enabled(); + + dcn2_1_ip.max_num_otg = pool->base.res_cap->num_timing_generator; + dcn2_1_ip.max_num_dpp = pool->base.pipe_count; + dcn2_1_soc.num_chans = bw_params->num_channels; + + ASSERT(clk_table->num_entries); + /* Copy dcn2_1_soc.clock_limits to clock_limits to avoid copying over null states later */ + for (i = 0; i < dcn2_1_soc.num_states + 1; i++) { + clock_limits[i] = dcn2_1_soc.clock_limits[i]; + } + + for (i = 0; i < clk_table->num_entries; i++) { + /* loop backwards*/ + for (closest_clk_lvl = 0, j = dcn2_1_soc.num_states - 1; j >= 0; j--) { + if ((unsigned int) dcn2_1_soc.clock_limits[j].dcfclk_mhz <= clk_table->entries[i].dcfclk_mhz) { + closest_clk_lvl = j; + break; + } + } + + /* clk_table[1] is reserved for min DF PState. skip here to fill in later. */ + if (i == 1) + k++; + + clock_limits[k].state = k; + clock_limits[k].dcfclk_mhz = clk_table->entries[i].dcfclk_mhz; + clock_limits[k].fabricclk_mhz = clk_table->entries[i].fclk_mhz; + clock_limits[k].socclk_mhz = clk_table->entries[i].socclk_mhz; + clock_limits[k].dram_speed_mts = clk_table->entries[i].memclk_mhz * 2; + + clock_limits[k].dispclk_mhz = dcn2_1_soc.clock_limits[closest_clk_lvl].dispclk_mhz; + clock_limits[k].dppclk_mhz = dcn2_1_soc.clock_limits[closest_clk_lvl].dppclk_mhz; + clock_limits[k].dram_bw_per_chan_gbps = dcn2_1_soc.clock_limits[closest_clk_lvl].dram_bw_per_chan_gbps; + clock_limits[k].dscclk_mhz = dcn2_1_soc.clock_limits[closest_clk_lvl].dscclk_mhz; + clock_limits[k].dtbclk_mhz = dcn2_1_soc.clock_limits[closest_clk_lvl].dtbclk_mhz; + clock_limits[k].phyclk_d18_mhz = dcn2_1_soc.clock_limits[closest_clk_lvl].phyclk_d18_mhz; + clock_limits[k].phyclk_mhz = dcn2_1_soc.clock_limits[closest_clk_lvl].phyclk_mhz; + + k++; + } + for (i = 0; i < clk_table->num_entries + 1; i++) + dcn2_1_soc.clock_limits[i] = clock_limits[i]; + if (clk_table->num_entries) { + dcn2_1_soc.num_states = clk_table->num_entries + 1; + /* fill in min DF PState */ + dcn2_1_soc.clock_limits[1] = construct_low_pstate_lvl(clk_table, closest_clk_lvl); + /* duplicate last level */ + dcn2_1_soc.clock_limits[dcn2_1_soc.num_states] = dcn2_1_soc.clock_limits[dcn2_1_soc.num_states - 1]; + dcn2_1_soc.clock_limits[dcn2_1_soc.num_states].state = dcn2_1_soc.num_states; + } + + dml_init_instance(&dc->dml, &dcn2_1_soc, &dcn2_1_ip, DML_PROJECT_DCN21); +} diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.h b/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.h index 6b1f4126bc88..da38fa10f077 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.h +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.h @@ -71,4 +71,13 @@ void dcn20_fpu_adjust_dppclk(struct vba_vars_st *v, int max_mpc_comb, int pipe_idx, bool is_validating_bw); + +int dcn21_populate_dml_pipes_from_context(struct dc *dc, + struct dc_state *context, + display_e2e_pipe_params_st *pipes, + bool fast_validate); +bool dcn21_validate_bandwidth_fp(struct dc *dc, + struct dc_state *context, + bool fast_validate); +void dcn21_update_bw_bounding_box(struct dc *dc, struct clk_bw_params *bw_params); #endif /* __DCN20_FPU_H__ */ From patchwork Mon Feb 28 21:10:47 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Melissa Wen X-Patchwork-Id: 12763806 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E255AC433EF for ; Mon, 28 Feb 2022 21:11:54 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id F22F710E8DB; Mon, 28 Feb 2022 21:11:53 +0000 (UTC) Received: from fanzine2.igalia.com (fanzine2.igalia.com [213.97.179.56]) by gabe.freedesktop.org (Postfix) with ESMTPS id A0D9810E8D9; Mon, 28 Feb 2022 21:11:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To:Content-Type:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=hlxuHAMRU3IMg6hj7WJxf5M11wcx0ijnT9+HHKNCJmY=; b=obQcFpsBHStgolqorqjWeQ4+Bo dXzFLLVuEKqbAFgwEib3HJ8yHcdiLk6GQafUZkwZCbg311p9psRwTkoiDNGZDzqVwgH4mB43qAffI wgkft3zSyMGjVhI7uN5KL5KXbp8P9Lyk2GVEDopiXcD/wu4gjBIjXGS6wuySB3aXBh7FAjpO69o1I OX/VrbntjRoQ5w+G5ML18oJnElP9XyxqL0+ywMPy+pl9h5t0K4kN3MU/7RDyvR371hCIUJBza2Deu B6bMOXgiOfvhwSrvMtaR+t7PNs39lSB+fs7FS4ViQq72Ph1zV4PNQ/z+3EeEbGzQkvXg/jlYCoxSb lePgc7Mw==; Received: from [165.90.126.25] (helo=killbill.home) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1nOnJ7-000CFy-Gv; Mon, 28 Feb 2022 22:11:49 +0100 From: Melissa Wen To: amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, harry.wentland@amd.com, sunpeng.li@amd.com, Rodrigo.Siqueira@amd.com, alexander.deucher@amd.com, christian.koenig@amd.com, Xinhui.Pan@amd.com, airlied@linux.ie, daniel@ffwll.ch Subject: [PATCH 2/2] drm/amd/display: move FPU code from dcn10 to dml/dcn10 folder Date: Mon, 28 Feb 2022 20:10:47 -0100 Message-Id: <20220228211047.3957945-3-mwen@igalia.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220228211047.3957945-1-mwen@igalia.com> References: <20220228211047.3957945-1-mwen@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Melissa Wen , Qingqing Zhuo , Dmytro Laktyushkin , linux-kernel@vger.kernel.org, Jasdeep Dhillon Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" FPU operations in dcn10 was already moved to dml folder via calcs code. However, dcn1_0_ip and dcn_1_0_soc with FPU componentd remains on dcn10. Following previous changes to isolate FPU, this patch creates dcn10_fpu files to isolate FPU-specific code and moves those structs to it. Signed-off-by: Melissa Wen --- .../drm/amd/display/dc/dcn10/dcn10_resource.c | 62 --------- .../drm/amd/display/dc/dcn10/dcn10_resource.h | 4 + drivers/gpu/drm/amd/display/dc/dml/Makefile | 2 + .../drm/amd/display/dc/dml/dcn10/dcn10_fpu.c | 124 ++++++++++++++++++ .../drm/amd/display/dc/dml/dcn10/dcn10_fpu.h | 30 +++++ 5 files changed, 160 insertions(+), 62 deletions(-) create mode 100644 drivers/gpu/drm/amd/display/dc/dml/dcn10/dcn10_fpu.c create mode 100644 drivers/gpu/drm/amd/display/dc/dml/dcn10/dcn10_fpu.h diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c index 858b72149897..ac96242cc474 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c +++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c @@ -70,68 +70,6 @@ #include "dce/dce_aux.h" #include "dce/dce_i2c.h" -const struct _vcs_dpi_ip_params_st dcn1_0_ip = { - .rob_buffer_size_kbytes = 64, - .det_buffer_size_kbytes = 164, - .dpte_buffer_size_in_pte_reqs_luma = 42, - .dpp_output_buffer_pixels = 2560, - .opp_output_buffer_lines = 1, - .pixel_chunk_size_kbytes = 8, - .pte_enable = 1, - .pte_chunk_size_kbytes = 2, - .meta_chunk_size_kbytes = 2, - .writeback_chunk_size_kbytes = 2, - .line_buffer_size_bits = 589824, - .max_line_buffer_lines = 12, - .IsLineBufferBppFixed = 0, - .LineBufferFixedBpp = -1, - .writeback_luma_buffer_size_kbytes = 12, - .writeback_chroma_buffer_size_kbytes = 8, - .max_num_dpp = 4, - .max_num_wb = 2, - .max_dchub_pscl_bw_pix_per_clk = 4, - .max_pscl_lb_bw_pix_per_clk = 2, - .max_lb_vscl_bw_pix_per_clk = 4, - .max_vscl_hscl_bw_pix_per_clk = 4, - .max_hscl_ratio = 4, - .max_vscl_ratio = 4, - .hscl_mults = 4, - .vscl_mults = 4, - .max_hscl_taps = 8, - .max_vscl_taps = 8, - .dispclk_ramp_margin_percent = 1, - .underscan_factor = 1.10, - .min_vblank_lines = 14, - .dppclk_delay_subtotal = 90, - .dispclk_delay_subtotal = 42, - .dcfclk_cstate_latency = 10, - .max_inter_dcn_tile_repeaters = 8, - .can_vstartup_lines_exceed_vsync_plus_back_porch_lines_minus_one = 0, - .bug_forcing_LC_req_same_size_fixed = 0, -}; - -const struct _vcs_dpi_soc_bounding_box_st dcn1_0_soc = { - .sr_exit_time_us = 9.0, - .sr_enter_plus_exit_time_us = 11.0, - .urgent_latency_us = 4.0, - .writeback_latency_us = 12.0, - .ideal_dram_bw_after_urgent_percent = 80.0, - .max_request_size_bytes = 256, - .downspread_percent = 0.5, - .dram_page_open_time_ns = 50.0, - .dram_rw_turnaround_time_ns = 17.5, - .dram_return_buffer_per_channel_bytes = 8192, - .round_trip_ping_latency_dcfclk_cycles = 128, - .urgent_out_of_order_return_per_channel_bytes = 256, - .channel_interleave_bytes = 256, - .num_banks = 8, - .num_chans = 2, - .vmm_page_size_bytes = 4096, - .dram_clock_change_latency_us = 17.0, - .writeback_dram_clock_change_latency_us = 23.0, - .return_bus_width_bytes = 64, -}; - #ifndef mmDP0_DP_DPHY_INTERNAL_CTRL #define mmDP0_DP_DPHY_INTERNAL_CTRL 0x210f #define mmDP0_DP_DPHY_INTERNAL_CTRL_BASE_IDX 2 diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.h b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.h index 633025ccb870..bf8e33cd8147 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.h +++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.h @@ -27,6 +27,7 @@ #define __DC_RESOURCE_DCN10_H__ #include "core_types.h" +#include "dml/dcn10/dcn10_fpu.h" #define TO_DCN10_RES_POOL(pool)\ container_of(pool, struct dcn10_resource_pool, base) @@ -35,6 +36,9 @@ struct dc; struct resource_pool; struct _vcs_dpi_display_pipe_params_st; +extern struct _vcs_dpi_ip_params_st dcn1_0_ip; +extern struct _vcs_dpi_soc_bounding_box_st dcn1_0_soc; + struct dcn10_resource_pool { struct resource_pool base; }; diff --git a/drivers/gpu/drm/amd/display/dc/dml/Makefile b/drivers/gpu/drm/amd/display/dc/dml/Makefile index b16c492593e2..6b7f8b62a56f 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/Makefile +++ b/drivers/gpu/drm/amd/display/dc/dml/Makefile @@ -58,6 +58,7 @@ CFLAGS_$(AMDDALPATH)/dc/dml/display_mode_lib.o := $(dml_ccflags) ifdef CONFIG_DRM_AMD_DC_DCN CFLAGS_$(AMDDALPATH)/dc/dml/display_mode_vba.o := $(dml_ccflags) +CFLAGS_$(AMDDALPATH)/dc/dml/dcn10/dcn10_fpu.o := $(dml_ccflags) CFLAGS_$(AMDDALPATH)/dc/dml/dcn20/dcn20_fpu.o := $(dml_ccflags) CFLAGS_$(AMDDALPATH)/dc/dml/dcn20/display_mode_vba_20.o := $(dml_ccflags) CFLAGS_$(AMDDALPATH)/dc/dml/dcn20/display_mode_vba_20.o := $(dml_ccflags) @@ -105,6 +106,7 @@ DML = calcs/dce_calcs.o calcs/custom_float.o calcs/bw_fixed.o ifdef CONFIG_DRM_AMD_DC_DCN DML += display_mode_lib.o display_rq_dlg_helpers.o dml1_display_rq_dlg_calc.o +DML += dcn10/dcn10_fpu.o DML += dcn20/dcn20_fpu.o DML += display_mode_vba.o dcn20/display_rq_dlg_calc_20.o dcn20/display_mode_vba_20.o DML += dcn20/display_rq_dlg_calc_20v2.o dcn20/display_mode_vba_20v2.o diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn10/dcn10_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn10/dcn10_fpu.c new file mode 100644 index 000000000000..7f08f49eb7b1 --- /dev/null +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn10/dcn10_fpu.c @@ -0,0 +1,124 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright 2021 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + * + * Authors: AMD + * + */ + +#include "dcn10/dcn10_resource.h" + +#include "dcn10_fpu.h" + +/** + * DOC: DCN10 FPU manipulation Overview + * + * The DCN architecture relies on FPU operations, which require special + * compilation flags and the use of kernel_fpu_begin/end functions; ideally, we + * want to avoid spreading FPU access across multiple files. With this idea in + * mind, this file aims to centralize DCN10 functions that require FPU access + * in a single place. Code in this file follows the following code pattern: + * + * 1. Functions that use FPU operations should be isolated in static functions. + * 2. The FPU functions should have the noinline attribute to ensure anything + * that deals with FP register is contained within this call. + * 3. All function that needs to be accessed outside this file requires a + * public interface that not uses any FPU reference. + * 4. Developers **must not** use DC_FP_START/END in this file, but they need + * to ensure that the caller invokes it before access any function available + * in this file. For this reason, public functions in this file must invoke + * dc_assert_fp_enabled(); + * + * Let's expand a little bit more the idea in the code pattern. To fully + * isolate FPU operations in a single place, we must avoid situations where + * compilers spill FP values to registers due to FP enable in a specific C + * file. Note that even if we isolate all FPU functions in a single file and + * call its interface from other files, the compiler might enable the use of + * FPU before we call DC_FP_START. Nevertheless, it is the programmer's + * responsibility to invoke DC_FP_START/END in the correct place. To highlight + * situations where developers forgot to use the FP protection before calling + * the DC FPU interface functions, we introduce a helper that checks if the + * function is invoked under FP protection. If not, it will trigger a kernel + * warning. + */ + +struct _vcs_dpi_ip_params_st dcn1_0_ip = { + .rob_buffer_size_kbytes = 64, + .det_buffer_size_kbytes = 164, + .dpte_buffer_size_in_pte_reqs_luma = 42, + .dpp_output_buffer_pixels = 2560, + .opp_output_buffer_lines = 1, + .pixel_chunk_size_kbytes = 8, + .pte_enable = 1, + .pte_chunk_size_kbytes = 2, + .meta_chunk_size_kbytes = 2, + .writeback_chunk_size_kbytes = 2, + .line_buffer_size_bits = 589824, + .max_line_buffer_lines = 12, + .IsLineBufferBppFixed = 0, + .LineBufferFixedBpp = -1, + .writeback_luma_buffer_size_kbytes = 12, + .writeback_chroma_buffer_size_kbytes = 8, + .max_num_dpp = 4, + .max_num_wb = 2, + .max_dchub_pscl_bw_pix_per_clk = 4, + .max_pscl_lb_bw_pix_per_clk = 2, + .max_lb_vscl_bw_pix_per_clk = 4, + .max_vscl_hscl_bw_pix_per_clk = 4, + .max_hscl_ratio = 4, + .max_vscl_ratio = 4, + .hscl_mults = 4, + .vscl_mults = 4, + .max_hscl_taps = 8, + .max_vscl_taps = 8, + .dispclk_ramp_margin_percent = 1, + .underscan_factor = 1.10, + .min_vblank_lines = 14, + .dppclk_delay_subtotal = 90, + .dispclk_delay_subtotal = 42, + .dcfclk_cstate_latency = 10, + .max_inter_dcn_tile_repeaters = 8, + .can_vstartup_lines_exceed_vsync_plus_back_porch_lines_minus_one = 0, + .bug_forcing_LC_req_same_size_fixed = 0, +}; + +struct _vcs_dpi_soc_bounding_box_st dcn1_0_soc = { + .sr_exit_time_us = 9.0, + .sr_enter_plus_exit_time_us = 11.0, + .urgent_latency_us = 4.0, + .writeback_latency_us = 12.0, + .ideal_dram_bw_after_urgent_percent = 80.0, + .max_request_size_bytes = 256, + .downspread_percent = 0.5, + .dram_page_open_time_ns = 50.0, + .dram_rw_turnaround_time_ns = 17.5, + .dram_return_buffer_per_channel_bytes = 8192, + .round_trip_ping_latency_dcfclk_cycles = 128, + .urgent_out_of_order_return_per_channel_bytes = 256, + .channel_interleave_bytes = 256, + .num_banks = 8, + .num_chans = 2, + .vmm_page_size_bytes = 4096, + .dram_clock_change_latency_us = 17.0, + .writeback_dram_clock_change_latency_us = 23.0, + .return_bus_width_bytes = 64, +}; + diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn10/dcn10_fpu.h b/drivers/gpu/drm/amd/display/dc/dml/dcn10/dcn10_fpu.h new file mode 100644 index 000000000000..e74ed4b4ce5b --- /dev/null +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn10/dcn10_fpu.h @@ -0,0 +1,30 @@ +/* SPDX-License-Identifier: MIT */ +/* + * Copyright 2021 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + * + * Authors: AMD + * + */ + +#ifndef __DCN10_FPU_H__ +#define __DCN10_FPU_H__ + +#endif /* __DCN20_FPU_H__ */