From patchwork Wed Oct 25 07:33:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "M K, Muralidhara" X-Patchwork-Id: 13435654 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A8FD1C25B6B for ; Wed, 25 Oct 2023 07:37:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234184AbjJYHhr (ORCPT ); Wed, 25 Oct 2023 03:37:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58446 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234201AbjJYHhZ (ORCPT ); Wed, 25 Oct 2023 03:37:25 -0400 Received: from NAM12-DM6-obe.outbound.protection.outlook.com (mail-dm6nam12on2057.outbound.protection.outlook.com [40.107.243.57]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8142B386F; Wed, 25 Oct 2023 00:35:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=MsTepsn5cDVqYP6yKNteSVPcxO96ydRMOXv3rNaYNEgjOptyL8Okjkf7tl6Pu6qIK9iLRPcu5fE6uOoPENmy2YKSkVkWLVEBGtDk+/MQV9hSFaBd1B1Qj1h4buc4eyC/UolNJGwE1bwWA0XU1D00M+PFyF/R9C3+aId02Bw9kdc6ThMWj9waT85kwdm+D6FaRyeRoFhhna+vGqUOy5PvLYSUBpBfMxJwNSs8waa0I61ou5iiSXlYSNFbutv9JD/j/HBwiKnBAJdUS45tloVkr3rclUJRep+Je/DISpI17IPAJOKrU5Ja/LQBgvZ33BQsiQfA1tp+erX0ACQRSQ5aOw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=3iwyQePTgrftNMp5vZxzXiDDXXc3x4zVu7UroQvPFKM=; b=HlQQgJSgSmPXpCi5cIHOsortNOTYqcGCE48eS2c/LZIx+Fx+1suV18uTR7hYrh6JbXGqFtl+z4BQuCFfb+BJ+I4aC4lFYp6cVE8gaks4y6xY4i6R/dysSzpwv61ZCtPny/EvQ6LaUDw6L4BMZz69KUEB56EuhTP6AkfWb60C2+AZKMTNbWmgEACTxNVgkcAUnEFqP3WI3RcfRrMlxmzLsCIWpq5a+qVQ++8XId5m1uxHJSW7D2DLX8bieai6ipBEPNtSFrL0hh6nj0gXD+gYbyfPS7x/2aOH1tiegEmNNPxUE4S56uew96QHmEMtfwquwpjxi85DV4QDIXKYt0bzJA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=3iwyQePTgrftNMp5vZxzXiDDXXc3x4zVu7UroQvPFKM=; b=ShpopXXLnxTP/P5ahV8yTaPGj4dMoHEhXSBEy83dEPD/gJ674er6p1uvjn6jyzPCOX2eeDZW3z5bQfBf3exCxdEbGLoC1TqJ4eCeHr7tTWDGU1RWp1JZ890vNAXPPwAgdM/WQLX9IGYNDS3gvQ5sLmcVanAzXS4Bnk2b14mtWSk= Received: from DM6PR02CA0152.namprd02.prod.outlook.com (2603:10b6:5:332::19) by CYYPR12MB8964.namprd12.prod.outlook.com (2603:10b6:930:bc::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6907.33; Wed, 25 Oct 2023 07:35:12 +0000 Received: from DS3PEPF000099D9.namprd04.prod.outlook.com (2603:10b6:5:332:cafe::ad) by DM6PR02CA0152.outlook.office365.com (2603:10b6:5:332::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6907.33 via Frontend Transport; Wed, 25 Oct 2023 07:35:12 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by DS3PEPF000099D9.mail.protection.outlook.com (10.167.17.10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.6933.15 via Frontend Transport; Wed, 25 Oct 2023 07:35:12 +0000 Received: from amd.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27; Wed, 25 Oct 2023 02:35:09 -0500 From: Muralidhara M K To: , CC: , , , Muralidhara M K , Yazen Ghannam Subject: [PATCH 1/7] RAS: Add Address Translation support for MI200 Date: Wed, 25 Oct 2023 07:33:33 +0000 Message-ID: <20231025073339.630093-2-muralimk@amd.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231025073339.630093-1-muralimk@amd.com> References: <20231025073339.630093-1-muralimk@amd.com> MIME-Version: 1.0 X-Originating-IP: [10.180.168.240] X-ClientProxiedBy: SATLEXMB03.amd.com (10.181.40.144) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS3PEPF000099D9:EE_|CYYPR12MB8964:EE_ X-MS-Office365-Filtering-Correlation-Id: 789d43c3-2bdd-427d-4948-08dbd52cec4a X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: ehvY4sfGrwf/Lg4JODayc7bxDKzqccha2dtW8aGkPuG+g/lTTEAqtupPwhA+IZ1AQntw52fCiRVTn/kNp9AX5l6RCrxOuDnjasroBhSYbFqt/QPC2Bh2wQ+xA+CM9Kfmr3G3Ow1iwY3uVD4+2XFHnx5XlKJiDTkPNQmxzE0+DAgzFZR0DkgUTtgDkkJcmafPKvQrwHvZAKaYlFP+Dh6kZOA5H2SEW7MmsVNbOquT/EW4eLjFEc/Y7s7ZHc9+PoyoE3yJT3Wtik+IabYkTTtwtfaMzjLJU+73kcMAHFHnbEpPC0HasPsi8faP47svu/cO1++qsi5zXrx36met9ceTPn6b1JMkcPB750rtvyBDNajuS7U52dTepbakUzf4Uc8Fa7MShgIp8nJKiYn75IMhLIQIgft4rPgRROyGLGbRcTMeb+JWvAqbA1MfKXTf9WbxVMG+yGan1aS2VM9kLUayFE3QBNZyD+quMflgHr9eNeuDLEIyyAWSaeoFOZgSWduIN63XVT0DvnMNxxJWpFXfPmhOSd5UvcXlJg3b6W2ZOTLHyOi97eS4pkYB/1aD/cqlKMVGy7jtDmsh+7hfY9cAlgaAXCqgiHGvCdCELFB+ZLHMlIrCB7WtWu71BvKmQYcvlifRSFEDcSrkwJgyWq6nwTdNJIcqiBOhBZSoDSfGmfft3bdcyuhkSdA+4DlGAybO9hjSfpFn1+u/lDNVhCx/+laHsuT7vFL2bfwMMlXyTi+rYLYjkoOgWsZuH/gSqhajw9sN3odTLSCh3bGW31HPiQ== X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230031)(4636009)(376002)(346002)(39860400002)(396003)(136003)(230922051799003)(451199024)(82310400011)(186009)(1800799009)(64100799003)(36840700001)(40470700004)(46966006)(40480700001)(40460700003)(83380400001)(30864003)(36860700001)(2906002)(26005)(4326008)(8936002)(8676002)(336012)(36756003)(82740400003)(47076005)(2616005)(16526019)(81166007)(426003)(356005)(1076003)(6666004)(7696005)(316002)(478600001)(5660300002)(110136005)(70586007)(54906003)(70206006)(41300700001)(36900700001);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Oct 2023 07:35:12.1844 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 789d43c3-2bdd-427d-4948-08dbd52cec4a X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: DS3PEPF000099D9.namprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: CYYPR12MB8964 Precedence: bulk List-ID: X-Mailing-List: linux-edac@vger.kernel.org From: Muralidhara M K Add support for address translation on Data Fabric version 3.5 Add new interleaving modes for heterogeneous model support and adjust how the DRAM address maps are found early in the translation for certain cases. Signed-off-by: Muralidhara M K Co-developed-by: Yazen Ghannam Signed-off-by: Yazen Ghannam --- drivers/ras/amd/atl/dehash.c | 60 +++++++++++++++++ drivers/ras/amd/atl/denormalize.c | 11 +++- drivers/ras/amd/atl/internal.h | 15 ++++- drivers/ras/amd/atl/map.c | 105 +++++++++++++++++++++++++++++- drivers/ras/amd/atl/reg_fields.h | 29 +++++++++ drivers/ras/amd/atl/system.c | 1 + 6 files changed, 217 insertions(+), 4 deletions(-) diff --git a/drivers/ras/amd/atl/dehash.c b/drivers/ras/amd/atl/dehash.c index e501f2e918d7..5760e6bca194 100644 --- a/drivers/ras/amd/atl/dehash.c +++ b/drivers/ras/amd/atl/dehash.c @@ -395,6 +395,61 @@ static int df4p5_dehash_addr(struct addr_ctx *ctx) return 0; } +/* + * MI200 hash bits + * 64K 2M 1G + * CSSelect[0] = XOR of addr{8, 16, 21, 30}; + * CSSelect[1] = XOR of addr{9, 17, 22, 31}; + * CSSelect[2] = XOR of addr{10, 18, 23, 32}; + * CSSelect[3] = XOR of addr{11, 19, 24, 33}; - 16 and 32 channel only + * CSSelect[4] = XOR of addr{12, 20, 25, 34}; - 32 channel only + */ +static int mi200_dehash_addr(struct addr_ctx *ctx) +{ + u8 num_intlv_bits = ctx->map.total_intlv_bits; + bool hash_ctl_64k, hash_ctl_2M, hash_ctl_1G; + u8 hashed_bit, intlv_bit, i; + + /* Assert that interleave bit is 8. */ + if (ctx->map.intlv_bit_pos != 8) { + pr_warn("%s: Invalid interleave bit: %u", + __func__, ctx->map.intlv_bit_pos); + return -EINVAL; + } + + /* Assert that die interleaving is disabled. */ + if (ctx->map.num_intlv_dies > 1) { + pr_warn("%s: Invalid number of interleave dies: %u", + __func__, ctx->map.num_intlv_dies); + return -EINVAL; + } + + /* Assert that socket interleaving is disabled. */ + if (ctx->map.num_intlv_sockets > 1) { + pr_warn("%s: Invalid number of interleave sockets: %u", + __func__, ctx->map.num_intlv_sockets); + return -EINVAL; + } + + hash_ctl_64k = FIELD_GET(DF3_HASH_CTL_64K, ctx->map.ctl); + hash_ctl_2M = FIELD_GET(DF3_HASH_CTL_2M, ctx->map.ctl); + hash_ctl_1G = FIELD_GET(DF3_HASH_CTL_1G, ctx->map.ctl); + + for (i = 0; i < num_intlv_bits; i++) { + intlv_bit = atl_get_bit(8 + i, ctx->ret_addr); + + hashed_bit = intlv_bit; + hashed_bit ^= atl_get_bit(8 + i, ctx->ret_addr); + hashed_bit ^= atl_get_bit(16 + i, ctx->ret_addr) & hash_ctl_64k; + hashed_bit ^= atl_get_bit(21 + i, ctx->ret_addr) & hash_ctl_2M; + hashed_bit ^= atl_get_bit(30 + i, ctx->ret_addr) & hash_ctl_1G; + + if (hashed_bit != intlv_bit) + ctx->ret_addr ^= BIT_ULL(8 + i); + } + return 0; +} + int dehash_address(struct addr_ctx *ctx) { switch (ctx->map.intlv_mode) { @@ -452,6 +507,11 @@ int dehash_address(struct addr_ctx *ctx) case DF4p5_NPS1_16CHAN_2K_HASH: return df4p5_dehash_addr(ctx); + case MI2_HASH_8CHAN: + case MI2_HASH_16CHAN: + case MI2_HASH_32CHAN: + return mi200_dehash_addr(ctx); + default: ATL_BAD_INTLV_MODE(ctx->map.intlv_mode); return -EINVAL; diff --git a/drivers/ras/amd/atl/denormalize.c b/drivers/ras/amd/atl/denormalize.c index fe1480c8e0d8..03eb1eea68f9 100644 --- a/drivers/ras/amd/atl/denormalize.c +++ b/drivers/ras/amd/atl/denormalize.c @@ -16,7 +16,7 @@ * Returns the Destination Fabric ID. This is the first (lowest) * CS Fabric ID used within a DRAM Address map. */ -static u16 get_dst_fabric_id(struct addr_ctx *ctx) +u16 get_dst_fabric_id(struct addr_ctx *ctx) { switch (df_cfg.rev) { case DF2: @@ -97,6 +97,9 @@ static u64 make_space_for_cs_id(struct addr_ctx *ctx) case NOHASH_8CHAN: case NOHASH_16CHAN: case NOHASH_32CHAN: + case MI2_HASH_8CHAN: + case MI2_HASH_16CHAN: + case MI2_HASH_32CHAN: case DF2_2CHAN_HASH: return make_space_for_cs_id_at_intlv_bit(ctx); @@ -233,6 +236,9 @@ static u16 calculate_cs_id(struct addr_ctx *ctx) case DF3_COD4_2CHAN_HASH: case DF3_COD2_4CHAN_HASH: case DF3_COD1_8CHAN_HASH: + case MI2_HASH_8CHAN: + case MI2_HASH_16CHAN: + case MI2_HASH_32CHAN: case DF2_2CHAN_HASH: return get_cs_id_df2(ctx); @@ -296,6 +302,9 @@ static u64 insert_cs_id(struct addr_ctx *ctx, u64 denorm_addr, u16 cs_id) case NOHASH_8CHAN: case NOHASH_16CHAN: case NOHASH_32CHAN: + case MI2_HASH_8CHAN: + case MI2_HASH_16CHAN: + case MI2_HASH_32CHAN: case DF2_2CHAN_HASH: return insert_cs_id_at_intlv_bit(ctx, denorm_addr, cs_id); diff --git a/drivers/ras/amd/atl/internal.h b/drivers/ras/amd/atl/internal.h index f3888c8fd02d..33905933e31e 100644 --- a/drivers/ras/amd/atl/internal.h +++ b/drivers/ras/amd/atl/internal.h @@ -30,6 +30,12 @@ /* Shift needed for adjusting register values to true values. */ #define DF_DRAM_BASE_LIMIT_LSB 28 +/* Cache Coherent Moderator Instnce Type value on register */ +#define DF_INST_TYPE_CCM 0 + +/* Maximum possible number of DRAM maps within a Data Fabric. */ +#define DF_NUM_DRAM_MAPS_AVAILABLE 16 + /* * Glossary of acronyms used in address translation for Zen-based systems * @@ -68,6 +74,9 @@ enum intlv_modes { DF4_NPS1_12CHAN_HASH = 0x15, DF4_NPS2_5CHAN_HASH = 0x16, DF4_NPS1_10CHAN_HASH = 0x17, + MI2_HASH_8CHAN = 0x1C, + MI2_HASH_16CHAN = 0x1D, + MI2_HASH_32CHAN = 0x1E, DF2_2CHAN_HASH = 0x21, /* DF4.5 modes are all IntLvNumChan + 0x20 */ DF4p5_NPS1_16CHAN_1K_HASH = 0x2C, @@ -94,8 +103,9 @@ enum intlv_modes { struct df_flags { __u8 legacy_ficaa : 1, + heterogeneous : 1, genoa_quirk : 1, - __reserved_0 : 6; + __reserved_0 : 5; }; struct df_config { @@ -220,6 +230,9 @@ int determine_node_id(struct addr_ctx *ctx, u8 socket_num, u8 die_num); int get_address_map(struct addr_ctx *ctx); int denormalize_address(struct addr_ctx *ctx); + +u16 get_dst_fabric_id(struct addr_ctx *ctx); + int dehash_address(struct addr_ctx *ctx); int norm_to_sys_addr(u8 socket_id, u8 die_id, u8 cs_inst_id, u64 *addr); diff --git a/drivers/ras/amd/atl/map.c b/drivers/ras/amd/atl/map.c index 05141da27028..9326f6a6b6c3 100644 --- a/drivers/ras/amd/atl/map.c +++ b/drivers/ras/amd/atl/map.c @@ -355,6 +355,101 @@ static int get_dram_addr_map(struct addr_ctx *ctx) } } +static int find_moderator_instance_id(struct addr_ctx *ctx) +{ + u16 num_df_instances; + u32 reg; + + /* Read D18F0x40 (FabricBlockInstanceCount). */ + if (df_indirect_read_broadcast(0, 0, 0x40, ®)) + return -EINVAL; + + num_df_instances = FIELD_GET(DF_BLOCK_INSTANCE_COUNT, reg); + + for (ctx->inst_id = 0; ctx->inst_id < num_df_instances; ctx->inst_id++) { + /* Read D18F0x44 (FabricBlockInstanceInformation0). */ + if (df_indirect_read_instance(0, 0, 0x44, ctx->inst_id, ®)) + return -EINVAL; + + if (!reg) + continue; + + /* Match on the first CCM instance. */ + if (FIELD_GET(DF_INSTANCE_TYPE, reg) == DF_INST_TYPE_CCM) + return 0; + } + + return -EINVAL; +} + +static int find_map_by_dst_fabric_id(struct addr_ctx *ctx) +{ + u64 mask = df_cfg.node_id_mask; + + for (ctx->map.num = 0; ctx->map.num < DF_NUM_DRAM_MAPS_AVAILABLE ; ctx->map.num++) { + if (get_dram_addr_map(ctx)) + return -EINVAL; + + /* + * Match if the Destination Fabric ID in this map is the same as + * the Fabric ID for the target memory device. + */ + if ((get_dst_fabric_id(ctx) & mask) == (ctx->cs_fabric_id & mask)) + return 0; + } + + return -EINVAL; +} + +/* UMC to CS mapping for MI200 die[0]s */ +u8 umc_to_cs_mapping_mi200_die0[] = { 28, 20, 24, 16, 12, 4, 8, 0, + 6, 30, 2, 26, 22, 14, 18, 10, + 19, 11, 15, 7, 3, 27, 31, 23, + 9, 1, 5, 29, 25, 17, 21, 13}; + +/* UMC to CS mapping for MI200 die[1]s */ +u8 umc_to_cs_mapping_mi200_die1[] = { 19, 11, 15, 7, 3, 27, 31, 23, + 9, 1, 5, 29, 25, 17, 21, 13, + 28, 20, 24, 16, 12, 4, 8, 0, + 6, 30, 2, 26, 22, 14, 18, 10}; + +int get_umc_to_cs_mapping(struct addr_ctx *ctx) +{ + if (ctx->inst_id >= sizeof(umc_to_cs_mapping_mi200_die0)) + return -EINVAL; + + /* + * MI200 has 2 dies and are enumerated alternatively + * die0's are enumerated as node 2, 4, 6 and 8 + * die1's are enumerated as node 1, 3, 5 and 7 + */ + if (ctx->node_id % 2) + ctx->inst_id = umc_to_cs_mapping_mi200_die1[ctx->inst_id]; + else + ctx->inst_id = umc_to_cs_mapping_mi200_die0[ctx->inst_id]; + + return 0; +} + +static int get_address_map_heterogeneous(struct addr_ctx *ctx) +{ + if (ctx->node_id >= amd_nb_num()) { + if (get_umc_to_cs_mapping(ctx)) + return -EINVAL; + } + + ctx->cs_fabric_id = ctx->inst_id; + ctx->cs_fabric_id |= ctx->node_id << df_cfg.node_id_shift; + + if (find_moderator_instance_id(ctx)) + return -EINVAL; + + if (find_map_by_dst_fabric_id(ctx)) + return -EINVAL; + + return 0; +} + static int lookup_cs_fabric_id(struct addr_ctx *ctx) { u32 reg; @@ -482,6 +577,7 @@ static u8 get_num_intlv_chan(enum intlv_modes intlv_mode) case NOHASH_8CHAN: case DF3_COD1_8CHAN_HASH: case DF4_NPS1_8CHAN_HASH: + case MI2_HASH_8CHAN: case DF4p5_NPS1_8CHAN_1K_HASH: case DF4p5_NPS1_8CHAN_2K_HASH: return 8; @@ -494,6 +590,7 @@ static u8 get_num_intlv_chan(enum intlv_modes intlv_mode) case DF4p5_NPS1_12CHAN_2K_HASH: return 12; case NOHASH_16CHAN: + case MI2_HASH_16CHAN: case DF4p5_NPS1_16CHAN_1K_HASH: case DF4p5_NPS1_16CHAN_2K_HASH: return 16; @@ -501,6 +598,7 @@ static u8 get_num_intlv_chan(enum intlv_modes intlv_mode) case DF4p5_NPS0_24CHAN_2K_HASH: return 24; case NOHASH_32CHAN: + case MI2_HASH_32CHAN: return 32; default: ATL_BAD_INTLV_MODE(intlv_mode); @@ -645,8 +743,11 @@ int get_address_map(struct addr_ctx *ctx) { int ret = 0; - /* TODO: Add special path for DF3.5 heterogeneous systems. */ - ret = get_address_map_common(ctx); + /* Add special path for DF3.5 heterogeneous systems. */ + if (df_cfg.flags.heterogeneous && df_cfg.rev == DF3p5) + ret = get_address_map_heterogeneous(ctx); + else + ret = get_address_map_common(ctx); if (ret) return ret; diff --git a/drivers/ras/amd/atl/reg_fields.h b/drivers/ras/amd/atl/reg_fields.h index d48470e12498..b85ab157773e 100644 --- a/drivers/ras/amd/atl/reg_fields.h +++ b/drivers/ras/amd/atl/reg_fields.h @@ -601,3 +601,32 @@ #define DF2_SOCKET_ID_SHIFT GENMASK(31, 28) #define DF3_SOCKET_ID_SHIFT GENMASK(9, 8) #define DF4_SOCKET_ID_SHIFT GENMASK(11, 8) + +/* + * Total number of instances of all the blocks in DF + * + * Access type: Broadcast + * + * Register + * Rev Fieldname Bits + * + * D18F0x040 [Fabric Block Instance Count] + * DF3 BlkInstCount [7:0] + * DF3p5 BlkInstCount [7:0] + * DF4 BlkInstCount [7:0] + * DF4p5 BlkInstCount [9:0] + */ +#define DF_BLOCK_INSTANCE_COUNT GENMASK(9, 0) + +/* + * Information on the block capabilities + * + * Access type: Broadcast + * + * Register + * Rev Fieldname Bits + * + * D18F0x044 [Fabric Block Instance Information 0] + * DF3p5 BlkInstCount [3:0] + */ +#define DF_INSTANCE_TYPE GENMASK(3, 0) diff --git a/drivers/ras/amd/atl/system.c b/drivers/ras/amd/atl/system.c index 86488138e120..656aac3f6c59 100644 --- a/drivers/ras/amd/atl/system.c +++ b/drivers/ras/amd/atl/system.c @@ -144,6 +144,7 @@ static int determine_df_rev_legacy(void) if (FIELD_GET(DF4_COMPONENT_ID_MASK, fabric_id_mask0)) { df_cfg.rev = DF3p5; + df_cfg.flags.heterogeneous = 1; /* Read D18F1x154 (SystemFabricIdMask1) */ if (df_indirect_read_broadcast(0, 1, 0x154, &fabric_id_mask1)) From patchwork Wed Oct 25 07:33:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "M K, Muralidhara" X-Patchwork-Id: 13435671 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 21313C25B47 for ; Wed, 25 Oct 2023 07:46:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234410AbjJYHqK (ORCPT ); Wed, 25 Oct 2023 03:46:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43462 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233153AbjJYHpq (ORCPT ); Wed, 25 Oct 2023 03:45:46 -0400 Received: from NAM10-BN7-obe.outbound.protection.outlook.com (mail-bn7nam10on2074.outbound.protection.outlook.com [40.107.92.74]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DA2F81985; Wed, 25 Oct 2023 00:35:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=dKScIcaXdIPkJ7+xCWXdYGsE81ZswcWHQ6v25SH+YZ1qyUqH/0VF4YCm0Kp2pmOHCHPw4o/IlSM9Kbs6v2og1FhbK72vMySVENUE42OnfYL/8ychfU6p1ew21Jd4niB2cLx3Csym0s8aVMl6b3fJFI4YtUBDjP7TmBKBuHbkXAtLUZKTSHr5uqztAuL0LYrRb8wb3gqjf1eSFUYxPMZtIVFgUr7207ziLefGLNiBdcBWnI+1SH8habWfQ4ZuDN8TybypIiRAFJjybj4eYE3lEWqaSylPzLl1AeFdqYoEj1g9Pvau0mR0qjyGSpvB3ve0960kuAagcvU65T2o1azkKA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=g+pq2U2vQF2d5lPXaNKhDXXeYI5ztpHBeU+pA8vLqYg=; b=UFzpYmm4hQOwsWB8IoRzYQzKvNwbeGiilaHn2hixN7eU8/0e6lNwWTr9ZXsb+PQEcq1ou06HDKQehR91jYq5ULBnC7wHs2EcgbBNZPkJ+vOgs9Tkgr/wVpV1oY5hdff42VZkBWVOaHpMf6cwGjQTONEn/ZfWaqW/ThcNnieWALla0rT2l7uhPmEC8OyMizG6qOBiTBGI9HpjC6w+x7VAmIhxsjydyMz/teUgjvvfYBpy69Vrf4w7TLuKJyNXe5NrIfwEOWuhnTQgLYMoN+15+9zqwmu6O/rjRIOfMOetTCRSJuJGilyerrRGZQSzVrZiSqKYyfqKfyqKZCo2yy6RYQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=g+pq2U2vQF2d5lPXaNKhDXXeYI5ztpHBeU+pA8vLqYg=; b=Nk4bE+az8vq9eGgeyiD8CLuStAlVRtGCK/1LgibuTwpVACVlKcDuOq+OwLS4BYMIGYT9rUpAa07DEXzvF0uY5CWJ2bd8pQK0NimewaGEsGxSeI0rpm0ZFIxABMjYnAe5STcvhEfdwDKIxNf99MOVbuMlZRsuVhsuB/W2UpD8GnU= Received: from DM6PR10CA0032.namprd10.prod.outlook.com (2603:10b6:5:60::45) by PH7PR12MB7164.namprd12.prod.outlook.com (2603:10b6:510:203::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6933.19; Wed, 25 Oct 2023 07:35:15 +0000 Received: from DS3PEPF000099D3.namprd04.prod.outlook.com (2603:10b6:5:60:cafe::8b) by DM6PR10CA0032.outlook.office365.com (2603:10b6:5:60::45) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6933.19 via Frontend Transport; Wed, 25 Oct 2023 07:35:15 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by DS3PEPF000099D3.mail.protection.outlook.com (10.167.17.4) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.6933.15 via Frontend Transport; Wed, 25 Oct 2023 07:35:15 +0000 Received: from amd.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27; Wed, 25 Oct 2023 02:35:12 -0500 From: Muralidhara M K To: , CC: , , , Muralidhara M K , Yazen Ghannam Subject: [PATCH 2/7] RAS: Add Address Translation support for MI300 Date: Wed, 25 Oct 2023 07:33:34 +0000 Message-ID: <20231025073339.630093-3-muralimk@amd.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231025073339.630093-1-muralimk@amd.com> References: <20231025073339.630093-1-muralimk@amd.com> MIME-Version: 1.0 X-Originating-IP: [10.180.168.240] X-ClientProxiedBy: SATLEXMB03.amd.com (10.181.40.144) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS3PEPF000099D3:EE_|PH7PR12MB7164:EE_ X-MS-Office365-Filtering-Correlation-Id: f0661b44-a619-4bf1-0d1a-08dbd52cee12 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: iAf6DdHdkg1r31d3a/93axZsY+/CiAD26wqPuAqLHcskvtJTmjKMCJtOYp6Ya1Gc3UtT4WIVha+vMPHxC75X9re9WnEFFeHmsVtoxWO2jCzN87X5+MrUNCa0zKhea0QAeHPvsKQfq5Zv17jiP5+OLckC56XZgKnj3GNCCJyajpAxiJ6TOKIs/FgLAa6edRR0AD5K3443h06W8wHmDI5ZI8yPs43hTE3vUwXkQJOfWT/h+1QkPRyYTf2Eh2bfNTsW7fqpGhiyf/ojUt83GmY85/8u5Sfhkl8aUyB+RBzOkkuIDnRUjhGHkI6/WjMJIfvFAJCSJKhRF4xctzcS9r469uwlOKvoMlZ+miQb6JV51XMjUjdUJ9rxNAZXhTzBuZwKGAtrwsEI0EmirA3tdKmMORH/4r3uB76VToaLv7FGX0CC+Weaw8w8TAcdt2YbdJWlvBR+XFy9NyF19yiRJXyi3MlrC9vOr73x73HDMZqv41BAsP7uwBBcJkZIN4lDF38TssIjYDv1ThN983RhjIM1FN4O+L0U5NamEw2uVMoFBMkQcYq+rkc0iES8ecK/5oBrCjC6yi0dXm95WyPjqWaxAQjhCJ/k6I9xe2b6vNXdLfOsr/+75Ix3j8D9Wt4wDowDMdEbURkC+daO8EroQIblr31ZMhzjLhINYvoeEirG/7RbRnGfO1XWSwpuXxXrv2Ek+Dr+0X6+KDlxpavFOa5gYnbg2uTP8zddiWDH3yqlRJMX/UO1lY7W1MmWO4yJxExUoqm5PHfufMsmXvLFk0p5tQ== X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230031)(4636009)(39860400002)(396003)(376002)(346002)(136003)(230922051799003)(186009)(1800799009)(64100799003)(82310400011)(451199024)(46966006)(36840700001)(40470700004)(40460700003)(41300700001)(40480700001)(7696005)(6666004)(478600001)(1076003)(2616005)(70586007)(70206006)(110136005)(54906003)(36756003)(316002)(36860700001)(47076005)(426003)(336012)(16526019)(26005)(82740400003)(356005)(81166007)(83380400001)(8676002)(8936002)(4326008)(5660300002)(2906002)(30864003)(36900700001);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Oct 2023 07:35:15.1557 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: f0661b44-a619-4bf1-0d1a-08dbd52cee12 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: DS3PEPF000099D3.namprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH7PR12MB7164 Precedence: bulk List-ID: X-Mailing-List: linux-edac@vger.kernel.org From: Muralidhara M K Add support for address translation on Data Fabric version 4.5 for MI300 systems. Add new interleaving modes for APU model support and adjust how the DRAM address maps are found early in the translation for certain cases. Signed-off-by: Muralidhara M K Co-developed-by: Yazen Ghannam Signed-off-by: Yazen Ghannam --- drivers/ras/amd/atl/core.c | 5 +- drivers/ras/amd/atl/dehash.c | 89 +++++++++++++++++++++++++++++++ drivers/ras/amd/atl/denormalize.c | 79 +++++++++++++++++++++++++++ drivers/ras/amd/atl/internal.h | 12 ++++- drivers/ras/amd/atl/map.c | 53 +++++++++++------- drivers/ras/amd/atl/reg_fields.h | 5 ++ drivers/ras/amd/atl/system.c | 3 ++ drivers/ras/amd/atl/umc.c | 28 +++++++++- 8 files changed, 250 insertions(+), 24 deletions(-) diff --git a/drivers/ras/amd/atl/core.c b/drivers/ras/amd/atl/core.c index 8c997c7ae8a6..cbbaf82f1ee1 100644 --- a/drivers/ras/amd/atl/core.c +++ b/drivers/ras/amd/atl/core.c @@ -56,7 +56,7 @@ static int add_legacy_hole(struct addr_ctx *ctx) if (df_cfg.rev >= DF4) func = 7; - if (df_indirect_read_broadcast(ctx->node_id, func, 0x104, &dram_hole_base)) + if (df_indirect_read_broadcast(ctx->df_acc_id, func, 0x104, &dram_hole_base)) return -EINVAL; dram_hole_base &= DF_DRAM_HOLE_BASE_MASK; @@ -103,7 +103,7 @@ static bool late_hole_remove(struct addr_ctx *ctx) return false; } -int norm_to_sys_addr(u8 socket_id, u8 die_id, u8 cs_inst_id, u64 *addr) +int norm_to_sys_addr(u16 df_acc_id, u8 socket_id, u8 die_id, u8 cs_inst_id, u64 *addr) { struct addr_ctx ctx; @@ -115,6 +115,7 @@ int norm_to_sys_addr(u8 socket_id, u8 die_id, u8 cs_inst_id, u64 *addr) /* We start from the normalized address */ ctx.ret_addr = *addr; ctx.inst_id = cs_inst_id; + ctx.df_acc_id = df_acc_id; if (determine_node_id(&ctx, socket_id, die_id)) { pr_warn("Failed to determine Node ID"); diff --git a/drivers/ras/amd/atl/dehash.c b/drivers/ras/amd/atl/dehash.c index 5760e6bca194..ddfada2eb7b4 100644 --- a/drivers/ras/amd/atl/dehash.c +++ b/drivers/ras/amd/atl/dehash.c @@ -450,6 +450,90 @@ static int mi200_dehash_addr(struct addr_ctx *ctx) return 0; } +/* + * MI300 hash bits + * 4K 64K 2M 1G 1T 1T + * CSSelect[0] = XOR of addr{8, 12, 15, 22, 29, 36, 43} + * CSSelect[1] = XOR of addr{9, 13, 16, 23, 30, 37, 44} + * CSSelect[2] = XOR of addr{10, 14, 17, 24, 31, 38, 45} + * CSSelect[3] = XOR of addr{11, 18, 25, 32, 39, 46} + * CSSelect[4] = XOR of addr{14, 19, 26, 33, 40, 47} aka Stack + * DieID[0] = XOR of addr{12, 20, 27, 34, 41 } + * DieID[1] = XOR of addr{13, 21, 28, 35, 42 } + */ +static int mi300_dehash_addr(struct addr_ctx *ctx) +{ + bool hash_ctl_4k, hash_ctl_64k, hash_ctl_2M, hash_ctl_1G, hash_ctl_1T; + u8 hashed_bit, intlv_bit, num_intlv_bits, base_bit, i; + + if (ctx->map.intlv_bit_pos != 8) { + pr_warn("%s: Invalid interleave bit: %u", + __func__, ctx->map.intlv_bit_pos); + return -EINVAL; + } + + if (ctx->map.num_intlv_sockets > 1) { + pr_warn("%s: Invalid number of interleave sockets: %u", + __func__, ctx->map.num_intlv_sockets); + return -EINVAL; + } + + hash_ctl_4k = FIELD_GET(DF4p5_HASH_CTL_4K, ctx->map.ctl); + hash_ctl_64k = FIELD_GET(DF4p5_HASH_CTL_64K, ctx->map.ctl); + hash_ctl_2M = FIELD_GET(DF4p5_HASH_CTL_2M, ctx->map.ctl); + hash_ctl_1G = FIELD_GET(DF4p5_HASH_CTL_1G, ctx->map.ctl); + hash_ctl_1T = FIELD_GET(DF4p5_HASH_CTL_1T, ctx->map.ctl); + + /* Channel bits */ + num_intlv_bits = ilog2(ctx->map.num_intlv_chan); + + for (i = 0; i < num_intlv_bits; i++) { + base_bit = 8 + i; + + /* CSSelect[4] jumps to a base bit of 14. */ + if (i == 4) + base_bit = 14; + + intlv_bit = atl_get_bit(base_bit, ctx->ret_addr); + + hashed_bit = intlv_bit; + + /* 4k hash bit only applies to the first 3 bits. */ + if (i <= 2) + hashed_bit ^= atl_get_bit(12 + i, ctx->ret_addr) & hash_ctl_4k; + + hashed_bit ^= atl_get_bit(15 + i, ctx->ret_addr) & hash_ctl_64k; + hashed_bit ^= atl_get_bit(22 + i, ctx->ret_addr) & hash_ctl_2M; + hashed_bit ^= atl_get_bit(29 + i, ctx->ret_addr) & hash_ctl_1G; + hashed_bit ^= atl_get_bit(36 + i, ctx->ret_addr) & hash_ctl_1T; + hashed_bit ^= atl_get_bit(43 + i, ctx->ret_addr) & hash_ctl_1T; + + if (hashed_bit != intlv_bit) + ctx->ret_addr ^= BIT_ULL(base_bit); + } + + /* Die bits */ + num_intlv_bits = ilog2(ctx->map.num_intlv_dies); + + for (i = 0; i < num_intlv_bits; i++) { + base_bit = 12 + i; + + intlv_bit = atl_get_bit(base_bit, ctx->ret_addr); + + hashed_bit = intlv_bit; + + hashed_bit ^= atl_get_bit(20 + i, ctx->ret_addr) & hash_ctl_64k; + hashed_bit ^= atl_get_bit(27 + i, ctx->ret_addr) & hash_ctl_2M; + hashed_bit ^= atl_get_bit(34 + i, ctx->ret_addr) & hash_ctl_1G; + hashed_bit ^= atl_get_bit(41 + i, ctx->ret_addr) & hash_ctl_1T; + + if (hashed_bit != intlv_bit) + ctx->ret_addr ^= BIT_ULL(base_bit); + } + + return 0; +} + int dehash_address(struct addr_ctx *ctx) { switch (ctx->map.intlv_mode) { @@ -512,6 +596,11 @@ int dehash_address(struct addr_ctx *ctx) case MI2_HASH_32CHAN: return mi200_dehash_addr(ctx); + case MI3_HASH_8CHAN: + case MI3_HASH_16CHAN: + case MI3_HASH_32CHAN: + return mi300_dehash_addr(ctx); + default: ATL_BAD_INTLV_MODE(ctx->map.intlv_mode); return -EINVAL; diff --git a/drivers/ras/amd/atl/denormalize.c b/drivers/ras/amd/atl/denormalize.c index 03eb1eea68f9..b233a26f68fc 100644 --- a/drivers/ras/amd/atl/denormalize.c +++ b/drivers/ras/amd/atl/denormalize.c @@ -85,6 +85,46 @@ static u64 make_space_for_cs_id_split_2_1(struct addr_ctx *ctx) return expand_bits(12, ctx->map.total_intlv_bits - 1, denorm_addr); } +/* + * Make space for CS ID at bits [14:8] as follows: + * + * 8 channels -> bits [10:8] + * 16 channels -> bits [11:8] + * 32 channels -> bits [14,11:8] + * + * 1 die -> N/A + * 2 dies -> bit [12] + * 4 dies -> bits [13:12] + */ +static u64 make_space_for_cs_id_mi300(struct addr_ctx *ctx) +{ + u8 num_intlv_bits = order_base_2(ctx->map.num_intlv_chan); + u64 denorm_addr; + + if (ctx->map.intlv_bit_pos != 8) { + pr_warn("%s: Invalid interleave bit: %u", __func__, ctx->map.intlv_bit_pos); + return -1; + } + + /* Channel bits. Covers up to 4 bits at [11:8]. */ + if (num_intlv_bits > 4) + denorm_addr = expand_bits(8, 4, ctx->ret_addr); + else + denorm_addr = expand_bits(ctx->map.intlv_bit_pos, num_intlv_bits, ctx->ret_addr); + + /* Die bits. Always starts at [12]. */ + if (ctx->map.num_intlv_dies > 1) + denorm_addr = expand_bits(12, + ctx->map.total_intlv_bits - num_intlv_bits, + denorm_addr); + + /* Additional channel bit at [14]. */ + if (num_intlv_bits > 4) + denorm_addr = expand_bits(14, 1, denorm_addr); + + return denorm_addr; +} + /* * Take the current calculated address and shift enough bits in the middle * to make a gap where the interleave bits will be inserted. @@ -116,6 +156,11 @@ static u64 make_space_for_cs_id(struct addr_ctx *ctx) case DF4p5_NPS1_16CHAN_2K_HASH: return make_space_for_cs_id_split_2_1(ctx); + case MI3_HASH_8CHAN: + case MI3_HASH_16CHAN: + case MI3_HASH_32CHAN: + return make_space_for_cs_id_mi300(ctx); + case DF4p5_NPS2_4CHAN_1K_HASH: //TODO case DF4p5_NPS1_8CHAN_1K_HASH: @@ -219,6 +264,32 @@ static u16 get_cs_id_df4(struct addr_ctx *ctx) return cs_id; } +/* + * MI300 hash has: + * (C)hannel[3:0] = cs_id[3:0] + * (S)tack[0] = cs_id[4] + * (D)ie[1:0] = cs_id[6:5] + * + * Hashed cs_id is swizzled so that Stack bit is at the end. + * cs_id = SDDCCCC + */ +static u16 get_cs_id_mi300(struct addr_ctx *ctx) +{ + u8 channel_bits, die_bits, stack_bit; + u16 die_id; + + /* Subtract the "base" Destination Fabric ID. */ + ctx->cs_fabric_id -= get_dst_fabric_id(ctx); + + die_id = (ctx->cs_fabric_id & df_cfg.die_id_mask) >> df_cfg.die_id_shift; + + channel_bits = FIELD_GET(GENMASK(3, 0), ctx->cs_fabric_id); + stack_bit = FIELD_GET(BIT(4), ctx->cs_fabric_id) << 6; + die_bits = die_id << 4; + + return stack_bit | die_bits | channel_bits; +} + /* * Derive the correct CS ID that represents the interleave bits * used within the system physical address. This accounts for the @@ -252,6 +323,11 @@ static u16 calculate_cs_id(struct addr_ctx *ctx) case DF4p5_NPS1_16CHAN_2K_HASH: return get_cs_id_df4(ctx); + case MI3_HASH_8CHAN: + case MI3_HASH_16CHAN: + case MI3_HASH_32CHAN: + return get_cs_id_mi300(ctx); + /* CS ID is simply the CS Fabric ID adjusted by the Destination Fabric ID. */ case DF4p5_NPS2_4CHAN_1K_HASH: case DF4p5_NPS1_8CHAN_1K_HASH: @@ -305,6 +381,9 @@ static u64 insert_cs_id(struct addr_ctx *ctx, u64 denorm_addr, u16 cs_id) case MI2_HASH_8CHAN: case MI2_HASH_16CHAN: case MI2_HASH_32CHAN: + case MI3_HASH_8CHAN: + case MI3_HASH_16CHAN: + case MI3_HASH_32CHAN: case DF2_2CHAN_HASH: return insert_cs_id_at_intlv_bit(ctx, denorm_addr, cs_id); diff --git a/drivers/ras/amd/atl/internal.h b/drivers/ras/amd/atl/internal.h index 33905933e31e..a5b13e611a72 100644 --- a/drivers/ras/amd/atl/internal.h +++ b/drivers/ras/amd/atl/internal.h @@ -27,8 +27,12 @@ /* PCI IDs for Genoa DF Function 0. */ #define DF_FUNC0_ID_GENOA 0x14AD1022 +/* PCI IDs for MI300 DF Function 0. */ +#define DF_FUNC0_ID_MI300 0x15281022 + /* Shift needed for adjusting register values to true values. */ #define DF_DRAM_BASE_LIMIT_LSB 28 +#define MI300_DRAM_LIMIT_LSB 20 /* Cache Coherent Moderator Instnce Type value on register */ #define DF_INST_TYPE_CCM 0 @@ -74,6 +78,9 @@ enum intlv_modes { DF4_NPS1_12CHAN_HASH = 0x15, DF4_NPS2_5CHAN_HASH = 0x16, DF4_NPS1_10CHAN_HASH = 0x17, + MI3_HASH_8CHAN = 0x18, + MI3_HASH_16CHAN = 0x19, + MI3_HASH_32CHAN = 0x1A, MI2_HASH_8CHAN = 0x1C, MI2_HASH_16CHAN = 0x1D, MI2_HASH_32CHAN = 0x1E, @@ -219,6 +226,9 @@ struct addr_ctx { * System-wide ID that includes 'node' bits. */ u16 cs_fabric_id; + + /* ID calculated from topology */ + u16 df_acc_id; }; int df_indirect_read_instance(u16 node, u8 func, u16 reg, u8 instance_id, u32 *lo); @@ -235,7 +245,7 @@ u16 get_dst_fabric_id(struct addr_ctx *ctx); int dehash_address(struct addr_ctx *ctx); -int norm_to_sys_addr(u8 socket_id, u8 die_id, u8 cs_inst_id, u64 *addr); +int norm_to_sys_addr(u16 df_acc_id, u8 socket_id, u8 die_id, u8 cs_inst_id, u64 *addr); /* * Helper to use test_bit() without needing to do diff --git a/drivers/ras/amd/atl/map.c b/drivers/ras/amd/atl/map.c index 9326f6a6b6c3..9e9d97e87c69 100644 --- a/drivers/ras/amd/atl/map.c +++ b/drivers/ras/amd/atl/map.c @@ -63,6 +63,10 @@ static int df4p5_get_intlv_mode(struct addr_ctx *ctx) if (ctx->map.intlv_mode <= NOHASH_32CHAN) return 0; + if (ctx->map.intlv_mode >= MI3_HASH_8CHAN && + ctx->map.intlv_mode <= MI3_HASH_32CHAN) + return 0; + /* * Modes matching the ranges above are returned as-is. * @@ -117,6 +121,9 @@ static u64 get_hi_addr_offset(u32 reg_dram_offset) ATL_BAD_DF_REV; } + if (df_cfg.rev == DF4p5 && df_cfg.flags.heterogeneous) + shift = MI300_DRAM_LIMIT_LSB; + return hi_addr_offset << shift; } @@ -138,13 +145,13 @@ static int get_dram_offset(struct addr_ctx *ctx, bool *enabled, u64 *norm_offset if (df_cfg.rev >= DF4) { /* Read D18F7x140 (DramOffset) */ - if (df_indirect_read_instance(ctx->node_id, 7, 0x140 + (4 * map_num), + if (df_indirect_read_instance(ctx->df_acc_id, 7, 0x140 + (4 * map_num), ctx->inst_id, ®_dram_offset)) return -EINVAL; } else { /* Read D18F0x1B4 (DramOffset) */ - if (df_indirect_read_instance(ctx->node_id, 0, 0x1B4 + (4 * map_num), + if (df_indirect_read_instance(ctx->df_acc_id, 0, 0x1B4 + (4 * map_num), ctx->inst_id, ®_dram_offset)) return -EINVAL; } @@ -170,7 +177,7 @@ static int df3_6ch_get_dram_addr_map(struct addr_ctx *ctx) offset = 0x68; /* Read D18F0x06{0,8} (DF::Skt0CsTargetRemap0)/(DF::Skt0CsTargetRemap1) */ - if (df_indirect_read_broadcast(ctx->node_id, 0, offset, ®)) + if (df_indirect_read_broadcast(ctx->df_acc_id, 0, offset, ®)) return -EINVAL; /* Save 8 remap entries. */ @@ -191,12 +198,12 @@ static int df3_6ch_get_dram_addr_map(struct addr_ctx *ctx) static int df2_get_dram_addr_map(struct addr_ctx *ctx) { /* Read D18F0x110 (DramBaseAddress). */ - if (df_indirect_read_instance(ctx->node_id, 0, 0x110 + (8 * ctx->map.num), + if (df_indirect_read_instance(ctx->df_acc_id, 0, 0x110 + (8 * ctx->map.num), ctx->inst_id, &ctx->map.base)) return -EINVAL; /* Read D18F0x114 (DramLimitAddress). */ - if (df_indirect_read_instance(ctx->node_id, 0, 0x114 + (8 * ctx->map.num), + if (df_indirect_read_instance(ctx->df_acc_id, 0, 0x114 + (8 * ctx->map.num), ctx->inst_id, &ctx->map.limit)) return -EINVAL; @@ -209,7 +216,7 @@ static int df3_get_dram_addr_map(struct addr_ctx *ctx) return -EINVAL; /* Read D18F0x3F8 (DfGlobalCtl). */ - if (df_indirect_read_instance(ctx->node_id, 0, 0x3F8, + if (df_indirect_read_instance(ctx->df_acc_id, 0, 0x3F8, ctx->inst_id, &ctx->map.ctl)) return -EINVAL; @@ -222,22 +229,22 @@ static int df4_get_dram_addr_map(struct addr_ctx *ctx) u32 remap_reg; /* Read D18F7xE00 (DramBaseAddress). */ - if (df_indirect_read_instance(ctx->node_id, 7, 0xE00 + (16 * ctx->map.num), + if (df_indirect_read_instance(ctx->df_acc_id, 7, 0xE00 + (16 * ctx->map.num), ctx->inst_id, &ctx->map.base)) return -EINVAL; /* Read D18F7xE04 (DramLimitAddress). */ - if (df_indirect_read_instance(ctx->node_id, 7, 0xE04 + (16 * ctx->map.num), + if (df_indirect_read_instance(ctx->df_acc_id, 7, 0xE04 + (16 * ctx->map.num), ctx->inst_id, &ctx->map.limit)) return -EINVAL; /* Read D18F7xE08 (DramAddressCtl). */ - if (df_indirect_read_instance(ctx->node_id, 7, 0xE08 + (16 * ctx->map.num), + if (df_indirect_read_instance(ctx->df_acc_id, 7, 0xE08 + (16 * ctx->map.num), ctx->inst_id, &ctx->map.ctl)) return -EINVAL; /* Read D18F7xE0C (DramAddressIntlv). */ - if (df_indirect_read_instance(ctx->node_id, 7, 0xE0C + (16 * ctx->map.num), + if (df_indirect_read_instance(ctx->df_acc_id, 7, 0xE0C + (16 * ctx->map.num), ctx->inst_id, &ctx->map.intlv)) return -EINVAL; @@ -252,7 +259,7 @@ static int df4_get_dram_addr_map(struct addr_ctx *ctx) remap_sel = FIELD_GET(DF4_REMAP_SEL, ctx->map.ctl); /* Read D18F7x180 (CsTargetRemap0A). */ - if (df_indirect_read_instance(ctx->node_id, 7, 0x180 + (8 * remap_sel), + if (df_indirect_read_instance(ctx->df_acc_id, 7, 0x180 + (8 * remap_sel), ctx->inst_id, &remap_reg)) return -EINVAL; @@ -261,7 +268,7 @@ static int df4_get_dram_addr_map(struct addr_ctx *ctx) ctx->map.remap_array[i] = (remap_reg >> (j * shift)) & mask; /* Read D18F7x184 (CsTargetRemap0B). */ - if (df_indirect_read_instance(ctx->node_id, 7, 0x184 + (8 * remap_sel), + if (df_indirect_read_instance(ctx->df_acc_id, 7, 0x184 + (8 * remap_sel), ctx->inst_id, &remap_reg)) return -EINVAL; @@ -278,22 +285,22 @@ static int df4p5_get_dram_addr_map(struct addr_ctx *ctx) u32 remap_reg; /* Read D18F7x200 (DramBaseAddress). */ - if (df_indirect_read_instance(ctx->node_id, 7, 0x200 + (16 * ctx->map.num), + if (df_indirect_read_instance(ctx->df_acc_id, 7, 0x200 + (16 * ctx->map.num), ctx->inst_id, &ctx->map.base)) return -EINVAL; /* Read D18F7x204 (DramLimitAddress). */ - if (df_indirect_read_instance(ctx->node_id, 7, 0x204 + (16 * ctx->map.num), + if (df_indirect_read_instance(ctx->df_acc_id, 7, 0x204 + (16 * ctx->map.num), ctx->inst_id, &ctx->map.limit)) return -EINVAL; /* Read D18F7x208 (DramAddressCtl). */ - if (df_indirect_read_instance(ctx->node_id, 7, 0x208 + (16 * ctx->map.num), + if (df_indirect_read_instance(ctx->df_acc_id, 7, 0x208 + (16 * ctx->map.num), ctx->inst_id, &ctx->map.ctl)) return -EINVAL; /* Read D18F7x20C (DramAddressIntlv). */ - if (df_indirect_read_instance(ctx->node_id, 7, 0x20C + (16 * ctx->map.num), + if (df_indirect_read_instance(ctx->df_acc_id, 7, 0x20C + (16 * ctx->map.num), ctx->inst_id, &ctx->map.intlv)) return -EINVAL; @@ -308,7 +315,7 @@ static int df4p5_get_dram_addr_map(struct addr_ctx *ctx) remap_sel = FIELD_GET(DF4_REMAP_SEL, ctx->map.ctl); /* Read D18F7x180 (CsTargetRemap0A). */ - if (df_indirect_read_instance(ctx->node_id, 7, 0x180 + (24 * remap_sel), + if (df_indirect_read_instance(ctx->df_acc_id, 7, 0x180 + (24 * remap_sel), ctx->inst_id, &remap_reg)) return -EINVAL; @@ -317,7 +324,7 @@ static int df4p5_get_dram_addr_map(struct addr_ctx *ctx) ctx->map.remap_array[i] = (remap_reg >> (j * shift)) & mask; /* Read D18F7x184 (CsTargetRemap0B). */ - if (df_indirect_read_instance(ctx->node_id, 7, 0x184 + (24 * remap_sel), + if (df_indirect_read_instance(ctx->df_acc_id, 7, 0x184 + (24 * remap_sel), ctx->inst_id, &remap_reg)) return -EINVAL; @@ -326,7 +333,7 @@ static int df4p5_get_dram_addr_map(struct addr_ctx *ctx) ctx->map.remap_array[i] = (remap_reg >> (j * shift)) & mask; /* Read D18F7x188 (CsTargetRemap0C). */ - if (df_indirect_read_instance(ctx->node_id, 7, 0x188 + (24 * remap_sel), + if (df_indirect_read_instance(ctx->df_acc_id, 7, 0x188 + (24 * remap_sel), ctx->inst_id, &remap_reg)) return -EINVAL; @@ -455,7 +462,7 @@ static int lookup_cs_fabric_id(struct addr_ctx *ctx) u32 reg; /* Read D18F0x50 (FabricBlockInstanceInformation3). */ - if (df_indirect_read_instance(ctx->node_id, 0, 0x50, ctx->inst_id, ®)) + if (df_indirect_read_instance(ctx->df_acc_id, 0, 0x50, ctx->inst_id, ®)) return -EINVAL; if (df_cfg.rev < DF4p5) @@ -463,6 +470,9 @@ static int lookup_cs_fabric_id(struct addr_ctx *ctx) else ctx->cs_fabric_id = FIELD_GET(DF4p5_CS_FABRIC_ID, reg); + if (df_cfg.rev == DF4p5 && df_cfg.flags.heterogeneous) + ctx->cs_fabric_id |= ctx->node_id << df_cfg.node_id_shift; + return 0; } @@ -578,6 +588,7 @@ static u8 get_num_intlv_chan(enum intlv_modes intlv_mode) case DF3_COD1_8CHAN_HASH: case DF4_NPS1_8CHAN_HASH: case MI2_HASH_8CHAN: + case MI3_HASH_8CHAN: case DF4p5_NPS1_8CHAN_1K_HASH: case DF4p5_NPS1_8CHAN_2K_HASH: return 8; @@ -591,6 +602,7 @@ static u8 get_num_intlv_chan(enum intlv_modes intlv_mode) return 12; case NOHASH_16CHAN: case MI2_HASH_16CHAN: + case MI3_HASH_16CHAN: case DF4p5_NPS1_16CHAN_1K_HASH: case DF4p5_NPS1_16CHAN_2K_HASH: return 16; @@ -599,6 +611,7 @@ static u8 get_num_intlv_chan(enum intlv_modes intlv_mode) return 24; case NOHASH_32CHAN: case MI2_HASH_32CHAN: + case MI3_HASH_32CHAN: return 32; default: ATL_BAD_INTLV_MODE(intlv_mode); diff --git a/drivers/ras/amd/atl/reg_fields.h b/drivers/ras/amd/atl/reg_fields.h index b85ab157773e..c3853a25217b 100644 --- a/drivers/ras/amd/atl/reg_fields.h +++ b/drivers/ras/amd/atl/reg_fields.h @@ -251,6 +251,11 @@ #define DF4_HASH_CTL_2M BIT(9) #define DF4_HASH_CTL_1G BIT(10) #define DF4_HASH_CTL_1T BIT(15) +#define DF4p5_HASH_CTL_4K BIT(7) +#define DF4p5_HASH_CTL_64K BIT(8) +#define DF4p5_HASH_CTL_2M BIT(9) +#define DF4p5_HASH_CTL_1G BIT(10) +#define DF4p5_HASH_CTL_1T BIT(15) /* * High Address Offset diff --git a/drivers/ras/amd/atl/system.c b/drivers/ras/amd/atl/system.c index 656aac3f6c59..d80f24798a1e 100644 --- a/drivers/ras/amd/atl/system.c +++ b/drivers/ras/amd/atl/system.c @@ -124,6 +124,9 @@ static int df4_determine_df_rev(u32 reg) if (reg == DF_FUNC0_ID_GENOA) df_cfg.flags.genoa_quirk = 1; + if (reg == DF_FUNC0_ID_MI300) + df_cfg.flags.heterogeneous = 1; + return df4_get_fabric_id_mask_registers(); } diff --git a/drivers/ras/amd/atl/umc.c b/drivers/ras/amd/atl/umc.c index 80030db6b8a5..f334be0dc034 100644 --- a/drivers/ras/amd/atl/umc.c +++ b/drivers/ras/amd/atl/umc.c @@ -17,8 +17,16 @@ static u8 get_socket_id(struct mce *m) return m->socketid; } +#define MCA_IPID_INST_ID_HI GENMASK_ULL(47, 44) static u8 get_die_id(struct mce *m) { + /* The "AMD Node ID" is provided in MCA_IPID[InstanceIdHi] */ + if (df_cfg.rev == DF4p5 && df_cfg.flags.heterogeneous) { + u8 node_id = FIELD_GET(MCA_IPID_INST_ID_HI, m->ipid); + + return node_id / 4; + } + /* * For CPUs, this is the AMD Node ID modulo the number * of AMD Nodes per socket. @@ -37,14 +45,32 @@ static u8 get_cs_inst_id(struct mce *m) return FIELD_GET(UMC_CHANNEL_NUM, m->ipid); } +/* + * Use CPU's AMD Node ID for all cases. + * + * This is needed to read DF registers which can only be + * done on CPU-attached DFs even in heterogeneous cases. + * + * Future systems may report MCA errors across AMD Nodes. + * For example, errors from CPU socket 1 are reported to a + * CPU on socket 0. When this happens, the assumption below + * will break. But the AMD Node ID will be reported in + * MCA_IPID[InstanceIdHi] at that time. + */ +static u16 get_df_acc_id(struct mce *m) +{ + return topology_die_id(m->extcpu); +} + int umc_mca_addr_to_sys_addr(struct mce *m, u64 *sys_addr) { u8 cs_inst_id = get_cs_inst_id(m); u8 socket_id = get_socket_id(m); u64 addr = get_norm_addr(m); u8 die_id = get_die_id(m); + u16 df_acc_id = get_df_acc_id(m); - if (norm_to_sys_addr(socket_id, die_id, cs_inst_id, &addr)) + if (norm_to_sys_addr(df_acc_id, socket_id, die_id, cs_inst_id, &addr)) return -EINVAL; *sys_addr = addr; From patchwork Wed Oct 25 07:33:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "M K, Muralidhara" X-Patchwork-Id: 13435655 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 25255C25B47 for ; Wed, 25 Oct 2023 07:37:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234208AbjJYHht (ORCPT ); Wed, 25 Oct 2023 03:37:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34112 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232865AbjJYHh1 (ORCPT ); Wed, 25 Oct 2023 03:37:27 -0400 Received: from NAM10-BN7-obe.outbound.protection.outlook.com (mail-bn7nam10on2052.outbound.protection.outlook.com [40.107.92.52]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D8C153A9B; Wed, 25 Oct 2023 00:35:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=RhdF/bJ5/tS2Bxk8WbBsind90rSGp9j/C8pDbXPfx1triwVBjNjXIEbSL8BSCNbUw0NMc+2w8vKFDMN7FXJO0qES4i97eSBuh80E8AAVKoky+jnNm3sJYcWEA8MqasHKS+algxCDgZDxNAHAV53odR3fwpk4JGB5wGmBGx4o0B9Fk/7/4j1mdad6KMbOp8wiVmXnkXit6htyQv0LgbACus/pOvNXNLwtHAh4fiDTK+92uo+wy3euO4JWH5NiX79zJRdhQaecCGLnMcA2Yct2MwFjYRZqj8Wk/1orNe5d4Mzlk6RTG0o6oOASMFiTGvnOulaHNXoOfXlcfHXoys4J/g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=YenW6jD8DwHsKBLoZ7mUSWL1oZeP5QZt1EbgNOwjlcU=; b=kyf60T8VGXFhaQHwZFx97F67vDEj9tezJk7QnYa599DC2648oBvmcmIu4IevvltZaABQQnpDSw9ONLDCjxWSgGvIROisYqTwFHz9mKjTbyMtUjqiRXkkiAOQuNFO1BBqM8HXrfV12xG/d3lxWCJ/dojLrylT0OlJwJmzl2lHkSRxGA+lIEDlPLusDN0pAWjkevqmCw0mV1qjn2kYD8Yv5x2CfTuRxAqhTO000z9FGVfV+mxdexOzr1j3mkZmW/DljBCAEWF1/gVdV1XyKXH8Y7Jy7GTq9vHG5RUCwuGC8a0R8ED1o2g7J5wHmvXc6B+DnX9lKUdkIFxGQFzSzdR7/w== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=YenW6jD8DwHsKBLoZ7mUSWL1oZeP5QZt1EbgNOwjlcU=; b=yQ0AXTA2vPy6LFUUMeTi/4463Nde8IKDNGFpHvWWxdPNLyAlYTfclNWe9hCr0tl4boo4W3QXsKdYJueMbTVfZATuOGeBd4zM5GU4uGkOwJ9aXlAlYmMZxLXZf1TYe57SnvrLAxNJ7PXnDfSo6eklnnCSNZfJ/iI3egvLIA++hOk= Received: from BYAPR21CA0023.namprd21.prod.outlook.com (2603:10b6:a03:114::33) by CH0PR12MB5236.namprd12.prod.outlook.com (2603:10b6:610:d3::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6907.33; Wed, 25 Oct 2023 07:35:19 +0000 Received: from CO1PEPF000044F0.namprd05.prod.outlook.com (2603:10b6:a03:114:cafe::8c) by BYAPR21CA0023.outlook.office365.com (2603:10b6:a03:114::33) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6933.14 via Frontend Transport; Wed, 25 Oct 2023 07:35:19 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by CO1PEPF000044F0.mail.protection.outlook.com (10.167.241.70) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.6933.15 via Frontend Transport; Wed, 25 Oct 2023 07:35:18 +0000 Received: from amd.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27; Wed, 25 Oct 2023 02:35:15 -0500 From: Muralidhara M K To: , CC: , , , Muralidhara M K Subject: [PATCH 3/7] RAS: Add MCA Error address conversion for UMC Date: Wed, 25 Oct 2023 07:33:35 +0000 Message-ID: <20231025073339.630093-4-muralimk@amd.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231025073339.630093-1-muralimk@amd.com> References: <20231025073339.630093-1-muralimk@amd.com> MIME-Version: 1.0 X-Originating-IP: [10.180.168.240] X-ClientProxiedBy: SATLEXMB03.amd.com (10.181.40.144) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CO1PEPF000044F0:EE_|CH0PR12MB5236:EE_ X-MS-Office365-Filtering-Correlation-Id: 7074fb77-3bf6-46e6-5b7c-08dbd52cf035 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: trspEdtej7BlUAzO0hT4NfNHMyHdHjBbADCByvh30ijtnBVABVqFot/vFigdA4paO5p4v+Bw1oI+eiyx98Fc/nGgbDyjELRwr6E+DWLRenPyQKr1RnFM8yI2FdnI4PofNlKkyEkC55ExSp8C0F783PXtGvQYLUwaOY3f6fg3nBIKe/G6HN5CadPxEqxvtcrhzYJZK/BOZFdLfB+qRYQGtNQ0tbj5vuFkvRL+8IH+lwJUTBGCXhZ74cqLGduLyOfqerewIrs+ri/qFuZn1pVocATE1ZufSjOrke/AWU3lccjZ4fEZxfphq0tcqAA4u5xH1DSmdIHVQeIYPKeON9egStGSHaePeqIiOfJab+IiJ+ZyzVUYl7i8wkzZDhdjFl/U0Mkr0WuVHXF7xRGSPQYf/jFqru0A7TSgOCKpfXDsT5bZnSiodYxWdqK5lDn7M78+BxE9f4AjE9infchTGai16ThlsYPwUSspJwZ2vIw1ysrnu1s6U/kEcUguRynX+dgSR8a5PgAOMr4MLout5cmZzWIgq6uXU8mDRFIwIgoUwAcFZPMeVDVHdfKlKB0Y1yEZScWD7pjllvQJ+dSPWeCelS+5q6NJftNipRF4yXLVv1SBN9NZLi7yYTGXPUFSzV89M3vmz+edN9r3UWGLPhyF/xZEgWtcB9v5BDn3P+/RLJahpYX2ffx4N1RDzamnBHMinyUB/6zId4JfWlHHFRoGYEO4oEf4gFOZ+fA3juXokM662+CQryKytDJC6mzY9y9Ep02CIav+eblyP2jdNKcYog== X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230031)(4636009)(396003)(39860400002)(346002)(136003)(376002)(230922051799003)(186009)(1800799009)(451199024)(82310400011)(64100799003)(46966006)(36840700001)(40470700004)(36860700001)(2906002)(41300700001)(82740400003)(81166007)(54906003)(6666004)(316002)(7696005)(16526019)(478600001)(70206006)(70586007)(2616005)(1076003)(110136005)(426003)(336012)(40480700001)(47076005)(966005)(83380400001)(40460700003)(36756003)(5660300002)(356005)(4326008)(8676002)(8936002)(26005)(36900700001);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Oct 2023 07:35:18.6956 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 7074fb77-3bf6-46e6-5b7c-08dbd52cf035 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: CO1PEPF000044F0.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: CH0PR12MB5236 Precedence: bulk List-ID: X-Mailing-List: linux-edac@vger.kernel.org From: Muralidhara M K On AMD systems with HBM3 memory, The reported MCA address is DRAM address which needs to be converted to normalized address before the data fabric address translation. MI300A models have on-chip HBM3 memory capable of On-Die ECC support. On-Die ECC error address to MCA is a encoded address reported with a DRAM address (PC/SID/Bank/ROW/COL) instead of normalized address unlike MI200s UMC ECC, as the implementation difference between HBM3 On-Die ECC and HBM2 host ECC. Because On-Die ECC address reporting is done in the back-end of UMC and it no longer has normalized address at that point. So software needs to convert the reported MCA Error Address back to normalized address. Signed-off-by: Muralidhara M K --- Link: https://lore.kernel.org/linux-edac/20230720125425.3735538-1-muralimk@amd.com/T/#m225efdf5812820efd084158bd8cdf40cad1a5af6 drivers/ras/amd/atl/umc.c | 145 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 145 insertions(+) diff --git a/drivers/ras/amd/atl/umc.c b/drivers/ras/amd/atl/umc.c index f334be0dc034..fa8c3072a54f 100644 --- a/drivers/ras/amd/atl/umc.c +++ b/drivers/ras/amd/atl/umc.c @@ -12,6 +12,147 @@ #include "internal.h" +static bool internal_bit_wise_xor(u32 inp) +{ + bool tmp = 0; + int i; + + for (i = 0; i < 32; i++) + tmp = tmp ^ ((inp >> i) & 0x1); + + return tmp; +} + +/* + * Mapping of MCA decoded error address bit location to + * normalized address on MI300A systems. + */ +static const u8 umc_mca2na_mapping[] = { + 0, 5, 6, 8, 9, 14, 12, 13, + 10, 11, 15, 16, 17, 18, 19, 20, + 21, 22, 23, 24, 25, 26, 27, 28, + 7, 29, 30, +}; + +/* AddrHashBank and AddrHashPC/PC2 umc register bit fields */ +static struct { + u32 xor_enable :1; + u32 col_xor :13; + u32 row_xor :18; +} addr_hash_pc, addr_hash_bank[4]; + +static struct { + u32 bank_xor :6; +} addr_hash_pc2; + +#define COLUMN_LOCATION GENMASK(5, 1) +#define ROW_LOCATION GENMASK(23, 10) +/* + * The location of bank, column and row are fixed. + * location of column bit must be NA[5]. + * Row bits are always placed in a contiguous stretch of NA above the + * column and bank bits. + * Bits below the row bits can be either column or bank in any order, + * with the exception that NA[5] must be a column bit. + * Stack ID(SID) bits are placed in the MSB position of the NA. + */ +static int umc_ondie_addr_to_normaddr(u64 mca_addr, u16 nid) +{ + u32 bank[4], bank_hash[4], pc_hash; + u32 col, row, rawbank = 0, pc; + int i, temp = 0, err; + u64 mca2na; + + /* Default umc base address on MI300A systems */ + u32 gpu_umc_base = 0x90000; + + /* + * Error address logged on MI300A systems is ondie MCA address + * in the format MCA_Addr[27:0] = + * {SID[1:0],PC[0],row[14:0],bank[3:0],col[4:0],1'b0} + * The bit locations are calculated as per umc_mca2na_mapping[] + * to find normalized address. + * Refer F19 M90h BKDG Section 20.3.1.3 for clarifications + * + * XORs need to be applied based on the hash settings below. + */ + + /* Calculate column and row */ + col = FIELD_GET(COLUMN_LOCATION, mca_addr); + row = FIELD_GET(ROW_LOCATION, mca_addr); + + /* Apply hashing on below banks for bank calculation */ + for (i = 0; i < 4; i++) + bank_hash[i] = (mca_addr >> (6 + i)) & 0x1; + + /* bank hash algorithm */ + for (i = 0; i < 4; i++) { + /* Read AMD PPR UMC::AddrHashBank register */ + err = amd_smn_read(nid, gpu_umc_base + 0xC8 + (i * 4), &temp); + if (err) + return err; + + addr_hash_bank[i].xor_enable = temp & 1; + addr_hash_bank[i].col_xor = FIELD_GET(GENMASK(13, 1), temp); + addr_hash_bank[i].row_xor = FIELD_GET(GENMASK(31, 14), temp); + /* bank hash selection */ + bank[i] = bank_hash[i] ^ (addr_hash_bank[i].xor_enable & + (internal_bit_wise_xor(col & addr_hash_bank[i].col_xor) ^ + internal_bit_wise_xor(row & addr_hash_bank[i].row_xor))); + } + + /* To apply hash on pc bit */ + pc_hash = (mca_addr >> 25) & 0x1; + + /* Read AMD PPR UMC::CH::AddrHashPC register */ + err = amd_smn_read(nid, gpu_umc_base + 0xE0, &temp); + if (err) + return err; + + addr_hash_pc.xor_enable = temp & 1; + addr_hash_pc.col_xor = FIELD_GET(GENMASK(13, 1), temp); + addr_hash_pc.row_xor = FIELD_GET(GENMASK(31, 14), temp); + + /* Read AMD PPR UMC::CH::AddrHashPC2 register*/ + err = amd_smn_read(nid, gpu_umc_base + 0xE4, &temp); + if (err) + return err; + + addr_hash_pc2.bank_xor = FIELD_GET(GENMASK(5, 0), temp); + + /* Calculate bank value from bank[0..3], bank[4] and bank[5] */ + for (i = 0; i < 4; i++) + rawbank |= (bank[i] & 1) << i; + + rawbank |= (mca_addr >> 22) & 0x30; + + /* pseudochannel(pc) hash selection */ + pc = pc_hash ^ (addr_hash_pc.xor_enable & + (internal_bit_wise_xor(col & addr_hash_pc.col_xor) ^ + internal_bit_wise_xor(row & addr_hash_pc.row_xor) ^ + internal_bit_wise_xor(rawbank & addr_hash_pc2.bank_xor))); + + /* Mask b'25(pc_bit) and b'[9:6](bank) */ + mca_addr &= ~0x20003c0ULL; + + for (i = 0; i < 4; i++) + mca_addr |= (bank[i] << (6 + i)); + + mca_addr |= (pc << 25); + + /* NA[4..0] is fixed */ + mca2na = 0x0; + /* convert mca error address to normalized address */ + for (i = 1; i < ARRAY_SIZE(umc_mca2na_mapping); i++) + mca2na |= ((mca_addr >> i) & 0x1) << umc_mca2na_mapping[i]; + + mca_addr = mca2na; + pr_info("Error Addr: 0x%016llx\n", mca_addr); + pr_info("Error hit on: Bank %d Row %d Column %d\n", rawbank, row, col); + + return mca_addr; +} + static u8 get_socket_id(struct mce *m) { return m->socketid; @@ -36,6 +177,10 @@ static u8 get_die_id(struct mce *m) static u64 get_norm_addr(struct mce *m) { + /* MI300: DRAM->Normalized translation */ + if (df_cfg.rev == DF4p5 && df_cfg.flags.heterogeneous) + return umc_ondie_addr_to_normaddr(m->addr, get_socket_id(m)); + return m->addr; } From patchwork Wed Oct 25 07:33:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "M K, Muralidhara" X-Patchwork-Id: 13435656 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 71425C0032E for ; Wed, 25 Oct 2023 07:37:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233459AbjJYHhv (ORCPT ); Wed, 25 Oct 2023 03:37:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34092 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234231AbjJYHh0 (ORCPT ); Wed, 25 Oct 2023 03:37:26 -0400 Received: from NAM12-DM6-obe.outbound.protection.outlook.com (mail-dm6nam12on2056.outbound.protection.outlook.com [40.107.243.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E52673A91; Wed, 25 Oct 2023 00:35:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ZAtQDA9+mf3bKHBddLM5gS0PfebaA2bMr4Ba1UQ6ifh3KSQR+ldiRGY9M8IRDwsmQvyzutf8xs5YQuCBAEUNdyWgnDhJZctG9i6h6Xre65dpIdxrmJIKsczJWjde0CpjUeAhKOqeHp+kXjVV2HryhxG4RkoiIGf8zHhKvQnYKEnsw0nBXF61CvJxGIUVrOhCPxiKB6HKZLzHScQhCI5zcRlN7LGlPTRnNx+F2J6F9lsSLDOf5FOZ85C+031Pj3HkRaiU+L5WfnOQBaLPwiAsD9PtpfhP46i/dR2/hPhcuRfijFEHWoSM3e+oIDF8MZO7av2LrlDKTxEKvjx6+Utsaw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=58+8MI1ep5pIvy1pEqzcwy6/uPtpk6PWlvlZeA+v2x4=; b=C6KNA+mzWULBJ7Jg70L+JhzZaADkShY/PXcyXpVg4Qyo4yk/pB94TC1dgMCr1+PcX+W53jfUtfDLN2Mfonk7xc3mh92P6DfWXV635iCZudSQNnPeMqdtmgqXGSsHfW8gWoG0SBsTkQkDbTIP1NKaRNu6gYX+fFSte8d4zoP8A4s1e5BXnHuQDU+BYdA/SihLSJtgLXYwwL/GxlBMRTbpodYJXe2Y2Yn4nfngVN6VK4+SEc1k50pLdSgGgtuH/MpSBpQ/XBvhdX46CzFHNbnrMKpYRqERAH9FsZXXG6P7ov6k+DeCfvvfGO9PYxEB8l5UEISutRrWZXOmzT+UuVQZBQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=58+8MI1ep5pIvy1pEqzcwy6/uPtpk6PWlvlZeA+v2x4=; b=sYtOfsPXc96HgkLjIFISlxkRw64WG2RA+0lIiFozvhkiNUOL7xIs8kTSFoYlvMG4LDWp+wvvZygVcojBM9MmlWClEZoBS+sro5jOd6ZHftQ6DmQSKJXFT4W8XmvHx07JpJk8m0aCIQ9ZgiYE2w3maCYGZ69QHbgSX+KDvkar8a0= Received: from BYAPR21CA0022.namprd21.prod.outlook.com (2603:10b6:a03:114::32) by SN7PR12MB6790.namprd12.prod.outlook.com (2603:10b6:806:269::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6907.33; Wed, 25 Oct 2023 07:35:21 +0000 Received: from CO1PEPF000044F0.namprd05.prod.outlook.com (2603:10b6:a03:114:cafe::b6) by BYAPR21CA0022.outlook.office365.com (2603:10b6:a03:114::32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6933.14 via Frontend Transport; Wed, 25 Oct 2023 07:35:21 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by CO1PEPF000044F0.mail.protection.outlook.com (10.167.241.70) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.6933.15 via Frontend Transport; Wed, 25 Oct 2023 07:35:20 +0000 Received: from amd.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27; Wed, 25 Oct 2023 02:35:17 -0500 From: Muralidhara M K To: , CC: , , , Muralidhara M K Subject: [PATCH 4/7] RAS: Add static lookup table to get CS physical ID Date: Wed, 25 Oct 2023 07:33:36 +0000 Message-ID: <20231025073339.630093-5-muralimk@amd.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231025073339.630093-1-muralimk@amd.com> References: <20231025073339.630093-1-muralimk@amd.com> MIME-Version: 1.0 X-Originating-IP: [10.180.168.240] X-ClientProxiedBy: SATLEXMB03.amd.com (10.181.40.144) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CO1PEPF000044F0:EE_|SN7PR12MB6790:EE_ X-MS-Office365-Filtering-Correlation-Id: 15db8450-d0c7-47f5-11de-08dbd52cf170 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: iB+CBnyYufBMuHgqws1ahWzvTGj72o4FzKF17qgAsL8jr1V4AkkpV8zFykKrVKyKM70s85hBJSK4wHlAl+Nvn1kItFQSZQbT9Vlc2Regd6d7UvgG6LlZPxtIogKeDqADcFiLZL6/HJnYY6Pc4cvH/Gil5cjLEEjEXVwoQWGsbrqqO5AYjSJBMljRm2pCB1ctDa3kzcd3Gbfi4Esbzmao8I5y/4fqo9rxbWQP85jYxnc/j/3kwtKvoA3hmSdTn/c7zzv0JPB6SLnkP5CSgzb5kXGjlpRtH0200LVjOdPy0bPeamsmhDUqGhWnuSjcPfTsyGP7NQ4JGGS9QLx7ZgtQmnXYFvU0T5NeZOda85UH0msJq0uT6LhmX9GfyGH6I0Mf/o5RjzNNkRtR7k2PQqmYLMajUBmRLY+362BkmA9KT7ZBC/GwUslIMX4pKj4409iPMOerry3XXWN2JqJWDgfM5v5eI3i3WWbNKEsprLULYIRDron1GAURMU0ugyqVbVKdAdvWlwswi5WCGmECGPqoim5Eop6a9xDu1dAk6fiDn25jM3hvBujmKtXg8gn1awpFcX7r9JugMGCNZZDZ6tl6lMcNhIQh1GCrSemu5x49tw0GnAMc+9O426fPF6lE5MywU4CTyJMcvQHKPb7vbvFrwpd2cdjIk0aBpIYE+qTPx9D53LXFgI48vAHmgn1ceRC3XszLd+7pB08MxTLIAdkKwo1Y5ObEy9/LRihIifnvIabuI9frseASUt1WgLq7C/oFREZKFTYjXCXr4MvE9GBJew== X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230031)(4636009)(376002)(396003)(39860400002)(346002)(136003)(230922051799003)(1800799009)(186009)(451199024)(64100799003)(82310400011)(40470700004)(46966006)(36840700001)(36756003)(2906002)(316002)(36860700001)(41300700001)(110136005)(70586007)(70206006)(54906003)(5660300002)(8676002)(8936002)(4326008)(478600001)(356005)(82740400003)(6666004)(40480700001)(40460700003)(47076005)(16526019)(7696005)(2616005)(26005)(336012)(426003)(81166007)(1076003)(36900700001);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Oct 2023 07:35:20.7425 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 15db8450-d0c7-47f5-11de-08dbd52cf170 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: CO1PEPF000044F0.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN7PR12MB6790 Precedence: bulk List-ID: X-Mailing-List: linux-edac@vger.kernel.org From: Muralidhara M K AMD MI300A models have a single Data Fabric (DF) instance per socket. So, all 4 AIDs are not software-visible (using PCI Device 18h, etc.). The MCA_IPID_UMC[InstanceId] field holds the SMN base address for the UMC instance and SMN address mapping repeated same for each of all 4 AIDs in as socket. Add a static lookup table by reading the UMC SMN address from the MCA_IPID_UMC[InstanceId] field and use the value to look up the CS physical ID from the table. Signed-off-by: Muralidhara M K --- drivers/ras/amd/atl/umc.c | 33 +++++++++++++++++++++++++++++++++ 1 file changed, 33 insertions(+) diff --git a/drivers/ras/amd/atl/umc.c b/drivers/ras/amd/atl/umc.c index fa8c3072a54f..52247a7949fb 100644 --- a/drivers/ras/amd/atl/umc.c +++ b/drivers/ras/amd/atl/umc.c @@ -153,6 +153,35 @@ static int umc_ondie_addr_to_normaddr(u64 mca_addr, u16 nid) return mca_addr; } +/* + * MCA_IPID_UMC[InstanceId] holds the SMN Base Address for a UMC instance. + * MI-300 has a fixed, model-specific mapping between a UMC instance and its + * related Data Fabric CS instance. + * Use the UMC SMN Base Address value to find the appropriate CS instance ID. + */ +static const u32 csmap[32] = { + 0x393f00, 0x293f00, 0x193f00, 0x093f00, 0x392f00, 0x292f00, + 0x192f00, 0x092f00, 0x391f00, 0x291f00, 0x191f00, 0x091f00, + 0x390f00, 0x290f00, 0x190f00, 0x090f00, 0x793f00, 0x693f00, + 0x593f00, 0x493f00, 0x792f00, 0x692f00, 0x592f00, 0x492f00, + 0x791f00, 0x691f00, 0x591f00, 0x491f00, 0x790f00, 0x690f00, + 0x590f00, 0x490f00 }; + +/* MCA_IPID[InstanceId] give the Instance Number UMC SmnAddr */ +#define UMC_PHY_INSTANCE_NUM GENMASK(31, 0) + +static u8 fixup_cs_inst_id(struct mce *m) +{ + u32 smn_addr = FIELD_GET(UMC_PHY_INSTANCE_NUM, m->ipid); + int i; + + for (i = 0; i < ARRAY_SIZE(csmap); i++) { + if (smn_addr == csmap[i]) + break; + } + return i; +} + static u8 get_socket_id(struct mce *m) { return m->socketid; @@ -187,6 +216,10 @@ static u64 get_norm_addr(struct mce *m) #define UMC_CHANNEL_NUM GENMASK(31, 20) static u8 get_cs_inst_id(struct mce *m) { + /* MI300: static mapping table for MCA_IPID[InstanceId] to CS physical ID. */ + if (df_cfg.rev == DF4p5 && df_cfg.flags.heterogeneous) + return fixup_cs_inst_id(m); + return FIELD_GET(UMC_CHANNEL_NUM, m->ipid); } From patchwork Wed Oct 25 07:33:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "M K, Muralidhara" X-Patchwork-Id: 13435670 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 29A98C25B70 for ; Wed, 25 Oct 2023 07:44:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233303AbjJYHoa (ORCPT ); Wed, 25 Oct 2023 03:44:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43320 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234693AbjJYHoN (ORCPT ); Wed, 25 Oct 2023 03:44:13 -0400 Received: from NAM10-MW2-obe.outbound.protection.outlook.com (mail-mw2nam10on2063.outbound.protection.outlook.com [40.107.94.63]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 589013AB9; Wed, 25 Oct 2023 00:35:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Do0jvJQPvCir+AupxNgiSRngdGrkfJb29xGdPrzwypjji4JtaAL8gAX6M9bKlMiK1wDot2QwcZhA6Fta4d75GuuxRRP7YmuHwEjU5AiaPRWpSaD5ngImNmZ4MZ+2ShaBTYMx7E8YyAGMGd2kadXTD/jNhHDZYiiHojmHUzBIbcdpfPoKf+0vE1ro5qYppZ7RM7vkA1imACQkeB0VdP11s3HHRyGyl7gDYXFmH++caUiZZnFcubj5/8XzWvDNMykpUv0A7pcqiQm5bIZiJ5nUZBu1MG1i/joslR969M+3UMDTzMcFfqURYGCZjo1FXsg4C1NwC1jVN9Fav14wOvfMEw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=dpoFvFcZ7Ba18t8x6fW6/vyJJIZnqfA8HkwtyKMz4aI=; b=EmHL0q8dkDB4eKJjY31Rur8aGUZU30fWMDZAA3IOK+zlu+kl56J+eIfjAl4uhPqK84BONmq5CFBZZyRN0CC7I3IYuQJDXUWjhBcnalU/RuQRvAinilHc/pEeQf7m6HYQJaVJf1JuQwSTCM6I1lsD0NjUKX6hoavnhKOJY9/1V1UvAfedotppvctPiwyCbotGSYduEt5Y7pFPddKu9bPoqy+aaN7GhBu9CwDL8BrQP7fnHIpSpVG3pE49/MSRu/EMl/NGTBR1o8kK3VbiqQnn8Hib84tt9jp+pUGdkNCQCDQd7MAyx7mNABG+aLT55dZLZgcQT6q0p0XD6nrTwuIz4Q== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=dpoFvFcZ7Ba18t8x6fW6/vyJJIZnqfA8HkwtyKMz4aI=; b=kfkuV2TtO7ZRmvnFVQhl/HzX6CW1X5fHD5hOtX2rVZoKpEwnwaTx0qKpN+Y2ZvsscYdy1DZ352SKmlUHXdsY22A17dqUGBktw9sFqGAnV2iMRbO/gMcdIjAc6lEsJk8MZWonzW3NKbAAbv9vKa6V8+eXMmZjRh9sSY0oagUt0Pw= Received: from SJ0PR03CA0103.namprd03.prod.outlook.com (2603:10b6:a03:333::18) by MW4PR12MB7192.namprd12.prod.outlook.com (2603:10b6:303:22a::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6907.31; Wed, 25 Oct 2023 07:35:23 +0000 Received: from CO1PEPF000044F3.namprd05.prod.outlook.com (2603:10b6:a03:333:cafe::db) by SJ0PR03CA0103.outlook.office365.com (2603:10b6:a03:333::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6933.19 via Frontend Transport; Wed, 25 Oct 2023 07:35:23 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by CO1PEPF000044F3.mail.protection.outlook.com (10.167.241.73) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.6933.15 via Frontend Transport; Wed, 25 Oct 2023 07:35:22 +0000 Received: from amd.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27; Wed, 25 Oct 2023 02:35:20 -0500 From: Muralidhara M K To: , CC: , , , Muralidhara M K Subject: [PATCH 5/7] RAS: Add fixed Physical to logical CS ID mapping table Date: Wed, 25 Oct 2023 07:33:37 +0000 Message-ID: <20231025073339.630093-6-muralimk@amd.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231025073339.630093-1-muralimk@amd.com> References: <20231025073339.630093-1-muralimk@amd.com> MIME-Version: 1.0 X-Originating-IP: [10.180.168.240] X-ClientProxiedBy: SATLEXMB03.amd.com (10.181.40.144) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CO1PEPF000044F3:EE_|MW4PR12MB7192:EE_ X-MS-Office365-Filtering-Correlation-Id: de40815e-2f11-484d-ef59-08dbd52cf2c3 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 4WQWerrufzWdGSmvgsXHqFeAfesn2l+sCJDgiMCv2hC3F5pV95dRoxlidNIlwbFXm9aRmTfmawxvkVJ+2O1oNbjI3plmeeP/JcfVv/DFPzac7v10iFipt5yaUhNlIW3TO/jXUPfD811+z02wZ1MqGpSdMnX+FCuiC83SbB8da4c0L7L34tXyJ4uOWP7YZ1Dd+DCE0Dy173A+QD14XrpfSfIYljX6f779Ta3ngjSPJPu3VwdysyLrg0NqZFLtDecnonaEyRaA0mqeiZ0nE9DA0z8QgBi6UKT+jdSrhkIvCDFfKqzG8DC4lqHuephlKbWGE+iA+sZ0lwXMrPNDvEZnEOVjOcunyn1Dv4gQ0TNsqukIka5lQ+3Wp37NgpJiMcVp+3Xjvdvzpx+zvuo8ISERAtziFe/cQvnCIRQSLUdhI5YlbUqJJd8d8elUakePA5nv1tqdQwQxzU7d9fWCdmblsIRj81P6ptzExrJqXTIXA71mvoWSh6pvPqLPMYVPMDcLMSjqlpPFgYybRWlOwJO0ax2jjg8fD7ElGAiecKPHT2PL9C01ELlJsXBFSgKPP5oZH8VKORKfQJNSP9W/QCYrIPHKBe1eGjK+gx/4SLojQcDuQCx5z9GLHW2gXJ8NHOnlKkg1A2iwJjB8zJSFrXTL2smmvJKj0vLPGXHrsasrXgb5BUQocWVi9cxw9YF1Fl030BuKjHURGTRg4sJPlBprA1YJkrJIXCCIz9/pfhX55izn/xS5STdEPcX8Dv0Eq7Gu+h2ebzhzhyT1W7f3/lwPaA== X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230031)(4636009)(39860400002)(346002)(376002)(396003)(136003)(230922051799003)(451199024)(82310400011)(64100799003)(186009)(1800799009)(36840700001)(40470700004)(46966006)(83380400001)(40460700003)(8936002)(40480700001)(5660300002)(70586007)(110136005)(70206006)(41300700001)(478600001)(7696005)(54906003)(316002)(6666004)(36756003)(8676002)(4326008)(81166007)(2906002)(426003)(356005)(16526019)(2616005)(1076003)(336012)(26005)(82740400003)(36860700001)(47076005)(36900700001);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Oct 2023 07:35:22.9855 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: de40815e-2f11-484d-ef59-08dbd52cf2c3 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: CO1PEPF000044F3.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: MW4PR12MB7192 Precedence: bulk List-ID: X-Mailing-List: linux-edac@vger.kernel.org From: Muralidhara M K AMD MI300A models have a single Data Fabric (DF) instance per socket. So, only one out of 4 AIDs are software-visible using PCI Device 18h. Add a static lookup table for converting physical CS ID to logical CS ID for mapping of all 4 respective AIDs in a socket. Signed-off-by: Muralidhara M K --- drivers/ras/amd/atl/denormalize.c | 20 +++++++++++++++++++- 1 file changed, 19 insertions(+), 1 deletion(-) diff --git a/drivers/ras/amd/atl/denormalize.c b/drivers/ras/amd/atl/denormalize.c index b233a26f68fc..4c127347a56e 100644 --- a/drivers/ras/amd/atl/denormalize.c +++ b/drivers/ras/amd/atl/denormalize.c @@ -442,6 +442,20 @@ static u16 get_logical_cs_fabric_id(struct addr_ctx *ctx) return (phys_cs_fabric_id & df_cfg.node_id_mask) | log_cs_fabric_id; } +/* Physical CS to Logical CS mapping for MI300 AIDs */ +u16 phy_to_logicalcs_mapping_mi300_aid[] = { 12, 13, 14, 15, 8, 9, 10, 11, + 4, 5, 6, 7, 0, 1, 2, 3, + 28, 29, 30, 31, 24, 25, 26, 27, + 20, 21, 22, 23, 16, 17, 18, 19}; + +static u16 get_logical_cs_fabric_id_mi300_die(struct addr_ctx *ctx) +{ + if (ctx->inst_id >= sizeof(phy_to_logicalcs_mapping_mi300_aid)) + return -EINVAL; + + return phy_to_logicalcs_mapping_mi300_aid[ctx->inst_id]; +} + static int denorm_addr_common(struct addr_ctx *ctx) { u64 denorm_addr; @@ -451,7 +465,11 @@ static int denorm_addr_common(struct addr_ctx *ctx) * Convert the original physical CS Fabric ID to a logical value. * This is required for non-power-of-two and other interleaving modes. */ - ctx->cs_fabric_id = get_logical_cs_fabric_id(ctx); + if (df_cfg.rev == DF4p5 && df_cfg.flags.heterogeneous) + ctx->cs_fabric_id = (ctx->cs_fabric_id & df_cfg.node_id_mask) | + get_logical_cs_fabric_id_mi300_die(ctx); + else + ctx->cs_fabric_id = get_logical_cs_fabric_id(ctx); denorm_addr = make_space_for_cs_id(ctx); cs_id = calculate_cs_id(ctx); From patchwork Wed Oct 25 07:33:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "M K, Muralidhara" X-Patchwork-Id: 13435672 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6B8B4C0032E for ; Wed, 25 Oct 2023 07:46:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234182AbjJYHqn (ORCPT ); Wed, 25 Oct 2023 03:46:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48424 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234184AbjJYHqC (ORCPT ); Wed, 25 Oct 2023 03:46:02 -0400 Received: from NAM02-BN1-obe.outbound.protection.outlook.com (mail-bn1nam02on2087.outbound.protection.outlook.com [40.107.212.87]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E215D3C01; Wed, 25 Oct 2023 00:35:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=aD1HXrSL4JpT7l6NTHSM/6oVuwCvTXoZldxf9+7plya0LQKL87HzAq4E5moz1ybUazpGZbXU8faqQ/GbvqiX6P6giewY01A4sr4Es+i/dzurCExDyBgZegPkyBuIjsAs0Y/Wx0Y6fWE7UoZ5Pxb+/+p9iAHp2cxE8K6h2A2mrxiHbX9lGanGD60B39PFAj69jRL9mDmFXKO6OJL6+j0TE0DoQFsSUGa8a3H7T2Ror1y76bKxKG6sE7gyxMkTdnx7YKxlBYxeFr45p8Mj9KRsL88xH8AvwBA5JaupA9KJWG91VZot8Y29EfM/fk4PB8ebUAqfzpWvzRLdQ6Z3vRQC3A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=AAgaZShz+j9RBH+IhtiZKlkiSI79z9oHr/WIVdQl0dA=; b=YMUyZGh7UE5WJdM5l4CNvXuNT8CaRMNDusUVW+7tRdEAgwBe9AJ79KsuSGVhi7y08LgXURIzlte2ktFk6n/mvx7BEHFnGQedt0ZBQTFYJjaKKz7Z0uDu5UFxDZuSfWeYsL2hSu71dP6bm2SKrRA6/ITVQpYlCirKRcJcdf2j28D5YkBMwK7JJjigQuDFsiRSsv5D4B0KTKn3/fUCLQsh7uNSTvyYs4OcJnjtPdt4SBB57wLQ1sg7cpfSYyOEIf+fx8SdwRA7k56LGPldI4MGV+XPNKwfLvJWPYkHNC23Ck9sdPKaJPAftrtdVJgzPEHhEPJvEUtlm5FdDimN+uL0dA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=AAgaZShz+j9RBH+IhtiZKlkiSI79z9oHr/WIVdQl0dA=; b=3LTBh1zprNiwxiYzMyLsD63iw7r9+K8fGj62C0l9GCTO62o1+kEzzzz8ol6EA9aHRGKQEsv3+wR2+d77INIlGyK8IDFNVfMLIvtaAAOYgznTdipPMGiomHfbMbq98e2qPY0NBCbyA/vQhTdwloFsS+ok8qpSlvIQNLHoG4qlit4= Received: from BYAPR21CA0002.namprd21.prod.outlook.com (2603:10b6:a03:114::12) by LV2PR12MB5846.namprd12.prod.outlook.com (2603:10b6:408:175::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6907.26; Wed, 25 Oct 2023 07:35:26 +0000 Received: from CO1PEPF000044F0.namprd05.prod.outlook.com (2603:10b6:a03:114:cafe::b5) by BYAPR21CA0002.outlook.office365.com (2603:10b6:a03:114::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6933.4 via Frontend Transport; Wed, 25 Oct 2023 07:35:25 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by CO1PEPF000044F0.mail.protection.outlook.com (10.167.241.70) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.6933.15 via Frontend Transport; Wed, 25 Oct 2023 07:35:25 +0000 Received: from amd.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27; Wed, 25 Oct 2023 02:35:22 -0500 From: Muralidhara M K To: , CC: , , , Muralidhara M K Subject: [PATCH 6/7] RAS: Get CS fabirc ID register bit fields Date: Wed, 25 Oct 2023 07:33:38 +0000 Message-ID: <20231025073339.630093-7-muralimk@amd.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231025073339.630093-1-muralimk@amd.com> References: <20231025073339.630093-1-muralimk@amd.com> MIME-Version: 1.0 X-Originating-IP: [10.180.168.240] X-ClientProxiedBy: SATLEXMB03.amd.com (10.181.40.144) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CO1PEPF000044F0:EE_|LV2PR12MB5846:EE_ X-MS-Office365-Filtering-Correlation-Id: 3bbc378d-1015-4740-b5a7-08dbd52cf43d X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: fHS+Fz3hLFFKCUrD+XaPiWIb5aMWAjTcoxnOQOHvcoZkzs+CcZPejnmhVvT547yqsL3MCk0Slu1Krq1O0sIURLRkiA1uVp4YCuMAY1/pWE498XT/ISgz9566jOWb2Ed6rX12Q+hTnGEfobPJoqPZwetyG+YTVpFvTpZ8T/3hbm3YloTSaLk4AYDXZUL6AyD+Us+uHO8hZTteBAdUyR0GNddeGsG+AtWPs+tmYAUsjmR0upWqDNe5QnZ1VkFEmtUVUy3eT+QPgNVgf+GRXZqwbuO8xaE+uCwOhMIku5kUfVXBre+UFHazRIXPOvBL0WI8VymBY8kEkaq0pTU+1jY0Xpdq3ScRxkA5qiu0zIeeM9z6mT4b+ypzeDTNCZBwQJafs/0S0h4p+FYez8c7xwfl8SS2RDPtzf7Y+HpedhELl0sisamkKUFsOFlycMkDlnd81hHY9vqkAFc4XG52dzX37bxIOCNQoy4TReOYPpH7I5QlY9fsflQC7UaQ3JfgivdUWVKX/yI5eGqFdys8NpgUsPCB+dg9QT6kBqxeglh+dIxp3EFczVOsWNtJzuCTIBwtil42TPpferS5M9coGmet5aeP9aWkxTvvAS+CzL/19O9lR4+9mPFHyxPlOkU1Oqi+QT1x8iXwWO6jalSFCBwTuW87DMOHPrL6Wnj58xq7FVyF+G+TbpNHSNdmsaWOt7UtM1bFQILP6wE/wZNiZnIzXcuS2DNmQ9QWzbiU4S5lB2v1dJSPvJomaUoXp4f1jJXkGhdCGBvqIrDp7FMXXSV2YA== X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230031)(4636009)(396003)(376002)(346002)(39860400002)(136003)(230922051799003)(82310400011)(1800799009)(64100799003)(186009)(451199024)(40470700004)(36840700001)(46966006)(36756003)(478600001)(6666004)(4744005)(81166007)(426003)(5660300002)(336012)(7696005)(2906002)(83380400001)(2616005)(40460700003)(47076005)(1076003)(41300700001)(316002)(70586007)(70206006)(54906003)(110136005)(36860700001)(4326008)(8936002)(8676002)(356005)(40480700001)(16526019)(26005)(82740400003)(36900700001);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Oct 2023 07:35:25.4613 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 3bbc378d-1015-4740-b5a7-08dbd52cf43d X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: CO1PEPF000044F0.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: LV2PR12MB5846 Precedence: bulk List-ID: X-Mailing-List: linux-edac@vger.kernel.org From: Muralidhara M K Read correct register bit fields for cs_fabric_id for address translation to work. Signed-off-by: Muralidhara M K --- drivers/ras/amd/atl/reg_fields.h | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/ras/amd/atl/reg_fields.h b/drivers/ras/amd/atl/reg_fields.h index c3853a25217b..6b60091f235b 100644 --- a/drivers/ras/amd/atl/reg_fields.h +++ b/drivers/ras/amd/atl/reg_fields.h @@ -28,14 +28,14 @@ * Rev Fieldname Bits * * D18F0x50 [Fabric Block Instance Information 3] - * DF2 BlockFabricId [19:8] + * DF2 BlockFabricId [13:8] * DF3 BlockFabricId [19:8] * DF3p5 BlockFabricId [19:8] * DF4 BlockFabricId [19:8] - * DF4p5 BlockFabricId [15:8] + * DF4p5 BlockFabricId [19:8] */ -#define DF2_CS_FABRIC_ID GENMASK(19, 8) -#define DF4p5_CS_FABRIC_ID GENMASK(15, 8) +#define DF2_CS_FABRIC_ID GENMASK(13, 8) +#define DF4p5_CS_FABRIC_ID GENMASK(19, 8) /* * Component ID Mask From patchwork Wed Oct 25 07:33:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "M K, Muralidhara" X-Patchwork-Id: 13435657 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C8ADEC25B47 for ; Wed, 25 Oct 2023 07:38:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233982AbjJYHiQ (ORCPT ); Wed, 25 Oct 2023 03:38:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37566 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232691AbjJYHhd (ORCPT ); Wed, 25 Oct 2023 03:37:33 -0400 Received: from NAM12-DM6-obe.outbound.protection.outlook.com (mail-dm6nam12on2075.outbound.protection.outlook.com [40.107.243.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A3B9E19B5; Wed, 25 Oct 2023 00:35:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=K9s0NAXkCh117UAXTm44w4ehhmSqIkKLtTdFkY8X2Tom+XG5WFeNOSyycyMxNZ9BWiqfYKSuxWQA+oQL4CS3rtPQlEq8DRIDPYLizwhhuCLXUM90VWncmCqMdGHcvIpDHkFso2Ye11AbucmLC1aSryGskHURwmJ6R8IOVPRX1EmOF5TzgAwdhFZu4eKicGvNvgiSMBYmaMbH8Ypaxfxudj4QN7x6xim7LUEcqZ671P30OCoC70+yebwT1aNRlgE0ayotYc3be9oEN/rvmSfNZdUPvUrIyFH01LRN+pe3p3QqhgiKSS8ZYeQKg10Is29g0AuTCxLur/IME/u7p9gAhA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=gyYjYPuqEyptSuSvmwm9hwLqcrOhvOuUUJ9kDFyKlKQ=; b=NlJnDXIFSx1W1/qNDmEltTdu+8Lv8P2xU+ND8tq0kGSmyZavvo/ylPOK538x8Kd1JPQSe0vPx9q9NBkdjIW0pUrr74QiHM/Zv72e4rmOJWm5V4yMuoF98hY1jPLwZpIU6okv6iUDvwVnl5giVqJDrap9kBY3jEhp4PGih1MMrStlB+51BqK+86QU1AZKOc+slYmycvMumqbNu/4tIQ8tqwnCIYe6c7el6kHsO5dOvxUscbhatYqTCqTxum6ePBIsmVsMFk9xnONPzhZ7RWtLNDj53SDXAJHYaGIp8HqowV3nFi1zvFSms4rVNOrgefzwJ97GUmQpwt24kziJ0ObDxA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=gyYjYPuqEyptSuSvmwm9hwLqcrOhvOuUUJ9kDFyKlKQ=; b=jlQM8ZBjZO2ulrUq+kEmY3r1qE+WjVT3MxHZ89s+x6CUYL5mNodg+xNuqGVubdX+Pj6it3N32zvKj+9VDIibidEL38ynoT+VpPmJlE1m+FR4nr61ZsyMh7Xppimb7diFaEt5y3XEKMlKGY+d57e8nzevJHzUI7+BqfR94DicayY= Received: from SJ0PR03CA0221.namprd03.prod.outlook.com (2603:10b6:a03:39f::16) by DM6PR12MB4514.namprd12.prod.outlook.com (2603:10b6:5:2a7::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6933.19; Wed, 25 Oct 2023 07:35:28 +0000 Received: from CO1PEPF000044F3.namprd05.prod.outlook.com (2603:10b6:a03:39f:cafe::54) by SJ0PR03CA0221.outlook.office365.com (2603:10b6:a03:39f::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6907.35 via Frontend Transport; Wed, 25 Oct 2023 07:35:27 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by CO1PEPF000044F3.mail.protection.outlook.com (10.167.241.73) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.6933.15 via Frontend Transport; Wed, 25 Oct 2023 07:35:27 +0000 Received: from amd.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27; Wed, 25 Oct 2023 02:35:25 -0500 From: Muralidhara M K To: , CC: , , , Muralidhara M K Subject: [PATCH 7/7] EDAC/amd64: RAS: platform/x86/amd: Identify all physical pages in row Date: Wed, 25 Oct 2023 07:33:39 +0000 Message-ID: <20231025073339.630093-8-muralimk@amd.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231025073339.630093-1-muralimk@amd.com> References: <20231025073339.630093-1-muralimk@amd.com> MIME-Version: 1.0 X-Originating-IP: [10.180.168.240] X-ClientProxiedBy: SATLEXMB03.amd.com (10.181.40.144) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CO1PEPF000044F3:EE_|DM6PR12MB4514:EE_ X-MS-Office365-Filtering-Correlation-Id: 588e2164-369d-491c-93a2-08dbd52cf591 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: TEyud1PVUIzKFQMr8MeURiWCaUY5u1n6A8bTL2oA950RSL4G2aoDkHkHoCickSQfNL+CP0YXyaEDD7otuytzeHwQllj63rj7AOeXaNreRQ2sI/I5I/Awow5/lYhYIx/GLXwQZcKWLPn1kLQwbBK7S5rGBc2DXCPsMvXypL83+O6TXaDfH81VDAOFHujOdsN6mQvJQhTUPCBKLfuf3kNypYRl4cp3I2cn1q2ecQ+UVtwDHTmBkEBG9IIVpXn0wed8KgogsLJYtb/8t+lRCj9uaZw5gVkjPNzoY+vQA3TslU1YLI2OwM5Rclr4GC6cxLO0x+8jsMNgvDlPtIV6wWJcQW2rDiGzGMnRc0VCrYad5hCoIvt1OLtQttaK9nleVKNB7OYPqoaQIK7nKk7NknDLgxOnxcOtq3XFjHbcszgydUh6jJyXSHMCvnMbIeWLl4jST3NVlTB5PSdOVXLls5wLBn9AAzpTpXt2mnQHUprC448sirKWkFmJymdjAWP+6WVKyom5Gk8N6TiFeko7fkY6Ohwyjho6TWVCVdhuEgadc+XRz1lLCKBwvD0HiCPKvM877vBxQxZoc2SNf0UsdelgXS1cPZ1nYG8n85WjsXvyecMk9ccrRn8uHpLLVRjGdhVwSeOvVlt0yGXSHmgnwpj7uBMGgcK/5alyIsutBbNmVifdgHGmZJuldv2d9A7k95l76sHcPvN6q8wesMpj/RoDBGiQeqTNxQWU0yOefT57Rrgesu5ho170yqnOBuu3u3vAqfzDd97Jx5wa83FMAhUqTA== X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230031)(4636009)(39860400002)(136003)(396003)(376002)(346002)(230922051799003)(82310400011)(1800799009)(64100799003)(186009)(451199024)(46966006)(40470700004)(36840700001)(4326008)(8676002)(83380400001)(8936002)(2906002)(81166007)(356005)(82740400003)(5660300002)(41300700001)(7696005)(6666004)(40480700001)(316002)(110136005)(336012)(426003)(1076003)(26005)(478600001)(16526019)(2616005)(47076005)(36756003)(54906003)(70206006)(70586007)(40460700003)(36860700001)(36900700001);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Oct 2023 07:35:27.6886 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 588e2164-369d-491c-93a2-08dbd52cf591 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: CO1PEPF000044F3.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM6PR12MB4514 Precedence: bulk List-ID: X-Mailing-List: linux-edac@vger.kernel.org From: Muralidhara M K AMD systems have HBM memory embedded with the chips, The entire memory is managed by host OS. Error containment needs to be reliable, because HBM memory cannot be replaced. Persist all UMC DRAM ECC errors, the OS can make the bad or poisoned page state persistent so that it will not use the memory upon the next boot. The reported MCA error address in HBM in the format PC/SID/Bank/ROW/COL For example, In MI300A C1/C0 (column bits 1-0) is at SPA bit 6-5. Assuming PFN only looks at SPA bit 12 or higher, column bits 1-0 could be skipped. For PFN, SPA bits higher or equal than 12 matters. So column bits c2, c3 and c4 gives 8 possible combination of addresses in a row. So, Identify all physical pages in a HBM row and retire all the pages to get rid of intermittent or recurrent memory errors. Signed-off-by: Muralidhara M K --- drivers/edac/amd64_edac.c | 5 ++ drivers/ras/amd/atl/umc.c | 103 ++++++++++++++++++++++++++++++++++++++ include/linux/amd-atl.h | 2 + 3 files changed, 110 insertions(+) diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c index 79c6c552ee14..d0db11e19a46 100644 --- a/drivers/edac/amd64_edac.c +++ b/drivers/edac/amd64_edac.c @@ -2838,6 +2838,11 @@ static void decode_umc_error(int node_id, struct mce *m) error_address_to_page_and_offset(sys_addr, &err); + if (pvt->fam == 0x19 && (pvt->model >= 0x90 && pvt->model <= 0x9f)) { + if (identify_poison_pages_retire_row(m)) + return; + } + log_error: __log_ecc_error(mci, &err, ecc_type); } diff --git a/drivers/ras/amd/atl/umc.c b/drivers/ras/amd/atl/umc.c index 52247a7949fb..d31ad7680ff1 100644 --- a/drivers/ras/amd/atl/umc.c +++ b/drivers/ras/amd/atl/umc.c @@ -255,3 +255,106 @@ int umc_mca_addr_to_sys_addr(struct mce *m, u64 *sys_addr) return 0; } EXPORT_SYMBOL_GPL(umc_mca_addr_to_sys_addr); + +/* + * High Bandwidth Memory (HBM v3) has fixed number of columns in a + * row (8 columns in one HBM row). + * Extract column bits to find all the combination of masks to retire + * all the poison pages in a row. + */ +#define MAX_COLUMNS_IN_HBM_ROW 8 + +/* The C2 bit in CH NA address */ +#define UMC_NA_C2_BIT BIT(8) +/* The C3 bit in CH NA address */ +#define UMC_NA_C3_BIT BIT(9) +/* The C4 bit in CH NA address */ +#define UMC_NA_C4_BIT BIT(14) + +/* masks to get all possible combinations of column addresses */ +#define C_1_1_1_MASK (UMC_NA_C4_BIT | UMC_NA_C3_BIT | UMC_NA_C2_BIT) +#define C_1_1_0_MASK (UMC_NA_C4_BIT | UMC_NA_C3_BIT) +#define C_1_0_1_MASK (UMC_NA_C4_BIT | UMC_NA_C2_BIT) +#define C_1_0_0_MASK (UMC_NA_C4_BIT) +#define C_0_1_1_MASK (UMC_NA_C3_BIT | UMC_NA_C2_BIT) +#define C_0_1_0_MASK (UMC_NA_C3_BIT) +#define C_0_0_1_MASK (UMC_NA_C2_BIT) +#define C_0_0_0_MASK ~C_1_1_1_MASK + +/* Identify all combination of column address physical pages in a row */ +static int amd_umc_identify_pages_in_row(struct mce *m, u64 *spa_addr) +{ + u8 cs_inst_id = get_cs_inst_id(m); + u8 socket_id = get_socket_id(m); + u64 norm_addr = get_norm_addr(m); + u8 die_id = get_die_id(m); + u16 df_acc_id = get_df_acc_id(m); + + u64 retire_addr, column; + u64 column_masks[] = { 0, C_0_0_1_MASK, C_0_1_0_MASK, C_0_1_1_MASK, + C_1_0_0_MASK, C_1_0_1_MASK, C_1_1_0_MASK, C_1_1_1_MASK }; + + /* clear and loop for all possibilities of [c4 c3 c2] */ + norm_addr &= C_0_0_0_MASK; + + for (column = 0; column < ARRAY_SIZE(column_masks); column++) { + retire_addr = norm_addr | column_masks[column]; + + if (norm_to_sys_addr(df_acc_id, socket_id, die_id, cs_inst_id, &retire_addr)) + return -EINVAL; + *(spa_addr + column) = retire_addr; + } + + return 0; +} + +/* Find any duplicate addresses in all combination of column address */ +static void amd_umc_find_duplicate_spa(u64 arr[], int *size) +{ + int i, j, k; + + /* use nested for loop to find the duplicate elements in array */ + for (i = 0; i < *size; i++) { + for (j = i + 1; j < *size; j++) { + /* check duplicate element */ + if (arr[i] == arr[j]) { + /* delete the current position of the duplicate element */ + for (k = j; k < (*size - 1); k++) + arr[k] = arr[k + 1]; + + /* decrease the size of array after removing duplicate element */ + (*size)--; + + /* if the position of the elements is changes, don't increase index j */ + j--; + } + } + } +} + +int identify_poison_pages_retire_row(struct mce *m) +{ + int i, ret, addr_range; + unsigned long pfn; + u64 col[MAX_COLUMNS_IN_HBM_ROW]; + u64 *spa_addr = col; + + /* Identify all pages in a row */ + pr_info("Identify all physical Pages in a row for MCE addr:0x%llx\n", m->addr); + ret = amd_umc_identify_pages_in_row(m, spa_addr); + if (!ret) { + for (i = 0; i < MAX_COLUMNS_IN_HBM_ROW; i++) + pr_info("col[%d]_addr:0x%llx ", i, spa_addr[i]); + } + /* Find duplicate entries from all 8 physical addresses in a row */ + addr_range = ARRAY_SIZE(col); + amd_umc_find_duplicate_spa(spa_addr, &addr_range); + /* do page retirement on all system physical addresses */ + for (i = 0; i < addr_range; i++) { + pfn = PHYS_PFN(spa_addr[i]); + memory_failure(pfn, 0); + } + + return ret; +} +EXPORT_SYMBOL(identify_poison_pages_retire_row); diff --git a/include/linux/amd-atl.h b/include/linux/amd-atl.h index c625ea3ab5d0..df24ae592c4e 100644 --- a/include/linux/amd-atl.h +++ b/include/linux/amd-atl.h @@ -25,4 +25,6 @@ static inline int amd_umc_mca_addr_to_sys_addr(struct mce *m, u64 *sys_addr) return umc_mca_addr_to_sys_addr(m, sys_addr); } +int identify_poison_pages_retire_row(struct mce *m); + #endif /* _AMD_ATL_H */