From patchwork Mon May 27 12:12:51 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ofir Bitton X-Patchwork-Id: 13675138 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3CB8CC25B74 for ; Mon, 27 May 2024 12:13:15 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 9892E10FA1D; Mon, 27 May 2024 12:13:11 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=habana.ai header.i=@habana.ai header.b="OjZ20Tx9"; dkim-atps=neutral Received: from mail02.habana.ai (habanamailrelay02.habana.ai [62.90.112.121]) by gabe.freedesktop.org (Postfix) with ESMTPS id 766D410E94D for ; Mon, 27 May 2024 12:13:06 +0000 (UTC) Received: internal info suppressed DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=habana.ai; s=default; t=1716811982; bh=Gweql/mYSZp0D9OIhCBaIAviu0sxLWqrmSCFgLSuNRw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=OjZ20Tx90j0hd7nl5hmcrPTat70TO6SQ6WRyUe6FE8V4QbtQ+zPbYRUKkPQw65Csi gcg4qB6ZAwSrP/Alh64/jfqXDyJ4yfvQCjOANI1r+71rAjaIObEutwwHBg19qTFK3n lyTknEjr8h+j0afti7ZrFFKNszw2z4NqwuMfzDQReNNIV19Imtzqia0coAB3NLcwtN y7yCqrz1taeQw+GTKOlalQUWmJmtILAHnSL+2h4DdgnGO8YkewnACP6PaUHRJC+ZQw KbNAuuX17u6nH66jXOSPT10sKHs7yZoKLBUcYJ0vIbxZXdQ6B+c/MAxGG7QxyYhfD5 YIGxQVkExws8A== Received: from obitton-vm-u22.habana-labs.com (localhost [127.0.0.1]) by obitton-vm-u22.habana-labs.com (8.15.2/8.15.2/Debian-22ubuntu3) with ESMTP id 44RCCuav1921351; Mon, 27 May 2024 15:12:57 +0300 From: Ofir Bitton To: dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org Cc: Tomer Tayar Subject: [PATCH 5/8] accel/habanalabs/gaudi2: assume hard-reset by FW upon MC SEI severe error Date: Mon, 27 May 2024 15:12:51 +0300 Message-Id: <20240527121254.1921306-5-obitton@habana.ai> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240527121254.1921306-1-obitton@habana.ai> References: <20240527121254.1921306-1-obitton@habana.ai> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Tomer Tayar FW initiates a hard reset upon an MC SEI severe error. Align the driver to expect this reset and avoid accessing the device until the reset is done. Signed-off-by: Tomer Tayar Reviewed-by: Ofir Bitton --- drivers/accel/habanalabs/gaudi2/gaudi2.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/accel/habanalabs/gaudi2/gaudi2.c b/drivers/accel/habanalabs/gaudi2/gaudi2.c index 18cc7b773650..4791582d157c 100644 --- a/drivers/accel/habanalabs/gaudi2/gaudi2.c +++ b/drivers/accel/habanalabs/gaudi2/gaudi2.c @@ -10004,6 +10004,7 @@ static void gaudi2_handle_eqe(struct hl_device *hdev, struct hl_eq_entry *eq_ent if (gaudi2_handle_hbm_mc_sei_err(hdev, event_type, &eq_entry->sei_data)) { reset_flags |= HL_DRV_RESET_FW_FATAL_ERR; reset_required = true; + is_critical = eq_entry->sei_data.hdr.is_critical; } error_count++; break; @@ -10235,8 +10236,7 @@ static void gaudi2_handle_eqe(struct hl_device *hdev, struct hl_eq_entry *eq_ent gaudi2_print_event(hdev, event_type, true, "No error cause for H/W event %u", event_type); - if ((gaudi2_irq_map_table[event_type].reset != EVENT_RESET_TYPE_NONE) || - reset_required) { + if ((gaudi2_irq_map_table[event_type].reset != EVENT_RESET_TYPE_NONE) || reset_required) { if (reset_required || (gaudi2_irq_map_table[event_type].reset == EVENT_RESET_TYPE_HARD)) reset_flags |= HL_DRV_RESET_HARD;