From patchwork Mon May 27 12:12:48 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ofir Bitton X-Patchwork-Id: 13675142 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 9307AC25B74 for ; Mon, 27 May 2024 12:13:35 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C3FCF10F4E7; Mon, 27 May 2024 12:13:34 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=habana.ai header.i=@habana.ai header.b="LcQCWYKU"; dkim-atps=neutral Received: from mail02.habana.ai (habanamailrelay.habana.ai [213.57.90.13]) by gabe.freedesktop.org (Postfix) with ESMTPS id 4706B10FA1D for ; Mon, 27 May 2024 12:13:08 +0000 (UTC) Received: internal info suppressed DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=habana.ai; s=default; t=1716811993; bh=6fkm+Ekorc42aQ0DgvCRfc8pbcF22Uheaaz/c06hl/o=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=LcQCWYKUKob8FxyRRMksHjTJjtqkXFFJhp0EEZg6JpkjOSenDkGStR7NJby89TSSW TZanMV8UbA1hd9R1uTK620Eodt+tA4htDm3++AatDc1Idwd/T5a9c+ecMIaVvDEQzM 6OUfr9X5A+ubbdKNT5hJNMwaDCWF44tzGAreq4dO9r/fDNVI5YZdfuoOJnZNt2Ezxt xwtjGmr67clZlMO39sCnM+l6YHT4GSFHgonr1XWW6qpKog+M3hcpwVv9fAJK0ryIfo AebwE8OVcR7ZaAto8gIq2gQcS9APkf/mdB/5YdF0tLI1185ouPvO8nYq8t/55TyW7F r2/HVYSqr0qQg== Received: from obitton-vm-u22.habana-labs.com (localhost [127.0.0.1]) by obitton-vm-u22.habana-labs.com (8.15.2/8.15.2/Debian-22ubuntu3) with ESMTP id 44RCCuas1921351; Mon, 27 May 2024 15:12:57 +0300 From: Ofir Bitton To: dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org Cc: Farah Kassabri Subject: [PATCH 2/8] accel/habanalabs: check for errors after preboot is ready Date: Mon, 27 May 2024 15:12:48 +0300 Message-Id: <20240527121254.1921306-2-obitton@habana.ai> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240527121254.1921306-1-obitton@habana.ai> References: <20240527121254.1921306-1-obitton@habana.ai> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Farah Kassabri Driver should check and report any fatal errors detected by preboot, before it attempts to load the boot fit. Some errors may cause the driver to stop the boot process and mark the device as unusable. This check will allow the driver to fail and print the error reported by preboot and skip the time wasting attempt of trying to load the boot fit, which will fail due to the error. Signed-off-by: Farah Kassabri Reviewed-by: Ofir Bitton --- drivers/accel/habanalabs/common/firmware_if.c | 24 +++++++++---------- 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/drivers/accel/habanalabs/common/firmware_if.c b/drivers/accel/habanalabs/common/firmware_if.c index 886b3c07503d..6f0c40b12072 100644 --- a/drivers/accel/habanalabs/common/firmware_if.c +++ b/drivers/accel/habanalabs/common/firmware_if.c @@ -1482,7 +1482,7 @@ int hl_fw_wait_preboot_ready(struct hl_device *hdev) { struct pre_fw_load_props *pre_fw_load = &hdev->fw_loader.pre_fw_load; u32 status = 0, timeout; - int rc, tries = 1; + int rc, tries = 1, fw_err = 0; bool preboot_still_runs; /* Need to check two possible scenarios: @@ -1522,18 +1522,18 @@ int hl_fw_wait_preboot_ready(struct hl_device *hdev) } } - if (rc) { + /* If we read all FF, then something is totally wrong, no point + * of reading specific errors + */ + if (status != -1) + fw_err = fw_read_errors(hdev, pre_fw_load->boot_err0_reg, + pre_fw_load->boot_err1_reg, + pre_fw_load->sts_boot_dev_sts0_reg, + pre_fw_load->sts_boot_dev_sts1_reg); + if (rc || fw_err) { detect_cpu_boot_status(hdev, status); - dev_err(hdev->dev, "CPU boot ready timeout (status = %d)\n", status); - - /* If we read all FF, then something is totally wrong, no point - * of reading specific errors - */ - if (status != -1) - fw_read_errors(hdev, pre_fw_load->boot_err0_reg, - pre_fw_load->boot_err1_reg, - pre_fw_load->sts_boot_dev_sts0_reg, - pre_fw_load->sts_boot_dev_sts1_reg); + dev_err(hdev->dev, "CPU boot %s (status = %d)\n", + fw_err ? "failed due to an error" : "ready timeout", status); return -EIO; }