From patchwork Tue Feb 25 07:55:47 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tomeu Vizoso X-Patchwork-Id: 13989434 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2D724C021B8 for ; Tue, 25 Feb 2025 07:59:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Cc:To:In-Reply-To:References :Message-Id:Content-Transfer-Encoding:Content-Type:MIME-Version:Subject:Date: From:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=fQ8rK/SFPwv0iTdPidLWNwq+goMXwpaEZtcN5vvMS8I=; b=h+mcKA5BzasHgfJ0MZZr+Hs4qC xsCppEJohq3LKjXrKl9KMZUbNSWB7QkiGwrUEhYHJvIMW9uHOJXi7wfTqMvHEqsSEPa/33kV3Rqoj CVUiLJbBjzbFteZhjhcHoNwXYJljLRoZoKrMQ0Grr4lMFSiLYYoCGktIgCiYnOBKGUE4RfxDtVVAp LM3K7EZF9VoZJxFu9Ux+wraVTt1e/zNoigDTHSceLnMs6brlVcyUNUpegUx/6W8xJDi1/4SSS5gg5 RTlBzKkWbBADRrNlF3X25BRso0DuCdoPElbm27ahuTHAzb+7z1Utf9FDKidKhPY1IX+4bPVqGrkhU PFBuyagg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tmpqO-0000000GIQO-2Ey5; Tue, 25 Feb 2025 07:59:08 +0000 Received: from mail-wm1-f54.google.com ([209.85.128.54]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tmpnP-0000000GHcE-03co; Tue, 25 Feb 2025 07:56:04 +0000 Received: by mail-wm1-f54.google.com with SMTP id 5b1f17b1804b1-438a3216fc2so49848375e9.1; Mon, 24 Feb 2025 23:56:02 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740470161; x=1741074961; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=fQ8rK/SFPwv0iTdPidLWNwq+goMXwpaEZtcN5vvMS8I=; b=MdzVSrDJXclYKTsZjznfdmvAsPRMBqhZSC6hE9wHCrKC2ikCqyzXezX+n3irWWVH12 h476yh8Ht5Z7mbqQaqibQxRvR0zWd1hnD0icRf5EIcJWEAz/aTKe+/r9uzJiztWqvFur tsX4+8amTfhymkU85w4pXsuhDrWC5gEQD9KrCJl+5LyBYS0NPefZgm/McoLJDhlOE6cb 5QWxajWNkPTQdtHvXCVKv1AXQ0j/gyX8pSMN5rg5niPedEpeHWl5R69f4p34uPH9LfDy b6IxlUrCzigAWAbUt39bsknjx0N/na2JOZVB4WWPnTEaNEnHIAxgWUCN60buV7S76+YQ pF3Q== X-Forwarded-Encrypted: i=1; AJvYcCU3vWuoRRHGf8vnqP9oa2Yo8HwK8VDHydIFD6qiI/wnz0VQjbWNVbvIfsE/zngmjPEa+WhvTRFCepq9PbxgwTpL@lists.infradead.org, AJvYcCV9jEj1iu5uVD5PIHhvXcHmnTLyWItfm6h/wHeWTwoEXX2VREJBU3Twe5llLL+lQ+uY8JxS3gilKuBsgU0sDoc=@lists.infradead.org X-Gm-Message-State: AOJu0YwOwUVoV32Uh4kyQITjzZS5y/ehdLWVtZQHoduJFDxATY5Z64eM r0mc8QjeWUB8FtghguawSEaAUtqIjuFsdDHP3TN7Sed8ONIw0huE X-Gm-Gg: ASbGncuYKFuUSngelJcCg0NRCMkBvBZRMDcaUdtjgK9akyfPnK/qXhKlvfKsbhGSzcf kTplFWQW6qxfdLTw6zVmFU3xj6N75wyiF3e2BHpbcQ55BAwPuqM631kFopKzQectnU6UlWYAMK+ b45iqKSdlmiNB+j/Y/fWu8+uPaamYvVaeMnyYCy9ww2zG+JfXx7xv4JYXVV2T8Tap9g5OY4zVIe YCYZIBhZySSRElAObHYvo2WySAOLp8irKu+yKLYOYGLrx/kR4lr+OnmFslTL3p/j1kGYGfnn12X ZZTbFbLDk0XpKRV6h/rJkSldXrhR/L6NKWZFva3csHHGx4kKpbxkZu7OHA== X-Google-Smtp-Source: AGHT+IFLYntsPuTOziMl5zbK7YW3u6WRQqcDsC52Yv/3/P2y2cWtuj2SxpqE0RPE7hoOsemVi3pcMQ== X-Received: by 2002:a05:600c:450d:b0:439:9a43:dd62 with SMTP id 5b1f17b1804b1-439aebb2d6fmr101608685e9.24.1740470161102; Mon, 24 Feb 2025 23:56:01 -0800 (PST) Received: from [10.42.0.1] (cst-prg-37-50.cust.vodafone.cz. [46.135.37.50]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-439b02f2475sm130299475e9.20.2025.02.24.23.55.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 Feb 2025 23:55:59 -0800 (PST) From: Tomeu Vizoso Date: Tue, 25 Feb 2025 08:55:47 +0100 Subject: [PATCH v2 1/7] dt-bindings: npu: rockchip,rknn: Add bindings MIME-Version: 1.0 Message-Id: <20250225-6-10-rocket-v2-1-d4dbcfafc141@tomeuvizoso.net> References: <20250225-6-10-rocket-v2-0-d4dbcfafc141@tomeuvizoso.net> In-Reply-To: <20250225-6-10-rocket-v2-0-d4dbcfafc141@tomeuvizoso.net> To: Rob Herring , Krzysztof Kozlowski , Conor Dooley , Heiko Stuebner , Oded Gabbay , Jonathan Corbet , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Simona Vetter , Sumit Semwal , =?utf-8?q?Christian_K=C3=B6nig?= , Sebastian Reichel , Jeffrey Hugo Cc: devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-rockchip@lists.infradead.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, linux-doc@vger.kernel.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, Tomeu Vizoso X-Mailer: b4 0.14.2 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250224_235603_055975_81AFDF83 X-CRM114-Status: GOOD ( 12.13 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Add the bindings for the Neural Processing Unit IP from Rockchip. v2: - Adapt to new node structure (one node per core, each with its own IOMMU) - Several misc. fixes from Sebastian Reichel Signed-off-by: Tomeu Vizoso Signed-off-by: Sebastian Reichel --- .../bindings/npu/rockchip,rknn-core.yaml | 152 +++++++++++++++++++++ 1 file changed, 152 insertions(+) diff --git a/Documentation/devicetree/bindings/npu/rockchip,rknn-core.yaml b/Documentation/devicetree/bindings/npu/rockchip,rknn-core.yaml new file mode 100644 index 0000000000000000000000000000000000000000..e8d0afe4a7d1c4f166cf13a9f4aa7c1901362a3f --- /dev/null +++ b/Documentation/devicetree/bindings/npu/rockchip,rknn-core.yaml @@ -0,0 +1,152 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Neural Processing Unit IP from Rockchip + +maintainers: + - Tomeu Vizoso + +description: + Rockchip IP for accelerating inference of neural networks, based on NVIDIA's + open source NVDLA IP. + +properties: + $nodename: + pattern: '^npu-core@[a-f0-9]+$' + + compatible: + oneOf: + - items: + - enum: + - rockchip,rk3588-rknn-core-top + - const: rockchip,rknn-core-top + - items: + - enum: + - rockchip,rk3588-rknn-core + - const: rockchip,rknn-core + + reg: + maxItems: 1 + + clocks: + minItems: 2 + maxItems: 4 + + clock-names: + items: + - const: aclk + - const: hclk + - const: npu + - const: pclk + minItems: 2 + + interrupts: + maxItems: 1 + + iommus: + maxItems: 1 + + npu-supply: true + + power-domains: + maxItems: 1 + + resets: + maxItems: 2 + + reset-names: + items: + - const: srst_a + - const: srst_h + + sram-supply: true + +required: + - compatible + - reg + - clocks + - clock-names + - interrupts + - iommus + - npu-supply + - power-domains + - resets + - reset-names + - sram-supply + +allOf: + - if: + properties: + compatible: + contains: + enum: + - rockchip,rknn-core-top + then: + properties: + clocks: + minItems: 4 + + clock-names: + minItems: 4 + - if: + properties: + compatible: + contains: + enum: + - rockchip,rknn-core + then: + properties: + clocks: + maxItems: 2 + clock-names: + maxItems: 2 + +additionalProperties: false + +examples: + - | + #include + #include + #include + #include + #include + + bus { + #address-cells = <2>; + #size-cells = <2>; + + rknn_core_top: npu-core@fdab0000 { + compatible = "rockchip,rk3588-rknn-core-top", "rockchip,rknn-core-top"; + reg = <0x0 0xfdab0000 0x0 0x9000>; + assigned-clocks = <&scmi_clk SCMI_CLK_NPU>; + assigned-clock-rates = <200000000>; + clocks = <&cru ACLK_NPU0>, <&cru HCLK_NPU0>, + <&scmi_clk SCMI_CLK_NPU>, <&cru PCLK_NPU_ROOT>; + clock-names = "aclk", "hclk", "npu", "pclk"; + interrupts = ; + iommus = <&rknn_mmu_top>; + npu-supply = <&vdd_npu_s0>; + power-domains = <&power RK3588_PD_NPUTOP>; + resets = <&cru SRST_A_RKNN0>, <&cru SRST_H_RKNN0>; + reset-names = "srst_a", "srst_h"; + sram-supply = <&vdd_npu_mem_s0>; + }; + + rknn_core_1: npu-core@fdac0000 { + compatible = "rockchip,rk3588-rknn-core", "rockchip,rknn-core"; + reg = <0x0 0xfdac0000 0x0 0x9000>; + clocks = <&cru ACLK_NPU1>, <&cru HCLK_NPU1>; + clock-names = "aclk", "hclk"; + interrupts = ; + iommus = <&rknn_mmu_1>; + npu-supply = <&vdd_npu_s0>; + power-domains = <&power RK3588_PD_NPU1>; + resets = <&cru SRST_A_RKNN1>, <&cru SRST_H_RKNN1>; + reset-names = "srst_a", "srst_h"; + sram-supply = <&vdd_npu_mem_s0>; + }; + }; +... From patchwork Tue Feb 25 07:55:48 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tomeu Vizoso X-Patchwork-Id: 13989435 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B5393C021B2 for ; Tue, 25 Feb 2025 08:00:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Cc:To:In-Reply-To:References :Message-Id:Content-Transfer-Encoding:Content-Type:MIME-Version:Subject:Date: From:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=bttKeyS+W49kYKVE/Q0TanEzPtlhvTNN8fcNqxR6WQw=; b=lkxISDjdxLBevnaRzL4RakLxOS YZzHVedGeGGnF5vwWQYVTEs3ZnLvmOdhjdjzkEy650hpTljTU+VMUs8pCcaFDc8JVi/Yw18wa9+XG YeIvBrylTFHkEIylZhfPkCXb6U0le2sKLCBAR+M/NZHnZaZsCcHgiFFC8TUVmrhw5cx8BsI5Rhjp5 jWfWBp3rEB4LbExcXgJcGNE4xksn2lTdjzxO+7XD1PCY3zfxgbtnM/Wn243dObmERSPfIgM2OcjpL e9tmC8BVe3fDm+pHvF9BwQ/rjbLV2FlpVPmyONhhF/+IKAkXciXus3tOZ7hLgPxJyfmkHZdTVjQtR y0tUOymw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tmprs-0000000GIgA-3itn; Tue, 25 Feb 2025 08:00:40 +0000 Received: from mail-wm1-f49.google.com ([209.85.128.49]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tmpnQ-0000000GHcp-2Kfo; Tue, 25 Feb 2025 07:56:05 +0000 Received: by mail-wm1-f49.google.com with SMTP id 5b1f17b1804b1-4394a823036so49961515e9.0; Mon, 24 Feb 2025 23:56:04 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740470163; x=1741074963; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=bttKeyS+W49kYKVE/Q0TanEzPtlhvTNN8fcNqxR6WQw=; b=u35g6FPHO1EctElQbU+Njl0m+Ih9odPpMMGxdbCM8lHcd6xQR4Q14JydtGjX5YnoLt a0yp0HRydGFlMUcammwFaLj+uaSMxHqIWIaXbjMh2kJTPhuS9Bx3+tb4JT+8bsHQRsgM ik9B91tHvpBUCA0mUSOj+HKNocJn55zptMqLYBZGT1wj0cDre3BK38FWlRkngDiVYLp9 xZwWtgFPqmSZRy5rQ+YmRIraO8LTFvZP8wmgvYSCDBHOAapETleA5z88r/nXvAhRA2jl UVEMoitdfIUGP+nnCvBATRx+RsCC7BZt9wemtuh9jini4MPPJDHDZy4V/EJsPrUVq3v7 T2sg== X-Forwarded-Encrypted: i=1; AJvYcCU0IfbXay6CpMthZ/DiUQoAPtsPOo0LaBC3cjrwyAItIi8Ngq6a3/LjxkW/mZXMT6m8y7i7NbQYuqe9fD1huSU=@lists.infradead.org, AJvYcCWeV7Dg5nppxlXEg0KmIDNrOPG9K4wU5MiusG6r5NSOclCN0Ga7GGaTDLkEjmdF6sRYh709qe/CPXZrTeZGSRFH@lists.infradead.org X-Gm-Message-State: AOJu0Yw3cbXrJrt60rBoOf2YehNVUR6PX9uwqj7owfKNQOd/PSIJ6MBc iNECPwQpU31Qi2j7UNk8KxW7M7SVPM8I9j7NDlXmNJmkWgeTehP0 X-Gm-Gg: ASbGnctbvAODEm9KZn7psMvlL6K4UZ3mBnN3ITrqn7sIstJJN4Sl/YwJHN9OwdfUzWW daxIUBOttST33mqdhbwQoJxo2sF9YtrjyYw9aJESZLF0GKrMy5NKcVUSwqWjbKHiGp9FnuePoZF Ud9MvHaSWTHqi6cL1issKzMmKIEG7yNtbgtnS/5zObJZObJtuEb0W3fkxpEeo89Lm07G+g1PXnJ wr1FIwG0BTF6W7VKM5X9rTObAYlOExILMek4CZPp+qFLZxNJ9iEmukFtB5LJ9m/a3d2MT3lHP9s AIFQjWe6TN2mpnmc95JNT6RgToqs5gz9Hbk25U1AsoSkcW7X0bQLOBYtJA== X-Google-Smtp-Source: AGHT+IGlnf3zeYcKfw6UHFzFMOBrY9gNCtvaCQ4A3tiD1q4/beEfrfwhp2DF03Do2Fta+a8Paidiyg== X-Received: by 2002:a05:600c:470a:b0:439:a139:7a19 with SMTP id 5b1f17b1804b1-439aebb3155mr125686995e9.23.1740470162856; Mon, 24 Feb 2025 23:56:02 -0800 (PST) Received: from [10.42.0.1] (cst-prg-37-50.cust.vodafone.cz. [46.135.37.50]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-439b02f2475sm130299475e9.20.2025.02.24.23.56.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 Feb 2025 23:56:02 -0800 (PST) From: Tomeu Vizoso Date: Tue, 25 Feb 2025 08:55:48 +0100 Subject: [PATCH v2 2/7] arm64: dts: rockchip: Add nodes for NPU and its MMU to rk3588s MIME-Version: 1.0 Message-Id: <20250225-6-10-rocket-v2-2-d4dbcfafc141@tomeuvizoso.net> References: <20250225-6-10-rocket-v2-0-d4dbcfafc141@tomeuvizoso.net> In-Reply-To: <20250225-6-10-rocket-v2-0-d4dbcfafc141@tomeuvizoso.net> To: Rob Herring , Krzysztof Kozlowski , Conor Dooley , Heiko Stuebner , Oded Gabbay , Jonathan Corbet , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Simona Vetter , Sumit Semwal , =?utf-8?q?Christian_K=C3=B6nig?= , Sebastian Reichel , Jeffrey Hugo Cc: devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-rockchip@lists.infradead.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, linux-doc@vger.kernel.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, Tomeu Vizoso X-Mailer: b4 0.14.2 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250224_235604_601894_7C54E4C2 X-CRM114-Status: GOOD ( 12.47 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org See Chapter 36 "RKNN" from the RK3588 TRM (Part 1). This is a derivative of NVIDIA's NVDLA, but with its own front-end processor. The IP is divided in three cores, programmed independently. The first core though is special, requiring to be powered on before any of the others can be used. The IOMMU of the first core is also special in that it has two subunits (read/write?) that need to be programmed in sync. v2: - Have one device for each NPU core (Sebastian Reichel) - Have one device for each IOMMU (Sebastian Reichel) - Correctly sort nodes (Diederik de Haas) - Add rockchip,iommu compatible to IOMMU nodes (Sebastian Reichel) Signed-off-by: Tomeu Vizoso --- arch/arm64/boot/dts/rockchip/rk3588-base.dtsi | 76 +++++++++++++++++++++++++++ 1 file changed, 76 insertions(+) diff --git a/arch/arm64/boot/dts/rockchip/rk3588-base.dtsi b/arch/arm64/boot/dts/rockchip/rk3588-base.dtsi index 8cfa30837ce72581d0b513a8274ab0177eb5ae15..2680ed854e0c2ba5de167740ef18fcee167016fe 100644 --- a/arch/arm64/boot/dts/rockchip/rk3588-base.dtsi +++ b/arch/arm64/boot/dts/rockchip/rk3588-base.dtsi @@ -1125,6 +1125,82 @@ power-domain@RK3588_PD_SDMMC { }; }; + rknn_core_top: npu-core@fdab0000 { + compatible = "rockchip,rk3588-rknn-core-top", "rockchip,rknn-core-top"; + reg = <0x0 0xfdab0000 0x0 0x9000>; + interrupts = ; + clocks = <&scmi_clk SCMI_CLK_NPU>, <&cru PCLK_NPU_ROOT>, + <&cru ACLK_NPU0>, <&cru HCLK_NPU0>; + clock-names = "aclk", "hclk", "npu", "pclk"; + assigned-clocks = <&scmi_clk SCMI_CLK_NPU>; + assigned-clock-rates = <200000000>; + resets = <&cru SRST_A_RKNN0>, <&cru SRST_H_RKNN0>; + reset-names = "srst_a", "srst_h"; + power-domains = <&power RK3588_PD_NPUTOP>; + iommus = <&rknn_mmu_top>; + status = "disabled"; + }; + + rknn_mmu_top: iommu@fdab9000 { + compatible = "rockchip,rk3588-iommu", "rockchip,rk3568-iommu"; + reg = <0x0 0xfdab9000 0x0 0x100>, + <0x0 0xfdaba000 0x0 0x100>; + interrupts = ; + clocks = <&cru ACLK_NPU0>, <&cru HCLK_NPU0>; + clock-names = "aclk", "iface"; + #iommu-cells = <0>; + power-domains = <&power RK3588_PD_NPUTOP>; + status = "disabled"; + }; + + rknn_core_1: npu-core@fdac0000 { + compatible = "rockchip,rk3588-rknn-core", "rockchip,rknn-core"; + reg = <0x0 0xfdac0000 0x0 0x9000>; + interrupts = ; + clocks = <&cru ACLK_NPU1>, <&cru HCLK_NPU1>; + clock-names = "aclk", "hclk"; + resets = <&cru SRST_A_RKNN1>, <&cru SRST_H_RKNN1>; + reset-names = "srst_a", "srst_h"; + power-domains = <&power RK3588_PD_NPU1>; + iommus = <&rknn_mmu_1>; + status = "disabled"; + }; + + rknn_mmu_1: iommu@fdac9000 { + compatible = "rockchip,rk3588-iommu", "rockchip,rk3568-iommu"; + reg = <0x0 0xfdaca000 0x0 0x100>; + interrupts = ; + clocks = <&cru ACLK_NPU1>, <&cru HCLK_NPU1>; + clock-names = "aclk", "iface"; + #iommu-cells = <0>; + power-domains = <&power RK3588_PD_NPU1>; + status = "disabled"; + }; + + rknn_core_2: npu-core@fdad0000 { + compatible = "rockchip,rk3588-rknn-core", "rockchip,rknn-core"; + reg = <0x0 0xfdad0000 0x0 0x9000>; + interrupts = ; + clocks = <&cru ACLK_NPU2>, <&cru HCLK_NPU2>; + clock-names = "aclk", "hclk"; + resets = <&cru SRST_A_RKNN2>, <&cru SRST_H_RKNN2>; + reset-names = "srst_a", "srst_h"; + power-domains = <&power RK3588_PD_NPU2>; + iommus = <&rknn_mmu_2>; + status = "disabled"; + }; + + rknn_mmu_2: iommu@fdad9000 { + compatible = "rockchip,rk3588-iommu", "rockchip,rk3568-iommu"; + reg = <0x0 0xfdada000 0x0 0x100>; + interrupts = ; + clocks = <&cru ACLK_NPU2>, <&cru HCLK_NPU2>; + clock-names = "aclk", "iface"; + #iommu-cells = <0>; + power-domains = <&power RK3588_PD_NPU2>; + status = "disabled"; + }; + vpu121: video-codec@fdb50000 { compatible = "rockchip,rk3588-vpu121", "rockchip,rk3568-vpu"; reg = <0x0 0xfdb50000 0x0 0x800>; From patchwork Tue Feb 25 07:55:49 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tomeu Vizoso X-Patchwork-Id: 13989445 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id F14CEC021B8 for ; Tue, 25 Feb 2025 08:02:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Cc:To:In-Reply-To:References :Message-Id:Content-Transfer-Encoding:Content-Type:MIME-Version:Subject:Date: From:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=ZjST+tDBanh80dU/tgOfgWNh4nJ1bOb18ngSnNilxXE=; b=oofdWAKcw0JDTkllACTIZ/TdLn TsK8qQ5jB6NOW/D4OrKZF2AVlwP9sjem84WXx06rlBvpmoHFMFBaoP6CfRB/A1UXTkNmCiWYwL71X h6rY0yWbNSSLgeVNBsKoW+zI5QJWGoz/hE91RmFBtIA/caW+ohWporOjWFKzC6KuUzkQKurzRdMKh Hw9qAys/J5504XSV3JyUUFrLx52RZiDTkl6W/d2OwI4BIHjM2y9dvCIBAjNO9vKdEfxoTQ5MecGXe Dz/0APBRxnSaOABoaEcMQyW7E4eOBCyWXOqGT78kQv3cl1QNUH2LZP4n9efmgR+1vAp52oNYaxpY9 Ncq6Eorw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tmptM-0000000GJ2W-3h8P; Tue, 25 Feb 2025 08:02:12 +0000 Received: from mail-wm1-f47.google.com ([209.85.128.47]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tmpnT-0000000GHf1-2dDG; Tue, 25 Feb 2025 07:56:08 +0000 Received: by mail-wm1-f47.google.com with SMTP id 5b1f17b1804b1-43aac0390e8so10174275e9.2; Mon, 24 Feb 2025 23:56:07 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740470166; x=1741074966; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ZjST+tDBanh80dU/tgOfgWNh4nJ1bOb18ngSnNilxXE=; b=SJ1RMdK80950kpmixhOnHCJEEq+U7YzCglOGYf2TWryeDx/UTO54nL9x/W6lOnYyfy gA5XxOiOQQFQjcBI9N3QZJxxMYvLrAThhd4XOI2ikGhXyZZq3zZQ8ocIwF6EGXVZza+i MBIEqOLsVNPVbQBdRB/OHb6sUg8BjuZKbFfR3zGsE0YdNs00IVU5LIUjNJTCD6F/2pBd eFX4dZl4ohbrOt4LgvLZz650+y0LraRMLnb9fp7Ttzhd96Cve1nbPuh454MO62CfWkj/ pqGf4B6Jv1ZbWeK3kA7sB4kY+fQcFpFcXYGHuOFO6eyreUD5OBmHP+SMcSdNrkzdoWhd ShwA== X-Forwarded-Encrypted: i=1; AJvYcCVzp8ZEdPnnsRV5Hlwjp2ByA09+HT6JtJdBjsPi6bojuyKLHwFCbliXMb4dVaNvU1C1lAtkaKuXpkypcd6Dvf4j@lists.infradead.org, AJvYcCXBRPMjpdPmQubDgSQdSiOZ8AHx9gbXH0MnEf5et/TJ82Twjb4HB5N72C0tjChymGv8VW3ZFwJQYbTJdDu3oNA=@lists.infradead.org X-Gm-Message-State: AOJu0YylTn9ig5JSzHaPXt3bDpe0NbCS+vyE82eXqwQSnF6hXtGNCt6t ey789pIKVUqX8jv+RYRHm/x8umYJKTDJuvw1kPAA/H7qzq8K05hy X-Gm-Gg: ASbGncv12TKo3/aeeY0lOQ+PtJodMb10VpIea04AKo/KOl/7yaiVk3FRq/hs3YCCQLg IaQLguTEqDS6uvFCVnE/2hfN11gTTH/Pwj5xCLZPlbZVnNwfqh3eQrfnEhXOnkP7ZGPumHQyB4s lZSP6SJCdcbtzJhtzjkrDmPwb+rOJm2dALqvYO21umVqhSBYDH4P6c5YXLxIM9HTcPj79ygD3y1 RIzwqtAwIQD8nMyqx+5pqwWsuJ0oReLrD7ROPAiGP5/RRm/3f/gLGVX/kjs7UQeh1vFgDMiBss2 JqctXd2qaZzZBak2aQFu3X7J0e1dQHBuVYfP6QPpoVn/A3fe6u44pIGmBA== X-Google-Smtp-Source: AGHT+IFBVupjmoCIi4Z8rhYBHew03hpkxD1lkVsBkpbYaL9RCDyWui5OGx8IsblEOSEcCQ6omhnnfg== X-Received: by 2002:a05:600c:5493:b0:439:9e8b:228e with SMTP id 5b1f17b1804b1-439ae21cdddmr121462285e9.20.1740470165919; Mon, 24 Feb 2025 23:56:05 -0800 (PST) Received: from [10.42.0.1] (cst-prg-37-50.cust.vodafone.cz. [46.135.37.50]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-439b02f2475sm130299475e9.20.2025.02.24.23.56.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 Feb 2025 23:56:04 -0800 (PST) From: Tomeu Vizoso Date: Tue, 25 Feb 2025 08:55:49 +0100 Subject: [PATCH v2 3/7] arm64: dts: rockchip: Enable the NPU on quartzpro64 MIME-Version: 1.0 Message-Id: <20250225-6-10-rocket-v2-3-d4dbcfafc141@tomeuvizoso.net> References: <20250225-6-10-rocket-v2-0-d4dbcfafc141@tomeuvizoso.net> In-Reply-To: <20250225-6-10-rocket-v2-0-d4dbcfafc141@tomeuvizoso.net> To: Rob Herring , Krzysztof Kozlowski , Conor Dooley , Heiko Stuebner , Oded Gabbay , Jonathan Corbet , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Simona Vetter , Sumit Semwal , =?utf-8?q?Christian_K=C3=B6nig?= , Sebastian Reichel , Jeffrey Hugo Cc: devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-rockchip@lists.infradead.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, linux-doc@vger.kernel.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, Tomeu Vizoso X-Mailer: b4 0.14.2 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250224_235607_671829_778B30EA X-CRM114-Status: GOOD ( 10.88 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Enable the nodes added in a previous commit to the rk3588s device tree. v2: - Split nodes (Sebastian Reichel) - Sort nodes (Sebastian Reichel) - Add board regulators (Sebastian Reichel) Signed-off-by: Tomeu Vizoso --- .../arm64/boot/dts/rockchip/rk3588-quartzpro64.dts | 30 ++++++++++++++++++++++ 1 file changed, 30 insertions(+) diff --git a/arch/arm64/boot/dts/rockchip/rk3588-quartzpro64.dts b/arch/arm64/boot/dts/rockchip/rk3588-quartzpro64.dts index 088cfade6f6f14b6383ab844fa174c69fa711fc0..5f6b87dc46361eea93ea2a1fa373cb9ecdb7bbce 100644 --- a/arch/arm64/boot/dts/rockchip/rk3588-quartzpro64.dts +++ b/arch/arm64/boot/dts/rockchip/rk3588-quartzpro64.dts @@ -411,6 +411,36 @@ &pcie3x4 { status = "okay"; }; +&rknn_core_top { + npu-supply = <&vdd_npu_s0>; + sram-supply = <&vdd_npu_mem_s0>; + status = "okay"; +}; + +&rknn_core_1 { + npu-supply = <&vdd_npu_s0>; + sram-supply = <&vdd_npu_mem_s0>; + status = "okay"; +}; + +&rknn_core_2 { + npu-supply = <&vdd_npu_s0>; + sram-supply = <&vdd_npu_mem_s0>; + status = "okay"; +}; + +&rknn_mmu_top { + status = "okay"; +}; + +&rknn_mmu_1 { + status = "okay"; +}; + +&rknn_mmu_2 { + status = "okay"; +}; + &saradc { vref-supply = <&vcc_1v8_s0>; status = "okay"; From patchwork Tue Feb 25 07:55:51 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tomeu Vizoso X-Patchwork-Id: 13989446 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B4C58C021BC for ; Tue, 25 Feb 2025 08:03:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Cc:To:In-Reply-To:References :Message-Id:Content-Transfer-Encoding:Content-Type:MIME-Version:Subject:Date: From:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=nam30mXYH7sqnxZsF8iziYkpPaRXuHdeqPaD1WJal/o=; b=Cdky5d+sW4qgVJbAF0DF9QBL80 kICdWDFTQhHKBl/gQJ4hfvfVaY8+fPGPT7TFhDEZX6oEjLML+j328DJRTWlvKKf5RChmH0NVRTCp+ vAGMqsMO3AQAwuxgZ/SvkJGxpOA0ov2J1SdV9EBgB+thxfbKIsHXTvgUBjVm8lyxY5ttI7GBLj+Ma PH4wCMj0n94pHf3/b/Wds2Wxerp90wAJGvgBYYsua/ErcqXq+gQ5pkwRMezYkS93FVCQqZw0dtLAp I/aJ40qoGHbGrpOw/2CedOM+n9Jm3rpTGQeq7UOHTrBS+tqIqxguLXHGK9T/HaWvsSWUWbUBUE9qM GFaGEIGw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tmpur-0000000GJQU-3cqz; Tue, 25 Feb 2025 08:03:45 +0000 Received: from mail-wm1-f48.google.com ([209.85.128.48]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tmpnZ-0000000GHi1-33C0; Tue, 25 Feb 2025 07:56:15 +0000 Received: by mail-wm1-f48.google.com with SMTP id 5b1f17b1804b1-4399deda38cso31820745e9.1; Mon, 24 Feb 2025 23:56:13 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740470172; x=1741074972; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=nam30mXYH7sqnxZsF8iziYkpPaRXuHdeqPaD1WJal/o=; b=uzNWcrCKQhYZ8jA0AtldbO3DNLtjcuSxd2Bniw1OETsh8FuBVokJh9FHz4B6QJ5YaJ lfBR2ifMNMIFt8wSyyMuvHCWOcjS92eNuINZ+4GuJkD7xKZSi40c6HW+FeYhXsSsWmU/ 3IVxz9xARJNF1AYS2tb96+rjlBf/bHp/mqAsx9ljitnxjVp8kPgpS0fZ8IHpLRt9rK8p jE3H49kCdU2EXz9sbogNprftCd6Nt2JEa/GOZPMU2YgkwfJV2hua1fspSkBc8o9ck0y+ yfh7w2JyuyQpIinwyRupZ4rQm6p/49+dbb/unVebffbs33v08OEYagBhOJ3YGLi1Nd2b MV1A== X-Forwarded-Encrypted: i=1; AJvYcCWL+8c6ELQYoQTWzrlxPYBXVIj0Fan9hQnxAmfp9EqWh9IHD2ek8npcjV9CgKsqFxRSOvM/v4ZZk23qBmPWgWnd@lists.infradead.org, AJvYcCWxCzxC/S4zW51HKbAyme7jRvuoOzlxgddSWDv4tqL2IxcTDHBhERvvxFnnnDh/6YBdSIQESJzU0WzlcGfXmP4=@lists.infradead.org X-Gm-Message-State: AOJu0Ywvr+/WyaGbUIhu3hWWShRR5m1eY53hyFLgm/ge8a5cDxtCjS/s b4uSt/7H4e0DQZAtGWEGjgDi9fin9pKx5hDoVDhu5l1dIYB13SgzLrH6fJsT83Y= X-Gm-Gg: ASbGncsyExaiaZRBuAFzJmMQdfNS0d/BuWBNYSSPGiybBwgUGw7lhNJ+H8+txOz1JmN bAL7mduRFd6rRes6U2Y3Lpb/aRZbyuQkENfdoEFrMM6qjCh7hqMK0USaVKC2l/vNGopxOVgR/hP 9RX76Whm1taLVktmMklFIX8gP3YUMCuJJtCi3mlSu/C2t7XC1d9RWkQEQ5UOMeq2OAEqbQOVXqW hCnnT5iCy/EgxWmBDsdsfcXiImkWo3BfcEGwD8d2CsK/q4avFKWPDKnSXKdh2AXhAVHSYXgyp3w FOcJMvrGH9MwANg9VSdzPgRrQ2/gHg4c78stBy1kjQDp7rswqRbAQz5t6g== X-Google-Smtp-Source: AGHT+IHmtdtPcIWm3ze8HJbO0PJUvdpS/G0UT9AX7Fpj+7cQRjuHZ04kk6lhxFh/1DepcXetWVaxlQ== X-Received: by 2002:a05:600c:1989:b0:439:a6db:1824 with SMTP id 5b1f17b1804b1-43ab0f4271cmr21458595e9.16.1740470171828; Mon, 24 Feb 2025 23:56:11 -0800 (PST) Received: from [10.42.0.1] (cst-prg-37-50.cust.vodafone.cz. [46.135.37.50]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-439b02f2475sm130299475e9.20.2025.02.24.23.56.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 Feb 2025 23:56:11 -0800 (PST) From: Tomeu Vizoso Date: Tue, 25 Feb 2025 08:55:51 +0100 Subject: [PATCH v2 5/7] accel/rocket: Add IOCTL for BO creation MIME-Version: 1.0 Message-Id: <20250225-6-10-rocket-v2-5-d4dbcfafc141@tomeuvizoso.net> References: <20250225-6-10-rocket-v2-0-d4dbcfafc141@tomeuvizoso.net> In-Reply-To: <20250225-6-10-rocket-v2-0-d4dbcfafc141@tomeuvizoso.net> To: Rob Herring , Krzysztof Kozlowski , Conor Dooley , Heiko Stuebner , Oded Gabbay , Jonathan Corbet , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Simona Vetter , Sumit Semwal , =?utf-8?q?Christian_K=C3=B6nig?= , Sebastian Reichel , Jeffrey Hugo Cc: devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-rockchip@lists.infradead.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, linux-doc@vger.kernel.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, Tomeu Vizoso X-Mailer: b4 0.14.2 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250224_235613_777607_3BA65CA8 X-CRM114-Status: GOOD ( 29.13 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org This uses the SHMEM DRM helpers and we map right away to the CPU and NPU sides, as all buffers are expected to be accessed from both. v2: - Sync the IOMMUs for the other cores when mapping and unmapping. Signed-off-by: Tomeu Vizoso --- drivers/accel/rocket/Makefile | 3 +- drivers/accel/rocket/rocket_device.c | 4 + drivers/accel/rocket/rocket_device.h | 2 + drivers/accel/rocket/rocket_drv.c | 7 +- drivers/accel/rocket/rocket_gem.c | 141 +++++++++++++++++++++++++++++++++++ drivers/accel/rocket/rocket_gem.h | 27 +++++++ include/uapi/drm/rocket_accel.h | 43 +++++++++++ 7 files changed, 225 insertions(+), 2 deletions(-) diff --git a/drivers/accel/rocket/Makefile b/drivers/accel/rocket/Makefile index 73a7280d260c068d37ad3048824f710482333540..875cac2243d902694e0d5d05e60b4ae551a633c4 100644 --- a/drivers/accel/rocket/Makefile +++ b/drivers/accel/rocket/Makefile @@ -5,4 +5,5 @@ obj-$(CONFIG_DRM_ACCEL_ROCKET) := rocket.o rocket-y := \ rocket_core.o \ rocket_device.o \ - rocket_drv.o + rocket_drv.o \ + rocket_gem.o diff --git a/drivers/accel/rocket/rocket_device.c b/drivers/accel/rocket/rocket_device.c index ce3b533f15c1011d8a7a23dd8132e907cc334c58..9af36357caba7148dcac764c8222699f3b572d60 100644 --- a/drivers/accel/rocket/rocket_device.c +++ b/drivers/accel/rocket/rocket_device.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 /* Copyright 2024 Tomeu Vizoso */ +#include "linux/mutex.h" #include #include "rocket_device.h" @@ -10,6 +11,8 @@ int rocket_device_init(struct rocket_device *rdev) struct device *dev = rdev->cores[0].dev; int err; + mutex_init(&rdev->iommu_lock); + rdev->clk_npu = devm_clk_get(dev, "npu"); rdev->pclk = devm_clk_get(dev, "pclk"); @@ -26,4 +29,5 @@ int rocket_device_init(struct rocket_device *rdev) void rocket_device_fini(struct rocket_device *rdev) { rocket_core_fini(&rdev->cores[0]); + mutex_destroy(&rdev->iommu_lock); } diff --git a/drivers/accel/rocket/rocket_device.h b/drivers/accel/rocket/rocket_device.h index 466edba9102c5dc5dfac5d3fcc1c904f206eaebb..c6152569fdd9e5587c8e8d7b0d7c2e2a77af6000 100644 --- a/drivers/accel/rocket/rocket_device.h +++ b/drivers/accel/rocket/rocket_device.h @@ -14,6 +14,8 @@ struct rocket_device { struct clk *clk_npu; struct clk *pclk; + struct mutex iommu_lock; + struct rocket_core *cores; unsigned int num_cores; }; diff --git a/drivers/accel/rocket/rocket_drv.c b/drivers/accel/rocket/rocket_drv.c index c22d965f20f1239a36b1d823d5fe5f372713555d..e5612b52952fa7a0cd0af02aef314984bc483b05 100644 --- a/drivers/accel/rocket/rocket_drv.c +++ b/drivers/accel/rocket/rocket_drv.c @@ -6,6 +6,7 @@ #include #include #include +#include #include #include #include @@ -14,6 +15,7 @@ #include #include "rocket_drv.h" +#include "rocket_gem.h" static int rocket_open(struct drm_device *dev, struct drm_file *file) @@ -42,6 +44,8 @@ rocket_postclose(struct drm_device *dev, struct drm_file *file) static const struct drm_ioctl_desc rocket_drm_driver_ioctls[] = { #define ROCKET_IOCTL(n, func) \ DRM_IOCTL_DEF_DRV(ROCKET_##n, rocket_ioctl_##func, 0) + + ROCKET_IOCTL(CREATE_BO, create_bo), }; DEFINE_DRM_ACCEL_FOPS(rocket_accel_driver_fops); @@ -51,9 +55,10 @@ DEFINE_DRM_ACCEL_FOPS(rocket_accel_driver_fops); * - 1.0 - initial interface */ static const struct drm_driver rocket_drm_driver = { - .driver_features = DRIVER_COMPUTE_ACCEL, + .driver_features = DRIVER_COMPUTE_ACCEL | DRIVER_GEM, .open = rocket_open, .postclose = rocket_postclose, + .gem_create_object = rocket_gem_create_object, .ioctls = rocket_drm_driver_ioctls, .num_ioctls = ARRAY_SIZE(rocket_drm_driver_ioctls), .fops = &rocket_accel_driver_fops, diff --git a/drivers/accel/rocket/rocket_gem.c b/drivers/accel/rocket/rocket_gem.c new file mode 100644 index 0000000000000000000000000000000000000000..d5337cf1e275c249a1491d0dd28e6b8ccd2ff2cb --- /dev/null +++ b/drivers/accel/rocket/rocket_gem.c @@ -0,0 +1,141 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright 2024 Tomeu Vizoso */ + +#include +#include +#include +#include +#include + +#include "rocket_device.h" +#include "rocket_gem.h" + +static void rocket_gem_bo_free(struct drm_gem_object *obj) +{ + struct rocket_device *rdev = to_rocket_device(obj->dev); + struct rocket_gem_object *bo = to_rocket_bo(obj); + struct sg_table *sgt; + + drm_WARN_ON(obj->dev, bo->base.pages_use_count > 1); + + mutex_lock(&rdev->iommu_lock); + + sgt = drm_gem_shmem_get_pages_sgt(&bo->base); + + /* Unmap this object from the IOMMUs for cores > 0 */ + for (unsigned int core = 1; core < rdev->num_cores; core++) { + struct iommu_domain *domain = iommu_get_domain_for_dev(rdev->cores[core].dev); + size_t unmapped = iommu_unmap(domain, sgt->sgl->dma_address, bo->size); + + drm_WARN_ON(obj->dev, unmapped != bo->size); + } + + /* This will unmap the pages from the IOMMU linked to core 0 */ + drm_gem_shmem_free(&bo->base); + + mutex_unlock(&rdev->iommu_lock); +} + +static const struct drm_gem_object_funcs rocket_gem_funcs = { + .free = rocket_gem_bo_free, + .print_info = drm_gem_shmem_object_print_info, + .pin = drm_gem_shmem_object_pin, + .unpin = drm_gem_shmem_object_unpin, + .get_sg_table = drm_gem_shmem_object_get_sg_table, + .vmap = drm_gem_shmem_object_vmap, + .vunmap = drm_gem_shmem_object_vunmap, + .mmap = drm_gem_shmem_object_mmap, + .vm_ops = &drm_gem_shmem_vm_ops, +}; + +/** + * rocket_gem_create_object - Implementation of driver->gem_create_object. + * @dev: DRM device + * @size: Size in bytes of the memory the object will reference + * + * This lets the GEM helpers allocate object structs for us, and keep + * our BO stats correct. + */ +struct drm_gem_object *rocket_gem_create_object(struct drm_device *dev, size_t size) +{ + struct rocket_gem_object *obj; + + obj = kzalloc(sizeof(*obj), GFP_KERNEL); + if (!obj) + return ERR_PTR(-ENOMEM); + + obj->base.base.funcs = &rocket_gem_funcs; + + return &obj->base.base; +} + +int rocket_ioctl_create_bo(struct drm_device *dev, void *data, struct drm_file *file) +{ + struct drm_rocket_create_bo *args = data; + struct rocket_device *rdev = to_rocket_device(dev); + struct drm_gem_shmem_object *shmem_obj; + struct rocket_gem_object *rkt_obj; + struct drm_gem_object *gem_obj; + struct sg_table *sgt; + int ret; + + shmem_obj = drm_gem_shmem_create(dev, args->size); + if (IS_ERR(shmem_obj)) + return PTR_ERR(shmem_obj); + + gem_obj = &shmem_obj->base; + rkt_obj = to_rocket_bo(gem_obj); + + rkt_obj->size = args->size; + rkt_obj->offset = 0; + mutex_init(&rkt_obj->mutex); + + ret = drm_gem_handle_create(file, gem_obj, &args->handle); + drm_gem_object_put(gem_obj); + if (ret) + goto err; + + mutex_lock(&rdev->iommu_lock); + + /* This will map the pages to the IOMMU linked to core 0 */ + sgt = drm_gem_shmem_get_pages_sgt(shmem_obj); + if (IS_ERR(sgt)) { + ret = PTR_ERR(sgt); + goto err_unlock; + } + + /* Map the pages to the IOMMUs linked to the other cores, so all cores can access this BO */ + for (unsigned int core = 1; core < rdev->num_cores; core++) { + + ret = iommu_map_sgtable(iommu_get_domain_for_dev(rdev->cores[core].dev), + sgt->sgl->dma_address, + sgt, + IOMMU_READ | IOMMU_WRITE); + if (ret < 0 || ret < args->size) { + DRM_ERROR("failed to map buffer: size=%d request_size=%u\n", + ret, args->size); + ret = -ENOMEM; + goto err_unlock; + } + + /* iommu_map_sgtable might have aligned the size */ + rkt_obj->size = ret; + + dma_sync_sgtable_for_device(rdev->cores[core].dev, shmem_obj->sgt, + DMA_BIDIRECTIONAL); + } + + mutex_unlock(&rdev->iommu_lock); + + args->offset = drm_vma_node_offset_addr(&gem_obj->vma_node); + args->dma_address = sg_dma_address(shmem_obj->sgt->sgl); + + return 0; + +err_unlock: + mutex_unlock(&rdev->iommu_lock); +err: + drm_gem_shmem_object_free(gem_obj); + + return ret; +} diff --git a/drivers/accel/rocket/rocket_gem.h b/drivers/accel/rocket/rocket_gem.h new file mode 100644 index 0000000000000000000000000000000000000000..19b0cf91ddd99bd126c1af30beb169d6101f6dee --- /dev/null +++ b/drivers/accel/rocket/rocket_gem.h @@ -0,0 +1,27 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* Copyright 2024 Tomeu Vizoso */ + +#ifndef __ROCKET_GEM_H__ +#define __ROCKET_GEM_H__ + +#include + +struct rocket_gem_object { + struct drm_gem_shmem_object base; + + struct mutex mutex; + size_t size; + u32 offset; +}; + +struct drm_gem_object *rocket_gem_create_object(struct drm_device *dev, size_t size); + +int rocket_ioctl_create_bo(struct drm_device *dev, void *data, struct drm_file *file); + +static inline +struct rocket_gem_object *to_rocket_bo(struct drm_gem_object *obj) +{ + return container_of(to_drm_gem_shmem_obj(obj), struct rocket_gem_object, base); +} + +#endif diff --git a/include/uapi/drm/rocket_accel.h b/include/uapi/drm/rocket_accel.h new file mode 100644 index 0000000000000000000000000000000000000000..8338726a83c31b954608ca505cf78bcd70d3494b --- /dev/null +++ b/include/uapi/drm/rocket_accel.h @@ -0,0 +1,43 @@ +/* SPDX-License-Identifier: MIT */ +/* + * Copyright © 2024 Tomeu Vizoso + */ +#ifndef _ROCKET_DRM_H_ +#define _ROCKET_DRM_H_ + +#include "drm.h" + +#if defined(__cplusplus) +extern "C" { +#endif + +#define DRM_ROCKET_CREATE_BO 0x00 + +#define DRM_IOCTL_ROCKET_CREATE_BO DRM_IOWR(DRM_COMMAND_BASE + DRM_ROCKET_CREATE_BO, struct drm_rocket_create_bo) + +/** + * struct drm_rocket_create_bo - ioctl argument for creating Rocket BOs. + * + */ +struct drm_rocket_create_bo { + __u32 size; + + /** Returned GEM handle for the BO. */ + __u32 handle; + + /** + * Returned DMA address for the BO in the NPU address space. This address + * is private to the DRM fd and is valid for the lifetime of the GEM + * handle. + */ + __u64 dma_address; + + /** Offset into the drm node to use for subsequent mmap call. */ + __u64 offset; +}; + +#if defined(__cplusplus) +} +#endif + +#endif /* _ROCKET_DRM_H_ */ From patchwork Tue Feb 25 07:55:52 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tomeu Vizoso X-Patchwork-Id: 13989447 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E72ACC021B2 for ; Tue, 25 Feb 2025 08:05:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Cc:To:In-Reply-To:References :Message-Id:Content-Transfer-Encoding:Content-Type:MIME-Version:Subject:Date: From:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=8M9oXvZiUou47XKCcHVDbFoUOcZ0JEmOMQ8wFDLo4lI=; b=X9y+t0wVnicxCxqNwzXrJJbcx2 emXHgYXHcJ/OCK88/HMOb7hSQy/Z5hlJKuAJtVCVRIUsL3gxD7XC37xQblsXCIDmBEYJZy388dkqd rK9HXVTWp+yJrC2aGsB5LS2RixytphSEw0tazwiuoffXLB9PplELPtG201zhpcZlR/2zs5pputqzO ez5qNvcC1MCwtZ1RVQvpNaoGwPTWPIe8ID0Gf2Plm3qWHuUiFhG05OTYzESzmVr3H9TtiY0O/ZTBZ AHeqEq8hAyTSX7O0olc1kq2CxXdCjdTwPbillh6vop0G5Mfd/93eiH0UJ+gug9BllbGMhqebbqgws VzAVUOLg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tmpwM-0000000GJiM-1pHE; Tue, 25 Feb 2025 08:05:18 +0000 Received: from mail-wm1-f49.google.com ([209.85.128.49]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tmpnb-0000000GHiu-0i44; Tue, 25 Feb 2025 07:56:17 +0000 Received: by mail-wm1-f49.google.com with SMTP id 5b1f17b1804b1-4398e3dfc66so46143875e9.0; Mon, 24 Feb 2025 23:56:15 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740470174; x=1741074974; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=8M9oXvZiUou47XKCcHVDbFoUOcZ0JEmOMQ8wFDLo4lI=; b=EC2fT8CvOn6JaXkEwXYdLa167OrtSkZb0mT1YpJoMRMNbc+jZrpo4hZ0asTvfYrrvF FVBB1vv4Nk47IMapa2KFYJZjzkKhZNjvSl4djSGVf9lh3twkM4sq0sDQ3m08BnGMnUnw b04DymvIfQa16DVXg85iiEsAHl37AS+AgfeR2ElmhcaY/pS3mlVAto34U8eZHHdWa9m5 12MXMAU+hsiHhAzq7GYzxzXQqmk1jI3p7YXkGIT27KU4X3RTd4x8zAcu7VSQufK2cL1E IDir7wj994ZC2X+t2AajC2FZzz/tU1rgcLf7P0EkNLF/9BMfnIqCOIzw5GMaiSKToaDD rx0Q== X-Forwarded-Encrypted: i=1; AJvYcCV38BWPeX3vqk07ia1/tdlpvgvkbErYcnuuhazFKffvMH1poEtIAKwnoQ9f6Ll1uVtaRkiLrZ2wCLUZImFzAAMq@lists.infradead.org, AJvYcCVBFvdljWtTu0V03k4sXQnq/WSpIHbFx6EQokMJONe/sJGZ7vDJBg+ALkjpqvPYVG8Yw5XKanMf5M4CCCbrz28=@lists.infradead.org X-Gm-Message-State: AOJu0YxmD5Dhyw4zpoOhrzwF+XbgElmuFIHF1lCMCOopcCji6bdZZSss r4jMVsZDw1nLymJ5f4ECymDadv/+7yyNnQtDWsaceIgta35lgkLB X-Gm-Gg: ASbGncubESgeZKI0NoK+fyHqI7VV4B5ucVxUMwyOlz9MTXUPt4zMfUOJDOI296wDu9P 4uzm1u8FxlBERqPEL68h1n3zsPEnxlMJMzPhIGBZgniteZ6WV7k0vuExmCVmi9CZcSSYVQbDNPG z7jk/8/yaVqm+ioeQ3wZ1BwN1/yzi4HwOg4C8aOlffeotjvwIJjfNxKqCf9yRk/GhM9xYHbmcZX lhV14t0mH72W1e/Vwt1FzH42zKgHCG2Sexx5I18wv628lEIka79iI03bQyjL3P9JpZfl/mjNm5e svbua1viGsgk5cm/Zz1tqaFiUqbsYmB4y+2n2hw/GmqqKEZC+oiZFfETtA== X-Google-Smtp-Source: AGHT+IHwBTicXUpbqwWUPSs/T1dwfdrvCMx/gZdQ3WVe4vasGVVR3VYFVOwJpOaLDvPu0HP1D9X3nw== X-Received: by 2002:a05:600c:46d1:b0:439:89d1:30ec with SMTP id 5b1f17b1804b1-439ae221701mr120222535e9.29.1740470173573; Mon, 24 Feb 2025 23:56:13 -0800 (PST) Received: from [10.42.0.1] (cst-prg-37-50.cust.vodafone.cz. [46.135.37.50]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-439b02f2475sm130299475e9.20.2025.02.24.23.56.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 Feb 2025 23:56:13 -0800 (PST) From: Tomeu Vizoso Date: Tue, 25 Feb 2025 08:55:52 +0100 Subject: [PATCH v2 6/7] accel/rocket: Add job submission IOCTL MIME-Version: 1.0 Message-Id: <20250225-6-10-rocket-v2-6-d4dbcfafc141@tomeuvizoso.net> References: <20250225-6-10-rocket-v2-0-d4dbcfafc141@tomeuvizoso.net> In-Reply-To: <20250225-6-10-rocket-v2-0-d4dbcfafc141@tomeuvizoso.net> To: Rob Herring , Krzysztof Kozlowski , Conor Dooley , Heiko Stuebner , Oded Gabbay , Jonathan Corbet , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Simona Vetter , Sumit Semwal , =?utf-8?q?Christian_K=C3=B6nig?= , Sebastian Reichel , Jeffrey Hugo Cc: devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-rockchip@lists.infradead.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, linux-doc@vger.kernel.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, Tomeu Vizoso X-Mailer: b4 0.14.2 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250224_235615_437249_57C29D23 X-CRM114-Status: GOOD ( 27.42 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Using the DRM GPU scheduler infrastructure, with a scheduler for each core. Userspace can decide for a series of tasks to be executed sequentially in the same core, so SRAM locality can be taken advantage of. The job submission code was initially based on Panfrost. v2: - Remove hardcoded number of cores - Misc. style fixes (Jeffrey Hugo) - Repack IOCTL struct (Jeffrey Hugo) Signed-off-by: Tomeu Vizoso --- drivers/accel/rocket/Makefile | 3 +- drivers/accel/rocket/rocket_core.c | 6 + drivers/accel/rocket/rocket_core.h | 14 + drivers/accel/rocket/rocket_device.c | 2 + drivers/accel/rocket/rocket_device.h | 2 + drivers/accel/rocket/rocket_drv.c | 15 + drivers/accel/rocket/rocket_drv.h | 4 + drivers/accel/rocket/rocket_job.c | 710 +++++++++++++++++++++++++++++++++++ drivers/accel/rocket/rocket_job.h | 50 +++ include/uapi/drm/rocket_accel.h | 55 +++ 10 files changed, 860 insertions(+), 1 deletion(-) diff --git a/drivers/accel/rocket/Makefile b/drivers/accel/rocket/Makefile index 875cac2243d902694e0d5d05e60b4ae551a633c4..4d59036af8d9c213d3cac0559eb66e3ebb0320e7 100644 --- a/drivers/accel/rocket/Makefile +++ b/drivers/accel/rocket/Makefile @@ -6,4 +6,5 @@ rocket-y := \ rocket_core.o \ rocket_device.o \ rocket_drv.o \ - rocket_gem.o + rocket_gem.o \ + rocket_job.o diff --git a/drivers/accel/rocket/rocket_core.c b/drivers/accel/rocket/rocket_core.c index 09d966c826b5b1090a18cb24b3aa4aba286a12d4..2b522592693874eed90463e8f85653d5282ae5b8 100644 --- a/drivers/accel/rocket/rocket_core.c +++ b/drivers/accel/rocket/rocket_core.c @@ -6,6 +6,7 @@ #include #include "rocket_core.h" +#include "rocket_job.h" #include "rocket_registers.h" static int rocket_clk_init(struct rocket_core *core) @@ -48,6 +49,10 @@ int rocket_core_init(struct rocket_core *core) if (IS_ERR(core->iomem)) return PTR_ERR(core->iomem); + err = rocket_job_init(core); + if (err) + return err; + pm_runtime_use_autosuspend(dev); pm_runtime_set_autosuspend_delay(dev, 50); /* ~3 frames */ pm_runtime_enable(dev); @@ -68,4 +73,5 @@ int rocket_core_init(struct rocket_core *core) void rocket_core_fini(struct rocket_core *core) { pm_runtime_disable(core->dev); + rocket_job_fini(core); } diff --git a/drivers/accel/rocket/rocket_core.h b/drivers/accel/rocket/rocket_core.h index 2171eba7139ccc63fe24802dc81b4adb7f3abf31..045a46a2010a2ffd6122ed86c379e5fabc70365a 100644 --- a/drivers/accel/rocket/rocket_core.h +++ b/drivers/accel/rocket/rocket_core.h @@ -21,6 +21,20 @@ struct rocket_core { void __iomem *iomem; struct clk *a_clk; struct clk *h_clk; + + struct rocket_job *in_flight_job; + + spinlock_t job_lock; + + struct { + struct workqueue_struct *wq; + struct work_struct work; + atomic_t pending; + } reset; + + struct drm_gpu_scheduler sched; + u64 fence_context; + u64 emit_seqno; }; int rocket_core_init(struct rocket_core *core); diff --git a/drivers/accel/rocket/rocket_device.c b/drivers/accel/rocket/rocket_device.c index 9af36357caba7148dcac764c8222699f3b572d60..62c640e1e0200fe25b6834e45d71f6de139ff3ab 100644 --- a/drivers/accel/rocket/rocket_device.c +++ b/drivers/accel/rocket/rocket_device.c @@ -12,6 +12,7 @@ int rocket_device_init(struct rocket_device *rdev) int err; mutex_init(&rdev->iommu_lock); + mutex_init(&rdev->sched_lock); rdev->clk_npu = devm_clk_get(dev, "npu"); rdev->pclk = devm_clk_get(dev, "pclk"); @@ -29,5 +30,6 @@ int rocket_device_init(struct rocket_device *rdev) void rocket_device_fini(struct rocket_device *rdev) { rocket_core_fini(&rdev->cores[0]); + mutex_destroy(&rdev->sched_lock); mutex_destroy(&rdev->iommu_lock); } diff --git a/drivers/accel/rocket/rocket_device.h b/drivers/accel/rocket/rocket_device.h index c6152569fdd9e5587c8e8d7b0d7c2e2a77af6000..4168ae8da2d38c2ea114b37c6e053b02611a0232 100644 --- a/drivers/accel/rocket/rocket_device.h +++ b/drivers/accel/rocket/rocket_device.h @@ -11,6 +11,8 @@ struct rocket_device { struct drm_device ddev; + struct mutex sched_lock; + struct clk *clk_npu; struct clk *pclk; diff --git a/drivers/accel/rocket/rocket_drv.c b/drivers/accel/rocket/rocket_drv.c index e5612b52952fa7a0cd0af02aef314984bc483b05..a6b486e2d4f648d7b1d8831590b633bf661c7bc4 100644 --- a/drivers/accel/rocket/rocket_drv.c +++ b/drivers/accel/rocket/rocket_drv.c @@ -16,12 +16,14 @@ #include "rocket_drv.h" #include "rocket_gem.h" +#include "rocket_job.h" static int rocket_open(struct drm_device *dev, struct drm_file *file) { struct rocket_device *rdev = to_rocket_device(dev); struct rocket_file_priv *rocket_priv; + int ret; rocket_priv = kzalloc(sizeof(*rocket_priv), GFP_KERNEL); if (!rocket_priv) @@ -30,7 +32,15 @@ rocket_open(struct drm_device *dev, struct drm_file *file) rocket_priv->rdev = rdev; file->driver_priv = rocket_priv; + ret = rocket_job_open(rocket_priv); + if (ret) + goto err_free; + return 0; + +err_free: + kfree(rocket_priv); + return ret; } static void @@ -38,6 +48,7 @@ rocket_postclose(struct drm_device *dev, struct drm_file *file) { struct rocket_file_priv *rocket_priv = file->driver_priv; + rocket_job_close(rocket_priv); kfree(rocket_priv); } @@ -46,6 +57,7 @@ static const struct drm_ioctl_desc rocket_drm_driver_ioctls[] = { DRM_IOCTL_DEF_DRV(ROCKET_##n, rocket_ioctl_##func, 0) ROCKET_IOCTL(CREATE_BO, create_bo), + ROCKET_IOCTL(SUBMIT, submit), }; DEFINE_DRM_ACCEL_FOPS(rocket_accel_driver_fops); @@ -245,6 +257,9 @@ static int rocket_device_runtime_suspend(struct device *dev) if (dev != rdev->cores[core].dev) continue; + if (!rocket_job_is_idle(&rdev->cores[core])) + return -EBUSY; + clk_disable_unprepare(rdev->cores[core].a_clk); clk_disable_unprepare(rdev->cores[core].h_clk); diff --git a/drivers/accel/rocket/rocket_drv.h b/drivers/accel/rocket/rocket_drv.h index ccdd50c69d4c033eea18cb800407fdcfb3bf2e9b..54e21a61006057aee293496016e54b495a2f6d55 100644 --- a/drivers/accel/rocket/rocket_drv.h +++ b/drivers/accel/rocket/rocket_drv.h @@ -4,10 +4,14 @@ #ifndef __ROCKET_DRV_H__ #define __ROCKET_DRV_H__ +#include + #include "rocket_device.h" struct rocket_file_priv { struct rocket_device *rdev; + + struct drm_sched_entity sched_entity; }; #endif diff --git a/drivers/accel/rocket/rocket_job.c b/drivers/accel/rocket/rocket_job.c new file mode 100644 index 0000000000000000000000000000000000000000..25b31f28e932aaee86173b9a0962932c9c640c03 --- /dev/null +++ b/drivers/accel/rocket/rocket_job.c @@ -0,0 +1,710 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright 2019 Linaro, Ltd, Rob Herring */ +/* Copyright 2019 Collabora ltd. */ +/* Copyright 2024 Tomeu Vizoso */ + +#include +#include +#include +#include +#include +#include + +#include "rocket_core.h" +#include "rocket_device.h" +#include "rocket_drv.h" +#include "rocket_job.h" +#include "rocket_registers.h" + +#define JOB_TIMEOUT_MS 500 + +#define job_write(dev, reg, data) writel(data, dev->iomem + (reg)) +#define job_read(dev, reg) readl(dev->iomem + (reg)) + +static struct rocket_job * +to_rocket_job(struct drm_sched_job *sched_job) +{ + return container_of(sched_job, struct rocket_job, base); +} + +struct rocket_fence { + struct dma_fence base; + struct drm_device *dev; + /* rocket seqno for signaled() test */ + u64 seqno; + int queue; +}; + +static inline struct rocket_fence * +to_rocket_fence(struct dma_fence *fence) +{ + return (struct rocket_fence *)fence; +} + +static const char *rocket_fence_get_driver_name(struct dma_fence *fence) +{ + return "rocket"; +} + +static const char *rocket_fence_get_timeline_name(struct dma_fence *fence) +{ + return "rockchip-npu"; +} + +static const struct dma_fence_ops rocket_fence_ops = { + .get_driver_name = rocket_fence_get_driver_name, + .get_timeline_name = rocket_fence_get_timeline_name, +}; + +static struct dma_fence *rocket_fence_create(struct rocket_core *core) +{ + struct rocket_device *rdev = core->rdev; + struct rocket_fence *fence; + + fence = kzalloc(sizeof(*fence), GFP_KERNEL); + if (!fence) + return ERR_PTR(-ENOMEM); + + fence->dev = &rdev->ddev; + fence->seqno = ++core->emit_seqno; + dma_fence_init(&fence->base, &rocket_fence_ops, &core->job_lock, + core->fence_context, fence->seqno); + + return &fence->base; +} + +static int +rocket_copy_tasks(struct drm_device *dev, + struct drm_file *file_priv, + struct drm_rocket_job *job, + struct rocket_job *rjob) +{ + struct drm_rocket_task *tasks; + int ret = 0; + int i; + + rjob->task_count = job->task_count; + + if (!rjob->task_count) + return 0; + + tasks = kvmalloc_array(rjob->task_count, sizeof(*tasks), GFP_KERNEL); + if (!tasks) { + ret = -ENOMEM; + DRM_DEBUG("Failed to allocate incoming tasks\n"); + goto fail; + } + + if (copy_from_user(tasks, + (void __user *)(uintptr_t)job->tasks, + rjob->task_count * sizeof(*tasks))) { + ret = -EFAULT; + DRM_DEBUG("Failed to copy incoming tasks\n"); + goto fail; + } + + rjob->tasks = kvmalloc_array(job->task_count, sizeof(*rjob->tasks), GFP_KERNEL); + if (!rjob->tasks) { + DRM_DEBUG("Failed to allocate task array\n"); + ret = -ENOMEM; + goto fail; + } + + for (i = 0; i < rjob->task_count; i++) { + if (tasks[i].regcmd_count == 0) { + ret = -EINVAL; + goto fail; + } + rjob->tasks[i].regcmd = tasks[i].regcmd; + rjob->tasks[i].regcmd_count = tasks[i].regcmd_count; + } + +fail: + kvfree(tasks); + return ret; +} + +static void rocket_job_hw_submit(struct rocket_core *core, struct rocket_job *job) +{ + struct rocket_task *task; + bool task_pp_en = 1; + bool task_count = 1; + + /* GO ! */ + + /* Don't queue the job if a reset is in progress */ + if (!atomic_read(&core->reset.pending)) { + + task = &job->tasks[job->next_task_idx]; + job->next_task_idx++; /* TODO: Do this only after a successful run? */ + + rocket_write(core, REG_PC_BASE_ADDRESS, 0x1); + + rocket_write(core, REG_CNA_S_POINTER, 0xe + 0x10000000 * core->index); + rocket_write(core, REG_CORE_S_POINTER, 0xe + 0x10000000 * core->index); + + rocket_write(core, REG_PC_BASE_ADDRESS, task->regcmd); + rocket_write(core, REG_PC_REGISTER_AMOUNTS, (task->regcmd_count + 1) / 2 - 1); + + rocket_write(core, REG_PC_INTERRUPT_MASK, + PC_INTERRUPT_MASK_DPU_0 | PC_INTERRUPT_MASK_DPU_1); + rocket_write(core, REG_PC_INTERRUPT_CLEAR, + PC_INTERRUPT_CLEAR_DPU_0 | PC_INTERRUPT_CLEAR_DPU_1); + + rocket_write(core, REG_PC_TASK_CON, ((0x6 | task_pp_en) << 12) | task_count); + + rocket_write(core, REG_PC_TASK_DMA_BASE_ADDR, 0x0); + + rocket_write(core, REG_PC_OPERATION_ENABLE, 0x1); + + dev_dbg(core->dev, + "Submitted regcmd at 0x%llx to core %d", + task->regcmd, core->index); + } +} + +static int rocket_acquire_object_fences(struct drm_gem_object **bos, + int bo_count, + struct drm_sched_job *job, + bool is_write) +{ + int i, ret; + + for (i = 0; i < bo_count; i++) { + ret = dma_resv_reserve_fences(bos[i]->resv, 1); + if (ret) + return ret; + + ret = drm_sched_job_add_implicit_dependencies(job, bos[i], + is_write); + if (ret) + return ret; + } + + return 0; +} + +static void rocket_attach_object_fences(struct drm_gem_object **bos, + int bo_count, + struct dma_fence *fence) +{ + int i; + + for (i = 0; i < bo_count; i++) + dma_resv_add_fence(bos[i]->resv, fence, DMA_RESV_USAGE_WRITE); +} + +static int rocket_job_push(struct rocket_job *job) +{ + struct rocket_device *rdev = job->rdev; + struct drm_gem_object **bos; + struct ww_acquire_ctx acquire_ctx; + int ret = 0; + + bos = kvmalloc_array(job->in_bo_count + job->out_bo_count, sizeof(void *), + GFP_KERNEL); + memcpy(bos, job->in_bos, job->in_bo_count * sizeof(void *)); + memcpy(&bos[job->in_bo_count], job->out_bos, job->out_bo_count * sizeof(void *)); + + ret = drm_gem_lock_reservations(bos, job->in_bo_count + job->out_bo_count, &acquire_ctx); + if (ret) + goto err; + + mutex_lock(&rdev->sched_lock); + drm_sched_job_arm(&job->base); + + job->inference_done_fence = dma_fence_get(&job->base.s_fence->finished); + + ret = rocket_acquire_object_fences(job->in_bos, job->in_bo_count, &job->base, false); + if (ret) { + mutex_unlock(&rdev->sched_lock); + goto err_unlock; + } + + ret = rocket_acquire_object_fences(job->out_bos, job->out_bo_count, &job->base, true); + if (ret) { + mutex_unlock(&rdev->sched_lock); + goto err_unlock; + } + + kref_get(&job->refcount); /* put by scheduler job completion */ + + drm_sched_entity_push_job(&job->base); + + mutex_unlock(&rdev->sched_lock); + + rocket_attach_object_fences(job->out_bos, job->out_bo_count, job->inference_done_fence); + +err_unlock: + drm_gem_unlock_reservations(bos, job->in_bo_count + job->out_bo_count, &acquire_ctx); +err: + kfree(bos); + + return ret; +} + +static void rocket_job_cleanup(struct kref *ref) +{ + struct rocket_job *job = container_of(ref, struct rocket_job, + refcount); + unsigned int i; + + dma_fence_put(job->done_fence); + dma_fence_put(job->inference_done_fence); + + if (job->in_bos) { + for (i = 0; i < job->in_bo_count; i++) + drm_gem_object_put(job->in_bos[i]); + + kvfree(job->in_bos); + } + + if (job->out_bos) { + for (i = 0; i < job->out_bo_count; i++) + drm_gem_object_put(job->out_bos[i]); + + kvfree(job->out_bos); + } + + kfree(job->tasks); + + kfree(job); +} + +static void rocket_job_put(struct rocket_job *job) +{ + kref_put(&job->refcount, rocket_job_cleanup); +} + +static void rocket_job_free(struct drm_sched_job *sched_job) +{ + struct rocket_job *job = to_rocket_job(sched_job); + + drm_sched_job_cleanup(sched_job); + + rocket_job_put(job); +} + +static struct rocket_core *sched_to_core(struct rocket_device *rdev, + struct drm_gpu_scheduler *sched) +{ + unsigned int core; + + for (core = 0; core < rdev->num_cores; core++) { + if (&rdev->cores[core].sched == sched) + return &rdev->cores[core]; + } + + return NULL; +} + +static struct dma_fence *rocket_job_run(struct drm_sched_job *sched_job) +{ + struct rocket_job *job = to_rocket_job(sched_job); + struct rocket_device *rdev = job->rdev; + struct rocket_core *core = sched_to_core(rdev, sched_job->sched); + struct dma_fence *fence = NULL; + int ret; + + if (unlikely(job->base.s_fence->finished.error)) + return NULL; + + /* + * Nothing to execute: can happen if the job has finished while + * we were resetting the GPU. + */ + if (job->next_task_idx == job->task_count) + return NULL; + + fence = rocket_fence_create(core); + if (IS_ERR(fence)) + return fence; + + if (job->done_fence) + dma_fence_put(job->done_fence); + job->done_fence = dma_fence_get(fence); + + ret = pm_runtime_get_sync(core->dev); + if (ret < 0) + return fence; + + spin_lock(&core->job_lock); + + core->in_flight_job = job; + rocket_job_hw_submit(core, job); + + spin_unlock(&core->job_lock); + + return fence; +} + +static void rocket_job_handle_done(struct rocket_core *core, + struct rocket_job *job) +{ + if (job->next_task_idx < job->task_count) { + rocket_job_hw_submit(core, job); + return; + } + + core->in_flight_job = NULL; + dma_fence_signal_locked(job->done_fence); + pm_runtime_put_autosuspend(core->dev); +} + +static void rocket_job_handle_irq(struct rocket_core *core) +{ + uint32_t status, raw_status; + + pm_runtime_mark_last_busy(core->dev); + + status = rocket_read(core, REG_PC_INTERRUPT_STATUS); + raw_status = rocket_read(core, REG_PC_INTERRUPT_RAW_STATUS); + + rocket_write(core, REG_PC_OPERATION_ENABLE, 0x0); + rocket_write(core, REG_PC_INTERRUPT_CLEAR, 0x1ffff); + + spin_lock(&core->job_lock); + + if (core->in_flight_job) + rocket_job_handle_done(core, core->in_flight_job); + + spin_unlock(&core->job_lock); +} + +static void +rocket_reset(struct rocket_core *core, struct drm_sched_job *bad) +{ + bool cookie; + + if (!atomic_read(&core->reset.pending)) + return; + + /* + * Stop the scheduler. + * + * FIXME: We temporarily get out of the dma_fence_signalling section + * because the cleanup path generate lockdep splats when taking locks + * to release job resources. We should rework the code to follow this + * pattern: + * + * try_lock + * if (locked) + * release + * else + * schedule_work_to_release_later + */ + drm_sched_stop(&core->sched, bad); + + cookie = dma_fence_begin_signalling(); + + if (bad) + drm_sched_increase_karma(bad); + + /* + * Mask job interrupts and synchronize to make sure we won't be + * interrupted during our reset. + */ + rocket_write(core, REG_PC_INTERRUPT_MASK, 0x0); + synchronize_irq(core->irq); + + /* Handle the remaining interrupts before we reset. */ + rocket_job_handle_irq(core); + + /* + * Remaining interrupts have been handled, but we might still have + * stuck jobs. Let's make sure the PM counters stay balanced by + * manually calling pm_runtime_put_noidle() and + * rocket_devfreq_record_idle() for each stuck job. + * Let's also make sure the cycle counting register's refcnt is + * kept balanced to prevent it from running forever + */ + spin_lock(&core->job_lock); + if (core->in_flight_job) + pm_runtime_put_noidle(core->dev); + + core->in_flight_job = NULL; + spin_unlock(&core->job_lock); + + /* Proceed with reset now. */ + pm_runtime_force_suspend(core->dev); + pm_runtime_force_resume(core->dev); + + /* GPU has been reset, we can clear the reset pending bit. */ + atomic_set(&core->reset.pending, 0); + + /* + * Now resubmit jobs that were previously queued but didn't have a + * chance to finish. + * FIXME: We temporarily get out of the DMA fence signalling section + * while resubmitting jobs because the job submission logic will + * allocate memory with the GFP_KERNEL flag which can trigger memory + * reclaim and exposes a lock ordering issue. + */ + dma_fence_end_signalling(cookie); + drm_sched_resubmit_jobs(&core->sched); + cookie = dma_fence_begin_signalling(); + + /* Restart the scheduler */ + drm_sched_start(&core->sched, 0); + + dma_fence_end_signalling(cookie); +} + +static enum drm_gpu_sched_stat rocket_job_timedout(struct drm_sched_job *sched_job) +{ + struct rocket_job *job = to_rocket_job(sched_job); + struct rocket_device *rdev = job->rdev; + struct rocket_core *core = sched_to_core(rdev, sched_job->sched); + + /* + * If the GPU managed to complete this jobs fence, the timeout is + * spurious. Bail out. + */ + if (dma_fence_is_signaled(job->done_fence)) + return DRM_GPU_SCHED_STAT_NOMINAL; + + /* + * Rocket IRQ handler may take a long time to process an interrupt + * if there is another IRQ handler hogging the processing. + * For example, the HDMI encoder driver might be stuck in the IRQ + * handler for a significant time in a case of bad cable connection. + * In order to catch such cases and not report spurious rocket + * job timeouts, synchronize the IRQ handler and re-check the fence + * status. + */ + synchronize_irq(core->irq); + + if (dma_fence_is_signaled(job->done_fence)) { + dev_warn(core->dev, "unexpectedly high interrupt latency\n"); + return DRM_GPU_SCHED_STAT_NOMINAL; + } + + dev_err(core->dev, "gpu sched timeout"); + + atomic_set(&core->reset.pending, 1); + rocket_reset(core, sched_job); + + return DRM_GPU_SCHED_STAT_NOMINAL; +} + +static void rocket_reset_work(struct work_struct *work) +{ + struct rocket_core *core; + + core = container_of(work, struct rocket_core, reset.work); + rocket_reset(core, NULL); +} + +static const struct drm_sched_backend_ops rocket_sched_ops = { + .run_job = rocket_job_run, + .timedout_job = rocket_job_timedout, + .free_job = rocket_job_free +}; + +static irqreturn_t rocket_job_irq_handler_thread(int irq, void *data) +{ + struct rocket_core *core = data; + + rocket_job_handle_irq(core); + + return IRQ_HANDLED; +} + +static irqreturn_t rocket_job_irq_handler(int irq, void *data) +{ + struct rocket_core *core = data; + uint32_t raw_status = rocket_read(core, REG_PC_INTERRUPT_RAW_STATUS); + + WARN_ON(raw_status & PC_INTERRUPT_RAW_STATUS_DMA_READ_ERROR); + WARN_ON(raw_status & PC_INTERRUPT_RAW_STATUS_DMA_READ_ERROR); + + if (!(raw_status & PC_INTERRUPT_RAW_STATUS_DPU_0 || + raw_status & PC_INTERRUPT_RAW_STATUS_DPU_1)) + return IRQ_NONE; + + rocket_write(core, REG_PC_INTERRUPT_MASK, 0x0); + + return IRQ_WAKE_THREAD; +} + +int rocket_job_init(struct rocket_core *core) +{ + int ret; + + INIT_WORK(&core->reset.work, rocket_reset_work); + spin_lock_init(&core->job_lock); + + core->irq = platform_get_irq(to_platform_device(core->dev), 0); + if (core->irq < 0) + return core->irq; + + ret = devm_request_threaded_irq(core->dev, core->irq, + rocket_job_irq_handler, + rocket_job_irq_handler_thread, + IRQF_SHARED, KBUILD_MODNAME "-job", + core); + if (ret) { + dev_err(core->dev, "failed to request job irq"); + return ret; + } + + core->reset.wq = alloc_ordered_workqueue("rocket-reset-%d", 0, core->index); + if (!core->reset.wq) + return -ENOMEM; + + core->fence_context = dma_fence_context_alloc(1); + + ret = drm_sched_init(&core->sched, + &rocket_sched_ops, NULL, + DRM_SCHED_PRIORITY_COUNT, + 1, 0, + msecs_to_jiffies(JOB_TIMEOUT_MS), + core->reset.wq, + NULL, "rocket", core->dev); + if (ret) { + dev_err(core->dev, "Failed to create scheduler: %d.", ret); + goto err_sched; + } + + return 0; + +err_sched: + drm_sched_fini(&core->sched); + + destroy_workqueue(core->reset.wq); + return ret; +} + +void rocket_job_fini(struct rocket_core *core) +{ + drm_sched_fini(&core->sched); + + cancel_work_sync(&core->reset.work); + destroy_workqueue(core->reset.wq); +} + +int rocket_job_open(struct rocket_file_priv *rocket_priv) +{ + struct rocket_device *rdev = rocket_priv->rdev; + struct drm_gpu_scheduler **scheds = kmalloc_array(rdev->num_cores, sizeof(scheds), + GFP_KERNEL); + unsigned int core; + int ret; + + for (core = 0; core < rdev->num_cores; core++) + scheds[core] = &rdev->cores[core].sched; + + ret = drm_sched_entity_init(&rocket_priv->sched_entity, + DRM_SCHED_PRIORITY_NORMAL, + scheds, + rdev->num_cores, NULL); + if (WARN_ON(ret)) + return ret; + + return 0; +} + +void rocket_job_close(struct rocket_file_priv *rocket_priv) +{ + struct drm_sched_entity *entity = &rocket_priv->sched_entity; + + kfree(entity->sched_list); + drm_sched_entity_destroy(entity); +} + +int rocket_job_is_idle(struct rocket_core *core) +{ + /* If there are any jobs in this HW queue, we're not idle */ + if (atomic_read(&core->sched.credit_count)) + return false; + + return true; +} + +static int rocket_ioctl_submit_job(struct drm_device *dev, struct drm_file *file, + struct drm_rocket_job *job) +{ + struct rocket_device *rdev = to_rocket_device(dev); + struct rocket_file_priv *file_priv = file->driver_priv; + struct rocket_job *rjob = NULL; + int ret = 0; + + if (job->task_count == 0) + return -EINVAL; + + rjob = kzalloc(sizeof(*rjob), GFP_KERNEL); + if (!rjob) + return -ENOMEM; + + kref_init(&rjob->refcount); + + rjob->rdev = rdev; + + ret = drm_sched_job_init(&rjob->base, + &file_priv->sched_entity, + 1, NULL); + if (ret) + goto out_put_job; + + ret = rocket_copy_tasks(dev, file, job, rjob); + if (ret) + goto out_cleanup_job; + + ret = drm_gem_objects_lookup(file, + (void __user *)(uintptr_t)job->in_bo_handles, + job->in_bo_handle_count, &rjob->in_bos); + if (ret) + goto out_cleanup_job; + + rjob->in_bo_count = job->in_bo_handle_count; + + ret = drm_gem_objects_lookup(file, + (void __user *)(uintptr_t)job->out_bo_handles, + job->out_bo_handle_count, &rjob->out_bos); + if (ret) + goto out_cleanup_job; + + rjob->out_bo_count = job->out_bo_handle_count; + + ret = rocket_job_push(rjob); + if (ret) + goto out_cleanup_job; + +out_cleanup_job: + if (ret) + drm_sched_job_cleanup(&rjob->base); +out_put_job: + rocket_job_put(rjob); + + return ret; +} + +int rocket_ioctl_submit(struct drm_device *dev, void *data, struct drm_file *file) +{ + struct drm_rocket_submit *args = data; + struct drm_rocket_job *jobs; + int ret = 0; + unsigned int i = 0; + + jobs = kvmalloc_array(args->job_count, sizeof(*jobs), GFP_KERNEL); + if (!jobs) { + DRM_DEBUG("Failed to allocate incoming job array\n"); + return -ENOMEM; + } + + if (copy_from_user(jobs, + (void __user *)(uintptr_t)args->jobs, + args->job_count * sizeof(*jobs))) { + ret = -EFAULT; + DRM_DEBUG("Failed to copy incoming job array\n"); + goto exit; + } + + for (i = 0; i < args->job_count; i++) + rocket_ioctl_submit_job(dev, file, &jobs[i]); + +exit: + kfree(jobs); + + return ret; +} diff --git a/drivers/accel/rocket/rocket_job.h b/drivers/accel/rocket/rocket_job.h new file mode 100644 index 0000000000000000000000000000000000000000..93fa1f988c72adb7a405acbf08c1c9b87d22f9c5 --- /dev/null +++ b/drivers/accel/rocket/rocket_job.h @@ -0,0 +1,50 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* Copyright 2024 Tomeu Vizoso */ + +#ifndef __ROCKET_JOB_H__ +#define __ROCKET_JOB_H__ + +#include +#include + +#include "rocket_core.h" +#include "rocket_drv.h" + +struct rocket_task { + u64 regcmd; + u32 regcmd_count; +}; + +struct rocket_job { + struct drm_sched_job base; + + struct rocket_device *rdev; + + struct drm_gem_object **in_bos; + struct drm_gem_object **out_bos; + + u32 in_bo_count; + u32 out_bo_count; + + struct rocket_task *tasks; + u32 task_count; + u32 next_task_idx; + + /* Fence to be signaled by drm-sched once its done with the job */ + struct dma_fence *inference_done_fence; + + /* Fence to be signaled by IRQ handler when the job is complete. */ + struct dma_fence *done_fence; + + struct kref refcount; +}; + +int rocket_ioctl_submit(struct drm_device *dev, void *data, struct drm_file *file); + +int rocket_job_init(struct rocket_core *core); +void rocket_job_fini(struct rocket_core *core); +int rocket_job_open(struct rocket_file_priv *rocket_priv); +void rocket_job_close(struct rocket_file_priv *rocket_priv); +int rocket_job_is_idle(struct rocket_core *core); + +#endif diff --git a/include/uapi/drm/rocket_accel.h b/include/uapi/drm/rocket_accel.h index 8338726a83c31b954608ca505cf78bcd70d3494b..eb886351134ebef62969b1e1182ccc174f88fe9d 100644 --- a/include/uapi/drm/rocket_accel.h +++ b/include/uapi/drm/rocket_accel.h @@ -12,8 +12,10 @@ extern "C" { #endif #define DRM_ROCKET_CREATE_BO 0x00 +#define DRM_ROCKET_SUBMIT 0x01 #define DRM_IOCTL_ROCKET_CREATE_BO DRM_IOWR(DRM_COMMAND_BASE + DRM_ROCKET_CREATE_BO, struct drm_rocket_create_bo) +#define DRM_IOCTL_ROCKET_SUBMIT DRM_IOW(DRM_COMMAND_BASE + DRM_ROCKET_SUBMIT, struct drm_rocket_submit) /** * struct drm_rocket_create_bo - ioctl argument for creating Rocket BOs. @@ -36,6 +38,59 @@ struct drm_rocket_create_bo { __u64 offset; }; +/** + * struct drm_rocket_task - A task to be run on the NPU + * + * A task is the smallest unit of work that can be run on the NPU. + */ +struct drm_rocket_task { + /** DMA address to NPU mapping of register command buffer */ + __u64 regcmd; + + /** Number of commands in the register command buffer */ + __u32 regcmd_count; +}; + +/** + * struct drm_rocket_job - A job to be run on the NPU + * + * The kernel will schedule the execution of this job taking into account its + * dependencies with other jobs. All tasks in the same job will be executed + * sequentially on the same core, to benefit from memory residency in SRAM. + */ +struct drm_rocket_job { + /** Pointer to an array of struct drm_rocket_task. */ + __u64 tasks; + + /** Pointer to a u32 array of the BOs that are read by the job. */ + __u64 in_bo_handles; + + /** Pointer to a u32 array of the BOs that are written to by the job. */ + __u64 out_bo_handles; + + /** Number of tasks passed in. */ + __u32 task_count; + + /** Number of input BO handles passed in (size is that times 4). */ + __u32 in_bo_handle_count; + + /** Number of output BO handles passed in (size is that times 4). */ + __u32 out_bo_handle_count; +}; + +/** + * struct drm_rocket_submit - ioctl argument for submitting commands to the NPU. + * + * The kernel will schedule the execution of these jobs in dependency order. + */ +struct drm_rocket_submit { + /** Pointer to an array of struct drm_rocket_job. */ + __u64 jobs; + + /** Number of jobs passed in. */ + __u32 job_count; +}; + #if defined(__cplusplus) } #endif From patchwork Tue Feb 25 07:55:53 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tomeu Vizoso X-Patchwork-Id: 13989449 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 38E30C021B2 for ; Tue, 25 Feb 2025 08:07:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Cc:To:In-Reply-To:References :Message-Id:Content-Transfer-Encoding:Content-Type:MIME-Version:Subject:Date: From:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=uhkWeYCo8ghdLkTMjf+93H9t5mIr5lFb9LP3nVrsIlg=; b=jb6v2/bWRQwDoDGtJwVlz7hf51 1zpJNEH3rLiVJyf/A1Wy5rqWyY6LHo+4iK52x3GHnEWTvgCryerd5GgyfUYMPRGt6VVErrrwPogMv wLYVZcIlL/Du0hfalBjgrZVBlxhyNbS3N6iWeGgfZlivYWZh+QQcaEk+rzHW8SBCfne3r4mMjrVH5 V9bglC/6A240pvw9MPABwtr0IibCTh8HQVb9BAjy9ErVYzpDRPFRBQ23F6OsQ655vq/mBZmFe43AI V5cbg3JsOQv75T9bPOBqRqisim1BbIGY/S1omc4PvT9hkNmNS7hm11lARXUuXUMb6ECjIgHzkRFK5 YAKPr/zg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tmpxr-0000000GJwb-0SKX; Tue, 25 Feb 2025 08:06:51 +0000 Received: from mail-wm1-f54.google.com ([209.85.128.54]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tmpnc-0000000GHkR-34r9; Tue, 25 Feb 2025 07:56:18 +0000 Received: by mail-wm1-f54.google.com with SMTP id 5b1f17b1804b1-43948021a45so45774745e9.1; Mon, 24 Feb 2025 23:56:16 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740470175; x=1741074975; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=uhkWeYCo8ghdLkTMjf+93H9t5mIr5lFb9LP3nVrsIlg=; b=Sv5m3IQFabmETyRbnsVskK5UKrRYmUHhNOEhJODI2pjyYGv/iNtQQF8pujjFeF9SnT OblqB+HBAMqZETqnrxqtbUDFRR/wBpKY87AdVLhpTYYkIssfFE0CIYfC7LuZGQunTFT5 KUdRp2j43jTuRNXonmyyKrdLIvBik+ZKo2jRpBs6MVkgta+0qkfqM7w/8APRWh6mf5ba yDdtG8V1ISg1xuzTuacBrffpa1vuy6utTS2iiOSTDwL8cbzlnGN8pad4FWoRgmhsySYq d6/ondCZSX31WG9ji3+td9KkpcXfKr4j7CTfgs9VcHVkzuCuHNTV49MwborDcF5MD4rW PPqQ== X-Forwarded-Encrypted: i=1; AJvYcCUvsN5IKOLIbnJDIHgDqx2VzPSiD95M7UN/vP2fEEctlUI9XpWFX77ShiluD4wdSz3EhoTj4gSXg4+tD6mwC7A=@lists.infradead.org, AJvYcCViXHyDm6qFTO8NbVo4iuPkPLZCtcXwmFVIKvZ2b7AkMdyy2Qzgnq16VZvnXWbmbJW5Iqs45kBxokcaP8yTrskA@lists.infradead.org X-Gm-Message-State: AOJu0Yx1SqwW6LlqVHGTJWPZaBx1s8+z1hhEHfwiTQ7urKGVZxkxw52a 0CoRep84CbWLCNUhC82k+Jn2eoF2QTeCzZPAx4qsQia6EMXxUDHS X-Gm-Gg: ASbGnctEZzdUxAajOHe1HcRdfHCzi4B6cnzH6Ssi+VQrAUT1lwDkatET3rvQpOIqJNv g9mEGHWJ4YEE9qo0X167Aqi4cEAQ/myeTei8Y8GooMKijFN/RyJecFYmR2alC/J6PgVaT1HLqgi x3kU0yJF36sd59LFjVljJ6YVeKlGp56UDTtmE6Bd9rqrNjlegyeGbXrh+g16HcAJZz+Go/lgKQf WId4oGdt9IOPpurhXH93Wv2a8MmfyTg8v0e3limp/HCRXFEh6l2n18r6jqBsXSq4BayPFuB0MzB QrVKrIG8lrPVCCs7mFX6LzDILDonAVKn51+XkxyLOYXOHFipb0pGm3yXKA== X-Google-Smtp-Source: AGHT+IEBU/UREEsAZGJWloCHExe7c2hHky/VEEmIamwDzrYWwzyacyIuHiNhHtllqz0/KaZJonVUtw== X-Received: by 2002:a05:600c:3ba8:b0:439:8a44:1e68 with SMTP id 5b1f17b1804b1-439ae21f822mr125937265e9.28.1740470175202; Mon, 24 Feb 2025 23:56:15 -0800 (PST) Received: from [10.42.0.1] (cst-prg-37-50.cust.vodafone.cz. [46.135.37.50]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-439b02f2475sm130299475e9.20.2025.02.24.23.56.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 Feb 2025 23:56:14 -0800 (PST) From: Tomeu Vizoso Date: Tue, 25 Feb 2025 08:55:53 +0100 Subject: [PATCH v2 7/7] accel/rocket: Add IOCTLs for synchronizing memory accesses MIME-Version: 1.0 Message-Id: <20250225-6-10-rocket-v2-7-d4dbcfafc141@tomeuvizoso.net> References: <20250225-6-10-rocket-v2-0-d4dbcfafc141@tomeuvizoso.net> In-Reply-To: <20250225-6-10-rocket-v2-0-d4dbcfafc141@tomeuvizoso.net> To: Rob Herring , Krzysztof Kozlowski , Conor Dooley , Heiko Stuebner , Oded Gabbay , Jonathan Corbet , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Simona Vetter , Sumit Semwal , =?utf-8?q?Christian_K=C3=B6nig?= , Sebastian Reichel , Jeffrey Hugo Cc: devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-rockchip@lists.infradead.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, linux-doc@vger.kernel.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, Tomeu Vizoso X-Mailer: b4 0.14.2 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250224_235616_828736_79C64C1D X-CRM114-Status: GOOD ( 19.38 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org The NPU cores have their own access to the memory bus, and this isn't cache coherent with the CPUs. Add IOCTLs so userspace can mark when the caches need to be flushed, and also when a writer job needs to be waited for before the buffer can be accessed from the CPU. Initially based on the same IOCTLs from the Etnaviv driver. v2: - Don't break UABI by reordering the IOCTL IDs (Jeffrey Hugo) Signed-off-by: Tomeu Vizoso --- drivers/accel/rocket/rocket_drv.c | 2 ++ drivers/accel/rocket/rocket_gem.c | 75 +++++++++++++++++++++++++++++++++++++++ drivers/accel/rocket/rocket_gem.h | 5 +++ include/uapi/drm/rocket_accel.h | 18 ++++++++++ 4 files changed, 100 insertions(+) diff --git a/drivers/accel/rocket/rocket_drv.c b/drivers/accel/rocket/rocket_drv.c index a6b486e2d4f648d7b1d8831590b633bf661c7bc4..cc3531f66839b777e7abc1d41cb50cffd9685ea0 100644 --- a/drivers/accel/rocket/rocket_drv.c +++ b/drivers/accel/rocket/rocket_drv.c @@ -58,6 +58,8 @@ static const struct drm_ioctl_desc rocket_drm_driver_ioctls[] = { ROCKET_IOCTL(CREATE_BO, create_bo), ROCKET_IOCTL(SUBMIT, submit), + ROCKET_IOCTL(PREP_BO, prep_bo), + ROCKET_IOCTL(FINI_BO, fini_bo), }; DEFINE_DRM_ACCEL_FOPS(rocket_accel_driver_fops); diff --git a/drivers/accel/rocket/rocket_gem.c b/drivers/accel/rocket/rocket_gem.c index d5337cf1e275c249a1491d0dd28e6b8ccd2ff2cb..6a0a7f6958c34bce4611cfdf033590029c3ac026 100644 --- a/drivers/accel/rocket/rocket_gem.c +++ b/drivers/accel/rocket/rocket_gem.c @@ -139,3 +139,78 @@ int rocket_ioctl_create_bo(struct drm_device *dev, void *data, struct drm_file * return ret; } + +static inline enum dma_data_direction rocket_op_to_dma_dir(u32 op) +{ + if (op & ROCKET_PREP_READ) + return DMA_FROM_DEVICE; + else if (op & ROCKET_PREP_WRITE) + return DMA_TO_DEVICE; + else + return DMA_BIDIRECTIONAL; +} + +int rocket_ioctl_prep_bo(struct drm_device *dev, void *data, struct drm_file *file) +{ + struct drm_rocket_prep_bo *args = data; + unsigned long timeout = drm_timeout_abs_to_jiffies(args->timeout_ns); + struct rocket_device *rdev = to_rocket_device(dev); + struct drm_gem_object *gem_obj; + struct drm_gem_shmem_object *shmem_obj; + bool write = !!(args->op & ROCKET_PREP_WRITE); + long ret = 0; + + if (args->op & ~(ROCKET_PREP_READ | ROCKET_PREP_WRITE)) + return -EINVAL; + + gem_obj = drm_gem_object_lookup(file, args->handle); + if (!gem_obj) + return -ENOENT; + + ret = dma_resv_wait_timeout(gem_obj->resv, dma_resv_usage_rw(write), + true, timeout); + if (!ret) + ret = timeout ? -ETIMEDOUT : -EBUSY; + + shmem_obj = &to_rocket_bo(gem_obj)->base; + + for (unsigned int core = 1; core < rdev->num_cores; core++) { + dma_sync_sgtable_for_cpu(rdev->cores[core].dev, shmem_obj->sgt, + rocket_op_to_dma_dir(args->op)); + } + + to_rocket_bo(gem_obj)->last_cpu_prep_op = args->op; + + drm_gem_object_put(gem_obj); + + return ret; +} + +int rocket_ioctl_fini_bo(struct drm_device *dev, void *data, struct drm_file *file) +{ + struct drm_rocket_fini_bo *args = data; + struct drm_gem_object *gem_obj; + struct rocket_gem_object *rkt_obj; + struct drm_gem_shmem_object *shmem_obj; + struct rocket_device *rdev = to_rocket_device(dev); + + gem_obj = drm_gem_object_lookup(file, args->handle); + if (!gem_obj) + return -ENOENT; + + rkt_obj = to_rocket_bo(gem_obj); + shmem_obj = &rkt_obj->base; + + WARN_ON(rkt_obj->last_cpu_prep_op == 0); + + for (unsigned int core = 1; core < rdev->num_cores; core++) { + dma_sync_sgtable_for_device(rdev->cores[core].dev, shmem_obj->sgt, + rocket_op_to_dma_dir(rkt_obj->last_cpu_prep_op)); + } + + rkt_obj->last_cpu_prep_op = 0; + + drm_gem_object_put(gem_obj); + + return 0; +} diff --git a/drivers/accel/rocket/rocket_gem.h b/drivers/accel/rocket/rocket_gem.h index 19b0cf91ddd99bd126c1af30beb169d6101f6dee..1fd11441f5856c4b10ed77b63f34f157cd13e242 100644 --- a/drivers/accel/rocket/rocket_gem.h +++ b/drivers/accel/rocket/rocket_gem.h @@ -12,12 +12,17 @@ struct rocket_gem_object { struct mutex mutex; size_t size; u32 offset; + u32 last_cpu_prep_op; }; struct drm_gem_object *rocket_gem_create_object(struct drm_device *dev, size_t size); int rocket_ioctl_create_bo(struct drm_device *dev, void *data, struct drm_file *file); +int rocket_ioctl_prep_bo(struct drm_device *dev, void *data, struct drm_file *file); + +int rocket_ioctl_fini_bo(struct drm_device *dev, void *data, struct drm_file *file); + static inline struct rocket_gem_object *to_rocket_bo(struct drm_gem_object *obj) { diff --git a/include/uapi/drm/rocket_accel.h b/include/uapi/drm/rocket_accel.h index eb886351134ebef62969b1e1182ccc174f88fe9d..ad6589884880126a248fa646aab7c4034600c11c 100644 --- a/include/uapi/drm/rocket_accel.h +++ b/include/uapi/drm/rocket_accel.h @@ -13,9 +13,13 @@ extern "C" { #define DRM_ROCKET_CREATE_BO 0x00 #define DRM_ROCKET_SUBMIT 0x01 +#define DRM_ROCKET_PREP_BO 0x02 +#define DRM_ROCKET_FINI_BO 0x03 #define DRM_IOCTL_ROCKET_CREATE_BO DRM_IOWR(DRM_COMMAND_BASE + DRM_ROCKET_CREATE_BO, struct drm_rocket_create_bo) #define DRM_IOCTL_ROCKET_SUBMIT DRM_IOW(DRM_COMMAND_BASE + DRM_ROCKET_SUBMIT, struct drm_rocket_submit) +#define DRM_IOCTL_ROCKET_PREP_BO DRM_IOW(DRM_COMMAND_BASE + DRM_ROCKET_PREP_BO, struct drm_rocket_prep_bo) +#define DRM_IOCTL_ROCKET_FINI_BO DRM_IOW(DRM_COMMAND_BASE + DRM_ROCKET_FINI_BO, struct drm_rocket_fini_bo) /** * struct drm_rocket_create_bo - ioctl argument for creating Rocket BOs. @@ -38,6 +42,20 @@ struct drm_rocket_create_bo { __u64 offset; }; +#define ROCKET_PREP_READ 0x01 +#define ROCKET_PREP_WRITE 0x02 + +struct drm_rocket_prep_bo { + __u32 handle; /* in */ + __u32 op; /* in, mask of ROCKET_PREP_x */ + __s64 timeout_ns; /* in */ +}; + +struct drm_rocket_fini_bo { + __u32 handle; /* in */ + __u32 flags; /* in, placeholder for now, no defined values */ +}; + /** * struct drm_rocket_task - A task to be run on the NPU *