From patchwork Wed May 27 03:20:08 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Alex Deucher X-Patchwork-Id: 6486781 Return-Path: X-Original-To: patchwork-dri-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id DEAF89F38C for ; Wed, 27 May 2015 03:21:06 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id D75DC20700 for ; Wed, 27 May 2015 03:21:05 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id CF647206FC for ; Wed, 27 May 2015 03:21:04 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id A489E6E62A; Tue, 26 May 2015 20:21:01 -0700 (PDT) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from mail-qc0-f174.google.com (mail-qc0-f174.google.com [209.85.216.174]) by gabe.freedesktop.org (Postfix) with ESMTP id 41B016E8D1 for ; Tue, 26 May 2015 20:20:59 -0700 (PDT) Received: by qcxw10 with SMTP id w10so2527267qcx.3 for ; Tue, 26 May 2015 20:20:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-type:content-transfer-encoding; bh=dkTK3egGbAy1yigC0Pc1x/2MGV+6g5WVySG80d/PE2o=; b=pP+FFQAvhsQ0s8L/gt1BUMhlJSmi6YcRiW/36TUmleiapmgPFDfEzwfGMNvszvg6pM QXfzxs7dVP5NpQSq8/llY1EWqvECt7iSNL9xOOHqniefKdA+73Tt3JPvgBGSYdGN5wl6 qAHl3bTnGmxOZKNn/5nE1WuZrWmG2NEaD9Rr9GxIEQ+srtRsgKrw4nDk1QXT9yhjc4e5 3w6ikPdgf4JtIlEJsrxXFwKbdlRESjO1ZvFrR8sR3PmHQDZyjPhCvNG2mbq06pHoK89f YJjmp+AR9JiMQUrFILsJYrusgA0Lzho4dG/hPMIr8wcED+OS7BS4tUSBCMMVO7yRHZ4p WNsg== X-Received: by 10.140.83.168 with SMTP id j37mr36684913qgd.18.1432696858959; Tue, 26 May 2015 20:20:58 -0700 (PDT) Received: from localhost.localdomain (static-74-96-105-49.washdc.fios.verizon.net. [74.96.105.49]) by mx.google.com with ESMTPSA id 20sm9629127qhf.14.2015.05.26.20.20.58 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 26 May 2015 20:20:58 -0700 (PDT) From: Alex Deucher X-Google-Original-From: Alex Deucher To: dri-devel@lists.freedesktop.org Subject: [PATCH 69/88] drm/amdgpu: add and implement the GPU reset status query Date: Tue, 26 May 2015 23:20:08 -0400 Message-Id: <1432696827-3752-39-git-send-email-alexander.deucher@amd.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1432696827-3752-1-git-send-email-alexander.deucher@amd.com> References: <1432696827-3752-1-git-send-email-alexander.deucher@amd.com> MIME-Version: 1.0 Cc: =?UTF-8?q?Marek=20Ol=C5=A1=C3=A1k?= X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" X-Spam-Status: No, score=-4.1 required=5.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, FREEMAIL_FROM, RCVD_IN_DNSWL_MED, T_DKIM_INVALID, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Marek Olšák Signed-off-by: Marek Olšák Reviewed-by: Christian König Reviewed-by: Jammy Zhou --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 6 ++--- drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 36 +++++++++++++++++++----------- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 + include/uapi/drm/amdgpu_drm.h | 11 ++++++++- 4 files changed, 37 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index 66b5bd0..ebff89e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -1040,7 +1040,7 @@ struct amdgpu_vm_manager { struct amdgpu_ctx_state { uint64_t flags; - uint64_t hangs; + uint32_t hangs; }; struct amdgpu_ctx { @@ -1049,6 +1049,7 @@ struct amdgpu_ctx { struct amdgpu_fpriv *fpriv; struct amdgpu_ctx_state state; uint32_t id; + unsigned reset_counter; }; struct amdgpu_ctx_mgr { @@ -1897,8 +1898,6 @@ int amdgpu_ctx_alloc(struct amdgpu_device *adev,struct amdgpu_fpriv *fpriv, uint32_t *id,uint32_t flags); int amdgpu_ctx_free(struct amdgpu_device *adev, struct amdgpu_fpriv *fpriv, uint32_t id); -int amdgpu_ctx_query(struct amdgpu_device *adev, struct amdgpu_fpriv *fpriv, - uint32_t id,struct amdgpu_ctx_state *state); void amdgpu_ctx_fini(struct amdgpu_fpriv *fpriv); struct amdgpu_ctx *amdgpu_ctx_get(struct amdgpu_fpriv *fpriv, uint32_t id); @@ -2006,6 +2005,7 @@ struct amdgpu_device { atomic64_t vram_vis_usage; atomic64_t gtt_usage; atomic64_t num_bytes_moved; + atomic_t gpu_reset_counter; /* display */ struct amdgpu_mode_info mode_info; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c index bcd332e..6c66ac8 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c @@ -81,21 +81,36 @@ int amdgpu_ctx_free(struct amdgpu_device *adev, struct amdgpu_fpriv *fpriv, uint return -EINVAL; } -int amdgpu_ctx_query(struct amdgpu_device *adev, struct amdgpu_fpriv *fpriv, uint32_t id, struct amdgpu_ctx_state *state) +static int amdgpu_ctx_query(struct amdgpu_device *adev, + struct amdgpu_fpriv *fpriv, uint32_t id, + union drm_amdgpu_ctx_out *out) { struct amdgpu_ctx *ctx; struct amdgpu_ctx_mgr *mgr = &fpriv->ctx_mgr; + unsigned reset_counter; mutex_lock(&mgr->lock); ctx = idr_find(&mgr->ctx_handles, id); - if (ctx) { - /* state should alter with CS activity */ - *state = ctx->state; + if (!ctx) { mutex_unlock(&mgr->lock); - return 0; + return -EINVAL; } + + /* TODO: these two are always zero */ + out->state.flags = ctx->state.flags; + out->state.hangs = ctx->state.hangs; + + /* determine if a GPU reset has occured since the last call */ + reset_counter = atomic_read(&adev->gpu_reset_counter); + /* TODO: this should ideally return NO, GUILTY, or INNOCENT. */ + if (ctx->reset_counter == reset_counter) + out->state.reset_status = AMDGPU_CTX_NO_RESET; + else + out->state.reset_status = AMDGPU_CTX_UNKNOWN_RESET; + ctx->reset_counter = reset_counter; + mutex_unlock(&mgr->lock); - return -EINVAL; + return 0; } void amdgpu_ctx_fini(struct amdgpu_fpriv *fpriv) @@ -115,12 +130,11 @@ void amdgpu_ctx_fini(struct amdgpu_fpriv *fpriv) } int amdgpu_ctx_ioctl(struct drm_device *dev, void *data, - struct drm_file *filp) + struct drm_file *filp) { int r; uint32_t id; uint32_t flags; - struct amdgpu_ctx_state state; union drm_amdgpu_ctx *args = data; struct amdgpu_device *adev = dev->dev_private; @@ -139,11 +153,7 @@ int amdgpu_ctx_ioctl(struct drm_device *dev, void *data, r = amdgpu_ctx_free(adev, fpriv, id); break; case AMDGPU_CTX_OP_QUERY_STATE: - r = amdgpu_ctx_query(adev, fpriv, id, &state); - if (r == 0) { - args->out.state.flags = state.flags; - args->out.state.hangs = state.hangs; - } + r = amdgpu_ctx_query(adev, fpriv, id, &args->out); break; default: return -EINVAL; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 61cf5ad..3448d9f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -1781,6 +1781,7 @@ int amdgpu_gpu_reset(struct amdgpu_device *adev) } adev->needs_reset = false; + atomic_inc(&adev->gpu_reset_counter); /* block TTM */ resched = ttm_bo_lock_delayed_workqueue(&adev->mman.bdev); diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h index 65da7cd..46580e9 100644 --- a/include/uapi/drm/amdgpu_drm.h +++ b/include/uapi/drm/amdgpu_drm.h @@ -149,6 +149,12 @@ union drm_amdgpu_bo_list { #define AMDGPU_CTX_OP_STATE_RUNNING 1 +/* GPU reset status */ +#define AMDGPU_CTX_NO_RESET 0 +#define AMDGPU_CTX_GUILTY_RESET 1 /* this the context caused it */ +#define AMDGPU_CTX_INNOCENT_RESET 2 /* some other context caused it */ +#define AMDGPU_CTX_UNKNOWN_RESET 3 /* unknown cause */ + struct drm_amdgpu_ctx_in { uint32_t op; uint32_t flags; @@ -164,7 +170,10 @@ union drm_amdgpu_ctx_out { struct { uint64_t flags; - uint64_t hangs; + /** Number of resets caused by this context so far. */ + uint32_t hangs; + /** Reset status since the last call of the ioctl. */ + uint32_t reset_status; } state; };