From patchwork Fri Aug 28 07:59:24 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Janusz Krzysztofik X-Patchwork-Id: 11742337 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0CAA9913 for ; Fri, 28 Aug 2020 08:00:37 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id E7E592098B for ; Fri, 28 Aug 2020 08:00:36 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E7E592098B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 2C88B6EB85; Fri, 28 Aug 2020 08:00:36 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by gabe.freedesktop.org (Postfix) with ESMTPS id 4B3FD6EB94; Fri, 28 Aug 2020 08:00:26 +0000 (UTC) IronPort-SDR: ffADpk8kgHwYo94yZhNy5z4Eun1DlgsdAWj5O4ZwBf9HXWTTXjjEw5bPfOTqd9VW9YwrY8UORF 6yEn9swxDklA== X-IronPort-AV: E=McAfee;i="6000,8403,9726"; a="144375154" X-IronPort-AV: E=Sophos;i="5.76,363,1592895600"; d="scan'208";a="144375154" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Aug 2020 01:00:24 -0700 IronPort-SDR: 1oej/7ILhSe4iNDYpwGwnwnSp2SsLcpF6irBYcFAkFQaqIdom7ONaJQLBbvgF4k1x0NK9SR4vu JxRJmXKaYNsQ== X-IronPort-AV: E=Sophos;i="5.76,363,1592895600"; d="scan'208";a="444756024" Received: from jkrzyszt-desk.igk.intel.com ([172.22.244.18]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Aug 2020 01:00:22 -0700 From: Janusz Krzysztofik To: igt-dev@lists.freedesktop.org Date: Fri, 28 Aug 2020 09:59:24 +0200 Message-Id: <20200828075927.17061-19-janusz.krzysztofik@linux.intel.com> X-Mailer: git-send-email 2.21.1 In-Reply-To: <20200828075927.17061-1-janusz.krzysztofik@linux.intel.com> References: <20200828075927.17061-1-janusz.krzysztofik@linux.intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH i-g-t v5 18/21] tests/core_hotunplug: Add 'lateclose before restore' variants X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Micha=C5=82_Winiarski?= , intel-gfx@lists.freedesktop.org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" If a GPU gets wedged during driver rebind or device re-plug for some reason, current hotunbind/hotunplug test variants may time out before lateclose phase, resulting in incomplete CI reports. Add new test variants which close the device before restoring it. Also rename old variants to more adequate hotrebind/hotreplug-lateclose and perform health checks both before and after late close. v2: Rebase on upstream. v3: Refresh, - further rename hotunbind/hotunplug-lateclose to hotunbind-rebind and hotunplug-rescan respectively, then add two more variants under the old names which only exercise late close, leaving rebind / rescan to be cared of in the post-subtest recovery phase, - also update descriptions of unmodified subtests for consistency. v4: Refresh, - drop subtests with no health checks, adjust timeouts in successors, - perform health checks of hot restored devices also before late close, - in order to be able to safely run a health check while still keeping an unbound / unplugged device instance open, also preserve the open device fd, not only a close error, - adjust subtest descriptions. Signed-off-by: Janusz Krzysztofik Reviewed-by: MichaƂ Winiarski # v2 --- tests/core_hotunplug.c | 98 ++++++++++++++++++++++++++++++++++-------- 1 file changed, 80 insertions(+), 18 deletions(-) diff --git a/tests/core_hotunplug.c b/tests/core_hotunplug.c index 1f211a820..305c57a3f 100644 --- a/tests/core_hotunplug.c +++ b/tests/core_hotunplug.c @@ -276,17 +276,19 @@ static int local_i915_recover(int i915) static void healthcheck(struct hotunplug *priv, bool recover) { - /* preserve error code potentially stored before in priv->fd.drm */ + /* preserve device fd / close status stored in priv->fd.drm */ + int fd_drm, saved_fd_drm = priv->fd.drm; bool closed = priv->fd.drm == -1; - int fd_drm; /* device name may have changed, rebuild IGT device list */ igt_devices_scan(true); priv->failure = "Device reopen failure!"; fd_drm = local_drm_open_driver("re", " for healthcheck"); - if (closed) /* store for cleanup if no error code to preserve */ + if (closed) /* store for cleanup if not dirty */ priv->fd.drm = fd_drm; + else /* force close error should we fail prematurely */ + priv->fd.drm = -EBADF; if (is_i915_device(fd_drm)) { const char *failure = NULL; @@ -308,8 +310,10 @@ static void healthcheck(struct hotunplug *priv, bool recover) } fd_drm = close_device(fd_drm); - if (closed) /* store result if no error code to preserve */ + if (closed) /* store result if no dirty status to preserve */ priv->fd.drm = fd_drm; + else if (fd_drm == -1) /* cancel fake error, restore saved status */ + priv->fd.drm = saved_fd_drm; /* not only request igt_abort on failure, also fail the health check */ igt_fail_on_f(priv->failure, "%s\n", priv->failure); @@ -381,31 +385,65 @@ static void unplug_rescan(struct hotunplug *priv) healthcheck(priv, false); } -static void hotunbind_lateclose(struct hotunplug *priv) +static void hotunbind_rebind(struct hotunplug *priv) { igt_assert_eq(priv->fd.drm, -1); - priv->fd.drm = local_drm_open_driver("", " for hotunbind"); + priv->fd.drm = local_drm_open_driver("", " for hotrebind"); driver_unbind(priv, "hot ", 0); - driver_bind(priv, 0); - igt_debug("late closing the unbound device instance\n"); priv->fd.drm = close_device(priv->fd.drm); igt_assert_eq(priv->fd.drm, -1); + driver_bind(priv, 0); + healthcheck(priv, false); } -static void hotunplug_lateclose(struct hotunplug *priv) +static void hotunplug_rescan(struct hotunplug *priv) { igt_assert_eq(priv->fd.drm, -1); - priv->fd.drm = local_drm_open_driver("", " for hotunplug"); + priv->fd.drm = local_drm_open_driver("", " for hotreplug"); device_unplug(priv, "hot ", 0); + igt_debug("late closing the removed device instance\n"); + priv->fd.drm = close_device(priv->fd.drm); + igt_assert_eq(priv->fd.drm, -1); + bus_rescan(priv, 0); + healthcheck(priv, false); +} + +static void hotrebind_lateclose(struct hotunplug *priv) +{ + priv->fd.drm = local_drm_open_driver("", " for hotrebind"); + + driver_unbind(priv, "hot ", 60); + + driver_bind(priv, 0); + + healthcheck(priv, false); + + igt_debug("late closing the unbound device instance\n"); + priv->fd.drm = close_device(priv->fd.drm); + igt_assert_eq(priv->fd.drm, -1); + + healthcheck(priv, false); +} + +static void hotreplug_lateclose(struct hotunplug *priv) +{ + priv->fd.drm = local_drm_open_driver("", " for hotreplug"); + + device_unplug(priv, "hot ", 60); + + bus_rescan(priv, 0); + + healthcheck(priv, false); + igt_debug("late closing the removed device instance\n"); priv->fd.drm = close_device(priv->fd.drm); igt_assert_eq(priv->fd.drm, -1); @@ -443,7 +481,7 @@ igt_main } igt_subtest_group { - igt_describe("Check if the driver can be cleanly unbound from a device believed to be closed"); + igt_describe("Check if the driver can be cleanly unbound from a device believed to be closed, then rebound"); igt_subtest("unbind-rebind") unbind_rebind(&priv); @@ -455,7 +493,7 @@ igt_main post_healthcheck(&priv); igt_subtest_group { - igt_describe("Check if a device believed to be closed can be cleanly unplugged"); + igt_describe("Check if a device believed to be closed can be cleanly unplugged, then restored"); igt_subtest("unplug-rescan") unplug_rescan(&priv); @@ -467,9 +505,33 @@ igt_main post_healthcheck(&priv); igt_subtest_group { - igt_describe("Check if the driver can be cleanly unbound from a still open device, then released"); - igt_subtest("hotunbind-lateclose") - hotunbind_lateclose(&priv); + igt_describe("Check if the driver can be cleanly unbound from an open device, then released and rebound"); + igt_subtest("hotunbind-rebind") + hotunbind_rebind(&priv); + + igt_fixture + recover(&priv); + } + + igt_fixture + post_healthcheck(&priv); + + igt_subtest_group { + igt_describe("Check if an open device can be cleanly unplugged, then released and restored"); + igt_subtest("hotunplug-rescan") + hotunplug_rescan(&priv); + + igt_fixture + recover(&priv); + } + + igt_fixture + post_healthcheck(&priv); + + igt_subtest_group { + igt_describe("Check if the driver hot unbound from a still open device can be cleanly rebound, then the old instance released"); + igt_subtest("hotrebind-lateclose") + hotrebind_lateclose(&priv); igt_fixture recover(&priv); @@ -479,9 +541,9 @@ igt_main post_healthcheck(&priv); igt_subtest_group { - igt_describe("Check if a still open device can be cleanly unplugged, then released"); - igt_subtest("hotunplug-lateclose") - hotunplug_lateclose(&priv); + igt_describe("Check if a still open while hot unplugged device can be cleanly restored, then the old instance released"); + igt_subtest("hotreplug-lateclose") + hotreplug_lateclose(&priv); igt_fixture recover(&priv);