From patchwork Wed Apr 15 13:15:15 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Janusz Krzysztofik X-Patchwork-Id: 11491271 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DC69314B4 for ; Wed, 15 Apr 2020 13:15:38 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C4C6620737 for ; Wed, 15 Apr 2020 13:15:38 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C4C6620737 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id F2B146E9DD; Wed, 15 Apr 2020 13:15:36 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by gabe.freedesktop.org (Postfix) with ESMTPS id 2722B6E9DC; Wed, 15 Apr 2020 13:15:35 +0000 (UTC) IronPort-SDR: CrD6iHnDz/XzDt+gGIVfYH+CHVF8deQmYphCgNXqXOw4GSO8zQveyfTDccOut1p4m0cYCFgitI 1rXXZFAvVvvA== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Apr 2020 06:15:34 -0700 IronPort-SDR: DVMx8HdVZdE39j6eNblUcOnCUmJe2LhLDhelMSLOM9xlJM5u6ytBYbX9o7qn7N9n3jnaWzEJeN 5Sx31iJBlxmw== X-IronPort-AV: E=Sophos;i="5.72,386,1580803200"; d="scan'208";a="256855161" Received: from jkrzyszt-desk.ger.corp.intel.com (HELO jkrzyszt-desk.igk.intel.com) ([172.22.244.18]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Apr 2020 06:15:31 -0700 From: Janusz Krzysztofik To: igt-dev@lists.freedesktop.org Date: Wed, 15 Apr 2020 15:15:15 +0200 Message-Id: <20200415131515.7507-1-janusz.krzysztofik@linux.intel.com> X-Mailer: git-send-email 2.21.1 MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH i-g-t v15] tests: Add a test for device hot unplug X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Janusz Krzysztofik , intel-gfx@lists.freedesktop.org, Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Janusz Krzysztofik There is a test which verifies unloading of i915 driver module but no test exists that checks how a driver behaves when it gets unbound from a device or when the device gets unplugged. Implement such test using sysfs interface. Two minimalistic subtests - "unbind-rebind" and "unplug-rescan" - perform the named operations on a DRM device which is believed to be not in use. Another pair of subtests named "hotunbind-lateclose" and hotunplug-lateclose" do the same on a DRM device while keeping its file descriptor open and close it thereafter. v2: Run a subprocess with dummy_load instead of external command (Antonio). v3: Run dummy_load from the test process directly (Antonio). v4: Run dummy_load from inside subtests (Antonio). v5: Try to restore the device to a working state after each subtest (Petri, Daniel). v6: Run workload inside an igt helper subprocess so resources consumed by the workload are cleaned up automatically on workload subprocess crash, without affecting test results, - move the igt helper with workload back from subtests to initial fixture so workload crash also does not affect test results, - other cleanups suggested by Katarzyna and Chris. v7: No changes. v8: Move workload functions back from fixture to subtests, - register different actions and different workloads in respective tables and iterate over those tables while enumerating subtests, - introduce new subtest flavors by simply omitting module unload step, - instead of simply requesting bus rescan or not, introduce action specific device recovery helpers, required specifically with those new subtests not touching the module, - split workload functions in two parts, one spawning the workload, the other waiting for its completion, - for the new subtests not requiring module unload, run workload functions directly from the test process and use new workload completion wait functions in place of subprocess completion wait, - take more control over logging, longjumps and exit codes in workload subprocesses, - add some debug messages for easy progress watching, - move function API descriptions on top of respective typedefs. v9: All changes after Daniel's comments - thanks! - flatten the code, don't try to create a midlayer (Daniel), - provide minimal subtests that even don't keep device open (Daniel), - don't use driver unbind in more advanced subtests (Daniel), - provide subtests with different level of resources allocated during device unplug (Daniel), - provide subtests which check driver behavior after device hot unplug (Daniel). v10 Rename variables and function arguments to something that indicates they're file descriptors (Daniel), - introduce a data structure that contains various file descriptors and a helper function to set them all (Daniel), - fix strange indentation (Daniel), - limit scope to first three subtests as the initial set of tests to merge (Daniel). v11 Fix typos in some comments, - use SPDX license identifier, - include a per-patch changelog in the commit message (Daniel). v12 We don't use SPDX license identifiers nor GPL-2.0 in IGT (Petri), - avoid chipset, make sure we reopen the same device (Chris), - rename subtest "drm_open-hotunplug" to "hotunplug-lateclose", - add subtest "hotunbind-lateclose" (less affected by IOMMU issues), - move some redundant code to helpers, - reorder some helpers, - reword some messages and comments, - clean up headers. v13 Add test / subtest descriptions (patchwork). v14 Extract redundant device rescan and reopen code to a 'healthcheck' helper, - call igt_abort_on_f() on device reopen failure (Petri), - if any timeout set with igt_set_timeout() inside a subtest expires, call igt_abort_on_f() from a follow-up igt_fixture (Petri), - when running on a i915 device, require GEM and call igt_abort_on_f() if no usable GEM is detected on device reopen. v15 Add the test to CI blacklist (Martin). Signed-off-by: Janusz Krzysztofik Cc: Antonio Argenziano Cc: Petri Latvala Cc: Daniel Vetter Cc: Katarzyna Dec Cc: Martin Peres Acked-by: Chris Wilson Acked-by: Petri Latvala --- tests/Makefile.sources | 1 + tests/core_hotunplug.c | 300 +++++++++++++++++++++++++++++++++++ tests/intel-ci/blacklist.txt | 1 + tests/meson.build | 1 + 4 files changed, 303 insertions(+) create mode 100644 tests/core_hotunplug.c diff --git a/tests/Makefile.sources b/tests/Makefile.sources index 4e44c98c2..32cbbf4f9 100644 --- a/tests/Makefile.sources +++ b/tests/Makefile.sources @@ -18,6 +18,7 @@ TESTS_progs = \ core_getclient \ core_getstats \ core_getversion \ + core_hotunplug \ core_setmaster \ core_setmaster_vs_auth \ debugfs_test \ diff --git a/tests/core_hotunplug.c b/tests/core_hotunplug.c new file mode 100644 index 000000000..f9cfc8c3c --- /dev/null +++ b/tests/core_hotunplug.c @@ -0,0 +1,300 @@ +/* + * Copyright © 2019 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + */ + +#include +#include +#include +#include +#include +#include +#include + +#include "igt.h" +#include "igt_device_scan.h" +#include "igt_kmod.h" +#include "igt_sysfs.h" + +IGT_TEST_DESCRIPTION("Examine behavior of a driver on device hot unplug"); + +struct hotunplug { + struct { + int drm; + int sysfs_dev; + int sysfs_bus; + int sysfs_drv; + } fd; + char *dev_bus_addr; +}; + +/* Helpers */ + +static void prepare_for_unbind(struct hotunplug *priv, char *buf, int buflen) +{ + int len; + + igt_assert(buflen); + + priv->fd.sysfs_drv = openat(priv->fd.sysfs_dev, "device/driver", + O_DIRECTORY); + igt_assert(priv->fd.sysfs_drv >= 0); + + len = readlinkat(priv->fd.sysfs_dev, "device", buf, buflen - 1); + buf[len] = '\0'; + priv->dev_bus_addr = strrchr(buf, '/'); + igt_assert(priv->dev_bus_addr++); + + /* sysfs_dev no longer needed */ + close(priv->fd.sysfs_dev); +} + +static void prepare(struct hotunplug *priv, char *buf, int buflen) +{ + igt_debug("opening device\n"); + priv->fd.drm = __drm_open_driver(DRIVER_ANY); + igt_assert(priv->fd.drm >= 0); + + priv->fd.sysfs_dev = igt_sysfs_open(priv->fd.drm); + igt_assert(priv->fd.sysfs_dev >= 0); + + if (buf) { + prepare_for_unbind(priv, buf, buflen); + } else { + /* prepare for bus rescan */ + priv->fd.sysfs_bus = openat(priv->fd.sysfs_dev, + "device/subsystem", O_DIRECTORY); + igt_assert(priv->fd.sysfs_bus >= 0); + } +} + +static const char *failure; + +/* Unbind the driver from the device */ +static void driver_unbind(int fd_sysfs_drv, const char *dev_bus_addr) +{ + failure = "Driver unbind timeout!"; + igt_set_timeout(60, failure); + igt_sysfs_set(fd_sysfs_drv, "unbind", dev_bus_addr); + igt_reset_timeout(); + failure = NULL; + + /* don't close fd_sysfs_drv, it will be used for driver rebinding */ +} + +/* Re-bind the driver to the device */ +static void driver_bind(int fd_sysfs_drv, const char *dev_bus_addr) +{ + failure = "Driver re-bind timeout!"; + igt_set_timeout(60, failure); + igt_sysfs_set(fd_sysfs_drv, "bind", dev_bus_addr); + igt_reset_timeout(); + failure = NULL; + + close(fd_sysfs_drv); +} + +/* Remove (virtually unplug) the device from its bus */ +static void device_unplug(int fd_sysfs_dev) +{ + failure = "Device unplug timeout!"; + igt_set_timeout(60, failure); + igt_sysfs_set(fd_sysfs_dev, "device/remove", "1"); + igt_reset_timeout(); + failure = NULL; + + close(fd_sysfs_dev); +} + +/* Re-discover the device by rescanning its bus */ +static void bus_rescan(int fd_sysfs_bus) +{ + failure = "Bus rescan timeout!"; + igt_set_timeout(60, failure); + igt_sysfs_set(fd_sysfs_bus, "rescan", "1"); + igt_reset_timeout(); + failure = NULL; + + close(fd_sysfs_bus); +} + +static void healthcheck(void) +{ + int fd_drm; + + /* device name may have changed, rebuild IGT device list */ + igt_devices_scan(true); + + igt_debug("reopening the device\n"); + fd_drm = __drm_open_driver(DRIVER_ANY); + igt_abort_on_f(fd_drm < 0, "Device reopen failure"); + + if (is_i915_device(fd_drm)) { + failure = "GEM failure"; + igt_require_gem(fd_drm); + failure = NULL; + } + + close(fd_drm); +} + +static void set_filter_from_device(int fd) +{ + const char *filter_type = "sys:"; + char filter[strlen(filter_type) + PATH_MAX + 1]; + char *dst = stpcpy(filter, filter_type); + char path[PATH_MAX + 1]; + + igt_assert(igt_sysfs_path(fd, path, PATH_MAX)); + strncat(path, "/device", PATH_MAX - strlen(path)); + igt_assert(realpath(path, dst)); + + igt_device_filter_set(filter); +} + +/* Subtests */ + +static void unbind_rebind(void) +{ + struct hotunplug priv; + char buf[PATH_MAX]; + + prepare(&priv, buf, sizeof(buf)); + + igt_debug("closing the device\n"); + close(priv.fd.drm); + + igt_debug("unbinding the driver from the device\n"); + driver_unbind(priv.fd.sysfs_drv, priv.dev_bus_addr); + + igt_debug("rebinding the driver to the device\n"); + driver_bind(priv.fd.sysfs_drv, priv.dev_bus_addr); + + healthcheck(); +} + +static void unplug_rescan(void) +{ + struct hotunplug priv; + + prepare(&priv, NULL, 0); + + igt_debug("closing the device\n"); + close(priv.fd.drm); + + igt_debug("unplugging the device\n"); + device_unplug(priv.fd.sysfs_dev); + + igt_debug("recovering the device\n"); + bus_rescan(priv.fd.sysfs_bus); + + healthcheck(); +} + +static void hotunbind_lateclose(void) +{ + struct hotunplug priv; + char buf[PATH_MAX]; + + prepare(&priv, buf, sizeof(buf)); + + igt_debug("hot unbinding the driver from the device\n"); + driver_unbind(priv.fd.sysfs_drv, priv.dev_bus_addr); + + igt_debug("rebinding the driver to the device\n"); + driver_bind(priv.fd.sysfs_drv, priv.dev_bus_addr); + + igt_debug("late closing the unbound device instance\n"); + close(priv.fd.drm); + + healthcheck(); +} + +static void hotunplug_lateclose(void) +{ + struct hotunplug priv; + + prepare(&priv, NULL, 0); + + igt_debug("hot unplugging the device\n"); + device_unplug(priv.fd.sysfs_dev); + + igt_debug("recovering the device\n"); + bus_rescan(priv.fd.sysfs_bus); + + igt_debug("late closing the removed device instance\n"); + close(priv.fd.drm); + + healthcheck(); +} + +/* Main */ + +igt_main +{ + igt_fixture { + int fd_drm; + + /** + * As subtests must be able to close examined devices + * completely, don't use drm_open_driver() as it keeps + * a device file descriptor open for exit handler use. + */ + fd_drm = __drm_open_driver(DRIVER_ANY); + igt_assert(fd_drm >= 0); + + if (is_i915_device(fd_drm)) + igt_require_gem(fd_drm); + + /* Make sure subtests always reopen the same device */ + set_filter_from_device(fd_drm); + + close(fd_drm); + } + + igt_describe("Check if the driver can be cleanly unbound from a device believed to be closed"); + igt_subtest("unbind-rebind") + unbind_rebind(); + + igt_fixture + igt_abort_on_f(failure, "%s\n", failure); + + igt_describe("Check if a device believed to be closed can be cleanly unplugged"); + igt_subtest("unplug-rescan") + unplug_rescan(); + + igt_fixture + igt_abort_on_f(failure, "%s\n", failure); + + igt_describe("Check if the driver can be cleanly unbound from a still open device, then released"); + igt_subtest("hotunbind-lateclose") + hotunbind_lateclose(); + + igt_fixture + igt_abort_on_f(failure, "%s\n", failure); + + igt_describe("Check if a still open device can be cleanly unplugged, then released"); + igt_subtest("hotunplug-lateclose") + hotunplug_lateclose(); + + igt_fixture + igt_abort_on_f(failure, "%s\n", failure); +} diff --git a/tests/intel-ci/blacklist.txt b/tests/intel-ci/blacklist.txt index ee7045f03..201f4b1b4 100644 --- a/tests/intel-ci/blacklist.txt +++ b/tests/intel-ci/blacklist.txt @@ -117,3 +117,4 @@ igt@.*@.*pipe-f($|-.*) # Since 5.7-rc1, this test has produced tens of megabytes of kernel # logs. igt@perf_pmu@cpu-hotplug +igt@core_hotunplug@.* diff --git a/tests/meson.build b/tests/meson.build index e882f4dcd..0bdcfbe4c 100644 --- a/tests/meson.build +++ b/tests/meson.build @@ -3,6 +3,7 @@ test_progs = [ 'core_getclient', 'core_getstats', 'core_getversion', + 'core_hotunplug', 'core_setmaster', 'core_setmaster_vs_auth', 'debugfs_test',