Message ID | 20240730223932.3432862-2-sdf@fomichev.me (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [net-next,v2,1/2] selftests: net-drv: exercise queue stats when the device is down | expand |
Stanislav Fomichev <sdf@fomichev.me> writes: > Add new @ksft_disruptive decorator to mark the tests that might > be disruptive to the system. Depending on how well the previous > test works in the CI we might want to disable disruptive tests > by default and only let the developers run them manually. > > KSFT framework runs disruptive tests by default. DISRUPTIVE=False > environment (or config file) can be used to disable these tests. > ksft_setup should be called by the test cases that want to use > new decorator (ksft_setup is only called via NetDrvEnv/NetDrvEpEnv for now). Is that something that tests would want to genuinely do, manage this stuff by hand? I don't really mind having the helper globally accessible, but default I'd keep it inside env.py and expect others to inherit appropriately. > @@ -127,6 +129,36 @@ KSFT_RESULT_ALL = True > KSFT_RESULT = False > > > +def ksft_disruptive(func): > + """ > + Decorator that marks the test as disruptive (e.g. the test > + that can down the interface). Disruptive tests can be skipped > + by passing DISRUPTIVE=False environment variable. > + """ > + > + @functools.wraps(func) > + def wrapper(*args, **kwargs): > + if not KSFT_DISRUPTIVE: > + raise KsftSkipEx(f"marked as disruptive") Since this is a skip, it will fail the overall run. But that happened because the user themselves set DISRUPTIVE=0 to avoid, um, disruption to the system. I think it should either be xfail, or something else dedicated that conveys the idea that we didn't run the test, but that's fine. Using xfail for this somehow doesn't seem correct, nothing failed. Maybe we need KsftOmitEx, which would basically be an xfail with a more appropriate name? > +def ksft_setup(env): > + """ > + Setup test framework global state from the environment. > + """ > + > + def get_bool(env, name): > + return env.get(name, "").lower() in ["true", "1"] "yes" should alse be considered, for compatibility with the bash selftests. It's also odd that 0 is false, 1 is true, but 2 is false again. How about something like this? def get_bool(env, name): value = env.get(name, "").lower() if value in ["yes", "true"]: return True if value in ["no", "false"]: return False try: return bool(int(value)) except: raise something something invalid value So that people at least know if they set it to nonsense that it's nonsense? Dunno. The bash selftests just take "yes" and don't care about being very user friendly in that regard at all. _load_env_file() likewise looks like it just takes strings and doesn't care about the semantics. So I don't feel too strongly about this at all. Besides the "yes" bit, that should be recognized.
On 07/31, Petr Machata wrote: > > Stanislav Fomichev <sdf@fomichev.me> writes: > > > Add new @ksft_disruptive decorator to mark the tests that might > > be disruptive to the system. Depending on how well the previous > > test works in the CI we might want to disable disruptive tests > > by default and only let the developers run them manually. > > > > KSFT framework runs disruptive tests by default. DISRUPTIVE=False > > environment (or config file) can be used to disable these tests. > > ksft_setup should be called by the test cases that want to use > > new decorator (ksft_setup is only called via NetDrvEnv/NetDrvEpEnv for now). > > Is that something that tests would want to genuinely do, manage this > stuff by hand? I don't really mind having the helper globally > accessible, but default I'd keep it inside env.py and expect others to > inherit appropriately. Hard to say how well it's gonna work tbh. But at least from what I've seen, large code bases (outside of kernel) usually have some way to attach metadata to the testcase to indicate various things. For example, this is how the timeout can be controlled: https://bazel.build/reference/test-encyclopedia#role-test-runner So I'd imagine we can eventually have @kstf_short/@ksft_long to control that using similar techniques. Regarding keeping it inside env.py: can you expand more on what you mean by having the default in env.py? > > @@ -127,6 +129,36 @@ KSFT_RESULT_ALL = True > > KSFT_RESULT = False > > > > > > +def ksft_disruptive(func): > > + """ > > + Decorator that marks the test as disruptive (e.g. the test > > + that can down the interface). Disruptive tests can be skipped > > + by passing DISRUPTIVE=False environment variable. > > + """ > > + > > + @functools.wraps(func) > > + def wrapper(*args, **kwargs): > > + if not KSFT_DISRUPTIVE: > > + raise KsftSkipEx(f"marked as disruptive") > > Since this is a skip, it will fail the overall run. But that happened > because the user themselves set DISRUPTIVE=0 to avoid, um, disruption to > the system. I think it should either be xfail, or something else > dedicated that conveys the idea that we didn't run the test, but that's > fine. > > Using xfail for this somehow doesn't seem correct, nothing failed. Maybe > we need KsftOmitEx, which would basically be an xfail with a more > appropriate name? Are you sure skip will fail the overall run? At least looking at tools/testing/selftests/net/lib/py/ksft.py, both skip and xfail are considered KSFT_RESULT=True. Or am I looking at the wrong place? > > +def ksft_setup(env): > > + """ > > + Setup test framework global state from the environment. > > + """ > > + > > + def get_bool(env, name): > > + return env.get(name, "").lower() in ["true", "1"] > > "yes" should alse be considered, for compatibility with the bash > selftests. > > It's also odd that 0 is false, 1 is true, but 2 is false again. How > about something like this? > > def get_bool(env, name): > value = env.get(name, "").lower() > if value in ["yes", "true"]: > return True > if value in ["no", "false"]: > return False > > try: > return bool(int(value)) > except: > raise something something invalid value > > So that people at least know if they set it to nonsense that it's > nonsense? > > Dunno. The bash selftests just take "yes" and don't care about being > very user friendly in that regard at all. _load_env_file() likewise > looks like it just takes strings and doesn't care about the semantics. > So I don't feel too strongly about this at all. Besides the "yes" bit, > that should be recognized. Sure, will do! (will also apply your suggestions for 1/2 so want reply separately)
Stanislav Fomichev <sdf@fomichev.me> writes: > On 07/31, Petr Machata wrote: >> >> Stanislav Fomichev <sdf@fomichev.me> writes: >> >> > Add new @ksft_disruptive decorator to mark the tests that might >> > be disruptive to the system. Depending on how well the previous >> > test works in the CI we might want to disable disruptive tests >> > by default and only let the developers run them manually. >> > >> > KSFT framework runs disruptive tests by default. DISRUPTIVE=False >> > environment (or config file) can be used to disable these tests. >> > ksft_setup should be called by the test cases that want to use >> > new decorator (ksft_setup is only called via NetDrvEnv/NetDrvEpEnv for now). >> >> Is that something that tests would want to genuinely do, manage this >> stuff by hand? I don't really mind having the helper globally >> accessible, but default I'd keep it inside env.py and expect others to >> inherit appropriately. > > Hard to say how well it's gonna work tbh. But at least from > what I've seen, large code bases (outside of kernel) usually > have some way to attach metadata to the testcase to indicate > various things. For example, this is how the timeout > can be controlled: > > https://bazel.build/reference/test-encyclopedia#role-test-runner > > So I'd imagine we can eventually have @kstf_short/@ksft_long to > control that using similar techniques. > > Regarding keeping it inside env.py: can you expand more on what > you mean by having the default in env.py? I'm looking into it now and I missed how this is layered. ksft.py is the comparatively general piece of code, and env.py is something specifically for driver testing. It makes sense for ksft_setup() to be where it is, because not-driver tests might want to be marked disruptive as well. It also makes sense that env.py invokes the general helper. All is good. >> > @@ -127,6 +129,36 @@ KSFT_RESULT_ALL = True >> > KSFT_RESULT = False >> > >> > >> > +def ksft_disruptive(func): >> > + """ >> > + Decorator that marks the test as disruptive (e.g. the test >> > + that can down the interface). Disruptive tests can be skipped >> > + by passing DISRUPTIVE=False environment variable. >> > + """ >> > + >> > + @functools.wraps(func) >> > + def wrapper(*args, **kwargs): >> > + if not KSFT_DISRUPTIVE: >> > + raise KsftSkipEx(f"marked as disruptive") >> >> Since this is a skip, it will fail the overall run. But that happened >> because the user themselves set DISRUPTIVE=0 to avoid, um, disruption to >> the system. I think it should either be xfail, or something else >> dedicated that conveys the idea that we didn't run the test, but that's >> fine. >> >> Using xfail for this somehow doesn't seem correct, nothing failed. Maybe >> we need KsftOmitEx, which would basically be an xfail with a more >> appropriate name? > > Are you sure skip will fail the overall run? At least looking at > tools/testing/selftests/net/lib/py/ksft.py, both skip and xfail are > considered KSFT_RESULT=True. Or am I looking at the wrong place? You seem to be right about the exit code. This was discussed some time ago, that SKIP is considered a sort of a failure. As the person running the test you would want to go in and fix whatever configuration issue is preventing the test from running. I'm not sure how it works in practice, whether people look for skips in the test log explicitly or rely on exit codes. Maybe Jakub can chime in, since he's the one that cajoled me into handling this whole SKIP / XFAIL business properly in bash selftests.
On Thu, 1 Aug 2024 10:36:18 +0200 Petr Machata wrote: > You seem to be right about the exit code. This was discussed some time > ago, that SKIP is considered a sort of a failure. As the person running > the test you would want to go in and fix whatever configuration issue is > preventing the test from running. I'm not sure how it works in practice, > whether people look for skips in the test log explicitly or rely on exit > codes. > > Maybe Jakub can chime in, since he's the one that cajoled me into > handling this whole SKIP / XFAIL business properly in bash selftests. For HW testing there is a lot more variables than just "is there some tool missing in the VM image". Not sure how well we can do in detecting HW capabilities and XFAILing without making the tests super long. And this case itself is not very clear cut. On one hand, you expect the test not to run if it's disruptive and executor can't deal with disruptive - IOW it's an eXpected FAIL. But it is an executor limitation, the device/driver could have been tested if it wasn't for the executor, so not entirely dissimilar to a tool missing. Either way - no strong opinion as of yet, we need someone to actually continuously run these to get experience :(
Jakub Kicinski <kuba@kernel.org> writes: > On Thu, 1 Aug 2024 10:36:18 +0200 Petr Machata wrote: >> You seem to be right about the exit code. This was discussed some time >> ago, that SKIP is considered a sort of a failure. As the person running >> the test you would want to go in and fix whatever configuration issue is >> preventing the test from running. I'm not sure how it works in practice, >> whether people look for skips in the test log explicitly or rely on exit >> codes. >> >> Maybe Jakub can chime in, since he's the one that cajoled me into >> handling this whole SKIP / XFAIL business properly in bash selftests. > > For HW testing there is a lot more variables than just "is there some > tool missing in the VM image". Not sure how well we can do in detecting > HW capabilities and XFAILing without making the tests super long. > And this case itself is not very clear cut. On one hand, you expect > the test not to run if it's disruptive and executor can't deal with > disruptive - IOW it's an eXpected FAIL. But it is an executor > limitation, the device/driver could have been tested if it wasn't > for the executor, so not entirely dissimilar to a tool missing. > > Either way - no strong opinion as of yet, we need someone to actually > continuously run these to get experience :( After sending my response I realized we talked about this once already. Apparently I forgot. I think it's odd that SKIP is a fail in one framework but a pass in another. But XFAIL is not a good name for something that was not even run. And if we add something like "omit", nobody will know what it means. Ho hum. Let's keep SKIP as passing in Python tests then...
diff --git a/tools/testing/selftests/drivers/net/lib/py/env.py b/tools/testing/selftests/drivers/net/lib/py/env.py index a5e800b8f103..1ea9bb695e94 100644 --- a/tools/testing/selftests/drivers/net/lib/py/env.py +++ b/tools/testing/selftests/drivers/net/lib/py/env.py @@ -4,6 +4,7 @@ import os import time from pathlib import Path from lib.py import KsftSkipEx, KsftXfailEx +from lib.py import ksft_setup from lib.py import cmd, ethtool, ip from lib.py import NetNS, NetdevSimDev from .remote import Remote @@ -14,7 +15,7 @@ from .remote import Remote src_dir = Path(src_path).parent.resolve() if not (src_dir / "net.config").exists(): - return env + return ksft_setup(env) with open((src_dir / "net.config").as_posix(), 'r') as fp: for line in fp.readlines(): @@ -30,7 +31,7 @@ from .remote import Remote if len(pair) != 2: raise Exception("Can't parse configuration line:", full_file) env[pair[0]] = pair[1] - return env + return ksft_setup(env) class NetDrvEnv: diff --git a/tools/testing/selftests/drivers/net/stats.py b/tools/testing/selftests/drivers/net/stats.py index 93f9204f51c4..4c58080cf893 100755 --- a/tools/testing/selftests/drivers/net/stats.py +++ b/tools/testing/selftests/drivers/net/stats.py @@ -3,6 +3,7 @@ from lib.py import ksft_run, ksft_exit, ksft_pr from lib.py import ksft_ge, ksft_eq, ksft_in, ksft_true, ksft_raises, KsftSkipEx, KsftXfailEx +from lib.py import ksft_disruptive from lib.py import EthtoolFamily, NetdevFamily, RtnlFamily, NlError from lib.py import NetDrvEnv from lib.py import ip, defer @@ -134,6 +135,7 @@ rtnl = RtnlFamily() ksft_eq(cm.exception.nl_msg.extack['bad-attr'], '.ifindex') +@ksft_disruptive def check_down(cfg) -> None: try: qstat = netfam.qstats_get({"ifindex": cfg.ifindex}, dump=True) diff --git a/tools/testing/selftests/net/lib/py/ksft.py b/tools/testing/selftests/net/lib/py/ksft.py index f26c20df9db4..a9a24ea77226 100644 --- a/tools/testing/selftests/net/lib/py/ksft.py +++ b/tools/testing/selftests/net/lib/py/ksft.py @@ -1,6 +1,7 @@ # SPDX-License-Identifier: GPL-2.0 import builtins +import functools import inspect import sys import time @@ -10,6 +11,7 @@ from .utils import global_defer_queue KSFT_RESULT = None KSFT_RESULT_ALL = True +KSFT_DISRUPTIVE = True class KsftFailEx(Exception): @@ -127,6 +129,36 @@ KSFT_RESULT_ALL = True KSFT_RESULT = False +def ksft_disruptive(func): + """ + Decorator that marks the test as disruptive (e.g. the test + that can down the interface). Disruptive tests can be skipped + by passing DISRUPTIVE=False environment variable. + """ + + @functools.wraps(func) + def wrapper(*args, **kwargs): + if not KSFT_DISRUPTIVE: + raise KsftSkipEx(f"marked as disruptive") + return func(*args, **kwargs) + return wrapper + + +def ksft_setup(env): + """ + Setup test framework global state from the environment. + """ + + def get_bool(env, name): + return env.get(name, "").lower() in ["true", "1"] + + if "DISRUPTIVE" in env: + global KSFT_DISRUPTIVE + KSFT_DISRUPTIVE = get_bool(env, "DISRUPTIVE") + + return env + + def ksft_run(cases=None, globs=None, case_pfx=None, args=()): cases = cases or []
Add new @ksft_disruptive decorator to mark the tests that might be disruptive to the system. Depending on how well the previous test works in the CI we might want to disable disruptive tests by default and only let the developers run them manually. KSFT framework runs disruptive tests by default. DISRUPTIVE=False environment (or config file) can be used to disable these tests. ksft_setup should be called by the test cases that want to use new decorator (ksft_setup is only called via NetDrvEnv/NetDrvEpEnv for now). In the future we can add similar decorators to, for example, avoid running slow tests all the time. And/or have some option to run only 'fast' tests for some sort of smoke test scenario. $ DISRUPTIVE=False ./stats.py KTAP version 1 1..5 ok 1 stats.check_pause ok 2 stats.check_fec ok 3 stats.pkt_byte_sum ok 4 stats.qstat_by_ifindex ok 5 stats.check_down # SKIP marked as disruptive # Totals: pass:4 fail:0 xfail:0 xpass:0 skip:1 error:0 v2: - convert from cli argument to env variable (Jakub) Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> -- Cc: Shuah Khan <shuah@kernel.org> Cc: Joe Damato <jdamato@fastly.com> Cc: Petr Machata <petrm@nvidia.com> Cc: linux-kselftest@vger.kernel.org --- .../selftests/drivers/net/lib/py/env.py | 5 +-- tools/testing/selftests/drivers/net/stats.py | 2 ++ tools/testing/selftests/net/lib/py/ksft.py | 32 +++++++++++++++++++ 3 files changed, 37 insertions(+), 2 deletions(-)