Message ID | wq27r7e3n5jz4z6pn2twwrcp2zklumcfibutcpxrw6sgaxcsl5@m5z7rwxyuh72 (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [GIT,PULL] bcachefs updates for 6.8 | expand |
On Wed, Jan 10, 2024 at 02:36:30PM -0500, Kent Overstreet wrote: > [...] > bcachefs: %pg is banished Hi! Not a PR blocker, but this patch re-introduces users of strlcpy() which has been otherwise removed this cycle. I'll send a patch to replace these new uses, but process-wise, I'd like check on how bcachefs patches are reviewed. Normally I'd go find the original email that posted the patch and reply there, but I couldn't find a development list where this patch was posted. Where is this happening? (Being posted somewhere is supposed to be a prerequisite for living in -next. E.g. quoting from the -next inclusion boiler-plate: "* posted to the relevant mailing list,") It looks like it was authored 5 days ago, which is cutting it awfully close to the merge window opening: AuthorDate: Fri Jan 5 11:58:50 2024 -0500 Actually, it looks like you rebased onto v6.7-rc7? This is normally strongly discouraged. The common merge base is -rc2. It also seems it didn't get a run through scripts/checkpatch.pl, which shows 4 warnings, 2 or which point out the strlcpy deprecation: WARNING: Prefer strscpy over strlcpy - see: https://github.com/KSPP/linux/issues/89 #123: FILE: fs/bcachefs/super.c:1389: + strlcpy(c->name, name.buf, sizeof(c->name)); WARNING: Prefer strscpy over strlcpy - see: https://github.com/KSPP/linux/issues/89 #124: FILE: fs/bcachefs/super.c:1390: + strlcpy(ca->name, name.buf, sizeof(ca->name)); Please make sure you're running checkpatch.pl -- it'll make integration, technical debt reduction, and coding style adjustments much easier. :) Thanks! -Kees
On Wed, Jan 10, 2024 at 03:48:43PM -0800, Kees Cook wrote: > On Wed, Jan 10, 2024 at 02:36:30PM -0500, Kent Overstreet wrote: > > [...] > > bcachefs: %pg is banished > > Hi! > > Not a PR blocker, but this patch re-introduces users of strlcpy() which > has been otherwise removed this cycle. I'll send a patch to replace > these new uses, but process-wise, I'd like check on how bcachefs patches > are reviewed. I'm happy to fix it. Perhaps the declaration could get a depracated warning, though? > Normally I'd go find the original email that posted the patch and reply > there, but I couldn't find a development list where this patch was > posted. Where is this happening? (Being posted somewhere is supposed > to be a prerequisite for living in -next. E.g. quoting from the -next > inclusion boiler-plate: "* posted to the relevant mailing list,") It > looks like it was authored 5 days ago, which is cutting it awfully close > to the merge window opening: > > AuthorDate: Fri Jan 5 11:58:50 2024 -0500 I'm confident in my testing; if it was a patch that needed more soak time it would have waited. > Actually, it looks like you rebased onto v6.7-rc7? This is normally > strongly discouraged. The common merge base is -rc2. Is there something special about rc2? I reorder patches fairly often just in the normal course of backporting fixes, and if I have to rebase everything for a backport I'll often rebase onto a newer kernel so that the people who are running my tree are testing something more stable - it does come up. > It also seems it didn't get a run through scripts/checkpatch.pl, which > shows 4 warnings, 2 or which point out the strlcpy deprecation: > > WARNING: Prefer strscpy over strlcpy - see: https://github.com/KSPP/linux/issues/89 > #123: FILE: fs/bcachefs/super.c:1389: > + strlcpy(c->name, name.buf, sizeof(c->name)); > > WARNING: Prefer strscpy over strlcpy - see: https://github.com/KSPP/linux/issues/89 > #124: FILE: fs/bcachefs/super.c:1390: > + strlcpy(ca->name, name.buf, sizeof(ca->name)); > > Please make sure you're running checkpatch.pl -- it'll make integration, > technical debt reduction, and coding style adjustments much easier. :) Well, we do have rather a lot of linters these days. That's actually something I've been meaning to raise - perhaps we could start thinking about some pluggable way of running linters so that they're all run as part of a normal kernel build (and something that would be easy to drop new linters in to; I'd like to write some bcachefs specific ones). The current model of "I have to remember to run these 5 things, and then I'm going to get email nags for 3 more that I can't run" is not terribly scalable :)
On Wed, Jan 10, 2024 at 07:04:47PM -0500, Kent Overstreet wrote: > On Wed, Jan 10, 2024 at 03:48:43PM -0800, Kees Cook wrote: > > On Wed, Jan 10, 2024 at 02:36:30PM -0500, Kent Overstreet wrote: > > > [...] > > > bcachefs: %pg is banished > > > > Hi! > > > > Not a PR blocker, but this patch re-introduces users of strlcpy() which > > has been otherwise removed this cycle. I'll send a patch to replace > > these new uses, but process-wise, I'd like check on how bcachefs patches > > are reviewed. > > I'm happy to fix it. Perhaps the declaration could get a depracated > warning, though? That's one of checkpatch.pl's purposes, seeing as how deprecation warnings are ... deprecated. :P https://docs.kernel.org/process/deprecated.html#id1 This has made treewide changes like this more difficult, but these are the Rules From Linus. ;) > > Normally I'd go find the original email that posted the patch and reply > > there, but I couldn't find a development list where this patch was > > posted. Where is this happening? (Being posted somewhere is supposed > > to be a prerequisite for living in -next. E.g. quoting from the -next > > inclusion boiler-plate: "* posted to the relevant mailing list,") It > > looks like it was authored 5 days ago, which is cutting it awfully close > > to the merge window opening: > > > > AuthorDate: Fri Jan 5 11:58:50 2024 -0500 > > I'm confident in my testing; if it was a patch that needed more soak > time it would have waited. > > > Actually, it looks like you rebased onto v6.7-rc7? This is normally > > strongly discouraged. The common merge base is -rc2. > > Is there something special about rc2? It's what sfr suggested as it's when many subsystem maintainers merge to when opening their trees for development. Usually it's a good tree state: after stabilization fixes from any rc1 rough edges. > I reorder patches fairly often just in the normal course of backporting > fixes, and if I have to rebase everything for a backport I'll often > rebase onto a newer kernel so that the people who are running my tree > are testing something more stable - it does come up. Okay, gotcha. I personally don't care how maintainers handle rebasing; I was just confused about the timing and why I couldn't find the original patch on any lists. :) And to potentially warn about Linus possibly not liking the rebase too. > > > It also seems it didn't get a run through scripts/checkpatch.pl, which > > shows 4 warnings, 2 or which point out the strlcpy deprecation: > > > > WARNING: Prefer strscpy over strlcpy - see: https://github.com/KSPP/linux/issues/89 > > #123: FILE: fs/bcachefs/super.c:1389: > > + strlcpy(c->name, name.buf, sizeof(c->name)); > > > > WARNING: Prefer strscpy over strlcpy - see: https://github.com/KSPP/linux/issues/89 > > #124: FILE: fs/bcachefs/super.c:1390: > > + strlcpy(ca->name, name.buf, sizeof(ca->name)); > > > > Please make sure you're running checkpatch.pl -- it'll make integration, > > technical debt reduction, and coding style adjustments much easier. :) > > Well, we do have rather a lot of linters these days. > > That's actually something I've been meaning to raise - perhaps we could > start thinking about some pluggable way of running linters so that > they're all run as part of a normal kernel build (and something that > would be easy to drop new linters in to; I'd like to write some bcachefs > specific ones). With no central CI, the best we've got is everyone running the same "minimum set" of checks. I'm most familiar with netdev's CI which has such things (and checkpatch.pl is included). For example see: https://patchwork.kernel.org/project/netdevbpf/patch/20240110110451.5473-3-ptikhomirov@virtuozzo.com/ > The current model of "I have to remember to run these 5 things, and then > I'm going to get email nags for 3 more that I can't run" is not terribly > scalable :) Oh, I hear you. It's positively agonizing for those of us doing treewide changes. I've got at least 4 CIs I check (in addition to my own) just to check everyone's various coverage tools. At the very least, checkpatch.pl is the common denominator: https://docs.kernel.org/process/submitting-patches.html#style-check-your-changes -Kees
On Wed, Jan 10, 2024 at 04:39:22PM -0800, Kees Cook wrote: > On Wed, Jan 10, 2024 at 07:04:47PM -0500, Kent Overstreet wrote: > > On Wed, Jan 10, 2024 at 03:48:43PM -0800, Kees Cook wrote: > > > On Wed, Jan 10, 2024 at 02:36:30PM -0500, Kent Overstreet wrote: > > > > [...] > > > > bcachefs: %pg is banished > > > > > > Hi! > > > > > > Not a PR blocker, but this patch re-introduces users of strlcpy() which > > > has been otherwise removed this cycle. I'll send a patch to replace > > > these new uses, but process-wise, I'd like check on how bcachefs patches > > > are reviewed. > > > > I'm happy to fix it. Perhaps the declaration could get a depracated > > warning, though? > > That's one of checkpatch.pl's purposes, seeing as how deprecation warnings > are ... deprecated. :P > https://docs.kernel.org/process/deprecated.html#id1 > This has made treewide changes like this more difficult, but these are > the Rules From Linus. ;) ...And how does that make any sense? "The warnings weren't getting cleaned up, so get rid of them - except not really, just move them off to the side so they'll be more annoying when they do come up"... Perhaps we could've just switched to deprecation warnings being on in a W=1 build? > Okay, gotcha. I personally don't care how maintainers handle rebasing; I > was just confused about the timing and why I couldn't find the original > patch on any lists. :) And to potentially warn about Linus possibly not > liking the rebase too. *nod* If there's some other reason why it's convenient to be on rc2 I could possibly switch my workflow, but pushing code out quickly is the norm for me. > > > It also seems it didn't get a run through scripts/checkpatch.pl, which > > > shows 4 warnings, 2 or which point out the strlcpy deprecation: > > > > > > WARNING: Prefer strscpy over strlcpy - see: https://github.com/KSPP/linux/issues/89 > > > #123: FILE: fs/bcachefs/super.c:1389: > > > + strlcpy(c->name, name.buf, sizeof(c->name)); > > > > > > WARNING: Prefer strscpy over strlcpy - see: https://github.com/KSPP/linux/issues/89 > > > #124: FILE: fs/bcachefs/super.c:1390: > > > + strlcpy(ca->name, name.buf, sizeof(ca->name)); > > > > > > Please make sure you're running checkpatch.pl -- it'll make integration, > > > technical debt reduction, and coding style adjustments much easier. :) > > > > Well, we do have rather a lot of linters these days. > > > > That's actually something I've been meaning to raise - perhaps we could > > start thinking about some pluggable way of running linters so that > > they're all run as part of a normal kernel build (and something that > > would be easy to drop new linters in to; I'd like to write some bcachefs > > specific ones). > > With no central CI, the best we've got is everyone running the same > "minimum set" of checks. I'm most familiar with netdev's CI which has > such things (and checkpatch.pl is included). For example see: > https://patchwork.kernel.org/project/netdevbpf/patch/20240110110451.5473-3-ptikhomirov@virtuozzo.com/ Yeah, we badly need a central/common CI. I've been making noises that my own thing could be a good basis for that - e.g. it shouldn't be much work to use it for running our tests in tools/tesing/selftests. Sadly no time for that myself, but happy to talk about it if someone does start leading/coordinating that effort. example tests, example output: https://evilpiepirate.org/git/ktest.git/tree/tests/bcachefs/single_device.ktest https://evilpiepirate.org/~testdashboard/ci?branch=bcachefs-testing > > The current model of "I have to remember to run these 5 things, and then > > I'm going to get email nags for 3 more that I can't run" is not terribly > > scalable :) > > Oh, I hear you. It's positively agonizing for those of us doing treewide > changes. I've got at least 4 CIs I check (in addition to my own) just to > check everyone's various coverage tools. > > At the very least, checkpatch.pl is the common denominator: > https://docs.kernel.org/process/submitting-patches.html#style-check-your-changes At one point in my career I was religious about checkpatch; since then the warnings it produces have seemed to me more on the naggy and less on the useful end of the spectrum - I like smatch better in that respect. But - I'll start running it again for the deprecation warnings :)
On Wed, 10 Jan 2024 at 16:58, Kent Overstreet <kent.overstreet@linux.dev> wrote: > > ...And how does that make any sense? "The warnings weren't getting > cleaned up, so get rid of them - except not really, just move them off > to the side so they'll be more annoying when they do come up"... Honestly,the checkpatch warnings are often garbage too. The whole deprecation warnings never worked. They don't work in checkpatch either. > Perhaps we could've just switched to deprecation warnings being on in a > W=1 build? No, because the whole idea of "let me mark something deprecated and then not just remove it" is GARBAGE. If somebody wants to deprecate something, it is up to *them* to finish the job. Not annoy thousands of other developers with idiotic warnings. Linus
The pull request you sent on Wed, 10 Jan 2024 14:36:30 -0500:
> https://evilpiepirate.org/git/bcachefs.git tags/bcachefs-2024-01-10
has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/999a36b52b1b11b2ca0590756e4f8cf21f2d9182
Thank you!
On Wed, Jan 10, 2024 at 07:58:20PM -0500, Kent Overstreet wrote: > On Wed, Jan 10, 2024 at 04:39:22PM -0800, Kees Cook wrote: > > With no central CI, the best we've got is everyone running the same > > "minimum set" of checks. I'm most familiar with netdev's CI which has > > such things (and checkpatch.pl is included). For example see: > > https://patchwork.kernel.org/project/netdevbpf/patch/20240110110451.5473-3-ptikhomirov@virtuozzo.com/ > Yeah, we badly need a central/common CI. I've been making noises that my > own thing could be a good basis for that - e.g. it shouldn't be much > work to use it for running our tests in tools/tesing/selftests. Sadly no > time for that myself, but happy to talk about it if someone does start > leading/coordinating that effort. IME the actually running the tests bit isn't usually *so* much the issue, someone making a new test runner and/or output format does mean a bit of work integrating it into infrastructure but that's more usually annoying than a blocker. Issues tend to be more around arranging to drive the relevant test systems, figuring out which tests to run where (including things like figuring out capacity on test devices, or how long you're prepared to wait in interactive usage) and getting the environment on the target devices into a state where the tests can run. Plus any stability issues with the tests themselves of course, and there's a bunch of costs somewhere along the line. I suspect we're more likely to get traction with aggregating test results and trying to do UI/reporting on top of that than with the running things bit, that really would be very good to have. I've copied in Nikolai who's work on kcidb is the main thing I'm aware of there, though at the minute operational issues mean it's a bit write only. > example tests, example output: > https://evilpiepirate.org/git/ktest.git/tree/tests/bcachefs/single_device.ktest > https://evilpiepirate.org/~testdashboard/ci?branch=bcachefs-testing For example looking at the sample test there it looks like it needs among other things mkfs.btrfs, bcachefs, stress-ng, xfs_io, fio, mdadm, rsync and a reasonably performant disk with 40G of space available. None of that is especially unreasonable for a filesystems test but it's all things that we need to get onto the system where we want to run the test and there's a lot of systems where the storage requirements would be unsustainable for one reason or another. It also appears to take about 33000s to run on whatever system you use which is distinctly non-trivial. I certainly couldn't run it readily in my lab. > > At the very least, checkpatch.pl is the common denominator: > > https://docs.kernel.org/process/submitting-patches.html#style-check-your-changes > At one point in my career I was religious about checkpatch; since then > the warnings it produces have seemed to me more on the naggy and less > on the useful end of the spectrum - I like smatch better in that > respect. But - I'll start running it again for the deprecation > warnings :) Yeah, I don't run it on incoming stuff because the rate at which it reports things I don't find useful is far too high.
On Thu, Jan 11, 2024 at 03:35:40PM +0000, Mark Brown wrote: > On Wed, Jan 10, 2024 at 07:58:20PM -0500, Kent Overstreet wrote: > > On Wed, Jan 10, 2024 at 04:39:22PM -0800, Kees Cook wrote: > > > > With no central CI, the best we've got is everyone running the same > > > "minimum set" of checks. I'm most familiar with netdev's CI which has > > > such things (and checkpatch.pl is included). For example see: > > > https://patchwork.kernel.org/project/netdevbpf/patch/20240110110451.5473-3-ptikhomirov@virtuozzo.com/ > > > Yeah, we badly need a central/common CI. I've been making noises that my > > own thing could be a good basis for that - e.g. it shouldn't be much > > work to use it for running our tests in tools/tesing/selftests. Sadly no > > time for that myself, but happy to talk about it if someone does start > > leading/coordinating that effort. > > IME the actually running the tests bit isn't usually *so* much the > issue, someone making a new test runner and/or output format does mean a > bit of work integrating it into infrastructure but that's more usually > annoying than a blocker. No, the proliferation of test runners, test output formats, CI systems, etc. really is an issue; it means we can't have one common driver that anyone can run from the command line, and instead there's a bunch of disparate systems with patchwork integration and all the feedback is nag emails - after you've finished whan you were working on instead of moving on to the next thing - with no way to get immediate feedback. And it's because building something shiny and new is the fun part, no one wants to do the grungy integration work. > Issues tend to be more around arranging to > drive the relevant test systems, figuring out which tests to run where > (including things like figuring out capacity on test devices, or how > long you're prepared to wait in interactive usage) and getting the > environment on the target devices into a state where the tests can run. > Plus any stability issues with the tests themselves of course, and > there's a bunch of costs somewhere along the line. > > I suspect we're more likely to get traction with aggregating test > results and trying to do UI/reporting on top of that than with the > running things bit, that really would be very good to have. I've copied > in Nikolai who's work on kcidb is the main thing I'm aware of there, > though at the minute operational issues mean it's a bit write only. > > > example tests, example output: > > https://evilpiepirate.org/git/ktest.git/tree/tests/bcachefs/single_device.ktest > > https://evilpiepirate.org/~testdashboard/ci?branch=bcachefs-testing > > For example looking at the sample test there it looks like it needs > among other things mkfs.btrfs, bcachefs, stress-ng, xfs_io, fio, mdadm, > rsync Getting all that set up by the end user is one command: ktest/root_image create and running a test is one morecommand: build-test-kernel run ~/ktest/tests/bcachefs/single_device.ktest > and a reasonably performant disk with 40G of space available. > None of that is especially unreasonable for a filesystems test but it's > all things that we need to get onto the system where we want to run the > test and there's a lot of systems where the storage requirements would > be unsustainable for one reason or another. It also appears to take > about 33000s to run on whatever system you use which is distinctly > non-trivial. Getting sufficient coverage in filesystem land does take some amount of resources, but it's not so bad - I'm leasing 80 core ARM64 machines from Hetzner for $250/month and running 10 test VMs per machine, so it's really not that expensive. Other subsystems would probably be fine with less resources.
On Thu, Jan 11, 2024 at 12:38:57PM -0500, Kent Overstreet wrote: > On Thu, Jan 11, 2024 at 03:35:40PM +0000, Mark Brown wrote: > > IME the actually running the tests bit isn't usually *so* much the > > issue, someone making a new test runner and/or output format does mean a > > bit of work integrating it into infrastructure but that's more usually > > annoying than a blocker. > No, the proliferation of test runners, test output formats, CI systems, > etc. really is an issue; it means we can't have one common driver that > anyone can run from the command line, and instead there's a bunch of > disparate systems with patchwork integration and all the feedback is nag > emails - after you've finished whan you were working on instead of > moving on to the next thing - with no way to get immediate feedback. It's certainly an issue and it's much better if people do manage to fit their tests into some existing thing but I'm not convinced that's the big reason why you have a bunch of different systems running separately and doing different things. For example the enterprise vendors will naturally tend to have a bunch of server systems in their labs and focus on their testing needs while I know the Intel audio CI setup has a bunch of laptops, laptop like dev boards and things in there with loopback audio cables and I think test equipment plugged in and focuses rather more on audio. My own lab is built around on systems I can be in the same room as without getting too annoyed and does things I find useful, plus using spare bandwidth for KernelCI because they can take donated lab time. I think there's a few different issues you're pointing at here: - Working out how to run relevant tests for whatever area of the kernel you're working on on whatever hardware you have to hand. - Working out exactly what other testers will do. - Promptness and consistency of feedback from other testers. - UI for getting results from other testers. and while it really sounds like your main annoyances are the bits with other test systems it really seems like the test runner bit is mainly for the first issue, possibly also helping with working out what other testers are going to do. These are all very real issues. > And it's because building something shiny and new is the fun part, no > one wants to do the grungy integration work. I think you may be overestimating people's enthusiasm for writing test stuff there! There is NIH stuff going on for sure but lot of the time when you look at something where people have gone off and done their own thing it's either much older than you initially thought and predates anything they might've integrated with or there's some reason why none of the existing systems fit well. Anecdotally it seems much more common to see people looking for things to reuse in order to save time than it is to see people going off and reinventing the world. > > > example tests, example output: > > > https://evilpiepirate.org/git/ktest.git/tree/tests/bcachefs/single_device.ktest > > > https://evilpiepirate.org/~testdashboard/ci?branch=bcachefs-testing > > For example looking at the sample test there it looks like it needs > > among other things mkfs.btrfs, bcachefs, stress-ng, xfs_io, fio, mdadm, > > rsync > Getting all that set up by the end user is one command: > ktest/root_image create > and running a test is one morecommand: > build-test-kernel run ~/ktest/tests/bcachefs/single_device.ktest That does assume that you're building and running everything directly on the system under test and are happy to have the test in a VM which isn't an assumption that holds universally, and also that whoever's doing the testing doesn't want to do something like use their own distro or something - like I say none of it looks too unreasonable for filesystems. > > and a reasonably performant disk with 40G of space available. > > None of that is especially unreasonable for a filesystems test but it's > > all things that we need to get onto the system where we want to run the > > test and there's a lot of systems where the storage requirements would > > be unsustainable for one reason or another. It also appears to take > > about 33000s to run on whatever system you use which is distinctly > > non-trivial. > Getting sufficient coverage in filesystem land does take some amount of > resources, but it's not so bad - I'm leasing 80 core ARM64 machines from > Hetzner for $250/month and running 10 test VMs per machine, so it's > really not that expensive. Other subsystems would probably be fine with > less resources. Some will be, some will have more demanding requirements especially when you want to test on actual hardware rather than in a VM. For example with my own test setup which is more focused on hardware the operating costs aren't such a big deal but I've got boards that are for various reasons irreplaceable, often single instances of boards (which makes scheduling a thing) and for some of the tests I'd like to get around to setting up I need special physical setup. Some of the hardware I'd like to cover is only available in machines which are in various respects annoying to automate, I've got a couple of unused systems waiting for me to have sufficient bandwidth to work out how to automate them. Either way I don't think the costs are trival enough to be completely handwaved away. I'd also note that the 9 hour turnaround time for that test set you're pointing at isn't exactly what I'd associate with immediate feedback.
On Wed, Jan 10, 2024 at 05:47:20PM -0800, Linus Torvalds wrote: > No, because the whole idea of "let me mark something deprecated and > then not just remove it" is GARBAGE. > > If somebody wants to deprecate something, it is up to *them* to finish > the job. Not annoy thousands of other developers with idiotic > warnings. What would be nice is something that warned about _new_ uses being added. ie checkpatch. Let's at least not make the problem worse.
On Thu, Jan 11, 2024 at 10:57:18PM +0000, Matthew Wilcox wrote: > On Wed, Jan 10, 2024 at 05:47:20PM -0800, Linus Torvalds wrote: > > No, because the whole idea of "let me mark something deprecated and > > then not just remove it" is GARBAGE. > > > > If somebody wants to deprecate something, it is up to *them* to finish > > the job. Not annoy thousands of other developers with idiotic > > warnings. > > What would be nice is something that warned about _new_ uses being > added. ie checkpatch. Let's at least not make the problem worse. For now, we've just kind of "dealt with it". For things that show up with new -W options we've enlisted sfr to do the -next builds with it explicitly added (but not to the tree) so he could generate nag emails when new warnings appeared. That could happen if we added it to W=1 builds, or some other flag like REPORT_DEPRECATED=1. Another ugly idea would be to do a treewide replacement of "func" to "func_deprecated", and make "func" just a wrapper for it that is marked with __deprecated. Then only new instances would show up (assuming people weren't trying to actively bypass the deprecation work by adding calls to "func_deprecated"). :P Then the refactoring to replace "func_deprecated" could happen a bit more easily. Most past deprecations have pretty narrow usage. This is not true with the string functions, which is why it's more noticeable here. :P -Kees
On Thu, 11 Jan 2024 at 15:42, Kees Cook <keescook@chromium.org> wrote: > > Another ugly idea would be to do a treewide replacement of "func" to > "func_deprecated", and make "func" just a wrapper for it that is marked > with __deprecated. That's probably not a horrible idea, at least when we're talking a reasonable number of users (ie when we're talking "tens of users" like strlcpy is now). We should probably generally rename functions much more aggressively any time the "signature" changes. We've had situations where the semantics changed but not enough to necessarily trigger type warnings, and then renaming things is just a good thing just to avoid mistakes. Even if it's temporary and you plan on renaming things back. And with a coccinelle script (that should be documented in the patch) it's not necessarily all that painful to do. Linus
On Thu, Jan 11, 2024 at 03:42:19PM -0800, Kees Cook wrote: > On Thu, Jan 11, 2024 at 10:57:18PM +0000, Matthew Wilcox wrote: > > On Wed, Jan 10, 2024 at 05:47:20PM -0800, Linus Torvalds wrote: > > > No, because the whole idea of "let me mark something deprecated and > > > then not just remove it" is GARBAGE. > > > > > > If somebody wants to deprecate something, it is up to *them* to finish > > > the job. Not annoy thousands of other developers with idiotic > > > warnings. > > > > What would be nice is something that warned about _new_ uses being > > added. ie checkpatch. Let's at least not make the problem worse. > > For now, we've just kind of "dealt with it". For things that show up > with new -W options we've enlisted sfr to do the -next builds with it > explicitly added (but not to the tree) so he could generate nag emails > when new warnings appeared. That could happen if we added it to W=1 > builds, or some other flag like REPORT_DEPRECATED=1. > > Another ugly idea would be to do a treewide replacement of "func" to > "func_deprecated", and make "func" just a wrapper for it that is marked > with __deprecated. Then only new instances would show up (assuming people > weren't trying to actively bypass the deprecation work by adding calls to > "func_deprecated"). :P Then the refactoring to replace "func_deprecated" > could happen a bit more easily. > > Most past deprecations have pretty narrow usage. This is not true with > the string functions, which is why it's more noticeable here. :P Before doing the renaming - why not just leave a kdoc comment that marks it as deprecated? Seems odd that checkpatch was patched, but I can't find anything marking it as deprecated when I cscope to it.
On Thu, Jan 11, 2024 at 07:05:06PM -0500, Kent Overstreet wrote: > On Thu, Jan 11, 2024 at 03:42:19PM -0800, Kees Cook wrote: > > On Thu, Jan 11, 2024 at 10:57:18PM +0000, Matthew Wilcox wrote: > > > On Wed, Jan 10, 2024 at 05:47:20PM -0800, Linus Torvalds wrote: > > > > No, because the whole idea of "let me mark something deprecated and > > > > then not just remove it" is GARBAGE. > > > > > > > > If somebody wants to deprecate something, it is up to *them* to finish > > > > the job. Not annoy thousands of other developers with idiotic > > > > warnings. > > > > > > What would be nice is something that warned about _new_ uses being > > > added. ie checkpatch. Let's at least not make the problem worse. > > > > For now, we've just kind of "dealt with it". For things that show up > > with new -W options we've enlisted sfr to do the -next builds with it > > explicitly added (but not to the tree) so he could generate nag emails > > when new warnings appeared. That could happen if we added it to W=1 > > builds, or some other flag like REPORT_DEPRECATED=1. > > > > Another ugly idea would be to do a treewide replacement of "func" to > > "func_deprecated", and make "func" just a wrapper for it that is marked > > with __deprecated. Then only new instances would show up (assuming people > > weren't trying to actively bypass the deprecation work by adding calls to > > "func_deprecated"). :P Then the refactoring to replace "func_deprecated" > > could happen a bit more easily. > > > > Most past deprecations have pretty narrow usage. This is not true with > > the string functions, which is why it's more noticeable here. :P > > Before doing the renaming - why not just leave a kdoc comment that marks > it as deprecated? Seems odd that checkpatch was patched, but I can't > find anything marking it as deprecated when I cscope to it. It doesn't explicitly say "deprecated", but this language has been in the kdoc for a while now (not that people go read this often): * Do not use this function. While FORTIFY_SOURCE tries to avoid * over-reads when calculating strlen(@q), it is still possible. * Prefer strscpy(), though note its different return values for * detecting truncation. But it's all fine -- we're about to wipe out strlcpy for v6.8. Once the drivers-core and drm-misc-next trees land, (and the bcachefs patch[1]) we'll be at 0 users. :) -Kees [1] https://lore.kernel.org/lkml/20240110235438.work.385-kees@kernel.org/
On Thu, Jan 11, 2024 at 09:47:26PM +0000, Mark Brown wrote: > On Thu, Jan 11, 2024 at 12:38:57PM -0500, Kent Overstreet wrote: > > On Thu, Jan 11, 2024 at 03:35:40PM +0000, Mark Brown wrote: > > > > IME the actually running the tests bit isn't usually *so* much the > > > issue, someone making a new test runner and/or output format does mean a > > > bit of work integrating it into infrastructure but that's more usually > > > annoying than a blocker. > > > No, the proliferation of test runners, test output formats, CI systems, > > etc. really is an issue; it means we can't have one common driver that > > anyone can run from the command line, and instead there's a bunch of > > disparate systems with patchwork integration and all the feedback is nag > > emails - after you've finished whan you were working on instead of > > moving on to the next thing - with no way to get immediate feedback. > > It's certainly an issue and it's much better if people do manage to fit > their tests into some existing thing but I'm not convinced that's the > big reason why you have a bunch of different systems running separately > and doing different things. For example the enterprise vendors will > naturally tend to have a bunch of server systems in their labs and focus > on their testing needs while I know the Intel audio CI setup has a bunch > of laptops, laptop like dev boards and things in there with loopback > audio cables and I think test equipment plugged in and focuses rather > more on audio. My own lab is built around on systems I can be in the > same room as without getting too annoyed and does things I find useful, > plus using spare bandwidth for KernelCI because they can take donated > lab time. No, you're overthinking. The vast majority of kernel testing requires no special hardware, just a virtual machine. There is _no fucking reason_ we shouldn't be able to run tests on our own local machines - _local_ machines, not waiting for the Intel CI setup and asking for a git branch to be tested, not waiting for who knows how long for the CI farm to get to it - just run the damn tests immediately and get immediate feedback. You guys are overthinking and overengineering and ignoring the basics, the way enterprise people always do. > > And it's because building something shiny and new is the fun part, no > > one wants to do the grungy integration work. > > I think you may be overestimating people's enthusiasm for writing test > stuff there! There is NIH stuff going on for sure but lot of the time > when you look at something where people have gone off and done their own > thing it's either much older than you initially thought and predates > anything they might've integrated with or there's some reason why none > of the existing systems fit well. Anecdotally it seems much more common > to see people looking for things to reuse in order to save time than it > is to see people going off and reinventing the world. It's a basic lack of leadership. Yes, the younger engineers are always going to be doing the new and shiny, and always going to want to build something new instead of finishing off the tests or integrating with something existing. Which is why we're supposed to have managers saying "ok, what do I need to prioritize for my team be able to develop effectively". > > > > > example tests, example output: > > > > https://evilpiepirate.org/git/ktest.git/tree/tests/bcachefs/single_device.ktest > > > > https://evilpiepirate.org/~testdashboard/ci?branch=bcachefs-testing > > > > For example looking at the sample test there it looks like it needs > > > among other things mkfs.btrfs, bcachefs, stress-ng, xfs_io, fio, mdadm, > > > rsync > > > Getting all that set up by the end user is one command: > > ktest/root_image create > > and running a test is one morecommand: > > build-test-kernel run ~/ktest/tests/bcachefs/single_device.ktest > > That does assume that you're building and running everything directly on > the system under test and are happy to have the test in a VM which isn't > an assumption that holds universally, and also that whoever's doing the > testing doesn't want to do something like use their own distro or > something - like I say none of it looks too unreasonable for > filesystems. No, I'm doing it that way because technically that's the simplest way to do it. All you guys building crazy contraptions for running tests on Google Cloud or Amazon or whatever - you're building technical workarounds for broken procurement. Just requisition the damn machines. > Some will be, some will have more demanding requirements especially when > you want to test on actual hardware rather than in a VM. For example > with my own test setup which is more focused on hardware the operating > costs aren't such a big deal but I've got boards that are for various > reasons irreplaceable, often single instances of boards (which makes > scheduling a thing) and for some of the tests I'd like to get around to > setting up I need special physical setup. Some of the hardware I'd like > to cover is only available in machines which are in various respects > annoying to automate, I've got a couple of unused systems waiting for me > to have sufficient bandwidth to work out how to automate them. Either > way I don't think the costs are trival enough to be completely handwaved > away. That does complicate things. I'd also really like to get automated performance testing going too, which would have similar requirements in that jobs would need to be scheduled on specific dedicated machines. I think what you're doing could still build off of some common infrastructure. > I'd also note that the 9 hour turnaround time for that test set you're > pointing at isn't exactly what I'd associate with immediate feedback. My CI shards at the subtest level, and like I mentioned I run 10 VMs per physical machine, so with just 2 of the 80 core Ampere boxes I get full test runs done in ~20 minutes.
On Thu, Jan 11, 2024 at 8:11 PM Kent Overstreet <kent.overstreet@linux.dev> wrote: > > On Thu, Jan 11, 2024 at 09:47:26PM +0000, Mark Brown wrote: > > On Thu, Jan 11, 2024 at 12:38:57PM -0500, Kent Overstreet wrote: > > > On Thu, Jan 11, 2024 at 03:35:40PM +0000, Mark Brown wrote: > > > > > > IME the actually running the tests bit isn't usually *so* much the > > > > issue, someone making a new test runner and/or output format does mean a > > > > bit of work integrating it into infrastructure but that's more usually > > > > annoying than a blocker. > > > > > No, the proliferation of test runners, test output formats, CI systems, > > > etc. really is an issue; it means we can't have one common driver that > > > anyone can run from the command line, and instead there's a bunch of > > > disparate systems with patchwork integration and all the feedback is nag > > > emails - after you've finished whan you were working on instead of > > > moving on to the next thing - with no way to get immediate feedback. > > > > It's certainly an issue and it's much better if people do manage to fit > > their tests into some existing thing but I'm not convinced that's the > > big reason why you have a bunch of different systems running separately > > and doing different things. For example the enterprise vendors will > > naturally tend to have a bunch of server systems in their labs and focus > > on their testing needs while I know the Intel audio CI setup has a bunch > > of laptops, laptop like dev boards and things in there with loopback > > audio cables and I think test equipment plugged in and focuses rather > > more on audio. My own lab is built around on systems I can be in the > > same room as without getting too annoyed and does things I find useful, > > plus using spare bandwidth for KernelCI because they can take donated > > lab time. > > No, you're overthinking. > > The vast majority of kernel testing requires no special hardware, just a > virtual machine. > > There is _no fucking reason_ we shouldn't be able to run tests on our > own local machines - _local_ machines, not waiting for the Intel CI > setup and asking for a git branch to be tested, not waiting for who > knows how long for the CI farm to get to it - just run the damn tests > immediately and get immediate feedback. > > You guys are overthinking and overengineering and ignoring the basics, > the way enterprise people always do. > As one of those former enterprise people that actually did do this stuff, I can say that even when I was "in the enterprise", I tried to avoid overthinking and overengineering stuff like this. :) Nobody can maintain anything that's so complicated nobody can run the tests on their machine. That is the root of all sadness. > > > And it's because building something shiny and new is the fun part, no > > > one wants to do the grungy integration work. > > > > I think you may be overestimating people's enthusiasm for writing test > > stuff there! There is NIH stuff going on for sure but lot of the time > > when you look at something where people have gone off and done their own > > thing it's either much older than you initially thought and predates > > anything they might've integrated with or there's some reason why none > > of the existing systems fit well. Anecdotally it seems much more common > > to see people looking for things to reuse in order to save time than it > > is to see people going off and reinventing the world. > > It's a basic lack of leadership. Yes, the younger engineers are always > going to be doing the new and shiny, and always going to want to build > something new instead of finishing off the tests or integrating with > something existing. Which is why we're supposed to have managers saying > "ok, what do I need to prioritize for my team be able to develop > effectively". > > > > > > > > example tests, example output: > > > > > https://evilpiepirate.org/git/ktest.git/tree/tests/bcachefs/single_device.ktest > > > > > https://evilpiepirate.org/~testdashboard/ci?branch=bcachefs-testing > > > > > > For example looking at the sample test there it looks like it needs > > > > among other things mkfs.btrfs, bcachefs, stress-ng, xfs_io, fio, mdadm, > > > > rsync > > > > > Getting all that set up by the end user is one command: > > > ktest/root_image create > > > and running a test is one morecommand: > > > build-test-kernel run ~/ktest/tests/bcachefs/single_device.ktest > > > > That does assume that you're building and running everything directly on > > the system under test and are happy to have the test in a VM which isn't > > an assumption that holds universally, and also that whoever's doing the > > testing doesn't want to do something like use their own distro or > > something - like I say none of it looks too unreasonable for > > filesystems. > > No, I'm doing it that way because technically that's the simplest way to > do it. > > All you guys building crazy contraptions for running tests on Google > Cloud or Amazon or whatever - you're building technical workarounds for > broken procurement. > > Just requisition the damn machines. > Running in the cloud does not mean it has to be complicated. It can be a simple Buildbot or whatever that knows how to spawn spot instances for tests and destroy them when they're done *if the test passed*. If a test failed on an instance, it could hold onto them for a day or two for someone to debug if needed. (I mention Buildbot because in a previous life, I used that to run tests for the dattobd out-of-tree kernel module before. That was the strategy I used for it.) > > Some will be, some will have more demanding requirements especially when > > you want to test on actual hardware rather than in a VM. For example > > with my own test setup which is more focused on hardware the operating > > costs aren't such a big deal but I've got boards that are for various > > reasons irreplaceable, often single instances of boards (which makes > > scheduling a thing) and for some of the tests I'd like to get around to > > setting up I need special physical setup. Some of the hardware I'd like > > to cover is only available in machines which are in various respects > > annoying to automate, I've got a couple of unused systems waiting for me > > to have sufficient bandwidth to work out how to automate them. Either > > way I don't think the costs are trival enough to be completely handwaved > > away. > > That does complicate things. > > I'd also really like to get automated performance testing going too, > which would have similar requirements in that jobs would need to be > scheduled on specific dedicated machines. I think what you're doing > could still build off of some common infrastructure. > > > I'd also note that the 9 hour turnaround time for that test set you're > > pointing at isn't exactly what I'd associate with immediate feedback. > > My CI shards at the subtest level, and like I mentioned I run 10 VMs per > physical machine, so with just 2 of the 80 core Ampere boxes I get full > test runs done in ~20 minutes. > This design, ironically, is way more cloud-friendly than a lot of testing system designs I've seen in the past. :)
On Fri, Jan 12, 2024 at 06:11:04AM -0500, Neal Gompa wrote: > On Thu, Jan 11, 2024 at 8:11 PM Kent Overstreet > > On Thu, Jan 11, 2024 at 09:47:26PM +0000, Mark Brown wrote: > > > It's certainly an issue and it's much better if people do manage to fit > > > their tests into some existing thing but I'm not convinced that's the > > > big reason why you have a bunch of different systems running separately > > > and doing different things. For example the enterprise vendors will > > > naturally tend to have a bunch of server systems in their labs and focus > > > on their testing needs while I know the Intel audio CI setup has a bunch > > No, you're overthinking. > > The vast majority of kernel testing requires no special hardware, just a > > virtual machine. This depends a lot on the area of the kernel you're looking at - some things are very amenable to testing in a VM but there's plenty of code where you really do want to ensure that at some point you're running with some actual hardware, ideally as wide a range of it with diverse implementation decisions as you can manage. OTOH some things can only be tested virtually because the hardware doesn't exist yet! > > There is _no fucking reason_ we shouldn't be able to run tests on our > > own local machines - _local_ machines, not waiting for the Intel CI > > setup and asking for a git branch to be tested, not waiting for who > > knows how long for the CI farm to get to it - just run the damn tests > > immediately and get immediate feedback. > > You guys are overthinking and overengineering and ignoring the basics, > > the way enterprise people always do. > As one of those former enterprise people that actually did do this > stuff, I can say that even when I was "in the enterprise", I tried to > avoid overthinking and overengineering stuff like this. :) > Nobody can maintain anything that's so complicated nobody can run the > tests on their machine. That is the root of all sadness. Yeah, similar with a lot of the more hardware focused or embedded stuff - running something on the machine that's in front of you is seldom the bit that causes substantial issues. Most of the exceptions I've personally dealt with involved testing hardware (from simple stuff like wiring the audio inputs and outputs together to verify that they're working to attaching fancy test equipment to simulate things or validate that desired physical parameters are being achieved). > > > of the existing systems fit well. Anecdotally it seems much more common > > > to see people looking for things to reuse in order to save time than it > > > is to see people going off and reinventing the world. > > It's a basic lack of leadership. Yes, the younger engineers are always > > going to be doing the new and shiny, and always going to want to build > > something new instead of finishing off the tests or integrating with > > something existing. Which is why we're supposed to have managers saying > > "ok, what do I need to prioritize for my team be able to develop > > effectively". That sounds more like a "(reproducible) tests don't exist" complaint which is a different thing again to people going off and NIHing fancy frameworks. > > > That does assume that you're building and running everything directly on > > > the system under test and are happy to have the test in a VM which isn't > > > an assumption that holds universally, and also that whoever's doing the > > > testing doesn't want to do something like use their own distro or > > > something - like I say none of it looks too unreasonable for > > > filesystems. > > No, I'm doing it that way because technically that's the simplest way to > > do it. > > All you guys building crazy contraptions for running tests on Google > > Cloud or Amazon or whatever - you're building technical workarounds for > > broken procurement. I think you're addressing some specific stuff that I'm not super familiar with here? My own stuff (and most of the stuff I end up looking at) involves driving actual hardware. > > Just requisition the damn machines. There's some assumptions there which are true for a lot of people working on the kernel but not all of them... > Running in the cloud does not mean it has to be complicated. It can be > a simple Buildbot or whatever that knows how to spawn spot instances > for tests and destroy them when they're done *if the test passed*. If > a test failed on an instance, it could hold onto them for a day or two > for someone to debug if needed. > (I mention Buildbot because in a previous life, I used that to run > tests for the dattobd out-of-tree kernel module before. That was the > strategy I used for it.) Yeah, or if your thing runs in a Docker container rather than a VM then throwing it at a Kubernetes cluster using a batch job isn't a big jump. > > I'd also really like to get automated performance testing going too, > > which would have similar requirements in that jobs would need to be > > scheduled on specific dedicated machines. I think what you're doing > > could still build off of some common infrastructure. It does actually - like quite a few test labs mine is based around LAVA, labgrid is the other popular option (people were actually thinking about integrating the two recently since labgrid is a bit lower level than LAVA and they could conceptually play nicely with each other). Since the control API is internet accessible this means that it's really simple for me to to donate spare time on the boards to KernelCI as it understands how to drive LAVA, testing that I in turn use myself. Both my stuff and KernelCI use a repository of glue which knows how to drive various testsuites inside a LAVA job, that's also used by other systems using LAVA like LKFT. The custom stuff I have is all fairly thin (and quite janky), mostly just either things specific to my physical lab or managing which tests I want to run and what results I expect. What I've got is *much* more limited than I'd like, and frankly if I wasn't able to pick up huge amounts of preexisting work most of this stuff would not be happening. > > > I'd also note that the 9 hour turnaround time for that test set you're > > > pointing at isn't exactly what I'd associate with immediate feedback. > > My CI shards at the subtest level, and like I mentioned I run 10 VMs per > > physical machine, so with just 2 of the 80 core Ampere boxes I get full > > test runs done in ~20 minutes. > This design, ironically, is way more cloud-friendly than a lot of > testing system designs I've seen in the past. :) Sounds like a small private cloud to me! :P
On Fri, Jan 12, 2024 at 06:22:55PM +0000, Mark Brown wrote: > This depends a lot on the area of the kernel you're looking at - some > things are very amenable to testing in a VM but there's plenty of code > where you really do want to ensure that at some point you're running > with some actual hardware, ideally as wide a range of it with diverse > implementation decisions as you can manage. OTOH some things can only > be tested virtually because the hardware doesn't exist yet! Surface wise, there are a lot of drivers that need real hardware; but if you look at where the complexity is, the hard complex algorithmic stuff that really needs to be tested thoroughly - that's all essentially library code that doesn't need specific drivers to test. More broadly, whenever testing comes up the "special cases and special hardware" keeps distracting us from making progress on the basics; which is making sure as much of the kernel as possible can be tested in a virtual machine, with no special setup. And if we were better at that, it would be a good nudge towards driver developers to make their stuff easier to test, perhaps by getting a virtualized implementation into qemu, or to make the individual drivers thinner and move heavy logic into easier to test library code. > Yeah, similar with a lot of the more hardware focused or embedded stuff > - running something on the machine that's in front of you is seldom the > bit that causes substantial issues. Most of the exceptions I've > personally dealt with involved testing hardware (from simple stuff like > wiring the audio inputs and outputs together to verify that they're > working to attaching fancy test equipment to simulate things or validate > that desired physical parameters are being achieved). Is that sort of thing a frequent source of regressions? That sounds like the sort of thing that should be a simple table, and not something I would expect to need heavy regression testing - but, my experience with driver development was nearly 15 years ago; not a lot of day to day. How badly are typical kernel refactorings needing regression testing in individual drivers? Filesystem development, OTOH, needs _heavy_ regression testing for everything we do. Similarly with mm, scheduler; many subtle interactions going on. > > > > of the existing systems fit well. Anecdotally it seems much more common > > > > to see people looking for things to reuse in order to save time than it > > > > is to see people going off and reinventing the world. > > > > It's a basic lack of leadership. Yes, the younger engineers are always > > > going to be doing the new and shiny, and always going to want to build > > > something new instead of finishing off the tests or integrating with > > > something existing. Which is why we're supposed to have managers saying > > > "ok, what do I need to prioritize for my team be able to develop > > > effectively". > > That sounds more like a "(reproducible) tests don't exist" complaint > which is a different thing again to people going off and NIHing fancy > frameworks. No, it's a leadership/mentorship thing. And this is something that's always been lacking in kernel culture. Witness the kind of general grousing that goes on at maintainer summits; maintainers complain about being overworked and people not stepping up to help with the grungy responsibilities, while simultaneously we still very much have a "fuck off if you haven't proven yourself" attitude towards newcomers. Understandable given the historical realities (this shit is hard and the penalties of fucking up are high, so there does need to be a barrier to entry), but it's left us with some real gaps. We don't have enough a people in the senier engineer role who lay out designs and organise people to take on projects that are bigger than one single person can do, or that are necessary but not "fun". Tests and test infrastructure fall into the necessary but not fun category, so they languish. They are also things that you don't really learn the value of until you've been doing this stuff for a decade or so and you've learned by experience that yes, good tests really make life easier, as well as how to write effective tests, and that's knowledge that needs to be instilled. > > > > > That does assume that you're building and running everything directly on > > > > the system under test and are happy to have the test in a VM which isn't > > > > an assumption that holds universally, and also that whoever's doing the > > > > testing doesn't want to do something like use their own distro or > > > > something - like I say none of it looks too unreasonable for > > > > filesystems. > > > > No, I'm doing it that way because technically that's the simplest way to > > > do it. > > > > All you guys building crazy contraptions for running tests on Google > > > Cloud or Amazon or whatever - you're building technical workarounds for > > > broken procurement. > > I think you're addressing some specific stuff that I'm not super > familiar with here? My own stuff (and most of the stuff I end up > looking at) involves driving actual hardware. Yeah that's fair; that was addressed more towards what's been going on in the filesystem testing world, where I still (outside of my own stuff) haven't seen a CI with a proper dashboard of test results; instead a lot of code has been burned on multi-distro, highly configurable stuff that targets multiple clouds, but - I want simple and functional, not whiz-bang features. > > > Just requisition the damn machines. > > There's some assumptions there which are true for a lot of people > working on the kernel but not all of them... $500 a month for my setup (and this is coming out of my patreon funding right now!). It's a matter of priorities, and being willing to present this as _necessary_ to the people who control the purse strings. > > Running in the cloud does not mean it has to be complicated. It can be > > a simple Buildbot or whatever that knows how to spawn spot instances > > for tests and destroy them when they're done *if the test passed*. If > > a test failed on an instance, it could hold onto them for a day or two > > for someone to debug if needed. > > > (I mention Buildbot because in a previous life, I used that to run > > tests for the dattobd out-of-tree kernel module before. That was the > > strategy I used for it.) > > Yeah, or if your thing runs in a Docker container rather than a VM then > throwing it at a Kubernetes cluster using a batch job isn't a big jump. Kubernetes might be next level; I'm not a kubernetes guy so I can't say if it would simplify things over what I've got. But if it meant running on existing kubernetes clouds, that would make requisitioning hardware easier. > > > I'd also really like to get automated performance testing going too, > > > which would have similar requirements in that jobs would need to be > > > scheduled on specific dedicated machines. I think what you're doing > > > could still build off of some common infrastructure. > > It does actually - like quite a few test labs mine is based around LAVA, > labgrid is the other popular option (people were actually thinking about > integrating the two recently since labgrid is a bit lower level than > LAVA and they could conceptually play nicely with each other). Since > the control API is internet accessible this means that it's really > simple for me to to donate spare time on the boards to KernelCI as it > understands how to drive LAVA, testing that I in turn use myself. Both > my stuff and KernelCI use a repository of glue which knows how to drive > various testsuites inside a LAVA job, that's also used by other systems > using LAVA like LKFT. > > The custom stuff I have is all fairly thin (and quite janky), mostly > just either things specific to my physical lab or managing which tests I > want to run and what results I expect. What I've got is *much* more > limited than I'd like, and frankly if I wasn't able to pick up huge > amounts of preexisting work most of this stuff would not be happening. That's interesting. Do you have or would you be willing to write an overview of what you've got? The way you describe it I wonder if we've got some commonality. The short overview of my system: tests are programs that expose subcommends for listing depencies (i.e. virtual machine options, kernel config options) and for listing and running subtests. Tests themselves are shell scripts, with various library code for e.g. standard kernel/vm config options, hooking up tracing, core dump catching, etc. The idea is for tests to be entirely self contained and need no outside configuration. The test framework knows how to - build an appropriately configured kernel - launch a VM, which needs no prior configuration besides creation of a RO root filesystem image (single command, as mentioned) - exposes subcommands for qemu's gdb interface, kgdb, ssh access, etc. for when running interactively - implements watchdogs/test timeouts and the CI, on top of all that, watches various git repositories and - as you saw - tests every commit, newest to oldest, and provides the results in a git log format. The last one, "results in git log format", is _huge_. I don't know why I haven't seen anyone else do that - it was a must-have feature for any system over 10 years ago, and it never appeared so I finally built it myself. We (inherently!) have lots of issues with tests that only sometimes fail making it hard to know when a regression was introduced, but running all the tests on every commit with a good way to see the results makes this nearly a non issue - that is, with a weak and noisy signal (tests results) we just have to gather enough data and present the results properly to make the signal stand out (which commit(s) were buggy). I write a lot of code (over 200 commits for bcachefs this merge window alone), and this is a huge part of why I'm able to - I never have to do manual bisection anymore, and thanks to a codebase that's littered with assertions and debugging tools I don't spend that much time bug hunting either. > > > > I'd also note that the 9 hour turnaround time for that test set you're > > > > pointing at isn't exactly what I'd associate with immediate feedback. > > > > My CI shards at the subtest level, and like I mentioned I run 10 VMs per > > > physical machine, so with just 2 of the 80 core Ampere boxes I get full > > > test runs done in ~20 minutes. > > > This design, ironically, is way more cloud-friendly than a lot of > > testing system designs I've seen in the past. :) > > Sounds like a small private cloud to me! :P Yep :)
On Mon, Jan 15, 2024 at 01:42:53PM -0500, Kent Overstreet wrote: > > That sounds more like a "(reproducible) tests don't exist" complaint > > which is a different thing again to people going off and NIHing fancy > > frameworks. > > No, it's a leadership/mentorship thing. > > And this is something that's always been lacking in kernel culture. > Witness the kind of general grousing that goes on at maintainer summits; > maintainers complain about being overworked and people not stepping up > to help with the grungy responsibilities, while simultaneously we still > very much have a "fuck off if you haven't proven yourself" attitude > towards newcomers. Understandable given the historical realities (this > shit is hard and the penalties of fucking up are high, so there does > need to be a barrier to entry), but it's left us with some real gaps. > > We don't have enough a people in the senier engineer role who lay out > designs and organise people to take on projects that are bigger than one > single person can do, or that are necessary but not "fun". > > Tests and test infrastructure fall into the necessary but not fun > category, so they languish. No, they fall into the "no company wants to pay someone to do the work" category, so it doesn't get done. It's not a "leadership" issue, what is the "leadership" supposed to do here, refuse to take any new changes unless someone ponys up and does the infrastructure and testing work first? That's not going to fly, for valid reasons. And as proof of this, we have had many real features, that benefit everyone, called out as "please, companies, pay for this to be done, you all want it, and so do we!" and yet, no one does it. One real example is the RT work, it has a real roadmap, people to do the work, a tiny price tag, yet almost no one sponsoring it. Yes, for that specific issue it's slowly getting there and better, but it is one example of how you view of this might not be all that correct. I have loads of things I would love to see done. And I get interns at times to chip away at them, but my track record with interns is that almost all of them go off and get real jobs at companies doing kernel work (and getting paid well), and my tasks don't get finished, so it's back up to me to do them. And that's fine, and wonderful, I want those interns to get good jobs, that's why we do this. > They are also things that you don't really learn the value of until > you've been doing this stuff for a decade or so and you've learned by > experience that yes, good tests really make life easier, as well as how > to write effective tests, and that's knowledge that needs to be > instilled. And you will see that we now have the infrastructure in places for this. The great kunit testing framework, the kselftest framework, and the stuff tying it all together is there. All it takes is people actually using it to write their tests, which is slowly happening. So maybe, the "leadership" here is working, but in a nice organic way of "wouldn't it be nice if you cleaned that out-of-tree unit test framework up and get it merged" type of leadership, not mandates-from-on-high that just don't work. So organic you might have missed it :) Anyway, just my my 2c, what do I know... greg k-h
On Mon, Jan 15, 2024 at 09:13:01PM +0100, Greg KH wrote: > On Mon, Jan 15, 2024 at 01:42:53PM -0500, Kent Overstreet wrote: > > > That sounds more like a "(reproducible) tests don't exist" complaint > > > which is a different thing again to people going off and NIHing fancy > > > frameworks. > > > > No, it's a leadership/mentorship thing. > > > > And this is something that's always been lacking in kernel culture. > > Witness the kind of general grousing that goes on at maintainer summits; > > maintainers complain about being overworked and people not stepping up > > to help with the grungy responsibilities, while simultaneously we still > > very much have a "fuck off if you haven't proven yourself" attitude > > towards newcomers. Understandable given the historical realities (this > > shit is hard and the penalties of fucking up are high, so there does > > need to be a barrier to entry), but it's left us with some real gaps. > > > > We don't have enough a people in the senier engineer role who lay out > > designs and organise people to take on projects that are bigger than one > > single person can do, or that are necessary but not "fun". > > > > Tests and test infrastructure fall into the necessary but not fun > > category, so they languish. > > No, they fall into the "no company wants to pay someone to do the work" > category, so it doesn't get done. > > It's not a "leadership" issue, what is the "leadership" supposed to do > here, refuse to take any new changes unless someone ponys up and does > the infrastructure and testing work first? That's not going to fly, for > valid reasons. > > And as proof of this, we have had many real features, that benefit > everyone, called out as "please, companies, pay for this to be done, you > all want it, and so do we!" and yet, no one does it. One real example > is the RT work, it has a real roadmap, people to do the work, a tiny > price tag, yet almost no one sponsoring it. Yes, for that specific > issue it's slowly getting there and better, but it is one example of how > you view of this might not be all that correct. Well, what's so special about any of those features? What's special about the RT work? The list of features and enhancements we want is never ending. But good tools are important beacuse they affect the rate of everyday development; they're a multiplier on the money everone is spending on salaries. In everyday development, the rate at which we can run tests and verify the corectness of the code we're working on is more often than not _the_ limiting factor on rate of development. It's a particularly big deal for getting new people up to speed, and for work that crosses subsystems. > And you will see that we now have the infrastructure in places for this. > The great kunit testing framework, the kselftest framework, and the > stuff tying it all together is there. All it takes is people actually > using it to write their tests, which is slowly happening. > > So maybe, the "leadership" here is working, but in a nice organic way of > "wouldn't it be nice if you cleaned that out-of-tree unit test framework > up and get it merged" type of leadership, not mandates-from-on-high that > just don't work. So organic you might have missed it :) Things are moving in the right direction; the testing track at Plumber's was exciting to see. Kselftests is not there yet, though. Those tests could all be runnable with a single command - and _most_ of what's needed is there, the kernel config dependencies are listed out, but we're still lacking a testrunner. I've been trying to get someone interested in hooking them up to ktest (my ktest, not that other thing), so that we'd have one common testrunner for running anything that can be a VM test. Similarly with blktests, mmtests, et cetera. Having one common way of running all our functional VM tests, and a common collection of those tests would be a huge win for productivity because _way_ too many developers are still using slow ad hoc testing methods, and a good test runner (ktest) gets the edit/compile/test cycle down to < 1 minute, with the same tests framework for local development and automated testing in the big test cloud...
On Tue, Jan 16, 2024 at 11:41:25PM -0500, Kent Overstreet wrote: > On Mon, Jan 15, 2024 at 09:13:01PM +0100, Greg KH wrote: > > On Mon, Jan 15, 2024 at 01:42:53PM -0500, Kent Overstreet wrote: > > > > That sounds more like a "(reproducible) tests don't exist" complaint > > > > which is a different thing again to people going off and NIHing fancy > > > > frameworks. > > > > > > No, it's a leadership/mentorship thing. > > > > > > And this is something that's always been lacking in kernel culture. > > > Witness the kind of general grousing that goes on at maintainer summits; > > > maintainers complain about being overworked and people not stepping up > > > to help with the grungy responsibilities, while simultaneously we still > > > very much have a "fuck off if you haven't proven yourself" attitude > > > towards newcomers. Understandable given the historical realities (this > > > shit is hard and the penalties of fucking up are high, so there does > > > need to be a barrier to entry), but it's left us with some real gaps. > > > > > > We don't have enough a people in the senier engineer role who lay out > > > designs and organise people to take on projects that are bigger than one > > > single person can do, or that are necessary but not "fun". > > > > > > Tests and test infrastructure fall into the necessary but not fun > > > category, so they languish. > > > > No, they fall into the "no company wants to pay someone to do the work" > > category, so it doesn't get done. > > > > It's not a "leadership" issue, what is the "leadership" supposed to do > > here, refuse to take any new changes unless someone ponys up and does > > the infrastructure and testing work first? That's not going to fly, for > > valid reasons. > > > > And as proof of this, we have had many real features, that benefit > > everyone, called out as "please, companies, pay for this to be done, you > > all want it, and so do we!" and yet, no one does it. One real example > > is the RT work, it has a real roadmap, people to do the work, a tiny > > price tag, yet almost no one sponsoring it. Yes, for that specific > > issue it's slowly getting there and better, but it is one example of how > > you view of this might not be all that correct. > > Well, what's so special about any of those features? What's special > about the RT work? The list of features and enhancements we want is > never ending. Nothing is special about RT except it is a good example of the kernel "leadership" asking for help, and companies just ignoring us by not funding the work to be done that they themselves want to see happen because their own devices rely on it. > But good tools are important beacuse they affect the rate of everyday > development; they're a multiplier on the money everone is spending on > salaries. > > In everyday development, the rate at which we can run tests and verify > the corectness of the code we're working on is more often than not _the_ > limiting factor on rate of development. It's a particularly big deal for > getting new people up to speed, and for work that crosses subsystems. Agreed, I'm not objecting here at all. > > And you will see that we now have the infrastructure in places for this. > > The great kunit testing framework, the kselftest framework, and the > > stuff tying it all together is there. All it takes is people actually > > using it to write their tests, which is slowly happening. > > > > So maybe, the "leadership" here is working, but in a nice organic way of > > "wouldn't it be nice if you cleaned that out-of-tree unit test framework > > up and get it merged" type of leadership, not mandates-from-on-high that > > just don't work. So organic you might have missed it :) > > Things are moving in the right direction; the testing track at Plumber's > was exciting to see. > > Kselftests is not there yet, though. Those tests could all be runnable > with a single command - and _most_ of what's needed is there, the kernel > config dependencies are listed out, but we're still lacking a > testrunner. 'make kselftest' is a good start, it outputs in proper format that test runners can consume. We even have 'make rusttest' now too because "rust is special" for some odd reason :) And that should be all that the kernel needs to provide as test runners all work differently for various reasons, but if you want to help standardize on something, that's what kernelci is doing, I know they can always appreciate the help as well. > I've been trying to get someone interested in hooking them up to ktest > (my ktest, not that other thing), so that we'd have one common > testrunner for running anything that can be a VM test. Similarly with > blktests, mmtests, et cetera. Hey, that "other" ktest.pl is what I have been using for stable kernel test builds for years, it does work well for what it is designed for, and I know other developers also use it. > Having one common way of running all our functional VM tests, and a > common collection of those tests would be a huge win for productivity > because _way_ too many developers are still using slow ad hoc testing > methods, and a good test runner (ktest) gets the edit/compile/test cycle > down to < 1 minute, with the same tests framework for local development > and automated testing in the big test cloud... Agreed, and that's what kernelci is working to help provide. thanks, greg k-h
On Tue, Jan 16, 2024 at 11:41:25PM -0500, Kent Overstreet wrote: > > > No, it's a leadership/mentorship thing. > > > > > > And this is something that's always been lacking in kernel culture. > > > Witness the kind of general grousing that goes on at maintainer summits; > > > maintainers complain about being overworked and people not stepping up > > > to help with the grungy responsibilities, while simultaneously we still <blah blah blah> > > > Tests and test infrastructure fall into the necessary but not fun > > > category, so they languish. > > > > No, they fall into the "no company wants to pay someone to do the work" > > category, so it doesn't get done. > > > > It's not a "leadership" issue, what is the "leadership" supposed to do > > here, refuse to take any new changes unless someone ponys up and does > > the infrastructure and testing work first? That's not going to fly, for > > valid reasons. Greg is absolutely right about this. > But good tools are important beacuse they affect the rate of everyday > development; they're a multiplier on the money everone is spending on > salaries. Alas, companies don't see it that way. They take the value that get from Linux for granted, and they only care about the multipler effect of their employees salaries (and sometimes not even that). They most certainly care about the salutary effects on the entire ecosyustem. At least, I haven't seen any company make funding decisions on that basis. It's easy enough for you to blame "leadership", but the problem is the leaders at the VP and SVP level who control the budgets, not the leadership of the maintainers, who are overworked, and who often invest in testing themselves, on their own personal time, because they don't get adequate support from others. It's also for that reason why we try to prove that people won't just stick around enough for their pet feature (or in the case of ntfs, their pet file system) gets into the kernel --- and then disappear. For too often, this is what happens, either because they have their itch scratched, or their company reassigns them to some other project that is important for their company's bottom-line. If that person is willing their own personal time, long after work hours, to steward their contribution in the absence of corporate support, great. But we need to have that proven to us, or at the very least, make sure the feature's long-term maintenace burden is as low possible, to mitigate the likelihood that we won't see the new engineer after their feature lands upstream. > Having one common way of running all our functional VM tests, and a > common collection of those tests would be a huge win for productivity > because _way_ too many developers are still using slow ad hoc testing > methods, and a good test runner (ktest) gets the edit/compile/test cycle > down to < 1 minute, with the same tests framework for local development > and automated testing in the big test cloud... I'm going to call bullshit on this assertion. The fact that we have multiple ways of running our tests is not the reason why testing takes a long time. If you are going to run stress tests, which is critical for testing real file systems, that's going to take at least an hour; more if you want to test muliple file system features. The full regression set for ext4, using the common fstests testt suite, takes about 25 hours of VM time; and about 2.5 hours of wall clock time since I shard it across a dozen VM's. Yes, w could try to add some unit tests which take much less time running tests where fstests is creating a file system, mounting it, exercising the code through userspace functions, and then unmounting the file system and then checking the file system. Even if that were an adequate replacement for some of the existing fstests, (a) it's not a replacement for stress testing, and (b) this would require a vast amount of file system specific software engineering investment, and where is that going from? The bottom line is that problem is that having a one common way of running our functional VM tests is not even *close* to root cause of the problem. - Ted
On Wed, 2024-01-17 at 00:54 -0500, Theodore Ts'o wrote: > On Tue, Jan 16, 2024 at 11:41:25PM -0500, Kent Overstreet wrote: > > > > No, it's a leadership/mentorship thing. > > > > > > > > And this is something that's always been lacking in kernel > > > > culture. Witness the kind of general grousing that goes on at > > > > maintainer summits;maintainers complain about being overworked > > > > and people not stepping up to help with the grungy > > > > responsibilities, while simultaneously we still > > <blah blah blah> > > > > > Tests and test infrastructure fall into the necessary but not > > > > fun category, so they languish. > > > > > > No, they fall into the "no company wants to pay someone to do the > > > work" category, so it doesn't get done. > > > > > > It's not a "leadership" issue, what is the "leadership" supposed > > > to do here, refuse to take any new changes unless someone ponys > > > up and does the infrastructure and testing work first? That's > > > not going to fly, for valid reasons. > > Greg is absolutely right about this. > > > But good tools are important beacuse they affect the rate of > > everyday development; they're a multiplier on the money everone is > > spending on salaries. > > Alas, companies don't see it that way. They take the value that get > from Linux for granted, and they only care about the multipler effect > of their employees salaries (and sometimes not even that). They most > certainly care about the salutary effects on the entire ecosyustem. > At least, I haven't seen any company make funding decisions on that > basis. Actually, this is partly our fault. Companies behave exactly like a selfish contributor does: https://archive.fosdem.org/2020/schedule/event/selfish_contributor/ The question they ask is "if I'm putting money into it, what am I getting out of it". If the answer to that is that it benefits everybody, it's basically charity to the entity being asked (and not even properly tax deductible at that), which goes way back behind even real charitable donations (which at least have a publicity benefit) and you don't even get to speak to anyone about it when you go calling with the collecting tin. If you can say it benefits these 5 tasks your current employees are doing, you might have a possible case for the engineering budget (you might get in the door but you'll still be queuing behind every in-plan budget item). The best case is if you can demonstrate some useful for profit contribution it makes to the actual line of business (or better yet could be used to spawn a new line of business), so when you're asking for a tool, it has to be usable outside the narrow confines of the kernel and you need to be able to articulate why it's generally useful (git is a great example, it was designed to solve a kernel specific problem, but not it's in use pretty much everywhere source control is a thing). Somewhere between 2000 and now we seem to have lost our ability to frame the argument in the above terms, because the business quid pro quo argument was what got us money for stuff we needed and the Linux Foundation and the TAB formed, but we're not managing nearly as well now. The environment has hardened against us (we're no longer the new shiny) but that's not the whole explanation. I also have to say, that for all the complaints there's just not any open source pull for test tools (there's no-one who's on a mission to make them better). Demanding that someone else do it is proof of this (if you cared enough you'd do it yourself). That's why all our testing infrastructure is just some random set of scripts that mostly does what I want, because it's the last thing I need to prove the thing I actually care about works. Finally testing infrastructure is how OSDL (the precursor to the Linux foundation) got started and got its initial funding, so corporations have been putting money into it for decades with not much return (and pretty much nothing to show for a unified testing infrastructure ... ten points to the team who can actually name the test infrastructure OSDL produced) and have finally concluded it's not worth it, making it a 10x harder sell now. James
On Mon, Jan 15, 2024 at 01:42:53PM -0500, Kent Overstreet wrote: > On Fri, Jan 12, 2024 at 06:22:55PM +0000, Mark Brown wrote: > > This depends a lot on the area of the kernel you're looking at - some > > things are very amenable to testing in a VM but there's plenty of code > > where you really do want to ensure that at some point you're running > > with some actual hardware, ideally as wide a range of it with diverse > > implementation decisions as you can manage. OTOH some things can only > > be tested virtually because the hardware doesn't exist yet! > Surface wise, there are a lot of drivers that need real hardware; but if > you look at where the complexity is, the hard complex algorithmic stuff > that really needs to be tested thoroughly - that's all essentially > library code that doesn't need specific drivers to test. ... > And if we were better at that, it would be a good nudge towards driver > developers to make their stuff easier to test, perhaps by getting a > virtualized implementation into qemu, or to make the individual drivers > thinner and move heavy logic into easier to test library code. As Greg indicated with the testing I doubt everyone has infinite budget for developing emulation, and I will note that model accuracy and performance tend to be competing goals. When it comes to factoring things out into library code that can be a double edged sword - changes in the shared code can affect rather more systems than a single driver change so really ought to be tested on a wide range of systems. The level of risk from changes does vary widly of course, and you can try to have pure software tests for the things you know are relied upon, but it can be surprising. > > Yeah, similar with a lot of the more hardware focused or embedded stuff > > - running something on the machine that's in front of you is seldom the > > bit that causes substantial issues. Most of the exceptions I've > > personally dealt with involved testing hardware (from simple stuff like > > wiring the audio inputs and outputs together to verify that they're > > working to attaching fancy test equipment to simulate things or validate > > that desired physical parameters are being achieved). > Is that sort of thing a frequent source of regressions? > That sounds like the sort of thing that should be a simple table, and > not something I would expect to need heavy regression testing - but, my > experience with driver development was nearly 15 years ago; not a lot of > day to day. How badly are typical kernel refactorings needing regression > testing in individual drivers? General refactorings tend not to be that risky, but once you start doing active work on the shared code dealing with the specific thing the risk starts to go up and some changes are more risky than others. > Filesystem development, OTOH, needs _heavy_ regression testing for > everything we do. Similarly with mm, scheduler; many subtle interactions > going on. Right, and a lot of factored out code ends up in the same boat - that's kind of the issue. > > > > It's a basic lack of leadership. Yes, the younger engineers are always > > > > going to be doing the new and shiny, and always going to want to build > > > > something new instead of finishing off the tests or integrating with > > > > something existing. Which is why we're supposed to have managers saying > > > > "ok, what do I need to prioritize for my team be able to develop > > > > effectively". > > That sounds more like a "(reproducible) tests don't exist" complaint > > which is a different thing again to people going off and NIHing fancy > > frameworks. > No, it's a leadership/mentorship thing. > And this is something that's always been lacking in kernel culture. > Witness the kind of general grousing that goes on at maintainer summits; > maintainers complain about being overworked and people not stepping up > to help with the grungy responsibilities, while simultaneously we still > very much have a "fuck off if you haven't proven yourself" attitude > towards newcomers. Understandable given the historical realities (this > shit is hard and the penalties of fucking up are high, so there does > need to be a barrier to entry), but it's left us with some real gaps. > We don't have enough a people in the senier engineer role who lay out > designs and organise people to take on projects that are bigger than one > single person can do, or that are necessary but not "fun". > Tests and test infrastructure fall into the necessary but not fun > category, so they languish. Like Greg said I don't think that's a realistic view of how we can get things done here - often the thing with stop energy is that it just makes people stop. In a lot of areas everyone is just really busy and struggling to keep up, we make progress on the generic stuff in part by accepting that people have limited time and will do what they can with everyone building on top of everyone's work. > > > > Just requisition the damn machines. > > There's some assumptions there which are true for a lot of people > > working on the kernel but not all of them... > $500 a month for my setup (and this is coming out of my patreon funding > right now!). It's a matter of priorities, and being willing to present > this as _necessary_ to the people who control the purse strings. One of the assumptions there is that everyone is doing this in a well funded corporate environment focused on upstream. Even ignoring hobbyists and students for example in the embedded world it's fairly common to have stuff being upstreamed since people did the work anyway for a customer project or internal product but where the customer doesn't actually care either way if the code lands anywhere other than their product (we might suggest that they should care but that doesn't mean that they actually do care). I'll also note that there's people like me who do things with areas of the kernel not urgently related to their current employer's business and hence very difficult to justify as a work expense. With my lab some companies have been generous enough to send me test hardware (which I'm very greatful for, that's most of the irreplaceable stuff I have) but the infrastructure around them and the day to day operating costs are all being paid for by me personally. > > > > I'd also really like to get automated performance testing going too, > > > > which would have similar requirements in that jobs would need to be > > > > scheduled on specific dedicated machines. I think what you're doing > > > > could still build off of some common infrastructure. > > It does actually - like quite a few test labs mine is based around LAVA, > > labgrid is the other popular option (people were actually thinking about > > integrating the two recently since labgrid is a bit lower level than ... > > want to run and what results I expect. What I've got is *much* more > > limited than I'd like, and frankly if I wasn't able to pick up huge > > amounts of preexisting work most of this stuff would not be happening. > That's interesting. Do you have or would you be willing to write an > overview of what you've got? The way you describe it I wonder if we've > got some commonality. I was actually thinking about putting together a talk about it, though realistically the majority of it is just a very standard LAVA lab which is something there's a bunch of presentations/documentation about already. > The short overview of my system: tests are programs that expose > subcommends for listing depencies (i.e. virtual machine options, kernel > config options) and for listing and running subtests. Tests themselves > are shell scripts, with various library code for e.g. standard > kernel/vm config options, hooking up tracing, core dump catching, etc. > The idea is for tests to be entirely self contained and need no outside > configuration. The tests themselves bit sounds like what everyone else is doing - it all comes down to running some shell commands in a target environment somewhere. kselftest provides information on which config options it needs which would be nice to integrate too. > and the CI, on top of all that, watches various git repositories and - > as you saw - tests every commit, newest to oldest, and provides the > results in a git log format. > The last one, "results in git log format", is _huge_. I don't know why I > haven't seen anyone else do that - it was a must-have feature for any > system over 10 years ago, and it never appeared so I finally built it > myself. A lot of the automated testing that gets done is too expensive to be done per commit, though some does. I do actually do it myself, but even there it's mainly just some very quick smoke tests that get run per commit with more tests done on the branch as a whole (with a bit more where I can parallise things well). My stuff is more organised for scripting so expected passes are all just elided, I just use LAVA's UI if I want to pull the actual jobs for some reason. I've also see aiaiai used for this, though I think the model there was similarly to only get told about problems. > We (inherently!) have lots of issues with tests that only sometimes fail > making it hard to know when a regression was introduced, but running all > the tests on every commit with a good way to see the results makes this > nearly a non issue - that is, with a weak and noisy signal (tests > results) we just have to gather enough data and present the results > properly to make the signal stand out (which commit(s) were buggy). Yeah, running for longer and/or more often helps find the hard to reproduce things. There's a bunch of strategies for picking exactly what to do there, per commit is certainly a valid one.
On Wed, Jan 17, 2024 at 08:03:35AM -0500, James Bottomley wrote: > I also have to say, that for all the complaints there's just not any > open source pull for test tools (there's no-one who's on a mission to > make them better). Demanding that someone else do it is proof of this > (if you cared enough you'd do it yourself). That's why all our testing > infrastructure is just some random set of scripts that mostly does what > I want, because it's the last thing I need to prove the thing I > actually care about works. > Finally testing infrastructure is how OSDL (the precursor to the Linux > foundation) got started and got its initial funding, so corporations > have been putting money into it for decades with not much return (and > pretty much nothing to show for a unified testing infrastructure ... > ten points to the team who can actually name the test infrastructure > OSDL produced) and have finally concluded it's not worth it, making it > a 10x harder sell now. I think that's a *bit* pessimistic, at least for some areas of the kernel - there is commercial stuff going on with kernel testing with varying degrees of community engagement (eg, off the top of my head Baylibre, Collabora and Linaro all have offerings of various kinds that I'm aware of), and some of that does turn into investments in reusable things rather than proprietary stuff. I know that I look at the kernelci.org results for my trees, and that I've fixed issues I saw purely in there. kselftest is noticably getting much better over time, and LTP is quite active too. The stuff I'm aware of is more focused around the embedded space than the enterprise/server space but it does exist. That's not to say that this is all well resourced and there's no problem (far from it), but it really doesn't feel like a complete dead loss either. Some of the issues come from the different questions that people are trying to answer with testing, or the very different needs of the tests that people want to run - for example one of the reasons filesystems aren't particularly well covered for the embedded cases is that if your local storage is SD or worse eMMC then heavy I/O suddenly looks a lot more demanding and media durability a real consideration.
On Wed, Jan 17, 2024 at 08:03:35AM -0500, James Bottomley wrote: > Actually, this is partly our fault. Companies behave exactly like a > selfish contributor does: > > https://archive.fosdem.org/2020/schedule/event/selfish_contributor/ > > The question they ask is "if I'm putting money into it, what am I > getting out of it". If the answer to that is that it benefits > everybody, it's basically charity to the entity being asked (and not > even properly tax deductible at that), which goes way back behind even > real charitable donations (which at least have a publicity benefit) and > you don't even get to speak to anyone about it when you go calling with > the collecting tin. If you can say it benefits these 5 tasks your > current employees are doing, you might have a possible case for the > engineering budget (you might get in the door but you'll still be > queuing behind every in-plan budget item). The best case is if you can > demonstrate some useful for profit contribution it makes to the actual > line of business (or better yet could be used to spawn a new line of > business), so when you're asking for a tool, it has to be usable > outside the narrow confines of the kernel and you need to be able to > articulate why it's generally useful (git is a great example, it was > designed to solve a kernel specific problem, but not it's in use pretty > much everywhere source control is a thing). I have on occasion tried to make the "it benefits the whole ecosystem" argument, and that will work on the margins. But it's a lot harder when it's more than a full SWE-year's worth of investment, at least more recently. I *have* tried to get more test investment. with an eye towards benefitting not just one company, but in a much more general fasion ---- but multi-engineer projects are a very hard sell, especially recently. If Kent wants to impugn my leadership skills, that's fine; I invite him to try and see if he can get SVP's cough up the dough. :-) I've certainly had a lot more success with the "Business quid pro quo" argument; fscrypt and fsverity was developed for Android and Chrome; casefolding support benefited Android and Steam; ext4 fast commits was targetted at cloud-based NFS and Samba serving, etc. My conception of a successful open source maintainer includes a strong aspect of a product manager whose job is to find product/market fit. That is, I try to be a matchmaker between some feature that I've wnated for my subsystem, and would benefit users, and a business case that is sufficientlty compelling that a company is willing to fund the engineering effort to make taht feature happen. That companmy might be one that signs my patcheck, or might be some other company. For special bonus points, if I can convince some other company to find a good chunk of the engineering effort, and it *also* benefits the company that pays my salary, that's a win-win that I can crow about at performance review time. :-) > Somewhere between 2000 and now we seem to have lost our ability to > frame the argument in the above terms, because the business quid pro > quo argument was what got us money for stuff we needed and the Linux > Foundation and the TAB formed, but we're not managing nearly as well > now. The environment has hardened against us (we're no longer the new > shiny) but that's not the whole explanation. There are a couple of dynamics going on here, I think. When a company is just starting to invest in open source, and it is the "new shiny" it's a lot easier to make the pitch for big projects that are good for everyone. In the early days of the IBM Linux Technolgy Center, the Linux SMP scalability effort, ltp, etc., were significantly funded by the IBM LTC. And in some cases, efforts which didn't make it upstream, but which inspired the features to enter Linux (even if it wasn't IBM code), such as in the case of the IBM's linux thread or volume management, it was still considered a win by IBM management. Unfortunately, this effect fades over time. It's a lot easier to fund multi-engineer projects which run for more than a year, when a company is just starting out, and when it's still trying to attract upstream developers, and it has a sizeable "investment" budget. ("IBM will invest a billion dollars in Linux"). But then in later years, the VP's have to justify their budget, and so companies tend to become more and more "selfish". After all, that's how capitalism works --- "think of the children^H^H^H^H^H^H^H shareholders!" I suspect we can all think of companies beyond just IBM where this dynamic is at play; I certainly can! The economic cycle can also make a huge difference. Things got harder after the dot com imposiion; then things lossened up. Howver, post-COVID, we've seen multiple companies really become much more focused on "how is this good for our company". It has different names at different companies, such as "year of efficiency" or "sharpening our focus", but it often is accompanied with layoffs, and a general tightening of budgets. I don't think it's an accident that maintainwer grumpiness has been higher than normal in the last year or so. - Ted
On 1/17/24 05:03, James Bottomley wrote: > Finally testing infrastructure is how OSDL (the precursor to the Linux > foundation) got started and got its initial funding, so corporations > have been putting money into it for decades with not much return (and > pretty much nothing to show for a unified testing infrastructure ... > ten points to the team who can actually name the test infrastructure > OSDL produced) and have finally concluded it's not worth it, making it > a 10x harder sell now. What will ten points get me? a weak cup of coffee? Do I need a team to answer the question? Anyway, Crucible.
On Wed, Jan 17, 2024 at 08:03:35AM -0500, James Bottomley wrote: > On Wed, 2024-01-17 at 00:54 -0500, Theodore Ts'o wrote: > > On Tue, Jan 16, 2024 at 11:41:25PM -0500, Kent Overstreet wrote: > > > > > No, it's a leadership/mentorship thing. > > > > > > > > > > And this is something that's always been lacking in kernel > > > > > culture. Witness the kind of general grousing that goes on at > > > > > maintainer summits;maintainers complain about being overworked > > > > > and people not stepping up to help with the grungy > > > > > responsibilities, while simultaneously we still > > > > <blah blah blah> > > > > > > > Tests and test infrastructure fall into the necessary but not > > > > > fun category, so they languish. > > > > > > > > No, they fall into the "no company wants to pay someone to do the > > > > work" category, so it doesn't get done. > > > > > > > > It's not a "leadership" issue, what is the "leadership" supposed > > > > to do here, refuse to take any new changes unless someone ponys > > > > up and does the infrastructure and testing work first? That's > > > > not going to fly, for valid reasons. > > > > Greg is absolutely right about this. > > > > > But good tools are important beacuse they affect the rate of > > > everyday development; they're a multiplier on the money everone is > > > spending on salaries. > > > > Alas, companies don't see it that way. They take the value that get > > from Linux for granted, and they only care about the multipler effect > > of their employees salaries (and sometimes not even that). They most > > certainly care about the salutary effects on the entire ecosyustem. > > At least, I haven't seen any company make funding decisions on that > > basis. > > Actually, this is partly our fault. Companies behave exactly like a > selfish contributor does: > > https://archive.fosdem.org/2020/schedule/event/selfish_contributor/ > > The question they ask is "if I'm putting money into it, what am I > getting out of it". If the answer to that is that it benefits > everybody, it's basically charity to the entity being asked (and not > even properly tax deductible at that), which goes way back behind even > real charitable donations (which at least have a publicity benefit) and > you don't even get to speak to anyone about it when you go calling with > the collecting tin. If you can say it benefits these 5 tasks your > current employees are doing, you might have a possible case for the > engineering budget (you might get in the door but you'll still be > queuing behind every in-plan budget item). The best case is if you can > demonstrate some useful for profit contribution it makes to the actual > line of business (or better yet could be used to spawn a new line of > business), so when you're asking for a tool, it has to be usable > outside the narrow confines of the kernel and you need to be able to > articulate why it's generally useful (git is a great example, it was > designed to solve a kernel specific problem, but not it's in use pretty > much everywhere source control is a thing). > > Somewhere between 2000 and now we seem to have lost our ability to > frame the argument in the above terms, because the business quid pro > quo argument was what got us money for stuff we needed and the Linux > Foundation and the TAB formed, but we're not managing nearly as well > now. The environment has hardened against us (we're no longer the new > shiny) but that's not the whole explanation. I think this take is closer to the mark, yeah. The elephant in the room that I keep seeing is that MBA driven business culture in the U.S. has gotten _insane_, and we've all been stewing in the same pot together, collectively boiling, and not noticing or talking about just how bad it's gotten. Engineering culture really does matter; it's what makes the difference between working effectively or not. And by engineering culture I mean things like being able to set effective goals and deliver on them, and have a good balance between product based, end user focused development; exploratory, prototype-minded research product type stuff; and the "clean up your messes and eat your vegetables" type stuff that keeps tech debt from getting out of hand. Culturally, we in the kernel community are quite good on the last front, not so good on the first two, and I think a large part of the reason is people being immersed in corporate culture where everything is quarterly OKRs, "efficiency", et cetera - and everywhere I look, it's hard to find senior engineering involved in setting a roadmap. Instead we have a lot of "initiatives" and feifdoms, and if you ask me it's a direct result of MBA culture run amuck. Culturally, things seem to be a lot better in Europe - I've been seeing a _lot_ more willingness to fund grungy difficult long term projects there; the silicon valley mentality of "it must have the potential for a massive impact (and we have to get it done as quick as possible) or it's not worth looking at" is, thankfully, absent there. > I also have to say, that for all the complaints there's just not any > open source pull for test tools (there's no-one who's on a mission to > make them better). Demanding that someone else do it is proof of this > (if you cared enough you'd do it yourself). That's why all our testing > infrastructure is just some random set of scripts that mostly does what > I want, because it's the last thing I need to prove the thing I > actually care about works. It's awkward because the people with the greatest need, and therefore (in theory?) the greatest understanding for what kind of tools would be effective, are the people with massive other responsibilities. There are things we just can't do without delegating, and delegating is something we seem to be consistently not great at in the kernel community. And I don't think it needs to be that way, because younger engineers would really benefit from working closely with someone more senior, and in my experience the way to do a lot of these tooling things right is _not_ to build it all at once in a year of full time SWE salary time - it's much better to take your time, spend a lot of time learning the workflows, letting ideas percolate, and gradually build things up. Yet the way these projects all seem to go is we have one or a few people working full time mostly writing code, building things with a lot of _features_... and if you ask me, ending up with something where most of the features were things we didn't need or ask for and just make the end result harder to use. Tools are hard to get right; perhaps we should be spending more of our bikeshedding time on the lists bikeshedding our tools, and a little bit less on coding style minutia. Personally, I've tried to get the ball rolling multiple times with various people asking them what they want and need out of their testing tools and how they use them, and it often feels like pulling teeth. > Finally testing infrastructure is how OSDL (the precursor to the Linux > foundation) got started and got its initial funding, so corporations > have been putting money into it for decades with not much return (and > pretty much nothing to show for a unified testing infrastructure ... > ten points to the team who can actually name the test infrastructure > OSDL produced) and have finally concluded it's not worth it, making it > a 10x harder sell now. The circle of fail continues :)
On Wed, Jan 17, 2024 at 06:19:43PM +0000, Mark Brown wrote: > On Wed, Jan 17, 2024 at 08:03:35AM -0500, James Bottomley wrote: > > > I also have to say, that for all the complaints there's just not any > > open source pull for test tools (there's no-one who's on a mission to > > make them better). Demanding that someone else do it is proof of this > > (if you cared enough you'd do it yourself). That's why all our testing > > infrastructure is just some random set of scripts that mostly does what > > I want, because it's the last thing I need to prove the thing I > > actually care about works. > > > Finally testing infrastructure is how OSDL (the precursor to the Linux > > foundation) got started and got its initial funding, so corporations > > have been putting money into it for decades with not much return (and > > pretty much nothing to show for a unified testing infrastructure ... > > ten points to the team who can actually name the test infrastructure > > OSDL produced) and have finally concluded it's not worth it, making it > > a 10x harder sell now. > > I think that's a *bit* pessimistic, at least for some areas of the > kernel - there is commercial stuff going on with kernel testing with > varying degrees of community engagement (eg, off the top of my head > Baylibre, Collabora and Linaro all have offerings of various kinds that > I'm aware of), and some of that does turn into investments in reusable > things rather than proprietary stuff. I know that I look at the > kernelci.org results for my trees, and that I've fixed issues I saw > purely in there. kselftest is noticably getting much better over time, > and LTP is quite active too. The stuff I'm aware of is more focused > around the embedded space than the enterprise/server space but it does > exist. That's not to say that this is all well resourced and there's no > problem (far from it), but it really doesn't feel like a complete dead > loss either. kselftest is pretty exciting to me; "collect all our integration tests into one place and start to standarize on running them" is good stuff. You seem to be pretty familiar with all the various testing efforts, I wonder if you could talk about what you see that's interesting and useful in the various projects? I think a lot of this stems from a lack of organization and a lack of communication; I see a lot of projects reinventing things in slightly different ways and failing to build off of each other. > Some of the issues come from the different questions that people are > trying to answer with testing, or the very different needs of the > tests that people want to run - for example one of the reasons > filesystems aren't particularly well covered for the embedded cases is > that if your local storage is SD or worse eMMC then heavy I/O suddenly > looks a lot more demanding and media durability a real consideration. Well, for filesystem testing we (mostly) don't want to be hammering on an actual block device if we can help it - there are occasionally bugs that will only manifest when you're testing on a device with realistic performance characteristics, and we definitely want to be doing some amount of performance testing on actual devices, but most of our testing is best done in a VM where the scratch devices live entirely in dram on the host. But that's a minor detail, IMO - that doesn't prevent us from having a common test runner for anything that doesn't need special hardware.
On Wed, Jan 17, 2024 at 09:49:22PM -0500, Theodore Ts'o wrote: > On Wed, Jan 17, 2024 at 08:03:35AM -0500, James Bottomley wrote: > > Actually, this is partly our fault. Companies behave exactly like a > > selfish contributor does: > > > > https://archive.fosdem.org/2020/schedule/event/selfish_contributor/ > > > > The question they ask is "if I'm putting money into it, what am I > > getting out of it". If the answer to that is that it benefits > > everybody, it's basically charity to the entity being asked (and not > > even properly tax deductible at that), which goes way back behind even > > real charitable donations (which at least have a publicity benefit) and > > you don't even get to speak to anyone about it when you go calling with > > the collecting tin. If you can say it benefits these 5 tasks your > > current employees are doing, you might have a possible case for the > > engineering budget (you might get in the door but you'll still be > > queuing behind every in-plan budget item). The best case is if you can > > demonstrate some useful for profit contribution it makes to the actual > > line of business (or better yet could be used to spawn a new line of > > business), so when you're asking for a tool, it has to be usable > > outside the narrow confines of the kernel and you need to be able to > > articulate why it's generally useful (git is a great example, it was > > designed to solve a kernel specific problem, but not it's in use pretty > > much everywhere source control is a thing). > > I have on occasion tried to make the "it benefits the whole ecosystem" > argument, and that will work on the margins. But it's a lot harder > when it's more than a full SWE-year's worth of investment, at least > more recently. I *have* tried to get more test investment. with an > eye towards benefitting not just one company, but in a much more > general fasion ---- but multi-engineer projects are a very hard sell, > especially recently. If Kent wants to impugn my leadership skills, > that's fine; I invite him to try and see if he can get SVP's cough up > the dough. :-) Well, I've tried talking to you about improving our testing tooling - in particular, what we could do if we had better, more self contained tools, not just targeted at xfstests, in particular a VM testrunner that could run kselftests too - and as I recall, your reaction was pretty much "why would I be interested in that? What does that do for me?" So yeah, I would call that a fail in leadership. Us filesystem people have the highest testing requirements and ought to know how to do this best, and if the poeple with the most experience aren't trying share that knowledge and experience in the form of collaborating on tooling, what the fuck are we even doing here? If I sound frustrated, it's because I am. > I've certainly had a lot more success with the "Business quid pro quo" > argument; fscrypt and fsverity was developed for Android and Chrome; > casefolding support benefited Android and Steam; ext4 fast commits was > targetted at cloud-based NFS and Samba serving, etc. Yeah, I keep hearing you talking about the product management angle and I have to call bullshit. There's a lot more to maintaining the health of projects in the long term than just selling features to customers. > Unfortunately, this effect fades over time. It's a lot easier to fund > multi-engineer projects which run for more than a year, when a company > is just starting out, and when it's still trying to attract upstream > developers, and it has a sizeable "investment" budget. ("IBM will > invest a billion dollars in Linux"). But then in later years, the > VP's have to justify their budget, and so companies tend to become > more and more "selfish". After all, that's how capitalism works --- > "think of the children^H^H^H^H^H^H^H shareholders!" This stuff doesn't have to be huge multi engineer-year projects to get anything useful done. ktest has been a tiny side project for me. If I can turn that into a full blown CI that runs arbitrary self contained VM tests with quick turnaround and a nice git log UI, in my spare time, why can't we pitch in together instead of each running in different directions and collaborate and communicate a bit better instead of bitching so much?
On Sun, Jan 21, 2024 at 07:20:32AM -0500, Kent Overstreet wrote: > > Well, I've tried talking to you about improving our testing tooling - in > particular, what we could do if we had better, more self contained > tools, not just targeted at xfstests, in particular a VM testrunner that > could run kselftests too - and as I recall, your reaction was pretty > much "why would I be interested in that? What does that do for me?" My reaction was to your proposal that I throw away my framework which works super well for me, in favor of your favorite framework. My framework already supports blktests and the Phoronix Test Suite, and it would be a lot less work for me to add support for kselftests to {gce,kvm,android}-xfstests. The reality is that we all have test suites that are optimized for our workflow. Trying to get everyone to standardize on a single test framework is going to be hard, since they have optimized for different use cases. Mine can be used for both local testing as well as sharding across multiple Google Cloud VM's, and with auto-bisection features, and it already supports blktests and PTS, and it handles both x86 and arm64 with both native and cross-compiling support. I'm certainly willing to work with others to improve my xfstests-bld. > So yeah, I would call that a fail in leadership. Us filesystem people > have the highest testing requirements and ought to know how to do this > best, and if the poeple with the most experience aren't trying share > that knowledge and experience in the form of collaborating on tooling, > what the fuck are we even doing here? I'm certainly willing to work with others, and I've accepted patches from other users of {kvm,gce,android}-xfstests. If you have something which is a strict superset of all of the features of xfstests-bld, I'm certainly willing to talk. I'm sure you have a system which works well for *you*. However, I'm much less interested in throwing away of my invested effort for something that works well for me --- as well as other users of xfstests-bld. (This includes other ext4 developers, Google's internal prodkernel for our data centers, and testing ext4 and xfs for Google's Cloud-Opmized OS distribution.) This is not a leadership failure; this is more like telling a Debian user to throw away their working system because you think Fedora better, and "wouldn't it be better if we all used the same distribution"? > ktest has been a tiny side project for me. If I can turn that into a > full blown CI that runs arbitrary self contained VM tests with quick > turnaround and a nice git log UI, in my spare time, why can't we pitch > in together instead of each running in different directions and > collaborate and communicate a bit better instead of bitching so much? xfstests-bld started as a side project to me as well, and has accumulated other users and contributors. Why can't you use my system instead? By your definition of "failure of leadership", you have clearly failed as well in not seeing the light and using *my* system. :-) - Ted
On Sat, Jan 20, 2024 at 10:24:09PM -0500, Kent Overstreet wrote: > On Wed, Jan 17, 2024 at 06:19:43PM +0000, Mark Brown wrote: > > On Wed, Jan 17, 2024 at 08:03:35AM -0500, James Bottomley wrote: > > I think that's a *bit* pessimistic, at least for some areas of the > > kernel - there is commercial stuff going on with kernel testing with > > varying degrees of community engagement (eg, off the top of my head > > Baylibre, Collabora and Linaro all have offerings of various kinds that > > I'm aware of), and some of that does turn into investments in reusable > > things rather than proprietary stuff. I know that I look at the > > kernelci.org results for my trees, and that I've fixed issues I saw > > purely in there. kselftest is noticably getting much better over time, > > and LTP is quite active too. The stuff I'm aware of is more focused > > around the embedded space than the enterprise/server space but it does > > exist. That's not to say that this is all well resourced and there's no > > problem (far from it), but it really doesn't feel like a complete dead > > loss either. > kselftest is pretty exciting to me; "collect all our integration tests > into one place and start to standarize on running them" is good stuff. > You seem to be pretty familiar with all the various testing efforts, I > wonder if you could talk about what you see that's interesting and > useful in the various projects? Well, I'm familiar with the bits I look at and some of the adjacent areas but definitely not with the testing world as a whole. For tests themselves there's some generic suites like LTP and kselftest, plus a lot of domain specific things which are widely used in their areas. Often the stuff that's separate either lives with something like a userspace library rather than just being a purely kernel thing or has some other special infrastructure needs. For lab orchestration there's at least: https://beaker-project.org/ https://github.com/labgrid-project/labgrid https://www.lavasoftware.org/ Beaker and LAVA are broadly similar in a parallel evolution sort of way, scalable job scheduler/orchestration things intended for non interactive use with a lot of overlap in design choices. LAVA plays nicer with embedded boards since Beaker comes from RedHat and is focused more on server/PC type use cases though I don't think there's anything fundamental there. Labgrid has a strong embedded focus with facilities like integrating anciliary test equipment and caters a lot more to interactive use than either of the other two but AIUI doesn't help so much with batch usage, though that can be built on top. All of them can handle virtual targets as well as physical ones. All of these need something driving them to actually generate test jobs and present the results, as well as larger projects there's also people like Guenter Roeck and myself who run things that amuse us and report them by hand. Of the bigger general purpose orchestration projects off the top of my head there's https://github.com/intel/lkp-tests/blob/master/doc/faq.md https://cki-project.org/ https://kernelci.org/ https://lkft.linaro.org/ CKI and KernelCI are not a million miles apart, they both monitor a bunch of trees and run well known testsuites that they've integrated, and have code available if you want to deploy your own thing (eg, for non-public stuff). They're looking at pooling their results into kcidb as part of the KernelCI LF project. Like 0day is proprietary to Intel LKFT is proprietary to Linaro, LKFT has a focus on running a lot of tests on stable -rcs with manual reporting though they do have some best effort coverage of mainline and -next as well. There's also a bunch of people doing things specific to a given hardware type or other interest, often internal to a vendor but for example Intel have some public CI for their graphics and audio: https://intel-gfx-ci.01.org/ https://github.com/thesofproject/linux/ (you can see the audio stuff doing it's thing on the pull requests in the SOF repo.) The infra behind these is a bit task specific AIUI, for example the audio testing includes a lot of boards that don't have serial consoles or anything (eg, laptops) so it uses a fixed filesystem on the device, copies a kernel in and uses grub-reboot to try it one time. They're particularly interesting because they're more actively tied to the development flow. The clang people have something too using a github flow: https://github.com/ClangBuiltLinux/continuous-integration2 (which does have some boots on virtual platforms as well as just build coverage.) > I think a lot of this stems from a lack of organization and a lack of > communication; I see a lot of projects reinventing things in slightly > different ways and failing to build off of each other. There's definitely some NIHing going on in places but a lot of it comes from people with different needs or environments (like the Intel audio stuff I mentioned), or just things already existing and nobody wanting to disrupt what they've got for a wholesale replacement. People are rarely working from nothing, and there's a bunch of communication and sharing of ideas going on. > > Some of the issues come from the different questions that people are > > trying to answer with testing, or the very different needs of the > > tests that people want to run - for example one of the reasons > > filesystems aren't particularly well covered for the embedded cases is > > that if your local storage is SD or worse eMMC then heavy I/O suddenly > > looks a lot more demanding and media durability a real consideration. > Well, for filesystem testing we (mostly) don't want to be hammering on > an actual block device if we can help it - there are occasionally bugs > that will only manifest when you're testing on a device with realistic > performance characteristics, and we definitely want to be doing some > amount of performance testing on actual devices, but most of our testing > is best done in a VM where the scratch devices live entirely in dram on > the host. Sure, though there can be limitations with the amount of memory on a lot of these systems too! You can definitely do things, it's just not always ideal - for example filesystem people will tend to default to using test filesystems sized like the total memory of a lot of even modern embedded boards so if nothing else you need to tune things down if you're going to do a memory only test.