Message ID | 20220421234837.3629927-7-kent.overstreet@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2,1/8] lib/printbuf: New data structure for heap-allocated strings | expand |
I still see absolutel no reason to bloat the kernel with a duplicate of the existing seq_buf functionality. Please use that and improve it where needed for your use case.
On Fri, Apr 22, 2022 at 06:20:17AM +0200, Christoph Hellwig wrote: > I still see absolutel no reason to bloat the kernel with a duplicate > of the existing seq_buf functionality. Please use that and improve it > where needed for your use case. Christoph, you have no problem making more work for me but I can't even get you to look at the bugs you introuduce in your refactorings that I report to you. Still waiting on you to look at oops you introduced in bio_copy_data_iter...
On Fri, Apr 22, 2022 at 01:14:48AM -0400, Kent Overstreet wrote: > Christoph, you have no problem making more work for me but I can't even get you I think you are misunderstanding this. You are trying to create more work for people maintainaing the kernel by creating duplicate infrastructure. The burden is always on the submitter. > to look at the bugs you introuduce in your refactorings that I report to you. > > Still waiting on you to look at oops you introduced in bio_copy_data_iter... I'm not sure why I shoud care about your out of tree code making assumptions about block layer helpers.
On Fri, Apr 22, 2022 at 07:22:08AM +0200, Christoph Hellwig wrote: > On Fri, Apr 22, 2022 at 01:14:48AM -0400, Kent Overstreet wrote: > > Christoph, you have no problem making more work for me but I can't even get you > > I think you are misunderstanding this. You are trying to create more > work for people maintainaing the kernel by creating duplicate > infrastructure. The burden is always on the submitter. > > > to look at the bugs you introuduce in your refactorings that I report to you. > > > > Still waiting on you to look at oops you introduced in bio_copy_data_iter... > > I'm not sure why I shoud care about your out of tree code making > assumptions about block layer helpers. Wasn't just bcachefs, it affected bcache too, as Coly also reported. And I wrote that code originally (and the whole fucking modern bvec iter infrastracture, mind you) so please don't lecture me on making assumptions on block layer helpers. Here's the thing, I think you and I have somewhat different approaches to engineering. Personaly, I find good engineering to be about tradeoffs, not absolutism, and not letting perfect be the enemy of good. So I'm honestly not super eager to start modifying tricky arch code that I can't test, and digging into what looked like non trivial interactions between the way the traceing code using seq_buf (naturally, given that's where it originates). I like to push out code that I have high confidence in, and the patch series I pushed out I do have confidence in, given that it's been in use for awhile and it's well tested in my tree. Now yes, I _could_ do a wholesale conversion of seq_buf to printbuf and delete that code, but doing that job right, to be confident that I'm not introducing bugs, is going to take more time than I really want to invest right now. I really don't like to play fast and loose with that stuff. And the reason getting this from you really irks me is that _practically every single time_ I trip over a something nasty when I rebase and I git bisect or blame it's something you did. I don't even bother reporting most of them to you. I don't want to be calling you out for the work you do because on the whole it's good and appreciated - I saw the patch series go by getting request_queue out of filesystem land, I'm happy that's getting done. But I've also seen the stuff you submit get _really_ churny at times for no good reason, and some really nasty, data corrupting bugs go by, so... Please chill out a bit if I'm not super in a rush to do it your way.
On Fri, Apr 22, 2022 at 01:40:15AM -0400, Kent Overstreet wrote: > Wasn't just bcachefs, it affected bcache too, as Coly also reported. Well, I've not seen a good bug report for that, but I'd gladly look at it. > And I wrote > that code originally (and the whole fucking modern bvec iter infrastracture, > mind you) so please don't lecture me on making assumptions on block layer > helpers. Well, most of what we have is really from Ming. Because your original idea was awesome, but the code didn't really fit. Then again I'm not sure why this even matters. I'm also relly not sure why you are getting so personal. > Now yes, I _could_ do a wholesale conversion of seq_buf to printbuf and delete > that code, but doing that job right, to be confident that I'm not introducing > bugs, is going to take more time than I really want to invest right now. I > really don't like to play fast and loose with that stuff. Even of that I'd rather see a very good reason first. seq_bufs have been in the kernel for a while and seem to work fine. If you think there are shortcomings please try to improve it, not replace or duplicate it. Sometimes there might be a good reason to replace exiting code, but it rather have to be a very good reason.
On Fri, Apr 22, 2022 at 07:52:14AM +0200, Christoph Hellwig wrote: > On Fri, Apr 22, 2022 at 01:40:15AM -0400, Kent Overstreet wrote: > > Wasn't just bcachefs, it affected bcache too, as Coly also reported. > > Well, I've not seen a good bug report for that, but I'd gladly look at it. Thanks. It's been awhile but I'll see if I can dig up the original bug report tomorrow. > > And I wrote > > that code originally (and the whole fucking modern bvec iter infrastracture, > > mind you) so please don't lecture me on making assumptions on block layer > > helpers. > > Well, most of what we have is really from Ming. Because your original > idea was awesome, but the code didn't really fit. Then again I'm not > sure why this even matters. Didn't fit how? And Ming extended it to multipage bvecs based on my proposal. > I'm also relly not sure why you are getting so personal. Put yourself my shoes, I've honestly found you to be hardheaded and exceedingly difficult to work with for a very long time. But I'll bite my tongue for now, because if you'll start listening to bug reports that will go a long way towards easing things, and we've got LSF coming up so maybe we can hash things out over beers. > > Now yes, I _could_ do a wholesale conversion of seq_buf to printbuf and delete > > that code, but doing that job right, to be confident that I'm not introducing > > bugs, is going to take more time than I really want to invest right now. I > > really don't like to play fast and loose with that stuff. > > Even of that I'd rather see a very good reason first. seq_bufs have been > in the kernel for a while and seem to work fine. If you think there are > shortcomings please try to improve it, not replace or duplicate it. > Sometimes there might be a good reason to replace exiting code, but it > rather have to be a very good reason. When the new version is semantically different from the old version it makes it a lot easier to deal with the merge conflicts later when forward/backporting stuff by giving the new version a new name. Anyways, I'll have a chat with Steven Rostedt about it since I believe he wrote the original code.
On Fri, Apr 22, 2022 at 02:06:44AM -0400, Kent Overstreet wrote: > > Well, most of what we have is really from Ming. Because your original > > idea was awesome, but the code didn't really fit. Then again I'm not > > sure why this even matters. > > Didn't fit how? And Ming extended it to multipage bvecs based on my proposal. He did all the actual hard work to get it ready to merge and to work everywhere. As in he stuck around and actually finished the project based on your design. > > > I'm also relly not sure why you are getting so personal. > > Put yourself my shoes, I've honestly found you to be hardheaded and exceedingly > difficult to work with for a very long time. Thanks, but I've been walking these shoes for a while..
On Fri, Apr 22, 2022 at 08:11:52AM +0200, Christoph Hellwig wrote: > On Fri, Apr 22, 2022 at 02:06:44AM -0400, Kent Overstreet wrote: > > > Well, most of what we have is really from Ming. Because your original > > > idea was awesome, but the code didn't really fit. Then again I'm not > > > sure why this even matters. > > > > Didn't fit how? And Ming extended it to multipage bvecs based on my proposal. > > He did all the actual hard work to get it ready to merge and to work > everywhere. As in he stuck around and actually finished the project > based on your design. Not sure why you need to keep throwing shade, but.. > > > I'm also relly not sure why you are getting so personal. > > > > Put yourself my shoes, I've honestly found you to be hardheaded and exceedingly > > difficult to work with for a very long time. > > Thanks, but I've been walking these shoes for a while.. *snort* Yeah, I know I am too :)
On Fri, 22 Apr 2022 01:40:15 -0400 Kent Overstreet <kent.overstreet@gmail.com> wrote: > So I'm honestly not super eager to start modifying tricky arch code that I can't > test, and digging into what looked like non trivial interactions between the way > the traceing code using seq_buf (naturally, given that's where it originates). Yes, seq_buf came from the tracing system but was to be used in a more broader way. I had originally pushed trace_seq into the lib directory, but Andrew Morton said it was too specific to tracing. Thus, I gutted the generic parts out of it and created seq_buf, which looks to be something that you could use. I had patches to convert seq_file to it, but ran out of time. I probably can pull them out of the closet and start that again. > > Now yes, I _could_ do a wholesale conversion of seq_buf to printbuf and delete > that code, but doing that job right, to be confident that I'm not introducing > bugs, is going to take more time than I really want to invest right now. I > really don't like to play fast and loose with that stuff. I would be happy to work with you to convert to seq_buf. If there's something missing from it, I can help you change it so that it doesn't cause any regressions with the tracing subsystem. This is how open source programming is suppose to work ;-) -- Steve
Hi Steve! On Fri, Apr 22, 2022 at 11:37:36AM -0400, Steven Rostedt wrote: > On Fri, 22 Apr 2022 01:40:15 -0400 > Kent Overstreet <kent.overstreet@gmail.com> wrote: > > > So I'm honestly not super eager to start modifying tricky arch code that I can't > > test, and digging into what looked like non trivial interactions between the way > > the traceing code using seq_buf (naturally, given that's where it originates). > > Yes, seq_buf came from the tracing system but was to be used in a more > broader way. I had originally pushed trace_seq into the lib directory, but > Andrew Morton said it was too specific to tracing. Thus, I gutted the > generic parts out of it and created seq_buf, which looks to be something > that you could use. I had patches to convert seq_file to it, but ran out of > time. I probably can pull them out of the closet and start that again. > > > > > Now yes, I _could_ do a wholesale conversion of seq_buf to printbuf and delete > > that code, but doing that job right, to be confident that I'm not introducing > > bugs, is going to take more time than I really want to invest right now. I > > really don't like to play fast and loose with that stuff. > > I would be happy to work with you to convert to seq_buf. If there's > something missing from it, I can help you change it so that it doesn't > cause any regressions with the tracing subsystem. > > This is how open source programming is suppose to work ;-) Is it though? :) One of the things I've been meaning to talk more about, that came out of a recent Rust discussion, is that we in the kernel community could really do a better job with how we interact with the outside world, particularly with regards to the sharing of code. The point was made to me when another long standing kernel dev was complaining about Facebook being a large, insular, difficult to work with organization, that likes to pretend it is the center of the universe and not bend to the outside world, while doing the exact same thing with respect to new concerns brought by the Rust community. The irony was illuminating :) The reason I bring that up is that in this case, printbuf is the more evolved, more widely used implementation, and you're asking me to discard it so the kernel can stick with its more primitive, less widely used implementation. $ git grep -w seq_buf|wc -l 86 $ git grep -w printbuf|wc -l 366 So, going to have to push back on that one :) Printbufs aren't new code; everything in them is there because I've found it valuable, which is why I decided to try promoting them to the kernel proper (and more importantly, the idea of a standard way to pretty-print anything). I'm happy to discuss the merits of the code more, and try to convince you why you'll like them :)
On Fri, 22 Apr 2022 15:30:15 -0400 Kent Overstreet <kent.overstreet@gmail.com> wrote: > > This is how open source programming is suppose to work ;-) > > Is it though? :) > > One of the things I've been meaning to talk more about, that > came out of a recent Rust discussion, is that we in the kernel community could > really do a better job with how we interact with the outside world, particularly > with regards to the sharing of code. > > The point was made to me when another long standing kernel dev was complaining > about Facebook being a large, insular, difficult to work with organization, that > likes to pretend it is the center of the universe and not bend to the outside > world, while doing the exact same thing with respect to new concerns brought by > the Rust community. The irony was illuminating :) I do not consider Facebook an open source company. One reason I turned them down. > > The reason I bring that up is that in this case, printbuf is the more evolved, > more widely used implementation, and you're asking me to discard it so the > kernel can stick with its more primitive, less widely used implementation. > > $ git grep -w seq_buf|wc -l > 86 > > $ git grep -w printbuf|wc -l > 366 $ git grep printbuf drivers/media/i2c/ccs/ccs-reg-access.c: char printbuf[(MAX_WRITE_LEN << 1) + drivers/media/i2c/ccs/ccs-reg-access.c: bin2hex(printbuf, regdata, msg.len); drivers/media/i2c/ccs/ccs-reg-access.c: regs->addr + j, printbuf); I don't see it. And by your notion: $ git grep trace_seq | wc -l 1680 Thus we all should be using trace_seq! > > So, going to have to push back on that one :) > > Printbufs aren't new code; everything in them is there because I've found it > valuable, which is why I decided to try promoting them to the kernel proper (and > more importantly, the idea of a standard way to pretty-print anything). > > I'm happy to discuss the merits of the code more, and try to convince you why > you'll like them :) I'd like to know more to why seq_buf is not good for you. And just telling me that you never seriously tried to make it work because you were afraid of causing tracing regressions without ever asking the tracing maintainer is not going to cut it. -- Steve
On Fri, 2022-04-22 at 15:30 -0400, Kent Overstreet wrote: > Hi Steve! > > On Fri, Apr 22, 2022 at 11:37:36AM -0400, Steven Rostedt wrote: > > On Fri, 22 Apr 2022 01:40:15 -0400 > > Kent Overstreet <kent.overstreet@gmail.com> wrote: [...] > > > Now yes, I _could_ do a wholesale conversion of seq_buf to > > > printbuf and delete that code, but doing that job right, to be > > > confident that I'm not introducing bugs, is going to take more > > > time than I really want to invest right now. I really don't like > > > to play fast and loose with that stuff. > > > > I would be happy to work with you to convert to seq_buf. If there's > > something missing from it, I can help you change it so that it > > doesn't cause any regressions with the tracing subsystem. > > > > This is how open source programming is suppose to work ;-) > > Is it though? :) > > One of the things I've been meaning to talk more about, that > came out of a recent Rust discussion, is that we in the kernel > community could really do a better job with how we interact with the > outside world, particularly with regards to the sharing of code. > > The point was made to me when another long standing kernel dev was > complaining about Facebook being a large, insular, difficult to work > with organization, that likes to pretend it is the center of the > universe and not bend to the outside world, while doing the exact > same thing with respect to new concerns brought by the Rust > community. The irony was illuminating :) Hey, I didn't say that at all. I said vendoring the facebook reference implementation wouldn't work (it being 74k lines and us using 300) but that facebook was doing the right thing for us with zstd because they were maintaining the core code we needed, even if we couldn't vendor it from their code base: https://lore.kernel.org/rust-for-linux/ea85b3bce5f172dc73e2be8eb4dbd21fae826fa1.camel@HansenPartnership.com/ You were the one who said all that about facebook, while incorrectly implying I said it first (which is an interesting variation on the strawman fallacy): https://lore.kernel.org/rust-for-linux/20220415203926.pvahugtzrg4dbhcc@moria.home.lan/ James
On Fri, Apr 22, 2022 at 03:39:16PM -0400, Steven Rostedt wrote: > I do not consider Facebook an open source company. One reason I turned them > down. Surely you can see how NIH syndrome isn't something that just happens at closed-source companies? How a default cultural assumption of "we do things the way we've always done" leads to things getting insular? > > The reason I bring that up is that in this case, printbuf is the more evolved, > > more widely used implementation, and you're asking me to discard it so the > > kernel can stick with its more primitive, less widely used implementation. > > > > $ git grep -w seq_buf|wc -l > > 86 > > > > $ git grep -w printbuf|wc -l > > 366 > > $ git grep printbuf > drivers/media/i2c/ccs/ccs-reg-access.c: char printbuf[(MAX_WRITE_LEN << 1) + > drivers/media/i2c/ccs/ccs-reg-access.c: bin2hex(printbuf, regdata, msg.len); > drivers/media/i2c/ccs/ccs-reg-access.c: regs->addr + j, printbuf); > > I don't see it. Here: https://evilpiepirate.org/git/bcachefs.git/ It may not be merged yet, but it is actively developed open source code with active users that's intended to be merged! > I'd like to know more to why seq_buf is not good for you. And just telling > me that you never seriously tried to make it work because you were afraid > of causing tracing regressions without ever asking the tracing maintainer > is not going to cut it. I didn't know about seq_buf until a day or two ago, that's literally all it was. And Steve, apologies if I've come across as being a dick about this, that wasn't my intent. I've got nothing against you or your code - I'd love it if we could just have a discussion about them on their merits, and if it feels like I'm making an issue about this unnecessarily that's because I think there's something about kernel process and culture worth improving that I want to raise, so I'm sticking my neck out a bit here. So here's the story of how I got from where seq_buf is now to where printbuf is now: - Printbuf started out as almost an exact duplicate of seq_buf (like I said, not intentionally), with an external buffer typically allocated on the stack. - As error/log messages got to be bigger and more structured, stack usage eventually became an issue, so eventually I added the heap allocations. - This made them a lot more convenient to use, and made possible entirely new ways of using them - so I started using them more, and converting everything that outputted to strings to them. - This lead to the realization that when pretty-printers are easy and convenient to write, that leads to writing pretty-printers for _more_ stuff, which makes it easy to stay in the habit of adding anything relevant to sysfs/debugfs - and log/error messages got a _whole_ lot better when I realized instead of writing format strings for every basic C type I can just use the .to_text() methods of the high level objects I'm working with. Basically, my debugging life has gotten _drastically_ easier because of this change in process and approach - deadlocks that I used to have to attach a debugger for are now trivial because all the relevant state is in debugfs and greppable, and filesystem inconsistencies that used to suck to debug I now just take what's in the error message and grep through the journal for. I can't understate how invaluable all this stuff has been, and I'm excited to take the lessons I've learned and apply them to the wider kernel and make other people's lives easier too. The shrinkers-OOM-reporting patch was an obvious starting point because - shrinkers have internal state that's definitely worth reporting on - we shouldn't just be logging this on OOM, we should also make this available in sysfs or debugfs. Feature wise, printbufs have: - heap allocation - extra state for formatting: indent level, tabstops, and a way of specifying units. That's basically it. Heap allocation adds very little code and eliminates a _lot_ of headaches in playing the "how much do I need to/can I put on the stack" game, and you'll want the formatting options as soon as you start formatting multi line output and writing pretty printers that call other pretty printers.
On Fri, 22 Apr 2022 16:30:57 -0400 Kent Overstreet <kent.overstreet@gmail.com> wrote: > So here's the story of how I got from where seq_buf is now to where printbuf is > now: > > - Printbuf started out as almost an exact duplicate of seq_buf (like I said, > not intentionally), with an external buffer typically allocated on the stack. Basically seq_buf is designed to be used as an underlining infrastructure. That's why it does not allocate any buffer and leaves that to the user cases. Hence, trace_seq() allocates a page for use, and I even use seq_buf in user space to dynamically allocate when needed. > > - As error/log messages got to be bigger and more structured, stack usage > eventually became an issue, so eventually I added the heap allocations. Which is something you could do on top of seq_buf. Point being, you do not need to re-implement printbuf, and I have not looked at the code, but instead, implement printbuf on top of seq_buf, and extend seq_buf where needed. Like trace_seq does, and the patches I have for seq_file would do. It would leave the string processing and buffer space management to seq_buf, as there's ways to see "oh, we need more space, let's allocate more" and then increase the heap. > > - This made them a lot more convenient to use, and made possible entirely new > ways of using them - so I started using them more, and converting everything > that outputted to strings to them. > > - This lead to the realization that when pretty-printers are easy and > convenient to write, that leads to writing pretty-printers for _more_ stuff, > which makes it easy to stay in the habit of adding anything relevant to > sysfs/debugfs - and log/error messages got a _whole_ lot better when I > realized instead of writing format strings for every basic C type I can just > use the .to_text() methods of the high level objects I'm working with. > > Basically, my debugging life has gotten _drastically_ easier because of this > change in process and approach - deadlocks that I used to have to attach a > debugger for are now trivial because all the relevant state is in debugfs and > greppable, and filesystem inconsistencies that used to suck to debug I now just > take what's in the error message and grep through the journal for. > > I can't understate how invaluable all this stuff has been, and I'm excited to > take the lessons I've learned and apply them to the wider kernel and make other > people's lives easier too. > > The shrinkers-OOM-reporting patch was an obvious starting point because > - shrinkers have internal state that's definitely worth reporting on > - we shouldn't just be logging this on OOM, we should also make this available > in sysfs or debugfs. > > Feature wise, printbufs have: > - heap allocation > - extra state for formatting: indent level, tabstops, and a way of specifying > units. > > That's basically it. Heap allocation adds very little code and eliminates a > _lot_ of headaches in playing the "how much do I need to/can I put on the stack" > game, and you'll want the formatting options as soon as you start formatting > multi line output and writing pretty printers that call other pretty printers. I would be more willing to accept a printbuf, if it was built on top of seq_buf. That is, you don't need to change all your user cases, you just need to make printbuf an extension of seq_buf by using it underneath, like trace_seq does. Then it would not be re-inventing the wheel, but just building on top of it. -- Steve
On Fri, Apr 22, 2022 at 04:03:12PM -0400, James Bottomley wrote: > Hey, I didn't say that at all. I said vendoring the facebook reference > implementation wouldn't work (it being 74k lines and us using 300) but > that facebook was doing the right thing for us with zstd because they > were maintaining the core code we needed, even if we couldn't vendor it > from their code base: > > https://lore.kernel.org/rust-for-linux/ea85b3bce5f172dc73e2be8eb4dbd21fae826fa1.camel@HansenPartnership.com/ > > You were the one who said all that about facebook, while incorrectly > implying I said it first (which is an interesting variation on the > strawman fallacy): Hey, sorry for picking on you James - I didn't mean to single you out. Let me explain where I'm coming from - actually, this email I received puts it better than I could: From: "John Ericson" <mail@johnericson.me> To: "Kent Overstreet" <kent.overstreet@gmail.com> Subject: Nice email on cargo crates in Linux Hi. I saw your email linked from https://lwn.net/Articles/889924/. Nicely put! I was involved in some of the allocating try_* function work to avoid nasty panics, and thus make sure Rust in Linux didn't have to reinvent the wheel redoing the alloc library. I very much share your goals but suspect this will happen in a sort of boil-the-frog way over time. The basic portability method of "incentivize users to write code that's *more* portable than what they currently need" is a culture shock for user space and kernel space alike. But if we get all our ducks in a row on the Rust side, eventually it should just be "too tempting", and the code sharing will happen for economic reasons (rather than ideological ones about trying to smash down the arbitrary dichotomies :) going it alone). The larger issue is major projects closed off unto themselves such as Linux, PostgreSQL, or even the Glasgow Haskell Compiler (something I am currently writing a plan to untangle) have fallen too deep in Conway's law's embrace, and thus are full of bad habits like https://johno.com/composition-over-configuration is arguing against. Over time, I hope using external libraries will not only result in less duplicated work, but also "pry open" their architecture a bit, tamping down on crazy configurability and replacing it with more principled composition. I fear this is all way too spicy to bring up on the LKML in 2022, at least coming from someone like me without any landed kernel patches under his belt, but I wanted to let you know other people have similar thoughts. Cheers, John ------- Re: Rust in Linux, I get frustrated when I see senior people tell the new Rust people "don't do that" to things that are standard practice in the outside world. I think Linus said recently that Rust in the kernel is something that could fail, and he's right - but if it fails, it won't just be the failure of the Rust people to do the required work, it'll be _our_ failure too, a failure to work with them. And I think the kernel needs Rust, in the long term. Rust is the biggest advance towards programming languages with embedded correctness proofs in the systems space since ever, and that's huge. It's only a first step, it doesn't solve everything - the next step will probably be dependent types (as in Idris); after that it's hard to say. But maybe 50, 100 years out, systems programming is going to look very different. And that excites me! What I love about programming is being able to sink into something new, difficult and unsolved, and let my mind wander and explore and play with ideas until I come across a simple and elegant way of solving it. What I hate about programming is starting to do that, and then an hour in getting interrupted by a bug report in an area of the code that I should have been done with. And then again. And then again. But that's just the nature of the beast in C, with the current state of the art of the tools we have. Proving C programs correct can be done (I understand Paul McKenney is on the bleeding edge of that), but it's too expensive and time consuming to be practical for most things. We need better languages. And we've got people trying to work with us, so I hope we work with them and make it happen. That's going to require us to update our thinking in some areas - like with cargo. C people are used to projects being monoliths, but in the modern world things that in prior days would have required hacking the compiler are now done as libraries - and we're going to want that code! We don't need to be reimplementing rust bitvecs, or string parsing, or - I could make up a long, long list. That code is all going to just work in kernel land, and if we adopt it it'll lead to a lot less duplication of effort. Also, what John said in that email about "large projects closed off unto themselves" - he's spot on about that. When I was at Google, their was this deep unspoken assumption that anything done at Google was better than anything in the outside world, and it _rankled_. It was a major reason I departed. The kernel community has a lot of that going on here. Again, sorry to pick on you James, but I wanted to make the argument that - maybe the kernel _should_ be adopting a more structured way of using code from outside repositories, like cargo, or git submodules (except I've never had a positive experience with git submodules, so ignore that suggestion, unless I've just been using them wrong, in which case someone please teach me). To read you and Greg saying "nah, just copy code from other repos, it's fine" - it felt like being back in the old days when we were still trying to get people to use source control, and having that one older colleague who _insisted_ on not using source control of any kind, and that's a bit disheartening. I just want to make both the kernel and the wider world of open source software a better place, and the thing I'm itching to see get better is that currently it's a massive hassle to share code between kernel space and user space. This is understandable, because before git we definitely didn't have the tooling, and in C this kind of code sharing has historically been questionable. But: - We've got a new language coming that makes this sort of thing a lot saner, both from a source code management perspective (cargo) and with much stronger language guarantees that code won't do something surprising on you - Even in C, we're gaining a better appreciation for why someone would want to do such a thing. Another random case in point: liblwipv6, a userspace TCP/IP library, lets you do really cool things with building/simulating networks entirely in userspace without root access, just over unix domain sockets. But it's a completely new implementation, and thus buggy and incomplete. Imagine if, in some far away future, we could someday just reuse the whole Linux network stack in userspace, for purposes undreamed of by the likes of us... And also, personally I find that code gets better when you have to spend the effort to clean it up, make it more modular, and separate it from the wider project - that's when you find out what it's really about. There's a ton of _really effing cool code_ in the kernel that could be seeing wider use in userspace, too (workqueues!). And, like with pretty printers, once you start down a good path it often begets more of itself - modularity will lead to more modularity. If we start making thinks like workqueues more modular and also usable as a userspace library, that's going to make it easier to port more kernel code, and more... which will make writing C in userspace a nicer experience, and maybe even lead to more code sharing in the reverse direction.. Anyways, just one man's dream :)
On Fri, Apr 22, 2022 at 04:47:44PM -0400, Steven Rostedt wrote: > Which is something you could do on top of seq_buf. Point being, you do not > need to re-implement printbuf, and I have not looked at the code, but > instead, implement printbuf on top of seq_buf, and extend seq_buf where > needed. Like trace_seq does, and the patches I have for seq_file would do. > It would leave the string processing and buffer space management to > seq_buf, as there's ways to see "oh, we need more space, let's allocate > more" and then increase the heap. That sounds like it could work. > I would be more willing to accept a printbuf, if it was built on top of > seq_buf. That is, you don't need to change all your user cases, you just > need to make printbuf an extension of seq_buf by using it underneath, like > trace_seq does. Then it would not be re-inventing the wheel, but just > building on top of it. Hmm... At first glance, redoing printbuf on top of seq_buf looks like it would save a pretty trivial amount of code - and my engineering taste these days leans more toward less layering if it's only slightly more code; I think I might like printbuf and seq_buf to stay separate things (and both of them are pretty small). But it's definitely not an unreasonable idea - I can try it out and see how it turns out. Would you have any objections to making some changes to seq_buf? - You've got size and len as size_t, I've got them as unsigned. Given that we need to be checking for overflow anyways for correctens, I like having them as u32s. - seq_buf->readpos - it looks like this is only used by seq_buf_to_user(), does it need to be in seq_buf? - in printbufs, I make sure the buffer is always nul-terminated - seems simplest, given that we need to make sure there's always room for the terminating nul anyways. A downside of having printbuf on top of seq_buf is that now we've got two apis that functions can output to - vs. if we modified printbuf by adding a flag for "this is an external buffer, don't reallocate it". That approach would be less code overall, for sure. Could I get you to look over printbuf and share your thoughts on the different approaches?
On Fri, 22 Apr 2022 17:51:46 -0400 Kent Overstreet <kent.overstreet@gmail.com> wrote: > > But it's definitely not an unreasonable idea - I can try it out and see how it > turns out. Would you have any objections to making some changes to seq_buf? No I don't mind, and that's why I want the coupled, as enhancements or bug fixes would happen to both. > > - You've got size and len as size_t, I've got them as unsigned. Given that we > need to be checking for overflow anyways for correctens, I like having them > as u32s. I had it as size_t as I had planned (and still plan to) make seq_file use seq_buf, and seq_file uses size_t. Who knows, perhaps in the future, we may have strings that are more than 4GBs. ;-) > - seq_buf->readpos - it looks like this is only used by seq_buf_to_user(), does > it need to be in seq_buf? Perhaps. > - in printbufs, I make sure the buffer is always nul-terminated - seems > simplest, given that we need to make sure there's always room for the > terminating nul anyways. I'm not against that. It was an optimization, but I never actually benchmarked it. But I'm not sure how many fast paths it is used in to warrant that kind of optimization over the complexity it can bring for users. > > A downside of having printbuf on top of seq_buf is that now we've got two apis > that functions can output to - vs. if we modified printbuf by adding a flag for > "this is an external buffer, don't reallocate it". That approach would be less > code overall, for sure. > > Could I get you to look over printbuf and share your thoughts on the different > approaches? Sure, but will have to wait till next week. -- Steve
[change of subject for an easy thread kill, since I'm sure most people on cc don't want to follow this] On Fri, 2022-04-22 at 17:13 -0400, Kent Overstreet wrote: > On Fri, Apr 22, 2022 at 04:03:12PM -0400, James Bottomley wrote: > > Hey, I didn't say that at all. I said vendoring the facebook > > reference implementation wouldn't work (it being 74k lines and us > > using 300) but that facebook was doing the right thing for us with > > zstd because they were maintaining the core code we needed, even if > > we couldn't vendor it from their code base: > > > > https://lore.kernel.org/rust-for-linux/ea85b3bce5f172dc73e2be8eb4dbd21fae826fa1.camel@HansenPartnership.com/ > > > > You were the one who said all that about facebook, while > > incorrectly implying I said it first (which is an interesting > > variation on the strawman fallacy): > > Hey, sorry for picking on you James - I didn't mean to single you > out. > > Let me explain where I'm coming from - actually, this email I > received puts it better than I could: > > From: "John Ericson" <mail@johnericson.me> > To: "Kent Overstreet" <kent.overstreet@gmail.com> > Subject: Nice email on cargo crates in Linux > > Hi. I saw your email linked from https://lwn.net/Articles/889924/. > Nicely put! > > I was involved in some of the allocating try_* function work to avoid > nasty panics, and thus make sure Rust in Linux didn't have to > reinvent the wheel redoing the alloc library. I very much share your > goals but suspect this will happen in a sort of boil-the-frog way > over time. The basic portability method of "incentivize users to > write code that's *more* portable than what they currently > need" is a culture shock for user space and kernel space alike. But > if we get all our ducks in a row on the Rust side, eventually it > should just be "too tempting", and the code sharing will happen for > economic reasons (rather than ideological ones about trying to smash > down the arbitrary dichotomies :) going it alone). > > The larger issue is major projects closed off unto themselves such as > Linux, PostgreSQL, or even the Glasgow Haskell Compiler (something I > am currently writing a plan to untangle) have fallen too deep in > Conway's law's embrace, and thus are full of bad habits like > https://johno.com/composition-over-configuration is arguing against. > Over time, I hope using external libraries will not only result in > less duplicated work, but also "pry open" their architecture a bit, > tamping down on crazy configurability and replacing it with more > principled composition. > > I fear this is all way too spicy to bring up on the LKML in 2022, at > least coming from someone like me without any landed kernel patches > under his belt, but I wanted to let you know other people have > similar thoughts. > > Cheers, > > John > > ------- > > Re: Rust in Linux, I get frustrated when I see senior people tell the > new Rust people "don't do that" to things that are standard practice > in the outside world. You stripped the nuance of that. I said many no_std crates could be used in the kernel. I also said that the async crate couldn't because the rust compiler itself would have to support the kernel threading model. > I think Linus said recently that Rust in the kernel is something that > could fail, and he's right - but if it fails, it won't just be the > failure of the Rust people to do the required work, it'll be _our_ > failure too, a failure to work with them. The big risk is that rust needs to adapt to the kernel environment. This isn't rust specific, llvm had similar issues as an alternative C compiler. I think rust in the kernel would fail if it were only the rust kernel people asking. Fortunately the pressure to support rust in embedded leading to the rise in no_std crates is a force which can also get rust in the kernel over the finish line because of the focus it puts on getting the language and crates to adapt to non standard environments. [...] > The kernel community has a lot of that going on here. Again, sorry to > pick on you James, but I wanted to make the argument that - maybe the > kernel _should_ be adopting a more structured way of using code from > outside repositories, like cargo, or git submodules (except I've > never had a positive experience with git submodules, so ignore that > suggestion, unless I've just been using them wrong, in which case > someone please teach me). To read you and Greg saying "nah, just > copy code from other repos, it's fine" - it felt like being back in > the old days when we were still trying to get people to use source > control, and having that one older colleague who _insisted_ on not > using source control of any kind, and that's a bit disheartening. Even in C terms, the kernel is a nostdlib environment. If a C project has too much libc dependency it's not going to work directly in the kernel, nor should it. Let's look at zstd (which is pretty much a nostdlib project) as a great example: the facebook people didn't actually port the top of their tree (1.5) to the kernel, they backported bug fixes to the 1.4 branch and made a special release (1.4.10) just for us. Why did they do this? It was because the 1.5 version vastly increased stack use to the extent it would run off the end of the limited kernel stack so couldn't be ported directly into the kernel. A lot of C libraries that are nostdlib have problems like this as well (you can use recursion, but not in the kernel). There's no easy way of shimming environmental constraints like this. The lesson: it is possible to make the core of a project mobile, but only if you're aware of all the environmental constraints it will run into as it gets ported. The list of possible environments is huge: kernel, embedded, industrial control ..., so naturally not every (or more accurately hardly any) project wants to do this. James
On Sat, Apr 23, 2022 at 10:16:37AM -0400, James Bottomley wrote: > You stripped the nuance of that. I said many no_std crates could be > used in the kernel. I also said that the async crate couldn't because > the rust compiler itself would have to support the kernel threading > model. I just scanned through that thread and that's not what you said. What you said was: > The above is also the rust crate problem in miniature: the crates grow > API features the kernel will never care about and importing them > wholesale is going to take forever because of the internal kernel > support issue. In the end, to take rust async as an example, it will > be much better to do for rust what we've done for zlib: take the core > that can support the kernel threading model and reimplement that in the > kernel crate. The act of doing that will a) prove people care enough > about the functionality and b) allow us to refine it nicely. > > I also don't think rust would really want to import crates wholesale. > The reason for no_std is that rust is trying to adapt to embedded > environments, which the somewhat harsh constraints of the kernel is > very similar to. But maybe your position has changed somewhat? It sounds like you've been arguing against just directly depending on foreign reposotories and for the staus quo of just ad-hoc copying of code. I'll help by stating my own position: I think we should be coming up with a process for how dependencies on other git repositories are going to work, something better than just cut and paste. Whether or not we vendorize code isn't really that important, but I'd say that if we are vendorizing code and we're not including entire sub-repositories (like cargo vendor does) we ought to still make this a scripted process that takes as an input a list of files we're pulling and a remote repository we're pulling from, and the file list and the remote repo (and commit ID we're pulling from) should all be checked in. I think using cargo would be _great_ because it would handle this part for us (perhaps minus pulling of individual files? haven't checked) instead of home-growing our own. However, I'd like something for C repositories too, and I don't think we want to start depending on cargo for non Rust development... :) But much more important that the technical details of how we import code is just having good answers for people who aren't embedded in Linux kernel development culture, so we aren't just telling them "no" by default because we haven't thought about this stuff and having them walk away frustrated. > > I think Linus said recently that Rust in the kernel is something that > > could fail, and he's right - but if it fails, it won't just be the > > failure of the Rust people to do the required work, it'll be _our_ > > failure too, a failure to work with them. > > The big risk is that rust needs to adapt to the kernel environment. > This isn't rust specific, llvm had similar issues as an alternative C > compiler. I think rust in the kernel would fail if it were only the > rust kernel people asking. Fortunately the pressure to support rust in > embedded leading to the rise in no_std crates is a force which can also > get rust in the kernel over the finish line because of the focus it > puts on getting the language and crates to adapt to non standard > environments. It's both! It's on all of us to make this work. > > The kernel community has a lot of that going on here. Again, sorry to > > pick on you James, but I wanted to make the argument that - maybe the > > kernel _should_ be adopting a more structured way of using code from > > outside repositories, like cargo, or git submodules (except I've > > never had a positive experience with git submodules, so ignore that > > suggestion, unless I've just been using them wrong, in which case > > someone please teach me). To read you and Greg saying "nah, just > > copy code from other repos, it's fine" - it felt like being back in > > the old days when we were still trying to get people to use source > > control, and having that one older colleague who _insisted_ on not > > using source control of any kind, and that's a bit disheartening. > > Even in C terms, the kernel is a nostdlib environment. If a C project > has too much libc dependency it's not going to work directly in the > kernel, nor should it. Let's look at zstd (which is pretty much a > nostdlib project) as a great example: the facebook people didn't > actually port the top of their tree (1.5) to the kernel, they > backported bug fixes to the 1.4 branch and made a special release > (1.4.10) just for us. Why did they do this? It was because the 1.5 > version vastly increased stack use to the extent it would run off the > end of the limited kernel stack so couldn't be ported directly into the > kernel. A lot of C libraries that are nostdlib have problems like this > as well (you can use recursion, but not in the kernel). There's no > easy way of shimming environmental constraints like this. I wonder if we might have come up with a better solution if there'd been more cross-project communication and less siloing. Small stacks aren't particular to the kernel - it's definitely not unheard of to write userspace code where you want to have a lot of small stacks (especially if you're doing some kind of coroutine style threading; I've done stuff like this in the past) - and to me, as someone who's been incrementing on and maintaining a codebase in active use for 10 years, having multiple older versions in active use that need bugfixes gives me cold shivers. I wouldn't be surprised if at some point the zstd people walk back some of their changes or make it configurable at some point :)
On Thu, 2022-04-21 at 19:48 -0400, Kent Overstreet wrote: > This adds printbufs: simple heap-allocated strings meant for building up > structured messages, for logging/procfs/sysfs and elsewhere. They've > been heavily used in bcachefs for writing .to_text() functions/methods - > pretty printers, which has in turn greatly improved the overall quality > of error messages. > > Basic usage is documented in include/linux/printbuf.h. Given the maximum printk output is less than 1024 bytes, why should this be allowed to be larger than that or larger than PAGE_SIZE? > + * pr_human_readable_u64, pr_human_readable_s64: Print an integer with human > + * readable units. Why not extend vsprintf for this using something like %pH[8|16|32|64] or %pH[c|s|l|ll|uc|us|ul|ull] ?
On Sun, Apr 24, 2022 at 04:46:03PM -0700, Joe Perches wrote: > On Thu, 2022-04-21 at 19:48 -0400, Kent Overstreet wrote: > > This adds printbufs: simple heap-allocated strings meant for building up > > structured messages, for logging/procfs/sysfs and elsewhere. They've > > been heavily used in bcachefs for writing .to_text() functions/methods - > > pretty printers, which has in turn greatly improved the overall quality > > of error messages. > > > > Basic usage is documented in include/linux/printbuf.h. > > Given the maximum printk output is less than 1024 bytes, why should > this be allowed to be larger than that or larger than PAGE_SIZE? It's not just used there - in bcachefs I use it for sysfs & debugfs, as well as userspace code for e.g. printing out the superblock (which gets pretty big when including all the variable length sections). > > + * pr_human_readable_u64, pr_human_readable_s64: Print an integer with human > > + * readable units. > > Why not extend vsprintf for this using something like %pH[8|16|32|64] > or %pH[c|s|l|ll|uc|us|ul|ull] ? It'd be incompatible with userspace printf. I do like the way we extend printf is the kernel, but I'm trying to make sure the code I write now is by default portable between both kernel space and userspace. Glibc has its own mechanism for extending printf, I've been meaning to look at that more and see if it'd be possible to do something more generic and extensible that works for both.
On Sun, Apr 24, 2022 at 04:46:03PM -0700, Joe Perches wrote: > > + * pr_human_readable_u64, pr_human_readable_s64: Print an integer with human > > + * readable units. > > Why not extend vsprintf for this using something like %pH[8|16|32|64] > or %pH[c|s|l|ll|uc|us|ul|ull] ? The %pX extension we have is _cute_, but ultimately a bad idea. It centralises all kinds of unrelated things in vsprintf.c, eg bdev_name() and clock() and ip_addr_string(). Really, it's working around that we don't have something like Java's StringBuffer (which I see both seq_buf and printbuf as attempting to be). So we have this primitive format string hack instead of exposing methods like: void dentry_string(struct strbuf *, struct dentry *); as an example, if (unlikely(ino == dir->i_ino)) { EXT4_ERROR_INODE(dir, "'%pd' linked to parent dir", dentry); return ERR_PTR(-EFSCORRUPTED); } would become something like: if (unlikely(ino == dir->i_ino)) { struct strbuf strbuf; strbuf_char(strbuf, '\''); dentry_string(strbuf, dentry); strbuf_string(strbuf, "' linked to parent dir"); EXT4_ERROR_INODE(dir, strbuf); return ERR_PTR(-EFSCORRUPTED); } which isn't terribly nice, but C has sucky syntax for string construction. Other languages have done this better, including Rust.
On Mon, Apr 25, 2022 at 03:44:34AM +0100, Matthew Wilcox wrote: > On Sun, Apr 24, 2022 at 04:46:03PM -0700, Joe Perches wrote: > > > + * pr_human_readable_u64, pr_human_readable_s64: Print an integer with human > > > + * readable units. > > > > Why not extend vsprintf for this using something like %pH[8|16|32|64] > > or %pH[c|s|l|ll|uc|us|ul|ull] ? > > The %pX extension we have is _cute_, but ultimately a bad idea. It > centralises all kinds of unrelated things in vsprintf.c, eg bdev_name() > and clock() and ip_addr_string(). And it's not remotely discoverable. I didn't realize we had bdev_name() available as a format string until just now or I would've been using it! > Really, it's working around that we don't have something like Java's > StringBuffer (which I see both seq_buf and printbuf as attempting to > be). So we have this primitive format string hack instead of exposing > methods like: > > void dentry_string(struct strbuf *, struct dentry *); Exactly! > as an example, > if (unlikely(ino == dir->i_ino)) { > EXT4_ERROR_INODE(dir, "'%pd' linked to parent dir", > dentry); > return ERR_PTR(-EFSCORRUPTED); > } > > would become something like: > > if (unlikely(ino == dir->i_ino)) { > struct strbuf strbuf; > strbuf_char(strbuf, '\''); > dentry_string(strbuf, dentry); > strbuf_string(strbuf, "' linked to parent dir"); > EXT4_ERROR_INODE(dir, strbuf); > return ERR_PTR(-EFSCORRUPTED); > } > > which isn't terribly nice, but C has sucky syntax for string > construction. Other languages have done this better, including Rust. Over IRC just now you proposed "%p(%p)", dentry_name, dentry - I'm _really_ liking this idea, especially if we can get glibc to take it. Then your ext4 example becomes just if (unlikely(ino == dir->i_ino)) { EXT4_ERROR_INODE(dir, "'%p(%p)' linked to parent dir", dentry_name, dentry); return ERR_PTR(-EFSCORRUPTED); } And you can cscope to the pretty-printer! And dentry_name becomes just void dentry_name(struct printbuf *out, struct dentry *dentry) { ... } Which is quite a bit simpler than the current definition. Sweeeeeet.
On Mon, 2022-04-25 at 00:19 -0400, Kent Overstreet wrote: > On Mon, Apr 25, 2022 at 03:44:34AM +0100, Matthew Wilcox wrote: > > On Sun, Apr 24, 2022 at 04:46:03PM -0700, Joe Perches wrote: > > > > + * pr_human_readable_u64, pr_human_readable_s64: Print an integer with human > > > > + * readable units. > > > > > > Why not extend vsprintf for this using something like %pH[8|16|32|64] > > > or %pH[c|s|l|ll|uc|us|ul|ull] ? > > > > The %pX extension we have is _cute_, but ultimately a bad idea. It > > centralises all kinds of unrelated things in vsprintf.c, eg bdev_name() > > and clock() and ip_addr_string(). > > And it's not remotely discoverable. I didn't realize we had bdev_name() > available as a format string until just now or I would've been using it! Documentation/core-api/printk-formats.rst
On Sun, Apr 24, 2022 at 09:48:58PM -0700, Joe Perches wrote: > On Mon, 2022-04-25 at 00:19 -0400, Kent Overstreet wrote: > > On Mon, Apr 25, 2022 at 03:44:34AM +0100, Matthew Wilcox wrote: > > > On Sun, Apr 24, 2022 at 04:46:03PM -0700, Joe Perches wrote: > > > > > + * pr_human_readable_u64, pr_human_readable_s64: Print an integer with human > > > > > + * readable units. > > > > > > > > Why not extend vsprintf for this using something like %pH[8|16|32|64] > > > > or %pH[c|s|l|ll|uc|us|ul|ull] ? > > > > > > The %pX extension we have is _cute_, but ultimately a bad idea. It > > > centralises all kinds of unrelated things in vsprintf.c, eg bdev_name() > > > and clock() and ip_addr_string(). > > > > And it's not remotely discoverable. I didn't realize we had bdev_name() > > available as a format string until just now or I would've been using it! > > Documentation/core-api/printk-formats.rst Who has time for docs?
On Mon, 2022-04-25 at 00:59 -0400, Kent Overstreet wrote: > On Sun, Apr 24, 2022 at 09:48:58PM -0700, Joe Perches wrote: > > On Mon, 2022-04-25 at 00:19 -0400, Kent Overstreet wrote: > > > On Mon, Apr 25, 2022 at 03:44:34AM +0100, Matthew Wilcox wrote: > > > > On Sun, Apr 24, 2022 at 04:46:03PM -0700, Joe Perches wrote: > > > > > > + * pr_human_readable_u64, pr_human_readable_s64: Print an integer with human > > > > > > + * readable units. > > > > > > > > > > Why not extend vsprintf for this using something like %pH[8|16|32|64] > > > > > or %pH[c|s|l|ll|uc|us|ul|ull] ? > > > > > > > > The %pX extension we have is _cute_, but ultimately a bad idea. It > > > > centralises all kinds of unrelated things in vsprintf.c, eg bdev_name() > > > > and clock() and ip_addr_string(). > > > > > > And it's not remotely discoverable. I didn't realize we had bdev_name() > > > available as a format string until just now or I would've been using it! > > > > Documentation/core-api/printk-formats.rst > > Who has time for docs? The same people that have time to reimplement the already implemented?
On Sun, Apr 24, 2022 at 10:00:32PM -0700, Joe Perches wrote: > On Mon, 2022-04-25 at 00:59 -0400, Kent Overstreet wrote: > > On Sun, Apr 24, 2022 at 09:48:58PM -0700, Joe Perches wrote: > > > On Mon, 2022-04-25 at 00:19 -0400, Kent Overstreet wrote: > > > > On Mon, Apr 25, 2022 at 03:44:34AM +0100, Matthew Wilcox wrote: > > > > > On Sun, Apr 24, 2022 at 04:46:03PM -0700, Joe Perches wrote: > > > > > > > + * pr_human_readable_u64, pr_human_readable_s64: Print an integer with human > > > > > > > + * readable units. > > > > > > > > > > > > Why not extend vsprintf for this using something like %pH[8|16|32|64] > > > > > > or %pH[c|s|l|ll|uc|us|ul|ull] ? > > > > > > > > > > The %pX extension we have is _cute_, but ultimately a bad idea. It > > > > > centralises all kinds of unrelated things in vsprintf.c, eg bdev_name() > > > > > and clock() and ip_addr_string(). > > > > > > > > And it's not remotely discoverable. I didn't realize we had bdev_name() > > > > available as a format string until just now or I would've been using it! > > > > > > Documentation/core-api/printk-formats.rst > > > > Who has time for docs? > > The same people that have time to reimplement the already implemented? Touché :)
On Sun, 2022-04-24 at 16:36 -0400, Kent Overstreet wrote: > On Sat, Apr 23, 2022 at 10:16:37AM -0400, James Bottomley wrote: > > You stripped the nuance of that. I said many no_std crates could > > be used in the kernel. I also said that the async crate couldn't > > because the rust compiler itself would have to support the kernel > > threading model. > > I just scanned through that thread and that's not what you said. What > you said was: > > > The above is also the rust crate problem in miniature: the crates > > grow API features the kernel will never care about and importing > > them wholesale is going to take forever because of the internal > > kernel support issue. In the end, to take rust async as an > > example, it will be much better to do for rust what we've done for > > zlib: take the core that can support the kernel threading model and > > reimplement that in the kernel crate. The act of doing that will > > a) prove people care enough about the functionality and b) allow us > > to refine it nicely. > > > > I also don't think rust would really want to import crates > > wholesale. The reason for no_std is that rust is trying to adapt > > to embedded environments, which the somewhat harsh constraints of > > the kernel is very similar to. > > But maybe your position has changed somewhat? I read those two as saying the same thing just with differing levels of detail. > It sounds like you've been arguing against just directly depending > on foreign reposotories and for the staus quo of just ad-hoc copying > of code. I don't think I've said anything generalised about that either way. However, I have noted that most of the external repositories I've looked at can't (or shouldn't) be imported directly. Perhaps if we could find one that could be pulled in directly, we could use that to drive a discussion of how. > I'll help by stating my own position: I think we should be coming up > with a process for how dependencies on other git repositories are > going to work, something better than just cut and paste. Whether or > not we vendorize code isn't really that important, but I'd say that > if we are vendorizing code and we're notincluding entire sub- > repositories (like cargo vendor does) we ought to still make this a > scripted process that takes as an input a list of files we're > pulling and a remote repository we're pulling from, and the file list > and the remote repo (and commit ID we're pulling from) should all be > checked in. Do we have an example of an external piece of code we could do this to? [...] > > > The kernel community has a lot of that going on here. Again, > > > sorry to pick on you James, but I wanted to make the argument > > > that - maybe the kernel _should_ be adopting a more structured > > > way of using code from outside repositories, like cargo, or git > > > submodules (except I've never had a positive experience with git > > > submodules, so ignore that suggestion, unless I've just been > > > using them wrong, in which case someone please teach me). To read > > > you and Greg saying "nah, just copy code from other repos, it's > > > fine" - it felt like being back in the old days when we were > > > still trying to get people to use source control, and having that > > > one older colleague who _insisted_ on not using source control of > > > any kind, and that's a bit disheartening. > > > > Even in C terms, the kernel is a nostdlib environment. If a C > > project has too much libc dependency it's not going to work > > directly in the kernel, nor should it. Let's look at zstd (which > > is pretty much a nostdlib project) as a great example: the facebook > > people didn't actually port the top of their tree (1.5) to the > > kernel, they backported bug fixes to the 1.4 branch and made a > > special release (1.4.10) just for us. Why did they do this? It > > was because the 1.5 version vastly increased stack use to the > > extent it would run off the end of the limited kernel stack so > > couldn't be ported directly into the kernel. A lot of C libraries > > that are nostdlib have problems like this as well (you can use > > recursion, but not in the kernel). There's no easy way of shimming > > environmental constraints like this. > > I wonder if we might have come up with a better solution if there'd > been more cross-project communication and less siloing. Small stacks > aren't particular to the kernel - it's definitely not unheard of to > write userspace code where you want to have a lot of small stacks > (especially if you're doing some kind of coroutine style threading; > I've done stuff like this in the past) But would you say that every piece of userspace code should reject recursion and write for small stacks just in case it needs to be reused in a more extreme environment? I don't; I think it's fine there are loads of implementations that would never work in our environment, because mostly there's no reason to use them in the kernel (or another embedded environment). I also understand why people build for the standard userspace environment first ... it reduces complexity and makes the construction way easier. > - and to me, as someone who's been incrementing on and maintaining a > codebase in active use for 10 years, having multiple older versions > in active use that need bugfixes gives me cold shivers. > > I wouldn't be surprised if at some point the zstd people walk back > some of their changes or make it configurable at some point :) Many things can happen in the future, but right at the moment I still think zstd is working fine for both parties. James
diff --git a/include/linux/printbuf.h b/include/linux/printbuf.h new file mode 100644 index 0000000000..276cdecf08 --- /dev/null +++ b/include/linux/printbuf.h @@ -0,0 +1,164 @@ +/* SPDX-License-Identifier: LGPL-2.1+ */ +/* Copyright (C) 2022 Kent Overstreet */ + +#ifndef _LINUX_PRINTBUF_H +#define _LINUX_PRINTBUF_H + +/* + * Printbufs: Simple heap allocated strings, with some features for structered + * formatting. + * + * This code has provisions for use in userspace, to aid in making other code + * portable between kernelspace and userspace. + * + * Basic example: + * + * struct printbuf buf = PRINTBUF; + * + * pr_buf(&buf, "foo="); + * foo_to_text(&buf, foo); + * printk("%s", buf.buf); + * printbuf_exit(&buf); + * + * We can now write pretty printers instead of writing code that dumps + * everything to the kernel log buffer, and then those pretty-printers can be + * used by other code that outputs to kernel log, sysfs, debugfs, etc. + * + * Memory allocation: Outputing to a printbuf may allocate memory. This + * allocation is done with GFP_KERNEL, by default: use the newer + * memalloc_*_(save|restore) functions as needed. + * + * Since no equivalent yet exists for GFP_ATOMIC/GFP_NOWAIT, memory allocations + * will be done with GFP_ATOMIC if printbuf->atomic is nonzero. + * + * Memory allocation failures: We don't return errors directly, because on + * memory allocation failure we usually don't want to bail out and unwind - we + * want to print what we've got, on a best-effort basis. But code that does want + * to return -ENOMEM may check printbuf.allocation_failure. + * + * Indenting, tabstops: + * + * To aid is writing multi-line pretty printers spread across multiple + * functions, printbufs track the current indent level. + * + * pr_indent_push() and pr_indent_pop() increase and decrease the current indent + * level, respectively. + * + * To use tabstops, set printbuf->tabstops[]; they are in units of spaces, from + * start of line. Once set, pr_tab() will output spaces up to the next tabstop. + * pr_tab_rjust() will also advance the current line of text up to the next + * tabstop, but it does so by shifting text since the previous tabstop up to the + * next tabstop - right justifying it. + * + * Make sure you use pr_newline() instead of \n in the format string for indent + * level and tabstops to work corretly. + * + * Output units: printbuf->units exists to tell pretty-printers how to output + * numbers: a raw value (e.g. directly from a superblock field), as bytes, or as + * human readable bytes. pr_units() and pr_sectors() obey it. + * + * Other helpful functions: + * + * pr_human_readable_u64, pr_human_readable_s64: Print an integer with human + * readable units. + * + * pr_time(): for printing a time_t with strftime in userspace, prints as an + * integer number of seconds in the kernel. + * + * pr_string_option: Given an enumerated value and a string array with names for + * each option, prints out the enum names with the selected one indicated with + * square brackets. + * + * pr_bitflags: Given a bitflag and a string array with names for each bit, + * prints out the names of the selected bits. + */ + +#include <linux/slab.h> +#include <linux/string_helpers.h> + +enum printbuf_units { + PRINTBUF_UNITS_RAW, + PRINTBUF_UNITS_BYTES, + PRINTBUF_UNITS_HUMAN_READABLE, +}; + +struct printbuf { + char *buf; + unsigned size; + unsigned pos; + unsigned last_newline; + unsigned last_field; + unsigned indent; + enum printbuf_units units:8; + /* + * If nonzero, allocations will be done with GFP_ATOMIC: + */ + u8 atomic; + bool allocation_failure:1; + /* SI units (10^3), or 2^10: */ + enum string_size_units human_readable_units:1; + u8 tabstop; + u8 tabstops[4]; +}; + +#define PRINTBUF ((struct printbuf) { .human_readable_units = STRING_UNITS_2 }) + +/** + * printbuf_exit - exit a printbuf, freeing memory it owns and poisoning it + * against accidental use. + */ +static inline void printbuf_exit(struct printbuf *buf) +{ + kfree(buf->buf); + buf->buf = ERR_PTR(-EINTR); /* poison value */ +} + +/** + * printbuf_reset - re-use a printbuf without freeing and re-initializing it: + */ +static inline void printbuf_reset(struct printbuf *buf) +{ + buf->pos = 0; + buf->last_newline = 0; + buf->last_field = 0; + buf->indent = 0; + buf->tabstop = 0; + buf->allocation_failure = 0; +} + +/** + * printbuf_atomic_inc - mark as entering an atomic section + */ +static inline void printbuf_atomic_inc(struct printbuf *buf) +{ + buf->atomic++; +} + +/** + * printbuf_atomic_inc - mark as leaving an atomic section + */ +static inline void printbuf_atomic_dec(struct printbuf *buf) +{ + buf->atomic--; +} + +void pr_buf(struct printbuf *out, const char *fmt, ...) + __attribute__ ((format (printf, 2, 3))); + +void pr_char(struct printbuf *buf, char c); +void pr_newline(struct printbuf *); +void pr_indent_push(struct printbuf *, unsigned); +void pr_indent_pop(struct printbuf *, unsigned); +void pr_tab(struct printbuf *); +void pr_tab_rjust(struct printbuf *); +void pr_human_readable_u64(struct printbuf *, u64); +void pr_human_readable_s64(struct printbuf *, s64); +void pr_units(struct printbuf *, s64, s64); +void pr_sectors(struct printbuf *, u64); +void pr_time(struct printbuf *, u64); +void pr_uuid(struct printbuf *, u8 *); +void pr_string_option(struct printbuf *, const char * const list[], size_t); +void pr_bitflags(struct printbuf *, const char * const list[], u64); +const char *printbuf_str(const struct printbuf *); + +#endif /* _LINUX_PRINTBUF_H */ diff --git a/lib/Makefile b/lib/Makefile index c588a126a3..31a3904eda 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -34,7 +34,7 @@ lib-y := ctype.o string.o vsprintf.o cmdline.o \ is_single_threaded.o plist.o decompress.o kobject_uevent.o \ earlycpio.o seq_buf.o siphash.o dec_and_lock.o \ nmi_backtrace.o nodemask.o win_minmax.o memcat_p.o \ - buildid.o + buildid.o printbuf.o lib-$(CONFIG_PRINTK) += dump_stack.o lib-$(CONFIG_SMP) += cpumask.o diff --git a/lib/printbuf.c b/lib/printbuf.c new file mode 100644 index 0000000000..e0dfa82cda --- /dev/null +++ b/lib/printbuf.c @@ -0,0 +1,274 @@ +// SPDX-License-Identifier: LGPL-2.1+ +/* Copyright (C) 2022 Kent Overstreet */ + +#ifdef __KERNEL__ +#include <linux/export.h> +#include <linux/kernel.h> +#else +#define EXPORT_SYMBOL(x) +#endif + +#include <linux/log2.h> +#include <linux/printbuf.h> +#include <linux/string_helpers.h> + +static inline size_t printbuf_remaining(struct printbuf *buf) +{ + return buf->size - buf->pos; +} + +static inline size_t printbuf_linelen(struct printbuf *buf) +{ + return buf->pos - buf->last_newline; +} + +static int printbuf_realloc(struct printbuf *out, unsigned extra) +{ + unsigned new_size; + char *buf; + + if (out->pos + extra + 1 < out->size) + return 0; + + new_size = roundup_pow_of_two(out->size + extra); + buf = krealloc(out->buf, new_size, !out->atomic ? GFP_KERNEL : GFP_ATOMIC); + + if (!buf) { + out->allocation_failure = true; + return -ENOMEM; + } + + out->buf = buf; + out->size = new_size; + return 0; +} + +void pr_buf(struct printbuf *out, const char *fmt, ...) +{ + va_list args; + int len; + + do { + va_start(args, fmt); + len = vsnprintf(out->buf + out->pos, printbuf_remaining(out), fmt, args); + va_end(args); + } while (len + 1 >= printbuf_remaining(out) && + !printbuf_realloc(out, len + 1)); + + len = min_t(size_t, len, + printbuf_remaining(out) ? printbuf_remaining(out) - 1 : 0); + out->pos += len; +} +EXPORT_SYMBOL(pr_buf); + +void pr_char(struct printbuf *buf, char c) +{ + if (!printbuf_realloc(buf, 1)) { + buf->buf[buf->pos++] = c; + buf->buf[buf->pos] = 0; + } +} +EXPORT_SYMBOL(pr_char); + +void pr_newline(struct printbuf *buf) +{ + unsigned i; + + pr_char(buf, '\n'); + + buf->last_newline = buf->pos; + + for (i = 0; i < buf->indent; i++) + pr_char(buf, ' '); + + buf->last_field = buf->pos; + buf->tabstop = 0; +} +EXPORT_SYMBOL(pr_newline); + +void pr_indent_push(struct printbuf *buf, unsigned spaces) +{ + buf->indent += spaces; + while (spaces--) + pr_char(buf, ' '); +} +EXPORT_SYMBOL(pr_indent_push); + +void pr_indent_pop(struct printbuf *buf, unsigned spaces) +{ + if (buf->last_newline + buf->indent == buf->pos) { + buf->pos -= spaces; + buf->buf[buf->pos] = 0; + } + buf->indent -= spaces; +} +EXPORT_SYMBOL(pr_indent_pop); + +void pr_tab(struct printbuf *buf) +{ + BUG_ON(buf->tabstop > ARRAY_SIZE(buf->tabstops)); + + while (printbuf_remaining(buf) > 1 && + printbuf_linelen(buf) < buf->tabstops[buf->tabstop]) + pr_char(buf, ' '); + + buf->last_field = buf->pos; + buf->tabstop++; +} +EXPORT_SYMBOL(pr_tab); + +void pr_tab_rjust(struct printbuf *buf) +{ + BUG_ON(buf->tabstop > ARRAY_SIZE(buf->tabstops)); + + if (printbuf_linelen(buf) < buf->tabstops[buf->tabstop]) { + unsigned move = buf->pos - buf->last_field; + unsigned shift = buf->tabstops[buf->tabstop] - + printbuf_linelen(buf); + + printbuf_realloc(buf, shift); + + if (buf->last_field + shift + 1 < buf->size) { + move = min(move, buf->size - 1 - buf->last_field - shift); + + memmove(buf->buf + buf->last_field + shift, + buf->buf + buf->last_field, + move); + memset(buf->buf + buf->last_field, ' ', shift); + buf->pos += shift; + buf->buf[buf->pos] = 0; + } + } + + buf->last_field = buf->pos; + buf->tabstop++; +} +EXPORT_SYMBOL(pr_tab_rjust); + +void pr_human_readable_u64(struct printbuf *buf, u64 v) +{ + printbuf_realloc(buf, 10); + string_get_size(v, 1, buf->human_readable_units, buf->buf + buf->pos, + printbuf_remaining(buf)); + buf->pos += strlen(buf->buf + buf->pos); +} +EXPORT_SYMBOL(pr_human_readable_u64); + +void pr_human_readable_s64(struct printbuf *buf, s64 v) +{ + if (v < 0) + pr_char(buf, '-'); + pr_human_readable_u64(buf, abs(v)); +} +EXPORT_SYMBOL(pr_human_readable_s64); + +void pr_units(struct printbuf *out, s64 raw, s64 bytes) +{ + switch (out->units) { + case PRINTBUF_UNITS_RAW: + pr_buf(out, "%llu", raw); + break; + case PRINTBUF_UNITS_BYTES: + pr_buf(out, "%llu", bytes); + break; + case PRINTBUF_UNITS_HUMAN_READABLE: + pr_human_readable_s64(out, bytes); + break; + } +} +EXPORT_SYMBOL(pr_units); + +void pr_sectors(struct printbuf *out, u64 v) +{ + pr_units(out, v, v << 9); +} +EXPORT_SYMBOL(pr_sectors); + +#ifdef __KERNEL__ + +void pr_time(struct printbuf *out, u64 time) +{ + pr_buf(out, "%llu", time); +} +EXPORT_SYMBOL(pr_time); + +void pr_uuid(struct printbuf *out, u8 *uuid) +{ + pr_buf(out, "%pUb", uuid); +} +EXPORT_SYMBOL(pr_uuid); + +#else + +#include <time.h> +#include <uuid.h> + +void pr_time(struct printbuf *out, u64 _time) +{ + char time_str[64]; + time_t time = _time; + struct tm *tm = localtime(&time); + size_t err = strftime(time_str, sizeof(time_str), "%c", tm); + + if (!err) + pr_buf(out, "(formatting error)"); + else + pr_buf(out, "%s", time_str); +} + +void pr_uuid(struct printbuf *out, u8 *uuid) +{ + char uuid_str[40]; + + uuid_unparse_lower(uuid, uuid_str); + pr_buf(out, uuid_str); +} + +#endif + +void pr_string_option(struct printbuf *out, + const char * const list[], + size_t selected) +{ + size_t i; + + for (i = 0; list[i]; i++) + pr_buf(out, i == selected ? "[%s] " : "%s ", list[i]); +} +EXPORT_SYMBOL(pr_string_option); + +void pr_bitflags(struct printbuf *out, + const char * const list[], u64 flags) +{ + unsigned bit, nr = 0; + bool first = true; + + while (list[nr]) + nr++; + + while (flags && (bit = __ffs(flags)) < nr) { + if (!first) + pr_buf(out, ","); + first = false; + pr_buf(out, "%s", list[bit]); + flags ^= 1 << bit; + } +} +EXPORT_SYMBOL(pr_bitflags); + +/** + * printbuf_str - returns printbuf's buf as a C string, guaranteed to be null + * terminated + */ +const char *printbuf_str(const struct printbuf *buf) +{ + /* + * If we've written to a printbuf then it's guaranteed to be a null + * terminated string - but if we haven't, then we might not have + * allocated a buffer at all: + */ + return buf->pos + ? buf->buf + : ""; +} +EXPORT_SYMBOL(printbuf_str);
This adds printbufs: simple heap-allocated strings meant for building up structured messages, for logging/procfs/sysfs and elsewhere. They've been heavily used in bcachefs for writing .to_text() functions/methods - pretty printers, which has in turn greatly improved the overall quality of error messages. Basic usage is documented in include/linux/printbuf.h. The next patches in the series are going to be using printbufs to implement a .to_text() method for shrinkers, and improving OOM reporting. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> --- include/linux/printbuf.h | 164 +++++++++++++++++++++++ lib/Makefile | 2 +- lib/printbuf.c | 274 +++++++++++++++++++++++++++++++++++++++ 3 files changed, 439 insertions(+), 1 deletion(-) create mode 100644 include/linux/printbuf.h create mode 100644 lib/printbuf.c