Message ID | 1541165517-3557-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v6,1/3] printk: Add line-buffered printk() API. | expand |
On Fri, Nov 02, 2018 at 10:31:55PM +0900, Tetsuo Handa wrote: > get_printk_buffer() tries to assign a "struct printk_buffer" from > statically preallocated array. get_printk_buffer() returns NULL if > all "struct printk_buffer" are in use, but the caller does not need to > check for NULL. This seems like a great way of wasting 16kB of memory. Since you've already made printk_buffered() work with a NULL initial argument, what's the advantage over just doing kmalloc(1024, GFP_ATOMIC)?
Hi Tetsuo,
I love your patch! Yet something to improve:
[auto build test ERROR on linus/master]
[also build test ERROR on v4.19 next-20181102]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/Tetsuo-Handa/printk-Add-line-buffered-printk-API/20181103-010234
config: i386-randconfig-x002-201843 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
# save the attached .config to linux build tree
make ARCH=i386
All errors (new ones prefixed by >>):
In file included from kernel/printk/printk_buffered.c:5:0:
>> kernel/printk/internal.h:28:8: error: unknown type name 'raw_spinlock_t'
extern raw_spinlock_t logbuf_lock;
^~~~~~~~~~~~~~
vim +/raw_spinlock_t +28 kernel/printk/internal.h
f956d1a3 Tetsuo Handa 2018-11-02 27
099f1c84 Sergey Senozhatsky 2016-12-27 @28 extern raw_spinlock_t logbuf_lock;
099f1c84 Sergey Senozhatsky 2016-12-27 29
:::::: The code at line 28 was first introduced by commit
:::::: 099f1c84c0052ec1b27f1c3942eed5830d86bdbb printk: introduce per-cpu safe_print seq buffer
:::::: TO: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
:::::: CC: Petr Mladek <pmladek@suse.com>
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
On 2018/11/02 23:40, Matthew Wilcox wrote: > On Fri, Nov 02, 2018 at 10:31:55PM +0900, Tetsuo Handa wrote: >> get_printk_buffer() tries to assign a "struct printk_buffer" from >> statically preallocated array. get_printk_buffer() returns NULL if >> all "struct printk_buffer" are in use, but the caller does not need to >> check for NULL. > > This seems like a great way of wasting 16kB of memory. Since you've > already made printk_buffered() work with a NULL initial argument, what's > the advantage over just doing kmalloc(1024, GFP_ATOMIC)? Like "[PATCH 2/3] mm: Use line-buffered printk() for show_free_areas()." demonstrates, kzalloc(sizeof(struct printk_buffer), GFP_ATOMIC) can fail. And using statically preallocated buffers helps avoiding (1) out of buffers when memory cannot be allocated (2) kernel stack overflow when kernel stack is already tight (e.g. a memory allocation attempt from an interrupt handler which was invoked from deep inside call chain of a process context) . Whether (A) tuning the number of statically preallocated buffers (B) allocating buffers on caller side (e.g. kzalloc() or in .bss section) are useful is a future decision, for too much concurrent printk() will lockup the system even if there are enough buffers. I think that starting with statically preallocated buffers is (at least for now) a good choice for minimizing risk of (1) (2) while offering practically acceptable result.
On (11/02/18 22:31), Tetsuo Handa wrote: > (1) Call get_printk_buffer() and acquire "struct printk_buffer *". > > (2) Rewrite printk() calls in the following way. The "ptr" is > "struct printk_buffer *" obtained in step (1). > > printk(fmt, ...) => printk_buffered(ptr, fmt, ...) > vprintk(fmt, args) => vprintk_buffered(ptr, fmt, args) > pr_emerg(fmt, ...) => bpr_emerg(ptr, fmt, ...) > pr_alert(fmt, ...) => bpr_alert(ptr, fmt, ...) > pr_crit(fmt, ...) => bpr_crit(ptr, fmt, ...) > pr_err(fmt, ...) => bpr_err(ptr, fmt, ...) > pr_warning(fmt, ...) => bpr_warning(ptr, fmt, ...) > pr_warn(fmt, ...) => bpr_warn(ptr, fmt, ...) > pr_notice(fmt, ...) => bpr_notice(ptr, fmt, ...) > pr_info(fmt, ...) => bpr_info(ptr, fmt, ...) > pr_cont(fmt, ...) => bpr_cont(ptr, fmt, ...) > > (3) Release "struct printk_buffer" by calling put_printk_buffer(). [..] > Since we want to remove "struct cont" eventually, we will try to remove > both "implicit printk() users who are expecting KERN_CONT behavior" and > "explicit pr_cont()/printk(KERN_CONT) users". Therefore, converting to > this API is recommended. - The printk-fallback sounds like a hint that the existing 'cont' handling better stay in the kernel. I don't see how the existing 'cont' is significantly worse than bpr_warn(NULL, ...)->printk() // no 'cont' support I don't see why would we want to do it, sorry. I don't see "it takes 16 printk-buffers to make a thing go right" as a sure thing. A question. How bad would it actually be to: - Allocate seq_buf 512-bytes buffer (GFP_ATOMIC) just-in-time, when we need it. // How often systems cannot allocate a 512-byte buffer? // - OK, assuming that systems around the world are so badly OOM like all the time and even kmalloc(512) is absolutely impossible, then have a fallback to the existing 'cont' handling; it just looks to me better than a plain printk()-fallback with removed 'cont' support. - Do not allocate seq_buf if we are in printk-safe or in printk-nmi mode. To avoid "buffering for the sake of buffering". IOW, when in printk-safe use printk-safe. -ss
On Tue 2018-11-06 23:35:02, Sergey Senozhatsky wrote: > On (11/02/18 22:31), Tetsuo Handa wrote: > > (1) Call get_printk_buffer() and acquire "struct printk_buffer *". > > > > (2) Rewrite printk() calls in the following way. The "ptr" is > > "struct printk_buffer *" obtained in step (1). > > > > printk(fmt, ...) => printk_buffered(ptr, fmt, ...) > > vprintk(fmt, args) => vprintk_buffered(ptr, fmt, args) > > pr_emerg(fmt, ...) => bpr_emerg(ptr, fmt, ...) > > pr_alert(fmt, ...) => bpr_alert(ptr, fmt, ...) > > pr_crit(fmt, ...) => bpr_crit(ptr, fmt, ...) > > pr_err(fmt, ...) => bpr_err(ptr, fmt, ...) > > pr_warning(fmt, ...) => bpr_warning(ptr, fmt, ...) > > pr_warn(fmt, ...) => bpr_warn(ptr, fmt, ...) > > pr_notice(fmt, ...) => bpr_notice(ptr, fmt, ...) > > pr_info(fmt, ...) => bpr_info(ptr, fmt, ...) > > pr_cont(fmt, ...) => bpr_cont(ptr, fmt, ...) > > > > (3) Release "struct printk_buffer" by calling put_printk_buffer(). > > [..] > > > Since we want to remove "struct cont" eventually, we will try to remove > > both "implicit printk() users who are expecting KERN_CONT behavior" and > > "explicit pr_cont()/printk(KERN_CONT) users". Therefore, converting to > > this API is recommended. > > - The printk-fallback sounds like a hint that the existing 'cont' handling > better stay in the kernel. I don't see how the existing 'cont' is > significantly worse than > bpr_warn(NULL, ...)->printk() // no 'cont' support > I don't see why would we want to do it, sorry. I don't see "it takes 16 > printk-buffers to make a thing go right" as a sure thing. I see it the following way: + mixed cont lines are very rare but they happen + 16 buffers are more than 1 so it could only be better [*] + the printk_buffer() code is self-contained and does not complicate the logic of the classic printk() code [**] [*] A missing put_printk_buffer() might cause that we would get out of buffers. But the same problem is with locks, disabled preemption, disabled interrupts, seq_buffer, alloc/free. Such problems happen but they are rare. Also I do not expect that the same buffer would be shared between many functions. Therefore it should be easy to use it correctly. [**] I admit that cont buffer implementation is much easier after removing the early flush to consoles but still... Anyway, I do not think that both implementations are worth it. We could keep both for some transition period but we should remove the old one later. > A question. > > How bad would it actually be to: > > - Allocate seq_buf 512-bytes buffer (GFP_ATOMIC) just-in-time, when we > need it. > // How often systems cannot allocate a 512-byte buffer? // > > - OK, assuming that systems around the world are so badly OOM like all the > time and even kmalloc(512) is absolutely impossible, then have a fallback > to the existing 'cont' handling; it just looks to me better than a plain > printk()-fallback with removed 'cont' support. This would prevent removing the fallback to struct cont. OOM is one important scenario where continuous lines are used. > - Do not allocate seq_buf if we are in printk-safe or in printk-nmi mode. > To avoid "buffering for the sake of buffering". IOW, when in printk-safe > use printk-safe. Sure, my plan is to add a helper function is_buffered_printk_context() or so that would check printk_context. Then we could do the following in vprintk_buffered() if (is_buffered_printk_context()) vprintk_func(....); It might be added on top of the current patchset. I opened this problem once and it got lost. So I did not want to complicate it at this moment. Best Regards, Petr
On 2018/11/06 23:35, Sergey Senozhatsky wrote: >> Since we want to remove "struct cont" eventually, we will try to remove >> both "implicit printk() users who are expecting KERN_CONT behavior" and >> "explicit pr_cont()/printk(KERN_CONT) users". Therefore, converting to >> this API is recommended. > > - The printk-fallback sounds like a hint that the existing 'cont' handling > better stay in the kernel. I don't see how the existing 'cont' is > significantly worse than > bpr_warn(NULL, ...)->printk() // no 'cont' support > I don't see why would we want to do it, sorry. I don't see "it takes 16 > printk-buffers to make a thing go right" as a sure thing. Existing 'cont' handling will stay for a while. After majority of pr_cont()/KERN_CONT users are converted, 'cont' support will be removed (e.g. KERN_CONT becomes ""). > > A question. > > How bad would it actually be to: > > - Allocate seq_buf 512-bytes buffer (GFP_ATOMIC) just-in-time, when we > need it. > // How often systems cannot allocate a 512-byte buffer? // It is a very bad thing to do GFP_ATOMIC without __GFP_NOWARN. See "[PATCH 2/3] mm: Use line-buffered printk() for show_free_areas()." which helps exactly when GFP_ATOMIC without __GFP_NOWARN failed. Without __GFP_NOWARN, GFP_ATOMIC for printk() can trigger infinite recursion and kernel stack overflow. Even without recursion, doing kmalloc(GFP_ATOMIC | __GFP_NOWARN) temporarily consumes some kernel stack. I don't know the exact amount needed for kmalloc(GFP_ATOMIC | __GFP_NOWARN), but it might still emit memory allocation fault injection messages. What GFP_ATOMIC can guarantee is nothing but "it does not sleep". Not suitable for printk() which might be called from critically dangerous situations. > > - OK, assuming that systems around the world are so badly OOM like all the > time and even kmalloc(512) is absolutely impossible, then have a fallback > to the existing 'cont' handling; it just looks to me better than a plain > printk()-fallback with removed 'cont' support. Since I want to eventually remove 'cont' support inside printk(), I dropped KERN_CONT in patch [2/3] and [3/3]. > > - Do not allocate seq_buf if we are in printk-safe or in printk-nmi mode. > To avoid "buffering for the sake of buffering". IOW, when in printk-safe > use printk-safe. Why? Since printk_safe_flush_buffer() forcibly flushes the partial line, calling printk_safe_log_store() after line buffering can reduce possibility of flushing partial lines, can't it?
From: Tetsuo Handa > Sent: 07 November 2018 10:53 > > On 2018/11/06 23:35, Sergey Senozhatsky wrote: > >> Since we want to remove "struct cont" eventually, we will try to remove > >> both "implicit printk() users who are expecting KERN_CONT behavior" and > >> "explicit pr_cont()/printk(KERN_CONT) users". Therefore, converting to > >> this API is recommended. > > > > - The printk-fallback sounds like a hint that the existing 'cont' handling > > better stay in the kernel. I don't see how the existing 'cont' is > > significantly worse than > > bpr_warn(NULL, ...)->printk() // no 'cont' support > > I don't see why would we want to do it, sorry. I don't see "it takes 16 > > printk-buffers to make a thing go right" as a sure thing. > > Existing 'cont' handling will stay for a while. After majority of > pr_cont()/KERN_CONT users are converted, 'cont' support will be removed > (e.g. KERN_CONT becomes ""). A though: Why not make the printf lock slightly 'sticky'? - If the output line is incomplete save the cpuid. - If there is a saved cpuid that doesn't match the current cpu then spin for a bit. Any callers of printk() have to assume they will spin on the buffer for the longest printk formatting (and symbol lookup might take a while) so a short additional delay won't matter. Then two calls to printk() for the same line won't (usually) get split and none of the callers need any changes. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On Wed 2018-11-07 19:52:53, Tetsuo Handa wrote: > On 2018/11/06 23:35, Sergey Senozhatsky wrote: > > - Do not allocate seq_buf if we are in printk-safe or in printk-nmi mode. > > To avoid "buffering for the sake of buffering". IOW, when in printk-safe > > use printk-safe. > > Why? Since printk_safe_flush_buffer() forcibly flushes the partial line, > calling printk_safe_log_store() after line buffering can reduce possibility of > flushing partial lines, can't it? Good point. Well, printk_safe buffers are flushed via irqwork scheduled on the same CPU. It might get flushed prematurely from other CPU but I am not sure if this risk is worth the double buffering. Best Regards, Petr
On Wed 2018-11-07 11:01:05, David Laight wrote: > From: Tetsuo Handa > > Sent: 07 November 2018 10:53 > A though: > > Why not make the printf lock slightly 'sticky'? > - If the output line is incomplete save the cpuid. > - If there is a saved cpuid that doesn't match the current cpu then spin for a bit. > > Any callers of printk() have to assume they will spin on the buffer for the > longest printk formatting (and symbol lookup might take a while) so a short > additional delay won't matter. Disabling preemption for a single printk() is questionable. We have spent many years trying to find an acceptable solution to avoid softlockups and we did not fully succeeded. And this patchset is about continuous lines. Therefore we would need to disable preemption for all code intermixed with the related printks. This does not have much chances to get accepted. > Then two calls to printk() for the same line won't (usually) get split and > none of the callers need any changes. It would require changes to disable the preemption around the related calls. We could not do this without a better API and inspecting all users. Note that we could not detect this by "\n" because it is too error-prone. Best Regards, Petr
On 2018/11/07 19:21, Petr Mladek wrote: > On Tue 2018-11-06 23:35:02, Sergey Senozhatsky wrote: >>> Since we want to remove "struct cont" eventually, we will try to remove >>> both "implicit printk() users who are expecting KERN_CONT behavior" and >>> "explicit pr_cont()/printk(KERN_CONT) users". Therefore, converting to >>> this API is recommended. >> >> - The printk-fallback sounds like a hint that the existing 'cont' handling >> better stay in the kernel. I don't see how the existing 'cont' is >> significantly worse than >> bpr_warn(NULL, ...)->printk() // no 'cont' support >> I don't see why would we want to do it, sorry. I don't see "it takes 16 >> printk-buffers to make a thing go right" as a sure thing. > > I see it the following way: > > + mixed cont lines are very rare but they happen > > + 16 buffers are more than 1 so it could only be better [*] > > + the printk_buffer() code is self-contained and does not > complicate the logic of the classic printk() code [**] > > > [*] A missing put_printk_buffer() might cause that we would get > out of buffers. But the same problem is with locks, > disabled preemption, disabled interrupts, seq_buffer, > alloc/free. Such problems happen but they are rare. > > Also I do not expect that the same buffer would be shared > between many functions. Therefore it should be easy > to use it correctly. Since we can allocate printk() buffer upon dup_task_struct() and free it upon free_task_struct(), we can have enough printk() buffers for task context. Also, since total number of exceptional contexts (e.g. interrupts, oops) is finite, we can have enough printk() buffers for exceptional contexts. Is it possible to identify all locations which should use their own printk() buffers (e.g. interrupt handlers, oops handlers) ? If yes, despite Linus's objection, automatically switching printk() buffers (like memalloc_nofs_save()/memalloc_nofs_restore() instead of https://lkml.kernel.org/r/201709021512.DJG00503.JHOSOFFMFtVLOQ@I-love.SAKURA.ne.jp ) will be easiest and least error prone.
On Fri 2018-11-02 22:31:55, Tetsuo Handa wrote: > Sometimes we want to print a whole line without being disturbed by > concurrent printk() from interrupts and/or other threads, for printk() > which does not end with '\n' can be disturbed. > > Since mixed printk() output makes it hard to interpret, this patch > introduces API for line-buffered printk() output (so that we can make > sure that printk() ends with '\n'). > > Since functions introduced by this patch are merely wrapping > printk()/vprintk() calls in order to minimize possibility of using > "struct cont", it is safe to replace printk()/vprintk() with this API. > Since we want to remove "struct cont" eventually, we will try to remove > both "implicit printk() users who are expecting KERN_CONT behavior" and > "explicit pr_cont()/printk(KERN_CONT) users". Therefore, converting to > this API is recommended. [...] > diff --git a/include/linux/printk.h b/include/linux/printk.h > index cf3eccf..92af345 100644 > --- a/include/linux/printk.h > +++ b/include/linux/printk.h > @@ -173,6 +174,20 @@ int printk_emit(int facility, int level, > > asmlinkage __printf(1, 2) __cold > int printk(const char *fmt, ...); > +struct printk_buffer *get_printk_buffer(void); > +bool flush_printk_buffer(struct printk_buffer *ptr); > +__printf(2, 3) > +int printk_buffered(struct printk_buffer *ptr, const char *fmt, ...); > +__printf(2, 0) > +int vprintk_buffered(struct printk_buffer *ptr, const char *fmt, va_list args); > +/* > + * In order to avoid accidentally reusing "ptr" after put_printk_buffer("ptr"), > + * put_printk_buffer() is defined as a macro which explicitly resets "ptr" to > + * NULL. > + */ > +void __put_printk_buffer(struct printk_buffer *ptr); > +#define put_printk_buffer(ptr) \ > + do { __put_printk_buffer(ptr); ptr = NULL; } while (0) I have mixed feelings about this. It is clever but it complicates the code. I wonder why we need to be more paranoid than kfree() and friends. Especially that the reuse of printk buffer is much less dangerous than reusing freed pointers. Please, remove it unless it gets more supporters. > --- /dev/null > +++ b/kernel/printk/printk_buffered.c > +int vprintk_buffered(struct printk_buffer *ptr, const char *fmt, va_list args) > +{ > + va_list tmp_args; > + int r; > + int pos; > + > + if (!ptr) > + return vprintk(fmt, args); > + /* > + * Skip KERN_CONT here based on an assumption that KERN_CONT will be > + * given via "fmt" argument when KERN_CONT is given. > + */ > + pos = (printk_get_level(fmt) == 'c') ? 2 : 0; > + while (true) { > + va_copy(tmp_args, args); > + r = vsnprintf(ptr->buf + ptr->len, sizeof(ptr->buf) - ptr->len, > + fmt + pos, tmp_args); > + va_end(tmp_args); > + if (likely(r + ptr->len < sizeof(ptr->buf))) > + break; > + if (!flush_printk_buffer(ptr)) > + return vprintk(fmt, args); > + } > + ptr->len += r; > + /* Flush already completed lines if any. */ > + for (pos = ptr->len - 1; pos >= 0; pos--) { > + if (ptr->buf[pos] != '\n') > + continue; > + ptr->buf[pos++] = '\0'; > + printk("%s\n", ptr->buf); > + ptr->len -= pos; > + memmove(ptr->buf, ptr->buf + pos, ptr->len); > + /* This '\0' will be overwritten by next vsnprintf() above. */ > + ptr->buf[ptr->len] = '\0'; I am not against explicitly adding '\0' this way. Just note that there is always the trailing '\0' in the original string, see below. I mean that we could get the same result with memmove(,, ptr->len+1); > + break; > + } > + return r; > +} > + > +/** > + * flush_printk_buffer - Flush incomplete line in printk_buffer. > + * > + * @ptr: Pointer to "struct printk_buffer". It can be NULL. > + * > + * Returns true if flushed something, false otherwise. > + * > + * Flush if @ptr contains partial data. But usually there is no need to call > + * this function because @ptr is flushed by put_printk_buffer(). > + */ > +bool flush_printk_buffer(struct printk_buffer *ptr) > +{ > + if (!ptr || !ptr->len) > + return false; > + /* > + * vprintk_buffered() keeps 0 <= ptr->len < sizeof(ptr->buf) true. > + * But ptr->buf[ptr->len] != '\0' if this function is called due to > + * vsnprintf() + ptr->len >= sizeof(ptr->buf). > + */ > + ptr->buf[ptr->len] = '\0'; This is not necessary. Note that vsnprintf() always adds the trailing '\0' even when the string is shrunken. > + printk("%s", ptr->buf); > + ptr->len = 0; > + return true; > +} > +EXPORT_SYMBOL(flush_printk_buffer); > + > +/** > + * __put_printk_buffer - Release printk_buffer. > + * > + * @ptr: Pointer to "struct printk_buffer". It can be NULL. > + * > + * Returns nothing. > + * > + * Flush and release @ptr. > + * Please use put_printk_buffer() in order to catch use-after-free bugs. > + */ > +void __put_printk_buffer(struct printk_buffer *ptr) > +{ > + long i = (unsigned long) ptr - (unsigned long) printk_buffers; > + > + if (!ptr) > + return; > + if (WARN_ON_ONCE(i % sizeof(struct printk_buffer))) > + return; > + i /= sizeof(struct printk_buffer); > + if (WARN_ON_ONCE(i < 0 || i >= NUM_LINE_BUFFERS)) > + return; > + if (ptr->len) Nit: This check is superfluous. > + flush_printk_buffer(ptr); > + clear_bit_unlock(i, printk_buffers_in_use); > +} > +EXPORT_SYMBOL(__put_printk_buffer); Otherwise, it looks good to me. Best Regards, Petr
On Fri 2018-11-02 22:31:55, Tetsuo Handa wrote: > How to use this API: > > (1) Call get_printk_buffer() and acquire "struct printk_buffer *". > > (2) Rewrite printk() calls in the following way. The "ptr" is > "struct printk_buffer *" obtained in step (1). > > printk(fmt, ...) => printk_buffered(ptr, fmt, ...) > vprintk(fmt, args) => vprintk_buffered(ptr, fmt, args) > pr_emerg(fmt, ...) => bpr_emerg(ptr, fmt, ...) > pr_alert(fmt, ...) => bpr_alert(ptr, fmt, ...) > pr_crit(fmt, ...) => bpr_crit(ptr, fmt, ...) > pr_err(fmt, ...) => bpr_err(ptr, fmt, ...) > pr_warning(fmt, ...) => bpr_warning(ptr, fmt, ...) > pr_warn(fmt, ...) => bpr_warn(ptr, fmt, ...) > pr_notice(fmt, ...) => bpr_notice(ptr, fmt, ...) > pr_info(fmt, ...) => bpr_info(ptr, fmt, ...) > pr_cont(fmt, ...) => bpr_cont(ptr, fmt, ...) I am looking at the sample conversions. We actually won't need bpr_cont(). We will use buffer_printk() instead. Well, I think about renaming buffer_printk() to bprintk() or define it as a wrapper at least. Best Regards, Petr
On (11/07/18 11:21), Petr Mladek wrote: > > [..] > > - The printk-fallback sounds like a hint that the existing 'cont' handling > > better stay in the kernel. I don't see how the existing 'cont' is > > significantly worse than > > bpr_warn(NULL, ...)->printk() // no 'cont' support > > I don't see why would we want to do it, sorry. I don't see "it takes 16 > > printk-buffers to make a thing go right" as a sure thing. > > I see it the following way: > > + mixed cont lines are very rare but they happen > > + 16 buffers are more than 1 so it could only be better [*] Right, so this is where things go a bit shady. I'll try to explain. What is the problem: - we have tons of CPUs, with tons of tasks running on them, with preemption, and interrupts, and potentially printk-s coming from various contexts/CPUs/tasks etc. so one 'cont' buffer is not enough. What is the proposed solution: - if 1 is not enough then 16 will do. And if 16 is not enough then this is not our problem anymore, it's a kernel misconfiguration and users' fault. So, maybe it is a "common" kernel development pattern, I really don't know; but it looks like we are just throwing the issue over the fence; and atop of that we are killing off the existing 'cont'. > Anyway, I do not think that both implementations are worth it. > We could keep both for some transition period but we should > remove the old one later. > [..] > > This would prevent removing the fallback to struct cont. OOM is > one important scenario where continuous lines are used. Let's have one more look at what we will fix and what we will break. 'cont' has premature flushes. - it's annoying in some cases - it's good otherwise Why is it good. It preserves the correct order of events. pr_cont("calling foo->init()...."); foo->init() printk("Can't allocate buffer\n"); // premature flush pr_cont("...blah\h"); Will end up in the logbuf as: [12345.123] calling foo->init().... [12345.124] Can't allocate buffer [12345.125] ...blah Where buffered printk will endup as: [12345.123] Can't allocate buffer [12345.124] calling foo->init().......blah It's very different. Not to mention that buffered printk does not flush on panic. So, frankly, as of now, I don't see buffered printk as a 'cont' replacement. If our problem is OOM and lockdep print outs, then let's address only those two; let's not "fix" the rest of the kernel, especially the early boot, - we can break more things than we can mend. [..] > I opened this problem once and it got lost. So I did not want to > complicate it at this moment. Right, I saw it. We have similar points; I raised those in private talks: - we don't need buffered printk in printk_safe/printk_nmi - we don't need br_cont() - unlike 'cont' buffer there is no way for us to flush buffered printk buffers on panic - I don't exactly like the completely of the vprintk_buffered. If buffered printk is for single line, then it must be for single line only. And I'm not buying the "we will need this for printk origin info injection" argument. Linus was very clear about this whole idea: Linus Torvalds wrote: : Sergey Senozhatsky wrote: : > : > so... I think we don't have to update 'struct printk_log'. we can store : > that "extended data" at the beginning of every message, right after the : > prefix. : : No, we really can't. That just means that all the tools would have to : be changed to get the normal messages without the extra crud. And : since it will have lost the difference, that's not even easy to do. : : So this is exactly the wrong way around. : : If people want to see the extra data, it really should be extra data : that you can get with a new interface from the kernel logs. Not a : "let's just a add it to all lines and make every line uglier and : harder to read. So the "we need all that complexity to inject printk info in later patches" push doesn't look right. - It seems that buffered printk attempts to solve too many problems. I'd prefer it to address just one. -ss
On (11/07/18 19:52), Tetsuo Handa wrote: > > A question. > > > > How bad would it actually be to: > > > > - Allocate seq_buf 512-bytes buffer (GFP_ATOMIC) just-in-time, when we > > need it. > > // How often systems cannot allocate a 512-byte buffer? // > > It is a very bad thing to do GFP_ATOMIC without __GFP_NOWARN. Absolutely, __GFP_NOWARN. > "it does not sleep". Not suitable for printk() which might be called from > critically dangerous situations. So I'm really not convinced that we can use buffered printk in critically dangerous situations. Premature 'cont' flushes and 'cont' flushes on panic are nice and right in critically dangerous situations. [..] > > - Do not allocate seq_buf if we are in printk-safe or in printk-nmi mode. > > To avoid "buffering for the sake of buffering". IOW, when in printk-safe > > use printk-safe. > > Why? Since printk_safe_flush_buffer() forcibly flushes the partial line We need to leave printk_safe and enable local IRQs for that partial flush to occur. I'm not sure that those "partial flushes" from printk_safe actually happen. -ss
On Thu 2018-11-08 11:21:38, Sergey Senozhatsky wrote: > On (11/07/18 11:21), Petr Mladek wrote: > What is the problem: > - we have tons of CPUs, with tons of tasks running on them, with preemption, > and interrupts, and potentially printk-s coming from various > contexts/CPUs/tasks etc. so one 'cont' buffer is not enough. > > What is the proposed solution: > - if 1 is not enough then 16 will do. And if 16 is not enough then this > is not our problem anymore, it's a kernel misconfiguration and users' > fault. I believe that I mentioned this more times. 16 buffers is the first attempt. We could improve it later in several ways: + add more buffers + combine buffers on stack, dynamically allocated and static ones. BTW: I do not remember seeing mixed lines from anything even close to 16 CPUs. Maybe I was just lucky or my memory is leaky. > Let's have one more look at what we will fix and what we will break. > > 'cont' has premature flushes. > > Why is it good. > It preserves the correct order of events. > > pr_cont("calling foo->init()...."); > foo->init() > printk("Can't allocate buffer\n"); // premature flush > pr_cont("...blah\h"); > > Will end up in the logbuf as: > [12345.123] calling foo->init().... > [12345.124] Can't allocate buffer > [12345.125] ...blah > > Where buffered printk will endup as: > [12345.123] Can't allocate buffer > [12345.124] calling foo->init().......blah We will always have this problem with API using explicit buffers. What do you suggest instead, please? I am afraid that we are running in cycles. The other serious alternative was having per-process and per-context buffers but it was rejected several times. > Not to mention that buffered printk does not flush on panic. > So, frankly, as of now, I don't see buffered printk as a 'cont' > replacement. The static array of buffers can be flushed on panic. > If our problem is OOM and lockdep print outs, then let's address only > those two; let's not "fix" the rest of the kernel, especially the early > boot, - we can break more things than we can mend. Do you have any alternative proposal how to handle OOM and lockdep, please? > [..] > > I opened this problem once and it got lost. So I did not want to > > complicate it at this moment. > > - I don't exactly like the completely of the vprintk_buffered. If > buffered printk is for single line, then it must be for single > line only. My undestanding is that the new API is similar to the current cont buffer from this point of view: + buffer size is LOG_LINE_MAX + it is flushed when full The only difference is that it is flushed also when there is a complete line. Is this a problem, please? > And I'm not buying the "we will need this for printk origin > info injection" argument. I was against this idea several times. The current API does not do anything like this. > - It seems that buffered printk attempts to solve too many problems. > I'd prefer it to address just one. This API tries to handle continuous lines more reliably. Do I miss anything, please? Best Regards, Petr
On 2018/11/08 20:24, Petr Mladek wrote: >> Let's have one more look at what we will fix and what we will break. >> >> 'cont' has premature flushes. >> >> Why is it good. >> It preserves the correct order of events. >> >> pr_cont("calling foo->init()...."); >> foo->init() >> printk("Can't allocate buffer\n"); // premature flush >> pr_cont("...blah\h"); >> >> Will end up in the logbuf as: >> [12345.123] calling foo->init().... >> [12345.124] Can't allocate buffer >> [12345.125] ...blah >> >> Where buffered printk will endup as: >> [12345.123] Can't allocate buffer >> [12345.124] calling foo->init().......blah > > We will always have this problem with API using explicit buffers. > What do you suggest instead, please? > > I am afraid that we are running in cycles. The other serious > alternative was having per-process and per-context buffers > but it was rejected several times. Is it possible to identify all locations which should use their own printk() buffers (e.g. interrupt handlers, oops handlers) ? If yes, automatically switching printk() buffers (like memalloc_nofs_save()/ memalloc_nofs_restore()) will be easiest and least error prone.
On (11/08/18 12:24), Petr Mladek wrote: > I believe that I mentioned this more times. 16 buffers is the first > attempt. We could improve it later in several ways Sure. Like I said - maybe it is a normal development pattern; I really don't know. > > Let's have one more look at what we will fix and what we will break. > > > > 'cont' has premature flushes. > > > > Why is it good. > > It preserves the correct order of events. > > > > pr_cont("calling foo->init()...."); > > foo->init() > > printk("Can't allocate buffer\n"); // premature flush > > pr_cont("...blah\h"); > > > > Will end up in the logbuf as: > > [12345.123] calling foo->init().... > > [12345.124] Can't allocate buffer > > [12345.125] ...blah > > > > Where buffered printk will endup as: > > [12345.123] Can't allocate buffer > > [12345.124] calling foo->init().......blah > > We will always have this problem with API using explicit buffers. Exactly. > What do you suggest instead, please? So this is why "let's not remove 'cont'" thing. Buffered printk is not 100% identical to 'cont'. And 'cont' does a good job sometimes, Because 'cont' looks like a buffered printk, but it behaves like a normal printk when things go bad. Buffered printk is always buffered; and not even aware of the fact that things go bad around. > > If our problem is OOM and lockdep print outs, then let's address only > > those two; let's not "fix" the rest of the kernel, especially the early > > boot, - we can break more things than we can mend. > > Do you have any alternative proposal how to handle OOM and lockdep, please? You misunderstood me. I'm not against the buffered printk in lockdep and OOM. Albeit I must admit that lockdep patch looks quite big. I don't like the idea of "we will remove 'cont' and replace it with buffered printk through out the kernel". [..] > > - It seems that buffered printk attempts to solve too many problems. > > I'd prefer it to address just one. > > This API tries to handle continuous lines more reliably. > Do I miss anything, please? This part: + /* Flush already completed lines if any. */ + for (pos = ptr->len - 1; pos >= 0; pos--) { + if (ptr->buf[pos] != '\n') + continue; + ptr->buf[pos++] = '\0'; + printk("%s\n", ptr->buf); + ptr->len -= pos; + memmove(ptr->buf, ptr->buf + pos, ptr->len); + /* This '\0' will be overwritten by next vsnprintf() above. */ + ptr->buf[ptr->len] = '\0'; + break; + } + return r; If I'm not mistaken, this is for the futute "printk injection" work. Right now printk("foo\nbar\n") will end up to be a single logbuf entry, with \n in the middle and at the end. So it will look like two lines on the serial console: [123.123] foo bar Tetsuo does this \n lookup (on every vprintk_buffered) and split lines (via memove) for "printk injection", so the output will be [123.123] foo [123.124] bar Which makes it simpler to "inject" printk origin into every printed line. Without it we can just do len = vsprintf(); if (len && text[len - 1] == '\n' || overflow) flush(); Can't we? -ss
On Thu 2018-11-08 21:30:49, Sergey Senozhatsky wrote: > On (11/08/18 12:24), Petr Mladek wrote: > > > - It seems that buffered printk attempts to solve too many problems. > > > I'd prefer it to address just one. > > > > This API tries to handle continuous lines more reliably. > > Do I miss anything, please? > > This part: > > + /* Flush already completed lines if any. */ > + for (pos = ptr->len - 1; pos >= 0; pos--) { > + if (ptr->buf[pos] != '\n') > + continue; > + ptr->buf[pos++] = '\0'; > + printk("%s\n", ptr->buf); > + ptr->len -= pos; > + memmove(ptr->buf, ptr->buf + pos, ptr->len); > + /* This '\0' will be overwritten by next vsnprintf() above. */ > + ptr->buf[ptr->len] = '\0'; > + break; > + } > + return r; > > If I'm not mistaken, this is for the futute "printk injection" work. No, it does not make sense to distinguish the context at this level. The buffer is passed as an argument. It should not get passed to another task or context. The above code only tries to push complete lines to the main log buffer and consoles ASAP. It sounds like a Good Idea(tm). > Right now printk("foo\nbar\n") will end up to be a single logbuf > entry, with \n in the middle and at the end. So it will look like > two lines on the serial console: > > [123.123] foo > bar > > Tetsuo does this \n lookup (on every vprintk_buffered) and split lines > (via memove) for "printk injection", so the output will be > > [123.123] foo > [123.124] bar No, please note that the for cycle searches for '\n' from the end of the string. > Which makes it simpler to "inject" printk origin into every printed > line. > > Without it we can just do > > len = vsprintf(); > if (len && text[len - 1] == '\n' || overflow) > flush(); I had the same idea. Tetsuo ignored it. I looked for more arguments and found that '\n' is used in the middle of several pr_cont() calls, see git grep 'pr_cont.*\\n.*[^\\n]"' Best Regards, Petr
On (11/09/18 15:10), Petr Mladek wrote: > > > > If I'm not mistaken, this is for the futute "printk injection" work. > > The above code only tries to push complete lines to the main log buffer > and consoles ASAP. It sounds like a Good Idea(tm). Probably it is. So *quite likely* I'm wrong here. Hmm... Thinking out loud. At the same time, splitting a single logbuf entry gives a chance to other events to mix in. Example: pr_cont("Foo:"); pr_cont("\nbar"); pr_cont("\n"); Previously it could been stored in one logbuf entry (one logstore, one console_trylock_spinning), which means that it could have been printed in one go: call_console_drivers() spin_trylock_irqsave(&port->lock, flags); uart_console_write(.... "Foo:\nbar\n"); spin_unlock_irqrestore(&port->lock, flags); While with buffered printk, it will be store in two logbuf entries, and printed in two goes: call_console_drivers() spin_trylock_irqsave(&port->lock, flags); uart_console_write(.... "Foo:\nbar\n"); spin_unlock_irqrestore(&port->lock, flags); << ... console driver IRQ, TX port->state->xmit chars ... >>> call_console_drivers() spin_trylock_irqsave(&port->lock, flags); uart_console_write(.... "Foo:\nbar\n"); spin_unlock_irqrestore(&port->lock, flags); So, *technically*, we now let more events to happens between printk-s. But, let's look at some of pr_cont() usage examples. E.g. dump() from arch/h8300/kernel/traps.c. The code in question looks as follows: pr_info("\nKERNEL STACK:"); tp = ((unsigned char *) fp) - 0x40; for (sp = (unsigned long *) tp, i = 0; (i < 0x40); i += 4) { if ((i % 0x10) == 0) pr_info("\n%08x: ", (int) (tp + i)); pr_info("%08x ", (int) *sp++); } pr_info("\n"); dmesg [ 15.260099] 0000: 00000000 00000010 00000040 00000090 0010: 00000100 00000190 00000240 00000310 0020: 00000400 00000510 00000640 00000790 0030: 00000900 00000a90 00000c40 00000e10 So we have the entire KERNEL STACK stored as a single line, in a single logbuf entry. cat /dev/kmsg 4,687,15260099,c;\x0a0000: 00000000 00000010 00000040 00000090 \x0a0010: 00000100 00000190 00000240 00000310 \x0a0020: 00000400 00000510 00000640 00000790 \x0a0030: 00000900 00000a90 00000c40 00000e10 Shall we consider this as a "reference" representation: data that pr_cont(), ideally, would have stored as a single logbuf entry? And then compare the "reference" representation and what buffered printk does. Buffered printk always stores this as several lines, IOW several logbuf entries. cat /dev/kmsg 4,690,15260152,-;0000: 00000000 00000010 00000040 00000090 4,691,15260154,-;0010: 00000100 00000190 00000240 00000310 4,692,15260157,-;0020: 00000400 00000510 00000640 00000790 4,694,15260161,-;0030: 00000900 00000a90 00000c40 00000e10 So if we will have concurrent printk()-s happening on other CPUs, then the KERNEL STACK data block still can be a bit hard to read: [ 15.260152] 0000: 00000000 00000010 00000040 00000090 <printk CPU1> ... foo bar foo bar <printk CPU2> ... foo bar foo bar ... [ 15.260154] 0010: 00000100 00000190 00000240 00000310 <printk CPU3> ... foo bar foo bar <printk CPU2> ... foo bar foo bar ... ... and so on; you got the idea. > No, please note that the for cycle searches for '\n' from the end > of the string. You are right, I didn't notice that. Indeed. Tetsuo, lockdep report with buffered printks looks a bit different: kernel: Possible unsafe locking scenario: kernel: CPU0 CPU1 kernel: ---- ---- kernel: lock(bar_lock); kernel: lock( kernel: foo_lock); kernel: lock(bar_lock); kernel: lock(foo_lock); kernel: > > len = vsprintf(); > > if (len && text[len - 1] == '\n' || overflow) > > flush(); > > I had the same idea. Tetsuo ignored it. I looked for more arguments > and found that '\n' is used in the middle of several pr_cont() > calls OK, so we probably can have that new semantics. But we might split something that possibly was meant to be a single line with a bunch of \n in the middle. -ss
On 2018/11/12 16:59, Sergey Senozhatsky wrote: > Tetsuo, lockdep report with buffered printks looks a bit different: > > kernel: Possible unsafe locking scenario: > kernel: CPU0 CPU1 > kernel: ---- ---- > kernel: lock(bar_lock); > kernel: lock( > kernel: foo_lock); > kernel: lock(bar_lock); > kernel: lock(foo_lock); > kernel: > Yes. That is because vprintk_buffered() eliminated KERN_CONT. Since there was pending partial printk() output, vprintk_buffered() should not eliminate KERN_CONT. Petr Mladek commented 1. The mixing of normal and buffered printk calls is a bit confusing and error prone. It would make sense to use the buffered printk everywhere in the given section of code even when it is not strictly needed. and I made a draft version for how the code would look like if we try to avoid the mixing of normal and buffered printk calls. The required changes seem to be too large to apply tree wide. And I suggested try_buffered_printk().
On 2018/11/12 16:59, Sergey Senozhatsky wrote: > On (11/09/18 15:10), Petr Mladek wrote: >>> >>> If I'm not mistaken, this is for the futute "printk injection" work. >> >> The above code only tries to push complete lines to the main log buffer >> and consoles ASAP. It sounds like a Good Idea(tm). > > Probably it is. So *quite likely* I'm wrong here. In the future, I want to inject caller information to standard console output, for syzbot is using serial console [ 0.000000] Linux version 4.20.0-rc2-next-20181116+ (syzkaller@ci) (gcc version 8.0.1 20180413 (experimental) (GCC)) #120 SMP PREEMPT Fri Nov 16 20:21:42 UTC 2018 [ 0.000000] Command line: BOOT_IMAGE=/vmlinuz root=/dev/sda1 console=ttyS0 earlyprintk=serial vsyscall=native rodata=n oops=panic panic_on_warn=1 nmi_watchdog=panic panic=86400 security=apparmor ima_policy=tcb workqueue.watchdog_thresh=140 kvm-intel.nested=1 nf-conntrack-ftp.ports=20000 nf-conntrack-tftp.ports=20000 nf-conntrack-sip.ports=20000 nf-conntrack-irc.ports=20000 nf-conntrack-sane.ports=20000 vivid.n_devs=16 vivid.multiplanar=1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2 nopcid and is experiencing confusion due to concurrent printk() like [ 552.907865] syz-executor0: vmalloc: allocation failure: 256 bytes, mode:0x6000c0(GFP_KERNEL), nodemask=(null) [ 552.921666] ------------[ cut here ]------------ [ 552.926434] DEBUG_LOCKS_WARN_ON(depth <= 0) [ 552.926563] WARNING: CPU: 0 PID: 24211 at kernel/locking/lockdep.c:3595 lock_release+0x740/0xa10 [ 552.927489] syz-executor0 cpuset=syz0 mems_allowed=0 [ 552.930893] Kernel panic - not syncing: panic_on_warn set ... [ 552.943750] CPU: 1 PID: 24208 Comm: syz-executor0 Not tainted 4.20.0-rc1-next-20181109+ #110 [ 552.959345] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 [ 552.968694] Call Trace: [ 552.971296] dump_stack+0x244/0x39d [ 552.974934] ? dump_stack_print_info.cold.1+0x20/0x20 [ 552.980138] ? rcu_lockdep_current_cpu_online+0x1a4/0x210 [ 552.985685] warn_alloc.cold.116+0xb7/0x1bd [ 552.990011] ? zone_watermark_ok_safe+0x3f0/0x3f0 [ 552.994884] ? __get_vm_area_node+0x130/0x3a0 [ 552.999387] ? rcu_read_lock_sched_held+0x14f/0x180 [ 553.004414] ? __might_fault+0x12b/0x1e0 [ 553.008487] ? __get_vm_area_node+0x2e5/0x3a0 [ 553.013000] __vmalloc_node_range+0x472/0x750 [ 553.017514] ? do_replace+0x23b/0x4c0 [ 553.021327] vmalloc+0x6f/0x80 (...snipped...) [ 553.296333] __warn.cold.8+0x20/0x45 [ 553.296353] ? lock_release+0x740/0xa10 [ 553.329854] Node 0 DMA free:15908kB min:164kB low:204kB high:244kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15992kB managed:15908kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB [ 553.332731] report_bug+0x254/0x2d0 [ 553.332796] do_error_trap+0x11b/0x200 [ 553.332822] do_invalid_op+0x36/0x40 [ 553.336825] lowmem_reserve[]: 0 2818 6321 6321 [ 553.362888] ? lock_release+0x740/0xa10 [ 553.362906] invalid_op+0x14/0x20 [ 553.362920] RIP: 0010:lock_release+0x740/0xa10 [ 553.362935] Code: 03 38 d0 7c 08 84 d2 0f 85 da 02 00 00 8b 35 a7 95 b3 08 85 f6 75 15 48 c7 c6 20 66 2b 88 48 c7 c7 c0 33 2b 88 e8 10 36 e7 ff <0f> 0b 48 8b 95 e8 fe ff ff 4c 89 f7 48 8b b5 f0 fe ff ff e8 e8 58 [ 553.362941] RSP: 0018:ffff8801839ef868 EFLAGS: 00010086 [ 553.362950] RAX: 0000000000000000 RBX: 1ffff1003073df12 RCX: ffffc9000be41000 [ 553.362957] RDX: 00000000000087b4 RSI: ffffffff8165ba15 RDI: 0000000000000006 [ 553.362966] RBP: ffff8801839ef998 R08: ffff88017c010600 R09: fffffbfff12b2254 [ 553.362974] R10: fffffbfff12b2254 R11: ffffffff895912a3 R12: ffffffff8b0e27a0 [ 553.362989] R13: ffff8801839ef970 R14: ffff88017c010600 R15: ffff8801839ef8b0 [ 553.373138] Node 0 DMA32 free:2887356kB min:30052kB low:37564kB high:45076kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:3129332kB managed:2888772kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:1416kB local_pcp:0kB free_cma:0kB [ 553.374209] ? vprintk_func+0x85/0x181 [ 553.387199] lowmem_reserve[]: 0 0 3503 3503 [ 553.390755] ? lock_release+0x740/0xa10 [ 553.390840] ? loop_control_ioctl+0xf5/0x4e0 in https://syzkaller.appspot.com/bug?id=e7838e1e659de1ba566bb17410438b9d1dc59eb7 . Thus, I appreciate if we could mitigate KERN_CONT problem using try_buffered_printk() and move on to "caller information injection" topic. I'm fine with starting with CONFIG_DEBUG_AID_FOR_SYZBOT=y.
diff --git a/include/linux/printk.h b/include/linux/printk.h index cf3eccf..92af345 100644 --- a/include/linux/printk.h +++ b/include/linux/printk.h @@ -157,6 +157,7 @@ static inline void printk_nmi_direct_enter(void) { } static inline void printk_nmi_direct_exit(void) { } #endif /* PRINTK_NMI */ +struct printk_buffer; #ifdef CONFIG_PRINTK asmlinkage __printf(5, 0) int vprintk_emit(int facility, int level, @@ -173,6 +174,20 @@ int printk_emit(int facility, int level, asmlinkage __printf(1, 2) __cold int printk(const char *fmt, ...); +struct printk_buffer *get_printk_buffer(void); +bool flush_printk_buffer(struct printk_buffer *ptr); +__printf(2, 3) +int printk_buffered(struct printk_buffer *ptr, const char *fmt, ...); +__printf(2, 0) +int vprintk_buffered(struct printk_buffer *ptr, const char *fmt, va_list args); +/* + * In order to avoid accidentally reusing "ptr" after put_printk_buffer("ptr"), + * put_printk_buffer() is defined as a macro which explicitly resets "ptr" to + * NULL. + */ +void __put_printk_buffer(struct printk_buffer *ptr); +#define put_printk_buffer(ptr) \ + do { __put_printk_buffer(ptr); ptr = NULL; } while (0) /* * Special printk facility for scheduler/timekeeping use only, _DO_NOT_USE_ ! @@ -220,6 +235,17 @@ int printk(const char *s, ...) { return 0; } +static inline struct printk_buffer *get_printk_buffer(void) +{ + return NULL; +} +static inline bool flush_printk_buffer(struct printk_buffer *ptr) +{ + return false; +} +#define printk_buffered(ptr, fmt, ...) printk(fmt, ##__VA_ARGS__) +#define vprintk_buffered(ptr, fmt, args) vprintk(fmt, args) +#define put_printk_buffer(ptr) do { ptr = NULL; } while (0) static inline __printf(1, 2) __cold int printk_deferred(const char *s, ...) { @@ -330,6 +356,23 @@ static inline void printk_safe_flush_on_panic(void) no_printk(KERN_DEBUG pr_fmt(fmt), ##__VA_ARGS__) #endif +#define bpr_emerg(ptr, fmt, ...) \ + printk_buffered(ptr, KERN_EMERG pr_fmt(fmt), ##__VA_ARGS__) +#define bpr_alert(ptr, fmt, ...) \ + printk_buffered(ptr, KERN_ALERT pr_fmt(fmt), ##__VA_ARGS__) +#define bpr_crit(ptr, fmt, ...) \ + printk_buffered(ptr, KERN_CRIT pr_fmt(fmt), ##__VA_ARGS__) +#define bpr_err(ptr, fmt, ...) \ + printk_buffered(ptr, KERN_ERR pr_fmt(fmt), ##__VA_ARGS__) +#define bpr_warning(ptr, fmt, ...) \ + printk_buffered(ptr, KERN_WARNING pr_fmt(fmt), ##__VA_ARGS__) +#define bpr_warn bpr_warning +#define bpr_notice(ptr, fmt, ...) \ + printk_buffered(ptr, KERN_NOTICE pr_fmt(fmt), ##__VA_ARGS__) +#define bpr_info(ptr, fmt, ...) \ + printk_buffered(ptr, KERN_INFO pr_fmt(fmt), ##__VA_ARGS__) +#define bpr_cont(ptr, fmt, ...) \ + printk_buffered(ptr, KERN_CONT fmt, ##__VA_ARGS__) /* If you are writing a driver, please use dev_dbg instead */ #if defined(CONFIG_DYNAMIC_DEBUG) diff --git a/kernel/printk/Makefile b/kernel/printk/Makefile index 4a2ffc3..23b1547 100644 --- a/kernel/printk/Makefile +++ b/kernel/printk/Makefile @@ -1,3 +1,3 @@ obj-y = printk.o -obj-$(CONFIG_PRINTK) += printk_safe.o +obj-$(CONFIG_PRINTK) += printk_safe.o printk_buffered.o obj-$(CONFIG_A11Y_BRAILLE_CONSOLE) += braille.o diff --git a/kernel/printk/internal.h b/kernel/printk/internal.h index 0f18988..5e8c048 100644 --- a/kernel/printk/internal.h +++ b/kernel/printk/internal.h @@ -22,6 +22,9 @@ #define PRINTK_NMI_DIRECT_CONTEXT_MASK 0x40000000 #define PRINTK_NMI_CONTEXT_MASK 0x80000000 +#define PREFIX_MAX 32 +#define LOG_LINE_MAX (1024 - PREFIX_MAX) + extern raw_spinlock_t logbuf_lock; __printf(5, 0) diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 1b2a029..0b06211 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -421,9 +421,6 @@ __packed __aligned(4) static u64 clear_seq; static u32 clear_idx; -#define PREFIX_MAX 32 -#define LOG_LINE_MAX (1024 - PREFIX_MAX) - #define LOG_LEVEL(v) ((v) & 0x07) #define LOG_FACILITY(v) ((v) >> 3 & 0xff) diff --git a/kernel/printk/printk_buffered.c b/kernel/printk/printk_buffered.c new file mode 100644 index 0000000..d181d31 --- /dev/null +++ b/kernel/printk/printk_buffered.c @@ -0,0 +1,179 @@ +/* SPDX-License-Identifier: GPL-2.0+ */ + +#include <linux/types.h> /* DECLARE_BITMAP() */ +#include <linux/printk.h> +#include "internal.h" + +/* A structure for line-buffered printk() output. */ +struct printk_buffer { + unsigned short int len; /* Valid bytes in buf[]. */ + char buf[LOG_LINE_MAX]; +} __aligned(1024); + +/* + * Number of statically preallocated buffers. + * + * We can introduce a kernel config option if someone wants to tune this value. + * But since "struct printk_buffer" makes difference only when there are + * multiple threads concurrently calling printk() which does not end with '\n', + * and this API will fallback to normal printk() when all buffers are in use, + * it is possible that nobody needs to tune this value. + */ +#define NUM_LINE_BUFFERS 16 + +static struct printk_buffer printk_buffers[NUM_LINE_BUFFERS]; +static DECLARE_BITMAP(printk_buffers_in_use, NUM_LINE_BUFFERS); + +/** + * get_printk_buffer - Try to get printk_buffer. + * + * Returns pointer to "struct printk_buffer" on success, NULL otherwise. + * + * If this function returned "struct printk_buffer", the caller is responsible + * for passing it to put_printk_buffer() so that "struct printk_buffer" can be + * reused in the future. + * + * Even if this function returned NULL, the caller does not need to check for + * NULL, for passing NULL to printk_buffered() simply acts like normal printk() + * and passing NULL to flush_printk_buffer()/put_printk_buffer() is a no-op. + */ +struct printk_buffer *get_printk_buffer(void) +{ + long i; + + for (i = 0; i < NUM_LINE_BUFFERS; i++) { + if (test_and_set_bit_lock(i, printk_buffers_in_use)) + continue; + printk_buffers[i].len = 0; + return &printk_buffers[i]; + } + return NULL; +} +EXPORT_SYMBOL(get_printk_buffer); + +/** + * vprintk_buffered - Try to vprintk() in line buffered mode. + * + * @ptr: Pointer to "struct printk_buffer". It can be NULL. + * @fmt: printk() format string. + * @args: va_list structure. + * + * Returns the return value of vprintk(). + * + * Try to store to @ptr first. If it fails, flush @ptr and then try to store to + * @ptr again. If it still fails, use unbuffered printing. + */ +int vprintk_buffered(struct printk_buffer *ptr, const char *fmt, va_list args) +{ + va_list tmp_args; + int r; + int pos; + + if (!ptr) + return vprintk(fmt, args); + /* + * Skip KERN_CONT here based on an assumption that KERN_CONT will be + * given via "fmt" argument when KERN_CONT is given. + */ + pos = (printk_get_level(fmt) == 'c') ? 2 : 0; + while (true) { + va_copy(tmp_args, args); + r = vsnprintf(ptr->buf + ptr->len, sizeof(ptr->buf) - ptr->len, + fmt + pos, tmp_args); + va_end(tmp_args); + if (likely(r + ptr->len < sizeof(ptr->buf))) + break; + if (!flush_printk_buffer(ptr)) + return vprintk(fmt, args); + } + ptr->len += r; + /* Flush already completed lines if any. */ + for (pos = ptr->len - 1; pos >= 0; pos--) { + if (ptr->buf[pos] != '\n') + continue; + ptr->buf[pos++] = '\0'; + printk("%s\n", ptr->buf); + ptr->len -= pos; + memmove(ptr->buf, ptr->buf + pos, ptr->len); + /* This '\0' will be overwritten by next vsnprintf() above. */ + ptr->buf[ptr->len] = '\0'; + break; + } + return r; +} + +/** + * printk_buffered - Try to printk() in line buffered mode. + * + * @ptr: Pointer to "struct printk_buffer". It can be NULL. + * @fmt: printk() format string, followed by arguments. + * + * Returns the return value of printk(). + * + * Try to store to @ptr first. If it fails, flush @ptr and then try to store to + * @ptr again. If it still fails, use unbuffered printing. + */ +int printk_buffered(struct printk_buffer *ptr, const char *fmt, ...) +{ + va_list args; + int r; + + va_start(args, fmt); + r = vprintk_buffered(ptr, fmt, args); + va_end(args); + return r; +} +EXPORT_SYMBOL(printk_buffered); + +/** + * flush_printk_buffer - Flush incomplete line in printk_buffer. + * + * @ptr: Pointer to "struct printk_buffer". It can be NULL. + * + * Returns true if flushed something, false otherwise. + * + * Flush if @ptr contains partial data. But usually there is no need to call + * this function because @ptr is flushed by put_printk_buffer(). + */ +bool flush_printk_buffer(struct printk_buffer *ptr) +{ + if (!ptr || !ptr->len) + return false; + /* + * vprintk_buffered() keeps 0 <= ptr->len < sizeof(ptr->buf) true. + * But ptr->buf[ptr->len] != '\0' if this function is called due to + * vsnprintf() + ptr->len >= sizeof(ptr->buf). + */ + ptr->buf[ptr->len] = '\0'; + printk("%s", ptr->buf); + ptr->len = 0; + return true; +} +EXPORT_SYMBOL(flush_printk_buffer); + +/** + * __put_printk_buffer - Release printk_buffer. + * + * @ptr: Pointer to "struct printk_buffer". It can be NULL. + * + * Returns nothing. + * + * Flush and release @ptr. + * Please use put_printk_buffer() in order to catch use-after-free bugs. + */ +void __put_printk_buffer(struct printk_buffer *ptr) +{ + long i = (unsigned long) ptr - (unsigned long) printk_buffers; + + if (!ptr) + return; + if (WARN_ON_ONCE(i % sizeof(struct printk_buffer))) + return; + i /= sizeof(struct printk_buffer); + if (WARN_ON_ONCE(i < 0 || i >= NUM_LINE_BUFFERS)) + return; + if (ptr->len) + flush_printk_buffer(ptr); + clear_bit_unlock(i, printk_buffers_in_use); +} +EXPORT_SYMBOL(__put_printk_buffer);
Sometimes we want to print a whole line without being disturbed by concurrent printk() from interrupts and/or other threads, for printk() which does not end with '\n' can be disturbed. Since mixed printk() output makes it hard to interpret, this patch introduces API for line-buffered printk() output (so that we can make sure that printk() ends with '\n'). Since functions introduced by this patch are merely wrapping printk()/vprintk() calls in order to minimize possibility of using "struct cont", it is safe to replace printk()/vprintk() with this API. Since we want to remove "struct cont" eventually, we will try to remove both "implicit printk() users who are expecting KERN_CONT behavior" and "explicit pr_cont()/printk(KERN_CONT) users". Therefore, converting to this API is recommended. Details: A structure named "struct printk_buffer" is introduced for buffering up to LOG_LINE_MAX bytes of printk() output which did not end with '\n'. get_printk_buffer() tries to assign a "struct printk_buffer" from statically preallocated array. get_printk_buffer() returns NULL if all "struct printk_buffer" are in use, but the caller does not need to check for NULL. put_printk_buffer() flushes and releases the "struct printk_buffer". put_printk_buffer() must match corresponding get_printk_buffer() as with rcu_read_unlock() must match corresponding rcu_read_lock(). Three functions vprintk_buffered(), printk_buffered() and flush_printk_buffer() are provided for using "struct printk_buffer". These are like vfprintf(), fprintf(), fflush() except that these receive "struct printk_buffer *" for the first argument. vprintk_buffered() and printk_buffered() fall back to vprintk() and printk() respectively if "struct printk_buffer *" argument is NULL. flush_printk_buffer() and put_printk_buffer() become no-op if "struct printk_buffer *" argument is NULL. Therefore, the caller of get_printk_buffer() does not need to check for NULL. How to use this API: (1) Call get_printk_buffer() and acquire "struct printk_buffer *". (2) Rewrite printk() calls in the following way. The "ptr" is "struct printk_buffer *" obtained in step (1). printk(fmt, ...) => printk_buffered(ptr, fmt, ...) vprintk(fmt, args) => vprintk_buffered(ptr, fmt, args) pr_emerg(fmt, ...) => bpr_emerg(ptr, fmt, ...) pr_alert(fmt, ...) => bpr_alert(ptr, fmt, ...) pr_crit(fmt, ...) => bpr_crit(ptr, fmt, ...) pr_err(fmt, ...) => bpr_err(ptr, fmt, ...) pr_warning(fmt, ...) => bpr_warning(ptr, fmt, ...) pr_warn(fmt, ...) => bpr_warn(ptr, fmt, ...) pr_notice(fmt, ...) => bpr_notice(ptr, fmt, ...) pr_info(fmt, ...) => bpr_info(ptr, fmt, ...) pr_cont(fmt, ...) => bpr_cont(ptr, fmt, ...) (3) Release "struct printk_buffer" by calling put_printk_buffer(). Note that since "struct printk_buffer" buffers only up to one line, there is no need to rewrite if it is known that the "struct printk_buffer" is empty and printk() ends with '\n'. Good example: printk("Hello "); => buf = get_printk_buffer(); pr_cont("world.\n"); printk_buffered(buf, "Hello "); printk_buffered(buf, "world.\n"); put_printk_buffer(buf); Pointless example: printk("Hello\n"); => buf = get_printk_buffer(); printk("World.\n"); printk_buffered(buf, "Hello\n"); printk_buffered(buf, "World.\n"); put_printk_buffer(buf); Note that bpr_devel() and bpr_debug() are not defined. This is because pr_devel()/pr_debug() should not be followed by pr_cont() because pr_devel()/pr_debug() are conditionally enabled; output from pr_devel()/pr_debug() should always end with '\n'. Previous version was proposed at https://lkml.kernel.org/r/1540375870-6235-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp . Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> --- include/linux/printk.h | 43 ++++++++++ kernel/printk/Makefile | 2 +- kernel/printk/internal.h | 3 + kernel/printk/printk.c | 3 - kernel/printk/printk_buffered.c | 179 ++++++++++++++++++++++++++++++++++++++++ 5 files changed, 226 insertions(+), 4 deletions(-) create mode 100644 kernel/printk/printk_buffered.c