Message ID | 28d8dff56bc15b2a41f0d2035701ccb11df22610.1730133890.git.chris@chrisdown.name |
---|---|
State | New |
Headers | show |
Series | printk: console: Per-console loglevels | expand |
On Mon 2024-10-28 16:45:37, Chris Down wrote: > Consoles can have vastly different latencies and throughputs. For > example, writing a message to the serial console can take on the order > of tens of milliseconds to get the UART to successfully write a message. > While this might be fine for a single, one-off message, this can cause > significant application-level stalls in situations where the kernel > writes large amounts of information to the console. > > This means that while you might want to send at least INFO level > messages to (for example) netconsole, which is relatively fast, you may > only want to send at least WARN level messages to the serial console. > Such an implementation would permit debugging using the serial console > in cases that netconsole doesn't receive messages during particularly > bad system issues, while still keeping the noise low enough to avoid > inducing latency in userspace applications. To mitigate this, add such > an interface, extending the existing console loglevel controls to allow > each console to have its own loglevel. > > One can't just disable the serial console, because one may actually need > it in situations where the machine is in a bad enough state that nothing > is received on netconsole. One also can't just bump the loglevel at > runtime after the issue, because usually the machine is already so > wedged by this point that it isn't responsive to such requests. The motivation is described nicely. > The sysfs and kernel command line interfaces to set the per-console > loglevel will be added later. For now, simply add the necessary internal > infrastructure to be used by later patches. It would be nice to describe the basic logic in the commit message. And also the handling when the global console_loglevel is modified via sysrq. Something like: <proposal> This commit adds the internal infrastructure to support per-console log levels, which will be configurable through sysfs and the kernel command line in future commits. The global `console_loglevel` is preserved and used as the default log level for all consoles. Each console can override this global level with its own specific log level stored in `struct console`. To override the global level, the per-console log level must be greater than 0; otherwise, the default value of -1 ensures the global level is used. The existing `ignore_loglevel` command line parameter will override both the global and per-console log levels. </proposal> > --- a/drivers/tty/sysrq.c > +++ b/drivers/tty/sysrq.c > @@ -101,11 +102,25 @@ __setup("sysrq_always_enabled", sysrq_always_enabled_setup); > static void sysrq_handle_loglevel(u8 key) > { > u8 loglevel = key - '0'; > + int cookie; > + int warned = 0; > + struct console *con; > > console_loglevel = CONSOLE_LOGLEVEL_DEFAULT; > pr_info("Loglevel set to %u\n", loglevel); > console_loglevel = loglevel; > + > + cookie = console_srcu_read_lock(); > + for_each_console_srcu(con) { > + if (!warned && per_console_loglevel_is_set(con)) { > + warned = 1; > + pr_warn("Overriding per-console loglevel from sysrq\n"); > + } > + WRITE_ONCE(con->level, -1); > + } > + console_srcu_read_unlock(cookie); > } > + I would prefer to move this into a separate patch. It is similar to the syslog interface which is handled separately as well. I would keep this patch only for the basic logic. > static const struct sysrq_key_op sysrq_loglevel_op = { > .handler = sysrq_handle_loglevel, > .help_msg = "loglevel(0-9)", > diff --git a/include/linux/console.h b/include/linux/console.h > index eba367bf605d..3ff22bfeef2a 100644 > --- a/include/linux/console.h > +++ b/include/linux/console.h > @@ -349,6 +350,7 @@ struct console { > unsigned long dropped; > void *data; > struct hlist_node node; > + int level; > > /* nbcon console specific members */ > > diff --git a/include/linux/printk.h b/include/linux/printk.h > index eca9bb2ee637..5fbd6b7f1ab4 100644 > --- a/include/linux/printk.h > +++ b/include/linux/printk.h > @@ -303,6 +304,10 @@ static inline void nbcon_device_release(struct console *con) > static inline void nbcon_atomic_flush_unsafe(void) > { > } > +static inline bool per_console_loglevel_is_set(const struct console *con) There are similar functions in the printk API, for example, is_printk_legacy_deferred(). It would be nice to use a similar naming scheme. I would call it, something like: has_per_console_loglevel(con) > +{ > + return false; > +} > > #endif > > --- a/kernel/printk/printk.c > +++ b/kernel/printk/printk.c > @@ -1287,9 +1287,62 @@ module_param(ignore_loglevel, bool, S_IRUGO | S_IWUSR); > MODULE_PARM_DESC(ignore_loglevel, > "ignore loglevel setting (prints all kernel messages to the console)"); > > -static bool suppress_message_printing(int level) > +bool per_console_loglevel_is_set(const struct console *con) > { > - return (level >= console_loglevel && !ignore_loglevel); > + return con && (READ_ONCE(con->level) > 0); We should document why READ_ONCE() is used. Also it would be great to allow checking of locking dependencies to make sure that the struct console could not disappear. Similiar to console_srcu_read_flags(con). In the followup patches, the funtion is mostly used the console_srcu_read_lock(). The exception seems to be the sysfs interface, e.g. effective_loglevel_source_show(). Hmm, we need to distinguish these two callers either by serapate API or a parameter. Both is a bit ugly, so I would prefer to reduce the number of these ugly APIs by passing @con_loglevel instead of @con. Anyway, we should create something similar to console_srcu_read_flags(). /** * console_srcu_read_loglevel - Locklessly read the console specific loglevel * of a possibly registered console * @con: struct console pointer of console to read flags from * * Locklessly reading @con->loglevel provides a consistent read value because * there is at most one CPU modifying @con->loglevel and that CPU is using only * read-modify-write operations to do so. * * Requires console_srcu_read_lock to be held, which implies that @con might * be a registered console. The purpose of holding console_srcu_read_lock is * to guarantee that the console state is valid (CON_SUSPENDED/CON_ENABLED) * and that no exit/cleanup routines will run if the console is currently * undergoing unregistration. * * If the caller is holding the console_list_lock or it is _certain_ that * @con is not and will not become registered, the caller may read * @con->loglevel directly instead. * * Context: Any context. * Return: The current value of the @con->level field. */ static inline int console_srcu_read_loglevel(const struct console *con) { WARN_ON_ONCE(!console_srcu_read_lock_is_held()); /* The READ_ONCE() matches the WRITE_ONCE() when @flags are modified. */ return data_race(READ_ONCE(con->level)); } Then I would suggest to pass @con_loglevel directly in most the other API. For example, this function would be: bool is_valid_per_console_loglevel(int con_level) { return (con_level > 0); } > +} > + > +/* > + * Hierarchy of loglevel authority: > + * > + * 1. con->level. The locally set, console-specific loglevel. Optional, only > + * valid if >0. > + * 2. console_loglevel. The default global console loglevel, always present. I would remove these comments. It is obvious from the code. > + */ > +enum loglevel_source > +console_effective_loglevel_source(const struct console *con) > +{ > + if (WARN_ON_ONCE(!con)) > + return LLS_GLOBAL; > + > + if (ignore_loglevel) > + return LLS_IGNORE_LOGLEVEL; > + > + if (per_console_loglevel_is_set(con)) > + return LLS_LOCAL; > + > + return LLS_GLOBAL; > +} The @con parameter is nice. But it makes it hard to add lockdep checks. So, I suggest to pass the console specific loglevel as the parameter, like: enum loglevel_source console_effective_loglevel_source(int con_level) { if (ignore_loglevel) return LLS_IGNORE_LOGLEVEL; if (is_valid_per_console_loglevel(con_level)) return LLS_LOCAL; return LLS_GLOBAL; } IMHO, it is not that bad after all. > +int console_effective_loglevel(const struct console *con) > +{ > + enum loglevel_source source; > + int level; > + > + source = console_effective_loglevel_source(con); > + > + switch (source) { > + case LLS_IGNORE_LOGLEVEL: > + level = CONSOLE_LOGLEVEL_MOTORMOUTH; > + break; > + case LLS_LOCAL: > + level = READ_ONCE(con->level); > + break; > + case LLS_GLOBAL: > + level = console_loglevel; > + break; > + default: > + pr_warn("Unhandled console loglevel source: %d", source); > + level = console_loglevel; > + break; > + } > + > + return level; > +} and similar here: int console_effective_loglevel(int con_level) { enum loglevel_source source; int level; source = console_effective_loglevel_source(con_level); switch (source) { case LLS_IGNORE_LOGLEVEL: level = CONSOLE_LOGLEVEL_MOTORMOUTH; break; case LLS_LOCAL: level = con_level; break; case LLS_GLOBAL: level = console_level; break; default: pr_warn("Unhandled console loglevel source: %d", source); level = console_level; break; } return level; } > + > +static bool suppress_message_printing(int level, struct console *con) > +{ > + return level >= console_effective_loglevel(con); > } and here: static bool suppress_message_printing(int level, int con_level) { return level >= console_effective_loglevel(con_level); } > > #ifdef CONFIG_BOOT_PRINTK_DELAY > @@ -2122,7 +2175,21 @@ int printk_delay_msec __read_mostly; > > static inline void printk_delay(int level) > { > - if (suppress_message_printing(level)) > + bool will_emit = false; > + int cookie; > + struct console *con; > + > + cookie = console_srcu_read_lock(); > + > + for_each_console_srcu(con) { > + if (!suppress_message_printing(level, con)) { This is called under srcu read lock so that we could use the srcu_read API here: for_each_console_srcu(con) { int con_level = console_srcu_read_loglevel(con); if (!suppress_message_printing(level, con_level)) { ... > + will_emit = true; > + break; > + } > + } > + console_srcu_read_unlock(cookie); > + > + if (!will_emit) > return; > > boot_delay_msec(); > @@ -2975,7 +3042,7 @@ bool printk_get_next_message(struct printk_message *pmsg, struct console *con, > pmsg->dropped = r.info->seq - seq; > > /* Never suppress when used in devkmsg_read() */ > - if (con && suppress_message_printing(r.info->level)) > + if (con && suppress_message_printing(r.info->level, con)) We could read the con_level at the beginning of the funcion. We already read there the CON_EXTENDED attribute: int con_level: if (con) { is_extended = console_srcu_read_flags(con) & CON_EXTENDED; con_level = console_srcu_read_loglevel(con); } else { /* Used only by devkmsg_read(). */ is_extended = true; con_level = LOGLEVEL_DEFAULT; } Note that I have used LOGLEVEL_DEFAULT instead of the hardcoded -1. IMHO, it is better to use a NAME and LOGLEVEL_DEFAULT is defined as -1 and the name more or less fits here. and then do: if (con && suppress_message_printing(r.info->level, con_level)) > goto out; > > if (is_extended) { > @@ -3789,6 +3856,9 @@ static int try_enable_preferred_console(struct console *newcon, > if (newcon->index < 0) > newcon->index = c->index; > > + // TODO: Will be configurable in a later patch I would remove the comment. It is enough to mention this in the commit message. > + newcon->level = -1; newcon->level = LOGLEVEL_DEFAULT; > + > if (_braille_register_console(newcon, c)) > return 0; > Best Regards, Petr PS: I am sorry for suggesting so many changes. I think that you did it as we agreed in the discussion about v5. v5 was discussed a long time ago and I have learned some new tricks with lockdep when working on the printk kthreads.
On Fri 2024-11-08 17:10:31, Petr Mladek wrote: > On Mon 2024-10-28 16:45:37, Chris Down wrote: > It would be nice to describe the basic logic in the commit message. > And also the handling when the global console_loglevel is modified > via sysrq. Something like: > > <proposal> > This commit adds the internal infrastructure to support per-console > log levels, which will be configurable through sysfs and the kernel > command line in future commits. > > The global `console_loglevel` is preserved and used as the default log > level for all consoles. Each console can override this global level > with its own specific log level stored in `struct console`. To > override the global level, the per-console log level must be greater > than 0; otherwise, the default value of -1 ensures the global level > is used. > > The existing `ignore_loglevel` command line parameter will override > both the global and per-console log levels. > </proposal> > > > --- a/kernel/printk/printk.c > > +++ b/kernel/printk/printk.c > > @@ -1287,9 +1287,62 @@ module_param(ignore_loglevel, bool, S_IRUGO | S_IWUSR); > > +enum loglevel_source > > +console_effective_loglevel_source(const struct console *con) > > +{ > > + if (WARN_ON_ONCE(!con)) > > + return LLS_GLOBAL; > > + > > + if (ignore_loglevel) > > + return LLS_IGNORE_LOGLEVEL; > > + > > + if (per_console_loglevel_is_set(con)) > > + return LLS_LOCAL; > > + > > + return LLS_GLOBAL; > > +} > > The @con parameter is nice. But it makes it hard to add lockdep > checks. So, I suggest to pass the console specific loglevel > as the parameter, like: > > enum loglevel_source > console_effective_loglevel_source(int con_level) > { > if (ignore_loglevel) > return LLS_IGNORE_LOGLEVEL; > > if (is_valid_per_console_loglevel(con_level)) > return LLS_LOCAL; > > return LLS_GLOBAL; > } > > int console_effective_loglevel(int con_level) > { > enum loglevel_source source; > int level; > > source = console_effective_loglevel_source(con_level); > > switch (source) { > case LLS_IGNORE_LOGLEVEL: > level = CONSOLE_LOGLEVEL_MOTORMOUTH; > break; > case LLS_LOCAL: > level = con_level; > break; > case LLS_GLOBAL: > level = console_level; > break; > default: > pr_warn("Unhandled console loglevel source: %d", source); > level = console_level; > break; > } > > return level; > } > > static bool suppress_message_printing(int level, int con_level) > { > return level >= console_effective_loglevel(con_level); > } > > > > > #ifdef CONFIG_BOOT_PRINTK_DELAY > > @@ -2975,7 +3042,7 @@ bool printk_get_next_message(struct printk_message *pmsg, struct console *con, > > pmsg->dropped = r.info->seq - seq; > > > > /* Never suppress when used in devkmsg_read() */ > > - if (con && suppress_message_printing(r.info->level)) > > + if (con && suppress_message_printing(r.info->level, con)) > > We could read the con_level at the beginning of the funcion. We > already read there the CON_EXTENDED attribute: > > int con_level: > > if (con) { > is_extended = console_srcu_read_flags(con) & CON_EXTENDED; > con_level = console_srcu_read_loglevel(con); > } else { > /* Used only by devkmsg_read(). */ > is_extended = true; > con_level = LOGLEVEL_DEFAULT; Or we could do: } else { /* Used only by devkmsg_read(). Show all messages there. */ is_extended = true; con_level = CONSOLE_LOGLEVEL_MOTORMOUTH; The LOGLEVE_DEFAULT would cause using the global loglevel. But we want to force showing the message. > } > > Note that I have used LOGLEVEL_DEFAULT instead of the hardcoded -1. > IMHO, it is better to use a NAME and LOGLEVEL_DEFAULT is defined as -1 > and the name more or less fits here. > > and then do: > > if (con && suppress_message_printing(r.info->level, con_level)) and then we do not need to check the @con here. This idea came when I was looking at resolving conflicts against the patchset which removed the hack in sysrq code, see https://lore.kernel.org/r/20241105-printk-loud-con-v2-0-bd3ecdf7b0e4@suse.com Best Regards, Petr
On Fri 2024-11-08 17:10:31, Petr Mladek wrote: > On Mon 2024-10-28 16:45:37, Chris Down wrote: > > Consoles can have vastly different latencies and throughputs. For > > example, writing a message to the serial console can take on the order > > of tens of milliseconds to get the UART to successfully write a message. > > While this might be fine for a single, one-off message, this can cause > > significant application-level stalls in situations where the kernel > > writes large amounts of information to the console. > > > > --- a/drivers/tty/sysrq.c > > +++ b/drivers/tty/sysrq.c > > @@ -101,11 +102,25 @@ __setup("sysrq_always_enabled", sysrq_always_enabled_setup); > > static void sysrq_handle_loglevel(u8 key) > > { > > u8 loglevel = key - '0'; > > + int cookie; > > + int warned = 0; > > + struct console *con; > > > > console_loglevel = CONSOLE_LOGLEVEL_DEFAULT; > > pr_info("Loglevel set to %u\n", loglevel); > > console_loglevel = loglevel; > > + > > + cookie = console_srcu_read_lock(); > > + for_each_console_srcu(con) { > > + if (!warned && per_console_loglevel_is_set(con)) { > > + warned = 1; > > + pr_warn("Overriding per-console loglevel from sysrq\n"); It just came to my mind. We could use pr_warn_once() and get rid of the @warned variable. It is slightly less optimal. But this is a slow path and the code would be easier. > > + } > > + WRITE_ONCE(con->level, -1); > > + } > > + console_srcu_read_unlock(cookie); > > } > > + > Best Regards, Petr
diff --git a/drivers/tty/sysrq.c b/drivers/tty/sysrq.c index 930b04e3d148..daa9fe7dad54 100644 --- a/drivers/tty/sysrq.c +++ b/drivers/tty/sysrq.c @@ -51,6 +51,7 @@ #include <linux/syscalls.h> #include <linux/of.h> #include <linux/rcupdate.h> +#include <linux/console.h> #include <asm/ptrace.h> #include <asm/irq_regs.h> @@ -101,11 +102,25 @@ __setup("sysrq_always_enabled", sysrq_always_enabled_setup); static void sysrq_handle_loglevel(u8 key) { u8 loglevel = key - '0'; + int cookie; + int warned = 0; + struct console *con; console_loglevel = CONSOLE_LOGLEVEL_DEFAULT; pr_info("Loglevel set to %u\n", loglevel); console_loglevel = loglevel; + + cookie = console_srcu_read_lock(); + for_each_console_srcu(con) { + if (!warned && per_console_loglevel_is_set(con)) { + warned = 1; + pr_warn("Overriding per-console loglevel from sysrq\n"); + } + WRITE_ONCE(con->level, -1); + } + console_srcu_read_unlock(cookie); } + static const struct sysrq_key_op sysrq_loglevel_op = { .handler = sysrq_handle_loglevel, .help_msg = "loglevel(0-9)", diff --git a/include/linux/console.h b/include/linux/console.h index eba367bf605d..3ff22bfeef2a 100644 --- a/include/linux/console.h +++ b/include/linux/console.h @@ -321,6 +321,7 @@ struct nbcon_write_context { * @dropped: Number of unreported dropped ringbuffer records * @data: Driver private data * @node: hlist node for the console list + * @level: Console-specific loglevel * * @nbcon_state: State for nbcon consoles * @nbcon_seq: Sequence number of the next record for nbcon to print @@ -349,6 +350,7 @@ struct console { unsigned long dropped; void *data; struct hlist_node node; + int level; /* nbcon console specific members */ diff --git a/include/linux/printk.h b/include/linux/printk.h index eca9bb2ee637..5fbd6b7f1ab4 100644 --- a/include/linux/printk.h +++ b/include/linux/printk.h @@ -204,6 +204,7 @@ void printk_legacy_allow_panic_sync(void); extern bool nbcon_device_try_acquire(struct console *con); extern void nbcon_device_release(struct console *con); void nbcon_atomic_flush_unsafe(void); +bool per_console_loglevel_is_set(const struct console *con); #else static inline __printf(1, 0) int vprintk(const char *s, va_list args) @@ -303,6 +304,10 @@ static inline void nbcon_device_release(struct console *con) static inline void nbcon_atomic_flush_unsafe(void) { } +static inline bool per_console_loglevel_is_set(const struct console *con) +{ + return false; +} #endif diff --git a/kernel/printk/internal.h b/kernel/printk/internal.h index 58ad209e0310..9303839d0551 100644 --- a/kernel/printk/internal.h +++ b/kernel/printk/internal.h @@ -15,6 +15,16 @@ int devkmsg_sysctl_set_loglvl(const struct ctl_table *table, int write, #define printk_sysctl_init() do { } while (0) #endif +enum loglevel_source { + LLS_GLOBAL, + LLS_LOCAL, + LLS_IGNORE_LOGLEVEL, +}; + +enum loglevel_source +console_effective_loglevel_source(const struct console *con); +int console_effective_loglevel(const struct console *con); + #define con_printk(lvl, con, fmt, ...) \ printk(lvl pr_fmt("%s%sconsole [%s%d] " fmt), \ (con->flags & CON_NBCON) ? "" : "legacy ", \ diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index dfe7011b863a..2e99b63efb46 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -1287,9 +1287,62 @@ module_param(ignore_loglevel, bool, S_IRUGO | S_IWUSR); MODULE_PARM_DESC(ignore_loglevel, "ignore loglevel setting (prints all kernel messages to the console)"); -static bool suppress_message_printing(int level) +bool per_console_loglevel_is_set(const struct console *con) { - return (level >= console_loglevel && !ignore_loglevel); + return con && (READ_ONCE(con->level) > 0); +} + +/* + * Hierarchy of loglevel authority: + * + * 1. con->level. The locally set, console-specific loglevel. Optional, only + * valid if >0. + * 2. console_loglevel. The default global console loglevel, always present. + */ +enum loglevel_source +console_effective_loglevel_source(const struct console *con) +{ + if (WARN_ON_ONCE(!con)) + return LLS_GLOBAL; + + if (ignore_loglevel) + return LLS_IGNORE_LOGLEVEL; + + if (per_console_loglevel_is_set(con)) + return LLS_LOCAL; + + return LLS_GLOBAL; +} + +int console_effective_loglevel(const struct console *con) +{ + enum loglevel_source source; + int level; + + source = console_effective_loglevel_source(con); + + switch (source) { + case LLS_IGNORE_LOGLEVEL: + level = CONSOLE_LOGLEVEL_MOTORMOUTH; + break; + case LLS_LOCAL: + level = READ_ONCE(con->level); + break; + case LLS_GLOBAL: + level = console_loglevel; + break; + default: + pr_warn("Unhandled console loglevel source: %d", source); + level = console_loglevel; + break; + } + + return level; +} + +static bool suppress_message_printing(int level, struct console *con) +{ + return level >= console_effective_loglevel(con); } #ifdef CONFIG_BOOT_PRINTK_DELAY @@ -2122,7 +2175,21 @@ int printk_delay_msec __read_mostly; static inline void printk_delay(int level) { - if (suppress_message_printing(level)) + bool will_emit = false; + int cookie; + struct console *con; + + cookie = console_srcu_read_lock(); + + for_each_console_srcu(con) { + if (!suppress_message_printing(level, con)) { + will_emit = true; + break; + } + } + console_srcu_read_unlock(cookie); + + if (!will_emit) return; boot_delay_msec(); @@ -2975,7 +3042,7 @@ bool printk_get_next_message(struct printk_message *pmsg, struct console *con, pmsg->dropped = r.info->seq - seq; /* Never suppress when used in devkmsg_read() */ - if (con && suppress_message_printing(r.info->level)) + if (con && suppress_message_printing(r.info->level, con)) goto out; if (is_extended) { @@ -3789,6 +3856,9 @@ static int try_enable_preferred_console(struct console *newcon, if (newcon->index < 0) newcon->index = c->index; + // TODO: Will be configurable in a later patch + newcon->level = -1; + if (_braille_register_console(newcon, c)) return 0;
Consoles can have vastly different latencies and throughputs. For example, writing a message to the serial console can take on the order of tens of milliseconds to get the UART to successfully write a message. While this might be fine for a single, one-off message, this can cause significant application-level stalls in situations where the kernel writes large amounts of information to the console. This means that while you might want to send at least INFO level messages to (for example) netconsole, which is relatively fast, you may only want to send at least WARN level messages to the serial console. Such an implementation would permit debugging using the serial console in cases that netconsole doesn't receive messages during particularly bad system issues, while still keeping the noise low enough to avoid inducing latency in userspace applications. To mitigate this, add such an interface, extending the existing console loglevel controls to allow each console to have its own loglevel. One can't just disable the serial console, because one may actually need it in situations where the machine is in a bad enough state that nothing is received on netconsole. One also can't just bump the loglevel at runtime after the issue, because usually the machine is already so wedged by this point that it isn't responsive to such requests. The sysfs and kernel command line interfaces to set the per-console loglevel will be added later. For now, simply add the necessary internal infrastructure to be used by later patches. Signed-off-by: Chris Down <chris@chrisdown.name> --- drivers/tty/sysrq.c | 15 ++++++++ include/linux/console.h | 2 ++ include/linux/printk.h | 5 +++ kernel/printk/internal.h | 10 ++++++ kernel/printk/printk.c | 78 +++++++++++++++++++++++++++++++++++++--- 5 files changed, 106 insertions(+), 4 deletions(-)