Message ID | 20240710130335.765885-1-daniel.vetter@ffwll.ch (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | bcachefs: no console_lock in bch2_print_string_as_lines | expand |
On 2024-07-10, Daniel Vetter <daniel.vetter@ffwll.ch> wrote: > console_lock is the outermost subsystem lock for a lot of subsystems, > which means get/put_user must nest within. Which means it cannot be > acquired somewhere deeply nested in other locks, and most definitely > not while holding fs locks potentially needed to resolve faults. > > console_trylock is the best we can do here. But John pointed out on a > previous version that this is futile: > > "Using the console lock here at all is wrong. The console lock does not > prevent other CPUs from calling printk() and inserting lines in between. > > "There is no way to guarantee a contiguous ringbuffer block using > multiple printk() calls. > > "The console_lock usage should be removed." > > https://lore.kernel.org/lkml/87frsh33xp.fsf@jogness.linutronix.de/ > > Do that. > > Reported-by: syzbot+6cebc1af246fe020a2f0@syzkaller.appspotmail.com > References: https://lore.kernel.org/dri-devel/00000000000026c1ff061cd0de12@google.com/ > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> > Fixes: a8f354284304 ("bcachefs: bch2_print_string_as_lines()") Reviewed-by: John Ogness <john.ogness@linutronix.de>
On 2024-07-10, Daniel Vetter <daniel.vetter@ffwll.ch> wrote: > console_lock is the outermost subsystem lock for a lot of subsystems, > which means get/put_user must nest within. Which means it cannot be > acquired somewhere deeply nested in other locks, and most definitely > not while holding fs locks potentially needed to resolve faults. > > console_trylock is the best we can do here. But John pointed out on a > previous version that this is futile: > > "Using the console lock here at all is wrong. The console lock does not > prevent other CPUs from calling printk() and inserting lines in between. > > "There is no way to guarantee a contiguous ringbuffer block using > multiple printk() calls. > > "The console_lock usage should be removed." > > https://lore.kernel.org/lkml/87frsh33xp.fsf@jogness.linutronix.de/ > > Do that. Note that there is more of this incorrect usage of console lock in: fs/bcachefs/debug.c:bch2_btree_verify_replica() fs/bcachefs/bset.c:bch2_dump_btree_node() from commit 1c6fdbd8f246("bcachefs: Initial commit") ... and its parent bcache: drivers/md/bcache/debug.c:bch_btree_verify() drivers/md/bcache/bset.c:bch_dump_bucket() from commit cafe56359144("bcache: A block layer cache") These should also be removed. Although Kent should verify that the console lock is not providing some sort of necessary side-effect synchronization. John Ogness
On Wed, Jul 10, 2024 at 04:19:53PM +0206, John Ogness wrote: > On 2024-07-10, Daniel Vetter <daniel.vetter@ffwll.ch> wrote: > > console_lock is the outermost subsystem lock for a lot of subsystems, > > which means get/put_user must nest within. Which means it cannot be > > acquired somewhere deeply nested in other locks, and most definitely > > not while holding fs locks potentially needed to resolve faults. > > > > console_trylock is the best we can do here. But John pointed out on a > > previous version that this is futile: > > > > "Using the console lock here at all is wrong. The console lock does not > > prevent other CPUs from calling printk() and inserting lines in between. > > > > "There is no way to guarantee a contiguous ringbuffer block using > > multiple printk() calls. > > > > "The console_lock usage should be removed." > > > > https://lore.kernel.org/lkml/87frsh33xp.fsf@jogness.linutronix.de/ > > > > Do that. > > Note that there is more of this incorrect usage of console lock in: > > fs/bcachefs/debug.c:bch2_btree_verify_replica() > > fs/bcachefs/bset.c:bch2_dump_btree_node() > > from commit 1c6fdbd8f246("bcachefs: Initial commit") > > ... and its parent bcache: > > drivers/md/bcache/debug.c:bch_btree_verify() > > drivers/md/bcache/bset.c:bch_dump_bucket() > > from commit cafe56359144("bcache: A block layer cache") > > These should also be removed. Although Kent should verify that the > console lock is not providing some sort of necessary side-effect > synchronization. I'll take a look, at least some of them seem doable to audit without deep bcachefs understanding. Thanks for pointing them out, I should have looked a bit more at git grep ... -Sima
diff --git a/fs/bcachefs/util.c b/fs/bcachefs/util.c index de331dec2a99..dc891563d502 100644 --- a/fs/bcachefs/util.c +++ b/fs/bcachefs/util.c @@ -8,7 +8,6 @@ #include <linux/bio.h> #include <linux/blkdev.h> -#include <linux/console.h> #include <linux/ctype.h> #include <linux/debugfs.h> #include <linux/freezer.h> @@ -261,7 +260,6 @@ void bch2_print_string_as_lines(const char *prefix, const char *lines) return; } - console_lock(); while (1) { p = strchrnul(lines, '\n'); printk("%s%.*s\n", prefix, (int) (p - lines), lines); @@ -269,7 +267,6 @@ void bch2_print_string_as_lines(const char *prefix, const char *lines) break; lines = p + 1; } - console_unlock(); } int bch2_save_backtrace(bch_stacktrace *stack, struct task_struct *task, unsigned skipnr,