Message ID | ca0022886e8f211a323a716653a1396a3bc91653.1722365899.git.daniel@makrotopia.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | block: preparations for NVMEM provider | expand |
Same NAK as last time. Random modules should not be able to hook directly into block device / partition probing. What you want to do can be done trivially in userspace in initramfs, please do that as recommended multiple times before.
On Tue, Jul 30, 2024 at 12:36:59PM -0700, Christoph Hellwig wrote: > Same NAK as last time. Random modules should not be able to hook > directly into block device / partition probing. Would using delayed_work be indirect enough for your taste? If so, that would of course be rather easy to implement. > > What you want to do can be done trivially in userspace in initramfs, > please do that as recommended multiple times before. > While the average desktop or server **general purpose** Linux distribution uses an initramfs, often generated dynamically on the target system during installation or kernel updates, this is NOT how things are working in the embedded Linux world and for OpenWrt specifically. For the OpenWrt community, the great thing is that the Linux Kernel, and even an identical userland can run on embedded devices with as little as 8 megabytes of NOR flash as well as on much more resourceful systems with large a eMMC or even NVMe disks, but almost always just exactly one single non-volatile storage device. All of those devices come without complex boot firmware, so no ACPI, no UEFI, ... just U-Boot and a DT blob which gets glued to the kernel in one way or another. And it would of course be nice if they would all wake up with correct MAC addresses and working WiFi, even if they come with larger (typically block-oriented) storage. In terms of hardware such boards are often just two or three IC packages: SoC (sometimes including RAM) and some sort of non-volatile memory big enough to store a Linux-based firmware, factory data (MAC addresses, WiFI calibration, serial number) and user settings. The same Linux Kernel source tree is also used to build kernels running on countless large servers (and comparingly small number of desktop systems) with complex (proprietary) boot firmware and typically a hand full of flashes and EEPROMs on the motherboard alone. On such systems, Ethernet NICs are dedicated chips or even PCIe cards with sometimes even dedicated EEPROMs storing their MAC addresses. Or virtual machines having the host taking care of all of that. Coexistance of all those different scales, without forcing the ways of large systems onto the small ones (and vice versa) has been a huge strength in my opinion. When it comes to the small (sub $100, often much less) boards for plastic-case network appliances such as routers and access points, often times the exact same board can be bought either with on-board SPI-NAND (used with UBI) or an eMMC. Of course, the vendors keep things as similar as possible, so the layout used for the NVMEM bits is often identical, just that in one case those (typically less than a memory page full of) bits are stored on an MTD partition or directly inside a UBI volume, and in the other case they are stored either at a fixed offset on the mmcblk0boot[01] device or inside a GPT partition. This is just how reality for this class of devices already looks like today. In previous iterations of the series I've provided multiple examples of mainstream device vendors (Adtran, ASUS, GL.iNet, ...) to illustrate that. Hence I fail to understand why different rules should apply for block devices than for EEPROMs, e-fuses, raw or SPI-connected NOR or NAND flashes, or UBI. Especially as this is about something completely optional, and disabled by default. Effectively, if an interface to reference and access block-oriented storage devices as NVMEM providers in the same way as MTD, UBI, ... is rejected by the Linux kernel, it just means we will have to carry that as a downstream patch in OpenWrt in order to support those devices in a decent way. Generating a device-specific initramfs for each and every device would not be decent imho. Carrying information about all devices in the filesystem used on every device is also not decent. Our goal is exactly to get rid of the board-specific switch-case Shell script madness in userspace instead of having more of it... Traversing DT in userspace (via /sys/firmware/) would of course be possible, but it's often simply too late (ie. after rootfs has been mounted, and that includes initramfs) for many use-cases (eg. nfsroot), and it would be a redundant implementation of things which are already implemented in the kernel. We don't like to repeat ourselves, nor do we like to deal with board-specific details in userland. Having a complex do-it-all initramfs like on the common x86-centric desktop or server distribution is also not an option, it would never fit into the storage of lower-end devices with only a few megabytes of NOR flash. You'd need two copies of libc and busybox (one in initramfs and one in the actual rootfs), and even the extreme case of a single static ELF binary used as initrd would still occupy hundreds of kilobytes of storage, and be a hell to maintain. If that sounds like very little to you, that means you haven't been dealing with that class of devices. Thank you for your consideration Daniel
diff --git a/block/Kconfig b/block/Kconfig index 5b623b876d3b4..67cd4f92378af 100644 --- a/block/Kconfig +++ b/block/Kconfig @@ -209,6 +209,12 @@ config BLK_INLINE_ENCRYPTION_FALLBACK by falling back to the kernel crypto API when inline encryption hardware is not present. +config BLOCK_NOTIFIERS + bool "Enable support for notifications in block layer" + help + Enable this option to provide notifiers for other subsystems + upon addition or removal of block devices. + source "block/partitions/Kconfig" config BLK_MQ_PCI diff --git a/block/Makefile b/block/Makefile index ddfd21c1a9ffc..a131fa7d6b26e 100644 --- a/block/Makefile +++ b/block/Makefile @@ -38,3 +38,4 @@ obj-$(CONFIG_BLK_INLINE_ENCRYPTION) += blk-crypto.o blk-crypto-profile.o \ blk-crypto-sysfs.o obj-$(CONFIG_BLK_INLINE_ENCRYPTION_FALLBACK) += blk-crypto-fallback.o obj-$(CONFIG_BLOCK_HOLDER_DEPRECATED) += holder.o +obj-$(CONFIG_BLOCK_NOTIFIERS) += blk-notify.o diff --git a/block/blk-notify.c b/block/blk-notify.c new file mode 100644 index 0000000000000..fd727288ea19a --- /dev/null +++ b/block/blk-notify.c @@ -0,0 +1,87 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Notifiers for addition and removal of block devices + * + * Copyright (c) 2024 Daniel Golle <daniel@makrotopia.org> + */ + +#include <linux/list.h> +#include <linux/mutex.h> +#include <linux/notifier.h> + +#include "blk.h" + +struct blk_device_list { + struct device *dev; + struct list_head list; +}; + +static RAW_NOTIFIER_HEAD(blk_notifier_list); +static DEFINE_MUTEX(blk_notifier_lock); +static LIST_HEAD(blk_devices); + +void blk_register_notify(struct notifier_block *nb) +{ + struct blk_device_list *existing_blkdev; + + mutex_lock(&blk_notifier_lock); + raw_notifier_chain_register(&blk_notifier_list, nb); + + list_for_each_entry(existing_blkdev, &blk_devices, list) + nb->notifier_call(nb, BLK_DEVICE_ADD, existing_blkdev->dev); + + mutex_unlock(&blk_notifier_lock); +} +EXPORT_SYMBOL_GPL(blk_register_notify); + +void blk_unregister_notify(struct notifier_block *nb) +{ + mutex_lock(&blk_notifier_lock); + raw_notifier_chain_unregister(&blk_notifier_list, nb); + mutex_unlock(&blk_notifier_lock); +} +EXPORT_SYMBOL_GPL(blk_unregister_notify); + +static int blk_call_notifier_add(struct device *dev) +{ + struct blk_device_list *new_blkdev; + + new_blkdev = kmalloc(sizeof(*new_blkdev), GFP_KERNEL); + if (!new_blkdev) + return -ENOMEM; + + new_blkdev->dev = dev; + mutex_lock(&blk_notifier_lock); + list_add_tail(&new_blkdev->list, &blk_devices); + raw_notifier_call_chain(&blk_notifier_list, BLK_DEVICE_ADD, dev); + mutex_unlock(&blk_notifier_lock); + return 0; +} + +static void blk_call_notifier_remove(struct device *dev) +{ + struct blk_device_list *old_blkdev, *tmp; + + mutex_lock(&blk_notifier_lock); + list_for_each_entry_safe(old_blkdev, tmp, &blk_devices, list) { + if (old_blkdev->dev != dev) + continue; + + list_del(&old_blkdev->list); + kfree(old_blkdev); + } + raw_notifier_call_chain(&blk_notifier_list, BLK_DEVICE_REMOVE, dev); + mutex_unlock(&blk_notifier_lock); +} + +static struct class_interface blk_notifications_bus_interface __refdata = { + .class = &block_class, + .add_dev = &blk_call_notifier_add, + .remove_dev = &blk_call_notifier_remove, +}; + +static int __init blk_notifications_init(void) +{ + return class_interface_register(&blk_notifications_bus_interface); +} +device_initcall(blk_notifications_init); diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index e85ec73a07d57..2f871158d2860 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -1682,4 +1682,15 @@ static inline bool bdev_can_atomic_write(struct block_device *bdev) #define DEFINE_IO_COMP_BATCH(name) struct io_comp_batch name = { } + +#define BLK_DEVICE_ADD 1 +#define BLK_DEVICE_REMOVE 2 +#if defined(CONFIG_BLOCK_NOTIFIERS) +void blk_register_notify(struct notifier_block *nb); +void blk_unregister_notify(struct notifier_block *nb); +#else +static inline void blk_register_notify(struct notifier_block *nb) { }; +static inline void blk_unregister_notify(struct notifier_block *nb) { }; +#endif + #endif /* _LINUX_BLKDEV_H */
Add notifier block to notify other subsystems about the addition or removal of block devices. Signed-off-by: Daniel Golle <daniel@makrotopia.org> --- block/Kconfig | 6 +++ block/Makefile | 1 + block/blk-notify.c | 87 ++++++++++++++++++++++++++++++++++++++++++ include/linux/blkdev.h | 11 ++++++ 4 files changed, 105 insertions(+) create mode 100644 block/blk-notify.c