Message ID | 20200330115535.3215-2-vbabka@suse.cz (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | support setting sysctl parameters from kernel command line | expand |
On 3/30/20 1:55 PM, Vlastimil Babka wrote: > A recently proposed patch to add vm_swappiness command line parameter in > addition to existing sysctl [1] made me wonder why we don't have a general > support for passing sysctl parameters via command line. Googling found only > somebody else wondering the same [2], but I haven't found any prior discussion > with reasons why not to do this. > > Settings the vm_swappiness issue aside (the underlying issue might be solved in > a different way), quick search of kernel-parameters.txt shows there are already > some that exist as both sysctl and kernel parameter - hung_task_panic, > nmi_watchdog, numa_zonelist_order, traceoff_on_warning. A general mechanism > would remove the need to add more of those one-offs and might be handy in > situations where configuration by e.g. /etc/sysctl.d/ is impractical. > > Hence, this patch adds a new parse_args() pass that looks for parameters > prefixed by 'sysctl.' and tries to interpret them as writes to the > corresponding sys/ files using an temporary in-kernel procfs mount. This > mechanism was suggested by Eric W. Biederman [3], as it handles all dynamically > registered sysctl tables. Errors due to e.g. invalid parameter name or value > are reported in the kernel log. > > The processing is hooked right before the init process is loaded, as some > handlers might be more complicated than simple setters and might need some > subsystems to be initialized. At the moment the init process can be started and > eventually execute a process writing to /proc/sys/ then it should be also fine > to do that from the kernel. > > Sysctls registered later on module load time are not set by this mechanism - > it's expected that in such scenarios, setting sysctl values from userspace is > practical enough. > > [1] https://lore.kernel.org/r/BL0PR02MB560167492CA4094C91589930E9FC0@BL0PR02MB5601.namprd02.prod.outlook.com/ > [2] https://unix.stackexchange.com/questions/558802/how-to-set-sysctl-using-kernel-command-line-parameter > [3] https://lore.kernel.org/r/87bloj2skm.fsf@x220.int.ebiederm.org/ > > Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Boo, all the error prints should terminate with \n Will wait for feedback before resend.
On Mon, Mar 30, 2020 at 01:55:33PM +0200, Vlastimil Babka wrote: > A recently proposed patch to add vm_swappiness command line parameter in > addition to existing sysctl [1] made me wonder why we don't have a general > support for passing sysctl parameters via command line. Googling found only > somebody else wondering the same [2], but I haven't found any prior discussion > with reasons why not to do this. > > Settings the vm_swappiness issue aside (the underlying issue might be solved in > a different way), quick search of kernel-parameters.txt shows there are already > some that exist as both sysctl and kernel parameter - hung_task_panic, > nmi_watchdog, numa_zonelist_order, traceoff_on_warning. A general mechanism > would remove the need to add more of those one-offs and might be handy in > situations where configuration by e.g. /etc/sysctl.d/ is impractical. > > Hence, this patch adds a new parse_args() pass that looks for parameters > prefixed by 'sysctl.' and tries to interpret them as writes to the > corresponding sys/ files using an temporary in-kernel procfs mount. This > mechanism was suggested by Eric W. Biederman [3], as it handles all dynamically > registered sysctl tables. Errors due to e.g. invalid parameter name or value > are reported in the kernel log. > > The processing is hooked right before the init process is loaded, as some > handlers might be more complicated than simple setters and might need some > subsystems to be initialized. At the moment the init process can be started and > eventually execute a process writing to /proc/sys/ then it should be also fine > to do that from the kernel. > > Sysctls registered later on module load time are not set by this mechanism - > it's expected that in such scenarios, setting sysctl values from userspace is > practical enough. > > [1] https://lore.kernel.org/r/BL0PR02MB560167492CA4094C91589930E9FC0@BL0PR02MB5601.namprd02.prod.outlook.com/ > [2] https://unix.stackexchange.com/questions/558802/how-to-set-sysctl-using-kernel-command-line-parameter > [3] https://lore.kernel.org/r/87bloj2skm.fsf@x220.int.ebiederm.org/ > > Signed-off-by: Vlastimil Babka <vbabka@suse.cz> > --- > .../admin-guide/kernel-parameters.txt | 9 ++ > fs/proc/proc_sysctl.c | 100 ++++++++++++++++++ > include/linux/sysctl.h | 4 + > init/main.c | 2 + > 4 files changed, 115 insertions(+) > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt > index c07815d230bc..81ff626fc700 100644 > --- a/Documentation/admin-guide/kernel-parameters.txt > +++ b/Documentation/admin-guide/kernel-parameters.txt > @@ -4793,6 +4793,15 @@ > > switches= [HW,M68k] > > + sysctl.*= [KNL] > + Set a sysctl parameter, right before loading the init > + process, as if the value was written to the respective > + /proc/sys/... file. Both '.' and '/' are recognized as > + separators. Unrecognized parameters and invalid values > + are reported in the kernel log. Sysctls registered > + later by a loaded module cannot be set this way. > + Example: sysctl.vm.swappiness=40 > + > sysfs.deprecated=0|1 [KNL] > Enable/disable old style sysfs layout for old udev > on older distributions. When this option is enabled > diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c > index c75bb4632ed1..653188c9c4c9 100644 > --- a/fs/proc/proc_sysctl.c > +++ b/fs/proc/proc_sysctl.c > @@ -14,6 +14,7 @@ > #include <linux/mm.h> > #include <linux/module.h> > #include <linux/bpf-cgroup.h> > +#include <linux/mount.h> > #include "internal.h" > > static const struct dentry_operations proc_sys_dentry_operations; > @@ -1725,3 +1726,102 @@ int __init proc_sys_init(void) > > return sysctl_init(); > } > + > +/* Set sysctl value passed on kernel command line. */ > +static int process_sysctl_arg(char *param, char *val, > + const char *unused, void *arg) > +{ > + char *path; > + struct vfsmount *proc_mnt = *((struct vfsmount **)arg); I would just make this: struct vfsmount **proc_mnt = (struct vfsmount **)arg; > + struct file_system_type *proc_fs_type; > + struct file *file; > + int len; > + int err; > + loff_t pos = 0; > + ssize_t wret; > + > + if (strncmp(param, "sysctl", sizeof("sysctl") - 1)) > + return 0; > + > + param += sizeof("sysctl") - 1; > + > + if (param[0] != '/' && param[0] != '.') > + return 0; > + > + param++; > + > + if (!proc_mnt) { if (!*proc_mnt) { I would also add a comment here to explain that this is doing an on-demand mount so that it doesn't have to mount proc if there are not sysctl parameters. > + proc_fs_type = get_fs_type("proc"); > + if (!proc_fs_type) { > + pr_err("Failed to find procfs to set sysctl from command line"); > + return 0; > + } > + proc_mnt = kern_mount(proc_fs_type); *proc_mnt = kern_mount(proc_fs_type); > + put_filesystem(proc_fs_type); > + if (IS_ERR(proc_mnt)) { if (IS_ERR(*proc_mnt)) { > + pr_err("Failed to mount procfs to set sysctl from command line"); > + return 0; > + } > + *((struct vfsmount **)arg) = proc_mnt; Then drop this line. > + } > + > + path = kasprintf(GFP_KERNEL, "sys/%s", param); > + if (!path) > + panic("%s: Failed to allocate path for %s\n", __func__, param); > + strreplace(path, '.', '/'); > + > + file = file_open_root(proc_mnt->mnt_root, proc_mnt, path, O_WRONLY, 0); file = file_open_root((*proc_mnt)->mnt_root, *proc_mnt, path, O_WRONLY, 0); > + if (IS_ERR(file)) { > + err = PTR_ERR(file); > + if (err == -ENOENT) > + pr_err("Failed to set sysctl parameter '%s=%s': parameter not found", > + param, val); > + else if (err == -EACCES) > + pr_err("Failed to set sysctl parameter '%s=%s': permission denied (read-only?)", > + param, val); > + else > + pr_err("Error %pe opening proc file to set sysctl parameter '%s=%s'", > + file, param, val); > + goto out; > + } > + len = strlen(val); > + wret = kernel_write(file, val, len, &pos); > + if (wret < 0) { > + err = wret; > + if (err == -EINVAL) > + pr_err("Failed to set sysctl parameter '%s=%s': invalid value", > + param, val); > + else > + pr_err("Error %pe writing to proc file to set sysctl parameter '%s=%s'", > + ERR_PTR(err), param, val); > + } else if (wret != len) { > + pr_err("Wrote only %ld bytes of %d writing to proc file %s to set sysctl parameter '%s=%s'", > + wret, len, path, param, val); > + } > + > + err = filp_close(file, NULL); > + if (err) > + pr_err("Error %pe closing proc file to set sysctl parameter '%s=%s'", > + ERR_PTR(err), param, val); > +out: > + kfree(path); > + return 0; > +} > + > +void do_sysctl_args(void) > +{ > + char *command_line; > + struct vfsmount *proc_mnt = NULL; > + > + command_line = kstrdup(saved_command_line, GFP_KERNEL); > + if (!command_line) > + panic("%s: Failed to allocate copy of command line\n", __func__); > + > + parse_args("Setting sysctl args", command_line, > + NULL, 0, -1, -1, &proc_mnt, process_sysctl_arg); > + > + if (proc_mnt) > + kern_unmount(proc_mnt); > + > + kfree(command_line); > +} > diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h > index 02fa84493f23..5f3f2a00d75f 100644 > --- a/include/linux/sysctl.h > +++ b/include/linux/sysctl.h > @@ -206,6 +206,7 @@ struct ctl_table_header *register_sysctl_paths(const struct ctl_path *path, > void unregister_sysctl_table(struct ctl_table_header * table); > > extern int sysctl_init(void); > +void do_sysctl_args(void); > > extern struct ctl_table sysctl_mount_point[]; > > @@ -236,6 +237,9 @@ static inline void setup_sysctl_set(struct ctl_table_set *p, > { > } > > +void do_sysctl_args(void) As with the others in the no-op case: static inline void do_sysctl_args(void) > +{ > +} > #endif /* CONFIG_SYSCTL */ > > int sysctl_max_threads(struct ctl_table *table, int write, > diff --git a/init/main.c b/init/main.c > index ee4947af823f..a91ea166a731 100644 > --- a/init/main.c > +++ b/init/main.c > @@ -1367,6 +1367,8 @@ static int __ref kernel_init(void *unused) > > rcu_end_inkernel_boot(); > > + do_sysctl_args(); > + > if (ramdisk_execute_command) { > ret = run_init_process(ramdisk_execute_command); > if (!ret) > -- > 2.25.1 > Otherwise, yes, looks good!
On Mon, Mar 30, 2020 at 06:23:57PM +0200, Vlastimil Babka wrote: > Boo, all the error prints should terminate with \n > Will wait for feedback before resend. eek, yes, good catch. :)
Sorry to be late to the apocalypse review party for this, feedback below. On Mon, Mar 30, 2020 at 01:55:33PM +0200, Vlastimil Babka wrote: > A recently proposed patch to add vm_swappiness command line parameter in > addition to existing sysctl [1] made me wonder why we don't have a general > support for passing sysctl parameters via command line. Googling found only > somebody else wondering the same [2], but I haven't found any prior discussion > with reasons why not to do this. > > Settings the vm_swappiness issue aside (the underlying issue might be solved in > a different way), quick search of kernel-parameters.txt shows there are already > some that exist as both sysctl and kernel parameter - hung_task_panic, > nmi_watchdog, numa_zonelist_order, traceoff_on_warning. A general mechanism > would remove the need to add more of those one-offs and might be handy in > situations where configuration by e.g. /etc/sysctl.d/ is impractical. > > Hence, this patch adds a new parse_args() pass that looks for parameters > prefixed by 'sysctl.' and tries to interpret them as writes to the > corresponding sys/ files using an temporary in-kernel procfs mount. This > mechanism was suggested by Eric W. Biederman [3], as it handles all dynamically > registered sysctl tables. "even though we don't handle modular sysctls" might be safer to add. > Errors due to e.g. invalid parameter name or value > are reported in the kernel log. > > The processing is hooked right before the init process is loaded, as some > handlers might be more complicated than simple setters and might need some > subsystems to be initialized. At the moment the init process can be started and > eventually execute a process writing to /proc/sys/ then it should be also fine > to do that from the kernel. This is wonderful when we think about existing sysctls which have corresponding silly boot params that do the same thing. However, shoving a boot param capability down every possible built-in sysctl brings forward support considerations we should take serious, as this would add a new user interface and we'll have to support it. Simply put, not all sysctls should be born to be boot params. I suggest we white-list which ones we can process, so that only sysctls we *do* review and agree are good candidates get allowed to also be boot params. Calling a proc hanlder early might seem functional, but if the subsystem defers evaluation of a setting later, then any boot param set would be lifted anyway. I think each syscl would need to be reviewed for this to be supported in a way that doesn't create odd unexpected system settings which we later have to support forever. Should we not do this, we'll have to live with the consequences of supporting the full swoop of sysctls are boot params, whatever consequences those may be. > Sysctls registered later on module load time are not set by this mechanism - > it's expected that in such scenarios, setting sysctl values from userspace is > practical enough. I'm just not sure if its worth supporting these, for modules we have module params, but those with more creative userspace might have a better idea as to why we'd want to support this. I just can't see it yet. > [1] https://lore.kernel.org/r/BL0PR02MB560167492CA4094C91589930E9FC0@BL0PR02MB5601.namprd02.prod.outlook.com/ > [2] https://unix.stackexchange.com/questions/558802/how-to-set-sysctl-using-kernel-command-line-parameter > [3] https://lore.kernel.org/r/87bloj2skm.fsf@x220.int.ebiederm.org/ > > Signed-off-by: Vlastimil Babka <vbabka@suse.cz> > --- > .../admin-guide/kernel-parameters.txt | 9 ++ > fs/proc/proc_sysctl.c | 100 ++++++++++++++++++ > include/linux/sysctl.h | 4 + > init/main.c | 2 + > 4 files changed, 115 insertions(+) > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt > index c07815d230bc..81ff626fc700 100644 > --- a/Documentation/admin-guide/kernel-parameters.txt > +++ b/Documentation/admin-guide/kernel-parameters.txt > @@ -4793,6 +4793,15 @@ > > switches= [HW,M68k] > > + sysctl.*= [KNL] > + Set a sysctl parameter, right before loading the init > + process, as if the value was written to the respective > + /proc/sys/... file. Both '.' and '/' are recognized as > + separators. Unrecognized parameters and invalid values > + are reported in the kernel log. Sysctls registered > + later by a loaded module cannot be set this way. > + Example: sysctl.vm.swappiness=40 > + > sysfs.deprecated=0|1 [KNL] > Enable/disable old style sysfs layout for old udev > on older distributions. When this option is enabled > diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c > index c75bb4632ed1..653188c9c4c9 100644 > --- a/fs/proc/proc_sysctl.c > +++ b/fs/proc/proc_sysctl.c > @@ -14,6 +14,7 @@ > #include <linux/mm.h> > #include <linux/module.h> > #include <linux/bpf-cgroup.h> > +#include <linux/mount.h> > #include "internal.h" > > static const struct dentry_operations proc_sys_dentry_operations; > @@ -1725,3 +1726,102 @@ int __init proc_sys_init(void) > > return sysctl_init(); > } > + > +/* Set sysctl value passed on kernel command line. */ > +static int process_sysctl_arg(char *param, char *val, > + const char *unused, void *arg) > +{ > + char *path; > + struct vfsmount *proc_mnt = *((struct vfsmount **)arg); > + struct file_system_type *proc_fs_type; > + struct file *file; > + int len; > + int err; > + loff_t pos = 0; > + ssize_t wret; > + > + if (strncmp(param, "sysctl", sizeof("sysctl") - 1)) > + return 0; > + > + param += sizeof("sysctl") - 1; > + > + if (param[0] != '/' && param[0] != '.') > + return 0; > + > + param++; > + > + if (!proc_mnt) { > + proc_fs_type = get_fs_type("proc"); > + if (!proc_fs_type) { > + pr_err("Failed to find procfs to set sysctl from command line"); > + return 0; > + } > + proc_mnt = kern_mount(proc_fs_type); > + put_filesystem(proc_fs_type); > + if (IS_ERR(proc_mnt)) { > + pr_err("Failed to mount procfs to set sysctl from command line"); > + return 0; > + } > + *((struct vfsmount **)arg) = proc_mnt; > + } > + > + path = kasprintf(GFP_KERNEL, "sys/%s", param); > + if (!path) > + panic("%s: Failed to allocate path for %s\n", __func__, param); > + strreplace(path, '.', '/'); > + > + file = file_open_root(proc_mnt->mnt_root, proc_mnt, path, O_WRONLY, 0); > + if (IS_ERR(file)) { > + err = PTR_ERR(file); > + if (err == -ENOENT) > + pr_err("Failed to set sysctl parameter '%s=%s': parameter not found", > + param, val); > + else if (err == -EACCES) > + pr_err("Failed to set sysctl parameter '%s=%s': permission denied (read-only?)", > + param, val); > + else > + pr_err("Error %pe opening proc file to set sysctl parameter '%s=%s'", > + file, param, val); > + goto out; > + } > + len = strlen(val); > + wret = kernel_write(file, val, len, &pos); > + if (wret < 0) { > + err = wret; > + if (err == -EINVAL) > + pr_err("Failed to set sysctl parameter '%s=%s': invalid value", > + param, val); > + else > + pr_err("Error %pe writing to proc file to set sysctl parameter '%s=%s'", > + ERR_PTR(err), param, val); > + } else if (wret != len) { > + pr_err("Wrote only %ld bytes of %d writing to proc file %s to set sysctl parameter '%s=%s'", > + wret, len, path, param, val); > + } > + > + err = filp_close(file, NULL); > + if (err) > + pr_err("Error %pe closing proc file to set sysctl parameter '%s=%s'", > + ERR_PTR(err), param, val); > +out: > + kfree(path); > + return 0; > +} > + > +void do_sysctl_args(void) > +{ > + char *command_line; > + struct vfsmount *proc_mnt = NULL; > + > + command_line = kstrdup(saved_command_line, GFP_KERNEL); can you use kstrndup() ? And then use kfree_const()? Yes, feel free to move __kstrncpy() to a generic kstrncpy(). > + if (!command_line) > + panic("%s: Failed to allocate copy of command line\n", __func__); > + > + parse_args("Setting sysctl args", command_line, > + NULL, 0, -1, -1, &proc_mnt, process_sysctl_arg); > + > + if (proc_mnt) > + kern_unmount(proc_mnt); > + > + kfree(command_line); > +} Then, can we get this tested as part of lib/test_sysctl.c with its respective tools/testing/selftests/sysctl/sysctl.sh ? Luis
On 3/31/20 12:44 AM, Luis Chamberlain wrote: > Sorry to be late to the apocalypse review party for this, feedback below. > > On Mon, Mar 30, 2020 at 01:55:33PM +0200, Vlastimil Babka wrote: >> A recently proposed patch to add vm_swappiness command line parameter in >> addition to existing sysctl [1] made me wonder why we don't have a general >> support for passing sysctl parameters via command line. Googling found only >> somebody else wondering the same [2], but I haven't found any prior discussion >> with reasons why not to do this. >> >> Settings the vm_swappiness issue aside (the underlying issue might be solved in >> a different way), quick search of kernel-parameters.txt shows there are already >> some that exist as both sysctl and kernel parameter - hung_task_panic, >> nmi_watchdog, numa_zonelist_order, traceoff_on_warning. A general mechanism >> would remove the need to add more of those one-offs and might be handy in >> situations where configuration by e.g. /etc/sysctl.d/ is impractical. >> >> Hence, this patch adds a new parse_args() pass that looks for parameters >> prefixed by 'sysctl.' and tries to interpret them as writes to the >> corresponding sys/ files using an temporary in-kernel procfs mount. This >> mechanism was suggested by Eric W. Biederman [3], as it handles all dynamically >> registered sysctl tables. > > "even though we don't handle modular sysctls" might be safer to add. OK >> Errors due to e.g. invalid parameter name or value >> are reported in the kernel log. >> >> The processing is hooked right before the init process is loaded, as some >> handlers might be more complicated than simple setters and might need some >> subsystems to be initialized. At the moment the init process can be started and >> eventually execute a process writing to /proc/sys/ then it should be also fine >> to do that from the kernel. > > This is wonderful when we think about existing sysctls which have > corresponding silly boot params that do the same thing. However, shoving > a boot param capability down every possible built-in sysctl brings > forward support considerations we should take serious, as this would > add a new user interface and we'll have to support it. Hmm, if I boot with an initramfs with init process that does mount /proc and set some sysctl there as the very first thing, then this will be almost the same moment as my patch does it. There is no further kernel initialization in between. So with your logic we already do support all non-modular sysctls to be set so early. > Simply put, not all sysctls should be born to be boot params. I suggest > we white-list which ones we can process, so that only sysctls we *do* > review and agree are good candidates get allowed to also be boot params. By above, the nuber of sysctls that will be problematic with this boot param mechanism, but work properly when set from init process immediately, should be near zero, and I would expect truly zero. As such, whitelist approach seems excessive to me and it would take a lot of effort to build it, and it will be a lottery which sysctl would work as boot param on which kernel version. Sounds like a lot of trouble for little benefit to me. > Calling a proc hanlder early might seem functional, but if the subsystem > defers evaluation of a setting later, then any boot param set would be > lifted anyway. I'm not sure I understand, can you show me some example please? > I think each syscl would need to be reviewed for this to > be supported in a way that doesn't create odd unexpected system settings > which we later have to support forever. We would already do per the initramfs argument. > Should we not do this, we'll have to live with the consequences of > supporting the full swoop of sysctls are boot params, whatever > consequences those may be. Of course when the first user tries to set some particular sysctl as boot param and finds and reports it doesn't work as intended, then it can be fixed or blacklisted and it can't break anyone else? >> Sysctls registered later on module load time are not set by this mechanism - >> it's expected that in such scenarios, setting sysctl values from userspace is >> practical enough. > > I'm just not sure if its worth supporting these, for modules we have > module params, but those with more creative userspace might have a > better idea as to why we'd want to support this. I just can't see it > yet. Sure, I can defer that part for later now. >> [1] https://lore.kernel.org/r/BL0PR02MB560167492CA4094C91589930E9FC0@BL0PR02MB5601.namprd02.prod.outlook.com/ >> [2] https://unix.stackexchange.com/questions/558802/how-to-set-sysctl-using-kernel-command-line-parameter >> [3] https://lore.kernel.org/r/87bloj2skm.fsf@x220.int.ebiederm.org/ >> >> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> >> --- >
On Tue 31-03-20 09:42:46, Vlastimil Babka wrote: [...] > > Should we not do this, we'll have to live with the consequences of > > supporting the full swoop of sysctls are boot params, whatever > > consequences those may be. > > Of course when the first user tries to set some particular sysctl as boot param > and finds and reports it doesn't work as intended, then it can be fixed or > blacklisted and it can't break anyone else? Absolutely agreed. I would be really careful to not overengineer this whole thing.
On Tue, Mar 31, 2020 at 09:42:46AM +0200, Vlastimil Babka wrote: > On 3/31/20 12:44 AM, Luis Chamberlain wrote: > > > > This is wonderful when we think about existing sysctls which have > > corresponding silly boot params that do the same thing. However, shoving > > a boot param capability down every possible built-in sysctl brings > > forward support considerations we should take serious, as this would > > add a new user interface and we'll have to support it. > > Hmm, if I boot with an initramfs with init process that does mount /proc and set > some sysctl there as the very first thing, then this will be almost the same > moment as my patch does it. There is no further kernel initialization in > between. So with your logic we already do support all non-modular sysctls to be > set so early. Yes, true. Then by all means: Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Luis
On Tue, Mar 31, 2020 at 09:48:36AM +0200, Michal Hocko wrote: > On Tue 31-03-20 09:42:46, Vlastimil Babka wrote: > [...] > > > Should we not do this, we'll have to live with the consequences of > > > supporting the full swoop of sysctls are boot params, whatever > > > consequences those may be. > > > > Of course when the first user tries to set some particular sysctl as boot param > > and finds and reports it doesn't work as intended, then it can be fixed or > > blacklisted and it can't break anyone else? > > Absolutely agreed. I would be really careful to not overengineer this > whole thing. Right -- this is supposed to be _simple_, and I think that's the primary benefit here. If we encounter problems we can fix those cases. The careful place, I think, needs to be with converting existing boot params to be aliases. That's when timing considerations need to be taken into account carefully.
On 3/31/20 12:44 AM, Luis Chamberlain wrote: >> + } else if (wret != len) { >> + pr_err("Wrote only %ld bytes of %d writing to proc file %s to set sysctl parameter '%s=%s'", >> + wret, len, path, param, val); >> + } >> + >> + err = filp_close(file, NULL); >> + if (err) >> + pr_err("Error %pe closing proc file to set sysctl parameter '%s=%s'", >> + ERR_PTR(err), param, val); >> +out: >> + kfree(path); >> + return 0; >> +} >> + >> +void do_sysctl_args(void) >> +{ >> + char *command_line; >> + struct vfsmount *proc_mnt = NULL; >> + >> + command_line = kstrdup(saved_command_line, GFP_KERNEL); > > can you use kstrndup() ? And then use kfree_const()? Yes, feel free to I don't follow, what am I missing? Do you mean this? size_t len = strlen(saved_command_line); command_line = kstrndup(saved_command_line, len, GFP_KERNEL); What would be the advantage over plain kstrdup()? As for kfree_const(), when would command_line be .rodata? I don't see using kstrndup() resulting in that. > move __kstrncpy() to a generic kstrncpy(). > >> + if (!command_line) >> + panic("%s: Failed to allocate copy of command line\n", __func__); >> + >> + parse_args("Setting sysctl args", command_line, >> + NULL, 0, -1, -1, &proc_mnt, process_sysctl_arg); >> + >> + if (proc_mnt) >> + kern_unmount(proc_mnt); >> + >> + kfree(command_line); >> +} > > Then, can we get this tested as part of lib/test_sysctl.c with its > respective tools/testing/selftests/sysctl/sysctl.sh ? Hmm so I add some sysctl to the test "module" (in fact the 'config' file says it should be build with 'y', which would be needed anyway) and expand the test instructions so that the test kernel boot has to include it on the command line, and then I verify it has been set? Or do you see a better way? Thanks, Vlastimil > Luis >
On Wed, Apr 01, 2020 at 01:01:47PM +0200, Vlastimil Babka wrote: > On 3/31/20 12:44 AM, Luis Chamberlain wrote: > >> + } else if (wret != len) { > >> + pr_err("Wrote only %ld bytes of %d writing to proc file %s to set sysctl parameter '%s=%s'", > >> + wret, len, path, param, val); > >> + } > >> + > >> + err = filp_close(file, NULL); > >> + if (err) > >> + pr_err("Error %pe closing proc file to set sysctl parameter '%s=%s'", > >> + ERR_PTR(err), param, val); > >> +out: > >> + kfree(path); > >> + return 0; > >> +} > >> + > >> +void do_sysctl_args(void) > >> +{ > >> + char *command_line; > >> + struct vfsmount *proc_mnt = NULL; > >> + > >> + command_line = kstrdup(saved_command_line, GFP_KERNEL); > > > > can you use kstrndup() ? And then use kfree_const()? Yes, feel free to > > I don't follow, what am I missing? Do you mean this? > > size_t len = strlen(saved_command_line); > command_line = kstrndup(saved_command_line, len, GFP_KERNEL); > > What would be the advantage over plain kstrdup()? > As for kfree_const(), when would command_line be .rodata? I don't see using > kstrndup() resulting in that. The const nature of using kstrdup() comes with using const for your purpose. ie: const char *const_command_line = saved_command_line; The point of a kstrncpy() then is to ensure force a const throughout your use if you know you don't need modifications. > > move __kstrncpy() to a generic kstrncpy(). > > > >> + if (!command_line) > >> + panic("%s: Failed to allocate copy of command line\n", __func__); > >> + > >> + parse_args("Setting sysctl args", command_line, > >> + NULL, 0, -1, -1, &proc_mnt, process_sysctl_arg); > >> + > >> + if (proc_mnt) > >> + kern_unmount(proc_mnt); > >> + > >> + kfree(command_line); > >> +} > > > > Then, can we get this tested as part of lib/test_sysctl.c with its > > respective tools/testing/selftests/sysctl/sysctl.sh ? > > Hmm so I add some sysctl to the test "module" (in fact the 'config' file says it > should be build with 'y', which would be needed anyway) and expand the test > instructions so that the test kernel boot has to include it on the command line, > and then I verify it has been set? Or do you see a better way? We don't necessarily have a way to test the use boot params today. That reveals an are which we should eventually put some focus on in the future. In the meantime we have to deal with what we have. So let's think about this: You are adding a new cmdline sysctl boot param, and also a wrapper for those old boot bootparams to also work using both new sysctl path and old path. Testing just these both should suffice. How about this: For testing the new feature you are adding, can you extend the default boot params *always* if a new CONFIG_TEST_SYSCTL_CMDLINE is set? Then upon boot we can verify the proc handlers for these new boot params got kicked, and likewise some other proc handlers which also can be used from the cmdline are *not* set. For this later set, we already have a series of test syctls you can use. In fact, you can use the existing syctls for both cases already I believe, its just a matter of adding this new CONFIG_TEST_SYSCTL_CMDLINE which would extend the cmdline, and these tests would take place *first* on the script. That would test both cases with one kernel. You could then also add a bogus new sysctl which also expands to a silly raw boot param to test the wrapper you are providing. That would be the only new test syctl you would need to add. If you can think of a way to *break* your new functionality and ensure it doesn't break the kernel, even better. Luis
On Thu, Apr 02, 2020 at 04:04:42PM +0000, Luis Chamberlain wrote: > On Wed, Apr 01, 2020 at 01:01:47PM +0200, Vlastimil Babka wrote: > > On 3/31/20 12:44 AM, Luis Chamberlain wrote: > > >> + } else if (wret != len) { > > >> + pr_err("Wrote only %ld bytes of %d writing to proc file %s to set sysctl parameter '%s=%s'", > > >> + wret, len, path, param, val); > > >> + } > > >> + > > >> + err = filp_close(file, NULL); > > >> + if (err) > > >> + pr_err("Error %pe closing proc file to set sysctl parameter '%s=%s'", > > >> + ERR_PTR(err), param, val); > > >> +out: > > >> + kfree(path); > > >> + return 0; > > >> +} > > >> + > > >> +void do_sysctl_args(void) > > >> +{ > > >> + char *command_line; > > >> + struct vfsmount *proc_mnt = NULL; > > >> + > > >> + command_line = kstrdup(saved_command_line, GFP_KERNEL); > > > > > > can you use kstrndup() ? And then use kfree_const()? Yes, feel free to > > > > I don't follow, what am I missing? Do you mean this? > > > > size_t len = strlen(saved_command_line); > > command_line = kstrndup(saved_command_line, len, GFP_KERNEL); > > > > What would be the advantage over plain kstrdup()? > > As for kfree_const(), when would command_line be .rodata? I don't see using > > kstrndup() resulting in that. > > The const nature of using kstrdup() comes with using const for your > purpose. ie: > > const char *const_command_line = saved_command_line; > > The point of a kstrncpy() then is to ensure force a const throughout > your use if you know you don't need modifications. I'm not following this suggestion. It _is_ modifying it. That's why it's making a copy. What am I missing? > > >> + parse_args("Setting sysctl args", command_line, > > >> + NULL, 0, -1, -1, &proc_mnt, process_sysctl_arg); > > >> + > > >> + if (proc_mnt) > > >> + kern_unmount(proc_mnt); > > >> + > > >> + kfree(command_line); > > >> +} > > > > > > Then, can we get this tested as part of lib/test_sysctl.c with its > > > respective tools/testing/selftests/sysctl/sysctl.sh ? > > > > Hmm so I add some sysctl to the test "module" (in fact the 'config' file says it > > should be build with 'y', which would be needed anyway) and expand the test > > instructions so that the test kernel boot has to include it on the command line, > > and then I verify it has been set? Or do you see a better way? > > We don't necessarily have a way to test the use boot params today. > That reveals an are which we should eventually put some focus on > in the future. In the meantime we have to deal with what we have. > > So let's think about this: > > You are adding a new cmdline sysctl boot param, and also a wrapper > for those old boot bootparams to also work using both new sysctl > path and old path. Testing just these both should suffice. > > How about this: > > For testing the new feature you are adding, can you extend the default > boot params *always* if a new CONFIG_TEST_SYSCTL_CMDLINE is set? Then > upon boot we can verify the proc handlers for these new boot params got > kicked, and likewise some other proc handlers which also can be used > from the cmdline are *not* set. For this later set, we already have > a series of test syctls you can use. In fact, you can use the existing > syctls for both cases already I believe, its just a matter of adding > this new CONFIG_TEST_SYSCTL_CMDLINE which would extend the cmdline, > and these tests would take place *first* on the script. This seems... messy. I'm all for testing this, but I'd rather this not be internally driven. This is an external interface (boot params), so I'd rather an external driver handle that testing. We don't have a common method to do that with the kernel, though. > That would test both cases with one kernel. > > You could then also add a bogus new sysctl which also expands to a silly > raw boot param to test the wrapper you are providing. That would be the > only new test syctl you would need to add. Sure, that seems reasonable. Supporting externally driven testing makes sense for this.
On Thu, Apr 02, 2020 at 10:23:13AM -0700, Kees Cook wrote: > On Thu, Apr 02, 2020 at 04:04:42PM +0000, Luis Chamberlain wrote: > > On Wed, Apr 01, 2020 at 01:01:47PM +0200, Vlastimil Babka wrote: > > > On 3/31/20 12:44 AM, Luis Chamberlain wrote: > > > >> + } else if (wret != len) { > > > >> + pr_err("Wrote only %ld bytes of %d writing to proc file %s to set sysctl parameter '%s=%s'", > > > >> + wret, len, path, param, val); > > > >> + } > > > >> + > > > >> + err = filp_close(file, NULL); > > > >> + if (err) > > > >> + pr_err("Error %pe closing proc file to set sysctl parameter '%s=%s'", > > > >> + ERR_PTR(err), param, val); > > > >> +out: > > > >> + kfree(path); > > > >> + return 0; > > > >> +} > > > >> + > > > >> +void do_sysctl_args(void) > > > >> +{ > > > >> + char *command_line; > > > >> + struct vfsmount *proc_mnt = NULL; > > > >> + > > > >> + command_line = kstrdup(saved_command_line, GFP_KERNEL); > > > > > > > > can you use kstrndup() ? And then use kfree_const()? Yes, feel free to > > > > > > I don't follow, what am I missing? Do you mean this? > > > > > > size_t len = strlen(saved_command_line); > > > command_line = kstrndup(saved_command_line, len, GFP_KERNEL); > > > > > > What would be the advantage over plain kstrdup()? > > > As for kfree_const(), when would command_line be .rodata? I don't see using > > > kstrndup() resulting in that. > > > > The const nature of using kstrdup() comes with using const for your > > purpose. ie: > > > > const char *const_command_line = saved_command_line; > > > > The point of a kstrncpy() then is to ensure force a const throughout > > your use if you know you don't need modifications. > > I'm not following this suggestion. It _is_ modifying it. That's why it's > making a copy. What am I missing? We modify the copied bootparams to allow new sysctls to map to old boot params? If so, then yes, this cannot be used. > > > >> + parse_args("Setting sysctl args", command_line, > > > >> + NULL, 0, -1, -1, &proc_mnt, process_sysctl_arg); > > > >> + > > > >> + if (proc_mnt) > > > >> + kern_unmount(proc_mnt); > > > >> + > > > >> + kfree(command_line); > > > >> +} > > > > > > > > Then, can we get this tested as part of lib/test_sysctl.c with its > > > > respective tools/testing/selftests/sysctl/sysctl.sh ? > > > > > > Hmm so I add some sysctl to the test "module" (in fact the 'config' file says it > > > should be build with 'y', which would be needed anyway) and expand the test > > > instructions so that the test kernel boot has to include it on the command line, > > > and then I verify it has been set? Or do you see a better way? > > > > We don't necessarily have a way to test the use boot params today. > > That reveals an are which we should eventually put some focus on > > in the future. In the meantime we have to deal with what we have. > > > > So let's think about this: > > > > You are adding a new cmdline sysctl boot param, and also a wrapper > > for those old boot bootparams to also work using both new sysctl > > path and old path. Testing just these both should suffice. > > > > How about this: > > > > For testing the new feature you are adding, can you extend the default > > boot params *always* if a new CONFIG_TEST_SYSCTL_CMDLINE is set? Then > > upon boot we can verify the proc handlers for these new boot params got > > kicked, and likewise some other proc handlers which also can be used > > from the cmdline are *not* set. For this later set, we already have > > a series of test syctls you can use. In fact, you can use the existing > > syctls for both cases already I believe, its just a matter of adding > > this new CONFIG_TEST_SYSCTL_CMDLINE which would extend the cmdline, > > and these tests would take place *first* on the script. > > This seems... messy. It is all we have. > I'm all for testing this, OK so we do want to test it. > but I'd rather this not be internally driven. This is the least cumbersome solution I could think of. Other things would require things like using qemu, etc. That seems much more messsy. > This is an external interface (boot params), so > I'd rather an external driver handle that testing. We don't have a > common method to do that with the kernel, though. Right... which begs the question now -- how do we test this sort of stuff? The above would at least get us coverage while we iron something more generic out for boot params. > > That would test both cases with one kernel. > > > > You could then also add a bogus new sysctl which also expands to a silly > > raw boot param to test the wrapper you are providing. That would be the > > only new test syctl you would need to add. > > Sure, that seems reasonable. Supporting externally driven testing makes > sense for this. But again, what exactly? Luis
On Thu, Apr 02, 2020 at 08:59:32PM +0000, Luis Chamberlain wrote: > On Thu, Apr 02, 2020 at 10:23:13AM -0700, Kees Cook wrote: > > On Thu, Apr 02, 2020 at 04:04:42PM +0000, Luis Chamberlain wrote: > > > On Wed, Apr 01, 2020 at 01:01:47PM +0200, Vlastimil Babka wrote: > > > > On 3/31/20 12:44 AM, Luis Chamberlain wrote: > > > > >> + } else if (wret != len) { > > > > >> + pr_err("Wrote only %ld bytes of %d writing to proc file %s to set sysctl parameter '%s=%s'", > > > > >> + wret, len, path, param, val); > > > > >> + } > > > > >> + > > > > >> + err = filp_close(file, NULL); > > > > >> + if (err) > > > > >> + pr_err("Error %pe closing proc file to set sysctl parameter '%s=%s'", > > > > >> + ERR_PTR(err), param, val); > > > > >> +out: > > > > >> + kfree(path); > > > > >> + return 0; > > > > >> +} > > > > >> + > > > > >> +void do_sysctl_args(void) > > > > >> +{ > > > > >> + char *command_line; > > > > >> + struct vfsmount *proc_mnt = NULL; > > > > >> + > > > > >> + command_line = kstrdup(saved_command_line, GFP_KERNEL); > > > > > > > > > > can you use kstrndup() ? And then use kfree_const()? Yes, feel free to > > > > > > > > I don't follow, what am I missing? Do you mean this? > > > > > > > > size_t len = strlen(saved_command_line); > > > > command_line = kstrndup(saved_command_line, len, GFP_KERNEL); > > > > > > > > What would be the advantage over plain kstrdup()? > > > > As for kfree_const(), when would command_line be .rodata? I don't see using > > > > kstrndup() resulting in that. > > > > > > The const nature of using kstrdup() comes with using const for your > > > purpose. ie: > > > > > > const char *const_command_line = saved_command_line; > > > > > > The point of a kstrncpy() then is to ensure force a const throughout > > > your use if you know you don't need modifications. > > > > I'm not following this suggestion. It _is_ modifying it. That's why it's > > making a copy. What am I missing? > > We modify the copied bootparams to allow new sysctls to map to old boot params? > > If so, then yes, this cannot be used. I feel like I've lost track of this thread. This strdup is so that the command line can have '\0's injected while it steps through the args (and for doing the . and / replacement). I don't know what you mean by "map" here: this is standard parse_args() usage. > > > > >> + parse_args("Setting sysctl args", command_line, > > > > >> + NULL, 0, -1, -1, &proc_mnt, process_sysctl_arg); > > > > >> + > > > > >> + if (proc_mnt) > > > > >> + kern_unmount(proc_mnt); > > > > >> + > > > > >> + kfree(command_line); > > > > >> +} > > > > > > > > > > Then, can we get this tested as part of lib/test_sysctl.c with its > > > > > respective tools/testing/selftests/sysctl/sysctl.sh ? > > > > > > > > Hmm so I add some sysctl to the test "module" (in fact the 'config' file says it > > > > should be build with 'y', which would be needed anyway) and expand the test > > > > instructions so that the test kernel boot has to include it on the command line, > > > > and then I verify it has been set? Or do you see a better way? > > > > > > We don't necessarily have a way to test the use boot params today. > > > That reveals an are which we should eventually put some focus on > > > in the future. In the meantime we have to deal with what we have. > > > > > > So let's think about this: > > > > > > You are adding a new cmdline sysctl boot param, and also a wrapper > > > for those old boot bootparams to also work using both new sysctl > > > path and old path. Testing just these both should suffice. > > > > > > How about this: > > > > > > For testing the new feature you are adding, can you extend the default > > > boot params *always* if a new CONFIG_TEST_SYSCTL_CMDLINE is set? Then > > > upon boot we can verify the proc handlers for these new boot params got > > > kicked, and likewise some other proc handlers which also can be used > > > from the cmdline are *not* set. For this later set, we already have > > > a series of test syctls you can use. In fact, you can use the existing > > > syctls for both cases already I believe, its just a matter of adding > > > this new CONFIG_TEST_SYSCTL_CMDLINE which would extend the cmdline, > > > and these tests would take place *first* on the script. > > > > This seems... messy. > > It is all we have. > > I'm all for testing this, > > OK so we do want to test it. > > > but I'd rather this not be internally driven. > > This is the least cumbersome solution I could think of. Other things > would require things like using qemu, etc. That seems much more messsy. Yes. Doing an internal extension isn't testing the actual code. > > > This is an external interface (boot params), so > > I'd rather an external driver handle that testing. We don't have a > > common method to do that with the kernel, though. > > Right... which begs the question now -- how do we test this sort of > stuff? The above would at least get us coverage while we iron something > more generic out for boot params. > > > > That would test both cases with one kernel. > > > > > > You could then also add a bogus new sysctl which also expands to a silly > > > raw boot param to test the wrapper you are providing. That would be the > > > only new test syctl you would need to add. > > > > Sure, that seems reasonable. Supporting externally driven testing makes > > sense for this. > > But again, what exactly? I don't think anything is needed for this series. It can be boot tested manually.
On Fri, Apr 03, 2020 at 04:57:51PM -0700, Kees Cook wrote: > On Thu, Apr 02, 2020 at 08:59:32PM +0000, Luis Chamberlain wrote: > > We modify the copied bootparams to allow new sysctls to map to old boot params? > > This strdup is so that the > command line can have '\0's injected while it steps through the args > (and for doing the . and / replacement). Please ignore the const feedback then. > > This is the least cumbersome solution I could think of. Other things > > would require things like using qemu, etc. That seems much more messsy. > > Yes. Doing an internal extension isn't testing the actual code. But it would. > > > This is an external interface (boot params), so > > > I'd rather an external driver handle that testing. We don't have a > > > common method to do that with the kernel, though. > > > > Right... which begs the question now -- how do we test this sort of > > stuff? The above would at least get us coverage while we iron something > > more generic out for boot params. > > > > > > That would test both cases with one kernel. > > > > > > > > You could then also add a bogus new sysctl which also expands to a silly > > > > raw boot param to test the wrapper you are providing. That would be the > > > > only new test syctl you would need to add. > > > > > > Sure, that seems reasonable. Supporting externally driven testing makes > > > sense for this. > > > > But again, what exactly? > > I don't think anything is needed for this series. It can be boot tested > manually. Why test it manually when it could be tested automatically with a new kconfig? Luis
On Mon, Apr 06, 2020 at 02:08:36PM +0000, Luis Chamberlain wrote: > > Yes. Doing an internal extension isn't testing the actual code. > > But it would. > > [...] > > I don't think anything is needed for this series. It can be boot tested > > manually. > > Why test it manually when it could be tested automatically with a new kconfig? So, my impression is that adding code to the internals to test the internals isn't a valid test (or at least makes it fragile) because the test would depend on the changes to the internals (or at least depend on non-default non-production CONFIGs). Can you send a patch for what you think this should look like? Perhaps I'm not correctly imagining what you're describing? Regardless of testing, I think this series is ready for -mm.
On Mon, Apr 06, 2020 at 08:58:50AM -0700, Kees Cook wrote: > On Mon, Apr 06, 2020 at 02:08:36PM +0000, Luis Chamberlain wrote: > > > Yes. Doing an internal extension isn't testing the actual code. > > > > But it would. > > > > [...] > > > I don't think anything is needed for this series. It can be boot tested > > > manually. > > > > Why test it manually when it could be tested automatically with a new kconfig? > > So, my impression is that adding code to the internals to test the > internals isn't a valid test (or at least makes it fragile) because the > test would depend on the changes to the internals (or at least depend on > non-default non-production CONFIGs). The *internal* aspect here is an extension to boot params under a kconfig which would simply append to it, as if the user would have added some more params. Since we already have test sysctl params the only one we'd need to add on the test driver would be a dummy one which tests the alias, on the second patch. We should have enough sysctls to already test dummy values. Nothing else would be needed as the sysctl test driver would just need to test that the values expected when this is enabled is set. > Can you send a patch for what you think this should look like? Perhaps > I'm not correctly imagining what you're describing? I rather get the person involved in the changes to do the testing so as they're the ones designing the feature. If however it is not clear what I mean I'm happy to elaborate. Vlastimil do you get what I mean? > Regardless of testing, I think this series is ready for -mm. I'm happy for it to go in provided we at least devise a follow up plan for testing. Otherwise -- like other things, it won't get done. Luis
On 4/6/20 7:08 PM, Luis Chamberlain wrote: > On Mon, Apr 06, 2020 at 08:58:50AM -0700, Kees Cook wrote: >> On Mon, Apr 06, 2020 at 02:08:36PM +0000, Luis Chamberlain wrote: >> > > Yes. Doing an internal extension isn't testing the actual code. >> > >> > But it would. >> > >> > [...] >> > > I don't think anything is needed for this series. It can be boot tested >> > > manually. >> > >> > Why test it manually when it could be tested automatically with a new kconfig? >> >> So, my impression is that adding code to the internals to test the >> internals isn't a valid test (or at least makes it fragile) because the >> test would depend on the changes to the internals (or at least depend on >> non-default non-production CONFIGs). > > The *internal* aspect here is an extension to boot params under a > kconfig which would simply append to it, as if the user would have So there's no such kconfig yet to apply boot parameters specified by configure, right? That would itself be a new feature. Or could we use bootconfig? (CC Masami) > added some more params. Since we already have test sysctl params the > only one we'd need to add on the test driver would be a dummy one which > tests the alias, on the second patch. We should have enough sysctls to > already test dummy values. Right. > Nothing else would be needed as the sysctl test driver would just need > to test that the values expected when this is enabled is set. > >> Can you send a patch for what you think this should look like? Perhaps >> I'm not correctly imagining what you're describing? > > I rather get the person involved in the changes to do the testing so > as they're the ones designing the feature. If however it is not clear > what I mean I'm happy to elaborate. > > Vlastimil do you get what I mean? Hopefully :) >> Regardless of testing, I think this series is ready for -mm. > > I'm happy for it to go in provided we at least devise a follow up plan > for testing. Otherwise -- like other things, it won't get done. OK I'll send a v2 and we can discuss the test driver details. I don't want to neglect testing, but also not block the new functionality, in case it means testing means adding another new functionality. Thanks, Vlastimil > Luis >
On Tue, 14 Apr 2020 13:25:07 +0200 Vlastimil Babka <vbabka@suse.cz> wrote: > On 4/6/20 7:08 PM, Luis Chamberlain wrote: > > On Mon, Apr 06, 2020 at 08:58:50AM -0700, Kees Cook wrote: > >> On Mon, Apr 06, 2020 at 02:08:36PM +0000, Luis Chamberlain wrote: > >> > > Yes. Doing an internal extension isn't testing the actual code. > >> > > >> > But it would. > >> > > >> > [...] > >> > > I don't think anything is needed for this series. It can be boot tested > >> > > manually. > >> > > >> > Why test it manually when it could be tested automatically with a new kconfig? > >> > >> So, my impression is that adding code to the internals to test the > >> internals isn't a valid test (or at least makes it fragile) because the > >> test would depend on the changes to the internals (or at least depend on > >> non-default non-production CONFIGs). > > > > The *internal* aspect here is an extension to boot params under a > > kconfig which would simply append to it, as if the user would have > > So there's no such kconfig yet to apply boot parameters specified by configure, > right? That would itself be a new feature. Or could we use bootconfig? (CC Masami) Yes, I think you can use bootconfig to add this feature more flexibly. I think your patch is easily modified to use bootconfig. :) Thank you,
On Tue, Apr 14, 2020 at 01:25:07PM +0200, Vlastimil Babka wrote: > On 4/6/20 7:08 PM, Luis Chamberlain wrote: > > On Mon, Apr 06, 2020 at 08:58:50AM -0700, Kees Cook wrote: > >> On Mon, Apr 06, 2020 at 02:08:36PM +0000, Luis Chamberlain wrote: > >> > > Yes. Doing an internal extension isn't testing the actual code. > >> > > >> > But it would. > >> > > >> > [...] > >> > > I don't think anything is needed for this series. It can be boot tested > >> > > manually. > >> > > >> > Why test it manually when it could be tested automatically with a new kconfig? > >> > >> So, my impression is that adding code to the internals to test the > >> internals isn't a valid test (or at least makes it fragile) because the > >> test would depend on the changes to the internals (or at least depend on > >> non-default non-production CONFIGs). > > > > The *internal* aspect here is an extension to boot params under a > > kconfig which would simply append to it, as if the user would have > > So there's no such kconfig yet to apply boot parameters specified by configure, > right? That would itself be a new feature. I cannot say, there is no easy grammatical expression for this. For kernel testing, no, I am not aware of one. > Or could we use bootconfig? (CC Masami) Neat, yeah, that seems like a set of helpers of which could help for sure. > >> Regardless of testing, I think this series is ready for -mm. > > > > I'm happy for it to go in provided we at least devise a follow up plan > > for testing. Otherwise -- like other things, it won't get done. > > OK I'll send a v2 and we can discuss the test driver details. I don't want to > neglect testing, but also not block the new functionality, So long as the testing part gets done, I'm happy :) > in case it means > testing means adding another new functionality. Yeah, I can see how some new code may need to be added for that, its a good point. But think about this for a second, if we *didn't* have such code added, how could it have easily been tested before? The question is rhetorical, as I'm alluding to that a lot of old code wasn't designed for easy unit testing either, and I'd invite one to consider the value of the change in philosophy on first code design when one *does* take that into consideration. There are certain best practices that I'd wager are probably good for us to evaluate for the long term of kernel evolution, and I think that always advocating designing testing around new functionality would be one. Once and if you do add your testing, and if such testing is expanded to test parsing the cmdline further, and to purposely break it, who knows what other bugs bugs we'll find. But maybe Masami already uncovered all the fun bugs. Luis
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index c07815d230bc..81ff626fc700 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -4793,6 +4793,15 @@ switches= [HW,M68k] + sysctl.*= [KNL] + Set a sysctl parameter, right before loading the init + process, as if the value was written to the respective + /proc/sys/... file. Both '.' and '/' are recognized as + separators. Unrecognized parameters and invalid values + are reported in the kernel log. Sysctls registered + later by a loaded module cannot be set this way. + Example: sysctl.vm.swappiness=40 + sysfs.deprecated=0|1 [KNL] Enable/disable old style sysfs layout for old udev on older distributions. When this option is enabled diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c index c75bb4632ed1..653188c9c4c9 100644 --- a/fs/proc/proc_sysctl.c +++ b/fs/proc/proc_sysctl.c @@ -14,6 +14,7 @@ #include <linux/mm.h> #include <linux/module.h> #include <linux/bpf-cgroup.h> +#include <linux/mount.h> #include "internal.h" static const struct dentry_operations proc_sys_dentry_operations; @@ -1725,3 +1726,102 @@ int __init proc_sys_init(void) return sysctl_init(); } + +/* Set sysctl value passed on kernel command line. */ +static int process_sysctl_arg(char *param, char *val, + const char *unused, void *arg) +{ + char *path; + struct vfsmount *proc_mnt = *((struct vfsmount **)arg); + struct file_system_type *proc_fs_type; + struct file *file; + int len; + int err; + loff_t pos = 0; + ssize_t wret; + + if (strncmp(param, "sysctl", sizeof("sysctl") - 1)) + return 0; + + param += sizeof("sysctl") - 1; + + if (param[0] != '/' && param[0] != '.') + return 0; + + param++; + + if (!proc_mnt) { + proc_fs_type = get_fs_type("proc"); + if (!proc_fs_type) { + pr_err("Failed to find procfs to set sysctl from command line"); + return 0; + } + proc_mnt = kern_mount(proc_fs_type); + put_filesystem(proc_fs_type); + if (IS_ERR(proc_mnt)) { + pr_err("Failed to mount procfs to set sysctl from command line"); + return 0; + } + *((struct vfsmount **)arg) = proc_mnt; + } + + path = kasprintf(GFP_KERNEL, "sys/%s", param); + if (!path) + panic("%s: Failed to allocate path for %s\n", __func__, param); + strreplace(path, '.', '/'); + + file = file_open_root(proc_mnt->mnt_root, proc_mnt, path, O_WRONLY, 0); + if (IS_ERR(file)) { + err = PTR_ERR(file); + if (err == -ENOENT) + pr_err("Failed to set sysctl parameter '%s=%s': parameter not found", + param, val); + else if (err == -EACCES) + pr_err("Failed to set sysctl parameter '%s=%s': permission denied (read-only?)", + param, val); + else + pr_err("Error %pe opening proc file to set sysctl parameter '%s=%s'", + file, param, val); + goto out; + } + len = strlen(val); + wret = kernel_write(file, val, len, &pos); + if (wret < 0) { + err = wret; + if (err == -EINVAL) + pr_err("Failed to set sysctl parameter '%s=%s': invalid value", + param, val); + else + pr_err("Error %pe writing to proc file to set sysctl parameter '%s=%s'", + ERR_PTR(err), param, val); + } else if (wret != len) { + pr_err("Wrote only %ld bytes of %d writing to proc file %s to set sysctl parameter '%s=%s'", + wret, len, path, param, val); + } + + err = filp_close(file, NULL); + if (err) + pr_err("Error %pe closing proc file to set sysctl parameter '%s=%s'", + ERR_PTR(err), param, val); +out: + kfree(path); + return 0; +} + +void do_sysctl_args(void) +{ + char *command_line; + struct vfsmount *proc_mnt = NULL; + + command_line = kstrdup(saved_command_line, GFP_KERNEL); + if (!command_line) + panic("%s: Failed to allocate copy of command line\n", __func__); + + parse_args("Setting sysctl args", command_line, + NULL, 0, -1, -1, &proc_mnt, process_sysctl_arg); + + if (proc_mnt) + kern_unmount(proc_mnt); + + kfree(command_line); +} diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h index 02fa84493f23..5f3f2a00d75f 100644 --- a/include/linux/sysctl.h +++ b/include/linux/sysctl.h @@ -206,6 +206,7 @@ struct ctl_table_header *register_sysctl_paths(const struct ctl_path *path, void unregister_sysctl_table(struct ctl_table_header * table); extern int sysctl_init(void); +void do_sysctl_args(void); extern struct ctl_table sysctl_mount_point[]; @@ -236,6 +237,9 @@ static inline void setup_sysctl_set(struct ctl_table_set *p, { } +void do_sysctl_args(void) +{ +} #endif /* CONFIG_SYSCTL */ int sysctl_max_threads(struct ctl_table *table, int write, diff --git a/init/main.c b/init/main.c index ee4947af823f..a91ea166a731 100644 --- a/init/main.c +++ b/init/main.c @@ -1367,6 +1367,8 @@ static int __ref kernel_init(void *unused) rcu_end_inkernel_boot(); + do_sysctl_args(); + if (ramdisk_execute_command) { ret = run_init_process(ramdisk_execute_command); if (!ret)
A recently proposed patch to add vm_swappiness command line parameter in addition to existing sysctl [1] made me wonder why we don't have a general support for passing sysctl parameters via command line. Googling found only somebody else wondering the same [2], but I haven't found any prior discussion with reasons why not to do this. Settings the vm_swappiness issue aside (the underlying issue might be solved in a different way), quick search of kernel-parameters.txt shows there are already some that exist as both sysctl and kernel parameter - hung_task_panic, nmi_watchdog, numa_zonelist_order, traceoff_on_warning. A general mechanism would remove the need to add more of those one-offs and might be handy in situations where configuration by e.g. /etc/sysctl.d/ is impractical. Hence, this patch adds a new parse_args() pass that looks for parameters prefixed by 'sysctl.' and tries to interpret them as writes to the corresponding sys/ files using an temporary in-kernel procfs mount. This mechanism was suggested by Eric W. Biederman [3], as it handles all dynamically registered sysctl tables. Errors due to e.g. invalid parameter name or value are reported in the kernel log. The processing is hooked right before the init process is loaded, as some handlers might be more complicated than simple setters and might need some subsystems to be initialized. At the moment the init process can be started and eventually execute a process writing to /proc/sys/ then it should be also fine to do that from the kernel. Sysctls registered later on module load time are not set by this mechanism - it's expected that in such scenarios, setting sysctl values from userspace is practical enough. [1] https://lore.kernel.org/r/BL0PR02MB560167492CA4094C91589930E9FC0@BL0PR02MB5601.namprd02.prod.outlook.com/ [2] https://unix.stackexchange.com/questions/558802/how-to-set-sysctl-using-kernel-command-line-parameter [3] https://lore.kernel.org/r/87bloj2skm.fsf@x220.int.ebiederm.org/ Signed-off-by: Vlastimil Babka <vbabka@suse.cz> --- .../admin-guide/kernel-parameters.txt | 9 ++ fs/proc/proc_sysctl.c | 100 ++++++++++++++++++ include/linux/sysctl.h | 4 + init/main.c | 2 + 4 files changed, 115 insertions(+)