diff mbox

[v3,00/10] sysctl: Remove sentinel elements from kernel dir

Message ID 20240328-jag-sysctl_remove_empty_elem_kernel-v3-0-285d273912fe@samsung.com (mailing list archive)
State New
Headers show

Commit Message

Joel Granados via B4 Relay March 28, 2024, 3:44 p.m. UTC
From: Joel Granados <j.granados@samsung.com>

What?
These commits remove the sentinel element (last empty element) from the
sysctl arrays of all the files under the "kernel/" directory that use a
sysctl array for registration. The merging of the preparation patches
[1] to mainline allows us to remove sentinel elements without changing
behavior. This is safe because the sysctl registration code
(register_sysctl() and friends) use the array size in addition to
checking for a sentinel [2].

Why?
By removing the sysctl sentinel elements we avoid kernel bloat as
ctl_table arrays get moved out of kernel/sysctl.c into their own
respective subsystems. This move was started long ago to avoid merge
conflicts; the sentinel removal bit came after Mathew Wilcox suggested
it to avoid bloating the kernel by one element as arrays moved out. This
patchset will reduce the overall build time size of the kernel and run
time memory bloat by about ~64 bytes per declared ctl_table array (more
info here [5]).

When are we done?
There are 4 patchests (25 commits [3]) that are still outstanding to
completely remove the sentinels: files under "net/", files under
"kernel/" (this patchset) dir, misc dirs (files under mm/ security/ and
others) and the final set that removes the unneeded check for ->procname
== NULL.

Testing:
* Ran sysctl selftests (./tools/testing/selftests/sysctl/sysctl.sh)
* Ran this through 0-day with no errors or warnings

Savings in vmlinux:
  A total of 64 bytes per sentinel is saved after removal; I measured in
  x86_64 to give an idea of the aggregated savings. The actual savings
  will depend on individual kernel configuration.
    * bloat-o-meter
        - The "yesall" config saves 1984 bytes [6]
        - A reduced config [4] saves 1027 bytes [7]

Savings in allocated memory:
  None in this set but will occur when the superfluous allocations are
  removed from proc_sysctl.c. I include it here for context. The
  estimated savings during boot for config [3] are 6272 bytes. See [8]
  for how to measure it.

Comments/feedback greatly appreciated

Changes in v3:
- Rebased to v6.9-rc1
- wrote a shorter cover letter
- Removed willy@infradead.org from cc
- Link to v2: https://lore.kernel.org/r/20240104-jag-sysctl_remove_empty_elem_kernel-v2-0-836cc04e00ec@samsung.com

Changes in v2:
- No functional changes; I resent it as I did not see it in the latest
  sysctl-next. It might be a bit too late to include it in 6.7 version,
  but this v2 can be used for 6.8 when it comes out.
- Rebased on top of v6.7-rc6
- Added trailers to the relevant commits.
- Link to v1: https://lore.kernel.org/r/20231107-jag-sysctl_remove_empty_elem_kernel-v1-0-e4ce1388dfa0@samsung.com
Best

Joel

[1] https://lore.kernel.org/all/ZO5Yx5JFogGi%2FcBo@bombadil.infradead.org/
[2] https://lore.kernel.org/all/ZO5Yx5JFogGi%2FcBo@bombadil.infradead.org/
[3] https://git.kernel.org/pub/scm/linux/kernel/git/joel.granados/linux.git/tag/?h=sysctl_remove_empty_elem_v5
[4] https://gist.github.com/Joelgranados/feaca7af5537156ca9b73aeaec093171

[5]
Links Related to the ctl_table sentinel removal:
* Good summaries from Luis:
  https://lore.kernel.org/all/ZO5Yx5JFogGi%2FcBo@bombadil.infradead.org/
  https://lore.kernel.org/all/ZMFizKFkVxUFtSqa@bombadil.infradead.org/
* Patches adjusting sysctl register calls:
  https://lore.kernel.org/all/20230302204612.782387-1-mcgrof@kernel.org/
  https://lore.kernel.org/all/20230302202826.776286-1-mcgrof@kernel.org/
* Discussions about expectations and approach
  https://lore.kernel.org/all/20230321130908.6972-1-frank.li@vivo.com
  https://lore.kernel.org/all/20220220060626.15885-1-tangmeng@uniontech.com

[6]
add/remove: 0/0 grow/shrink: 0/31 up/down: 0/-1984 (-1984)
Function                                     old     new   delta
watchdog_sysctls                             576     512     -64
watchdog_hardlockup_sysctl                   128      64     -64
vm_table                                    1344    1280     -64
uts_kern_table                               448     384     -64
usermodehelper_table                         192     128     -64
user_table                                   832     768     -64
user_event_sysctls                           128      64     -64
timer_sysctl                                 128      64     -64
signal_debug_table                           128      64     -64
seccomp_sysctl_table                         192     128     -64
sched_rt_sysctls                             256     192     -64
sched_fair_sysctls                           256     192     -64
sched_energy_aware_sysctls                   128      64     -64
sched_dl_sysctls                             192     128     -64
sched_core_sysctls                           384     320     -64
sched_autogroup_sysctls                      128      64     -64
printk_sysctls                               512     448     -64
pid_ns_ctl_table_vm                          128      64     -64
pid_ns_ctl_table                             128      64     -64
latencytop_sysctl                            128      64     -64
kprobe_sysctls                               128      64     -64
kexec_core_sysctls                           256     192     -64
kern_table                                  2560    2496     -64
kern_reboot_table                            192     128     -64
kern_panic_table                             192     128     -64
kern_exit_table                              128      64     -64
kern_delayacct_table                         128      64     -64
kern_acct_table                              128      64     -64
hung_task_sysctls                            448     384     -64
ftrace_sysctls                               128      64     -64
bpf_syscall_table                            192     128     -64
Total: Before=429912331, After=429910347, chg -0.00%

[7]
add/remove: 0/1 grow/shrink: 0/16 up/down: 0/-1027 (-1027)
Function                                     old     new   delta
sched_core_sysctl_init                        39      36      -3
vm_table                                    1024     960     -64
uts_kern_table                               448     384     -64
usermodehelper_table                         192     128     -64
user_table                                   704     640     -64
signal_debug_table                           128      64     -64
seccomp_sysctl_table                         192     128     -64
sched_rt_sysctls                             256     192     -64
sched_fair_sysctls                           128      64     -64
sched_dl_sysctls                             192     128     -64
sched_core_sysctls                            64       -     -64
printk_sysctls                               512     448     -64
pid_ns_ctl_table_vm                          128      64     -64
kern_table                                  1920    1856     -64
kern_reboot_table                            192     128     -64
kern_panic_table                             128      64     -64
kern_exit_table                              128      64     -64
Total: Before=8522228, After=8521201, chg -0.01%

[8]
To measure the in memory savings apply this on top of this patchset.

"
"
"
and then run the following bash script in the kernel:

accum=0
for n in $(dmesg | grep kzalloc | awk '{print $3}') ; do
    accum=$(calc "$accum + $n")
done
echo $accum

---
Signed-off-by: Joel Granados <j.granados@samsung.com>

---
Joel Granados (10):
      kernel misc:  Remove the now superfluous sentinel elements from ctl_table array
      umh:  Remove the now superfluous sentinel elements from ctl_table array
      ftrace: Remove the now superfluous sentinel elements from ctl_table array
      timekeeping:  Remove the now superfluous sentinel elements from ctl_table array
      seccomp: Remove the now superfluous sentinel elements from ctl_table array
      scheduler: Remove the now superfluous sentinel elements from ctl_table array
      printk: Remove the now superfluous sentinel elements from ctl_table array
      kprobes: Remove the now superfluous sentinel elements from ctl_table array
      delayacct:  Remove the now superfluous sentinel elements from ctl_table array
      bpf: Remove the now superfluous sentinel elements from ctl_table array

 kernel/acct.c                    | 1 -
 kernel/bpf/syscall.c             | 1 -
 kernel/delayacct.c               | 1 -
 kernel/exit.c                    | 1 -
 kernel/hung_task.c               | 1 -
 kernel/kexec_core.c              | 1 -
 kernel/kprobes.c                 | 1 -
 kernel/latencytop.c              | 1 -
 kernel/panic.c                   | 1 -
 kernel/pid_namespace.c           | 1 -
 kernel/pid_sysctl.h              | 1 -
 kernel/printk/sysctl.c           | 1 -
 kernel/reboot.c                  | 1 -
 kernel/sched/autogroup.c         | 1 -
 kernel/sched/core.c              | 1 -
 kernel/sched/deadline.c          | 1 -
 kernel/sched/fair.c              | 1 -
 kernel/sched/rt.c                | 1 -
 kernel/sched/topology.c          | 1 -
 kernel/seccomp.c                 | 1 -
 kernel/signal.c                  | 1 -
 kernel/stackleak.c               | 1 -
 kernel/sysctl.c                  | 2 --
 kernel/time/timer.c              | 1 -
 kernel/trace/ftrace.c            | 1 -
 kernel/trace/trace_events_user.c | 1 -
 kernel/ucount.c                  | 3 +--
 kernel/umh.c                     | 1 -
 kernel/utsname_sysctl.c          | 1 -
 kernel/watchdog.c                | 2 --
 30 files changed, 1 insertion(+), 33 deletions(-)
---
base-commit: 4cece764965020c22cff7665b18a012006359095
change-id: 20231107-jag-sysctl_remove_empty_elem_kernel-7de90cfd0c0a

Best regards,

Comments

Joel Granados April 15, 2024, 1:12 p.m. UTC | #1
Just a heads up: will add this to the sysctl-next tree
https://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl.git/log/?h=sysctl-next
so it has time to soak there for next merge window. If you are a
maintainer and are thinking of including any of these patches in your
tree, let me know and I'll remove when upstreaming

On Thu, Mar 28, 2024 at 04:44:01PM +0100, Joel Granados via B4 Relay wrote:
> From: Joel Granados <j.granados@samsung.com>
> 
> What?
> These commits remove the sentinel element (last empty element) from the
> sysctl arrays of all the files under the "kernel/" directory that use a
> sysctl array for registration. The merging of the preparation patches
> [1] to mainline allows us to remove sentinel elements without changing
> behavior. This is safe because the sysctl registration code
> (register_sysctl() and friends) use the array size in addition to
> checking for a sentinel [2].
...

Best
Konrad Dybcio April 22, 2024, 2:27 p.m. UTC | #2
On 3/28/24 16:44, Joel Granados wrote:
> What?
> These commits remove the sentinel element (last empty element) from the
> sysctl arrays of all the files under the "kernel/" directory that use a
> sysctl array for registration. The merging of the preparation patches
> [1] to mainline allows us to remove sentinel elements without changing
> behavior. This is safe because the sysctl registration code
> (register_sysctl() and friends) use the array size in addition to
> checking for a sentinel [2].

Hi,

looks like *this* "patch" made it to the sysctl tree [1], breaking b4
for everyone else (as there's a "--- b4-submit-tracking ---" magic in
the tree history now) on next-20240422

Please drop it (again, I'm only talking about this empty cover letter).

Konrad

[1] https://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl.git/commit/?h=sysctl-next&id=ec04a7fa09ddedc1d6c8b86ae281897256c7fdf0
Krzysztof Kozlowski April 22, 2024, 2:49 p.m. UTC | #3
On 22/04/2024 16:27, Konrad Dybcio wrote:
> 
> 
> On 3/28/24 16:44, Joel Granados wrote:
>> What?
>> These commits remove the sentinel element (last empty element) from the
>> sysctl arrays of all the files under the "kernel/" directory that use a
>> sysctl array for registration. The merging of the preparation patches
>> [1] to mainline allows us to remove sentinel elements without changing
>> behavior. This is safe because the sysctl registration code
>> (register_sysctl() and friends) use the array size in addition to
>> checking for a sentinel [2].
> 
> Hi,
> 
> looks like *this* "patch" made it to the sysctl tree [1], breaking b4
> for everyone else (as there's a "--- b4-submit-tracking ---" magic in
> the tree history now) on next-20240422
> 
> Please drop it (again, I'm only talking about this empty cover letter).

Just to clarify, in case it is not obvious:
Please *do not merge your own trees* into kernel.org repos. Instead use
b4 shazam to pick up entire patchset, even if it is yours. b4 allows to
merge/apply also the cover letter, if this is your intention.

With b4 shazam you would get proper Link tags and not break everyone's
b4 workflow on next. :/

Best regards,
Krzysztof
Konstantin Ryabitsev April 22, 2024, 2:57 p.m. UTC | #4
On Mon, Apr 22, 2024 at 04:49:27PM +0200, Krzysztof Kozlowski wrote:
> >> These commits remove the sentinel element (last empty element) from 
> >> the
> >> sysctl arrays of all the files under the "kernel/" directory that use a
> >> sysctl array for registration. The merging of the preparation patches
> >> [1] to mainline allows us to remove sentinel elements without changing
> >> behavior. This is safe because the sysctl registration code
> >> (register_sysctl() and friends) use the array size in addition to
> >> checking for a sentinel [2].
> > 
> > Hi,
> > 
> > looks like *this* "patch" made it to the sysctl tree [1], breaking b4
> > for everyone else (as there's a "--- b4-submit-tracking ---" magic in
> > the tree history now) on next-20240422
> > 
> > Please drop it (again, I'm only talking about this empty cover letter).
> 
> Just to clarify, in case it is not obvious:
> Please *do not merge your own trees* into kernel.org repos. Instead use
> b4 shazam to pick up entire patchset, even if it is yours. b4 allows to
> merge/apply also the cover letter, if this is your intention.
> 
> With b4 shazam you would get proper Link tags and not break everyone's
> b4 workflow on next. :/

I was expecting this to happen at some point. :/

Note, that you can still use b4 and merge your own trees, but you need 
to switch to using a different cover letter strategy:

  [b4]
  prep-cover-strategy = branch-description

-K
Krzysztof Kozlowski April 22, 2024, 3:07 p.m. UTC | #5
On 22/04/2024 16:57, Konstantin Ryabitsev wrote:
> On Mon, Apr 22, 2024 at 04:49:27PM +0200, Krzysztof Kozlowski wrote:
>>>> These commits remove the sentinel element (last empty element) from 
>>>> the
>>>> sysctl arrays of all the files under the "kernel/" directory that use a
>>>> sysctl array for registration. The merging of the preparation patches
>>>> [1] to mainline allows us to remove sentinel elements without changing
>>>> behavior. This is safe because the sysctl registration code
>>>> (register_sysctl() and friends) use the array size in addition to
>>>> checking for a sentinel [2].
>>>
>>> Hi,
>>>
>>> looks like *this* "patch" made it to the sysctl tree [1], breaking b4
>>> for everyone else (as there's a "--- b4-submit-tracking ---" magic in
>>> the tree history now) on next-20240422
>>>
>>> Please drop it (again, I'm only talking about this empty cover letter).
>>
>> Just to clarify, in case it is not obvious:
>> Please *do not merge your own trees* into kernel.org repos. Instead use
>> b4 shazam to pick up entire patchset, even if it is yours. b4 allows to
>> merge/apply also the cover letter, if this is your intention.
>>
>> With b4 shazam you would get proper Link tags and not break everyone's
>> b4 workflow on next. :/
> 
> I was expecting this to happen at some point. :/
> 
> Note, that you can still use b4 and merge your own trees, but you need 
> to switch to using a different cover letter strategy:
> 
>   [b4]
>   prep-cover-strategy = branch-description

Yes, but you still won't have:
1. Link tags
2. Nice thank-you letters
3. Auto-collecting review/tested/ack tags

So sure, maintainer can even cherry-pick patches, use patch or manually
edit git objects and then update git refs, but that's not the point. :)

Just use b4 shazam, it's so awesome tool.

Best regards,
Krzysztof
Joel Granados April 24, 2024, 7:41 a.m. UTC | #6
On Mon, Apr 22, 2024 at 04:27:47PM +0200, Konrad Dybcio wrote:
> 
> 
> On 3/28/24 16:44, Joel Granados wrote:
> > What?
> > These commits remove the sentinel element (last empty element) from the
> > sysctl arrays of all the files under the "kernel/" directory that use a
> > sysctl array for registration. The merging of the preparation patches
> > [1] to mainline allows us to remove sentinel elements without changing
> > behavior. This is safe because the sysctl registration code
> > (register_sysctl() and friends) use the array size in addition to
> > checking for a sentinel [2].
> 
> Hi,
> 
> looks like *this* "patch" made it to the sysctl tree [1], breaking b4
> for everyone else (as there's a "--- b4-submit-tracking ---" magic in
> the tree history now) on next-20240422
> 
> Please drop it (again, I'm only talking about this empty cover letter).
I see it. Will remove it from sysctl-next

Thx for pointing it out.

Best

> 
> Konrad
> 
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl.git/commit/?h=sysctl-next&id=ec04a7fa09ddedc1d6c8b86ae281897256c7fdf0
Joel Granados April 24, 2024, 7:46 a.m. UTC | #7
On Mon, Apr 22, 2024 at 04:49:27PM +0200, Krzysztof Kozlowski wrote:
> On 22/04/2024 16:27, Konrad Dybcio wrote:
> > 
> > 
> > On 3/28/24 16:44, Joel Granados wrote:
> >> What?
> >> These commits remove the sentinel element (last empty element) from the
> >> sysctl arrays of all the files under the "kernel/" directory that use a
> >> sysctl array for registration. The merging of the preparation patches
> >> [1] to mainline allows us to remove sentinel elements without changing
> >> behavior. This is safe because the sysctl registration code
> >> (register_sysctl() and friends) use the array size in addition to
> >> checking for a sentinel [2].
> > 
> > Hi,
> > 
> > looks like *this* "patch" made it to the sysctl tree [1], breaking b4
> > for everyone else (as there's a "--- b4-submit-tracking ---" magic in
> > the tree history now) on next-20240422
> > 
> > Please drop it (again, I'm only talking about this empty cover letter).
> 
> Just to clarify, in case it is not obvious:
> Please *do not merge your own trees* into kernel.org repos. Instead use
> b4 shazam to pick up entire patchset, even if it is yours. b4 allows to
> merge/apply also the cover letter, if this is your intention.
Noted. Will adjust my workflow to just use B4 to bring stuff into the
sysctl-next tree.

> 
> With b4 shazam you would get proper Link tags and not break everyone's
> b4 workflow on next. :/
Ok. Sorry for the noise.

> 
> Best regards,
> Krzysztof
>
Joel Granados April 24, 2024, 7:52 a.m. UTC | #8
On Mon, Apr 22, 2024 at 04:27:47PM +0200, Konrad Dybcio wrote:
> 
> 
> On 3/28/24 16:44, Joel Granados wrote:
> > What?
> > These commits remove the sentinel element (last empty element) from the
> > sysctl arrays of all the files under the "kernel/" directory that use a
> > sysctl array for registration. The merging of the preparation patches
> > [1] to mainline allows us to remove sentinel elements without changing
> > behavior. This is safe because the sysctl registration code
> > (register_sysctl() and friends) use the array size in addition to
> > checking for a sentinel [2].
> 
> Hi,
> 
> looks like *this* "patch" made it to the sysctl tree [1], breaking b4
> for everyone else (as there's a "--- b4-submit-tracking ---" magic in
> the tree history now) on next-20240422
> 
> Please drop it (again, I'm only talking about this empty cover letter).
Here do you mean revert? or do you mean force-push without the cover
letter commit?

I did the later, but if the former is necessary I can always go back to
the old HEAD, add a revert commit and then push that.

Best

> 
> Konrad
> 
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl.git/commit/?h=sysctl-next&id=ec04a7fa09ddedc1d6c8b86ae281897256c7fdf0
Joel Granados April 24, 2024, 7:55 a.m. UTC | #9
On Mon, Apr 22, 2024 at 05:07:59PM +0200, Krzysztof Kozlowski wrote:
> On 22/04/2024 16:57, Konstantin Ryabitsev wrote:
> > On Mon, Apr 22, 2024 at 04:49:27PM +0200, Krzysztof Kozlowski wrote:
> >>>> These commits remove the sentinel element (last empty element) from 
> >>>> the
> >>>> sysctl arrays of all the files under the "kernel/" directory that use a
> >>>> sysctl array for registration. The merging of the preparation patches
> >>>> [1] to mainline allows us to remove sentinel elements without changing
> >>>> behavior. This is safe because the sysctl registration code
> >>>> (register_sysctl() and friends) use the array size in addition to
> >>>> checking for a sentinel [2].
> >>>
> >>> Hi,
> >>>
> >>> looks like *this* "patch" made it to the sysctl tree [1], breaking b4
> >>> for everyone else (as there's a "--- b4-submit-tracking ---" magic in
> >>> the tree history now) on next-20240422
> >>>
> >>> Please drop it (again, I'm only talking about this empty cover letter).
> >>
> >> Just to clarify, in case it is not obvious:
> >> Please *do not merge your own trees* into kernel.org repos. Instead use
> >> b4 shazam to pick up entire patchset, even if it is yours. b4 allows to
> >> merge/apply also the cover letter, if this is your intention.
> >>
> >> With b4 shazam you would get proper Link tags and not break everyone's
> >> b4 workflow on next. :/
> > 
> > I was expecting this to happen at some point. :/
> > 
> > Note, that you can still use b4 and merge your own trees, but you need 
> > to switch to using a different cover letter strategy:
> > 
> >   [b4]
> >   prep-cover-strategy = branch-description
> 
> Yes, but you still won't have:
> 1. Link tags
> 2. Nice thank-you letters
> 3. Auto-collecting review/tested/ack tags
> 
> So sure, maintainer can even cherry-pick patches, use patch or manually
> edit git objects and then update git refs, but that's not the point. :)
> 
> Just use b4 shazam, it's so awesome tool.
I'll try this out going forward (instead of chaning the cover letter
strategy)

Thx again.

Best
diff mbox

Patch

diff --git i/fs/proc/proc_sysctl.c w/fs/proc/proc_sysctl.c
index 37cde0efee57..896c498600e8 100644
--- i/fs/proc/proc_sysctl.c
+++ w/fs/proc/proc_sysctl.c
@@ -966,6 +966,7 @@  static struct ctl_dir *new_dir(struct ctl_table_set *set,
        table[0].procname = new_name;
        table[0].mode = S_IFDIR|S_IRUGO|S_IXUGO;
        init_header(&new->header, set->dir.header.root, set, node, table, 1);
+       printk("%ld sysctl saved mem kzalloc\n", sizeof(struct ctl_table));

        return new;
 }
@@ -1189,6 +1190,7 @@  static struct ctl_table_header *new_links(struct ctl_dir *dir, s>
                link_name += len;
                link++;
        }
+       printk("%ld sysctl saved mem kzalloc\n", sizeof(struct ctl_table));
        init_header(links, dir->header.root, dir->header.set, node, link_table,
                    head->ctl_table_size);
        links->nreg = nr_entries;