diff mbox series

mm: optionally disable brk()

Message ID 20201002171921.3053-1-toiwoton@gmail.com (mailing list archive)
State New, archived
Headers show
Series mm: optionally disable brk() | expand

Commit Message

Topi Miettinen Oct. 2, 2020, 5:19 p.m. UTC
The brk() system call allows to change data segment size (heap). This
is mainly used by glibc for memory allocation, but it can use mmap()
and that results in more randomized memory mappings since the heap is
always located at fixed offset to program while mmap()ed memory is
randomized.

Signed-off-by: Topi Miettinen <toiwoton@gmail.com>
---
 init/Kconfig    | 15 +++++++++++++++
 kernel/sys_ni.c |  2 ++
 mm/mmap.c       |  2 ++
 3 files changed, 19 insertions(+)

Comments

David Hildenbrand Oct. 2, 2020, 5:52 p.m. UTC | #1
On 02.10.20 19:19, Topi Miettinen wrote:
> The brk() system call allows to change data segment size (heap). This
> is mainly used by glibc for memory allocation, but it can use mmap()
> and that results in more randomized memory mappings since the heap is
> always located at fixed offset to program while mmap()ed memory is
> randomized.

Want to take more Unix out of Linux?

Honestly, why care about disabling? User space can happily use mmap() if
it prefers.
David Laight Oct. 2, 2020, 9:19 p.m. UTC | #2
From: David Hildenbrand
> Sent: 02 October 2020 18:52
> 
> On 02.10.20 19:19, Topi Miettinen wrote:
> > The brk() system call allows to change data segment size (heap). This
> > is mainly used by glibc for memory allocation, but it can use mmap()
> > and that results in more randomized memory mappings since the heap is
> > always located at fixed offset to program while mmap()ed memory is
> > randomized.
> 
> Want to take more Unix out of Linux?
> 
> Honestly, why care about disabling? User space can happily use mmap() if
> it prefers.

I bet some obscure applications rely on it.

Although hopefully nothing still does heap allocation
by just increasing the VA and calling brk() in response
to SIGSEGV.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Topi Miettinen Oct. 2, 2020, 9:44 p.m. UTC | #3
On 2.10.2020 20.52, David Hildenbrand wrote:
> On 02.10.20 19:19, Topi Miettinen wrote:
>> The brk() system call allows to change data segment size (heap). This
>> is mainly used by glibc for memory allocation, but it can use mmap()
>> and that results in more randomized memory mappings since the heap is
>> always located at fixed offset to program while mmap()ed memory is
>> randomized.
> 
> Want to take more Unix out of Linux?
> 
> Honestly, why care about disabling? User space can happily use mmap() if
> it prefers.

brk() interface doesn't seem to be used much and glibc is happy to 
switch to mmap() if brk() fails, so why not allow disabling it 
optionally? If you don't care to disable, don't do it and this is even 
the default.

-Topi
Michal Hocko Oct. 5, 2020, 6:12 a.m. UTC | #4
On Sat 03-10-20 00:44:09, Topi Miettinen wrote:
> On 2.10.2020 20.52, David Hildenbrand wrote:
> > On 02.10.20 19:19, Topi Miettinen wrote:
> > > The brk() system call allows to change data segment size (heap). This
> > > is mainly used by glibc for memory allocation, but it can use mmap()
> > > and that results in more randomized memory mappings since the heap is
> > > always located at fixed offset to program while mmap()ed memory is
> > > randomized.
> > 
> > Want to take more Unix out of Linux?
> > 
> > Honestly, why care about disabling? User space can happily use mmap() if
> > it prefers.
> 
> brk() interface doesn't seem to be used much and glibc is happy to switch to
> mmap() if brk() fails, so why not allow disabling it optionally? If you
> don't care to disable, don't do it and this is even the default.

I do not think we want to have config per syscall, do we? There are many
other syscalls which are rarely used. Your changelog is actually missing
the most important part. Why do we care so much to increase the config
space and make the kerneel even more tricky for users to configure? How
do I know that something won't break? brk() is one of those syscalls
that has been here for ever and a lot of userspace might depend on it.
I haven't checked but the code size is very unlikely to be shrunk much
as this is mostly a tiny wrapper around mmap code. We are not going to
get rid of any complexity.

So what is the point?
Topi Miettinen Oct. 5, 2020, 8:11 a.m. UTC | #5
On 5.10.2020 9.12, Michal Hocko wrote:
> On Sat 03-10-20 00:44:09, Topi Miettinen wrote:
>> On 2.10.2020 20.52, David Hildenbrand wrote:
>>> On 02.10.20 19:19, Topi Miettinen wrote:
>>>> The brk() system call allows to change data segment size (heap). This
>>>> is mainly used by glibc for memory allocation, but it can use mmap()
>>>> and that results in more randomized memory mappings since the heap is
>>>> always located at fixed offset to program while mmap()ed memory is
>>>> randomized.
>>>
>>> Want to take more Unix out of Linux?
>>>
>>> Honestly, why care about disabling? User space can happily use mmap() if
>>> it prefers.
>>
>> brk() interface doesn't seem to be used much and glibc is happy to switch to
>> mmap() if brk() fails, so why not allow disabling it optionally? If you
>> don't care to disable, don't do it and this is even the default.
> 
> I do not think we want to have config per syscall, do we? There are many
> other syscalls which are rarely used. Your changelog is actually missing
> the most important part. Why do we care so much to increase the config
> space and make the kerneel even more tricky for users to configure?

Maybe, I didn't know this was an important priority since there are 
other similar config options. Can you suggest some other config option 
which could trigger this? This option is already buried under CONFIG_EXPERT.

> How
> do I know that something won't break? brk() is one of those syscalls
> that has been here for ever and a lot of userspace might depend on it.

1. brk() is used by glibc for malloc() as the primary choice, secondary 
to mmap(NULL, ...). But malloc() switches to using only mmap() as soon 
as brk() fails the first time, without breakage.

2. brk() also used for initializing glibc's internal thread structures. 
The only program I saw having problems was ldconfig which indeed 
segfaults due to an unsafe assumption that sbrk() will never fail. This 
is easily fixable by switching to an internal version of mmap().

3. The dynamic loader uses brk() but this is only done to help malloc() 
and nothing breaks there if brk() returns ENOSYS.

I've sent to glibc list RFC patches which switch to mmap() completely. 
This improves the randomization for malloc()ated memory and the location 
of the thread structures.

> I haven't checked but the code size is very unlikely to be shrunk much
> as this is mostly a tiny wrapper around mmap code. We are not going to
> get rid of any complexity.
> 
> So what is the point?

The point is not to shrink the kernel (it will shrink by one small 
function) or get rid of complexity. The point is to disable an inferior 
interface. Memory returned by mmap() is at a random location but with 
brk() it is located near the data segment, so the address is more easily 
predictable.

I think hardened, security oriented systems should disable brk() 
completely because it will increase the randomization of the process 
address space (ASLR). This wouldn't be a good option to enable for 
systems where maximum compatibility with legacy software is more 
important than any hardening.

-Topi
Michal Hocko Oct. 5, 2020, 8:22 a.m. UTC | #6
On Mon 05-10-20 11:11:35, Topi Miettinen wrote:
[...]
> I think hardened, security oriented systems should disable brk() completely
> because it will increase the randomization of the process address space
> (ASLR). This wouldn't be a good option to enable for systems where maximum
> compatibility with legacy software is more important than any hardening.

I believe we already do have means to filter syscalls from userspace for
security hardened environements. Or is there any reason to duplicate
that and control during the configuration time?
Topi Miettinen Oct. 5, 2020, 9:03 a.m. UTC | #7
On 5.10.2020 11.22, Michal Hocko wrote:
> On Mon 05-10-20 11:11:35, Topi Miettinen wrote:
> [...]
>> I think hardened, security oriented systems should disable brk() completely
>> because it will increase the randomization of the process address space
>> (ASLR). This wouldn't be a good option to enable for systems where maximum
>> compatibility with legacy software is more important than any hardening.
> 
> I believe we already do have means to filter syscalls from userspace for
> security hardened environements. Or is there any reason to duplicate
> that and control during the configuration time?

This is true, but seccomp can't be used for cases where NoNewPrivileges 
can't be enabled (setuid/setgid binaries present which sadly is still 
often the case even in otherwise hardened system), so it's typically not 
possible to install a filter for the whole system.

-Topi
David Hildenbrand Oct. 5, 2020, 9:13 a.m. UTC | #8
On 05.10.20 08:12, Michal Hocko wrote:
> On Sat 03-10-20 00:44:09, Topi Miettinen wrote:
>> On 2.10.2020 20.52, David Hildenbrand wrote:
>>> On 02.10.20 19:19, Topi Miettinen wrote:
>>>> The brk() system call allows to change data segment size (heap). This
>>>> is mainly used by glibc for memory allocation, but it can use mmap()
>>>> and that results in more randomized memory mappings since the heap is
>>>> always located at fixed offset to program while mmap()ed memory is
>>>> randomized.
>>>
>>> Want to take more Unix out of Linux?
>>>
>>> Honestly, why care about disabling? User space can happily use mmap() if
>>> it prefers.
>>
>> brk() interface doesn't seem to be used much and glibc is happy to switch to
>> mmap() if brk() fails, so why not allow disabling it optionally? If you
>> don't care to disable, don't do it and this is even the default.
> 
> I do not think we want to have config per syscall, do we? 

I do wonder if grouping would be a better option then (finding a proper
level of abstraction ...).
Michal Hocko Oct. 5, 2020, 9:20 a.m. UTC | #9
On Mon 05-10-20 11:13:48, David Hildenbrand wrote:
> On 05.10.20 08:12, Michal Hocko wrote:
> > On Sat 03-10-20 00:44:09, Topi Miettinen wrote:
> >> On 2.10.2020 20.52, David Hildenbrand wrote:
> >>> On 02.10.20 19:19, Topi Miettinen wrote:
> >>>> The brk() system call allows to change data segment size (heap). This
> >>>> is mainly used by glibc for memory allocation, but it can use mmap()
> >>>> and that results in more randomized memory mappings since the heap is
> >>>> always located at fixed offset to program while mmap()ed memory is
> >>>> randomized.
> >>>
> >>> Want to take more Unix out of Linux?
> >>>
> >>> Honestly, why care about disabling? User space can happily use mmap() if
> >>> it prefers.
> >>
> >> brk() interface doesn't seem to be used much and glibc is happy to switch to
> >> mmap() if brk() fails, so why not allow disabling it optionally? If you
> >> don't care to disable, don't do it and this is even the default.
> > 
> > I do not think we want to have config per syscall, do we? 
> 
> I do wonder if grouping would be a better option then (finding a proper
> level of abstraction ...).

I have a vague recollection that project for the kernel tinification was
aiming that direction. No idea what is the current state or whether
somebody is pursuing it.
Topi Miettinen Oct. 5, 2020, 9:47 a.m. UTC | #10
On 5.10.2020 12.13, David Hildenbrand wrote:
> On 05.10.20 08:12, Michal Hocko wrote:
>> On Sat 03-10-20 00:44:09, Topi Miettinen wrote:
>>> On 2.10.2020 20.52, David Hildenbrand wrote:
>>>> On 02.10.20 19:19, Topi Miettinen wrote:
>>>>> The brk() system call allows to change data segment size (heap). This
>>>>> is mainly used by glibc for memory allocation, but it can use mmap()
>>>>> and that results in more randomized memory mappings since the heap is
>>>>> always located at fixed offset to program while mmap()ed memory is
>>>>> randomized.
>>>>
>>>> Want to take more Unix out of Linux?
>>>>
>>>> Honestly, why care about disabling? User space can happily use mmap() if
>>>> it prefers.
>>>
>>> brk() interface doesn't seem to be used much and glibc is happy to switch to
>>> mmap() if brk() fails, so why not allow disabling it optionally? If you
>>> don't care to disable, don't do it and this is even the default.
>>
>> I do not think we want to have config per syscall, do we?
> 
> I do wonder if grouping would be a better option then (finding a proper
> level of abstraction ...).

If hardening and compatibility are seen as tradeoffs, perhaps there 
could be a top level config choice (CONFIG_HARDENING_TRADEOFF) for this. 
It would have options
- "compatibility" (default) to gear questions for maximum compatibility, 
deselecting any hardening options which reduce compatibility
- "hardening" to gear questions for maximum hardening, deselecting any 
compatibility options which reduce hardening
- "none/manual": ask all questions like before

-Topi
David Hildenbrand Oct. 5, 2020, 9:55 a.m. UTC | #11
On 05.10.20 11:47, Topi Miettinen wrote:
> On 5.10.2020 12.13, David Hildenbrand wrote:
>> On 05.10.20 08:12, Michal Hocko wrote:
>>> On Sat 03-10-20 00:44:09, Topi Miettinen wrote:
>>>> On 2.10.2020 20.52, David Hildenbrand wrote:
>>>>> On 02.10.20 19:19, Topi Miettinen wrote:
>>>>>> The brk() system call allows to change data segment size (heap). This
>>>>>> is mainly used by glibc for memory allocation, but it can use mmap()
>>>>>> and that results in more randomized memory mappings since the heap is
>>>>>> always located at fixed offset to program while mmap()ed memory is
>>>>>> randomized.
>>>>>
>>>>> Want to take more Unix out of Linux?
>>>>>
>>>>> Honestly, why care about disabling? User space can happily use mmap() if
>>>>> it prefers.
>>>>
>>>> brk() interface doesn't seem to be used much and glibc is happy to switch to
>>>> mmap() if brk() fails, so why not allow disabling it optionally? If you
>>>> don't care to disable, don't do it and this is even the default.
>>>
>>> I do not think we want to have config per syscall, do we?
>>
>> I do wonder if grouping would be a better option then (finding a proper
>> level of abstraction ...).
> 
> If hardening and compatibility are seen as tradeoffs, perhaps there 
> could be a top level config choice (CONFIG_HARDENING_TRADEOFF) for this. 
> It would have options
> - "compatibility" (default) to gear questions for maximum compatibility, 
> deselecting any hardening options which reduce compatibility
> - "hardening" to gear questions for maximum hardening, deselecting any 
> compatibility options which reduce hardening
> - "none/manual": ask all questions like before

I think the general direction is to avoid an exploding set of config
options. So if there isn't a *real* demand, I guess gluing this to a
single option ("CONFIG_SECURITY_HARDENING") might be good enough.
David Laight Oct. 5, 2020, 11:21 a.m. UTC | #12
From: David Hildenbrand
> Sent: 05 October 2020 10:55
...
> > If hardening and compatibility are seen as tradeoffs, perhaps there
> > could be a top level config choice (CONFIG_HARDENING_TRADEOFF) for this.
> > It would have options
> > - "compatibility" (default) to gear questions for maximum compatibility,
> > deselecting any hardening options which reduce compatibility
> > - "hardening" to gear questions for maximum hardening, deselecting any
> > compatibility options which reduce hardening
> > - "none/manual": ask all questions like before
> 
> I think the general direction is to avoid an exploding set of config
> options. So if there isn't a *real* demand, I guess gluing this to a
> single option ("CONFIG_SECURITY_HARDENING") might be good enough.

Wouldn't that be better achieved by run-time clobbering
of the syscall vectors?

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
David Hildenbrand Oct. 5, 2020, 12:18 p.m. UTC | #13
On 05.10.20 13:21, David Laight wrote:
> From: David Hildenbrand
>> Sent: 05 October 2020 10:55
> ...
>>> If hardening and compatibility are seen as tradeoffs, perhaps there
>>> could be a top level config choice (CONFIG_HARDENING_TRADEOFF) for this.
>>> It would have options
>>> - "compatibility" (default) to gear questions for maximum compatibility,
>>> deselecting any hardening options which reduce compatibility
>>> - "hardening" to gear questions for maximum hardening, deselecting any
>>> compatibility options which reduce hardening
>>> - "none/manual": ask all questions like before
>>
>> I think the general direction is to avoid an exploding set of config
>> options. So if there isn't a *real* demand, I guess gluing this to a
>> single option ("CONFIG_SECURITY_HARDENING") might be good enough.
> 
> Wouldn't that be better achieved by run-time clobbering
> of the syscall vectors?

You mean via something like a boot parameter? Possibly yes.
David Laight Oct. 5, 2020, 12:25 p.m. UTC | #14
From: David Hildenbrand
> Sent: 05 October 2020 13:19
> 
> On 05.10.20 13:21, David Laight wrote:
> > From: David Hildenbrand
> >> Sent: 05 October 2020 10:55
> > ...
> >>> If hardening and compatibility are seen as tradeoffs, perhaps there
> >>> could be a top level config choice (CONFIG_HARDENING_TRADEOFF) for this.
> >>> It would have options
> >>> - "compatibility" (default) to gear questions for maximum compatibility,
> >>> deselecting any hardening options which reduce compatibility
> >>> - "hardening" to gear questions for maximum hardening, deselecting any
> >>> compatibility options which reduce hardening
> >>> - "none/manual": ask all questions like before
> >>
> >> I think the general direction is to avoid an exploding set of config
> >> options. So if there isn't a *real* demand, I guess gluing this to a
> >> single option ("CONFIG_SECURITY_HARDENING") might be good enough.
> >
> > Wouldn't that be better achieved by run-time clobbering
> > of the syscall vectors?
> 
> You mean via something like a boot parameter? Possibly yes.

I was thinking of later.
Some kind of restricted system might want the 'clobber'
mount() after everything is running.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Jonathan Corbet Oct. 5, 2020, 2:12 p.m. UTC | #15
On Mon, 5 Oct 2020 11:11:35 +0300
Topi Miettinen <toiwoton@gmail.com> wrote:

> The point is not to shrink the kernel (it will shrink by one small 
> function) or get rid of complexity. The point is to disable an inferior 
> interface. Memory returned by mmap() is at a random location but with 
> brk() it is located near the data segment, so the address is more easily 
> predictable.

So if your true objective is to get glibc to allocate memory differently,
perhaps the right thing to do is to patch glibc?

Thanks,

jon
Topi Miettinen Oct. 5, 2020, 4:14 p.m. UTC | #16
On 5.10.2020 17.12, Jonathan Corbet wrote:
> On Mon, 5 Oct 2020 11:11:35 +0300
> Topi Miettinen <toiwoton@gmail.com> wrote:
> 
>> The point is not to shrink the kernel (it will shrink by one small
>> function) or get rid of complexity. The point is to disable an inferior
>> interface. Memory returned by mmap() is at a random location but with
>> brk() it is located near the data segment, so the address is more easily
>> predictable.
> 
> So if your true objective is to get glibc to allocate memory differently,
> perhaps the right thing to do is to patch glibc?

Of course:
https://sourceware.org/pipermail/libc-alpha/2020-October/118319.html

But since glibc is pretty much the only user of brk(), it also makes 
sense to disable it in the kernel if nothing uses it anymore.

-Topi
Topi Miettinen Oct. 7, 2020, 9:43 a.m. UTC | #17
On 5.10.2020 15.25, David Laight wrote:
> From: David Hildenbrand
>> Sent: 05 October 2020 13:19
>>
>> On 05.10.20 13:21, David Laight wrote:
>>> From: David Hildenbrand
>>>> Sent: 05 October 2020 10:55
>>> ...
>>>>> If hardening and compatibility are seen as tradeoffs, perhaps there
>>>>> could be a top level config choice (CONFIG_HARDENING_TRADEOFF) for this.
>>>>> It would have options
>>>>> - "compatibility" (default) to gear questions for maximum compatibility,
>>>>> deselecting any hardening options which reduce compatibility
>>>>> - "hardening" to gear questions for maximum hardening, deselecting any
>>>>> compatibility options which reduce hardening
>>>>> - "none/manual": ask all questions like before
>>>>
>>>> I think the general direction is to avoid an exploding set of config
>>>> options. So if there isn't a *real* demand, I guess gluing this to a
>>>> single option ("CONFIG_SECURITY_HARDENING") might be good enough.
>>>
>>> Wouldn't that be better achieved by run-time clobbering
>>> of the syscall vectors?
>>
>> You mean via something like a boot parameter? Possibly yes.
> 
> I was thinking of later.
> Some kind of restricted system might want the 'clobber'
> mount() after everything is running.

Perhaps suitably privileged tasks should be able to install global 
seccomp filters which would disregard any NoNewPrivileges requirements 
and would apply immediately to all tasks. The boot parameter would be 
also nice so that initrd and PID1 would be also restricted. Seccomp 
would also allow more specific filtering than messing with the syscall 
tables.

-Topi
Topi Miettinen Nov. 1, 2020, 11:41 a.m. UTC | #18
On 5.10.2020 15.18, David Hildenbrand wrote:
> On 05.10.20 13:21, David Laight wrote:
>> From: David Hildenbrand
>>> Sent: 05 October 2020 10:55
>> ...
>>>> If hardening and compatibility are seen as tradeoffs, perhaps there
>>>> could be a top level config choice (CONFIG_HARDENING_TRADEOFF) for this.
>>>> It would have options
>>>> - "compatibility" (default) to gear questions for maximum compatibility,
>>>> deselecting any hardening options which reduce compatibility
>>>> - "hardening" to gear questions for maximum hardening, deselecting any
>>>> compatibility options which reduce hardening
>>>> - "none/manual": ask all questions like before
>>>
>>> I think the general direction is to avoid an exploding set of config
>>> options. So if there isn't a *real* demand, I guess gluing this to a
>>> single option ("CONFIG_SECURITY_HARDENING") might be good enough.
>>
>> Wouldn't that be better achieved by run-time clobbering
>> of the syscall vectors?
> 
> You mean via something like a boot parameter? Possibly yes.
> 

This may be obvious, but a global seccomp filter which doesn't affect 
NNP can be installed in initrd with a simple program with no changes to 
kernel:

#include <errno.h>
#include <seccomp.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int main(int argc, char **argv) {
         if (argc < 3) {
                 fprintf(stderr, "Usage: %s syscall [syscall]... 
program\n", argv[0]);
                 return EXIT_FAILURE;
         }

         scmp_filter_ctx ctx = seccomp_init(SCMP_ACT_ALLOW);

         if (ctx == NULL) {
                 fprintf(stderr, "failed to init filter\n");
                 return EXIT_FAILURE;
         }

         int r;
         r = seccomp_attr_set(ctx, SCMP_FLTATR_CTL_NNP, 0);
         if (r != 0) {
                 fprintf(stderr, "failed to disable NNP\n");
                 return EXIT_FAILURE;
         }

         fprintf(stderr, "filtering");
         for (int i = 1; i < argc - 1; i++) {
                 const char *syscall = argv[i];

                 int syscall_nr = seccomp_syscall_resolve_name(syscall);

                 if (syscall_nr == __NR_SCMP_ERROR) {
                         //fprintf(stderr, "unknown syscall %s, 
ignoring\n", syscall);
                         continue;
                 }
                 r = seccomp_rule_add_exact(ctx, SCMP_ACT_ERRNO(ENOSYS), 
syscall_nr, 0);
                 if (r != 0) {
                         //fprintf(stderr, "failed to filter syscall %s, 
ignoring\n", syscall);
                         continue;
                 }
                 fprintf(stderr, " %s", syscall);
         }
         fprintf(stderr, "\n");
         r = seccomp_load(ctx);
         if (r != 0) {
                 fprintf(stderr, "failed to apply filter\n");
                 return EXIT_FAILURE;
         }

         seccomp_release(ctx);

         char *program = argv[argc - 1];
         char *new_argv[] = { program, NULL };

         execv(program, new_argv);

         fprintf(stderr, "failed to exec %s\n", program);
         return EXIT_FAILURE;
}

This can be inserted in initrd to disable some obsolete and old system 
calls like this:
#!/bin/sh

exec /usr/local/sbin/seccomp-exec _sysctl afs_syscall bdflush break 
create_module ftime get_kernel_syms getpmsg gtty idle lock mpx prof 
profil putpmsg query_module security sgetmask ssetmask stty sysfs 
tuxcall ulimit uselib ustat vserver epoll_ctl_old epoll_wait_old 
old_adjtimex old_getpagesize oldfstat oldlstat oldolduname oldstat 
oldumount olduname osf_old_creat osf_old_fstat osf_old_getpgrp 
osf_old_killpg osf_old_lstat osf_old_open osf_old_sigaction 
osf_old_sigblock osf_old_sigreturn osf_old_sigsetmask osf_old_sigvec 
osf_old_stat osf_old_vadvise osf_old_vtrace osf_old_wait osf_oldquota 
vm86old brk /init

-Topi
diff mbox series

Patch

diff --git a/init/Kconfig b/init/Kconfig
index c5ea2e694f6a..53735ac305d8 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1851,6 +1851,20 @@  config SLUB_MEMCG_SYSFS_ON
 	  controlled by slub_memcg_sysfs boot parameter and this
 	  config option determines the parameter's default value.
 
+config BRK_SYSCALL
+	bool "Enable brk() system call" if EXPERT
+	default y
+	help
+	  Enable the brk() system call that allows to change data
+	  segment size (heap). This is mainly used by glibc for memory
+	  allocation, but it can use mmap() and that results in more
+	  randomized memory mappings since the heap is always located
+	  at fixed offset to program while mmap()ed memory is
+	  randomized.
+
+	  If unsure, say Y for maximum compatibility.
+
+if BRK_SYSCALL
 config COMPAT_BRK
 	bool "Disable heap randomization"
 	default y
@@ -1862,6 +1876,7 @@  config COMPAT_BRK
 	  /proc/sys/kernel/randomize_va_space to 2 or 3.
 
 	  On non-ancient distros (post-2000 ones) N is usually a safe choice.
+endif # BRK_SYSCALL
 
 choice
 	prompt "Choose SLAB allocator"
diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c
index 4d59775ea79c..3ffa5c4002e1 100644
--- a/kernel/sys_ni.c
+++ b/kernel/sys_ni.c
@@ -299,6 +299,8 @@  COND_SYSCALL(recvmmsg_time32);
 COND_SYSCALL_COMPAT(recvmmsg_time32);
 COND_SYSCALL_COMPAT(recvmmsg_time64);
 
+COND_SYSCALL(brk);
+
 /*
  * Architecture specific syscalls: see further below
  */
diff --git a/mm/mmap.c b/mm/mmap.c
index 489368f43af1..653be2c8982a 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -188,6 +188,7 @@  static struct vm_area_struct *remove_vma(struct vm_area_struct *vma)
 
 static int do_brk_flags(unsigned long addr, unsigned long request, unsigned long flags,
 		struct list_head *uf);
+#ifdef CONFIG_BRK_SYSCALL
 SYSCALL_DEFINE1(brk, unsigned long, brk)
 {
 	unsigned long retval;
@@ -286,6 +287,7 @@  SYSCALL_DEFINE1(brk, unsigned long, brk)
 	mmap_write_unlock(mm);
 	return retval;
 }
+#endif
 
 static inline unsigned long vma_compute_gap(struct vm_area_struct *vma)
 {