diff mbox series

mmap.2: fix description of treatment of the hint

Message ID 20190211163203.33477-1-jannh@google.com (mailing list archive)
State New, archived
Headers show
Series mmap.2: fix description of treatment of the hint | expand

Commit Message

Jann Horn Feb. 11, 2019, 4:32 p.m. UTC
The current manpage reads to me as if the kernel will always pick a free
space close to the requested address, but that's not the case:

mmap(0x600000000000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
-1, 0) = 0x600000000000
mmap(0x600000000000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
-1, 0) = 0x7f5042859000

You can also see this in the various implementations of
->get_unmapped_area() - if the specified address isn't available, the
kernel basically ignores the hint (apart from the 5level paging hack).

Clarify how this works a bit.

Signed-off-by: Jann Horn <jannh@google.com>
---
 man2/mmap.2 | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Comments

Michal Hocko Feb. 13, 2019, 11:47 a.m. UTC | #1
On Mon 11-02-19 17:32:03, Jann Horn wrote:
> The current manpage reads to me as if the kernel will always pick a free
> space close to the requested address, but that's not the case:
> 
> mmap(0x600000000000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
> -1, 0) = 0x600000000000
> mmap(0x600000000000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
> -1, 0) = 0x7f5042859000
> 
> You can also see this in the various implementations of
> ->get_unmapped_area() - if the specified address isn't available, the
> kernel basically ignores the hint (apart from the 5level paging hack).
> 
> Clarify how this works a bit.

Do we really want to be that specific? What if a future implementation
would like to ignore the mapping even if there is no colliding mapping
already? E.g. becuase of fragmentation avoidance or whatever other
reason. If we are explicit about the current implementation we might
give a receipt to userspace to depend on that behavior.

> Signed-off-by: Jann Horn <jannh@google.com>
> ---
>  man2/mmap.2 | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/man2/mmap.2 b/man2/mmap.2
> index fccfb9b3e..8556bbfeb 100644
> --- a/man2/mmap.2
> +++ b/man2/mmap.2
> @@ -71,7 +71,12 @@ If
>  .I addr
>  is not NULL,
>  then the kernel takes it as a hint about where to place the mapping;
> -on Linux, the mapping will be created at a nearby page boundary.
> +on Linux, the kernel will pick a nearby page boundary (but always above
> +or equal to the value specified by
> +.IR /proc/sys/vm/mmap_min_addr )
> +and attempt to create the mapping there.
> +If another mapping already exists there, the kernel picks a new
> +address, independent of the hint.
>  .\" Before Linux 2.6.24, the address was rounded up to the next page
>  .\" boundary; since 2.6.24, it is rounded down!
>  The address of the new mapping is returned as the result of the call.
> -- 
> 2.20.1.791.gb4d0f1c61a-goog
Jann Horn Feb. 13, 2019, 11:53 a.m. UTC | #2
On Wed, Feb 13, 2019 at 12:47 PM Michal Hocko <mhocko@kernel.org> wrote:
> On Mon 11-02-19 17:32:03, Jann Horn wrote:
> > The current manpage reads to me as if the kernel will always pick a free
> > space close to the requested address, but that's not the case:
> >
> > mmap(0x600000000000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
> > -1, 0) = 0x600000000000
> > mmap(0x600000000000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
> > -1, 0) = 0x7f5042859000
> >
> > You can also see this in the various implementations of
> > ->get_unmapped_area() - if the specified address isn't available, the
> > kernel basically ignores the hint (apart from the 5level paging hack).
> >
> > Clarify how this works a bit.
>
> Do we really want to be that specific? What if a future implementation
> would like to ignore the mapping even if there is no colliding mapping
> already? E.g. becuase of fragmentation avoidance or whatever other
> reason. If we are explicit about the current implementation we might
> give a receipt to userspace to depend on that behavior.

You have a point. So I guess we want something like this?

"If another mapping already exists there, the kernel picks a new
address that may or may not depend on the hint."

Unless someone can come up with a nicer wording for this?
Michal Hocko Feb. 13, 2019, 12:22 p.m. UTC | #3
On Wed 13-02-19 12:53:15, Jann Horn wrote:
> On Wed, Feb 13, 2019 at 12:47 PM Michal Hocko <mhocko@kernel.org> wrote:
> > On Mon 11-02-19 17:32:03, Jann Horn wrote:
> > > The current manpage reads to me as if the kernel will always pick a free
> > > space close to the requested address, but that's not the case:
> > >
> > > mmap(0x600000000000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
> > > -1, 0) = 0x600000000000
> > > mmap(0x600000000000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
> > > -1, 0) = 0x7f5042859000
> > >
> > > You can also see this in the various implementations of
> > > ->get_unmapped_area() - if the specified address isn't available, the
> > > kernel basically ignores the hint (apart from the 5level paging hack).
> > >
> > > Clarify how this works a bit.
> >
> > Do we really want to be that specific? What if a future implementation
> > would like to ignore the mapping even if there is no colliding mapping
> > already? E.g. becuase of fragmentation avoidance or whatever other
> > reason. If we are explicit about the current implementation we might
> > give a receipt to userspace to depend on that behavior.
> 
> You have a point. So I guess we want something like this?
> 
> "If another mapping already exists there, the kernel picks a new
> address that may or may not depend on the hint."

Yes, this sounds good to me.
diff mbox series

Patch

diff --git a/man2/mmap.2 b/man2/mmap.2
index fccfb9b3e..8556bbfeb 100644
--- a/man2/mmap.2
+++ b/man2/mmap.2
@@ -71,7 +71,12 @@  If
 .I addr
 is not NULL,
 then the kernel takes it as a hint about where to place the mapping;
-on Linux, the mapping will be created at a nearby page boundary.
+on Linux, the kernel will pick a nearby page boundary (but always above
+or equal to the value specified by
+.IR /proc/sys/vm/mmap_min_addr )
+and attempt to create the mapping there.
+If another mapping already exists there, the kernel picks a new
+address, independent of the hint.
 .\" Before Linux 2.6.24, the address was rounded up to the next page
 .\" boundary; since 2.6.24, it is rounded down!
 The address of the new mapping is returned as the result of the call.