Message ID | 20250409200316.1555164-1-jannh@google.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [man] mmap.2: Document danger of mappings larger than PTRDIFF_MAX | expand |
* Jann Horn <jannh@google.com>, 2025-04-09 22:03: > - glibc malloc restricts object size to <=PTRDIFF_MAX in > checked_request2size() FWIW, this is done only since glibc v2.30 (released in 2019): https://sourceware.org/cgit/glibc/commit/?id=9bf8e29ca136094f
Hi Jan, On Wed, Apr 09, 2025 at 10:03:16PM +0200, Jann Horn wrote: > References: > - C99 draft: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf > section "6.5.6 Additive operators", paragraph 9 > - object size restriction in GCC: > https://gcc.gnu.org/legacy-ml/gcc/2011-08/msg00221.html > - glibc malloc restricts object size to <=PTRDIFF_MAX in > checked_request2size() > --- > I'm not sure if we can reasonably do anything about this in the kernel, > given that the kernel does not really have any idea of what userspace > object sizes look like, Hmmm. Maybe it could reject PTRDIFF_MAX within the kernel, which would at least work for cases where user-space ptrdiff_t matches the kernel's ptrdiff_t? Then only users where they don't match would be unprotected, but those are hopefully extra careful. > or whether userspace even wants C semantics. I guess any language will have to link to C at some point, or have inherent limitations similar to those of C. > But we can at least document it... Yep. Most people are unaware of this, and believe they can get SIZE_MAX. > > @man-pages maintainer: Please wait a few days before applying this; > I imagine there might be some discussion about this. Okay; see some minor comments below. > > man/man2/mmap.2 | 17 +++++++++++++++++ > 1 file changed, 17 insertions(+) > > diff --git a/man/man2/mmap.2 b/man/man2/mmap.2 > index caf822103..9cb7dacf3 100644 > --- a/man/man2/mmap.2 > +++ b/man/man2/mmap.2 > @@ -785,6 +785,23 @@ correspond to added or removed regions of the file is unspecified. > An application can determine which pages of a mapping are > currently resident in the buffer/page cache using > .BR mincore (2). > +.P > +Unlike typical > +.BR malloc (3) > +implementations, > +.BR mmap () > +does not prevent creating objects larger than > +.B PTRDIFF_MAX. .BR PTRDIFF_MAX . (since you want the '.' not bold, but roman) > +Objects that are larger than > +.B PTRDIFF_MAX > +only work in limited ways in standard C (in particular, pointer subtraction Please break the line also before the '('. > +results in undefined behavior if the result would be bigger than > +.B PTRDIFF_MAX). .BR PTRDIFF_MAX ). (same reasons) > +On top of that, GCC also assumes that no object is bigger than > +.B PTRDIFF_MAX. .BR PTRDIFF_MAX . > +.B PTRDIFF_MAX > +is usually half of the address space size; so for 32-bit processes, it is Please break the line after ';' and after ',' (and not after 'is'). See also man-pages(7): $ MANWIDTH=72 man man-pages | sed -n '/Use semantic newlines/,/^$/p' Use semantic newlines In the source of a manual page, new sentences should be started on new lines, long sentences should be split into lines at clause breaks (commas, semicolons, colons, and so on), and long clauses should be split at phrase boundaries. This convention, sometimes known as "semantic newlines", makes it easier to see the effect of patches, which often operate at the level of individual sentences, clauses, or phrases. Have a lovely night! Alex > +usually 0x7fffffff (almost 2 GiB). > .\" > .SS Using MAP_FIXED safely > The only safe use for > > base-commit: 4c4d9f0f5148caf1271394018d0f7381c1b8b400 > -- > 2.49.0.504.g3bcea36a83-goog >
On Wed, Apr 9, 2025 at 10:41 PM Alejandro Colomar <alx@kernel.org> wrote: > On Wed, Apr 09, 2025 at 10:03:16PM +0200, Jann Horn wrote: > > References: > > - C99 draft: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf > > section "6.5.6 Additive operators", paragraph 9 > > - object size restriction in GCC: > > https://gcc.gnu.org/legacy-ml/gcc/2011-08/msg00221.html > > - glibc malloc restricts object size to <=PTRDIFF_MAX in > > checked_request2size() > > --- > > I'm not sure if we can reasonably do anything about this in the kernel, > > given that the kernel does not really have any idea of what userspace > > object sizes look like, > > Hmmm. Maybe it could reject PTRDIFF_MAX within the kernel, which would > at least work for cases where user-space ptrdiff_t matches the kernel's > ptrdiff_t? Then only users where they don't match would be unprotected, > but those are hopefully extra careful. Perhaps. But then some tricky things are: 1. How many existing users would we be breaking with such a change? Probably _someone_ out there is deliberately mapping files over 2G into 32-bit processes and it sorta worked until now... 2. We don't really have a concept of object size in the kernel, and it might be hard to reason about whether mmap() is used logically to create a new object or extend an existing object. I guess we could limit VMA sizes for 32-bit userspace to 0x7ffff000 and enforce a 1-page gap around mappings that are at least half that size, or something like that, but that would probably get a bit ugly on the kernel side... The first point is really the main concern for me - we might end up breaking existing users. > > or whether userspace even wants C semantics. > > I guess any language will have to link to C at some point, or have > inherent limitations similar to those of C. This limitation is really a result of deciding to make pointer subtraction return a signed value, so that you can subtract a bigger pointer from a smaller pointer. I don't know whether other languages do that. > > But we can at least document it... > > Yep. Most people are unaware of this, and believe they can get > SIZE_MAX. > > > > > @man-pages maintainer: Please wait a few days before applying this; > > I imagine there might be some discussion about this. > > Okay; see some minor comments below. Thanks. (I'll probably be out for the next two weeks or so, I'll probably get back to this afterwards.)
Hi Jann, On Thu, Apr 10, 2025 at 08:08:41PM +0200, Jann Horn wrote: > > Hmmm. Maybe it could reject PTRDIFF_MAX within the kernel, which would > > at least work for cases where user-space ptrdiff_t matches the kernel's > > ptrdiff_t? Then only users where they don't match would be unprotected, > > but those are hopefully extra careful. > > Perhaps. But then some tricky things are: > > 1. How many existing users would we be breaking with such a change? > Probably _someone_ out there is deliberately mapping files over 2G > into 32-bit processes and it sorta worked until now... > 2. We don't really have a concept of object size in the kernel, and it > might be hard to reason about whether mmap() is used logically to > create a new object or extend an existing object. I guess we could > limit VMA sizes for 32-bit userspace to 0x7ffff000 and enforce a > 1-page gap around mappings that are at least half that size, or > something like that, but that would probably get a bit ugly on the > kernel side... > > The first point is really the main concern for me - we might end up > breaking existing users. Hmmm, okay. If it ends up being too complex, it also would be bad. It's easier for careful programmers to just check the size before the call. So it's fine to not do the check in the kernel. > > > or whether userspace even wants C semantics. > > > > I guess any language will have to link to C at some point, or have > > inherent limitations similar to those of C. > > This limitation is really a result of deciding to make pointer > subtraction return a signed value, so that you can subtract a bigger > pointer from a smaller pointer. I don't know whether other languages > do that. > > > > But we can at least document it... > > > > Yep. Most people are unaware of this, and believe they can get > > SIZE_MAX. > > > > > > > > @man-pages maintainer: Please wait a few days before applying this; > > > I imagine there might be some discussion about this. > > > > Okay; see some minor comments below. > > Thanks. (I'll probably be out for the next two weeks or so, I'll > probably get back to this afterwards.) Okay, no problem
diff --git a/man/man2/mmap.2 b/man/man2/mmap.2 index caf822103..9cb7dacf3 100644 --- a/man/man2/mmap.2 +++ b/man/man2/mmap.2 @@ -785,6 +785,23 @@ correspond to added or removed regions of the file is unspecified. An application can determine which pages of a mapping are currently resident in the buffer/page cache using .BR mincore (2). +.P +Unlike typical +.BR malloc (3) +implementations, +.BR mmap () +does not prevent creating objects larger than +.B PTRDIFF_MAX. +Objects that are larger than +.B PTRDIFF_MAX +only work in limited ways in standard C (in particular, pointer subtraction +results in undefined behavior if the result would be bigger than +.B PTRDIFF_MAX). +On top of that, GCC also assumes that no object is bigger than +.B PTRDIFF_MAX. +.B PTRDIFF_MAX +is usually half of the address space size; so for 32-bit processes, it is +usually 0x7fffffff (almost 2 GiB). .\" .SS Using MAP_FIXED safely The only safe use for