diff mbox series

setup: avoid unconditional open with write flags

Message ID 20221205190019.52829-1-cgzones@googlemail.com (mailing list archive)
State New, archived
Headers show
Series setup: avoid unconditional open with write flags | expand

Commit Message

Christian Göttsche Dec. 5, 2022, 7 p.m. UTC
Commit 57f5d52a942 ("common-main: call sanitize_stdfds()") added the
sanitization for standard file descriptors (stdin, stdout, stderr) to
all binaries.  The lead to all binaries unconditionally opening
/dev/null with the flag O_RDWR (read and write).  Most of the time the
standard file descriptors should be set up properly and the sanitization
ends up doing nothing.

There are many non modifying git operations, like `git status` or `git
stash list`, which might be called by a parent to gather information
about the repository.  That parent might run under a seccomp filter to
avoid accidental modification or unwanted command execution on memory
corruptions.  As part of that seccomp filter open(2) and openat(2) might
be only allowed in read-only mode (O_RDONLY), thus preventing git's
sanitation and stopping the application.

Check before opening /dev/null to populate a possible non-present
standard file descriptor if actually any is missing.

Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
---
Alternatively one could add a command line argument
(`--no-stdfd-sanitization`).
---
 setup.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

Comments

brian m. carlson Dec. 5, 2022, 10:13 p.m. UTC | #1
On 2022-12-05 at 19:00:19, Christian Göttsche wrote:
> Commit 57f5d52a942 ("common-main: call sanitize_stdfds()") added the
> sanitization for standard file descriptors (stdin, stdout, stderr) to
> all binaries.  The lead to all binaries unconditionally opening
> /dev/null with the flag O_RDWR (read and write).  Most of the time the
> standard file descriptors should be set up properly and the sanitization
> ends up doing nothing.
> 
> There are many non modifying git operations, like `git status` or `git
> stash list`, which might be called by a parent to gather information
> about the repository.  That parent might run under a seccomp filter to
> avoid accidental modification or unwanted command execution on memory
> corruptions.  As part of that seccomp filter open(2) and openat(2) might
> be only allowed in read-only mode (O_RDONLY), thus preventing git's
> sanitation and stopping the application.
> 
> Check before opening /dev/null to populate a possible non-present
> standard file descriptor if actually any is missing.

I don't think this patch makes anything worse, and so I think it should
be fine as it is.

_However_, I will say that `git status` is not a read-only command
because it can write the index, and we aren't, in general, going to be
able to promise that any portion of Git will work with only O_RDONLY
file descriptors.  I suspect such a sandbox is going to result in a
highly broken Git in general, and so it wouldn't be a good idea.
Taylor Blau Dec. 5, 2022, 10:59 p.m. UTC | #2
On Mon, Dec 05, 2022 at 08:00:19PM +0100, Christian Göttsche wrote:
> @@ -1669,7 +1669,14 @@ const char *resolve_gitdir_gently(const char *suspect, int *return_error_code)
>  /* if any standard file descriptor is missing open it to /dev/null */
>  void sanitize_stdfds(void)
>  {
> -	int fd = xopen("/dev/null", O_RDWR);

So before we would do one syscall here unconditionally. Then in the
common case, we'll do another syscall to close the descriptor we just
opened. IOW, ordinarily we expect this function to execute two syscalls.

> +	int fd;
> +
> +	if (fcntl(0, F_GETFD) != -1 &&
> +	    fcntl(1, F_GETFD) != -1 &&
> +	    fcntl(2, F_GETFD) != -1)
> +		return;

But under the same circumstances (i.e., where all three descriptors are
already valid), we now have to make three syscalls to determine the same
fact.

...So I'm not sure that I agree with brian's "this isn't making anything
worse" statement earlier in the thread.

In practice, however, it appears to be basically undetectable. Here,
'git.old' is the pre-image of this patch, and 'git.new' is the
post-image. I figure that running 'git -h' is about the fastest thing I
could do:

    $ hyperfine -N -L V old,new './git.{V} -h'
    Benchmark 1: ./git.old -h
      Time (mean ± σ):       1.6 ms ±   0.2 ms    [User: 1.1 ms, System: 0.4 ms]
      Range (min … max):     0.8 ms …   2.2 ms    1589 runs

    Benchmark 2: ./git.new -h
      Time (mean ± σ):       1.6 ms ±   0.2 ms    [User: 1.1 ms, System: 0.4 ms]
      Range (min … max):     0.9 ms …   2.2 ms    1746 runs

    Summary
      './git.old -h' ran
        1.00 ± 0.14 times faster than './git.new -h'

So it appears that the old version is ever-so-slightly faster than the
new one. But it's so noisy, and the regression is so small that it's
hard to notice it at all.

So I wouldn't strongly oppose the patch based on those numbers, but in
principle it seems flawed.

Thanks,
Taylor
Junio C Hamano Dec. 6, 2022, 12:10 a.m. UTC | #3
Taylor Blau <me@ttaylorr.com> writes:

> So it appears that the old version is ever-so-slightly faster than the
> new one. But it's so noisy, and the regression is so small that it's
> hard to notice it at all.
>
> So I wouldn't strongly oppose the patch based on those numbers, but in
> principle it seems flawed.

Thanks for writing and reviewing.

As long as we were touching the function, I suspect that
the logic should become more like

    if (fd #0 is not open)
	open /dev/null read-only and give it to fd #0
    if (fd #1 is not open)
	open /dev/null write-only and give it to fd #1
    if (fd #2 is not open)
	open /dev/null write-only and give it to fd #2

with opening of /dev/null optimized not to happen when not needed.
Taylor Blau Dec. 6, 2022, 12:31 a.m. UTC | #4
On Tue, Dec 06, 2022 at 09:10:51AM +0900, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
>
> > So it appears that the old version is ever-so-slightly faster than the
> > new one. But it's so noisy, and the regression is so small that it's
> > hard to notice it at all.
> >
> > So I wouldn't strongly oppose the patch based on those numbers, but in
> > principle it seems flawed.
>
> Thanks for writing and reviewing.
>
> As long as we were touching the function, I suspect that
> the logic should become more like
>
>     if (fd #0 is not open)
> 	open /dev/null read-only and give it to fd #0
>     if (fd #1 is not open)
> 	open /dev/null write-only and give it to fd #1
>     if (fd #2 is not open)
> 	open /dev/null write-only and give it to fd #2
>
> with opening of /dev/null optimized not to happen when not needed.

Yeah, that would work, and it has the added benefit of not opening fd #0
with O_RDWR (though I kind of doubt that such a thing matters in
practice).

But it's still no better than the patch here in the happy case, since we
still have to perform three fcntl() checks to figure out that all three
descriptors are initialized as-expected (versus just one open() and
close()).

So I think your version is a slight improvement on Christian's, but I
would just as soon stick with what we have.

Thanks,
Taylor
Junio C Hamano Dec. 6, 2022, 12:40 a.m. UTC | #5
Taylor Blau <me@ttaylorr.com> writes:

> But it's still no better than the patch here in the happy case, since we
> still have to perform three fcntl() checks to figure out that all three
> descriptors are initialized as-expected (versus just one open() and
> close()).
>
> So I think your version is a slight improvement on Christian's, but I
> would just as soon stick with what we have.

I am OK as long as there is a workaround available to Christian's
use case without changing "git" at all.  If Christian can tighten
the environment into somewhat unnatural "opening writable FD is a
failure" way, I suspect such a jail can be augmented to further to
allow opening /dev/null and other "selected" files writable, so I
wouldn't worry too much if we dropped this patch entirely.

Thanks.
Jeff King Dec. 6, 2022, 1:38 a.m. UTC | #6
On Mon, Dec 05, 2022 at 10:13:44PM +0000, brian m. carlson wrote:

> _However_, I will say that `git status` is not a read-only command
> because it can write the index, and we aren't, in general, going to be
> able to promise that any portion of Git will work with only O_RDONLY
> file descriptors.  I suspect such a sandbox is going to result in a
> highly broken Git in general, and so it wouldn't be a good idea.

I wonder if "git status" might work OK in a sandbox, because it should
quietly skip the on-disk index refresh if something fails. That is, it's
supposed to work in a read-only repository. As long as the sandbox just
returns an error when opening the lockfile (and not, say, killing the
process), it would look the same to Git.

-Peff
Christian Göttsche Dec. 6, 2022, 4:15 p.m. UTC | #7
> But it's still no better than the patch here in the happy case, since we
> still have to perform three fcntl() checks to figure out that all three
> descriptors are initialized as-expected (versus just one open() and
> close()).

An alternative to performing three syscalls fot the check one could call
open(2) with O_RDONLY (O_PATH would also work, but seems not
yet to be used in the git source) on a common path ("/", "/dev/null", ...)
and skip the sanitization if the returned descriptor is greater than 2.
This would lead to two (open + close) syscalls in the common case,
same as current.

> If Christian can tighten
> the environment into somewhat unnatural "opening writable FD is a
> failure" way, I suspect such a jail can be augmented to further to
> allow opening /dev/null and other "selected" files writable, so I
> wouldn't worry too much if we dropped this patch entirely.

The seccomp filter only gets the address of the memory where the path
is stored, so simple allow-listing paths is not possible.  And even on
inspection of the path one would need to avoid toctou attacks (the
filter seeing a different memory content at check time than the kernel
at use time).
René Scharfe Dec. 6, 2022, 7:47 p.m. UTC | #8
Am 05.12.22 um 20:00 schrieb Christian Göttsche:
> Commit 57f5d52a942 ("common-main: call sanitize_stdfds()") added the
> sanitization for standard file descriptors (stdin, stdout, stderr) to
> all binaries.  The lead to all binaries unconditionally opening
> /dev/null with the flag O_RDWR (read and write).  Most of the time the
> standard file descriptors should be set up properly and the sanitization
> ends up doing nothing.
>
> There are many non modifying git operations, like `git status` or `git
> stash list`, which might be called by a parent to gather information
> about the repository.  That parent might run under a seccomp filter to
> avoid accidental modification or unwanted command execution on memory
> corruptions.  As part of that seccomp filter open(2) and openat(2) might
> be only allowed in read-only mode (O_RDONLY), thus preventing git's
> sanitation and stopping the application.
>
> Check before opening /dev/null to populate a possible non-present
> standard file descriptor if actually any is missing.
>
> Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
> ---
> Alternatively one could add a command line argument
> (`--no-stdfd-sanitization`).
> ---
>  setup.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/setup.c b/setup.c
> index cefd5f6..2af7170 100644
> --- a/setup.c
> +++ b/setup.c
> @@ -1669,7 +1669,14 @@ const char *resolve_gitdir_gently(const char *suspect, int *return_error_code)
>  /* if any standard file descriptor is missing open it to /dev/null */
>  void sanitize_stdfds(void)
>  {
> -	int fd = xopen("/dev/null", O_RDWR);
> +	int fd;
> +
> +	if (fcntl(0, F_GETFD) != -1 &&
> +	    fcntl(1, F_GETFD) != -1 &&
> +	    fcntl(2, F_GETFD) != -1)
> +		return;
> +
> +	fd = xopen("/dev/null", O_RDWR);
>  	while (fd < 2)
>  		fd = xdup(fd);
>  	if (fd > 2)

If read-only access is allowed, how about this?

diff --git a/setup.c b/setup.c
index cefd5f63c4..0f52c51759 100644
--- a/setup.c
+++ b/setup.c
@@ -1669,7 +1669,12 @@ const char *resolve_gitdir_gently(const char *suspect, int *return_error_code)
 /* if any standard file descriptor is missing open it to /dev/null */
 void sanitize_stdfds(void)
 {
-	int fd = xopen("/dev/null", O_RDWR);
+	int fd = xopen("/dev/null", O_RDONLY);
+	if (fd > 0)
+		close(fd);
+	if (fd > 2)
+		return;
+	fd = xopen("/dev/null", O_WRONLY);
 	while (fd < 2)
 		fd = xdup(fd);
 	if (fd > 2)

Requires an extra open/close pair if fd 0 is already open, but no extra
syscalls if 0, 1 and 2 are all open.

Can opening /dev/null (or NUL on Windows) multiple times instead of dup'ing
cause issues?  Can we e.g. lock ourselves out?

René
Junio C Hamano Dec. 6, 2022, 11:39 p.m. UTC | #9
René Scharfe <l.s.r@web.de> writes:

> Can opening /dev/null (or NUL on Windows) multiple times instead of dup'ing
> cause issues?  Can we e.g. lock ourselves out?

258e93a1 (daemon: if one of the standard fds is missing open it to
/dev/null, 2006-07-13) unfortunately does not explain why it was
done this way.

Given that "command </dev/null >/dev/null 2>/dev/null" should work fine,
I suspect that the "once we open a throw-away descriptor, instead of
opening the same file again, dup the fd and reuse" was done not for
correctness but for performance, under the assumption that open() is
more expensive?
diff mbox series

Patch

diff --git a/setup.c b/setup.c
index cefd5f6..2af7170 100644
--- a/setup.c
+++ b/setup.c
@@ -1669,7 +1669,14 @@  const char *resolve_gitdir_gently(const char *suspect, int *return_error_code)
 /* if any standard file descriptor is missing open it to /dev/null */
 void sanitize_stdfds(void)
 {
-	int fd = xopen("/dev/null", O_RDWR);
+	int fd;
+
+	if (fcntl(0, F_GETFD) != -1 &&
+	    fcntl(1, F_GETFD) != -1 &&
+	    fcntl(2, F_GETFD) != -1)
+		return;
+
+	fd = xopen("/dev/null", O_RDWR);
 	while (fd < 2)
 		fd = xdup(fd);
 	if (fd > 2)