git-compat-util: avoid failing dir ownership checks if running privileged

Message ID	20220427000522.15637-1-carenas@gmail.com (mailing list archive)
State	Superseded
Headers	show Return-Path: <git-owner@kernel.org> From: =?utf-8?q?Carlo_Marcelo_Arenas_Bel=C3=B3n?= <carenas@gmail.com> To: git@vger.kernel.org Cc: philipoakley@iee.email, me@ttaylorr.com, guy.j@maurel.de, szeder.dev@gmail.com, johannes.Schindelin@gmx.de, gitster@pobox.com, derrickstolee@github.com, =?utf-8?q?Carlo_Marcelo_Arenas_Bel=C3=B3n?= <carenas@gmail.com>, Randall Becker <rsbecker@nexbridge.com>, Johannes Schindelin <Johannes.Schindelin@gmx.de> Subject: [PATCH] git-compat-util: avoid failing dir ownership checks if running privileged Date: Tue, 26 Apr 2022 17:05:22 -0700 Message-Id: <20220427000522.15637-1-carenas@gmail.com> In-Reply-To: <20220426183105.99779-1-carenas@gmail.com> References: <20220426183105.99779-1-carenas@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	git-compat-util: avoid failing dir ownership checks if running privileged \| expand git-compat-util: avoid failing dir ownership checks if running privileged

Carlo Marcelo Arenas Belón April 27, 2022, 12:05 a.m. UTC

bdc77d1d685 (Add a function to determine whether a path is owned by the
current user, 2022-03-02) checks for the effective uid of the running
process using geteuid() but didn't account for cases where that uid was
root (because git was invoked through sudo or a compatible tool) and the
original user that repository trusted for its config was different,
therefore failing the following common use case:

  guy@renard ~/Software/uncrustify $ sudo git describe --always --dirty
  [sudo] password for guy:
  fatal: unsafe repository ('/home/guy/Software/uncrustify' is owned by someone else)

Attempt to detect those cases by using the environment variables that
sudo or compatible tools create to keep track of the original user id,
and do the ownership check using that instead.

This assumes the environment the user is running with after going
privileged can't be tampered with, and also does the check only for
root to keep the most common case less complicated, but as a side effect
will miss cases where sudo (or an equivalent) was used to change to
another unprivileged user or where the equivalent tool used to raise
privileges didn't track the original id in a sudo compatible way.

Reported-by: Guy Maurel <guy.j@maurel.de>
Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Randall Becker <rsbecker@nexbridge.com>
Suggested-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
---
Changes since RFC
* Addresses all spelling errors, even the ones not reported and even if I
  am sure "priviledged" is a nicer sounding word even if obsoleted.
* Uses strtol instead of atoi as suggested by Randall and Junio, the extra
  error checking was too much to handle inline so a new helper function
  was added.
* Removes checks for DOAS_UID, in an attempt to make the change smaller
  and because that is part of an extention that might not be that common.
  This means `doas` is still broken, but that was punted for now.
* Has been tested a little more, but is still missing a test case, but
  as Derrick pointed out doing so is not trivial, so punted for now.

 git-compat-util.h | 38 +++++++++++++++++++++++++++++++++++++-
 1 file changed, 37 insertions(+), 1 deletion(-)

Phillip Wood April 27, 2022, 9:33 a.m. UTC | #1

Hi Carlo

On 27/04/2022 01:05, Carlo Marcelo Arenas Belón wrote:
> bdc77d1d685 (Add a function to determine whether a path is owned by the
> current user, 2022-03-02) checks for the effective uid of the running
> process using geteuid() but didn't account for cases where that uid was
> root (because git was invoked through sudo or a compatible tool) and the
> original user that repository trusted for its config was different,
> therefore failing the following common use case:
> 
>    guy@renard ~/Software/uncrustify $ sudo git describe --always --dirty
>    [sudo] password for guy:
>    fatal: unsafe repository ('/home/guy/Software/uncrustify' is owned by someone else)
> 
> Attempt to detect those cases by using the environment variables that
> sudo or compatible tools create to keep track of the original user id,
> and do the ownership check using that instead.
> 
> This assumes the environment the user is running with after going
> privileged can't be tampered with, and also does the check only for
> root to keep the most common case less complicated, but as a side effect
> will miss cases where sudo (or an equivalent) was used to change to
> another unprivileged user or where the equivalent tool used to raise
> privileges didn't track the original id in a sudo compatible way.
> 
> Reported-by: Guy Maurel <guy.j@maurel.de>
> Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
> Helped-by: Randall Becker <rsbecker@nexbridge.com>
> Suggested-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
> Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
> ---
> Changes since RFC
> * Addresses all spelling errors, even the ones not reported and even if I
>    am sure "priviledged" is a nicer sounding word even if obsoleted.
> * Uses strtol instead of atoi as suggested by Randall and Junio, the extra
>    error checking was too much to handle inline so a new helper function
>    was added.
> * Removes checks for DOAS_UID, in an attempt to make the change smaller
>    and because that is part of an extention that might not be that common.
>    This means `doas` is still broken, but that was punted for now.
> * Has been tested a little more, but is still missing a test case, but
>    as Derrick pointed out doing so is not trivial, so punted for now.
> 
>   git-compat-util.h | 38 +++++++++++++++++++++++++++++++++++++-
>   1 file changed, 37 insertions(+), 1 deletion(-)
> 
> diff --git a/git-compat-util.h b/git-compat-util.h
> index 58fd813bd01..9bb0eb5087a 100644
> --- a/git-compat-util.h
> +++ b/git-compat-util.h
> @@ -437,12 +437,48 @@ static inline int git_offset_1st_component(const char *path)
>   #endif
>   
>   #ifndef is_path_owned_by_current_user
> +
> +#ifdef __TANDEM
> +#define ROOT_UID 65535
> +#else
> +#define ROOT_UID 0
> +#endif
> +
> +/*
> + * this helper function overrides a ROOT_UID with the one provided by
> + * an environment variable, do not use unless the original uid is
> + * root
> + */
> +static inline int extract_id_from_env(const char *env, uid_t *id)

Do we really want this living in git-compat-util.h?

> +{
> +	const char *real_uid = getenv(env);
> +
> +	if (real_uid && *real_uid) {
> +		char *error;
> +		long extracted_id = strtol(real_uid, &error, 10);
> +		if (!*error && LONG_MIN < extracted_id &&
> +				extracted_id < LONG_MAX) {

strtol() returns a long so the last two checks are redundant. The 
standard is silent on what happens to error when the value is out of 
range. The way to check that all the string was consumed without 
underflow/overflow is

	errno = 0;
	val = strtol(str, &endp, 10);
	if (errno || !*endp)
		error(...)

In this case I think negative values make no sense as I believe uids are 
always positive integers so we should add a check for "extracted_id < 0"

> +			*id = (uid_t)extracted_id;

There is a potential integer truncation here which could lead to false 
positives. I'm not sure that there is a way to query the maximum valid 
uid but we could do

	#if sizeof(uid_t) == sizeof(int)
		if (extracted_id > INT_MAX)
			error(...)
	#endif

Unfortunately I think we need to use #if rather than if to prevent 
compilers complaining that the condition is always true or always false.

Checking the real user id from an environment variable makes me feel a 
bit queasy but as others have pointed out if an attacker can mess with 
the users environment then they can already override $PATH.

Best Wishes

Phillip

> +			return 1;
> +		}
> +	}
> +	return 0;
> +}
> +
>   static inline int is_path_owned_by_current_uid(const char *path)
>   {
>   	struct stat st;
> +	uid_t euid;
> +
>   	if (lstat(path, &st))
>   		return 0;
> -	return st.st_uid == geteuid();
> +
> +	euid = geteuid();
> +	if (euid == ROOT_UID) {
> +		/* we might have raised our privileges with sudo */
> +		extract_id_from_env("SUDO_UID", &euid);
> +	}
> +	return st.st_uid == euid;
>   }
>   
>   #define is_path_owned_by_current_user is_path_owned_by_current_uid

Phillip Wood April 27, 2022, 12:30 p.m. UTC | #2

On 27/04/2022 10:33, Phillip Wood wrote:
> Hi Carlo
> 
> On 27/04/2022 01:05, Carlo Marcelo Arenas Belón wrote:

> [...]
>> +{
>> +    const char *real_uid = getenv(env);
>> +
>> +    if (real_uid && *real_uid) {
>> +        char *error;
>> +        long extracted_id = strtol(real_uid, &error, 10);
>> +        if (!*error && LONG_MIN < extracted_id &&
>> +                extracted_id < LONG_MAX) {
> 
> strtol() returns a long so the last two checks are redundant. The 
> standard is silent on what happens to error when the value is out of 
> range. The way to check that all the string was consumed without 
> underflow/overflow is
> 
>      errno = 0;
>      val = strtol(str, &endp, 10);
>      if (errno || !*endp)

Sorry that should be "if (errno || *endp)" to check for an error

 > [...]
 >     #if sizeof(uid_t) == sizeof(int)
 >         if (extracted_id > INT_MAX)
 >             error(...)
 >     #endif

I think we should probably check if uid_t is a short integer as well as

 > +#ifdef __TANDEM
 > +#define ROOT_UID 65535

suggests it may be an unsigned short on NonStop.

Not knowing the size of uid_t or if it is signed or not (as far as I can 
tell posix just says it's an integer) makes the limit checks tricky - 
maybe we make euid a long (or unsigned long) here and it the function 
below rather than casting it to uid_t and possibly truncating it.

 > [...]
>> +
>>   static inline int is_path_owned_by_current_uid(const char *path)
>>   {
>>       struct stat st;
>> +    uid_t euid;
>> +
>>       if (lstat(path, &st))
>>           return 0;
>> -    return st.st_uid == geteuid();
>> +
>> +    euid = geteuid();
>> +    if (euid == ROOT_UID) {
>> +        /* we might have raised our privileges with sudo */
>> +        extract_id_from_env("SUDO_UID", &euid);

You are ignoring any errors when parsing the environment variable - that 
is not a good idea in a security check.

Best Wishes

Phillip

>> +    }
>> +    return st.st_uid == euid;
>>   }
>>   #define is_path_owned_by_current_user is_path_owned_by_current_uid
>

Randall S. Becker April 27, 2022, 2:15 p.m. UTC | #3

On April 27, 2022 8:31 AM, Phillip Wood wrote:
>On 27/04/2022 10:33, Phillip Wood wrote:
>> Hi Carlo
>>
>> On 27/04/2022 01:05, Carlo Marcelo Arenas Belón wrote:
>
>> [...]
>>> +{
>>> +    const char *real_uid = getenv(env);
>>> +
>>> +    if (real_uid && *real_uid) {
>>> +        char *error;
>>> +        long extracted_id = strtol(real_uid, &error, 10);
>>> +        if (!*error && LONG_MIN < extracted_id &&
>>> +                extracted_id < LONG_MAX) {
>>
>> strtol() returns a long so the last two checks are redundant. The
>> standard is silent on what happens to error when the value is out of
>> range. The way to check that all the string was consumed without
>> underflow/overflow is
>>
>>      errno = 0;
>>      val = strtol(str, &endp, 10);
>>      if (errno || !*endp)
>
>Sorry that should be "if (errno || *endp)" to check for an error
>
> > [...]
> >     #if sizeof(uid_t) == sizeof(int)
> >         if (extracted_id > INT_MAX)
> >             error(...)
> >     #endif
>
>I think we should probably check if uid_t is a short integer as well as
>
> > +#ifdef __TANDEM
> > +#define ROOT_UID 65535
>
>suggests it may be an unsigned short on NonStop.
>
>Not knowing the size of uid_t or if it is signed or not (as far as I can tell posix just
>says it's an integer) makes the limit checks tricky - maybe we make euid a long (or
>unsigned long) here and it the function below rather than casting it to uid_t and
>possibly truncating it.

It depends on the compile. If LP64 is used - long file pointers, then it is a long (32-bit) or an int (also 32-bit). Not unsigned in either case, but specifying unsigned would be helpful. Nice catch.

> > [...]
>>> +
>>>   static inline int is_path_owned_by_current_uid(const char *path)
>>>   {
>>>       struct stat st;
>>> +    uid_t euid;
>>> +
>>>       if (lstat(path, &st))
>>>           return 0;
>>> -    return st.st_uid == geteuid();
>>> +
>>> +    euid = geteuid();
>>> +    if (euid == ROOT_UID) {
>>> +        /* we might have raised our privileges with sudo */
>>> +        extract_id_from_env("SUDO_UID", &euid);
>
>You are ignoring any errors when parsing the environment variable - that is not a
>good idea in a security check.

Carlo Marcelo Arenas Belón April 27, 2022, 3:38 p.m. UTC | #4

On Wed, Apr 27, 2022 at 2:33 AM Phillip Wood <phillip.wood123@gmail.com> wrote:
> On 27/04/2022 01:05, Carlo Marcelo Arenas Belón wrote:
> > diff --git a/git-compat-util.h b/git-compat-util.h
> > index 58fd813bd01..9bb0eb5087a 100644
> > --- a/git-compat-util.h
> > +++ b/git-compat-util.h
> > @@ -437,12 +437,48 @@ static inline int git_offset_1st_component(const char *path)
> >   #endif
> >
> >   #ifndef is_path_owned_by_current_user
> > +
> > +#ifdef __TANDEM
> > +#define ROOT_UID 65535
> > +#else
> > +#define ROOT_UID 0
> > +#endif
> > +
> > +/*
> > + * this helper function overrides a ROOT_UID with the one provided by
> > + * an environment variable, do not use unless the original uid is
> > + * root
> > + */
> > +static inline int extract_id_from_env(const char *env, uid_t *id)
>
> Do we really want this living in git-compat-util.h?

No; but IMHO the same applies to is_path_owned_by_current_uid(), and
as I mentioned in my original proposal refactoring this code to do so
has been punted until later since the objective here was to keep the
change as small as possible for clean backporting.

My intention with that comment was not only to warn people that might
want to reuse that helper but to indicate it was just a hack that
should be refactored ASAP.

FWIW, I still think that using atoi with a check to skip "" is
probably as safe as doing all this extra checking as no one has shown
yet a system where sizeof(uid_t) > sizeof(uint32_t), but agree with
Junio that using long instead avoids issues with the systems where
sizeof(uid_t) > sizeof(int) and unless sizeof(int) == sizeof(long)
(ex: 32-bit Linux) which is then covered by the cast.

> > +{
> > +     const char *real_uid = getenv(env);
> > +
> > +     if (real_uid && *real_uid) {
> > +             char *error;
> > +             long extracted_id = strtol(real_uid, &error, 10);
> > +             if (!*error && LONG_MIN < extracted_id &&
> > +                             extracted_id < LONG_MAX) {
>
> strtol() returns a long so the last two checks are redundant.

The last two checks were to check for underflow or overflow and to
make sure that the bogus values this function returns in case of those
errors are not taken into consideration.

>The standard is silent on what happens to error when the value is out of
> range.

Which is why instead I was avoiding LONG_{MIN,MAX} which are
described[1] as the bogus values that will be used in that case
Agree with you that we could also add a check for >=0 as uid_t is
usually unsigned, but it is not warranted to be so, and that is I was
aiming for the wider possible range so we don't have to worry that
much and let it be settled to a valid value through the cast.

Carlo

[1] https://pubs.opengroup.org/onlinepubs/007904875/functions/strtol.html

Randall S. Becker April 27, 2022, 3:50 p.m. UTC | #5

On April 27, 2022 11:39 AM, Carlo Arenas wrote:
>On Wed, Apr 27, 2022 at 2:33 AM Phillip Wood <phillip.wood123@gmail.com>
>wrote:
>> On 27/04/2022 01:05, Carlo Marcelo Arenas Belón wrote:
>> > diff --git a/git-compat-util.h b/git-compat-util.h index
>> > 58fd813bd01..9bb0eb5087a 100644
>> > --- a/git-compat-util.h
>> > +++ b/git-compat-util.h
>> > @@ -437,12 +437,48 @@ static inline int git_offset_1st_component(const char
>*path)
>> >   #endif
>> >
>> >   #ifndef is_path_owned_by_current_user
>> > +
>> > +#ifdef __TANDEM
>> > +#define ROOT_UID 65535
>> > +#else
>> > +#define ROOT_UID 0
>> > +#endif
>> > +
>> > +/*
>> > + * this helper function overrides a ROOT_UID with the one provided
>> > +by
>> > + * an environment variable, do not use unless the original uid is
>> > + * root
>> > + */
>> > +static inline int extract_id_from_env(const char *env, uid_t *id)
>>
>> Do we really want this living in git-compat-util.h?
>
>No; but IMHO the same applies to is_path_owned_by_current_uid(), and as I
>mentioned in my original proposal refactoring this code to do so has been punted
>until later since the objective here was to keep the change as small as possible for
>clean backporting.
>
>My intention with that comment was not only to warn people that might want to
>reuse that helper but to indicate it was just a hack that should be refactored ASAP.
>
>FWIW, I still think that using atoi with a check to skip "" is probably as safe as doing
>all this extra checking as no one has shown yet a system where sizeof(uid_t) >
>sizeof(uint32_t), but agree with Junio that using long instead avoids issues with
>the systems where
>sizeof(uid_t) > sizeof(int) and unless sizeof(int) == sizeof(long)
>(ex: 32-bit Linux) which is then covered by the cast.
>
>> > +{
>> > +     const char *real_uid = getenv(env);
>> > +
>> > +     if (real_uid && *real_uid) {
>> > +             char *error;
>> > +             long extracted_id = strtol(real_uid, &error, 10);
>> > +             if (!*error && LONG_MIN < extracted_id &&
>> > +                             extracted_id < LONG_MAX) {
>>
>> strtol() returns a long so the last two checks are redundant.
>
>The last two checks were to check for underflow or overflow and to make sure
>that the bogus values this function returns in case of those errors are not taken
>into consideration.
>
>>The standard is silent on what happens to error when the value is out
>>of  range.
>
>Which is why instead I was avoiding LONG_{MIN,MAX} which are described[1] as
>the bogus values that will be used in that case Agree with you that we could also
>add a check for >=0 as uid_t is usually unsigned, but it is not warranted to be so,
>and that is I was aiming for the wider possible range so we don't have to worry
>that much and let it be settled to a valid value through the cast.

Somehow, the root user id is a compat issue. Perhaps we need a uid_t getRootUid() in git-compat-util.h, and hide the details of the proc from the interface.

Carlo Marcelo Arenas Belón April 27, 2022, 3:58 p.m. UTC | #6

On Wed, Apr 27, 2022 at 5:30 AM Phillip Wood <phillip.wood123@gmail.com> wrote:
> You are ignoring any errors when parsing the environment variable - that
> is not a good idea in a security check.

which errors are you concerned about?, if anything in this code
worries me from a security point of view is the fact that we are
relying in getenv not being racy (as mentioned in the original RFC),
but there are no errors set there AFAIK.

not ignoring errno in strtol is an option, but as mentioned before I
decided instead to reject bogus values and therefore not the clobber a
previous errno, since I was using strtol as a wider version of atoi.

Carlo

Phillip Wood April 27, 2022, 4:14 p.m. UTC | #7

On 27/04/2022 16:58, Carlo Arenas wrote:
> On Wed, Apr 27, 2022 at 5:30 AM Phillip Wood <phillip.wood123@gmail.com> wrote:
>> You are ignoring any errors when parsing the environment variable - that
>> is not a good idea in a security check.
> 
> which errors are you concerned about?, if anything in this code

I was confused by the fact that the helper function returns a 
success/failure value which we ignore. However euid is not overwritten 
if strtol fails so it is safe I think.

> worries me from a security point of view is the fact that we are
> relying in getenv not being racy (as mentioned in the original RFC),
> but there are no errors set there AFAIK.
> 
> not ignoring errno in strtol is an option, but as mentioned before I
> decided instead to reject bogus values and therefore not the clobber a
> previous errno,

strtol() will set errno if there is a range error ignoring it does not 
change that. In any case is_path_owned_by_current_uid() already clobbers 
errno if stat() fails.

Best Wishes

Phillip

> since I was using strtol as a wider version of atoi.

> Carlo

Junio C Hamano April 27, 2022, 4:19 p.m. UTC | #8

Carlo Arenas <carenas@gmail.com> writes:

>> > +     if (real_uid && *real_uid) {
>> > +             char *error;
>> > +             long extracted_id = strtol(real_uid, &error, 10);
>> > +             if (!*error && LONG_MIN < extracted_id &&
>> > +                             extracted_id < LONG_MAX) {
>>
>> strtol() returns a long so the last two checks are redundant.
>
> The last two checks were to check for underflow or overflow and to
> make sure that the bogus values this function returns in case of those
> errors are not taken into consideration.
>
>>The standard is silent on what happens to error when the value is out of
>> range.

Actually the standard is is very clear what happens to endptr (no,
don't call it "error", that is not the point of the parameter).

    A pointer to the final string shall be stored in the object
    pointed to by endptr, provided that endptr is not a null
    pointer.

where "final string" has a precise definition much earlier:

    First, they decompose the input string into three parts:

    1. An initial, possibly empty, sequence of white-space
       characters (as specified by isspace())

    2. A subject sequence interpreted as an integer represented in
       some radix determined by the value of base

    3. A final string of one or more unrecognized characters,
       including the terminating null byte of the input string.

    Then they shall attempt to convert the subject sequence to an
    integer, and return the result.

So, leading whitespace is stripped, then "subject sequence" that is
the sequence of digits (with optional +/- sign) to be turned into a
long is recognised, and what remains is the "final string".  endptr
is made to point at that "final string", and it does not matter what
kind of value the interpretation of "subject sequence" yields.

And that is why it is wrong to call it "error".  It is not about
errors, but is about the shape of the input string, i.e. where the
run of digits ends.

> Which is why instead I was avoiding LONG_{MIN,MAX} which are
> described[1] as the bogus values that will be used in that case

strtol() is designed to return LONG_MIN and LONG_MAX when values
overflow or underflow, but that does not mean it returns these two
values only when the input overflows or underflows.  If it were
strtouint8(), giving it "255" will give 255, while you would get
UCHAR_MAX when the input makes it overflow, e.g. "1000", and from
the returned value you cannot tell which is which because UCHAR_MAX
is 255 ;-)

IOW, you are reading the standard wrong.  It does not say that
LONG_MIN and LONG_MAX are bogus values at all.

And theree is this note:

    Since 0, {LONG_MIN} or {LLONG_MIN}, and {LONG_MAX} or {LLONG_MAX}
    are returned on error and are also valid returns on success, an
    application wishing to check for error situations should set errno
    to 0, then call strtol() or strtoll(), then check errno.

So, no, I do not think the range check gives us anything
meaningful.

At this point, it is tempting to say that it is not worth trying to
do it right by avoiding atoi(), as we are bound to make more of this
kind of mistakes X-<.  But at the same time, I hope we can learn
together and get this right ;-).

Phillip Wood April 27, 2022, 4:31 p.m. UTC | #9

On 27/04/2022 16:38, Carlo Arenas wrote:
> On Wed, Apr 27, 2022 at 2:33 AM Phillip Wood <phillip.wood123@gmail.com> wrote:
>> On 27/04/2022 01:05, Carlo Marcelo Arenas Belón wrote:
> [...]
> FWIW, I still think that using atoi with a check to skip "" is
> probably as safe as doing all this extra checking as no one has shown
> yet a system where sizeof(uid_t) > sizeof(uint32_t), but agree with
> Junio that using long instead avoids issues with the systems where
> sizeof(uid_t) > sizeof(int) and unless sizeof(int) == sizeof(long)
> (ex: 32-bit Linux) which is then covered by the cast.

if sizeof(uid_t) < sizeof(long) then the cast will truncate the value 
returned by strtol() which means we are trusting that SUDO_UID is a 
valid uid otherwise it will be truncated.

>>> +{
>>> +     const char *real_uid = getenv(env);
>>> +
>>> +     if (real_uid && *real_uid) {
>>> +             char *error;
>>> +             long extracted_id = strtol(real_uid, &error, 10);
>>> +             if (!*error && LONG_MIN < extracted_id &&
>>> +                             extracted_id < LONG_MAX) {
>>
>> strtol() returns a long so the last two checks are redundant.
> 
> The last two checks were to check for underflow or overflow and to
> make sure that the bogus values this function returns in case of those
> errors are not taken into consideration.

Sorry I misread the code - this is checking if the value is valid, not 
if there is an error. On platforms where uid_t and long are the same 
size (e.g. where int and long are both 32 bits) the "< LONG_MAX" check 
is rejecting a valid uid. The fact that we are doing these checks makes 
me think we should care about the possible truncation above.

Best Wishes

Phillip

>> The standard is silent on what happens to error when the value is out of
>> range.
> 
> Which is why instead I was avoiding LONG_{MIN,MAX} which are
> described[1] as the bogus values that will be used in that case
> Agree with you that we could also add a check for >=0 as uid_t is
> usually unsigned, but it is not warranted to be so, and that is I was
> aiming for the wider possible range so we don't have to worry that
> much and let it be settled to a valid value through the cast.
> 
> Carlo
> 
> [1] https://pubs.opengroup.org/onlinepubs/007904875/functions/strtol.html

Carlo Marcelo Arenas Belón April 27, 2022, 4:45 p.m. UTC | #10

On Wed, Apr 27, 2022 at 9:19 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> And that is why it is wrong to call it "error".  It is not about
> errors, but is about the shape of the input string, i.e. where the
> run of digits ends.

I called it error, because I only cared about it to signal that the
string it converted was an erroneous one (ex: "1a" or "a")

> And there is this note:
>
>     Since 0, {LONG_MIN} or {LLONG_MIN}, and {LONG_MAX} or {LLONG_MAX}
>     are returned on error and are also valid returns on success, an
>     application wishing to check for error situations should set errno
>     to 0, then call strtol() or strtoll(), then check errno.
>
> So, no, I do not think the range check gives us anything
> meaningful.

the meaningful part of it was meant to be:

this function will try to get the value of an UID from SUDO_UID, if it
gets anything insane like LONG_MIN or LONG_MAX or anything above or
below that, it would consider you are just trying to play a bad joke
with it and ignore you.

> At this point, it is tempting to say that it is not worth trying to
> do it right by avoiding atoi(), as we are bound to make more of this
> kind of mistakes X-<.  But at the same time, I hope we can learn
> together and get this right ;-).

except it wasn't a mistake, but maybe poorly documented since I was
already going over my quota of added lines for what should have been a
3 line patch.

apologies for making the code so cryptic and thanks for spending so
much time reviewing it and commenting on it, has been definitely very
instructive so far.

any specific suggestions you would like to have in a reroll?

Carlo

Carlo Marcelo Arenas Belón April 27, 2022, 4:54 p.m. UTC | #11

On Wed, Apr 27, 2022 at 9:31 AM Phillip Wood <phillip.wood123@gmail.com> wrote:
> On 27/04/2022 16:38, Carlo Arenas wrote:
> > FWIW, I still think that using atoi with a check to skip "" is
> > probably as safe as doing all this extra checking as no one has shown
> > yet a system where sizeof(uid_t) > sizeof(uint32_t), but agree with
> > Junio that using long instead avoids issues with the systems where
> > sizeof(uid_t) > sizeof(int) and unless sizeof(int) == sizeof(long)
> > (ex: 32-bit Linux) which is then covered by the cast.
>
> if sizeof(uid_t) < sizeof(long) then the cast will truncate the value
> returned by strtol() which means we are trusting that SUDO_UID is a
> valid uid otherwise it will be truncated.

correct, this whole procedure relies on the fact that SUDO_UID is not
a bogus value (ex: it was produced by a non buggy sudo and hasn't been
tampered with)

in systems where sizeof(uid_t) < sizeof(long), it is expected that the
id we got should be able to fit in an uid_t so no truncation will ever
happen.

the only thing that worries me is sign extension but that is why I put
a specific cast.  for all practical reasons I expect uid_t to be
uint32_t and therefore using long should be better than using int
(through atoi)

Carlo

Phillip Wood April 27, 2022, 5:22 p.m. UTC | #12

On 27/04/2022 17:19, Junio C Hamano wrote:
> Carlo Arenas <carenas@gmail.com> writes:
>>> The standard is silent on what happens to error when the value is out of
>>> range.
> 
> Actually the standard is is very clear what happens to endptr (no,
> don't call it "error", that is not the point of the parameter).
 >
>      A pointer to the final string shall be stored in the object
>      pointed to by endptr, provided that endptr is not a null
>      pointer.
> 
> where "final string" has a precise definition much earlier:
> 
>      First, they decompose the input string into three parts:
> 
>      1. An initial, possibly empty, sequence of white-space
>         characters (as specified by isspace())
> 
>      2. A subject sequence interpreted as an integer represented in
>         some radix determined by the value of base
> 
>      3. A final string of one or more unrecognized characters,
>         including the terminating null byte of the input string.
> 
>      Then they shall attempt to convert the subject sequence to an
>      integer, and return the result.
> 
> So, leading whitespace is stripped, then "subject sequence" that is
> the sequence of digits (with optional +/- sign) to be turned into a
> long is recognised, and what remains is the "final string".  endptr
> is made to point at that "final string", and it does not matter what
> kind of value the interpretation of "subject sequence" yields.

Oh I think I misunderstood when I read the standard this morning. Just 
to check I do understand now, in the case of overflow entptr points to 
the final string (i.e. to the character following the last digit) even 
though the "subject sequence" cannot be successfully converted?

Thanks

Phillip

Phillip Wood April 27, 2022, 5:28 p.m. UTC | #13

On 27/04/2022 17:54, Carlo Arenas wrote:
> On Wed, Apr 27, 2022 at 9:31 AM Phillip Wood <phillip.wood123@gmail.com> wrote:
>> On 27/04/2022 16:38, Carlo Arenas wrote:
>>> FWIW, I still think that using atoi with a check to skip "" is
>>> probably as safe as doing all this extra checking as no one has shown
>>> yet a system where sizeof(uid_t) > sizeof(uint32_t), but agree with
>>> Junio that using long instead avoids issues with the systems where
>>> sizeof(uid_t) > sizeof(int) and unless sizeof(int) == sizeof(long)
>>> (ex: 32-bit Linux) which is then covered by the cast.
>>
>> if sizeof(uid_t) < sizeof(long) then the cast will truncate the value
>> returned by strtol() which means we are trusting that SUDO_UID is a
>> valid uid otherwise it will be truncated.
> 
> correct, this whole procedure relies on the fact that SUDO_UID is not
> a bogus value (ex: it was produced by a non buggy sudo and hasn't been
> tampered with)
> 
> in systems where sizeof(uid_t) < sizeof(long), it is expected that the
> id we got should be able to fit in an uid_t so no truncation will ever
> happen.
> 
> the only thing that worries me is sign extension but that is why I put
> a specific cast.  for all practical reasons I expect uid_t to be
> uint32_t and therefore using long should be better than using int
> (through atoi)


If we think uid_t is a uint32_t then should we be using strtoul() to 
make sure we cover the whole uid range where sizeof(long) == 
sizeof(uint32_t)?

Best Wishes

Phillip

> Carlo

Carlo Marcelo Arenas Belón April 27, 2022, 5:49 p.m. UTC | #14

On Wed, Apr 27, 2022 at 10:28 AM Phillip Wood <phillip.wood123@gmail.com> wrote:
>
> On 27/04/2022 17:54, Carlo Arenas wrote:
> >
> > the only thing that worries me is sign extension but that is why I put
> > a specific cast.  for all practical reasons I expect uid_t to be
> > uint32_t and therefore using long should be better than using int
> > (through atoi)

Well, just because I think that is the most likely option, doesn't
mean it will be so since it is not defined as such in the standard.
I should have documented though that I was (probably incorrectly)
prioritizing the possibility of supporting negative uids instead of
positive uids > INT_MAX.

This of course only matters in 32bit, but looking at sudo's sources
they use "%u" to set the UID in the environment and therefore we
should change our approach to match.

> If we think uid_t is a uint32_t then should we be using strtoul() to
> make sure we cover the whole uid range where sizeof(long) ==
> sizeof(uint32_t)?

strtoul is sadly not very portable, but I think you are correct that
it should be used instead
originally I thought it would be better to do strtoimax but that will
also require moving this function around.

Carlo

Randall S. Becker April 27, 2022, 5:49 p.m. UTC | #15

On April 27, 2022 1:22 PM, Phillip Wood wrote:
>On 27/04/2022 17:19, Junio C Hamano wrote:
>> Carlo Arenas <carenas@gmail.com> writes:
>>>> The standard is silent on what happens to error when the value is
>>>> out of range.
>>
>> Actually the standard is is very clear what happens to endptr (no,
>> don't call it "error", that is not the point of the parameter).
> >
>>      A pointer to the final string shall be stored in the object
>>      pointed to by endptr, provided that endptr is not a null
>>      pointer.
>>
>> where "final string" has a precise definition much earlier:
>>
>>      First, they decompose the input string into three parts:
>>
>>      1. An initial, possibly empty, sequence of white-space
>>         characters (as specified by isspace())
>>
>>      2. A subject sequence interpreted as an integer represented in
>>         some radix determined by the value of base
>>
>>      3. A final string of one or more unrecognized characters,
>>         including the terminating null byte of the input string.
>>
>>      Then they shall attempt to convert the subject sequence to an
>>      integer, and return the result.
>>
>> So, leading whitespace is stripped, then "subject sequence" that is
>> the sequence of digits (with optional +/- sign) to be turned into a
>> long is recognised, and what remains is the "final string".  endptr is
>> made to point at that "final string", and it does not matter what kind
>> of value the interpretation of "subject sequence" yields.
>
>Oh I think I misunderstood when I read the standard this morning. Just to check I
>do understand now, in the case of overflow entptr points to the final string (i.e. to
>the character following the last digit) even though the "subject sequence" cannot
>be successfully converted?

I can confirm this on multiple 32-bit platforms. With strtol("123456789012345678", &endptr, 10) returns 2147483647 and *endptr == '\0' just beyond the last 8.

Carlo Marcelo Arenas Belón April 27, 2022, 5:54 p.m. UTC | #16

On Wed, Apr 27, 2022 at 10:49 AM <rsbecker@nexbridge.com> wrote:
>
> I can confirm this on multiple 32-bit platforms. With strtol("123456789012345678", &endptr, 10) returns 2147483647 and *endptr == '\0' just beyond the last 8.

can you also confirm that you have strtoul and works the same, before
I break NON-STOP in my next reroll?

Carlo

Randall S. Becker April 27, 2022, 6:05 p.m. UTC | #17

On April 27, 2022 1:54 PM, Carlo Arenas wrote:
>To: rsbecker@nexbridge.com
>Cc: phillip.wood@dunelm.org.uk; Junio C Hamano <gitster@pobox.com>;
>git@vger.kernel.org; philipoakley@iee.email; me@ttaylorr.com;
>guy.j@maurel.de; szeder.dev@gmail.com; johannes.Schindelin@gmx.de;
>derrickstolee@github.com
>Subject: Re: [PATCH] git-compat-util: avoid failing dir ownership checks if running
>privileged
>
>On Wed, Apr 27, 2022 at 10:49 AM <rsbecker@nexbridge.com> wrote:
>>
>> I can confirm this on multiple 32-bit platforms. With
>strtol("123456789012345678", &endptr, 10) returns 2147483647 and *endptr == '\0'
>just beyond the last 8.
>
>can you also confirm that you have strtoul and works the same, before I break
>NON-STOP in my next reroll?

Results on NonStop: strtol("123456789012345678", &endptr, 10) returns 4294967295 and *endptr == '\0' just beyond the last 8.

Carlo Marcelo Arenas Belón April 27, 2022, 6:11 p.m. UTC | #18

On Wed, Apr 27, 2022 at 11:06 AM <rsbecker@nexbridge.com> wrote:
> Results on NonStop: strtol("123456789012345678", &endptr, 10) returns 4294967295 and *endptr == '\0' just beyond the last 8.

Thanks and errno == ERANGE as POSIX requires, right?, the code
(untested) I was planning to use instead looks like :

static inline void extract_id_from_env(const char *env, uid_t *id)
{
        const char *real_uid = getenv(env);

        /* discard any empty values */
        if (real_uid && *real_uid) {
                unsigned long extracted_id;
                int saved_errno = errno;

                errno = 0;
                extracted_id = strtoul(real_uid, NULL, 10);
                if (!errno)
                        *id = (uid_t)extracted_id;
                errno = saved_errno;
        }
        return 0;
}

Carlo

Randall S. Becker April 27, 2022, 6:16 p.m. UTC | #19

On April 27, 2022 2:12 PM, Carlo Arenas wrote:
>On Wed, Apr 27, 2022 at 11:06 AM <rsbecker@nexbridge.com> wrote:
>> Results on NonStop: strtol("123456789012345678", &endptr, 10) returns
>4294967295 and *endptr == '\0' just beyond the last 8.
>
>Thanks and errno == ERANGE as POSIX requires, right?, the code
>(untested) I was planning to use instead looks like :
>
>static inline void extract_id_from_env(const char *env, uid_t *id) {
>        const char *real_uid = getenv(env);
>
>        /* discard any empty values */
>        if (real_uid && *real_uid) {
>                unsigned long extracted_id;
>                int saved_errno = errno;
>
>                errno = 0;
>                extracted_id = strtoul(real_uid, NULL, 10);
>                if (!errno)
>                        *id = (uid_t)extracted_id;
>                errno = saved_errno;
>        }
>        return 0;
>}

errno = 4034 (Value out of range), which is the ERANGE #define.
--Randall

Junio C Hamano April 27, 2022, 6:54 p.m. UTC | #20

Phillip Wood <phillip.wood123@gmail.com> writes:

> On 27/04/2022 16:58, Carlo Arenas wrote:
>> On Wed, Apr 27, 2022 at 5:30 AM Phillip Wood <phillip.wood123@gmail.com> wrote:
>>> You are ignoring any errors when parsing the environment variable - that
>>> is not a good idea in a security check.
>> which errors are you concerned about?, if anything in this code
>
> I was confused by the fact that the helper function returns a
> success/failure value which we ignore. However euid is not overwritten 
> if strtol fails so it is safe I think.
>
>> worries me from a security point of view is the fact that we are
>> relying in getenv not being racy (as mentioned in the original RFC),
>> but there are no errors set there AFAIK.
>> not ignoring errno in strtol is an option, but as mentioned before I
>> decided instead to reject bogus values and therefore not the clobber a
>> previous errno,
>
> strtol() will set errno if there is a range error ignoring it does not
> change that. In any case is_path_owned_by_current_uid() already
> clobbers errno if stat() fails.

IOW we are not preserving errno anyway, so ignoring errno (or
resetting before calling strtol) does not by us anything.

We probably should step back a bit, take a deep breath and think
what we are trying to protect our users against.  When I said early
in the discussion how much we trust the value in SUDO_UID, I hoped
people would do so, and I was especially hoping such a discussion to
happen when they realized that the width of uid_t is unknown and
there can be truncations in either direction.

So let's start one now.

If we didn't trust the value in SUDO_UID, what valid use case do we
help, and what avenue of attack do we open ourselves to?  This is
not a suggestion of an alternative, but is a discussion starter to
illustrate the line along which we want to think about the issues.

The original "sudo make install" (which runs "git describe" inside)
use case does not really care.  The process running with SUDO_UID is
not spelunking into random directories uncontrollably without being
aware that there may be hostile repositories.  The directories they
visit are part of legitimate journey "make install" takes.

The "sudo sh to get a shell, chdir to /var/tmp/foo to do something"
use case does care---it needs to make sure that whereever it goes is
not part of a hostile repository.  So "if SUDO_UID is there, anything
goes" is not a good protection for such a use case.

If we read SUDO_UID into an int16_t and uid_t happens to be much
wider than that type, then what happens?  Again, I am not advocating
to deliberately use shorter type to cause truncation.  This is
merely to illustrate how much truncation matters.

The "sudo make install" use case may be harmed unless the comparison
is done carefully.  If the repository owner's UID is beyond 32k then
asking if (st.st_uid == SUDO_UID) would say "no" due to truncation
and refuse access to the legitimate user.  If we compare both after
casting them to int16_t then the repository owner will be allowed in
again.  A friendly stranger who happens to share the low 16-bit of
UID will also be allowed to install from the repository, but that is
not an intended consequence.

The "sudo sh and go spelunking" use case is weakend by truncation.
It is protected only if the victim's UID does not share the lower
16-bit with the owner of hostile repository.

So what attack does this allow?  An attacker needs to learn their
own UID, find another user whose UID is the same modulo 16-bit as a
potential victim, and the victim has to be among those who can run
"sudo".  Then the victim somehow has to be talked into running stuff
in a repository the attacker prepares.

Possible?  Yes.  Practical?  I dunno.  Especially if we do not
deliberately truncate by reading into int16_t but use "int" (32-bit
almost everywhere we care about) or "long".  Now how likely is uid_t
narrower than long?  If it is not likely, then we probably are OK to
use long and not worry about the truncation at all.  Or use strtoll()
to make it even less likely.

So, in short, I think it is reasonable if we did:

 - do the extra "if SUDO_UID exists, then..." only when we are root.

 - use strtol to ensure SUDO_UID is in good shape (all digits) and
   read into long; use errno=0 trick to notice overflows.  err on
   the safe side and do not allow if the value cannot be read
   correctly.

 - do the comparison between st.st_uid and strtol() result by
   casting both to (long).  Without this, st.st_uid has a value that
   needs wider than long, we cannot allow the owner of such a
   repository in (and I somehow feel that is unlikely to happen).

 - or omit the third one --- tough luck for those whose UID is
   beyond what a long can represent (and I somehow feel that this is
   probably OK).

Another thing I'd suggest doing is to have a helper function to do
the "we do a bit more than geteuid() to come up with the allowed
owner of repositories we encounter", and call that function instead
of geteuid().

Thanks.

Carlo Marcelo Arenas Belón April 27, 2022, 8:59 p.m. UTC | #21

On Wed, Apr 27, 2022 at 11:54 AM Junio C Hamano <gitster@pobox.com> wrote:

> The "sudo sh to get a shell, chdir to /var/tmp/foo to do something"
> use case does care---it needs to make sure that whereever it goes is
> not part of a hostile repository.  So "if SUDO_UID is there, anything
> goes" is not a good protection for such a use case.

FWIW that was never part of the proposal, indeed making git aware of
SUDO_ID when it is running as only as root was part of the design to
avoid other users probably allowing repositories they don't control by
having an evil SUDO_ID.

as per the root user, I was expecting that he could be trusted more
and that wouldn't accidentally get an evil SUDO_ID on their session
because it is something that gets generated from their original
account and they should be aware of it and even might need to tweak it
(ex: by un setting it if it gets in the way).

To make the `sudo make install` use case possible I think it make
sense to trust SUDO_ID by default, but I would be open to add a
configuration setting that would be required to enable it (it indeed
might become useful for root users that might want to disable it as an
alternative to unset SUDO_ID).

> Possible?  Yes.  Practical?  I dunno.  Especially if we do not
> deliberately truncate by reading into int16_t but use "int" (32-bit
> almost everywhere we care about) or "long".  Now how likely is uid_t
> narrower than long?  If it is not likely, then we probably are OK to
> use long and not worry about the truncation at all.  Or use strtoll()
> to make it even less likely.

strtoll (or my suggested alternative of strtoimax) suffer from
portability issues which are handled later in the git-compat-util.h
file, so while I think it is worth doing, I would prefer doing so
later as part of moving this code into a proper compat module.

for this iteration, the suggestion by Phillip to use strtoul makes
more sense (specially after confirmation by Randall that it works in
NON-STOP), it is more portable and doesn't require later in the file
git-compat magic to work, it allows for the full range of 32bit that
might be needed in 32bit *NIX, it is should fit find even in 32bit
environments with a 32bit unsigned uid_t (which is fairly common), it
doesn't sign extend if applied into a wider uid_t and would be likely
able to be assigned even without a cast and without the risk of
truncation for most uid_t types.  Lastly, since now this code is
focused only in sudo compatibility, and sudo casts their uid to an
unsigned int and prints it into that variable with "%u" it is the
right "API".

For comparison, I would also think it is better to let the compiler
warn if needed so at least the users would know their uid_t type is
narrower than what we expect and sudo sets, and might work as a
weather balloon to find those systems.

>
> So, in short, I think it is reasonable if we did:
>
>  - do the extra "if SUDO_UID exists, then..." only when we are root.

already in since the RFC, see discussion above for an option to
tighten it further if really needed, but even in that case I would
rather do it in the next iteration.

>  - use strtol to ensure SUDO_UID is in good shape (all digits) and
>    read into long; use errno=0 trick to notice overflows.  err on
>    the safe side and do not allow if the value cannot be read
>    correctly.

already implemented as part of v2, except it is using strtoul/unsigned long.
also in the line of "err on the safe side" I was also discarding any values
that would overflow an uid_t (which is assumed to be unsigned), which
affect the next bullet point.

>  - do the comparison between st.st_uid and strtol() result by
>    casting both to (long).  Without this, st.st_uid has a value that
>    needs wider than long, we cannot allow the owner of such a
>    repository in (and I somehow feel that is unlikely to happen).

do you still want to cast both sides even if discarding values that would
overflow an uid_t?

FWIW, as Phillip reported, using long (or another signed type) would
restrict the valid ids in 32-bit *NIX to half of what the expected uint32_t
range most of them should have.

>  - or omit the third one --- tough luck for those whose UID is
>    beyond what a long can represent (and I somehow feel that this is
>    probably OK).

That was what I proposed in the current RFC (at least in 32bit *NIX)
but I have to admit that i'd seen a lot more issues with uid larger than
INT_MAX that I expected when doing google for it, so that is what I
thought it would be safer to use the full 32-bit and an unsigned type
instead.

> Another thing I'd suggest doing is to have a helper function to do
> the "we do a bit more than geteuid() to come up with the allowed
> owner of repositories we encounter", and call that function instead
> of geteuid().

I did this originally, but discarded it quickly because it was a
little odd and easy to reuse, for something that I was hoping wouldn't
stay in git-compat for too long.

The use of the current helper function covers that use case but is by
design harder to reuse while keeping the overall logic of the 2
combined functions more visible.

Once we move this code into a proper compat module, I would be happy
to add a geteuid_or_sudo() function but that would be probably after
we also fix doas.

Carlo

Randall S. Becker April 27, 2022, 9:09 p.m. UTC | #22

On April 27, 2022 5:00 PM, Carlo Arenas wrote:
>On Wed, Apr 27, 2022 at 11:54 AM Junio C Hamano <gitster@pobox.com> wrote:
>
>> The "sudo sh to get a shell, chdir to /var/tmp/foo to do something"
>> use case does care---it needs to make sure that whereever it goes is
>> not part of a hostile repository.  So "if SUDO_UID is there, anything
>> goes" is not a good protection for such a use case.
>
>FWIW that was never part of the proposal, indeed making git aware of SUDO_ID
>when it is running as only as root was part of the design to avoid other users
>probably allowing repositories they don't control by having an evil SUDO_ID.
>
>as per the root user, I was expecting that he could be trusted more and that
>wouldn't accidentally get an evil SUDO_ID on their session because it is something
>that gets generated from their original account and they should be aware of it and
>even might need to tweak it
>(ex: by un setting it if it gets in the way).
<snip>

For perspective, the root user is specifically only trusted so far my community. Mucking about with repositories is frowned upon except for special system configuration repositories (ones for /etc, ssl certs, for example, and they have to sign those). Commit signing is being deployed to detect and as much practical prevent root (or any other user, elevated or not) from inappropriate repo history operations.

Just sharing a different POV.
--Randall

Junio C Hamano April 27, 2022, 9:25 p.m. UTC | #23

Carlo Arenas <carenas@gmail.com> writes:

> On Wed, Apr 27, 2022 at 11:54 AM Junio C Hamano <gitster@pobox.com> wrote:
>
>> The "sudo sh to get a shell, chdir to /var/tmp/foo to do something"
>> use case does care---it needs to make sure that whereever it goes is
>> not part of a hostile repository.  So "if SUDO_UID is there, anything
>> goes" is not a good protection for such a use case.
>
> FWIW that was never part of the proposal, indeed making git aware of
> SUDO_ID when it is running as only as root was part of the design to
> avoid other users probably allowing repositories they don't control by
> having an evil SUDO_ID.

You forgot to quote:

    This is not a suggestion of an alternative, but is a discussion
    starter to illustrate the line along which we want to think
    about the issues.

that came before it.

>>  - do the comparison between st.st_uid and strtol() result by
>>    casting both to (long).  Without this, st.st_uid has a value that
>>    needs wider than long, we cannot allow the owner of such a
>>    repository in (and I somehow feel that is unlikely to happen).
>
> do you still want to cast both sides even if discarding values that would
> overflow an uid_t?

The variable used with strtol (or strtoull or whatever) does not
have to be casted.  The "both sides" was primarily for the case
where st.st_uid is wider than whatever integral type we used to
parse and hold the value we took from the environment, to force
the same truncation on st.st_uid as the truncation the environment
parsing may have caused.  But as I keep saying I do not think it is
worth it, which means ...

>>  - or omit the third one --- tough luck for those whose UID is
>>    beyond what a long can represent (and I somehow feel that this is
>>    probably OK).

... this one.

Phillip Wood April 28, 2022, 5:56 p.m. UTC | #24

On 27/04/2022 19:54, Junio C Hamano wrote:

[ I've trimmed your discussion for brevity but thanks for taking the 
time to write it out, it was helpful in clarifying what we're trying to 
protect ourselves against]

> So, in short, I think it is reasonable if we did:
> 
>   - do the extra "if SUDO_UID exists, then..." only when we are root.
> 
>   - use strtol to ensure SUDO_UID is in good shape (all digits) and
>     read into long; use errno=0 trick to notice overflows.  err on
>     the safe side and do not allow if the value cannot be read
>     correctly.
> 
>   - do the comparison between st.st_uid and strtol() result by
>     casting both to (long).  Without this, st.st_uid has a value that
>     needs wider than long, we cannot allow the owner of such a
>     repository in (and I somehow feel that is unlikely to happen).
> 
>   - or omit the third one --- tough luck for those whose UID is
>     beyond what a long can represent (and I somehow feel that this is
>     probably OK).

I think that unsigned long should be wide enough. On x86 linux uid_t is 
uint32_t for both 32 & 64 bit processors. That means that we could 
truncate the value we read from SUDO_UID when long is 64 bits if it is 
cast to uid_t but we're supposed to be able to trust that SUDO_UID is a 
valid id so that shouldn't be a problem. If an attacker can change 
what's in SUDO_UID we're doomed anyway.

Best Wishes

Phillip

git-compat-util: avoid failing dir ownership checks if running privileged

Commit Message

Comments

Patch