Message ID | 20200422153347.40018-1-jrtc27@jrtc27.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | config.mak.uname: Define FREAD_READS_DIRECTORIES for GNU/Hurd | expand |
Jessica Clarke wrote: > GNU/Hurd is another platform that behaves like this. Set it to > UnfortunatelyYes so that config directory files are correctly processed. > This fixes the corresponding 'proper error on directory "files"' test in > t1308-config-set.sh. > > Thanks-to: Jeff King <peff@peff.net> > Signed-off-by: Jessica Clarke <jrtc27@jrtc27.com> > --- > config.mak.uname | 1 + > 1 file changed, 1 insertion(+) Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Thanks. > diff --git a/config.mak.uname b/config.mak.uname > index 0ab8e00938..3e526f6b9f 100644 > --- a/config.mak.uname > +++ b/config.mak.uname > @@ -308,6 +308,7 @@ ifeq ($(uname_S),GNU) > NO_STRLCPY = YesPlease > HAVE_PATHS_H = YesPlease > LIBC_CONTAINS_LIBINTL = YesPlease > + FREAD_READS_DIRECTORIES = UnfortunatelyYes > endif I wonder why we set up this knob this way. A lot of operating systems support fopen(..., "r") of a directory --- wouldn't it make sense for FREAD_READS_DIRECTORIES to be the default and for users on stricter platforms to be able to set FREAD_DOES_NOT_READ_DIRECTORIES if they want to speed Git up by taking advantage of their saner fread? Thanks, Jonathan
Jessica Clarke <jrtc27@jrtc27.com> writes: > GNU/Hurd is another platform that behaves like this. Set it to > UnfortunatelyYes so that config directory files are correctly processed. > This fixes the corresponding 'proper error on directory "files"' test in > t1308-config-set.sh. > > Thanks-to: Jeff King <peff@peff.net> > Signed-off-by: Jessica Clarke <jrtc27@jrtc27.com> > --- > config.mak.uname | 1 + > 1 file changed, 1 insertion(+) I'd tweak s/Thanks-to:/Helped-by:/ while queuing. Thanks for a quick turnaround after reporting the issue and getting a response. The way collaboration is working feels wonderful. > > diff --git a/config.mak.uname b/config.mak.uname > index 0ab8e00938..3e526f6b9f 100644 > --- a/config.mak.uname > +++ b/config.mak.uname > @@ -308,6 +308,7 @@ ifeq ($(uname_S),GNU) > NO_STRLCPY = YesPlease > HAVE_PATHS_H = YesPlease > LIBC_CONTAINS_LIBINTL = YesPlease > + FREAD_READS_DIRECTORIES = UnfortunatelyYes > endif > ifeq ($(uname_S),IRIX) > NO_SETENV = YesPlease
Jonathan Nieder <jrnieder@gmail.com> writes: > I wonder why we set up this knob this way. A lot of operating systems > support fopen(..., "r") of a directory --- wouldn't it make sense for > FREAD_READS_DIRECTORIES to be the default and for users on stricter > platforms to be able to set FREAD_DOES_NOT_READ_DIRECTORIES if they > want to speed Git up by taking advantage of their saner fread? It would have been helped to hear that when we accepted cba22528 (Add compat/fopen.c which returns NULL on attempt to open directory, 2008-02-08). Perhaps back then it was more common not to allow fopen() on a directory? I dunno. Because we do not very often hear "oops, this system also needs the READS_DIRECTORIES knob set" these days, I consider it a fair game to toggle the polarity of it, once the Hurd patch that started this thread lands, as the vicinity of the code would become quiescent again.
On Wed, Apr 22, 2020 at 9:41 AM Jonathan Nieder <jrnieder@gmail.com> wrote: > > Jessica Clarke wrote: > > > GNU/Hurd is another platform that behaves like this. Set it to > > UnfortunatelyYes so that config directory files are correctly processed. > > This fixes the corresponding 'proper error on directory "files"' test in > > t1308-config-set.sh. > > > > Thanks-to: Jeff King <peff@peff.net> > > Signed-off-by: Jessica Clarke <jrtc27@jrtc27.com> > > --- > > config.mak.uname | 1 + > > 1 file changed, 1 insertion(+) > > Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> > Thanks. > > > diff --git a/config.mak.uname b/config.mak.uname > > index 0ab8e00938..3e526f6b9f 100644 > > --- a/config.mak.uname > > +++ b/config.mak.uname > > @@ -308,6 +308,7 @@ ifeq ($(uname_S),GNU) > > NO_STRLCPY = YesPlease > > HAVE_PATHS_H = YesPlease > > LIBC_CONTAINS_LIBINTL = YesPlease > > + FREAD_READS_DIRECTORIES = UnfortunatelyYes > > endif > > I wonder why we set up this knob this way. A lot of operating systems > support fopen(..., "r") of a directory --- wouldn't it make sense for > FREAD_READS_DIRECTORIES to be the default and for users on stricter > platforms to be able to set FREAD_DOES_NOT_READ_DIRECTORIES if they > want to speed Git up by taking advantage of their saner fread? At the time it was written, the assumption in the code was that an fread() on a directory would produce a failure, and that that was the sane and common behavior. fopen(..., "r") on a directory was expected to be successful on most platforms, but does fail on some. I don't recall if it failed on any of the platforms I had access to at the time (Solaris, IRIX), but it does fail on Windows. So I introduced this feature that would make fopen() fail when opening a directory for use on the platforms where fread() of a directory did not fail, instead of trying to wrap fread(). I just looked in config.mak.uname, and I'm surprised to see FREAD_READS_DIRECTORIES set for so many platforms. And it's set for Linux and Darwin?!?!? Junio added it for Darwin (8e178ec4d072da4cd8f4449e17aef3aff5b57f6a) and Nguyễn Thái Ngọc Duy added it for Linux (e2d90fd1c33ae57e4a6da5729ae53876107b3463), but also seemed to mistake the intention of FREAD_FREADS_DIRECTORIES as being about the fopen(..., "r") of a directory rather than about an fread() of a directory. I just wrote a test program and tested on Linux, Darwin, and Windows. Linux and Darwin both succeed to fopen() a directory and fail to fread() it, as expected. Windows fails to fopen() a directory. I notice this earlier commit mentions a failure of t1308 (4e3ecbd43958b1400d6cb85fe5529beda1630e3a). I wonder if this is the reason FREAD_READS_DIRECTORIES was added to so many platforms? -Brandon
On 22 Apr 2020, at 19:48, Brandon Casey <drafnel@gmail.com> wrote: > On Wed, Apr 22, 2020 at 9:41 AM Jonathan Nieder <jrnieder@gmail.com> wrote: >> >> Jessica Clarke wrote: >> >>> GNU/Hurd is another platform that behaves like this. Set it to >>> UnfortunatelyYes so that config directory files are correctly processed. >>> This fixes the corresponding 'proper error on directory "files"' test in >>> t1308-config-set.sh. >>> >>> Thanks-to: Jeff King <peff@peff.net> >>> Signed-off-by: Jessica Clarke <jrtc27@jrtc27.com> >>> --- >>> config.mak.uname | 1 + >>> 1 file changed, 1 insertion(+) >> >> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> >> Thanks. >> >>> diff --git a/config.mak.uname b/config.mak.uname >>> index 0ab8e00938..3e526f6b9f 100644 >>> --- a/config.mak.uname >>> +++ b/config.mak.uname >>> @@ -308,6 +308,7 @@ ifeq ($(uname_S),GNU) >>> NO_STRLCPY = YesPlease >>> HAVE_PATHS_H = YesPlease >>> LIBC_CONTAINS_LIBINTL = YesPlease >>> + FREAD_READS_DIRECTORIES = UnfortunatelyYes >>> endif >> >> I wonder why we set up this knob this way. A lot of operating systems >> support fopen(..., "r") of a directory --- wouldn't it make sense for >> FREAD_READS_DIRECTORIES to be the default and for users on stricter >> platforms to be able to set FREAD_DOES_NOT_READ_DIRECTORIES if they >> want to speed Git up by taking advantage of their saner fread? > > At the time it was written, the assumption in the code was that an > fread() on a directory would produce a failure, and that that was the > sane and common behavior. fopen(..., "r") on a directory was expected > to be successful on most platforms, but does fail on some. I don't > recall if it failed on any of the platforms I had access to at the > time (Solaris, IRIX), but it does fail on Windows. So I introduced > this feature that would make fopen() fail when opening a directory for > use on the platforms where fread() of a directory did not fail, > instead of trying to wrap fread(). > > I just looked in config.mak.uname, and I'm surprised to see > FREAD_READS_DIRECTORIES set for so many platforms. And it's set for > Linux and Darwin?!?!? Junio added it for Darwin > (8e178ec4d072da4cd8f4449e17aef3aff5b57f6a) and Nguyễn Thái Ngọc Duy > added it for Linux (e2d90fd1c33ae57e4a6da5729ae53876107b3463), but > also seemed to mistake the intention of FREAD_FREADS_DIRECTORIES as > being about the fopen(..., "r") of a directory rather than about an > fread() of a directory. > > I just wrote a test program and tested on Linux, Darwin, and Windows. > Linux and Darwin both succeed to fopen() a directory and fail to > fread() it, as expected. Windows fails to fopen() a directory. > > I notice this earlier commit mentions a failure of t1308 > (4e3ecbd43958b1400d6cb85fe5529beda1630e3a). I wonder if this is the > reason FREAD_READS_DIRECTORIES was added to so many platforms? Then the current autoconf test is wrong and likely causing confusion: > AC_RUN_IFELSE( > [AC_LANG_PROGRAM([AC_INCLUDES_DEFAULT], > [[ > FILE *f = fopen(".", "r"); > return f != NULL;]])], > [ac_cv_fread_reads_directories=no], > [ac_cv_fread_reads_directories=yes]) > ]) Jess
On Wed, Apr 22, 2020 at 11:48 AM Brandon Casey <drafnel@gmail.com> wrote: > I just wrote a test program and tested on Linux, Darwin, and Windows. > Linux and Darwin both succeed to fopen() a directory and fail to > fread() it, as expected. Windows fails to fopen() a directory. I just want to be clear here, so based on the above, Linux and Darwin should _not_ need to have FREAD_READS_DIRECTORIES set. Windows, as always, is a stickier subject. Based on the above, Windows should not need FREAD_READS_DIRECTORIES either. It's not set when MSVC is the compiler, but it is set for cygwin. -Brandon
On Wed, Apr 22, 2020 at 11:50 AM Jessica Clarke <jrtc27@jrtc27.com> wrote: > > On 22 Apr 2020, at 19:48, Brandon Casey <drafnel@gmail.com> wrote: > > introduced > > this feature that would make fopen() fail when opening a directory for > > use on the platforms where fread() of a directory did not fail, > > instead of trying to wrap fread(). > > Then the current autoconf test is wrong and likely causing confusion: > > > AC_RUN_IFELSE( > > [AC_LANG_PROGRAM([AC_INCLUDES_DEFAULT], > > [[ > > FILE *f = fopen(".", "r"); > > return f != NULL;]])], > > [ac_cv_fread_reads_directories=no], > > [ac_cv_fread_reads_directories=yes]) > > ]) Yes, we should attempt to call fread() there. If either the fopen() fails or the fread() fails, then that should mean that FREAD_READS_DIRECTORIES is not necessary. -Brandon
On Wed, Apr 22, 2020 at 11:48 AM Brandon Casey <drafnel@gmail.com> wrote: > > I just looked in config.mak.uname, and I'm surprised to see > FREAD_READS_DIRECTORIES set for so many platforms. And it's set for > Linux and Darwin?!?!? Junio added it for Darwin > (8e178ec4d072da4cd8f4449e17aef3aff5b57f6a) and Nguyễn Thái Ngọc Duy > added it for Linux (e2d90fd1c33ae57e4a6da5729ae53876107b3463), but > also seemed to mistake the intention of FREAD_FREADS_DIRECTORIES as > being about the fopen(..., "r") of a directory rather than about an > fread() of a directory. > > I just wrote a test program and tested on Linux, Darwin, and Windows. > Linux and Darwin both succeed to fopen() a directory and fail to > fread() it, as expected. Windows fails to fopen() a directory. > > I notice this earlier commit mentions a failure of t1308 > (4e3ecbd43958b1400d6cb85fe5529beda1630e3a). I wonder if this is the > reason FREAD_READS_DIRECTORIES was added to so many platforms? Whoops, I got the order of e2d90fd1c33ae57e4a6da5729ae53876107b3463 and 4e3ecbd43958b1400d6cb85fe5529beda1630e3a wrong. Looks like the misunderstanding of FREAD_READS_DIRECTORIES in e2d90fd could have been the cause of all of this. That commit introduced the test t1308 and added FREAD_READS... to Linux, kFreeBSD, and FreeBSD, and the other additions followed shortly after. -Brandon
On Wed, Apr 22, 2020 at 12:13:50PM -0700, Brandon Casey wrote: > On Wed, Apr 22, 2020 at 11:48 AM Brandon Casey <drafnel@gmail.com> wrote: > > > > I just looked in config.mak.uname, and I'm surprised to see > > FREAD_READS_DIRECTORIES set for so many platforms. And it's set for > > Linux and Darwin?!?!? Junio added it for Darwin > > (8e178ec4d072da4cd8f4449e17aef3aff5b57f6a) and Nguyễn Thái Ngọc Duy > > added it for Linux (e2d90fd1c33ae57e4a6da5729ae53876107b3463), but > > also seemed to mistake the intention of FREAD_FREADS_DIRECTORIES as > > being about the fopen(..., "r") of a directory rather than about an > > fread() of a directory. > > > > I just wrote a test program and tested on Linux, Darwin, and Windows. > > Linux and Darwin both succeed to fopen() a directory and fail to > > fread() it, as expected. Windows fails to fopen() a directory. > > > > I notice this earlier commit mentions a failure of t1308 > > (4e3ecbd43958b1400d6cb85fe5529beda1630e3a). I wonder if this is the > > reason FREAD_READS_DIRECTORIES was added to so many platforms? > > Whoops, I got the order of e2d90fd1c33ae57e4a6da5729ae53876107b3463 > and 4e3ecbd43958b1400d6cb85fe5529beda1630e3a wrong. Looks like the > misunderstanding of FREAD_READS_DIRECTORIES in e2d90fd could have been > the cause of all of this. That commit introduced the test t1308 and > added FREAD_READS... to Linux, kFreeBSD, and FreeBSD, and the other > additions followed shortly after. I think the code is actually doing the right thing (or at least something useful), and the "FREAD" in the name is the confusing part. Because there's now code doing: fh = fopen(fn, "r"); if (!fh) { if (errno == ENOENT || errno == EISDIR) { /* not actually a file; treat as a gentle noop */ return 0; } else { die_errno("omg, a real open error"); } } if (!fread(..., fh)) die_errno("omg, a real read error"); which is exactly what the failing test in t1308 is doing. I know that wasn't the original intent of the flag, but I think it was a conscious decision to build on around the time of e2d90fd1c, when we started actually checking fopen() return values (as opposed to just segfaulting). And in practice, do we really care about cases that can fopen a directory but refuse to read from it? It's simpler and more efficient to catch this case up front. So I think the knob has de facto become "do we need to use our compat wrapper to make opening a directory fail with EISDIR". And any attempt to change that will mean adapting all of the callers to handle that case themselves. I think what we have now is the useful knob we want; it's just misnamed (and obviously I don't blame your original patch; it was adapted over time). -Peff
On Wed, Apr 22, 2020 at 12:58 PM Jeff King <peff@peff.net> wrote: > > On Wed, Apr 22, 2020 at 12:13:50PM -0700, Brandon Casey wrote: > > > On Wed, Apr 22, 2020 at 11:48 AM Brandon Casey <drafnel@gmail.com> wrote: > > > > > > I just looked in config.mak.uname, and I'm surprised to see > > > FREAD_READS_DIRECTORIES set for so many platforms. And it's set for > > > Linux and Darwin?!?!? Junio added it for Darwin > > > (8e178ec4d072da4cd8f4449e17aef3aff5b57f6a) and Nguyễn Thái Ngọc Duy > > > added it for Linux (e2d90fd1c33ae57e4a6da5729ae53876107b3463), but > > > also seemed to mistake the intention of FREAD_FREADS_DIRECTORIES as > > > being about the fopen(..., "r") of a directory rather than about an > > > fread() of a directory. > > > > > > I just wrote a test program and tested on Linux, Darwin, and Windows. > > > Linux and Darwin both succeed to fopen() a directory and fail to > > > fread() it, as expected. Windows fails to fopen() a directory. > > > > > > I notice this earlier commit mentions a failure of t1308 > > > (4e3ecbd43958b1400d6cb85fe5529beda1630e3a). I wonder if this is the > > > reason FREAD_READS_DIRECTORIES was added to so many platforms? > > > > Whoops, I got the order of e2d90fd1c33ae57e4a6da5729ae53876107b3463 > > and 4e3ecbd43958b1400d6cb85fe5529beda1630e3a wrong. Looks like the > > misunderstanding of FREAD_READS_DIRECTORIES in e2d90fd could have been > > the cause of all of this. That commit introduced the test t1308 and > > added FREAD_READS... to Linux, kFreeBSD, and FreeBSD, and the other > > additions followed shortly after. > > I think the code is actually doing the right thing (or at least > something useful), and the "FREAD" in the name is the confusing part. > Because there's now code doing: > > fh = fopen(fn, "r"); > if (!fh) { > if (errno == ENOENT || errno == EISDIR) { > /* not actually a file; treat as a gentle noop */ > return 0; > } else { > die_errno("omg, a real open error"); > } > } > if (!fread(..., fh)) > die_errno("omg, a real read error"); > > which is exactly what the failing test in t1308 is doing. > > I know that wasn't the original intent of the flag, but I think it was a > conscious decision to build on around the time of e2d90fd1c, when we > started actually checking fopen() return values (as opposed to just > segfaulting). Our policy has always been to check return values hasn't it? > And in practice, do we really care about cases that can fopen a > directory but refuse to read from it? It's simpler and more efficient to > catch this case up front. I'm not sure I agree that it's simpler and more efficient to catch this up front. Catching the case where a directory is supplied when only a file is valid is an error path which we generally do not optimize for, and it requires us to add an extra stat to every fopen() call. We should have always been checking both the fopen() and any reads and handle a failure in either one. So normally we wouldn't have to do anything special to produce an error for the case of a directory being supplied. Now, if you want to ignore when a directory is supplied, and not produce an error, I would expect the code to actually check for that explicitly/clearly and not depend on fopen() failing, since that is not a common behavior. > So I think the knob has de facto become "do we need to use our compat > wrapper to make opening a directory fail with EISDIR". And any attempt > to change that will mean adapting all of the callers to handle that case > themselves. I think what we have now is the useful knob we want; it's > just misnamed (and obviously I don't blame your original patch; it was > adapted over time). We've generally taken the approach that there is an expected "normal" behavior for the c library, generally the linux/glibc behavior. Then, for platforms that behaved differently from what we've defined as "normal", we'd introduce a compatibility function to make them behave the way we wanted them to behave, or as close to that as possible. But what you're suggesting here seems different. You're suggesting that we should modify the behavior of fopen from what is commonly considered "normal" so that it behaves in a new uncommon way. That doesn't seem like the right thing to do. Instead, I would think it would be better to introduce a new function that has the behavior we want and to explicitly call that function instead of pretending that we're calling fopen(). Otherwise it just leads to confusion since _our_ fopen() doesn't actually work the way fopen() "normally" works on our common platform. Maybe call it fopen_file_only() or something. I've been away from git development for too long to know whether most fopen callsites would need to use an fopen_file_only() function or whether it would just be a few special instances, and the rest could just use a regular fopen(). -Brandon
On Wed, Apr 22, 2020 at 02:18:15PM -0700, Brandon Casey wrote: > > I know that wasn't the original intent of the flag, but I think it was a > > conscious decision to build on around the time of e2d90fd1c, when we > > started actually checking fopen() return values (as opposed to just > > segfaulting). > > Our policy has always been to check return values hasn't it? Policy perhaps, but practice didn't always follow it. :) At any rate, my point was just that as part of a larger cleanup, we found it useful at that time to have an fopen() which behaved predictably on all platforms. > > And in practice, do we really care about cases that can fopen a > > directory but refuse to read from it? It's simpler and more efficient to > > catch this case up front. > > I'm not sure I agree that it's simpler and more efficient to catch > this up front. Catching the case where a directory is supplied when > only a file is valid is an error path which we generally do not > optimize for, and it requires us to add an extra stat to every fopen() > call. We should have always been checking both the fopen() and any > reads and handle a failure in either one. So normally we wouldn't have > to do anything special to produce an error for the case of a directory > being supplied. True, I guess it's not really more efficient (I was thinking that we had to deal with it only once, not on each fread). I do think it's simpler, though, as it means fopen() behaves the same everywhere. > Now, if you want to ignore when a directory is supplied, and not > produce an error, I would expect the code to actually check for that > explicitly/clearly and not depend on fopen() failing, since that is > not a common behavior. Sure, we could be stat()-ing each file before fopen(). And if you think it's worth going back to that direction, you can try to work up a patch. My suspicion is that it will end up making the callers worse, for no real gain. > We've generally taken the approach that there is an expected "normal" > behavior for the c library, generally the linux/glibc behavior. Then, > for platforms that behaved differently from what we've defined as > "normal", we'd introduce a compatibility function to make them behave > the way we wanted them to behave, or as close to that as possible. But > what you're suggesting here seems different. You're suggesting that we > should modify the behavior of fopen from what is commonly considered > "normal" so that it behaves in a new uncommon way. That doesn't seem > like the right thing to do. I think this is just defining a different "normal". And FWIW, I'm not suggesting any particular change now. I think we went in this direction a few years ago, and I'm just trying to explain the current state in terms of that decision. > Instead, I would think it would be better to introduce a new function > that has the behavior we want and to explicitly call that function > instead of pretending that we're calling fopen(). Otherwise it just > leads to confusion since _our_ fopen() doesn't actually work the way > fopen() "normally" works on our common platform. Maybe call it > fopen_file_only() or something. I've been away from git development > for too long to know whether most fopen callsites would need to use an > fopen_file_only() function or whether it would just be a few special > instances, and the rest could just use a regular fopen(). Yeah, I think that would be fine. Though again, I'm really not sure that it gains us all that much, and it would require auditing each fopen() site to see whether it's depending on the current compat behavior or not. -Peff
diff --git a/config.mak.uname b/config.mak.uname index 0ab8e00938..3e526f6b9f 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -308,6 +308,7 @@ ifeq ($(uname_S),GNU) NO_STRLCPY = YesPlease HAVE_PATHS_H = YesPlease LIBC_CONTAINS_LIBINTL = YesPlease + FREAD_READS_DIRECTORIES = UnfortunatelyYes endif ifeq ($(uname_S),IRIX) NO_SETENV = YesPlease
GNU/Hurd is another platform that behaves like this. Set it to UnfortunatelyYes so that config directory files are correctly processed. This fixes the corresponding 'proper error on directory "files"' test in t1308-config-set.sh. Thanks-to: Jeff King <peff@peff.net> Signed-off-by: Jessica Clarke <jrtc27@jrtc27.com> --- config.mak.uname | 1 + 1 file changed, 1 insertion(+)