[v2,6/8] t7063: drop non-POSIX argument "-ls" from find(1)
diff mbox series

Message ID 59e3f73784b2a3bd9ccec87412e6178411c3708e.1584838148.git.congdanhqx@gmail.com
State New
Headers show
Series
  • fix test failure with busybox
Related show

Commit Message

Đoàn Trần Công Danh March 22, 2020, 12:55 a.m. UTC
Since commit 6b7728db81, (t7063: work around FreeBSD's lazy mtime
update feature, 2016-08-03), we started to use ls as a trick to update
directory's mtime.

However, `-ls` flag isn't required by POSIX's find(1), and
busybox(1) doesn't implement it.

From the original conversation, it seems like find(1) with "-type d"
could trigger enough "lstat(2)" to ask FreeBSD update mtime.

Use only filter "-type d" for now.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---
 t/t7063-status-untracked-cache.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Johannes Schindelin March 23, 2020, 2:11 p.m. UTC | #1
Hi,

On Sun, 22 Mar 2020, Đoàn Trần Công Danh wrote:

> Since commit 6b7728db81, (t7063: work around FreeBSD's lazy mtime
> update feature, 2016-08-03), we started to use ls as a trick to update
> directory's mtime.
>
> However, `-ls` flag isn't required by POSIX's find(1), and
> busybox(1) doesn't implement it.
>
> >From the original conversation, it seems like find(1) with "-type d"
> could trigger enough "lstat(2)" to ask FreeBSD update mtime.

This rationale  makes me uneasy: why did Duy add _both_ `-type d` *and*
`-ls` if the former would have been enough?

> Use only filter "-type d" for now.
>
> Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
> ---
>  t/t7063-status-untracked-cache.sh | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/t/t7063-status-untracked-cache.sh b/t/t7063-status-untracked-cache.sh
> index 190ae149cf..6791c6b95a 100755
> --- a/t/t7063-status-untracked-cache.sh
> +++ b/t/t7063-status-untracked-cache.sh
> @@ -18,7 +18,7 @@ GIT_FORCE_UNTRACKED_CACHE=true
>  export GIT_FORCE_UNTRACKED_CACHE
>
>  sync_mtime () {
> -	find . -type d -ls >/dev/null
> +	find . -type d >/dev/null

A more conservative patch would be the following:

-- snip --
commit 1680a64fae24b1073dbf1b844889a9953823b7a2
Author: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Date:   Wed Jul 19 22:13:16 2017 +0200

    t7063: when running under BusyBox, avoid unsupported find option

    BusyBox' find implementation does not understand the -ls option, so
    let's not use it when we're running inside BusyBox.

    Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>

diff --git a/t/t7063-status-untracked-cache.sh
b/t/t7063-status-untracked-cache.sh
index 190ae149cf3c..ab7e8b5fea01 100755
--- a/t/t7063-status-untracked-cache.sh
+++ b/t/t7063-status-untracked-cache.sh
@@ -18,7 +18,12 @@ GIT_FORCE_UNTRACKED_CACHE=true
 export GIT_FORCE_UNTRACKED_CACHE

 sync_mtime () {
-       find . -type d -ls >/dev/null
+       if test_have_prereq BUSYBOX
+       then
+               find . -type d -print0 | xargs -0r ls -ld >/dev/null
+       else
+               find . -type d -ls >/dev/null
+       fi
 }

 avoid_racy() {
-- snap --

I have this in Git for Windows' fork, although I have to admit that there
is no CI set up to verify that this is all working as I expect it to.

Ciao,
Dscho

>  }
>
>  avoid_racy() {
> --
> 2.26.0.rc2.310.g2932bb562d
>
>
>
Torsten Bögershausen March 23, 2020, 2:37 p.m. UTC | #2
On Mon, Mar 23, 2020 at 03:11:50PM +0100, Johannes Schindelin wrote:
> Hi,
>
> On Sun, 22 Mar 2020, Đoàn Trần Công Danh wrote:
>
> > Since commit 6b7728db81, (t7063: work around FreeBSD's lazy mtime
> > update feature, 2016-08-03), we started to use ls as a trick to update
> > directory's mtime.
> >
> > However, `-ls` flag isn't required by POSIX's find(1), and
> > busybox(1) doesn't implement it.
> >
> > >From the original conversation, it seems like find(1) with "-type d"
> > could trigger enough "lstat(2)" to ask FreeBSD update mtime.
>
> This rationale  makes me uneasy: why did Duy add _both_ `-type d` *and*
> `-ls` if the former would have been enough?

man readdir on my Linux gives:
[]
DESCRIPTION
       The  readdir() function returns a pointer to a dirent structure representing the next
       directory entry in the directory stream pointed to by dirp.  It returns
       NULL on reaching the end of the directory stream or if an error occurred.

       In the glibc implementation, the dirent structure is defined as follows:

           struct dirent {
               ino_t          d_ino;       /* Inode number */
               off_t          d_off;       /* Not an offset; see below */
               unsigned short d_reclen;    /* Length of this record */
               unsigned char  d_type;      /* Type of file; not supported
                                              by all filesystem types */
               char           d_name[256]; /* Null-terminated filename */
           };

So we could imagine that `find` is clever enough to extract "-type d" already
when calling readdir(), and the "good old" following stat() is not needed here.

So I would strongly agree with Dscho to keep the `ls`

[snip]
Đoàn Trần Công Danh March 23, 2020, 3:11 p.m. UTC | #3
On 2020-03-23 15:11:50+0100, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> Hi,
> 
> On Sun, 22 Mar 2020, Đoàn Trần Công Danh wrote:
> 
> > Since commit 6b7728db81, (t7063: work around FreeBSD's lazy mtime
> > update feature, 2016-08-03), we started to use ls as a trick to update
> > directory's mtime.
> >
> > However, `-ls` flag isn't required by POSIX's find(1), and
> > busybox(1) doesn't implement it.
> >
> > >From the original conversation, it seems like find(1) with "-type d"
> > could trigger enough "lstat(2)" to ask FreeBSD update mtime.
> 
> This rationale  makes me uneasy: why did Duy add _both_ `-type d` *and*
> `-ls` if the former would have been enough?
> 
> > Use only filter "-type d" for now.
> >
> > Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
> > ---
> >  t/t7063-status-untracked-cache.sh | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/t/t7063-status-untracked-cache.sh b/t/t7063-status-untracked-cache.sh
> > index 190ae149cf..6791c6b95a 100755
> > --- a/t/t7063-status-untracked-cache.sh
> > +++ b/t/t7063-status-untracked-cache.sh
> > @@ -18,7 +18,7 @@ GIT_FORCE_UNTRACKED_CACHE=true
> >  export GIT_FORCE_UNTRACKED_CACHE
> >
> >  sync_mtime () {
> > -	find . -type d -ls >/dev/null
> > +	find . -type d >/dev/null
> 
> A more conservative patch would be the following:
> 
> -- snip --
> commit 1680a64fae24b1073dbf1b844889a9953823b7a2
> Author: Johannes Schindelin <Johannes.Schindelin@gmx.de>
> Date:   Wed Jul 19 22:13:16 2017 +0200
> 
>     t7063: when running under BusyBox, avoid unsupported find option
> 
>     BusyBox' find implementation does not understand the -ls option, so
>     let's not use it when we're running inside BusyBox.

Yes, this patch is a conservative patch.
In v1, I went with "-exec ls -dils {} +".

And, Jeff worries about a lot of flags passed to ls may run into
compatibility issue.

> 
>     Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> 
> diff --git a/t/t7063-status-untracked-cache.sh
> b/t/t7063-status-untracked-cache.sh
> index 190ae149cf3c..ab7e8b5fea01 100755
> --- a/t/t7063-status-untracked-cache.sh
> +++ b/t/t7063-status-untracked-cache.sh
> @@ -18,7 +18,12 @@ GIT_FORCE_UNTRACKED_CACHE=true
>  export GIT_FORCE_UNTRACKED_CACHE
> 
>  sync_mtime () {
> -       find . -type d -ls >/dev/null
> +       if test_have_prereq BUSYBOX
> +       then
> +               find . -type d -print0 | xargs -0r ls -ld >/dev/null

Can we just change back to what Duy proposed time ago:

	find . -type d -exec ls -ld {} \;

> +       else
> +               find . -type d -ls >/dev/null
> +       fi
>  }
> 
>  avoid_racy() {
> -- snap --
> 
> I have this in Git for Windows' fork, although I have to admit that there
> is no CI set up to verify that this is all working as I expect it to.

I'd find some time (later) to setup a Travis build with Alpine and VoidLinux,
mainly for musl check.
Junio C Hamano March 23, 2020, 3:55 p.m. UTC | #4
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> diff --git a/t/t7063-status-untracked-cache.sh
> b/t/t7063-status-untracked-cache.sh
> index 190ae149cf3c..ab7e8b5fea01 100755
> --- a/t/t7063-status-untracked-cache.sh
> +++ b/t/t7063-status-untracked-cache.sh
> @@ -18,7 +18,12 @@ GIT_FORCE_UNTRACKED_CACHE=true
>  export GIT_FORCE_UNTRACKED_CACHE
>
>  sync_mtime () {
> -       find . -type d -ls >/dev/null
> +       if test_have_prereq BUSYBOX
> +       then
> +               find . -type d -print0 | xargs -0r ls -ld >/dev/null
> +       else
> +               find . -type d -ls >/dev/null
> +       fi

Given that this is only to work around the lazy mtime sync found on
BSDs, if we were to have any if/then/else, shouldn't we be rather
doing

	if test_have_prereq NEEDS_SYNC_MTIME_BECAUSE_WE_ARE_ON_BSD
	then
		find . -type d -ls >/dev/null
	fi

to make it a no-op for most everybody (including Windows)?
Junio C Hamano March 23, 2020, 8:30 p.m. UTC | #5
Danh Doan <congdanhqx@gmail.com> writes:

> Can we just change back to what Duy proposed time ago:
>
> 	find . -type d -exec ls -ld {} \;

Yup, that sounds just as conservative as it can get.  On platforms
that are not FreeBSD, anything done in this helper function is an
expensive noop, but provided that the working tree is small enough
and the overhead would not matter, something very simple like the
above sounds like the way to go.  If the overhead is too much, on
the other hand, then we can do it only conditionally on certain
filesystems where the issue exists.

> I'd find some time (later) to setup a Travis build with Alpine and VoidLinux,
> mainly for musl check.

Thanks.
Johannes Schindelin March 24, 2020, 10:31 p.m. UTC | #6
Hi Junio,

On Mon, 23 Mar 2020, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>
> > diff --git a/t/t7063-status-untracked-cache.sh
> > b/t/t7063-status-untracked-cache.sh
> > index 190ae149cf3c..ab7e8b5fea01 100755
> > --- a/t/t7063-status-untracked-cache.sh
> > +++ b/t/t7063-status-untracked-cache.sh
> > @@ -18,7 +18,12 @@ GIT_FORCE_UNTRACKED_CACHE=true
> >  export GIT_FORCE_UNTRACKED_CACHE
> >
> >  sync_mtime () {
> > -       find . -type d -ls >/dev/null
> > +       if test_have_prereq BUSYBOX
> > +       then
> > +               find . -type d -print0 | xargs -0r ls -ld >/dev/null
> > +       else
> > +               find . -type d -ls >/dev/null
> > +       fi
>
> Given that this is only to work around the lazy mtime sync found on
> BSDs, if we were to have any if/then/else, shouldn't we be rather
> doing
>
> 	if test_have_prereq NEEDS_SYNC_MTIME_BECAUSE_WE_ARE_ON_BSD
> 	then
> 		find . -type d -ls >/dev/null
> 	fi
>
> to make it a no-op for most everybody (including Windows)?

I like that approach.

Thank you,
Dscho
Junio C Hamano March 24, 2020, 11:46 p.m. UTC | #7
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> Hi Junio,
>
> On Mon, 23 Mar 2020, Junio C Hamano wrote:
>
>> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>>
>> > diff --git a/t/t7063-status-untracked-cache.sh
>> > b/t/t7063-status-untracked-cache.sh
>> > index 190ae149cf3c..ab7e8b5fea01 100755
>> > --- a/t/t7063-status-untracked-cache.sh
>> > +++ b/t/t7063-status-untracked-cache.sh
>> > @@ -18,7 +18,12 @@ GIT_FORCE_UNTRACKED_CACHE=true
>> >  export GIT_FORCE_UNTRACKED_CACHE
>> >
>> >  sync_mtime () {
>> > -       find . -type d -ls >/dev/null
>> > +       if test_have_prereq BUSYBOX
>> > +       then
>> > +               find . -type d -print0 | xargs -0r ls -ld >/dev/null
>> > +       else
>> > +               find . -type d -ls >/dev/null
>> > +       fi
>>
>> Given that this is only to work around the lazy mtime sync found on
>> BSDs, if we were to have any if/then/else, shouldn't we be rather
>> doing
>>
>> 	if test_have_prereq NEEDS_SYNC_MTIME_BECAUSE_WE_ARE_ON_BSD
>> 	then
>> 		find . -type d -ls >/dev/null
>> 	fi
>>
>> to make it a no-op for most everybody (including Windows)?
>
> I like that approach.

I actually think I was being half stupid --- what are we going to
tell those who want a stripped down system with busybox based on
BSD?  I think the condition for "if" is OK (as long as we have a
real prereq suitable for the approach), but the actual "let's make
sure the inodes are stat(2)ed" should avoid relying on "find -ls".

If we can assume that a typical busybox build offers find with -print0
and xargs with -0, I think the version you sent is probably the best
(with or without "do so only on BSD" I suggested) approach.

Thanks.

Patch
diff mbox series

diff --git a/t/t7063-status-untracked-cache.sh b/t/t7063-status-untracked-cache.sh
index 190ae149cf..6791c6b95a 100755
--- a/t/t7063-status-untracked-cache.sh
+++ b/t/t7063-status-untracked-cache.sh
@@ -18,7 +18,7 @@  GIT_FORCE_UNTRACKED_CACHE=true
 export GIT_FORCE_UNTRACKED_CACHE
 
 sync_mtime () {
-	find . -type d -ls >/dev/null
+	find . -type d >/dev/null
 }
 
 avoid_racy() {