diff mbox series

[GSoC] t/t5000-tar-tree: add helper function

Message ID 20230202202557.19297-1-kostya.farber@gmail.com (mailing list archive)
State New, archived
Headers show
Series [GSoC] t/t5000-tar-tree: add helper function | expand

Commit Message

Kostya Farber Feb. 2, 2023, 8:25 p.m. UTC
Add the helper function test_file_path_exists to the
interpret pax header test. This change makes it clearer
as to what the test is trying to check, in this case whether
a file path exists.

Signed-off-by: Kostya Farber <kostya.farber@gmail.com>
---
Hello all. I'd like to introduce myself. My name is 
Kostya, I am new to the git open source community.

I am a data engineer by trade, but have recently gotten
into (and am really enjoying) the open source world.
My main contributions have been in the scientific 
python space (https://github.com/kostyafarber).

I have read MyFirstContribution, documentation in 
t/README and the general GSoC page. I have really 
started to enjoy low-level programming and have
been going through The C Programming Language (Second
Edition) and the Linux Programming Interface.

I am planning to submit a proposal for GSoC (hopefully) and 
want to try help out and contribute to a tool I've been using
for a long time and have love for.

Keen to speak to you all soon. 

 t/t5000-tar-tree.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Junio C Hamano Feb. 2, 2023, 10:36 p.m. UTC | #1
Kostya Farber <kostya.farber@gmail.com> writes:

> Add the helper function test_file_path_exists to the
> interpret pax header test. This change makes it clearer
> as to what the test is trying to check, in this case whether
> a file path exists.

Really?  

The code with "test -e" is already clear that it is checking if the
path $data exists.  This change does not make it any clearer.  What
it helps is that it gives a message upon failure, when the test is
run with the "-v" option.

> diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
> index d473048138..19d5bd0c04 100755
> --- a/t/t5000-tar-tree.sh
> +++ b/t/t5000-tar-tree.sh
> @@ -73,7 +73,7 @@ check_tar() {
>  			for header in *.paxheader
>  			do
>  				data=${header%.paxheader}.data &&
> -				if test -h $data || test -e $data
> +				if test -h $data || test_file_path_exists $data
>  				then
>  					path=$(get_pax_header $header path) &&
>  					if test -n "$path"

Nothing seems to be adding a new helper whose name is
test_file_path_exists; the patch expects such a helper already
exists and uses it in place for existing "test -e".

Perhaps you meant to say "use test_path_exists" not "add helper" on
the title, and use that function in the patch instead?

Thanks.
Eric Sunshine Feb. 2, 2023, 11:09 p.m. UTC | #2
On Thu, Feb 2, 2023 at 5:37 PM Junio C Hamano <gitster@pobox.com> wrote:
> Kostya Farber <kostya.farber@gmail.com> writes:
> > Add the helper function test_file_path_exists to the
> > interpret pax header test. This change makes it clearer
> > as to what the test is trying to check, in this case whether
> > a file path exists.
> >
> > -                             if test -h $data || test -e $data
> > +                             if test -h $data || test_file_path_exists $data
>
> Nothing seems to be adding a new helper whose name is
> test_file_path_exists; the patch expects such a helper already
> exists and uses it in place for existing "test -e".
>
> Perhaps you meant to say "use test_path_exists" not "add helper" on
> the title, and use that function in the patch instead?

A couple comments...

The test framework does not define a function named
"test_file_path_exists". Probably "test_path_exists" was intended.

Delving more deeply, though, this change seems undesirable from a
clarity viewpoint. The function "test_path_exists" is an assertion;
its purpose is to make the test fail if the path is expected to exist
but doesn't. However, in the original code from t5000:

    if test -h $data || test -e $data
    then
        path=$(get_pax_header $header path) &&
        if test -n "$path"
        then
            mv "$data" "$path" || exit 1
        fi
    fi

it is perfectly fine if the path is neither a symbolic link nor an
actual file; that's not considered an error. Therefore, using an
assertion function -- which suggests test failure -- muddles the
intent of the code rather than clarifying it.

Additionally, t/test-lib-functions.sh also defines the function
"test_path_is_symlink" which would seem to be the obvious complement
to "test_path_exists", thus one might have expected the patch to
change the code to:

    if test_path_is_symlink $data || test_path_exists $data
    then
        ...

however, "test_path_is_symlink" is also an assertion, thus not really
suitable for this case in which it is acceptable (not an error) if
neither condition holds true.

So, t5000 seems to be one of those relatively rare cases in which the
raw "test" command is more correct than the higher-level helper
functions.
Eric Sunshine Feb. 2, 2023, 11:19 p.m. UTC | #3
On Thu, Feb 2, 2023 at 6:09 PM Eric Sunshine <sunshine@sunshineco.com> wrote:
> So, t5000 seems to be one of those relatively rare cases in which the
> raw "test" command is more correct than the higher-level helper
> functions.

By the way, although the change made by this patch is probably
undesirable, if you would like to try a different submission, there is
a bit of modernization that could be applied to t5000. In particular,
the "archive and :(glob)" test does not match present-day style
guidelines:

        git archive -v HEAD -- ":(glob)**/sh" >/dev/null 2>actual &&
        cat >expect <<EOF &&
    a/
    a/bin/
    a/bin/sh
    EOF
        test_cmp expect actual

These days, we would use `<<-EOF` rather than `<<EOF` to allow the
here-doc body to be indented to the same depth as the `cat` command
itself. Furthermore, we would use `\EOF` to indicate to the reader
that no interpolation is expected within the body. Taken together, the
final result would be:

    git archive -v HEAD -- ":(glob)**/sh" >/dev/null 2>actual &&
    cat >expect <<-\EOF &&
    a/
    a/bin/
    a/bin/sh
    EOF
    test_cmp expect actual

Both cleanups can be done via a single patch; the commit message
should mention both modernizations.
Kostya Farber Feb. 4, 2023, 3:04 p.m. UTC | #4
On Thu, Feb 2, 2023 at 10:36 PM Junio C Hamano <gitster@pobox.com> wrote:

> Kostya Farber <kostya.farber@gmail.com> writes:
>
> > Add the helper function test_file_path_exists to the
> > interpret pax header test. This change makes it clearer
> > as to what the test is trying to check, in this case whether
> > a file path exists.
>
> Really?
>
> The code with "test -e" is already clear that it is checking if the
> path $data exists.  This change does not make it any clearer.  What
> it helps is that it gives a message upon failure, when the test is
> run with the "-v" option.

Okay, noted. I will look into the helper functions more closely to
understand why and how they useful compared to other methods (i.e test
-e in this instance)

> > diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
> > index d473048138..19d5bd0c04 100755
> > --- a/t/t5000-tar-tree.sh
> > +++ b/t/t5000-tar-tree.sh
> > @@ -73,7 +73,7 @@ check_tar() {
> >                       for header in *.paxheader
> >                       do
> >                               data=${header%.paxheader}.data &&
> > -                             if test -h $data || test -e $data
> > +                             if test -h $data || test_file_path_exists $data
> >                               then
> >                                       path=$(get_pax_header $header path) &&
> >                                       if test -n "$path"
>
> Nothing seems to be adding a new helper whose name is
> test_file_path_exists; the patch expects such a helper already
> exists and uses it in place for existing "test -e".
>
> Perhaps you meant to say "use test_path_exists" not "add helper" on
> the title, and use that function in the patch instead?

Yes you are right. I made a mistake by using the wrong function name
and I think "use test_path_exists" was my intention as a title name.

> Thanks.
Kostya Farber Feb. 4, 2023, 3:12 p.m. UTC | #5
On Thu, Feb 2, 2023 at 11:09 PM Eric Sunshine <sunshine@sunshineco.com> wrote:
>
> On Thu, Feb 2, 2023 at 5:37 PM Junio C Hamano <gitster@pobox.com> wrote:
> > Kostya Farber <kostya.farber@gmail.com> writes:
> > > Add the helper function test_file_path_exists to the
> > > interpret pax header test. This change makes it clearer
> > > as to what the test is trying to check, in this case whether
> > > a file path exists.
> > >
> > > -                             if test -h $data || test -e $data
> > > +                             if test -h $data || test_file_path_exists $data
> >
> > Nothing seems to be adding a new helper whose name is
> > test_file_path_exists; the patch expects such a helper already
> > exists and uses it in place for existing "test -e".
> >
> > Perhaps you meant to say "use test_path_exists" not "add helper" on
> > the title, and use that function in the patch instead?
>
> A couple comments...
>
> The test framework does not define a function named
> "test_file_path_exists". Probably "test_path_exists" was intended.
>
> Delving more deeply, though, this change seems undesirable from a
> clarity viewpoint. The function "test_path_exists" is an assertion;
> its purpose is to make the test fail if the path is expected to exist
> but doesn't. However, in the original code from t5000:
>
>     if test -h $data || test -e $data
>     then
>         path=$(get_pax_header $header path) &&
>         if test -n "$path"
>         then
>             mv "$data" "$path" || exit 1
>         fi
>     fi
>
> it is perfectly fine if the path is neither a symbolic link nor an
> actual file; that's not considered an error. Therefore, using an
> assertion function -- which suggests test failure -- muddles the
> intent of the code rather than clarifying it.

Okay, that makes sense. Thanks for explaining why this function isn't
desirable in this instance, and why it actually reduces the clarity of
the test.

> Additionally, t/test-lib-functions.sh also defines the function
> "test_path_is_symlink" which would seem to be the obvious complement
> to "test_path_exists", thus one might have expected the patch to
> change the code to:
>
>     if test_path_is_symlink $data || test_path_exists $data
>     then
>         ...
>
> however, "test_path_is_symlink" is also an assertion, thus not really
> suitable for this case in which it is acceptable (not an error) if
> neither condition holds true.
>
> So, t5000 seems to be one of those relatively rare cases in which the
> raw "test" command is more correct than the higher-level helper
> functions.

Yes on first glance it would look like test_path_is_symlink would be
the obvious complement, but as you say I guess it doesn't make sense
in this case for the same reasons as using test_path_exists.
It's a good learning experience for me to understand when, and in
which contexts, to use the helper functions.

Thanks
Kostya Farber Feb. 4, 2023, 3:16 p.m. UTC | #6
On Thu, Feb 2, 2023 at 11:19 PM Eric Sunshine <sunshine@sunshineco.com> wrote:
>
> On Thu, Feb 2, 2023 at 6:09 PM Eric Sunshine <sunshine@sunshineco.com> wrote:
> > So, t5000 seems to be one of those relatively rare cases in which the
> > raw "test" command is more correct than the higher-level helper
> > functions.
>
> By the way, although the change made by this patch is probably
> undesirable, if you would like to try a different submission, there is
> a bit of modernization that could be applied to t5000. In particular,
> the "archive and :(glob)" test does not match present-day style
> guidelines:
>
>         git archive -v HEAD -- ":(glob)**/sh" >/dev/null 2>actual &&
>         cat >expect <<EOF &&
>     a/
>     a/bin/
>     a/bin/sh
>     EOF
>         test_cmp expect actual
>
> These days, we would use `<<-EOF` rather than `<<EOF` to allow the
> here-doc body to be indented to the same depth as the `cat` command
> itself. Furthermore, we would use `\EOF` to indicate to the reader
> that no interpolation is expected within the body. Taken together, the
> final result would be:
>
>     git archive -v HEAD -- ":(glob)**/sh" >/dev/null 2>actual &&
>     cat >expect <<-\EOF &&
>     a/
>     a/bin/
>     a/bin/sh
>     EOF
>     test_cmp expect actual
>
> Both cleanups can be done via a single patch; the commit message
> should mention both modernizations.

I would be happy to help and submit another patch for this test based
on your observations above. Thanks for the suggestion. I am trying to
get used to the development workflow of emailing patches and generally
getting familiar with the code base and this seems like a small but
important step in the right direction.

Thanks
Eric Sunshine Feb. 5, 2023, 5:59 p.m. UTC | #7
On Sat, Feb 4, 2023 at 10:17 AM Kostya Farber <kostya.farber@gmail.com> wrote:
> On Thu, Feb 2, 2023 at 11:19 PM Eric Sunshine <sunshine@sunshineco.com> wrote:
> > On Thu, Feb 2, 2023 at 6:09 PM Eric Sunshine <sunshine@sunshineco.com> wrote:
> > > So, t5000 seems to be one of those relatively rare cases in which the
> > > raw "test" command is more correct than the higher-level helper
> > > functions.
> >
> > By the way, although the change made by this patch is probably
> > undesirable, if you would like to try a different submission, there is
> > a bit of modernization that could be applied to t5000. [...]
>
> I would be happy to help and submit another patch for this test based
> on your observations above. Thanks for the suggestion. I am trying to
> get used to the development workflow of emailing patches and generally
> getting familiar with the code base and this seems like a small but
> important step in the right direction.

The goal of the microproject isn't so much to get a change accepted
into the project, but rather to get experience with the workflow and
review process. Responding to reviewer comments, as you did, is part
of that process, so you're doing fine.
diff mbox series

Patch

diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
index d473048138..19d5bd0c04 100755
--- a/t/t5000-tar-tree.sh
+++ b/t/t5000-tar-tree.sh
@@ -73,7 +73,7 @@  check_tar() {
 			for header in *.paxheader
 			do
 				data=${header%.paxheader}.data &&
-				if test -h $data || test -e $data
+				if test -h $data || test_file_path_exists $data
 				then
 					path=$(get_pax_header $header path) &&
 					if test -n "$path"