Message ID | 20190609044907.32477-3-chriscool@tuxfamily.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Test oidmap | expand |
On Sun, Jun 09, 2019 at 06:49:06AM +0200, Christian Couder wrote: > From: Christian Couder <christian.couder@gmail.com> > > Add actual tests for operations using `struct oidmap` from oidmap.{c,h}. > > Signed-off-by: Christian Couder <chriscool@tuxfamily.org> > --- > t/t0016-oidmap.sh | 100 ++++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 100 insertions(+) > create mode 100755 t/t0016-oidmap.sh > > diff --git a/t/t0016-oidmap.sh b/t/t0016-oidmap.sh > new file mode 100755 > index 0000000000..3a8e8bdb3d > --- /dev/null > +++ b/t/t0016-oidmap.sh > @@ -0,0 +1,100 @@ > +#!/bin/sh > + > +test_description='test oidmap' > +. ./test-lib.sh > + > +# This purposefully is very similar to t0011-hashmap.sh > + > +test_oidmap() { > + echo "$1" | test-tool oidmap $3 > actual && > + echo "$2" > expect && Style nit: space between redirection op and filename. > + test_cmp expect actual > +} > + > + > +test_expect_success 'setup' ' > + > + test_commit one && > + test_commit two && > + test_commit three && > + test_commit four > + > +' > + > +test_oidhash() { > + git rev-parse "$1" | perl -ne 'print hex("$4$3$2$1") . "\n" if m/^(..)(..)(..)(..).*/;' New Perl dependencies always make Dscho sad... :) So, 'test oidmap' from the previous patch prints the value we want to check with: printf("%u\n", sha1hash(oid.hash)); First, since object ids inherently make more sense as hex values, it would be more appropriate to print that hash with the '%x' format specifier, and then we wouldn't need Perl's hex() anymore, and thus could swap the order of the first four bytes in oidmap's hash without relying on Perl, e.g. with: sed -e 's/^\(..\)\(..\)\(..\)\(..\).*/\4\3\2\1/' Second, and more importantly, the need for swapping the byte order indicates that this test would fail on big-endian systems, I'm afraid. So I think we need an additional bswap32() on the printing side, and then could further simplify 'test_oidhash': diff --git a/t/helper/test-oidmap.c b/t/helper/test-oidmap.c index 0ba122a264..4177912f9a 100644 --- a/t/helper/test-oidmap.c +++ b/t/helper/test-oidmap.c @@ -51,7 +51,7 @@ int cmd__oidmap(int argc, const char **argv) /* print hash of oid */ if (!get_oid(p1, &oid)) - printf("%u\n", sha1hash(oid.hash)); + printf("%x\n", bswap32(sha1hash(oid.hash))); else printf("Unknown oid: %s\n", p1); diff --git a/t/t0016-oidmap.sh b/t/t0016-oidmap.sh index 3a8e8bdb3d..9c0d88a316 100755 --- a/t/t0016-oidmap.sh +++ b/t/t0016-oidmap.sh @@ -22,10 +22,10 @@ test_expect_success 'setup' ' ' test_oidhash() { - git rev-parse "$1" | perl -ne 'print hex("$4$3$2$1") . "\n" if m/^(..)(..)(..)(..).*/;' + git rev-parse "$1" | cut -c1-8 } -test_expect_success PERL 'hash' ' +test_expect_success 'hash' ' test_oidmap "hash one hash two > +} > + > +test_expect_success PERL 'hash' ' > + > +test_oidmap "hash one > +hash two > +hash invalidOid > +hash three" "$(test_oidhash one) > +$(test_oidhash two) > +Unknown oid: invalidOid > +$(test_oidhash three)" > + > +' > + > +test_expect_success 'put' ' > + > +test_oidmap "put one 1 > +put two 2 > +put invalidOid 4 > +put three 3" "NULL > +NULL > +Unknown oid: invalidOid > +NULL" > + > +' > + > +test_expect_success 'replace' ' > + > +test_oidmap "put one 1 > +put two 2 > +put three 3 > +put invalidOid 4 > +put two deux > +put one un" "NULL > +NULL > +NULL > +Unknown oid: invalidOid > +2 > +1" > + > +' > + > +test_expect_success 'get' ' > + > +test_oidmap "put one 1 > +put two 2 > +put three 3 > +get two > +get four > +get invalidOid > +get one" "NULL > +NULL > +NULL > +2 > +NULL > +Unknown oid: invalidOid > +1" > + > +' > + > +test_expect_success 'iterate' ' > + > +test_oidmap "put one 1 > +put two 2 > +put three 3 > +iterate" "NULL > +NULL > +NULL > +$(git rev-parse two) 2 > +$(git rev-parse one) 1 > +$(git rev-parse three) 3" > + > +' > + > +test_done > -- > 2.22.0.14.g9023ccb50a >
On Sun, Jun 9, 2019 at 11:23 AM SZEDER Gábor <szeder.dev@gmail.com> wrote: > > On Sun, Jun 09, 2019 at 06:49:06AM +0200, Christian Couder wrote: > > + > > +test_oidmap() { > > + echo "$1" | test-tool oidmap $3 > actual && > > + echo "$2" > expect && > > Style nit: space between redirection op and filename. Thanks for spotting this. It's fixed in my current version. > > +test_oidhash() { > > + git rev-parse "$1" | perl -ne 'print hex("$4$3$2$1") . "\n" if m/^(..)(..)(..)(..).*/;' > > New Perl dependencies always make Dscho sad... :) Yeah, I was not sure how to do it properly in shell so I was hoping I would get suggestions about this. Thanks for looking at this! I could have hardcoded the values as it is done in t0011-hashmap.sh, but I thought it was better to find a function that does he job. > So, 'test oidmap' from the previous patch prints the value we want to > check with: > > printf("%u\n", sha1hash(oid.hash)); Yeah, I did it this way because "test-hashmap.c" does the same kind of thing to print hashes: printf("%u %u %u %u\n", strhash(p1), memhash(p1, strlen(p1)), strihash(p1), memihash(p1, strlen(p1))); > First, since object ids inherently make more sense as hex values, it > would be more appropriate to print that hash with the '%x' format > specifier, I would be ok with that, but then I think it would make sense to also print hex values in "test-hashmap.c". > and then we wouldn't need Perl's hex() anymore, and thus > could swap the order of the first four bytes in oidmap's hash without > relying on Perl, e.g. with: > > sed -e 's/^\(..\)\(..\)\(..\)\(..\).*/\4\3\2\1/' > > Second, and more importantly, the need for swapping the byte order > indicates that this test would fail on big-endian systems, I'm afraid. > So I think we need an additional bswap32() on the printing side, Ok, but then shouldn't we also use bswap32() in "test-hashmap.c"? By the way it seems that we use ntohl() or htonl() instead of bswap32() in the source code. > and then could further simplify 'test_oidhash': > > diff --git a/t/helper/test-oidmap.c b/t/helper/test-oidmap.c > index 0ba122a264..4177912f9a 100644 > --- a/t/helper/test-oidmap.c > +++ b/t/helper/test-oidmap.c > @@ -51,7 +51,7 @@ int cmd__oidmap(int argc, const char **argv) > > /* print hash of oid */ > if (!get_oid(p1, &oid)) > - printf("%u\n", sha1hash(oid.hash)); > + printf("%x\n", bswap32(sha1hash(oid.hash))); > else > printf("Unknown oid: %s\n", p1); > > diff --git a/t/t0016-oidmap.sh b/t/t0016-oidmap.sh > index 3a8e8bdb3d..9c0d88a316 100755 > --- a/t/t0016-oidmap.sh > +++ b/t/t0016-oidmap.sh > @@ -22,10 +22,10 @@ test_expect_success 'setup' ' > ' > > test_oidhash() { > - git rev-parse "$1" | perl -ne 'print hex("$4$3$2$1") . "\n" if m/^(..)(..)(..)(..).*/;' > + git rev-parse "$1" | cut -c1-8 > } > > -test_expect_success PERL 'hash' ' > +test_expect_success 'hash' ' Yeah, I agree that it seems better to me this way.
On Sun, Jun 09, 2019 at 10:24:55PM +0200, Christian Couder wrote: > On Sun, Jun 9, 2019 at 11:23 AM SZEDER Gábor <szeder.dev@gmail.com> wrote: > > > > On Sun, Jun 09, 2019 at 06:49:06AM +0200, Christian Couder wrote: > > > + > > > +test_oidmap() { > > > + echo "$1" | test-tool oidmap $3 > actual && > > > + echo "$2" > expect && > > > > Style nit: space between redirection op and filename. > > Thanks for spotting this. It's fixed in my current version. > > > > +test_oidhash() { > > > + git rev-parse "$1" | perl -ne 'print hex("$4$3$2$1") . "\n" if m/^(..)(..)(..)(..).*/;' > > > > New Perl dependencies always make Dscho sad... :) > > Yeah, I was not sure how to do it properly in shell so I was hoping I > would get suggestions about this. Thanks for looking at this! > > I could have hardcoded the values as it is done in t0011-hashmap.sh, > but I thought it was better to find a function that does he job. Well, I'm fine with hardcoding the expected hash values (in network byte order) as well, because then we won't add another git process upstream of a pipe that would pop up during audit later... > > So, 'test oidmap' from the previous patch prints the value we want to > > check with: > > > > printf("%u\n", sha1hash(oid.hash)); > > Yeah, I did it this way because "test-hashmap.c" does the same kind of > thing to print hashes: > > printf("%u %u %u %u\n", > strhash(p1), memhash(p1, strlen(p1)), > strihash(p1), memihash(p1, strlen(p1))); > > > First, since object ids inherently make more sense as hex values, it > > would be more appropriate to print that hash with the '%x' format > > specifier, > > I would be ok with that, but then I think it would make sense to also > print hex values in "test-hashmap.c". > > > and then we wouldn't need Perl's hex() anymore, and thus > > could swap the order of the first four bytes in oidmap's hash without > > relying on Perl, e.g. with: > > > > sed -e 's/^\(..\)\(..\)\(..\)\(..\).*/\4\3\2\1/' > > > > Second, and more importantly, the need for swapping the byte order > > indicates that this test would fail on big-endian systems, I'm afraid. > > So I think we need an additional bswap32() on the printing side, > > Ok, but then shouldn't we also use bswap32() in "test-hashmap.c"? No. The two test scripts/helpers work with different hashes. t0011 and 'test-hashmap.c' uses the various FNV-1-based hash functions (strhash(), memhash(), ...) to calculate an unsigned int hash of the items stored in the hashmap, therefore their hashes will be the same regardless of endianness. In an oidmap, however, the hash is simply the first four bytes of the object id as an unsigned int as is, and look at how sha1hash() does it, and indeed at the last sentence of the comment in front of it: * [...] Note that * the results will be different on big-endian and little-endian * platforms, so they should not be stored or transferred over the net. */ static inline unsigned int sha1hash(const unsigned char *sha1) { /* * Equivalent to 'return *(unsigned int *)sha1;', but safe on * platforms that don't support unaligned reads. */ unsigned int hash; memcpy(&hash, sha1, sizeof(hash)); return hash; } > By the way it seems that we use ntohl() or htonl() instead of > bswap32() in the source code. OK. > > and then could further simplify 'test_oidhash': > > > > diff --git a/t/helper/test-oidmap.c b/t/helper/test-oidmap.c > > index 0ba122a264..4177912f9a 100644 > > --- a/t/helper/test-oidmap.c > > +++ b/t/helper/test-oidmap.c > > @@ -51,7 +51,7 @@ int cmd__oidmap(int argc, const char **argv) > > > > /* print hash of oid */ > > if (!get_oid(p1, &oid)) > > - printf("%u\n", sha1hash(oid.hash)); > > + printf("%x\n", bswap32(sha1hash(oid.hash))); > > else > > printf("Unknown oid: %s\n", p1); > > > > diff --git a/t/t0016-oidmap.sh b/t/t0016-oidmap.sh > > index 3a8e8bdb3d..9c0d88a316 100755 > > --- a/t/t0016-oidmap.sh > > +++ b/t/t0016-oidmap.sh > > @@ -22,10 +22,10 @@ test_expect_success 'setup' ' > > ' > > > > test_oidhash() { > > - git rev-parse "$1" | perl -ne 'print hex("$4$3$2$1") . "\n" if m/^(..)(..)(..)(..).*/;' > > + git rev-parse "$1" | cut -c1-8 > > } > > > > -test_expect_success PERL 'hash' ' > > +test_expect_success 'hash' ' > > Yeah, I agree that it seems better to me this way.
On Sun, Jun 9, 2019 at 11:21 PM SZEDER Gábor <szeder.dev@gmail.com> wrote: > > On Sun, Jun 09, 2019 at 10:24:55PM +0200, Christian Couder wrote: > > On Sun, Jun 9, 2019 at 11:23 AM SZEDER Gábor <szeder.dev@gmail.com> wrote: > > > > > > New Perl dependencies always make Dscho sad... :) > > > > Yeah, I was not sure how to do it properly in shell so I was hoping I > > would get suggestions about this. Thanks for looking at this! > > > > I could have hardcoded the values as it is done in t0011-hashmap.sh, > > but I thought it was better to find a function that does he job. > > Well, I'm fine with hardcoding the expected hash values (in network > byte order) as well, because then we won't add another git process > upstream of a pipe that would pop up during audit later... Ok, I think I will do that then. > > > So, 'test oidmap' from the previous patch prints the value we want to > > > check with: > > > > > > printf("%u\n", sha1hash(oid.hash)); > > > > Yeah, I did it this way because "test-hashmap.c" does the same kind of > > thing to print hashes: > > > > printf("%u %u %u %u\n", > > strhash(p1), memhash(p1, strlen(p1)), > > strihash(p1), memihash(p1, strlen(p1))); > > > > > First, since object ids inherently make more sense as hex values, it > > > would be more appropriate to print that hash with the '%x' format > > > specifier, > > > > I would be ok with that, but then I think it would make sense to also > > print hex values in "test-hashmap.c". > > > > > and then we wouldn't need Perl's hex() anymore, and thus > > > could swap the order of the first four bytes in oidmap's hash without > > > relying on Perl, e.g. with: > > > > > > sed -e 's/^\(..\)\(..\)\(..\)\(..\).*/\4\3\2\1/' > > > > > > Second, and more importantly, the need for swapping the byte order > > > indicates that this test would fail on big-endian systems, I'm afraid. > > > So I think we need an additional bswap32() on the printing side, > > > > Ok, but then shouldn't we also use bswap32() in "test-hashmap.c"? > > No. The two test scripts/helpers work with different hashes. t0011 > and 'test-hashmap.c' uses the various FNV-1-based hash functions > (strhash(), memhash(), ...) to calculate an unsigned int hash of the > items stored in the hashmap, therefore their hashes will be the same > regardless of endianness. I see. Thanks for explaining that. > In an oidmap, however, the hash is simply > the first four bytes of the object id as an unsigned int as is, Yeah, I had realized that. Thanks, Christian.
SZEDER Gábor <szeder.dev@gmail.com> writes: > So, 'test oidmap' from the previous patch prints the value we want to > check with: > > printf("%u\n", sha1hash(oid.hash)); > > First, since object ids inherently make more sense as hex values, it > would be more appropriate to print that hash with the '%x' format > specifier, and then we wouldn't need Perl's hex() anymore, and thus > could swap the order of the first four bytes in oidmap's hash without > relying on Perl, e.g. with: > > sed -e 's/^\(..\)\(..\)\(..\)\(..\).*/\4\3\2\1/' > > Second, and more importantly, the need for swapping the byte order > indicates that this test would fail on big-endian systems, I'm afraid. > So I think we need an additional bswap32() on the printing side, and > then could further simplify 'test_oidhash': Yup, if we are doing an ad-hoc t/helper/ command, we should strive to make it help the driving scripts around it to become simpler, and your suggestion to do s/%u/%x/ is a good example of doing so. Thanks for a dose of sanity. The goal of the series may be worthwhile, and helping hands in improving its execution is very much appreciated.
On Sun, Jun 09, 2019 at 11:22:59AM +0200, SZEDER Gábor wrote: > So, 'test oidmap' from the previous patch prints the value we want to > check with: > > printf("%u\n", sha1hash(oid.hash)); > > First, since object ids inherently make more sense as hex values, it > would be more appropriate to print that hash with the '%x' format > specifier, and then we wouldn't need Perl's hex() anymore, and thus > could swap the order of the first four bytes in oidmap's hash without > relying on Perl, e.g. with: > > sed -e 's/^\(..\)\(..\)\(..\)\(..\).*/\4\3\2\1/' > > Second, and more importantly, the need for swapping the byte order > indicates that this test would fail on big-endian systems, I'm afraid. > So I think we need an additional bswap32() on the printing side, and > then could further simplify 'test_oidhash': I agree with all your points about using hex and pushing the logic into test-oidmap.c. BUT. At the point where we are normalizing byte order of the hashes, I have to wonder: why do we care about testing the hash value in the first place? We care that oidmap can store and retrieve values, and that it performs well. But as long as it does those things, I don't think anybody cares if it uses the first 4 bytes of the sha1 or the last 4. I know there are testing philosophies that go to this level of white-box testing, but I don't think we usually do in Git. A unit test of oidmap's externally visible behavior seems like the right level to me. -Peff
On Thu, Jun 13, 2019 at 01:19:13PM -0400, Jeff King wrote: > On Sun, Jun 09, 2019 at 11:22:59AM +0200, SZEDER Gábor wrote: > > > So, 'test oidmap' from the previous patch prints the value we want to > > check with: > > > > printf("%u\n", sha1hash(oid.hash)); > > > > First, since object ids inherently make more sense as hex values, it > > would be more appropriate to print that hash with the '%x' format > > specifier, and then we wouldn't need Perl's hex() anymore, and thus > > could swap the order of the first four bytes in oidmap's hash without > > relying on Perl, e.g. with: > > > > sed -e 's/^\(..\)\(..\)\(..\)\(..\).*/\4\3\2\1/' > > > > Second, and more importantly, the need for swapping the byte order > > indicates that this test would fail on big-endian systems, I'm afraid. > > So I think we need an additional bswap32() on the printing side, and > > then could further simplify 'test_oidhash': > > I agree with all your points about using hex and pushing the logic into > test-oidmap.c. BUT. > > At the point where we are normalizing byte order of the hashes, I have > to wonder: why do we care about testing the hash value in the first > place? We care that oidmap can store and retrieve values, and that it > performs well. But as long as it does those things, I don't think > anybody cares if it uses the first 4 bytes of the sha1 or the last 4. > > I know there are testing philosophies that go to this level of > white-box testing, but I don't think we usually do in Git. A unit > test of oidmap's externally visible behavior seems like the right > level to me. That's a good point... but then why does 't0011-hashmap.sh' do it in the first place? As far as I understood this t0016 mainly follows suit of t0011.
On Thu, Jun 13, 2019 at 07:52:36PM +0200, SZEDER Gábor wrote: > > At the point where we are normalizing byte order of the hashes, I have > > to wonder: why do we care about testing the hash value in the first > > place? We care that oidmap can store and retrieve values, and that it > > performs well. But as long as it does those things, I don't think > > anybody cares if it uses the first 4 bytes of the sha1 or the last 4. > > > > I know there are testing philosophies that go to this level of > > white-box testing, but I don't think we usually do in Git. A unit > > test of oidmap's externally visible behavior seems like the right > > level to me. > > That's a good point... but then why does 't0011-hashmap.sh' do it in > the first place? As far as I understood this t0016 mainly follows > suit of t0011. I'd make the same argument against t0011. :) I think there it at least made a little more sense because we truly are hashing ourselves, rather than just copying out some sha1 bytes. But I think I'd still argue that if I updated strhash() to use a different hash, I should not have to be updating t0011 to change out the hashes. -Peff
Jeff King <peff@peff.net> writes: >> > I know there are testing philosophies that go to this level of >> > white-box testing, but I don't think we usually do in Git. A unit >> > test of oidmap's externally visible behavior seems like the right >> > level to me. >> >> That's a good point... but then why does 't0011-hashmap.sh' do it in >> the first place? As far as I understood this t0016 mainly follows >> suit of t0011. > > I'd make the same argument against t0011. :) Yeah, I tend to agree. It is not a good excuse that somebody else alerady has made a mistake. > I think there it at least made a little more sense because we truly are > hashing ourselves, rather than just copying out some sha1 bytes. But I > think I'd still argue that if I updated strhash() to use a different > hash, I should not have to be updating t0011 to change out the hashes. True, too.
On Fri, Jun 14, 2019 at 12:22 AM Junio C Hamano <gitster@pobox.com> wrote: > > Jeff King <peff@peff.net> writes: > > >> > I know there are testing philosophies that go to this level of > >> > white-box testing, but I don't think we usually do in Git. A unit > >> > test of oidmap's externally visible behavior seems like the right > >> > level to me. > >> > >> That's a good point... but then why does 't0011-hashmap.sh' do it in > >> the first place? As far as I understood this t0016 mainly follows > >> suit of t0011. > > > > I'd make the same argument against t0011. :) > > Yeah, I tend to agree. It is not a good excuse that somebody else > alerady has made a mistake. Ok, I will remove the "hash" test in t0016 and the corresponding code in test-oidmap.c. > > I think there it at least made a little more sense because we truly are > > hashing ourselves, rather than just copying out some sha1 bytes. But I > > think I'd still argue that if I updated strhash() to use a different > > hash, I should not have to be updating t0011 to change out the hashes. > > True, too. I will also send an additional patch to remove similar code in t00161 and test-hashmap.c.
diff --git a/t/t0016-oidmap.sh b/t/t0016-oidmap.sh new file mode 100755 index 0000000000..3a8e8bdb3d --- /dev/null +++ b/t/t0016-oidmap.sh @@ -0,0 +1,100 @@ +#!/bin/sh + +test_description='test oidmap' +. ./test-lib.sh + +# This purposefully is very similar to t0011-hashmap.sh + +test_oidmap() { + echo "$1" | test-tool oidmap $3 > actual && + echo "$2" > expect && + test_cmp expect actual +} + + +test_expect_success 'setup' ' + + test_commit one && + test_commit two && + test_commit three && + test_commit four + +' + +test_oidhash() { + git rev-parse "$1" | perl -ne 'print hex("$4$3$2$1") . "\n" if m/^(..)(..)(..)(..).*/;' +} + +test_expect_success PERL 'hash' ' + +test_oidmap "hash one +hash two +hash invalidOid +hash three" "$(test_oidhash one) +$(test_oidhash two) +Unknown oid: invalidOid +$(test_oidhash three)" + +' + +test_expect_success 'put' ' + +test_oidmap "put one 1 +put two 2 +put invalidOid 4 +put three 3" "NULL +NULL +Unknown oid: invalidOid +NULL" + +' + +test_expect_success 'replace' ' + +test_oidmap "put one 1 +put two 2 +put three 3 +put invalidOid 4 +put two deux +put one un" "NULL +NULL +NULL +Unknown oid: invalidOid +2 +1" + +' + +test_expect_success 'get' ' + +test_oidmap "put one 1 +put two 2 +put three 3 +get two +get four +get invalidOid +get one" "NULL +NULL +NULL +2 +NULL +Unknown oid: invalidOid +1" + +' + +test_expect_success 'iterate' ' + +test_oidmap "put one 1 +put two 2 +put three 3 +iterate" "NULL +NULL +NULL +$(git rev-parse two) 2 +$(git rev-parse one) 1 +$(git rev-parse three) 3" + +' + +test_done