Message ID | cover.1560898723.git.matheus.bernardino@usp.br (mailing list archive) |
---|---|
Headers | show |
Series | clone: dir-iterator refactoring with tests | expand |
On Tue, Jun 18, 2019 at 8:28 PM Matheus Tavares <matheus.bernardino@usp.br> wrote: > > This patchset contains: > - tests to the dir-iterator API; > - dir-iterator refactoring to make its state machine simpler > and feature adding with tests; > - a replacement of explicit recursive dir iteration at > copy_or_link_directory for the dir-iterator API; > - some refactoring and behavior changes at local clone, mainly to > take care of symlinks and hidden files at .git/objects, together > with tests for these types of files. > > Changes since v6: > - Rebased with master; > - Added to dir-iterator documentation that ENOENT errors and hence broken > symlinks are both ignored. > > With the changes brought by this patchset, dir_iterator_begin() may now > return NULL (setting errno) when it finds an error. Also, it's possible > to pass a pedantic flag to it so that dir_iterator_advance() return > immediately on errors. But at refs/files-backend.c, the only user of > the API so far, the flag wasn't used and an empty iterator is > returned in case of errors at dir_iterator_begin(). These actions were > taken in order to keep the files-backend's behavior as close as > possible to the one it previously had. But since it already has guards > for possible errors at dir_iterator_advance(), I'm wondering whether I > should send a follow-up patch making it use the pedantic flag. > > Also, should I perhaps call die_errno() on dir_iterator_begin() errors > at files-backend? I mean, we should continue returning an empty > iterator on ENOENT errors since ".git/logs", the dir it iterates over, > may not be present. But we could possibly abort on other errors, just > to be sure... I got ahead of myself in this last paragraph. ".git/logs" is one of the dirs that files-backend.c is used to iterate over, but it doesn't mean it's the only one. This dir, in particular, is iterated when we run 'git rev-list --reflog', for example. And upon ENOENTs, the iteration is expected to end successfully but with no entries. (also adding Michael and Daniel to CC, in case they have some input on these ideas)
Matheus Tavares <matheus.bernardino@usp.br> writes: > Daniel Ferreira (1): > dir-iterator: add tests for dir-iterator API > > Matheus Tavares (8): > clone: better handle symlinked files at .git/objects/ > dir-iterator: use warning_errno when possible > dir-iterator: refactor state machine model > dir-iterator: add flags parameter to dir_iterator_begin > clone: copy hidden paths at local clone > clone: extract function from copy_or_link_directory > clone: use dir-iterator to avoid explicit dir traversal > clone: replace strcmp by fspathcmp > > Ævar Arnfjörð Bjarmason (1): > clone: test for our behavior on odd objects/* content > > Makefile | 1 + > builtin/clone.c | 75 +++++---- > dir-iterator.c | 289 +++++++++++++++++++++-------------- > dir-iterator.h | 60 ++++++-- > refs/files-backend.c | 17 ++- > t/helper/test-dir-iterator.c | 58 +++++++ > t/helper/test-tool.c | 1 + > t/helper/test-tool.h | 1 + > t/t0066-dir-iterator.sh | 163 ++++++++++++++++++++ > t/t5604-clone-reference.sh | 133 ++++++++++++++++ > 10 files changed, 635 insertions(+), 163 deletions(-) > create mode 100644 t/helper/test-dir-iterator.c > create mode 100755 t/t0066-dir-iterator.sh A higher level question is what's the benefit of using dir-iterator API in the first place. After subtracting 356 added lines to t/, it still adds 279 lines while removing only 163 lines, so it is not like "we have a perfect dir-iterator API that can be applied as-is but an older code that predates dir-iterator API was still using an old way, so let's make the latter use the former."
On Thu, Jun 20, 2019 at 5:18 PM Junio C Hamano <gitster@pobox.com> wrote: > > Matheus Tavares <matheus.bernardino@usp.br> writes: > > > Daniel Ferreira (1): > > dir-iterator: add tests for dir-iterator API > > > > Matheus Tavares (8): > > clone: better handle symlinked files at .git/objects/ > > dir-iterator: use warning_errno when possible > > dir-iterator: refactor state machine model > > dir-iterator: add flags parameter to dir_iterator_begin > > clone: copy hidden paths at local clone > > clone: extract function from copy_or_link_directory > > clone: use dir-iterator to avoid explicit dir traversal > > clone: replace strcmp by fspathcmp > > > > Ævar Arnfjörð Bjarmason (1): > > clone: test for our behavior on odd objects/* content > > > > Makefile | 1 + > > builtin/clone.c | 75 +++++---- > > dir-iterator.c | 289 +++++++++++++++++++++-------------- > > dir-iterator.h | 60 ++++++-- > > refs/files-backend.c | 17 ++- > > t/helper/test-dir-iterator.c | 58 +++++++ > > t/helper/test-tool.c | 1 + > > t/helper/test-tool.h | 1 + > > t/t0066-dir-iterator.sh | 163 ++++++++++++++++++++ > > t/t5604-clone-reference.sh | 133 ++++++++++++++++ > > 10 files changed, 635 insertions(+), 163 deletions(-) > > create mode 100644 t/helper/test-dir-iterator.c > > create mode 100755 t/t0066-dir-iterator.sh > > A higher level question is what's the benefit of using dir-iterator > API in the first place. After subtracting 356 added lines to t/, > it still adds 279 lines while removing only 163 lines, so it is not > like "we have a perfect dir-iterator API that can be applied as-is > but an older code that predates dir-iterator API was still using an > old way, so let's make the latter use the former." > Yes, indeed the dir-iterator API didn't nicely fit in clone without some tweaking. Yet I think most of those line additions were not only to adjust the API, but also trying to improve both dir-iterator and local clone (I should have maybe split those changes into other patchsets, though). For example, these changes make local clone better handle possible symlinks and hidden files at git dir. And the API changes should make it easier to apply it as-is in other sections of the codebase from now on. As for the benefit of using the API here, I think it mainly resides in the security it brings, avoiding recursive iteration (even though it should be shallow in local clone) and more carefully handling symlinks.