Message ID | 20190213205554.4086-1-matheus.bernardino@usp.br (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [GSoC,RFC] clone: use dir-iterator to avoid explicit dir traversal | expand |
On Thu, Feb 14, 2019 at 1:16 PM Matheus Tavares <matheus.bernardino@usp.br> wrote: > > Replace usage of opendir/readdir/closedir API to traverse directories > recursively, at copy_or_link_directory function, by the dir-iterator > API. > > Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> > --- > This is my microproject for GSoC 2019. It's still a RFC because I have > some questions. Any help will be much appreciated. Thanks for working on a microproject! > There're three places inside copy_or_link_directory's loop where > die_errno() is called. Should I call dir_iterator_abort, at these > places, before die_errno() is called (to free resources)? I don't think it's necessary. We care about freeing resources when we report errors (for example by returning an error code from inside a function), but not when we are exiting. > -static void copy_or_link_directory(struct strbuf *src, struct strbuf *dest, > - const char *src_repo, int src_baselen) > +static void mkdir_if_missing(const char *pathname, mode_t mode) It looks like your patch is both splitting copy_or_link_directory() into 2 functions and converting it to use the dir-iterator API. Maybe it would be better to have 2 patches instead, one for splitting it and one for converting it. > { > - struct dirent *de; > + /* > + * Tries to create a dir at pathname. If pathname already exists and > + * is a dir, do nothing. > + */ I think we usually put such comments just before the function. And maybe it could be shortened to "Create a dir at pathname unless there's already one"
On Thu, Feb 14, 2019 at 7:16 PM Christian Couder <christian.couder@gmail.com> wrote: > > On Thu, Feb 14, 2019 at 1:16 PM Matheus Tavares > <matheus.bernardino@usp.br> wrote: > > > > Replace usage of opendir/readdir/closedir API to traverse directories > > recursively, at copy_or_link_directory function, by the dir-iterator > > API. > > > > Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> > > --- > > This is my microproject for GSoC 2019. It's still a RFC because I have > > some questions. Any help will be much appreciated. > > Thanks for working on a microproject! > Hi, Christian. Thank you for the review and comments. > > There're three places inside copy_or_link_directory's loop where > > die_errno() is called. Should I call dir_iterator_abort, at these > > places, before die_errno() is called (to free resources)? > > I don't think it's necessary. We care about freeing resources when we > report errors (for example by returning an error code from inside a > function), but not when we are exiting. > Ok, thanks! > > -static void copy_or_link_directory(struct strbuf *src, struct strbuf *dest, > > - const char *src_repo, int src_baselen) > > +static void mkdir_if_missing(const char *pathname, mode_t mode) > > It looks like your patch is both splitting copy_or_link_directory() > into 2 functions and converting it to use the dir-iterator API. Maybe > it would be better to have 2 patches instead, one for splitting it and > one for converting it. > Got it. As the justification for splitting the function was to use the extracted part in the section that was previously recursive, I thought both changes should be in the same patch. But I really was in doubt about that. Should I split it into two patches and mention that justification at the first one? Or just split? > > { > > - struct dirent *de; > > + /* > > + * Tries to create a dir at pathname. If pathname already exists and > > + * is a dir, do nothing. > > + */ > > I think we usually put such comments just before the function. And > maybe it could be shortened to "Create a dir at pathname unless > there's already one" Right, the shortened version does sounds much better, thanks! About the comment placement, I followed what I saw in other functions from the same file ("copy_alternates", for example). But also, I couldn't find any instruction about that at Documentation/CodingGuidelines. So should I move it as you suggested?
On Thu, Feb 14, 2019 at 11:04 PM Matheus Tavares Bernardino <matheus.bernardino@usp.br> wrote: > > On Thu, Feb 14, 2019 at 7:16 PM Christian Couder > <christian.couder@gmail.com> wrote: > > > -static void copy_or_link_directory(struct strbuf *src, struct strbuf *dest, > > > - const char *src_repo, int src_baselen) > > > +static void mkdir_if_missing(const char *pathname, mode_t mode) > > > > It looks like your patch is both splitting copy_or_link_directory() > > into 2 functions and converting it to use the dir-iterator API. Maybe > > it would be better to have 2 patches instead, one for splitting it and > > one for converting it. > > > > Got it. As the justification for splitting the function was to use the > extracted part in the section that was previously recursive, I thought > both changes should be in the same patch. But I really was in doubt > about that. Should I split it into two patches and mention that > justification at the first one? Or just split? If 2 patches instead of 1 makes it easier to review and understand what's going on, then I think you should indeed send 2 patches and mention the justification for splitting the function in the commit message of the first patch. > > I think we usually put such comments just before the function. And > > maybe it could be shortened to "Create a dir at pathname unless > > there's already one" > > Right, the shortened version does sounds much better, thanks! About > the comment placement, I followed what I saw in other functions from > the same file ("copy_alternates", for example). Then it's ok to place it like you did. Sorry about the noise. > But also, I couldn't > find any instruction about that at Documentation/CodingGuidelines. So > should I move it as you suggested? I think we have not been very consistent over the years. Recently I think we have tried to add or move API documentation inside header files, and in general before functions in the code, but yeah it might not have been documented.
diff --git a/builtin/clone.c b/builtin/clone.c index 50bde99618..f5208ad9ca 100644 --- a/builtin/clone.c +++ b/builtin/clone.c @@ -23,6 +23,8 @@ #include "transport.h" #include "strbuf.h" #include "dir.h" +#include "dir-iterator.h" +#include "iterator.h" #include "sigchain.h" #include "branch.h" #include "remote.h" @@ -392,50 +394,55 @@ static void copy_alternates(struct strbuf *src, struct strbuf *dst, fclose(in); } -static void copy_or_link_directory(struct strbuf *src, struct strbuf *dest, - const char *src_repo, int src_baselen) +static void mkdir_if_missing(const char *pathname, mode_t mode) { - struct dirent *de; + /* + * Tries to create a dir at pathname. If pathname already exists and + * is a dir, do nothing. + */ struct stat buf; - int src_len, dest_len; - DIR *dir; - dir = opendir(src->buf); - if (!dir) - die_errno(_("failed to open '%s'"), src->buf); - - if (mkdir(dest->buf, 0777)) { + if (mkdir(pathname, mode)) { if (errno != EEXIST) - die_errno(_("failed to create directory '%s'"), dest->buf); - else if (stat(dest->buf, &buf)) - die_errno(_("failed to stat '%s'"), dest->buf); + die_errno(_("failed to create directory '%s'"), + pathname); + else if (stat(pathname, &buf)) + die_errno(_("failed to stat '%s'"), pathname); else if (!S_ISDIR(buf.st_mode)) - die(_("%s exists and is not a directory"), dest->buf); + die(_("%s exists and is not a directory"), pathname); } +} + +static void copy_or_link_directory(struct strbuf *src, struct strbuf *dest, + const char *src_repo, int src_baselen) +{ + int src_len, dest_len; + struct dir_iterator *iter; + int iter_status; + + mkdir_if_missing(dest->buf, 0777); + + iter = dir_iterator_begin(src->buf); strbuf_addch(src, '/'); src_len = src->len; strbuf_addch(dest, '/'); dest_len = dest->len; - while ((de = readdir(dir)) != NULL) { + while ((iter_status = dir_iterator_advance(iter)) == ITER_OK) { strbuf_setlen(src, src_len); - strbuf_addstr(src, de->d_name); + strbuf_addstr(src, iter->relative_path); strbuf_setlen(dest, dest_len); - strbuf_addstr(dest, de->d_name); - if (stat(src->buf, &buf)) { - warning (_("failed to stat %s\n"), src->buf); - continue; - } - if (S_ISDIR(buf.st_mode)) { - if (de->d_name[0] != '.') - copy_or_link_directory(src, dest, - src_repo, src_baselen); + strbuf_addstr(dest, iter->relative_path); + + if (S_ISDIR(iter->st.st_mode)) { + if (iter->basename[0] != '.') + mkdir_if_missing(dest->buf, 0777); continue; } /* Files that cannot be copied bit-for-bit... */ - if (!strcmp(src->buf + src_baselen, "/info/alternates")) { + if (!strcmp(iter->relative_path, "info/alternates")) { copy_alternates(src, dest, src_repo); continue; } @@ -452,7 +459,11 @@ static void copy_or_link_directory(struct strbuf *src, struct strbuf *dest, if (copy_file_with_time(dest->buf, src->buf, 0666)) die_errno(_("failed to copy file to '%s'"), dest->buf); } - closedir(dir); + + if (iter_status != ITER_DONE) { + strbuf_setlen(src, src_len); + die(_("failed to iterate over '%s'"), src->buf); + } } static void clone_local(const char *src_repo, const char *dest_repo)
Replace usage of opendir/readdir/closedir API to traverse directories recursively, at copy_or_link_directory function, by the dir-iterator API. Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> --- This is my microproject for GSoC 2019. It's still a RFC because I have some questions. Any help will be much appreciated. There're three places inside copy_or_link_directory's loop where die_errno() is called. Should I call dir_iterator_abort, at these places, before die_errno() is called (to free resources)? And if so, should I check dir_iterator_abort's return code? It's said at dir-iterator.h that dir_iterator_abort returns ITER_ERROR on error, but by the code, we can see that it only possibly returns ITER_DONE. Finally, if this call and check both needs to be done, there'll be code replication in this three places. Should I add a goto and do all this stuff at the function end? (But then, I would have to store die_errno's error messages and errno code in temporary variables. Is this approach good?) builtin/clone.c | 65 +++++++++++++++++++++++++++++-------------------- 1 file changed, 38 insertions(+), 27 deletions(-)