Message ID | 20240905082749.GA88220@coredump.intra.peff.net (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | sparse-checkout: use fdopen_lock_file() instead of xfdopen() | expand |
On Thu, Sep 05, 2024 at 04:27:49AM -0400, Jeff King wrote: > diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c > index 2604ab04df..f1bd31b2f7 100644 > --- a/builtin/sparse-checkout.c > +++ b/builtin/sparse-checkout.c > @@ -336,8 +335,7 @@ static int write_patterns_and_update(struct pattern_list *pl) > if (safe_create_leading_directories(sparse_filename)) > die(_("failed to create directory for sparse-checkout file")); > > - fd = hold_lock_file_for_update(&lk, sparse_filename, > - LOCK_DIE_ON_ERROR); > + hold_lock_file_for_update(&lk, sparse_filename, LOCK_DIE_ON_ERROR); > free(sparse_filename); > > result = update_working_directory(pl); Okay, we die on an error so there is no need to check `fd`, either. Makes sense. > @@ -348,15 +346,17 @@ static int write_patterns_and_update(struct pattern_list *pl) > return result; > } > > - fp = xfdopen(fd, "w"); > + fp = fdopen_lock_file(&lk, "w"); > + if (!fp) > + die_errno(_("unable to fdopen %s"), get_lock_file_path(&lk)); > > if (core_sparse_checkout_cone) > write_cone_to_file(fp, pl); > else > write_patterns_to_file(fp, pl); > > - fflush(fp); > - commit_lock_file(&lk); > + if (commit_lock_file(&lk)) > + die_errno(_("unable to write %s"), get_locked_file_path(&lk)); > > clear_pattern_list(pl); I think the error handling is broken. `commit_lock_file()` calls `rename_tempfile()`, which deletes the temporary file even in the error case. The consequence is that `lk->tempfile` will be set to the `NULL` pointer. When we call `get_locked_file_path()` we then dereference it unconditionally and would thus segfault. Other than that this patch looks sensible. Also kind of curious that we never checked that `commit_lock_file()` was successful, so we wouldn't even notice when renaming the file into place failed. Patrick
Jeff King <peff@peff.net> writes: > We do have to adjust the code a bit: > > - we have to handle errors ourselves; we can just die(), since that's > what xfdopen() would have done (and we can even provide a more > specific error message). > > - we no longer need to call fflush(); committing the lock-file > auto-closes it, which will now do the flush for us. As a bonus, this > will actually check that the flush was successful before renaming > the file into place. Let's likewise report when committing the lock > fails (rather than quietly returning success from the command). > > - we can get rid of the local "fd" variable, since we never look at it > ourselves now OK. The neessary change is surprisingly small. > I found this because I was building git on an Android system,... Sounds like fun. Will queue. Thanks.
Patrick Steinhardt <ps@pks.im> writes: >> + if (commit_lock_file(&lk)) >> + die_errno(_("unable to write %s"), get_locked_file_path(&lk)); >> >> clear_pattern_list(pl); > > I think the error handling is broken. `commit_lock_file()` calls > `rename_tempfile()`, which deletes the temporary file even in the error > case. The consequence is that `lk->tempfile` will be set to the `NULL` > pointer. When we call `get_locked_file_path()` we then dereference it > unconditionally and would thus segfault. Hmph. Would this be sufficient as a band-aid, then? builtin/sparse-checkout.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git c/builtin/sparse-checkout.c w/builtin/sparse-checkout.c index f1bd31b2f7..60363fd056 100644 --- c/builtin/sparse-checkout.c +++ w/builtin/sparse-checkout.c @@ -356,7 +356,7 @@ static int write_patterns_and_update(struct pattern_list *pl) write_patterns_to_file(fp, pl); if (commit_lock_file(&lk)) - die_errno(_("unable to write %s"), get_locked_file_path(&lk)); + die_errno(_("unable to write %s"), sparse_filename); clear_pattern_list(pl);
Junio C Hamano <gitster@pobox.com> writes: > Patrick Steinhardt <ps@pks.im> writes: > >>> + if (commit_lock_file(&lk)) >>> + die_errno(_("unable to write %s"), get_locked_file_path(&lk)); >>> >>> clear_pattern_list(pl); >> >> I think the error handling is broken. `commit_lock_file()` calls >> `rename_tempfile()`, which deletes the temporary file even in the error >> case. The consequence is that `lk->tempfile` will be set to the `NULL` >> pointer. When we call `get_locked_file_path()` we then dereference it >> unconditionally and would thus segfault. > > Hmph. Would this be sufficient as a band-aid, then? Of course not. That would refer to a piece of memory that we already free'ed in this function. Perhaps like this? diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c index f1bd31b2f7..27d181a612 100644 --- a/builtin/sparse-checkout.c +++ b/builtin/sparse-checkout.c @@ -328,7 +328,7 @@ static int write_patterns_and_update(struct pattern_list *pl) char *sparse_filename; FILE *fp; struct lock_file lk = LOCK_INIT; - int result; + int result = 0; sparse_filename = get_sparse_checkout_filename(); @@ -336,19 +336,18 @@ static int write_patterns_and_update(struct pattern_list *pl) die(_("failed to create directory for sparse-checkout file")); hold_lock_file_for_update(&lk, sparse_filename, LOCK_DIE_ON_ERROR); - free(sparse_filename); result = update_working_directory(pl); if (result) { rollback_lock_file(&lk); clear_pattern_list(pl); update_working_directory(NULL); - return result; + goto out; } fp = fdopen_lock_file(&lk, "w"); if (!fp) - die_errno(_("unable to fdopen %s"), get_lock_file_path(&lk)); + die_errno(_("unable to fdopen %s"), sparse_filename); if (core_sparse_checkout_cone) write_cone_to_file(fp, pl); @@ -356,11 +355,13 @@ static int write_patterns_and_update(struct pattern_list *pl) write_patterns_to_file(fp, pl); if (commit_lock_file(&lk)) - die_errno(_("unable to write %s"), get_locked_file_path(&lk)); + die_errno(_("unable to write %s"), sparse_filename); clear_pattern_list(pl); - return 0; +out: + free(sparse_filename); + return result; } enum sparse_checkout_mode {
On Thu, Sep 05, 2024 at 02:20:18PM -0700, Junio C Hamano wrote: > >> I think the error handling is broken. `commit_lock_file()` calls > >> `rename_tempfile()`, which deletes the temporary file even in the error > >> case. The consequence is that `lk->tempfile` will be set to the `NULL` > >> pointer. When we call `get_locked_file_path()` we then dereference it > >> unconditionally and would thus segfault. > > > > Hmph. Would this be sufficient as a band-aid, then? > > Of course not. That would refer to a piece of memory that we > already free'ed in this function. > > Perhaps like this? That works, though... > fp = fdopen_lock_file(&lk, "w"); > if (!fp) > - die_errno(_("unable to fdopen %s"), get_lock_file_path(&lk)); > + die_errno(_("unable to fdopen %s"), sparse_filename); > > if (core_sparse_checkout_cone) > write_cone_to_file(fp, pl); > @@ -356,11 +355,13 @@ static int write_patterns_and_update(struct pattern_list *pl) > write_patterns_to_file(fp, pl); > > if (commit_lock_file(&lk)) > - die_errno(_("unable to write %s"), get_locked_file_path(&lk)); > + die_errno(_("unable to write %s"), sparse_filename); Note the difference between "get_lock" and "get_locked" in these two. The first will mention the tempfile name, and the second the destination filename (and sparse_filename is the latter). I don't know that it really matters much in practice, though. If fdopen() fails it probably has nothing to do with the file itself, and is either lack of memory or a programming bug (e.g., bogus descriptor). I'll pick up this change, but I'll split the whole error-handling fix into its own patch, since it's getting more complicated. Will send v2 later tonight. Thanks, Patrick, for noticing the problem in the first place. -Peff
Jeff King <peff@peff.net> writes: >> fp = fdopen_lock_file(&lk, "w"); >> if (!fp) >> - die_errno(_("unable to fdopen %s"), get_lock_file_path(&lk)); >> + die_errno(_("unable to fdopen %s"), sparse_filename); >> >> if (core_sparse_checkout_cone) >> write_cone_to_file(fp, pl); >> @@ -356,11 +355,13 @@ static int write_patterns_and_update(struct pattern_list *pl) >> write_patterns_to_file(fp, pl); >> >> if (commit_lock_file(&lk)) >> - die_errno(_("unable to write %s"), get_locked_file_path(&lk)); >> + die_errno(_("unable to write %s"), sparse_filename); > > Note the difference between "get_lock" and "get_locked" in these two. > The first will mention the tempfile name, and the second the destination > filename (and sparse_filename is the latter). I did consider ... to write %s.%s", sparse_filename, LOCK_SUFFIX) but then thought that the final name is what is more relevant to the end user. Yes, while correcting unrelated error, I shouldn't have tried to improve unrelated end-user experience ;-). > Will send v2 later tonight. Thanks, Patrick, for noticing the problem in > the first place. Yeah, thanks, both.
diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c index 2604ab04df..f1bd31b2f7 100644 --- a/builtin/sparse-checkout.c +++ b/builtin/sparse-checkout.c @@ -327,7 +327,6 @@ static int write_patterns_and_update(struct pattern_list *pl) { char *sparse_filename; FILE *fp; - int fd; struct lock_file lk = LOCK_INIT; int result; @@ -336,8 +335,7 @@ static int write_patterns_and_update(struct pattern_list *pl) if (safe_create_leading_directories(sparse_filename)) die(_("failed to create directory for sparse-checkout file")); - fd = hold_lock_file_for_update(&lk, sparse_filename, - LOCK_DIE_ON_ERROR); + hold_lock_file_for_update(&lk, sparse_filename, LOCK_DIE_ON_ERROR); free(sparse_filename); result = update_working_directory(pl); @@ -348,15 +346,17 @@ static int write_patterns_and_update(struct pattern_list *pl) return result; } - fp = xfdopen(fd, "w"); + fp = fdopen_lock_file(&lk, "w"); + if (!fp) + die_errno(_("unable to fdopen %s"), get_lock_file_path(&lk)); if (core_sparse_checkout_cone) write_cone_to_file(fp, pl); else write_patterns_to_file(fp, pl); - fflush(fp); - commit_lock_file(&lk); + if (commit_lock_file(&lk)) + die_errno(_("unable to write %s"), get_locked_file_path(&lk)); clear_pattern_list(pl);
When updating sparse patterns, we open a lock_file to write out the new data. The lock_file struct holds the file descriptor, but we call fdopen() to get a stdio handle to do the actual write. After we finish writing, we fflush() so that all of the data is on disk, and then call commit_lock_file() which closes the descriptor. But we never fclose() the stdio handle, leaking it. The obvious solution seems like it would be to just call fclose(). But when? If we do it before commit_lock_file(), then the lock_file code is left thinking it owns the now-closed file descriptor, and will do an extra close() on the descriptor. But if we do it before, we have the opposite problem: the lock_file code will close the descriptor, and fclose() will do the extra close(). We can handle this correctly by using fdopen_lock_file(). That leaves ownership of the stdio handle with the lock_file, which knows not to double-close it. We do have to adjust the code a bit: - we have to handle errors ourselves; we can just die(), since that's what xfdopen() would have done (and we can even provide a more specific error message). - we no longer need to call fflush(); committing the lock-file auto-closes it, which will now do the flush for us. As a bonus, this will actually check that the flush was successful before renaming the file into place. Let's likewise report when committing the lock fails (rather than quietly returning success from the command). - we can get rid of the local "fd" variable, since we never look at it ourselves now Signed-off-by: Jeff King <peff@peff.net> --- Curiously, this is not reported as a leak by LSan, and it is included in the "reachable" set by valgrind. I imagine this is because libc has to maintain a list of open writable handles, since fflush(NULL) is supposed to flush all of them. I peeked at other fdopen(..., "w") calls to see if there were other similar spots, but didn't notice any. I found this because I was building git on an Android system, and they have an "fdsan" that complains when the underlying descriptor of a handle is closed. It flagged some other spots, too, but the rest that I looked at involved the tempfile atexit() handler closing the descriptors manually (we don't care about flushing there, as our goal is to just prevent the "can't delete an open file" problem on Windows). builtin/sparse-checkout.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)