Message ID | b2c72097e5e8985e7fdd8e3eee66cdf43d1b65c0.1619710329.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | refs: cleanup errno sideband ref related functions | expand |
I 100% agree with you that errno is cumbersome to use and carries far less information than we want (we do not learn what operation failed on what path) over a long distance. It only is useful when the callchain still knows what path was operated on. But... "Han-Wen Nienhuys via GitGitGadget" <gitgitgadget@gmail.com> writes: > For the copy/rename support, calls to lock_ref_oid_basic() in this file are > followed by: > > * lock_ref_oid_basic (copy/rename rollback error path) > > * write_ref_to_lockfile (both in the rollback path and the success path of > copy/rename) > > These calls do not inspect the incoming errno. As they perform I/O, they can > clobber errno. For this reason, callers cannot reliably observe the errno that > lock_ref_oid_basic() generated, so it is unsound for programmatic use. In the latter part of the above, "callers" refers to the callers of "the copy/rename support" (aka files_copy_or_rename_ref())? Then I am not sure why "callers cannot reliably observe the errno that lock_ref_oid_basic() generated" is a problem. They will see the errno from the last system call that failed, if they care. So their performing I/O is perfectly acceptable, too. Hence, I am not sure what change the above justifies, if any. If we can show that no caller of files_copy_or_rename_ref() uses errno, it is a clear indication that lock_ref_oid_basic() is saving and restoring errno for no good reason. I think that is what was done for the other two callers below. So I traced what happens after the copy-rename thing gets called. refs_rename_ref(), rename_ref(), refs_copy_existing_ref() and copy_existing_ref() (all in refs.c) should be the only callers of the function. All callers in builtin/branch.c and builtin/remote.c of these functions (by the way, refs_X() variants do not seem to be called from anywhere---are they over-engineered?) just die() when they signal a failure by returning non-zero, so I think it is safe and much easier to understand to justify this change like so: refs/files-backend.c::lock_ref_oid_basic() tries hard to signal how it failed to its callers using errno. The three callers of this file-scope static function are * files_copy_or_rename_ref() * files_create_symref() * files_reflog_expire() None of them looks at errno after seeing a negative return from lock_ref_oid_basic() to make any decision, and no caller of these three functions looks at errno after they signal a failure by returning a negative value. > For files_create_symref() and files_reflog_expire(), grepping over callers > showed no callers inspecting errno. Yes, this is a lot more relevant justification to allow these two functions, and functions that are called _only_ _by_ these two functions, stop worrying about errno.
On Fri, Apr 30, 2021 at 5:10 AM Junio C Hamano <gitster@pobox.com> wrote: > > I 100% agree with you that errno is cumbersome to use and carries > far less information than we want (we do not learn what operation > failed on what path) over a long distance. It only is useful when > the callchain still knows what path was operated on. > > But... > > "Han-Wen Nienhuys via GitGitGadget" <gitgitgadget@gmail.com> writes: > > > For the copy/rename support, calls to lock_ref_oid_basic() in this file are > > followed by: > > > > * lock_ref_oid_basic (copy/rename rollback error path) > > > > * write_ref_to_lockfile (both in the rollback path and the success path of > > copy/rename) > > > > These calls do not inspect the incoming errno. As they perform I/O, they can > > clobber errno. For this reason, callers cannot reliably observe the errno that > > lock_ref_oid_basic() generated, so it is unsound for programmatic use. > > In the latter part of the above, "callers" refers to the callers of > "the copy/rename support" (aka files_copy_or_rename_ref())? > > Then I am not sure why "callers cannot reliably observe the errno > that lock_ref_oid_basic() generated" is a problem. They will see > the errno from the last system call that failed, if they care. So > their performing I/O is perfectly acceptable, too. > > Hence, I am not sure what change the above justifies, if any. > > If we can show that no caller of files_copy_or_rename_ref() uses > errno, it is a clear indication that lock_ref_oid_basic() is saving > and restoring errno for no good reason. I think that is what was > done for the other two callers below. > > So I traced what happens after the copy-rename thing gets called. > > refs_rename_ref(), rename_ref(), refs_copy_existing_ref() and > copy_existing_ref() (all in refs.c) should be the only callers of > the function. All callers in builtin/branch.c and builtin/remote.c > of these functions (by the way, refs_X() variants do not seem to be > called from anywhere---are they over-engineered?) just die() when > they signal a failure by returning non-zero, so I think it is safe > and much easier to understand to justify this change like so: > > refs/files-backend.c::lock_ref_oid_basic() tries hard to signal > how it failed to its callers using errno. The three callers of > this file-scope static function are > > * files_copy_or_rename_ref() > * files_create_symref() > * files_reflog_expire() > > None of them looks at errno after seeing a negative return from > lock_ref_oid_basic() to make any decision, and no caller of > these three functions looks at errno after they signal a failure > by returning a negative value. I stole your message here; hope that's OK. My original message tries to convey that if you do /* should return errno */ int a() { .. } int b() { result = a(); maybe_do_IO(); return result; } then callers of b() can't reason about the errno result of a(), because they can't know if an error code was generated by maybe_do_IO() or a(). This means that the errno result of a() is useless. (This is assuming that b() doesn't inspect errno, which I failed to mention.)
> From: Han-Wen Nienhuys <hanwen@google.com> > > Errno is a global variable written by almost all system calls, and therefore it > is hard to reason about its state. It's also useless for user-visible errors, as > it leaves no place to report the offending file and/or syscall. I don't think this paragraph is useful. > For the copy/rename support, calls to lock_ref_oid_basic() in this file are > followed by: > > * lock_ref_oid_basic (copy/rename rollback error path) > > * write_ref_to_lockfile (both in the rollback path and the success path of > copy/rename) > > These calls do not inspect the incoming errno. As they perform I/O, they can > clobber errno. For this reason, callers cannot reliably observe the errno that > lock_ref_oid_basic() generated, so it is unsound for programmatic use. > > For files_create_symref() and files_reflog_expire(), grepping over callers > showed no callers inspecting errno. This is probably written more clearly as follows: No call to the static function lock_ref_oid_basic() is immediately followed by an errno check, so stopping setting errno is safe. But as a sanity check, lock_ref_oid_basic() is used in these functions: - files_copy_or_rename_ref() - here, calls are followed by error() (which performs I/O) or write_ref_to_lockfile() (which calls parse_object() which may perform I/O) - files_create_symref() - here, calls are followed by error() or create_symref_locked() (which performs I/O and does not inspect errno) - files_reflog_expire() - here, calls are followed by error() or refs_reflog_exists() (which calls a function in a vtable that is not documented to use and/or preserve errno) So it is safe to stop setting errno in lock_ref_oid_basic(). The diff itself looks good.
On Thu, Jun 3, 2021 at 4:33 AM Jonathan Tan <jonathantanmy@google.com> wrote: > > Errno is a global variable written by almost all system calls, and therefore it > > is hard to reason about its state. It's also useless for user-visible errors, as > > it leaves no place to report the offending file and/or syscall. > > I don't think this paragraph is useful. Dropped. > This is probably written more clearly as follows: > > No call to the static function lock_ref_oid_basic() is immediately > followed by an errno check, so stopping setting errno is safe. But as a > sanity check, lock_ref_oid_basic() is used in these functions: > - files_copy_or_rename_ref() - here, calls are followed by error() (which > performs I/O) or write_ref_to_lockfile() (which calls parse_object() which > may perform I/O) > - files_create_symref() - here, calls are followed by error() or > create_symref_locked() (which performs I/O and does not inspect > errno) > - files_reflog_expire() - here, calls are followed by error() or > refs_reflog_exists() (which calls a function in a vtable that is not > documented to use and/or preserve errno) > > So it is safe to stop setting errno in lock_ref_oid_basic(). I used some of the above text.
diff --git a/refs/files-backend.c b/refs/files-backend.c index 119972ee16f8..c9511da1d387 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -910,7 +910,6 @@ static int create_reflock(const char *path, void *cb) /* * Locks a ref returning the lock on success and NULL on failure. - * On failure errno is set to something meaningful. */ static struct ref_lock *lock_ref_oid_basic(struct files_ref_store *refs, const char *refname, @@ -922,7 +921,6 @@ static struct ref_lock *lock_ref_oid_basic(struct files_ref_store *refs, { struct strbuf ref_file = STRBUF_INIT; struct ref_lock *lock; - int last_errno = 0; int mustexist = (old_oid && !is_null_oid(old_oid)); int resolve_flags = RESOLVE_REF_NO_RECURSE; int resolved; @@ -949,7 +947,6 @@ static struct ref_lock *lock_ref_oid_basic(struct files_ref_store *refs, * to remain. */ if (remove_empty_directories(&ref_file)) { - last_errno = errno; if (!refs_verify_refname_available( &refs->base, refname, extras, skip, err)) @@ -962,7 +959,7 @@ static struct ref_lock *lock_ref_oid_basic(struct files_ref_store *refs, &lock->old_oid, type); } if (!resolved) { - last_errno = errno; + int last_errno = errno; if (last_errno != ENOTDIR || !refs_verify_refname_available(&refs->base, refname, extras, skip, err)) @@ -981,20 +978,17 @@ static struct ref_lock *lock_ref_oid_basic(struct files_ref_store *refs, if (is_null_oid(&lock->old_oid) && refs_verify_refname_available(refs->packed_ref_store, refname, extras, skip, err)) { - last_errno = ENOTDIR; goto error_return; } lock->ref_name = xstrdup(refname); if (raceproof_create_file(ref_file.buf, create_reflock, &lock->lk)) { - last_errno = errno; unable_to_lock_message(ref_file.buf, errno, err); goto error_return; } if (verify_lock(&refs->base, lock, old_oid, mustexist, err)) { - last_errno = errno; goto error_return; } goto out; @@ -1005,7 +999,6 @@ static struct ref_lock *lock_ref_oid_basic(struct files_ref_store *refs, out: strbuf_release(&ref_file); - errno = last_errno; return lock; }