Message ID | 20201206225349.3392790-2-sandals@crustytoothpaste.net (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | rev-parse options for absolute or relative paths | expand |
On Sun, Dec 6, 2020 at 5:56 PM brian m. carlson <sandals@crustytoothpaste.net> wrote: > We'd like to canonicalize paths such that we can preserve any number of > trailing components that may be missing. Let's add a function to do > that, allowing either one or an unlimited number of components to > canonicalize, and make strbuf_realpath a wrapper around it that allows > just one component. This commit message is too barebones. As a justification, "We'd like..." is insufficient; it doesn't help the reader understand why this change is desirable. Further, the lack of explanation about the seemingly arbitrary "one or infinite" condition confuses the issue. The first question which popped into this reader's head was "why those two specific choices?". What makes one missing path component special as opposed to any number of missing components? (These questions are mostly rhetorical; I can figure out reasonable answers, but the commit message ought to do a better job of explaining.) > Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> > --- > +/* > + * Just like strbuf_realpath, but allows specifying how many missing components > + * are permitted. If many_missing is true, an arbitrary number of path > + * components may be missing; otherwise, only the last component may be missing. > + */ > +char *strbuf_realpath_missing(struct strbuf *resolved, const char *path, > + int many_missing, int die_on_error) This interface feels odd. Why would a caller ever want to call this function with many_missing=0 when it would be easier, shorter, more direct to simply call strbuf_realpath()? Is the plan to retire strbuf_realpath() down the road? A more orthogonal-feeling interface would be to make this function _always_ allow any number of missing trailing path components (that is, drop `many_missing` altogether). Doing so would simplify the documentation and the signature. If the caller needs the original behavior of only allowing the final path component to be missing, then strbuf_realpath() can be used, as usual. The name of the function is somewhat confusing, especially if you take the suggestion of dropping the `many_missing` argument. Perhaps a name such as strbuf_realpath_forgiving() would be more understandable. Note that the above comments are only about the API. It's perfectly fine if the two functions share an underlying private implementation (making them each one-liners calling the actual worker function).
On 2020-12-07 at 00:02:16, Eric Sunshine wrote: > This commit message is too barebones. As a justification, "We'd > like..." is insufficient; it doesn't help the reader understand why > this change is desirable. > > Further, the lack of explanation about the seemingly arbitrary "one or > infinite" condition confuses the issue. The first question which > popped into this reader's head was "why those two specific choices?". > What makes one missing path component special as opposed to any number > of missing components? (These questions are mostly rhetorical; I can > figure out reasonable answers, but the commit message ought to do a > better job of explaining.) Sure, I can expand the commit message to be a little more descriptive. > The name of the function is somewhat confusing, especially if you take > the suggestion of dropping the `many_missing` argument. Perhaps a name > such as strbuf_realpath_forgiving() would be more understandable. Sure, I agree that's a better name. It shouldn't be surprising to anyone on the list that I am absolutely terrible at naming things, so I appreciate the suggestion.
Am 06.12.20 um 23:53 schrieb brian m. carlson: > We'd like to canonicalize paths such that we can preserve any number of > trailing components that may be missing. Let's add a function to do > that, allowing either one or an unlimited number of components to > canonicalize, and make strbuf_realpath a wrapper around it that allows > just one component. > > Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> > --- > abspath.c | 13 ++++++++++++- > cache.h | 2 ++ > 2 files changed, 14 insertions(+), 1 deletion(-) > > diff --git a/abspath.c b/abspath.c > index 6f15a418bb..e6630b3618 100644 > --- a/abspath.c > +++ b/abspath.c > @@ -80,6 +80,17 @@ static void get_root_part(struct strbuf *resolved, struct strbuf *remaining) > */ > char *strbuf_realpath(struct strbuf *resolved, const char *path, > int die_on_error) > +{ > + return strbuf_realpath_missing(resolved, path, 0, die_on_error); > +} > + > +/* > + * Just like strbuf_realpath, but allows specifying how many missing components > + * are permitted. If many_missing is true, an arbitrary number of path > + * components may be missing; otherwise, only the last component may be missing. > + */ > +char *strbuf_realpath_missing(struct strbuf *resolved, const char *path, > + int many_missing, int die_on_error) So this uses two 32-bit values to pass two bits. Hmm. I find the concept of a "real" path with imaginary components strangely amusing. But perhaps a name like strbuf_resolve_path() would fit better? > { > struct strbuf remaining = STRBUF_INIT; > struct strbuf next = STRBUF_INIT; > @@ -129,7 +140,7 @@ char *strbuf_realpath(struct strbuf *resolved, const char *path, > > if (lstat(resolved->buf, &st)) { > /* error out unless this was the last component */ > - if (errno != ENOENT || remaining.len) { > + if (errno != ENOENT || (!many_missing && remaining.len)) { > if (die_on_error) > die_errno("Invalid path '%s'", > resolved->buf); So the original code errors out if there is a real error (errno != ENOENT). It also errors out if any component except the last one is missing (errno == ENOENT && remaining.len); that's what the comment is about. This patch adds the ability to ignore ENOENT for all components. Perhaps convert many_missing and die_on_error into a single flags parameter and implement the flags DIE_ON_ERR and REQUIRE_BASENAME or similar? Callers would be easier to read because such an interface is self-documenting -- provided we find good flag names. > diff --git a/cache.h b/cache.h > index e986cf4ea9..a1386235fc 100644 > --- a/cache.h > +++ b/cache.h > @@ -1320,6 +1320,8 @@ static inline int is_absolute_path(const char *path) > int is_directory(const char *); > char *strbuf_realpath(struct strbuf *resolved, const char *path, > int die_on_error); > +char *strbuf_realpath_missing(struct strbuf *resolved, const char *path, > + int missing_components, int die_on_error); > char *real_pathdup(const char *path, int die_on_error); > const char *absolute_path(const char *path); > char *absolute_pathdup(const char *path); >
On 2020-12-07 at 17:19:32, René Scharfe wrote: > I find the concept of a "real" path with imaginary components strangely > amusing. But perhaps a name like strbuf_resolve_path() would fit better? I think I'm going to take the strbuf_realpath_forgiving solution from Eric Sunshine because I think having similar names for similar functions helps discoverability. > So the original code errors out if there is a real error > (errno != ENOENT). It also errors out if any component except the last > one is missing (errno == ENOENT && remaining.len); that's what the > comment is about. This patch adds the ability to ignore ENOENT for all > components. > > Perhaps convert many_missing and die_on_error into a single flags > parameter and implement the flags DIE_ON_ERR and REQUIRE_BASENAME or > similar? Callers would be easier to read because such an interface is > self-documenting -- provided we find good flag names. As discussed elsewhere in the thread, this will be moving to an internal function, but I can make that function take two flag parameters.
diff --git a/abspath.c b/abspath.c index 6f15a418bb..e6630b3618 100644 --- a/abspath.c +++ b/abspath.c @@ -80,6 +80,17 @@ static void get_root_part(struct strbuf *resolved, struct strbuf *remaining) */ char *strbuf_realpath(struct strbuf *resolved, const char *path, int die_on_error) +{ + return strbuf_realpath_missing(resolved, path, 0, die_on_error); +} + +/* + * Just like strbuf_realpath, but allows specifying how many missing components + * are permitted. If many_missing is true, an arbitrary number of path + * components may be missing; otherwise, only the last component may be missing. + */ +char *strbuf_realpath_missing(struct strbuf *resolved, const char *path, + int many_missing, int die_on_error) { struct strbuf remaining = STRBUF_INIT; struct strbuf next = STRBUF_INIT; @@ -129,7 +140,7 @@ char *strbuf_realpath(struct strbuf *resolved, const char *path, if (lstat(resolved->buf, &st)) { /* error out unless this was the last component */ - if (errno != ENOENT || remaining.len) { + if (errno != ENOENT || (!many_missing && remaining.len)) { if (die_on_error) die_errno("Invalid path '%s'", resolved->buf); diff --git a/cache.h b/cache.h index e986cf4ea9..a1386235fc 100644 --- a/cache.h +++ b/cache.h @@ -1320,6 +1320,8 @@ static inline int is_absolute_path(const char *path) int is_directory(const char *); char *strbuf_realpath(struct strbuf *resolved, const char *path, int die_on_error); +char *strbuf_realpath_missing(struct strbuf *resolved, const char *path, + int missing_components, int die_on_error); char *real_pathdup(const char *path, int die_on_error); const char *absolute_path(const char *path); char *absolute_pathdup(const char *path);
We'd like to canonicalize paths such that we can preserve any number of trailing components that may be missing. Let's add a function to do that, allowing either one or an unlimited number of components to canonicalize, and make strbuf_realpath a wrapper around it that allows just one component. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> --- abspath.c | 13 ++++++++++++- cache.h | 2 ++ 2 files changed, 14 insertions(+), 1 deletion(-)