Message ID | 20250127-pks-reftable-drop-git-compat-util-v1-1-6e280a564877@pks.im (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | reftable: stop using "git-compat-util.h" | expand |
On 25/01/27 02:04PM, Patrick Steinhardt wrote: > There is a single callsite of `read_in_full()` in the reftable library. > Open-code the function to reduce our dependency on the Git library. > > Signed-off-by: Patrick Steinhardt <ps@pks.im> > --- > reftable/stack.c | 18 ++++++++++++++---- > 1 file changed, 14 insertions(+), 4 deletions(-) > > diff --git a/reftable/stack.c b/reftable/stack.c > index f7c1845e15..9490366795 100644 > --- a/reftable/stack.c > +++ b/reftable/stack.c > @@ -115,13 +115,16 @@ int reftable_new_stack(struct reftable_stack **dest, const char *dir, > > static int fd_read_lines(int fd, char ***namesp) > { > - off_t size = lseek(fd, 0, SEEK_END); > char *buf = NULL; > int err = 0; > + off_t size; > + > + size = lseek(fd, 0, SEEK_END); > if (size < 0) { > err = REFTABLE_IO_ERROR; > goto done; > } > + > err = lseek(fd, 0, SEEK_SET); > if (err < 0) { > err = REFTABLE_IO_ERROR; > @@ -134,9 +137,16 @@ static int fd_read_lines(int fd, char ***namesp) > goto done; > } > > - if (read_in_full(fd, buf, size) != size) { > - err = REFTABLE_IO_ERROR; > - goto done; > + for (size_t total_read = 0; total_read < (size_t) size; ) { The cast from off_t -> size_t matches the currect behavior, but is it always safe to do this? In `git-compat-util.h` it looks like we have `xsize_t()` to safely handle these conversions. Since this series is moving away from `git-compat-util.h` should ideally something similar be implemented? > + ssize_t bytes_read = read(fd, buf + total_read, size - total_read); > + if (bytes_read < 0 && (errno == EAGAIN || errno == EINTR)) The error handling here for EAGAIN doesn't go as far as what `xread()` does via `handle_nonblock()`. In this scenario is that ok? > + continue; > + if (bytes_read < 0 || !bytes_read) { > + err = REFTABLE_IO_ERROR; > + goto done; > + } > + > + total_read += bytes_read; > } > buf[size] = 0; > > > -- > 2.48.1.362.g079036d154.dirty > >
On Mon, Jan 27, 2025 at 10:57:20AM -0600, Justin Tobler wrote: > On 25/01/27 02:04PM, Patrick Steinhardt wrote: > > diff --git a/reftable/stack.c b/reftable/stack.c > > index f7c1845e15..9490366795 100644 > > --- a/reftable/stack.c > > +++ b/reftable/stack.c > > @@ -115,13 +115,16 @@ int reftable_new_stack(struct reftable_stack **dest, const char *dir, > > > > static int fd_read_lines(int fd, char ***namesp) > > { > > - off_t size = lseek(fd, 0, SEEK_END); > > char *buf = NULL; > > int err = 0; > > + off_t size; > > + > > + size = lseek(fd, 0, SEEK_END); > > if (size < 0) { > > err = REFTABLE_IO_ERROR; > > goto done; > > } > > + > > err = lseek(fd, 0, SEEK_SET); > > if (err < 0) { > > err = REFTABLE_IO_ERROR; > > @@ -134,9 +137,16 @@ static int fd_read_lines(int fd, char ***namesp) > > goto done; > > } > > > > - if (read_in_full(fd, buf, size) != size) { > > - err = REFTABLE_IO_ERROR; > > - goto done; > > + for (size_t total_read = 0; total_read < (size_t) size; ) { > > The cast from off_t -> size_t matches the currect behavior, but is it > always safe to do this? In `git-compat-util.h` it looks like we have > `xsize_t()` to safely handle these conversions. Since this series is > moving away from `git-compat-util.h` should ideally something similar be > implemented? It is safe, because a couple lines further up we check for `size < 0` and error out if that is the case. So we know it's a positive integer, and thus it can be represented via `size_t`. > > + ssize_t bytes_read = read(fd, buf + total_read, size - total_read); > > + if (bytes_read < 0 && (errno == EAGAIN || errno == EINTR)) > > The error handling here for EAGAIN doesn't go as far as what `xread()` > does via `handle_nonblock()`. In this scenario is that ok? Yes, because we don't set `O_NONBLOCK` in the reftable library. I'll note that in the commit message. Patrick
Patrick Steinhardt <ps@pks.im> writes: >> The cast from off_t -> size_t matches the currect behavior, but is it >> always safe to do this? In `git-compat-util.h` it looks like we have >> `xsize_t()` to safely handle these conversions. Since this series is >> moving away from `git-compat-util.h` should ideally something similar be >> implemented? > > It is safe, because a couple lines further up we check for `size < 0` > and error out if that is the case. So we know it's a positive integer, > and thus it can be represented via `size_t`. Even where off_t (which measures on-disk file in bytes) may be wider than size_t (which measures in-core piece of memory in bytes)? >> > + ssize_t bytes_read = read(fd, buf + total_read, size - total_read); >> > + if (bytes_read < 0 && (errno == EAGAIN || errno == EINTR)) >> >> The error handling here for EAGAIN doesn't go as far as what `xread()` >> does via `handle_nonblock()`. In this scenario is that ok? > > Yes, because we don't set `O_NONBLOCK` in the reftable library. > > I'll note that in the commit message. Good. Thanks.
On Tue, Jan 28, 2025 at 09:05:44AM -0800, Junio C Hamano wrote: > Patrick Steinhardt <ps@pks.im> writes: > > >> The cast from off_t -> size_t matches the currect behavior, but is it > >> always safe to do this? In `git-compat-util.h` it looks like we have > >> `xsize_t()` to safely handle these conversions. Since this series is > >> moving away from `git-compat-util.h` should ideally something similar be > >> implemented? > > > > It is safe, because a couple lines further up we check for `size < 0` > > and error out if that is the case. So we know it's a positive integer, > > and thus it can be represented via `size_t`. > > Even where off_t (which measures on-disk file in bytes) may be wider > than size_t (which measures in-core piece of memory in bytes)? Wait, can that actually happen? Hm. I assume it can, for example on 32 bit systems with large-file support enabled. There I assume that `off_t` would be a 64 bit signed integer, whereas `size_t` may be a 32 bit unsigned integer. Will address in v3. Patrick
diff --git a/reftable/stack.c b/reftable/stack.c index f7c1845e15..9490366795 100644 --- a/reftable/stack.c +++ b/reftable/stack.c @@ -115,13 +115,16 @@ int reftable_new_stack(struct reftable_stack **dest, const char *dir, static int fd_read_lines(int fd, char ***namesp) { - off_t size = lseek(fd, 0, SEEK_END); char *buf = NULL; int err = 0; + off_t size; + + size = lseek(fd, 0, SEEK_END); if (size < 0) { err = REFTABLE_IO_ERROR; goto done; } + err = lseek(fd, 0, SEEK_SET); if (err < 0) { err = REFTABLE_IO_ERROR; @@ -134,9 +137,16 @@ static int fd_read_lines(int fd, char ***namesp) goto done; } - if (read_in_full(fd, buf, size) != size) { - err = REFTABLE_IO_ERROR; - goto done; + for (size_t total_read = 0; total_read < (size_t) size; ) { + ssize_t bytes_read = read(fd, buf + total_read, size - total_read); + if (bytes_read < 0 && (errno == EAGAIN || errno == EINTR)) + continue; + if (bytes_read < 0 || !bytes_read) { + err = REFTABLE_IO_ERROR; + goto done; + } + + total_read += bytes_read; } buf[size] = 0;
There is a single callsite of `read_in_full()` in the reftable library. Open-code the function to reduce our dependency on the Git library. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- reftable/stack.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-)