Message ID | 20241108162441.50736-1-eric.peijian@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | cat-file: add remote-object-info to batch-command | expand |
Eric Ju <eric.peijian@gmail.com> writes: > This is a continuation of Calvin Wan's (calvinwan@google.com) > patch series [PATCH v5 0/6] cat-file: add --batch-command remote-object-info command at [1]. > > Sometimes it is useful to get information about an object without having to download > it completely. The server logic for retrieving size has already been implemented and merged in > "a2ba162cda (object-info: support for retrieving object info, 2021-04-20)"[2]. > This patch series implement the client option for it. > > This patch series add the `remote-object-info` command to `cat-file --batch-command`. > This command allows the client to make an object-info command request to a server > that supports protocol v2. If the server is v2, but does not have > object-info capability, the entire object is fetched and the > relevant object info is returned. > > A few questions open for discussions please: > > 1. In the current implementation, if a user puts `remote-object-info` in protocol v1, > `cat-file --batch-command` will die. Which way do we prefer? "error and exit (i.e. die)" > or "warn and wait for new command". In the primary use case envisioned, would it be a program that is driving the "cat-file --batch-command" process? Can it sensibly react to "warn and wait" and throw different commands to achieve what it wanted to do with the remote-object-info command? If the answer is "no", die would be more appropriate. > 2. Right now, only the size is supported. If the batch command format > contains objectsize:disk or deltabase, it will die. The question > is about objecttype. In the current implementation, it will die too. > But dying on objecttype breaks the default format. We have changed the > default format to %(objectname) %(objectsize) when remote-object-info is used. > Any suggestions on this approach? Why bend the default format to the shortcoming of the new feature? What makes it impossible to learn what type of object it is? If the limitation that makes it impossible cannot be avoided, would it make more sense to fall back to the "fetch and locally inspect" just like "the other side does not know how to do object-info" case? Another thing you did not list, which is related, is where the "fetch and locally inspect" fallback fetch the object into. Would we use a quarantine mechanism, so that a mere request for remote object info for an object will not contaminate our local object store until the next gc realizes that such an object is dangling? Thanks.