diff mbox series

mktree: Make '--missing' behave as documented

Message ID 1566aed1-a38f-a9ca-241c-21b56d732328@roku.com (mailing list archive)
State New, archived
Headers show
Series mktree: Make '--missing' behave as documented | expand

Commit Message

Richard Oliver June 16, 2022, 3:46 p.m. UTC
On 15/06/2022 22:01, Junio C Hamano wrote:
> We by grave mistake at 31c8221a (mktree: validate entry type in
> input, 2009-05-14) started insisting on inspecting objects even when
> allow-mising was given.  I do not think it was sensible, given why
> we had "--missing" as an option to allow users to say "you do not
> have to be too paranoid".
> 
> The codebase is so distant but I think we should probably do a moral
> revert/reconstruct of that commit so that the extra paranoia of the
> said commit applies only when "--missing" is not in effect, or
> something like that.

-- >8 --
Subject: [PATCH] mktree: Make '--missing' behave as documented

Do not make use of the object store to verify object type for
'allow_missing' objects in mktree_line(). GIT-MKTREE(1) documents
'--missing' as disabling "verif[ication] that each tree entry's sha1
identifies an existing object".

This change retains the check for '<mode>' and '<type>' for
'allow_missing' objects as this check is merely input validation that
avoids interrogating the object store.

Signed-off-by: Richard Oliver <roliver@roku.com>
---
 builtin/mktree.c | 11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)

Comments

Junio C Hamano June 16, 2022, 5:44 p.m. UTC | #1
Richard Oliver <roliver@roku.com> writes:

> Subject: [PATCH] mktree: Make '--missing' behave as documented

I would intead title the change as

	   mktree: disable object verification under '--missing'

if I were writing this change.  It is a tad longer, but "as
documented" is much less informative when this appears in "git
shortlog --no-merges".

But wait for a bit.  I am not sold on "disable" 100% yet.

> Do not make use of the object store to verify object type for
> 'allow_missing' objects in mktree_line(). GIT-MKTREE(1) documents
> '--missing' as disabling "verif[ication] that each tree entry's sha1
> identifies an existing object".

I am on the fence on this.  The current behaviour is not exactly
forcing verification with "--missing".  Even if oid_object_info(),
after consulting the promisor remote, says that the object is not
available, the code does not complain, and in that sense, it is
working as documented.  It's just that it is doing unnecessary work
that you know you are not going to use.

> This change retains the check for '<mode>' and '<type>' for
> 'allow_missing' objects as this check is merely input validation that
> avoids interrogating the object store.
>
> Signed-off-by: Richard Oliver <roliver@roku.com>
> ---
>  builtin/mktree.c | 11 ++++-------
>  1 file changed, 4 insertions(+), 7 deletions(-)

31c8221a (mktree: validate entry type in input, 2009-05-14) was
overly eager to check with the local object store. They called the
sha1_object_info() API to obtain the type information, allowed the
call to silently fail when the object was missing locally, but
sanity checked the types opportunistically when the object did
exist.  That implementation is hurting us now, but it is sort of
excusable because back then there was no lazy/on-demand downloading
of individual objects that causes a long delay and materializes the
object, hence defeating the point of using "--missing".

And that is why the code back then was OK but not anymore in today's
world.  Something like the analysis in the above paragraph should be
in the commit log message to explain why we are making this change
today.

This patch would be a good first cut as a starting point, but we
probably can do better by doing oid_object_info_extended() call with
OBJECT_INFO_SKIP_FETCH_OBJECT bit (and probably QUICK bit, too) set,
with the current code structure.

And when we do so, the title would not match the purpose of the
change.  The verification was disabled with "--missing" all along
and that is not what we are changing.  What we will be fixing is the
wasteful implementation.

    mktree: do not check types of remote objects

    With 31c8221a (mktree: validate entry type in input, 2009-05-14), we
    called the sha1_object_info() API to obtain the type information,
    but allowed the call to silently fail when the object was missing
    locally, so that we can sanity-check the types opportunistically
    when the object did exist.

    The implementation is understandable because back then there was no
    lazy/on-demand downloading of individual objects from the promisor
    remotes that causes a long delay and materializes the object, hence
    defeating the point of using "--missing".  The design is hurting us
    now.

    We could bypass the opportunistic type/mode consistency check
    altogether when "--missing" is given, but instead, use the
    oid_object_info_extended() API and tell it that we are only
    interested in objects that locally exist and are immediately
    available by passing OBJECT_INFO_SKIP_FETCH_OBJECT bit to it.  That
    way, we will still retain the cheap and opportunistic sanity check
    for local objects.

or something like that, perhaps?

Thanks.

> diff --git a/builtin/mktree.c b/builtin/mktree.c
> index 902edba6d2..f41fda6e7d 100644
> --- a/builtin/mktree.c
> +++ b/builtin/mktree.c
> @@ -116,15 +116,12 @@ static void mktree_line(char *buf, int nul_term_line, int allow_missing)
>  			path, ptr, type_name(mode_type));
>  	}
>  
> -	/* Check the type of object identified by sha1 */
> -	obj_type = oid_object_info(the_repository, &oid, NULL);
> -	if (obj_type < 0) {
> -		if (allow_missing) {
> -			; /* no problem - missing objects are presumed to be of the right type */
> -		} else {
> +	if (!allow_missing) {
> +		/* Check the type of object identified by sha1 */
> +		obj_type = oid_object_info(the_repository, &oid, NULL);
> +		if (obj_type < 0) {
>  			die("entry '%s' object %s is unavailable", path, oid_to_hex(&oid));
>  		}
> -	} else {
>  		if (obj_type != mode_type) {
>  			/*
>  			 * The object exists but is of the wrong type.
diff mbox series

Patch

diff --git a/builtin/mktree.c b/builtin/mktree.c
index 902edba6d2..f41fda6e7d 100644
--- a/builtin/mktree.c
+++ b/builtin/mktree.c
@@ -116,15 +116,12 @@  static void mktree_line(char *buf, int nul_term_line, int allow_missing)
 			path, ptr, type_name(mode_type));
 	}
 
-	/* Check the type of object identified by sha1 */
-	obj_type = oid_object_info(the_repository, &oid, NULL);
-	if (obj_type < 0) {
-		if (allow_missing) {
-			; /* no problem - missing objects are presumed to be of the right type */
-		} else {
+	if (!allow_missing) {
+		/* Check the type of object identified by sha1 */
+		obj_type = oid_object_info(the_repository, &oid, NULL);
+		if (obj_type < 0) {
 			die("entry '%s' object %s is unavailable", path, oid_to_hex(&oid));
 		}
-	} else {
 		if (obj_type != mode_type) {
 			/*
 			 * The object exists but is of the wrong type.