From patchwork Fri Sep 1 14:24:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kousik Sanagavarapu X-Patchwork-Id: 13372663 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D9F3ECA0FE4 for ; Fri, 1 Sep 2023 14:27:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242253AbjIAO1f (ORCPT ); Fri, 1 Sep 2023 10:27:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34340 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1350012AbjIAO1Z (ORCPT ); Fri, 1 Sep 2023 10:27:25 -0400 Received: from mail-pg1-x532.google.com (mail-pg1-x532.google.com [IPv6:2607:f8b0:4864:20::532]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6C1FB10F9 for ; Fri, 1 Sep 2023 07:27:15 -0700 (PDT) Received: by mail-pg1-x532.google.com with SMTP id 41be03b00d2f7-56b0c5a140dso1557962a12.0 for ; Fri, 01 Sep 2023 07:27:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1693578434; x=1694183234; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=dLjQrjMtBXjPcC7JJmBjqPgEmMixJNKK4Ea6UZ4auNc=; b=JskuzCWid9GTF7RddXp5HzTvTqfA1cbVzqkSG0Q7T7hC9zhV8wGNWL5ZzH/LMZoMO9 LS3QRA3/pxKi7hud1kxwHln2DHFLn9xwwTly5O6vwBfezyRzP3dnj9P48uiyRxIAI99v SFD366UAZ2tc/e3XxJ5+URQ4ncM1Hmhbn/IlCNc2S9JqXZ/PlAWoqgHG9NF4dTk5M5dS R3tIzGwm367q3FUWZuooY4O9x0psaxnJdnwGWC00lwYrEVSIA/Omuff7PF2flCBSwM0i bNzXVxoYwDPRuPgETOWAMs9YMvCOxSOUHxwBb5/6D5emFOHSmKwYZLMOBch3O3cM85bH Uxng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1693578434; x=1694183234; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=dLjQrjMtBXjPcC7JJmBjqPgEmMixJNKK4Ea6UZ4auNc=; b=dDN2J5CgK1BvnSsGncEJzFtGRdesC+UNGn8d4TaSeuSJU9s5pn9BZ3mO/fmxjFPytv tMT4e8WYF3P2MuBbku9QOnmr8VwglJmG04H8VT4JrVtdju0CWiYm1YV6YrayFaRzGaLd x2+ioluLVG+srgOK48D55oHwG5ss7kOzbQTn9uOmI45AnFXIwetfz2aj/iFvvJGI+psi Vj2zDRzBdVUFWFTp5mvGaP9btxYfvSPCYELW55RhfGGb26x9dTyuBQevbSe6/UV7rhFR 3/4/7IcytYA2ckF8m4EDZSPOcbc6E2la6sLofBskLjAWoDlGLyIffhK2KB6bBqs7U8Xv V6EA== X-Gm-Message-State: AOJu0Yy+fXJRBpbphAdSUtz+38IjZGoq05hf0f4HomADAf9iL/w8l+Iv ieUaFrw3xVD0XkSkA5qb1P7ActjkhThx6g== X-Google-Smtp-Source: AGHT+IHMR2hlU5237iYEkKGs01H0iu9k3BpzkuRXNXXAdOnx80qYZ0SSoVe3Q+GIZXJAsF0mdlG15w== X-Received: by 2002:a05:6a20:1447:b0:14d:f41c:435a with SMTP id a7-20020a056a20144700b0014df41c435amr3360091pzi.39.1693578434312; Fri, 01 Sep 2023 07:27:14 -0700 (PDT) Received: from fivlite-virtual-machine.localdomain ([49.37.156.236]) by smtp.gmail.com with ESMTPSA id j15-20020aa78d0f000000b00688435a9915sm3010775pfe.189.2023.09.01.07.27.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 01 Sep 2023 07:27:13 -0700 (PDT) From: Kousik Sanagavarapu To: git@vger.kernel.org Cc: Kousik Sanagavarapu Subject: [PATCH] ref-filter: sort numerically when ":size" is used Date: Fri, 1 Sep 2023 19:54:54 +0530 Message-ID: <20230901142624.12063-1-five231003@gmail.com> X-Mailer: git-send-email 2.42.0.51.g5dc72c0fbc.dirty MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Atoms like "raw" and "contents" have a ":size" option which can be used to know the size of the data. Since these atoms have the cmp_type FIELD_STR, they are sorted alphabetically from 'a' to 'z' and '0' to '9'. Meaning, even when the ":size" option is used and what we ultimatlely have is numbers, we still sort alphabetically. For example, consider the the following case in a repo refname contents:size raw:size ======= ============= ======== refs/heads/branch1 1130 1210 refs/heads/master 300 410 refs/tags/v1.0 140 260 Sorting with "--format="%(refname) %(contents:size) --sort=contents:size" would give refs/heads/branch1 1130 refs/tags/v1.0.0 140 refs/heads/master 300 which is an alphabetic sort, while what one might really expect is refs/tags/v1.0.0 140 refs/heads/master 300 refs/heads/branch1 1130 which is a numeric sort (that is, a "$ sort -n file" as opposed to a "$ sort file", where "file" contains only the "contents:size" or "raw:size" info, each of which is on a newline). Same is the case with "--sort=raw:size". So, sort numerically whenever the sort is done with "contents:size" or "raw:size" and do it the normal alphabetic way when "contents" or "raw" are used with some other option (they are FIELD_STR anyways). Signed-off-by: Kousik Sanagavarapu --- ref-filter.c | 20 +++++++++++++++----- t/t6300-for-each-ref.sh | 15 +++++++++++++-- 2 files changed, 28 insertions(+), 7 deletions(-) diff --git a/ref-filter.c b/ref-filter.c index 1bfaf20fbf..5d7bea5f23 100644 --- a/ref-filter.c +++ b/ref-filter.c @@ -932,7 +932,13 @@ struct atom_value { ssize_t s_size; int (*handler)(struct atom_value *atomv, struct ref_formatting_state *state, struct strbuf *err); - uintmax_t value; /* used for sorting when not FIELD_STR */ + + /* + * Used for sorting when not FIELD_STR or when FIELD_STR but the + * sort should be numeric and not alphabetic. + */ + uintmax_t value; + struct used_atom *atom; }; @@ -1857,7 +1863,8 @@ static void grab_sub_body_contents(struct atom_value *val, int deref, struct exp v->s = xmemdupz(buf, buf_size); v->s_size = buf_size; } else if (atom->u.raw_data.option == RAW_LENGTH) { - v->s = xstrfmt("%"PRIuMAX, (uintmax_t)buf_size); + v->value = (uintmax_t)buf_size; + v->s = xstrfmt("%"PRIuMAX, v->value); } continue; } @@ -1883,8 +1890,10 @@ static void grab_sub_body_contents(struct atom_value *val, int deref, struct exp v->s = strbuf_detach(&sb, NULL); } else if (atom->u.contents.option == C_BODY_DEP) v->s = xmemdupz(bodypos, bodylen); - else if (atom->u.contents.option == C_LENGTH) - v->s = xstrfmt("%"PRIuMAX, (uintmax_t)strlen(subpos)); + else if (atom->u.contents.option == C_LENGTH) { + v->value = (uintmax_t)strlen(subpos); + v->s = xstrfmt("%"PRIuMAX, v->value); + } else if (atom->u.contents.option == C_BODY) v->s = xmemdupz(bodypos, nonsiglen); else if (atom->u.contents.option == C_SIG) @@ -2265,6 +2274,7 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err) v->s_size = ATOM_SIZE_UNSPECIFIED; v->handler = append_atom; + v->value = 0; v->atom = atom; if (*name == '*') { @@ -2986,7 +2996,7 @@ static int cmp_ref_sorting(struct ref_sorting *s, struct ref_array_item *a, stru cmp_detached_head = 1; } else if (s->sort_flags & REF_SORTING_VERSION) { cmp = versioncmp(va->s, vb->s); - } else if (cmp_type == FIELD_STR) { + } else if (cmp_type == FIELD_STR && !va->value && !vb->value) { if (va->s_size < 0 && vb->s_size < 0) { int (*cmp_fn)(const char *, const char *); cmp_fn = s->sort_flags & REF_SORTING_ICASE diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh index aa3c7c03c4..7b943fd34c 100755 --- a/t/t6300-for-each-ref.sh +++ b/t/t6300-for-each-ref.sh @@ -1017,16 +1017,16 @@ test_expect_success 'Verify sorts with raw' ' test_expect_success 'Verify sorts with raw:size' ' cat >expected <<-EOF && refs/myblobs/blob8 - refs/myblobs/first refs/myblobs/blob7 - refs/heads/main refs/myblobs/blob4 refs/myblobs/blob1 refs/myblobs/blob2 refs/myblobs/blob3 refs/myblobs/blob5 refs/myblobs/blob6 + refs/myblobs/first refs/mytrees/first + refs/heads/main EOF git for-each-ref --format="%(refname)" --sort=raw:size \ refs/heads/main refs/myblobs/ refs/mytrees/first >actual && @@ -1138,6 +1138,17 @@ test_expect_success 'for-each-ref --format compare with cat-file --batch' ' test_cmp expected actual ' +test_expect_success 'verify sorts with contents:size' ' + cat >expect <<-\EOF && + refs/heads/main + refs/heads/newtag + refs/heads/ambiguous + EOF + git for-each-ref --format="%(refname)" \ + --sort=contents:size refs/heads/ >actual && + test_cmp expect actual +' + test_expect_success 'set up multiple-sort tags' ' for when in 100000 200000 do