From patchwork Tue Jul 20 17:04:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 12388835 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 357C3C07E9B for ; Tue, 20 Jul 2021 17:05:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1FFC4610CC for ; Tue, 20 Jul 2021 17:05:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229708AbhGTQYy (ORCPT ); Tue, 20 Jul 2021 12:24:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49656 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231562AbhGTQYM (ORCPT ); Tue, 20 Jul 2021 12:24:12 -0400 Received: from mail-wr1-x435.google.com (mail-wr1-x435.google.com [IPv6:2a00:1450:4864:20::435]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 49777C061762 for ; Tue, 20 Jul 2021 10:04:50 -0700 (PDT) Received: by mail-wr1-x435.google.com with SMTP id u1so26832446wrs.1 for ; Tue, 20 Jul 2021 10:04:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=6RpFJu2vjGaQd/XxCqy3glY6M+uGGVu+N+4X/7oK2bY=; b=lobke5gH3DKryrZjVcQcxkT1IMTOw/MsLFbQEoTkAhtLtKAkeSem6GaiALXOSBc0gV 48xtOLyNbz2pRsmR6+oE+SUKXUp+2ETeQZQUcxczuEuT1dHQeoZ7a0Sj0L8cwenvB7FW 6dhCxrwtDBzLI0UvJi8SvWt5aNYSlXwoeOu48cYu+lGeT7p4y7f0ncyE0dbD83Yu5QlP I2WE/mVdsqtCVIk7rpJvqLKf8oxthOXnCdKD+blWx/s7nFaCG8FuhKE9dQeNXbx/jWsF /2WWMdcitb2M4JqwW/bayyQ7FlBmqNX1NZSHQglSIrCJ2GdlNze+sj1jV4hpk0Fll1v1 dMag== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=6RpFJu2vjGaQd/XxCqy3glY6M+uGGVu+N+4X/7oK2bY=; b=l0ZxectRrmnkCWtBNV4QNVok9YLMIbqJh1rm8FUInYZGHsb+1lvZjro0/d31tUACwL gYcIAJBuo7LFcwD3q7T8b28sTcKQRc2iJCagGiKyh72OIwdgWfpM30cRl2oSoVV/ddiW 7HbWizVjdHTpsiX464qfnbJHD7bZJceKyc4oIn0FXENRWlPSotuC/bPjwx9BJaO5Sil4 bWMeOkt4ZUSShXrpvYoCcHqkNJnzRmeYTokaXClxXInkeewQvveUUT4i05puq7ewbtR4 qktr6mqhGZhDTCYoMOvcSwNt5H6gARPN13Ea+nUmyGdO/ZonzU4TKLgzjzCFg1Sc4hLg 8T0g== X-Gm-Message-State: AOAM5329JyHx0UxTx/rvdToPoq0ayFQHyXATiyvlcSunaG4OEbeGI0Ue NmLD1rS3DUpulLtxv5bwHNbnFPOs/YY= X-Google-Smtp-Source: ABdhPJw6SfVYVjFpWNxPDjTzOoX7yTQJNAfPDbReI471AHhgkhverjQ/o8+Fz91vZhqaXkIWeeBeuA== X-Received: by 2002:adf:f908:: with SMTP id b8mr5153822wrr.302.1626800688873; Tue, 20 Jul 2021 10:04:48 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id u16sm29292363wrw.36.2021.07.20.10.04.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 10:04:48 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Tue, 20 Jul 2021 17:04:21 +0000 Subject: [PATCH 01/26] hash.h: provide constants for the hash IDs Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys This will simplify referencing them from code that is not deeply integrated with Git, in particular, the reftable library. Signed-off-by: Han-Wen Nienhuys --- hash.h | 6 ++++++ object-file.c | 7 ++----- 2 files changed, 8 insertions(+), 5 deletions(-) diff --git a/hash.h b/hash.h index 9c6df4d9527..45b207f37cc 100644 --- a/hash.h +++ b/hash.h @@ -95,12 +95,18 @@ static inline void git_SHA256_Clone(git_SHA256_CTX *dst, const git_SHA256_CTX *s /* Number of algorithms supported (including unknown). */ #define GIT_HASH_NALGOS (GIT_HASH_SHA256 + 1) +/* "sha1", big-endian */ +#define GIT_SHA1_FORMAT_ID 0x73686131 + /* The length in bytes and in hex digits of an object name (SHA-1 value). */ #define GIT_SHA1_RAWSZ 20 #define GIT_SHA1_HEXSZ (2 * GIT_SHA1_RAWSZ) /* The block size of SHA-1. */ #define GIT_SHA1_BLKSZ 64 +/* "s256", big-endian */ +#define GIT_SHA256_FORMAT_ID 0x73323536 + /* The length in bytes and in hex digits of an object name (SHA-256 value). */ #define GIT_SHA256_RAWSZ 32 #define GIT_SHA256_HEXSZ (2 * GIT_SHA256_RAWSZ) diff --git a/object-file.c b/object-file.c index ecca5a8da00..5f2b271e8bd 100644 --- a/object-file.c +++ b/object-file.c @@ -164,7 +164,6 @@ static void git_hash_unknown_final_oid(struct object_id *oid, git_hash_ctx *ctx) BUG("trying to finalize unknown hash"); } - const struct git_hash_algo hash_algos[GIT_HASH_NALGOS] = { { NULL, @@ -183,8 +182,7 @@ const struct git_hash_algo hash_algos[GIT_HASH_NALGOS] = { }, { "sha1", - /* "sha1", big-endian */ - 0x73686131, + GIT_SHA1_FORMAT_ID, GIT_SHA1_RAWSZ, GIT_SHA1_HEXSZ, GIT_SHA1_BLKSZ, @@ -199,8 +197,7 @@ const struct git_hash_algo hash_algos[GIT_HASH_NALGOS] = { }, { "sha256", - /* "s256", big-endian */ - 0x73323536, + GIT_SHA256_FORMAT_ID, GIT_SHA256_RAWSZ, GIT_SHA256_HEXSZ, GIT_SHA256_BLKSZ, From patchwork Tue Jul 20 17:04:22 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 12388837 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1C518C07E95 for ; Tue, 20 Jul 2021 17:05:58 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 002CF610FB for ; Tue, 20 Jul 2021 17:05:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229921AbhGTQZM (ORCPT ); Tue, 20 Jul 2021 12:25:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49658 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231866AbhGTQYN (ORCPT ); Tue, 20 Jul 2021 12:24:13 -0400 Received: from mail-wr1-x42b.google.com (mail-wr1-x42b.google.com [IPv6:2a00:1450:4864:20::42b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D2020C061766 for ; Tue, 20 Jul 2021 10:04:50 -0700 (PDT) Received: by mail-wr1-x42b.google.com with SMTP id c12so8763039wrt.3 for ; Tue, 20 Jul 2021 10:04:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=M66o8jlMvL4PeeUtekVFtRg/hRORM9wxMnGQIcXA8eM=; b=Arz4Fd5yHi5jHIijws3ptHHK85Dd/hr+aFhZYWv2rEMEE0ng3RcxO3hOqm8OUbpUvS dh8ZuqQpRM7SQ5xp3946Z4VFWyPPR2PWv2cFYLzzwX1tMKjvkG57LRyx36hPCQNwqqZs KKj7Quj+MN95fsdJAP19c76Hvxu9yDrbdE3hy9O4wpKINaZV1he9fX3Y0+4l+LQ0zHtp vUSpfGmwvEPkgV6xxYb73psgAJ9EEAE1dAMZMW+DeM3sUQr8V8NewAMPy15N3RkkxdAX eMglOkJyjiJPbvpkP58QmieJ1+gndR8q5ogU/rAHCmOq9eVRj9DeVOZXDvlhzmJsWa9G nI8w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=M66o8jlMvL4PeeUtekVFtRg/hRORM9wxMnGQIcXA8eM=; b=mooFkXxlteFBZoIysPC2pfa3oL++ypBSiTzkpDEedvHnA56+ToI6nq0lKTaqUH/xqX R+1Y59Qrn1mE20SephPnbGwo7H1AJckS0VbhuEvPyKvdExZPk4Uim/cT/tUHi+9pvUU2 5zLQa/QMVch/wFKMyrep61dsT9uXZqfuSS7W/8q8yo0hpCNESGY8FkQby6HuE1yMtxBc xNCe3IpfeNvLb68O+tCI9+XYOtf++9EzOR0zaK/FKW2NBA25V/cC0nFxOwK3SIpPZ1Eq w5s0kwAMXj03//V3d6frspV/5voPlfDG6hJ6kWh7gvb98fjIvmoJE9dD50uqs9uoYDmF N7Rw== X-Gm-Message-State: AOAM530XQK/bmMUNoQl/57tfRolrgkH+vJLfYN22Twscu3i/7qURtp0Q whn6FxV9c3CjwHl0XRrCOhiXAsMkhfk= X-Google-Smtp-Source: ABdhPJxrYmEmh/IcZ1k6JBAA2NKCmHgQ3m72Volgs21uhjitUI7YHi77S9D2h4nyccgacWyef0kYFg== X-Received: by 2002:adf:de92:: with SMTP id w18mr37475763wrl.42.1626800689429; Tue, 20 Jul 2021 10:04:49 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id a8sm24349522wrt.61.2021.07.20.10.04.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 10:04:49 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Tue, 20 Jul 2021 17:04:22 +0000 Subject: [PATCH 02/26] init-db: set the_repository->hash_algo early on Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys The reftable backend needs to know the hash algorithm for writing the initialization hash table. The initial reftable contains a symref HEAD => "main" (or "master"), which is agnostic to the size of hash value, but this is an exceptional circumstance, and the reftable library does not cater to this exception. It insists that all tables in the stack have a consistent format ID for the hash algorithm. Call set_repo_hash_algo directly after calling validate_hash_algorithm() (which reads $GIT_DEFAULT_HASH). Helped-by: Junio C Hamano Signed-off-by: Han-Wen Nienhuys --- builtin/init-db.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/builtin/init-db.c b/builtin/init-db.c index 2167796ff2a..c2f03f6018e 100644 --- a/builtin/init-db.c +++ b/builtin/init-db.c @@ -425,6 +425,27 @@ int init_db(const char *git_dir, const char *real_git_dir, validate_hash_algorithm(&repo_fmt, hash); + /* + * At this point, the_repository we have in-core does not look + * anything like one that we would see initialized in an already + * working repository after calling setup_git_directory(). + * + * Calling repository.c::initialize_the_repository() may have + * prepared the .index .objects and .parsed_objects members, but + * other members like .gitdir, .commondir, etc. have not been + * initialized. + * + * Many API functions assume they are working with the_repository + * that has sensibly been initialized, but because we haven't + * really read from an existing repository, we need to hand-craft + * the necessary members of the structure to get out of this + * chicken-and-egg situation. + * + * For now, we update the hash algorithm member to what the + * validate_hash_algorithm() call decided for us. + */ + repo_set_hash_algo(the_repository, repo_fmt.hash_algo); + reinit = create_default_files(template_dir, original_git_dir, initial_branch, &repo_fmt, flags & INIT_DB_QUIET); From patchwork Tue Jul 20 17:04:23 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 12388841 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D41DCC07E95 for ; Tue, 20 Jul 2021 17:06:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BC07D610D2 for ; Tue, 20 Jul 2021 17:06:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233694AbhGTQZS (ORCPT ); Tue, 20 Jul 2021 12:25:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49664 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231919AbhGTQYO (ORCPT ); Tue, 20 Jul 2021 12:24:14 -0400 Received: from mail-wm1-x32d.google.com (mail-wm1-x32d.google.com [IPv6:2a00:1450:4864:20::32d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 674C8C061767 for ; Tue, 20 Jul 2021 10:04:51 -0700 (PDT) Received: by mail-wm1-x32d.google.com with SMTP id l17-20020a05600c1d11b029021f84fcaf75so1918039wms.1 for ; Tue, 20 Jul 2021 10:04:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=yj9tx1HJN1gjVb7BbFdzwGzB8sZO/EgnY23zy/aVzOc=; b=VsikkGaegfIacPNimQGuYnYnduJhQ8mFBg9xALNt6ZpQTBL6iJ56FgLRYKkICfo4bo Kz9Dp34Us36we8Mj/L5+XSH/fM6vFT9b42w75NAs3MTmqIU0rR0guSn64v9Sx0YIPrG+ eFXIzP3PpQQ4wuI7xLkc3FMpNdRs3Mc/Tvur436CROYcmrK5sjK3vXGPOY/nb7ONp5pb E3OOuqZJh1W3iPiKhafyqepMVsqlrZ4Jzjgw20XMymJ+Dbb4zzdXyPbEFEasU+3VSays nhOHZCEyAItVxpeJr+4SrXZChKS9d4jMrPzJzujyrnXUETvDYQU0i/4y9jskh0bS+WTY LlVw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=yj9tx1HJN1gjVb7BbFdzwGzB8sZO/EgnY23zy/aVzOc=; b=TR2iTfDnTYauLM3FZxEmdHjJfUz3UEHKCiNdl8MbSBuGoB4LN3rpoOx7gFynMS8D6B sYSxrrcQVsoRo5HDblU4tAuap+wkRwZqNmbRkWxxDiJCvt10sCAkSOhmlfw+9/MhMLop RUlnUOYjO0ACpxyni3uhfp4GDpzu7sryEG6iNRGUEAcw+jkvMKNaCUobmJF9nXveFZOc iziKGXkhiJRjCgUJSbSoHisA0rOwfcTFi73JnLZrTqWcrn2fza18mF+zIhDelS30TpOJ EgdzC6tlv+KEbx5Mluytc4G6FID5P/shKwNR6PeK1IYwcGRa+0mvsweblBbxVbtwJ9mN 4GuQ== X-Gm-Message-State: AOAM532OQb9gATUICHhBhvUKMo+rBVWdlPG7aVyo7i1hpQ71e13J1TzG KKvTPJkqvlIfO/yDx+QbGJ6tK4SvlP8= X-Google-Smtp-Source: ABdhPJySlmstCPR4lRDuR//HdHhxPH8owA8PJMWJz/nUNrZcN8hnrtDya8l9wVqb28DqurWNB8U1JA== X-Received: by 2002:a1c:7dcf:: with SMTP id y198mr25154280wmc.140.1626800689988; Tue, 20 Jul 2021 10:04:49 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id g3sm24394268wru.95.2021.07.20.10.04.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 10:04:49 -0700 (PDT) Message-Id: <76c2a1005da68babc292eedf05daa29d7f236fcf.1626800686.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 20 Jul 2021 17:04:23 +0000 Subject: [PATCH 03/26] reftable: RFC: add LICENSE Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys The objective of this code is to be usable as a C library, so it can be reused in libgit2. This is currently using a BSD license as it is the liberal license I could find, but this could be changed to whatever fits the stated goal above. This code is currently imported from github.com/hanwen/reftable. Once this code lands in git.git, the C code will be removed from github.com/hanwen/reftable, and the git.git code will be the source of truth. Signed-off-by: Han-Wen Nienhuys --- reftable/LICENSE | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) create mode 100644 reftable/LICENSE diff --git a/reftable/LICENSE b/reftable/LICENSE new file mode 100644 index 00000000000..402e0f9356b --- /dev/null +++ b/reftable/LICENSE @@ -0,0 +1,31 @@ +BSD License + +Copyright (c) 2020, Google LLC +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + +* Redistributions of source code must retain the above copyright notice, +this list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright +notice, this list of conditions and the following disclaimer in the +documentation and/or other materials provided with the distribution. + +* Neither the name of Google LLC nor the names of its contributors may +be used to endorse or promote products derived from this software +without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. From patchwork Tue Jul 20 17:04:24 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 12388845 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 941C6C636C8 for ; Tue, 20 Jul 2021 17:06:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 78C7D610D2 for ; Tue, 20 Jul 2021 17:06:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229771AbhGTQZV (ORCPT ); Tue, 20 Jul 2021 12:25:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49670 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231993AbhGTQYS (ORCPT ); Tue, 20 Jul 2021 12:24:18 -0400 Received: from mail-wm1-x32b.google.com (mail-wm1-x32b.google.com [IPv6:2a00:1450:4864:20::32b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 557E6C061768 for ; Tue, 20 Jul 2021 10:04:52 -0700 (PDT) Received: by mail-wm1-x32b.google.com with SMTP id f8-20020a1c1f080000b029022d4c6cfc37so1893843wmf.5 for ; Tue, 20 Jul 2021 10:04:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=/Izbl9wexLtHbO918dhEysjpUsFaAvsDqZEji3I0p88=; b=AtO29haX31OrfXmgjRREahWU8gkdCIHxOTzX0iAaxxUP1mXzt7jPv8PQ3fZCXl6Flk Z4ozurjGZBd+QKGY5GIFk1GedW9ePbIowTNbLObWtx0vQFw2PYTpBLYc0+3J64+kius2 tUg60S0UvR2g3jqIwUBEULmab5bmUbZ91ePgJW0UOTq4luH0/8yRU3xObEfWdLGwjS27 qcFJ8Ju6XuyX3zzFOqTSoriSISY8klmtwhoM2BI+g/3OuOVjB2eK/7F6mwFun1CTpUIi EHBBRJjblnNaBzeDjNT9CfF8xO9W2T5Uhud2Lqs8vXJKhIlF603nya1pStZh6nnbLLXy 8gMA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=/Izbl9wexLtHbO918dhEysjpUsFaAvsDqZEji3I0p88=; b=Lwjr0QNak1P+E7FXoirGHVPuM7JlQNjuUab608vJknKaCRwC4IJ3FmZcKa2pfB22/a HMEulcL58s5OjYCqH/fbjpPQMNwuCv5a8hgCeuPE2nfeVsAfA7za8uCXkgFcNDYWtNf0 MfMwO5rJH8rPg6bievXQWTHfzxe635Y2DaS2DoocuBr7M9vBRQgQ0TEUByewlgMcck/i lLYsl0it6PFHKvtBDzTAlByxspqEYpNW5svM3kw/rFH0oef8nLWBIE3iofJM1Wed7RUQ H5IQOEJaulALQT4gqutIpS5ZN+vpaGBCpWh/pFA7yVJcPwj+qQRcD2WPnMLLRBw0xypx 6m8A== X-Gm-Message-State: AOAM531fRQkmRCwH+VEg0OrsaqtDL6oK5X+bA43H8HgE61o2qZeX+he4 lLMyDcOztKYpN+gpVN08aa9bktWS1pU= X-Google-Smtp-Source: ABdhPJy5S7oQ0vo/wcpNZNi+EcefiMI5rucD2bH67ha2ITjj23hQCvaX0HSaHcC3FiMWJrTTz+kGYw== X-Received: by 2002:a05:600c:4101:: with SMTP id j1mr33143182wmi.130.1626800690647; Tue, 20 Jul 2021 10:04:50 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id x9sm24583543wrm.82.2021.07.20.10.04.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 10:04:50 -0700 (PDT) Message-Id: <616d6ed89eebe1a9ba55c51656d4426f07b7b153.1626800686.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 20 Jul 2021 17:04:24 +0000 Subject: [PATCH 04/26] reftable: add error related functionality Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys The reftable/ directory is structured as a library, so it cannot crash on misuse. Instead, it returns an error codes. In addition, the error code can be used to signal conditions from lower levels of the library to be handled by higher levels of the library. For example, a transaction might legitimately write an empty reftable file, but in that case, we'd want to shortcut the transaction overhead. Signed-off-by: Han-Wen Nienhuys --- reftable/error.c | 41 ++++++++++++++++++++++++++ reftable/reftable-error.h | 62 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 103 insertions(+) create mode 100644 reftable/error.c create mode 100644 reftable/reftable-error.h diff --git a/reftable/error.c b/reftable/error.c new file mode 100644 index 00000000000..f6f16def921 --- /dev/null +++ b/reftable/error.c @@ -0,0 +1,41 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "reftable-error.h" + +#include + +const char *reftable_error_str(int err) +{ + static char buf[250]; + switch (err) { + case REFTABLE_IO_ERROR: + return "I/O error"; + case REFTABLE_FORMAT_ERROR: + return "corrupt reftable file"; + case REFTABLE_NOT_EXIST_ERROR: + return "file does not exist"; + case REFTABLE_LOCK_ERROR: + return "data is outdated"; + case REFTABLE_API_ERROR: + return "misuse of the reftable API"; + case REFTABLE_ZLIB_ERROR: + return "zlib failure"; + case REFTABLE_NAME_CONFLICT: + return "file/directory conflict"; + case REFTABLE_EMPTY_TABLE_ERROR: + return "wrote empty table"; + case REFTABLE_REFNAME_ERROR: + return "invalid refname"; + case -1: + return "general error"; + default: + snprintf(buf, sizeof(buf), "unknown error code %d", err); + return buf; + } +} diff --git a/reftable/reftable-error.h b/reftable/reftable-error.h new file mode 100644 index 00000000000..6f89bedf1a5 --- /dev/null +++ b/reftable/reftable-error.h @@ -0,0 +1,62 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef REFTABLE_ERROR_H +#define REFTABLE_ERROR_H + +/* + * Errors in reftable calls are signaled with negative integer return values. 0 + * means success. + */ +enum reftable_error { + /* Unexpected file system behavior */ + REFTABLE_IO_ERROR = -2, + + /* Format inconsistency on reading data */ + REFTABLE_FORMAT_ERROR = -3, + + /* File does not exist. Returned from block_source_from_file(), because + * it needs special handling in stack. + */ + REFTABLE_NOT_EXIST_ERROR = -4, + + /* Trying to write out-of-date data. */ + REFTABLE_LOCK_ERROR = -5, + + /* Misuse of the API: + * - on writing a record with NULL refname. + * - on writing a reftable_ref_record outside the table limits + * - on writing a ref or log record before the stack's + * next_update_inde*x + * - on writing a log record with multiline message with + * exact_log_message unset + * - on reading a reftable_ref_record from log iterator, or vice versa. + * + * When a call misuses the API, the internal state of the library is + * kept unchanged. + */ + REFTABLE_API_ERROR = -6, + + /* Decompression error */ + REFTABLE_ZLIB_ERROR = -7, + + /* Wrote a table without blocks. */ + REFTABLE_EMPTY_TABLE_ERROR = -8, + + /* Dir/file conflict. */ + REFTABLE_NAME_CONFLICT = -9, + + /* Invalid ref name. */ + REFTABLE_REFNAME_ERROR = -10, +}; + +/* convert the numeric error code to a string. The string should not be + * deallocated. */ +const char *reftable_error_str(int err); + +#endif From patchwork Tue Jul 20 17:04:25 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 12388847 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B17CEC07E9B for ; Tue, 20 Jul 2021 17:06:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9050361186 for ; Tue, 20 Jul 2021 17:06:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233465AbhGTQZa (ORCPT ); Tue, 20 Jul 2021 12:25:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49674 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232018AbhGTQYS (ORCPT ); Tue, 20 Jul 2021 12:24:18 -0400 Received: from mail-wr1-x433.google.com (mail-wr1-x433.google.com [IPv6:2a00:1450:4864:20::433]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E2763C0613DB for ; Tue, 20 Jul 2021 10:04:52 -0700 (PDT) Received: by mail-wr1-x433.google.com with SMTP id f9so26764970wrq.11 for ; Tue, 20 Jul 2021 10:04:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=K4tUjqSFqJTJQCD8vxDGVh2UO9e26qgMzo69csxQmBo=; b=uWeuZG4OU+RvvMcp+QMtTGbFGs/vjtLd8PVZXlM6zVUHQCs9q90TJihWc/D6BMhR6o wvO0ZJCS9b3+vrKsoyi2P0oSwrNp7bgTmCijitHuESjzCu6qz43m37DxDzxWs4Y2BpAN X2Znqd+gtVDd0x42n6956Izrw+D0ESDB9geeg4GPOVlHHMEnfQwzJetokJxpx8/Du13h UhmA7F1/LZEZ0GGI1yeBelTBRhnMgCT4n2M5ef8k//MvRyK25hI9K4hXcwq6t1Q8QzWn wpcWD69fbK/ZNdFwCeVwMbl5n4LOc8EqzvU8mtTstDjahaait7ZUDE7cH3lS3pJ9lioP g+8A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=K4tUjqSFqJTJQCD8vxDGVh2UO9e26qgMzo69csxQmBo=; b=mETaNGs4NE5eWe5cGlsfs2tHrZQSI79AmtTXitptT1VgZpT16+9l1LR7WzuFz2sNdk VMcJCEO5BW+xv27He14mFZjtJt2oyPuuj29d70OqCJRYioCtiQo/PfHw+TVz4yaVt+5K VrkVaM0sDxv28DTGGd98bf4/Giehv1+oDA8R9bfiTEb5jvll1pADLpbawVOm5ITtqNFB QMmDmUkqLRsqj1ZFlb+wHLoT0/qspxjmradKSbmzQ3LhbojLN4JAEaCcW0VZJZSdvRRK RnYuJS6eaOaCnky8BjgnaZUGPA7bEH3rRgd+oA4xaEymZLbC9K/hey9kFJjLodV4CMlS rp4Q== X-Gm-Message-State: AOAM533p/NIGiUFC0YOXPDBHgWcDBL9JUhEyWTFWgK2hH++2wiiLJg/5 cHt4VEJhxRNfjX5jxPRRqxBECCB2knU= X-Google-Smtp-Source: ABdhPJyIQ1PMwYUtPHoW0FjuWCYCo9C0Y2MLtErqmEYF6CkyUF0Mr12SGUhbmYxbvLUQtwP6QfW+Vg== X-Received: by 2002:a5d:4010:: with SMTP id n16mr37031172wrp.142.1626800691372; Tue, 20 Jul 2021 10:04:51 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id q19sm2830005wmc.44.2021.07.20.10.04.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 10:04:51 -0700 (PDT) Message-Id: <9d2cdfe3ddd645d5d82f30db184538ad208114f9.1626800686.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 20 Jul 2021 17:04:25 +0000 Subject: [PATCH 05/26] reftable: utility functions Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys This commit provides basic utility classes for the reftable library. Signed-off-by: Han-Wen Nienhuys Helped-by: Johannes Schindelin --- Makefile | 25 +++++- contrib/buildsystems/CMakeLists.txt | 14 ++- reftable/basics.c | 128 ++++++++++++++++++++++++++++ reftable/basics.h | 60 +++++++++++++ reftable/basics_test.c | 98 +++++++++++++++++++++ reftable/publicbasics.c | 58 +++++++++++++ reftable/reftable-malloc.h | 18 ++++ reftable/reftable-tests.h | 22 +++++ reftable/system.h | 24 ++++++ reftable/test_framework.c | 23 +++++ reftable/test_framework.h | 53 ++++++++++++ t/helper/test-reftable.c | 9 ++ t/helper/test-tool.c | 3 +- t/helper/test-tool.h | 1 + t/t0032-reftable-unittest.sh | 15 ++++ 15 files changed, 545 insertions(+), 6 deletions(-) create mode 100644 reftable/basics.c create mode 100644 reftable/basics.h create mode 100644 reftable/basics_test.c create mode 100644 reftable/publicbasics.c create mode 100644 reftable/reftable-malloc.h create mode 100644 reftable/reftable-tests.h create mode 100644 reftable/system.h create mode 100644 reftable/test_framework.c create mode 100644 reftable/test_framework.h create mode 100644 t/helper/test-reftable.c create mode 100755 t/t0032-reftable-unittest.sh diff --git a/Makefile b/Makefile index c7c46c017d3..ed969b0793f 100644 --- a/Makefile +++ b/Makefile @@ -741,6 +741,7 @@ TEST_BUILTINS_OBJS += test-read-cache.o TEST_BUILTINS_OBJS += test-read-graph.o TEST_BUILTINS_OBJS += test-read-midx.o TEST_BUILTINS_OBJS += test-ref-store.o +TEST_BUILTINS_OBJS += test-reftable.o TEST_BUILTINS_OBJS += test-regex.o TEST_BUILTINS_OBJS += test-repository.o TEST_BUILTINS_OBJS += test-revision-walking.o @@ -819,6 +820,8 @@ TEST_SHELL_PATH = $(SHELL_PATH) LIB_FILE = libgit.a XDIFF_LIB = xdiff/lib.a +REFTABLE_LIB = reftable/libreftable.a +REFTABLE_TEST_LIB = reftable/libreftable_test.a GENERATED_H += command-list.h GENERATED_H += config-list.h @@ -1191,7 +1194,7 @@ THIRD_PARTY_SOURCES += compat/regex/% THIRD_PARTY_SOURCES += sha1collisiondetection/% THIRD_PARTY_SOURCES += sha1dc/% -GITLIBS = common-main.o $(LIB_FILE) $(XDIFF_LIB) +GITLIBS = common-main.o $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) EXTLIBS = GIT_USER_AGENT = git/$(GIT_VERSION) @@ -2438,7 +2441,15 @@ XDIFF_OBJS += xdiff/xutils.o .PHONY: xdiff-objs xdiff-objs: $(XDIFF_OBJS) +REFTABLE_OBJS += reftable/basics.o +REFTABLE_OBJS += reftable/error.o +REFTABLE_OBJS += reftable/publicbasics.o + +REFTABLE_TEST_OBJS += reftable/test_framework.o +REFTABLE_TEST_OBJS += reftable/basics_test.o + TEST_OBJS := $(patsubst %$X,%.o,$(TEST_PROGRAMS)) $(patsubst %,t/helper/%,$(TEST_BUILTINS_OBJS)) + .PHONY: test-objs test-objs: $(TEST_OBJS) @@ -2454,6 +2465,8 @@ OBJECTS += $(PROGRAM_OBJS) OBJECTS += $(TEST_OBJS) OBJECTS += $(XDIFF_OBJS) OBJECTS += $(FUZZ_OBJS) +OBJECTS += $(REFTABLE_OBJS) $(REFTABLE_TEST_OBJS) + ifndef NO_CURL OBJECTS += http.o http-walker.o remote-curl.o endif @@ -2604,6 +2617,12 @@ $(LIB_FILE): $(LIB_OBJS) $(XDIFF_LIB): $(XDIFF_OBJS) $(QUIET_AR)$(AR) $(ARFLAGS) $@ $^ +$(REFTABLE_LIB): $(REFTABLE_OBJS) + $(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^ + +$(REFTABLE_TEST_LIB): $(REFTABLE_TEST_OBJS) + $(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^ + export DEFAULT_EDITOR DEFAULT_PAGER Documentation/GIT-EXCLUDED-PROGRAMS: FORCE @@ -2888,7 +2907,7 @@ perf: all t/helper/test-tool$X: $(patsubst %,t/helper/%,$(TEST_BUILTINS_OBJS)) -t/helper/test-%$X: t/helper/test-%.o GIT-LDFLAGS $(GITLIBS) +t/helper/test-%$X: t/helper/test-%.o GIT-LDFLAGS $(GITLIBS) $(REFTABLE_TEST_LIB) $(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) $(filter %.a,$^) $(LIBS) check-sha1:: t/helper/test-tool$X @@ -3218,7 +3237,7 @@ cocciclean: clean: profile-clean coverage-clean cocciclean $(RM) *.res $(RM) $(OBJECTS) - $(RM) $(LIB_FILE) $(XDIFF_LIB) + $(RM) $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(REFTABLE_TEST_LIB) $(RM) $(ALL_PROGRAMS) $(SCRIPT_LIB) $(BUILT_INS) git$X $(RM) $(TEST_PROGRAMS) $(RM) $(FUZZ_PROGRAMS) diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt index 171b4124afe..c2bf5bdffc6 100644 --- a/contrib/buildsystems/CMakeLists.txt +++ b/contrib/buildsystems/CMakeLists.txt @@ -640,6 +640,12 @@ parse_makefile_for_sources(libxdiff_SOURCES "XDIFF_OBJS") list(TRANSFORM libxdiff_SOURCES PREPEND "${CMAKE_SOURCE_DIR}/") add_library(xdiff STATIC ${libxdiff_SOURCES}) +#reftable +parse_makefile_for_sources(reftable_SOURCES "REFTABLE_OBJS") + +list(TRANSFORM reftable_SOURCES PREPEND "${CMAKE_SOURCE_DIR}/") +add_library(reftable STATIC ${reftable_SOURCES}) + if(WIN32) if(NOT MSVC)#use windres when compiling with gcc and clang add_custom_command(OUTPUT ${CMAKE_BINARY_DIR}/git.res @@ -662,7 +668,7 @@ endif() #link all required libraries to common-main add_library(common-main OBJECT ${CMAKE_SOURCE_DIR}/common-main.c) -target_link_libraries(common-main libgit xdiff ${ZLIB_LIBRARIES}) +target_link_libraries(common-main libgit xdiff reftable ${ZLIB_LIBRARIES}) if(Intl_FOUND) target_link_libraries(common-main ${Intl_LIBRARIES}) endif() @@ -902,11 +908,15 @@ if(BUILD_TESTING) add_executable(test-fake-ssh ${CMAKE_SOURCE_DIR}/t/helper/test-fake-ssh.c) target_link_libraries(test-fake-ssh common-main) +#reftable-tests +parse_makefile_for_sources(test-reftable_SOURCES "REFTABLE_TEST_OBJS") +list(TRANSFORM test-reftable_SOURCES PREPEND "${CMAKE_SOURCE_DIR}/") + #test-tool parse_makefile_for_sources(test-tool_SOURCES "TEST_BUILTINS_OBJS") list(TRANSFORM test-tool_SOURCES PREPEND "${CMAKE_SOURCE_DIR}/t/helper/") -add_executable(test-tool ${CMAKE_SOURCE_DIR}/t/helper/test-tool.c ${test-tool_SOURCES}) +add_executable(test-tool ${CMAKE_SOURCE_DIR}/t/helper/test-tool.c ${test-tool_SOURCES} ${test-reftable_SOURCES}) target_link_libraries(test-tool common-main) set_target_properties(test-fake-ssh test-tool diff --git a/reftable/basics.c b/reftable/basics.c new file mode 100644 index 00000000000..f761e48028c --- /dev/null +++ b/reftable/basics.c @@ -0,0 +1,128 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "basics.h" + +void put_be24(uint8_t *out, uint32_t i) +{ + out[0] = (uint8_t)((i >> 16) & 0xff); + out[1] = (uint8_t)((i >> 8) & 0xff); + out[2] = (uint8_t)(i & 0xff); +} + +uint32_t get_be24(uint8_t *in) +{ + return (uint32_t)(in[0]) << 16 | (uint32_t)(in[1]) << 8 | + (uint32_t)(in[2]); +} + +void put_be16(uint8_t *out, uint16_t i) +{ + out[0] = (uint8_t)((i >> 8) & 0xff); + out[1] = (uint8_t)(i & 0xff); +} + +int binsearch(size_t sz, int (*f)(size_t k, void *args), void *args) +{ + size_t lo = 0; + size_t hi = sz; + + /* Invariants: + * + * (hi == sz) || f(hi) == true + * (lo == 0 && f(0) == true) || fi(lo) == false + */ + while (hi - lo > 1) { + size_t mid = lo + (hi - lo) / 2; + + if (f(mid, args)) + hi = mid; + else + lo = mid; + } + + if (lo) + return hi; + + return f(0, args) ? 0 : 1; +} + +void free_names(char **a) +{ + char **p; + if (!a) { + return; + } + for (p = a; *p; p++) { + reftable_free(*p); + } + reftable_free(a); +} + +int names_length(char **names) +{ + char **p = names; + for (; *p; p++) { + /* empty */ + } + return p - names; +} + +void parse_names(char *buf, int size, char ***namesp) +{ + char **names = NULL; + size_t names_cap = 0; + size_t names_len = 0; + + char *p = buf; + char *end = buf + size; + while (p < end) { + char *next = strchr(p, '\n'); + if (next && next < end) { + *next = 0; + } else { + next = end; + } + if (p < next) { + if (names_len == names_cap) { + names_cap = 2 * names_cap + 1; + names = reftable_realloc( + names, names_cap * sizeof(*names)); + } + names[names_len++] = xstrdup(p); + } + p = next + 1; + } + + names = reftable_realloc(names, (names_len + 1) * sizeof(*names)); + names[names_len] = NULL; + *namesp = names; +} + +int names_equal(char **a, char **b) +{ + int i = 0; + for (; a[i] && b[i]; i++) { + if (strcmp(a[i], b[i])) { + return 0; + } + } + + return a[i] == b[i]; +} + +int common_prefix_size(struct strbuf *a, struct strbuf *b) +{ + int p = 0; + for (; p < a->len && p < b->len; p++) { + if (a->buf[p] != b->buf[p]) + break; + } + + return p; +} diff --git a/reftable/basics.h b/reftable/basics.h new file mode 100644 index 00000000000..096b36862b9 --- /dev/null +++ b/reftable/basics.h @@ -0,0 +1,60 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef BASICS_H +#define BASICS_H + +/* + * miscellaneous utilities that are not provided by Git. + */ + +#include "system.h" + +/* Bigendian en/decoding of integers */ + +void put_be24(uint8_t *out, uint32_t i); +uint32_t get_be24(uint8_t *in); +void put_be16(uint8_t *out, uint16_t i); + +/* + * find smallest index i in [0, sz) at which f(i) is true, assuming + * that f is ascending. Return sz if f(i) is false for all indices. + * + * Contrary to bsearch(3), this returns something useful if the argument is not + * found. + */ +int binsearch(size_t sz, int (*f)(size_t k, void *args), void *args); + +/* + * Frees a NULL terminated array of malloced strings. The array itself is also + * freed. + */ +void free_names(char **a); + +/* parse a newline separated list of names. `size` is the length of the buffer, + * without terminating '\0'. Empty names are discarded. */ +void parse_names(char *buf, int size, char ***namesp); + +/* compares two NULL-terminated arrays of strings. */ +int names_equal(char **a, char **b); + +/* returns the array size of a NULL-terminated array of strings. */ +int names_length(char **names); + +/* Allocation routines; they invoke the functions set through + * reftable_set_alloc() */ +void *reftable_malloc(size_t sz); +void *reftable_realloc(void *p, size_t sz); +void reftable_free(void *p); +void *reftable_calloc(size_t sz); + +/* Find the longest shared prefix size of `a` and `b` */ +struct strbuf; +int common_prefix_size(struct strbuf *a, struct strbuf *b); + +#endif diff --git a/reftable/basics_test.c b/reftable/basics_test.c new file mode 100644 index 00000000000..1fcd2297256 --- /dev/null +++ b/reftable/basics_test.c @@ -0,0 +1,98 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "system.h" + +#include "basics.h" +#include "test_framework.h" +#include "reftable-tests.h" + +struct binsearch_args { + int key; + int *arr; +}; + +static int binsearch_func(size_t i, void *void_args) +{ + struct binsearch_args *args = void_args; + + return args->key < args->arr[i]; +} + +static void test_binsearch(void) +{ + int arr[] = { 2, 4, 6, 8, 10 }; + size_t sz = ARRAY_SIZE(arr); + struct binsearch_args args = { + .arr = arr, + }; + + int i = 0; + for (i = 1; i < 11; i++) { + int res; + args.key = i; + res = binsearch(sz, &binsearch_func, &args); + + if (res < sz) { + EXPECT(args.key < arr[res]); + if (res > 0) { + EXPECT(args.key >= arr[res - 1]); + } + } else { + EXPECT(args.key == 10 || args.key == 11); + } + } +} + +static void test_names_length(void) +{ + char *a[] = { "a", "b", NULL }; + EXPECT(names_length(a) == 2); +} + +static void test_parse_names_normal(void) +{ + char in[] = "a\nb\n"; + char **out = NULL; + parse_names(in, strlen(in), &out); + EXPECT(!strcmp(out[0], "a")); + EXPECT(!strcmp(out[1], "b")); + EXPECT(!out[2]); + free_names(out); +} + +static void test_parse_names_drop_empty(void) +{ + char in[] = "a\n\n"; + char **out = NULL; + parse_names(in, strlen(in), &out); + EXPECT(!strcmp(out[0], "a")); + EXPECT(!out[1]); + free_names(out); +} + +static void test_common_prefix(void) +{ + struct strbuf s1 = STRBUF_INIT; + struct strbuf s2 = STRBUF_INIT; + strbuf_addstr(&s1, "abcdef"); + strbuf_addstr(&s2, "abc"); + EXPECT(common_prefix_size(&s1, &s2) == 3); + strbuf_release(&s1); + strbuf_release(&s2); +} + +int basics_test_main(int argc, const char *argv[]) +{ + RUN_TEST(test_common_prefix); + RUN_TEST(test_parse_names_normal); + RUN_TEST(test_parse_names_drop_empty); + RUN_TEST(test_binsearch); + RUN_TEST(test_names_length); + return 0; +} diff --git a/reftable/publicbasics.c b/reftable/publicbasics.c new file mode 100644 index 00000000000..bd0a02d3f68 --- /dev/null +++ b/reftable/publicbasics.c @@ -0,0 +1,58 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "reftable-malloc.h" + +#include "basics.h" +#include "system.h" + +static void *(*reftable_malloc_ptr)(size_t sz) = &malloc; +static void *(*reftable_realloc_ptr)(void *, size_t) = &realloc; +static void (*reftable_free_ptr)(void *) = &free; + +void *reftable_malloc(size_t sz) +{ + return (*reftable_malloc_ptr)(sz); +} + +void *reftable_realloc(void *p, size_t sz) +{ + return (*reftable_realloc_ptr)(p, sz); +} + +void reftable_free(void *p) +{ + reftable_free_ptr(p); +} + +void *reftable_calloc(size_t sz) +{ + void *p = reftable_malloc(sz); + memset(p, 0, sz); + return p; +} + +void reftable_set_alloc(void *(*malloc)(size_t), + void *(*realloc)(void *, size_t), void (*free)(void *)) +{ + reftable_malloc_ptr = malloc; + reftable_realloc_ptr = realloc; + reftable_free_ptr = free; +} + +int hash_size(uint32_t id) +{ + switch (id) { + case 0: + case GIT_SHA1_FORMAT_ID: + return GIT_SHA1_RAWSZ; + case GIT_SHA256_FORMAT_ID: + return GIT_SHA256_RAWSZ; + } + abort(); +} diff --git a/reftable/reftable-malloc.h b/reftable/reftable-malloc.h new file mode 100644 index 00000000000..5f2185f1f34 --- /dev/null +++ b/reftable/reftable-malloc.h @@ -0,0 +1,18 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef REFTABLE_H +#define REFTABLE_H + +#include + +/* Overrides the functions to use for memory management. */ +void reftable_set_alloc(void *(*malloc)(size_t), + void *(*realloc)(void *, size_t), void (*free)(void *)); + +#endif diff --git a/reftable/reftable-tests.h b/reftable/reftable-tests.h new file mode 100644 index 00000000000..5e7698ae654 --- /dev/null +++ b/reftable/reftable-tests.h @@ -0,0 +1,22 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef REFTABLE_TESTS_H +#define REFTABLE_TESTS_H + +int basics_test_main(int argc, const char **argv); +int block_test_main(int argc, const char **argv); +int merged_test_main(int argc, const char **argv); +int record_test_main(int argc, const char **argv); +int refname_test_main(int argc, const char **argv); +int reftable_test_main(int argc, const char **argv); +int stack_test_main(int argc, const char **argv); +int tree_test_main(int argc, const char **argv); +int reftable_dump_main(int argc, char *const *argv); + +#endif diff --git a/reftable/system.h b/reftable/system.h new file mode 100644 index 00000000000..bf963ee458e --- /dev/null +++ b/reftable/system.h @@ -0,0 +1,24 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef SYSTEM_H +#define SYSTEM_H + +// This header glues the reftable library to the rest of Git + +#include "git-compat-util.h" +#include "strbuf.h" +#include "hash.h" /* hash ID, sizes.*/ +#include "dir.h" /* remove_dir_recursively, for tests.*/ + +#include + +struct strbuf; +int hash_size(uint32_t id); + +#endif diff --git a/reftable/test_framework.c b/reftable/test_framework.c new file mode 100644 index 00000000000..84ac972cad0 --- /dev/null +++ b/reftable/test_framework.c @@ -0,0 +1,23 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "system.h" +#include "test_framework.h" + +#include "basics.h" + +void set_test_hash(uint8_t *p, int i) +{ + memset(p, (uint8_t)i, hash_size(GIT_SHA1_FORMAT_ID)); +} + +ssize_t strbuf_add_void(void *b, const void *data, size_t sz) +{ + strbuf_add(b, data, sz); + return sz; +} diff --git a/reftable/test_framework.h b/reftable/test_framework.h new file mode 100644 index 00000000000..774cb275bf6 --- /dev/null +++ b/reftable/test_framework.h @@ -0,0 +1,53 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef TEST_FRAMEWORK_H +#define TEST_FRAMEWORK_H + +#include "system.h" +#include "reftable-error.h" + +#define EXPECT_ERR(c) \ + if (c != 0) { \ + fflush(stderr); \ + fflush(stdout); \ + fprintf(stderr, "%s: %d: error == %d (%s), want 0\n", \ + __FILE__, __LINE__, c, reftable_error_str(c)); \ + abort(); \ + } + +#define EXPECT_STREQ(a, b) \ + if (strcmp(a, b)) { \ + fflush(stderr); \ + fflush(stdout); \ + fprintf(stderr, "%s:%d: %s (%s) != %s (%s)\n", __FILE__, \ + __LINE__, #a, a, #b, b); \ + abort(); \ + } + +#define EXPECT(c) \ + if (!(c)) { \ + fflush(stderr); \ + fflush(stdout); \ + fprintf(stderr, "%s: %d: failed assertion %s\n", __FILE__, \ + __LINE__, #c); \ + abort(); \ + } + +#define RUN_TEST(f) \ + fprintf(stderr, "running %s\n", #f); \ + fflush(stderr); \ + f(); + +void set_test_hash(uint8_t *p, int i); + +/* Like strbuf_add, but suitable for passing to reftable_new_writer + */ +ssize_t strbuf_add_void(void *b, const void *data, size_t sz); + +#endif diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c new file mode 100644 index 00000000000..3b58e423e7b --- /dev/null +++ b/t/helper/test-reftable.c @@ -0,0 +1,9 @@ +#include "reftable/reftable-tests.h" +#include "test-tool.h" + +int cmd__reftable(int argc, const char **argv) +{ + basics_test_main(argc, argv); + + return 0; +} diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c index b21e8f15190..01201629fca 100644 --- a/t/helper/test-tool.c +++ b/t/helper/test-tool.c @@ -51,13 +51,14 @@ static struct test_cmd cmds[] = { { "pcre2-config", cmd__pcre2_config }, { "pkt-line", cmd__pkt_line }, { "prio-queue", cmd__prio_queue }, - { "proc-receive", cmd__proc_receive}, + { "proc-receive", cmd__proc_receive }, { "progress", cmd__progress }, { "reach", cmd__reach }, { "read-cache", cmd__read_cache }, { "read-graph", cmd__read_graph }, { "read-midx", cmd__read_midx }, { "ref-store", cmd__ref_store }, + { "reftable", cmd__reftable }, { "regex", cmd__regex }, { "repository", cmd__repository }, { "revision-walking", cmd__revision_walking }, diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h index f845ced4b3a..cb90b7f4f7b 100644 --- a/t/helper/test-tool.h +++ b/t/helper/test-tool.h @@ -47,6 +47,7 @@ int cmd__read_cache(int argc, const char **argv); int cmd__read_graph(int argc, const char **argv); int cmd__read_midx(int argc, const char **argv); int cmd__ref_store(int argc, const char **argv); +int cmd__reftable(int argc, const char **argv); int cmd__regex(int argc, const char **argv); int cmd__repository(int argc, const char **argv); int cmd__revision_walking(int argc, const char **argv); diff --git a/t/t0032-reftable-unittest.sh b/t/t0032-reftable-unittest.sh new file mode 100755 index 00000000000..0ed14971a58 --- /dev/null +++ b/t/t0032-reftable-unittest.sh @@ -0,0 +1,15 @@ +#!/bin/sh +# +# Copyright (c) 2020 Google LLC +# + +test_description='reftable unittests' + +. ./test-lib.sh + +test_expect_success 'unittests' ' + TMPDIR=$(pwd) && export TMPDIR && + test-tool reftable +' + +test_done From patchwork Tue Jul 20 17:04:26 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 12388843 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9B390C636C9 for ; Tue, 20 Jul 2021 17:06:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 844D66113A for ; Tue, 20 Jul 2021 17:06:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231305AbhGTQZW (ORCPT ); Tue, 20 Jul 2021 12:25:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49676 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232035AbhGTQYS (ORCPT ); Tue, 20 Jul 2021 12:24:18 -0400 Received: from mail-wr1-x436.google.com (mail-wr1-x436.google.com [IPv6:2a00:1450:4864:20::436]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6FCF2C061762 for ; Tue, 20 Jul 2021 10:04:53 -0700 (PDT) Received: by mail-wr1-x436.google.com with SMTP id f9so26764999wrq.11 for ; Tue, 20 Jul 2021 10:04:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=yLkUg+brB3A2NNOeyX8kkfjPBHAwfEGF9hwCMrHUod4=; b=QEFSYe1p1+qQAScp8rLbLazrCeYSGa+vhzufxzCtS5201RWx9+fSrMj3yUIhSIABMm spYyNee+FMObimCPno15cB0o9G8jRuWlOx34MYOL/m/bci2wJHeMZNetl9RRkQmWQarh MotAnQM7Kn2PMGWzEeqGTmoQtcCtADIhsbZeMU+afzHjGJ9xi0MuQCQfZcrIP9G11Sfw xvedL/LLS4JgNSgWcHoaReYOlFfX28htQRqx/O1vFhO7WHnJfMk2WtFsDND2qRBSYTZa vTpJ2FGwzEAX4eovPYa5a6U3TORqUkBzduh0v8OJL/K1a46QztfYjH+MvhZMZrhAeaIC X32w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=yLkUg+brB3A2NNOeyX8kkfjPBHAwfEGF9hwCMrHUod4=; b=UrjKpkuW+c8yd1WzX1hFb8xWcjbs9ZyM9WndbeIHjI27Ffv95kK8P6Sdr4whxRTZCE C3my/B6aiUAwR5DnpUv0zkwzIDdOjlnR7/uKskwRLfsthYt4oasd/M0B85TTDzQpxSDy 4ZaFLAUvz85kbQIRenqqKK3ktGxSp5sGzqrGMuIKi++cEJlUwUQWhH9z9Jd4MI069VFD VqkbCTOpsHQOuTEbaKhh46Xhr929PA0G92xY4uVYQxmkmRj+OhOQdIDOlXUSr3bWjTbF 0WIKdPUhP07z/eyIg2axCuIMUiqseq+0Sv7bNWOmIcrp8BTVwnRDaV4eISpm+JuiCWhr rEzQ== X-Gm-Message-State: AOAM531FsbbYp2TxSDoGZMgK9rrssyNgVsgo5VTb7uj5OoGMC1lMTCiF hXNo4x0/v8b/qTqsqEaHFyCrcePD6v0= X-Google-Smtp-Source: ABdhPJw8z/vNW6Ym129EHYowffXp0far2ScJSo+2txQGbQMGjWwevV4s8Z/bNxhUnshgIJuXveOE4A== X-Received: by 2002:a05:6000:10f:: with SMTP id o15mr4432528wrx.347.1626800691985; Tue, 20 Jul 2021 10:04:51 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id y11sm3059043wmi.33.2021.07.20.10.04.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 10:04:51 -0700 (PDT) Message-Id: <5ef4b7040a360f60f33dbe6f30afe1db4e54b384.1626800686.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 20 Jul 2021 17:04:26 +0000 Subject: [PATCH 06/26] reftable: add blocksource, an abstraction for random access reads Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys The reftable format is usually used with files for storage. However, we abstract away this using the blocksource data structure. This has two advantages: * log blocks are zlib compressed, and handling them is simplified if we can discard byte segments from within the block layer. * for unittests, it is useful to read and write in-memory. The blocksource allows us to abstract the data away from on-disk files. Signed-off-by: Han-Wen Nienhuys --- Makefile | 1 + reftable/blocksource.c | 148 ++++++++++++++++++++++++++++++++ reftable/blocksource.h | 22 +++++ reftable/reftable-blocksource.h | 49 +++++++++++ 4 files changed, 220 insertions(+) create mode 100644 reftable/blocksource.c create mode 100644 reftable/blocksource.h create mode 100644 reftable/reftable-blocksource.h diff --git a/Makefile b/Makefile index ed969b0793f..ad10ada9283 100644 --- a/Makefile +++ b/Makefile @@ -2443,6 +2443,7 @@ xdiff-objs: $(XDIFF_OBJS) REFTABLE_OBJS += reftable/basics.o REFTABLE_OBJS += reftable/error.o +REFTABLE_OBJS += reftable/blocksource.o REFTABLE_OBJS += reftable/publicbasics.o REFTABLE_TEST_OBJS += reftable/test_framework.o diff --git a/reftable/blocksource.c b/reftable/blocksource.c new file mode 100644 index 00000000000..0044eecd9aa --- /dev/null +++ b/reftable/blocksource.c @@ -0,0 +1,148 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "system.h" + +#include "basics.h" +#include "blocksource.h" +#include "reftable-blocksource.h" +#include "reftable-error.h" + +static void strbuf_return_block(void *b, struct reftable_block *dest) +{ + memset(dest->data, 0xff, dest->len); + reftable_free(dest->data); +} + +static void strbuf_close(void *b) +{ +} + +static int strbuf_read_block(void *v, struct reftable_block *dest, uint64_t off, + uint32_t size) +{ + struct strbuf *b = v; + assert(off + size <= b->len); + dest->data = reftable_calloc(size); + memcpy(dest->data, b->buf + off, size); + dest->len = size; + return size; +} + +static uint64_t strbuf_size(void *b) +{ + return ((struct strbuf *)b)->len; +} + +static struct reftable_block_source_vtable strbuf_vtable = { + .size = &strbuf_size, + .read_block = &strbuf_read_block, + .return_block = &strbuf_return_block, + .close = &strbuf_close, +}; + +void block_source_from_strbuf(struct reftable_block_source *bs, + struct strbuf *buf) +{ + assert(!bs->ops); + bs->ops = &strbuf_vtable; + bs->arg = buf; +} + +static void malloc_return_block(void *b, struct reftable_block *dest) +{ + memset(dest->data, 0xff, dest->len); + reftable_free(dest->data); +} + +static struct reftable_block_source_vtable malloc_vtable = { + .return_block = &malloc_return_block, +}; + +static struct reftable_block_source malloc_block_source_instance = { + .ops = &malloc_vtable, +}; + +struct reftable_block_source malloc_block_source(void) +{ + return malloc_block_source_instance; +} + +struct file_block_source { + int fd; + uint64_t size; +}; + +static uint64_t file_size(void *b) +{ + return ((struct file_block_source *)b)->size; +} + +static void file_return_block(void *b, struct reftable_block *dest) +{ + memset(dest->data, 0xff, dest->len); + reftable_free(dest->data); +} + +static void file_close(void *b) +{ + int fd = ((struct file_block_source *)b)->fd; + if (fd > 0) { + close(fd); + ((struct file_block_source *)b)->fd = 0; + } + + reftable_free(b); +} + +static int file_read_block(void *v, struct reftable_block *dest, uint64_t off, + uint32_t size) +{ + struct file_block_source *b = v; + assert(off + size <= b->size); + dest->data = reftable_malloc(size); + if (pread(b->fd, dest->data, size, off) != size) + return -1; + dest->len = size; + return size; +} + +static struct reftable_block_source_vtable file_vtable = { + .size = &file_size, + .read_block = &file_read_block, + .return_block = &file_return_block, + .close = &file_close, +}; + +int reftable_block_source_from_file(struct reftable_block_source *bs, + const char *name) +{ + struct stat st = { 0 }; + int err = 0; + int fd = open(name, O_RDONLY); + struct file_block_source *p = NULL; + if (fd < 0) { + if (errno == ENOENT) { + return REFTABLE_NOT_EXIST_ERROR; + } + return -1; + } + + err = fstat(fd, &st); + if (err < 0) + return -1; + + p = reftable_calloc(sizeof(struct file_block_source)); + p->size = st.st_size; + p->fd = fd; + + assert(!bs->ops); + bs->ops = &file_vtable; + bs->arg = p; + return 0; +} diff --git a/reftable/blocksource.h b/reftable/blocksource.h new file mode 100644 index 00000000000..072e2727ad2 --- /dev/null +++ b/reftable/blocksource.h @@ -0,0 +1,22 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef BLOCKSOURCE_H +#define BLOCKSOURCE_H + +#include "system.h" + +struct reftable_block_source; + +/* Create an in-memory block source for reading reftables */ +void block_source_from_strbuf(struct reftable_block_source *bs, + struct strbuf *buf); + +struct reftable_block_source malloc_block_source(void); + +#endif diff --git a/reftable/reftable-blocksource.h b/reftable/reftable-blocksource.h new file mode 100644 index 00000000000..5aa3990a573 --- /dev/null +++ b/reftable/reftable-blocksource.h @@ -0,0 +1,49 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef REFTABLE_BLOCKSOURCE_H +#define REFTABLE_BLOCKSOURCE_H + +#include + +/* block_source is a generic wrapper for a seekable readable file. + */ +struct reftable_block_source { + struct reftable_block_source_vtable *ops; + void *arg; +}; + +/* a contiguous segment of bytes. It keeps track of its generating block_source + * so it can return itself into the pool. */ +struct reftable_block { + uint8_t *data; + int len; + struct reftable_block_source source; +}; + +/* block_source_vtable are the operations that make up block_source */ +struct reftable_block_source_vtable { + /* returns the size of a block source */ + uint64_t (*size)(void *source); + + /* reads a segment from the block source. It is an error to read + beyond the end of the block */ + int (*read_block)(void *source, struct reftable_block *dest, + uint64_t off, uint32_t size); + /* mark the block as read; may return the data back to malloc */ + void (*return_block)(void *source, struct reftable_block *blockp); + + /* release all resources associated with the block source */ + void (*close)(void *source); +}; + +/* opens a file on the file system as a block_source */ +int reftable_block_source_from_file(struct reftable_block_source *block_src, + const char *name); + +#endif From patchwork Tue Jul 20 17:04:27 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 12388851 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2E92AC636C8 for ; Tue, 20 Jul 2021 17:06:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1183E610FB for ; Tue, 20 Jul 2021 17:06:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231302AbhGTQZq (ORCPT ); Tue, 20 Jul 2021 12:25:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49690 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232212AbhGTQYS (ORCPT ); Tue, 20 Jul 2021 12:24:18 -0400 Received: from mail-wr1-x42f.google.com (mail-wr1-x42f.google.com [IPv6:2a00:1450:4864:20::42f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 63BD1C0613DC for ; Tue, 20 Jul 2021 10:04:54 -0700 (PDT) Received: by mail-wr1-x42f.google.com with SMTP id f17so26789738wrt.6 for ; Tue, 20 Jul 2021 10:04:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=apvbOayG/ctSGWEgzhk0niAQDRqZwNiQxfRp5XsJdjs=; b=q1NOk6fKIXVKms/1AUwuFKajR8Ou29iNpSmQsXtKIYraItGPVOokQOdmpvuZpYtrLQ Hgm6UnK7UeDKhRLSWGlzFAqBV3cLPSjFyqINctRGBiTX4zhvUgOakDBxUEqFcRecClvn ulnUT+fXnVJqi7fWIuvRt989ebV11P2HIzZW00/fwIfFEF5lpSRoJUuETyvuAeHfYcnT YhVGUHbS57TSfm9BERxGL3eWHVDrbhn0xapIgckLzJWDB2hARocl0R5RQV1sS0aUxhBo 8YByOeAdNvI+/T6xDU8G0yU49KUdEk6yw/7xqfo+kQGmlBixyxtbjwRG1EBI7++v9ixu Fixw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=apvbOayG/ctSGWEgzhk0niAQDRqZwNiQxfRp5XsJdjs=; b=sSj3qdvPL3rkvSCxsaF689Zt/YTdTtGWy84NSIpxcm93xjN4nbbh3Q5yeqXqunx2Yy rZGfZHK77RRyp19rSnedmQKAJoMFp6xKQNRPSSVQQAEkHKI6YcgKa0XHhLUqvehiq7WS bv6doHEojpCI4HCqhOO0zqFoBJHaAXmdAWkYZnPMhEOuHfGzNnE2iwtqWZL0opGM8Zgs SsJNJgBFkTFF1gxxrt2iy7VZ9rQZBxpaMoGWMIfKkxGxrbPvt4jUiSCIS+jgXNVYPEA9 UhCFVEaja2evXIYeoMOP4vtGnzib+JV2Or5xiqtjkxrZ8e49s03VIAtRUu8+n3mYBnRl HEgA== X-Gm-Message-State: AOAM532MgvOHXrqxwka4r5bv5OBd/HQUGPpp/nZNnqSY6Lk7zleLjocw DLXkX55q9Tm++yDeZEgxHsAGVafeXd4= X-Google-Smtp-Source: ABdhPJw2w4JBtLJQk6G3GQNw8vE74bRNP6m7Hd2E5VsMmyyRIj6gG+L4IrlkqIohiYD0VP+1EXlamA== X-Received: by 2002:a5d:4b44:: with SMTP id w4mr35888991wrs.275.1626800692632; Tue, 20 Jul 2021 10:04:52 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id y197sm3130242wmc.7.2021.07.20.10.04.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 10:04:52 -0700 (PDT) Message-Id: <13a5cbef0df1a6efb4309b1e9c2f76dcf66ed54b.1626800686.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 20 Jul 2021 17:04:27 +0000 Subject: [PATCH 07/26] reftable: (de)serialization for the polymorphic record type. Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys The reftable format is structured as a sequence of blocks, and each block contains a sequence of prefix-compressed key-value records. There are 4 types of records, and they have similarities in how they must be handled. This is achieved by introducing a polymorphic 'record' type that encapsulates ref, log, index and object records. Signed-off-by: Han-Wen Nienhuys --- Makefile | 2 + reftable/constants.h | 21 + reftable/record.c | 1200 ++++++++++++++++++++++++++++++++++++ reftable/record.h | 139 +++++ reftable/record_test.c | 407 ++++++++++++ reftable/reftable-record.h | 114 ++++ t/helper/test-reftable.c | 2 +- 7 files changed, 1884 insertions(+), 1 deletion(-) create mode 100644 reftable/constants.h create mode 100644 reftable/record.c create mode 100644 reftable/record.h create mode 100644 reftable/record_test.c create mode 100644 reftable/reftable-record.h diff --git a/Makefile b/Makefile index ad10ada9283..15321edbd2c 100644 --- a/Makefile +++ b/Makefile @@ -2445,7 +2445,9 @@ REFTABLE_OBJS += reftable/basics.o REFTABLE_OBJS += reftable/error.o REFTABLE_OBJS += reftable/blocksource.o REFTABLE_OBJS += reftable/publicbasics.o +REFTABLE_OBJS += reftable/record.o +REFTABLE_TEST_OBJS += reftable/record_test.o REFTABLE_TEST_OBJS += reftable/test_framework.o REFTABLE_TEST_OBJS += reftable/basics_test.o diff --git a/reftable/constants.h b/reftable/constants.h new file mode 100644 index 00000000000..5eee72c4c11 --- /dev/null +++ b/reftable/constants.h @@ -0,0 +1,21 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef CONSTANTS_H +#define CONSTANTS_H + +#define BLOCK_TYPE_LOG 'g' +#define BLOCK_TYPE_INDEX 'i' +#define BLOCK_TYPE_REF 'r' +#define BLOCK_TYPE_OBJ 'o' +#define BLOCK_TYPE_ANY 0 + +#define MAX_RESTARTS ((1 << 16) - 1) +#define DEFAULT_BLOCK_SIZE 4096 + +#endif diff --git a/reftable/record.c b/reftable/record.c new file mode 100644 index 00000000000..34ed480b257 --- /dev/null +++ b/reftable/record.c @@ -0,0 +1,1200 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +/* record.c - methods for different types of records. */ + +#include "record.h" + +#include "system.h" +#include "constants.h" +#include "reftable-error.h" +#include "basics.h" + +int get_var_int(uint64_t *dest, struct string_view *in) +{ + int ptr = 0; + uint64_t val; + + if (in->len == 0) + return -1; + val = in->buf[ptr] & 0x7f; + + while (in->buf[ptr] & 0x80) { + ptr++; + if (ptr > in->len) { + return -1; + } + val = (val + 1) << 7 | (uint64_t)(in->buf[ptr] & 0x7f); + } + + *dest = val; + return ptr + 1; +} + +int put_var_int(struct string_view *dest, uint64_t val) +{ + uint8_t buf[10] = { 0 }; + int i = 9; + int n = 0; + buf[i] = (uint8_t)(val & 0x7f); + i--; + while (1) { + val >>= 7; + if (!val) { + break; + } + val--; + buf[i] = 0x80 | (uint8_t)(val & 0x7f); + i--; + } + + n = sizeof(buf) - i - 1; + if (dest->len < n) + return -1; + memcpy(dest->buf, &buf[i + 1], n); + return n; +} + +int reftable_is_block_type(uint8_t typ) +{ + switch (typ) { + case BLOCK_TYPE_REF: + case BLOCK_TYPE_LOG: + case BLOCK_TYPE_OBJ: + case BLOCK_TYPE_INDEX: + return 1; + } + return 0; +} + +uint8_t *reftable_ref_record_val1(struct reftable_ref_record *rec) +{ + switch (rec->value_type) { + case REFTABLE_REF_VAL1: + return rec->value.val1; + case REFTABLE_REF_VAL2: + return rec->value.val2.value; + default: + return NULL; + } +} + +uint8_t *reftable_ref_record_val2(struct reftable_ref_record *rec) +{ + switch (rec->value_type) { + case REFTABLE_REF_VAL2: + return rec->value.val2.target_value; + default: + return NULL; + } +} + +static int decode_string(struct strbuf *dest, struct string_view in) +{ + int start_len = in.len; + uint64_t tsize = 0; + int n = get_var_int(&tsize, &in); + if (n <= 0) + return -1; + string_view_consume(&in, n); + if (in.len < tsize) + return -1; + + strbuf_reset(dest); + strbuf_add(dest, in.buf, tsize); + string_view_consume(&in, tsize); + + return start_len - in.len; +} + +static int encode_string(char *str, struct string_view s) +{ + struct string_view start = s; + int l = strlen(str); + int n = put_var_int(&s, l); + if (n < 0) + return -1; + string_view_consume(&s, n); + if (s.len < l) + return -1; + memcpy(s.buf, str, l); + string_view_consume(&s, l); + + return start.len - s.len; +} + +int reftable_encode_key(int *restart, struct string_view dest, + struct strbuf prev_key, struct strbuf key, + uint8_t extra) +{ + struct string_view start = dest; + int prefix_len = common_prefix_size(&prev_key, &key); + uint64_t suffix_len = key.len - prefix_len; + int n = put_var_int(&dest, (uint64_t)prefix_len); + if (n < 0) + return -1; + string_view_consume(&dest, n); + + *restart = (prefix_len == 0); + + n = put_var_int(&dest, suffix_len << 3 | (uint64_t)extra); + if (n < 0) + return -1; + string_view_consume(&dest, n); + + if (dest.len < suffix_len) + return -1; + memcpy(dest.buf, key.buf + prefix_len, suffix_len); + string_view_consume(&dest, suffix_len); + + return start.len - dest.len; +} + +int reftable_decode_key(struct strbuf *key, uint8_t *extra, + struct strbuf last_key, struct string_view in) +{ + int start_len = in.len; + uint64_t prefix_len = 0; + uint64_t suffix_len = 0; + int n = get_var_int(&prefix_len, &in); + if (n < 0) + return -1; + string_view_consume(&in, n); + + if (prefix_len > last_key.len) + return -1; + + n = get_var_int(&suffix_len, &in); + if (n <= 0) + return -1; + string_view_consume(&in, n); + + *extra = (uint8_t)(suffix_len & 0x7); + suffix_len >>= 3; + + if (in.len < suffix_len) + return -1; + + strbuf_reset(key); + strbuf_add(key, last_key.buf, prefix_len); + strbuf_add(key, in.buf, suffix_len); + string_view_consume(&in, suffix_len); + + return start_len - in.len; +} + +static void reftable_ref_record_key(const void *r, struct strbuf *dest) +{ + const struct reftable_ref_record *rec = + (const struct reftable_ref_record *)r; + strbuf_reset(dest); + strbuf_addstr(dest, rec->refname); +} + +static void reftable_ref_record_copy_from(void *rec, const void *src_rec, + int hash_size) +{ + struct reftable_ref_record *ref = rec; + const struct reftable_ref_record *src = src_rec; + assert(hash_size > 0); + + /* This is simple and correct, but we could probably reuse the hash + * fields. */ + reftable_ref_record_release(ref); + if (src->refname) { + ref->refname = xstrdup(src->refname); + } + ref->update_index = src->update_index; + ref->value_type = src->value_type; + switch (src->value_type) { + case REFTABLE_REF_DELETION: + break; + case REFTABLE_REF_VAL1: + ref->value.val1 = reftable_malloc(hash_size); + memcpy(ref->value.val1, src->value.val1, hash_size); + break; + case REFTABLE_REF_VAL2: + ref->value.val2.value = reftable_malloc(hash_size); + memcpy(ref->value.val2.value, src->value.val2.value, hash_size); + ref->value.val2.target_value = reftable_malloc(hash_size); + memcpy(ref->value.val2.target_value, + src->value.val2.target_value, hash_size); + break; + case REFTABLE_REF_SYMREF: + ref->value.symref = xstrdup(src->value.symref); + break; + } +} + +static char hexdigit(int c) +{ + if (c <= 9) + return '0' + c; + return 'a' + (c - 10); +} + +static void hex_format(char *dest, uint8_t *src, int hash_size) +{ + assert(hash_size > 0); + if (src) { + int i = 0; + for (i = 0; i < hash_size; i++) { + dest[2 * i] = hexdigit(src[i] >> 4); + dest[2 * i + 1] = hexdigit(src[i] & 0xf); + } + dest[2 * hash_size] = 0; + } +} + +void reftable_ref_record_print(struct reftable_ref_record *ref, + uint32_t hash_id) +{ + char hex[2 * GIT_SHA256_RAWSZ + 1] = { 0 }; /* BUG */ + printf("ref{%s(%" PRIu64 ") ", ref->refname, ref->update_index); + switch (ref->value_type) { + case REFTABLE_REF_SYMREF: + printf("=> %s", ref->value.symref); + break; + case REFTABLE_REF_VAL2: + hex_format(hex, ref->value.val2.value, hash_size(hash_id)); + printf("val 2 %s", hex); + hex_format(hex, ref->value.val2.target_value, + hash_size(hash_id)); + printf("(T %s)", hex); + break; + case REFTABLE_REF_VAL1: + hex_format(hex, ref->value.val1, hash_size(hash_id)); + printf("val 1 %s", hex); + break; + case REFTABLE_REF_DELETION: + printf("delete"); + break; + } + printf("}\n"); +} + +static void reftable_ref_record_release_void(void *rec) +{ + reftable_ref_record_release(rec); +} + +void reftable_ref_record_release(struct reftable_ref_record *ref) +{ + switch (ref->value_type) { + case REFTABLE_REF_SYMREF: + reftable_free(ref->value.symref); + break; + case REFTABLE_REF_VAL2: + reftable_free(ref->value.val2.target_value); + reftable_free(ref->value.val2.value); + break; + case REFTABLE_REF_VAL1: + reftable_free(ref->value.val1); + break; + case REFTABLE_REF_DELETION: + break; + default: + abort(); + } + + reftable_free(ref->refname); + memset(ref, 0, sizeof(struct reftable_ref_record)); +} + +static uint8_t reftable_ref_record_val_type(const void *rec) +{ + const struct reftable_ref_record *r = + (const struct reftable_ref_record *)rec; + return r->value_type; +} + +static int reftable_ref_record_encode(const void *rec, struct string_view s, + int hash_size) +{ + const struct reftable_ref_record *r = + (const struct reftable_ref_record *)rec; + struct string_view start = s; + int n = put_var_int(&s, r->update_index); + assert(hash_size > 0); + if (n < 0) + return -1; + string_view_consume(&s, n); + + switch (r->value_type) { + case REFTABLE_REF_SYMREF: + n = encode_string(r->value.symref, s); + if (n < 0) { + return -1; + } + string_view_consume(&s, n); + break; + case REFTABLE_REF_VAL2: + if (s.len < 2 * hash_size) { + return -1; + } + memcpy(s.buf, r->value.val2.value, hash_size); + string_view_consume(&s, hash_size); + memcpy(s.buf, r->value.val2.target_value, hash_size); + string_view_consume(&s, hash_size); + break; + case REFTABLE_REF_VAL1: + if (s.len < hash_size) { + return -1; + } + memcpy(s.buf, r->value.val1, hash_size); + string_view_consume(&s, hash_size); + break; + case REFTABLE_REF_DELETION: + break; + default: + abort(); + } + + return start.len - s.len; +} + +static int reftable_ref_record_decode(void *rec, struct strbuf key, + uint8_t val_type, struct string_view in, + int hash_size) +{ + struct reftable_ref_record *r = rec; + struct string_view start = in; + uint64_t update_index = 0; + int n = get_var_int(&update_index, &in); + if (n < 0) + return n; + string_view_consume(&in, n); + + reftable_ref_record_release(r); + + assert(hash_size > 0); + + r->refname = reftable_realloc(r->refname, key.len + 1); + memcpy(r->refname, key.buf, key.len); + r->update_index = update_index; + r->refname[key.len] = 0; + r->value_type = val_type; + switch (val_type) { + case REFTABLE_REF_VAL1: + if (in.len < hash_size) { + return -1; + } + + r->value.val1 = reftable_malloc(hash_size); + memcpy(r->value.val1, in.buf, hash_size); + string_view_consume(&in, hash_size); + break; + + case REFTABLE_REF_VAL2: + if (in.len < 2 * hash_size) { + return -1; + } + + r->value.val2.value = reftable_malloc(hash_size); + memcpy(r->value.val2.value, in.buf, hash_size); + string_view_consume(&in, hash_size); + + r->value.val2.target_value = reftable_malloc(hash_size); + memcpy(r->value.val2.target_value, in.buf, hash_size); + string_view_consume(&in, hash_size); + break; + + case REFTABLE_REF_SYMREF: { + struct strbuf dest = STRBUF_INIT; + int n = decode_string(&dest, in); + if (n < 0) { + return -1; + } + string_view_consume(&in, n); + r->value.symref = dest.buf; + } break; + + case REFTABLE_REF_DELETION: + break; + default: + abort(); + break; + } + + return start.len - in.len; +} + +static int reftable_ref_record_is_deletion_void(const void *p) +{ + return reftable_ref_record_is_deletion( + (const struct reftable_ref_record *)p); +} + +static struct reftable_record_vtable reftable_ref_record_vtable = { + .key = &reftable_ref_record_key, + .type = BLOCK_TYPE_REF, + .copy_from = &reftable_ref_record_copy_from, + .val_type = &reftable_ref_record_val_type, + .encode = &reftable_ref_record_encode, + .decode = &reftable_ref_record_decode, + .release = &reftable_ref_record_release_void, + .is_deletion = &reftable_ref_record_is_deletion_void, +}; + +static void reftable_obj_record_key(const void *r, struct strbuf *dest) +{ + const struct reftable_obj_record *rec = + (const struct reftable_obj_record *)r; + strbuf_reset(dest); + strbuf_add(dest, rec->hash_prefix, rec->hash_prefix_len); +} + +static void reftable_obj_record_release(void *rec) +{ + struct reftable_obj_record *obj = rec; + FREE_AND_NULL(obj->hash_prefix); + FREE_AND_NULL(obj->offsets); + memset(obj, 0, sizeof(struct reftable_obj_record)); +} + +static void reftable_obj_record_copy_from(void *rec, const void *src_rec, + int hash_size) +{ + struct reftable_obj_record *obj = rec; + const struct reftable_obj_record *src = + (const struct reftable_obj_record *)src_rec; + + reftable_obj_record_release(obj); + *obj = *src; + obj->hash_prefix = reftable_malloc(obj->hash_prefix_len); + memcpy(obj->hash_prefix, src->hash_prefix, obj->hash_prefix_len); + + obj->offsets = reftable_malloc(obj->offset_len * sizeof(uint64_t)); + COPY_ARRAY(obj->offsets, src->offsets, obj->offset_len); +} + +static uint8_t reftable_obj_record_val_type(const void *rec) +{ + const struct reftable_obj_record *r = rec; + if (r->offset_len > 0 && r->offset_len < 8) + return r->offset_len; + return 0; +} + +static int reftable_obj_record_encode(const void *rec, struct string_view s, + int hash_size) +{ + const struct reftable_obj_record *r = rec; + struct string_view start = s; + int i = 0; + int n = 0; + uint64_t last = 0; + if (r->offset_len == 0 || r->offset_len >= 8) { + n = put_var_int(&s, r->offset_len); + if (n < 0) { + return -1; + } + string_view_consume(&s, n); + } + if (r->offset_len == 0) + return start.len - s.len; + n = put_var_int(&s, r->offsets[0]); + if (n < 0) + return -1; + string_view_consume(&s, n); + + last = r->offsets[0]; + for (i = 1; i < r->offset_len; i++) { + int n = put_var_int(&s, r->offsets[i] - last); + if (n < 0) { + return -1; + } + string_view_consume(&s, n); + last = r->offsets[i]; + } + return start.len - s.len; +} + +static int reftable_obj_record_decode(void *rec, struct strbuf key, + uint8_t val_type, struct string_view in, + int hash_size) +{ + struct string_view start = in; + struct reftable_obj_record *r = rec; + uint64_t count = val_type; + int n = 0; + uint64_t last; + int j; + r->hash_prefix = reftable_malloc(key.len); + memcpy(r->hash_prefix, key.buf, key.len); + r->hash_prefix_len = key.len; + + if (val_type == 0) { + n = get_var_int(&count, &in); + if (n < 0) { + return n; + } + + string_view_consume(&in, n); + } + + r->offsets = NULL; + r->offset_len = 0; + if (count == 0) + return start.len - in.len; + + r->offsets = reftable_malloc(count * sizeof(uint64_t)); + r->offset_len = count; + + n = get_var_int(&r->offsets[0], &in); + if (n < 0) + return n; + string_view_consume(&in, n); + + last = r->offsets[0]; + j = 1; + while (j < count) { + uint64_t delta = 0; + int n = get_var_int(&delta, &in); + if (n < 0) { + return n; + } + string_view_consume(&in, n); + + last = r->offsets[j] = (delta + last); + j++; + } + return start.len - in.len; +} + +static int not_a_deletion(const void *p) +{ + return 0; +} + +static struct reftable_record_vtable reftable_obj_record_vtable = { + .key = &reftable_obj_record_key, + .type = BLOCK_TYPE_OBJ, + .copy_from = &reftable_obj_record_copy_from, + .val_type = &reftable_obj_record_val_type, + .encode = &reftable_obj_record_encode, + .decode = &reftable_obj_record_decode, + .release = &reftable_obj_record_release, + .is_deletion = not_a_deletion, +}; + +void reftable_log_record_print(struct reftable_log_record *log, + uint32_t hash_id) +{ + char hex[GIT_SHA256_RAWSZ + 1] = { 0 }; + + switch (log->value_type) { + case REFTABLE_LOG_DELETION: + printf("log{%s(%" PRIu64 ") delete", log->refname, + log->update_index); + break; + case REFTABLE_LOG_UPDATE: + printf("log{%s(%" PRIu64 ") %s <%s> %" PRIu64 " %04d\n", + log->refname, log->update_index, log->update.name, + log->update.email, log->update.time, + log->update.tz_offset); + hex_format(hex, log->update.old_hash, hash_size(hash_id)); + printf("%s => ", hex); + hex_format(hex, log->update.new_hash, hash_size(hash_id)); + printf("%s\n\n%s\n}\n", hex, log->update.message); + break; + } +} + +static void reftable_log_record_key(const void *r, struct strbuf *dest) +{ + const struct reftable_log_record *rec = + (const struct reftable_log_record *)r; + int len = strlen(rec->refname); + uint8_t i64[8]; + uint64_t ts = 0; + strbuf_reset(dest); + strbuf_add(dest, (uint8_t *)rec->refname, len + 1); + + ts = (~ts) - rec->update_index; + put_be64(&i64[0], ts); + strbuf_add(dest, i64, sizeof(i64)); +} + +static void reftable_log_record_copy_from(void *rec, const void *src_rec, + int hash_size) +{ + struct reftable_log_record *dst = rec; + const struct reftable_log_record *src = + (const struct reftable_log_record *)src_rec; + + reftable_log_record_release(dst); + *dst = *src; + if (dst->refname) { + dst->refname = xstrdup(dst->refname); + } + switch (dst->value_type) { + case REFTABLE_LOG_DELETION: + break; + case REFTABLE_LOG_UPDATE: + if (dst->update.email) { + dst->update.email = xstrdup(dst->update.email); + } + if (dst->update.name) { + dst->update.name = xstrdup(dst->update.name); + } + if (dst->update.message) { + dst->update.message = xstrdup(dst->update.message); + } + + if (dst->update.new_hash) { + dst->update.new_hash = reftable_malloc(hash_size); + memcpy(dst->update.new_hash, src->update.new_hash, + hash_size); + } + if (dst->update.old_hash) { + dst->update.old_hash = reftable_malloc(hash_size); + memcpy(dst->update.old_hash, src->update.old_hash, + hash_size); + } + break; + } +} + +static void reftable_log_record_release_void(void *rec) +{ + struct reftable_log_record *r = rec; + reftable_log_record_release(r); +} + +void reftable_log_record_release(struct reftable_log_record *r) +{ + reftable_free(r->refname); + switch (r->value_type) { + case REFTABLE_LOG_DELETION: + break; + case REFTABLE_LOG_UPDATE: + reftable_free(r->update.new_hash); + reftable_free(r->update.old_hash); + reftable_free(r->update.name); + reftable_free(r->update.email); + reftable_free(r->update.message); + break; + } + memset(r, 0, sizeof(struct reftable_log_record)); +} + +static uint8_t reftable_log_record_val_type(const void *rec) +{ + const struct reftable_log_record *log = + (const struct reftable_log_record *)rec; + + return reftable_log_record_is_deletion(log) ? 0 : 1; +} + +static uint8_t zero[GIT_SHA256_RAWSZ] = { 0 }; + +static int reftable_log_record_encode(const void *rec, struct string_view s, + int hash_size) +{ + const struct reftable_log_record *r = rec; + struct string_view start = s; + int n = 0; + uint8_t *oldh = NULL; + uint8_t *newh = NULL; + if (reftable_log_record_is_deletion(r)) + return 0; + + oldh = r->update.old_hash; + newh = r->update.new_hash; + if (!oldh) { + oldh = zero; + } + if (!newh) { + newh = zero; + } + + if (s.len < 2 * hash_size) + return -1; + + memcpy(s.buf, oldh, hash_size); + memcpy(s.buf + hash_size, newh, hash_size); + string_view_consume(&s, 2 * hash_size); + + n = encode_string(r->update.name ? r->update.name : "", s); + if (n < 0) + return -1; + string_view_consume(&s, n); + + n = encode_string(r->update.email ? r->update.email : "", s); + if (n < 0) + return -1; + string_view_consume(&s, n); + + n = put_var_int(&s, r->update.time); + if (n < 0) + return -1; + string_view_consume(&s, n); + + if (s.len < 2) + return -1; + + put_be16(s.buf, r->update.tz_offset); + string_view_consume(&s, 2); + + n = encode_string(r->update.message ? r->update.message : "", s); + if (n < 0) + return -1; + string_view_consume(&s, n); + + return start.len - s.len; +} + +static int reftable_log_record_decode(void *rec, struct strbuf key, + uint8_t val_type, struct string_view in, + int hash_size) +{ + struct string_view start = in; + struct reftable_log_record *r = rec; + uint64_t max = 0; + uint64_t ts = 0; + struct strbuf dest = STRBUF_INIT; + int n; + + if (key.len <= 9 || key.buf[key.len - 9] != 0) + return REFTABLE_FORMAT_ERROR; + + r->refname = reftable_realloc(r->refname, key.len - 8); + memcpy(r->refname, key.buf, key.len - 8); + ts = get_be64(key.buf + key.len - 8); + + r->update_index = (~max) - ts; + + if (val_type != r->value_type) { + switch (r->value_type) { + case REFTABLE_LOG_UPDATE: + FREE_AND_NULL(r->update.old_hash); + FREE_AND_NULL(r->update.new_hash); + FREE_AND_NULL(r->update.message); + FREE_AND_NULL(r->update.email); + FREE_AND_NULL(r->update.name); + break; + case REFTABLE_LOG_DELETION: + break; + } + } + + r->value_type = val_type; + if (val_type == REFTABLE_LOG_DELETION) + return 0; + + if (in.len < 2 * hash_size) + return REFTABLE_FORMAT_ERROR; + + r->update.old_hash = reftable_realloc(r->update.old_hash, hash_size); + r->update.new_hash = reftable_realloc(r->update.new_hash, hash_size); + + memcpy(r->update.old_hash, in.buf, hash_size); + memcpy(r->update.new_hash, in.buf + hash_size, hash_size); + + string_view_consume(&in, 2 * hash_size); + + n = decode_string(&dest, in); + if (n < 0) + goto done; + string_view_consume(&in, n); + + r->update.name = reftable_realloc(r->update.name, dest.len + 1); + memcpy(r->update.name, dest.buf, dest.len); + r->update.name[dest.len] = 0; + + strbuf_reset(&dest); + n = decode_string(&dest, in); + if (n < 0) + goto done; + string_view_consume(&in, n); + + r->update.email = reftable_realloc(r->update.email, dest.len + 1); + memcpy(r->update.email, dest.buf, dest.len); + r->update.email[dest.len] = 0; + + ts = 0; + n = get_var_int(&ts, &in); + if (n < 0) + goto done; + string_view_consume(&in, n); + r->update.time = ts; + if (in.len < 2) + goto done; + + r->update.tz_offset = get_be16(in.buf); + string_view_consume(&in, 2); + + strbuf_reset(&dest); + n = decode_string(&dest, in); + if (n < 0) + goto done; + string_view_consume(&in, n); + + r->update.message = reftable_realloc(r->update.message, dest.len + 1); + memcpy(r->update.message, dest.buf, dest.len); + r->update.message[dest.len] = 0; + + strbuf_release(&dest); + return start.len - in.len; + +done: + strbuf_release(&dest); + return REFTABLE_FORMAT_ERROR; +} + +static int null_streq(char *a, char *b) +{ + char *empty = ""; + if (!a) + a = empty; + + if (!b) + b = empty; + + return 0 == strcmp(a, b); +} + +static int zero_hash_eq(uint8_t *a, uint8_t *b, int sz) +{ + if (!a) + a = zero; + + if (!b) + b = zero; + + return !memcmp(a, b, sz); +} + +int reftable_log_record_equal(struct reftable_log_record *a, + struct reftable_log_record *b, int hash_size) +{ + if (!(null_streq(a->refname, b->refname) && + a->update_index == b->update_index && + a->value_type == b->value_type)) + return 0; + + switch (a->value_type) { + case REFTABLE_LOG_DELETION: + return 1; + case REFTABLE_LOG_UPDATE: + return null_streq(a->update.name, b->update.name) && + a->update.time == b->update.time && + a->update.tz_offset == b->update.tz_offset && + null_streq(a->update.email, b->update.email) && + null_streq(a->update.message, b->update.message) && + zero_hash_eq(a->update.old_hash, b->update.old_hash, + hash_size) && + zero_hash_eq(a->update.new_hash, b->update.new_hash, + hash_size); + } + + abort(); +} + +static int reftable_log_record_is_deletion_void(const void *p) +{ + return reftable_log_record_is_deletion( + (const struct reftable_log_record *)p); +} + +static struct reftable_record_vtable reftable_log_record_vtable = { + .key = &reftable_log_record_key, + .type = BLOCK_TYPE_LOG, + .copy_from = &reftable_log_record_copy_from, + .val_type = &reftable_log_record_val_type, + .encode = &reftable_log_record_encode, + .decode = &reftable_log_record_decode, + .release = &reftable_log_record_release_void, + .is_deletion = &reftable_log_record_is_deletion_void, +}; + +struct reftable_record reftable_new_record(uint8_t typ) +{ + struct reftable_record rec = { NULL }; + switch (typ) { + case BLOCK_TYPE_REF: { + struct reftable_ref_record *r = + reftable_calloc(sizeof(struct reftable_ref_record)); + reftable_record_from_ref(&rec, r); + return rec; + } + + case BLOCK_TYPE_OBJ: { + struct reftable_obj_record *r = + reftable_calloc(sizeof(struct reftable_obj_record)); + reftable_record_from_obj(&rec, r); + return rec; + } + case BLOCK_TYPE_LOG: { + struct reftable_log_record *r = + reftable_calloc(sizeof(struct reftable_log_record)); + reftable_record_from_log(&rec, r); + return rec; + } + case BLOCK_TYPE_INDEX: { + struct reftable_index_record empty = { .last_key = + STRBUF_INIT }; + struct reftable_index_record *r = + reftable_calloc(sizeof(struct reftable_index_record)); + *r = empty; + reftable_record_from_index(&rec, r); + return rec; + } + } + abort(); + return rec; +} + +/* clear out the record, yielding the reftable_record data that was + * encapsulated. */ +static void *reftable_record_yield(struct reftable_record *rec) +{ + void *p = rec->data; + rec->data = NULL; + return p; +} + +void reftable_record_destroy(struct reftable_record *rec) +{ + reftable_record_release(rec); + reftable_free(reftable_record_yield(rec)); +} + +static void reftable_index_record_key(const void *r, struct strbuf *dest) +{ + const struct reftable_index_record *rec = r; + strbuf_reset(dest); + strbuf_addbuf(dest, &rec->last_key); +} + +static void reftable_index_record_copy_from(void *rec, const void *src_rec, + int hash_size) +{ + struct reftable_index_record *dst = rec; + const struct reftable_index_record *src = src_rec; + + strbuf_reset(&dst->last_key); + strbuf_addbuf(&dst->last_key, &src->last_key); + dst->offset = src->offset; +} + +static void reftable_index_record_release(void *rec) +{ + struct reftable_index_record *idx = rec; + strbuf_release(&idx->last_key); +} + +static uint8_t reftable_index_record_val_type(const void *rec) +{ + return 0; +} + +static int reftable_index_record_encode(const void *rec, struct string_view out, + int hash_size) +{ + const struct reftable_index_record *r = + (const struct reftable_index_record *)rec; + struct string_view start = out; + + int n = put_var_int(&out, r->offset); + if (n < 0) + return n; + + string_view_consume(&out, n); + + return start.len - out.len; +} + +static int reftable_index_record_decode(void *rec, struct strbuf key, + uint8_t val_type, struct string_view in, + int hash_size) +{ + struct string_view start = in; + struct reftable_index_record *r = rec; + int n = 0; + + strbuf_reset(&r->last_key); + strbuf_addbuf(&r->last_key, &key); + + n = get_var_int(&r->offset, &in); + if (n < 0) + return n; + + string_view_consume(&in, n); + return start.len - in.len; +} + +static struct reftable_record_vtable reftable_index_record_vtable = { + .key = &reftable_index_record_key, + .type = BLOCK_TYPE_INDEX, + .copy_from = &reftable_index_record_copy_from, + .val_type = &reftable_index_record_val_type, + .encode = &reftable_index_record_encode, + .decode = &reftable_index_record_decode, + .release = &reftable_index_record_release, + .is_deletion = ¬_a_deletion, +}; + +void reftable_record_key(struct reftable_record *rec, struct strbuf *dest) +{ + rec->ops->key(rec->data, dest); +} + +uint8_t reftable_record_type(struct reftable_record *rec) +{ + return rec->ops->type; +} + +int reftable_record_encode(struct reftable_record *rec, struct string_view dest, + int hash_size) +{ + return rec->ops->encode(rec->data, dest, hash_size); +} + +void reftable_record_copy_from(struct reftable_record *rec, + struct reftable_record *src, int hash_size) +{ + assert(src->ops->type == rec->ops->type); + + rec->ops->copy_from(rec->data, src->data, hash_size); +} + +uint8_t reftable_record_val_type(struct reftable_record *rec) +{ + return rec->ops->val_type(rec->data); +} + +int reftable_record_decode(struct reftable_record *rec, struct strbuf key, + uint8_t extra, struct string_view src, int hash_size) +{ + return rec->ops->decode(rec->data, key, extra, src, hash_size); +} + +void reftable_record_release(struct reftable_record *rec) +{ + rec->ops->release(rec->data); +} + +int reftable_record_is_deletion(struct reftable_record *rec) +{ + return rec->ops->is_deletion(rec->data); +} + +void reftable_record_from_ref(struct reftable_record *rec, + struct reftable_ref_record *ref_rec) +{ + assert(!rec->ops); + rec->data = ref_rec; + rec->ops = &reftable_ref_record_vtable; +} + +void reftable_record_from_obj(struct reftable_record *rec, + struct reftable_obj_record *obj_rec) +{ + assert(!rec->ops); + rec->data = obj_rec; + rec->ops = &reftable_obj_record_vtable; +} + +void reftable_record_from_index(struct reftable_record *rec, + struct reftable_index_record *index_rec) +{ + assert(!rec->ops); + rec->data = index_rec; + rec->ops = &reftable_index_record_vtable; +} + +void reftable_record_from_log(struct reftable_record *rec, + struct reftable_log_record *log_rec) +{ + assert(!rec->ops); + rec->data = log_rec; + rec->ops = &reftable_log_record_vtable; +} + +struct reftable_ref_record *reftable_record_as_ref(struct reftable_record *rec) +{ + assert(reftable_record_type(rec) == BLOCK_TYPE_REF); + return rec->data; +} + +struct reftable_log_record *reftable_record_as_log(struct reftable_record *rec) +{ + assert(reftable_record_type(rec) == BLOCK_TYPE_LOG); + return rec->data; +} + +static int hash_equal(uint8_t *a, uint8_t *b, int hash_size) +{ + if (a && b) + return !memcmp(a, b, hash_size); + + return a == b; +} + +int reftable_ref_record_equal(struct reftable_ref_record *a, + struct reftable_ref_record *b, int hash_size) +{ + assert(hash_size > 0); + if (!(0 == strcmp(a->refname, b->refname) && + a->update_index == b->update_index && + a->value_type == b->value_type)) + return 0; + + switch (a->value_type) { + case REFTABLE_REF_SYMREF: + return !strcmp(a->value.symref, b->value.symref); + case REFTABLE_REF_VAL2: + return hash_equal(a->value.val2.value, b->value.val2.value, + hash_size) && + hash_equal(a->value.val2.target_value, + b->value.val2.target_value, hash_size); + case REFTABLE_REF_VAL1: + return hash_equal(a->value.val1, b->value.val1, hash_size); + case REFTABLE_REF_DELETION: + return 1; + default: + abort(); + } +} + +int reftable_ref_record_compare_name(const void *a, const void *b) +{ + return strcmp(((struct reftable_ref_record *)a)->refname, + ((struct reftable_ref_record *)b)->refname); +} + +int reftable_ref_record_is_deletion(const struct reftable_ref_record *ref) +{ + return ref->value_type == REFTABLE_REF_DELETION; +} + +int reftable_log_record_compare_key(const void *a, const void *b) +{ + const struct reftable_log_record *la = a; + const struct reftable_log_record *lb = b; + + int cmp = strcmp(la->refname, lb->refname); + if (cmp) + return cmp; + if (la->update_index > lb->update_index) + return -1; + return (la->update_index < lb->update_index) ? 1 : 0; +} + +int reftable_log_record_is_deletion(const struct reftable_log_record *log) +{ + return (log->value_type == REFTABLE_LOG_DELETION); +} + +void string_view_consume(struct string_view *s, int n) +{ + s->buf += n; + s->len -= n; +} diff --git a/reftable/record.h b/reftable/record.h new file mode 100644 index 00000000000..498e8c50bf4 --- /dev/null +++ b/reftable/record.h @@ -0,0 +1,139 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef RECORD_H +#define RECORD_H + +#include "system.h" + +#include + +#include "reftable-record.h" + +/* + * A substring of existing string data. This structure takes no responsibility + * for the lifetime of the data it points to. + */ +struct string_view { + uint8_t *buf; + size_t len; +}; + +/* Advance `s.buf` by `n`, and decrease length. */ +void string_view_consume(struct string_view *s, int n); + +/* utilities for de/encoding varints */ + +int get_var_int(uint64_t *dest, struct string_view *in); +int put_var_int(struct string_view *dest, uint64_t val); + +/* Methods for records. */ +struct reftable_record_vtable { + /* encode the key of to a uint8_t strbuf. */ + void (*key)(const void *rec, struct strbuf *dest); + + /* The record type of ('r' for ref). */ + uint8_t type; + + void (*copy_from)(void *dest, const void *src, int hash_size); + + /* a value of [0..7], indicating record subvariants (eg. ref vs. symref + * vs ref deletion) */ + uint8_t (*val_type)(const void *rec); + + /* encodes rec into dest, returning how much space was used. */ + int (*encode)(const void *rec, struct string_view dest, int hash_size); + + /* decode data from `src` into the record. */ + int (*decode)(void *rec, struct strbuf key, uint8_t extra, + struct string_view src, int hash_size); + + /* deallocate and null the record. */ + void (*release)(void *rec); + + /* is this a tombstone? */ + int (*is_deletion)(const void *rec); +}; + +/* record is a generic wrapper for different types of records. */ +struct reftable_record { + void *data; + struct reftable_record_vtable *ops; +}; + +/* returns true for recognized block types. Block start with the block type. */ +int reftable_is_block_type(uint8_t typ); + +/* creates a malloced record of the given type. Dispose with record_destroy */ +struct reftable_record reftable_new_record(uint8_t typ); + +/* Encode `key` into `dest`. Sets `is_restart` to indicate a restart. Returns + * number of bytes written. */ +int reftable_encode_key(int *is_restart, struct string_view dest, + struct strbuf prev_key, struct strbuf key, + uint8_t extra); + +/* Decode into `key` and `extra` from `in` */ +int reftable_decode_key(struct strbuf *key, uint8_t *extra, + struct strbuf last_key, struct string_view in); + +/* reftable_index_record are used internally to speed up lookups. */ +struct reftable_index_record { + uint64_t offset; /* Offset of block */ + struct strbuf last_key; /* Last key of the block. */ +}; + +/* reftable_obj_record stores an object ID => ref mapping. */ +struct reftable_obj_record { + uint8_t *hash_prefix; /* leading bytes of the object ID */ + int hash_prefix_len; /* number of leading bytes. Constant + * across a single table. */ + uint64_t *offsets; /* a vector of file offsets. */ + int offset_len; +}; + +/* see struct record_vtable */ + +void reftable_record_key(struct reftable_record *rec, struct strbuf *dest); +uint8_t reftable_record_type(struct reftable_record *rec); +void reftable_record_copy_from(struct reftable_record *rec, + struct reftable_record *src, int hash_size); +uint8_t reftable_record_val_type(struct reftable_record *rec); +int reftable_record_encode(struct reftable_record *rec, struct string_view dest, + int hash_size); +int reftable_record_decode(struct reftable_record *rec, struct strbuf key, + uint8_t extra, struct string_view src, + int hash_size); +int reftable_record_is_deletion(struct reftable_record *rec); + +/* zeroes out the embedded record */ +void reftable_record_release(struct reftable_record *rec); + +/* clear and deallocate embedded record, and zero `rec`. */ +void reftable_record_destroy(struct reftable_record *rec); + +/* initialize generic records from concrete records. The generic record should + * be zeroed out. */ +void reftable_record_from_obj(struct reftable_record *rec, + struct reftable_obj_record *objrec); +void reftable_record_from_index(struct reftable_record *rec, + struct reftable_index_record *idxrec); +void reftable_record_from_ref(struct reftable_record *rec, + struct reftable_ref_record *refrec); +void reftable_record_from_log(struct reftable_record *rec, + struct reftable_log_record *logrec); +struct reftable_ref_record *reftable_record_as_ref(struct reftable_record *ref); +struct reftable_log_record *reftable_record_as_log(struct reftable_record *ref); + +/* for qsort. */ +int reftable_ref_record_compare_name(const void *a, const void *b); + +/* for qsort. */ +int reftable_log_record_compare_key(const void *a, const void *b); + +#endif diff --git a/reftable/record_test.c b/reftable/record_test.c new file mode 100644 index 00000000000..bf5d072b20d --- /dev/null +++ b/reftable/record_test.c @@ -0,0 +1,407 @@ +/* + Copyright 2020 Google LLC + + Use of this source code is governed by a BSD-style + license that can be found in the LICENSE file or at + https://developers.google.com/open-source/licenses/bsd +*/ + +#include "record.h" + +#include "system.h" +#include "basics.h" +#include "constants.h" +#include "test_framework.h" +#include "reftable-tests.h" + +static void test_copy(struct reftable_record *rec) +{ + struct reftable_record copy = + reftable_new_record(reftable_record_type(rec)); + reftable_record_copy_from(©, rec, GIT_SHA1_RAWSZ); + /* do it twice to catch memory leaks */ + reftable_record_copy_from(©, rec, GIT_SHA1_RAWSZ); + switch (reftable_record_type(©)) { + case BLOCK_TYPE_REF: + EXPECT(reftable_ref_record_equal(reftable_record_as_ref(©), + reftable_record_as_ref(rec), + GIT_SHA1_RAWSZ)); + break; + case BLOCK_TYPE_LOG: + EXPECT(reftable_log_record_equal(reftable_record_as_log(©), + reftable_record_as_log(rec), + GIT_SHA1_RAWSZ)); + break; + } + reftable_record_destroy(©); +} + +static void test_varint_roundtrip(void) +{ + uint64_t inputs[] = { 0, + 1, + 27, + 127, + 128, + 257, + 4096, + ((uint64_t)1 << 63), + ((uint64_t)1 << 63) + ((uint64_t)1 << 63) - 1 }; + int i = 0; + for (i = 0; i < ARRAY_SIZE(inputs); i++) { + uint8_t dest[10]; + + struct string_view out = { + .buf = dest, + .len = sizeof(dest), + }; + uint64_t in = inputs[i]; + int n = put_var_int(&out, in); + uint64_t got = 0; + + EXPECT(n > 0); + out.len = n; + n = get_var_int(&got, &out); + EXPECT(n > 0); + + EXPECT(got == in); + } +} + +static void test_common_prefix(void) +{ + struct { + const char *a, *b; + int want; + } cases[] = { + { "abc", "ab", 2 }, + { "", "abc", 0 }, + { "abc", "abd", 2 }, + { "abc", "pqr", 0 }, + }; + + int i = 0; + for (i = 0; i < ARRAY_SIZE(cases); i++) { + struct strbuf a = STRBUF_INIT; + struct strbuf b = STRBUF_INIT; + strbuf_addstr(&a, cases[i].a); + strbuf_addstr(&b, cases[i].b); + EXPECT(common_prefix_size(&a, &b) == cases[i].want); + + strbuf_release(&a); + strbuf_release(&b); + } +} + +static void set_hash(uint8_t *h, int j) +{ + int i = 0; + for (i = 0; i < hash_size(GIT_SHA1_FORMAT_ID); i++) { + h[i] = (j >> i) & 0xff; + } +} + +static void test_reftable_ref_record_roundtrip(void) +{ + int i = 0; + + for (i = REFTABLE_REF_DELETION; i < REFTABLE_NR_REF_VALUETYPES; i++) { + struct reftable_ref_record in = { NULL }; + struct reftable_ref_record out = { NULL }; + struct reftable_record rec_out = { NULL }; + struct strbuf key = STRBUF_INIT; + struct reftable_record rec = { NULL }; + uint8_t buffer[1024] = { 0 }; + struct string_view dest = { + .buf = buffer, + .len = sizeof(buffer), + }; + + int n, m; + + in.value_type = i; + switch (i) { + case REFTABLE_REF_DELETION: + break; + case REFTABLE_REF_VAL1: + in.value.val1 = reftable_malloc(GIT_SHA1_RAWSZ); + set_hash(in.value.val1, 1); + break; + case REFTABLE_REF_VAL2: + in.value.val2.value = reftable_malloc(GIT_SHA1_RAWSZ); + set_hash(in.value.val2.value, 1); + in.value.val2.target_value = + reftable_malloc(GIT_SHA1_RAWSZ); + set_hash(in.value.val2.target_value, 2); + break; + case REFTABLE_REF_SYMREF: + in.value.symref = xstrdup("target"); + break; + } + in.refname = xstrdup("refs/heads/master"); + + reftable_record_from_ref(&rec, &in); + test_copy(&rec); + + EXPECT(reftable_record_val_type(&rec) == i); + + reftable_record_key(&rec, &key); + n = reftable_record_encode(&rec, dest, GIT_SHA1_RAWSZ); + EXPECT(n > 0); + + /* decode into a non-zero reftable_record to test for leaks. */ + + reftable_record_from_ref(&rec_out, &out); + m = reftable_record_decode(&rec_out, key, i, dest, + GIT_SHA1_RAWSZ); + EXPECT(n == m); + + EXPECT(reftable_ref_record_equal(&in, &out, GIT_SHA1_RAWSZ)); + reftable_record_release(&rec_out); + + strbuf_release(&key); + reftable_ref_record_release(&in); + } +} + +static void test_reftable_log_record_equal(void) +{ + struct reftable_log_record in[2] = { + { + .refname = xstrdup("refs/heads/master"), + .update_index = 42, + }, + { + .refname = xstrdup("refs/heads/master"), + .update_index = 22, + } + }; + + EXPECT(!reftable_log_record_equal(&in[0], &in[1], GIT_SHA1_RAWSZ)); + in[1].update_index = in[0].update_index; + EXPECT(reftable_log_record_equal(&in[0], &in[1], GIT_SHA1_RAWSZ)); + reftable_log_record_release(&in[0]); + reftable_log_record_release(&in[1]); +} + +static void test_reftable_log_record_roundtrip(void) +{ + struct reftable_log_record in[2] = { + { + .refname = xstrdup("refs/heads/master"), + .update_index = 42, + .value_type = REFTABLE_LOG_UPDATE, + .update = { + .old_hash = reftable_malloc(GIT_SHA1_RAWSZ), + .new_hash = reftable_malloc(GIT_SHA1_RAWSZ), + .name = xstrdup("han-wen"), + .email = xstrdup("hanwen@google.com"), + .message = xstrdup("test"), + .time = 1577123507, + .tz_offset = 100, + } + }, + { + .refname = xstrdup("refs/heads/master"), + .update_index = 22, + .value_type = REFTABLE_LOG_DELETION, + } + }; + set_test_hash(in[0].update.new_hash, 1); + set_test_hash(in[0].update.old_hash, 2); + for (int i = 0; i < ARRAY_SIZE(in); i++) { + struct reftable_record rec = { NULL }; + struct strbuf key = STRBUF_INIT; + uint8_t buffer[1024] = { 0 }; + struct string_view dest = { + .buf = buffer, + .len = sizeof(buffer), + }; + /* populate out, to check for leaks. */ + struct reftable_log_record out = { + .refname = xstrdup("old name"), + .value_type = REFTABLE_LOG_UPDATE, + .update = { + .new_hash = reftable_calloc(GIT_SHA1_RAWSZ), + .old_hash = reftable_calloc(GIT_SHA1_RAWSZ), + .name = xstrdup("old name"), + .email = xstrdup("old@email"), + .message = xstrdup("old message"), + }, + }; + struct reftable_record rec_out = { NULL }; + int n, m, valtype; + + reftable_record_from_log(&rec, &in[i]); + + test_copy(&rec); + + reftable_record_key(&rec, &key); + + n = reftable_record_encode(&rec, dest, GIT_SHA1_RAWSZ); + EXPECT(n >= 0); + reftable_record_from_log(&rec_out, &out); + valtype = reftable_record_val_type(&rec); + m = reftable_record_decode(&rec_out, key, valtype, dest, + GIT_SHA1_RAWSZ); + EXPECT(n == m); + + EXPECT(reftable_log_record_equal(&in[i], &out, GIT_SHA1_RAWSZ)); + reftable_log_record_release(&in[i]); + strbuf_release(&key); + reftable_record_release(&rec_out); + } +} + +static void test_u24_roundtrip(void) +{ + uint32_t in = 0x112233; + uint8_t dest[3]; + uint32_t out; + put_be24(dest, in); + out = get_be24(dest); + EXPECT(in == out); +} + +static void test_key_roundtrip(void) +{ + uint8_t buffer[1024] = { 0 }; + struct string_view dest = { + .buf = buffer, + .len = sizeof(buffer), + }; + struct strbuf last_key = STRBUF_INIT; + struct strbuf key = STRBUF_INIT; + struct strbuf roundtrip = STRBUF_INIT; + int restart; + uint8_t extra; + int n, m; + uint8_t rt_extra; + + strbuf_addstr(&last_key, "refs/heads/master"); + strbuf_addstr(&key, "refs/tags/bla"); + extra = 6; + n = reftable_encode_key(&restart, dest, last_key, key, extra); + EXPECT(!restart); + EXPECT(n > 0); + + m = reftable_decode_key(&roundtrip, &rt_extra, last_key, dest); + EXPECT(n == m); + EXPECT(0 == strbuf_cmp(&key, &roundtrip)); + EXPECT(rt_extra == extra); + + strbuf_release(&last_key); + strbuf_release(&key); + strbuf_release(&roundtrip); +} + +static void test_reftable_obj_record_roundtrip(void) +{ + uint8_t testHash1[GIT_SHA1_RAWSZ] = { 1, 2, 3, 4, 0 }; + uint64_t till9[] = { 1, 2, 3, 4, 500, 600, 700, 800, 9000 }; + struct reftable_obj_record recs[3] = { { + .hash_prefix = testHash1, + .hash_prefix_len = 5, + .offsets = till9, + .offset_len = 3, + }, + { + .hash_prefix = testHash1, + .hash_prefix_len = 5, + .offsets = till9, + .offset_len = 9, + }, + { + .hash_prefix = testHash1, + .hash_prefix_len = 5, + } }; + int i = 0; + for (i = 0; i < ARRAY_SIZE(recs); i++) { + struct reftable_obj_record in = recs[i]; + uint8_t buffer[1024] = { 0 }; + struct string_view dest = { + .buf = buffer, + .len = sizeof(buffer), + }; + struct reftable_record rec = { NULL }; + struct strbuf key = STRBUF_INIT; + struct reftable_obj_record out = { NULL }; + struct reftable_record rec_out = { NULL }; + int n, m; + uint8_t extra; + + reftable_record_from_obj(&rec, &in); + test_copy(&rec); + reftable_record_key(&rec, &key); + n = reftable_record_encode(&rec, dest, GIT_SHA1_RAWSZ); + EXPECT(n > 0); + extra = reftable_record_val_type(&rec); + reftable_record_from_obj(&rec_out, &out); + m = reftable_record_decode(&rec_out, key, extra, dest, + GIT_SHA1_RAWSZ); + EXPECT(n == m); + + EXPECT(in.hash_prefix_len == out.hash_prefix_len); + EXPECT(in.offset_len == out.offset_len); + + EXPECT(!memcmp(in.hash_prefix, out.hash_prefix, + in.hash_prefix_len)); + EXPECT(0 == memcmp(in.offsets, out.offsets, + sizeof(uint64_t) * in.offset_len)); + strbuf_release(&key); + reftable_record_release(&rec_out); + } +} + +static void test_reftable_index_record_roundtrip(void) +{ + struct reftable_index_record in = { + .offset = 42, + .last_key = STRBUF_INIT, + }; + uint8_t buffer[1024] = { 0 }; + struct string_view dest = { + .buf = buffer, + .len = sizeof(buffer), + }; + struct strbuf key = STRBUF_INIT; + struct reftable_record rec = { NULL }; + struct reftable_index_record out = { .last_key = STRBUF_INIT }; + struct reftable_record out_rec = { NULL }; + int n, m; + uint8_t extra; + + strbuf_addstr(&in.last_key, "refs/heads/master"); + reftable_record_from_index(&rec, &in); + reftable_record_key(&rec, &key); + test_copy(&rec); + + EXPECT(0 == strbuf_cmp(&key, &in.last_key)); + n = reftable_record_encode(&rec, dest, GIT_SHA1_RAWSZ); + EXPECT(n > 0); + + extra = reftable_record_val_type(&rec); + reftable_record_from_index(&out_rec, &out); + m = reftable_record_decode(&out_rec, key, extra, dest, GIT_SHA1_RAWSZ); + EXPECT(m == n); + + EXPECT(in.offset == out.offset); + + reftable_record_release(&out_rec); + strbuf_release(&key); + strbuf_release(&in.last_key); +} + +int record_test_main(int argc, const char *argv[]) +{ + RUN_TEST(test_reftable_log_record_equal); + RUN_TEST(test_reftable_log_record_roundtrip); + RUN_TEST(test_reftable_ref_record_roundtrip); + RUN_TEST(test_varint_roundtrip); + RUN_TEST(test_key_roundtrip); + RUN_TEST(test_common_prefix); + RUN_TEST(test_reftable_obj_record_roundtrip); + RUN_TEST(test_reftable_index_record_roundtrip); + RUN_TEST(test_u24_roundtrip); + return 0; +} diff --git a/reftable/reftable-record.h b/reftable/reftable-record.h new file mode 100644 index 00000000000..7985b94ae2c --- /dev/null +++ b/reftable/reftable-record.h @@ -0,0 +1,114 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef REFTABLE_RECORD_H +#define REFTABLE_RECORD_H + +#include + +/* + * Basic data types + * + * Reftables store the state of each ref in struct reftable_ref_record, and they + * store a sequence of reflog updates in struct reftable_log_record. + */ + +/* reftable_ref_record holds a ref database entry target_value */ +struct reftable_ref_record { + char *refname; /* Name of the ref, malloced. */ + uint64_t update_index; /* Logical timestamp at which this value is + * written */ + + enum { + /* tombstone to hide deletions from earlier tables */ + REFTABLE_REF_DELETION = 0x0, + + /* a simple ref */ + REFTABLE_REF_VAL1 = 0x1, + /* a tag, plus its peeled hash */ + REFTABLE_REF_VAL2 = 0x2, + + /* a symbolic reference */ + REFTABLE_REF_SYMREF = 0x3, +#define REFTABLE_NR_REF_VALUETYPES 4 + } value_type; + union { + uint8_t *val1; /* malloced hash. */ + struct { + uint8_t *value; /* first value, malloced hash */ + uint8_t *target_value; /* second value, malloced hash */ + } val2; + char *symref; /* referent, malloced 0-terminated string */ + } value; +}; + +/* Returns the first hash, or NULL if `rec` is not of type + * REFTABLE_REF_VAL1 or REFTABLE_REF_VAL2. */ +uint8_t *reftable_ref_record_val1(struct reftable_ref_record *rec); + +/* Returns the second hash, or NULL if `rec` is not of type + * REFTABLE_REF_VAL2. */ +uint8_t *reftable_ref_record_val2(struct reftable_ref_record *rec); + +/* returns whether 'ref' represents a deletion */ +int reftable_ref_record_is_deletion(const struct reftable_ref_record *ref); + +/* prints a reftable_ref_record onto stdout. Useful for debugging. */ +void reftable_ref_record_print(struct reftable_ref_record *ref, + uint32_t hash_id); + +/* frees and nulls all pointer values inside `ref`. */ +void reftable_ref_record_release(struct reftable_ref_record *ref); + +/* returns whether two reftable_ref_records are the same. Useful for testing. */ +int reftable_ref_record_equal(struct reftable_ref_record *a, + struct reftable_ref_record *b, int hash_size); + +/* reftable_log_record holds a reflog entry */ +struct reftable_log_record { + char *refname; + uint64_t update_index; /* logical timestamp of a transactional update. + */ + + enum { + /* tombstone to hide deletions from earlier tables */ + REFTABLE_LOG_DELETION = 0x0, + + /* a simple update */ + REFTABLE_LOG_UPDATE = 0x1, +#define REFTABLE_NR_LOG_VALUETYPES 2 + } value_type; + + union { + struct { + uint8_t *new_hash; + uint8_t *old_hash; + char *name; + char *email; + uint64_t time; + int16_t tz_offset; + char *message; + } update; + }; +}; + +/* returns whether 'ref' represents the deletion of a log record. */ +int reftable_log_record_is_deletion(const struct reftable_log_record *log); + +/* frees and nulls all pointer values. */ +void reftable_log_record_release(struct reftable_log_record *log); + +/* returns whether two records are equal. Useful for testing. */ +int reftable_log_record_equal(struct reftable_log_record *a, + struct reftable_log_record *b, int hash_size); + +/* dumps a reftable_log_record on stdout, for debugging/testing. */ +void reftable_log_record_print(struct reftable_log_record *log, + uint32_t hash_id); + +#endif diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c index 3b58e423e7b..09d4b83ef9b 100644 --- a/t/helper/test-reftable.c +++ b/t/helper/test-reftable.c @@ -4,6 +4,6 @@ int cmd__reftable(int argc, const char **argv) { basics_test_main(argc, argv); - + record_test_main(argc, argv); return 0; } From patchwork Tue Jul 20 17:04:28 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 12388849 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6832FC07E95 for ; Tue, 20 Jul 2021 17:06:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 50898610D2 for ; Tue, 20 Jul 2021 17:06:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231562AbhGTQZk (ORCPT ); Tue, 20 Jul 2021 12:25:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49692 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232187AbhGTQYS (ORCPT ); Tue, 20 Jul 2021 12:24:18 -0400 Received: from mail-wm1-x32f.google.com (mail-wm1-x32f.google.com [IPv6:2a00:1450:4864:20::32f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C8B23C0613DD for ; Tue, 20 Jul 2021 10:04:54 -0700 (PDT) Received: by mail-wm1-x32f.google.com with SMTP id l17-20020a05600c1d11b029021f84fcaf75so1918128wms.1 for ; Tue, 20 Jul 2021 10:04:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=KhfP5Pi6SX1nvOeb1sKm3lJWDkHx7oyAGRasn4OX32A=; b=B7EPnf6i2QBB0PRyfkU3pzgOOE/0SVbKnkWrFvBJ3YCH/ISed2/AD23ib20IjUBATc BOcezOtTsraf/pDTxhSWn5gfcp7H2CN8dNn7N8RuMwJP8hgkfpwyyOoycbw9rOD/4jze au+CHay20/Y1Pku5x8/o9yGS142GnE/gke30PxShfC96AjR/H5fqlw4JRP1cKstkidkU aaxMyshE8HMVnTWTTyNbcas9TCBHruEGccERBypLY8aL4CHRAP3SDkvF4/1W5imZ3tqO bDLjkicDmgbe/iRLJnzyfL6+R1byg4kwW2tevbCdtUlyuerffdEwcIzrZW016+do03Mv N1VA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=KhfP5Pi6SX1nvOeb1sKm3lJWDkHx7oyAGRasn4OX32A=; b=q0+baWmfkIXpX2GKQ72ADGUbl+mpSUwUTL1ocHfEs0EsU697lAvzYBS8qTL+jo5MWM PLxAnto0/UVyaimKfcDObZ9MDwOK/5inqD0xRagR5JYl8dfJSaxibwdzFD3ywbDIsCE+ 4HWB3BFXS3fZITjNoSLcjhf335w/7cGx65jNnoBDbKkajmhioU4a548TUj0jdzmW5Q3g daiYix4MnNEhH0rgmJZ6tXo4uXqAQp1AqwfKMRo9NNUAB/zFujFk/5VW1YTUOtWoWOyw 9zXj+y7EkKY7+q/1NxjVoubj4hddfFyTOuJDbTPn8uuCs9vy2oRyBgPbQ0JQHV6/9ihv NY6Q== X-Gm-Message-State: AOAM533cDyItT6T6x4SN6kOuBMUvZpvIqRFWBssZD7MrKFECttyw88vf hOGQzknmQhXJjIfT9J91nqO2wfARuz0= X-Google-Smtp-Source: ABdhPJzGey6+fQ2vurDMYYG0MsnG+qCjv5yOu+t/+KTADmAuRreEBcHFdhncilqUd9nkNLSlPdKcng== X-Received: by 2002:a1c:35c2:: with SMTP id c185mr32318162wma.126.1626800693327; Tue, 20 Jul 2021 10:04:53 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id d29sm29695319wrb.63.2021.07.20.10.04.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 10:04:52 -0700 (PDT) Message-Id: <27abd15965875115f641ed9e0cb1a8732369eddb.1626800686.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 20 Jul 2021 17:04:28 +0000 Subject: [PATCH 08/26] Provide zlib's uncompress2 from compat/zlib-compat.c Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys This will be needed for reading reflog blocks in reftable. Signed-off-by: Han-Wen Nienhuys --- Makefile | 7 +++ ci/lib.sh | 1 + compat/.gitattributes | 1 + compat/zlib-uncompress2.c | 92 +++++++++++++++++++++++++++++++++++++++ configure.ac | 13 ++++++ 5 files changed, 114 insertions(+) create mode 100644 compat/.gitattributes create mode 100644 compat/zlib-uncompress2.c diff --git a/Makefile b/Makefile index 15321edbd2c..640a332b481 100644 --- a/Makefile +++ b/Makefile @@ -256,6 +256,8 @@ all:: # # Define NO_DEFLATE_BOUND if your zlib does not have deflateBound. # +# Define NO_UNCOMPRESS2 if your zlib does not have uncompress2. +# # Define NO_NORETURN if using buggy versions of gcc 4.6+ and profile feedback, # as the compiler can crash (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49299) # @@ -1734,6 +1736,11 @@ ifdef NO_DEFLATE_BOUND BASIC_CFLAGS += -DNO_DEFLATE_BOUND endif +ifdef NO_UNCOMPRESS2 + BASIC_CFLAGS += -DNO_UNCOMPRESS2 + LIB_OBJS += compat/zlib-uncompress2.o +endif + ifdef NO_POSIX_GOODIES BASIC_CFLAGS += -DNO_POSIX_GOODIES endif diff --git a/ci/lib.sh b/ci/lib.sh index 476c3f369f5..5711c63979d 100755 --- a/ci/lib.sh +++ b/ci/lib.sh @@ -224,6 +224,7 @@ linux-gcc-default) ;; Linux32) CC=gcc + MAKEFLAGS="$MAKEFLAGS NO_UNCOMPRESS2=1" ;; linux-musl) CC=gcc diff --git a/compat/.gitattributes b/compat/.gitattributes new file mode 100644 index 00000000000..40dbfb170da --- /dev/null +++ b/compat/.gitattributes @@ -0,0 +1 @@ +/zlib-uncompress2.c whitespace=-indent-with-non-tab,-trailing-space diff --git a/compat/zlib-uncompress2.c b/compat/zlib-uncompress2.c new file mode 100644 index 00000000000..6893bb469ce --- /dev/null +++ b/compat/zlib-uncompress2.c @@ -0,0 +1,92 @@ +/* taken from zlib's uncompr.c + + commit cacf7f1d4e3d44d871b605da3b647f07d718623f + Author: Mark Adler + Date: Sun Jan 15 09:18:46 2017 -0800 + + zlib 1.2.11 + +*/ + +/* + * Copyright (C) 1995-2003, 2010, 2014, 2016 Jean-loup Gailly, Mark Adler + * For conditions of distribution and use, see copyright notice in zlib.h + */ + +#include + +/* clang-format off */ + +/* =========================================================================== + Decompresses the source buffer into the destination buffer. *sourceLen is + the byte length of the source buffer. Upon entry, *destLen is the total size + of the destination buffer, which must be large enough to hold the entire + uncompressed data. (The size of the uncompressed data must have been saved + previously by the compressor and transmitted to the decompressor by some + mechanism outside the scope of this compression library.) Upon exit, + *destLen is the size of the decompressed data and *sourceLen is the number + of source bytes consumed. Upon return, source + *sourceLen points to the + first unused input byte. + + uncompress returns Z_OK if success, Z_MEM_ERROR if there was not enough + memory, Z_BUF_ERROR if there was not enough room in the output buffer, or + Z_DATA_ERROR if the input data was corrupted, including if the input data is + an incomplete zlib stream. +*/ +int ZEXPORT uncompress2 ( + Bytef *dest, + uLongf *destLen, + const Bytef *source, + uLong *sourceLen) { + z_stream stream; + int err; + const uInt max = (uInt)-1; + uLong len, left; + Byte buf[1]; /* for detection of incomplete stream when *destLen == 0 */ + + len = *sourceLen; + if (*destLen) { + left = *destLen; + *destLen = 0; + } + else { + left = 1; + dest = buf; + } + + stream.next_in = (z_const Bytef *)source; + stream.avail_in = 0; + stream.zalloc = (alloc_func)0; + stream.zfree = (free_func)0; + stream.opaque = (voidpf)0; + + err = inflateInit(&stream); + if (err != Z_OK) return err; + + stream.next_out = dest; + stream.avail_out = 0; + + do { + if (stream.avail_out == 0) { + stream.avail_out = left > (uLong)max ? max : (uInt)left; + left -= stream.avail_out; + } + if (stream.avail_in == 0) { + stream.avail_in = len > (uLong)max ? max : (uInt)len; + len -= stream.avail_in; + } + err = inflate(&stream, Z_NO_FLUSH); + } while (err == Z_OK); + + *sourceLen -= len + stream.avail_in; + if (dest != buf) + *destLen = stream.total_out; + else if (stream.total_out && err == Z_BUF_ERROR) + left = 1; + + inflateEnd(&stream); + return err == Z_STREAM_END ? Z_OK : + err == Z_NEED_DICT ? Z_DATA_ERROR : + err == Z_BUF_ERROR && left + stream.avail_out ? Z_DATA_ERROR : + err; +} diff --git a/configure.ac b/configure.ac index 031e8d3fee8..c3a913103d0 100644 --- a/configure.ac +++ b/configure.ac @@ -672,9 +672,22 @@ AC_LINK_IFELSE([ZLIBTEST_SRC], NO_DEFLATE_BOUND=yes]) LIBS="$old_LIBS" +AC_DEFUN([ZLIBTEST_UNCOMPRESS2_SRC], [ +AC_LANG_PROGRAM([#include ], + [uncompress2(NULL,NULL,NULL,NULL);])]) +AC_MSG_CHECKING([for uncompress2 in -lz]) +old_LIBS="$LIBS" +LIBS="$LIBS -lz" +AC_LINK_IFELSE([ZLIBTEST_UNCOMPRESS2_SRC], + [AC_MSG_RESULT([yes])], + [AC_MSG_RESULT([no]) + NO_UNCOMPRESS2=yes]) +LIBS="$old_LIBS" + GIT_UNSTASH_FLAGS($ZLIB_PATH) GIT_CONF_SUBST([NO_DEFLATE_BOUND]) +GIT_CONF_SUBST([NO_UNCOMPRESS2]) # # Define NEEDS_SOCKET if linking with libc is not enough (SunOS, From patchwork Tue Jul 20 17:04:29 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 12388859 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 888BEC07E9B for ; Tue, 20 Jul 2021 17:07:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7078A610CC for ; Tue, 20 Jul 2021 17:07:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234366AbhGTQ0D (ORCPT ); Tue, 20 Jul 2021 12:26:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49694 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232324AbhGTQYT (ORCPT ); Tue, 20 Jul 2021 12:24:19 -0400 Received: from mail-wm1-x32b.google.com (mail-wm1-x32b.google.com [IPv6:2a00:1450:4864:20::32b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A1A11C0613DE for ; Tue, 20 Jul 2021 10:04:55 -0700 (PDT) Received: by mail-wm1-x32b.google.com with SMTP id q19-20020a05600c2e53b0290249f2904453so61781wmf.1 for ; Tue, 20 Jul 2021 10:04:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=hplR1sOuGst9qvsJEBbRGed3/Ypq8CGquOR7wOsDqAM=; b=TU5dQLpZvgH9bV9DAU+0LdHNQ7WUdMzqada/S3kZY/S/yanCP38bpgZBfcr2VLSx0v STE16S5qMiBkhygHy7XyXrnYpB1wHj8mSge7eLz8IdTycfp11DUTRR4j8BRdlXYvaYoj 02GdkYx0e6XxDFzIYphavPWGEFdvCKJtLwi58K9GPhZriE+NTPHSG4mwYojPos0To5mZ ifRwJV54vV/5FRDG5dK5JOobmu/a2OnC0alA+8RPHcd0ChURgKZmyJ7gpT1rogzNi99j okEU2yDYYrfDtJ95nP5y2/2fv/FaCOiH327mY/DUfRdY5DN8kcqrDltC2Sx+CgK94Wry uIgQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=hplR1sOuGst9qvsJEBbRGed3/Ypq8CGquOR7wOsDqAM=; b=tqMSb49SX0Al4fMl2hrecor/n+vlAGoorrPlWtBizEReCVq7fi2EQyeRxV+lJ1fsj3 lGoT/4HNYZpcs4Tl9JAFduaG5TRIs4jl3koORL5jakzdH0sDv3ZkH/LAFHwttbZngp/4 b/BCs2Ags5C/4g3H9+OOpUhTntbVADKC+4NgCOdvDxawpHQjXy48/qg5A2ndiWgJApM9 LozgRjHFGyfXvhOGVEAkRIml6n5iHJ8cEgShLA2nj08+caMFZf8kGzEZQUOUkh8FWYce HfQmlwzVLnIOvRgUNuIG8itiJQt0h4Mgyi/NYTz5joqXV333nq81c52d8AVwvfDhk53V kdSw== X-Gm-Message-State: AOAM532S58y5NVufOxtNiY4DYaxGEaZ4WApbPQjuSG2sb7F37D0LDu16 1UcUzbfCNZsIACQmrsjGKemjQk9l37M= X-Google-Smtp-Source: ABdhPJy1BHbL9IrLraXGuQuqblwMdNu5Jf/WpX6k9XG5xHYx1zAMvCuQ4LoWDm9d4PDCc7YMzt3oGg== X-Received: by 2002:a1c:4b04:: with SMTP id y4mr39311238wma.185.1626800694053; Tue, 20 Jul 2021 10:04:54 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id o6sm23821347wry.91.2021.07.20.10.04.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 10:04:53 -0700 (PDT) Message-Id: <25aa2bf9b71c80b513bab8183ffad322c1a9444e.1626800687.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 20 Jul 2021 17:04:29 +0000 Subject: [PATCH 09/26] reftable: reading/writing blocks Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys The reftable format is structured as a sequence of block. Within a block, records are prefix compressed, with an index of offsets for fully expand keys to enable binary search within blocks. This commit provides the logic to read and write these blocks. Signed-off-by: Han-Wen Nienhuys --- Makefile | 2 + reftable/block.c | 446 +++++++++++++++++++++++++++++++++++++++ reftable/block.h | 127 +++++++++++ reftable/block_test.c | 121 +++++++++++ t/helper/test-reftable.c | 1 + 5 files changed, 697 insertions(+) create mode 100644 reftable/block.c create mode 100644 reftable/block.h create mode 100644 reftable/block_test.c diff --git a/Makefile b/Makefile index 640a332b481..91be4b9c27a 100644 --- a/Makefile +++ b/Makefile @@ -2450,10 +2450,12 @@ xdiff-objs: $(XDIFF_OBJS) REFTABLE_OBJS += reftable/basics.o REFTABLE_OBJS += reftable/error.o +REFTABLE_OBJS += reftable/block.o REFTABLE_OBJS += reftable/blocksource.o REFTABLE_OBJS += reftable/publicbasics.o REFTABLE_OBJS += reftable/record.o +REFTABLE_TEST_OBJS += reftable/block_test.o REFTABLE_TEST_OBJS += reftable/record_test.o REFTABLE_TEST_OBJS += reftable/test_framework.o REFTABLE_TEST_OBJS += reftable/basics_test.o diff --git a/reftable/block.c b/reftable/block.c new file mode 100644 index 00000000000..92f8e5abfad --- /dev/null +++ b/reftable/block.c @@ -0,0 +1,446 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "block.h" + +#include "blocksource.h" +#include "constants.h" +#include "record.h" +#include "reftable-error.h" +#include "system.h" +#include + +#ifdef NO_UNCOMPRESS2 +/* This is uncompress2, which is only available in zlib as of 2017. + */ +int uncompress2(Bytef *dest, uLongf *destLen, const Bytef *source, + uLong *sourceLen); +#endif + +int header_size(int version) +{ + switch (version) { + case 1: + return 24; + case 2: + return 28; + } + abort(); +} + +int footer_size(int version) +{ + switch (version) { + case 1: + return 68; + case 2: + return 72; + } + abort(); +} + +static int block_writer_register_restart(struct block_writer *w, int n, + int is_restart, struct strbuf *key) +{ + int rlen = w->restart_len; + if (rlen >= MAX_RESTARTS) { + is_restart = 0; + } + + if (is_restart) { + rlen++; + } + if (2 + 3 * rlen + n > w->block_size - w->next) + return -1; + if (is_restart) { + if (w->restart_len == w->restart_cap) { + w->restart_cap = w->restart_cap * 2 + 1; + w->restarts = reftable_realloc( + w->restarts, sizeof(uint32_t) * w->restart_cap); + } + + w->restarts[w->restart_len++] = w->next; + } + + w->next += n; + + strbuf_reset(&w->last_key); + strbuf_addbuf(&w->last_key, key); + w->entries++; + return 0; +} + +void block_writer_init(struct block_writer *bw, uint8_t typ, uint8_t *buf, + uint32_t block_size, uint32_t header_off, int hash_size) +{ + bw->buf = buf; + bw->hash_size = hash_size; + bw->block_size = block_size; + bw->header_off = header_off; + bw->buf[header_off] = typ; + bw->next = header_off + 4; + bw->restart_interval = 16; + bw->entries = 0; + bw->restart_len = 0; + bw->last_key.len = 0; +} + +uint8_t block_writer_type(struct block_writer *bw) +{ + return bw->buf[bw->header_off]; +} + +/* adds the reftable_record to the block. Returns -1 if it does not fit, 0 on + success */ +int block_writer_add(struct block_writer *w, struct reftable_record *rec) +{ + struct strbuf empty = STRBUF_INIT; + struct strbuf last = + w->entries % w->restart_interval == 0 ? empty : w->last_key; + struct string_view out = { + .buf = w->buf + w->next, + .len = w->block_size - w->next, + }; + + struct string_view start = out; + + int is_restart = 0; + struct strbuf key = STRBUF_INIT; + int n = 0; + + reftable_record_key(rec, &key); + n = reftable_encode_key(&is_restart, out, last, key, + reftable_record_val_type(rec)); + if (n < 0) + goto done; + string_view_consume(&out, n); + + n = reftable_record_encode(rec, out, w->hash_size); + if (n < 0) + goto done; + string_view_consume(&out, n); + + if (block_writer_register_restart(w, start.len - out.len, is_restart, + &key) < 0) + goto done; + + strbuf_release(&key); + return 0; + +done: + strbuf_release(&key); + return -1; +} + +int block_writer_finish(struct block_writer *w) +{ + int i = 0; + for (i = 0; i < w->restart_len; i++) { + put_be24(w->buf + w->next, w->restarts[i]); + w->next += 3; + } + + put_be16(w->buf + w->next, w->restart_len); + w->next += 2; + put_be24(w->buf + 1 + w->header_off, w->next); + + if (block_writer_type(w) == BLOCK_TYPE_LOG) { + int block_header_skip = 4 + w->header_off; + uint8_t *compressed = NULL; + int zresult = 0; + uLongf src_len = w->next - block_header_skip; + size_t dest_cap = src_len; + + compressed = reftable_malloc(dest_cap); + while (1) { + uLongf out_dest_len = dest_cap; + + zresult = compress2(compressed, &out_dest_len, + w->buf + block_header_skip, src_len, + 9); + if (zresult == Z_BUF_ERROR) { + dest_cap *= 2; + compressed = + reftable_realloc(compressed, dest_cap); + continue; + } + + if (Z_OK != zresult) { + reftable_free(compressed); + return REFTABLE_ZLIB_ERROR; + } + + memcpy(w->buf + block_header_skip, compressed, + out_dest_len); + w->next = out_dest_len + block_header_skip; + reftable_free(compressed); + break; + } + } + return w->next; +} + +uint8_t block_reader_type(struct block_reader *r) +{ + return r->block.data[r->header_off]; +} + +int block_reader_init(struct block_reader *br, struct reftable_block *block, + uint32_t header_off, uint32_t table_block_size, + int hash_size) +{ + uint32_t full_block_size = table_block_size; + uint8_t typ = block->data[header_off]; + uint32_t sz = get_be24(block->data + header_off + 1); + + uint16_t restart_count = 0; + uint32_t restart_start = 0; + uint8_t *restart_bytes = NULL; + + if (!reftable_is_block_type(typ)) + return REFTABLE_FORMAT_ERROR; + + if (typ == BLOCK_TYPE_LOG) { + int block_header_skip = 4 + header_off; + uLongf dst_len = sz - block_header_skip; /* total size of dest + buffer. */ + uLongf src_len = block->len - block_header_skip; + /* Log blocks specify the *uncompressed* size in their header. + */ + uint8_t *uncompressed = reftable_malloc(sz); + + /* Copy over the block header verbatim. It's not compressed. */ + memcpy(uncompressed, block->data, block_header_skip); + + /* Uncompress */ + if (Z_OK != + uncompress2(uncompressed + block_header_skip, &dst_len, + block->data + block_header_skip, &src_len)) { + reftable_free(uncompressed); + return REFTABLE_ZLIB_ERROR; + } + + if (dst_len + block_header_skip != sz) + return REFTABLE_FORMAT_ERROR; + + /* We're done with the input data. */ + reftable_block_done(block); + block->data = uncompressed; + block->len = sz; + block->source = malloc_block_source(); + full_block_size = src_len + block_header_skip; + } else if (full_block_size == 0) { + full_block_size = sz; + } else if (sz < full_block_size && sz < block->len && + block->data[sz] != 0) { + /* If the block is smaller than the full block size, it is + padded (data followed by '\0') or the next block is + unaligned. */ + full_block_size = sz; + } + + restart_count = get_be16(block->data + sz - 2); + restart_start = sz - 2 - 3 * restart_count; + restart_bytes = block->data + restart_start; + + /* transfer ownership. */ + br->block = *block; + block->data = NULL; + block->len = 0; + + br->hash_size = hash_size; + br->block_len = restart_start; + br->full_block_size = full_block_size; + br->header_off = header_off; + br->restart_count = restart_count; + br->restart_bytes = restart_bytes; + + return 0; +} + +static uint32_t block_reader_restart_offset(struct block_reader *br, int i) +{ + return get_be24(br->restart_bytes + 3 * i); +} + +void block_reader_start(struct block_reader *br, struct block_iter *it) +{ + it->br = br; + strbuf_reset(&it->last_key); + it->next_off = br->header_off + 4; +} + +struct restart_find_args { + int error; + struct strbuf key; + struct block_reader *r; +}; + +static int restart_key_less(size_t idx, void *args) +{ + struct restart_find_args *a = args; + uint32_t off = block_reader_restart_offset(a->r, idx); + struct string_view in = { + .buf = a->r->block.data + off, + .len = a->r->block_len - off, + }; + + /* the restart key is verbatim in the block, so this could avoid the + alloc for decoding the key */ + struct strbuf rkey = STRBUF_INIT; + struct strbuf last_key = STRBUF_INIT; + uint8_t unused_extra; + int n = reftable_decode_key(&rkey, &unused_extra, last_key, in); + int result; + if (n < 0) { + a->error = 1; + return -1; + } + + result = strbuf_cmp(&a->key, &rkey); + strbuf_release(&rkey); + return result; +} + +void block_iter_copy_from(struct block_iter *dest, struct block_iter *src) +{ + dest->br = src->br; + dest->next_off = src->next_off; + strbuf_reset(&dest->last_key); + strbuf_addbuf(&dest->last_key, &src->last_key); +} + +int block_iter_next(struct block_iter *it, struct reftable_record *rec) +{ + struct string_view in = { + .buf = it->br->block.data + it->next_off, + .len = it->br->block_len - it->next_off, + }; + struct string_view start = in; + struct strbuf key = STRBUF_INIT; + uint8_t extra = 0; + int n = 0; + + if (it->next_off >= it->br->block_len) + return 1; + + n = reftable_decode_key(&key, &extra, it->last_key, in); + if (n < 0) + return -1; + + string_view_consume(&in, n); + n = reftable_record_decode(rec, key, extra, in, it->br->hash_size); + if (n < 0) + return -1; + string_view_consume(&in, n); + + strbuf_reset(&it->last_key); + strbuf_addbuf(&it->last_key, &key); + it->next_off += start.len - in.len; + strbuf_release(&key); + return 0; +} + +int block_reader_first_key(struct block_reader *br, struct strbuf *key) +{ + struct strbuf empty = STRBUF_INIT; + int off = br->header_off + 4; + struct string_view in = { + .buf = br->block.data + off, + .len = br->block_len - off, + }; + + uint8_t extra = 0; + int n = reftable_decode_key(key, &extra, empty, in); + if (n < 0) + return n; + + return 0; +} + +int block_iter_seek(struct block_iter *it, struct strbuf *want) +{ + return block_reader_seek(it->br, it, want); +} + +void block_iter_close(struct block_iter *it) +{ + strbuf_release(&it->last_key); +} + +int block_reader_seek(struct block_reader *br, struct block_iter *it, + struct strbuf *want) +{ + struct restart_find_args args = { + .key = *want, + .r = br, + }; + struct reftable_record rec = reftable_new_record(block_reader_type(br)); + struct strbuf key = STRBUF_INIT; + int err = 0; + struct block_iter next = { + .last_key = STRBUF_INIT, + }; + + int i = binsearch(br->restart_count, &restart_key_less, &args); + if (args.error) { + err = REFTABLE_FORMAT_ERROR; + goto done; + } + + it->br = br; + if (i > 0) { + i--; + it->next_off = block_reader_restart_offset(br, i); + } else { + it->next_off = br->header_off + 4; + } + + /* We're looking for the last entry less/equal than the wanted key, so + we have to go one entry too far and then back up. + */ + while (1) { + block_iter_copy_from(&next, it); + err = block_iter_next(&next, &rec); + if (err < 0) + goto done; + + reftable_record_key(&rec, &key); + if (err > 0 || strbuf_cmp(&key, want) >= 0) { + err = 0; + goto done; + } + + block_iter_copy_from(it, &next); + } + +done: + strbuf_release(&key); + strbuf_release(&next.last_key); + reftable_record_destroy(&rec); + + return err; +} + +void block_writer_release(struct block_writer *bw) +{ + FREE_AND_NULL(bw->restarts); + strbuf_release(&bw->last_key); + /* the block is not owned. */ +} + +void reftable_block_done(struct reftable_block *blockp) +{ + struct reftable_block_source source = blockp->source; + if (blockp && source.ops) + source.ops->return_block(source.arg, blockp); + blockp->data = NULL; + blockp->len = 0; + blockp->source.ops = NULL; + blockp->source.arg = NULL; +} diff --git a/reftable/block.h b/reftable/block.h new file mode 100644 index 00000000000..e207706a644 --- /dev/null +++ b/reftable/block.h @@ -0,0 +1,127 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef BLOCK_H +#define BLOCK_H + +#include "basics.h" +#include "record.h" +#include "reftable-blocksource.h" + +/* + * Writes reftable blocks. The block_writer is reused across blocks to minimize + * allocation overhead. + */ +struct block_writer { + uint8_t *buf; + uint32_t block_size; + + /* Offset ofof the global header. Nonzero in the first block only. */ + uint32_t header_off; + + /* How often to restart keys. */ + int restart_interval; + int hash_size; + + /* Offset of next uint8_t to write. */ + uint32_t next; + uint32_t *restarts; + uint32_t restart_len; + uint32_t restart_cap; + + struct strbuf last_key; + int entries; +}; + +/* + * initializes the blockwriter to write `typ` entries, using `buf` as temporary + * storage. `buf` is not owned by the block_writer. */ +void block_writer_init(struct block_writer *bw, uint8_t typ, uint8_t *buf, + uint32_t block_size, uint32_t header_off, int hash_size); + +/* returns the block type (eg. 'r' for ref records. */ +uint8_t block_writer_type(struct block_writer *bw); + +/* appends the record, or -1 if it doesn't fit. */ +int block_writer_add(struct block_writer *w, struct reftable_record *rec); + +/* appends the key restarts, and compress the block if necessary. */ +int block_writer_finish(struct block_writer *w); + +/* clears out internally allocated block_writer members. */ +void block_writer_release(struct block_writer *bw); + +/* Read a block. */ +struct block_reader { + /* offset of the block header; nonzero for the first block in a + * reftable. */ + uint32_t header_off; + + /* the memory block */ + struct reftable_block block; + int hash_size; + + /* size of the data, excluding restart data. */ + uint32_t block_len; + uint8_t *restart_bytes; + uint16_t restart_count; + + /* size of the data in the file. For log blocks, this is the compressed + * size. */ + uint32_t full_block_size; +}; + +/* Iterate over entries in a block */ +struct block_iter { + /* offset within the block of the next entry to read. */ + uint32_t next_off; + struct block_reader *br; + + /* key for last entry we read. */ + struct strbuf last_key; +}; + +/* initializes a block reader. */ +int block_reader_init(struct block_reader *br, struct reftable_block *bl, + uint32_t header_off, uint32_t table_block_size, + int hash_size); + +/* Position `it` at start of the block */ +void block_reader_start(struct block_reader *br, struct block_iter *it); + +/* Position `it` to the `want` key in the block */ +int block_reader_seek(struct block_reader *br, struct block_iter *it, + struct strbuf *want); + +/* Returns the block type (eg. 'r' for refs) */ +uint8_t block_reader_type(struct block_reader *r); + +/* Decodes the first key in the block */ +int block_reader_first_key(struct block_reader *br, struct strbuf *key); + +void block_iter_copy_from(struct block_iter *dest, struct block_iter *src); + +/* return < 0 for error, 0 for OK, > 0 for EOF. */ +int block_iter_next(struct block_iter *it, struct reftable_record *rec); + +/* Seek to `want` with in the block pointed to by `it` */ +int block_iter_seek(struct block_iter *it, struct strbuf *want); + +/* deallocate memory for `it`. The block reader and its block is left intact. */ +void block_iter_close(struct block_iter *it); + +/* size of file header, depending on format version */ +int header_size(int version); + +/* size of file footer, depending on format version */ +int footer_size(int version); + +/* returns a block to its source. */ +void reftable_block_done(struct reftable_block *ret); + +#endif diff --git a/reftable/block_test.c b/reftable/block_test.c new file mode 100644 index 00000000000..c3d35eedb98 --- /dev/null +++ b/reftable/block_test.c @@ -0,0 +1,121 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "block.h" + +#include "system.h" + +#include "blocksource.h" +#include "basics.h" +#include "constants.h" +#include "record.h" +#include "test_framework.h" +#include "reftable-tests.h" + +static void test_block_read_write(void) +{ + const int header_off = 21; /* random */ + char *names[30]; + const int N = ARRAY_SIZE(names); + const int block_size = 1024; + struct reftable_block block = { NULL }; + struct block_writer bw = { + .last_key = STRBUF_INIT, + }; + struct reftable_ref_record ref = { NULL }; + struct reftable_record rec = { NULL }; + int i = 0; + int n; + struct block_reader br = { 0 }; + struct block_iter it = { .last_key = STRBUF_INIT }; + int j = 0; + struct strbuf want = STRBUF_INIT; + + block.data = reftable_calloc(block_size); + block.len = block_size; + block.source = malloc_block_source(); + block_writer_init(&bw, BLOCK_TYPE_REF, block.data, block_size, + header_off, hash_size(GIT_SHA1_FORMAT_ID)); + reftable_record_from_ref(&rec, &ref); + + for (i = 0; i < N; i++) { + char name[100]; + uint8_t hash[GIT_SHA1_RAWSZ]; + snprintf(name, sizeof(name), "branch%02d", i); + memset(hash, i, sizeof(hash)); + + ref.refname = name; + ref.value_type = REFTABLE_REF_VAL1; + ref.value.val1 = hash; + + names[i] = xstrdup(name); + n = block_writer_add(&bw, &rec); + ref.refname = NULL; + ref.value_type = REFTABLE_REF_DELETION; + EXPECT(n == 0); + } + + n = block_writer_finish(&bw); + EXPECT(n > 0); + + block_writer_release(&bw); + + block_reader_init(&br, &block, header_off, block_size, GIT_SHA1_RAWSZ); + + block_reader_start(&br, &it); + + while (1) { + int r = block_iter_next(&it, &rec); + EXPECT(r >= 0); + if (r > 0) { + break; + } + EXPECT_STREQ(names[j], ref.refname); + j++; + } + + reftable_record_release(&rec); + block_iter_close(&it); + + for (i = 0; i < N; i++) { + struct block_iter it = { .last_key = STRBUF_INIT }; + strbuf_reset(&want); + strbuf_addstr(&want, names[i]); + + n = block_reader_seek(&br, &it, &want); + EXPECT(n == 0); + + n = block_iter_next(&it, &rec); + EXPECT(n == 0); + + EXPECT_STREQ(names[i], ref.refname); + + want.len--; + n = block_reader_seek(&br, &it, &want); + EXPECT(n == 0); + + n = block_iter_next(&it, &rec); + EXPECT(n == 0); + EXPECT_STREQ(names[10 * (i / 10)], ref.refname); + + block_iter_close(&it); + } + + reftable_record_release(&rec); + reftable_block_done(&br.block); + strbuf_release(&want); + for (i = 0; i < N; i++) { + reftable_free(names[i]); + } +} + +int block_test_main(int argc, const char *argv[]) +{ + RUN_TEST(test_block_read_write); + return 0; +} diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c index 09d4b83ef9b..c9deeaf08c7 100644 --- a/t/helper/test-reftable.c +++ b/t/helper/test-reftable.c @@ -4,6 +4,7 @@ int cmd__reftable(int argc, const char **argv) { basics_test_main(argc, argv); + block_test_main(argc, argv); record_test_main(argc, argv); return 0; } From patchwork Tue Jul 20 17:04:30 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 12388853 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 74356C07E95 for ; Tue, 20 Jul 2021 17:07:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 54BFE610CC for ; Tue, 20 Jul 2021 17:07:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234595AbhGTQ0L (ORCPT ); Tue, 20 Jul 2021 12:26:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49698 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232339AbhGTQYT (ORCPT ); Tue, 20 Jul 2021 12:24:19 -0400 Received: from mail-wr1-x433.google.com (mail-wr1-x433.google.com [IPv6:2a00:1450:4864:20::433]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 21FB8C0613DF for ; Tue, 20 Jul 2021 10:04:56 -0700 (PDT) Received: by mail-wr1-x433.google.com with SMTP id m2so26812792wrq.2 for ; Tue, 20 Jul 2021 10:04:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=zdNu220TLE4nJ4iDwabREEaclGY5ZZqYRrgGheUTAVs=; b=Z6cdoGSMzbgOk9twOGW2j3tz5UCyOUFhuOEmYQj6IszETpHJ7UKxR7GHOKM+Sc5b98 kYepMfL5VuwwHSEiJlXaqGLw+gpRdjzz80oGdn6jWh5L+ftGULSWbkCsAqx/pQr6Za1e IN1Zbk8ORGDbWSbmNXphKc4Iz2aRpKe3s3IrQ0UtGqlkSagILYLmhdR2Zcm/2yIiA0HJ le+PUwVtQO26fhIDRnn0TroMDTDnWeDO96zjNE1+FvXanxdIytZBjIDACTMDrsyXBpms pQ6szF2UFK5PGu9Nv3FnEGHLmuuwavInOmYuW6VFmoAX9Yik5SOJczTQwV8/o/SQ/KT7 4GHw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=zdNu220TLE4nJ4iDwabREEaclGY5ZZqYRrgGheUTAVs=; b=eTbbzoS93Ldz20Q2b5PQv5+ZT9Ma2ZqaYUKlsNnxr+LV8paIC9d4dX0AIHXzyhp9Vi Z8E9b38kMn9dhpx0ZADmtk5MN5Wlcd2ifzQ9XfopMt7/LkJqTzMuKgg1SoWOLjMb/efy ChEjZxlutXtQBAP0EiMI+eNMcMI9Z2po1+lhUL/vnqGhpiwL8hTePsCmfGr93VJo1AT4 CXeuBcQ/7gwD3UPap7j/dMYyVhBesOv03zXAVD9DtcraeuqD/k4Pyx4OyMJQabIl3Ivt IVYvE0HFoT5fx7u3QHBmmGCBDJbT3RbLpfaQaVF7+oonzNCVThI+h7/89z29Ttg+lo9A odfQ== X-Gm-Message-State: AOAM530NQgI3ZzCfaKepJw/GDrFII/rZKB6so1rJ16QisNq2uG3n5zot mKV0MieIh2cD4DrLpHPqiERGMYDV4T0= X-Google-Smtp-Source: ABdhPJzCsIQAqfSYlCCYjDkwC+YxBpsFMxrLlzU1x7z40gCD1dXW46CUzIaB9IR2uJeZ9GcjXj+X4g== X-Received: by 2002:a5d:410b:: with SMTP id l11mr37087877wrp.173.1626800694703; Tue, 20 Jul 2021 10:04:54 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id n23sm19149607wms.4.2021.07.20.10.04.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 10:04:54 -0700 (PDT) Message-Id: <92970da9cb9105913753211c2b3bb98c1c577bc6.1626800687.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 20 Jul 2021 17:04:30 +0000 Subject: [PATCH 10/26] reftable: a generic binary tree implementation Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys The reftable format includes support for an (OID => ref) map. This map can speed up visibility and reachability checks. In particular, various operations along the fetch/push path within Gerrit have ben sped up by using this structure. The map is constructed with help of a binary tree. Object IDs are hashes, so they are uniformly distributed. Hence, the tree does not attempt forced rebalancing. Signed-off-by: Han-Wen Nienhuys --- Makefile | 4 ++- reftable/tree.c | 63 ++++++++++++++++++++++++++++++++++++++++ reftable/tree.h | 34 ++++++++++++++++++++++ reftable/tree_test.c | 61 ++++++++++++++++++++++++++++++++++++++ t/helper/test-reftable.c | 1 + 5 files changed, 162 insertions(+), 1 deletion(-) create mode 100644 reftable/tree.c create mode 100644 reftable/tree.h create mode 100644 reftable/tree_test.c diff --git a/Makefile b/Makefile index 91be4b9c27a..12bd12328b5 100644 --- a/Makefile +++ b/Makefile @@ -2454,11 +2454,13 @@ REFTABLE_OBJS += reftable/block.o REFTABLE_OBJS += reftable/blocksource.o REFTABLE_OBJS += reftable/publicbasics.o REFTABLE_OBJS += reftable/record.o +REFTABLE_OBJS += reftable/tree.o +REFTABLE_TEST_OBJS += reftable/basics_test.o REFTABLE_TEST_OBJS += reftable/block_test.o REFTABLE_TEST_OBJS += reftable/record_test.o REFTABLE_TEST_OBJS += reftable/test_framework.o -REFTABLE_TEST_OBJS += reftable/basics_test.o +REFTABLE_TEST_OBJS += reftable/tree_test.o TEST_OBJS := $(patsubst %$X,%.o,$(TEST_PROGRAMS)) $(patsubst %,t/helper/%,$(TEST_BUILTINS_OBJS)) diff --git a/reftable/tree.c b/reftable/tree.c new file mode 100644 index 00000000000..82db7995dd6 --- /dev/null +++ b/reftable/tree.c @@ -0,0 +1,63 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "tree.h" + +#include "basics.h" +#include "system.h" + +struct tree_node *tree_search(void *key, struct tree_node **rootp, + int (*compare)(const void *, const void *), + int insert) +{ + int res; + if (*rootp == NULL) { + if (!insert) { + return NULL; + } else { + struct tree_node *n = + reftable_calloc(sizeof(struct tree_node)); + n->key = key; + *rootp = n; + return *rootp; + } + } + + res = compare(key, (*rootp)->key); + if (res < 0) + return tree_search(key, &(*rootp)->left, compare, insert); + else if (res > 0) + return tree_search(key, &(*rootp)->right, compare, insert); + return *rootp; +} + +void infix_walk(struct tree_node *t, void (*action)(void *arg, void *key), + void *arg) +{ + if (t->left) { + infix_walk(t->left, action, arg); + } + action(arg, t->key); + if (t->right) { + infix_walk(t->right, action, arg); + } +} + +void tree_free(struct tree_node *t) +{ + if (t == NULL) { + return; + } + if (t->left) { + tree_free(t->left); + } + if (t->right) { + tree_free(t->right); + } + reftable_free(t); +} diff --git a/reftable/tree.h b/reftable/tree.h new file mode 100644 index 00000000000..fbdd002e23a --- /dev/null +++ b/reftable/tree.h @@ -0,0 +1,34 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef TREE_H +#define TREE_H + +/* tree_node is a generic binary search tree. */ +struct tree_node { + void *key; + struct tree_node *left, *right; +}; + +/* looks for `key` in `rootp` using `compare` as comparison function. If insert + * is set, insert the key if it's not found. Else, return NULL. + */ +struct tree_node *tree_search(void *key, struct tree_node **rootp, + int (*compare)(const void *, const void *), + int insert); + +/* performs an infix walk of the tree. */ +void infix_walk(struct tree_node *t, void (*action)(void *arg, void *key), + void *arg); + +/* + * deallocates the tree nodes recursively. Keys should be deallocated separately + * by walking over the tree. */ +void tree_free(struct tree_node *t); + +#endif diff --git a/reftable/tree_test.c b/reftable/tree_test.c new file mode 100644 index 00000000000..09a970e17b9 --- /dev/null +++ b/reftable/tree_test.c @@ -0,0 +1,61 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "tree.h" + +#include "basics.h" +#include "record.h" +#include "test_framework.h" +#include "reftable-tests.h" + +static int test_compare(const void *a, const void *b) +{ + return (char *)a - (char *)b; +} + +struct curry { + void *last; +}; + +static void check_increasing(void *arg, void *key) +{ + struct curry *c = arg; + if (c->last) { + assert(test_compare(c->last, key) < 0); + } + c->last = key; +} + +static void test_tree(void) +{ + struct tree_node *root = NULL; + + void *values[11] = { NULL }; + struct tree_node *nodes[11] = { NULL }; + int i = 1; + struct curry c = { NULL }; + do { + nodes[i] = tree_search(values + i, &root, &test_compare, 1); + i = (i * 7) % 11; + } while (i != 1); + + for (i = 1; i < ARRAY_SIZE(nodes); i++) { + assert(values + i == nodes[i]->key); + assert(nodes[i] == + tree_search(values + i, &root, &test_compare, 0)); + } + + infix_walk(root, check_increasing, &c); + tree_free(root); +} + +int tree_test_main(int argc, const char *argv[]) +{ + RUN_TEST(test_tree); + return 0; +} diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c index c9deeaf08c7..050551fa698 100644 --- a/t/helper/test-reftable.c +++ b/t/helper/test-reftable.c @@ -6,5 +6,6 @@ int cmd__reftable(int argc, const char **argv) basics_test_main(argc, argv); block_test_main(argc, argv); record_test_main(argc, argv); + tree_test_main(argc, argv); return 0; } From patchwork Tue Jul 20 17:04:31 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 12388865 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AECB7C636C9 for ; Tue, 20 Jul 2021 17:07:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 986506113B for ; Tue, 20 Jul 2021 17:07:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234656AbhGTQ0N (ORCPT ); Tue, 20 Jul 2021 12:26:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49700 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232356AbhGTQYU (ORCPT ); Tue, 20 Jul 2021 12:24:20 -0400 Received: from mail-wm1-x336.google.com (mail-wm1-x336.google.com [IPv6:2a00:1450:4864:20::336]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EC896C061574 for ; Tue, 20 Jul 2021 10:04:56 -0700 (PDT) Received: by mail-wm1-x336.google.com with SMTP id u8-20020a7bcb080000b02901e44e9caa2aso2455397wmj.4 for ; Tue, 20 Jul 2021 10:04:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=iPHkFdFog7ycFMHADPTK9ysJsPXabPrkZcU42+Y9I58=; b=lh2IcWOFP19Ce3n0/hTMe4Kq5Kij/LVVO2ONcD8gjHMzI9fVFcUDnfR7U6+2AUNSMn 6IN/fkm8fG9AIq5SMjvBp+jdUAvEYwz6qOlBU6G+4/davhd6zIXC92R6UsvNlVdo1qi4 D/a4n2JfsnsiklqJHmgKVNjX0s3hARORiJdnsPePuR4gP0RLJjv9+IOGfpCJR4SJURx3 MoUpaUqmlBdUygp05X65Sfy8LlqjZr8wSTAEITQu3vWhSE62FpLzR8FID14ZNF0FXH5W kDbTCPe6VtJ3SaHmCVkrCM+hMYKPbO9ESSEQzbn+yBzkqlV54n9f0hxKe0GpKClEPwG7 dCAg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=iPHkFdFog7ycFMHADPTK9ysJsPXabPrkZcU42+Y9I58=; b=WwoIpab6F+jQOWhX4e9TenvoB20zTPAldli4Nrgv3zDM2SBkih5LDr6JRu0YaWMwZg oXGNqa5Ep6uVB+IXtF/fmXv0zG8rOt1SVIEGHf71kt9lOu4hNOVHCAmJ+YKsVY3uuuvh MlBIMgBQGt1QqePadj9/9WHy4epMhBJQH6pqkKiB49nHmAsm2FxwuGsPIBubA+o0ilb6 yEt3T5oGPFnNZc4UQpgR/b1u0GfBkQO52MR9uR19665KGdh0nvTa2oyT6/uIiYRBU50t +TJBv98CmWDEKb3a+L403HzH/WotMELRw1HINP6hJ6K0prDYqLsWIkcXtP43VHiST5rv /u1A== X-Gm-Message-State: AOAM5327oRhkCF/x+wmMnJuVYmseqqeMNJKVNoZ99WjEnApT7FzxHpXG +JP6nIysa29qIfCiYW9zji2EiLF/X0Q= X-Google-Smtp-Source: ABdhPJxXhwKLRWAPi2RR8C4SLU6YDfFhAvR4hPo0ibU9H1iQe1g3Lfvh7qEIcrXRAecSXc8QzivRVg== X-Received: by 2002:a1c:988a:: with SMTP id a132mr38209437wme.175.1626800695357; Tue, 20 Jul 2021 10:04:55 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id z13sm24859375wro.79.2021.07.20.10.04.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 10:04:55 -0700 (PDT) Message-Id: <3b817f37a01def9337ae60107a3e082cf8da5937.1626800687.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 20 Jul 2021 17:04:31 +0000 Subject: [PATCH 11/26] reftable: write reftable files Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys Signed-off-by: Han-Wen Nienhuys --- Makefile | 1 + reftable/reftable-writer.h | 148 ++++++++ reftable/writer.c | 690 +++++++++++++++++++++++++++++++++++++ reftable/writer.h | 50 +++ 4 files changed, 889 insertions(+) create mode 100644 reftable/reftable-writer.h create mode 100644 reftable/writer.c create mode 100644 reftable/writer.h diff --git a/Makefile b/Makefile index 12bd12328b5..af553fc227a 100644 --- a/Makefile +++ b/Makefile @@ -2455,6 +2455,7 @@ REFTABLE_OBJS += reftable/blocksource.o REFTABLE_OBJS += reftable/publicbasics.o REFTABLE_OBJS += reftable/record.o REFTABLE_OBJS += reftable/tree.o +REFTABLE_OBJS += reftable/writer.o REFTABLE_TEST_OBJS += reftable/basics_test.o REFTABLE_TEST_OBJS += reftable/block_test.o diff --git a/reftable/reftable-writer.h b/reftable/reftable-writer.h new file mode 100644 index 00000000000..af36462ced5 --- /dev/null +++ b/reftable/reftable-writer.h @@ -0,0 +1,148 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef REFTABLE_WRITER_H +#define REFTABLE_WRITER_H + +#include "reftable-record.h" + +#include +#include /* ssize_t */ + +/* Writing single reftables */ + +/* reftable_write_options sets options for writing a single reftable. */ +struct reftable_write_options { + /* boolean: do not pad out blocks to block size. */ + unsigned unpadded : 1; + + /* the blocksize. Should be less than 2^24. */ + uint32_t block_size; + + /* boolean: do not generate a SHA1 => ref index. */ + unsigned skip_index_objects : 1; + + /* how often to write complete keys in each block. */ + int restart_interval; + + /* 4-byte identifier ("sha1", "s256") of the hash. + * Defaults to SHA1 if unset + */ + uint32_t hash_id; + + /* boolean: do not check ref names for validity or dir/file conflicts. + */ + unsigned skip_name_check : 1; + + /* boolean: copy log messages exactly. If unset, check that the message + * is a single line, and add '\n' if missing. + */ + unsigned exact_log_message : 1; +}; + +/* reftable_block_stats holds statistics for a single block type */ +struct reftable_block_stats { + /* total number of entries written */ + int entries; + /* total number of key restarts */ + int restarts; + /* total number of blocks */ + int blocks; + /* total number of index blocks */ + int index_blocks; + /* depth of the index */ + int max_index_level; + + /* offset of the first block for this type */ + uint64_t offset; + /* offset of the top level index block for this type, or 0 if not + * present */ + uint64_t index_offset; +}; + +/* stats holds overall statistics for a single reftable */ +struct reftable_stats { + /* total number of blocks written. */ + int blocks; + /* stats for ref data */ + struct reftable_block_stats ref_stats; + /* stats for the SHA1 to ref map. */ + struct reftable_block_stats obj_stats; + /* stats for index blocks */ + struct reftable_block_stats idx_stats; + /* stats for log blocks */ + struct reftable_block_stats log_stats; + + /* disambiguation length of shortened object IDs. */ + int object_id_len; +}; + +/* reftable_new_writer creates a new writer */ +struct reftable_writer * +reftable_new_writer(ssize_t (*writer_func)(void *, const void *, size_t), + void *writer_arg, struct reftable_write_options *opts); + +/* Set the range of update indices for the records we will add. When writing a + table into a stack, the min should be at least + reftable_stack_next_update_index(), or REFTABLE_API_ERROR is returned. + + For transactional updates to a stack, typically min==max, and the + update_index can be obtained by inspeciting the stack. When converting an + existing ref database into a single reftable, this would be a range of + update-index timestamps. + */ +void reftable_writer_set_limits(struct reftable_writer *w, uint64_t min, + uint64_t max); + +/* + Add a reftable_ref_record. The record should have names that come after + already added records. + + The update_index must be within the limits set by + reftable_writer_set_limits(), or REFTABLE_API_ERROR is returned. It is an + REFTABLE_API_ERROR error to write a ref record after a log record. +*/ +int reftable_writer_add_ref(struct reftable_writer *w, + struct reftable_ref_record *ref); + +/* + Convenience function to add multiple reftable_ref_records; the function sorts + the records before adding them, reordering the records array passed in. +*/ +int reftable_writer_add_refs(struct reftable_writer *w, + struct reftable_ref_record *refs, int n); + +/* + adds reftable_log_records. Log records are keyed by (refname, decreasing + update_index). The key for the record added must come after the already added + log records. +*/ +int reftable_writer_add_log(struct reftable_writer *w, + struct reftable_log_record *log); + +/* + Convenience function to add multiple reftable_log_records; the function sorts + the records before adding them, reordering records array passed in. +*/ +int reftable_writer_add_logs(struct reftable_writer *w, + struct reftable_log_record *logs, int n); + +/* reftable_writer_close finalizes the reftable. The writer is retained so + * statistics can be inspected. */ +int reftable_writer_close(struct reftable_writer *w); + +/* writer_stats returns the statistics on the reftable being written. + + This struct becomes invalid when the writer is freed. + */ +const struct reftable_stats *writer_stats(struct reftable_writer *w); + +/* reftable_writer_free deallocates memory for the writer */ +void reftable_writer_free(struct reftable_writer *w); + +#endif diff --git a/reftable/writer.c b/reftable/writer.c new file mode 100644 index 00000000000..1baad069b64 --- /dev/null +++ b/reftable/writer.c @@ -0,0 +1,690 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "writer.h" + +#include "system.h" + +#include "block.h" +#include "constants.h" +#include "record.h" +#include "tree.h" +#include "reftable-error.h" + +/* finishes a block, and writes it to storage */ +static int writer_flush_block(struct reftable_writer *w); + +/* deallocates memory related to the index */ +static void writer_clear_index(struct reftable_writer *w); + +/* finishes writing a 'r' (refs) or 'g' (reflogs) section */ +static int writer_finish_public_section(struct reftable_writer *w); + +static struct reftable_block_stats * +writer_reftable_block_stats(struct reftable_writer *w, uint8_t typ) +{ + switch (typ) { + case 'r': + return &w->stats.ref_stats; + case 'o': + return &w->stats.obj_stats; + case 'i': + return &w->stats.idx_stats; + case 'g': + return &w->stats.log_stats; + } + abort(); + return NULL; +} + +/* write data, queuing the padding for the next write. Returns negative for + * error. */ +static int padded_write(struct reftable_writer *w, uint8_t *data, size_t len, + int padding) +{ + int n = 0; + if (w->pending_padding > 0) { + uint8_t *zeroed = reftable_calloc(w->pending_padding); + int n = w->write(w->write_arg, zeroed, w->pending_padding); + if (n < 0) + return n; + + w->pending_padding = 0; + reftable_free(zeroed); + } + + w->pending_padding = padding; + n = w->write(w->write_arg, data, len); + if (n < 0) + return n; + n += padding; + return 0; +} + +static void options_set_defaults(struct reftable_write_options *opts) +{ + if (opts->restart_interval == 0) { + opts->restart_interval = 16; + } + + if (opts->hash_id == 0) { + opts->hash_id = GIT_SHA1_FORMAT_ID; + } + if (opts->block_size == 0) { + opts->block_size = DEFAULT_BLOCK_SIZE; + } +} + +static int writer_version(struct reftable_writer *w) +{ + return (w->opts.hash_id == 0 || w->opts.hash_id == GIT_SHA1_FORMAT_ID) ? + 1 : + 2; +} + +static int writer_write_header(struct reftable_writer *w, uint8_t *dest) +{ + memcpy(dest, "REFT", 4); + + dest[4] = writer_version(w); + + put_be24(dest + 5, w->opts.block_size); + put_be64(dest + 8, w->min_update_index); + put_be64(dest + 16, w->max_update_index); + if (writer_version(w) == 2) { + put_be32(dest + 24, w->opts.hash_id); + } + return header_size(writer_version(w)); +} + +static void writer_reinit_block_writer(struct reftable_writer *w, uint8_t typ) +{ + int block_start = 0; + if (w->next == 0) { + block_start = header_size(writer_version(w)); + } + + strbuf_release(&w->last_key); + block_writer_init(&w->block_writer_data, typ, w->block, + w->opts.block_size, block_start, + hash_size(w->opts.hash_id)); + w->block_writer = &w->block_writer_data; + w->block_writer->restart_interval = w->opts.restart_interval; +} + +static struct strbuf reftable_empty_strbuf = STRBUF_INIT; + +struct reftable_writer * +reftable_new_writer(ssize_t (*writer_func)(void *, const void *, size_t), + void *writer_arg, struct reftable_write_options *opts) +{ + struct reftable_writer *wp = + reftable_calloc(sizeof(struct reftable_writer)); + strbuf_init(&wp->block_writer_data.last_key, 0); + options_set_defaults(opts); + if (opts->block_size >= (1 << 24)) { + /* TODO - error return? */ + abort(); + } + wp->last_key = reftable_empty_strbuf; + wp->block = reftable_calloc(opts->block_size); + wp->write = writer_func; + wp->write_arg = writer_arg; + wp->opts = *opts; + writer_reinit_block_writer(wp, BLOCK_TYPE_REF); + + return wp; +} + +void reftable_writer_set_limits(struct reftable_writer *w, uint64_t min, + uint64_t max) +{ + w->min_update_index = min; + w->max_update_index = max; +} + +void reftable_writer_free(struct reftable_writer *w) +{ + reftable_free(w->block); + reftable_free(w); +} + +struct obj_index_tree_node { + struct strbuf hash; + uint64_t *offsets; + size_t offset_len; + size_t offset_cap; +}; + +#define OBJ_INDEX_TREE_NODE_INIT \ + { \ + .hash = STRBUF_INIT \ + } + +static int obj_index_tree_node_compare(const void *a, const void *b) +{ + return strbuf_cmp(&((const struct obj_index_tree_node *)a)->hash, + &((const struct obj_index_tree_node *)b)->hash); +} + +static void writer_index_hash(struct reftable_writer *w, struct strbuf *hash) +{ + uint64_t off = w->next; + + struct obj_index_tree_node want = { .hash = *hash }; + + struct tree_node *node = tree_search(&want, &w->obj_index_tree, + &obj_index_tree_node_compare, 0); + struct obj_index_tree_node *key = NULL; + if (node == NULL) { + struct obj_index_tree_node empty = OBJ_INDEX_TREE_NODE_INIT; + key = reftable_malloc(sizeof(struct obj_index_tree_node)); + *key = empty; + + strbuf_reset(&key->hash); + strbuf_addbuf(&key->hash, hash); + tree_search((void *)key, &w->obj_index_tree, + &obj_index_tree_node_compare, 1); + } else { + key = node->key; + } + + if (key->offset_len > 0 && key->offsets[key->offset_len - 1] == off) { + return; + } + + if (key->offset_len == key->offset_cap) { + key->offset_cap = 2 * key->offset_cap + 1; + key->offsets = reftable_realloc( + key->offsets, sizeof(uint64_t) * key->offset_cap); + } + + key->offsets[key->offset_len++] = off; +} + +static int writer_add_record(struct reftable_writer *w, + struct reftable_record *rec) +{ + struct strbuf key = STRBUF_INIT; + int err = -1; + reftable_record_key(rec, &key); + if (strbuf_cmp(&w->last_key, &key) >= 0) { + err = REFTABLE_API_ERROR; + goto done; + } + + strbuf_reset(&w->last_key); + strbuf_addbuf(&w->last_key, &key); + if (w->block_writer == NULL) { + writer_reinit_block_writer(w, reftable_record_type(rec)); + } + + assert(block_writer_type(w->block_writer) == reftable_record_type(rec)); + + if (block_writer_add(w->block_writer, rec) == 0) { + err = 0; + goto done; + } + + err = writer_flush_block(w); + if (err < 0) { + goto done; + } + + writer_reinit_block_writer(w, reftable_record_type(rec)); + err = block_writer_add(w->block_writer, rec); + if (err < 0) { + goto done; + } + + err = 0; +done: + strbuf_release(&key); + return err; +} + +int reftable_writer_add_ref(struct reftable_writer *w, + struct reftable_ref_record *ref) +{ + struct reftable_record rec = { NULL }; + struct reftable_ref_record copy = *ref; + int err = 0; + + if (ref->refname == NULL) + return REFTABLE_API_ERROR; + if (ref->update_index < w->min_update_index || + ref->update_index > w->max_update_index) + return REFTABLE_API_ERROR; + + reftable_record_from_ref(&rec, ©); + copy.update_index -= w->min_update_index; + + err = writer_add_record(w, &rec); + if (err < 0) + return err; + + if (!w->opts.skip_index_objects && reftable_ref_record_val1(ref)) { + struct strbuf h = STRBUF_INIT; + strbuf_add(&h, (char *)reftable_ref_record_val1(ref), + hash_size(w->opts.hash_id)); + writer_index_hash(w, &h); + strbuf_release(&h); + } + + if (!w->opts.skip_index_objects && reftable_ref_record_val2(ref)) { + struct strbuf h = STRBUF_INIT; + strbuf_add(&h, reftable_ref_record_val2(ref), + hash_size(w->opts.hash_id)); + writer_index_hash(w, &h); + strbuf_release(&h); + } + return 0; +} + +int reftable_writer_add_refs(struct reftable_writer *w, + struct reftable_ref_record *refs, int n) +{ + int err = 0; + int i = 0; + QSORT(refs, n, reftable_ref_record_compare_name); + for (i = 0; err == 0 && i < n; i++) { + err = reftable_writer_add_ref(w, &refs[i]); + } + return err; +} + +static int reftable_writer_add_log_verbatim(struct reftable_writer *w, + struct reftable_log_record *log) +{ + struct reftable_record rec = { NULL }; + if (w->block_writer && + block_writer_type(w->block_writer) == BLOCK_TYPE_REF) { + int err = writer_finish_public_section(w); + if (err < 0) + return err; + } + + w->next -= w->pending_padding; + w->pending_padding = 0; + + reftable_record_from_log(&rec, log); + return writer_add_record(w, &rec); +} + +int reftable_writer_add_log(struct reftable_writer *w, + struct reftable_log_record *log) +{ + char *input_log_message = NULL; + struct strbuf cleaned_message = STRBUF_INIT; + int err = 0; + + if (log->value_type == REFTABLE_LOG_DELETION) + return reftable_writer_add_log_verbatim(w, log); + + if (log->refname == NULL) + return REFTABLE_API_ERROR; + + input_log_message = log->update.message; + if (!w->opts.exact_log_message && log->update.message) { + strbuf_addstr(&cleaned_message, log->update.message); + while (cleaned_message.len && + cleaned_message.buf[cleaned_message.len - 1] == '\n') + strbuf_setlen(&cleaned_message, + cleaned_message.len - 1); + if (strchr(cleaned_message.buf, '\n')) { + // multiple lines not allowed. + err = REFTABLE_API_ERROR; + goto done; + } + strbuf_addstr(&cleaned_message, "\n"); + log->update.message = cleaned_message.buf; + } + + err = reftable_writer_add_log_verbatim(w, log); + log->update.message = input_log_message; +done: + strbuf_release(&cleaned_message); + return err; +} + +int reftable_writer_add_logs(struct reftable_writer *w, + struct reftable_log_record *logs, int n) +{ + int err = 0; + int i = 0; + QSORT(logs, n, reftable_log_record_compare_key); + + for (i = 0; err == 0 && i < n; i++) { + err = reftable_writer_add_log(w, &logs[i]); + } + return err; +} + +static int writer_finish_section(struct reftable_writer *w) +{ + uint8_t typ = block_writer_type(w->block_writer); + uint64_t index_start = 0; + int max_level = 0; + int threshold = w->opts.unpadded ? 1 : 3; + int before_blocks = w->stats.idx_stats.blocks; + int err = writer_flush_block(w); + int i = 0; + struct reftable_block_stats *bstats = NULL; + if (err < 0) + return err; + + while (w->index_len > threshold) { + struct reftable_index_record *idx = NULL; + int idx_len = 0; + + max_level++; + index_start = w->next; + writer_reinit_block_writer(w, BLOCK_TYPE_INDEX); + + idx = w->index; + idx_len = w->index_len; + + w->index = NULL; + w->index_len = 0; + w->index_cap = 0; + for (i = 0; i < idx_len; i++) { + struct reftable_record rec = { NULL }; + reftable_record_from_index(&rec, idx + i); + if (block_writer_add(w->block_writer, &rec) == 0) { + continue; + } + + err = writer_flush_block(w); + if (err < 0) + return err; + + writer_reinit_block_writer(w, BLOCK_TYPE_INDEX); + + err = block_writer_add(w->block_writer, &rec); + if (err != 0) { + /* write into fresh block should always succeed + */ + abort(); + } + } + for (i = 0; i < idx_len; i++) { + strbuf_release(&idx[i].last_key); + } + reftable_free(idx); + } + + writer_clear_index(w); + + err = writer_flush_block(w); + if (err < 0) + return err; + + bstats = writer_reftable_block_stats(w, typ); + bstats->index_blocks = w->stats.idx_stats.blocks - before_blocks; + bstats->index_offset = index_start; + bstats->max_index_level = max_level; + + /* Reinit lastKey, as the next section can start with any key. */ + w->last_key.len = 0; + + return 0; +} + +struct common_prefix_arg { + struct strbuf *last; + int max; +}; + +static void update_common(void *void_arg, void *key) +{ + struct common_prefix_arg *arg = void_arg; + struct obj_index_tree_node *entry = key; + if (arg->last) { + int n = common_prefix_size(&entry->hash, arg->last); + if (n > arg->max) { + arg->max = n; + } + } + arg->last = &entry->hash; +} + +struct write_record_arg { + struct reftable_writer *w; + int err; +}; + +static void write_object_record(void *void_arg, void *key) +{ + struct write_record_arg *arg = void_arg; + struct obj_index_tree_node *entry = key; + struct reftable_obj_record obj_rec = { + .hash_prefix = (uint8_t *)entry->hash.buf, + .hash_prefix_len = arg->w->stats.object_id_len, + .offsets = entry->offsets, + .offset_len = entry->offset_len, + }; + struct reftable_record rec = { NULL }; + if (arg->err < 0) + goto done; + + reftable_record_from_obj(&rec, &obj_rec); + arg->err = block_writer_add(arg->w->block_writer, &rec); + if (arg->err == 0) + goto done; + + arg->err = writer_flush_block(arg->w); + if (arg->err < 0) + goto done; + + writer_reinit_block_writer(arg->w, BLOCK_TYPE_OBJ); + arg->err = block_writer_add(arg->w->block_writer, &rec); + if (arg->err == 0) + goto done; + obj_rec.offset_len = 0; + arg->err = block_writer_add(arg->w->block_writer, &rec); + + /* Should be able to write into a fresh block. */ + assert(arg->err == 0); + +done:; +} + +static void object_record_free(void *void_arg, void *key) +{ + struct obj_index_tree_node *entry = key; + + FREE_AND_NULL(entry->offsets); + strbuf_release(&entry->hash); + reftable_free(entry); +} + +static int writer_dump_object_index(struct reftable_writer *w) +{ + struct write_record_arg closure = { .w = w }; + struct common_prefix_arg common = { NULL }; + if (w->obj_index_tree) { + infix_walk(w->obj_index_tree, &update_common, &common); + } + w->stats.object_id_len = common.max + 1; + + writer_reinit_block_writer(w, BLOCK_TYPE_OBJ); + + if (w->obj_index_tree) { + infix_walk(w->obj_index_tree, &write_object_record, &closure); + } + + if (closure.err < 0) + return closure.err; + return writer_finish_section(w); +} + +static int writer_finish_public_section(struct reftable_writer *w) +{ + uint8_t typ = 0; + int err = 0; + + if (w->block_writer == NULL) + return 0; + + typ = block_writer_type(w->block_writer); + err = writer_finish_section(w); + if (err < 0) + return err; + if (typ == BLOCK_TYPE_REF && !w->opts.skip_index_objects && + w->stats.ref_stats.index_blocks > 0) { + err = writer_dump_object_index(w); + if (err < 0) + return err; + } + + if (w->obj_index_tree) { + infix_walk(w->obj_index_tree, &object_record_free, NULL); + tree_free(w->obj_index_tree); + w->obj_index_tree = NULL; + } + + w->block_writer = NULL; + return 0; +} + +int reftable_writer_close(struct reftable_writer *w) +{ + uint8_t footer[72]; + uint8_t *p = footer; + int err = writer_finish_public_section(w); + int empty_table = w->next == 0; + if (err != 0) + goto done; + w->pending_padding = 0; + if (empty_table) { + /* Empty tables need a header anyway. */ + uint8_t header[28]; + int n = writer_write_header(w, header); + err = padded_write(w, header, n, 0); + if (err < 0) + goto done; + } + + p += writer_write_header(w, footer); + put_be64(p, w->stats.ref_stats.index_offset); + p += 8; + put_be64(p, (w->stats.obj_stats.offset) << 5 | w->stats.object_id_len); + p += 8; + put_be64(p, w->stats.obj_stats.index_offset); + p += 8; + + put_be64(p, w->stats.log_stats.offset); + p += 8; + put_be64(p, w->stats.log_stats.index_offset); + p += 8; + + put_be32(p, crc32(0, footer, p - footer)); + p += 4; + + err = padded_write(w, footer, footer_size(writer_version(w)), 0); + if (err < 0) + goto done; + + if (empty_table) { + err = REFTABLE_EMPTY_TABLE_ERROR; + goto done; + } + +done: + /* free up memory. */ + block_writer_release(&w->block_writer_data); + writer_clear_index(w); + strbuf_release(&w->last_key); + return err; +} + +static void writer_clear_index(struct reftable_writer *w) +{ + int i = 0; + for (i = 0; i < w->index_len; i++) { + strbuf_release(&w->index[i].last_key); + } + + FREE_AND_NULL(w->index); + w->index_len = 0; + w->index_cap = 0; +} + +static const int debug = 0; + +static int writer_flush_nonempty_block(struct reftable_writer *w) +{ + uint8_t typ = block_writer_type(w->block_writer); + struct reftable_block_stats *bstats = + writer_reftable_block_stats(w, typ); + uint64_t block_typ_off = (bstats->blocks == 0) ? w->next : 0; + int raw_bytes = block_writer_finish(w->block_writer); + int padding = 0; + int err = 0; + struct reftable_index_record ir = { .last_key = STRBUF_INIT }; + if (raw_bytes < 0) + return raw_bytes; + + if (!w->opts.unpadded && typ != BLOCK_TYPE_LOG) { + padding = w->opts.block_size - raw_bytes; + } + + if (block_typ_off > 0) { + bstats->offset = block_typ_off; + } + + bstats->entries += w->block_writer->entries; + bstats->restarts += w->block_writer->restart_len; + bstats->blocks++; + w->stats.blocks++; + + if (debug) { + fprintf(stderr, "block %c off %" PRIu64 " sz %d (%d)\n", typ, + w->next, raw_bytes, + get_be24(w->block + w->block_writer->header_off + 1)); + } + + if (w->next == 0) { + writer_write_header(w, w->block); + } + + err = padded_write(w, w->block, raw_bytes, padding); + if (err < 0) + return err; + + if (w->index_cap == w->index_len) { + w->index_cap = 2 * w->index_cap + 1; + w->index = reftable_realloc( + w->index, + sizeof(struct reftable_index_record) * w->index_cap); + } + + ir.offset = w->next; + strbuf_reset(&ir.last_key); + strbuf_addbuf(&ir.last_key, &w->block_writer->last_key); + w->index[w->index_len] = ir; + + w->index_len++; + w->next += padding + raw_bytes; + w->block_writer = NULL; + return 0; +} + +static int writer_flush_block(struct reftable_writer *w) +{ + if (w->block_writer == NULL) + return 0; + if (w->block_writer->entries == 0) + return 0; + return writer_flush_nonempty_block(w); +} + +const struct reftable_stats *writer_stats(struct reftable_writer *w) +{ + return &w->stats; +} diff --git a/reftable/writer.h b/reftable/writer.h new file mode 100644 index 00000000000..09b88673d97 --- /dev/null +++ b/reftable/writer.h @@ -0,0 +1,50 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef WRITER_H +#define WRITER_H + +#include "basics.h" +#include "block.h" +#include "tree.h" +#include "reftable-writer.h" + +struct reftable_writer { + ssize_t (*write)(void *, const void *, size_t); + void *write_arg; + int pending_padding; + struct strbuf last_key; + + /* offset of next block to write. */ + uint64_t next; + uint64_t min_update_index, max_update_index; + struct reftable_write_options opts; + + /* memory buffer for writing */ + uint8_t *block; + + /* writer for the current section. NULL or points to + * block_writer_data */ + struct block_writer *block_writer; + + struct block_writer block_writer_data; + + /* pending index records for the current section */ + struct reftable_index_record *index; + size_t index_len; + size_t index_cap; + + /* + * tree for use with tsearch; used to populate the 'o' inverse OID + * map */ + struct tree_node *obj_index_tree; + + struct reftable_stats stats; +}; + +#endif From patchwork Tue Jul 20 17:04:32 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 12388861 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BE3EBC636C8 for ; Tue, 20 Jul 2021 17:07:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A509361165 for ; Tue, 20 Jul 2021 17:07:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234766AbhGTQ0P (ORCPT ); Tue, 20 Jul 2021 12:26:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49702 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232377AbhGTQYU (ORCPT ); Tue, 20 Jul 2021 12:24:20 -0400 Received: from mail-wr1-x432.google.com (mail-wr1-x432.google.com [IPv6:2a00:1450:4864:20::432]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 600E0C061766 for ; Tue, 20 Jul 2021 10:04:57 -0700 (PDT) Received: by mail-wr1-x432.google.com with SMTP id t5so26748818wrw.12 for ; Tue, 20 Jul 2021 10:04:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=/u0OFxyyMENXleTZW3lz4hDEorLPbcs9UAXBdDt8v0I=; b=vLLzEYEauiQ5fUWyXEqlTRHDoF+f2S/a7mmg/dsLmlXw4AOg5M86ckM5ZB2OEuPDyJ boumyNr2v4pNMHatMoY2ZeQW0buCqdcX6tJDHSx9dgY5clX9bAJRfQHP9xSkNwg1AQ5n Odp6Zi5Z4r/MEORDa5BtmFOKtXcXeIBmoz6wUl0dulIxQe8YUgtM4dLIEQqSYWMDEuPZ 8Hbj9elY7SR5jRffysXJiiOYNkRCyo0fjVrcUh7bj52ZSSM1Hu9LB2iUlwclKDHVeFXS Nr5YB4TtLMIyx5yiHUNcVIW8/KTyEuMK3nF4yRx93zCs/5PUXR/NP21zsjhXShCphAMS 35sw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=/u0OFxyyMENXleTZW3lz4hDEorLPbcs9UAXBdDt8v0I=; b=eehe/8PKVDsCu8R53Oa8XE96rDIQxphQeJqQqVlhOAWhnbIcgVrlWH1jT2Qzoy4ldC QQrhIUdDS/d9Lq5EVmWpJrkbkKzWsG98kZd9zvLrHsvEDe+PFPVN2kmga5rSWtvfxgrL Z+AJEGemLvP2S26jWlQPsOerriOnFhC3AMuIGHDZomxi+qIp+v35BDEaRzHYwT1njzLb VpJvf73CPeAY0fJgpsRNYNpLlM/8J/+JUdDTGzyyXLeuaNT+aH4be4EXaUJUTKYwmWEh rfIgdu1cBzfYIBsqxyOt1kl5OPfV1AHorvJ9OPl7n8JU9yaQj+Zcbn4WcZmyZHTwmg64 WxWA== X-Gm-Message-State: AOAM533cxplJ/GqxKAPR7p5MyqpW9WFzmA+lCvS6iGx4nyVcifsUk9L9 ZzHW7XDpHmqhsS2YhNESsmMvThTwOw0= X-Google-Smtp-Source: ABdhPJxDYAJ+Cij5cB7FpU9n1s97qTMdIACpMJNbdwnWJM38jgBM1kdKkh0miDYss+Aqan+ydjUrOw== X-Received: by 2002:adf:b305:: with SMTP id j5mr36383438wrd.11.1626800695919; Tue, 20 Jul 2021 10:04:55 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id b15sm28384887wrr.27.2021.07.20.10.04.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 10:04:55 -0700 (PDT) Message-Id: <99708f408b0f51c12741ce5a486476aeb895bc2a.1626800687.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 20 Jul 2021 17:04:32 +0000 Subject: [PATCH 12/26] reftable: generic interface to tables Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys Signed-off-by: Han-Wen Nienhuys --- Makefile | 3 + reftable/generic.c | 169 +++++++++++++++++++++++++++++++++++ reftable/generic.h | 32 +++++++ reftable/reftable-generic.h | 47 ++++++++++ reftable/reftable-iterator.h | 39 ++++++++ reftable/reftable.c | 115 ++++++++++++++++++++++++ 6 files changed, 405 insertions(+) create mode 100644 reftable/generic.c create mode 100644 reftable/generic.h create mode 100644 reftable/reftable-generic.h create mode 100644 reftable/reftable-iterator.h create mode 100644 reftable/reftable.c diff --git a/Makefile b/Makefile index af553fc227a..9e0aefd96f7 100644 --- a/Makefile +++ b/Makefile @@ -2454,6 +2454,9 @@ REFTABLE_OBJS += reftable/block.o REFTABLE_OBJS += reftable/blocksource.o REFTABLE_OBJS += reftable/publicbasics.o REFTABLE_OBJS += reftable/record.o +REFTABLE_OBJS += reftable/refname.o +REFTABLE_OBJS += reftable/generic.o +REFTABLE_OBJS += reftable/stack.o REFTABLE_OBJS += reftable/tree.o REFTABLE_OBJS += reftable/writer.o diff --git a/reftable/generic.c b/reftable/generic.c new file mode 100644 index 00000000000..7a8a738d860 --- /dev/null +++ b/reftable/generic.c @@ -0,0 +1,169 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "basics.h" +#include "record.h" +#include "generic.h" +#include "reftable-iterator.h" +#include "reftable-generic.h" + +int reftable_table_seek_ref(struct reftable_table *tab, + struct reftable_iterator *it, const char *name) +{ + struct reftable_ref_record ref = { + .refname = (char *)name, + }; + struct reftable_record rec = { NULL }; + reftable_record_from_ref(&rec, &ref); + return tab->ops->seek_record(tab->table_arg, it, &rec); +} + +int reftable_table_seek_log(struct reftable_table *tab, + struct reftable_iterator *it, const char *name) +{ + struct reftable_log_record log = { + .refname = (char *)name, + .update_index = ~((uint64_t)0), + }; + struct reftable_record rec = { NULL }; + reftable_record_from_log(&rec, &log); + return tab->ops->seek_record(tab->table_arg, it, &rec); +} + +int reftable_table_read_ref(struct reftable_table *tab, const char *name, + struct reftable_ref_record *ref) +{ + struct reftable_iterator it = { NULL }; + int err = reftable_table_seek_ref(tab, &it, name); + if (err) + goto done; + + err = reftable_iterator_next_ref(&it, ref); + if (err) + goto done; + + if (strcmp(ref->refname, name) || + reftable_ref_record_is_deletion(ref)) { + reftable_ref_record_release(ref); + err = 1; + goto done; + } + +done: + reftable_iterator_destroy(&it); + return err; +} + +int reftable_table_print(struct reftable_table *tab) { + struct reftable_iterator it = { NULL }; + struct reftable_ref_record ref = { NULL }; + struct reftable_log_record log = { NULL }; + uint32_t hash_id = reftable_table_hash_id(tab); + int err = reftable_table_seek_ref(tab, &it, ""); + if (err < 0) { + return err; + } + + while (1) { + err = reftable_iterator_next_ref(&it, &ref); + if (err > 0) { + break; + } + if (err < 0) { + return err; + } + reftable_ref_record_print(&ref, hash_id); + } + reftable_iterator_destroy(&it); + reftable_ref_record_release(&ref); + + err = reftable_table_seek_log(tab, &it, ""); + if (err < 0) { + return err; + } + while (1) { + err = reftable_iterator_next_log(&it, &log); + if (err > 0) { + break; + } + if (err < 0) { + return err; + } + reftable_log_record_print(&log, hash_id); + } + reftable_iterator_destroy(&it); + reftable_log_record_release(&log); + return 0; +} + +uint64_t reftable_table_max_update_index(struct reftable_table *tab) +{ + return tab->ops->max_update_index(tab->table_arg); +} + +uint64_t reftable_table_min_update_index(struct reftable_table *tab) +{ + return tab->ops->min_update_index(tab->table_arg); +} + +uint32_t reftable_table_hash_id(struct reftable_table *tab) +{ + return tab->ops->hash_id(tab->table_arg); +} + +void reftable_iterator_destroy(struct reftable_iterator *it) +{ + if (!it->ops) { + return; + } + it->ops->close(it->iter_arg); + it->ops = NULL; + FREE_AND_NULL(it->iter_arg); +} + +int reftable_iterator_next_ref(struct reftable_iterator *it, + struct reftable_ref_record *ref) +{ + struct reftable_record rec = { NULL }; + reftable_record_from_ref(&rec, ref); + return iterator_next(it, &rec); +} + +int reftable_iterator_next_log(struct reftable_iterator *it, + struct reftable_log_record *log) +{ + struct reftable_record rec = { NULL }; + reftable_record_from_log(&rec, log); + return iterator_next(it, &rec); +} + +int iterator_next(struct reftable_iterator *it, struct reftable_record *rec) +{ + return it->ops->next(it->iter_arg, rec); +} + +static int empty_iterator_next(void *arg, struct reftable_record *rec) +{ + return 1; +} + +static void empty_iterator_close(void *arg) +{ +} + +static struct reftable_iterator_vtable empty_vtable = { + .next = &empty_iterator_next, + .close = &empty_iterator_close, +}; + +void iterator_set_empty(struct reftable_iterator *it) +{ + assert(!it->ops); + it->iter_arg = NULL; + it->ops = &empty_vtable; +} diff --git a/reftable/generic.h b/reftable/generic.h new file mode 100644 index 00000000000..98886a06402 --- /dev/null +++ b/reftable/generic.h @@ -0,0 +1,32 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef GENERIC_H +#define GENERIC_H + +#include "record.h" +#include "reftable-generic.h" + +/* generic interface to reftables */ +struct reftable_table_vtable { + int (*seek_record)(void *tab, struct reftable_iterator *it, + struct reftable_record *); + uint32_t (*hash_id)(void *tab); + uint64_t (*min_update_index)(void *tab); + uint64_t (*max_update_index)(void *tab); +}; + +struct reftable_iterator_vtable { + int (*next)(void *iter_arg, struct reftable_record *rec); + void (*close)(void *iter_arg); +}; + +void iterator_set_empty(struct reftable_iterator *it); +int iterator_next(struct reftable_iterator *it, struct reftable_record *rec); + +#endif diff --git a/reftable/reftable-generic.h b/reftable/reftable-generic.h new file mode 100644 index 00000000000..d239751a778 --- /dev/null +++ b/reftable/reftable-generic.h @@ -0,0 +1,47 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef REFTABLE_GENERIC_H +#define REFTABLE_GENERIC_H + +#include "reftable-iterator.h" + +struct reftable_table_vtable; + +/* + * Provides a unified API for reading tables, either merged tables, or single + * readers. */ +struct reftable_table { + struct reftable_table_vtable *ops; + void *table_arg; +}; + +int reftable_table_seek_log(struct reftable_table *tab, + struct reftable_iterator *it, const char *name); + +int reftable_table_seek_ref(struct reftable_table *tab, + struct reftable_iterator *it, const char *name); + +/* returns the hash ID from a generic reftable_table */ +uint32_t reftable_table_hash_id(struct reftable_table *tab); + +/* returns the max update_index covered by this table. */ +uint64_t reftable_table_max_update_index(struct reftable_table *tab); + +/* returns the min update_index covered by this table. */ +uint64_t reftable_table_min_update_index(struct reftable_table *tab); + +/* convenience function to read a single ref. Returns < 0 for error, 0 + for success, and 1 if ref not found. */ +int reftable_table_read_ref(struct reftable_table *tab, const char *name, + struct reftable_ref_record *ref); + +/* dump table contents onto stdout for debugging */ +int reftable_table_print(struct reftable_table *tab); + +#endif diff --git a/reftable/reftable-iterator.h b/reftable/reftable-iterator.h new file mode 100644 index 00000000000..d3eee7af357 --- /dev/null +++ b/reftable/reftable-iterator.h @@ -0,0 +1,39 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef REFTABLE_ITERATOR_H +#define REFTABLE_ITERATOR_H + +#include "reftable-record.h" + +struct reftable_iterator_vtable; + +/* iterator is the generic interface for walking over data stored in a + * reftable. + */ +struct reftable_iterator { + struct reftable_iterator_vtable *ops; + void *iter_arg; +}; + +/* reads the next reftable_ref_record. Returns < 0 for error, 0 for OK and > 0: + * end of iteration. + */ +int reftable_iterator_next_ref(struct reftable_iterator *it, + struct reftable_ref_record *ref); + +/* reads the next reftable_log_record. Returns < 0 for error, 0 for OK and > 0: + * end of iteration. + */ +int reftable_iterator_next_log(struct reftable_iterator *it, + struct reftable_log_record *log); + +/* releases resources associated with an iterator. */ +void reftable_iterator_destroy(struct reftable_iterator *it); + +#endif diff --git a/reftable/reftable.c b/reftable/reftable.c new file mode 100644 index 00000000000..0e4607a7cd6 --- /dev/null +++ b/reftable/reftable.c @@ -0,0 +1,115 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "basics.h" +#include "record.h" +#include "generic.h" +#include "reftable-iterator.h" +#include "reftable-generic.h" + +int reftable_table_seek_ref(struct reftable_table *tab, + struct reftable_iterator *it, const char *name) +{ + struct reftable_ref_record ref = { + .refname = (char *)name, + }; + struct reftable_record rec = { NULL }; + reftable_record_from_ref(&rec, &ref); + return tab->ops->seek_record(tab->table_arg, it, &rec); +} + +int reftable_table_read_ref(struct reftable_table *tab, const char *name, + struct reftable_ref_record *ref) +{ + struct reftable_iterator it = { NULL }; + int err = reftable_table_seek_ref(tab, &it, name); + if (err) + goto done; + + err = reftable_iterator_next_ref(&it, ref); + if (err) + goto done; + + if (strcmp(ref->refname, name) || + reftable_ref_record_is_deletion(ref)) { + reftable_ref_record_release(ref); + err = 1; + goto done; + } + +done: + reftable_iterator_destroy(&it); + return err; +} + +uint64_t reftable_table_max_update_index(struct reftable_table *tab) +{ + return tab->ops->max_update_index(tab->table_arg); +} + +uint64_t reftable_table_min_update_index(struct reftable_table *tab) +{ + return tab->ops->min_update_index(tab->table_arg); +} + +uint32_t reftable_table_hash_id(struct reftable_table *tab) +{ + return tab->ops->hash_id(tab->table_arg); +} + +void reftable_iterator_destroy(struct reftable_iterator *it) +{ + if (!it->ops) { + return; + } + it->ops->close(it->iter_arg); + it->ops = NULL; + FREE_AND_NULL(it->iter_arg); +} + +int reftable_iterator_next_ref(struct reftable_iterator *it, + struct reftable_ref_record *ref) +{ + struct reftable_record rec = { NULL }; + reftable_record_from_ref(&rec, ref); + return iterator_next(it, &rec); +} + +int reftable_iterator_next_log(struct reftable_iterator *it, + struct reftable_log_record *log) +{ + struct reftable_record rec = { NULL }; + reftable_record_from_log(&rec, log); + return iterator_next(it, &rec); +} + +int iterator_next(struct reftable_iterator *it, struct reftable_record *rec) +{ + return it->ops->next(it->iter_arg, rec); +} + +static int empty_iterator_next(void *arg, struct reftable_record *rec) +{ + return 1; +} + +static void empty_iterator_close(void *arg) +{ +} + +static struct reftable_iterator_vtable empty_vtable = { + .next = &empty_iterator_next, + .close = &empty_iterator_close, +}; + +void iterator_set_empty(struct reftable_iterator *it) +{ + assert(!it->ops); + it->iter_arg = NULL; + it->ops = &empty_vtable; +} From patchwork Tue Jul 20 17:04:33 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 12388855 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0C893C07E9B for ; Tue, 20 Jul 2021 17:07:35 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D59B1610D2 for ; Tue, 20 Jul 2021 17:07:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234829AbhGTQ0Z (ORCPT ); Tue, 20 Jul 2021 12:26:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49674 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232391AbhGTQYV (ORCPT ); Tue, 20 Jul 2021 12:24:21 -0400 Received: from mail-wr1-x42c.google.com (mail-wr1-x42c.google.com [IPv6:2a00:1450:4864:20::42c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 48DB5C061767 for ; Tue, 20 Jul 2021 10:04:58 -0700 (PDT) Received: by mail-wr1-x42c.google.com with SMTP id l7so26769312wrv.7 for ; Tue, 20 Jul 2021 10:04:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=yLJh1SOKeE+JOc0kQP4+pwwEA4KtjKtKfEQ1PfIGplQ=; b=fyrHkHGMg022OQ5Ubjzki0AwaC38NfcboYgybdu5I+z8kLK7Q5d1sWNTQQJZ3e/ET6 MB/2IeoptJWWrZkQKUH8tBL2YpWGVab/PS0gMBDYpbMKRrg3KXNeZb967gTuTZDhyw2D Rt0KN7aLsbzxygX9wjAueRCP8rpPXusb8anl2FklHU3HFMK036GXHWnasD0YkmTJJdYH Bg3OR+mh2mNIA+m6461zfoj1MCk9kR+YrKdwYq501LkfiYeSvUPjYz/UHYR0qelH4HKA nSzcrjWVSG4ypp5I1/gFHokCWerPfM/3Pq4T7uaUg87Nt2ZflLTnQ0cuS2dqXFMQVDoP uKiA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=yLJh1SOKeE+JOc0kQP4+pwwEA4KtjKtKfEQ1PfIGplQ=; b=F3P9m6lpxBaJo4mOmOTpqeYCLqbtLFxHwj08PQm3cdc0Crk7ixAj28C4GF7O3l98t/ jS7QuqgbB35Xnd7SxZSWXS4c+R/0MsfupyFaUoPinO/aJ/Dm1RmeD9NjdJdi+wKazcq/ /uxTAZUsvRNpmB7XR8fdbfRsOP8wBV+s2guY8DEipPN1jAMkqrYbeOL78/lRl9Ngur7k hNpuC/PtTh1H4b3qCyVCJCE4Jk4POSiIkknvH4L6qOOh7qGvN78Adr7V63BIsOVpw7Wh DytkjbfXmngnqoVrBfGgdDsm7X6rBlH7ANOCv0O64OElUSl2RxgoQYx0g15ohC9trsYX gj9w== X-Gm-Message-State: AOAM530SRc5n7VC3cmVPQ8o+zHazEqBbTyIaG1HHO8lePWjG5HQ0A+FB O3MvhWVS6eumiOuZG5bXNSXvL+GwTYA= X-Google-Smtp-Source: ABdhPJzJcGUcZCYSnTTrZblg0GrSbZ7lp+0saYiMyoPHkLdXJ4A1Q/aN1TAKTvvzkmwqljflRyqmjQ== X-Received: by 2002:a05:6000:52:: with SMTP id k18mr36291530wrx.270.1626800696648; Tue, 20 Jul 2021 10:04:56 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id d67sm3419404wmd.9.2021.07.20.10.04.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 10:04:56 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Tue, 20 Jul 2021 17:04:33 +0000 Subject: [PATCH 13/26] reftable: read reftable files Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys This supports reading a single reftable file. The commit introduces an abstract iterator type, which captures the usecases both of reading individual refs, and iterating over a segment of the ref namespace. Signed-off-by: Han-Wen Nienhuys --- Makefile | 2 + reftable/iter.c | 194 +++++++++ reftable/iter.h | 69 ++++ reftable/reader.c | 801 +++++++++++++++++++++++++++++++++++++ reftable/reader.h | 66 +++ reftable/reftable-reader.h | 101 +++++ 6 files changed, 1233 insertions(+) create mode 100644 reftable/iter.c create mode 100644 reftable/iter.h create mode 100644 reftable/reader.c create mode 100644 reftable/reader.h create mode 100644 reftable/reftable-reader.h diff --git a/Makefile b/Makefile index 9e0aefd96f7..ef4bf0f6a49 100644 --- a/Makefile +++ b/Makefile @@ -2452,7 +2452,9 @@ REFTABLE_OBJS += reftable/basics.o REFTABLE_OBJS += reftable/error.o REFTABLE_OBJS += reftable/block.o REFTABLE_OBJS += reftable/blocksource.o +REFTABLE_OBJS += reftable/iter.o REFTABLE_OBJS += reftable/publicbasics.o +REFTABLE_OBJS += reftable/reader.o REFTABLE_OBJS += reftable/record.o REFTABLE_OBJS += reftable/refname.o REFTABLE_OBJS += reftable/generic.o diff --git a/reftable/iter.c b/reftable/iter.c new file mode 100644 index 00000000000..93d04f735b8 --- /dev/null +++ b/reftable/iter.c @@ -0,0 +1,194 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "iter.h" + +#include "system.h" + +#include "block.h" +#include "generic.h" +#include "constants.h" +#include "reader.h" +#include "reftable-error.h" + +int iterator_is_null(struct reftable_iterator *it) +{ + return !it->ops; +} + +static void filtering_ref_iterator_close(void *iter_arg) +{ + struct filtering_ref_iterator *fri = iter_arg; + strbuf_release(&fri->oid); + reftable_iterator_destroy(&fri->it); +} + +static int filtering_ref_iterator_next(void *iter_arg, + struct reftable_record *rec) +{ + struct filtering_ref_iterator *fri = iter_arg; + struct reftable_ref_record *ref = rec->data; + int err = 0; + while (1) { + err = reftable_iterator_next_ref(&fri->it, ref); + if (err != 0) { + break; + } + + if (fri->double_check) { + struct reftable_iterator it = { NULL }; + + err = reftable_table_seek_ref(&fri->tab, &it, + ref->refname); + if (err == 0) { + err = reftable_iterator_next_ref(&it, ref); + } + + reftable_iterator_destroy(&it); + + if (err < 0) { + break; + } + + if (err > 0) { + continue; + } + } + + if (ref->value_type == REFTABLE_REF_VAL2 && + (!memcmp(fri->oid.buf, ref->value.val2.target_value, + fri->oid.len) || + !memcmp(fri->oid.buf, ref->value.val2.value, + fri->oid.len))) + return 0; + + if (ref->value_type == REFTABLE_REF_VAL1 && + !memcmp(fri->oid.buf, ref->value.val1, fri->oid.len)) { + return 0; + } + } + + reftable_ref_record_release(ref); + return err; +} + +static struct reftable_iterator_vtable filtering_ref_iterator_vtable = { + .next = &filtering_ref_iterator_next, + .close = &filtering_ref_iterator_close, +}; + +void iterator_from_filtering_ref_iterator(struct reftable_iterator *it, + struct filtering_ref_iterator *fri) +{ + assert(!it->ops); + it->iter_arg = fri; + it->ops = &filtering_ref_iterator_vtable; +} + +static void indexed_table_ref_iter_close(void *p) +{ + struct indexed_table_ref_iter *it = p; + block_iter_close(&it->cur); + reftable_block_done(&it->block_reader.block); + reftable_free(it->offsets); + strbuf_release(&it->oid); +} + +static int indexed_table_ref_iter_next_block(struct indexed_table_ref_iter *it) +{ + uint64_t off; + int err = 0; + if (it->offset_idx == it->offset_len) { + it->is_finished = 1; + return 1; + } + + reftable_block_done(&it->block_reader.block); + + off = it->offsets[it->offset_idx++]; + err = reader_init_block_reader(it->r, &it->block_reader, off, + BLOCK_TYPE_REF); + if (err < 0) { + return err; + } + if (err > 0) { + /* indexed block does not exist. */ + return REFTABLE_FORMAT_ERROR; + } + block_reader_start(&it->block_reader, &it->cur); + return 0; +} + +static int indexed_table_ref_iter_next(void *p, struct reftable_record *rec) +{ + struct indexed_table_ref_iter *it = p; + struct reftable_ref_record *ref = rec->data; + + while (1) { + int err = block_iter_next(&it->cur, rec); + if (err < 0) { + return err; + } + + if (err > 0) { + err = indexed_table_ref_iter_next_block(it); + if (err < 0) { + return err; + } + + if (it->is_finished) { + return 1; + } + continue; + } + /* BUG */ + if (!memcmp(it->oid.buf, ref->value.val2.target_value, + it->oid.len) || + !memcmp(it->oid.buf, ref->value.val2.value, it->oid.len)) { + return 0; + } + } +} + +int new_indexed_table_ref_iter(struct indexed_table_ref_iter **dest, + struct reftable_reader *r, uint8_t *oid, + int oid_len, uint64_t *offsets, int offset_len) +{ + struct indexed_table_ref_iter empty = INDEXED_TABLE_REF_ITER_INIT; + struct indexed_table_ref_iter *itr = + reftable_calloc(sizeof(struct indexed_table_ref_iter)); + int err = 0; + + *itr = empty; + itr->r = r; + strbuf_add(&itr->oid, oid, oid_len); + + itr->offsets = offsets; + itr->offset_len = offset_len; + + err = indexed_table_ref_iter_next_block(itr); + if (err < 0) { + reftable_free(itr); + } else { + *dest = itr; + } + return err; +} + +static struct reftable_iterator_vtable indexed_table_ref_iter_vtable = { + .next = &indexed_table_ref_iter_next, + .close = &indexed_table_ref_iter_close, +}; + +void iterator_from_indexed_table_ref_iter(struct reftable_iterator *it, + struct indexed_table_ref_iter *itr) +{ + assert(!it->ops); + it->iter_arg = itr; + it->ops = &indexed_table_ref_iter_vtable; +} diff --git a/reftable/iter.h b/reftable/iter.h new file mode 100644 index 00000000000..09eb0cbfa59 --- /dev/null +++ b/reftable/iter.h @@ -0,0 +1,69 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef ITER_H +#define ITER_H + +#include "system.h" +#include "block.h" +#include "record.h" + +#include "reftable-iterator.h" +#include "reftable-generic.h" + +/* Returns true for a zeroed out iterator, such as the one returned from + * iterator_destroy. */ +int iterator_is_null(struct reftable_iterator *it); + +/* iterator that produces only ref records that point to `oid` */ +struct filtering_ref_iterator { + int double_check; + struct reftable_table tab; + struct strbuf oid; + struct reftable_iterator it; +}; +#define FILTERING_REF_ITERATOR_INIT \ + { \ + .oid = STRBUF_INIT \ + } + +void iterator_from_filtering_ref_iterator(struct reftable_iterator *, + struct filtering_ref_iterator *); + +/* iterator that produces only ref records that point to `oid`, + * but using the object index. + */ +struct indexed_table_ref_iter { + struct reftable_reader *r; + struct strbuf oid; + + /* mutable */ + uint64_t *offsets; + + /* Points to the next offset to read. */ + int offset_idx; + int offset_len; + struct block_reader block_reader; + struct block_iter cur; + int is_finished; +}; + +#define INDEXED_TABLE_REF_ITER_INIT \ + { \ + .cur = { .last_key = STRBUF_INIT }, .oid = STRBUF_INIT, \ + } + +void iterator_from_indexed_table_ref_iter(struct reftable_iterator *it, + struct indexed_table_ref_iter *itr); + +/* Takes ownership of `offsets` */ +int new_indexed_table_ref_iter(struct indexed_table_ref_iter **dest, + struct reftable_reader *r, uint8_t *oid, + int oid_len, uint64_t *offsets, int offset_len); + +#endif diff --git a/reftable/reader.c b/reftable/reader.c new file mode 100644 index 00000000000..49f4ec070e6 --- /dev/null +++ b/reftable/reader.c @@ -0,0 +1,801 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "reader.h" + +#include "system.h" +#include "block.h" +#include "constants.h" +#include "generic.h" +#include "iter.h" +#include "record.h" +#include "reftable-error.h" +#include "reftable-generic.h" +#include "tree.h" + +uint64_t block_source_size(struct reftable_block_source *source) +{ + return source->ops->size(source->arg); +} + +int block_source_read_block(struct reftable_block_source *source, + struct reftable_block *dest, uint64_t off, + uint32_t size) +{ + int result = source->ops->read_block(source->arg, dest, off, size); + dest->source = *source; + return result; +} + +void block_source_close(struct reftable_block_source *source) +{ + if (!source->ops) { + return; + } + + source->ops->close(source->arg); + source->ops = NULL; +} + +static struct reftable_reader_offsets * +reader_offsets_for(struct reftable_reader *r, uint8_t typ) +{ + switch (typ) { + case BLOCK_TYPE_REF: + return &r->ref_offsets; + case BLOCK_TYPE_LOG: + return &r->log_offsets; + case BLOCK_TYPE_OBJ: + return &r->obj_offsets; + } + abort(); +} + +static int reader_get_block(struct reftable_reader *r, + struct reftable_block *dest, uint64_t off, + uint32_t sz) +{ + if (off >= r->size) + return 0; + + if (off + sz > r->size) { + sz = r->size - off; + } + + return block_source_read_block(&r->source, dest, off, sz); +} + +uint32_t reftable_reader_hash_id(struct reftable_reader *r) +{ + return r->hash_id; +} + +const char *reader_name(struct reftable_reader *r) +{ + return r->name; +} + +static int parse_footer(struct reftable_reader *r, uint8_t *footer, + uint8_t *header) +{ + uint8_t *f = footer; + uint8_t first_block_typ; + int err = 0; + uint32_t computed_crc; + uint32_t file_crc; + + if (memcmp(f, "REFT", 4)) { + err = REFTABLE_FORMAT_ERROR; + goto done; + } + f += 4; + + if (memcmp(footer, header, header_size(r->version))) { + err = REFTABLE_FORMAT_ERROR; + goto done; + } + + f++; + r->block_size = get_be24(f); + + f += 3; + r->min_update_index = get_be64(f); + f += 8; + r->max_update_index = get_be64(f); + f += 8; + + if (r->version == 1) { + r->hash_id = GIT_SHA1_FORMAT_ID; + } else { + r->hash_id = get_be32(f); + switch (r->hash_id) { + case GIT_SHA1_FORMAT_ID: + break; + case GIT_SHA256_FORMAT_ID: + break; + default: + err = REFTABLE_FORMAT_ERROR; + goto done; + } + f += 4; + } + + r->ref_offsets.index_offset = get_be64(f); + f += 8; + + r->obj_offsets.offset = get_be64(f); + f += 8; + + r->object_id_len = r->obj_offsets.offset & ((1 << 5) - 1); + r->obj_offsets.offset >>= 5; + + r->obj_offsets.index_offset = get_be64(f); + f += 8; + r->log_offsets.offset = get_be64(f); + f += 8; + r->log_offsets.index_offset = get_be64(f); + f += 8; + + computed_crc = crc32(0, footer, f - footer); + file_crc = get_be32(f); + f += 4; + if (computed_crc != file_crc) { + err = REFTABLE_FORMAT_ERROR; + goto done; + } + + first_block_typ = header[header_size(r->version)]; + r->ref_offsets.is_present = (first_block_typ == BLOCK_TYPE_REF); + r->ref_offsets.offset = 0; + r->log_offsets.is_present = (first_block_typ == BLOCK_TYPE_LOG || + r->log_offsets.offset > 0); + r->obj_offsets.is_present = r->obj_offsets.offset > 0; + err = 0; +done: + return err; +} + +int init_reader(struct reftable_reader *r, struct reftable_block_source *source, + const char *name) +{ + struct reftable_block footer = { NULL }; + struct reftable_block header = { NULL }; + int err = 0; + uint64_t file_size = block_source_size(source); + + /* Need +1 to read type of first block. */ + uint32_t read_size = header_size(2) + 1; /* read v2 because it's larger. */ + memset(r, 0, sizeof(struct reftable_reader)); + + if (read_size > file_size) { + err = REFTABLE_FORMAT_ERROR; + goto done; + } + + err = block_source_read_block(source, &header, 0, read_size); + if (err != header_size(2) + 1) { + err = REFTABLE_IO_ERROR; + goto done; + } + + if (memcmp(header.data, "REFT", 4)) { + err = REFTABLE_FORMAT_ERROR; + goto done; + } + r->version = header.data[4]; + if (r->version != 1 && r->version != 2) { + err = REFTABLE_FORMAT_ERROR; + goto done; + } + + r->size = file_size - footer_size(r->version); + r->source = *source; + r->name = xstrdup(name); + r->hash_id = 0; + + err = block_source_read_block(source, &footer, r->size, + footer_size(r->version)); + if (err != footer_size(r->version)) { + err = REFTABLE_IO_ERROR; + goto done; + } + + err = parse_footer(r, footer.data, header.data); +done: + reftable_block_done(&footer); + reftable_block_done(&header); + return err; +} + +struct table_iter { + struct reftable_reader *r; + uint8_t typ; + uint64_t block_off; + struct block_iter bi; + int is_finished; +}; +#define TABLE_ITER_INIT \ + { \ + .bi = {.last_key = STRBUF_INIT } \ + } + +static void table_iter_copy_from(struct table_iter *dest, + struct table_iter *src) +{ + dest->r = src->r; + dest->typ = src->typ; + dest->block_off = src->block_off; + dest->is_finished = src->is_finished; + block_iter_copy_from(&dest->bi, &src->bi); +} + +static int table_iter_next_in_block(struct table_iter *ti, + struct reftable_record *rec) +{ + int res = block_iter_next(&ti->bi, rec); + if (res == 0 && reftable_record_type(rec) == BLOCK_TYPE_REF) { + ((struct reftable_ref_record *)rec->data)->update_index += + ti->r->min_update_index; + } + + return res; +} + +static void table_iter_block_done(struct table_iter *ti) +{ + if (!ti->bi.br) { + return; + } + reftable_block_done(&ti->bi.br->block); + FREE_AND_NULL(ti->bi.br); + + ti->bi.last_key.len = 0; + ti->bi.next_off = 0; +} + +static int32_t extract_block_size(uint8_t *data, uint8_t *typ, uint64_t off, + int version) +{ + int32_t result = 0; + + if (off == 0) { + data += header_size(version); + } + + *typ = data[0]; + if (reftable_is_block_type(*typ)) { + result = get_be24(data + 1); + } + return result; +} + +int reader_init_block_reader(struct reftable_reader *r, struct block_reader *br, + uint64_t next_off, uint8_t want_typ) +{ + int32_t guess_block_size = r->block_size ? r->block_size : + DEFAULT_BLOCK_SIZE; + struct reftable_block block = { NULL }; + uint8_t block_typ = 0; + int err = 0; + uint32_t header_off = next_off ? 0 : header_size(r->version); + int32_t block_size = 0; + + if (next_off >= r->size) + return 1; + + err = reader_get_block(r, &block, next_off, guess_block_size); + if (err < 0) + return err; + + block_size = extract_block_size(block.data, &block_typ, next_off, + r->version); + if (block_size < 0) + return block_size; + + if (want_typ != BLOCK_TYPE_ANY && block_typ != want_typ) { + reftable_block_done(&block); + return 1; + } + + if (block_size > guess_block_size) { + reftable_block_done(&block); + err = reader_get_block(r, &block, next_off, block_size); + if (err < 0) { + return err; + } + } + + return block_reader_init(br, &block, header_off, r->block_size, + hash_size(r->hash_id)); +} + +static int table_iter_next_block(struct table_iter *dest, + struct table_iter *src) +{ + uint64_t next_block_off = src->block_off + src->bi.br->full_block_size; + struct block_reader br = { 0 }; + int err = 0; + + dest->r = src->r; + dest->typ = src->typ; + dest->block_off = next_block_off; + + err = reader_init_block_reader(src->r, &br, next_block_off, src->typ); + if (err > 0) { + dest->is_finished = 1; + return 1; + } + if (err != 0) + return err; + else { + struct block_reader *brp = + reftable_malloc(sizeof(struct block_reader)); + *brp = br; + + dest->is_finished = 0; + block_reader_start(brp, &dest->bi); + } + return 0; +} + +static int table_iter_next(struct table_iter *ti, struct reftable_record *rec) +{ + if (reftable_record_type(rec) != ti->typ) + return REFTABLE_API_ERROR; + + while (1) { + struct table_iter next = TABLE_ITER_INIT; + int err = 0; + if (ti->is_finished) { + return 1; + } + + err = table_iter_next_in_block(ti, rec); + if (err <= 0) { + return err; + } + + err = table_iter_next_block(&next, ti); + if (err != 0) { + ti->is_finished = 1; + } + table_iter_block_done(ti); + if (err != 0) { + return err; + } + table_iter_copy_from(ti, &next); + block_iter_close(&next.bi); + } +} + +static int table_iter_next_void(void *ti, struct reftable_record *rec) +{ + return table_iter_next(ti, rec); +} + +static void table_iter_close(void *p) +{ + struct table_iter *ti = p; + table_iter_block_done(ti); + block_iter_close(&ti->bi); +} + +static struct reftable_iterator_vtable table_iter_vtable = { + .next = &table_iter_next_void, + .close = &table_iter_close, +}; + +static void iterator_from_table_iter(struct reftable_iterator *it, + struct table_iter *ti) +{ + assert(!it->ops); + it->iter_arg = ti; + it->ops = &table_iter_vtable; +} + +static int reader_table_iter_at(struct reftable_reader *r, + struct table_iter *ti, uint64_t off, + uint8_t typ) +{ + struct block_reader br = { 0 }; + struct block_reader *brp = NULL; + + int err = reader_init_block_reader(r, &br, off, typ); + if (err != 0) + return err; + + brp = reftable_malloc(sizeof(struct block_reader)); + *brp = br; + ti->r = r; + ti->typ = block_reader_type(brp); + ti->block_off = off; + block_reader_start(brp, &ti->bi); + return 0; +} + +static int reader_start(struct reftable_reader *r, struct table_iter *ti, + uint8_t typ, int index) +{ + struct reftable_reader_offsets *offs = reader_offsets_for(r, typ); + uint64_t off = offs->offset; + if (index) { + off = offs->index_offset; + if (off == 0) { + return 1; + } + typ = BLOCK_TYPE_INDEX; + } + + return reader_table_iter_at(r, ti, off, typ); +} + +static int reader_seek_linear(struct reftable_reader *r, struct table_iter *ti, + struct reftable_record *want) +{ + struct reftable_record rec = + reftable_new_record(reftable_record_type(want)); + struct strbuf want_key = STRBUF_INIT; + struct strbuf got_key = STRBUF_INIT; + struct table_iter next = TABLE_ITER_INIT; + int err = -1; + + reftable_record_key(want, &want_key); + + while (1) { + err = table_iter_next_block(&next, ti); + if (err < 0) + goto done; + + if (err > 0) { + break; + } + + err = block_reader_first_key(next.bi.br, &got_key); + if (err < 0) + goto done; + + if (strbuf_cmp(&got_key, &want_key) > 0) { + table_iter_block_done(&next); + break; + } + + table_iter_block_done(ti); + table_iter_copy_from(ti, &next); + } + + err = block_iter_seek(&ti->bi, &want_key); + if (err < 0) + goto done; + err = 0; + +done: + block_iter_close(&next.bi); + reftable_record_destroy(&rec); + strbuf_release(&want_key); + strbuf_release(&got_key); + return err; +} + +static int reader_seek_indexed(struct reftable_reader *r, + struct reftable_iterator *it, + struct reftable_record *rec) +{ + struct reftable_index_record want_index = { .last_key = STRBUF_INIT }; + struct reftable_record want_index_rec = { NULL }; + struct reftable_index_record index_result = { .last_key = STRBUF_INIT }; + struct reftable_record index_result_rec = { NULL }; + struct table_iter index_iter = TABLE_ITER_INIT; + struct table_iter next = TABLE_ITER_INIT; + int err = 0; + + reftable_record_key(rec, &want_index.last_key); + reftable_record_from_index(&want_index_rec, &want_index); + reftable_record_from_index(&index_result_rec, &index_result); + + err = reader_start(r, &index_iter, reftable_record_type(rec), 1); + if (err < 0) + goto done; + + err = reader_seek_linear(r, &index_iter, &want_index_rec); + while (1) { + err = table_iter_next(&index_iter, &index_result_rec); + table_iter_block_done(&index_iter); + if (err != 0) + goto done; + + err = reader_table_iter_at(r, &next, index_result.offset, 0); + if (err != 0) + goto done; + + err = block_iter_seek(&next.bi, &want_index.last_key); + if (err < 0) + goto done; + + if (next.typ == reftable_record_type(rec)) { + err = 0; + break; + } + + if (next.typ != BLOCK_TYPE_INDEX) { + err = REFTABLE_FORMAT_ERROR; + break; + } + + table_iter_copy_from(&index_iter, &next); + } + + if (err == 0) { + struct table_iter empty = TABLE_ITER_INIT; + struct table_iter *malloced = + reftable_calloc(sizeof(struct table_iter)); + *malloced = empty; + table_iter_copy_from(malloced, &next); + iterator_from_table_iter(it, malloced); + } +done: + block_iter_close(&next.bi); + table_iter_close(&index_iter); + reftable_record_release(&want_index_rec); + reftable_record_release(&index_result_rec); + return err; +} + +static int reader_seek_internal(struct reftable_reader *r, + struct reftable_iterator *it, + struct reftable_record *rec) +{ + struct reftable_reader_offsets *offs = + reader_offsets_for(r, reftable_record_type(rec)); + uint64_t idx = offs->index_offset; + struct table_iter ti = TABLE_ITER_INIT; + int err = 0; + if (idx > 0) + return reader_seek_indexed(r, it, rec); + + err = reader_start(r, &ti, reftable_record_type(rec), 0); + if (err < 0) + return err; + err = reader_seek_linear(r, &ti, rec); + if (err < 0) + return err; + else { + struct table_iter *p = + reftable_malloc(sizeof(struct table_iter)); + *p = ti; + iterator_from_table_iter(it, p); + } + + return 0; +} + +int reader_seek(struct reftable_reader *r, struct reftable_iterator *it, + struct reftable_record *rec) +{ + uint8_t typ = reftable_record_type(rec); + + struct reftable_reader_offsets *offs = reader_offsets_for(r, typ); + if (!offs->is_present) { + iterator_set_empty(it); + return 0; + } + + return reader_seek_internal(r, it, rec); +} + +int reftable_reader_seek_ref(struct reftable_reader *r, + struct reftable_iterator *it, const char *name) +{ + struct reftable_ref_record ref = { + .refname = (char *)name, + }; + struct reftable_record rec = { NULL }; + reftable_record_from_ref(&rec, &ref); + return reader_seek(r, it, &rec); +} + +int reftable_reader_seek_log_at(struct reftable_reader *r, + struct reftable_iterator *it, const char *name, + uint64_t update_index) +{ + struct reftable_log_record log = { + .refname = (char *)name, + .update_index = update_index, + }; + struct reftable_record rec = { NULL }; + reftable_record_from_log(&rec, &log); + return reader_seek(r, it, &rec); +} + +int reftable_reader_seek_log(struct reftable_reader *r, + struct reftable_iterator *it, const char *name) +{ + uint64_t max = ~((uint64_t)0); + return reftable_reader_seek_log_at(r, it, name, max); +} + +void reader_close(struct reftable_reader *r) +{ + block_source_close(&r->source); + FREE_AND_NULL(r->name); +} + +int reftable_new_reader(struct reftable_reader **p, + struct reftable_block_source *src, char const *name) +{ + struct reftable_reader *rd = + reftable_calloc(sizeof(struct reftable_reader)); + int err = init_reader(rd, src, name); + if (err == 0) { + *p = rd; + } else { + block_source_close(src); + reftable_free(rd); + } + return err; +} + +void reftable_reader_free(struct reftable_reader *r) +{ + reader_close(r); + reftable_free(r); +} + +static int reftable_reader_refs_for_indexed(struct reftable_reader *r, + struct reftable_iterator *it, + uint8_t *oid) +{ + struct reftable_obj_record want = { + .hash_prefix = oid, + .hash_prefix_len = r->object_id_len, + }; + struct reftable_record want_rec = { NULL }; + struct reftable_iterator oit = { NULL }; + struct reftable_obj_record got = { NULL }; + struct reftable_record got_rec = { NULL }; + int err = 0; + struct indexed_table_ref_iter *itr = NULL; + + /* Look through the reverse index. */ + reftable_record_from_obj(&want_rec, &want); + err = reader_seek(r, &oit, &want_rec); + if (err != 0) + goto done; + + /* read out the reftable_obj_record */ + reftable_record_from_obj(&got_rec, &got); + err = iterator_next(&oit, &got_rec); + if (err < 0) + goto done; + + if (err > 0 || + memcmp(want.hash_prefix, got.hash_prefix, r->object_id_len)) { + /* didn't find it; return empty iterator */ + iterator_set_empty(it); + err = 0; + goto done; + } + + err = new_indexed_table_ref_iter(&itr, r, oid, hash_size(r->hash_id), + got.offsets, got.offset_len); + if (err < 0) + goto done; + got.offsets = NULL; + iterator_from_indexed_table_ref_iter(it, itr); + +done: + reftable_iterator_destroy(&oit); + reftable_record_release(&got_rec); + return err; +} + +static int reftable_reader_refs_for_unindexed(struct reftable_reader *r, + struct reftable_iterator *it, + uint8_t *oid) +{ + struct table_iter ti_empty = TABLE_ITER_INIT; + struct table_iter *ti = reftable_calloc(sizeof(struct table_iter)); + struct filtering_ref_iterator *filter = NULL; + struct filtering_ref_iterator empty = FILTERING_REF_ITERATOR_INIT; + int oid_len = hash_size(r->hash_id); + int err; + + *ti = ti_empty; + err = reader_start(r, ti, BLOCK_TYPE_REF, 0); + if (err < 0) { + reftable_free(ti); + return err; + } + + filter = reftable_malloc(sizeof(struct filtering_ref_iterator)); + *filter = empty; + + strbuf_add(&filter->oid, oid, oid_len); + reftable_table_from_reader(&filter->tab, r); + filter->double_check = 0; + iterator_from_table_iter(&filter->it, ti); + + iterator_from_filtering_ref_iterator(it, filter); + return 0; +} + +int reftable_reader_refs_for(struct reftable_reader *r, + struct reftable_iterator *it, uint8_t *oid) +{ + if (r->obj_offsets.is_present) + return reftable_reader_refs_for_indexed(r, it, oid); + return reftable_reader_refs_for_unindexed(r, it, oid); +} + +uint64_t reftable_reader_max_update_index(struct reftable_reader *r) +{ + return r->max_update_index; +} + +uint64_t reftable_reader_min_update_index(struct reftable_reader *r) +{ + return r->min_update_index; +} + +/* generic table interface. */ + +static int reftable_reader_seek_void(void *tab, struct reftable_iterator *it, + struct reftable_record *rec) +{ + return reader_seek(tab, it, rec); +} + +static uint32_t reftable_reader_hash_id_void(void *tab) +{ + return reftable_reader_hash_id(tab); +} + +static uint64_t reftable_reader_min_update_index_void(void *tab) +{ + return reftable_reader_min_update_index(tab); +} + +static uint64_t reftable_reader_max_update_index_void(void *tab) +{ + return reftable_reader_max_update_index(tab); +} + +static struct reftable_table_vtable reader_vtable = { + .seek_record = reftable_reader_seek_void, + .hash_id = reftable_reader_hash_id_void, + .min_update_index = reftable_reader_min_update_index_void, + .max_update_index = reftable_reader_max_update_index_void, +}; + +void reftable_table_from_reader(struct reftable_table *tab, + struct reftable_reader *reader) +{ + assert(!tab->ops); + tab->ops = &reader_vtable; + tab->table_arg = reader; +} + + +int reftable_reader_print_file(const char *tablename) +{ + struct reftable_block_source src = { NULL }; + int err = reftable_block_source_from_file(&src, tablename); + struct reftable_reader *r = NULL; + struct reftable_table tab = { NULL }; + if (err < 0) + goto done; + + err = reftable_new_reader(&r, &src, tablename); + if (err < 0) + goto done; + + reftable_table_from_reader(&tab, r); + err = reftable_table_print(&tab); +done: + reftable_reader_free(r); + return err; +} diff --git a/reftable/reader.h b/reftable/reader.h new file mode 100644 index 00000000000..39583e5dbcd --- /dev/null +++ b/reftable/reader.h @@ -0,0 +1,66 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef READER_H +#define READER_H + +#include "block.h" +#include "record.h" +#include "reftable-iterator.h" +#include "reftable-reader.h" + +uint64_t block_source_size(struct reftable_block_source *source); + +int block_source_read_block(struct reftable_block_source *source, + struct reftable_block *dest, uint64_t off, + uint32_t size); +void block_source_close(struct reftable_block_source *source); + +/* metadata for a block type */ +struct reftable_reader_offsets { + int is_present; + uint64_t offset; + uint64_t index_offset; +}; + +/* The state for reading a reftable file. */ +struct reftable_reader { + /* for convience, associate a name with the instance. */ + char *name; + struct reftable_block_source source; + + /* Size of the file, excluding the footer. */ + uint64_t size; + + /* 'sha1' for SHA1, 's256' for SHA-256 */ + uint32_t hash_id; + + uint32_t block_size; + uint64_t min_update_index; + uint64_t max_update_index; + /* Length of the OID keys in the 'o' section */ + int object_id_len; + int version; + + struct reftable_reader_offsets ref_offsets; + struct reftable_reader_offsets obj_offsets; + struct reftable_reader_offsets log_offsets; +}; + +int init_reader(struct reftable_reader *r, struct reftable_block_source *source, + const char *name); +int reader_seek(struct reftable_reader *r, struct reftable_iterator *it, + struct reftable_record *rec); +void reader_close(struct reftable_reader *r); +const char *reader_name(struct reftable_reader *r); + +/* initialize a block reader to read from `r` */ +int reader_init_block_reader(struct reftable_reader *r, struct block_reader *br, + uint64_t next_off, uint8_t want_typ); + +#endif diff --git a/reftable/reftable-reader.h b/reftable/reftable-reader.h new file mode 100644 index 00000000000..4a4bc2fdf85 --- /dev/null +++ b/reftable/reftable-reader.h @@ -0,0 +1,101 @@ +/* + Copyright 2020 Google LLC + + Use of this source code is governed by a BSD-style + license that can be found in the LICENSE file or at + https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef REFTABLE_READER_H +#define REFTABLE_READER_H + +#include "reftable-iterator.h" +#include "reftable-blocksource.h" + +/* + * Reading single tables + * + * The follow routines are for reading single files. For an + * application-level interface, skip ahead to struct + * reftable_merged_table and struct reftable_stack. + */ + +/* The reader struct is a handle to an open reftable file. */ +struct reftable_reader; + +/* Generic table. */ +struct reftable_table; + +/* reftable_new_reader opens a reftable for reading. If successful, + * returns 0 code and sets pp. The name is used for creating a + * stack. Typically, it is the basename of the file. The block source + * `src` is owned by the reader, and is closed on calling + * reftable_reader_destroy(). On error, the block source `src` is + * closed as well. + */ +int reftable_new_reader(struct reftable_reader **pp, + struct reftable_block_source *src, const char *name); + +/* reftable_reader_seek_ref returns an iterator where 'name' would be inserted + in the table. To seek to the start of the table, use name = "". + + example: + + struct reftable_reader *r = NULL; + int err = reftable_new_reader(&r, &src, "filename"); + if (err < 0) { ... } + struct reftable_iterator it = {0}; + err = reftable_reader_seek_ref(r, &it, "refs/heads/master"); + if (err < 0) { ... } + struct reftable_ref_record ref = {0}; + while (1) { + err = reftable_iterator_next_ref(&it, &ref); + if (err > 0) { + break; + } + if (err < 0) { + ..error handling.. + } + ..found.. + } + reftable_iterator_destroy(&it); + reftable_ref_record_release(&ref); +*/ +int reftable_reader_seek_ref(struct reftable_reader *r, + struct reftable_iterator *it, const char *name); + +/* returns the hash ID used in this table. */ +uint32_t reftable_reader_hash_id(struct reftable_reader *r); + +/* seek to logs for the given name, older than update_index. To seek to the + start of the table, use name = "". +*/ +int reftable_reader_seek_log_at(struct reftable_reader *r, + struct reftable_iterator *it, const char *name, + uint64_t update_index); + +/* seek to newest log entry for given name. */ +int reftable_reader_seek_log(struct reftable_reader *r, + struct reftable_iterator *it, const char *name); + +/* closes and deallocates a reader. */ +void reftable_reader_free(struct reftable_reader *); + +/* return an iterator for the refs pointing to `oid`. */ +int reftable_reader_refs_for(struct reftable_reader *r, + struct reftable_iterator *it, uint8_t *oid); + +/* return the max_update_index for a table */ +uint64_t reftable_reader_max_update_index(struct reftable_reader *r); + +/* return the min_update_index for a table */ +uint64_t reftable_reader_min_update_index(struct reftable_reader *r); + +/* creates a generic table from a file reader. */ +void reftable_table_from_reader(struct reftable_table *tab, + struct reftable_reader *reader); + +/* print table onto stdout for debugging. */ +int reftable_reader_print_file(const char *tablename); + +#endif From patchwork Tue Jul 20 17:04:34 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 12388863 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CC782C636CA for ; Tue, 20 Jul 2021 17:07:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BB5C76113A for ; Tue, 20 Jul 2021 17:07:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229790AbhGTQ05 (ORCPT ); Tue, 20 Jul 2021 12:26:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49676 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232422AbhGTQYW (ORCPT ); Tue, 20 Jul 2021 12:24:22 -0400 Received: from mail-wr1-x42f.google.com (mail-wr1-x42f.google.com [IPv6:2a00:1450:4864:20::42f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C2727C0613E0 for ; Tue, 20 Jul 2021 10:04:58 -0700 (PDT) Received: by mail-wr1-x42f.google.com with SMTP id l7so26769349wrv.7 for ; Tue, 20 Jul 2021 10:04:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=byfg16NwNueTjDxTkObn1kQOALrBG82IUCjBhRt0Z34=; b=hrd0FsqkRPzAbSd9m6DeqNiJszYGa0PQ4Q0x8v22dETDTpLWywfMDpKCeUp7fXetDg jRpVky9waTjfqB35AltU6+CcGO+zQR/n24HmPDOvsfBZjOpieftQTYauo9MVd2fN2cNU SPnN0q7ZDG2O5frR2K9cY5+XeBnnzw0AxgJQbJIAsJJdPFiTxUhNDN0QEYgnEffqcEW4 CkSTWJsQkju5wfx9QLc5HFIklzXeWMEWOhrXfn5NvZYaK8wc2VJ5kS/tL9xCcAZl/Ugm d2vhEFoPhyl6QrBTANT2FmPTjRKAMeeEoMYfuKQKZHoaA79Yjbs5g3QR9F2u4peh+fVr 0zTQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=byfg16NwNueTjDxTkObn1kQOALrBG82IUCjBhRt0Z34=; b=V8EqGcga5A55KJFvAQqsF4PfHhFlXvP4uyfovBPaZvBOE2JqdMnTrllX27gqOw06TV paLN/V1TYkDlwXU2Mcr2lGmSXcLzlnFe0IzGEG7OO6exNtC5JLLlS1wZb8CDlahdZiEY RkVaaHca5M7snJSEp57OIP/PBs34q5loT2aeEqpRpcdJ4dUvtrHyK0hgFCMxxN/l7SsM XxRBQHksk8AdN1lIPtkzhuFYIXZvy4fd+vwXORsEz6qeyPbtOPCgFDiJmXS1Aa8CoYAE UuK1qVqikfD7cmNMi2jSPGIK2oqEF17gs/TKOgnEfu7aw/gTpDxe/LTP8md07RyW8cwm JfKg== X-Gm-Message-State: AOAM533lJhWRvNXMR3SlIumbSsGN8VqQQRvi2dyXCb38NerxQ5NydoFe tI+dzU6g6G/kukCi70TcRBAzzrGFiNw= X-Google-Smtp-Source: ABdhPJyNpcqYiehYsPRiEeMG/7WEd6hENclK3MLbVwduECcwaLc8b3rihJrPrFdGaD+SYbsfpNQe+A== X-Received: by 2002:adf:d1cd:: with SMTP id b13mr38014191wrd.200.1626800697261; Tue, 20 Jul 2021 10:04:57 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id f7sm24044346wru.11.2021.07.20.10.04.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 10:04:56 -0700 (PDT) Message-Id: <3eeefa5665f2b9e6c89e78584d3912bdf0362ae4.1626800687.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 20 Jul 2021 17:04:34 +0000 Subject: [PATCH 14/26] reftable: reftable file level tests Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys With support for reading and writing files in place, we can construct files (in memory) and attempt to read them back. Because some sections of the format are optional (eg. indices, log entries), we have to exercise this code using multiple sizes of input data Signed-off-by: Han-Wen Nienhuys --- Makefile | 1 + reftable/readwrite_test.c | 650 ++++++++++++++++++++++++++++++++++++++ reftable/reftable-tests.h | 2 +- t/helper/test-reftable.c | 1 + 4 files changed, 653 insertions(+), 1 deletion(-) create mode 100644 reftable/readwrite_test.c diff --git a/Makefile b/Makefile index ef4bf0f6a49..235d30b55d6 100644 --- a/Makefile +++ b/Makefile @@ -2465,6 +2465,7 @@ REFTABLE_OBJS += reftable/writer.o REFTABLE_TEST_OBJS += reftable/basics_test.o REFTABLE_TEST_OBJS += reftable/block_test.o REFTABLE_TEST_OBJS += reftable/record_test.o +REFTABLE_TEST_OBJS += reftable/readwrite_test.o REFTABLE_TEST_OBJS += reftable/test_framework.o REFTABLE_TEST_OBJS += reftable/tree_test.o diff --git a/reftable/readwrite_test.c b/reftable/readwrite_test.c new file mode 100644 index 00000000000..42ca48f83c4 --- /dev/null +++ b/reftable/readwrite_test.c @@ -0,0 +1,650 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "system.h" + +#include "basics.h" +#include "block.h" +#include "blocksource.h" +#include "constants.h" +#include "reader.h" +#include "record.h" +#include "test_framework.h" +#include "reftable-tests.h" +#include "reftable-writer.h" + +static const int update_index = 5; + +static void test_buffer(void) +{ + struct strbuf buf = STRBUF_INIT; + struct reftable_block_source source = { NULL }; + struct reftable_block out = { NULL }; + int n; + uint8_t in[] = "hello"; + strbuf_add(&buf, in, sizeof(in)); + block_source_from_strbuf(&source, &buf); + EXPECT(block_source_size(&source) == 6); + n = block_source_read_block(&source, &out, 0, sizeof(in)); + EXPECT(n == sizeof(in)); + EXPECT(!memcmp(in, out.data, n)); + reftable_block_done(&out); + + n = block_source_read_block(&source, &out, 1, 2); + EXPECT(n == 2); + EXPECT(!memcmp(out.data, "el", 2)); + + reftable_block_done(&out); + block_source_close(&source); + strbuf_release(&buf); +} + +static void write_table(char ***names, struct strbuf *buf, int N, + int block_size, uint32_t hash_id) +{ + struct reftable_write_options opts = { + .block_size = block_size, + .hash_id = hash_id, + }; + struct reftable_writer *w = + reftable_new_writer(&strbuf_add_void, buf, &opts); + struct reftable_ref_record ref = { NULL }; + int i = 0, n; + struct reftable_log_record log = { NULL }; + const struct reftable_stats *stats = NULL; + *names = reftable_calloc(sizeof(char *) * (N + 1)); + reftable_writer_set_limits(w, update_index, update_index); + for (i = 0; i < N; i++) { + uint8_t hash[GIT_SHA256_RAWSZ] = { 0 }; + char name[100]; + int n; + + set_test_hash(hash, i); + + snprintf(name, sizeof(name), "refs/heads/branch%02d", i); + + ref.refname = name; + ref.update_index = update_index; + ref.value_type = REFTABLE_REF_VAL1; + ref.value.val1 = hash; + (*names)[i] = xstrdup(name); + + n = reftable_writer_add_ref(w, &ref); + EXPECT(n == 0); + } + + for (i = 0; i < N; i++) { + uint8_t hash[GIT_SHA256_RAWSZ] = { 0 }; + char name[100]; + int n; + + set_test_hash(hash, i); + + snprintf(name, sizeof(name), "refs/heads/branch%02d", i); + + log.refname = name; + log.update_index = update_index; + log.value_type = REFTABLE_LOG_UPDATE; + log.update.new_hash = hash; + log.update.message = "message"; + + n = reftable_writer_add_log(w, &log); + EXPECT(n == 0); + } + + n = reftable_writer_close(w); + EXPECT(n == 0); + + stats = writer_stats(w); + for (i = 0; i < stats->ref_stats.blocks; i++) { + int off = i * opts.block_size; + if (off == 0) { + off = header_size((hash_id == GIT_SHA256_FORMAT_ID) ? 2 : + 1); + } + EXPECT(buf->buf[off] == 'r'); + } + + EXPECT(stats->log_stats.blocks > 0); + reftable_writer_free(w); +} + +static void test_log_buffer_size(void) +{ + struct strbuf buf = STRBUF_INIT; + struct reftable_write_options opts = { + .block_size = 4096, + }; + int err; + struct reftable_log_record log = { .refname = "refs/heads/master", + .update_index = 0xa, + .value_type = REFTABLE_LOG_UPDATE, + .update = { + .name = "Han-Wen Nienhuys", + .email = "hanwen@google.com", + .tz_offset = 100, + .time = 0x5e430672, + .message = "commit: 9\n", + } }; + struct reftable_writer *w = + reftable_new_writer(&strbuf_add_void, &buf, &opts); + + /* This tests buffer extension for log compression. Must use a random + hash, to ensure that the compressed part is larger than the original. + */ + uint8_t hash1[GIT_SHA1_RAWSZ], hash2[GIT_SHA1_RAWSZ]; + for (int i = 0; i < GIT_SHA1_RAWSZ; i++) { + hash1[i] = (uint8_t)(rand() % 256); + hash2[i] = (uint8_t)(rand() % 256); + } + log.update.old_hash = hash1; + log.update.new_hash = hash2; + reftable_writer_set_limits(w, update_index, update_index); + err = reftable_writer_add_log(w, &log); + EXPECT_ERR(err); + err = reftable_writer_close(w); + EXPECT_ERR(err); + reftable_writer_free(w); + strbuf_release(&buf); +} + +static void test_log_write_read(void) +{ + int N = 2; + char **names = reftable_calloc(sizeof(char *) * (N + 1)); + int err; + struct reftable_write_options opts = { + .block_size = 256, + }; + struct reftable_ref_record ref = { NULL }; + int i = 0; + struct reftable_log_record log = { NULL }; + int n; + struct reftable_iterator it = { NULL }; + struct reftable_reader rd = { NULL }; + struct reftable_block_source source = { NULL }; + struct strbuf buf = STRBUF_INIT; + struct reftable_writer *w = + reftable_new_writer(&strbuf_add_void, &buf, &opts); + const struct reftable_stats *stats = NULL; + reftable_writer_set_limits(w, 0, N); + for (i = 0; i < N; i++) { + char name[256]; + struct reftable_ref_record ref = { NULL }; + snprintf(name, sizeof(name), "b%02d%0*d", i, 130, 7); + names[i] = xstrdup(name); + ref.refname = name; + ref.update_index = i; + + err = reftable_writer_add_ref(w, &ref); + EXPECT_ERR(err); + } + for (i = 0; i < N; i++) { + uint8_t hash1[GIT_SHA1_RAWSZ], hash2[GIT_SHA1_RAWSZ]; + struct reftable_log_record log = { NULL }; + set_test_hash(hash1, i); + set_test_hash(hash2, i + 1); + + log.refname = names[i]; + log.update_index = i; + log.value_type = REFTABLE_LOG_UPDATE; + log.update.old_hash = hash1; + log.update.new_hash = hash2; + + err = reftable_writer_add_log(w, &log); + EXPECT_ERR(err); + } + + n = reftable_writer_close(w); + EXPECT(n == 0); + + stats = writer_stats(w); + EXPECT(stats->log_stats.blocks > 0); + reftable_writer_free(w); + w = NULL; + + block_source_from_strbuf(&source, &buf); + + err = init_reader(&rd, &source, "file.log"); + EXPECT_ERR(err); + + err = reftable_reader_seek_ref(&rd, &it, names[N - 1]); + EXPECT_ERR(err); + + err = reftable_iterator_next_ref(&it, &ref); + EXPECT_ERR(err); + + /* end of iteration. */ + err = reftable_iterator_next_ref(&it, &ref); + EXPECT(0 < err); + + reftable_iterator_destroy(&it); + reftable_ref_record_release(&ref); + + err = reftable_reader_seek_log(&rd, &it, ""); + EXPECT_ERR(err); + + i = 0; + while (1) { + int err = reftable_iterator_next_log(&it, &log); + if (err > 0) { + break; + } + + EXPECT_ERR(err); + EXPECT_STREQ(names[i], log.refname); + EXPECT(i == log.update_index); + i++; + reftable_log_record_release(&log); + } + + EXPECT(i == N); + reftable_iterator_destroy(&it); + + /* cleanup. */ + strbuf_release(&buf); + free_names(names); + reader_close(&rd); +} + +static void test_table_read_write_sequential(void) +{ + char **names; + struct strbuf buf = STRBUF_INIT; + int N = 50; + struct reftable_iterator it = { NULL }; + struct reftable_block_source source = { NULL }; + struct reftable_reader rd = { NULL }; + int err = 0; + int j = 0; + + write_table(&names, &buf, N, 256, GIT_SHA1_FORMAT_ID); + + block_source_from_strbuf(&source, &buf); + + err = init_reader(&rd, &source, "file.ref"); + EXPECT_ERR(err); + + err = reftable_reader_seek_ref(&rd, &it, ""); + EXPECT_ERR(err); + + while (1) { + struct reftable_ref_record ref = { NULL }; + int r = reftable_iterator_next_ref(&it, &ref); + EXPECT(r >= 0); + if (r > 0) { + break; + } + EXPECT(0 == strcmp(names[j], ref.refname)); + EXPECT(update_index == ref.update_index); + + j++; + reftable_ref_record_release(&ref); + } + EXPECT(j == N); + reftable_iterator_destroy(&it); + strbuf_release(&buf); + free_names(names); + + reader_close(&rd); +} + +static void test_table_write_small_table(void) +{ + char **names; + struct strbuf buf = STRBUF_INIT; + int N = 1; + write_table(&names, &buf, N, 4096, GIT_SHA1_FORMAT_ID); + EXPECT(buf.len < 200); + strbuf_release(&buf); + free_names(names); +} + +static void test_table_read_api(void) +{ + char **names; + struct strbuf buf = STRBUF_INIT; + int N = 50; + struct reftable_reader rd = { NULL }; + struct reftable_block_source source = { NULL }; + int err; + int i; + struct reftable_log_record log = { NULL }; + struct reftable_iterator it = { NULL }; + + write_table(&names, &buf, N, 256, GIT_SHA1_FORMAT_ID); + + block_source_from_strbuf(&source, &buf); + + err = init_reader(&rd, &source, "file.ref"); + EXPECT_ERR(err); + + err = reftable_reader_seek_ref(&rd, &it, names[0]); + EXPECT_ERR(err); + + err = reftable_iterator_next_log(&it, &log); + EXPECT(err == REFTABLE_API_ERROR); + + strbuf_release(&buf); + for (i = 0; i < N; i++) { + reftable_free(names[i]); + } + reftable_iterator_destroy(&it); + reftable_free(names); + reader_close(&rd); + strbuf_release(&buf); +} + +static void test_table_read_write_seek(int index, int hash_id) +{ + char **names; + struct strbuf buf = STRBUF_INIT; + int N = 50; + struct reftable_reader rd = { NULL }; + struct reftable_block_source source = { NULL }; + int err; + int i = 0; + + struct reftable_iterator it = { NULL }; + struct strbuf pastLast = STRBUF_INIT; + struct reftable_ref_record ref = { NULL }; + + write_table(&names, &buf, N, 256, hash_id); + + block_source_from_strbuf(&source, &buf); + + err = init_reader(&rd, &source, "file.ref"); + EXPECT_ERR(err); + EXPECT(hash_id == reftable_reader_hash_id(&rd)); + + if (!index) { + rd.ref_offsets.index_offset = 0; + } else { + EXPECT(rd.ref_offsets.index_offset > 0); + } + + for (i = 1; i < N; i++) { + int err = reftable_reader_seek_ref(&rd, &it, names[i]); + EXPECT_ERR(err); + err = reftable_iterator_next_ref(&it, &ref); + EXPECT_ERR(err); + EXPECT(0 == strcmp(names[i], ref.refname)); + EXPECT(REFTABLE_REF_VAL1 == ref.value_type); + EXPECT(i == ref.value.val1[0]); + + reftable_ref_record_release(&ref); + reftable_iterator_destroy(&it); + } + + strbuf_addstr(&pastLast, names[N - 1]); + strbuf_addstr(&pastLast, "/"); + + err = reftable_reader_seek_ref(&rd, &it, pastLast.buf); + if (err == 0) { + struct reftable_ref_record ref = { NULL }; + int err = reftable_iterator_next_ref(&it, &ref); + EXPECT(err > 0); + } else { + EXPECT(err > 0); + } + + strbuf_release(&pastLast); + reftable_iterator_destroy(&it); + + strbuf_release(&buf); + for (i = 0; i < N; i++) { + reftable_free(names[i]); + } + reftable_free(names); + reader_close(&rd); +} + +static void test_table_read_write_seek_linear(void) +{ + test_table_read_write_seek(0, GIT_SHA1_FORMAT_ID); +} + +static void test_table_read_write_seek_linear_sha256(void) +{ + test_table_read_write_seek(0, GIT_SHA256_FORMAT_ID); +} + +static void test_table_read_write_seek_index(void) +{ + test_table_read_write_seek(1, GIT_SHA1_FORMAT_ID); +} + +static void test_table_refs_for(int indexed) +{ + int N = 50; + char **want_names = reftable_calloc(sizeof(char *) * (N + 1)); + int want_names_len = 0; + uint8_t want_hash[GIT_SHA1_RAWSZ]; + + struct reftable_write_options opts = { + .block_size = 256, + }; + struct reftable_ref_record ref = { NULL }; + int i = 0; + int n; + int err; + struct reftable_reader rd; + struct reftable_block_source source = { NULL }; + + struct strbuf buf = STRBUF_INIT; + struct reftable_writer *w = + reftable_new_writer(&strbuf_add_void, &buf, &opts); + + struct reftable_iterator it = { NULL }; + int j; + + set_test_hash(want_hash, 4); + + for (i = 0; i < N; i++) { + uint8_t hash[GIT_SHA1_RAWSZ]; + char fill[51] = { 0 }; + char name[100]; + uint8_t hash1[GIT_SHA1_RAWSZ]; + uint8_t hash2[GIT_SHA1_RAWSZ]; + struct reftable_ref_record ref = { NULL }; + + memset(hash, i, sizeof(hash)); + memset(fill, 'x', 50); + /* Put the variable part in the start */ + snprintf(name, sizeof(name), "br%02d%s", i, fill); + name[40] = 0; + ref.refname = name; + + set_test_hash(hash1, i / 4); + set_test_hash(hash2, 3 + i / 4); + ref.value_type = REFTABLE_REF_VAL2; + ref.value.val2.value = hash1; + ref.value.val2.target_value = hash2; + + /* 80 bytes / entry, so 3 entries per block. Yields 17 + */ + /* blocks. */ + n = reftable_writer_add_ref(w, &ref); + EXPECT(n == 0); + + if (!memcmp(hash1, want_hash, GIT_SHA1_RAWSZ) || + !memcmp(hash2, want_hash, GIT_SHA1_RAWSZ)) { + want_names[want_names_len++] = xstrdup(name); + } + } + + n = reftable_writer_close(w); + EXPECT(n == 0); + + reftable_writer_free(w); + w = NULL; + + block_source_from_strbuf(&source, &buf); + + err = init_reader(&rd, &source, "file.ref"); + EXPECT_ERR(err); + if (!indexed) { + rd.obj_offsets.is_present = 0; + } + + err = reftable_reader_seek_ref(&rd, &it, ""); + EXPECT_ERR(err); + reftable_iterator_destroy(&it); + + err = reftable_reader_refs_for(&rd, &it, want_hash); + EXPECT_ERR(err); + + j = 0; + while (1) { + int err = reftable_iterator_next_ref(&it, &ref); + EXPECT(err >= 0); + if (err > 0) { + break; + } + + EXPECT(j < want_names_len); + EXPECT(0 == strcmp(ref.refname, want_names[j])); + j++; + reftable_ref_record_release(&ref); + } + EXPECT(j == want_names_len); + + strbuf_release(&buf); + free_names(want_names); + reftable_iterator_destroy(&it); + reader_close(&rd); +} + +static void test_table_refs_for_no_index(void) +{ + test_table_refs_for(0); +} + +static void test_table_refs_for_obj_index(void) +{ + test_table_refs_for(1); +} + +static void test_write_empty_table(void) +{ + struct reftable_write_options opts = { 0 }; + struct strbuf buf = STRBUF_INIT; + struct reftable_writer *w = + reftable_new_writer(&strbuf_add_void, &buf, &opts); + struct reftable_block_source source = { NULL }; + struct reftable_reader *rd = NULL; + struct reftable_ref_record rec = { NULL }; + struct reftable_iterator it = { NULL }; + int err; + + reftable_writer_set_limits(w, 1, 1); + + err = reftable_writer_close(w); + EXPECT(err == REFTABLE_EMPTY_TABLE_ERROR); + reftable_writer_free(w); + + EXPECT(buf.len == header_size(1) + footer_size(1)); + + block_source_from_strbuf(&source, &buf); + + err = reftable_new_reader(&rd, &source, "filename"); + EXPECT_ERR(err); + + err = reftable_reader_seek_ref(rd, &it, ""); + EXPECT_ERR(err); + + err = reftable_iterator_next_ref(&it, &rec); + EXPECT(err > 0); + + reftable_iterator_destroy(&it); + reftable_reader_free(rd); + strbuf_release(&buf); +} + +static void test_write_key_order(void) +{ + struct reftable_write_options opts = { 0 }; + struct strbuf buf = STRBUF_INIT; + struct reftable_writer *w = + reftable_new_writer(&strbuf_add_void, &buf, &opts); + struct reftable_ref_record refs[2] = { + { + .refname = "b", + .update_index = 1, + .value_type = REFTABLE_REF_SYMREF, + .value = { + .symref = "target", + }, + }, { + .refname = "a", + .update_index = 1, + .value_type = REFTABLE_REF_SYMREF, + .value = { + .symref = "target", + }, + } + }; + int err; + + reftable_writer_set_limits(w, 1, 1); + err = reftable_writer_add_ref(w, &refs[0]); + EXPECT_ERR(err); + err = reftable_writer_add_ref(w, &refs[1]); + printf("%d\n", err); + EXPECT(err == REFTABLE_API_ERROR); + reftable_writer_close(w); + reftable_writer_free(w); + strbuf_release(&buf); +} + +static void test_corrupt_table_empty(void) +{ + struct strbuf buf = STRBUF_INIT; + struct reftable_block_source source = { NULL }; + struct reftable_reader rd = { NULL }; + int err; + + block_source_from_strbuf(&source, &buf); + err = init_reader(&rd, &source, "file.log"); + EXPECT(err == REFTABLE_FORMAT_ERROR); +} + +static void test_corrupt_table(void) +{ + uint8_t zeros[1024] = { 0 }; + struct strbuf buf = STRBUF_INIT; + struct reftable_block_source source = { NULL }; + struct reftable_reader rd = { NULL }; + int err; + strbuf_add(&buf, zeros, sizeof(zeros)); + + block_source_from_strbuf(&source, &buf); + err = init_reader(&rd, &source, "file.log"); + EXPECT(err == REFTABLE_FORMAT_ERROR); + strbuf_release(&buf); +} + +int readwrite_test_main(int argc, const char *argv[]) +{ + RUN_TEST(test_corrupt_table); + RUN_TEST(test_corrupt_table_empty); + RUN_TEST(test_log_write_read); + RUN_TEST(test_write_key_order); + RUN_TEST(test_table_read_write_seek_linear_sha256); + RUN_TEST(test_log_buffer_size); + RUN_TEST(test_table_write_small_table); + RUN_TEST(test_buffer); + RUN_TEST(test_table_read_api); + RUN_TEST(test_table_read_write_sequential); + RUN_TEST(test_table_read_write_seek_linear); + RUN_TEST(test_table_read_write_seek_index); + RUN_TEST(test_table_refs_for_no_index); + RUN_TEST(test_table_refs_for_obj_index); + RUN_TEST(test_write_empty_table); + return 0; +} diff --git a/reftable/reftable-tests.h b/reftable/reftable-tests.h index 5e7698ae654..3d541fa5c0c 100644 --- a/reftable/reftable-tests.h +++ b/reftable/reftable-tests.h @@ -14,7 +14,7 @@ int block_test_main(int argc, const char **argv); int merged_test_main(int argc, const char **argv); int record_test_main(int argc, const char **argv); int refname_test_main(int argc, const char **argv); -int reftable_test_main(int argc, const char **argv); +int readwrite_test_main(int argc, const char **argv); int stack_test_main(int argc, const char **argv); int tree_test_main(int argc, const char **argv); int reftable_dump_main(int argc, char *const *argv); diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c index 050551fa698..898aba836fd 100644 --- a/t/helper/test-reftable.c +++ b/t/helper/test-reftable.c @@ -6,6 +6,7 @@ int cmd__reftable(int argc, const char **argv) basics_test_main(argc, argv); block_test_main(argc, argv); record_test_main(argc, argv); + readwrite_test_main(argc, argv); tree_test_main(argc, argv); return 0; } From patchwork Tue Jul 20 17:04:35 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 12388857 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D55E2C636C8 for ; Tue, 20 Jul 2021 17:07:35 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B53EA610CC for ; Tue, 20 Jul 2021 17:07:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234886AbhGTQ0e (ORCPT ); Tue, 20 Jul 2021 12:26:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49712 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232429AbhGTQYW (ORCPT ); Tue, 20 Jul 2021 12:24:22 -0400 Received: from mail-wr1-x436.google.com (mail-wr1-x436.google.com [IPv6:2a00:1450:4864:20::436]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 40D3BC0613E1 for ; Tue, 20 Jul 2021 10:04:59 -0700 (PDT) Received: by mail-wr1-x436.google.com with SMTP id a13so26749495wrf.10 for ; Tue, 20 Jul 2021 10:04:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=EzSvWhD664LH+f880EJGnAsq5NbF3QbDTpvtbtbxoLY=; b=LsX1/UV8Zd8lTeUGgeJxKgy0eqwA6wCjycKTtoHC2s2WyFGvjmyBzE/Ev6mIk26ir0 4ivsSAZB+viz08oN4Hsji/MQJ+GmHRXe8hyxO4HVVJUVce8GezihAn5cUbzyE7ssspoY Fq2fkErCZxisrtV+IfPcyGY71RxB6y6Wj2T0WcfVa6TMr3rACdOpIalbpajmJSBQVb/4 /272EF38JFHCiz2qq/FsDHGWBOhbZ3chgTrdzZmmTWG5+7nDz1WAWW7TuBgYkl9qoBjl BRjpMkvcUD2j2noTb29A7CA4U60LS62RmveG2V3z5DyCUR4ci+sUKGif6PqUaWcPpz5l 9nkA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=EzSvWhD664LH+f880EJGnAsq5NbF3QbDTpvtbtbxoLY=; b=JDfmegno9M1zkxZiLUZmpgtMtMBYTkPlMgvSHQ+rOwtGiQ7QtCDVnSGaNEQ+4Wn7Gd v46/eqxvMZJTK2VCPGk4WOfcs85uJ1bfhrpXQZPfR4OGR5D1irOOMDodf5TySLsHP9Vf y0lQ2U88XZ4Q0cZkh16ukSBLGVjlAHUb7ti9Bn3bLuabRbu0GqoyEFtcl6+qSCG0LcPu 0u+N9rmLAEAJTLJe5ZjD/PzzGNM/tsKKkTM6ELM527CFGM1saRvglOV3bbaijEfOERHM naPFfaSjk0XDu3WzmXMXqhfDS2N9aGC3O3lH07GUSzjmw/8LedcHEji4i3suQUkqBGJc 5ACg== X-Gm-Message-State: AOAM5326VpEZ15dIHQhcoy6jPgYYOFUaEtKYv1eMU8zwr9LzuP7PSMIp 6AOsg2LBaj6+03dDj26B5Wr2wR9Di9w= X-Google-Smtp-Source: ABdhPJwFhhxacQpBlfhwXZMRTvPstOubEMXqF+0C3gHU3Dt7a6rSOdZPcvDQrNEVN5uVHZbjKw4iqA== X-Received: by 2002:adf:ed07:: with SMTP id a7mr38628085wro.70.1626800697890; Tue, 20 Jul 2021 10:04:57 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id q81sm3427905wme.18.2021.07.20.10.04.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 10:04:57 -0700 (PDT) Message-Id: <4d7745e65529c3e77cc6d016ca75084e265c96fb.1626800687.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 20 Jul 2021 17:04:35 +0000 Subject: [PATCH 15/26] reftable: add a heap-based priority queue for reftable records Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys This is needed to create a merged view multiple reftables Signed-off-by: Han-Wen Nienhuys --- Makefile | 2 + reftable/pq.c | 115 ++++++++++++++++++++++++++++++++++++++ reftable/pq.h | 32 +++++++++++ reftable/pq_test.c | 72 ++++++++++++++++++++++++ reftable/reftable-tests.h | 1 + t/helper/test-reftable.c | 1 + 6 files changed, 223 insertions(+) create mode 100644 reftable/pq.c create mode 100644 reftable/pq.h create mode 100644 reftable/pq_test.c diff --git a/Makefile b/Makefile index 235d30b55d6..2123aa782c9 100644 --- a/Makefile +++ b/Makefile @@ -2454,6 +2454,7 @@ REFTABLE_OBJS += reftable/block.o REFTABLE_OBJS += reftable/blocksource.o REFTABLE_OBJS += reftable/iter.o REFTABLE_OBJS += reftable/publicbasics.o +REFTABLE_OBJS += reftable/pq.o REFTABLE_OBJS += reftable/reader.o REFTABLE_OBJS += reftable/record.o REFTABLE_OBJS += reftable/refname.o @@ -2464,6 +2465,7 @@ REFTABLE_OBJS += reftable/writer.o REFTABLE_TEST_OBJS += reftable/basics_test.o REFTABLE_TEST_OBJS += reftable/block_test.o +REFTABLE_TEST_OBJS += reftable/pq_test.o REFTABLE_TEST_OBJS += reftable/record_test.o REFTABLE_TEST_OBJS += reftable/readwrite_test.o REFTABLE_TEST_OBJS += reftable/test_framework.o diff --git a/reftable/pq.c b/reftable/pq.c new file mode 100644 index 00000000000..8918d158e2d --- /dev/null +++ b/reftable/pq.c @@ -0,0 +1,115 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "pq.h" + +#include "reftable-record.h" +#include "system.h" +#include "basics.h" + +static int pq_less(struct pq_entry a, struct pq_entry b) +{ + struct strbuf ak = STRBUF_INIT; + struct strbuf bk = STRBUF_INIT; + int cmp = 0; + reftable_record_key(&a.rec, &ak); + reftable_record_key(&b.rec, &bk); + + cmp = strbuf_cmp(&ak, &bk); + + strbuf_release(&ak); + strbuf_release(&bk); + + if (cmp == 0) + return a.index > b.index; + + return cmp < 0; +} + +struct pq_entry merged_iter_pqueue_top(struct merged_iter_pqueue pq) +{ + return pq.heap[0]; +} + +int merged_iter_pqueue_is_empty(struct merged_iter_pqueue pq) +{ + return pq.len == 0; +} + +void merged_iter_pqueue_check(struct merged_iter_pqueue pq) +{ + int i = 0; + for (i = 1; i < pq.len; i++) { + int parent = (i - 1) / 2; + + assert(pq_less(pq.heap[parent], pq.heap[i])); + } +} + +struct pq_entry merged_iter_pqueue_remove(struct merged_iter_pqueue *pq) +{ + int i = 0; + struct pq_entry e = pq->heap[0]; + pq->heap[0] = pq->heap[pq->len - 1]; + pq->len--; + + i = 0; + while (i < pq->len) { + int min = i; + int j = 2 * i + 1; + int k = 2 * i + 2; + if (j < pq->len && pq_less(pq->heap[j], pq->heap[i])) { + min = j; + } + if (k < pq->len && pq_less(pq->heap[k], pq->heap[min])) { + min = k; + } + + if (min == i) { + break; + } + + SWAP(pq->heap[i], pq->heap[min]); + i = min; + } + + return e; +} + +void merged_iter_pqueue_add(struct merged_iter_pqueue *pq, struct pq_entry e) +{ + int i = 0; + if (pq->len == pq->cap) { + pq->cap = 2 * pq->cap + 1; + pq->heap = reftable_realloc(pq->heap, + pq->cap * sizeof(struct pq_entry)); + } + + pq->heap[pq->len++] = e; + i = pq->len - 1; + while (i > 0) { + int j = (i - 1) / 2; + if (pq_less(pq->heap[j], pq->heap[i])) { + break; + } + + SWAP(pq->heap[j], pq->heap[i]); + + i = j; + } +} + +void merged_iter_pqueue_release(struct merged_iter_pqueue *pq) +{ + int i = 0; + for (i = 0; i < pq->len; i++) { + reftable_record_destroy(&pq->heap[i].rec); + } + FREE_AND_NULL(pq->heap); + pq->len = pq->cap = 0; +} diff --git a/reftable/pq.h b/reftable/pq.h new file mode 100644 index 00000000000..385d2fb139a --- /dev/null +++ b/reftable/pq.h @@ -0,0 +1,32 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef PQ_H +#define PQ_H + +#include "record.h" + +struct pq_entry { + int index; + struct reftable_record rec; +}; + +struct merged_iter_pqueue { + struct pq_entry *heap; + size_t len; + size_t cap; +}; + +struct pq_entry merged_iter_pqueue_top(struct merged_iter_pqueue pq); +int merged_iter_pqueue_is_empty(struct merged_iter_pqueue pq); +void merged_iter_pqueue_check(struct merged_iter_pqueue pq); +struct pq_entry merged_iter_pqueue_remove(struct merged_iter_pqueue *pq); +void merged_iter_pqueue_add(struct merged_iter_pqueue *pq, struct pq_entry e); +void merged_iter_pqueue_release(struct merged_iter_pqueue *pq); + +#endif diff --git a/reftable/pq_test.c b/reftable/pq_test.c new file mode 100644 index 00000000000..ad21673e854 --- /dev/null +++ b/reftable/pq_test.c @@ -0,0 +1,72 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "system.h" + +#include "basics.h" +#include "constants.h" +#include "pq.h" +#include "record.h" +#include "reftable-tests.h" +#include "test_framework.h" + +static void test_pq(void) +{ + char *names[54] = { NULL }; + int N = ARRAY_SIZE(names) - 1; + + struct merged_iter_pqueue pq = { NULL }; + const char *last = NULL; + + int i = 0; + for (i = 0; i < N; i++) { + char name[100]; + snprintf(name, sizeof(name), "%02d", i); + names[i] = xstrdup(name); + } + + i = 1; + do { + struct reftable_record rec = + reftable_new_record(BLOCK_TYPE_REF); + struct pq_entry e = { 0 }; + + reftable_record_as_ref(&rec)->refname = names[i]; + e.rec = rec; + merged_iter_pqueue_add(&pq, e); + merged_iter_pqueue_check(pq); + i = (i * 7) % N; + } while (i != 1); + + while (!merged_iter_pqueue_is_empty(pq)) { + struct pq_entry e = merged_iter_pqueue_remove(&pq); + struct reftable_ref_record *ref = + reftable_record_as_ref(&e.rec); + + merged_iter_pqueue_check(pq); + + if (last) { + assert(strcmp(last, ref->refname) < 0); + } + last = ref->refname; + ref->refname = NULL; + reftable_free(ref); + } + + for (i = 0; i < N; i++) { + reftable_free(names[i]); + } + + merged_iter_pqueue_release(&pq); +} + +int pq_test_main(int argc, const char *argv[]) +{ + RUN_TEST(test_pq); + return 0; +} diff --git a/reftable/reftable-tests.h b/reftable/reftable-tests.h index 3d541fa5c0c..0019cbcfa49 100644 --- a/reftable/reftable-tests.h +++ b/reftable/reftable-tests.h @@ -12,6 +12,7 @@ https://developers.google.com/open-source/licenses/bsd int basics_test_main(int argc, const char **argv); int block_test_main(int argc, const char **argv); int merged_test_main(int argc, const char **argv); +int pq_test_main(int argc, const char **argv); int record_test_main(int argc, const char **argv); int refname_test_main(int argc, const char **argv); int readwrite_test_main(int argc, const char **argv); diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c index 898aba836fd..0b5a1701df1 100644 --- a/t/helper/test-reftable.c +++ b/t/helper/test-reftable.c @@ -5,6 +5,7 @@ int cmd__reftable(int argc, const char **argv) { basics_test_main(argc, argv); block_test_main(argc, argv); + pq_test_main(argc, argv); record_test_main(argc, argv); readwrite_test_main(argc, argv); tree_test_main(argc, argv); From patchwork Tue Jul 20 17:04:36 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 12388871 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DB8E6C07E95 for ; Tue, 20 Jul 2021 17:08:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C029361106 for ; Tue, 20 Jul 2021 17:08:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235245AbhGTQ1Q (ORCPT ); Tue, 20 Jul 2021 12:27:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49718 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232461AbhGTQYY (ORCPT ); Tue, 20 Jul 2021 12:24:24 -0400 Received: from mail-wm1-x331.google.com (mail-wm1-x331.google.com [IPv6:2a00:1450:4864:20::331]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3764BC061762 for ; Tue, 20 Jul 2021 10:05:00 -0700 (PDT) Received: by mail-wm1-x331.google.com with SMTP id q19-20020a05600c2e53b0290249f2904453so61897wmf.1 for ; Tue, 20 Jul 2021 10:05:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=6scdmBNv3eCTFhlKLn9bJWjLI9RwmCDyk/DILPaAyJ0=; b=FIFv3GazsQGFZdaatOBeiboIdvLk0DWYxljkUSnlUqLzrxkHMWNzJ+aBTEfN8lQYWn ojxXfsXStEvRwcyZwTzJuJl2FZP+ECveNBIkUHHhtv89OzlVm8dQdeMzC+g+ErRjWmZt NygxpBtaXcyJ32LkppckHtniD7SvcitOgcGC0SatdnybkXmSatkiGrztiHq577s9toJi 8EBdBdsUTor5RXVybhJ3Sa0nuJEDVjB0qwcXoRstanaX1cbpGt6KDBaTZzxlipxYTbyb taKZzKqpGxYRao77guaI0jc4xhI9GYVXpdO3eYKkfsyMX1mtsQUMPsq6exh0RZvtqNEN hHNg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=6scdmBNv3eCTFhlKLn9bJWjLI9RwmCDyk/DILPaAyJ0=; b=RDlKm4VGXVtRD6AT4Myfsdp2+BXRIWbqNBB0XYe133bKrnIYgV+qzWaclCRzrAnOzm KZ5PJoLi/UmuahN2Wkn01RUU4X/XQ63YOBzIDUSINdnQeeQOjf0mRQTBXsbvziFajhw/ vo3L/OQ7CJsvzNxOnpZIl55yk0DzS6jQqIATLf1exDJtU8+GflLyzegcmUcCCEQt1bIj bT1R84+S1uHgQKFbdau4VPM2nncmgfEyCpNdCb9yghLBNt98SZRAlgq2H0YEgHuUZne0 YL4R5Iimchmjoq7BE+xghQJTADhQQG/kGG+/0bXSd6Sn8iWaXKYUtJDyWxJ0TR6Kr7PM SPOA== X-Gm-Message-State: AOAM530x10MQbGliKv/P42pFtX/eJ8Z/3KfoL57MO0FjaRqZAz6Veua0 ie2hVzfgw0oABsYhRq+ZIO0QilqukL8= X-Google-Smtp-Source: ABdhPJxQAN13YLlRN2NYFpOv/MzPjjXxI874Qsl5+LeVxS/Hh/1VKlKntsnVYQa4T6Vw/XGtOkWXTQ== X-Received: by 2002:a05:600c:190a:: with SMTP id j10mr37995324wmq.109.1626800698702; Tue, 20 Jul 2021 10:04:58 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id j4sm10413165wrt.24.2021.07.20.10.04.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 10:04:58 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Tue, 20 Jul 2021 17:04:36 +0000 Subject: [PATCH 16/26] reftable: add merged table view Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys This adds an abstract, read-only interface to the ref database. This primitive is used to construct the read view of the ref database (the read view is constructed by merging several *.ref files). It also provides the mechanism to provide a unified view of the refs in the main repository and the per-worktree refs. Signed-off-by: Han-Wen Nienhuys --- Makefile | 2 + reftable/merged.c | 362 +++++++++++++++++++++++++++++++++++++ reftable/merged.h | 35 ++++ reftable/merged_test.c | 292 ++++++++++++++++++++++++++++++ reftable/reftable-merged.h | 72 ++++++++ t/helper/test-reftable.c | 1 + 6 files changed, 764 insertions(+) create mode 100644 reftable/merged.c create mode 100644 reftable/merged.h create mode 100644 reftable/merged_test.c create mode 100644 reftable/reftable-merged.h diff --git a/Makefile b/Makefile index 2123aa782c9..9369013daed 100644 --- a/Makefile +++ b/Makefile @@ -2454,6 +2454,7 @@ REFTABLE_OBJS += reftable/block.o REFTABLE_OBJS += reftable/blocksource.o REFTABLE_OBJS += reftable/iter.o REFTABLE_OBJS += reftable/publicbasics.o +REFTABLE_OBJS += reftable/merged.o REFTABLE_OBJS += reftable/pq.o REFTABLE_OBJS += reftable/reader.o REFTABLE_OBJS += reftable/record.o @@ -2465,6 +2466,7 @@ REFTABLE_OBJS += reftable/writer.o REFTABLE_TEST_OBJS += reftable/basics_test.o REFTABLE_TEST_OBJS += reftable/block_test.o +REFTABLE_TEST_OBJS += reftable/merged_test.o REFTABLE_TEST_OBJS += reftable/pq_test.o REFTABLE_TEST_OBJS += reftable/record_test.o REFTABLE_TEST_OBJS += reftable/readwrite_test.o diff --git a/reftable/merged.c b/reftable/merged.c new file mode 100644 index 00000000000..e5b53da6db3 --- /dev/null +++ b/reftable/merged.c @@ -0,0 +1,362 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "merged.h" + +#include "constants.h" +#include "iter.h" +#include "pq.h" +#include "reader.h" +#include "record.h" +#include "generic.h" +#include "reftable-merged.h" +#include "reftable-error.h" +#include "system.h" + +static int merged_iter_init(struct merged_iter *mi) +{ + int i = 0; + for (i = 0; i < mi->stack_len; i++) { + struct reftable_record rec = reftable_new_record(mi->typ); + int err = iterator_next(&mi->stack[i], &rec); + if (err < 0) { + return err; + } + + if (err > 0) { + reftable_iterator_destroy(&mi->stack[i]); + reftable_record_destroy(&rec); + } else { + struct pq_entry e = { + .rec = rec, + .index = i, + }; + merged_iter_pqueue_add(&mi->pq, e); + } + } + + return 0; +} + +static void merged_iter_close(void *p) +{ + struct merged_iter *mi = p; + int i = 0; + merged_iter_pqueue_release(&mi->pq); + for (i = 0; i < mi->stack_len; i++) { + reftable_iterator_destroy(&mi->stack[i]); + } + reftable_free(mi->stack); +} + +static int merged_iter_advance_nonnull_subiter(struct merged_iter *mi, + size_t idx) +{ + struct reftable_record rec = reftable_new_record(mi->typ); + struct pq_entry e = { + .rec = rec, + .index = idx, + }; + int err = iterator_next(&mi->stack[idx], &rec); + if (err < 0) + return err; + + if (err > 0) { + reftable_iterator_destroy(&mi->stack[idx]); + reftable_record_destroy(&rec); + return 0; + } + + merged_iter_pqueue_add(&mi->pq, e); + return 0; +} + +static int merged_iter_advance_subiter(struct merged_iter *mi, size_t idx) +{ + if (iterator_is_null(&mi->stack[idx])) + return 0; + return merged_iter_advance_nonnull_subiter(mi, idx); +} + +static int merged_iter_next_entry(struct merged_iter *mi, + struct reftable_record *rec) +{ + struct strbuf entry_key = STRBUF_INIT; + struct pq_entry entry = { 0 }; + int err = 0; + + if (merged_iter_pqueue_is_empty(mi->pq)) + return 1; + + entry = merged_iter_pqueue_remove(&mi->pq); + err = merged_iter_advance_subiter(mi, entry.index); + if (err < 0) + return err; + + /* + One can also use reftable as datacenter-local storage, where the ref + database is maintained in globally consistent database (eg. + CockroachDB or Spanner). In this scenario, replication delays together + with compaction may cause newer tables to contain older entries. In + such a deployment, the loop below must be changed to collect all + entries for the same key, and return new the newest one. + */ + reftable_record_key(&entry.rec, &entry_key); + while (!merged_iter_pqueue_is_empty(mi->pq)) { + struct pq_entry top = merged_iter_pqueue_top(mi->pq); + struct strbuf k = STRBUF_INIT; + int err = 0, cmp = 0; + + reftable_record_key(&top.rec, &k); + + cmp = strbuf_cmp(&k, &entry_key); + strbuf_release(&k); + + if (cmp > 0) { + break; + } + + merged_iter_pqueue_remove(&mi->pq); + err = merged_iter_advance_subiter(mi, top.index); + if (err < 0) { + return err; + } + reftable_record_destroy(&top.rec); + } + + reftable_record_copy_from(rec, &entry.rec, hash_size(mi->hash_id)); + reftable_record_destroy(&entry.rec); + strbuf_release(&entry_key); + return 0; +} + +static int merged_iter_next(struct merged_iter *mi, struct reftable_record *rec) +{ + while (1) { + int err = merged_iter_next_entry(mi, rec); + if (err == 0 && mi->suppress_deletions && + reftable_record_is_deletion(rec)) { + continue; + } + + return err; + } +} + +static int merged_iter_next_void(void *p, struct reftable_record *rec) +{ + struct merged_iter *mi = p; + if (merged_iter_pqueue_is_empty(mi->pq)) + return 1; + + return merged_iter_next(mi, rec); +} + +static struct reftable_iterator_vtable merged_iter_vtable = { + .next = &merged_iter_next_void, + .close = &merged_iter_close, +}; + +static void iterator_from_merged_iter(struct reftable_iterator *it, + struct merged_iter *mi) +{ + assert(!it->ops); + it->iter_arg = mi; + it->ops = &merged_iter_vtable; +} + +int reftable_new_merged_table(struct reftable_merged_table **dest, + struct reftable_table *stack, int n, + uint32_t hash_id) +{ + struct reftable_merged_table *m = NULL; + uint64_t last_max = 0; + uint64_t first_min = 0; + int i = 0; + for (i = 0; i < n; i++) { + uint64_t min = reftable_table_min_update_index(&stack[i]); + uint64_t max = reftable_table_max_update_index(&stack[i]); + + if (reftable_table_hash_id(&stack[i]) != hash_id) { + return REFTABLE_FORMAT_ERROR; + } + if (i == 0 || min < first_min) { + first_min = min; + } + if (i == 0 || max > last_max) { + last_max = max; + } + } + + m = reftable_calloc(sizeof(struct reftable_merged_table)); + m->stack = stack; + m->stack_len = n; + m->min = first_min; + m->max = last_max; + m->hash_id = hash_id; + *dest = m; + return 0; +} + +/* clears the list of subtable, without affecting the readers themselves. */ +void merged_table_release(struct reftable_merged_table *mt) +{ + FREE_AND_NULL(mt->stack); + mt->stack_len = 0; +} + +void reftable_merged_table_free(struct reftable_merged_table *mt) +{ + if (!mt) { + return; + } + merged_table_release(mt); + reftable_free(mt); +} + +uint64_t +reftable_merged_table_max_update_index(struct reftable_merged_table *mt) +{ + return mt->max; +} + +uint64_t +reftable_merged_table_min_update_index(struct reftable_merged_table *mt) +{ + return mt->min; +} + +static int reftable_table_seek_record(struct reftable_table *tab, + struct reftable_iterator *it, + struct reftable_record *rec) +{ + return tab->ops->seek_record(tab->table_arg, it, rec); +} + +static int merged_table_seek_record(struct reftable_merged_table *mt, + struct reftable_iterator *it, + struct reftable_record *rec) +{ + struct reftable_iterator *iters = reftable_calloc( + sizeof(struct reftable_iterator) * mt->stack_len); + struct merged_iter merged = { + .stack = iters, + .typ = reftable_record_type(rec), + .hash_id = mt->hash_id, + .suppress_deletions = mt->suppress_deletions, + }; + int n = 0; + int err = 0; + int i = 0; + for (i = 0; i < mt->stack_len && err == 0; i++) { + int e = reftable_table_seek_record(&mt->stack[i], &iters[n], + rec); + if (e < 0) { + err = e; + } + if (e == 0) { + n++; + } + } + if (err < 0) { + int i = 0; + for (i = 0; i < n; i++) { + reftable_iterator_destroy(&iters[i]); + } + reftable_free(iters); + return err; + } + + merged.stack_len = n; + err = merged_iter_init(&merged); + if (err < 0) { + merged_iter_close(&merged); + return err; + } else { + struct merged_iter *p = + reftable_malloc(sizeof(struct merged_iter)); + *p = merged; + iterator_from_merged_iter(it, p); + } + return 0; +} + +int reftable_merged_table_seek_ref(struct reftable_merged_table *mt, + struct reftable_iterator *it, + const char *name) +{ + struct reftable_ref_record ref = { + .refname = (char *)name, + }; + struct reftable_record rec = { NULL }; + reftable_record_from_ref(&rec, &ref); + return merged_table_seek_record(mt, it, &rec); +} + +int reftable_merged_table_seek_log_at(struct reftable_merged_table *mt, + struct reftable_iterator *it, + const char *name, uint64_t update_index) +{ + struct reftable_log_record log = { + .refname = (char *)name, + .update_index = update_index, + }; + struct reftable_record rec = { NULL }; + reftable_record_from_log(&rec, &log); + return merged_table_seek_record(mt, it, &rec); +} + +int reftable_merged_table_seek_log(struct reftable_merged_table *mt, + struct reftable_iterator *it, + const char *name) +{ + uint64_t max = ~((uint64_t)0); + return reftable_merged_table_seek_log_at(mt, it, name, max); +} + +uint32_t reftable_merged_table_hash_id(struct reftable_merged_table *mt) +{ + return mt->hash_id; +} + +static int reftable_merged_table_seek_void(void *tab, + struct reftable_iterator *it, + struct reftable_record *rec) +{ + return merged_table_seek_record(tab, it, rec); +} + +static uint32_t reftable_merged_table_hash_id_void(void *tab) +{ + return reftable_merged_table_hash_id(tab); +} + +static uint64_t reftable_merged_table_min_update_index_void(void *tab) +{ + return reftable_merged_table_min_update_index(tab); +} + +static uint64_t reftable_merged_table_max_update_index_void(void *tab) +{ + return reftable_merged_table_max_update_index(tab); +} + +static struct reftable_table_vtable merged_table_vtable = { + .seek_record = reftable_merged_table_seek_void, + .hash_id = reftable_merged_table_hash_id_void, + .min_update_index = reftable_merged_table_min_update_index_void, + .max_update_index = reftable_merged_table_max_update_index_void, +}; + +void reftable_table_from_merged_table(struct reftable_table *tab, + struct reftable_merged_table *merged) +{ + assert(!tab->ops); + tab->ops = &merged_table_vtable; + tab->table_arg = merged; +} diff --git a/reftable/merged.h b/reftable/merged.h new file mode 100644 index 00000000000..8c4d4d58d77 --- /dev/null +++ b/reftable/merged.h @@ -0,0 +1,35 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef MERGED_H +#define MERGED_H + +#include "pq.h" + +struct reftable_merged_table { + struct reftable_table *stack; + size_t stack_len; + uint32_t hash_id; + int suppress_deletions; + + uint64_t min; + uint64_t max; +}; + +struct merged_iter { + struct reftable_iterator *stack; + uint32_t hash_id; + size_t stack_len; + uint8_t typ; + int suppress_deletions; + struct merged_iter_pqueue pq; +}; + +void merged_table_release(struct reftable_merged_table *mt); + +#endif diff --git a/reftable/merged_test.c b/reftable/merged_test.c new file mode 100644 index 00000000000..1e2afe37b8b --- /dev/null +++ b/reftable/merged_test.c @@ -0,0 +1,292 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "merged.h" + +#include "system.h" + +#include "basics.h" +#include "blocksource.h" +#include "constants.h" +#include "reader.h" +#include "record.h" +#include "test_framework.h" +#include "reftable-merged.h" +#include "reftable-tests.h" +#include "reftable-generic.h" +#include "reftable-writer.h" + +static void write_test_table(struct strbuf *buf, + struct reftable_ref_record refs[], int n) +{ + int min = 0xffffffff; + int max = 0; + int i = 0; + int err; + + struct reftable_write_options opts = { + .block_size = 256, + }; + struct reftable_writer *w = NULL; + for (i = 0; i < n; i++) { + uint64_t ui = refs[i].update_index; + if (ui > max) { + max = ui; + } + if (ui < min) { + min = ui; + } + } + + w = reftable_new_writer(&strbuf_add_void, buf, &opts); + reftable_writer_set_limits(w, min, max); + + for (i = 0; i < n; i++) { + uint64_t before = refs[i].update_index; + int n = reftable_writer_add_ref(w, &refs[i]); + assert(n == 0); + assert(before == refs[i].update_index); + } + + err = reftable_writer_close(w); + EXPECT_ERR(err); + + reftable_writer_free(w); +} + +static struct reftable_merged_table * +merged_table_from_records(struct reftable_ref_record **refs, + struct reftable_block_source **source, + struct reftable_reader ***readers, int *sizes, + struct strbuf *buf, int n) +{ + int i = 0; + struct reftable_merged_table *mt = NULL; + int err; + struct reftable_table *tabs = + reftable_calloc(n * sizeof(struct reftable_table)); + *readers = reftable_calloc(n * sizeof(struct reftable_reader *)); + *source = reftable_calloc(n * sizeof(**source)); + for (i = 0; i < n; i++) { + write_test_table(&buf[i], refs[i], sizes[i]); + block_source_from_strbuf(&(*source)[i], &buf[i]); + + err = reftable_new_reader(&(*readers)[i], &(*source)[i], + "name"); + EXPECT_ERR(err); + reftable_table_from_reader(&tabs[i], (*readers)[i]); + } + + err = reftable_new_merged_table(&mt, tabs, n, GIT_SHA1_FORMAT_ID); + EXPECT_ERR(err); + return mt; +} + +static void readers_destroy(struct reftable_reader **readers, size_t n) +{ + int i = 0; + for (; i < n; i++) + reftable_reader_free(readers[i]); + reftable_free(readers); +} + +static void test_merged_between(void) +{ + uint8_t hash1[GIT_SHA1_RAWSZ] = { 1, 2, 3, 0 }; + + struct reftable_ref_record r1[] = { { + .refname = "b", + .update_index = 1, + .value_type = REFTABLE_REF_VAL1, + .value.val1 = hash1, + } }; + struct reftable_ref_record r2[] = { { + .refname = "a", + .update_index = 2, + .value_type = REFTABLE_REF_DELETION, + } }; + + struct reftable_ref_record *refs[] = { r1, r2 }; + int sizes[] = { 1, 1 }; + struct strbuf bufs[2] = { STRBUF_INIT, STRBUF_INIT }; + struct reftable_block_source *bs = NULL; + struct reftable_reader **readers = NULL; + struct reftable_merged_table *mt = + merged_table_from_records(refs, &bs, &readers, sizes, bufs, 2); + int i; + struct reftable_ref_record ref = { NULL }; + struct reftable_iterator it = { NULL }; + int err = reftable_merged_table_seek_ref(mt, &it, "a"); + EXPECT_ERR(err); + + err = reftable_iterator_next_ref(&it, &ref); + EXPECT_ERR(err); + EXPECT(ref.update_index == 2); + reftable_ref_record_release(&ref); + reftable_iterator_destroy(&it); + readers_destroy(readers, 2); + reftable_merged_table_free(mt); + for (i = 0; i < ARRAY_SIZE(bufs); i++) { + strbuf_release(&bufs[i]); + } + reftable_free(bs); +} + +static void test_merged(void) +{ + uint8_t hash1[GIT_SHA1_RAWSZ] = { 1 }; + uint8_t hash2[GIT_SHA1_RAWSZ] = { 2 }; + struct reftable_ref_record r1[] = { + { + .refname = "a", + .update_index = 1, + .value_type = REFTABLE_REF_VAL1, + .value.val1 = hash1, + }, + { + .refname = "b", + .update_index = 1, + .value_type = REFTABLE_REF_VAL1, + .value.val1 = hash1, + }, + { + .refname = "c", + .update_index = 1, + .value_type = REFTABLE_REF_VAL1, + .value.val1 = hash1, + } + }; + struct reftable_ref_record r2[] = { { + .refname = "a", + .update_index = 2, + .value_type = REFTABLE_REF_DELETION, + } }; + struct reftable_ref_record r3[] = { + { + .refname = "c", + .update_index = 3, + .value_type = REFTABLE_REF_VAL1, + .value.val1 = hash2, + }, + { + .refname = "d", + .update_index = 3, + .value_type = REFTABLE_REF_VAL1, + .value.val1 = hash1, + }, + }; + + struct reftable_ref_record want[] = { + r2[0], + r1[1], + r3[0], + r3[1], + }; + + struct reftable_ref_record *refs[] = { r1, r2, r3 }; + int sizes[3] = { 3, 1, 2 }; + struct strbuf bufs[3] = { STRBUF_INIT, STRBUF_INIT, STRBUF_INIT }; + struct reftable_block_source *bs = NULL; + struct reftable_reader **readers = NULL; + struct reftable_merged_table *mt = + merged_table_from_records(refs, &bs, &readers, sizes, bufs, 3); + + struct reftable_iterator it = { NULL }; + int err = reftable_merged_table_seek_ref(mt, &it, "a"); + struct reftable_ref_record *out = NULL; + size_t len = 0; + size_t cap = 0; + int i = 0; + + EXPECT_ERR(err); + while (len < 100) { /* cap loops/recursion. */ + struct reftable_ref_record ref = { NULL }; + int err = reftable_iterator_next_ref(&it, &ref); + if (err > 0) { + break; + } + if (len == cap) { + cap = 2 * cap + 1; + out = reftable_realloc( + out, sizeof(struct reftable_ref_record) * cap); + } + out[len++] = ref; + } + reftable_iterator_destroy(&it); + + assert(ARRAY_SIZE(want) == len); + for (i = 0; i < len; i++) { + assert(reftable_ref_record_equal(&want[i], &out[i], + GIT_SHA1_RAWSZ)); + } + for (i = 0; i < len; i++) { + reftable_ref_record_release(&out[i]); + } + reftable_free(out); + + for (i = 0; i < 3; i++) { + strbuf_release(&bufs[i]); + } + readers_destroy(readers, 3); + reftable_merged_table_free(mt); + reftable_free(bs); +} + +static void test_default_write_opts(void) +{ + struct reftable_write_options opts = { 0 }; + struct strbuf buf = STRBUF_INIT; + struct reftable_writer *w = + reftable_new_writer(&strbuf_add_void, &buf, &opts); + + struct reftable_ref_record rec = { + .refname = "master", + .update_index = 1, + }; + int err; + struct reftable_block_source source = { NULL }; + struct reftable_table *tab = reftable_calloc(sizeof(*tab) * 1); + uint32_t hash_id; + struct reftable_reader *rd = NULL; + struct reftable_merged_table *merged = NULL; + + reftable_writer_set_limits(w, 1, 1); + + err = reftable_writer_add_ref(w, &rec); + EXPECT_ERR(err); + + err = reftable_writer_close(w); + EXPECT_ERR(err); + reftable_writer_free(w); + + block_source_from_strbuf(&source, &buf); + + err = reftable_new_reader(&rd, &source, "filename"); + EXPECT_ERR(err); + + hash_id = reftable_reader_hash_id(rd); + assert(hash_id == GIT_SHA1_FORMAT_ID); + + reftable_table_from_reader(&tab[0], rd); + err = reftable_new_merged_table(&merged, tab, 1, GIT_SHA1_FORMAT_ID); + EXPECT_ERR(err); + + reftable_reader_free(rd); + reftable_merged_table_free(merged); + strbuf_release(&buf); +} + +/* XXX test refs_for(oid) */ + +int merged_test_main(int argc, const char *argv[]) +{ + RUN_TEST(test_merged_between); + RUN_TEST(test_merged); + RUN_TEST(test_default_write_opts); + return 0; +} diff --git a/reftable/reftable-merged.h b/reftable/reftable-merged.h new file mode 100644 index 00000000000..1a6d16915ab --- /dev/null +++ b/reftable/reftable-merged.h @@ -0,0 +1,72 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef REFTABLE_MERGED_H +#define REFTABLE_MERGED_H + +#include "reftable-iterator.h" + +/* + * Merged tables + * + * A ref database kept in a sequence of table files. The merged_table presents a + * unified view to reading (seeking, iterating) a sequence of immutable tables. + * + * The merged tables are on purpose kept disconnected from their actual storage + * (eg. files on disk), because it is useful to merge tables aren't files. For + * example, the per-workspace and global ref namespace can be implemented as a + * merged table of two stacks of file-backed reftables. + */ + +/* A merged table is implements seeking/iterating over a stack of tables. */ +struct reftable_merged_table; + +/* A generic reftable; see below. */ +struct reftable_table; + +/* reftable_new_merged_table creates a new merged table. It takes ownership of + the stack array. +*/ +int reftable_new_merged_table(struct reftable_merged_table **dest, + struct reftable_table *stack, int n, + uint32_t hash_id); + +/* returns an iterator positioned just before 'name' */ +int reftable_merged_table_seek_ref(struct reftable_merged_table *mt, + struct reftable_iterator *it, + const char *name); + +/* returns an iterator for log entry, at given update_index */ +int reftable_merged_table_seek_log_at(struct reftable_merged_table *mt, + struct reftable_iterator *it, + const char *name, uint64_t update_index); + +/* like reftable_merged_table_seek_log_at but look for the newest entry. */ +int reftable_merged_table_seek_log(struct reftable_merged_table *mt, + struct reftable_iterator *it, + const char *name); + +/* returns the max update_index covered by this merged table. */ +uint64_t +reftable_merged_table_max_update_index(struct reftable_merged_table *mt); + +/* returns the min update_index covered by this merged table. */ +uint64_t +reftable_merged_table_min_update_index(struct reftable_merged_table *mt); + +/* releases memory for the merged_table */ +void reftable_merged_table_free(struct reftable_merged_table *m); + +/* return the hash ID of the merged table. */ +uint32_t reftable_merged_table_hash_id(struct reftable_merged_table *m); + +/* create a generic table from reftable_merged_table */ +void reftable_table_from_merged_table(struct reftable_table *tab, + struct reftable_merged_table *table); + +#endif diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c index 0b5a1701df1..8087f2da4e6 100644 --- a/t/helper/test-reftable.c +++ b/t/helper/test-reftable.c @@ -5,6 +5,7 @@ int cmd__reftable(int argc, const char **argv) { basics_test_main(argc, argv); block_test_main(argc, argv); + merged_test_main(argc, argv); pq_test_main(argc, argv); record_test_main(argc, argv); readwrite_test_main(argc, argv); From patchwork Tue Jul 20 17:04:37 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 12388869 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E5E0FC636C8 for ; Tue, 20 Jul 2021 17:08:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CE6B961165 for ; Tue, 20 Jul 2021 17:08:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232461AbhGTQ1U (ORCPT ); Tue, 20 Jul 2021 12:27:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49720 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232482AbhGTQYY (ORCPT ); Tue, 20 Jul 2021 12:24:24 -0400 Received: from mail-wr1-x432.google.com (mail-wr1-x432.google.com [IPv6:2a00:1450:4864:20::432]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BCAD5C061768 for ; Tue, 20 Jul 2021 10:05:00 -0700 (PDT) Received: by mail-wr1-x432.google.com with SMTP id k4so26768950wrc.8 for ; Tue, 20 Jul 2021 10:05:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=B23JSgAWmeDjRErqpqUH/UHJgQS/0Zg8LHDXUJ+u5YY=; b=nsbJOE3Dbl1Zfh4VB+6IrPvaaaS6v8+pTC6c4qqCK7jNfJm3z+uDfHuRxKmS/pczH2 Lw7sZRbuCTrTcAOhob7kJQYNVxpLZMoByiKAN8AUsZuKHJx+WJZ0Fj6Akieg+UWon1T3 PalrEo2SW14Pm0BJ31kdg7kLK/SdJtgRDZyigOS6rpcALg0QgYfcym+UpPXY8Q996H0Z pZPqb3L3+uo2l80va4FMrH8qffpdL6s0biFTQaFgzZDWVpjpAUtCkQNS/WL0ze5vqzYC 9KbGNvWXsxUsJQuOBR7jjSUymfOtNDG1bk/Ik4aM0JBtmSR943LbKNnYC9GaM3M19fCi OwXQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=B23JSgAWmeDjRErqpqUH/UHJgQS/0Zg8LHDXUJ+u5YY=; b=QCeP02TE5Une0UoS6VpVLKGvhW6zvWtMIA+HzUNeiQx7CaYo29Z0+P/Kk11J2vdMPi 4hLENEQWj4lwXDx7Ik3/yB9FpjmL2GvKhcSgFElOxTRgthXis8Uh3JZ+UPA1g0t5XuUf TY5BOUkVaqJChNmn6qeCpCBF6FuOXLaLpJcin98popKhPl0Pta/p7B4C95WMWV05glQX NgGgNoF+wBT4ch9QmVt6ulSB6y3wXnOxo9570TGMcZuapy3gG3u/4ISL1sekkcD8ZzRQ kP+aQj9JBIrrH5vZFtZc0dsAqF72dxJx2deQA4ADk4PIp6bek2u9t19J7zRLXa20byyI xHbg== X-Gm-Message-State: AOAM530cYCT2qjcnAU5CHx1kkWlvidw1TIVQT/kvcNL0bvD4DPlkwZEP uuQVA48I8/n12JGdBNWedsNATaCDqxo= X-Google-Smtp-Source: ABdhPJxFcI5byrr8mvAXexUntNeJt/I4prqfKLJyaJ+Cu33+CqQtmCEFN5ae56XCA84fK6C6Dd1VbQ== X-Received: by 2002:a5d:4449:: with SMTP id x9mr37136095wrr.52.1626800699306; Tue, 20 Jul 2021 10:04:59 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id z16sm25342697wrl.8.2021.07.20.10.04.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 10:04:59 -0700 (PDT) Message-Id: <806c2e04392fa4af7f3d5abf2d1e4401898ab29f.1626800687.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 20 Jul 2021 17:04:37 +0000 Subject: [PATCH 17/26] reftable: implement refname validation Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys The packed/loose format has restrictions on refnames: a and a/b cannot coexist. This limitation does not apply to reftable per se, but must be maintained for interoperability. This code adds validation routines to abort transactions that are trying to add invalid names. Signed-off-by: Han-Wen Nienhuys --- Makefile | 1 + reftable/refname.c | 209 +++++++++++++++++++++++++++++++++++++++ reftable/refname.h | 29 ++++++ reftable/refname_test.c | 102 +++++++++++++++++++ t/helper/test-reftable.c | 1 + 5 files changed, 342 insertions(+) create mode 100644 reftable/refname.c create mode 100644 reftable/refname.h create mode 100644 reftable/refname_test.c diff --git a/Makefile b/Makefile index 9369013daed..06a5fb9103b 100644 --- a/Makefile +++ b/Makefile @@ -2470,6 +2470,7 @@ REFTABLE_TEST_OBJS += reftable/merged_test.o REFTABLE_TEST_OBJS += reftable/pq_test.o REFTABLE_TEST_OBJS += reftable/record_test.o REFTABLE_TEST_OBJS += reftable/readwrite_test.o +REFTABLE_TEST_OBJS += reftable/refname_test.o REFTABLE_TEST_OBJS += reftable/test_framework.o REFTABLE_TEST_OBJS += reftable/tree_test.o diff --git a/reftable/refname.c b/reftable/refname.c new file mode 100644 index 00000000000..95734969324 --- /dev/null +++ b/reftable/refname.c @@ -0,0 +1,209 @@ +/* + Copyright 2020 Google LLC + + Use of this source code is governed by a BSD-style + license that can be found in the LICENSE file or at + https://developers.google.com/open-source/licenses/bsd +*/ + +#include "system.h" +#include "reftable-error.h" +#include "basics.h" +#include "refname.h" +#include "reftable-iterator.h" + +struct find_arg { + char **names; + const char *want; +}; + +static int find_name(size_t k, void *arg) +{ + struct find_arg *f_arg = arg; + return strcmp(f_arg->names[k], f_arg->want) >= 0; +} + +static int modification_has_ref(struct modification *mod, const char *name) +{ + struct reftable_ref_record ref = { NULL }; + int err = 0; + + if (mod->add_len > 0) { + struct find_arg arg = { + .names = mod->add, + .want = name, + }; + int idx = binsearch(mod->add_len, find_name, &arg); + if (idx < mod->add_len && !strcmp(mod->add[idx], name)) { + return 0; + } + } + + if (mod->del_len > 0) { + struct find_arg arg = { + .names = mod->del, + .want = name, + }; + int idx = binsearch(mod->del_len, find_name, &arg); + if (idx < mod->del_len && !strcmp(mod->del[idx], name)) { + return 1; + } + } + + err = reftable_table_read_ref(&mod->tab, name, &ref); + reftable_ref_record_release(&ref); + return err; +} + +static void modification_release(struct modification *mod) +{ + /* don't delete the strings themselves; they're owned by ref records. + */ + FREE_AND_NULL(mod->add); + FREE_AND_NULL(mod->del); + mod->add_len = 0; + mod->del_len = 0; +} + +static int modification_has_ref_with_prefix(struct modification *mod, + const char *prefix) +{ + struct reftable_iterator it = { NULL }; + struct reftable_ref_record ref = { NULL }; + int err = 0; + + if (mod->add_len > 0) { + struct find_arg arg = { + .names = mod->add, + .want = prefix, + }; + int idx = binsearch(mod->add_len, find_name, &arg); + if (idx < mod->add_len && + !strncmp(prefix, mod->add[idx], strlen(prefix))) + goto done; + } + err = reftable_table_seek_ref(&mod->tab, &it, prefix); + if (err) + goto done; + + while (1) { + err = reftable_iterator_next_ref(&it, &ref); + if (err) + goto done; + + if (mod->del_len > 0) { + struct find_arg arg = { + .names = mod->del, + .want = ref.refname, + }; + int idx = binsearch(mod->del_len, find_name, &arg); + if (idx < mod->del_len && + !strcmp(ref.refname, mod->del[idx])) { + continue; + } + } + + if (strncmp(ref.refname, prefix, strlen(prefix))) { + err = 1; + goto done; + } + err = 0; + goto done; + } + +done: + reftable_ref_record_release(&ref); + reftable_iterator_destroy(&it); + return err; +} + +static int validate_refname(const char *name) +{ + while (1) { + char *next = strchr(name, '/'); + if (!*name) { + return REFTABLE_REFNAME_ERROR; + } + if (!next) { + return 0; + } + if (next - name == 0 || (next - name == 1 && *name == '.') || + (next - name == 2 && name[0] == '.' && name[1] == '.')) + return REFTABLE_REFNAME_ERROR; + name = next + 1; + } + return 0; +} + +int validate_ref_record_addition(struct reftable_table tab, + struct reftable_ref_record *recs, size_t sz) +{ + struct modification mod = { + .tab = tab, + .add = reftable_calloc(sizeof(char *) * sz), + .del = reftable_calloc(sizeof(char *) * sz), + }; + int i = 0; + int err = 0; + for (; i < sz; i++) { + if (reftable_ref_record_is_deletion(&recs[i])) { + mod.del[mod.del_len++] = recs[i].refname; + } else { + mod.add[mod.add_len++] = recs[i].refname; + } + } + + err = modification_validate(&mod); + modification_release(&mod); + return err; +} + +static void strbuf_trim_component(struct strbuf *sl) +{ + while (sl->len > 0) { + int is_slash = (sl->buf[sl->len - 1] == '/'); + strbuf_setlen(sl, sl->len - 1); + if (is_slash) + break; + } +} + +int modification_validate(struct modification *mod) +{ + struct strbuf slashed = STRBUF_INIT; + int err = 0; + int i = 0; + for (; i < mod->add_len; i++) { + err = validate_refname(mod->add[i]); + if (err) + goto done; + strbuf_reset(&slashed); + strbuf_addstr(&slashed, mod->add[i]); + strbuf_addstr(&slashed, "/"); + + err = modification_has_ref_with_prefix(mod, slashed.buf); + if (err == 0) { + err = REFTABLE_NAME_CONFLICT; + goto done; + } + if (err < 0) + goto done; + + strbuf_reset(&slashed); + strbuf_addstr(&slashed, mod->add[i]); + while (slashed.len) { + strbuf_trim_component(&slashed); + err = modification_has_ref(mod, slashed.buf); + if (err == 0) { + err = REFTABLE_NAME_CONFLICT; + goto done; + } + if (err < 0) + goto done; + } + } + err = 0; +done: + strbuf_release(&slashed); + return err; +} diff --git a/reftable/refname.h b/reftable/refname.h new file mode 100644 index 00000000000..a24b40fcb42 --- /dev/null +++ b/reftable/refname.h @@ -0,0 +1,29 @@ +/* + Copyright 2020 Google LLC + + Use of this source code is governed by a BSD-style + license that can be found in the LICENSE file or at + https://developers.google.com/open-source/licenses/bsd +*/ +#ifndef REFNAME_H +#define REFNAME_H + +#include "reftable-record.h" +#include "reftable-generic.h" + +struct modification { + struct reftable_table tab; + + char **add; + size_t add_len; + + char **del; + size_t del_len; +}; + +int validate_ref_record_addition(struct reftable_table tab, + struct reftable_ref_record *recs, size_t sz); + +int modification_validate(struct modification *mod); + +#endif diff --git a/reftable/refname_test.c b/reftable/refname_test.c new file mode 100644 index 00000000000..8645cd93bbd --- /dev/null +++ b/reftable/refname_test.c @@ -0,0 +1,102 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "basics.h" +#include "block.h" +#include "blocksource.h" +#include "constants.h" +#include "reader.h" +#include "record.h" +#include "refname.h" +#include "reftable-error.h" +#include "reftable-writer.h" +#include "system.h" + +#include "test_framework.h" +#include "reftable-tests.h" + +struct testcase { + char *add; + char *del; + int error_code; +}; + +static void test_conflict(void) +{ + struct reftable_write_options opts = { 0 }; + struct strbuf buf = STRBUF_INIT; + struct reftable_writer *w = + reftable_new_writer(&strbuf_add_void, &buf, &opts); + struct reftable_ref_record rec = { + .refname = "a/b", + .value_type = REFTABLE_REF_SYMREF, + .value.symref = "destination", /* make sure it's not a symref. + */ + .update_index = 1, + }; + int err; + int i; + struct reftable_block_source source = { NULL }; + struct reftable_reader *rd = NULL; + struct reftable_table tab = { NULL }; + struct testcase cases[] = { + { "a/b/c", NULL, REFTABLE_NAME_CONFLICT }, + { "b", NULL, 0 }, + { "a", NULL, REFTABLE_NAME_CONFLICT }, + { "a", "a/b", 0 }, + + { "p/", NULL, REFTABLE_REFNAME_ERROR }, + { "p//q", NULL, REFTABLE_REFNAME_ERROR }, + { "p/./q", NULL, REFTABLE_REFNAME_ERROR }, + { "p/../q", NULL, REFTABLE_REFNAME_ERROR }, + + { "a/b/c", "a/b", 0 }, + { NULL, "a//b", 0 }, + }; + reftable_writer_set_limits(w, 1, 1); + + err = reftable_writer_add_ref(w, &rec); + EXPECT_ERR(err); + + err = reftable_writer_close(w); + EXPECT_ERR(err); + reftable_writer_free(w); + + block_source_from_strbuf(&source, &buf); + err = reftable_new_reader(&rd, &source, "filename"); + EXPECT_ERR(err); + + reftable_table_from_reader(&tab, rd); + + for (i = 0; i < ARRAY_SIZE(cases); i++) { + struct modification mod = { + .tab = tab, + }; + + if (cases[i].add) { + mod.add = &cases[i].add; + mod.add_len = 1; + } + if (cases[i].del) { + mod.del = &cases[i].del; + mod.del_len = 1; + } + + err = modification_validate(&mod); + EXPECT(err == cases[i].error_code); + } + + reftable_reader_free(rd); + strbuf_release(&buf); +} + +int refname_test_main(int argc, const char *argv[]) +{ + RUN_TEST(test_conflict); + return 0; +} diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c index 8087f2da4e6..c8db6852c35 100644 --- a/t/helper/test-reftable.c +++ b/t/helper/test-reftable.c @@ -8,6 +8,7 @@ int cmd__reftable(int argc, const char **argv) merged_test_main(argc, argv); pq_test_main(argc, argv); record_test_main(argc, argv); + refname_test_main(argc, argv); readwrite_test_main(argc, argv); tree_test_main(argc, argv); return 0; From patchwork Tue Jul 20 17:04:38 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 12388887 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AD5D1C07E95 for ; Tue, 20 Jul 2021 17:09:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 809E261003 for ; Tue, 20 Jul 2021 17:09:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235344AbhGTQ1Z (ORCPT ); Tue, 20 Jul 2021 12:27:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49698 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232660AbhGTQYZ (ORCPT ); Tue, 20 Jul 2021 12:24:25 -0400 Received: from mail-wr1-x42d.google.com (mail-wr1-x42d.google.com [IPv6:2a00:1450:4864:20::42d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E6F5EC0613DB for ; Tue, 20 Jul 2021 10:05:01 -0700 (PDT) Received: by mail-wr1-x42d.google.com with SMTP id c15so3186126wrs.5 for ; Tue, 20 Jul 2021 10:05:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=DYcCOpkFvq9DycqOSNGPDITAsSotQiist3u4fmoqQyU=; b=Q24a4TA3f1PB2y7iQbMdZk5nxZPItqGGHF7cCciBwgoVPNgtdlrQV/5SQpidgX6pZi kGX5e0RMEnYMasDJ/OOdxgd7faLCaxraY725ZdsQs/a3tX7d3LdgAKm0gWqqHyR2j7+z qP8yfYFzj6BXXtLMrm7I80NnjchoXxrSe/IRdGxZ0+XgBfTxLVbBTykutSv22v9DJ3PT N6fR3+i94mZvGG78fY4Sv/XwyhATCcnlbaMjg/OfASnWy1Woq2D6VgUq5jhZyZtX8277 hnP3+XIfAGV134zSBK37zWCxxK1m6Z4JVN7tsDtbX3ot7djth8DeAV8KaJCUmWO2NcOW tduQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=DYcCOpkFvq9DycqOSNGPDITAsSotQiist3u4fmoqQyU=; b=cDBL1l4aovQCqJ43mz0GOgNsymC37Z552Y3buFbHGNeE71xQzGDvSFvG8LevDGJqAV jwmjkZdeUkYN/uYg6/1gIThZP0FESVhGNzUL66A4tz/HaHA19aSCn961k6MVckcHzk2W MenWCBOl6uGOQwKtzMnI70zAFV5UsR7tOlIlOKqEKsrEQvViNeL36z8wZ/SY+rPEWa7B TkQ9D/d/Wdds+fc6gN4gHYZreRDweHq78Y2Q09zY/ctYMnNOqajXsFRFidYjTL6Y4xct NS6N0CwYQuDkA9XSfyYmQMDUKC32m627wEv2qAeeucLaeJaAZjI/YPZHKsxUpiVmxEre hX1g== X-Gm-Message-State: AOAM5307jywPGB2Th29Todv5/yijOA92eU+MnZavVfGOnEEJbdxK4Xdz kq52iyRdsIKJlBEEiL8XH6ca4KfrdQQ= X-Google-Smtp-Source: ABdhPJxaPlzzj9aCCSBhN0HS2RsFcHcIr6D2HjZg9IJ9YtPYqIsTxstEJSEAiaXGMhJFfieD0OnZUA== X-Received: by 2002:a5d:6850:: with SMTP id o16mr36095411wrw.319.1626800700068; Tue, 20 Jul 2021 10:05:00 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id x19sm20291268wmi.10.2021.07.20.10.04.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 10:04:59 -0700 (PDT) Message-Id: <67f1282469d21a88eb253dced69e069d84f1be4f.1626800687.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 20 Jul 2021 17:04:38 +0000 Subject: [PATCH 18/26] reftable: implement stack, a mutable database of reftable files. Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys Signed-off-by: Han-Wen Nienhuys --- Makefile | 1 + reftable/reftable-stack.h | 128 ++++ reftable/stack.c | 1396 +++++++++++++++++++++++++++++++++++++ reftable/stack.h | 41 ++ reftable/stack_test.c | 947 +++++++++++++++++++++++++ t/helper/test-reftable.c | 1 + 6 files changed, 2514 insertions(+) create mode 100644 reftable/reftable-stack.h create mode 100644 reftable/stack.c create mode 100644 reftable/stack.h create mode 100644 reftable/stack_test.c diff --git a/Makefile b/Makefile index 06a5fb9103b..c18042929c8 100644 --- a/Makefile +++ b/Makefile @@ -2471,6 +2471,7 @@ REFTABLE_TEST_OBJS += reftable/pq_test.o REFTABLE_TEST_OBJS += reftable/record_test.o REFTABLE_TEST_OBJS += reftable/readwrite_test.o REFTABLE_TEST_OBJS += reftable/refname_test.o +REFTABLE_TEST_OBJS += reftable/stack_test.o REFTABLE_TEST_OBJS += reftable/test_framework.o REFTABLE_TEST_OBJS += reftable/tree_test.o diff --git a/reftable/reftable-stack.h b/reftable/reftable-stack.h new file mode 100644 index 00000000000..1b602dda58a --- /dev/null +++ b/reftable/reftable-stack.h @@ -0,0 +1,128 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef REFTABLE_STACK_H +#define REFTABLE_STACK_H + +#include "reftable-writer.h" + +/* + * The stack presents an interface to a mutable sequence of reftables. + + * A stack can be mutated by pushing a table to the top of the stack. + + * The reftable_stack automatically compacts files on disk to ensure good + * amortized performance. + * + * For windows and other platforms that cannot have open files as rename + * destinations, concurrent access from multiple processes needs the rand() + * random seed to be randomized. + */ +struct reftable_stack; + +/* open a new reftable stack. The tables along with the table list will be + * stored in 'dir'. Typically, this should be .git/reftables. + */ +int reftable_new_stack(struct reftable_stack **dest, const char *dir, + struct reftable_write_options config); + +/* returns the update_index at which a next table should be written. */ +uint64_t reftable_stack_next_update_index(struct reftable_stack *st); + +/* holds a transaction to add tables at the top of a stack. */ +struct reftable_addition; + +/* + * returns a new transaction to add reftables to the given stack. As a side + * effect, the ref database is locked. + */ +int reftable_stack_new_addition(struct reftable_addition **dest, + struct reftable_stack *st); + +/* Adds a reftable to transaction. */ +int reftable_addition_add(struct reftable_addition *add, + int (*write_table)(struct reftable_writer *wr, + void *arg), + void *arg); + +/* Commits the transaction, releasing the lock. After calling this, + * reftable_addition_destroy should still be called. + */ +int reftable_addition_commit(struct reftable_addition *add); + +/* Release all non-committed data from the transaction, and deallocate the + * transaction. Releases the lock if held. */ +void reftable_addition_destroy(struct reftable_addition *add); + +/* add a new table to the stack. The write_table function must call + * reftable_writer_set_limits, add refs and return an error value. */ +int reftable_stack_add(struct reftable_stack *st, + int (*write_table)(struct reftable_writer *wr, + void *write_arg), + void *write_arg); + +/* returns the merged_table for seeking. This table is valid until the + * next write or reload, and should not be closed or deleted. + */ +struct reftable_merged_table * +reftable_stack_merged_table(struct reftable_stack *st); + +/* frees all resources associated with the stack. */ +void reftable_stack_destroy(struct reftable_stack *st); + +/* Reloads the stack if necessary. This is very cheap to run if the stack was up + * to date */ +int reftable_stack_reload(struct reftable_stack *st); + +/* Policy for expiring reflog entries. */ +struct reftable_log_expiry_config { + /* Drop entries older than this timestamp */ + uint64_t time; + + /* Drop older entries */ + uint64_t min_update_index; +}; + +/* compacts all reftables into a giant table. Expire reflog entries if config is + * non-NULL */ +int reftable_stack_compact_all(struct reftable_stack *st, + struct reftable_log_expiry_config *config); + +/* heuristically compact unbalanced table stack. */ +int reftable_stack_auto_compact(struct reftable_stack *st); + +/* delete stale .ref tables. */ +int reftable_stack_clean(struct reftable_stack *st); + +/* convenience function to read a single ref. Returns < 0 for error, 0 for + * success, and 1 if ref not found. */ +int reftable_stack_read_ref(struct reftable_stack *st, const char *refname, + struct reftable_ref_record *ref); + +/* convenience function to read a single log. Returns < 0 for error, 0 for + * success, and 1 if ref not found. */ +int reftable_stack_read_log(struct reftable_stack *st, const char *refname, + struct reftable_log_record *log); + +/* statistics on past compactions. */ +struct reftable_compaction_stats { + uint64_t bytes; /* total number of bytes written */ + uint64_t entries_written; /* total number of entries written, including + failures. */ + int attempts; /* how often we tried to compact */ + int failures; /* failures happen on concurrent updates */ +}; + +/* return statistics for compaction up till now. */ +struct reftable_compaction_stats * +reftable_stack_compaction_stats(struct reftable_stack *st); + +/* print the entire stack represented by the directory */ +int reftable_stack_print_directory(const char *stackdir, uint32_t hash_id); + +#endif diff --git a/reftable/stack.c b/reftable/stack.c new file mode 100644 index 00000000000..cf3b11ac998 --- /dev/null +++ b/reftable/stack.c @@ -0,0 +1,1396 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "stack.h" + +#include "system.h" +#include "merged.h" +#include "reader.h" +#include "refname.h" +#include "reftable-error.h" +#include "reftable-record.h" +#include "reftable-merged.h" +#include "writer.h" + +static int stack_try_add(struct reftable_stack *st, + int (*write_table)(struct reftable_writer *wr, + void *arg), + void *arg); +static int stack_write_compact(struct reftable_stack *st, + struct reftable_writer *wr, int first, int last, + struct reftable_log_expiry_config *config); +static int stack_check_addition(struct reftable_stack *st, + const char *new_tab_name); +static void reftable_addition_close(struct reftable_addition *add); +static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st, + int reuse_open); + +static void stack_filename(struct strbuf *dest, struct reftable_stack *st, + const char *name) +{ + strbuf_reset(dest); + strbuf_addstr(dest, st->reftable_dir); + strbuf_addstr(dest, "/"); + strbuf_addstr(dest, name); +} + +static ssize_t reftable_fd_write(void *arg, const void *data, size_t sz) +{ + int *fdp = (int *)arg; + return write(*fdp, data, sz); +} + +int reftable_new_stack(struct reftable_stack **dest, const char *dir, + struct reftable_write_options config) +{ + struct reftable_stack *p = + reftable_calloc(sizeof(struct reftable_stack)); + struct strbuf list_file_name = STRBUF_INIT; + int err = 0; + + if (config.hash_id == 0) { + config.hash_id = GIT_SHA1_FORMAT_ID; + } + + *dest = NULL; + + strbuf_reset(&list_file_name); + strbuf_addstr(&list_file_name, dir); + strbuf_addstr(&list_file_name, "/tables.list"); + + p->list_file = strbuf_detach(&list_file_name, NULL); + p->reftable_dir = xstrdup(dir); + p->config = config; + + err = reftable_stack_reload_maybe_reuse(p, 1); + if (err < 0) { + reftable_stack_destroy(p); + } else { + *dest = p; + } + return err; +} + +static int fd_read_lines(int fd, char ***namesp) +{ + off_t size = lseek(fd, 0, SEEK_END); + char *buf = NULL; + int err = 0; + if (size < 0) { + err = REFTABLE_IO_ERROR; + goto done; + } + err = lseek(fd, 0, SEEK_SET); + if (err < 0) { + err = REFTABLE_IO_ERROR; + goto done; + } + + buf = reftable_malloc(size + 1); + if (read(fd, buf, size) != size) { + err = REFTABLE_IO_ERROR; + goto done; + } + buf[size] = 0; + + parse_names(buf, size, namesp); + +done: + reftable_free(buf); + return err; +} + +int read_lines(const char *filename, char ***namesp) +{ + int fd = open(filename, O_RDONLY, 0644); + int err = 0; + if (fd < 0) { + if (errno == ENOENT) { + *namesp = reftable_calloc(sizeof(char *)); + return 0; + } + + return REFTABLE_IO_ERROR; + } + err = fd_read_lines(fd, namesp); + close(fd); + return err; +} + +struct reftable_merged_table * +reftable_stack_merged_table(struct reftable_stack *st) +{ + return st->merged; +} + +static int has_name(char **names, const char *name) +{ + while (*names) { + if (!strcmp(*names, name)) + return 1; + names++; + } + return 0; +} + +/* Close and free the stack */ +void reftable_stack_destroy(struct reftable_stack *st) +{ + char **names = NULL; + int err = 0; + if (st->merged) { + reftable_merged_table_free(st->merged); + st->merged = NULL; + } + + err = read_lines(st->list_file, &names); + if (err < 0) { + FREE_AND_NULL(names); + } + + if (st->readers) { + int i = 0; + struct strbuf filename = STRBUF_INIT; + for (i = 0; i < st->readers_len; i++) { + const char *name = reader_name(st->readers[i]); + strbuf_reset(&filename); + if (names && !has_name(names, name)) { + stack_filename(&filename, st, name); + } + reftable_reader_free(st->readers[i]); + + if (filename.len) { + // On Windows, can only unlink after closing. + unlink(filename.buf); + } + } + strbuf_release(&filename); + st->readers_len = 0; + FREE_AND_NULL(st->readers); + } + FREE_AND_NULL(st->list_file); + FREE_AND_NULL(st->reftable_dir); + reftable_free(st); + free_names(names); +} + +static struct reftable_reader **stack_copy_readers(struct reftable_stack *st, + int cur_len) +{ + struct reftable_reader **cur = + reftable_calloc(sizeof(struct reftable_reader *) * cur_len); + int i = 0; + for (i = 0; i < cur_len; i++) { + cur[i] = st->readers[i]; + } + return cur; +} + +static int reftable_stack_reload_once(struct reftable_stack *st, char **names, + int reuse_open) +{ + int cur_len = !st->merged ? 0 : st->merged->stack_len; + struct reftable_reader **cur = stack_copy_readers(st, cur_len); + int err = 0; + int names_len = names_length(names); + struct reftable_reader **new_readers = + reftable_calloc(sizeof(struct reftable_reader *) * names_len); + struct reftable_table *new_tables = + reftable_calloc(sizeof(struct reftable_table) * names_len); + int new_readers_len = 0; + struct reftable_merged_table *new_merged = NULL; + int i; + + while (*names) { + struct reftable_reader *rd = NULL; + char *name = *names++; + + /* this is linear; we assume compaction keeps the number of + tables under control so this is not quadratic. */ + int j = 0; + for (j = 0; reuse_open && j < cur_len; j++) { + if (cur[j] && 0 == strcmp(cur[j]->name, name)) { + rd = cur[j]; + cur[j] = NULL; + break; + } + } + + if (!rd) { + struct reftable_block_source src = { NULL }; + struct strbuf table_path = STRBUF_INIT; + stack_filename(&table_path, st, name); + + err = reftable_block_source_from_file(&src, + table_path.buf); + strbuf_release(&table_path); + + if (err < 0) + goto done; + + err = reftable_new_reader(&rd, &src, name); + if (err < 0) + goto done; + } + + new_readers[new_readers_len] = rd; + reftable_table_from_reader(&new_tables[new_readers_len], rd); + new_readers_len++; + } + + /* success! */ + err = reftable_new_merged_table(&new_merged, new_tables, + new_readers_len, st->config.hash_id); + if (err < 0) + goto done; + + new_tables = NULL; + st->readers_len = new_readers_len; + if (st->merged) { + merged_table_release(st->merged); + reftable_merged_table_free(st->merged); + } + if (st->readers) { + reftable_free(st->readers); + } + st->readers = new_readers; + new_readers = NULL; + new_readers_len = 0; + + new_merged->suppress_deletions = 1; + st->merged = new_merged; + for (i = 0; i < cur_len; i++) { + if (cur[i]) { + const char *name = reader_name(cur[i]); + struct strbuf filename = STRBUF_INIT; + stack_filename(&filename, st, name); + + reader_close(cur[i]); + reftable_reader_free(cur[i]); + + // On Windows, can only unlink after closing. + unlink(filename.buf); + + strbuf_release(&filename); + } + } + +done: + for (i = 0; i < new_readers_len; i++) { + reader_close(new_readers[i]); + reftable_reader_free(new_readers[i]); + } + reftable_free(new_readers); + reftable_free(new_tables); + reftable_free(cur); + return err; +} + +/* return negative if a before b. */ +static int tv_cmp(struct timeval *a, struct timeval *b) +{ + time_t diff = a->tv_sec - b->tv_sec; + int udiff = a->tv_usec - b->tv_usec; + + if (diff != 0) + return diff; + + return udiff; +} + +static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st, + int reuse_open) +{ + struct timeval deadline = { 0 }; + int err = gettimeofday(&deadline, NULL); + int64_t delay = 0; + int tries = 0; + if (err < 0) + return err; + + deadline.tv_sec += 3; + while (1) { + char **names = NULL; + char **names_after = NULL; + struct timeval now = { 0 }; + int err = gettimeofday(&now, NULL); + int err2 = 0; + if (err < 0) { + return err; + } + + /* Only look at deadlines after the first few times. This + simplifies debugging in GDB */ + tries++; + if (tries > 3 && tv_cmp(&now, &deadline) >= 0) { + break; + } + + err = read_lines(st->list_file, &names); + if (err < 0) { + free_names(names); + return err; + } + err = reftable_stack_reload_once(st, names, reuse_open); + if (err == 0) { + free_names(names); + break; + } + if (err != REFTABLE_NOT_EXIST_ERROR) { + free_names(names); + return err; + } + + /* err == REFTABLE_NOT_EXIST_ERROR can be caused by a concurrent + writer. Check if there was one by checking if the name list + changed. + */ + err2 = read_lines(st->list_file, &names_after); + if (err2 < 0) { + free_names(names); + return err2; + } + + if (names_equal(names_after, names)) { + free_names(names); + free_names(names_after); + return err; + } + free_names(names); + free_names(names_after); + + delay = delay + (delay * rand()) / RAND_MAX + 1; + sleep_millisec(delay); + } + + return 0; +} + +/* -1 = error + 0 = up to date + 1 = changed. */ +static int stack_uptodate(struct reftable_stack *st) +{ + char **names = NULL; + int err = read_lines(st->list_file, &names); + int i = 0; + if (err < 0) + return err; + + for (i = 0; i < st->readers_len; i++) { + if (!names[i]) { + err = 1; + goto done; + } + + if (strcmp(st->readers[i]->name, names[i])) { + err = 1; + goto done; + } + } + + if (names[st->merged->stack_len]) { + err = 1; + goto done; + } + +done: + free_names(names); + return err; +} + +int reftable_stack_reload(struct reftable_stack *st) +{ + int err = stack_uptodate(st); + if (err > 0) + return reftable_stack_reload_maybe_reuse(st, 1); + return err; +} + +int reftable_stack_add(struct reftable_stack *st, + int (*write)(struct reftable_writer *wr, void *arg), + void *arg) +{ + int err = stack_try_add(st, write, arg); + if (err < 0) { + if (err == REFTABLE_LOCK_ERROR) { + /* Ignore error return, we want to propagate + REFTABLE_LOCK_ERROR. + */ + reftable_stack_reload(st); + } + return err; + } + + if (!st->disable_auto_compact) + return reftable_stack_auto_compact(st); + + return 0; +} + +static void format_name(struct strbuf *dest, uint64_t min, uint64_t max) +{ + char buf[100]; + uint32_t rnd = (uint32_t)rand(); + snprintf(buf, sizeof(buf), "0x%012" PRIx64 "-0x%012" PRIx64 "-%08x", + min, max, rnd); + strbuf_reset(dest); + strbuf_addstr(dest, buf); +} + +struct reftable_addition { + int lock_file_fd; + struct strbuf lock_file_name; + struct reftable_stack *stack; + + char **new_tables; + int new_tables_len; + uint64_t next_update_index; +}; + +#define REFTABLE_ADDITION_INIT \ + { \ + .lock_file_name = STRBUF_INIT \ + } + +static int reftable_stack_init_addition(struct reftable_addition *add, + struct reftable_stack *st) +{ + int err = 0; + add->stack = st; + + strbuf_reset(&add->lock_file_name); + strbuf_addstr(&add->lock_file_name, st->list_file); + strbuf_addstr(&add->lock_file_name, ".lock"); + + add->lock_file_fd = open(add->lock_file_name.buf, + O_EXCL | O_CREAT | O_WRONLY, 0644); + if (add->lock_file_fd < 0) { + if (errno == EEXIST) { + err = REFTABLE_LOCK_ERROR; + } else { + err = REFTABLE_IO_ERROR; + } + goto done; + } + err = stack_uptodate(st); + if (err < 0) + goto done; + + if (err > 1) { + err = REFTABLE_LOCK_ERROR; + goto done; + } + + add->next_update_index = reftable_stack_next_update_index(st); +done: + if (err) { + reftable_addition_close(add); + } + return err; +} + +static void reftable_addition_close(struct reftable_addition *add) +{ + int i = 0; + struct strbuf nm = STRBUF_INIT; + for (i = 0; i < add->new_tables_len; i++) { + stack_filename(&nm, add->stack, add->new_tables[i]); + unlink(nm.buf); + reftable_free(add->new_tables[i]); + add->new_tables[i] = NULL; + } + reftable_free(add->new_tables); + add->new_tables = NULL; + add->new_tables_len = 0; + + if (add->lock_file_fd > 0) { + close(add->lock_file_fd); + add->lock_file_fd = 0; + } + if (add->lock_file_name.len > 0) { + unlink(add->lock_file_name.buf); + strbuf_release(&add->lock_file_name); + } + + strbuf_release(&nm); +} + +void reftable_addition_destroy(struct reftable_addition *add) +{ + if (!add) { + return; + } + reftable_addition_close(add); + reftable_free(add); +} + +int reftable_addition_commit(struct reftable_addition *add) +{ + struct strbuf table_list = STRBUF_INIT; + int i = 0; + int err = 0; + if (add->new_tables_len == 0) + goto done; + + for (i = 0; i < add->stack->merged->stack_len; i++) { + strbuf_addstr(&table_list, add->stack->readers[i]->name); + strbuf_addstr(&table_list, "\n"); + } + for (i = 0; i < add->new_tables_len; i++) { + strbuf_addstr(&table_list, add->new_tables[i]); + strbuf_addstr(&table_list, "\n"); + } + + err = write(add->lock_file_fd, table_list.buf, table_list.len); + strbuf_release(&table_list); + if (err < 0) { + err = REFTABLE_IO_ERROR; + goto done; + } + + err = close(add->lock_file_fd); + add->lock_file_fd = 0; + if (err < 0) { + err = REFTABLE_IO_ERROR; + goto done; + } + + err = rename(add->lock_file_name.buf, add->stack->list_file); + if (err < 0) { + err = REFTABLE_IO_ERROR; + goto done; + } + + /* success, no more state to clean up. */ + strbuf_release(&add->lock_file_name); + for (i = 0; i < add->new_tables_len; i++) { + reftable_free(add->new_tables[i]); + } + reftable_free(add->new_tables); + add->new_tables = NULL; + add->new_tables_len = 0; + + err = reftable_stack_reload(add->stack); +done: + reftable_addition_close(add); + return err; +} + +int reftable_stack_new_addition(struct reftable_addition **dest, + struct reftable_stack *st) +{ + int err = 0; + struct reftable_addition empty = REFTABLE_ADDITION_INIT; + *dest = reftable_calloc(sizeof(**dest)); + **dest = empty; + err = reftable_stack_init_addition(*dest, st); + if (err) { + reftable_free(*dest); + *dest = NULL; + } + return err; +} + +static int stack_try_add(struct reftable_stack *st, + int (*write_table)(struct reftable_writer *wr, + void *arg), + void *arg) +{ + struct reftable_addition add = REFTABLE_ADDITION_INIT; + int err = reftable_stack_init_addition(&add, st); + if (err < 0) + goto done; + if (err > 0) { + err = REFTABLE_LOCK_ERROR; + goto done; + } + + err = reftable_addition_add(&add, write_table, arg); + if (err < 0) + goto done; + + err = reftable_addition_commit(&add); +done: + reftable_addition_close(&add); + return err; +} + +int reftable_addition_add(struct reftable_addition *add, + int (*write_table)(struct reftable_writer *wr, + void *arg), + void *arg) +{ + struct strbuf temp_tab_file_name = STRBUF_INIT; + struct strbuf tab_file_name = STRBUF_INIT; + struct strbuf next_name = STRBUF_INIT; + struct reftable_writer *wr = NULL; + int err = 0; + int tab_fd = 0; + + strbuf_reset(&next_name); + format_name(&next_name, add->next_update_index, add->next_update_index); + + stack_filename(&temp_tab_file_name, add->stack, next_name.buf); + strbuf_addstr(&temp_tab_file_name, ".temp.XXXXXX"); + + tab_fd = mkstemp(temp_tab_file_name.buf); + if (tab_fd < 0) { + err = REFTABLE_IO_ERROR; + goto done; + } + + wr = reftable_new_writer(reftable_fd_write, &tab_fd, + &add->stack->config); + err = write_table(wr, arg); + if (err < 0) + goto done; + + err = reftable_writer_close(wr); + if (err == REFTABLE_EMPTY_TABLE_ERROR) { + err = 0; + goto done; + } + if (err < 0) + goto done; + + err = close(tab_fd); + tab_fd = 0; + if (err < 0) { + err = REFTABLE_IO_ERROR; + goto done; + } + + err = stack_check_addition(add->stack, temp_tab_file_name.buf); + if (err < 0) + goto done; + + if (wr->min_update_index < add->next_update_index) { + err = REFTABLE_API_ERROR; + goto done; + } + + format_name(&next_name, wr->min_update_index, wr->max_update_index); + strbuf_addstr(&next_name, ".ref"); + + stack_filename(&tab_file_name, add->stack, next_name.buf); + + /* + On windows, this relies on rand() picking a unique destination name. + Maybe we should do retry loop as well? + */ + err = rename(temp_tab_file_name.buf, tab_file_name.buf); + if (err < 0) { + err = REFTABLE_IO_ERROR; + goto done; + } + + add->new_tables = reftable_realloc(add->new_tables, + sizeof(*add->new_tables) * + (add->new_tables_len + 1)); + add->new_tables[add->new_tables_len] = strbuf_detach(&next_name, NULL); + add->new_tables_len++; +done: + if (tab_fd > 0) { + close(tab_fd); + tab_fd = 0; + } + if (temp_tab_file_name.len > 0) { + unlink(temp_tab_file_name.buf); + } + + strbuf_release(&temp_tab_file_name); + strbuf_release(&tab_file_name); + strbuf_release(&next_name); + reftable_writer_free(wr); + return err; +} + +uint64_t reftable_stack_next_update_index(struct reftable_stack *st) +{ + int sz = st->merged->stack_len; + if (sz > 0) + return reftable_reader_max_update_index(st->readers[sz - 1]) + + 1; + return 1; +} + +static int stack_compact_locked(struct reftable_stack *st, int first, int last, + struct strbuf *temp_tab, + struct reftable_log_expiry_config *config) +{ + struct strbuf next_name = STRBUF_INIT; + int tab_fd = -1; + struct reftable_writer *wr = NULL; + int err = 0; + + format_name(&next_name, + reftable_reader_min_update_index(st->readers[first]), + reftable_reader_max_update_index(st->readers[last])); + + stack_filename(temp_tab, st, next_name.buf); + strbuf_addstr(temp_tab, ".temp.XXXXXX"); + + tab_fd = mkstemp(temp_tab->buf); + wr = reftable_new_writer(reftable_fd_write, &tab_fd, &st->config); + + err = stack_write_compact(st, wr, first, last, config); + if (err < 0) + goto done; + err = reftable_writer_close(wr); + if (err < 0) + goto done; + + err = close(tab_fd); + tab_fd = 0; + +done: + reftable_writer_free(wr); + if (tab_fd > 0) { + close(tab_fd); + tab_fd = 0; + } + if (err != 0 && temp_tab->len > 0) { + unlink(temp_tab->buf); + strbuf_release(temp_tab); + } + strbuf_release(&next_name); + return err; +} + +static int stack_write_compact(struct reftable_stack *st, + struct reftable_writer *wr, int first, int last, + struct reftable_log_expiry_config *config) +{ + int subtabs_len = last - first + 1; + struct reftable_table *subtabs = reftable_calloc( + sizeof(struct reftable_table) * (last - first + 1)); + struct reftable_merged_table *mt = NULL; + int err = 0; + struct reftable_iterator it = { NULL }; + struct reftable_ref_record ref = { NULL }; + struct reftable_log_record log = { NULL }; + + uint64_t entries = 0; + + int i = 0, j = 0; + for (i = first, j = 0; i <= last; i++) { + struct reftable_reader *t = st->readers[i]; + reftable_table_from_reader(&subtabs[j++], t); + st->stats.bytes += t->size; + } + reftable_writer_set_limits(wr, st->readers[first]->min_update_index, + st->readers[last]->max_update_index); + + err = reftable_new_merged_table(&mt, subtabs, subtabs_len, + st->config.hash_id); + if (err < 0) { + reftable_free(subtabs); + goto done; + } + + err = reftable_merged_table_seek_ref(mt, &it, ""); + if (err < 0) + goto done; + + while (1) { + err = reftable_iterator_next_ref(&it, &ref); + if (err > 0) { + err = 0; + break; + } + if (err < 0) { + break; + } + + if (first == 0 && reftable_ref_record_is_deletion(&ref)) { + continue; + } + + err = reftable_writer_add_ref(wr, &ref); + if (err < 0) { + break; + } + entries++; + } + reftable_iterator_destroy(&it); + + err = reftable_merged_table_seek_log(mt, &it, ""); + if (err < 0) + goto done; + + while (1) { + err = reftable_iterator_next_log(&it, &log); + if (err > 0) { + err = 0; + break; + } + if (err < 0) { + break; + } + if (first == 0 && reftable_log_record_is_deletion(&log)) { + continue; + } + + if (config && config->min_update_index > 0 && + log.update_index < config->min_update_index) { + continue; + } + + if (config && config->time > 0 && + log.update.time < config->time) { + continue; + } + + err = reftable_writer_add_log(wr, &log); + if (err < 0) { + break; + } + entries++; + } + +done: + reftable_iterator_destroy(&it); + if (mt) { + merged_table_release(mt); + reftable_merged_table_free(mt); + } + reftable_ref_record_release(&ref); + reftable_log_record_release(&log); + st->stats.entries_written += entries; + return err; +} + +/* < 0: error. 0 == OK, > 0 attempt failed; could retry. */ +static int stack_compact_range(struct reftable_stack *st, int first, int last, + struct reftable_log_expiry_config *expiry) +{ + struct strbuf temp_tab_file_name = STRBUF_INIT; + struct strbuf new_table_name = STRBUF_INIT; + struct strbuf lock_file_name = STRBUF_INIT; + struct strbuf ref_list_contents = STRBUF_INIT; + struct strbuf new_table_path = STRBUF_INIT; + int err = 0; + int have_lock = 0; + int lock_file_fd = 0; + int compact_count = last - first + 1; + char **listp = NULL; + char **delete_on_success = + reftable_calloc(sizeof(char *) * (compact_count + 1)); + char **subtable_locks = + reftable_calloc(sizeof(char *) * (compact_count + 1)); + int i = 0; + int j = 0; + int is_empty_table = 0; + + if (first > last || (!expiry && first == last)) { + err = 0; + goto done; + } + + st->stats.attempts++; + + strbuf_reset(&lock_file_name); + strbuf_addstr(&lock_file_name, st->list_file); + strbuf_addstr(&lock_file_name, ".lock"); + + lock_file_fd = + open(lock_file_name.buf, O_EXCL | O_CREAT | O_WRONLY, 0644); + if (lock_file_fd < 0) { + if (errno == EEXIST) { + err = 1; + } else { + err = REFTABLE_IO_ERROR; + } + goto done; + } + /* Don't want to write to the lock for now. */ + close(lock_file_fd); + lock_file_fd = 0; + + have_lock = 1; + err = stack_uptodate(st); + if (err != 0) + goto done; + + for (i = first, j = 0; i <= last; i++) { + struct strbuf subtab_file_name = STRBUF_INIT; + struct strbuf subtab_lock = STRBUF_INIT; + int sublock_file_fd = -1; + + stack_filename(&subtab_file_name, st, + reader_name(st->readers[i])); + + strbuf_reset(&subtab_lock); + strbuf_addbuf(&subtab_lock, &subtab_file_name); + strbuf_addstr(&subtab_lock, ".lock"); + + sublock_file_fd = open(subtab_lock.buf, + O_EXCL | O_CREAT | O_WRONLY, 0644); + if (sublock_file_fd > 0) { + close(sublock_file_fd); + } else if (sublock_file_fd < 0) { + if (errno == EEXIST) { + err = 1; + } else { + err = REFTABLE_IO_ERROR; + } + } + + subtable_locks[j] = subtab_lock.buf; + delete_on_success[j] = subtab_file_name.buf; + j++; + + if (err != 0) + goto done; + } + + err = unlink(lock_file_name.buf); + if (err < 0) + goto done; + have_lock = 0; + + err = stack_compact_locked(st, first, last, &temp_tab_file_name, + expiry); + /* Compaction + tombstones can create an empty table out of non-empty + * tables. */ + is_empty_table = (err == REFTABLE_EMPTY_TABLE_ERROR); + if (is_empty_table) { + err = 0; + } + if (err < 0) + goto done; + + lock_file_fd = + open(lock_file_name.buf, O_EXCL | O_CREAT | O_WRONLY, 0644); + if (lock_file_fd < 0) { + if (errno == EEXIST) { + err = 1; + } else { + err = REFTABLE_IO_ERROR; + } + goto done; + } + have_lock = 1; + + format_name(&new_table_name, st->readers[first]->min_update_index, + st->readers[last]->max_update_index); + strbuf_addstr(&new_table_name, ".ref"); + + stack_filename(&new_table_path, st, new_table_name.buf); + + if (!is_empty_table) { + /* retry? */ + err = rename(temp_tab_file_name.buf, new_table_path.buf); + if (err < 0) { + err = REFTABLE_IO_ERROR; + goto done; + } + } + + for (i = 0; i < first; i++) { + strbuf_addstr(&ref_list_contents, st->readers[i]->name); + strbuf_addstr(&ref_list_contents, "\n"); + } + if (!is_empty_table) { + strbuf_addbuf(&ref_list_contents, &new_table_name); + strbuf_addstr(&ref_list_contents, "\n"); + } + for (i = last + 1; i < st->merged->stack_len; i++) { + strbuf_addstr(&ref_list_contents, st->readers[i]->name); + strbuf_addstr(&ref_list_contents, "\n"); + } + + err = write(lock_file_fd, ref_list_contents.buf, ref_list_contents.len); + if (err < 0) { + err = REFTABLE_IO_ERROR; + unlink(new_table_path.buf); + goto done; + } + err = close(lock_file_fd); + lock_file_fd = 0; + if (err < 0) { + err = REFTABLE_IO_ERROR; + unlink(new_table_path.buf); + goto done; + } + + err = rename(lock_file_name.buf, st->list_file); + if (err < 0) { + err = REFTABLE_IO_ERROR; + unlink(new_table_path.buf); + goto done; + } + have_lock = 0; + + /* Reload the stack before deleting. On windows, we can only delete the + files after we closed them. + */ + err = reftable_stack_reload_maybe_reuse(st, first < last); + + listp = delete_on_success; + while (*listp) { + if (strcmp(*listp, new_table_path.buf)) { + unlink(*listp); + } + listp++; + } + +done: + free_names(delete_on_success); + + listp = subtable_locks; + while (*listp) { + unlink(*listp); + listp++; + } + free_names(subtable_locks); + if (lock_file_fd > 0) { + close(lock_file_fd); + lock_file_fd = 0; + } + if (have_lock) { + unlink(lock_file_name.buf); + } + strbuf_release(&new_table_name); + strbuf_release(&new_table_path); + strbuf_release(&ref_list_contents); + strbuf_release(&temp_tab_file_name); + strbuf_release(&lock_file_name); + return err; +} + +int reftable_stack_compact_all(struct reftable_stack *st, + struct reftable_log_expiry_config *config) +{ + return stack_compact_range(st, 0, st->merged->stack_len - 1, config); +} + +static int stack_compact_range_stats(struct reftable_stack *st, int first, + int last, + struct reftable_log_expiry_config *config) +{ + int err = stack_compact_range(st, first, last, config); + if (err > 0) { + st->stats.failures++; + } + return err; +} + +static int segment_size(struct segment *s) +{ + return s->end - s->start; +} + +int fastlog2(uint64_t sz) +{ + int l = 0; + if (sz == 0) + return 0; + for (; sz; sz /= 2) { + l++; + } + return l - 1; +} + +struct segment *sizes_to_segments(int *seglen, uint64_t *sizes, int n) +{ + struct segment *segs = reftable_calloc(sizeof(struct segment) * n); + int next = 0; + struct segment cur = { 0 }; + int i = 0; + + if (n == 0) { + *seglen = 0; + return segs; + } + for (i = 0; i < n; i++) { + int log = fastlog2(sizes[i]); + if (cur.log != log && cur.bytes > 0) { + struct segment fresh = { + .start = i, + }; + + segs[next++] = cur; + cur = fresh; + } + + cur.log = log; + cur.end = i + 1; + cur.bytes += sizes[i]; + } + segs[next++] = cur; + *seglen = next; + return segs; +} + +struct segment suggest_compaction_segment(uint64_t *sizes, int n) +{ + int seglen = 0; + struct segment *segs = sizes_to_segments(&seglen, sizes, n); + struct segment min_seg = { + .log = 64, + }; + int i = 0; + for (i = 0; i < seglen; i++) { + if (segment_size(&segs[i]) == 1) { + continue; + } + + if (segs[i].log < min_seg.log) { + min_seg = segs[i]; + } + } + + while (min_seg.start > 0) { + int prev = min_seg.start - 1; + if (fastlog2(min_seg.bytes) < fastlog2(sizes[prev])) { + break; + } + + min_seg.start = prev; + min_seg.bytes += sizes[prev]; + } + + reftable_free(segs); + return min_seg; +} + +static uint64_t *stack_table_sizes_for_compaction(struct reftable_stack *st) +{ + uint64_t *sizes = + reftable_calloc(sizeof(uint64_t) * st->merged->stack_len); + int version = (st->config.hash_id == GIT_SHA1_FORMAT_ID) ? 1 : 2; + int overhead = header_size(version) - 1; + int i = 0; + for (i = 0; i < st->merged->stack_len; i++) { + sizes[i] = st->readers[i]->size - overhead; + } + return sizes; +} + +int reftable_stack_auto_compact(struct reftable_stack *st) +{ + uint64_t *sizes = stack_table_sizes_for_compaction(st); + struct segment seg = + suggest_compaction_segment(sizes, st->merged->stack_len); + reftable_free(sizes); + if (segment_size(&seg) > 0) + return stack_compact_range_stats(st, seg.start, seg.end - 1, + NULL); + + return 0; +} + +struct reftable_compaction_stats * +reftable_stack_compaction_stats(struct reftable_stack *st) +{ + return &st->stats; +} + +int reftable_stack_read_ref(struct reftable_stack *st, const char *refname, + struct reftable_ref_record *ref) +{ + struct reftable_table tab = { NULL }; + reftable_table_from_merged_table(&tab, reftable_stack_merged_table(st)); + return reftable_table_read_ref(&tab, refname, ref); +} + +int reftable_stack_read_log(struct reftable_stack *st, const char *refname, + struct reftable_log_record *log) +{ + struct reftable_iterator it = { NULL }; + struct reftable_merged_table *mt = reftable_stack_merged_table(st); + int err = reftable_merged_table_seek_log(mt, &it, refname); + if (err) + goto done; + + err = reftable_iterator_next_log(&it, log); + if (err) + goto done; + + if (strcmp(log->refname, refname) || + reftable_log_record_is_deletion(log)) { + err = 1; + goto done; + } + +done: + if (err) { + reftable_log_record_release(log); + } + reftable_iterator_destroy(&it); + return err; +} + +static int stack_check_addition(struct reftable_stack *st, + const char *new_tab_name) +{ + int err = 0; + struct reftable_block_source src = { NULL }; + struct reftable_reader *rd = NULL; + struct reftable_table tab = { NULL }; + struct reftable_ref_record *refs = NULL; + struct reftable_iterator it = { NULL }; + int cap = 0; + int len = 0; + int i = 0; + + if (st->config.skip_name_check) + return 0; + + err = reftable_block_source_from_file(&src, new_tab_name); + if (err < 0) + goto done; + + err = reftable_new_reader(&rd, &src, new_tab_name); + if (err < 0) + goto done; + + err = reftable_reader_seek_ref(rd, &it, ""); + if (err > 0) { + err = 0; + goto done; + } + if (err < 0) + goto done; + + while (1) { + struct reftable_ref_record ref = { NULL }; + err = reftable_iterator_next_ref(&it, &ref); + if (err > 0) { + break; + } + if (err < 0) + goto done; + + if (len >= cap) { + cap = 2 * cap + 1; + refs = reftable_realloc(refs, cap * sizeof(refs[0])); + } + + refs[len++] = ref; + } + + reftable_table_from_merged_table(&tab, reftable_stack_merged_table(st)); + + err = validate_ref_record_addition(tab, refs, len); + +done: + for (i = 0; i < len; i++) { + reftable_ref_record_release(&refs[i]); + } + + free(refs); + reftable_iterator_destroy(&it); + reftable_reader_free(rd); + return err; +} + +static int is_table_name(const char *s) +{ + const char *dot = strrchr(s, '.'); + return dot && !strcmp(dot, ".ref"); +} + +static void remove_maybe_stale_table(struct reftable_stack *st, uint64_t max, + const char *name) +{ + int err = 0; + uint64_t update_idx = 0; + struct reftable_block_source src = { NULL }; + struct reftable_reader *rd = NULL; + struct strbuf table_path = STRBUF_INIT; + stack_filename(&table_path, st, name); + + err = reftable_block_source_from_file(&src, table_path.buf); + if (err < 0) + goto done; + + err = reftable_new_reader(&rd, &src, name); + if (err < 0) + goto done; + + update_idx = reftable_reader_max_update_index(rd); + reftable_reader_free(rd); + + if (update_idx <= max) { + unlink(table_path.buf); + } +done: + strbuf_release(&table_path); +} + +static int reftable_stack_clean_locked(struct reftable_stack *st) +{ + uint64_t max = reftable_merged_table_max_update_index( + reftable_stack_merged_table(st)); + DIR *dir = opendir(st->reftable_dir); + struct dirent *d = NULL; + if (!dir) { + return REFTABLE_IO_ERROR; + } + + while ((d = readdir(dir))) { + int i = 0; + int found = 0; + if (!is_table_name(d->d_name)) + continue; + + for (i = 0; !found && i < st->readers_len; i++) { + found = !strcmp(reader_name(st->readers[i]), d->d_name); + } + if (found) + continue; + + remove_maybe_stale_table(st, max, d->d_name); + } + + closedir(dir); + return 0; +} + +int reftable_stack_clean(struct reftable_stack *st) +{ + struct reftable_addition *add = NULL; + int err = reftable_stack_new_addition(&add, st); + if (err < 0) { + goto done; + } + + err = reftable_stack_reload(st); + if (err < 0) { + goto done; + } + + err = reftable_stack_clean_locked(st); + +done: + reftable_addition_destroy(add); + return err; +} + +int reftable_stack_print_directory(const char *stackdir, uint32_t hash_id) +{ + struct reftable_stack *stack = NULL; + struct reftable_write_options cfg = { .hash_id = hash_id }; + struct reftable_merged_table *merged = NULL; + struct reftable_table table = { NULL }; + + int err = reftable_new_stack(&stack, stackdir, cfg); + if (err < 0) + goto done; + + merged = reftable_stack_merged_table(stack); + reftable_table_from_merged_table(&table, merged); + err = reftable_table_print(&table); +done: + if (stack) + reftable_stack_destroy(stack); + return err; +} diff --git a/reftable/stack.h b/reftable/stack.h new file mode 100644 index 00000000000..f57005846e5 --- /dev/null +++ b/reftable/stack.h @@ -0,0 +1,41 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef STACK_H +#define STACK_H + +#include "system.h" +#include "reftable-writer.h" +#include "reftable-stack.h" + +struct reftable_stack { + char *list_file; + char *reftable_dir; + int disable_auto_compact; + + struct reftable_write_options config; + + struct reftable_reader **readers; + size_t readers_len; + struct reftable_merged_table *merged; + struct reftable_compaction_stats stats; +}; + +int read_lines(const char *filename, char ***lines); + +struct segment { + int start, end; + int log; + uint64_t bytes; +}; + +int fastlog2(uint64_t sz); +struct segment *sizes_to_segments(int *seglen, uint64_t *sizes, int n); +struct segment suggest_compaction_segment(uint64_t *sizes, int n); + +#endif diff --git a/reftable/stack_test.c b/reftable/stack_test.c new file mode 100644 index 00000000000..0743defda13 --- /dev/null +++ b/reftable/stack_test.c @@ -0,0 +1,947 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "stack.h" + +#include "system.h" + +#include "reftable-reader.h" +#include "merged.h" +#include "basics.h" +#include "constants.h" +#include "record.h" +#include "test_framework.h" +#include "reftable-tests.h" + +#include +#include + +static void clear_dir(const char *dirname) +{ + struct strbuf path = STRBUF_INIT; + strbuf_addstr(&path, dirname); + remove_dir_recursively(&path, 0); + strbuf_release(&path); +} + +static int count_dir_entries(const char *dirname) +{ + DIR *dir = opendir(dirname); + int len = 0; + struct dirent *d; + if (dir == NULL) + return 0; + + while ((d = readdir(dir))) { + if (!strcmp(d->d_name, "..") || !strcmp(d->d_name, ".")) + continue; + len++; + } + closedir(dir); + return len; +} + +static char *get_tmp_template(const char *prefix) +{ + const char *tmp = getenv("TMPDIR"); + static char template[1024]; + snprintf(template, sizeof(template) - 1, "%s/%s.XXXXXX", + tmp ? tmp : "/tmp", prefix); + return template; +} + +static char *get_tmp_dir(const char *prefix) +{ + char *dir = get_tmp_template(prefix); + EXPECT(mkdtemp(dir)); + return dir; +} + +static void test_read_file(void) +{ + char *fn = get_tmp_template(__FUNCTION__); + int fd = mkstemp(fn); + char out[1024] = "line1\n\nline2\nline3"; + int n, err; + char **names = NULL; + char *want[] = { "line1", "line2", "line3" }; + int i = 0; + + EXPECT(fd > 0); + n = write(fd, out, strlen(out)); + EXPECT(n == strlen(out)); + err = close(fd); + EXPECT(err >= 0); + + err = read_lines(fn, &names); + EXPECT_ERR(err); + + for (i = 0; names[i]; i++) { + EXPECT(0 == strcmp(want[i], names[i])); + } + free_names(names); + remove(fn); +} + +static void test_parse_names(void) +{ + char buf[] = "line\n"; + char **names = NULL; + parse_names(buf, strlen(buf), &names); + + EXPECT(NULL != names[0]); + EXPECT(0 == strcmp(names[0], "line")); + EXPECT(NULL == names[1]); + free_names(names); +} + +static void test_names_equal(void) +{ + char *a[] = { "a", "b", "c", NULL }; + char *b[] = { "a", "b", "d", NULL }; + char *c[] = { "a", "b", NULL }; + + EXPECT(names_equal(a, a)); + EXPECT(!names_equal(a, b)); + EXPECT(!names_equal(a, c)); +} + +static int write_test_ref(struct reftable_writer *wr, void *arg) +{ + struct reftable_ref_record *ref = arg; + reftable_writer_set_limits(wr, ref->update_index, ref->update_index); + return reftable_writer_add_ref(wr, ref); +} + +struct write_log_arg { + struct reftable_log_record *log; + uint64_t update_index; +}; + +static int write_test_log(struct reftable_writer *wr, void *arg) +{ + struct write_log_arg *wla = arg; + + reftable_writer_set_limits(wr, wla->update_index, wla->update_index); + return reftable_writer_add_log(wr, wla->log); +} + +static void test_reftable_stack_add_one(void) +{ + char *dir = get_tmp_dir(__FUNCTION__); + + struct reftable_write_options cfg = { 0 }; + struct reftable_stack *st = NULL; + int err; + struct reftable_ref_record ref = { + .refname = "HEAD", + .update_index = 1, + .value_type = REFTABLE_REF_SYMREF, + .value.symref = "master", + }; + struct reftable_ref_record dest = { NULL }; + + + err = reftable_new_stack(&st, dir, cfg); + EXPECT_ERR(err); + + err = reftable_stack_add(st, &write_test_ref, &ref); + EXPECT_ERR(err); + + err = reftable_stack_read_ref(st, ref.refname, &dest); + EXPECT_ERR(err); + EXPECT(0 == strcmp("master", dest.value.symref)); + + printf("testing print functionality:\n"); + err = reftable_stack_print_directory(dir, GIT_SHA1_FORMAT_ID); + EXPECT_ERR(err); + + err = reftable_stack_print_directory(dir, GIT_SHA256_FORMAT_ID); + EXPECT(err == REFTABLE_FORMAT_ERROR); + + reftable_ref_record_release(&dest); + reftable_stack_destroy(st); + clear_dir(dir); +} + +static void test_reftable_stack_uptodate(void) +{ + struct reftable_write_options cfg = { 0 }; + struct reftable_stack *st1 = NULL; + struct reftable_stack *st2 = NULL; + char *dir = get_tmp_dir(__FUNCTION__); + + int err; + struct reftable_ref_record ref1 = { + .refname = "HEAD", + .update_index = 1, + .value_type = REFTABLE_REF_SYMREF, + .value.symref = "master", + }; + struct reftable_ref_record ref2 = { + .refname = "branch2", + .update_index = 2, + .value_type = REFTABLE_REF_SYMREF, + .value.symref = "master", + }; + + + /* simulate multi-process access to the same stack + by creating two stacks for the same directory. + */ + err = reftable_new_stack(&st1, dir, cfg); + EXPECT_ERR(err); + + err = reftable_new_stack(&st2, dir, cfg); + EXPECT_ERR(err); + + err = reftable_stack_add(st1, &write_test_ref, &ref1); + EXPECT_ERR(err); + + err = reftable_stack_add(st2, &write_test_ref, &ref2); + EXPECT(err == REFTABLE_LOCK_ERROR); + + err = reftable_stack_reload(st2); + EXPECT_ERR(err); + + err = reftable_stack_add(st2, &write_test_ref, &ref2); + EXPECT_ERR(err); + reftable_stack_destroy(st1); + reftable_stack_destroy(st2); + clear_dir(dir); +} + +static void test_reftable_stack_transaction_api(void) +{ + char *dir = get_tmp_dir(__FUNCTION__); + + struct reftable_write_options cfg = { 0 }; + struct reftable_stack *st = NULL; + int err; + struct reftable_addition *add = NULL; + + struct reftable_ref_record ref = { + .refname = "HEAD", + .update_index = 1, + .value_type = REFTABLE_REF_SYMREF, + .value.symref = "master", + }; + struct reftable_ref_record dest = { NULL }; + + + err = reftable_new_stack(&st, dir, cfg); + EXPECT_ERR(err); + + reftable_addition_destroy(add); + + err = reftable_stack_new_addition(&add, st); + EXPECT_ERR(err); + + err = reftable_addition_add(add, &write_test_ref, &ref); + EXPECT_ERR(err); + + err = reftable_addition_commit(add); + EXPECT_ERR(err); + + reftable_addition_destroy(add); + + err = reftable_stack_read_ref(st, ref.refname, &dest); + EXPECT_ERR(err); + EXPECT(REFTABLE_REF_SYMREF == dest.value_type); + EXPECT(0 == strcmp("master", dest.value.symref)); + + reftable_ref_record_release(&dest); + reftable_stack_destroy(st); + clear_dir(dir); +} + +static void test_reftable_stack_validate_refname(void) +{ + struct reftable_write_options cfg = { 0 }; + struct reftable_stack *st = NULL; + int err; + char *dir = get_tmp_dir(__FUNCTION__); + + int i; + struct reftable_ref_record ref = { + .refname = "a/b", + .update_index = 1, + .value_type = REFTABLE_REF_SYMREF, + .value.symref = "master", + }; + char *additions[] = { "a", "a/b/c" }; + + err = reftable_new_stack(&st, dir, cfg); + EXPECT_ERR(err); + + err = reftable_stack_add(st, &write_test_ref, &ref); + EXPECT_ERR(err); + + for (i = 0; i < ARRAY_SIZE(additions); i++) { + struct reftable_ref_record ref = { + .refname = additions[i], + .update_index = 1, + .value_type = REFTABLE_REF_SYMREF, + .value.symref = "master", + }; + + err = reftable_stack_add(st, &write_test_ref, &ref); + EXPECT(err == REFTABLE_NAME_CONFLICT); + } + + reftable_stack_destroy(st); + clear_dir(dir); +} + +static int write_error(struct reftable_writer *wr, void *arg) +{ + return *((int *)arg); +} + +static void test_reftable_stack_update_index_check(void) +{ + char *dir = get_tmp_dir(__FUNCTION__); + + struct reftable_write_options cfg = { 0 }; + struct reftable_stack *st = NULL; + int err; + struct reftable_ref_record ref1 = { + .refname = "name1", + .update_index = 1, + .value_type = REFTABLE_REF_SYMREF, + .value.symref = "master", + }; + struct reftable_ref_record ref2 = { + .refname = "name2", + .update_index = 1, + .value_type = REFTABLE_REF_SYMREF, + .value.symref = "master", + }; + + err = reftable_new_stack(&st, dir, cfg); + EXPECT_ERR(err); + + err = reftable_stack_add(st, &write_test_ref, &ref1); + EXPECT_ERR(err); + + err = reftable_stack_add(st, &write_test_ref, &ref2); + EXPECT(err == REFTABLE_API_ERROR); + reftable_stack_destroy(st); + clear_dir(dir); +} + +static void test_reftable_stack_lock_failure(void) +{ + char *dir = get_tmp_dir(__FUNCTION__); + + struct reftable_write_options cfg = { 0 }; + struct reftable_stack *st = NULL; + int err, i; + + err = reftable_new_stack(&st, dir, cfg); + EXPECT_ERR(err); + for (i = -1; i != REFTABLE_EMPTY_TABLE_ERROR; i--) { + err = reftable_stack_add(st, &write_error, &i); + EXPECT(err == i); + } + + reftable_stack_destroy(st); + clear_dir(dir); +} + +static void test_reftable_stack_add(void) +{ + int i = 0; + int err = 0; + struct reftable_write_options cfg = { + .exact_log_message = 1, + }; + struct reftable_stack *st = NULL; + char *dir = get_tmp_dir(__FUNCTION__); + + struct reftable_ref_record refs[2] = { { NULL } }; + struct reftable_log_record logs[2] = { { NULL } }; + int N = ARRAY_SIZE(refs); + + + err = reftable_new_stack(&st, dir, cfg); + EXPECT_ERR(err); + st->disable_auto_compact = 1; + + for (i = 0; i < N; i++) { + char buf[256]; + snprintf(buf, sizeof(buf), "branch%02d", i); + refs[i].refname = xstrdup(buf); + refs[i].update_index = i + 1; + refs[i].value_type = REFTABLE_REF_VAL1; + refs[i].value.val1 = reftable_malloc(GIT_SHA1_RAWSZ); + set_test_hash(refs[i].value.val1, i); + + logs[i].refname = xstrdup(buf); + logs[i].update_index = N + i + 1; + logs[i].value_type = REFTABLE_LOG_UPDATE; + + logs[i].update.new_hash = reftable_malloc(GIT_SHA1_RAWSZ); + logs[i].update.email = xstrdup("identity@invalid"); + set_test_hash(logs[i].update.new_hash, i); + } + + for (i = 0; i < N; i++) { + int err = reftable_stack_add(st, &write_test_ref, &refs[i]); + EXPECT_ERR(err); + } + + for (i = 0; i < N; i++) { + struct write_log_arg arg = { + .log = &logs[i], + .update_index = reftable_stack_next_update_index(st), + }; + int err = reftable_stack_add(st, &write_test_log, &arg); + EXPECT_ERR(err); + } + + err = reftable_stack_compact_all(st, NULL); + EXPECT_ERR(err); + + for (i = 0; i < N; i++) { + struct reftable_ref_record dest = { NULL }; + + int err = reftable_stack_read_ref(st, refs[i].refname, &dest); + EXPECT_ERR(err); + EXPECT(reftable_ref_record_equal(&dest, refs + i, + GIT_SHA1_RAWSZ)); + reftable_ref_record_release(&dest); + } + + for (i = 0; i < N; i++) { + struct reftable_log_record dest = { NULL }; + int err = reftable_stack_read_log(st, refs[i].refname, &dest); + EXPECT_ERR(err); + EXPECT(reftable_log_record_equal(&dest, logs + i, + GIT_SHA1_RAWSZ)); + reftable_log_record_release(&dest); + } + + /* cleanup */ + reftable_stack_destroy(st); + for (i = 0; i < N; i++) { + reftable_ref_record_release(&refs[i]); + reftable_log_record_release(&logs[i]); + } + clear_dir(dir); +} + +static void test_reftable_stack_log_normalize(void) +{ + int err = 0; + struct reftable_write_options cfg = { + 0, + }; + struct reftable_stack *st = NULL; + char *dir = get_tmp_dir(__FUNCTION__); + + + uint8_t h1[GIT_SHA1_RAWSZ] = { 0x01 }, h2[GIT_SHA1_RAWSZ] = { 0x02 }; + + struct reftable_log_record input = { .refname = "branch", + .update_index = 1, + .value_type = REFTABLE_LOG_UPDATE, + .update = { + .new_hash = h1, + .old_hash = h2, + } }; + struct reftable_log_record dest = { + .update_index = 0, + }; + struct write_log_arg arg = { + .log = &input, + .update_index = 1, + }; + + err = reftable_new_stack(&st, dir, cfg); + EXPECT_ERR(err); + + input.update.message = "one\ntwo"; + err = reftable_stack_add(st, &write_test_log, &arg); + EXPECT(err == REFTABLE_API_ERROR); + + input.update.message = "one"; + err = reftable_stack_add(st, &write_test_log, &arg); + EXPECT_ERR(err); + + err = reftable_stack_read_log(st, input.refname, &dest); + EXPECT_ERR(err); + EXPECT(0 == strcmp(dest.update.message, "one\n")); + + input.update.message = "two\n"; + arg.update_index = 2; + err = reftable_stack_add(st, &write_test_log, &arg); + EXPECT_ERR(err); + err = reftable_stack_read_log(st, input.refname, &dest); + EXPECT_ERR(err); + EXPECT(0 == strcmp(dest.update.message, "two\n")); + + /* cleanup */ + reftable_stack_destroy(st); + reftable_log_record_release(&dest); + clear_dir(dir); +} + +static void test_reftable_stack_tombstone(void) +{ + int i = 0; + char *dir = get_tmp_dir(__FUNCTION__); + + struct reftable_write_options cfg = { 0 }; + struct reftable_stack *st = NULL; + int err; + struct reftable_ref_record refs[2] = { { NULL } }; + struct reftable_log_record logs[2] = { { NULL } }; + int N = ARRAY_SIZE(refs); + struct reftable_ref_record dest = { NULL }; + struct reftable_log_record log_dest = { NULL }; + + + err = reftable_new_stack(&st, dir, cfg); + EXPECT_ERR(err); + + /* even entries add the refs, odd entries delete them. */ + for (i = 0; i < N; i++) { + const char *buf = "branch"; + refs[i].refname = xstrdup(buf); + refs[i].update_index = i + 1; + if (i % 2 == 0) { + refs[i].value_type = REFTABLE_REF_VAL1; + refs[i].value.val1 = reftable_malloc(GIT_SHA1_RAWSZ); + set_test_hash(refs[i].value.val1, i); + } + + logs[i].refname = xstrdup(buf); + /* update_index is part of the key. */ + logs[i].update_index = 42; + if (i % 2 == 0) { + logs[i].value_type = REFTABLE_LOG_UPDATE; + logs[i].update.new_hash = + reftable_malloc(GIT_SHA1_RAWSZ); + set_test_hash(logs[i].update.new_hash, i); + logs[i].update.email = xstrdup("identity@invalid"); + } + } + for (i = 0; i < N; i++) { + int err = reftable_stack_add(st, &write_test_ref, &refs[i]); + EXPECT_ERR(err); + } + + for (i = 0; i < N; i++) { + struct write_log_arg arg = { + .log = &logs[i], + .update_index = reftable_stack_next_update_index(st), + }; + int err = reftable_stack_add(st, &write_test_log, &arg); + EXPECT_ERR(err); + } + + err = reftable_stack_read_ref(st, "branch", &dest); + EXPECT(err == 1); + reftable_ref_record_release(&dest); + + err = reftable_stack_read_log(st, "branch", &log_dest); + EXPECT(err == 1); + reftable_log_record_release(&log_dest); + + err = reftable_stack_compact_all(st, NULL); + EXPECT_ERR(err); + + err = reftable_stack_read_ref(st, "branch", &dest); + EXPECT(err == 1); + + err = reftable_stack_read_log(st, "branch", &log_dest); + EXPECT(err == 1); + reftable_ref_record_release(&dest); + reftable_log_record_release(&log_dest); + + /* cleanup */ + reftable_stack_destroy(st); + for (i = 0; i < N; i++) { + reftable_ref_record_release(&refs[i]); + reftable_log_record_release(&logs[i]); + } + clear_dir(dir); +} + +static void test_reftable_stack_hash_id(void) +{ + char *dir = get_tmp_dir(__FUNCTION__); + + struct reftable_write_options cfg = { 0 }; + struct reftable_stack *st = NULL; + int err; + + struct reftable_ref_record ref = { + .refname = "master", + .value_type = REFTABLE_REF_SYMREF, + .value.symref = "target", + .update_index = 1, + }; + struct reftable_write_options cfg32 = { .hash_id = GIT_SHA256_FORMAT_ID }; + struct reftable_stack *st32 = NULL; + struct reftable_write_options cfg_default = { 0 }; + struct reftable_stack *st_default = NULL; + struct reftable_ref_record dest = { NULL }; + + err = reftable_new_stack(&st, dir, cfg); + EXPECT_ERR(err); + + err = reftable_stack_add(st, &write_test_ref, &ref); + EXPECT_ERR(err); + + /* can't read it with the wrong hash ID. */ + err = reftable_new_stack(&st32, dir, cfg32); + EXPECT(err == REFTABLE_FORMAT_ERROR); + + /* check that we can read it back with default config too. */ + err = reftable_new_stack(&st_default, dir, cfg_default); + EXPECT_ERR(err); + + err = reftable_stack_read_ref(st_default, "master", &dest); + EXPECT_ERR(err); + + EXPECT(reftable_ref_record_equal(&ref, &dest, GIT_SHA1_RAWSZ)); + reftable_ref_record_release(&dest); + reftable_stack_destroy(st); + reftable_stack_destroy(st_default); + clear_dir(dir); +} + +static void test_log2(void) +{ + EXPECT(1 == fastlog2(3)); + EXPECT(2 == fastlog2(4)); + EXPECT(2 == fastlog2(5)); +} + +static void test_sizes_to_segments(void) +{ + uint64_t sizes[] = { 2, 3, 4, 5, 7, 9 }; + /* .................0 1 2 3 4 5 */ + + int seglen = 0; + struct segment *segs = + sizes_to_segments(&seglen, sizes, ARRAY_SIZE(sizes)); + EXPECT(segs[2].log == 3); + EXPECT(segs[2].start == 5); + EXPECT(segs[2].end == 6); + + EXPECT(segs[1].log == 2); + EXPECT(segs[1].start == 2); + EXPECT(segs[1].end == 5); + reftable_free(segs); +} + +static void test_sizes_to_segments_empty(void) +{ + int seglen = 0; + struct segment *segs = sizes_to_segments(&seglen, NULL, 0); + EXPECT(seglen == 0); + reftable_free(segs); +} + +static void test_sizes_to_segments_all_equal(void) +{ + uint64_t sizes[] = { 5, 5 }; + + int seglen = 0; + struct segment *segs = + sizes_to_segments(&seglen, sizes, ARRAY_SIZE(sizes)); + EXPECT(seglen == 1); + EXPECT(segs[0].start == 0); + EXPECT(segs[0].end == 2); + reftable_free(segs); +} + +static void test_suggest_compaction_segment(void) +{ + uint64_t sizes[] = { 128, 64, 17, 16, 9, 9, 9, 16, 16 }; + /* .................0 1 2 3 4 5 6 */ + struct segment min = + suggest_compaction_segment(sizes, ARRAY_SIZE(sizes)); + EXPECT(min.start == 2); + EXPECT(min.end == 7); +} + +static void test_suggest_compaction_segment_nothing(void) +{ + uint64_t sizes[] = { 64, 32, 16, 8, 4, 2 }; + struct segment result = + suggest_compaction_segment(sizes, ARRAY_SIZE(sizes)); + EXPECT(result.start == result.end); +} + +static void test_reflog_expire(void) +{ + char *dir = get_tmp_dir(__FUNCTION__); + + struct reftable_write_options cfg = { 0 }; + struct reftable_stack *st = NULL; + struct reftable_log_record logs[20] = { { NULL } }; + int N = ARRAY_SIZE(logs) - 1; + int i = 0; + int err; + struct reftable_log_expiry_config expiry = { + .time = 10, + }; + struct reftable_log_record log = { NULL }; + + + err = reftable_new_stack(&st, dir, cfg); + EXPECT_ERR(err); + + for (i = 1; i <= N; i++) { + char buf[256]; + snprintf(buf, sizeof(buf), "branch%02d", i); + + logs[i].refname = xstrdup(buf); + logs[i].update_index = i; + logs[i].value_type = REFTABLE_LOG_UPDATE; + logs[i].update.time = i; + logs[i].update.new_hash = reftable_malloc(GIT_SHA1_RAWSZ); + logs[i].update.email = xstrdup("identity@invalid"); + set_test_hash(logs[i].update.new_hash, i); + } + + for (i = 1; i <= N; i++) { + struct write_log_arg arg = { + .log = &logs[i], + .update_index = reftable_stack_next_update_index(st), + }; + int err = reftable_stack_add(st, &write_test_log, &arg); + EXPECT_ERR(err); + } + + err = reftable_stack_compact_all(st, NULL); + EXPECT_ERR(err); + + err = reftable_stack_compact_all(st, &expiry); + EXPECT_ERR(err); + + err = reftable_stack_read_log(st, logs[9].refname, &log); + EXPECT(err == 1); + + err = reftable_stack_read_log(st, logs[11].refname, &log); + EXPECT_ERR(err); + + expiry.min_update_index = 15; + err = reftable_stack_compact_all(st, &expiry); + EXPECT_ERR(err); + + err = reftable_stack_read_log(st, logs[14].refname, &log); + EXPECT(err == 1); + + err = reftable_stack_read_log(st, logs[16].refname, &log); + EXPECT_ERR(err); + + /* cleanup */ + reftable_stack_destroy(st); + for (i = 0; i <= N; i++) { + reftable_log_record_release(&logs[i]); + } + clear_dir(dir); + reftable_log_record_release(&log); +} + +static int write_nothing(struct reftable_writer *wr, void *arg) +{ + reftable_writer_set_limits(wr, 1, 1); + return 0; +} + +static void test_empty_add(void) +{ + struct reftable_write_options cfg = { 0 }; + struct reftable_stack *st = NULL; + int err; + char *dir = get_tmp_dir(__FUNCTION__); + + struct reftable_stack *st2 = NULL; + + + err = reftable_new_stack(&st, dir, cfg); + EXPECT_ERR(err); + + err = reftable_stack_add(st, &write_nothing, NULL); + EXPECT_ERR(err); + + err = reftable_new_stack(&st2, dir, cfg); + EXPECT_ERR(err); + clear_dir(dir); + reftable_stack_destroy(st); + reftable_stack_destroy(st2); +} + +static void test_reftable_stack_auto_compaction(void) +{ + struct reftable_write_options cfg = { 0 }; + struct reftable_stack *st = NULL; + char *dir = get_tmp_dir(__FUNCTION__); + + int err, i; + int N = 100; + + err = reftable_new_stack(&st, dir, cfg); + EXPECT_ERR(err); + + for (i = 0; i < N; i++) { + char name[100]; + struct reftable_ref_record ref = { + .refname = name, + .update_index = reftable_stack_next_update_index(st), + .value_type = REFTABLE_REF_SYMREF, + .value.symref = "master", + }; + snprintf(name, sizeof(name), "branch%04d", i); + + err = reftable_stack_add(st, &write_test_ref, &ref); + EXPECT_ERR(err); + + EXPECT(i < 3 || st->merged->stack_len < 2 * fastlog2(i)); + } + + EXPECT(reftable_stack_compaction_stats(st)->entries_written < + (uint64_t)(N * fastlog2(N))); + + reftable_stack_destroy(st); + clear_dir(dir); +} + +static void test_reftable_stack_compaction_concurrent(void) +{ + struct reftable_write_options cfg = { 0 }; + struct reftable_stack *st1 = NULL, *st2 = NULL; + char *dir = get_tmp_dir(__FUNCTION__); + + int err, i; + int N = 3; + + err = reftable_new_stack(&st1, dir, cfg); + EXPECT_ERR(err); + + for (i = 0; i < N; i++) { + char name[100]; + struct reftable_ref_record ref = { + .refname = name, + .update_index = reftable_stack_next_update_index(st1), + .value_type = REFTABLE_REF_SYMREF, + .value.symref = "master", + }; + snprintf(name, sizeof(name), "branch%04d", i); + + err = reftable_stack_add(st1, &write_test_ref, &ref); + EXPECT_ERR(err); + } + + err = reftable_new_stack(&st2, dir, cfg); + EXPECT_ERR(err); + + err = reftable_stack_compact_all(st1, NULL); + EXPECT_ERR(err); + + reftable_stack_destroy(st1); + reftable_stack_destroy(st2); + + EXPECT(count_dir_entries(dir) == 2); + clear_dir(dir); +} + +static void unclean_stack_close(struct reftable_stack *st) +{ + // break abstraction boundary to simulate unclean shutdown. + int i = 0; + for (; i < st->readers_len; i++) { + reftable_reader_free(st->readers[i]); + } + st->readers_len = 0; + FREE_AND_NULL(st->readers); +} + +static void test_reftable_stack_compaction_concurrent_clean(void) +{ + struct reftable_write_options cfg = { 0 }; + struct reftable_stack *st1 = NULL, *st2 = NULL, *st3 = NULL; + char *dir = get_tmp_dir(__FUNCTION__); + + int err, i; + int N = 3; + + err = reftable_new_stack(&st1, dir, cfg); + EXPECT_ERR(err); + + for (i = 0; i < N; i++) { + char name[100]; + struct reftable_ref_record ref = { + .refname = name, + .update_index = reftable_stack_next_update_index(st1), + .value_type = REFTABLE_REF_SYMREF, + .value.symref = "master", + }; + snprintf(name, sizeof(name), "branch%04d", i); + + err = reftable_stack_add(st1, &write_test_ref, &ref); + EXPECT_ERR(err); + } + + err = reftable_new_stack(&st2, dir, cfg); + EXPECT_ERR(err); + + err = reftable_stack_compact_all(st1, NULL); + EXPECT_ERR(err); + + unclean_stack_close(st1); + unclean_stack_close(st2); + + err = reftable_new_stack(&st3, dir, cfg); + EXPECT_ERR(err); + + err = reftable_stack_clean(st3); + EXPECT_ERR(err); + EXPECT(count_dir_entries(dir) == 2); + + reftable_stack_destroy(st1); + reftable_stack_destroy(st2); + reftable_stack_destroy(st3); + + clear_dir(dir); +} + +int stack_test_main(int argc, const char *argv[]) +{ + RUN_TEST(test_empty_add); + RUN_TEST(test_log2); + RUN_TEST(test_names_equal); + RUN_TEST(test_parse_names); + RUN_TEST(test_read_file); + RUN_TEST(test_reflog_expire); + RUN_TEST(test_reftable_stack_add); + RUN_TEST(test_reftable_stack_add_one); + RUN_TEST(test_reftable_stack_auto_compaction); + RUN_TEST(test_reftable_stack_compaction_concurrent); + RUN_TEST(test_reftable_stack_compaction_concurrent_clean); + RUN_TEST(test_reftable_stack_hash_id); + RUN_TEST(test_reftable_stack_lock_failure); + RUN_TEST(test_reftable_stack_log_normalize); + RUN_TEST(test_reftable_stack_tombstone); + RUN_TEST(test_reftable_stack_transaction_api); + RUN_TEST(test_reftable_stack_update_index_check); + RUN_TEST(test_reftable_stack_uptodate); + RUN_TEST(test_reftable_stack_validate_refname); + RUN_TEST(test_sizes_to_segments); + RUN_TEST(test_sizes_to_segments_all_equal); + RUN_TEST(test_sizes_to_segments_empty); + RUN_TEST(test_suggest_compaction_segment); + RUN_TEST(test_suggest_compaction_segment_nothing); + return 0; +} diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c index c8db6852c35..996da85f7b5 100644 --- a/t/helper/test-reftable.c +++ b/t/helper/test-reftable.c @@ -10,6 +10,7 @@ int cmd__reftable(int argc, const char **argv) record_test_main(argc, argv); refname_test_main(argc, argv); readwrite_test_main(argc, argv); + stack_test_main(argc, argv); tree_test_main(argc, argv); return 0; } From patchwork Tue Jul 20 17:04:39 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 12388867 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DA258C6377A for ; Tue, 20 Jul 2021 17:07:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C7C9F61209 for ; Tue, 20 Jul 2021 17:07:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235091AbhGTQ1D (ORCPT ); Tue, 20 Jul 2021 12:27:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49728 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232542AbhGTQYY (ORCPT ); Tue, 20 Jul 2021 12:24:24 -0400 Received: from mail-wr1-x42b.google.com (mail-wr1-x42b.google.com [IPv6:2a00:1450:4864:20::42b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 10801C0613E2 for ; Tue, 20 Jul 2021 10:05:02 -0700 (PDT) Received: by mail-wr1-x42b.google.com with SMTP id a13so26749657wrf.10 for ; Tue, 20 Jul 2021 10:05:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=BXTqlXi+8kq0ggef5dQ5t34RlFRqURObkEYrTzF2j8U=; b=lqs2mWe9iW4zx3J0sweWsbRyyAfW+4ZC+AXbaV7acl2Qf7q3/p+xay2Byl5Q4s4/K8 yfN60w0LVJORRMNSkLlmuFJSXjIyajNX6qsArAZE/iA/9B6tHNjBPDMviJ6Up9LcR4Om X/8/fSXbFHsmn+xlE2MW2b9A7gcojH7JB34R9MW/CElw2+jp23LPL5Wh5Qo7AYWK4kj6 GN4JRHKj3BCJwnmk88phMeC5aBttitPWO3EEL7Fq0J+587021cLX9vb9Gs/4g7ip0smf t85p1OPrq0Pu2j4qTRRX8hABKIC+N0K6aD0JCP1HMV9G83ZkqLCIG929n3BZitTJMVtc MMZA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=BXTqlXi+8kq0ggef5dQ5t34RlFRqURObkEYrTzF2j8U=; b=TIZHDziLTLOxOjGHZia82QbQ86mTD8HtUOVAnI4Jm6CXyokNXXKgl5V7aIVQSwAB9i iI837V8Sj9YczrizmkWodQL3i4bmLu/CfGJ1+ZBvE278srUO2MP85XsjWaTw+NtBlSZn HsiijvDwCdsL3wWBvabTPx6mZom5DTmtqrQTS8+uHZqgEz2FgygIzv4ULQhStP2NIJ0+ dJx5Sda8ufarYEclcRT7VMxZIWTNZHWfyyB3G+cH/4h9w8oax7teszthH91KJF4BTzeB 84zIkkWi9QNnKe6iMvOR+B043zc2Pp/deITxFpV2w6UvLXuJGE1pt0jaY+abLWeEajY4 soWw== X-Gm-Message-State: AOAM531LFMpcIpHTfd08PHdjK6GMn1OdK3PFzH9eVfC+CAaRcEVTP1p7 c4/76bx5l/3PFChuAR1WIXyoOIy62TA= X-Google-Smtp-Source: ABdhPJyhPKqyu7vywZ8fsJvA5l7+JYhg2nzPvoIdXBDoM3P5L8zOt/6+iol6wieR1bzg7Duw75W/rg== X-Received: by 2002:adf:e805:: with SMTP id o5mr4618561wrm.321.1626800700627; Tue, 20 Jul 2021 10:05:00 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id e6sm28282989wrg.18.2021.07.20.10.05.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 10:05:00 -0700 (PDT) Message-Id: <05e46f7e1d8ec53a80740c312234065eb65e3fe0.1626800687.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 20 Jul 2021 17:04:39 +0000 Subject: [PATCH 19/26] reftable: add dump utility Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys provide a command-line utility for inspecting individual tables, and inspecting a complete ref database Signed-off-by: Han-Wen Nienhuys --- reftable/dump.c | 105 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 105 insertions(+) create mode 100644 reftable/dump.c diff --git a/reftable/dump.c b/reftable/dump.c new file mode 100644 index 00000000000..668cfa89965 --- /dev/null +++ b/reftable/dump.c @@ -0,0 +1,105 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "reftable-blocksource.h" +#include "reftable-error.h" +#include "reftable-merged.h" +#include "reftable-record.h" +#include "reftable-tests.h" +#include "reftable-writer.h" +#include "reftable-iterator.h" +#include "reftable-reader.h" +#include "reftable-stack.h" +#include "reftable-generic.h" +#include "hash.h" + +#include +#include +#include +#include +#include + +static int compact_stack(const char *stackdir) +{ + struct reftable_stack *stack = NULL; + struct reftable_write_options cfg = { 0 }; + + int err = reftable_new_stack(&stack, stackdir, cfg); + if (err < 0) + goto done; + + err = reftable_stack_compact_all(stack, NULL); + if (err < 0) + goto done; +done: + if (stack) { + reftable_stack_destroy(stack); + } + return err; +} + +static void print_help(void) +{ + printf("usage: dump [-cst] arg\n\n" + "options: \n" + " -c compact\n" + " -t dump table\n" + " -s dump stack\n" + " -6 sha256 hash format\n" + " -h this help\n" + "\n"); +} + +int reftable_dump_main(int argc, char *const *argv) +{ + int err = 0; + int opt_dump_table = 0; + int opt_dump_stack = 0; + int opt_compact = 0; + uint32_t opt_hash_id = GIT_SHA1_FORMAT_ID; + const char *arg = NULL, *argv0 = argv[0]; + + for (; argc > 1; argv++, argc--) + if (*argv[1] != '-') + break; + else if (!strcmp("-t", argv[1])) + opt_dump_table = 1; + else if (!strcmp("-6", argv[1])) + opt_hash_id = GIT_SHA256_FORMAT_ID; + else if (!strcmp("-s", argv[1])) + opt_dump_stack = 1; + else if (!strcmp("-c", argv[1])) + opt_compact = 1; + else if (!strcmp("-?", argv[1]) || !strcmp("-h", argv[1])) { + print_help(); + return 2; + } + + if (argc != 2) { + fprintf(stderr, "need argument\n"); + print_help(); + return 2; + } + + arg = argv[1]; + + if (opt_dump_table) { + err = reftable_reader_print_file(arg); + } else if (opt_dump_stack) { + err = reftable_stack_print_directory(arg, opt_hash_id); + } else if (opt_compact) { + err = compact_stack(arg); + } + + if (err < 0) { + fprintf(stderr, "%s: %s: %s\n", argv0, arg, + reftable_error_str(err)); + return 1; + } + return 0; +} From patchwork Tue Jul 20 17:04:40 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bruce Perry via GitGitGadget X-Patchwork-Id: 12388873 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 316E9C07E95 for ; Tue, 20 Jul 2021 17:09:13 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id F169F610D2 for ; Tue, 20 Jul 2021 17:09:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232018AbhGTQ1p (ORCPT ); Tue, 20 Jul 2021 12:27:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49734 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232726AbhGTQY0 (ORCPT ); Tue, 20 Jul 2021 12:24:26 -0400 Received: from mail-wm1-x32f.google.com (mail-wm1-x32f.google.com [IPv6:2a00:1450:4864:20::32f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 804A5C061766 for ; Tue, 20 Jul 2021 10:05:03 -0700 (PDT) Received: by mail-wm1-x32f.google.com with SMTP id a23-20020a05600c2257b0290236ec98bebaso2468974wmm.1 for ; Tue, 20 Jul 2021 10:05:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=1FxdTCW2yJJjsFTN8n6YLa3Q4AW5H8U8sKXI1llWzzk=; b=WK2NSG0DA3iT69PJp1dTqGMFFdzvOvbOMOBmsIGpPS2UwrAFjB30CubqKCTypAigYK MabciOUNB1/eF/NnRvDcyKNSz5UlpweIcVmwSPqJ0va7HEFHbzs/HAAOy3sdiAneC8iY Vdyvsh3wBUmZ3C0b71df5J++Abm1T6cQb7z29Yi4jQ9SsOrw1ZSMlNE940esz8h3ualI 4AoIOCmZbEuEvl2YlaIG0noFz8N81tNKRonAwe8cBYktOHygQv5CNksX++2GJ5J0ubEr aclU/L5bmFtS/0ANU0woMdq4y3+k71UWQdeiEe64ClLlqfjCMB36DOKPlSNKUp413DMB 9R+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=1FxdTCW2yJJjsFTN8n6YLa3Q4AW5H8U8sKXI1llWzzk=; b=O00qSQDnm7qRIJjWHinVHpIWUMK829UaNs7XmrxJq0XlyinR/bwVQa+5XAX5WVeptl XD8jXOUyHKh21R9t5t2C4eDjgkIxNG8++z94Q8gCNCsYTtW9NqxB1lIprC1OiokfJzqV oKppp6e3TPHWF013/7FOG3Xy0z7vk3lVp/7iZvnq9qmu2Sda1gyqeM4N0ndxAXxSNRSE ermW7pcPq7WUTAibAAdIK2m00Jm1+4WwNnU9ETQbuA1u6ZWshjglk/fmjirmDKSLolX/ mMZlmzVhJB6sZIuKAaOb0fCrIsgDLL2d/IYZ6ym1f1FBbfWsZ6dSj2fSjN6H+3FyfAU8 fumg== X-Gm-Message-State: AOAM530SlKAfYVkVvoWaJCagGXtPH/pED8uJ+ov1p9jQRQAwaTcy81+Y xCe/wvqtmSNbhmxKJH1twqaITfnfLLM= X-Google-Smtp-Source: ABdhPJyFqpkVbamXJ2VSJmMaQ3f6SLIlEH2RgCzH202qEjAIr6GVD13LDsIkq45PBI3vZpKcDf2gQQ== X-Received: by 2002:a1c:f705:: with SMTP id v5mr38962289wmh.69.1626800701500; Tue, 20 Jul 2021 10:05:01 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id o7sm28317545wrv.72.2021.07.20.10.05.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 10:05:01 -0700 (PDT) Message-Id: In-Reply-To: References: From: "Han-Wen Nienhuys via GitGitGadget" Date: Tue, 20 Jul 2021 17:04:40 +0000 Subject: [PATCH 20/26] refs: RFC: Reftable support for git-core Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys For background, see Documentation/technical/reftable.txt. This introduces the file refs/reftable-backend.c containing a reftable-powered ref storage backend. It can be activated by setting GIT_TEST_REFTABLE in the environment. When GIT_TEST_REFTABLE is set, the test prerequisite !REFFILES is set. There is no option to git-init for now, as the test suite still shows failures with GIT_TEST_REFTABLE=1. Example use: see t/t0031-reftable.sh Signed-off-by: Han-Wen Nienhuys Signed-off-by: Johannes Schindelin Helped-by: Johannes Schindelin Helped-by: Junio Hamano Helped-by: Patrick Steinhardt Co-authored-by: Jeff King --- Documentation/config/extensions.txt | 9 + .../technical/repository-version.txt | 7 + Makefile | 1 + builtin/clone.c | 5 +- builtin/init-db.c | 39 +- builtin/stash.c | 8 +- builtin/worktree.c | 27 +- cache.h | 8 +- config.mak.uname | 2 +- contrib/buildsystems/Generators/Vcxproj.pm | 11 +- refs.c | 26 +- refs.h | 3 + refs/refs-internal.h | 1 + refs/reftable-backend.c | 1683 +++++++++++++++++ repository.c | 2 + repository.h | 3 + setup.c | 6 + t/t0031-reftable.sh | 291 +++ t/t1409-avoid-packing-refs.sh | 6 + t/t1450-fsck.sh | 6 + t/t3210-pack-refs.sh | 6 + t/test-lib.sh | 7 +- 22 files changed, 2129 insertions(+), 28 deletions(-) create mode 100644 refs/reftable-backend.c create mode 100755 t/t0031-reftable.sh diff --git a/Documentation/config/extensions.txt b/Documentation/config/extensions.txt index 4e23d73cdca..82c5940f143 100644 --- a/Documentation/config/extensions.txt +++ b/Documentation/config/extensions.txt @@ -6,3 +6,12 @@ extensions.objectFormat:: Note that this setting should only be set by linkgit:git-init[1] or linkgit:git-clone[1]. Trying to change it after initialization will not work and will produce hard-to-diagnose issues. ++ +extensions.refStorage:: + Specify the ref storage mechanism to use. The acceptable values are `files` and + `reftable`. If not specified, `files` is assumed. It is an error to specify + this key unless `core.repositoryFormatVersion` is 1. ++ +Note that this setting should only be set by linkgit:git-init[1] or +linkgit:git-clone[1]. Trying to change it after initialization will not +work and will produce hard-to-diagnose issues. diff --git a/Documentation/technical/repository-version.txt b/Documentation/technical/repository-version.txt index 7844ef30ffd..72576235833 100644 --- a/Documentation/technical/repository-version.txt +++ b/Documentation/technical/repository-version.txt @@ -100,3 +100,10 @@ If set, by default "git config" reads from both "config" and multiple working directory mode, "config" file is shared while "config.worktree" is per-working directory (i.e., it's in GIT_COMMON_DIR/worktrees//config.worktree) + +==== `refStorage` + +Specifies the file format for the ref database. Values are `files` +(for the traditional packed + loose ref format) and `reftable` for the +binary reftable format. See https://github.com/google/reftable for +more information. diff --git a/Makefile b/Makefile index c18042929c8..19566c661f1 100644 --- a/Makefile +++ b/Makefile @@ -986,6 +986,7 @@ LIB_OBJS += reflog-walk.o LIB_OBJS += refs.o LIB_OBJS += refs/debug.o LIB_OBJS += refs/files-backend.o +LIB_OBJS += refs/reftable-backend.o LIB_OBJS += refs/iterator.o LIB_OBJS += refs/packed-backend.o LIB_OBJS += refs/ref-cache.o diff --git a/builtin/clone.c b/builtin/clone.c index 66fe66679c8..baa1ff4fc60 100644 --- a/builtin/clone.c +++ b/builtin/clone.c @@ -1148,7 +1148,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix) } init_db(git_dir, real_git_dir, option_template, GIT_HASH_UNKNOWN, NULL, - INIT_DB_QUIET); + default_ref_storage(), INIT_DB_QUIET); if (real_git_dir) git_dir = real_git_dir; @@ -1299,7 +1299,8 @@ int cmd_clone(int argc, const char **argv, const char *prefix) * Now that we know what algorithm the remote side is using, * let's set ours to the same thing. */ - initialize_repository_version(hash_algo, 1); + initialize_repository_version(hash_algo, 1, + default_ref_storage()); repo_set_hash_algo(the_repository, hash_algo); mapped_refs = wanted_peer_refs(refs, &remote->fetch); diff --git a/builtin/init-db.c b/builtin/init-db.c index c2f03f6018e..22b07d2b2fb 100644 --- a/builtin/init-db.c +++ b/builtin/init-db.c @@ -167,12 +167,14 @@ static int needs_work_tree_config(const char *git_dir, const char *work_tree) return 1; } -void initialize_repository_version(int hash_algo, int reinit) +void initialize_repository_version(int hash_algo, int reinit, + const char *ref_storage_format) { char repo_version_string[10]; int repo_version = GIT_REPO_VERSION; - if (hash_algo != GIT_HASH_SHA1) + if (hash_algo != GIT_HASH_SHA1 || + !strcmp(ref_storage_format, "reftable")) repo_version = GIT_REPO_VERSION_READ; /* This forces creation of new config file */ @@ -226,6 +228,7 @@ static int create_default_files(const char *template_path, is_bare_repository_cfg = init_is_bare_repository || !work_tree; if (init_shared_repository != -1) set_shared_repository(init_shared_repository); + the_repository->ref_storage_format = xstrdup(fmt->ref_storage); /* * We would have created the above under user's umask -- under @@ -235,6 +238,24 @@ static int create_default_files(const char *template_path, adjust_shared_perm(get_git_dir()); } + /* + * Check to see if .git/HEAD exists; this must happen before + * initializing the ref db, because we want to see if there is an + * existing HEAD. + */ + path = git_path_buf(&buf, "HEAD"); + reinit = (!access(path, R_OK) || + readlink(path, junk, sizeof(junk) - 1) != -1); + + /* + * refs/heads is a file when using reftable. We can't reinitialize with + * a reftable because it will overwrite HEAD + */ + if (reinit && (!strcmp(fmt->ref_storage, "reftable")) == + is_directory(git_path_buf(&buf, "refs/heads"))) { + die("cannot switch ref storage format."); + } + /* * We need to create a "refs" dir in any case so that older * versions of git can tell that this is a repository. @@ -249,9 +270,6 @@ static int create_default_files(const char *template_path, * Point the HEAD symref to the initial branch with if HEAD does * not yet exist. */ - path = git_path_buf(&buf, "HEAD"); - reinit = (!access(path, R_OK) - || readlink(path, junk, sizeof(junk)-1) != -1); if (!reinit) { char *ref; @@ -268,7 +286,7 @@ static int create_default_files(const char *template_path, free(ref); } - initialize_repository_version(fmt->hash_algo, 0); + initialize_repository_version(fmt->hash_algo, 0, fmt->ref_storage); /* Check filemode trustability */ path = git_path_buf(&buf, "config"); @@ -383,7 +401,7 @@ static void validate_hash_algorithm(struct repository_format *repo_fmt, int hash int init_db(const char *git_dir, const char *real_git_dir, const char *template_dir, int hash, const char *initial_branch, - unsigned int flags) + const char *ref_storage_format, unsigned int flags) { int reinit; int exist_ok = flags & INIT_DB_EXIST_OK; @@ -422,6 +440,7 @@ int init_db(const char *git_dir, const char *real_git_dir, * is an attempt to reinitialize new repository with an old tool. */ check_repository_format(&repo_fmt); + repo_fmt.ref_storage = xstrdup(ref_storage_format); validate_hash_algorithm(&repo_fmt, hash); @@ -476,6 +495,9 @@ int init_db(const char *git_dir, const char *real_git_dir, git_config_set("receive.denyNonFastforwards", "true"); } + if (!strcmp(ref_storage_format, "reftable")) + git_config_set("extensions.refStorage", ref_storage_format); + if (!(flags & INIT_DB_QUIET)) { int len = strlen(git_dir); @@ -549,6 +571,7 @@ static const char *const init_db_usage[] = { int cmd_init_db(int argc, const char **argv, const char *prefix) { const char *git_dir; + const char *ref_storage_format = default_ref_storage(); const char *real_git_dir = NULL; const char *work_tree; const char *template_dir = NULL; @@ -713,5 +736,5 @@ int cmd_init_db(int argc, const char **argv, const char *prefix) flags |= INIT_DB_EXIST_OK; return init_db(git_dir, real_git_dir, template_dir, hash_algo, - initial_branch, flags); + initial_branch, ref_storage_format, flags); } diff --git a/builtin/stash.c b/builtin/stash.c index 8f42360ca91..d43d6898039 100644 --- a/builtin/stash.c +++ b/builtin/stash.c @@ -207,10 +207,16 @@ static int get_stash_info(struct stash_info *info, int argc, const char **argv) static int do_clear_stash(void) { struct object_id obj; + int result; if (get_oid(ref_stash, &obj)) return 0; - return delete_ref(NULL, ref_stash, &obj, 0); + result = delete_ref(NULL, ref_stash, &obj, 0); + + /* Ignore error; this is necessary for reftable, which keeps reflogs + * even when refs are deleted. */ + delete_reflog(ref_stash); + return result; } static int clear_stash(int argc, const char **argv, const char *prefix) diff --git a/builtin/worktree.c b/builtin/worktree.c index 976bf8ed063..9601ccd5909 100644 --- a/builtin/worktree.c +++ b/builtin/worktree.c @@ -13,6 +13,7 @@ #include "utf8.h" #include "worktree.h" #include "quote.h" +#include "../refs/refs-internal.h" static const char * const worktree_usage[] = { N_("git worktree add [] []"), @@ -328,9 +329,29 @@ static int add_worktree(const char *path, const char *refname, * worktree. */ strbuf_reset(&sb); - strbuf_addf(&sb, "%s/HEAD", sb_repo.buf); - write_file(sb.buf, "%s", oid_to_hex(null_oid())); - strbuf_reset(&sb); + if (get_main_ref_store(the_repository)->be == &refs_be_reftable) { + /* XXX this is cut & paste from reftable_init_db. */ + strbuf_addf(&sb, "%s/HEAD", sb_repo.buf); + write_file(sb.buf, "%s", "ref: refs/heads/.invalid\n"); + strbuf_reset(&sb); + + strbuf_addf(&sb, "%s/refs", sb_repo.buf); + safe_create_dir(sb.buf, 1); + strbuf_reset(&sb); + + strbuf_addf(&sb, "%s/refs/heads", sb_repo.buf); + write_file(sb.buf, "this repository uses the reftable format"); + strbuf_reset(&sb); + + strbuf_addf(&sb, "%s/reftable", sb_repo.buf); + safe_create_dir(sb.buf, 1); + strbuf_reset(&sb); + } else { + strbuf_addf(&sb, "%s/HEAD", sb_repo.buf); + write_file(sb.buf, "%s", oid_to_hex(null_oid())); + strbuf_reset(&sb); + } + strbuf_addf(&sb, "%s/commondir", sb_repo.buf); write_file(sb.buf, "../.."); diff --git a/cache.h b/cache.h index ba04ff8bd36..f4472afb3fc 100644 --- a/cache.h +++ b/cache.h @@ -647,9 +647,10 @@ int path_inside_repo(const char *prefix, const char *path); #define INIT_DB_EXIST_OK 0x0002 int init_db(const char *git_dir, const char *real_git_dir, - const char *template_dir, int hash_algo, - const char *initial_branch, unsigned int flags); -void initialize_repository_version(int hash_algo, int reinit); + const char *template_dir, int hash_algo, const char *initial_branch, + const char *ref_storage_format, unsigned int flags); +void initialize_repository_version(int hash_algo, int reinit, + const char *ref_storage_format); void sanitize_stdfds(void); int daemonize(void); @@ -1067,6 +1068,7 @@ struct repository_format { int hash_algo; int sparse_index; char *work_tree; + char *ref_storage; struct string_list unknown_extensions; struct string_list v1_only_extensions; }; diff --git a/config.mak.uname b/config.mak.uname index 69413fb3dc0..a2b156e338e 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -711,7 +711,7 @@ vcxproj: # Make .vcxproj files and add them unset QUIET_GEN QUIET_BUILT_IN; \ perl contrib/buildsystems/generate -g Vcxproj - git add -f git.sln {*,*/lib,t/helper/*}/*.vcxproj + git add -f git.sln {*,*/lib,*/libreftable,t/helper/*}/*.vcxproj # Generate the LinkOrCopyBuiltins.targets and LinkOrCopyRemoteHttp.targets file (echo '' && \ diff --git a/contrib/buildsystems/Generators/Vcxproj.pm b/contrib/buildsystems/Generators/Vcxproj.pm index d2584450ba1..1a25789d285 100644 --- a/contrib/buildsystems/Generators/Vcxproj.pm +++ b/contrib/buildsystems/Generators/Vcxproj.pm @@ -77,7 +77,7 @@ sub createProject { my $libs_release = "\n "; my $libs_debug = "\n "; if (!$static_library) { - $libs_release = join(";", sort(grep /^(?!libgit\.lib|xdiff\/lib\.lib|vcs-svn\/lib\.lib)/, @{$$build_structure{"$prefix${name}_LIBS"}})); + $libs_release = join(";", sort(grep /^(?!libgit\.lib|xdiff\/lib\.lib|vcs-svn\/lib\.lib|reftable\/libreftable\.lib)/, @{$$build_structure{"$prefix${name}_LIBS"}})); $libs_debug = $libs_release; $libs_debug =~ s/zlib\.lib/zlibd\.lib/g; $libs_debug =~ s/libexpat\.lib/libexpatd\.lib/g; @@ -232,6 +232,7 @@ EOM EOM if (!$static_library || $target =~ 'vcs-svn' || $target =~ 'xdiff') { my $uuid_libgit = $$build_structure{"LIBS_libgit_GUID"}; + my $uuid_libreftable = $$build_structure{"LIBS_reftable/libreftable_GUID"}; my $uuid_xdiff_lib = $$build_structure{"LIBS_xdiff/lib_GUID"}; print F << "EOM"; @@ -241,6 +242,14 @@ EOM false EOM + if (!($name =~ /xdiff|libreftable/)) { + print F << "EOM"; + + $uuid_libreftable + false + +EOM + } if (!($name =~ 'xdiff')) { print F << "EOM"; diff --git a/refs.c b/refs.c index 8b9f7c3a80a..00c4d24df6e 100644 --- a/refs.c +++ b/refs.c @@ -19,10 +19,15 @@ #include "repository.h" #include "sigchain.h" +const char *default_ref_storage(void) +{ + return git_env_bool("GIT_TEST_REFTABLE", 0) ? "reftable" : "files"; +} + /* * List of all available backends */ -static struct ref_storage_be *refs_backends = &refs_be_files; +static struct ref_storage_be *refs_backends = &refs_be_reftable; static struct ref_storage_be *find_ref_storage_backend(const char *name) { @@ -1875,13 +1880,13 @@ static struct ref_store *lookup_ref_store_map(struct hashmap *map, * Create, record, and return a ref_store instance for the specified * gitdir. */ -static struct ref_store *ref_store_init(const char *gitdir, +static struct ref_store *ref_store_init(const char *gitdir, const char *be_name, unsigned int flags) { - const char *be_name = "files"; - struct ref_storage_be *be = find_ref_storage_backend(be_name); + struct ref_storage_be *be; struct ref_store *refs; + be = find_ref_storage_backend(be_name); if (!be) BUG("reference backend %s is unknown", be_name); @@ -1897,7 +1902,11 @@ struct ref_store *get_main_ref_store(struct repository *r) if (!r->gitdir) BUG("attempting to get main_ref_store outside of repository"); - r->refs_private = ref_store_init(r->gitdir, REF_STORE_ALL_CAPS); + r->refs_private = ref_store_init(r->gitdir, + r->ref_storage_format ? + r->ref_storage_format : + default_ref_storage(), + REF_STORE_ALL_CAPS); r->refs_private = maybe_debug_wrap_ref_store(r->gitdir, r->refs_private); return r->refs_private; } @@ -1953,7 +1962,7 @@ struct ref_store *get_submodule_ref_store(const char *submodule) goto done; /* assume that add_submodule_odb() has been called */ - refs = ref_store_init(submodule_sb.buf, + refs = ref_store_init(submodule_sb.buf, default_ref_storage(), REF_STORE_READ | REF_STORE_ODB); register_ref_store_map(&submodule_ref_stores, "submodule", refs, submodule); @@ -1967,6 +1976,7 @@ done: struct ref_store *get_worktree_ref_store(const struct worktree *wt) { + const char *format = default_ref_storage(); struct ref_store *refs; const char *id; @@ -1980,9 +1990,9 @@ struct ref_store *get_worktree_ref_store(const struct worktree *wt) if (wt->id) refs = ref_store_init(git_common_path("worktrees/%s", wt->id), - REF_STORE_ALL_CAPS); + format, REF_STORE_ALL_CAPS); else - refs = ref_store_init(get_git_common_dir(), + refs = ref_store_init(get_git_common_dir(), format, REF_STORE_ALL_CAPS); if (refs) diff --git a/refs.h b/refs.h index 48970dfc7e0..5a6d4ca9fa8 100644 --- a/refs.h +++ b/refs.h @@ -11,6 +11,9 @@ struct string_list; struct string_list_item; struct worktree; +/* Returns the ref storage backend to use by default. */ +const char *default_ref_storage(void); + /* * Resolve a reference, recursively following symbolic refererences. * diff --git a/refs/refs-internal.h b/refs/refs-internal.h index 3155708345f..e36f215067e 100644 --- a/refs/refs-internal.h +++ b/refs/refs-internal.h @@ -672,6 +672,7 @@ struct ref_storage_be { }; extern struct ref_storage_be refs_be_files; +extern struct ref_storage_be refs_be_reftable; extern struct ref_storage_be refs_be_packed; /* diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c new file mode 100644 index 00000000000..d7137d12138 --- /dev/null +++ b/refs/reftable-backend.c @@ -0,0 +1,1683 @@ +#include "../cache.h" +#include "../chdir-notify.h" +#include "../config.h" +#include "../iterator.h" +#include "../lockfile.h" +#include "../refs.h" +#include "../reftable/reftable-stack.h" +#include "../reftable/reftable-record.h" +#include "../reftable/reftable-error.h" +#include "../reftable/reftable-blocksource.h" +#include "../reftable/reftable-reader.h" +#include "../reftable/reftable-iterator.h" +#include "../reftable/reftable-merged.h" +#include "../reftable/reftable-generic.h" +#include "../worktree.h" +#include "refs-internal.h" + +extern struct ref_storage_be refs_be_reftable; + +struct git_reftable_ref_store { + struct ref_store base; + unsigned int store_flags; + + int err; + char *repo_dir; + + char *reftable_dir; + + struct reftable_stack *main_stack; + struct reftable_stack *worktree_stack; +}; + +/* + * Some refs are global to the repository (refs/heads/{*}), while others are + * local to the worktree (eg. HEAD, refs/bisect/{*}). We solve this by having + * two separate databases (ie. two reftable/ directories), one for the + * repository, and one for the worktree. For reading, we merge the view (see + * git_reftable_iterator) of both, when necessary. + * + * Unfortunately, the worktrees can also be selected by specifying a magic + * refname (eg. worktree/BLA/refname, even if BLA isn't the current worktree.) + */ +static struct reftable_stack *stack_for(struct git_reftable_ref_store *store, + const char *refname) +{ + const char *wtname = refname; + int wtname_len = 0; + const char *wtref = refname; + + if (refname == NULL) + return store->main_stack; + + if (!parse_worktree_ref(refname, &wtname, &wtname_len, &wtref) && + wtname_len) { + /* this makes me cry. Woe you if you try to access + * worktree/BLA/REF and the current worktree + * from the same process. + */ + struct strbuf wt_dir = STRBUF_INIT; + struct reftable_write_options cfg = { + .block_size = 4096, + .hash_id = the_hash_algo->format_id, + }; + + strbuf_addstr(&wt_dir, store->base.gitdir); + strbuf_addstr(&wt_dir, "/worktrees/"); + strbuf_add(&wt_dir, wtname, wtname_len); + strbuf_addstr(&wt_dir, "/reftable"); + + if (store->worktree_stack) + reftable_stack_destroy(store->worktree_stack); + store->err = reftable_new_stack(&store->worktree_stack, + wt_dir.buf, cfg); + assert(store->err != REFTABLE_API_ERROR); + + return store->worktree_stack; + } + + if (store->worktree_stack == NULL) + return store->main_stack; + + switch (ref_type(refname)) { + case REF_TYPE_PER_WORKTREE: + case REF_TYPE_PSEUDOREF: + case REF_TYPE_OTHER_PSEUDOREF: + return store->worktree_stack; + default: + case REF_TYPE_MAIN_PSEUDOREF: + case REF_TYPE_NORMAL: + return store->main_stack; + } +} + +static const char *bare_ref_name(const char *ref) +{ + const char *out = ref; + int name_len = 0; + if (skip_prefix(ref, "main-worktree/", &out)) + return out; + + if (!parse_worktree_ref(ref, NULL, &name_len, &out) && name_len) { + return out; + } + + return ref; +} + +static int git_reftable_read_raw_ref(struct ref_store *ref_store, + const char *refname, struct object_id *oid, + struct strbuf *referent, + unsigned int *type); + +static void clear_reftable_log_record(struct reftable_log_record *log) +{ + log->refname = NULL; + switch (log->value_type) { + case REFTABLE_LOG_UPDATE: + log->update.old_hash = NULL; + log->update.new_hash = NULL; + log->update.message = NULL; + break; + case REFTABLE_LOG_DELETION: + break; + } + reftable_log_record_release(log); +} + +static void fill_reftable_log_record(struct reftable_log_record *log) +{ + const char *info = git_committer_info(0); + struct ident_split split = { NULL }; + int result = split_ident_line(&split, info, strlen(info)); + int sign = 1; + assert(0 == result); + + reftable_log_record_release(log); + log->value_type = REFTABLE_LOG_UPDATE; + log->update.name = + xstrndup(split.name_begin, split.name_end - split.name_begin); + log->update.email = + xstrndup(split.mail_begin, split.mail_end - split.mail_begin); + log->update.time = atol(split.date_begin); + if (*split.tz_begin == '-') { + sign = -1; + split.tz_begin++; + } + if (*split.tz_begin == '+') { + sign = 1; + split.tz_begin++; + } + + log->update.tz_offset = sign * atoi(split.tz_begin); +} + +static int has_suffix(struct strbuf *b, const char *suffix) +{ + size_t len = strlen(suffix); + + if (len > b->len) { + return 0; + } + + return 0 == strncmp(b->buf + b->len - len, suffix, len); +} + +/* trims the last path component of b. Returns -1 if it is not + * present, or 0 on success + */ +static int trim_component(struct strbuf *b) +{ + char *last; + last = strrchr(b->buf, '/'); + if (!last) + return -1; + strbuf_setlen(b, last - b->buf); + return 0; +} + +/* Returns whether `b` is a worktree path. Mutates its arg, trimming it to the + * gitdir + */ +static int is_worktree(struct strbuf *b) +{ + if (trim_component(b) < 0) { + return 0; + } + if (!has_suffix(b, "/worktrees")) { + return 0; + } + trim_component(b); + return 1; +} + +static struct ref_store *git_reftable_ref_store_create(const char *path, + unsigned int store_flags) +{ + struct git_reftable_ref_store *refs = xcalloc(1, sizeof(*refs)); + struct ref_store *ref_store = (struct ref_store *)refs; + struct reftable_write_options cfg = { + .block_size = 4096, + .hash_id = the_hash_algo->format_id, + }; + struct strbuf sb = STRBUF_INIT; + const char *gitdir = path; + struct strbuf wt_buf = STRBUF_INIT; + int wt = 0; + + strbuf_realpath(&wt_buf, path, /*die_on_error=*/0); + + /* this is clumsy, but the official worktree functions (eg. + * get_worktrees()) function will try to initialize a ref storage + * backend, leading to infinite recursion. */ + wt = is_worktree(&wt_buf); + if (wt) { + gitdir = wt_buf.buf; + } + + base_ref_store_init(ref_store, &refs_be_reftable); + ref_store->gitdir = xstrdup(gitdir); + refs->store_flags = store_flags; + strbuf_addf(&sb, "%s/reftable", gitdir); + refs->reftable_dir = xstrdup(sb.buf); + strbuf_reset(&sb); + + refs->err = + reftable_new_stack(&refs->main_stack, refs->reftable_dir, cfg); + assert(refs->err != REFTABLE_API_ERROR); + + if (refs->err == 0 && wt) { + strbuf_addf(&sb, "%s/reftable", path); + + refs->err = + reftable_new_stack(&refs->worktree_stack, sb.buf, cfg); + assert(refs->err != REFTABLE_API_ERROR); + } + + strbuf_release(&sb); + strbuf_release(&wt_buf); + return ref_store; +} + +static int git_reftable_init_db(struct ref_store *ref_store, struct strbuf *err) +{ + struct git_reftable_ref_store *refs = + (struct git_reftable_ref_store *)ref_store; + struct strbuf sb = STRBUF_INIT; + + safe_create_dir(refs->reftable_dir, 1); + + strbuf_addf(&sb, "%s/HEAD", refs->base.gitdir); + write_file(sb.buf, "ref: refs/heads/.invalid"); + strbuf_reset(&sb); + + strbuf_addf(&sb, "%s/refs", refs->base.gitdir); + safe_create_dir(sb.buf, 1); + strbuf_reset(&sb); + + strbuf_addf(&sb, "%s/refs/heads", refs->base.gitdir); + write_file(sb.buf, "this repository uses the reftable format"); + + return 0; +} + +struct git_reftable_iterator { + struct ref_iterator base; + struct reftable_iterator iter; + struct reftable_ref_record ref; + struct object_id oid; + struct ref_store *ref_store; + + /* In case we must iterate over 2 stacks, this is non-null. */ + struct reftable_merged_table *merged; + unsigned int flags; + int err; + const char *prefix; +}; + +static int reftable_ref_iterator_advance(struct ref_iterator *ref_iterator) +{ + struct git_reftable_iterator *ri = + (struct git_reftable_iterator *)ref_iterator; + while (ri->err == 0) { + ri->err = reftable_iterator_next_ref(&ri->iter, &ri->ref); + if (ri->err) { + break; + } + + if (ref_type(ri->ref.refname) == REF_TYPE_PSEUDOREF) { + /* + pseudorefs, eg. HEAD, FETCH_HEAD should not be + produced, by default. + */ + continue; + } + ri->base.refname = ri->ref.refname; + if (ri->prefix != NULL && + strncmp(ri->prefix, ri->ref.refname, strlen(ri->prefix))) { + ri->err = 1; + break; + } + if (ri->flags & DO_FOR_EACH_PER_WORKTREE_ONLY && + ref_type(ri->base.refname) != REF_TYPE_PER_WORKTREE) + continue; + + ri->base.flags = 0; + switch (ri->ref.value_type) { + case REFTABLE_REF_VAL1: + oidread(&ri->oid, ri->ref.value.val1); + break; + case REFTABLE_REF_VAL2: + oidread(&ri->oid, ri->ref.value.val2.value); + break; + case REFTABLE_REF_SYMREF: { + int out_flags = 0; + const char *resolved = refs_resolve_ref_unsafe( + ri->ref_store, ri->ref.refname, + RESOLVE_REF_READING, &ri->oid, &out_flags); + ri->base.flags = out_flags; + if (resolved == NULL && + !(ri->flags & DO_FOR_EACH_INCLUDE_BROKEN) && + (ri->base.flags & REF_ISBROKEN)) { + continue; + } + break; + } + default: + abort(); + } + + ri->base.oid = &ri->oid; + if (!(ri->flags & DO_FOR_EACH_INCLUDE_BROKEN) && + !ref_resolves_to_object(ri->base.refname, ri->base.oid, + ri->base.flags)) { + continue; + } + + break; + } + + if (ri->err > 0) { + return ITER_DONE; + } + if (ri->err < 0) { + return ITER_ERROR; + } + + return ITER_OK; +} + +static int reftable_ref_iterator_peel(struct ref_iterator *ref_iterator, + struct object_id *peeled) +{ + struct git_reftable_iterator *ri = + (struct git_reftable_iterator *)ref_iterator; + if (ri->ref.value_type == REFTABLE_REF_VAL2) { + oidread(peeled, ri->ref.value.val2.target_value); + return 0; + } + + return 1; +} + +static int reftable_ref_iterator_abort(struct ref_iterator *ref_iterator) +{ + struct git_reftable_iterator *ri = + (struct git_reftable_iterator *)ref_iterator; + reftable_ref_record_release(&ri->ref); + reftable_iterator_destroy(&ri->iter); + if (ri->merged) { + reftable_merged_table_free(ri->merged); + } + return 0; +} + +static struct ref_iterator_vtable reftable_ref_iterator_vtable = { + reftable_ref_iterator_advance, reftable_ref_iterator_peel, + reftable_ref_iterator_abort +}; + +static struct ref_iterator * +git_reftable_ref_iterator_begin(struct ref_store *ref_store, const char *prefix, + unsigned int flags) +{ + struct git_reftable_ref_store *refs = + (struct git_reftable_ref_store *)ref_store; + struct git_reftable_iterator *ri = xcalloc(1, sizeof(*ri)); + + if (refs->err < 0) { + ri->err = refs->err; + } else if (refs->worktree_stack == NULL) { + struct reftable_merged_table *mt = + reftable_stack_merged_table(refs->main_stack); + ri->err = reftable_merged_table_seek_ref(mt, &ri->iter, prefix); + } else { + struct reftable_merged_table *mt1 = + reftable_stack_merged_table(refs->main_stack); + struct reftable_merged_table *mt2 = + reftable_stack_merged_table(refs->worktree_stack); + struct reftable_table *tabs = + xcalloc(2, sizeof(struct reftable_table)); + reftable_table_from_merged_table(&tabs[0], mt1); + reftable_table_from_merged_table(&tabs[1], mt2); + ri->err = reftable_new_merged_table(&ri->merged, tabs, 2, + the_hash_algo->format_id); + if (ri->err == 0) + ri->err = reftable_merged_table_seek_ref( + ri->merged, &ri->iter, prefix); + } + + base_ref_iterator_init(&ri->base, &reftable_ref_iterator_vtable, 1); + ri->prefix = prefix; + ri->base.oid = &ri->oid; + ri->flags = flags; + ri->ref_store = ref_store; + return &ri->base; +} + +static int fixup_symrefs(struct ref_store *ref_store, + struct ref_transaction *transaction) +{ + struct strbuf referent = STRBUF_INIT; + int i = 0; + int err = 0; + + for (i = 0; i < transaction->nr; i++) { + struct ref_update *update = transaction->updates[i]; + struct object_id old_oid; + + err = git_reftable_read_raw_ref(ref_store, update->refname, + &old_oid, &referent, + /* mutate input, like + files-backend.c */ + &update->type); + if (err < 0 && errno == ENOENT && + is_null_oid(&update->old_oid)) { + err = 0; + } + if (err < 0) + goto done; + + if (!(update->type & REF_ISSYMREF)) + continue; + + if (update->flags & REF_NO_DEREF) { + /* what should happen here? See files-backend.c + * lock_ref_for_update. */ + } else { + /* + If we are updating a symref (eg. HEAD), we should also + update the branch that the symref points to. + + This is generic functionality, and would be better + done in refs.c, but the current implementation is + intertwined with the locking in files-backend.c. + */ + int new_flags = update->flags; + struct ref_update *new_update = NULL; + + /* if this is an update for HEAD, should also record a + log entry for HEAD? See files-backend.c, + split_head_update() + */ + new_update = ref_transaction_add_update( + transaction, referent.buf, new_flags, + &update->new_oid, &update->old_oid, + update->msg); + new_update->parent_update = update; + + /* files-backend sets REF_LOG_ONLY here. */ + update->flags |= REF_NO_DEREF | REF_LOG_ONLY; + update->flags &= ~REF_HAVE_OLD; + } + } + +done: + assert(err != REFTABLE_API_ERROR); + strbuf_release(&referent); + return err; +} + +static int git_reftable_transaction_prepare(struct ref_store *ref_store, + struct ref_transaction *transaction, + struct strbuf *errbuf) +{ + struct git_reftable_ref_store *refs = + (struct git_reftable_ref_store *)ref_store; + struct reftable_addition *add = NULL; + struct reftable_stack *stack = stack_for( + refs, + transaction->nr ? transaction->updates[0]->refname : NULL); + + int err = refs->err; + if (err < 0) { + goto done; + } + + err = reftable_stack_reload(stack); + if (err) { + goto done; + } + + err = reftable_stack_new_addition(&add, stack); + if (err) { + goto done; + } + + err = fixup_symrefs(ref_store, transaction); + if (err) { + goto done; + } + + transaction->backend_data = add; + transaction->state = REF_TRANSACTION_PREPARED; + +done: + assert(err != REFTABLE_API_ERROR); + if (err < 0) { + transaction->state = REF_TRANSACTION_CLOSED; + strbuf_addf(errbuf, "reftable: transaction prepare: %s", + reftable_error_str(err)); + } + + return err; +} + +static int git_reftable_transaction_abort(struct ref_store *ref_store, + struct ref_transaction *transaction, + struct strbuf *err) +{ + struct reftable_addition *add = + (struct reftable_addition *)transaction->backend_data; + reftable_addition_destroy(add); + transaction->backend_data = NULL; + return 0; +} + +static int reftable_check_old_oid(struct ref_store *refs, const char *refname, + struct object_id *want_oid) +{ + struct object_id out_oid; + int out_flags = 0; + const char *resolved = refs_resolve_ref_unsafe( + refs, refname, RESOLVE_REF_READING, &out_oid, &out_flags); + if (is_null_oid(want_oid) != (resolved == NULL)) { + return REFTABLE_LOCK_ERROR; + } + + if (resolved != NULL && !oideq(&out_oid, want_oid)) { + return REFTABLE_LOCK_ERROR; + } + + return 0; +} + +static int ref_update_cmp(const void *a, const void *b) +{ + return strcmp((*(struct ref_update **)a)->refname, + (*(struct ref_update **)b)->refname); +} + +static int write_transaction_table(struct reftable_writer *writer, void *arg) +{ + struct ref_transaction *transaction = (struct ref_transaction *)arg; + struct git_reftable_ref_store *refs = + (struct git_reftable_ref_store *)transaction->ref_store; + struct reftable_stack *stack = + stack_for(refs, transaction->updates[0]->refname); + uint64_t ts = reftable_stack_next_update_index(stack); + int err = 0; + int i = 0; + struct reftable_log_record *logs = + calloc(transaction->nr, sizeof(*logs)); + struct ref_update **sorted = + malloc(transaction->nr * sizeof(struct ref_update *)); + struct reftable_merged_table *mt = reftable_stack_merged_table(stack); + struct reftable_table tab = { NULL }; + struct reftable_ref_record ref = { NULL }; + reftable_table_from_merged_table(&tab, mt); + COPY_ARRAY(sorted, transaction->updates, transaction->nr); + QSORT(sorted, transaction->nr, ref_update_cmp); + reftable_writer_set_limits(writer, ts, ts); + + for (i = 0; i < transaction->nr; i++) { + struct ref_update *u = sorted[i]; + struct reftable_log_record *log = &logs[i]; + struct object_id old_id = *null_oid(); + fill_reftable_log_record(log); + log->update_index = ts; + log->value_type = REFTABLE_LOG_UPDATE; + log->refname = (char *)u->refname; + log->update.new_hash = u->new_oid.hash; + log->update.message = u->msg; + + err = reftable_table_read_ref(&tab, u->refname, &ref); + if (err < 0) + goto done; + else if (err > 0) { + err = 0; + } + + /* XXX if this is a symref (say, HEAD), should we deref the + * symref and check the update.old_hash against the referent? */ + if (ref.value_type == REFTABLE_REF_VAL2 || + ref.value_type == REFTABLE_REF_VAL1) + oidread(&old_id, ref.value.val1); + + /* XXX fold together with the old_id check below? */ + + log->update.old_hash = old_id.hash; + if (u->flags & REF_LOG_ONLY) { + continue; + } + + if (u->flags & REF_HAVE_NEW) { + struct reftable_ref_record ref = { NULL }; + struct object_id peeled; + + int peel_error = peel_object(&u->new_oid, &peeled); + ref.refname = (char *)u->refname; + ref.update_index = ts; + + if (!peel_error) { + ref.value_type = REFTABLE_REF_VAL2; + ref.value.val2.target_value = peeled.hash; + ref.value.val2.value = u->new_oid.hash; + } else if (!is_null_oid(&u->new_oid)) { + ref.value_type = REFTABLE_REF_VAL1; + ref.value.val1 = u->new_oid.hash; + } + + err = reftable_writer_add_ref(writer, &ref); + if (err < 0) { + goto done; + } + } + } + + for (i = 0; i < transaction->nr; i++) { + err = reftable_writer_add_log(writer, &logs[i]); + clear_reftable_log_record(&logs[i]); + if (err < 0) { + goto done; + } + } + +done: + assert(err != REFTABLE_API_ERROR); + reftable_ref_record_release(&ref); + free(logs); + free(sorted); + return err; +} + +static int git_reftable_transaction_finish(struct ref_store *ref_store, + struct ref_transaction *transaction, + struct strbuf *errmsg) +{ + struct reftable_addition *add = + (struct reftable_addition *)transaction->backend_data; + int err = 0; + int i; + + for (i = 0; i < transaction->nr; i++) { + struct ref_update *u = transaction->updates[i]; + if (u->flags & REF_HAVE_OLD) { + err = reftable_check_old_oid(transaction->ref_store, + u->refname, &u->old_oid); + if (err < 0) { + goto done; + } + } + } + if (transaction->nr) { + err = reftable_addition_add(add, &write_transaction_table, + transaction); + if (err < 0) { + goto done; + } + } + + err = reftable_addition_commit(add); + +done: + assert(err != REFTABLE_API_ERROR); + reftable_addition_destroy(add); + transaction->state = REF_TRANSACTION_CLOSED; + transaction->backend_data = NULL; + if (err) { + strbuf_addf(errmsg, "reftable: transaction failure: %s", + reftable_error_str(err)); + return -1; + } + return err; +} + +static int +git_reftable_transaction_initial_commit(struct ref_store *ref_store, + struct ref_transaction *transaction, + struct strbuf *errmsg) +{ + int err = git_reftable_transaction_prepare(ref_store, transaction, + errmsg); + if (err) + return err; + + return git_reftable_transaction_finish(ref_store, transaction, errmsg); +} + +struct write_delete_refs_arg { + struct reftable_stack *stack; + struct string_list *refnames; + const char *logmsg; + unsigned int flags; +}; + +static int write_delete_refs_table(struct reftable_writer *writer, void *argv) +{ + struct write_delete_refs_arg *arg = + (struct write_delete_refs_arg *)argv; + uint64_t ts = reftable_stack_next_update_index(arg->stack); + int err = 0; + int i = 0; + + reftable_writer_set_limits(writer, ts, ts); + for (i = 0; i < arg->refnames->nr; i++) { + struct reftable_ref_record ref = { + .refname = (char *)arg->refnames->items[i].string, + .value_type = REFTABLE_REF_DELETION, + .update_index = ts, + }; + err = reftable_writer_add_ref(writer, &ref); + if (err < 0) { + return err; + } + } + + for (i = 0; i < arg->refnames->nr; i++) { + struct reftable_log_record log = { + .update_index = ts, + }; + struct reftable_ref_record current = { NULL }; + fill_reftable_log_record(&log); + log.update_index = ts; + log.refname = (char *)arg->refnames->items[i].string; + + log.update.message = xstrdup(arg->logmsg); + log.update.new_hash = NULL; + log.update.old_hash = NULL; + if (reftable_stack_read_ref(arg->stack, log.refname, + ¤t) == 0) { + log.update.old_hash = + reftable_ref_record_val1(¤t); + } + err = reftable_writer_add_log(writer, &log); + log.update.old_hash = NULL; + reftable_ref_record_release(¤t); + + clear_reftable_log_record(&log); + if (err < 0) { + return err; + } + } + return 0; +} + +static int git_reftable_delete_refs(struct ref_store *ref_store, + const char *msg, + struct string_list *refnames, + unsigned int flags) +{ + struct git_reftable_ref_store *refs = + (struct git_reftable_ref_store *)ref_store; + struct reftable_stack *stack = stack_for( + refs, refnames->nr ? refnames->items[0].string : NULL); + struct write_delete_refs_arg arg = { + .stack = stack, + .refnames = refnames, + .logmsg = msg, + .flags = flags, + }; + int err = refs->err; + if (err < 0) { + goto done; + } + + string_list_sort(refnames); + err = reftable_stack_reload(stack); + if (err) { + goto done; + } + err = reftable_stack_add(stack, &write_delete_refs_table, &arg); +done: + assert(err != REFTABLE_API_ERROR); + return err; +} + +static int git_reftable_pack_refs(struct ref_store *ref_store, + unsigned int flags) +{ + struct git_reftable_ref_store *refs = + (struct git_reftable_ref_store *)ref_store; + int err = refs->err; + if (err < 0) { + return err; + } + err = reftable_stack_compact_all(refs->main_stack, NULL); + if (err == 0 && refs->worktree_stack != NULL) + err = reftable_stack_compact_all(refs->worktree_stack, NULL); + if (err == 0) + err = reftable_stack_clean(refs->main_stack); + if (err == 0 && refs->worktree_stack != NULL) + err = reftable_stack_clean(refs->worktree_stack); + + return err; +} + +struct write_create_symref_arg { + struct git_reftable_ref_store *refs; + struct reftable_stack *stack; + const char *refname; + const char *target; + const char *logmsg; +}; + +static int write_create_symref_table(struct reftable_writer *writer, void *arg) +{ + struct write_create_symref_arg *create = + (struct write_create_symref_arg *)arg; + uint64_t ts = reftable_stack_next_update_index(create->stack); + int err = 0; + + struct reftable_ref_record ref = { + .refname = (char *)create->refname, + .value_type = REFTABLE_REF_SYMREF, + .value.symref = (char *)create->target, + .update_index = ts, + }; + reftable_writer_set_limits(writer, ts, ts); + err = reftable_writer_add_ref(writer, &ref); + if (err == 0) { + struct reftable_log_record log = { NULL }; + struct object_id new_oid; + struct object_id old_oid; + + fill_reftable_log_record(&log); + log.refname = (char *)create->refname; + log.update_index = ts; + log.update.message = (char *)create->logmsg; + if (refs_resolve_ref_unsafe( + (struct ref_store *)create->refs, create->refname, + RESOLVE_REF_READING, &old_oid, NULL) != NULL) { + log.update.old_hash = old_oid.hash; + } + + if (refs_resolve_ref_unsafe((struct ref_store *)create->refs, + create->target, RESOLVE_REF_READING, + &new_oid, NULL) != NULL) { + log.update.new_hash = new_oid.hash; + } + + if (log.update.old_hash != NULL || + log.update.new_hash != NULL) { + err = reftable_writer_add_log(writer, &log); + } + log.refname = NULL; + log.update.message = NULL; + log.update.old_hash = NULL; + log.update.new_hash = NULL; + clear_reftable_log_record(&log); + } + return err; +} + +static int git_reftable_create_symref(struct ref_store *ref_store, + const char *refname, const char *target, + const char *logmsg) +{ + struct git_reftable_ref_store *refs = + (struct git_reftable_ref_store *)ref_store; + struct reftable_stack *stack = stack_for(refs, refname); + struct write_create_symref_arg arg = { .refs = refs, + .stack = stack, + .refname = refname, + .target = target, + .logmsg = logmsg }; + int err = refs->err; + if (err < 0) { + goto done; + } + err = reftable_stack_reload(stack); + if (err) { + goto done; + } + err = reftable_stack_add(stack, &write_create_symref_table, &arg); +done: + assert(err != REFTABLE_API_ERROR); + return err; +} + +struct write_rename_arg { + struct reftable_stack *stack; + const char *oldname; + const char *newname; + const char *logmsg; +}; + +static int write_rename_table(struct reftable_writer *writer, void *argv) +{ + struct write_rename_arg *arg = (struct write_rename_arg *)argv; + uint64_t ts = reftable_stack_next_update_index(arg->stack); + struct reftable_ref_record old_ref = { NULL }; + struct reftable_ref_record new_ref = { NULL }; + int err = reftable_stack_read_ref(arg->stack, arg->oldname, &old_ref); + + if (err) { + goto done; + } + + /* git-branch supports a --force, but the check is not atomic. */ + if (reftable_stack_read_ref(arg->stack, arg->newname, &new_ref) == 0) { + goto done; + } + + reftable_writer_set_limits(writer, ts, ts); + + { + struct reftable_ref_record todo[2] = { + { + .refname = (char *)arg->oldname, + .update_index = ts, + .value_type = REFTABLE_REF_DELETION, + }, + old_ref, + }; + todo[1].update_index = ts; + todo[1].refname = (char *)arg->newname; + + err = reftable_writer_add_refs(writer, todo, 2); + if (err < 0) { + goto done; + } + } + + if (reftable_ref_record_val1(&old_ref)) { + uint8_t *val1 = reftable_ref_record_val1(&old_ref); + struct reftable_log_record todo[2] = { { NULL } }; + fill_reftable_log_record(&todo[0]); + fill_reftable_log_record(&todo[1]); + + todo[0].refname = (char *)arg->oldname; + todo[0].update_index = ts; + todo[0].update.message = (char *)arg->logmsg; + todo[0].update.old_hash = val1; + todo[0].update.new_hash = NULL; + + todo[1].refname = (char *)arg->newname; + todo[1].update_index = ts; + todo[1].update.old_hash = NULL; + todo[1].update.new_hash = val1; + todo[1].update.message = (char *)arg->logmsg; + + err = reftable_writer_add_logs(writer, todo, 2); + + clear_reftable_log_record(&todo[0]); + clear_reftable_log_record(&todo[1]); + + if (err < 0) { + goto done; + } + + } else { + /* XXX symrefs? */ + } + +done: + assert(err != REFTABLE_API_ERROR); + reftable_ref_record_release(&new_ref); + reftable_ref_record_release(&old_ref); + return err; +} + +static int write_copy_table(struct reftable_writer *writer, void *argv) +{ + struct write_rename_arg *arg = (struct write_rename_arg *)argv; + uint64_t ts = reftable_stack_next_update_index(arg->stack); + struct reftable_ref_record old_ref = { NULL }; + struct reftable_ref_record new_ref = { NULL }; + struct reftable_log_record log = { NULL }; + struct reftable_iterator it = { NULL }; + int err = reftable_stack_read_ref(arg->stack, arg->oldname, &old_ref); + if (err) { + goto done; + } + + /* git-branch supports a --force, but the check is not atomic. */ + if (reftable_stack_read_ref(arg->stack, arg->newname, &new_ref) == 0) { + goto done; + } + + reftable_writer_set_limits(writer, ts, ts); + + FREE_AND_NULL(old_ref.refname); + old_ref.refname = xstrdup(arg->newname); + old_ref.update_index = ts; + err = reftable_writer_add_ref(writer, &old_ref); + if (err < 0) { + goto done; + } + + /* this copies the entire reflog history. Is this the right semantics? + */ + /* XXX should clear out existing reflog entries for oldname? */ + err = reftable_merged_table_seek_log( + reftable_stack_merged_table(arg->stack), &it, arg->oldname); + if (err < 0) { + goto done; + } + while (1) { + int err = reftable_iterator_next_log(&it, &log); + if (err < 0) { + goto done; + } + + if (err > 0 || strcmp(log.refname, arg->oldname)) { + break; + } + FREE_AND_NULL(log.refname); + log.refname = xstrdup(arg->newname); + reftable_writer_add_log(writer, &log); + reftable_log_record_release(&log); + } + +done: + assert(err != REFTABLE_API_ERROR); + reftable_ref_record_release(&new_ref); + reftable_ref_record_release(&old_ref); + reftable_log_record_release(&log); + reftable_iterator_destroy(&it); + return err; +} + +static int git_reftable_rename_ref(struct ref_store *ref_store, + const char *oldrefname, + const char *newrefname, const char *logmsg) +{ + struct git_reftable_ref_store *refs = + (struct git_reftable_ref_store *)ref_store; + struct reftable_stack *stack = stack_for(refs, newrefname); + struct write_rename_arg arg = { + .stack = stack, + .oldname = oldrefname, + .newname = newrefname, + .logmsg = logmsg, + }; + int err = refs->err; + if (err < 0) { + goto done; + } + err = reftable_stack_reload(stack); + if (err) { + goto done; + } + + err = reftable_stack_add(stack, &write_rename_table, &arg); +done: + assert(err != REFTABLE_API_ERROR); + return err; +} + +static int git_reftable_copy_ref(struct ref_store *ref_store, + const char *oldrefname, const char *newrefname, + const char *logmsg) +{ + struct git_reftable_ref_store *refs = + (struct git_reftable_ref_store *)ref_store; + struct reftable_stack *stack = stack_for(refs, newrefname); + struct write_rename_arg arg = { + .stack = stack, + .oldname = oldrefname, + .newname = newrefname, + .logmsg = logmsg, + }; + int err = refs->err; + if (err < 0) { + goto done; + } + err = reftable_stack_reload(stack); + if (err) { + goto done; + } + + err = reftable_stack_add(stack, &write_copy_table, &arg); +done: + assert(err != REFTABLE_API_ERROR); + return err; +} + +struct git_reftable_reflog_ref_iterator { + struct ref_iterator base; + struct reftable_iterator iter; + struct reftable_log_record log; + struct object_id oid; + + /* Used when iterating over worktree & main */ + struct reftable_merged_table *merged; + char *last_name; +}; + +static int +git_reftable_reflog_ref_iterator_advance(struct ref_iterator *ref_iterator) +{ + struct git_reftable_reflog_ref_iterator *ri = + (struct git_reftable_reflog_ref_iterator *)ref_iterator; + + while (1) { + int err = reftable_iterator_next_log(&ri->iter, &ri->log); + if (err > 0) { + return ITER_DONE; + } + if (err < 0) { + return ITER_ERROR; + } + + ri->base.refname = ri->log.refname; + if (ri->last_name != NULL && + !strcmp(ri->log.refname, ri->last_name)) { + /* we want the refnames that we have reflogs for, so we + * skip if we've already produced this name. This could + * be faster by seeking directly to + * reflog@update_index==0. + */ + continue; + } + + free(ri->last_name); + ri->last_name = xstrdup(ri->log.refname); + oidread(&ri->oid, ri->log.update.new_hash); + return ITER_OK; + } +} + +static int +git_reftable_reflog_ref_iterator_peel(struct ref_iterator *ref_iterator, + struct object_id *peeled) +{ + BUG("not supported."); + return -1; +} + +static int +git_reftable_reflog_ref_iterator_abort(struct ref_iterator *ref_iterator) +{ + struct git_reftable_reflog_ref_iterator *ri = + (struct git_reftable_reflog_ref_iterator *)ref_iterator; + reftable_log_record_release(&ri->log); + reftable_iterator_destroy(&ri->iter); + if (ri->merged) + reftable_merged_table_free(ri->merged); + return 0; +} + +static struct ref_iterator_vtable git_reftable_reflog_ref_iterator_vtable = { + git_reftable_reflog_ref_iterator_advance, + git_reftable_reflog_ref_iterator_peel, + git_reftable_reflog_ref_iterator_abort +}; + +static struct ref_iterator * +git_reftable_reflog_iterator_begin(struct ref_store *ref_store) +{ + struct git_reftable_reflog_ref_iterator *ri = xcalloc(1, sizeof(*ri)); + struct git_reftable_ref_store *refs = + (struct git_reftable_ref_store *)ref_store; + + if (refs->worktree_stack == NULL) { + struct reftable_stack *stack = refs->main_stack; + struct reftable_merged_table *mt = + reftable_stack_merged_table(stack); + int err = reftable_merged_table_seek_log(mt, &ri->iter, ""); + if (err < 0) { + free(ri); + /* XXX is this allowed? */ + return NULL; + } + } else { + struct reftable_merged_table *mt1 = + reftable_stack_merged_table(refs->main_stack); + struct reftable_merged_table *mt2 = + reftable_stack_merged_table(refs->worktree_stack); + struct reftable_table *tabs = + xcalloc(2, sizeof(struct reftable_table)); + int err = 0; + reftable_table_from_merged_table(&tabs[0], mt1); + reftable_table_from_merged_table(&tabs[1], mt2); + err = reftable_new_merged_table(&ri->merged, tabs, 2, + the_hash_algo->format_id); + if (err < 0) { + free(tabs); + /* XXX see above */ + return NULL; + } + err = reftable_merged_table_seek_ref(ri->merged, &ri->iter, ""); + if (err < 0) { + return NULL; + } + } + base_ref_iterator_init(&ri->base, + &git_reftable_reflog_ref_iterator_vtable, 1); + ri->base.oid = &ri->oid; + + return (struct ref_iterator *)ri; +} + +static int git_reftable_for_each_reflog_ent_newest_first( + struct ref_store *ref_store, const char *refname, each_reflog_ent_fn fn, + void *cb_data) +{ + struct reftable_iterator it = { NULL }; + struct git_reftable_ref_store *refs = + (struct git_reftable_ref_store *)ref_store; + struct reftable_stack *stack = stack_for(refs, refname); + struct reftable_merged_table *mt = NULL; + int err = 0; + struct reftable_log_record log = { NULL }; + + if (refs->err < 0) { + return refs->err; + } + refname = bare_ref_name(refname); + + mt = reftable_stack_merged_table(stack); + err = reftable_merged_table_seek_log(mt, &it, refname); + while (err == 0) { + struct object_id old_oid; + struct object_id new_oid; + const char *full_committer = ""; + + err = reftable_iterator_next_log(&it, &log); + if (err > 0) { + err = 0; + break; + } + if (err < 0) { + break; + } + + if (strcmp(log.refname, refname)) { + break; + } + + oidread(&old_oid, log.update.old_hash); + oidread(&new_oid, log.update.new_hash); + + full_committer = fmt_ident(log.update.name, log.update.email, + WANT_COMMITTER_IDENT, + /*date*/ NULL, IDENT_NO_DATE); + err = fn(&old_oid, &new_oid, full_committer, log.update.time, + log.update.tz_offset, log.update.message, cb_data); + if (err) + break; + } + + reftable_log_record_release(&log); + reftable_iterator_destroy(&it); + return err; +} + +static int git_reftable_for_each_reflog_ent_oldest_first( + struct ref_store *ref_store, const char *refname, each_reflog_ent_fn fn, + void *cb_data) +{ + struct reftable_iterator it = { NULL }; + struct git_reftable_ref_store *refs = + (struct git_reftable_ref_store *)ref_store; + struct reftable_stack *stack = stack_for(refs, refname); + struct reftable_merged_table *mt = NULL; + struct reftable_log_record *logs = NULL; + int cap = 0; + int len = 0; + int err = 0; + int i = 0; + + if (refs->err < 0) { + return refs->err; + } + refname = bare_ref_name(refname); + mt = reftable_stack_merged_table(stack); + err = reftable_merged_table_seek_log(mt, &it, refname); + + while (err == 0) { + struct reftable_log_record log = { NULL }; + err = reftable_iterator_next_log(&it, &log); + if (err > 0) { + err = 0; + break; + } + if (err < 0) { + break; + } + + if (strcmp(log.refname, refname)) { + break; + } + + if (len == cap) { + cap = 2 * cap + 1; + logs = realloc(logs, cap * sizeof(*logs)); + } + + logs[len++] = log; + } + + for (i = len; i--;) { + struct reftable_log_record *log = &logs[i]; + struct object_id old_oid; + struct object_id new_oid; + const char *full_committer = ""; + + oidread(&old_oid, log->update.old_hash); + oidread(&new_oid, log->update.new_hash); + + full_committer = fmt_ident(log->update.name, log->update.email, + WANT_COMMITTER_IDENT, NULL, + IDENT_NO_DATE); + err = fn(&old_oid, &new_oid, full_committer, log->update.time, + log->update.tz_offset, log->update.message, cb_data); + if (err) { + break; + } + } + + for (i = 0; i < len; i++) { + reftable_log_record_release(&logs[i]); + } + free(logs); + + reftable_iterator_destroy(&it); + return err; +} + +static int git_reftable_reflog_exists(struct ref_store *ref_store, + const char *refname) +{ + struct reftable_iterator it = { NULL }; + struct git_reftable_ref_store *refs = + (struct git_reftable_ref_store *)ref_store; + struct reftable_stack *stack = stack_for(refs, refname); + struct reftable_merged_table *mt = reftable_stack_merged_table(stack); + struct reftable_log_record log = { NULL }; + int err = refs->err; + + if (err < 0) { + goto done; + } + + refname = bare_ref_name(refname); + err = reftable_merged_table_seek_log(mt, &it, refname); + if (err) { + goto done; + } + err = reftable_iterator_next_log(&it, &log); + if (err) { + goto done; + } + + if (strcmp(log.refname, refname)) { + err = 1; + } + +done: + reftable_iterator_destroy(&it); + reftable_log_record_release(&log); + return !err; +} + +static int git_reftable_create_reflog(struct ref_store *ref_store, + const char *refname, int force_create, + struct strbuf *err) +{ + return 0; +} + +struct write_reflog_delete_arg { + struct reftable_stack *stack; + const char *refname; +}; + +static int write_reflog_delete_table(struct reftable_writer *writer, void *argv) +{ + struct write_reflog_delete_arg *arg = argv; + struct reftable_merged_table *mt = + reftable_stack_merged_table(arg->stack); + struct reftable_log_record log = { NULL }; + struct reftable_iterator it = { NULL }; + uint64_t ts = reftable_stack_next_update_index(arg->stack); + int err = reftable_merged_table_seek_log(mt, &it, arg->refname); + + reftable_writer_set_limits(writer, ts, ts); + while (err == 0) { + struct reftable_log_record tombstone = { + .refname = (char *)arg->refname, + .update_index = REFTABLE_LOG_DELETION, + }; + err = reftable_iterator_next_log(&it, &log); + if (err > 0) { + err = 0; + break; + } + + if (err < 0 || strcmp(log.refname, arg->refname)) { + break; + } + tombstone.update_index = log.update_index; + err = reftable_writer_add_log(writer, &tombstone); + } + + return err; +} + +static int git_reftable_delete_reflog(struct ref_store *ref_store, + const char *refname) +{ + struct git_reftable_ref_store *refs = + (struct git_reftable_ref_store *)ref_store; + struct reftable_stack *stack = stack_for(refs, refname); + struct write_reflog_delete_arg arg = { + .stack = stack, + .refname = refname, + }; + int err = reftable_stack_add(stack, &write_reflog_delete_table, &arg); + assert(err != REFTABLE_API_ERROR); + return err; +} + +struct reflog_expiry_arg { + struct reftable_stack *stack; + struct reftable_log_record *records; + int len; +}; + +static int write_reflog_expiry_table(struct reftable_writer *writer, void *argv) +{ + struct reflog_expiry_arg *arg = (struct reflog_expiry_arg *)argv; + uint64_t ts = reftable_stack_next_update_index(arg->stack); + int i = 0; + reftable_writer_set_limits(writer, ts, ts); + for (i = 0; i < arg->len; i++) { + int err = reftable_writer_add_log(writer, &arg->records[i]); + if (err) { + return err; + } + } + return 0; +} + +static int +git_reftable_reflog_expire(struct ref_store *ref_store, const char *refname, + const struct object_id *oid, unsigned int flags, + reflog_expiry_prepare_fn prepare_fn, + reflog_expiry_should_prune_fn should_prune_fn, + reflog_expiry_cleanup_fn cleanup_fn, + void *policy_cb_data) +{ + /* + For log expiry, we write tombstones in place of the expired entries, + This means that the entries are still retrievable by delving into the + stack, and expiring entries paradoxically takes extra memory. + + This memory is only reclaimed when some operation issues a + git_reftable_pack_refs(), which will compact the entire stack and get + rid of deletion entries. + + It would be better if the refs backend supported an API that sets a + criterion for all refs, passing the criterion to pack_refs(). + */ + struct git_reftable_ref_store *refs = + (struct git_reftable_ref_store *)ref_store; + struct reftable_stack *stack = stack_for(refs, refname); + struct reftable_merged_table *mt = NULL; + struct reflog_expiry_arg arg = { + .stack = stack, + }; + struct reftable_log_record *logs = NULL; + struct reftable_log_record *rewritten = NULL; + int logs_len = 0; + int logs_cap = 0; + int i = 0; + uint8_t *last_hash = NULL; + struct reftable_iterator it = { NULL }; + struct reftable_addition *add = NULL; + int err = 0; + if (refs->err < 0) { + return refs->err; + } + err = reftable_stack_reload(stack); + if (err) { + goto done; + } + + mt = reftable_stack_merged_table(stack); + err = reftable_merged_table_seek_log(mt, &it, refname); + if (err < 0) { + goto done; + } + + err = reftable_stack_new_addition(&add, stack); + if (err) { + goto done; + } + prepare_fn(refname, oid, policy_cb_data); + while (1) { + struct reftable_log_record log = { NULL }; + int err = reftable_iterator_next_log(&it, &log); + if (err < 0) { + goto done; + } + + if (err > 0 || strcmp(log.refname, refname)) { + break; + } + + if (logs_len >= logs_cap) { + int new_cap = logs_cap * 2 + 1; + logs = realloc(logs, new_cap * sizeof(*logs)); + logs_cap = new_cap; + } + logs[logs_len++] = log; + } + + rewritten = calloc(logs_len, sizeof(*rewritten)); + for (i = logs_len - 1; i >= 0; i--) { + struct object_id ooid; + struct object_id noid; + struct reftable_log_record *dest = &rewritten[i]; + + *dest = logs[i]; + oidread(&ooid, logs[i].update.old_hash); + oidread(&noid, logs[i].update.new_hash); + + if (should_prune_fn(&ooid, &noid, logs[i].update.email, + (timestamp_t)logs[i].update.time, + logs[i].update.tz_offset, + logs[i].update.message, policy_cb_data)) { + dest->value_type = REFTABLE_LOG_DELETION; + } else { + if ((flags & EXPIRE_REFLOGS_REWRITE) && + last_hash != NULL) { + dest->update.old_hash = last_hash; + } + last_hash = logs[i].update.new_hash; + } + } + + arg.records = rewritten; + arg.len = logs_len; + err = reftable_addition_add(add, &write_reflog_expiry_table, &arg); + if (err < 0) { + goto done; + } + + if (!(flags & EXPIRE_REFLOGS_DRY_RUN)) { + /* XXX - skip writing records that were not changed. */ + err = reftable_addition_commit(add); + } else { + /* XXX - print something */ + } + +done: + if (add) { + cleanup_fn(policy_cb_data); + } + assert(err != REFTABLE_API_ERROR); + reftable_addition_destroy(add); + for (i = 0; i < logs_len; i++) + reftable_log_record_release(&logs[i]); + free(logs); + free(rewritten); + reftable_iterator_destroy(&it); + return err; +} + +static int reftable_error_to_errno(int err) +{ + switch (err) { + case REFTABLE_IO_ERROR: + return EIO; + case REFTABLE_FORMAT_ERROR: + return EFAULT; + case REFTABLE_NOT_EXIST_ERROR: + return ENOENT; + case REFTABLE_LOCK_ERROR: + return EBUSY; + case REFTABLE_API_ERROR: + return EINVAL; + case REFTABLE_ZLIB_ERROR: + return EDOM; + default: + return ERANGE; + } +} + +static int git_reftable_read_raw_ref(struct ref_store *ref_store, + const char *refname, struct object_id *oid, + struct strbuf *referent, + unsigned int *type) +{ + struct git_reftable_ref_store *refs = + (struct git_reftable_ref_store *)ref_store; + struct reftable_stack *stack = stack_for(refs, refname); + struct reftable_ref_record ref = { NULL }; + int err = 0; + + refname = bare_ref_name(refname); /* XXX - in which other cases should + we do this? */ + if (refs->err < 0) { + return refs->err; + } + + /* This is usually not needed, but Git doesn't signal to ref backend if + a subprocess updated the ref DB. So we always check. + */ + err = reftable_stack_reload(stack); + if (err) { + goto done; + } + + err = reftable_stack_read_ref(stack, refname, &ref); + if (err > 0) { + errno = ENOENT; + err = -1; + goto done; + } + if (err < 0) { + errno = reftable_error_to_errno(err); + err = -1; + goto done; + } + + if (ref.value_type == REFTABLE_REF_SYMREF) { + strbuf_reset(referent); + strbuf_addstr(referent, ref.value.symref); + *type |= REF_ISSYMREF; + } else if (reftable_ref_record_val1(&ref) != NULL) { + oidread(oid, reftable_ref_record_val1(&ref)); + } else { + /* We got a tombstone, which should not happen. */ + BUG("Got reftable_ref_record with value type %d", + ref.value_type); + } + +done: + assert(err != REFTABLE_API_ERROR); + reftable_ref_record_release(&ref); + return err; +} + +struct ref_storage_be refs_be_reftable = { + &refs_be_files, + "reftable", + git_reftable_ref_store_create, + git_reftable_init_db, + git_reftable_transaction_prepare, + git_reftable_transaction_finish, + git_reftable_transaction_abort, + git_reftable_transaction_initial_commit, + + git_reftable_pack_refs, + git_reftable_create_symref, + git_reftable_delete_refs, + git_reftable_rename_ref, + git_reftable_copy_ref, + + git_reftable_ref_iterator_begin, + git_reftable_read_raw_ref, + + git_reftable_reflog_iterator_begin, + git_reftable_for_each_reflog_ent_oldest_first, + git_reftable_for_each_reflog_ent_newest_first, + git_reftable_reflog_exists, + git_reftable_create_reflog, + git_reftable_delete_reflog, + git_reftable_reflog_expire, +}; diff --git a/repository.c b/repository.c index b2bf44c6faf..f1cc8df47c4 100644 --- a/repository.c +++ b/repository.c @@ -180,6 +180,8 @@ int repo_init(struct repository *repo, if (worktree) repo_set_worktree(repo, worktree); + repo->ref_storage_format = xstrdup_or_null(format.ref_storage); + clear_repository_format(&format); return 0; diff --git a/repository.h b/repository.h index 3740c93bc0f..1bd9b4d09c8 100644 --- a/repository.h +++ b/repository.h @@ -82,6 +82,9 @@ struct repository { */ struct ref_store *refs_private; + /* The format to use for the ref database. */ + char *ref_storage_format; + /* * Contains path to often used file names. */ diff --git a/setup.c b/setup.c index eb9367ca5cb..72d27477560 100644 --- a/setup.c +++ b/setup.c @@ -498,6 +498,9 @@ static enum extension_result handle_extension(const char *var, return error("invalid value for 'extensions.objectformat'"); data->hash_algo = format; return EXTENSION_OK; + } else if (!strcmp(ext, "refstorage")) { + data->ref_storage = xstrdup(value); + return EXTENSION_OK; } return EXTENSION_UNKNOWN; } @@ -648,6 +651,7 @@ void clear_repository_format(struct repository_format *format) string_list_clear(&format->v1_only_extensions, 0); free(format->work_tree); free(format->partial_clone); + free(format->ref_storage); init_repository_format(format); } @@ -1312,6 +1316,8 @@ const char *setup_git_directory_gently(int *nongit_ok) the_repository->repository_format_partial_clone = repo_fmt.partial_clone; repo_fmt.partial_clone = NULL; + the_repository->ref_storage_format = + xstrdup_or_null(repo_fmt.ref_storage); } } /* diff --git a/t/t0031-reftable.sh b/t/t0031-reftable.sh new file mode 100755 index 00000000000..7899a1c580e --- /dev/null +++ b/t/t0031-reftable.sh @@ -0,0 +1,291 @@ +#!/bin/sh +# +# Copyright (c) 2020 Google LLC +# + +test_description='reftable basics' + +. ./test-lib.sh + +INVALID_SHA1=aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa + +git_init () { + git init -b primary "$@" +} + +initialize () { + rm -rf .git && + (GIT_TEST_REFTABLE=1; export GIT_TEST_REFTABLE; git_init) && + mv .git/hooks .git/hooks-disabled +} + +write_script fake_editor <<\EOF +echo "$MSG" >"$1" +echo "$MSG" >&2 +EOF +GIT_EDITOR=./fake_editor +export GIT_EDITOR + + +test_expect_success 'using reftable' ' + initialize && + test -d .git/reftable && + test -f .git/reftable/tables.list +' + +test_expect_success 'read existing old OID if REF_HAVE_OLD is not set' ' + initialize && + test_commit 1st && + test_commit 2nd && + MSG=b4 git notes add && + MSG=b3 git notes edit && + echo b4 >expect && + git notes --ref commits@{1} show >actual && + test_cmp expect actual +' + +test_expect_success 'git reflog delete' ' + initialize && + test_commit file && + test_commit file2 && + test_commit file3 && + test_commit file4 && + git reflog delete HEAD@{1} && + git reflog > output && + ! grep file3 output +' + +test_expect_success 'branch -D delete nonexistent branch' ' + initialize && + test_commit file && + test_must_fail git branch -D ../../my-private-file +' + +test_expect_success 'branch copy' ' + initialize && + test_commit file1 && + test_commit file2 && + git branch src && + git reflog src > expect && + git branch -c src dst && + git reflog dst | sed "s/dst/src/g" > actual && + test_cmp expect actual +' + +test_expect_success 'update-ref on corrupted data' ' + initialize && + test_commit file1 && + OLD_SHA1=$(git rev-parse HEAD) && + test_commit file2 && + ls -l .git/reftable && + for f in .git/reftable/*.ref + do + >$f + done && + test_must_fail git update-ref refs/heads/main $OLD_SHA1 +' + +test_expect_success 'git stash' ' + initialize && + test_commit file && + touch actual expected && + git -c status.showStash=true status >expected && + echo hoi >> file.t && + git stash push -m stashed && + git stash clear && + git -c status.showStash=true status >actual && + test_cmp expected actual +' + +test_expect_success 'rename branch' ' + initialize && + git symbolic-ref HEAD refs/heads/before && + test_commit file && + git show-ref | sed s/before/after/g > expected && + git branch -M after && + git show-ref > actual && + test_cmp expected actual +' + +test_expect_success 'SHA256 support, env' ' + rm -rf .git && + GIT_DEFAULT_HASH=sha256 && export GIT_DEFAULT_HASH && + (GIT_TEST_REFTABLE=1 git_init) && + mv .git/hooks .git/hooks-disabled && + test_commit file +' + +test_expect_success 'SHA256 support, option' ' + rm -rf .git && + (GIT_TEST_REFTABLE=1 git_init --object-format=sha256) && + mv .git/hooks .git/hooks-disabled && + test_commit file +' + +test_expect_success 'delete ref' ' + initialize && + test_commit file && + SHA=$(git show-ref -s --verify HEAD) && + test_write_lines "$SHA refs/heads/primary" "$SHA refs/tags/file" >expect && + git show-ref >actual && + ! git update-ref -d refs/tags/file $INVALID_SHA1 && + test_cmp expect actual && + git update-ref -d refs/tags/file $SHA && + test_write_lines "$SHA refs/heads/primary" >expect && + git show-ref >actual && + test_cmp expect actual +' + + +test_expect_success 'clone calls transaction_initial_commit' ' + test_commit message1 file1 && + git clone . cloned && + (test -f cloned/file1 || echo "Fixme.") +' + +test_expect_success 'basic operation of reftable storage: commit, show-ref' ' + initialize && + test_commit file && + test_write_lines refs/heads/primary refs/tags/file >expect && + git show-ref && + git show-ref | cut -f2 -d" " >actual && + test_cmp actual expect +' + +test_expect_success 'reflog, repack' ' + initialize && + for count in $(test_seq 1 10) + do + test_commit "number $count" file.t $count number-$count || + return 1 + done && + git pack-refs && + ls -1 .git/reftable >table-files && + test_line_count = 2 table-files && + git reflog refs/heads/primary >output && + test_line_count = 10 output && + grep "commit (initial): number 1" output && + grep "commit: number 10" output && + git gc && + git reflog refs/heads/primary >output && + test_line_count = 0 output +' + +test_expect_success 'branch switch in reflog output' ' + initialize && + test_commit file1 && + git checkout -b branch1 && + test_commit file2 && + git checkout -b branch2 && + git switch - && + git rev-parse --symbolic-full-name HEAD >actual && + echo refs/heads/branch1 >expect && + test_cmp actual expect +' + + +# This matches show-ref's output +print_ref() { + echo "$(git rev-parse "$1") $1" +} + +test_expect_success 'peeled tags are stored' ' + initialize && + test_commit file && + git tag -m "annotated tag" test_tag HEAD && + { + print_ref "refs/heads/primary" && + print_ref "refs/tags/file" && + print_ref "refs/tags/test_tag" && + print_ref "refs/tags/test_tag^{}" + } >expect && + git show-ref -d >actual && + test_cmp expect actual +' + +test_expect_success 'show-ref works on fresh repo' ' + initialize && + rm -rf .git && + (GIT_TEST_REFTABLE=1 git_init) && + >expect && + ! git show-ref >actual && + test_cmp expect actual +' + +test_expect_success 'checkout unborn branch' ' + initialize && + git checkout -b primary +' + + +test_expect_success 'dir/file conflict' ' + initialize && + test_commit file && + ! git branch primary/forbidden +' + + +test_expect_success 'do not clobber existing repo' ' + rm -rf .git && + git_init && + cat .git/HEAD >expect && + test_commit file && + (GIT_TEST_REFTABLE=1 git_init || true) && + cat .git/HEAD >actual && + test_cmp expect actual +' + +# cherry-pick uses a pseudo ref. +test_expect_success 'pseudo refs' ' + initialize && + test_commit message1 file1 && + test_commit message2 file2 && + git branch source && + git checkout HEAD^ && + test_commit message3 file3 && + git cherry-pick source && + test -f file2 +' + +# cherry-pick uses a pseudo ref. +test_expect_success 'rebase' ' + initialize && + test_commit message1 file1 && + test_commit message2 file2 && + git branch source && + git checkout HEAD^ && + test_commit message3 file3 && + git rebase source && + test -f file2 +' + +test_expect_success 'worktrees' ' + (GIT_TEST_REFTABLE=1 git_init start) && + (cd start && test_commit file1 && git checkout -b branch1 && + git checkout -b branch2 && + git worktree add ../wt + ) && + cd wt && + git checkout branch1 && + git branch +' + +test_expect_success 'worktrees 2' ' + initialize && + test_commit file1 && + mkdir existing_empty && + git worktree add --detach existing_empty primary +' + +test_expect_success 'FETCH_HEAD' ' + initialize && + test_commit one && + (git_init sub && cd sub && test_commit two) && + git --git-dir sub/.git rev-parse HEAD >expect && + git fetch sub && + git checkout FETCH_HEAD && + git rev-parse HEAD >actual && + test_cmp expect actual +' + +test_done diff --git a/t/t1409-avoid-packing-refs.sh b/t/t1409-avoid-packing-refs.sh index be12fb63506..cdc21bf2dcb 100755 --- a/t/t1409-avoid-packing-refs.sh +++ b/t/t1409-avoid-packing-refs.sh @@ -4,6 +4,12 @@ test_description='avoid rewriting packed-refs unnecessarily' . ./test-lib.sh +if test_have_prereq !REFFILES +then + skip_all='skipping pack-refs tests; need files backend' + test_done +fi + # Add an identifying mark to the packed-refs file header line. This # shouldn't upset readers, and it should be omitted if the file is # ever rewritten. diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh index 5071ac63a5b..6bdd430dfe3 100755 --- a/t/t1450-fsck.sh +++ b/t/t1450-fsck.sh @@ -8,6 +8,12 @@ test_description='git fsck random collection of tests . ./test-lib.sh +if test_have_prereq !REFFILES +then + skip_all='skipping tests; incompatible with reftable' + test_done +fi + test_expect_success setup ' git config gc.auto 0 && git config i18n.commitencoding ISO-8859-1 && diff --git a/t/t3210-pack-refs.sh b/t/t3210-pack-refs.sh index 577f32dc71f..e523c3dd624 100755 --- a/t/t3210-pack-refs.sh +++ b/t/t3210-pack-refs.sh @@ -14,6 +14,12 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME . ./test-lib.sh +if test_have_prereq !REFFILES +then + skip_all='skipping pack-refs tests; requires files ref backend' + test_done +fi + test_expect_success 'enable reflogs' ' git config core.logallrefupdates true ' diff --git a/t/test-lib.sh b/t/test-lib.sh index 9e268605449..c9b06f931b4 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -1509,7 +1509,12 @@ parisc* | hppa*) ;; esac -test_set_prereq REFFILES +if test -n "$GIT_TEST_REFTABLE" +then + test_set_prereq !REFFILES +else + test_set_prereq REFFILES +fi ( COLUMNS=1 && test $COLUMNS = 1 ) && test_set_prereq COLUMNS_CAN_BE_1 test -z "$NO_PERL" && test_set_prereq PERL From patchwork Tue Jul 20 17:04:41 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?SZEDER_G=C3=A1bor?= X-Patchwork-Id: 12388885 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id ECFEEC636C8 for ; Tue, 20 Jul 2021 17:09:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D5BF4610CC for ; Tue, 20 Jul 2021 17:09:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230079AbhGTQ1e (ORCPT ); Tue, 20 Jul 2021 12:27:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49692 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232711AbhGTQY0 (ORCPT ); Tue, 20 Jul 2021 12:24:26 -0400 Received: from mail-wr1-x436.google.com (mail-wr1-x436.google.com [IPv6:2a00:1450:4864:20::436]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 62B8CC061574 for ; Tue, 20 Jul 2021 10:05:03 -0700 (PDT) Received: by mail-wr1-x436.google.com with SMTP id m2so26813172wrq.2 for ; Tue, 20 Jul 2021 10:05:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:mime-version :content-transfer-encoding:fcc:to:cc; bh=SQqi9xjGISpoDsU6k7XeqVuwZGxStXX5SNt3T0ckhZg=; b=TpxdjKjQgmJDM15L+v7/dnHcKMOsineD+Xf4Z+wPseD7wAICTZfCbyzX8mIVmF8OiK pMml11kNFwcOGUrssCWT7Fm4013/R26ulJhU4Ind2lUhsJssbOPbYfBVDwqlfYTzUsaa hxnG/Ml7JE8cDUE4cJBYISxcyoauPxqcXhV9ghGEjuAVw3Vu+G0JSbb2yfb6+d+m2m9h xvcCXv2Kfj/zXZ5h90DT0G68s7fDib29KZWOt70wO5y8tQocMo6SmJsTWbHOu9s32XGY EY/FTtZ6mW0G1I6wvET/XWvGvxmeFovT3eIkQFEbnwxd2hzJNZzWdxyZbs+05LYocNRK nvZg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:mime-version:content-transfer-encoding:fcc:to:cc; bh=SQqi9xjGISpoDsU6k7XeqVuwZGxStXX5SNt3T0ckhZg=; b=X6cWTLkgt/zH4VRklD05IpeiPfB3+MdI2P+G3OTmdJa3YYVY5DkCEkPLEthkqEw9Bl bHe+GaMKilv1noP8H9l6gseKFdO1+PeYQ0oQXn9ysY+37hgPIyBBhKcEw4qHaJcsdH/z xbaHI2vgV0N+PhW+9kfAjgS5gpor8zhmTffX45ib4dOnY2/nscAPJJmL9i9UkxgvdBgl 5dIrK8o8WEmuxJ0uXPTUBOMb7PLWHWPd0FT/pC0M4BYCFGn06/Vq8VCduxT7Fjr0dlzY ZfHW+vQO4Z2yiUgquvwVhMK8k6Zr+BaA3mCrl246641I8tan7V1FIfTpEcM/nbfCkxfm VI0A== X-Gm-Message-State: AOAM533/RE9GkjgDPeMCM1OJtR8W5atcHT2xtCYQfwpTycjr6AgM0CLD tnskcNRT//vstr6LjrAfiA2yO+4auWk= X-Google-Smtp-Source: ABdhPJy1zuGyibPzsv5Is2eAngsu5ZaycftgHbK6JQOEi9Cjph868OtuX8MwMqjpZUBw2br5fogJWQ== X-Received: by 2002:a5d:5645:: with SMTP id j5mr17969768wrw.426.1626800702055; Tue, 20 Jul 2021 10:05:02 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id 19sm20275724wmu.17.2021.07.20.10.05.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 10:05:01 -0700 (PDT) Message-Id: <7541a4b8d6dbdffbad06779e7c295a8e730ed9af.1626800687.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 20 Jul 2021 17:04:41 +0000 Subject: [PATCH 21/26] git-prompt: prepare for reftable refs backend MIME-Version: 1.0 Fcc: Sent To: git@vger.kernel.org Cc: Han-Wen Nienhuys , =?utf-8?q?SZEDER_G=C3=A1bor?= Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: =?utf-8?q?SZEDER_G=C3=A1bor?= From: =?UTF-8?q?SZEDER=20G=C3=A1bor?= In our git-prompt script we strive to use Bash builtins wherever possible, because fork()-ing subshells for command substitutions and fork()+exec()-ing Git commands are expensive on some platforms. We even read and parse '.git/HEAD' using Bash builtins to get the name of the current branch [1]. However, the upcoming reftable refs backend won't use '.git/HEAD' at all, but will write an invalid refname as placeholder for backwards compatibility instead, which will break our git-prompt script. Update the git-prompt script to recognize the placeholder '.git/HEAD' written by the reftable backend (its content is specified in the reftable specs), and then fall back to use 'git symbolic-ref' to get the name of the current branch. [1] 3a43c4b5bd (bash prompt: use bash builtins to find out current branch, 2011-03-31) Signed-off-by: SZEDER Gábor --- contrib/completion/git-prompt.sh | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/contrib/completion/git-prompt.sh b/contrib/completion/git-prompt.sh index db7c0068fb5..4177274bea4 100644 --- a/contrib/completion/git-prompt.sh +++ b/contrib/completion/git-prompt.sh @@ -478,10 +478,15 @@ __git_ps1 () if ! __git_eread "$g/HEAD" head; then return $exit fi - # is it a symbolic ref? b="${head#ref: }" if [ "$head" = "$b" ]; then detached=yes + elif [ "$b" = "refs/heads/.invalid" ]; then + # Reftable + b="$(git symbolic-ref HEAD 2>/dev/null)" || + detached=yes + fi + if [ "$detached" = yes ]; then b="$( case "${GIT_PS1_DESCRIBE_STYLE-}" in (contains) From patchwork Tue Jul 20 17:04:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 12388883 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0750DC636C9 for ; Tue, 20 Jul 2021 17:09:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E24F16113B for ; Tue, 20 Jul 2021 17:09:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232215AbhGTQ1s (ORCPT ); Tue, 20 Jul 2021 12:27:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49736 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232695AbhGTQY0 (ORCPT ); Tue, 20 Jul 2021 12:24:26 -0400 Received: from mail-wr1-x434.google.com (mail-wr1-x434.google.com [IPv6:2a00:1450:4864:20::434]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ED676C061767 for ; Tue, 20 Jul 2021 10:05:03 -0700 (PDT) Received: by mail-wr1-x434.google.com with SMTP id r11so26758907wro.9 for ; Tue, 20 Jul 2021 10:05:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=OsQSLvt2u3LVkRXrYhUjj9EbYAyMPEEngwbUfGEcNKY=; b=WbT5qXvenQV2oL05syz4DDviooyL/zNH1LjP+QVeO3Og8FK7HQEuVxSqBt5rrKF1og fEnzGLlMiXiBIGz54u2X+QCVip9ilugGRhvtGhNs/p0KNkET0wIgCTlJOyhxfbmXUXTW gEAXA0r7QU7/JUMb6JC2jwK8ZSxh+xyaIPjKFJNvO14KKAE2JFdVPiL/xRD2PS2s6gFb lvpjmPpLl1RxhFG+bL9OWqkgQIOYvdSH3Km5zbnco1zhIdjgOYXSH+47uFuUCk0bhBj6 8weSPUvT3K9zPdBks4fTnR6W2y3z6BplEZa1FFcwkXDsYotTE3ytvaWOx+xSCLRX19q7 90Cw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=OsQSLvt2u3LVkRXrYhUjj9EbYAyMPEEngwbUfGEcNKY=; b=e+HeH7hxCLQcLp9sYRtp49C7MMEBOo2RNAj91aMBMroQ5DIIzHEQKKKEH70DcUofxO guYQLR2wuNYo6dqW8Zj1iNFklRU0RhpAZs0mjqnvz+UAYP0C/5GVT4iPdimM04kvKMJM otfx2pS9bdW/vVkC27951xJzsVEhDGI/id16c5yI3NHoZM50VgzmSNbHqruo06UAxSJF FXLc4livfNO2T5kwFRWGB/qnXNvlq30BsSLlM8ny8YDEwHPLs4Zu48VEReqrB2lHXU9S R3ys0eU5VIHUgdIodc0vMbs286+fkvk9wpawF/4zAlip2hL4+RS1iijNO9Lz/jpRiJPh ojpQ== X-Gm-Message-State: AOAM533VG3FCp7/IIgK/YVhX406H+cftyXGXF9O3tLKwlnv2/uidINKv gQCqs87cLpuy6k/rC00pNlBlTESfNRk= X-Google-Smtp-Source: ABdhPJwDcCK7q3oyT7bVHGxrmI44keo6n/tHSXWMPZGrKDNon4CHEJISK36sicslCZArVBEfmHJ5Pg== X-Received: by 2002:a5d:568a:: with SMTP id f10mr18150526wrv.293.1626800702588; Tue, 20 Jul 2021 10:05:02 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id 19sm20275752wmu.17.2021.07.20.10.05.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 10:05:02 -0700 (PDT) Message-Id: <3c9c3a2d56df2ef525072ede5cb420b2690bb979.1626800687.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 20 Jul 2021 17:04:42 +0000 Subject: [PATCH 22/26] Add "test-tool dump-reftable" command. Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys This command dumps individual tables or a stack of of tables. Signed-off-by: Han-Wen Nienhuys --- Makefile | 1 + t/helper/test-reftable.c | 5 +++++ t/helper/test-tool.c | 1 + t/helper/test-tool.h | 1 + t/t0031-reftable.sh | 6 ++++++ 5 files changed, 14 insertions(+) diff --git a/Makefile b/Makefile index 19566c661f1..6014f74a1b8 100644 --- a/Makefile +++ b/Makefile @@ -2467,6 +2467,7 @@ REFTABLE_OBJS += reftable/writer.o REFTABLE_TEST_OBJS += reftable/basics_test.o REFTABLE_TEST_OBJS += reftable/block_test.o +REFTABLE_TEST_OBJS += reftable/dump.o REFTABLE_TEST_OBJS += reftable/merged_test.o REFTABLE_TEST_OBJS += reftable/pq_test.o REFTABLE_TEST_OBJS += reftable/record_test.o diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c index 996da85f7b5..26b03d7b789 100644 --- a/t/helper/test-reftable.c +++ b/t/helper/test-reftable.c @@ -14,3 +14,8 @@ int cmd__reftable(int argc, const char **argv) tree_test_main(argc, argv); return 0; } + +int cmd__dump_reftable(int argc, const char **argv) +{ + return reftable_dump_main(argc, (char *const *)argv); +} diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c index 01201629fca..ed543037bb8 100644 --- a/t/helper/test-tool.c +++ b/t/helper/test-tool.c @@ -59,6 +59,7 @@ static struct test_cmd cmds[] = { { "read-midx", cmd__read_midx }, { "ref-store", cmd__ref_store }, { "reftable", cmd__reftable }, + { "dump-reftable", cmd__dump_reftable }, { "regex", cmd__regex }, { "repository", cmd__repository }, { "revision-walking", cmd__revision_walking }, diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h index cb90b7f4f7b..284cfe70d94 100644 --- a/t/helper/test-tool.h +++ b/t/helper/test-tool.h @@ -19,6 +19,7 @@ int cmd__dump_cache_tree(int argc, const char **argv); int cmd__dump_fsmonitor(int argc, const char **argv); int cmd__dump_split_index(int argc, const char **argv); int cmd__dump_untracked_cache(int argc, const char **argv); +int cmd__dump_reftable(int argc, const char **argv); int cmd__example_decorate(int argc, const char **argv); int cmd__fast_rebase(int argc, const char **argv); int cmd__genrandom(int argc, const char **argv); diff --git a/t/t0031-reftable.sh b/t/t0031-reftable.sh index 7899a1c580e..f024968ed66 100755 --- a/t/t0031-reftable.sh +++ b/t/t0031-reftable.sh @@ -288,4 +288,10 @@ test_expect_success 'FETCH_HEAD' ' test_cmp expect actual ' +test_expect_success 'dump reftable' ' + initialize && + hash_id=$(git config extensions.objectformat) && + test-tool dump-reftable $(test "${hash_id}" = "sha256" && echo "-6") -s .git/reftable +' + test_done From patchwork Tue Jul 20 17:04:43 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 12388877 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 93241C07E9B for ; Tue, 20 Jul 2021 17:09:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 71BBF610C7 for ; Tue, 20 Jul 2021 17:09:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232726AbhGTQ2E (ORCPT ); Tue, 20 Jul 2021 12:28:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49742 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232770AbhGTQY1 (ORCPT ); Tue, 20 Jul 2021 12:24:27 -0400 Received: from mail-wr1-x42e.google.com (mail-wr1-x42e.google.com [IPv6:2a00:1450:4864:20::42e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7DBFDC0613DC for ; Tue, 20 Jul 2021 10:05:04 -0700 (PDT) Received: by mail-wr1-x42e.google.com with SMTP id k4so26769177wrc.8 for ; Tue, 20 Jul 2021 10:05:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=luGuMJDPRMnjXEoCSqhvE+ut/GB0VLXv5Jme/aSENz4=; b=idRStOs6G5y/+xB61Ep3mlx8ysx5wrAQ82R96AANi6GK05JFj9DSYRulvzfC8/+4co vsVelgGFsxFku+TgQet1vBZvpEbdeZ5GLQwh5F+/wcPG0NXzZXBMlwYlnv9cSK2CG/jt rVUmk32XrPtjM9pAaQup53Z3wplO60h0ZFudpzdOjfDdFvJYGIud5iuACuCYOfendaXZ pSPUT5ERgQsLRwnCeT9tKMUiesFC3xaNCH1ZxyEeHAkxJTcM7vliZIW1tKKx1+uopsvz X/Y2h4EiEmjsNn3njBeZFyKk2B2vSIFzRjVht5O/Wce+/jQAv2ei/299YzcelXiOn8vj q2VA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=luGuMJDPRMnjXEoCSqhvE+ut/GB0VLXv5Jme/aSENz4=; b=d2l5OkP4YggkV7x15aLT2AloDWfCuCsoohw+zjpRiXbg81tyD8/cuftUjynB7iBwE8 EZqsm6x9i6PNtASMfyvOin1DpIbB3xLl6O0+kll3dhrSU3UOE9HcmK261RQR/HsIfBx2 NW6OcU9aar9Bntb+M+OSRFGnO+aCV5VkN6/44MITT9gUl2XNivX0Q3azUdxzd9DyA0Jd UXeAOH8I/n+XUcXKo4msSgFymYgjyVC3hmkTG9blXtjKIBOtiCgum/BaROvqaYCMvYCZ DQZZLfJiTGJV/KlX9RrCAw0P06tPHKmoYXkCSPhGBwvgeDSnCr58Y+HlIXA5LlibQkuP j/VA== X-Gm-Message-State: AOAM53154g4VgcmN3cKWgyQoTSDTbF+uUc+1F0SQR2La2hsKyDOC+5ty dedY7bKGkBxih3YYDpaC6TiJTR8L+Z4= X-Google-Smtp-Source: ABdhPJwSJUDXAL61SKAYphmSFHCkwSgddOgX4ggOIHYxWYaD9r71OT8YSsroBO4PEm1qCP3LQk5ICQ== X-Received: by 2002:a05:6000:1a86:: with SMTP id f6mr17223858wry.127.1626800703151; Tue, 20 Jul 2021 10:05:03 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id v11sm23779911wrs.4.2021.07.20.10.05.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 10:05:02 -0700 (PDT) Message-Id: <73eece0caacb7519edf338f040f08d381456df88.1626800687.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 20 Jul 2021 17:04:43 +0000 Subject: [PATCH 23/26] t1301: document what needs to be done for reftable Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys Signed-off-by: Han-Wen Nienhuys --- t/t1301-shared-repo.sh | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/t/t1301-shared-repo.sh b/t/t1301-shared-repo.sh index 84bf1970d8b..a5755b4a434 100755 --- a/t/t1301-shared-repo.sh +++ b/t/t1301-shared-repo.sh @@ -22,9 +22,10 @@ test_expect_success 'shared = 0400 (faulty permission u-w)' ' ) ' +# TODO(hanwen): for REFTABLE should inspect group-readable of .git/reftable/ for u in 002 022 do - test_expect_success POSIXPERM "shared=1 does not clear bits preset by umask $u" ' + test_expect_success REFFILES,POSIXPERM "shared=1 does not clear bits preset by umask $u" ' mkdir sub && ( cd sub && umask $u && @@ -114,7 +115,8 @@ test_expect_success POSIXPERM 'info/refs respects umask in unshared repo' ' test_cmp expect actual ' -test_expect_success POSIXPERM 'git reflog expire honors core.sharedRepository' ' +# For reftable, the check on .git/reftable/ is sufficient. +test_expect_success REFFILES,POSIXPERM 'git reflog expire honors core.sharedRepository' ' umask 077 && git config core.sharedRepository group && git reflog expire --all && @@ -201,7 +203,7 @@ test_expect_success POSIXPERM 're-init respects core.sharedrepository (remote)' test_cmp expect actual ' -test_expect_success POSIXPERM 'template can set core.sharedrepository' ' +test_expect_success REFFILES,POSIXPERM 'template can set core.sharedrepository' ' rm -rf child.git && umask 0022 && git config core.sharedrepository 0666 && From patchwork Tue Jul 20 17:04:44 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 12388875 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9327AC636C8 for ; Tue, 20 Jul 2021 17:09:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7B9E0610CC for ; Tue, 20 Jul 2021 17:09:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234812AbhGTQ2Z (ORCPT ); Tue, 20 Jul 2021 12:28:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49744 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232938AbhGTQY2 (ORCPT ); Tue, 20 Jul 2021 12:24:28 -0400 Received: from mail-wr1-x42a.google.com (mail-wr1-x42a.google.com [IPv6:2a00:1450:4864:20::42a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0A26DC0613DD for ; Tue, 20 Jul 2021 10:05:05 -0700 (PDT) Received: by mail-wr1-x42a.google.com with SMTP id f17so26790286wrt.6 for ; Tue, 20 Jul 2021 10:05:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=pBl9keprbyPXoTK8zNPwl5fONnH+JlzdDE+2Yk4oVBc=; b=iSJ9AlxAt3lSWMDBaIefPFLunB69y6ZJZromNEH9+xonKtyr50T5wXqBLRn8K0S8dU 4EPkmekiQhl8Mo8eBo3iLvpvRm9GE7/b63umQgXNyoSKhuObLA90+BoV15QuWyVShZ0/ 2Od2oSCQCzNkqsHr1mB4wCrKGG/tECm26wUsVw2pB703xKD1vw4qBn2UvIm51D+dsOvl nrLIJec11em/MYgQ4LE2ttRTWNtYPNNIwcEKj7SNdeXYvWkvfqib953bWWO1rr5MaKLA S7NbkyGt5NRJ7Yq0xX2G7PK2YRKEz51CXDvRRCGIUI9Wnou7RimRFoTy/yYSSpz4BfJR pXUA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=pBl9keprbyPXoTK8zNPwl5fONnH+JlzdDE+2Yk4oVBc=; b=f+VOn3dJ3Kupzu27CeE40q4KiYQPZCoayoSdZmmV+LUf5sCc10hdYRwJp788jRAJza 4r3CPSjSXYoPpZBMIATYthx4JMbee9mo+JXgWq12UmzX/cWoyrQKQi+K/r4s1WUnJz5o Q9KJk8WGV57/Ox4PwL4/dH9r18fGvhW7kX7SilVSkbn6UklbQAkWQviqsW9ksTkWe/RP jSL6FzdN0fkh2jLRe810DTl5hom7uqwPApxek2qUfM80+AHz2Jgeep9jPwI1j62bXAG/ c1lRfD6mkkJtb6S/T4JaPXLQavgO62X5tgwUUQ9BOybtjLUi4kis29MQlT15Z5N5bNfq vejg== X-Gm-Message-State: AOAM532ZCNSg4YVBryBy7AeZwbJI1IiYcXa3lY89mjR8SvgZFvThQ1bV 6yfFrB58WijWGi+2uJwIHx4pcJsIHl8= X-Google-Smtp-Source: ABdhPJxrFIyP3RKiU/hawGZg0Pxu3V1hFQbRSB42mAMZWD6Mm2VZpLuuY+/7QygHDAeq2ShOQMF1pQ== X-Received: by 2002:a5d:620d:: with SMTP id y13mr10383902wru.45.1626800703699; Tue, 20 Jul 2021 10:05:03 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id o15sm2929351wmh.40.2021.07.20.10.05.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 10:05:03 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Tue, 20 Jul 2021 17:04:44 +0000 Subject: [PATCH 24/26] t1401,t2011: parameterize HEAD.lock for REFFILES Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys Signed-off-by: Han-Wen Nienhuys --- t/t1401-symbolic-ref.sh | 11 +++++++++-- t/t2011-checkout-invalid-head.sh | 11 +++++++++-- 2 files changed, 18 insertions(+), 4 deletions(-) diff --git a/t/t1401-symbolic-ref.sh b/t/t1401-symbolic-ref.sh index 132a1b885ac..1b51013aded 100755 --- a/t/t1401-symbolic-ref.sh +++ b/t/t1401-symbolic-ref.sh @@ -102,9 +102,16 @@ test_expect_success LONG_REF 'we can parse long symbolic ref' ' test_cmp expect actual ' +if test_have_prereq REFFILES +then + HEAD_LOCK=HEAD.lock +else + HEAD_LOCK=reftable/tables.list.lock +fi + test_expect_success 'symbolic-ref reports failure in exit code' ' - test_when_finished "rm -f .git/HEAD.lock" && - >.git/HEAD.lock && + test_when_finished "rm -f .git/$HEAD_LOCK" && + >.git/$HEAD_LOCK && test_must_fail git symbolic-ref HEAD refs/heads/whatever ' diff --git a/t/t2011-checkout-invalid-head.sh b/t/t2011-checkout-invalid-head.sh index e52022e1522..a56f7af442c 100755 --- a/t/t2011-checkout-invalid-head.sh +++ b/t/t2011-checkout-invalid-head.sh @@ -22,9 +22,16 @@ test_expect_success 'checkout main from invalid HEAD' ' git checkout main -- ' +if test_have_prereq REFFILES +then + HEAD_LOCK=HEAD.lock +else + HEAD_LOCK=reftable/tables.list.lock +fi + test_expect_success 'checkout notices failure to lock HEAD' ' - test_when_finished "rm -f .git/HEAD.lock" && - >.git/HEAD.lock && + test_when_finished "rm -f .git/$HEAD_LOCK" && + >.git/$HEAD_LOCK && test_must_fail git checkout -b other ' From patchwork Tue Jul 20 17:04:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 12388881 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 731F0C636C9 for ; Tue, 20 Jul 2021 17:09:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 55A8260FED for ; Tue, 20 Jul 2021 17:09:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234908AbhGTQ2g (ORCPT ); Tue, 20 Jul 2021 12:28:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49718 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232992AbhGTQYb (ORCPT ); Tue, 20 Jul 2021 12:24:31 -0400 Received: from mail-wm1-x332.google.com (mail-wm1-x332.google.com [IPv6:2a00:1450:4864:20::332]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B83F2C0613DE for ; Tue, 20 Jul 2021 10:05:05 -0700 (PDT) Received: by mail-wm1-x332.google.com with SMTP id f8-20020a1c1f080000b029022d4c6cfc37so1894320wmf.5 for ; Tue, 20 Jul 2021 10:05:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=yyG7ptp3kPeTvu91jaJSIgDSemxiR1k8pmNU3KvV//Y=; b=vXQ45p/d6Vwavx6HcH24mUTcT5zlZ9VvBrebSehM6cm09xEpwDsyoRuneLVhGFR+8b aTsGWWcrOb60x7iDsZ/OMrx0e+9OvPtAwflmb/P6Mgn9zZG1062TCkFZKnLAhxwEVGPh ZRFazBlV8rFZENpYEd7ZTD2+/rO3l/VquDEqSA+3X/1Pe8j/Ez3BQ/WRlA9t67o4HmV3 45I3WBfSxWsgYCnDGynhfEIpvAzv8CzFgw1ag9Qbq8IgHbO1eTevXNXC4jvbuy+QgwPd DlePidOgAPDdIE0xfzuVugsZVx24sYArSLtVxLZOpNie75EHzMV13/yN4lxOsZpibu73 rxFA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=yyG7ptp3kPeTvu91jaJSIgDSemxiR1k8pmNU3KvV//Y=; b=g6JjFjk9Wo203ZpYcN5BQtoLm7fc7DTRJiktS/5cd/RcjlX8y1/5DiOf6vueGHu9f9 4wbyIv78eC1ExvlLl2sLDjA4rfWxb3PTo0a/QA25eV3IYWJnghfN6oqtdcUnZeOXc73N yarSPjUw+kIXhVfJBBTQY0A+5lOYZYweoinaSEDZvZA1q6jx9L8lVALCEw1dzqD89ZOz lEqznZSA+3Q926X0t3HmmBkUUKhu0lRNTpGW4GWXNznwaoDfY7KVapt7fUOPqbMKaswL pOqhmr8ydnuC9Ol3cVlxVz5PvZTwYdnA4AUKkrkvtuEQEWwiGMOZ8+hAoXDryAIUDkyH PFEQ== X-Gm-Message-State: AOAM533/9xQWehzzx5JXtM1oLiDdxROwMeR7/8smEWXuW6IZcl9vxkea oi0gsCaJ4E0oKHrPsdltARVxNolgW2I= X-Google-Smtp-Source: ABdhPJwIeKz2vheyEFMCZEkiie2Q+7pQjh/gfhjXg9BRjclqWcpuiCkD/nLKaWk9VYQwTvRDpG8HvQ== X-Received: by 2002:a7b:c4d3:: with SMTP id g19mr32416849wmk.78.1626800704361; Tue, 20 Jul 2021 10:05:04 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id p2sm19876446wmg.6.2021.07.20.10.05.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 10:05:04 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Tue, 20 Jul 2021 17:04:45 +0000 Subject: [PATCH 25/26] t1404: annotate test cases with REFFILES Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys * Reftable for now lacks detailed error messages for directory/file conflicts. Skip message comparisons. * Mark tests that muck with .git directly as REFFILES. Signed-off-by: Han-Wen Nienhuys --- t/t1404-update-ref-errors.sh | 56 +++++++++++++++++++++++++++--------- 1 file changed, 42 insertions(+), 14 deletions(-) diff --git a/t/t1404-update-ref-errors.sh b/t/t1404-update-ref-errors.sh index b729c1f4803..811d5bb56d4 100755 --- a/t/t1404-update-ref-errors.sh +++ b/t/t1404-update-ref-errors.sh @@ -27,7 +27,9 @@ test_update_rejected () { fi && printf "create $prefix/%s $C\n" $create >input && test_must_fail git update-ref --stdin output.err && - test_i18ngrep -F "$error" output.err && + if test_have_prereq REFFILES ; then + test_i18ngrep -F "$error" output.err + fi && git for-each-ref $prefix >actual && test_cmp unchanged actual } @@ -101,7 +103,9 @@ df_test() { printf "%s\n" "delete $delname" "create $addname $D" fi >commands && test_must_fail git update-ref --stdin output.err && - test_cmp expected-err output.err && + if test_have_prereq REFFILES ; then + test_cmp expected-err output.err + fi && printf "%s\n" "$C $delref" >expected-refs && git for-each-ref --format="%(objectname) %(refname)" $prefix/r >actual-refs && test_cmp expected-refs actual-refs @@ -336,7 +340,9 @@ test_expect_success 'missing old value blocks update' ' EOF printf "%s\n" "update $prefix/foo $E $D" | test_must_fail git update-ref --stdin 2>output.err && - test_cmp expected output.err + if test_have_prereq REFFILES ; then + test_cmp expected output.err + fi ' test_expect_success 'incorrect old value blocks update' ' @@ -347,7 +353,9 @@ test_expect_success 'incorrect old value blocks update' ' EOF printf "%s\n" "update $prefix/foo $E $D" | test_must_fail git update-ref --stdin 2>output.err && - test_cmp expected output.err + if test_have_prereq REFFILES ; then + test_cmp expected output.err + fi ' test_expect_success 'existing old value blocks create' ' @@ -358,7 +366,9 @@ test_expect_success 'existing old value blocks create' ' EOF printf "%s\n" "create $prefix/foo $E" | test_must_fail git update-ref --stdin 2>output.err && - test_cmp expected output.err + if test_have_prereq REFFILES ; then + test_cmp expected output.err + fi ' test_expect_success 'incorrect old value blocks delete' ' @@ -369,7 +379,9 @@ test_expect_success 'incorrect old value blocks delete' ' EOF printf "%s\n" "delete $prefix/foo $D" | test_must_fail git update-ref --stdin 2>output.err && - test_cmp expected output.err + if test_have_prereq REFFILES ; then + test_cmp expected output.err + fi ' test_expect_success 'missing old value blocks indirect update' ' @@ -380,7 +392,9 @@ test_expect_success 'missing old value blocks indirect update' ' EOF printf "%s\n" "update $prefix/symref $E $D" | test_must_fail git update-ref --stdin 2>output.err && - test_cmp expected output.err + if test_have_prereq REFFILES ; then + test_cmp expected output.err + fi ' test_expect_success 'incorrect old value blocks indirect update' ' @@ -392,7 +406,9 @@ test_expect_success 'incorrect old value blocks indirect update' ' EOF printf "%s\n" "update $prefix/symref $E $D" | test_must_fail git update-ref --stdin 2>output.err && - test_cmp expected output.err + if test_have_prereq REFFILES ; then + test_cmp expected output.err + fi ' test_expect_success 'existing old value blocks indirect create' ' @@ -404,7 +420,9 @@ test_expect_success 'existing old value blocks indirect create' ' EOF printf "%s\n" "create $prefix/symref $E" | test_must_fail git update-ref --stdin 2>output.err && - test_cmp expected output.err + if test_have_prereq REFFILES ; then + test_cmp expected output.err + fi ' test_expect_success 'incorrect old value blocks indirect delete' ' @@ -416,7 +434,9 @@ test_expect_success 'incorrect old value blocks indirect delete' ' EOF printf "%s\n" "delete $prefix/symref $D" | test_must_fail git update-ref --stdin 2>output.err && - test_cmp expected output.err + if test_have_prereq REFFILES ; then + test_cmp expected output.err + fi ' test_expect_success 'missing old value blocks indirect no-deref update' ' @@ -427,7 +447,9 @@ test_expect_success 'missing old value blocks indirect no-deref update' ' EOF printf "%s\n" "option no-deref" "update $prefix/symref $E $D" | test_must_fail git update-ref --stdin 2>output.err && - test_cmp expected output.err + if test_have_prereq REFFILES ; then + test_cmp expected output.err + fi ' test_expect_success 'incorrect old value blocks indirect no-deref update' ' @@ -439,7 +461,9 @@ test_expect_success 'incorrect old value blocks indirect no-deref update' ' EOF printf "%s\n" "option no-deref" "update $prefix/symref $E $D" | test_must_fail git update-ref --stdin 2>output.err && - test_cmp expected output.err + if test_have_prereq REFFILES ; then + test_cmp expected output.err + fi ' test_expect_success 'existing old value blocks indirect no-deref create' ' @@ -451,7 +475,9 @@ test_expect_success 'existing old value blocks indirect no-deref create' ' EOF printf "%s\n" "option no-deref" "create $prefix/symref $E" | test_must_fail git update-ref --stdin 2>output.err && - test_cmp expected output.err + if test_have_prereq REFFILES ; then + test_cmp expected output.err + fi ' test_expect_success 'incorrect old value blocks indirect no-deref delete' ' @@ -463,7 +489,9 @@ test_expect_success 'incorrect old value blocks indirect no-deref delete' ' EOF printf "%s\n" "option no-deref" "delete $prefix/symref $D" | test_must_fail git update-ref --stdin 2>output.err && - test_cmp expected output.err + if test_have_prereq REFFILES ; then + test_cmp expected output.err + fi ' test_expect_success REFFILES 'non-empty directory blocks create' ' From patchwork Tue Jul 20 17:04:46 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 12388879 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7322AC636CA for ; Tue, 20 Jul 2021 17:09:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6061361003 for ; Tue, 20 Jul 2021 17:09:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231305AbhGTQ2h (ORCPT ); Tue, 20 Jul 2021 12:28:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49728 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232994AbhGTQYb (ORCPT ); Tue, 20 Jul 2021 12:24:31 -0400 Received: from mail-wm1-x333.google.com (mail-wm1-x333.google.com [IPv6:2a00:1450:4864:20::333]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 46220C061574 for ; Tue, 20 Jul 2021 10:05:06 -0700 (PDT) Received: by mail-wm1-x333.google.com with SMTP id l6so12714359wmq.0 for ; Tue, 20 Jul 2021 10:05:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=vS3g4wjwVgWYF9DEvXctmDtiPK5MuvaZwl5SZRTztt4=; b=l0HFKIvIGw/ljbQr7HrCcMgX64o7FcaQhOjQL1IORv3B7q5DbauQ/jXhqmmADcVDAD bLkTJ5x+viC/yR/8uwataojvNIksejA6myWuMgDIbdi5gDeprF6YSu3HZ2NnumceTaom it0MBgXlBSWegOzhoAx2V1n6Qzub+nPKsUWqIelv4bMSXz9mKI40gCOUEnJQphSFhcLC v46d3U7VoOQAatOjpVehL3n+LdgAJ7V1oxaJBTeE83feY3cv0y35cD8Whp3Tyy7vD2R6 YI10rw06bNn7a7oVPcRT53Y7KI8wG+Cf2AeNTkT7bVceVZNzNj63e/ZRtZMGDXuKSA5/ cl6w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=vS3g4wjwVgWYF9DEvXctmDtiPK5MuvaZwl5SZRTztt4=; b=nZlyuJ4malkZ8F5SUcV2Q/a0mIZtyrDy4fRajbkHwyAIVP9LlURjDWqe0RBu5+T74Z 5iIUBzmSq4LzgimxPVxFdLb5Db2tWDT6wIfpGnkQDRxP3YwLfDdO/Hl2ZhyRxxdUB1JA 97oY2TRTsIptpAwmrC5wv0cFHEYeMY08t6Mb36SQ7e5LUMYWHMQ4nA7mUz7don4IZ95G fzICW8RL2B6iXyCro1C33D687TLgjFMvJM7u7JKNeqPtZt+YCn3k9wOI0spmepP9Ik96 gqFrolP7MSue1Xbf/+hm8Re8jGNfFzRIkmm51OXDzLOwad7eKUZqCSIuXCDo0KO2jyRH cR4Q== X-Gm-Message-State: AOAM531KFRMNJb+YWNSVU5MOoAcy711pJceJzn8j0nxmJIQMVSi0MmMs sP3em2sgtzmOIeLumbf0TvnyoKUXNj4= X-Google-Smtp-Source: ABdhPJzqCTqA3IIU4b+tnE61i2SAs+9rwGr1ADZxBM3W2bl2JYJ+olBt8mNVGtVn3u5q9KxKqbGhDQ== X-Received: by 2002:a05:600c:6d8:: with SMTP id b24mr39177253wmn.111.1626800704909; Tue, 20 Jul 2021 10:05:04 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id p2sm19876467wmg.6.2021.07.20.10.05.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 10:05:04 -0700 (PDT) Message-Id: <4a5891fa8cc9514535d70b52a4fce2b21991bfef.1626800687.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 20 Jul 2021 17:04:46 +0000 Subject: [PATCH 26/26] t7004: avoid direct filesystem access Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys Signed-off-by: Han-Wen Nienhuys --- t/t7004-tag.sh | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/t/t7004-tag.sh b/t/t7004-tag.sh index 2f72c5c6883..8bd84b0e404 100755 --- a/t/t7004-tag.sh +++ b/t/t7004-tag.sh @@ -97,7 +97,8 @@ test_expect_success 'creating a tag with --create-reflog should create reflog' ' test_when_finished "git tag -d tag_with_reflog" && git tag --create-reflog tag_with_reflog && git reflog exists refs/tags/tag_with_reflog && - sed -e "s/^.* //" .git/logs/refs/tags/tag_with_reflog >actual && + git reflog --format="format:tag: tagging %h (%s, %cd)%n" \ + --date=format:%Y-%m-%d refs/tags/tag_with_reflog >actual && test_cmp expected actual ' @@ -108,7 +109,9 @@ test_expect_success 'annotated tag with --create-reflog has correct message' ' test_when_finished "git tag -d tag_with_reflog" && git tag -m "annotated tag" --create-reflog tag_with_reflog && git reflog exists refs/tags/tag_with_reflog && - sed -e "s/^.* //" .git/logs/refs/tags/tag_with_reflog >actual && + git reflog \ + --format="format:tag: tagging %h (%s, %cd)%n" \ + --date=format:%Y-%m-%d >actual && test_cmp expected actual '