From patchwork Wed Dec 9 14:00:15 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 11961561 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 380E3C4167B for ; Wed, 9 Dec 2020 14:02:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 07A3423B51 for ; Wed, 9 Dec 2020 14:02:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732679AbgLIOBf (ORCPT ); Wed, 9 Dec 2020 09:01:35 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60282 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728541AbgLIOBO (ORCPT ); Wed, 9 Dec 2020 09:01:14 -0500 Received: from mail-wm1-x32c.google.com (mail-wm1-x32c.google.com [IPv6:2a00:1450:4864:20::32c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F13DFC0613D6 for ; Wed, 9 Dec 2020 06:00:33 -0800 (PST) Received: by mail-wm1-x32c.google.com with SMTP id v14so1572731wml.1 for ; Wed, 09 Dec 2020 06:00:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=vo6M8ti1rMOOBfY3vJMb/+0k6SEpCQ9PI6s3OSwMbqo=; b=kR/8qolaJxtVaO6D8VMSeFm6E99kTsbDAULOfMhPd+ir1zoLXDrJcL7sfYXmBjLbqT l6wBtu76aKMp+nL+URUobBAtIiHv38q75WWMNqL5UfBrohPKKZ5ugp57Tpbsk1Sz1x6d d1hhPbV1alYNFBzdD/jPNEtQ/SE6Xmjx+izk20HMPk2WW9PE/IWWWCtPiF4VlChdaTLJ wf3PhVOvRDRHp3IYXDW9756fSF3jgqggZnGafqwM5IdT7No4sFqQQH2BnTosgqchhB2G Dk1p+azeIHUmkDZtLtChXe8COwNr7zzkl/lR7oX+UDTzNLzmr7dT4GGmHaF551t+cN5T lzlQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=vo6M8ti1rMOOBfY3vJMb/+0k6SEpCQ9PI6s3OSwMbqo=; b=myLwWh6aJeP+DjVYN4sq7TcBhk3JdmIIUhfALIjt074tCjuvlJJbfiJ/Oiig3TlwkS eYuZj48f/lWtLsilN1cawq4BXqLKR3B8vnAz2nMAOzkxnbRo/A0yS7jIMoJEmBuX0Zau 3WTTjmmpURrLkWSvWbrgYJTqlIuS0EDwoZTis9tbn27ALPBRO+uNH6sU4/oZ+/Xo1oAv U2WeFQVF7Ay24ex8SxIRyPP13VgTzQyu72g1sqBsGswpK6PoiMYdCVTSZe2kipSI4PDG pB/rNC/h+KDsGJ/Ld0lnSw4sIMjNZboKVtgFD4KOjlkFhBwaLpItT9UVB6fIWZC8vZz8 f8Bw== X-Gm-Message-State: AOAM530azSDrtwiIdE0WC7afy+hrCnfawEjSw0fx2ofAQFA1fXmwcMe1 5lp5gsNKAAoKwhW0b0G3y/n6/mcNV+Y= X-Google-Smtp-Source: ABdhPJy+gw7ze0Z+dhPafDTlhdVjCr3hfvBUEn8zZ1GjHehqZkJr/fEhCqhrp7ObUZZ5JK2T16QWWg== X-Received: by 2002:a1c:5447:: with SMTP id p7mr2953578wmi.116.1607522432498; Wed, 09 Dec 2020 06:00:32 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id d3sm3626935wrr.2.2020.12.09.06.00.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 09 Dec 2020 06:00:32 -0800 (PST) Message-Id: <40ac041d0efef7a7baec56354265176235137444.1607522429.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 09 Dec 2020 14:00:15 +0000 Subject: [PATCH v4 01/15] init-db: set the_repository->hash_algo early on Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Jeff King , Ramsay Jones , Jonathan Nieder , Johannes Schindelin , Jonathan Tan , Josh Steadmon , Emily Shaffer , Patrick Steinhardt , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Felipe Contreras , Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys The reftable backend needs to know the hash algorithm for writing the initialization hash table. The initial reftable contains a symref HEAD => "main" (or "master"), which is agnostic to the size of hash value, but this is an exceptional circumstance, and the reftable library does not cater to this exception. It insists that all tables in the stack have a consistent format ID for the hash algorithm. Call set_repo_hash_algo directly after calling validate_hash_algorithm() (which reads $GIT_DEFAULT_HASH). Helped-by: Junio C Hamano Signed-off-by: Han-Wen Nienhuys --- builtin/init-db.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/builtin/init-db.c b/builtin/init-db.c index 01bc648d416..dcb7015db48 100644 --- a/builtin/init-db.c +++ b/builtin/init-db.c @@ -437,6 +437,27 @@ int init_db(const char *git_dir, const char *real_git_dir, validate_hash_algorithm(&repo_fmt, hash); + /* + * At this point, the_repository we have in-core does not look + * anything like one that we would see initialized in an already + * working repository after calling setup_git_directory(). + * + * Calling repository.c::initialize_the_repository() may have + * prepared the .index .objects and .parsed_objects members, but + * other members like .gitdir, .commondir, etc. have not been + * initialized. + * + * Many API functions assume they are working with the_repository + * that has sensibly been initialized, but because we haven't + * really read from an existing repository, we need to hand-craft + * the necessary members of the structure to get out of this + * chicken-and-egg situation. + * + * For now, we update the hash algorithm member to what the + * validate_hash_algorithm() call decided for us. + */ + repo_set_hash_algo(the_repository, repo_fmt.hash_algo); + reinit = create_default_files(template_dir, original_git_dir, initial_branch, &repo_fmt); if (reinit && initial_branch) From patchwork Wed Dec 9 14:00:16 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 11961559 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 10A98C4361B for ; Wed, 9 Dec 2020 14:02:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D28A323B31 for ; Wed, 9 Dec 2020 14:02:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728678AbgLIOBe (ORCPT ); Wed, 9 Dec 2020 09:01:34 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60284 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728577AbgLIOBP (ORCPT ); Wed, 9 Dec 2020 09:01:15 -0500 Received: from mail-wr1-x444.google.com (mail-wr1-x444.google.com [IPv6:2a00:1450:4864:20::444]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EA3D1C061794 for ; Wed, 9 Dec 2020 06:00:34 -0800 (PST) Received: by mail-wr1-x444.google.com with SMTP id t4so1841090wrr.12 for ; Wed, 09 Dec 2020 06:00:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=4NpTQf32ICA7a2lk50sbT+lka4vzTh1Zlu0nYbV/8/o=; b=DOoeFNIrJlTYMzORyF7UaPIY4TdnaujAfjRT+vEu24g3oaKVdtM8z+cNsL5+DRgJ28 KBKcJiiQg8eJziZ7mPIqo9lnxW/fsU1Ht1KCgoecb10LOHgCCaaFfcYGTCSz0cJW4lFb smv0uikbYO5rWbW9MyVaTbo3dOi/0Ws5UG0rv+xOVSUXAcBJd04ejkx80eILIaiK6Sun mOmQZ1aqHyEzxtaYX7XGcDIUXc1GVfstKNnVgvB3GPT9vCxd/L7oFAI3LAwev+VTi+Vz eIsdEycI9vzJdgCHVSfpsPInpZBo/1jc0MIi7MsK0qRiq90rWCoijiFJ1vnzvJi9r/n4 LXLg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=4NpTQf32ICA7a2lk50sbT+lka4vzTh1Zlu0nYbV/8/o=; b=ZZ58hsJWQVCbbUXM2UP9yMk9lZS/2q+BkdOGXo3nDtMwKJBGWjkT0RGZoTqPdVAwzB kOLoxxrnI6kkSDgMu/xjTtEynCxSQM0moHEJFv1QI2ma+yOep7jw3HUgtZ0Un6gsNjuT 0AnovY96pkHmoUddI86/xqdJByEaksKMzSgsVx7Jb+ZsuBTq1qtejZal8CzCkB0M0COW EF96U54VVt0yae0KTQTLCokyM64mImVoQPW1cG8B8JJ7kv+/qPhzqh/Hvy6HAfP0JQGG VnvHo4BQrZpJTFJSv/CT00eeb5tZKuyH2eWg/assSH0RpS641eXNFjdDoLbonpY2mFjx nrZg== X-Gm-Message-State: AOAM532ktgc8+AFGTnWyO4GWG1RqiXmI3TnWAo9RkaGkTsSyIWinRE1l sUzlz7OUTbHuMswiC5/HzhGwV9J4gnQ= X-Google-Smtp-Source: ABdhPJzPXycv8OynKmg4KmMeXDP1sCWhDn+vAUFyLnjem2HA0IEsuMPoATD05IF8jxmLru7LhRfrqw== X-Received: by 2002:a5d:5005:: with SMTP id e5mr2800942wrt.279.1607522433466; Wed, 09 Dec 2020 06:00:33 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id z2sm3909816wml.23.2020.12.09.06.00.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 09 Dec 2020 06:00:32 -0800 (PST) Message-Id: In-Reply-To: References: Date: Wed, 09 Dec 2020 14:00:16 +0000 Subject: [PATCH v4 02/15] reftable: add LICENSE Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Jeff King , Ramsay Jones , Jonathan Nieder , Johannes Schindelin , Jonathan Tan , Josh Steadmon , Emily Shaffer , Patrick Steinhardt , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Felipe Contreras , Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys TODO: relicense? Signed-off-by: Han-Wen Nienhuys --- reftable/LICENSE | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) create mode 100644 reftable/LICENSE diff --git a/reftable/LICENSE b/reftable/LICENSE new file mode 100644 index 00000000000..402e0f9356b --- /dev/null +++ b/reftable/LICENSE @@ -0,0 +1,31 @@ +BSD License + +Copyright (c) 2020, Google LLC +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + +* Redistributions of source code must retain the above copyright notice, +this list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright +notice, this list of conditions and the following disclaimer in the +documentation and/or other materials provided with the distribution. + +* Neither the name of Google LLC nor the names of its contributors may +be used to endorse or promote products derived from this software +without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. From patchwork Wed Dec 9 14:00:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 11961555 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E195CC19425 for ; Wed, 9 Dec 2020 14:01:34 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B1C3123B42 for ; Wed, 9 Dec 2020 14:01:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728695AbgLIOBS (ORCPT ); Wed, 9 Dec 2020 09:01:18 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60288 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728607AbgLIOBQ (ORCPT ); Wed, 9 Dec 2020 09:01:16 -0500 Received: from mail-wr1-x42f.google.com (mail-wr1-x42f.google.com [IPv6:2a00:1450:4864:20::42f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D83BBC06179C for ; Wed, 9 Dec 2020 06:00:35 -0800 (PST) Received: by mail-wr1-x42f.google.com with SMTP id a12so1850418wrv.8 for ; Wed, 09 Dec 2020 06:00:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=/Izbl9wexLtHbO918dhEysjpUsFaAvsDqZEji3I0p88=; b=krahg221RGeviBhJK6BCiHCv4wOL4Y6jtZAw8hXvO5Vx+1Rp6Rc+Msj30XGnbP2IgF RdJZF7obJ8O9azcqn1WshRIHY296VQqseCfW9HG8rW/OgFSnZqYH0M4rYFjzw56iDb8/ 3VqmwZyGNsvv1rz1U47PerDIJC4v4uU42A+zL3FaXvcS0o0euNbRQwklEqsUjb/NI/tW RDRjYSUO1rrF2NZZDQvqbwwHUCDVQHn2WWAvX/o90fbMqpFXfcDAJaB8wcfQ7CrEkcwC KPhbd2R3NavGR7veIZ4kUrR7gsDXyOSTY76fgLtzcqXjmk2xJZYz+s/LBAzOczgUoMKe bsNA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=/Izbl9wexLtHbO918dhEysjpUsFaAvsDqZEji3I0p88=; b=qNTPupVHnC1yc1llCWFcgdm4YFtOsPjcM72ATUs0Wn1V8fPzCckertRDegALOZwD2a CeoTxkskTTKFrPlhmyZY6299dMnW8PlgmdazLmvDhE+yN9vQMzs5l78oBG3XyTcIMIyF zXJLkRSd0xRZ4Fc+IIKPrKGiJKu8PP7rpTKTZCAyDMfy9+JmAmwwloVRzUayp5FKqw5C dXvBCJJyfjnMeZ1fz2p3Rk+kkBKNPid8WqQEAgTQn9109OdtocN4a5vIE4p17i/q3Jxp 5ZuiI/Mt81H7rWffZCULhh0m4ccNGqHMAnwdFI5s5RjZN9JXhBcrorVWCsQ32WeEWf5d eUBw== X-Gm-Message-State: AOAM530axXG3iLQPAfxRjJX63xnObIEo0rAxdF/LZStAjabKRAjhZ+gu yFnCj6aPKsAetuYK80qfMnWmtmi84oU= X-Google-Smtp-Source: ABdhPJwZDNxFYcgT0S+EbmZr3ql6406YHCbj+1LPw/NoVMy3/vY7bT4w1RqhERkWiXEe/SnQscVTGQ== X-Received: by 2002:adf:ec86:: with SMTP id z6mr2919787wrn.17.1607522434420; Wed, 09 Dec 2020 06:00:34 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id k2sm3568326wru.43.2020.12.09.06.00.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 09 Dec 2020 06:00:33 -0800 (PST) Message-Id: <798f680b800b6ef23c220fef88f82b7064b796ff.1607522429.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 09 Dec 2020 14:00:17 +0000 Subject: [PATCH v4 03/15] reftable: add error related functionality Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Jeff King , Ramsay Jones , Jonathan Nieder , Johannes Schindelin , Jonathan Tan , Josh Steadmon , Emily Shaffer , Patrick Steinhardt , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Felipe Contreras , Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys The reftable/ directory is structured as a library, so it cannot crash on misuse. Instead, it returns an error codes. In addition, the error code can be used to signal conditions from lower levels of the library to be handled by higher levels of the library. For example, a transaction might legitimately write an empty reftable file, but in that case, we'd want to shortcut the transaction overhead. Signed-off-by: Han-Wen Nienhuys --- reftable/error.c | 41 ++++++++++++++++++++++++++ reftable/reftable-error.h | 62 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 103 insertions(+) create mode 100644 reftable/error.c create mode 100644 reftable/reftable-error.h diff --git a/reftable/error.c b/reftable/error.c new file mode 100644 index 00000000000..f6f16def921 --- /dev/null +++ b/reftable/error.c @@ -0,0 +1,41 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "reftable-error.h" + +#include + +const char *reftable_error_str(int err) +{ + static char buf[250]; + switch (err) { + case REFTABLE_IO_ERROR: + return "I/O error"; + case REFTABLE_FORMAT_ERROR: + return "corrupt reftable file"; + case REFTABLE_NOT_EXIST_ERROR: + return "file does not exist"; + case REFTABLE_LOCK_ERROR: + return "data is outdated"; + case REFTABLE_API_ERROR: + return "misuse of the reftable API"; + case REFTABLE_ZLIB_ERROR: + return "zlib failure"; + case REFTABLE_NAME_CONFLICT: + return "file/directory conflict"; + case REFTABLE_EMPTY_TABLE_ERROR: + return "wrote empty table"; + case REFTABLE_REFNAME_ERROR: + return "invalid refname"; + case -1: + return "general error"; + default: + snprintf(buf, sizeof(buf), "unknown error code %d", err); + return buf; + } +} diff --git a/reftable/reftable-error.h b/reftable/reftable-error.h new file mode 100644 index 00000000000..6f89bedf1a5 --- /dev/null +++ b/reftable/reftable-error.h @@ -0,0 +1,62 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef REFTABLE_ERROR_H +#define REFTABLE_ERROR_H + +/* + * Errors in reftable calls are signaled with negative integer return values. 0 + * means success. + */ +enum reftable_error { + /* Unexpected file system behavior */ + REFTABLE_IO_ERROR = -2, + + /* Format inconsistency on reading data */ + REFTABLE_FORMAT_ERROR = -3, + + /* File does not exist. Returned from block_source_from_file(), because + * it needs special handling in stack. + */ + REFTABLE_NOT_EXIST_ERROR = -4, + + /* Trying to write out-of-date data. */ + REFTABLE_LOCK_ERROR = -5, + + /* Misuse of the API: + * - on writing a record with NULL refname. + * - on writing a reftable_ref_record outside the table limits + * - on writing a ref or log record before the stack's + * next_update_inde*x + * - on writing a log record with multiline message with + * exact_log_message unset + * - on reading a reftable_ref_record from log iterator, or vice versa. + * + * When a call misuses the API, the internal state of the library is + * kept unchanged. + */ + REFTABLE_API_ERROR = -6, + + /* Decompression error */ + REFTABLE_ZLIB_ERROR = -7, + + /* Wrote a table without blocks. */ + REFTABLE_EMPTY_TABLE_ERROR = -8, + + /* Dir/file conflict. */ + REFTABLE_NAME_CONFLICT = -9, + + /* Invalid ref name. */ + REFTABLE_REFNAME_ERROR = -10, +}; + +/* convert the numeric error code to a string. The string should not be + * deallocated. */ +const char *reftable_error_str(int err); + +#endif From patchwork Wed Dec 9 14:00:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 11961575 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C09AFC2BB40 for ; Wed, 9 Dec 2020 14:03:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 971F923B31 for ; Wed, 9 Dec 2020 14:03:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732777AbgLIOBy (ORCPT ); Wed, 9 Dec 2020 09:01:54 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60382 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728855AbgLIOBw (ORCPT ); Wed, 9 Dec 2020 09:01:52 -0500 Received: from mail-wr1-x434.google.com (mail-wr1-x434.google.com [IPv6:2a00:1450:4864:20::434]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3371BC0617A7 for ; Wed, 9 Dec 2020 06:00:37 -0800 (PST) Received: by mail-wr1-x434.google.com with SMTP id a12so1850506wrv.8 for ; Wed, 09 Dec 2020 06:00:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=OfGJ5NrDw8OMe/NyNtXg2/cynsBeyBHSQCiL5LjTUms=; b=cYiadi+ETWb6HxUZrPbls6YOWVYQqZlFaemPrTvek+7zLUUvQWrd2bSGTS65C+NDf0 SgA7o1sWqDlnGdjoH7zwak1Q2BVtd18iA5AHkkgFF9xQnKwESfpY16XckWyko4kXIjvO gDynjFJ+RZww+C4WazNU6DB3D+Ol5KUwEPARi8LC85eLUd00973hR65hkv6mgwvKqj/x mp/G6/FLGgR+DvstWSJxmDm8RqmuPPAx8QOcmpLg0rmfBMo7hAUkx9QPey4J+tRdgLbK 0wb3W2STDplUdazV84cMkeFGvdGZUathPGZ2Sw9XeaoHs8Fn42gBtIdmZWluWoR2/D14 JAfA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=OfGJ5NrDw8OMe/NyNtXg2/cynsBeyBHSQCiL5LjTUms=; b=A4V3AgrhipQ2kHunqmEpCgBj7RrrnPqB5QE5ElX8s5Qmv2qo0VNm0FFdIIKUpNpXBP kr0Hw2z/5UA16iiQ5GjJmyVMXstbIo6chy/XILghF4KxzIrLo0chaU1eWks/R0cbVsNT i1rujLALRR24Ybil+zhmROpjY4VA9VT5cBIu75TvEMFDXTy3ck4qT6Sl3iGH6oaBJ4dz 9M1TlAsz5DHo/kDyjwIk/qkIcSs3r4vOAlsh6n0Fu19kOsToaxcgL0PNQ/HI2zya+wyX eUub4PMCtfYsUJmASHIYaLX74jEQgGxypC/ZnGO3cDlyFU0DCCv0dAWslA3r+MHOjIWp uayQ== X-Gm-Message-State: AOAM531eGRs89pU64Kip/5zaJP+rKk42/swZ3mM+4M8sxJOUJe2HzDey Gw87qN8C+B/ln2EgJ0u1KznspUbJF/4= X-Google-Smtp-Source: ABdhPJz6j5pgcmCe8eORYdudpAPpnWfELlhI+TTDaz/bH/+yFatjeOP9+HKcN22UOWX4sk00hQfYzA== X-Received: by 2002:adf:e481:: with SMTP id i1mr2832378wrm.282.1607522435408; Wed, 09 Dec 2020 06:00:35 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id f14sm4122943wme.14.2020.12.09.06.00.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 09 Dec 2020 06:00:34 -0800 (PST) Message-Id: <2f55ff6d2406612c6fea16e0abefc41299642305.1607522429.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 09 Dec 2020 14:00:18 +0000 Subject: [PATCH v4 04/15] reftable: utility functions Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Jeff King , Ramsay Jones , Jonathan Nieder , Johannes Schindelin , Jonathan Tan , Josh Steadmon , Emily Shaffer , Patrick Steinhardt , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Felipe Contreras , Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys This commit provides basic utility classes for the reftable library. Signed-off-by: Han-Wen Nienhuys Helped-by: Johannes Schindelin --- Makefile | 24 +++++- contrib/buildsystems/CMakeLists.txt | 14 ++- reftable/basics.c | 128 ++++++++++++++++++++++++++++ reftable/basics.h | 60 +++++++++++++ reftable/basics_test.c | 98 +++++++++++++++++++++ reftable/publicbasics.c | 58 +++++++++++++ reftable/reftable-malloc.h | 18 ++++ reftable/reftable-tests.h | 22 +++++ reftable/system.h | 32 +++++++ reftable/test_framework.c | 23 +++++ reftable/test_framework.h | 53 ++++++++++++ t/helper/test-reftable.c | 9 ++ t/helper/test-tool.c | 3 +- t/helper/test-tool.h | 1 + t/t0032-reftable-unittest.sh | 15 ++++ 15 files changed, 552 insertions(+), 6 deletions(-) create mode 100644 reftable/basics.c create mode 100644 reftable/basics.h create mode 100644 reftable/basics_test.c create mode 100644 reftable/publicbasics.c create mode 100644 reftable/reftable-malloc.h create mode 100644 reftable/reftable-tests.h create mode 100644 reftable/system.h create mode 100644 reftable/test_framework.c create mode 100644 reftable/test_framework.h create mode 100644 t/helper/test-reftable.c create mode 100755 t/t0032-reftable-unittest.sh diff --git a/Makefile b/Makefile index 45bce31016b..9b6d84af8f6 100644 --- a/Makefile +++ b/Makefile @@ -731,6 +731,7 @@ TEST_BUILTINS_OBJS += test-read-cache.o TEST_BUILTINS_OBJS += test-read-graph.o TEST_BUILTINS_OBJS += test-read-midx.o TEST_BUILTINS_OBJS += test-ref-store.o +TEST_BUILTINS_OBJS += test-reftable.o TEST_BUILTINS_OBJS += test-regex.o TEST_BUILTINS_OBJS += test-repository.o TEST_BUILTINS_OBJS += test-revision-walking.o @@ -820,6 +821,8 @@ TEST_SHELL_PATH = $(SHELL_PATH) LIB_FILE = libgit.a XDIFF_LIB = xdiff/lib.a +REFTABLE_LIB = reftable/libreftable.a +REFTABLE_TEST_LIB = reftable/libreftable_test.a GENERATED_H += command-list.h GENERATED_H += config-list.h @@ -1185,7 +1188,7 @@ THIRD_PARTY_SOURCES += compat/regex/% THIRD_PARTY_SOURCES += sha1collisiondetection/% THIRD_PARTY_SOURCES += sha1dc/% -GITLIBS = common-main.o $(LIB_FILE) $(XDIFF_LIB) +GITLIBS = common-main.o $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) EXTLIBS = GIT_USER_AGENT = git/$(GIT_VERSION) @@ -2386,10 +2389,19 @@ XDIFF_OBJS += xdiff/xpatience.o XDIFF_OBJS += xdiff/xprepare.o XDIFF_OBJS += xdiff/xutils.o +REFTABLE_OBJS += reftable/basics.o +REFTABLE_OBJS += reftable/error.o +REFTABLE_OBJS += reftable/publicbasics.o + +REFTABLE_TEST_OBJS += reftable/test_framework.o +REFTABLE_TEST_OBJS += reftable/basics_test.o + TEST_OBJS := $(patsubst %$X,%.o,$(TEST_PROGRAMS)) $(patsubst %,t/helper/%,$(TEST_BUILTINS_OBJS)) OBJECTS := $(LIB_OBJS) $(BUILTIN_OBJS) $(PROGRAM_OBJS) $(TEST_OBJS) \ $(XDIFF_OBJS) \ $(FUZZ_OBJS) \ + $(REFTABLE_OBJS) \ + $(REFTABLE_TEST_OBJS) \ common-main.o \ git.o ifndef NO_CURL @@ -2541,6 +2553,12 @@ $(LIB_FILE): $(LIB_OBJS) $(XDIFF_LIB): $(XDIFF_OBJS) $(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^ +$(REFTABLE_LIB): $(REFTABLE_OBJS) + $(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^ + +$(REFTABLE_TEST_LIB): $(REFTABLE_TEST_OBJS) + $(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^ + export DEFAULT_EDITOR DEFAULT_PAGER Documentation/GIT-EXCLUDED-PROGRAMS: FORCE @@ -2821,7 +2839,7 @@ perf: all t/helper/test-tool$X: $(patsubst %,t/helper/%,$(TEST_BUILTINS_OBJS)) -t/helper/test-%$X: t/helper/test-%.o GIT-LDFLAGS $(GITLIBS) +t/helper/test-%$X: t/helper/test-%.o GIT-LDFLAGS $(GITLIBS) $(REFTABLE_TEST_LIB) $(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) $(filter %.a,$^) $(LIBS) check-sha1:: t/helper/test-tool$X @@ -3150,7 +3168,7 @@ cocciclean: clean: profile-clean coverage-clean cocciclean $(RM) *.res $(RM) $(OBJECTS) - $(RM) $(LIB_FILE) $(XDIFF_LIB) + $(RM) $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(REFTABLE_TEST_LIB) $(RM) $(ALL_PROGRAMS) $(SCRIPT_LIB) $(BUILT_INS) git$X $(RM) $(TEST_PROGRAMS) $(RM) $(FUZZ_PROGRAMS) diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt index df539a44fa0..f3a2fd35616 100644 --- a/contrib/buildsystems/CMakeLists.txt +++ b/contrib/buildsystems/CMakeLists.txt @@ -591,6 +591,12 @@ parse_makefile_for_sources(libxdiff_SOURCES "XDIFF_OBJS") list(TRANSFORM libxdiff_SOURCES PREPEND "${CMAKE_SOURCE_DIR}/") add_library(xdiff STATIC ${libxdiff_SOURCES}) +#reftable +parse_makefile_for_sources(reftable_SOURCES "REFTABLE_OBJS") + +list(TRANSFORM reftable_SOURCES PREPEND "${CMAKE_SOURCE_DIR}/") +add_library(reftable STATIC ${reftable_SOURCES}) + if(WIN32) if(NOT MSVC)#use windres when compiling with gcc and clang add_custom_command(OUTPUT ${CMAKE_BINARY_DIR}/git.res @@ -613,7 +619,7 @@ endif() #link all required libraries to common-main add_library(common-main OBJECT ${CMAKE_SOURCE_DIR}/common-main.c) -target_link_libraries(common-main libgit xdiff ${ZLIB_LIBRARIES}) +target_link_libraries(common-main libgit xdiff reftable ${ZLIB_LIBRARIES}) if(Intl_FOUND) target_link_libraries(common-main ${Intl_LIBRARIES}) endif() @@ -848,11 +854,15 @@ if(BUILD_TESTING) add_executable(test-fake-ssh ${CMAKE_SOURCE_DIR}/t/helper/test-fake-ssh.c) target_link_libraries(test-fake-ssh common-main) +#reftable-tests +parse_makefile_for_sources(test-reftable_SOURCES "REFTABLE_TEST_OBJS") +list(TRANSFORM test-reftable_SOURCES PREPEND "${CMAKE_SOURCE_DIR}/") + #test-tool parse_makefile_for_sources(test-tool_SOURCES "TEST_BUILTINS_OBJS") list(TRANSFORM test-tool_SOURCES PREPEND "${CMAKE_SOURCE_DIR}/t/helper/") -add_executable(test-tool ${CMAKE_SOURCE_DIR}/t/helper/test-tool.c ${test-tool_SOURCES}) +add_executable(test-tool ${CMAKE_SOURCE_DIR}/t/helper/test-tool.c ${test-tool_SOURCES} ${test-reftable_SOURCES}) target_link_libraries(test-tool common-main) set_target_properties(test-fake-ssh test-tool diff --git a/reftable/basics.c b/reftable/basics.c new file mode 100644 index 00000000000..abd027b9888 --- /dev/null +++ b/reftable/basics.c @@ -0,0 +1,128 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "basics.h" + +void put_be24(uint8_t *out, uint32_t i) +{ + out[0] = (uint8_t)((i >> 16) & 0xff); + out[1] = (uint8_t)((i >> 8) & 0xff); + out[2] = (uint8_t)(i & 0xff); +} + +uint32_t get_be24(uint8_t *in) +{ + return (uint32_t)(in[0]) << 16 | (uint32_t)(in[1]) << 8 | + (uint32_t)(in[2]); +} + +void put_be16(uint8_t *out, uint16_t i) +{ + out[0] = (uint8_t)((i >> 8) & 0xff); + out[1] = (uint8_t)(i & 0xff); +} + +int binsearch(size_t sz, int (*f)(size_t k, void *args), void *args) +{ + size_t lo = 0; + size_t hi = sz; + + /* Invariants: + * + * (hi == sz) || f(hi) == true + * (lo == 0 && f(0) == true) || fi(lo) == false + */ + while (hi - lo > 1) { + size_t mid = lo + (hi - lo) / 2; + + if (f(mid, args)) + hi = mid; + else + lo = mid; + } + + if (lo) + return hi; + + return f(0, args) ? 0 : 1; +} + +void free_names(char **a) +{ + char **p; + if (a == NULL) { + return; + } + for (p = a; *p; p++) { + reftable_free(*p); + } + reftable_free(a); +} + +int names_length(char **names) +{ + char **p = names; + for (; *p; p++) { + /* empty */ + } + return p - names; +} + +void parse_names(char *buf, int size, char ***namesp) +{ + char **names = NULL; + size_t names_cap = 0; + size_t names_len = 0; + + char *p = buf; + char *end = buf + size; + while (p < end) { + char *next = strchr(p, '\n'); + if (next && next < end) { + *next = 0; + } else { + next = end; + } + if (p < next) { + if (names_len == names_cap) { + names_cap = 2 * names_cap + 1; + names = reftable_realloc( + names, names_cap * sizeof(*names)); + } + names[names_len++] = xstrdup(p); + } + p = next + 1; + } + + names = reftable_realloc(names, (names_len + 1) * sizeof(*names)); + names[names_len] = NULL; + *namesp = names; +} + +int names_equal(char **a, char **b) +{ + int i = 0; + for (; a[i] && b[i]; i++) { + if (strcmp(a[i], b[i])) { + return 0; + } + } + + return a[i] == b[i]; +} + +int common_prefix_size(struct strbuf *a, struct strbuf *b) +{ + int p = 0; + for (; p < a->len && p < b->len; p++) { + if (a->buf[p] != b->buf[p]) + break; + } + + return p; +} diff --git a/reftable/basics.h b/reftable/basics.h new file mode 100644 index 00000000000..096b36862b9 --- /dev/null +++ b/reftable/basics.h @@ -0,0 +1,60 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef BASICS_H +#define BASICS_H + +/* + * miscellaneous utilities that are not provided by Git. + */ + +#include "system.h" + +/* Bigendian en/decoding of integers */ + +void put_be24(uint8_t *out, uint32_t i); +uint32_t get_be24(uint8_t *in); +void put_be16(uint8_t *out, uint16_t i); + +/* + * find smallest index i in [0, sz) at which f(i) is true, assuming + * that f is ascending. Return sz if f(i) is false for all indices. + * + * Contrary to bsearch(3), this returns something useful if the argument is not + * found. + */ +int binsearch(size_t sz, int (*f)(size_t k, void *args), void *args); + +/* + * Frees a NULL terminated array of malloced strings. The array itself is also + * freed. + */ +void free_names(char **a); + +/* parse a newline separated list of names. `size` is the length of the buffer, + * without terminating '\0'. Empty names are discarded. */ +void parse_names(char *buf, int size, char ***namesp); + +/* compares two NULL-terminated arrays of strings. */ +int names_equal(char **a, char **b); + +/* returns the array size of a NULL-terminated array of strings. */ +int names_length(char **names); + +/* Allocation routines; they invoke the functions set through + * reftable_set_alloc() */ +void *reftable_malloc(size_t sz); +void *reftable_realloc(void *p, size_t sz); +void reftable_free(void *p); +void *reftable_calloc(size_t sz); + +/* Find the longest shared prefix size of `a` and `b` */ +struct strbuf; +int common_prefix_size(struct strbuf *a, struct strbuf *b); + +#endif diff --git a/reftable/basics_test.c b/reftable/basics_test.c new file mode 100644 index 00000000000..6d52f0f9d5a --- /dev/null +++ b/reftable/basics_test.c @@ -0,0 +1,98 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "system.h" + +#include "basics.h" +#include "test_framework.h" +#include "reftable-tests.h" + +struct binsearch_args { + int key; + int *arr; +}; + +static int binsearch_func(size_t i, void *void_args) +{ + struct binsearch_args *args = (struct binsearch_args *)void_args; + + return args->key < args->arr[i]; +} + +static void test_binsearch(void) +{ + int arr[] = { 2, 4, 6, 8, 10 }; + size_t sz = ARRAY_SIZE(arr); + struct binsearch_args args = { + .arr = arr, + }; + + int i = 0; + for (i = 1; i < 11; i++) { + int res; + args.key = i; + res = binsearch(sz, &binsearch_func, &args); + + if (res < sz) { + EXPECT(args.key < arr[res]); + if (res > 0) { + EXPECT(args.key >= arr[res - 1]); + } + } else { + EXPECT(args.key == 10 || args.key == 11); + } + } +} + +static void test_names_length(void) +{ + char *a[] = { "a", "b", NULL }; + EXPECT(names_length(a) == 2); +} + +static void test_parse_names_normal(void) +{ + char in[] = "a\nb\n"; + char **out = NULL; + parse_names(in, strlen(in), &out); + EXPECT(!strcmp(out[0], "a")); + EXPECT(!strcmp(out[1], "b")); + EXPECT(out[2] == NULL); + free_names(out); +} + +static void test_parse_names_drop_empty(void) +{ + char in[] = "a\n\n"; + char **out = NULL; + parse_names(in, strlen(in), &out); + EXPECT(!strcmp(out[0], "a")); + EXPECT(out[1] == NULL); + free_names(out); +} + +static void test_common_prefix(void) +{ + struct strbuf s1 = STRBUF_INIT; + struct strbuf s2 = STRBUF_INIT; + strbuf_addstr(&s1, "abcdef"); + strbuf_addstr(&s2, "abc"); + EXPECT(common_prefix_size(&s1, &s2) == 3); + strbuf_release(&s1); + strbuf_release(&s2); +} + +int basics_test_main(int argc, const char *argv[]) +{ + RUN_TEST(test_common_prefix); + RUN_TEST(test_parse_names_normal); + RUN_TEST(test_parse_names_drop_empty); + RUN_TEST(test_binsearch); + RUN_TEST(test_names_length); + return 0; +} diff --git a/reftable/publicbasics.c b/reftable/publicbasics.c new file mode 100644 index 00000000000..25639f61af6 --- /dev/null +++ b/reftable/publicbasics.c @@ -0,0 +1,58 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "reftable-malloc.h" + +#include "basics.h" +#include "system.h" + +static void *(*reftable_malloc_ptr)(size_t sz) = &malloc; +static void *(*reftable_realloc_ptr)(void *, size_t) = &realloc; +static void (*reftable_free_ptr)(void *) = &free; + +void *reftable_malloc(size_t sz) +{ + return (*reftable_malloc_ptr)(sz); +} + +void *reftable_realloc(void *p, size_t sz) +{ + return (*reftable_realloc_ptr)(p, sz); +} + +void reftable_free(void *p) +{ + reftable_free_ptr(p); +} + +void *reftable_calloc(size_t sz) +{ + void *p = reftable_malloc(sz); + memset(p, 0, sz); + return p; +} + +void reftable_set_alloc(void *(*malloc)(size_t), + void *(*realloc)(void *, size_t), void (*free)(void *)) +{ + reftable_malloc_ptr = malloc; + reftable_realloc_ptr = realloc; + reftable_free_ptr = free; +} + +int hash_size(uint32_t id) +{ + switch (id) { + case 0: + case SHA1_ID: + return SHA1_SIZE; + case SHA256_ID: + return SHA256_SIZE; + } + abort(); +} diff --git a/reftable/reftable-malloc.h b/reftable/reftable-malloc.h new file mode 100644 index 00000000000..5f2185f1f34 --- /dev/null +++ b/reftable/reftable-malloc.h @@ -0,0 +1,18 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef REFTABLE_H +#define REFTABLE_H + +#include + +/* Overrides the functions to use for memory management. */ +void reftable_set_alloc(void *(*malloc)(size_t), + void *(*realloc)(void *, size_t), void (*free)(void *)); + +#endif diff --git a/reftable/reftable-tests.h b/reftable/reftable-tests.h new file mode 100644 index 00000000000..5e7698ae654 --- /dev/null +++ b/reftable/reftable-tests.h @@ -0,0 +1,22 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef REFTABLE_TESTS_H +#define REFTABLE_TESTS_H + +int basics_test_main(int argc, const char **argv); +int block_test_main(int argc, const char **argv); +int merged_test_main(int argc, const char **argv); +int record_test_main(int argc, const char **argv); +int refname_test_main(int argc, const char **argv); +int reftable_test_main(int argc, const char **argv); +int stack_test_main(int argc, const char **argv); +int tree_test_main(int argc, const char **argv); +int reftable_dump_main(int argc, char *const *argv); + +#endif diff --git a/reftable/system.h b/reftable/system.h new file mode 100644 index 00000000000..07277ca0627 --- /dev/null +++ b/reftable/system.h @@ -0,0 +1,32 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef SYSTEM_H +#define SYSTEM_H + +#include "git-compat-util.h" +#include "strbuf.h" + +#include + +struct strbuf; +/* In git, this is declared in dir.h */ +int remove_dir_recursively(struct strbuf *path, int flags); + +#define SHA1_ID 0x73686131 +#define SHA256_ID 0x73323536 +#define SHA1_SIZE 20 +#define SHA256_SIZE 32 + +/* This is uncompress2, which is only available in zlib as of 2017. + */ +int uncompress_return_consumed(Bytef *dest, uLongf *destLen, + const Bytef *source, uLong *sourceLen); +int hash_size(uint32_t id); + +#endif diff --git a/reftable/test_framework.c b/reftable/test_framework.c new file mode 100644 index 00000000000..a5ff4e2a2d2 --- /dev/null +++ b/reftable/test_framework.c @@ -0,0 +1,23 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "system.h" +#include "test_framework.h" + +#include "basics.h" + +void set_test_hash(uint8_t *p, int i) +{ + memset(p, (uint8_t)i, hash_size(SHA1_ID)); +} + +int strbuf_add_void(void *b, const void *data, size_t sz) +{ + strbuf_add((struct strbuf *)b, data, sz); + return sz; +} diff --git a/reftable/test_framework.h b/reftable/test_framework.h new file mode 100644 index 00000000000..5fdc9519a5a --- /dev/null +++ b/reftable/test_framework.h @@ -0,0 +1,53 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef TEST_FRAMEWORK_H +#define TEST_FRAMEWORK_H + +#include "system.h" +#include "reftable-error.h" + +#define EXPECT_ERR(c) \ + if (c != 0) { \ + fflush(stderr); \ + fflush(stdout); \ + fprintf(stderr, "%s: %d: error == %d (%s), want 0\n", \ + __FILE__, __LINE__, c, reftable_error_str(c)); \ + abort(); \ + } + +#define EXPECT_STREQ(a, b) \ + if (strcmp(a, b)) { \ + fflush(stderr); \ + fflush(stdout); \ + fprintf(stderr, "%s:%d: %s (%s) != %s (%s)\n", __FILE__, \ + __LINE__, #a, a, #b, b); \ + abort(); \ + } + +#define EXPECT(c) \ + if (!(c)) { \ + fflush(stderr); \ + fflush(stdout); \ + fprintf(stderr, "%s: %d: failed assertion %s\n", __FILE__, \ + __LINE__, #c); \ + abort(); \ + } + +#define RUN_TEST(f) \ + fprintf(stderr, "running %s\n", #f); \ + fflush(stderr); \ + f(); + +void set_test_hash(uint8_t *p, int i); + +/* Like strbuf_add, but suitable for passing to reftable_new_writer + */ +int strbuf_add_void(void *b, const void *data, size_t sz); + +#endif diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c new file mode 100644 index 00000000000..3b58e423e7b --- /dev/null +++ b/t/helper/test-reftable.c @@ -0,0 +1,9 @@ +#include "reftable/reftable-tests.h" +#include "test-tool.h" + +int cmd__reftable(int argc, const char **argv) +{ + basics_test_main(argc, argv); + + return 0; +} diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c index 9d6d14d9293..0208a0a41cf 100644 --- a/t/helper/test-tool.c +++ b/t/helper/test-tool.c @@ -48,13 +48,14 @@ static struct test_cmd cmds[] = { { "path-utils", cmd__path_utils }, { "pkt-line", cmd__pkt_line }, { "prio-queue", cmd__prio_queue }, - { "proc-receive", cmd__proc_receive}, + { "proc-receive", cmd__proc_receive }, { "progress", cmd__progress }, { "reach", cmd__reach }, { "read-cache", cmd__read_cache }, { "read-graph", cmd__read_graph }, { "read-midx", cmd__read_midx }, { "ref-store", cmd__ref_store }, + { "reftable", cmd__reftable }, { "regex", cmd__regex }, { "repository", cmd__repository }, { "revision-walking", cmd__revision_walking }, diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h index a6470ff62c4..1de39ce5b58 100644 --- a/t/helper/test-tool.h +++ b/t/helper/test-tool.h @@ -44,6 +44,7 @@ int cmd__read_cache(int argc, const char **argv); int cmd__read_graph(int argc, const char **argv); int cmd__read_midx(int argc, const char **argv); int cmd__ref_store(int argc, const char **argv); +int cmd__reftable(int argc, const char **argv); int cmd__regex(int argc, const char **argv); int cmd__repository(int argc, const char **argv); int cmd__revision_walking(int argc, const char **argv); diff --git a/t/t0032-reftable-unittest.sh b/t/t0032-reftable-unittest.sh new file mode 100755 index 00000000000..0ed14971a58 --- /dev/null +++ b/t/t0032-reftable-unittest.sh @@ -0,0 +1,15 @@ +#!/bin/sh +# +# Copyright (c) 2020 Google LLC +# + +test_description='reftable unittests' + +. ./test-lib.sh + +test_expect_success 'unittests' ' + TMPDIR=$(pwd) && export TMPDIR && + test-tool reftable +' + +test_done From patchwork Wed Dec 9 14:00:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 11961571 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 74D01C433FE for ; Wed, 9 Dec 2020 14:03:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3890E23B42 for ; Wed, 9 Dec 2020 14:03:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732786AbgLIOCM (ORCPT ); Wed, 9 Dec 2020 09:02:12 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60404 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732779AbgLIOB5 (ORCPT ); Wed, 9 Dec 2020 09:01:57 -0500 Received: from mail-wm1-x32e.google.com (mail-wm1-x32e.google.com [IPv6:2a00:1450:4864:20::32e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2518AC0617B0 for ; Wed, 9 Dec 2020 06:00:38 -0800 (PST) Received: by mail-wm1-x32e.google.com with SMTP id e25so1789201wme.0 for ; Wed, 09 Dec 2020 06:00:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=JyxmAVIRZ2EULz8ridZs410zSRPcviD//ZA3AWZLvQM=; b=TgRNWtx43bIYmA0/FsPeVA0Zj9Xj+3qAE3V340g3sw3jA3F3NBV+5F2d5WZ9y3tZ4A BeLt6CpJcImIveSZA7xMkSzZNlYGLUhNOuQVY7yiYrzoPYhE1r0df0vsChfLdUqvscyW Tt+GQU6vEFTQ8tsSWDCf3iWpp6Z7uaY2NmcLFWwgR5MKroZHdcRgAv0Vin1BsXfvZk4y seNtB3URQtgGHklJ1e9ZsUNR4+0G/+2+Q1YTkVQuTtvIkXr5rL7/eB72Tsp4LLSnI3Sc qFC8qrOaYbEhS1MDP6W3zV75ETmkzJpraRvybAARWz8wL56CnNrAPudLGUJNsqjMviiG 7gmA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=JyxmAVIRZ2EULz8ridZs410zSRPcviD//ZA3AWZLvQM=; b=RWp1IYUnwAiWN5IPAGQHF0ON88qsKCfL2K4bsMZAvLVv8PkHzZ11w9QUR4e2IK/Wq1 pnE1rIKbVz77adrYkcdFnQ+laveixL5y73sQptbhAZ/a62eLQIX9iE+VWhuNiTKZPVQl JkiAvJcwN3ogoJJ6oqTRNVEK1ozh3qUnFr6GyJGMtB76OqjCLotxZMJ9b22mMKjCERoW QenvsL46s0+57jhgF9Ns+Pg3R8XypknSndLORBf9y/B0WYfpaKstl3zP+Y7BvYA4Moqo IM9/0Wb6b2OBZth9VFHnvGkObdzoHpkydjHcuR5LVkIS2LJYf5w151ROQdR/TSl953+1 MDDA== X-Gm-Message-State: AOAM530ghRl/5AcnNk7VSNxgboYOdgT5eWlN+IOGlzRI+BEWW0A7XcGz /k3WPHIBlC4ie9fBpE8Ms4Y4pkvHK3g= X-Google-Smtp-Source: ABdhPJyS5dF26bBrY2CNY/GjRUbgMRSFvNMTRHEiKwaVPC5RNpTsI+RG3sqjze7N37cI8lgHr+B/DA== X-Received: by 2002:a1c:87:: with SMTP id 129mr2925045wma.183.1607522436606; Wed, 09 Dec 2020 06:00:36 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id y6sm3637034wmg.39.2020.12.09.06.00.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 09 Dec 2020 06:00:35 -0800 (PST) Message-Id: In-Reply-To: References: Date: Wed, 09 Dec 2020 14:00:19 +0000 Subject: [PATCH v4 05/15] reftable: add blocksource, an abstraction for random access reads Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Jeff King , Ramsay Jones , Jonathan Nieder , Johannes Schindelin , Jonathan Tan , Josh Steadmon , Emily Shaffer , Patrick Steinhardt , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Felipe Contreras , Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys The reftable format is usually used with files for storage. However, we abstract away this using the blocksource data structure. This has two advantages: * log blocks are zlib compressed, and handling them is simplified if we can discard byte segments from within the block layer. * for unittests, it is useful to read and write in-memory. The blocksource allows us to abstract the data away from on-disk files. Signed-off-by: Han-Wen Nienhuys --- Makefile | 1 + reftable/blocksource.c | 148 ++++++++++++++++++++++++++++++++ reftable/blocksource.h | 22 +++++ reftable/reftable-blocksource.h | 49 +++++++++++ 4 files changed, 220 insertions(+) create mode 100644 reftable/blocksource.c create mode 100644 reftable/blocksource.h create mode 100644 reftable/reftable-blocksource.h diff --git a/Makefile b/Makefile index 9b6d84af8f6..f623611bde2 100644 --- a/Makefile +++ b/Makefile @@ -2391,6 +2391,7 @@ XDIFF_OBJS += xdiff/xutils.o REFTABLE_OBJS += reftable/basics.o REFTABLE_OBJS += reftable/error.o +REFTABLE_OBJS += reftable/blocksource.o REFTABLE_OBJS += reftable/publicbasics.o REFTABLE_TEST_OBJS += reftable/test_framework.o diff --git a/reftable/blocksource.c b/reftable/blocksource.c new file mode 100644 index 00000000000..25d4d95b52b --- /dev/null +++ b/reftable/blocksource.c @@ -0,0 +1,148 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "system.h" + +#include "basics.h" +#include "blocksource.h" +#include "reftable-blocksource.h" +#include "reftable-error.h" + +static void strbuf_return_block(void *b, struct reftable_block *dest) +{ + memset(dest->data, 0xff, dest->len); + reftable_free(dest->data); +} + +static void strbuf_close(void *b) +{ +} + +static int strbuf_read_block(void *v, struct reftable_block *dest, uint64_t off, + uint32_t size) +{ + struct strbuf *b = (struct strbuf *)v; + assert(off + size <= b->len); + dest->data = reftable_calloc(size); + memcpy(dest->data, b->buf + off, size); + dest->len = size; + return size; +} + +static uint64_t strbuf_size(void *b) +{ + return ((struct strbuf *)b)->len; +} + +static struct reftable_block_source_vtable strbuf_vtable = { + .size = &strbuf_size, + .read_block = &strbuf_read_block, + .return_block = &strbuf_return_block, + .close = &strbuf_close, +}; + +void block_source_from_strbuf(struct reftable_block_source *bs, + struct strbuf *buf) +{ + assert(bs->ops == NULL); + bs->ops = &strbuf_vtable; + bs->arg = buf; +} + +static void malloc_return_block(void *b, struct reftable_block *dest) +{ + memset(dest->data, 0xff, dest->len); + reftable_free(dest->data); +} + +static struct reftable_block_source_vtable malloc_vtable = { + .return_block = &malloc_return_block, +}; + +static struct reftable_block_source malloc_block_source_instance = { + .ops = &malloc_vtable, +}; + +struct reftable_block_source malloc_block_source(void) +{ + return malloc_block_source_instance; +} + +struct file_block_source { + int fd; + uint64_t size; +}; + +static uint64_t file_size(void *b) +{ + return ((struct file_block_source *)b)->size; +} + +static void file_return_block(void *b, struct reftable_block *dest) +{ + memset(dest->data, 0xff, dest->len); + reftable_free(dest->data); +} + +static void file_close(void *b) +{ + int fd = ((struct file_block_source *)b)->fd; + if (fd > 0) { + close(fd); + ((struct file_block_source *)b)->fd = 0; + } + + reftable_free(b); +} + +static int file_read_block(void *v, struct reftable_block *dest, uint64_t off, + uint32_t size) +{ + struct file_block_source *b = (struct file_block_source *)v; + assert(off + size <= b->size); + dest->data = reftable_malloc(size); + if (pread(b->fd, dest->data, size, off) != size) + return -1; + dest->len = size; + return size; +} + +static struct reftable_block_source_vtable file_vtable = { + .size = &file_size, + .read_block = &file_read_block, + .return_block = &file_return_block, + .close = &file_close, +}; + +int reftable_block_source_from_file(struct reftable_block_source *bs, + const char *name) +{ + struct stat st = { 0 }; + int err = 0; + int fd = open(name, O_RDONLY); + struct file_block_source *p = NULL; + if (fd < 0) { + if (errno == ENOENT) { + return REFTABLE_NOT_EXIST_ERROR; + } + return -1; + } + + err = fstat(fd, &st); + if (err < 0) + return -1; + + p = reftable_calloc(sizeof(struct file_block_source)); + p->size = st.st_size; + p->fd = fd; + + assert(bs->ops == NULL); + bs->ops = &file_vtable; + bs->arg = p; + return 0; +} diff --git a/reftable/blocksource.h b/reftable/blocksource.h new file mode 100644 index 00000000000..072e2727ad2 --- /dev/null +++ b/reftable/blocksource.h @@ -0,0 +1,22 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef BLOCKSOURCE_H +#define BLOCKSOURCE_H + +#include "system.h" + +struct reftable_block_source; + +/* Create an in-memory block source for reading reftables */ +void block_source_from_strbuf(struct reftable_block_source *bs, + struct strbuf *buf); + +struct reftable_block_source malloc_block_source(void); + +#endif diff --git a/reftable/reftable-blocksource.h b/reftable/reftable-blocksource.h new file mode 100644 index 00000000000..5aa3990a573 --- /dev/null +++ b/reftable/reftable-blocksource.h @@ -0,0 +1,49 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef REFTABLE_BLOCKSOURCE_H +#define REFTABLE_BLOCKSOURCE_H + +#include + +/* block_source is a generic wrapper for a seekable readable file. + */ +struct reftable_block_source { + struct reftable_block_source_vtable *ops; + void *arg; +}; + +/* a contiguous segment of bytes. It keeps track of its generating block_source + * so it can return itself into the pool. */ +struct reftable_block { + uint8_t *data; + int len; + struct reftable_block_source source; +}; + +/* block_source_vtable are the operations that make up block_source */ +struct reftable_block_source_vtable { + /* returns the size of a block source */ + uint64_t (*size)(void *source); + + /* reads a segment from the block source. It is an error to read + beyond the end of the block */ + int (*read_block)(void *source, struct reftable_block *dest, + uint64_t off, uint32_t size); + /* mark the block as read; may return the data back to malloc */ + void (*return_block)(void *source, struct reftable_block *blockp); + + /* release all resources associated with the block source */ + void (*close)(void *source); +}; + +/* opens a file on the file system as a block_source */ +int reftable_block_source_from_file(struct reftable_block_source *block_src, + const char *name); + +#endif From patchwork Wed Dec 9 14:00:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 11961581 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 139A5C433FE for ; Wed, 9 Dec 2020 14:03:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CBD8E23B42 for ; Wed, 9 Dec 2020 14:03:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727314AbgLIODJ (ORCPT ); Wed, 9 Dec 2020 09:03:09 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60402 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732778AbgLIOB5 (ORCPT ); Wed, 9 Dec 2020 09:01:57 -0500 Received: from mail-wm1-x333.google.com (mail-wm1-x333.google.com [IPv6:2a00:1450:4864:20::333]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EFFDCC06138C for ; Wed, 9 Dec 2020 06:00:39 -0800 (PST) Received: by mail-wm1-x333.google.com with SMTP id e25so1789321wme.0 for ; Wed, 09 Dec 2020 06:00:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=nwKYYf7e4hcHilGYG4VCxyBlytI4JdqjBOBgLb98Slg=; b=qDwpw6+OM4DrjucyVbg7j1OLmBGTuhzdtIforrUaxc3WgpzzhX62BTtKCdIRc4KNys LM+XjfOFWI0l4fejKWxAeqHnUce9MZBHzvumHdKYTq3vO8a0KirAy57p8PKPDrG6Py+l 9Y97giNNz3taI3El9bq57A5D26oFppDQMpadDz3IERAjhvT+4A+rg/UdHhvSGX3yZ/kW QFhLovzgavHT+okS97L3c/9I6+rEv+V6Ps+78zvchutc7dFfmu8qzvkmrOWyyrImDcHR 9PTlww4tYeaAzz3M050U7eMY9d4upsyu8zBgzioDNkKyA8WtdHZyLZ44WXkDtwhXN5Xx 1ewQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=nwKYYf7e4hcHilGYG4VCxyBlytI4JdqjBOBgLb98Slg=; b=HPsuA10LqW9uXmJdomVCPvxhTrIxvhatdVZcblCbR7KoeGG0JFhmoE7UC7V+mPYzvq 2eVWm0V0Lj8E2VXzpMOZz5v7rXmWkmZ93r5yaXFVGoQvMAixFhIvJBQ0iiVJL0uFWVLd j9h0XV38omoLI7sjGsUgaLZkGuransV5/SlcbPdZD/YGcaEKMHEj+0z8kYUBIVP7HNGn bseS5YhbQmZBde/u68J/cPqB7ICSw1ON5Y2lFvuReW8hzsbZTBlE8DLPivwORgSTDCfn wWc9Kn/PFsC0eU3dnYgjlG+UD7yIOOgFtOz8rgfzUEn0EH7IIFPIiKnA5Jb3yK8Y1CLK 6j7A== X-Gm-Message-State: AOAM531Wdro3Eyguxu2UL8/uqPFuJgG94+ur+kGZ2UwBKUmBxap62ahH OC9t7Gu8zDuO5ZWMDdDn+vrA8Ua5Tck= X-Google-Smtp-Source: ABdhPJykcu2HtK9GcsKRoj/4daYkAq5EpY45XJupC0VHGd+8aQ9JDck2PdmJ72Z+r6EBFV+pnTMhcw== X-Received: by 2002:a05:600c:258:: with SMTP id 24mr3032010wmj.16.1607522437771; Wed, 09 Dec 2020 06:00:37 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id t184sm3885290wmt.13.2020.12.09.06.00.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 09 Dec 2020 06:00:37 -0800 (PST) Message-Id: In-Reply-To: References: Date: Wed, 09 Dec 2020 14:00:20 +0000 Subject: [PATCH v4 06/15] reftable: (de)serialization for the polymorphic record type. Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Jeff King , Ramsay Jones , Jonathan Nieder , Johannes Schindelin , Jonathan Tan , Josh Steadmon , Emily Shaffer , Patrick Steinhardt , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Felipe Contreras , Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys The reftable format is structured as a sequence of blocks, and each block contains a sequence of prefix-compressed key-value records. There are 4 types of records, and they have similarities in how they must be handled. This is achieved by introducing a polymorphic 'record' type that encapsulates ref, log, index and object records. Signed-off-by: Han-Wen Nienhuys --- Makefile | 2 + reftable/constants.h | 21 + reftable/record.c | 1157 ++++++++++++++++++++++++++++++++++++ reftable/record.h | 139 +++++ reftable/record_test.c | 398 +++++++++++++ reftable/reftable-record.h | 100 ++++ t/helper/test-reftable.c | 2 +- 7 files changed, 1818 insertions(+), 1 deletion(-) create mode 100644 reftable/constants.h create mode 100644 reftable/record.c create mode 100644 reftable/record.h create mode 100644 reftable/record_test.c create mode 100644 reftable/reftable-record.h diff --git a/Makefile b/Makefile index f623611bde2..ce3052169b4 100644 --- a/Makefile +++ b/Makefile @@ -2393,7 +2393,9 @@ REFTABLE_OBJS += reftable/basics.o REFTABLE_OBJS += reftable/error.o REFTABLE_OBJS += reftable/blocksource.o REFTABLE_OBJS += reftable/publicbasics.o +REFTABLE_OBJS += reftable/record.o +REFTABLE_TEST_OBJS += reftable/record_test.o REFTABLE_TEST_OBJS += reftable/test_framework.o REFTABLE_TEST_OBJS += reftable/basics_test.o diff --git a/reftable/constants.h b/reftable/constants.h new file mode 100644 index 00000000000..5eee72c4c11 --- /dev/null +++ b/reftable/constants.h @@ -0,0 +1,21 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef CONSTANTS_H +#define CONSTANTS_H + +#define BLOCK_TYPE_LOG 'g' +#define BLOCK_TYPE_INDEX 'i' +#define BLOCK_TYPE_REF 'r' +#define BLOCK_TYPE_OBJ 'o' +#define BLOCK_TYPE_ANY 0 + +#define MAX_RESTARTS ((1 << 16) - 1) +#define DEFAULT_BLOCK_SIZE 4096 + +#endif diff --git a/reftable/record.c b/reftable/record.c new file mode 100644 index 00000000000..5381e9aaa89 --- /dev/null +++ b/reftable/record.c @@ -0,0 +1,1157 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +/* record.c - methods for different types of records. */ + +#include "record.h" + +#include "system.h" +#include "constants.h" +#include "reftable-error.h" +#include "basics.h" + +int get_var_int(uint64_t *dest, struct string_view *in) +{ + int ptr = 0; + uint64_t val; + + if (in->len == 0) + return -1; + val = in->buf[ptr] & 0x7f; + + while (in->buf[ptr] & 0x80) { + ptr++; + if (ptr > in->len) { + return -1; + } + val = (val + 1) << 7 | (uint64_t)(in->buf[ptr] & 0x7f); + } + + *dest = val; + return ptr + 1; +} + +int put_var_int(struct string_view *dest, uint64_t val) +{ + uint8_t buf[10] = { 0 }; + int i = 9; + int n = 0; + buf[i] = (uint8_t)(val & 0x7f); + i--; + while (1) { + val >>= 7; + if (!val) { + break; + } + val--; + buf[i] = 0x80 | (uint8_t)(val & 0x7f); + i--; + } + + n = sizeof(buf) - i - 1; + if (dest->len < n) + return -1; + memcpy(dest->buf, &buf[i + 1], n); + return n; +} + +int reftable_is_block_type(uint8_t typ) +{ + switch (typ) { + case BLOCK_TYPE_REF: + case BLOCK_TYPE_LOG: + case BLOCK_TYPE_OBJ: + case BLOCK_TYPE_INDEX: + return 1; + } + return 0; +} + +uint8_t *reftable_ref_record_val1(struct reftable_ref_record *rec) +{ + switch (rec->value_type) { + case REFTABLE_REF_VAL1: + return rec->value.val1; + case REFTABLE_REF_VAL2: + return rec->value.val2.value; + default: + return NULL; + } +} + +uint8_t *reftable_ref_record_val2(struct reftable_ref_record *rec) +{ + switch (rec->value_type) { + case REFTABLE_REF_VAL2: + return rec->value.val2.target_value; + default: + return NULL; + } +} + +static int decode_string(struct strbuf *dest, struct string_view in) +{ + int start_len = in.len; + uint64_t tsize = 0; + int n = get_var_int(&tsize, &in); + if (n <= 0) + return -1; + string_view_consume(&in, n); + if (in.len < tsize) + return -1; + + strbuf_reset(dest); + strbuf_add(dest, in.buf, tsize); + string_view_consume(&in, tsize); + + return start_len - in.len; +} + +static int encode_string(char *str, struct string_view s) +{ + struct string_view start = s; + int l = strlen(str); + int n = put_var_int(&s, l); + if (n < 0) + return -1; + string_view_consume(&s, n); + if (s.len < l) + return -1; + memcpy(s.buf, str, l); + string_view_consume(&s, l); + + return start.len - s.len; +} + +int reftable_encode_key(int *restart, struct string_view dest, + struct strbuf prev_key, struct strbuf key, + uint8_t extra) +{ + struct string_view start = dest; + int prefix_len = common_prefix_size(&prev_key, &key); + uint64_t suffix_len = key.len - prefix_len; + int n = put_var_int(&dest, (uint64_t)prefix_len); + if (n < 0) + return -1; + string_view_consume(&dest, n); + + *restart = (prefix_len == 0); + + n = put_var_int(&dest, suffix_len << 3 | (uint64_t)extra); + if (n < 0) + return -1; + string_view_consume(&dest, n); + + if (dest.len < suffix_len) + return -1; + memcpy(dest.buf, key.buf + prefix_len, suffix_len); + string_view_consume(&dest, suffix_len); + + return start.len - dest.len; +} + +int reftable_decode_key(struct strbuf *key, uint8_t *extra, + struct strbuf last_key, struct string_view in) +{ + int start_len = in.len; + uint64_t prefix_len = 0; + uint64_t suffix_len = 0; + int n = get_var_int(&prefix_len, &in); + if (n < 0) + return -1; + string_view_consume(&in, n); + + if (prefix_len > last_key.len) + return -1; + + n = get_var_int(&suffix_len, &in); + if (n <= 0) + return -1; + string_view_consume(&in, n); + + *extra = (uint8_t)(suffix_len & 0x7); + suffix_len >>= 3; + + if (in.len < suffix_len) + return -1; + + strbuf_reset(key); + strbuf_add(key, last_key.buf, prefix_len); + strbuf_add(key, in.buf, suffix_len); + string_view_consume(&in, suffix_len); + + return start_len - in.len; +} + +static void reftable_ref_record_key(const void *r, struct strbuf *dest) +{ + const struct reftable_ref_record *rec = + (const struct reftable_ref_record *)r; + strbuf_reset(dest); + strbuf_addstr(dest, rec->refname); +} + +static void reftable_ref_record_copy_from(void *rec, const void *src_rec, + int hash_size) +{ + struct reftable_ref_record *ref = (struct reftable_ref_record *)rec; + struct reftable_ref_record *src = (struct reftable_ref_record *)src_rec; + assert(hash_size > 0); + + /* This is simple and correct, but we could probably reuse the hash + * fields. */ + reftable_ref_record_release(ref); + if (src->refname != NULL) { + ref->refname = xstrdup(src->refname); + } + ref->update_index = src->update_index; + ref->value_type = src->value_type; + switch (src->value_type) { + case REFTABLE_REF_DELETION: + break; + case REFTABLE_REF_VAL1: + ref->value.val1 = reftable_malloc(hash_size); + memcpy(ref->value.val1, src->value.val1, hash_size); + break; + case REFTABLE_REF_VAL2: + ref->value.val2.value = reftable_malloc(hash_size); + memcpy(ref->value.val2.value, src->value.val2.value, hash_size); + ref->value.val2.target_value = reftable_malloc(hash_size); + memcpy(ref->value.val2.target_value, + src->value.val2.target_value, hash_size); + break; + case REFTABLE_REF_SYMREF: + ref->value.symref = xstrdup(src->value.symref); + break; + } +} + +static char hexdigit(int c) +{ + if (c <= 9) + return '0' + c; + return 'a' + (c - 10); +} + +static void hex_format(char *dest, uint8_t *src, int hash_size) +{ + assert(hash_size > 0); + if (src != NULL) { + int i = 0; + for (i = 0; i < hash_size; i++) { + dest[2 * i] = hexdigit(src[i] >> 4); + dest[2 * i + 1] = hexdigit(src[i] & 0xf); + } + dest[2 * hash_size] = 0; + } +} + +void reftable_ref_record_print(struct reftable_ref_record *ref, + uint32_t hash_id) +{ + char hex[2 * SHA256_SIZE + 1] = { 0 }; /* BUG */ + printf("ref{%s(%" PRIu64 ") ", ref->refname, ref->update_index); + switch (ref->value_type) { + case REFTABLE_REF_SYMREF: + printf("=> %s", ref->value.symref); + break; + case REFTABLE_REF_VAL2: + hex_format(hex, ref->value.val2.value, hash_size(hash_id)); + printf("val 2 %s", hex); + hex_format(hex, ref->value.val2.target_value, + hash_size(hash_id)); + printf("(T %s)", hex); + break; + case REFTABLE_REF_VAL1: + hex_format(hex, ref->value.val1, hash_size(hash_id)); + printf("val 1 %s", hex); + break; + case REFTABLE_REF_DELETION: + printf("delete"); + break; + } + printf("}\n"); +} + +static void reftable_ref_record_release_void(void *rec) +{ + reftable_ref_record_release((struct reftable_ref_record *)rec); +} + +void reftable_ref_record_release(struct reftable_ref_record *ref) +{ + switch (ref->value_type) { + case REFTABLE_REF_SYMREF: + reftable_free(ref->value.symref); + break; + case REFTABLE_REF_VAL2: + reftable_free(ref->value.val2.target_value); + reftable_free(ref->value.val2.value); + break; + case REFTABLE_REF_VAL1: + reftable_free(ref->value.val1); + break; + case REFTABLE_REF_DELETION: + break; + default: + abort(); + } + + reftable_free(ref->refname); + memset(ref, 0, sizeof(struct reftable_ref_record)); +} + +static uint8_t reftable_ref_record_val_type(const void *rec) +{ + const struct reftable_ref_record *r = + (const struct reftable_ref_record *)rec; + return r->value_type; +} + +static int reftable_ref_record_encode(const void *rec, struct string_view s, + int hash_size) +{ + const struct reftable_ref_record *r = + (const struct reftable_ref_record *)rec; + struct string_view start = s; + int n = put_var_int(&s, r->update_index); + assert(hash_size > 0); + if (n < 0) + return -1; + string_view_consume(&s, n); + + switch (r->value_type) { + case REFTABLE_REF_SYMREF: + n = encode_string(r->value.symref, s); + if (n < 0) { + return -1; + } + string_view_consume(&s, n); + break; + case REFTABLE_REF_VAL2: + if (s.len < 2 * hash_size) { + return -1; + } + memcpy(s.buf, r->value.val2.value, hash_size); + string_view_consume(&s, hash_size); + memcpy(s.buf, r->value.val2.target_value, hash_size); + string_view_consume(&s, hash_size); + break; + case REFTABLE_REF_VAL1: + if (s.len < hash_size) { + return -1; + } + memcpy(s.buf, r->value.val1, hash_size); + string_view_consume(&s, hash_size); + break; + case REFTABLE_REF_DELETION: + break; + default: + abort(); + } + + return start.len - s.len; +} + +static int reftable_ref_record_decode(void *rec, struct strbuf key, + uint8_t val_type, struct string_view in, + int hash_size) +{ + struct reftable_ref_record *r = (struct reftable_ref_record *)rec; + struct string_view start = in; + uint64_t update_index = 0; + int n = get_var_int(&update_index, &in); + if (n < 0) + return n; + string_view_consume(&in, n); + + reftable_ref_record_release(r); + + assert(hash_size > 0); + + r->refname = reftable_realloc(r->refname, key.len + 1); + memcpy(r->refname, key.buf, key.len); + r->update_index = update_index; + r->refname[key.len] = 0; + r->value_type = val_type; + switch (val_type) { + case REFTABLE_REF_VAL1: + if (in.len < hash_size) { + return -1; + } + + r->value.val1 = reftable_malloc(hash_size); + memcpy(r->value.val1, in.buf, hash_size); + string_view_consume(&in, hash_size); + break; + + case REFTABLE_REF_VAL2: + if (in.len < 2 * hash_size) { + return -1; + } + + r->value.val2.value = reftable_malloc(hash_size); + memcpy(r->value.val2.value, in.buf, hash_size); + string_view_consume(&in, hash_size); + + r->value.val2.target_value = reftable_malloc(hash_size); + memcpy(r->value.val2.target_value, in.buf, hash_size); + string_view_consume(&in, hash_size); + break; + + case REFTABLE_REF_SYMREF: { + struct strbuf dest = STRBUF_INIT; + int n = decode_string(&dest, in); + if (n < 0) { + return -1; + } + string_view_consume(&in, n); + r->value.symref = dest.buf; + } break; + + case REFTABLE_REF_DELETION: + break; + default: + abort(); + break; + } + + return start.len - in.len; +} + +static int reftable_ref_record_is_deletion_void(const void *p) +{ + return reftable_ref_record_is_deletion( + (const struct reftable_ref_record *)p); +} + +static struct reftable_record_vtable reftable_ref_record_vtable = { + .key = &reftable_ref_record_key, + .type = BLOCK_TYPE_REF, + .copy_from = &reftable_ref_record_copy_from, + .val_type = &reftable_ref_record_val_type, + .encode = &reftable_ref_record_encode, + .decode = &reftable_ref_record_decode, + .release = &reftable_ref_record_release_void, + .is_deletion = &reftable_ref_record_is_deletion_void, +}; + +static void reftable_obj_record_key(const void *r, struct strbuf *dest) +{ + const struct reftable_obj_record *rec = + (const struct reftable_obj_record *)r; + strbuf_reset(dest); + strbuf_add(dest, rec->hash_prefix, rec->hash_prefix_len); +} + +static void reftable_obj_record_release(void *rec) +{ + struct reftable_obj_record *obj = (struct reftable_obj_record *)rec; + FREE_AND_NULL(obj->hash_prefix); + FREE_AND_NULL(obj->offsets); + memset(obj, 0, sizeof(struct reftable_obj_record)); +} + +static void reftable_obj_record_copy_from(void *rec, const void *src_rec, + int hash_size) +{ + struct reftable_obj_record *obj = (struct reftable_obj_record *)rec; + const struct reftable_obj_record *src = + (const struct reftable_obj_record *)src_rec; + int olen; + + reftable_obj_record_release(obj); + *obj = *src; + obj->hash_prefix = reftable_malloc(obj->hash_prefix_len); + memcpy(obj->hash_prefix, src->hash_prefix, obj->hash_prefix_len); + + olen = obj->offset_len * sizeof(uint64_t); + obj->offsets = reftable_malloc(olen); + memcpy(obj->offsets, src->offsets, olen); +} + +static uint8_t reftable_obj_record_val_type(const void *rec) +{ + struct reftable_obj_record *r = (struct reftable_obj_record *)rec; + if (r->offset_len > 0 && r->offset_len < 8) + return r->offset_len; + return 0; +} + +static int reftable_obj_record_encode(const void *rec, struct string_view s, + int hash_size) +{ + struct reftable_obj_record *r = (struct reftable_obj_record *)rec; + struct string_view start = s; + int i = 0; + int n = 0; + uint64_t last = 0; + if (r->offset_len == 0 || r->offset_len >= 8) { + n = put_var_int(&s, r->offset_len); + if (n < 0) { + return -1; + } + string_view_consume(&s, n); + } + if (r->offset_len == 0) + return start.len - s.len; + n = put_var_int(&s, r->offsets[0]); + if (n < 0) + return -1; + string_view_consume(&s, n); + + last = r->offsets[0]; + for (i = 1; i < r->offset_len; i++) { + int n = put_var_int(&s, r->offsets[i] - last); + if (n < 0) { + return -1; + } + string_view_consume(&s, n); + last = r->offsets[i]; + } + return start.len - s.len; +} + +static int reftable_obj_record_decode(void *rec, struct strbuf key, + uint8_t val_type, struct string_view in, + int hash_size) +{ + struct string_view start = in; + struct reftable_obj_record *r = (struct reftable_obj_record *)rec; + uint64_t count = val_type; + int n = 0; + uint64_t last; + int j; + r->hash_prefix = reftable_malloc(key.len); + memcpy(r->hash_prefix, key.buf, key.len); + r->hash_prefix_len = key.len; + + if (val_type == 0) { + n = get_var_int(&count, &in); + if (n < 0) { + return n; + } + + string_view_consume(&in, n); + } + + r->offsets = NULL; + r->offset_len = 0; + if (count == 0) + return start.len - in.len; + + r->offsets = reftable_malloc(count * sizeof(uint64_t)); + r->offset_len = count; + + n = get_var_int(&r->offsets[0], &in); + if (n < 0) + return n; + string_view_consume(&in, n); + + last = r->offsets[0]; + j = 1; + while (j < count) { + uint64_t delta = 0; + int n = get_var_int(&delta, &in); + if (n < 0) { + return n; + } + string_view_consume(&in, n); + + last = r->offsets[j] = (delta + last); + j++; + } + return start.len - in.len; +} + +static int not_a_deletion(const void *p) +{ + return 0; +} + +static struct reftable_record_vtable reftable_obj_record_vtable = { + .key = &reftable_obj_record_key, + .type = BLOCK_TYPE_OBJ, + .copy_from = &reftable_obj_record_copy_from, + .val_type = &reftable_obj_record_val_type, + .encode = &reftable_obj_record_encode, + .decode = &reftable_obj_record_decode, + .release = &reftable_obj_record_release, + .is_deletion = not_a_deletion, +}; + +void reftable_log_record_print(struct reftable_log_record *log, + uint32_t hash_id) +{ + char hex[SHA256_SIZE + 1] = { 0 }; + + printf("log{%s(%" PRIu64 ") %s <%s> %" PRIu64 " %04d\n", log->refname, + log->update_index, log->name, log->email, log->time, + log->tz_offset); + hex_format(hex, log->old_hash, hash_size(hash_id)); + printf("%s => ", hex); + hex_format(hex, log->new_hash, hash_size(hash_id)); + printf("%s\n\n%s\n}\n", hex, log->message); +} + +static void reftable_log_record_key(const void *r, struct strbuf *dest) +{ + const struct reftable_log_record *rec = + (const struct reftable_log_record *)r; + int len = strlen(rec->refname); + uint8_t i64[8]; + uint64_t ts = 0; + strbuf_reset(dest); + strbuf_add(dest, (uint8_t *)rec->refname, len + 1); + + ts = (~ts) - rec->update_index; + put_be64(&i64[0], ts); + strbuf_add(dest, i64, sizeof(i64)); +} + +static void reftable_log_record_copy_from(void *rec, const void *src_rec, + int hash_size) +{ + struct reftable_log_record *dst = (struct reftable_log_record *)rec; + const struct reftable_log_record *src = + (const struct reftable_log_record *)src_rec; + + reftable_log_record_release(dst); + *dst = *src; + if (dst->refname != NULL) { + dst->refname = xstrdup(dst->refname); + } + if (dst->email != NULL) { + dst->email = xstrdup(dst->email); + } + if (dst->name != NULL) { + dst->name = xstrdup(dst->name); + } + if (dst->message != NULL) { + dst->message = xstrdup(dst->message); + } + + if (dst->new_hash != NULL) { + dst->new_hash = reftable_malloc(hash_size); + memcpy(dst->new_hash, src->new_hash, hash_size); + } + if (dst->old_hash != NULL) { + dst->old_hash = reftable_malloc(hash_size); + memcpy(dst->old_hash, src->old_hash, hash_size); + } +} + +static void reftable_log_record_release_void(void *rec) +{ + struct reftable_log_record *r = (struct reftable_log_record *)rec; + reftable_log_record_release(r); +} + +void reftable_log_record_release(struct reftable_log_record *r) +{ + reftable_free(r->refname); + reftable_free(r->new_hash); + reftable_free(r->old_hash); + reftable_free(r->name); + reftable_free(r->email); + reftable_free(r->message); + memset(r, 0, sizeof(struct reftable_log_record)); +} + +static uint8_t reftable_log_record_val_type(const void *rec) +{ + const struct reftable_log_record *log = + (const struct reftable_log_record *)rec; + + return reftable_log_record_is_deletion(log) ? 0 : 1; +} + +static uint8_t zero[SHA256_SIZE] = { 0 }; + +static int reftable_log_record_encode(const void *rec, struct string_view s, + int hash_size) +{ + struct reftable_log_record *r = (struct reftable_log_record *)rec; + struct string_view start = s; + int n = 0; + uint8_t *oldh = r->old_hash; + uint8_t *newh = r->new_hash; + if (reftable_log_record_is_deletion(r)) + return 0; + + if (oldh == NULL) { + oldh = zero; + } + if (newh == NULL) { + newh = zero; + } + + if (s.len < 2 * hash_size) + return -1; + + memcpy(s.buf, oldh, hash_size); + memcpy(s.buf + hash_size, newh, hash_size); + string_view_consume(&s, 2 * hash_size); + + n = encode_string(r->name ? r->name : "", s); + if (n < 0) + return -1; + string_view_consume(&s, n); + + n = encode_string(r->email ? r->email : "", s); + if (n < 0) + return -1; + string_view_consume(&s, n); + + n = put_var_int(&s, r->time); + if (n < 0) + return -1; + string_view_consume(&s, n); + + if (s.len < 2) + return -1; + + put_be16(s.buf, r->tz_offset); + string_view_consume(&s, 2); + + n = encode_string(r->message ? r->message : "", s); + if (n < 0) + return -1; + string_view_consume(&s, n); + + return start.len - s.len; +} + +static int reftable_log_record_decode(void *rec, struct strbuf key, + uint8_t val_type, struct string_view in, + int hash_size) +{ + struct string_view start = in; + struct reftable_log_record *r = (struct reftable_log_record *)rec; + uint64_t max = 0; + uint64_t ts = 0; + struct strbuf dest = STRBUF_INIT; + int n; + + if (key.len <= 9 || key.buf[key.len - 9] != 0) + return REFTABLE_FORMAT_ERROR; + + r->refname = reftable_realloc(r->refname, key.len - 8); + memcpy(r->refname, key.buf, key.len - 8); + ts = get_be64(key.buf + key.len - 8); + + r->update_index = (~max) - ts; + + if (val_type == 0) { + FREE_AND_NULL(r->old_hash); + FREE_AND_NULL(r->new_hash); + FREE_AND_NULL(r->message); + FREE_AND_NULL(r->email); + FREE_AND_NULL(r->name); + return 0; + } + + if (in.len < 2 * hash_size) + return REFTABLE_FORMAT_ERROR; + + r->old_hash = reftable_realloc(r->old_hash, hash_size); + r->new_hash = reftable_realloc(r->new_hash, hash_size); + + memcpy(r->old_hash, in.buf, hash_size); + memcpy(r->new_hash, in.buf + hash_size, hash_size); + + string_view_consume(&in, 2 * hash_size); + + n = decode_string(&dest, in); + if (n < 0) + goto done; + string_view_consume(&in, n); + + r->name = reftable_realloc(r->name, dest.len + 1); + memcpy(r->name, dest.buf, dest.len); + r->name[dest.len] = 0; + + strbuf_reset(&dest); + n = decode_string(&dest, in); + if (n < 0) + goto done; + string_view_consume(&in, n); + + r->email = reftable_realloc(r->email, dest.len + 1); + memcpy(r->email, dest.buf, dest.len); + r->email[dest.len] = 0; + + ts = 0; + n = get_var_int(&ts, &in); + if (n < 0) + goto done; + string_view_consume(&in, n); + r->time = ts; + if (in.len < 2) + goto done; + + r->tz_offset = get_be16(in.buf); + string_view_consume(&in, 2); + + strbuf_reset(&dest); + n = decode_string(&dest, in); + if (n < 0) + goto done; + string_view_consume(&in, n); + + r->message = reftable_realloc(r->message, dest.len + 1); + memcpy(r->message, dest.buf, dest.len); + r->message[dest.len] = 0; + + strbuf_release(&dest); + return start.len - in.len; + +done: + strbuf_release(&dest); + return REFTABLE_FORMAT_ERROR; +} + +static int null_streq(char *a, char *b) +{ + char *empty = ""; + if (a == NULL) + a = empty; + + if (b == NULL) + b = empty; + + return 0 == strcmp(a, b); +} + +static int zero_hash_eq(uint8_t *a, uint8_t *b, int sz) +{ + if (a == NULL) + a = zero; + + if (b == NULL) + b = zero; + + return !memcmp(a, b, sz); +} + +int reftable_log_record_equal(struct reftable_log_record *a, + struct reftable_log_record *b, int hash_size) +{ + return null_streq(a->name, b->name) && null_streq(a->email, b->email) && + null_streq(a->message, b->message) && + zero_hash_eq(a->old_hash, b->old_hash, hash_size) && + zero_hash_eq(a->new_hash, b->new_hash, hash_size) && + a->time == b->time && a->tz_offset == b->tz_offset && + a->update_index == b->update_index; +} + +static int reftable_log_record_is_deletion_void(const void *p) +{ + return reftable_log_record_is_deletion( + (const struct reftable_log_record *)p); +} + +static struct reftable_record_vtable reftable_log_record_vtable = { + .key = &reftable_log_record_key, + .type = BLOCK_TYPE_LOG, + .copy_from = &reftable_log_record_copy_from, + .val_type = &reftable_log_record_val_type, + .encode = &reftable_log_record_encode, + .decode = &reftable_log_record_decode, + .release = &reftable_log_record_release_void, + .is_deletion = &reftable_log_record_is_deletion_void, +}; + +struct reftable_record reftable_new_record(uint8_t typ) +{ + struct reftable_record rec = { NULL }; + switch (typ) { + case BLOCK_TYPE_REF: { + struct reftable_ref_record *r = + reftable_calloc(sizeof(struct reftable_ref_record)); + reftable_record_from_ref(&rec, r); + return rec; + } + + case BLOCK_TYPE_OBJ: { + struct reftable_obj_record *r = + reftable_calloc(sizeof(struct reftable_obj_record)); + reftable_record_from_obj(&rec, r); + return rec; + } + case BLOCK_TYPE_LOG: { + struct reftable_log_record *r = + reftable_calloc(sizeof(struct reftable_log_record)); + reftable_record_from_log(&rec, r); + return rec; + } + case BLOCK_TYPE_INDEX: { + struct reftable_index_record empty = { .last_key = + STRBUF_INIT }; + struct reftable_index_record *r = + reftable_calloc(sizeof(struct reftable_index_record)); + *r = empty; + reftable_record_from_index(&rec, r); + return rec; + } + } + abort(); + return rec; +} + +/* clear out the record, yielding the reftable_record data that was + * encapsulated. */ +static void *reftable_record_yield(struct reftable_record *rec) +{ + void *p = rec->data; + rec->data = NULL; + return p; +} + +void reftable_record_destroy(struct reftable_record *rec) +{ + reftable_record_release(rec); + reftable_free(reftable_record_yield(rec)); +} + +static void reftable_index_record_key(const void *r, struct strbuf *dest) +{ + struct reftable_index_record *rec = (struct reftable_index_record *)r; + strbuf_reset(dest); + strbuf_addbuf(dest, &rec->last_key); +} + +static void reftable_index_record_copy_from(void *rec, const void *src_rec, + int hash_size) +{ + struct reftable_index_record *dst = (struct reftable_index_record *)rec; + struct reftable_index_record *src = + (struct reftable_index_record *)src_rec; + + strbuf_reset(&dst->last_key); + strbuf_addbuf(&dst->last_key, &src->last_key); + dst->offset = src->offset; +} + +static void reftable_index_record_release(void *rec) +{ + struct reftable_index_record *idx = (struct reftable_index_record *)rec; + strbuf_release(&idx->last_key); +} + +static uint8_t reftable_index_record_val_type(const void *rec) +{ + return 0; +} + +static int reftable_index_record_encode(const void *rec, struct string_view out, + int hash_size) +{ + const struct reftable_index_record *r = + (const struct reftable_index_record *)rec; + struct string_view start = out; + + int n = put_var_int(&out, r->offset); + if (n < 0) + return n; + + string_view_consume(&out, n); + + return start.len - out.len; +} + +static int reftable_index_record_decode(void *rec, struct strbuf key, + uint8_t val_type, struct string_view in, + int hash_size) +{ + struct string_view start = in; + struct reftable_index_record *r = (struct reftable_index_record *)rec; + int n = 0; + + strbuf_reset(&r->last_key); + strbuf_addbuf(&r->last_key, &key); + + n = get_var_int(&r->offset, &in); + if (n < 0) + return n; + + string_view_consume(&in, n); + return start.len - in.len; +} + +static struct reftable_record_vtable reftable_index_record_vtable = { + .key = &reftable_index_record_key, + .type = BLOCK_TYPE_INDEX, + .copy_from = &reftable_index_record_copy_from, + .val_type = &reftable_index_record_val_type, + .encode = &reftable_index_record_encode, + .decode = &reftable_index_record_decode, + .release = &reftable_index_record_release, + .is_deletion = ¬_a_deletion, +}; + +void reftable_record_key(struct reftable_record *rec, struct strbuf *dest) +{ + rec->ops->key(rec->data, dest); +} + +uint8_t reftable_record_type(struct reftable_record *rec) +{ + return rec->ops->type; +} + +int reftable_record_encode(struct reftable_record *rec, struct string_view dest, + int hash_size) +{ + return rec->ops->encode(rec->data, dest, hash_size); +} + +void reftable_record_copy_from(struct reftable_record *rec, + struct reftable_record *src, int hash_size) +{ + assert(src->ops->type == rec->ops->type); + + rec->ops->copy_from(rec->data, src->data, hash_size); +} + +uint8_t reftable_record_val_type(struct reftable_record *rec) +{ + return rec->ops->val_type(rec->data); +} + +int reftable_record_decode(struct reftable_record *rec, struct strbuf key, + uint8_t extra, struct string_view src, int hash_size) +{ + return rec->ops->decode(rec->data, key, extra, src, hash_size); +} + +void reftable_record_release(struct reftable_record *rec) +{ + rec->ops->release(rec->data); +} + +int reftable_record_is_deletion(struct reftable_record *rec) +{ + return rec->ops->is_deletion(rec->data); +} + +void reftable_record_from_ref(struct reftable_record *rec, + struct reftable_ref_record *ref_rec) +{ + assert(rec->ops == NULL); + rec->data = ref_rec; + rec->ops = &reftable_ref_record_vtable; +} + +void reftable_record_from_obj(struct reftable_record *rec, + struct reftable_obj_record *obj_rec) +{ + assert(rec->ops == NULL); + rec->data = obj_rec; + rec->ops = &reftable_obj_record_vtable; +} + +void reftable_record_from_index(struct reftable_record *rec, + struct reftable_index_record *index_rec) +{ + assert(rec->ops == NULL); + rec->data = index_rec; + rec->ops = &reftable_index_record_vtable; +} + +void reftable_record_from_log(struct reftable_record *rec, + struct reftable_log_record *log_rec) +{ + assert(rec->ops == NULL); + rec->data = log_rec; + rec->ops = &reftable_log_record_vtable; +} + +struct reftable_ref_record *reftable_record_as_ref(struct reftable_record *rec) +{ + assert(reftable_record_type(rec) == BLOCK_TYPE_REF); + return (struct reftable_ref_record *)rec->data; +} + +struct reftable_log_record *reftable_record_as_log(struct reftable_record *rec) +{ + assert(reftable_record_type(rec) == BLOCK_TYPE_LOG); + return (struct reftable_log_record *)rec->data; +} + +static int hash_equal(uint8_t *a, uint8_t *b, int hash_size) +{ + if (a != NULL && b != NULL) + return !memcmp(a, b, hash_size); + + return a == b; +} + +int reftable_ref_record_equal(struct reftable_ref_record *a, + struct reftable_ref_record *b, int hash_size) +{ + assert(hash_size > 0); + if (!(0 == strcmp(a->refname, b->refname) && + a->update_index == b->update_index && + a->value_type == b->value_type)) + return 0; + + switch (a->value_type) { + case REFTABLE_REF_SYMREF: + return !strcmp(a->value.symref, b->value.symref); + case REFTABLE_REF_VAL2: + return hash_equal(a->value.val2.value, b->value.val2.value, + hash_size) && + hash_equal(a->value.val2.target_value, + b->value.val2.target_value, hash_size); + case REFTABLE_REF_VAL1: + return hash_equal(a->value.val1, b->value.val1, hash_size); + case REFTABLE_REF_DELETION: + return 1; + default: + abort(); + } +} + +int reftable_ref_record_compare_name(const void *a, const void *b) +{ + return strcmp(((struct reftable_ref_record *)a)->refname, + ((struct reftable_ref_record *)b)->refname); +} + +int reftable_ref_record_is_deletion(const struct reftable_ref_record *ref) +{ + return ref->value_type == REFTABLE_REF_DELETION; +} + +int reftable_log_record_compare_key(const void *a, const void *b) +{ + struct reftable_log_record *la = (struct reftable_log_record *)a; + struct reftable_log_record *lb = (struct reftable_log_record *)b; + + int cmp = strcmp(la->refname, lb->refname); + if (cmp) + return cmp; + if (la->update_index > lb->update_index) + return -1; + return (la->update_index < lb->update_index) ? 1 : 0; +} + +int reftable_log_record_is_deletion(const struct reftable_log_record *log) +{ + return (log->new_hash == NULL && log->old_hash == NULL && + log->name == NULL && log->email == NULL && + log->message == NULL && log->time == 0 && log->tz_offset == 0 && + log->message == NULL); +} + +void string_view_consume(struct string_view *s, int n) +{ + s->buf += n; + s->len -= n; +} diff --git a/reftable/record.h b/reftable/record.h new file mode 100644 index 00000000000..498e8c50bf4 --- /dev/null +++ b/reftable/record.h @@ -0,0 +1,139 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef RECORD_H +#define RECORD_H + +#include "system.h" + +#include + +#include "reftable-record.h" + +/* + * A substring of existing string data. This structure takes no responsibility + * for the lifetime of the data it points to. + */ +struct string_view { + uint8_t *buf; + size_t len; +}; + +/* Advance `s.buf` by `n`, and decrease length. */ +void string_view_consume(struct string_view *s, int n); + +/* utilities for de/encoding varints */ + +int get_var_int(uint64_t *dest, struct string_view *in); +int put_var_int(struct string_view *dest, uint64_t val); + +/* Methods for records. */ +struct reftable_record_vtable { + /* encode the key of to a uint8_t strbuf. */ + void (*key)(const void *rec, struct strbuf *dest); + + /* The record type of ('r' for ref). */ + uint8_t type; + + void (*copy_from)(void *dest, const void *src, int hash_size); + + /* a value of [0..7], indicating record subvariants (eg. ref vs. symref + * vs ref deletion) */ + uint8_t (*val_type)(const void *rec); + + /* encodes rec into dest, returning how much space was used. */ + int (*encode)(const void *rec, struct string_view dest, int hash_size); + + /* decode data from `src` into the record. */ + int (*decode)(void *rec, struct strbuf key, uint8_t extra, + struct string_view src, int hash_size); + + /* deallocate and null the record. */ + void (*release)(void *rec); + + /* is this a tombstone? */ + int (*is_deletion)(const void *rec); +}; + +/* record is a generic wrapper for different types of records. */ +struct reftable_record { + void *data; + struct reftable_record_vtable *ops; +}; + +/* returns true for recognized block types. Block start with the block type. */ +int reftable_is_block_type(uint8_t typ); + +/* creates a malloced record of the given type. Dispose with record_destroy */ +struct reftable_record reftable_new_record(uint8_t typ); + +/* Encode `key` into `dest`. Sets `is_restart` to indicate a restart. Returns + * number of bytes written. */ +int reftable_encode_key(int *is_restart, struct string_view dest, + struct strbuf prev_key, struct strbuf key, + uint8_t extra); + +/* Decode into `key` and `extra` from `in` */ +int reftable_decode_key(struct strbuf *key, uint8_t *extra, + struct strbuf last_key, struct string_view in); + +/* reftable_index_record are used internally to speed up lookups. */ +struct reftable_index_record { + uint64_t offset; /* Offset of block */ + struct strbuf last_key; /* Last key of the block. */ +}; + +/* reftable_obj_record stores an object ID => ref mapping. */ +struct reftable_obj_record { + uint8_t *hash_prefix; /* leading bytes of the object ID */ + int hash_prefix_len; /* number of leading bytes. Constant + * across a single table. */ + uint64_t *offsets; /* a vector of file offsets. */ + int offset_len; +}; + +/* see struct record_vtable */ + +void reftable_record_key(struct reftable_record *rec, struct strbuf *dest); +uint8_t reftable_record_type(struct reftable_record *rec); +void reftable_record_copy_from(struct reftable_record *rec, + struct reftable_record *src, int hash_size); +uint8_t reftable_record_val_type(struct reftable_record *rec); +int reftable_record_encode(struct reftable_record *rec, struct string_view dest, + int hash_size); +int reftable_record_decode(struct reftable_record *rec, struct strbuf key, + uint8_t extra, struct string_view src, + int hash_size); +int reftable_record_is_deletion(struct reftable_record *rec); + +/* zeroes out the embedded record */ +void reftable_record_release(struct reftable_record *rec); + +/* clear and deallocate embedded record, and zero `rec`. */ +void reftable_record_destroy(struct reftable_record *rec); + +/* initialize generic records from concrete records. The generic record should + * be zeroed out. */ +void reftable_record_from_obj(struct reftable_record *rec, + struct reftable_obj_record *objrec); +void reftable_record_from_index(struct reftable_record *rec, + struct reftable_index_record *idxrec); +void reftable_record_from_ref(struct reftable_record *rec, + struct reftable_ref_record *refrec); +void reftable_record_from_log(struct reftable_record *rec, + struct reftable_log_record *logrec); +struct reftable_ref_record *reftable_record_as_ref(struct reftable_record *ref); +struct reftable_log_record *reftable_record_as_log(struct reftable_record *ref); + +/* for qsort. */ +int reftable_ref_record_compare_name(const void *a, const void *b); + +/* for qsort. */ +int reftable_log_record_compare_key(const void *a, const void *b); + +#endif diff --git a/reftable/record_test.c b/reftable/record_test.c new file mode 100644 index 00000000000..1bfeafe7f44 --- /dev/null +++ b/reftable/record_test.c @@ -0,0 +1,398 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "record.h" + +#include "system.h" +#include "basics.h" +#include "constants.h" +#include "test_framework.h" +#include "reftable-tests.h" + +static void test_copy(struct reftable_record *rec) +{ + struct reftable_record copy = + reftable_new_record(reftable_record_type(rec)); + reftable_record_copy_from(©, rec, SHA1_SIZE); + /* do it twice to catch memory leaks */ + reftable_record_copy_from(©, rec, SHA1_SIZE); + switch (reftable_record_type(©)) { + case BLOCK_TYPE_REF: + EXPECT(reftable_ref_record_equal(reftable_record_as_ref(©), + reftable_record_as_ref(rec), + SHA1_SIZE)); + break; + case BLOCK_TYPE_LOG: + EXPECT(reftable_log_record_equal(reftable_record_as_log(©), + reftable_record_as_log(rec), + SHA1_SIZE)); + break; + } + reftable_record_destroy(©); +} + +static void test_varint_roundtrip(void) +{ + uint64_t inputs[] = { 0, + 1, + 27, + 127, + 128, + 257, + 4096, + ((uint64_t)1 << 63), + ((uint64_t)1 << 63) + ((uint64_t)1 << 63) - 1 }; + int i = 0; + for (i = 0; i < ARRAY_SIZE(inputs); i++) { + uint8_t dest[10]; + + struct string_view out = { + .buf = dest, + .len = sizeof(dest), + }; + uint64_t in = inputs[i]; + int n = put_var_int(&out, in); + uint64_t got = 0; + + EXPECT(n > 0); + out.len = n; + n = get_var_int(&got, &out); + EXPECT(n > 0); + + EXPECT(got == in); + } +} + +static void test_common_prefix(void) +{ + struct { + const char *a, *b; + int want; + } cases[] = { + { "abc", "ab", 2 }, + { "", "abc", 0 }, + { "abc", "abd", 2 }, + { "abc", "pqr", 0 }, + }; + + int i = 0; + for (i = 0; i < ARRAY_SIZE(cases); i++) { + struct strbuf a = STRBUF_INIT; + struct strbuf b = STRBUF_INIT; + strbuf_addstr(&a, cases[i].a); + strbuf_addstr(&b, cases[i].b); + EXPECT(common_prefix_size(&a, &b) == cases[i].want); + + strbuf_release(&a); + strbuf_release(&b); + } +} + +static void set_hash(uint8_t *h, int j) +{ + int i = 0; + for (i = 0; i < hash_size(SHA1_ID); i++) { + h[i] = (j >> i) & 0xff; + } +} + +static void test_reftable_ref_record_roundtrip(void) +{ + int i = 0; + + for (i = REFTABLE_REF_DELETION; i < REFTABLE_NR_REF_VALUETYPES; i++) { + struct reftable_ref_record in = { NULL }; + struct reftable_ref_record out = { NULL }; + struct reftable_record rec_out = { NULL }; + struct strbuf key = STRBUF_INIT; + struct reftable_record rec = { NULL }; + uint8_t buffer[1024] = { 0 }; + struct string_view dest = { + .buf = buffer, + .len = sizeof(buffer), + }; + + int n, m; + + in.value_type = i; + switch (i) { + case REFTABLE_REF_DELETION: + break; + case REFTABLE_REF_VAL1: + in.value.val1 = reftable_malloc(SHA1_SIZE); + set_hash(in.value.val1, 1); + break; + case REFTABLE_REF_VAL2: + in.value.val2.value = reftable_malloc(SHA1_SIZE); + set_hash(in.value.val2.value, 1); + in.value.val2.target_value = reftable_malloc(SHA1_SIZE); + set_hash(in.value.val2.target_value, 2); + break; + case REFTABLE_REF_SYMREF: + in.value.symref = xstrdup("target"); + break; + } + in.refname = xstrdup("refs/heads/master"); + + reftable_record_from_ref(&rec, &in); + test_copy(&rec); + + EXPECT(reftable_record_val_type(&rec) == i); + + reftable_record_key(&rec, &key); + n = reftable_record_encode(&rec, dest, SHA1_SIZE); + EXPECT(n > 0); + + /* decode into a non-zero reftable_record to test for leaks. */ + + reftable_record_from_ref(&rec_out, &out); + m = reftable_record_decode(&rec_out, key, i, dest, SHA1_SIZE); + EXPECT(n == m); + + EXPECT(reftable_ref_record_equal(&in, &out, SHA1_SIZE)); + reftable_record_release(&rec_out); + + strbuf_release(&key); + reftable_ref_record_release(&in); + } +} + +static void test_reftable_log_record_equal(void) +{ + struct reftable_log_record in[2] = { + { + .refname = xstrdup("refs/heads/master"), + .update_index = 42, + }, + { + .refname = xstrdup("refs/heads/master"), + .update_index = 22, + } + }; + + EXPECT(!reftable_log_record_equal(&in[0], &in[1], SHA1_SIZE)); + in[1].update_index = in[0].update_index; + EXPECT(reftable_log_record_equal(&in[0], &in[1], SHA1_SIZE)); + reftable_log_record_release(&in[0]); + reftable_log_record_release(&in[1]); +} + +static void test_reftable_log_record_roundtrip(void) +{ + struct reftable_log_record in[2] = { + { + .refname = xstrdup("refs/heads/master"), + .old_hash = reftable_malloc(SHA1_SIZE), + .new_hash = reftable_malloc(SHA1_SIZE), + .name = xstrdup("han-wen"), + .email = xstrdup("hanwen@google.com"), + .message = xstrdup("test"), + .update_index = 42, + .time = 1577123507, + .tz_offset = 100, + }, + { + .refname = xstrdup("refs/heads/master"), + .update_index = 22, + } + }; + set_test_hash(in[0].new_hash, 1); + set_test_hash(in[0].old_hash, 2); + for (int i = 0; i < ARRAY_SIZE(in); i++) { + struct reftable_record rec = { NULL }; + struct strbuf key = STRBUF_INIT; + uint8_t buffer[1024] = { 0 }; + struct string_view dest = { + .buf = buffer, + .len = sizeof(buffer), + }; + /* populate out, to check for leaks. */ + struct reftable_log_record out = { + .refname = xstrdup("old name"), + .new_hash = reftable_calloc(SHA1_SIZE), + .old_hash = reftable_calloc(SHA1_SIZE), + .name = xstrdup("old name"), + .email = xstrdup("old@email"), + .message = xstrdup("old message"), + }; + struct reftable_record rec_out = { NULL }; + int n, m, valtype; + + reftable_record_from_log(&rec, &in[i]); + + test_copy(&rec); + + reftable_record_key(&rec, &key); + + n = reftable_record_encode(&rec, dest, SHA1_SIZE); + EXPECT(n >= 0); + reftable_record_from_log(&rec_out, &out); + valtype = reftable_record_val_type(&rec); + m = reftable_record_decode(&rec_out, key, valtype, dest, + SHA1_SIZE); + EXPECT(n == m); + + EXPECT(reftable_log_record_equal(&in[i], &out, SHA1_SIZE)); + reftable_log_record_release(&in[i]); + strbuf_release(&key); + reftable_record_release(&rec_out); + } +} + +static void test_u24_roundtrip(void) +{ + uint32_t in = 0x112233; + uint8_t dest[3]; + uint32_t out; + put_be24(dest, in); + out = get_be24(dest); + EXPECT(in == out); +} + +static void test_key_roundtrip(void) +{ + uint8_t buffer[1024] = { 0 }; + struct string_view dest = { + .buf = buffer, + .len = sizeof(buffer), + }; + struct strbuf last_key = STRBUF_INIT; + struct strbuf key = STRBUF_INIT; + struct strbuf roundtrip = STRBUF_INIT; + int restart; + uint8_t extra; + int n, m; + uint8_t rt_extra; + + strbuf_addstr(&last_key, "refs/heads/master"); + strbuf_addstr(&key, "refs/tags/bla"); + extra = 6; + n = reftable_encode_key(&restart, dest, last_key, key, extra); + EXPECT(!restart); + EXPECT(n > 0); + + m = reftable_decode_key(&roundtrip, &rt_extra, last_key, dest); + EXPECT(n == m); + EXPECT(0 == strbuf_cmp(&key, &roundtrip)); + EXPECT(rt_extra == extra); + + strbuf_release(&last_key); + strbuf_release(&key); + strbuf_release(&roundtrip); +} + +static void test_reftable_obj_record_roundtrip(void) +{ + uint8_t testHash1[SHA1_SIZE] = { 1, 2, 3, 4, 0 }; + uint64_t till9[] = { 1, 2, 3, 4, 500, 600, 700, 800, 9000 }; + struct reftable_obj_record recs[3] = { { + .hash_prefix = testHash1, + .hash_prefix_len = 5, + .offsets = till9, + .offset_len = 3, + }, + { + .hash_prefix = testHash1, + .hash_prefix_len = 5, + .offsets = till9, + .offset_len = 9, + }, + { + .hash_prefix = testHash1, + .hash_prefix_len = 5, + } }; + int i = 0; + for (i = 0; i < ARRAY_SIZE(recs); i++) { + struct reftable_obj_record in = recs[i]; + uint8_t buffer[1024] = { 0 }; + struct string_view dest = { + .buf = buffer, + .len = sizeof(buffer), + }; + struct reftable_record rec = { NULL }; + struct strbuf key = STRBUF_INIT; + struct reftable_obj_record out = { NULL }; + struct reftable_record rec_out = { NULL }; + int n, m; + uint8_t extra; + + reftable_record_from_obj(&rec, &in); + test_copy(&rec); + reftable_record_key(&rec, &key); + n = reftable_record_encode(&rec, dest, SHA1_SIZE); + EXPECT(n > 0); + extra = reftable_record_val_type(&rec); + reftable_record_from_obj(&rec_out, &out); + m = reftable_record_decode(&rec_out, key, extra, dest, + SHA1_SIZE); + EXPECT(n == m); + + EXPECT(in.hash_prefix_len == out.hash_prefix_len); + EXPECT(in.offset_len == out.offset_len); + + EXPECT(!memcmp(in.hash_prefix, out.hash_prefix, + in.hash_prefix_len)); + EXPECT(0 == memcmp(in.offsets, out.offsets, + sizeof(uint64_t) * in.offset_len)); + strbuf_release(&key); + reftable_record_release(&rec_out); + } +} + +static void test_reftable_index_record_roundtrip(void) +{ + struct reftable_index_record in = { + .offset = 42, + .last_key = STRBUF_INIT, + }; + uint8_t buffer[1024] = { 0 }; + struct string_view dest = { + .buf = buffer, + .len = sizeof(buffer), + }; + struct strbuf key = STRBUF_INIT; + struct reftable_record rec = { NULL }; + struct reftable_index_record out = { .last_key = STRBUF_INIT }; + struct reftable_record out_rec = { NULL }; + int n, m; + uint8_t extra; + + strbuf_addstr(&in.last_key, "refs/heads/master"); + reftable_record_from_index(&rec, &in); + reftable_record_key(&rec, &key); + test_copy(&rec); + + EXPECT(0 == strbuf_cmp(&key, &in.last_key)); + n = reftable_record_encode(&rec, dest, SHA1_SIZE); + EXPECT(n > 0); + + extra = reftable_record_val_type(&rec); + reftable_record_from_index(&out_rec, &out); + m = reftable_record_decode(&out_rec, key, extra, dest, SHA1_SIZE); + EXPECT(m == n); + + EXPECT(in.offset == out.offset); + + reftable_record_release(&out_rec); + strbuf_release(&key); + strbuf_release(&in.last_key); +} + +int record_test_main(int argc, const char *argv[]) +{ + RUN_TEST(test_reftable_log_record_equal); + RUN_TEST(test_reftable_log_record_roundtrip); + RUN_TEST(test_reftable_ref_record_roundtrip); + RUN_TEST(test_varint_roundtrip); + RUN_TEST(test_key_roundtrip); + RUN_TEST(test_common_prefix); + RUN_TEST(test_reftable_obj_record_roundtrip); + RUN_TEST(test_reftable_index_record_roundtrip); + RUN_TEST(test_u24_roundtrip); + return 0; +} diff --git a/reftable/reftable-record.h b/reftable/reftable-record.h new file mode 100644 index 00000000000..7b1bdbf5e5f --- /dev/null +++ b/reftable/reftable-record.h @@ -0,0 +1,100 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef REFTABLE_RECORD_H +#define REFTABLE_RECORD_H + +#include + +/* + * Basic data types + * + * Reftables store the state of each ref in struct reftable_ref_record, and they + * store a sequence of reflog updates in struct reftable_log_record. + */ + +/* reftable_ref_record holds a ref database entry target_value */ +struct reftable_ref_record { + char *refname; /* Name of the ref, malloced. */ + uint64_t update_index; /* Logical timestamp at which this value is + * written */ + + enum { + /* tombstone to hide deletions from earlier tables */ + REFTABLE_REF_DELETION = 0x0, + + /* a simple ref */ + REFTABLE_REF_VAL1 = 0x1, + /* a tag, plus its peeled hash */ + REFTABLE_REF_VAL2 = 0x2, + + /* a symbolic reference */ + REFTABLE_REF_SYMREF = 0x3, +#define REFTABLE_NR_REF_VALUETYPES 4 + } value_type; + union { + uint8_t *val1; /* malloced hash. */ + struct { + uint8_t *value; /* first value, malloced hash */ + uint8_t *target_value; /* second value, malloced hash */ + } val2; + char *symref; /* referent, malloced 0-terminated string */ + } value; +}; + +/* Returns the first hash, or NULL if `rec` is not of type + * REFTABLE_REF_VAL1 or REFTABLE_REF_VAL2. */ +uint8_t *reftable_ref_record_val1(struct reftable_ref_record *rec); + +/* Returns the second hash, or NULL if `rec` is not of type + * REFTABLE_REF_VAL2. */ +uint8_t *reftable_ref_record_val2(struct reftable_ref_record *rec); + +/* returns whether 'ref' represents a deletion */ +int reftable_ref_record_is_deletion(const struct reftable_ref_record *ref); + +/* prints a reftable_ref_record onto stdout. Useful for debugging. */ +void reftable_ref_record_print(struct reftable_ref_record *ref, + uint32_t hash_id); + +/* frees and nulls all pointer values inside `ref`. */ +void reftable_ref_record_release(struct reftable_ref_record *ref); + +/* returns whether two reftable_ref_records are the same. Useful for testing. */ +int reftable_ref_record_equal(struct reftable_ref_record *a, + struct reftable_ref_record *b, int hash_size); + +/* reftable_log_record holds a reflog entry */ +struct reftable_log_record { + char *refname; + uint64_t update_index; /* logical timestamp of a transactional update. + */ + uint8_t *new_hash; + uint8_t *old_hash; + char *name; + char *email; + uint64_t time; + int16_t tz_offset; + char *message; +}; + +/* returns whether 'ref' represents the deletion of a log record. */ +int reftable_log_record_is_deletion(const struct reftable_log_record *log); + +/* frees and nulls all pointer values. */ +void reftable_log_record_release(struct reftable_log_record *log); + +/* returns whether two records are equal. Useful for testing. */ +int reftable_log_record_equal(struct reftable_log_record *a, + struct reftable_log_record *b, int hash_size); + +/* dumps a reftable_log_record on stdout, for debugging/testing. */ +void reftable_log_record_print(struct reftable_log_record *log, + uint32_t hash_id); + +#endif diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c index 3b58e423e7b..09d4b83ef9b 100644 --- a/t/helper/test-reftable.c +++ b/t/helper/test-reftable.c @@ -4,6 +4,6 @@ int cmd__reftable(int argc, const char **argv) { basics_test_main(argc, argv); - + record_test_main(argc, argv); return 0; } From patchwork Wed Dec 9 14:00:21 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 11961583 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 37C64C4167B for ; Wed, 9 Dec 2020 14:03:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id EE5B923B31 for ; Wed, 9 Dec 2020 14:03:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727645AbgLIODL (ORCPT ); Wed, 9 Dec 2020 09:03:11 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60400 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732780AbgLIOB5 (ORCPT ); Wed, 9 Dec 2020 09:01:57 -0500 Received: from mail-wm1-x330.google.com (mail-wm1-x330.google.com [IPv6:2a00:1450:4864:20::330]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 81918C061282 for ; Wed, 9 Dec 2020 06:00:40 -0800 (PST) Received: by mail-wm1-x330.google.com with SMTP id y23so1788301wmi.1 for ; Wed, 09 Dec 2020 06:00:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=d96r7tRSsaGcRQdLxw4gy2Qi4C1Ve2j1jxZSyLqDFAs=; b=n6kQL3aEcLOc9MTK+c/fnkPEivayohUc4cn0Eys8I+VUZMz+cjP4AFL61vJtmFT7Is RDV0mmK+TCoWc/2xNlS940UnGUGYogysa0XZPW1hwp8KnSxec8CvYXVj2ugMz1NjIKDQ 7UOSJw1PrPneWRcPy3BBX+EWS/EgOXW3VxSmQs6eayleWWeTJd60xBaoumvj7nkdSMoB GGbiUQ5r31bxXdh8ANDss8qtCfvnzIpRv4/i/7zAp3x3W+/puFqBVWsQpH3ZK1iQxFks O6FV7J1KMGzq33n8AaF2AmPR2IA7hZtDBpouMYtaggndfWzxs2/vwZSX6pgIZwoYacQR KHAA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=d96r7tRSsaGcRQdLxw4gy2Qi4C1Ve2j1jxZSyLqDFAs=; b=VlZHXM1pTeCbL8PB2A8GNietW/n3kq6q3jt4nAWxQQeaPaFhML3I7TnAGwjJk/q1T+ yBNkzlc2JffaIkNmv1+8FtSltHrMJYFQXHSsPfxczRrGYxe5KRbQWakObZsRq45AA7Bx /C/7RRaPdqs8HV3v/VFfKkmdmMhTe5UU8B4u9CF/J0PhoUl7fUuEV89A2tIPU9hxDXaR pkE4SF5A5RH/PwLu+4o2JyK7AojDUVFYPJkPdUF0oZO9DLynxvPbc7Q1zd9L/pmfjPUQ 5PBTcztbzXCaz1IYRNlYbwhXAJyskZ4RyguSNAqyBq0iOBLmaXIGW+Pqb2UXejN1I2/u mxtA== X-Gm-Message-State: AOAM530Qo3DKX74xj81GiOPwU7Ee2d/4OPmSsCOMkOWw04yQyMKpLOqC UVY4wSQKM7Zi8yhfXlMNXGZRvj62GqI= X-Google-Smtp-Source: ABdhPJw9ctk5CMuIotIRI12UZOA9FWxjc4HXECJoCnmjgXj2CBEurtRDwBQI0xxwfmGYj/9zwIrryg== X-Received: by 2002:a1c:7e11:: with SMTP id z17mr2865381wmc.83.1607522438570; Wed, 09 Dec 2020 06:00:38 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id d16sm4031415wrw.17.2020.12.09.06.00.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 09 Dec 2020 06:00:38 -0800 (PST) Message-Id: In-Reply-To: References: Date: Wed, 09 Dec 2020 14:00:21 +0000 Subject: [PATCH v4 07/15] reftable: reading/writing blocks Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Jeff King , Ramsay Jones , Jonathan Nieder , Johannes Schindelin , Jonathan Tan , Josh Steadmon , Emily Shaffer , Patrick Steinhardt , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Felipe Contreras , Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys The reftable format is structured as a sequence of block. Within a block, records are prefix compressed, with an index of offsets for fully expand keys to enable binary search within blocks. This commit provides the logic to read and write these blocks. Includes a code snippet copied from zlib Signed-off-by: Han-Wen Nienhuys --- Makefile | 3 + reftable/.gitattributes | 1 + reftable/block.c | 440 +++++++++++++++++++++++++++++++++++++++ reftable/block.h | 127 +++++++++++ reftable/block_test.c | 121 +++++++++++ reftable/zlib-compat.c | 92 ++++++++ t/helper/test-reftable.c | 1 + 7 files changed, 785 insertions(+) create mode 100644 reftable/.gitattributes create mode 100644 reftable/block.c create mode 100644 reftable/block.h create mode 100644 reftable/block_test.c create mode 100644 reftable/zlib-compat.c diff --git a/Makefile b/Makefile index ce3052169b4..abe0dda5777 100644 --- a/Makefile +++ b/Makefile @@ -2391,10 +2391,13 @@ XDIFF_OBJS += xdiff/xutils.o REFTABLE_OBJS += reftable/basics.o REFTABLE_OBJS += reftable/error.o +REFTABLE_OBJS += reftable/block.o REFTABLE_OBJS += reftable/blocksource.o REFTABLE_OBJS += reftable/publicbasics.o REFTABLE_OBJS += reftable/record.o +REFTABLE_OBJS += reftable/zlib-compat.o +REFTABLE_TEST_OBJS += reftable/block_test.o REFTABLE_TEST_OBJS += reftable/record_test.o REFTABLE_TEST_OBJS += reftable/test_framework.o REFTABLE_TEST_OBJS += reftable/basics_test.o diff --git a/reftable/.gitattributes b/reftable/.gitattributes new file mode 100644 index 00000000000..f44451a3795 --- /dev/null +++ b/reftable/.gitattributes @@ -0,0 +1 @@ +/zlib-compat.c whitespace=-indent-with-non-tab,-trailing-space diff --git a/reftable/block.c b/reftable/block.c new file mode 100644 index 00000000000..5b4e49fb2be --- /dev/null +++ b/reftable/block.c @@ -0,0 +1,440 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "block.h" + +#include "blocksource.h" +#include "constants.h" +#include "record.h" +#include "reftable-error.h" +#include "system.h" +#include "zlib.h" + +int header_size(int version) +{ + switch (version) { + case 1: + return 24; + case 2: + return 28; + } + abort(); +} + +int footer_size(int version) +{ + switch (version) { + case 1: + return 68; + case 2: + return 72; + } + abort(); +} + +static int block_writer_register_restart(struct block_writer *w, int n, + int is_restart, struct strbuf *key) +{ + int rlen = w->restart_len; + if (rlen >= MAX_RESTARTS) { + is_restart = 0; + } + + if (is_restart) { + rlen++; + } + if (2 + 3 * rlen + n > w->block_size - w->next) + return -1; + if (is_restart) { + if (w->restart_len == w->restart_cap) { + w->restart_cap = w->restart_cap * 2 + 1; + w->restarts = reftable_realloc( + w->restarts, sizeof(uint32_t) * w->restart_cap); + } + + w->restarts[w->restart_len++] = w->next; + } + + w->next += n; + + strbuf_reset(&w->last_key); + strbuf_addbuf(&w->last_key, key); + w->entries++; + return 0; +} + +void block_writer_init(struct block_writer *bw, uint8_t typ, uint8_t *buf, + uint32_t block_size, uint32_t header_off, int hash_size) +{ + bw->buf = buf; + bw->hash_size = hash_size; + bw->block_size = block_size; + bw->header_off = header_off; + bw->buf[header_off] = typ; + bw->next = header_off + 4; + bw->restart_interval = 16; + bw->entries = 0; + bw->restart_len = 0; + bw->last_key.len = 0; +} + +uint8_t block_writer_type(struct block_writer *bw) +{ + return bw->buf[bw->header_off]; +} + +/* adds the reftable_record to the block. Returns -1 if it does not fit, 0 on + success */ +int block_writer_add(struct block_writer *w, struct reftable_record *rec) +{ + struct strbuf empty = STRBUF_INIT; + struct strbuf last = + w->entries % w->restart_interval == 0 ? empty : w->last_key; + struct string_view out = { + .buf = w->buf + w->next, + .len = w->block_size - w->next, + }; + + struct string_view start = out; + + int is_restart = 0; + struct strbuf key = STRBUF_INIT; + int n = 0; + + reftable_record_key(rec, &key); + n = reftable_encode_key(&is_restart, out, last, key, + reftable_record_val_type(rec)); + if (n < 0) + goto done; + string_view_consume(&out, n); + + n = reftable_record_encode(rec, out, w->hash_size); + if (n < 0) + goto done; + string_view_consume(&out, n); + + if (block_writer_register_restart(w, start.len - out.len, is_restart, + &key) < 0) + goto done; + + strbuf_release(&key); + return 0; + +done: + strbuf_release(&key); + return -1; +} + +int block_writer_finish(struct block_writer *w) +{ + int i = 0; + for (i = 0; i < w->restart_len; i++) { + put_be24(w->buf + w->next, w->restarts[i]); + w->next += 3; + } + + put_be16(w->buf + w->next, w->restart_len); + w->next += 2; + put_be24(w->buf + 1 + w->header_off, w->next); + + if (block_writer_type(w) == BLOCK_TYPE_LOG) { + int block_header_skip = 4 + w->header_off; + uint8_t *compressed = NULL; + int zresult = 0; + uLongf src_len = w->next - block_header_skip; + size_t dest_cap = src_len; + + compressed = reftable_malloc(dest_cap); + while (1) { + uLongf out_dest_len = dest_cap; + + zresult = compress2(compressed, &out_dest_len, + w->buf + block_header_skip, src_len, + 9); + if (zresult == Z_BUF_ERROR) { + dest_cap *= 2; + compressed = + reftable_realloc(compressed, dest_cap); + continue; + } + + if (Z_OK != zresult) { + reftable_free(compressed); + return REFTABLE_ZLIB_ERROR; + } + + memcpy(w->buf + block_header_skip, compressed, + out_dest_len); + w->next = out_dest_len + block_header_skip; + reftable_free(compressed); + break; + } + } + return w->next; +} + +uint8_t block_reader_type(struct block_reader *r) +{ + return r->block.data[r->header_off]; +} + +int block_reader_init(struct block_reader *br, struct reftable_block *block, + uint32_t header_off, uint32_t table_block_size, + int hash_size) +{ + uint32_t full_block_size = table_block_size; + uint8_t typ = block->data[header_off]; + uint32_t sz = get_be24(block->data + header_off + 1); + + uint16_t restart_count = 0; + uint32_t restart_start = 0; + uint8_t *restart_bytes = NULL; + + if (!reftable_is_block_type(typ)) + return REFTABLE_FORMAT_ERROR; + + if (typ == BLOCK_TYPE_LOG) { + int block_header_skip = 4 + header_off; + uLongf dst_len = sz - block_header_skip; /* total size of dest + buffer. */ + uLongf src_len = block->len - block_header_skip; + /* Log blocks specify the *uncompressed* size in their header. + */ + uint8_t *uncompressed = reftable_malloc(sz); + + /* Copy over the block header verbatim. It's not compressed. */ + memcpy(uncompressed, block->data, block_header_skip); + + /* Uncompress */ + if (Z_OK != uncompress_return_consumed( + uncompressed + block_header_skip, &dst_len, + block->data + block_header_skip, + &src_len)) { + reftable_free(uncompressed); + return REFTABLE_ZLIB_ERROR; + } + + if (dst_len + block_header_skip != sz) + return REFTABLE_FORMAT_ERROR; + + /* We're done with the input data. */ + reftable_block_done(block); + block->data = uncompressed; + block->len = sz; + block->source = malloc_block_source(); + full_block_size = src_len + block_header_skip; + } else if (full_block_size == 0) { + full_block_size = sz; + } else if (sz < full_block_size && sz < block->len && + block->data[sz] != 0) { + /* If the block is smaller than the full block size, it is + padded (data followed by '\0') or the next block is + unaligned. */ + full_block_size = sz; + } + + restart_count = get_be16(block->data + sz - 2); + restart_start = sz - 2 - 3 * restart_count; + restart_bytes = block->data + restart_start; + + /* transfer ownership. */ + br->block = *block; + block->data = NULL; + block->len = 0; + + br->hash_size = hash_size; + br->block_len = restart_start; + br->full_block_size = full_block_size; + br->header_off = header_off; + br->restart_count = restart_count; + br->restart_bytes = restart_bytes; + + return 0; +} + +static uint32_t block_reader_restart_offset(struct block_reader *br, int i) +{ + return get_be24(br->restart_bytes + 3 * i); +} + +void block_reader_start(struct block_reader *br, struct block_iter *it) +{ + it->br = br; + strbuf_reset(&it->last_key); + it->next_off = br->header_off + 4; +} + +struct restart_find_args { + int error; + struct strbuf key; + struct block_reader *r; +}; + +static int restart_key_less(size_t idx, void *args) +{ + struct restart_find_args *a = (struct restart_find_args *)args; + uint32_t off = block_reader_restart_offset(a->r, idx); + struct string_view in = { + .buf = a->r->block.data + off, + .len = a->r->block_len - off, + }; + + /* the restart key is verbatim in the block, so this could avoid the + alloc for decoding the key */ + struct strbuf rkey = STRBUF_INIT; + struct strbuf last_key = STRBUF_INIT; + uint8_t unused_extra; + int n = reftable_decode_key(&rkey, &unused_extra, last_key, in); + int result; + if (n < 0) { + a->error = 1; + return -1; + } + + result = strbuf_cmp(&a->key, &rkey); + strbuf_release(&rkey); + return result; +} + +void block_iter_copy_from(struct block_iter *dest, struct block_iter *src) +{ + dest->br = src->br; + dest->next_off = src->next_off; + strbuf_reset(&dest->last_key); + strbuf_addbuf(&dest->last_key, &src->last_key); +} + +int block_iter_next(struct block_iter *it, struct reftable_record *rec) +{ + struct string_view in = { + .buf = it->br->block.data + it->next_off, + .len = it->br->block_len - it->next_off, + }; + struct string_view start = in; + struct strbuf key = STRBUF_INIT; + uint8_t extra = 0; + int n = 0; + + if (it->next_off >= it->br->block_len) + return 1; + + n = reftable_decode_key(&key, &extra, it->last_key, in); + if (n < 0) + return -1; + + string_view_consume(&in, n); + n = reftable_record_decode(rec, key, extra, in, it->br->hash_size); + if (n < 0) + return -1; + string_view_consume(&in, n); + + strbuf_reset(&it->last_key); + strbuf_addbuf(&it->last_key, &key); + it->next_off += start.len - in.len; + strbuf_release(&key); + return 0; +} + +int block_reader_first_key(struct block_reader *br, struct strbuf *key) +{ + struct strbuf empty = STRBUF_INIT; + int off = br->header_off + 4; + struct string_view in = { + .buf = br->block.data + off, + .len = br->block_len - off, + }; + + uint8_t extra = 0; + int n = reftable_decode_key(key, &extra, empty, in); + if (n < 0) + return n; + + return 0; +} + +int block_iter_seek(struct block_iter *it, struct strbuf *want) +{ + return block_reader_seek(it->br, it, want); +} + +void block_iter_close(struct block_iter *it) +{ + strbuf_release(&it->last_key); +} + +int block_reader_seek(struct block_reader *br, struct block_iter *it, + struct strbuf *want) +{ + struct restart_find_args args = { + .key = *want, + .r = br, + }; + struct reftable_record rec = reftable_new_record(block_reader_type(br)); + struct strbuf key = STRBUF_INIT; + int err = 0; + struct block_iter next = { + .last_key = STRBUF_INIT, + }; + + int i = binsearch(br->restart_count, &restart_key_less, &args); + if (args.error) { + err = REFTABLE_FORMAT_ERROR; + goto done; + } + + it->br = br; + if (i > 0) { + i--; + it->next_off = block_reader_restart_offset(br, i); + } else { + it->next_off = br->header_off + 4; + } + + /* We're looking for the last entry less/equal than the wanted key, so + we have to go one entry too far and then back up. + */ + while (1) { + block_iter_copy_from(&next, it); + err = block_iter_next(&next, &rec); + if (err < 0) + goto done; + + reftable_record_key(&rec, &key); + if (err > 0 || strbuf_cmp(&key, want) >= 0) { + err = 0; + goto done; + } + + block_iter_copy_from(it, &next); + } + +done: + strbuf_release(&key); + strbuf_release(&next.last_key); + reftable_record_destroy(&rec); + + return err; +} + +void block_writer_release(struct block_writer *bw) +{ + FREE_AND_NULL(bw->restarts); + strbuf_release(&bw->last_key); + /* the block is not owned. */ +} + +void reftable_block_done(struct reftable_block *blockp) +{ + struct reftable_block_source source = blockp->source; + if (blockp != NULL && source.ops != NULL) + source.ops->return_block(source.arg, blockp); + blockp->data = NULL; + blockp->len = 0; + blockp->source.ops = NULL; + blockp->source.arg = NULL; +} diff --git a/reftable/block.h b/reftable/block.h new file mode 100644 index 00000000000..e207706a644 --- /dev/null +++ b/reftable/block.h @@ -0,0 +1,127 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef BLOCK_H +#define BLOCK_H + +#include "basics.h" +#include "record.h" +#include "reftable-blocksource.h" + +/* + * Writes reftable blocks. The block_writer is reused across blocks to minimize + * allocation overhead. + */ +struct block_writer { + uint8_t *buf; + uint32_t block_size; + + /* Offset ofof the global header. Nonzero in the first block only. */ + uint32_t header_off; + + /* How often to restart keys. */ + int restart_interval; + int hash_size; + + /* Offset of next uint8_t to write. */ + uint32_t next; + uint32_t *restarts; + uint32_t restart_len; + uint32_t restart_cap; + + struct strbuf last_key; + int entries; +}; + +/* + * initializes the blockwriter to write `typ` entries, using `buf` as temporary + * storage. `buf` is not owned by the block_writer. */ +void block_writer_init(struct block_writer *bw, uint8_t typ, uint8_t *buf, + uint32_t block_size, uint32_t header_off, int hash_size); + +/* returns the block type (eg. 'r' for ref records. */ +uint8_t block_writer_type(struct block_writer *bw); + +/* appends the record, or -1 if it doesn't fit. */ +int block_writer_add(struct block_writer *w, struct reftable_record *rec); + +/* appends the key restarts, and compress the block if necessary. */ +int block_writer_finish(struct block_writer *w); + +/* clears out internally allocated block_writer members. */ +void block_writer_release(struct block_writer *bw); + +/* Read a block. */ +struct block_reader { + /* offset of the block header; nonzero for the first block in a + * reftable. */ + uint32_t header_off; + + /* the memory block */ + struct reftable_block block; + int hash_size; + + /* size of the data, excluding restart data. */ + uint32_t block_len; + uint8_t *restart_bytes; + uint16_t restart_count; + + /* size of the data in the file. For log blocks, this is the compressed + * size. */ + uint32_t full_block_size; +}; + +/* Iterate over entries in a block */ +struct block_iter { + /* offset within the block of the next entry to read. */ + uint32_t next_off; + struct block_reader *br; + + /* key for last entry we read. */ + struct strbuf last_key; +}; + +/* initializes a block reader. */ +int block_reader_init(struct block_reader *br, struct reftable_block *bl, + uint32_t header_off, uint32_t table_block_size, + int hash_size); + +/* Position `it` at start of the block */ +void block_reader_start(struct block_reader *br, struct block_iter *it); + +/* Position `it` to the `want` key in the block */ +int block_reader_seek(struct block_reader *br, struct block_iter *it, + struct strbuf *want); + +/* Returns the block type (eg. 'r' for refs) */ +uint8_t block_reader_type(struct block_reader *r); + +/* Decodes the first key in the block */ +int block_reader_first_key(struct block_reader *br, struct strbuf *key); + +void block_iter_copy_from(struct block_iter *dest, struct block_iter *src); + +/* return < 0 for error, 0 for OK, > 0 for EOF. */ +int block_iter_next(struct block_iter *it, struct reftable_record *rec); + +/* Seek to `want` with in the block pointed to by `it` */ +int block_iter_seek(struct block_iter *it, struct strbuf *want); + +/* deallocate memory for `it`. The block reader and its block is left intact. */ +void block_iter_close(struct block_iter *it); + +/* size of file header, depending on format version */ +int header_size(int version); + +/* size of file footer, depending on format version */ +int footer_size(int version); + +/* returns a block to its source. */ +void reftable_block_done(struct reftable_block *ret); + +#endif diff --git a/reftable/block_test.c b/reftable/block_test.c new file mode 100644 index 00000000000..75fe198e677 --- /dev/null +++ b/reftable/block_test.c @@ -0,0 +1,121 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "block.h" + +#include "system.h" + +#include "blocksource.h" +#include "basics.h" +#include "constants.h" +#include "record.h" +#include "test_framework.h" +#include "reftable-tests.h" + +static void test_block_read_write(void) +{ + const int header_off = 21; /* random */ + char *names[30]; + const int N = ARRAY_SIZE(names); + const int block_size = 1024; + struct reftable_block block = { NULL }; + struct block_writer bw = { + .last_key = STRBUF_INIT, + }; + struct reftable_ref_record ref = { NULL }; + struct reftable_record rec = { NULL }; + int i = 0; + int n; + struct block_reader br = { 0 }; + struct block_iter it = { .last_key = STRBUF_INIT }; + int j = 0; + struct strbuf want = STRBUF_INIT; + + block.data = reftable_calloc(block_size); + block.len = block_size; + block.source = malloc_block_source(); + block_writer_init(&bw, BLOCK_TYPE_REF, block.data, block_size, + header_off, hash_size(SHA1_ID)); + reftable_record_from_ref(&rec, &ref); + + for (i = 0; i < N; i++) { + char name[100]; + uint8_t hash[SHA1_SIZE]; + snprintf(name, sizeof(name), "branch%02d", i); + memset(hash, i, sizeof(hash)); + + ref.refname = name; + ref.value_type = REFTABLE_REF_VAL1; + ref.value.val1 = hash; + + names[i] = xstrdup(name); + n = block_writer_add(&bw, &rec); + ref.refname = NULL; + ref.value_type = REFTABLE_REF_DELETION; + EXPECT(n == 0); + } + + n = block_writer_finish(&bw); + EXPECT(n > 0); + + block_writer_release(&bw); + + block_reader_init(&br, &block, header_off, block_size, SHA1_SIZE); + + block_reader_start(&br, &it); + + while (1) { + int r = block_iter_next(&it, &rec); + EXPECT(r >= 0); + if (r > 0) { + break; + } + EXPECT_STREQ(names[j], ref.refname); + j++; + } + + reftable_record_release(&rec); + block_iter_close(&it); + + for (i = 0; i < N; i++) { + struct block_iter it = { .last_key = STRBUF_INIT }; + strbuf_reset(&want); + strbuf_addstr(&want, names[i]); + + n = block_reader_seek(&br, &it, &want); + EXPECT(n == 0); + + n = block_iter_next(&it, &rec); + EXPECT(n == 0); + + EXPECT_STREQ(names[i], ref.refname); + + want.len--; + n = block_reader_seek(&br, &it, &want); + EXPECT(n == 0); + + n = block_iter_next(&it, &rec); + EXPECT(n == 0); + EXPECT_STREQ(names[10 * (i / 10)], ref.refname); + + block_iter_close(&it); + } + + reftable_record_release(&rec); + reftable_block_done(&br.block); + strbuf_release(&want); + for (i = 0; i < N; i++) { + reftable_free(names[i]); + } +} + +int block_test_main(int argc, const char *argv[]) +{ + RUN_TEST(test_block_read_write); + return 0; +} diff --git a/reftable/zlib-compat.c b/reftable/zlib-compat.c new file mode 100644 index 00000000000..3e0b0f24f1c --- /dev/null +++ b/reftable/zlib-compat.c @@ -0,0 +1,92 @@ +/* taken from zlib's uncompr.c + + commit cacf7f1d4e3d44d871b605da3b647f07d718623f + Author: Mark Adler + Date: Sun Jan 15 09:18:46 2017 -0800 + + zlib 1.2.11 + +*/ + +/* + * Copyright (C) 1995-2003, 2010, 2014, 2016 Jean-loup Gailly, Mark Adler + * For conditions of distribution and use, see copyright notice in zlib.h + */ + +#include "system.h" + +/* clang-format off */ + +/* =========================================================================== + Decompresses the source buffer into the destination buffer. *sourceLen is + the byte length of the source buffer. Upon entry, *destLen is the total size + of the destination buffer, which must be large enough to hold the entire + uncompressed data. (The size of the uncompressed data must have been saved + previously by the compressor and transmitted to the decompressor by some + mechanism outside the scope of this compression library.) Upon exit, + *destLen is the size of the decompressed data and *sourceLen is the number + of source bytes consumed. Upon return, source + *sourceLen points to the + first unused input byte. + + uncompress returns Z_OK if success, Z_MEM_ERROR if there was not enough + memory, Z_BUF_ERROR if there was not enough room in the output buffer, or + Z_DATA_ERROR if the input data was corrupted, including if the input data is + an incomplete zlib stream. +*/ +int ZEXPORT uncompress_return_consumed ( + Bytef *dest, + uLongf *destLen, + const Bytef *source, + uLong *sourceLen) { + z_stream stream; + int err; + const uInt max = (uInt)-1; + uLong len, left; + Byte buf[1]; /* for detection of incomplete stream when *destLen == 0 */ + + len = *sourceLen; + if (*destLen) { + left = *destLen; + *destLen = 0; + } + else { + left = 1; + dest = buf; + } + + stream.next_in = (z_const Bytef *)source; + stream.avail_in = 0; + stream.zalloc = (alloc_func)0; + stream.zfree = (free_func)0; + stream.opaque = (voidpf)0; + + err = inflateInit(&stream); + if (err != Z_OK) return err; + + stream.next_out = dest; + stream.avail_out = 0; + + do { + if (stream.avail_out == 0) { + stream.avail_out = left > (uLong)max ? max : (uInt)left; + left -= stream.avail_out; + } + if (stream.avail_in == 0) { + stream.avail_in = len > (uLong)max ? max : (uInt)len; + len -= stream.avail_in; + } + err = inflate(&stream, Z_NO_FLUSH); + } while (err == Z_OK); + + *sourceLen -= len + stream.avail_in; + if (dest != buf) + *destLen = stream.total_out; + else if (stream.total_out && err == Z_BUF_ERROR) + left = 1; + + inflateEnd(&stream); + return err == Z_STREAM_END ? Z_OK : + err == Z_NEED_DICT ? Z_DATA_ERROR : + err == Z_BUF_ERROR && left + stream.avail_out ? Z_DATA_ERROR : + err; +} diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c index 09d4b83ef9b..c9deeaf08c7 100644 --- a/t/helper/test-reftable.c +++ b/t/helper/test-reftable.c @@ -4,6 +4,7 @@ int cmd__reftable(int argc, const char **argv) { basics_test_main(argc, argv); + block_test_main(argc, argv); record_test_main(argc, argv); return 0; } From patchwork Wed Dec 9 14:00:22 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 11961579 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E5EC5C4361B for ; Wed, 9 Dec 2020 14:03:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B2AC823B31 for ; Wed, 9 Dec 2020 14:03:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727906AbgLIOC7 (ORCPT ); Wed, 9 Dec 2020 09:02:59 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60406 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732781AbgLIOB5 (ORCPT ); Wed, 9 Dec 2020 09:01:57 -0500 Received: from mail-wr1-x433.google.com (mail-wr1-x433.google.com [IPv6:2a00:1450:4864:20::433]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E3EB2C061285 for ; Wed, 9 Dec 2020 06:00:40 -0800 (PST) Received: by mail-wr1-x433.google.com with SMTP id x6so1846816wro.11 for ; Wed, 09 Dec 2020 06:00:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=kmTaSl1V6fQm6G4+IW71nCIJW8ZmPn8MUkbzlmuR2Xs=; b=gJ082F5Tsh4XyieqCY8/DO34R5mjM6uAFYCxN3+praKZbim8uOvaCaCFP4cVT2Azc0 mHYHFj0/AHeET0UZYJqj8oDRdD/GQfiY4TG7vlJFu0+zLwDIY7dUYwcl+flfr3rv2q5A u6zCroeKw+nsb2I4+PaQE+xaNvoa4Qqp0w0UQc21zH+AnZ0N0J51WLicFhGnSHj2IrX8 dL2cPmw9C4BeElhptfVJU4HiYmfD86xtUN2Oaji87/bZvVVEA/6NvNrdCazYiE/6Xrwk x1ZJTX9CB5LGFl3L5RA0yPXZDMfJVjyu+35j1O+/cty6+ZYXAr7Znlo6EJgEkU31LwQQ 5JDQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=kmTaSl1V6fQm6G4+IW71nCIJW8ZmPn8MUkbzlmuR2Xs=; b=MbGwV40grmg6689FNQ0AbYcG7iqH/aVpZEGo321QIGKLLre+Mi/TQJXfJQEtCaREo/ xqKdSzX1p1k06V+1t9e29lfHIMHW6NEBfpVkoF88HmEULZMdyuK+KhWMdHfqTIEOTJj+ Rjmd55fqW8hvjMyn2EmtkN1yqV9fxKLH1jN+EjTxp89r5w/NFP7NSRgvpaNJ+m3X4QYQ HSFT3Jrbi23gNVLWohQ2q8fosoxOX73Hiau0QwhC+pZs4M9cgYzMg7ljgMWfWoPqToVf Iklnau1NT6MsfJzO5sRDliER7M14M91pnb4vBRNnE1qCssz4L9SHVp3cW7jX8+l3RFaT Uxhg== X-Gm-Message-State: AOAM531lsx0hkjk4vEbEtIs9RTUFszq4BR3xUAmeY8Y9MBHWnwF8LzzU 5tIUf7VMtTs0it5BB8736zLy7RNqIuI= X-Google-Smtp-Source: ABdhPJy+fFN4k4NYbAJW+51wFFpKbj0H3cNVQgf6/EiEJEQwujlSVQFr/243LCgvyagUu8fbtreo5w== X-Received: by 2002:adf:9b91:: with SMTP id d17mr2887751wrc.32.1607522439399; Wed, 09 Dec 2020 06:00:39 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id h5sm4003637wrp.56.2020.12.09.06.00.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 09 Dec 2020 06:00:38 -0800 (PST) Message-Id: <6968dbc3828f22369cc92bf56e3a7855769415fa.1607522429.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 09 Dec 2020 14:00:22 +0000 Subject: [PATCH v4 08/15] reftable: a generic binary tree implementation Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Jeff King , Ramsay Jones , Jonathan Nieder , Johannes Schindelin , Jonathan Tan , Josh Steadmon , Emily Shaffer , Patrick Steinhardt , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Felipe Contreras , Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys The reftable format includes support for an (OID => ref) map. This map can speed up visibility and reachability checks. In particular, various operations along the fetch/push path within Gerrit have ben sped up by using this structure. The map is constructed with help of a binary tree. Object IDs are hashes, so they are uniformly distributed. Hence, the tree does not attempt forced rebalancing. Signed-off-by: Han-Wen Nienhuys --- Makefile | 4 ++- reftable/tree.c | 63 ++++++++++++++++++++++++++++++++++++++++ reftable/tree.h | 34 ++++++++++++++++++++++ reftable/tree_test.c | 61 ++++++++++++++++++++++++++++++++++++++ t/helper/test-reftable.c | 1 + 5 files changed, 162 insertions(+), 1 deletion(-) create mode 100644 reftable/tree.c create mode 100644 reftable/tree.h create mode 100644 reftable/tree_test.c diff --git a/Makefile b/Makefile index abe0dda5777..734c05655b2 100644 --- a/Makefile +++ b/Makefile @@ -2395,12 +2395,14 @@ REFTABLE_OBJS += reftable/block.o REFTABLE_OBJS += reftable/blocksource.o REFTABLE_OBJS += reftable/publicbasics.o REFTABLE_OBJS += reftable/record.o +REFTABLE_OBJS += reftable/tree.o REFTABLE_OBJS += reftable/zlib-compat.o +REFTABLE_TEST_OBJS += reftable/basics_test.o REFTABLE_TEST_OBJS += reftable/block_test.o REFTABLE_TEST_OBJS += reftable/record_test.o REFTABLE_TEST_OBJS += reftable/test_framework.o -REFTABLE_TEST_OBJS += reftable/basics_test.o +REFTABLE_TEST_OBJS += reftable/tree_test.o TEST_OBJS := $(patsubst %$X,%.o,$(TEST_PROGRAMS)) $(patsubst %,t/helper/%,$(TEST_BUILTINS_OBJS)) OBJECTS := $(LIB_OBJS) $(BUILTIN_OBJS) $(PROGRAM_OBJS) $(TEST_OBJS) \ diff --git a/reftable/tree.c b/reftable/tree.c new file mode 100644 index 00000000000..0061d14e306 --- /dev/null +++ b/reftable/tree.c @@ -0,0 +1,63 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "tree.h" + +#include "basics.h" +#include "system.h" + +struct tree_node *tree_search(void *key, struct tree_node **rootp, + int (*compare)(const void *, const void *), + int insert) +{ + int res; + if (*rootp == NULL) { + if (!insert) { + return NULL; + } else { + struct tree_node *n = + reftable_calloc(sizeof(struct tree_node)); + n->key = key; + *rootp = n; + return *rootp; + } + } + + res = compare(key, (*rootp)->key); + if (res < 0) + return tree_search(key, &(*rootp)->left, compare, insert); + else if (res > 0) + return tree_search(key, &(*rootp)->right, compare, insert); + return *rootp; +} + +void infix_walk(struct tree_node *t, void (*action)(void *arg, void *key), + void *arg) +{ + if (t->left != NULL) { + infix_walk(t->left, action, arg); + } + action(arg, t->key); + if (t->right != NULL) { + infix_walk(t->right, action, arg); + } +} + +void tree_free(struct tree_node *t) +{ + if (t == NULL) { + return; + } + if (t->left != NULL) { + tree_free(t->left); + } + if (t->right != NULL) { + tree_free(t->right); + } + reftable_free(t); +} diff --git a/reftable/tree.h b/reftable/tree.h new file mode 100644 index 00000000000..fbdd002e23a --- /dev/null +++ b/reftable/tree.h @@ -0,0 +1,34 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef TREE_H +#define TREE_H + +/* tree_node is a generic binary search tree. */ +struct tree_node { + void *key; + struct tree_node *left, *right; +}; + +/* looks for `key` in `rootp` using `compare` as comparison function. If insert + * is set, insert the key if it's not found. Else, return NULL. + */ +struct tree_node *tree_search(void *key, struct tree_node **rootp, + int (*compare)(const void *, const void *), + int insert); + +/* performs an infix walk of the tree. */ +void infix_walk(struct tree_node *t, void (*action)(void *arg, void *key), + void *arg); + +/* + * deallocates the tree nodes recursively. Keys should be deallocated separately + * by walking over the tree. */ +void tree_free(struct tree_node *t); + +#endif diff --git a/reftable/tree_test.c b/reftable/tree_test.c new file mode 100644 index 00000000000..26d1e694252 --- /dev/null +++ b/reftable/tree_test.c @@ -0,0 +1,61 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "tree.h" + +#include "basics.h" +#include "record.h" +#include "test_framework.h" +#include "reftable-tests.h" + +static int test_compare(const void *a, const void *b) +{ + return (char *)a - (char *)b; +} + +struct curry { + void *last; +}; + +static void check_increasing(void *arg, void *key) +{ + struct curry *c = (struct curry *)arg; + if (c->last != NULL) { + assert(test_compare(c->last, key) < 0); + } + c->last = key; +} + +static void test_tree(void) +{ + struct tree_node *root = NULL; + + void *values[11] = { NULL }; + struct tree_node *nodes[11] = { NULL }; + int i = 1; + struct curry c = { NULL }; + do { + nodes[i] = tree_search(values + i, &root, &test_compare, 1); + i = (i * 7) % 11; + } while (i != 1); + + for (i = 1; i < ARRAY_SIZE(nodes); i++) { + assert(values + i == nodes[i]->key); + assert(nodes[i] == + tree_search(values + i, &root, &test_compare, 0)); + } + + infix_walk(root, check_increasing, &c); + tree_free(root); +} + +int tree_test_main(int argc, const char *argv[]) +{ + RUN_TEST(test_tree); + return 0; +} diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c index c9deeaf08c7..050551fa698 100644 --- a/t/helper/test-reftable.c +++ b/t/helper/test-reftable.c @@ -6,5 +6,6 @@ int cmd__reftable(int argc, const char **argv) basics_test_main(argc, argv); block_test_main(argc, argv); record_test_main(argc, argv); + tree_test_main(argc, argv); return 0; } From patchwork Wed Dec 9 14:00:23 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 11961585 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3A052C1B0E3 for ; Wed, 9 Dec 2020 14:03:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0F1C823B51 for ; Wed, 9 Dec 2020 14:03:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727009AbgLIODS (ORCPT ); Wed, 9 Dec 2020 09:03:18 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60408 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732782AbgLIOB5 (ORCPT ); Wed, 9 Dec 2020 09:01:57 -0500 Received: from mail-wr1-x434.google.com (mail-wr1-x434.google.com [IPv6:2a00:1450:4864:20::434]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 62B4CC061257 for ; Wed, 9 Dec 2020 06:00:42 -0800 (PST) Received: by mail-wr1-x434.google.com with SMTP id a12so1850888wrv.8 for ; Wed, 09 Dec 2020 06:00:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=+vbYZsDbC5tGgrByAthPwq3w+wcZN9Y6/lE+9FXr3DY=; b=ocDQ7CsN1gHRC0CsAeiz3nC9TPV0sk4JLlRY6S5AKtl0HeaiOV7mVi/jH7y05LXK5r SfVTnfI0kFjbOKsCWlxflqTSbev8gVpAFs6LS7DAJSLrprnOVkiUBujr9XUtQYqh6Fhp bjj16YuawqUR2j70QryUtLOFVWI0hQRHZQlR8uulbe8IiEb+81oebTRXqxKHiT3oxSsv P2649iKLo1cqNWkbeJIWbp38TYS1GUwPq8EsKAiwJGPl+GlPe1Bp0jnXElsbDQimzJXV i7THrTwquk39ogjDlnapIwIIUe+h83vDNYeDxowRpEz0/YtD4rB4F3jI7wEh5gjitX/A 93aA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=+vbYZsDbC5tGgrByAthPwq3w+wcZN9Y6/lE+9FXr3DY=; b=n7TF8vbz4Vwoh0CJcrSN3bCdDtVSndsvFGm+q/dS9AtRrbVq6rYvzs0vDhtlJ7DoEx evJvgrTeR0i2ZCcxBYPgKAtuP65Yc1if9zrToPCxfkcv2mbVtZoleuSCPiSzQWfIxitc AQcDGUSO8S9zYFG9xnjaKHqNpMpoGMm3V1RV/3j6zQrZxjK06fEm7Z5ExdWy+7IgNAn8 t8rUvvGMjiezi2bwv4BlERugNuRWSzZKQ8PuR58hMgLwZ6p0D+ON1iU42eSf1Castxpg LqWttRExYyj0LK+e46ZZ8sl1LB5kzQ7+4s0S7ndHLi6jn/IVf/Pt6xv3urQZbUSjKCsq TeXg== X-Gm-Message-State: AOAM530NoJRpM+fv/c3Slel2HbcfKL5RhmSzka1vrbe91AhQ5P0c6/Xv nED12SXJuKaptXnXrJCk3+ljnuBH8NY= X-Google-Smtp-Source: ABdhPJzrWWzn+K4jg8f7BzlRY6+Uz+GdobEazGWrkcKtOghpoyCXYEFhIYdxywmBmldaGgR+MlZs1g== X-Received: by 2002:a05:6000:ce:: with SMTP id q14mr2806710wrx.277.1607522440469; Wed, 09 Dec 2020 06:00:40 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id c4sm3665721wmf.19.2020.12.09.06.00.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 09 Dec 2020 06:00:39 -0800 (PST) Message-Id: In-Reply-To: References: Date: Wed, 09 Dec 2020 14:00:23 +0000 Subject: [PATCH v4 09/15] reftable: write reftable files Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Jeff King , Ramsay Jones , Jonathan Nieder , Johannes Schindelin , Jonathan Tan , Josh Steadmon , Emily Shaffer , Patrick Steinhardt , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Felipe Contreras , Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys Signed-off-by: Han-Wen Nienhuys --- Makefile | 1 + reftable/reftable-writer.h | 147 ++++++++ reftable/writer.c | 681 +++++++++++++++++++++++++++++++++++++ reftable/writer.h | 50 +++ 4 files changed, 879 insertions(+) create mode 100644 reftable/reftable-writer.h create mode 100644 reftable/writer.c create mode 100644 reftable/writer.h diff --git a/Makefile b/Makefile index 734c05655b2..e78908c976a 100644 --- a/Makefile +++ b/Makefile @@ -2396,6 +2396,7 @@ REFTABLE_OBJS += reftable/blocksource.o REFTABLE_OBJS += reftable/publicbasics.o REFTABLE_OBJS += reftable/record.o REFTABLE_OBJS += reftable/tree.o +REFTABLE_OBJS += reftable/writer.o REFTABLE_OBJS += reftable/zlib-compat.o REFTABLE_TEST_OBJS += reftable/basics_test.o diff --git a/reftable/reftable-writer.h b/reftable/reftable-writer.h new file mode 100644 index 00000000000..9d2f8d60555 --- /dev/null +++ b/reftable/reftable-writer.h @@ -0,0 +1,147 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef REFTABLE_WRITER_H +#define REFTABLE_WRITER_H + +#include + +#include "reftable-record.h" + +/* Writing single reftables */ + +/* reftable_write_options sets options for writing a single reftable. */ +struct reftable_write_options { + /* boolean: do not pad out blocks to block size. */ + unsigned unpadded : 1; + + /* the blocksize. Should be less than 2^24. */ + uint32_t block_size; + + /* boolean: do not generate a SHA1 => ref index. */ + unsigned skip_index_objects : 1; + + /* how often to write complete keys in each block. */ + int restart_interval; + + /* 4-byte identifier ("sha1", "s256") of the hash. + * Defaults to SHA1 if unset + */ + uint32_t hash_id; + + /* boolean: do not check ref names for validity or dir/file conflicts. + */ + unsigned skip_name_check : 1; + + /* boolean: copy log messages exactly. If unset, check that the message + * is a single line, and add '\n' if missing. + */ + unsigned exact_log_message : 1; +}; + +/* reftable_block_stats holds statistics for a single block type */ +struct reftable_block_stats { + /* total number of entries written */ + int entries; + /* total number of key restarts */ + int restarts; + /* total number of blocks */ + int blocks; + /* total number of index blocks */ + int index_blocks; + /* depth of the index */ + int max_index_level; + + /* offset of the first block for this type */ + uint64_t offset; + /* offset of the top level index block for this type, or 0 if not + * present */ + uint64_t index_offset; +}; + +/* stats holds overall statistics for a single reftable */ +struct reftable_stats { + /* total number of blocks written. */ + int blocks; + /* stats for ref data */ + struct reftable_block_stats ref_stats; + /* stats for the SHA1 to ref map. */ + struct reftable_block_stats obj_stats; + /* stats for index blocks */ + struct reftable_block_stats idx_stats; + /* stats for log blocks */ + struct reftable_block_stats log_stats; + + /* disambiguation length of shortened object IDs. */ + int object_id_len; +}; + +/* reftable_new_writer creates a new writer */ +struct reftable_writer * +reftable_new_writer(int (*writer_func)(void *, const void *, size_t), + void *writer_arg, struct reftable_write_options *opts); + +/* Set the range of update indices for the records we will add. When writing a + table into a stack, the min should be at least + reftable_stack_next_update_index(), or REFTABLE_API_ERROR is returned. + + For transactional updates to a stack, typically min==max, and the + update_index can be obtained by inspeciting the stack. When converting an + existing ref database into a single reftable, this would be a range of + update-index timestamps. + */ +void reftable_writer_set_limits(struct reftable_writer *w, uint64_t min, + uint64_t max); + +/* + Add a reftable_ref_record. The record should have names that come after + already added records. + + The update_index must be within the limits set by + reftable_writer_set_limits(), or REFTABLE_API_ERROR is returned. It is an + REFTABLE_API_ERROR error to write a ref record after a log record. +*/ +int reftable_writer_add_ref(struct reftable_writer *w, + struct reftable_ref_record *ref); + +/* + Convenience function to add multiple reftable_ref_records; the function sorts + the records before adding them, reordering the records array passed in. +*/ +int reftable_writer_add_refs(struct reftable_writer *w, + struct reftable_ref_record *refs, int n); + +/* + adds reftable_log_records. Log records are keyed by (refname, decreasing + update_index). The key for the record added must come after the already added + log records. +*/ +int reftable_writer_add_log(struct reftable_writer *w, + struct reftable_log_record *log); + +/* + Convenience function to add multiple reftable_log_records; the function sorts + the records before adding them, reordering records array passed in. +*/ +int reftable_writer_add_logs(struct reftable_writer *w, + struct reftable_log_record *logs, int n); + +/* reftable_writer_close finalizes the reftable. The writer is retained so + * statistics can be inspected. */ +int reftable_writer_close(struct reftable_writer *w); + +/* writer_stats returns the statistics on the reftable being written. + + This struct becomes invalid when the writer is freed. + */ +const struct reftable_stats *writer_stats(struct reftable_writer *w); + +/* reftable_writer_free deallocates memory for the writer */ +void reftable_writer_free(struct reftable_writer *w); + +#endif diff --git a/reftable/writer.c b/reftable/writer.c new file mode 100644 index 00000000000..00a0a11440f --- /dev/null +++ b/reftable/writer.c @@ -0,0 +1,681 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "writer.h" + +#include "system.h" + +#include "block.h" +#include "constants.h" +#include "record.h" +#include "tree.h" +#include "reftable-error.h" + +/* finishes a block, and writes it to storage */ +static int writer_flush_block(struct reftable_writer *w); + +/* deallocates memory related to the index */ +static void writer_clear_index(struct reftable_writer *w); + +/* finishes writing a 'r' (refs) or 'g' (reflogs) section */ +static int writer_finish_public_section(struct reftable_writer *w); + +static struct reftable_block_stats * +writer_reftable_block_stats(struct reftable_writer *w, uint8_t typ) +{ + switch (typ) { + case 'r': + return &w->stats.ref_stats; + case 'o': + return &w->stats.obj_stats; + case 'i': + return &w->stats.idx_stats; + case 'g': + return &w->stats.log_stats; + } + abort(); + return NULL; +} + +/* write data, queuing the padding for the next write. Returns negative for + * error. */ +static int padded_write(struct reftable_writer *w, uint8_t *data, size_t len, + int padding) +{ + int n = 0; + if (w->pending_padding > 0) { + uint8_t *zeroed = reftable_calloc(w->pending_padding); + int n = w->write(w->write_arg, zeroed, w->pending_padding); + if (n < 0) + return n; + + w->pending_padding = 0; + reftable_free(zeroed); + } + + w->pending_padding = padding; + n = w->write(w->write_arg, data, len); + if (n < 0) + return n; + n += padding; + return 0; +} + +static void options_set_defaults(struct reftable_write_options *opts) +{ + if (opts->restart_interval == 0) { + opts->restart_interval = 16; + } + + if (opts->hash_id == 0) { + opts->hash_id = SHA1_ID; + } + if (opts->block_size == 0) { + opts->block_size = DEFAULT_BLOCK_SIZE; + } +} + +static int writer_version(struct reftable_writer *w) +{ + return (w->opts.hash_id == 0 || w->opts.hash_id == SHA1_ID) ? 1 : 2; +} + +static int writer_write_header(struct reftable_writer *w, uint8_t *dest) +{ + memcpy((char *)dest, "REFT", 4); + + dest[4] = writer_version(w); + + put_be24(dest + 5, w->opts.block_size); + put_be64(dest + 8, w->min_update_index); + put_be64(dest + 16, w->max_update_index); + if (writer_version(w) == 2) { + put_be32(dest + 24, w->opts.hash_id); + } + return header_size(writer_version(w)); +} + +static void writer_reinit_block_writer(struct reftable_writer *w, uint8_t typ) +{ + int block_start = 0; + if (w->next == 0) { + block_start = header_size(writer_version(w)); + } + + strbuf_release(&w->last_key); + block_writer_init(&w->block_writer_data, typ, w->block, + w->opts.block_size, block_start, + hash_size(w->opts.hash_id)); + w->block_writer = &w->block_writer_data; + w->block_writer->restart_interval = w->opts.restart_interval; +} + +static struct strbuf reftable_empty_strbuf = STRBUF_INIT; + +struct reftable_writer * +reftable_new_writer(int (*writer_func)(void *, const void *, size_t), + void *writer_arg, struct reftable_write_options *opts) +{ + struct reftable_writer *wp = + reftable_calloc(sizeof(struct reftable_writer)); + strbuf_init(&wp->block_writer_data.last_key, 0); + options_set_defaults(opts); + if (opts->block_size >= (1 << 24)) { + /* TODO - error return? */ + abort(); + } + wp->last_key = reftable_empty_strbuf; + wp->block = reftable_calloc(opts->block_size); + wp->write = writer_func; + wp->write_arg = writer_arg; + wp->opts = *opts; + writer_reinit_block_writer(wp, BLOCK_TYPE_REF); + + return wp; +} + +void reftable_writer_set_limits(struct reftable_writer *w, uint64_t min, + uint64_t max) +{ + w->min_update_index = min; + w->max_update_index = max; +} + +void reftable_writer_free(struct reftable_writer *w) +{ + reftable_free(w->block); + reftable_free(w); +} + +struct obj_index_tree_node { + struct strbuf hash; + uint64_t *offsets; + size_t offset_len; + size_t offset_cap; +}; + +#define OBJ_INDEX_TREE_NODE_INIT \ + { \ + .hash = STRBUF_INIT \ + } + +static int obj_index_tree_node_compare(const void *a, const void *b) +{ + return strbuf_cmp(&((const struct obj_index_tree_node *)a)->hash, + &((const struct obj_index_tree_node *)b)->hash); +} + +static void writer_index_hash(struct reftable_writer *w, struct strbuf *hash) +{ + uint64_t off = w->next; + + struct obj_index_tree_node want = { .hash = *hash }; + + struct tree_node *node = tree_search(&want, &w->obj_index_tree, + &obj_index_tree_node_compare, 0); + struct obj_index_tree_node *key = NULL; + if (node == NULL) { + struct obj_index_tree_node empty = OBJ_INDEX_TREE_NODE_INIT; + key = reftable_malloc(sizeof(struct obj_index_tree_node)); + *key = empty; + + strbuf_reset(&key->hash); + strbuf_addbuf(&key->hash, hash); + tree_search((void *)key, &w->obj_index_tree, + &obj_index_tree_node_compare, 1); + } else { + key = node->key; + } + + if (key->offset_len > 0 && key->offsets[key->offset_len - 1] == off) { + return; + } + + if (key->offset_len == key->offset_cap) { + key->offset_cap = 2 * key->offset_cap + 1; + key->offsets = reftable_realloc( + key->offsets, sizeof(uint64_t) * key->offset_cap); + } + + key->offsets[key->offset_len++] = off; +} + +static int writer_add_record(struct reftable_writer *w, + struct reftable_record *rec) +{ + int result = -1; + struct strbuf key = STRBUF_INIT; + int err = 0; + reftable_record_key(rec, &key); + if (strbuf_cmp(&w->last_key, &key) >= 0) + goto done; + + strbuf_reset(&w->last_key); + strbuf_addbuf(&w->last_key, &key); + if (w->block_writer == NULL) { + writer_reinit_block_writer(w, reftable_record_type(rec)); + } + + assert(block_writer_type(w->block_writer) == reftable_record_type(rec)); + + if (block_writer_add(w->block_writer, rec) == 0) { + result = 0; + goto done; + } + + err = writer_flush_block(w); + if (err < 0) { + result = err; + goto done; + } + + writer_reinit_block_writer(w, reftable_record_type(rec)); + err = block_writer_add(w->block_writer, rec); + if (err < 0) { + result = err; + goto done; + } + + result = 0; +done: + strbuf_release(&key); + return result; +} + +int reftable_writer_add_ref(struct reftable_writer *w, + struct reftable_ref_record *ref) +{ + struct reftable_record rec = { NULL }; + struct reftable_ref_record copy = *ref; + int err = 0; + + if (ref->refname == NULL) + return REFTABLE_API_ERROR; + if (ref->update_index < w->min_update_index || + ref->update_index > w->max_update_index) + return REFTABLE_API_ERROR; + + reftable_record_from_ref(&rec, ©); + copy.update_index -= w->min_update_index; + + err = writer_add_record(w, &rec); + if (err < 0) + return err; + + if (!w->opts.skip_index_objects && + reftable_ref_record_val1(ref) != NULL) { + struct strbuf h = STRBUF_INIT; + strbuf_add(&h, (char *)reftable_ref_record_val1(ref), + hash_size(w->opts.hash_id)); + writer_index_hash(w, &h); + strbuf_release(&h); + } + + if (!w->opts.skip_index_objects && + reftable_ref_record_val2(ref) != NULL) { + struct strbuf h = STRBUF_INIT; + strbuf_add(&h, reftable_ref_record_val2(ref), + hash_size(w->opts.hash_id)); + writer_index_hash(w, &h); + strbuf_release(&h); + } + return 0; +} + +int reftable_writer_add_refs(struct reftable_writer *w, + struct reftable_ref_record *refs, int n) +{ + int err = 0; + int i = 0; + QSORT(refs, n, reftable_ref_record_compare_name); + for (i = 0; err == 0 && i < n; i++) { + err = reftable_writer_add_ref(w, &refs[i]); + } + return err; +} + +int reftable_writer_add_log(struct reftable_writer *w, + struct reftable_log_record *log) +{ + struct reftable_record rec = { NULL }; + char *input_log_message = log->message; + struct strbuf cleaned_message = STRBUF_INIT; + int err; + if (log->refname == NULL) + return REFTABLE_API_ERROR; + + if (w->block_writer != NULL && + block_writer_type(w->block_writer) == BLOCK_TYPE_REF) { + int err = writer_finish_public_section(w); + if (err < 0) + return err; + } + + if (!w->opts.exact_log_message && log->message != NULL) { + strbuf_addstr(&cleaned_message, log->message); + while (cleaned_message.len && + cleaned_message.buf[cleaned_message.len - 1] == '\n') + strbuf_setlen(&cleaned_message, + cleaned_message.len - 1); + if (strchr(cleaned_message.buf, '\n')) { + // multiple lines not allowed. + err = REFTABLE_API_ERROR; + goto done; + } + strbuf_addstr(&cleaned_message, "\n"); + log->message = cleaned_message.buf; + } + + w->next -= w->pending_padding; + w->pending_padding = 0; + + reftable_record_from_log(&rec, log); + err = writer_add_record(w, &rec); + +done: + log->message = input_log_message; + strbuf_release(&cleaned_message); + return err; +} + +int reftable_writer_add_logs(struct reftable_writer *w, + struct reftable_log_record *logs, int n) +{ + int err = 0; + int i = 0; + QSORT(logs, n, reftable_log_record_compare_key); + + for (i = 0; err == 0 && i < n; i++) { + err = reftable_writer_add_log(w, &logs[i]); + } + return err; +} + +static int writer_finish_section(struct reftable_writer *w) +{ + uint8_t typ = block_writer_type(w->block_writer); + uint64_t index_start = 0; + int max_level = 0; + int threshold = w->opts.unpadded ? 1 : 3; + int before_blocks = w->stats.idx_stats.blocks; + int err = writer_flush_block(w); + int i = 0; + struct reftable_block_stats *bstats = NULL; + if (err < 0) + return err; + + while (w->index_len > threshold) { + struct reftable_index_record *idx = NULL; + int idx_len = 0; + + max_level++; + index_start = w->next; + writer_reinit_block_writer(w, BLOCK_TYPE_INDEX); + + idx = w->index; + idx_len = w->index_len; + + w->index = NULL; + w->index_len = 0; + w->index_cap = 0; + for (i = 0; i < idx_len; i++) { + struct reftable_record rec = { NULL }; + reftable_record_from_index(&rec, idx + i); + if (block_writer_add(w->block_writer, &rec) == 0) { + continue; + } + + err = writer_flush_block(w); + if (err < 0) + return err; + + writer_reinit_block_writer(w, BLOCK_TYPE_INDEX); + + err = block_writer_add(w->block_writer, &rec); + if (err != 0) { + /* write into fresh block should always succeed + */ + abort(); + } + } + for (i = 0; i < idx_len; i++) { + strbuf_release(&idx[i].last_key); + } + reftable_free(idx); + } + + writer_clear_index(w); + + err = writer_flush_block(w); + if (err < 0) + return err; + + bstats = writer_reftable_block_stats(w, typ); + bstats->index_blocks = w->stats.idx_stats.blocks - before_blocks; + bstats->index_offset = index_start; + bstats->max_index_level = max_level; + + /* Reinit lastKey, as the next section can start with any key. */ + w->last_key.len = 0; + + return 0; +} + +struct common_prefix_arg { + struct strbuf *last; + int max; +}; + +static void update_common(void *void_arg, void *key) +{ + struct common_prefix_arg *arg = (struct common_prefix_arg *)void_arg; + struct obj_index_tree_node *entry = (struct obj_index_tree_node *)key; + if (arg->last != NULL) { + int n = common_prefix_size(&entry->hash, arg->last); + if (n > arg->max) { + arg->max = n; + } + } + arg->last = &entry->hash; +} + +struct write_record_arg { + struct reftable_writer *w; + int err; +}; + +static void write_object_record(void *void_arg, void *key) +{ + struct write_record_arg *arg = (struct write_record_arg *)void_arg; + struct obj_index_tree_node *entry = (struct obj_index_tree_node *)key; + struct reftable_obj_record obj_rec = { + .hash_prefix = (uint8_t *)entry->hash.buf, + .hash_prefix_len = arg->w->stats.object_id_len, + .offsets = entry->offsets, + .offset_len = entry->offset_len, + }; + struct reftable_record rec = { NULL }; + if (arg->err < 0) + goto done; + + reftable_record_from_obj(&rec, &obj_rec); + arg->err = block_writer_add(arg->w->block_writer, &rec); + if (arg->err == 0) + goto done; + + arg->err = writer_flush_block(arg->w); + if (arg->err < 0) + goto done; + + writer_reinit_block_writer(arg->w, BLOCK_TYPE_OBJ); + arg->err = block_writer_add(arg->w->block_writer, &rec); + if (arg->err == 0) + goto done; + obj_rec.offset_len = 0; + arg->err = block_writer_add(arg->w->block_writer, &rec); + + /* Should be able to write into a fresh block. */ + assert(arg->err == 0); + +done:; +} + +static void object_record_free(void *void_arg, void *key) +{ + struct obj_index_tree_node *entry = (struct obj_index_tree_node *)key; + + FREE_AND_NULL(entry->offsets); + strbuf_release(&entry->hash); + reftable_free(entry); +} + +static int writer_dump_object_index(struct reftable_writer *w) +{ + struct write_record_arg closure = { .w = w }; + struct common_prefix_arg common = { NULL }; + if (w->obj_index_tree != NULL) { + infix_walk(w->obj_index_tree, &update_common, &common); + } + w->stats.object_id_len = common.max + 1; + + writer_reinit_block_writer(w, BLOCK_TYPE_OBJ); + + if (w->obj_index_tree != NULL) { + infix_walk(w->obj_index_tree, &write_object_record, &closure); + } + + if (closure.err < 0) + return closure.err; + return writer_finish_section(w); +} + +static int writer_finish_public_section(struct reftable_writer *w) +{ + uint8_t typ = 0; + int err = 0; + + if (w->block_writer == NULL) + return 0; + + typ = block_writer_type(w->block_writer); + err = writer_finish_section(w); + if (err < 0) + return err; + if (typ == BLOCK_TYPE_REF && !w->opts.skip_index_objects && + w->stats.ref_stats.index_blocks > 0) { + err = writer_dump_object_index(w); + if (err < 0) + return err; + } + + if (w->obj_index_tree != NULL) { + infix_walk(w->obj_index_tree, &object_record_free, NULL); + tree_free(w->obj_index_tree); + w->obj_index_tree = NULL; + } + + w->block_writer = NULL; + return 0; +} + +int reftable_writer_close(struct reftable_writer *w) +{ + uint8_t footer[72]; + uint8_t *p = footer; + int err = writer_finish_public_section(w); + int empty_table = w->next == 0; + if (err != 0) + goto done; + w->pending_padding = 0; + if (empty_table) { + /* Empty tables need a header anyway. */ + uint8_t header[28]; + int n = writer_write_header(w, header); + err = padded_write(w, header, n, 0); + if (err < 0) + goto done; + } + + p += writer_write_header(w, footer); + put_be64(p, w->stats.ref_stats.index_offset); + p += 8; + put_be64(p, (w->stats.obj_stats.offset) << 5 | w->stats.object_id_len); + p += 8; + put_be64(p, w->stats.obj_stats.index_offset); + p += 8; + + put_be64(p, w->stats.log_stats.offset); + p += 8; + put_be64(p, w->stats.log_stats.index_offset); + p += 8; + + put_be32(p, crc32(0, footer, p - footer)); + p += 4; + + err = padded_write(w, footer, footer_size(writer_version(w)), 0); + if (err < 0) + goto done; + + if (empty_table) { + err = REFTABLE_EMPTY_TABLE_ERROR; + goto done; + } + +done: + /* free up memory. */ + block_writer_release(&w->block_writer_data); + writer_clear_index(w); + strbuf_release(&w->last_key); + return err; +} + +static void writer_clear_index(struct reftable_writer *w) +{ + int i = 0; + for (i = 0; i < w->index_len; i++) { + strbuf_release(&w->index[i].last_key); + } + + FREE_AND_NULL(w->index); + w->index_len = 0; + w->index_cap = 0; +} + +static const int debug = 0; + +static int writer_flush_nonempty_block(struct reftable_writer *w) +{ + uint8_t typ = block_writer_type(w->block_writer); + struct reftable_block_stats *bstats = + writer_reftable_block_stats(w, typ); + uint64_t block_typ_off = (bstats->blocks == 0) ? w->next : 0; + int raw_bytes = block_writer_finish(w->block_writer); + int padding = 0; + int err = 0; + struct reftable_index_record ir = { .last_key = STRBUF_INIT }; + if (raw_bytes < 0) + return raw_bytes; + + if (!w->opts.unpadded && typ != BLOCK_TYPE_LOG) { + padding = w->opts.block_size - raw_bytes; + } + + if (block_typ_off > 0) { + bstats->offset = block_typ_off; + } + + bstats->entries += w->block_writer->entries; + bstats->restarts += w->block_writer->restart_len; + bstats->blocks++; + w->stats.blocks++; + + if (debug) { + fprintf(stderr, "block %c off %" PRIu64 " sz %d (%d)\n", typ, + w->next, raw_bytes, + get_be24(w->block + w->block_writer->header_off + 1)); + } + + if (w->next == 0) { + writer_write_header(w, w->block); + } + + err = padded_write(w, w->block, raw_bytes, padding); + if (err < 0) + return err; + + if (w->index_cap == w->index_len) { + w->index_cap = 2 * w->index_cap + 1; + w->index = reftable_realloc( + w->index, + sizeof(struct reftable_index_record) * w->index_cap); + } + + ir.offset = w->next; + strbuf_reset(&ir.last_key); + strbuf_addbuf(&ir.last_key, &w->block_writer->last_key); + w->index[w->index_len] = ir; + + w->index_len++; + w->next += padding + raw_bytes; + w->block_writer = NULL; + return 0; +} + +static int writer_flush_block(struct reftable_writer *w) +{ + if (w->block_writer == NULL) + return 0; + if (w->block_writer->entries == 0) + return 0; + return writer_flush_nonempty_block(w); +} + +const struct reftable_stats *writer_stats(struct reftable_writer *w) +{ + return &w->stats; +} diff --git a/reftable/writer.h b/reftable/writer.h new file mode 100644 index 00000000000..4921c249d06 --- /dev/null +++ b/reftable/writer.h @@ -0,0 +1,50 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef WRITER_H +#define WRITER_H + +#include "basics.h" +#include "block.h" +#include "tree.h" +#include "reftable-writer.h" + +struct reftable_writer { + int (*write)(void *, const void *, size_t); + void *write_arg; + int pending_padding; + struct strbuf last_key; + + /* offset of next block to write. */ + uint64_t next; + uint64_t min_update_index, max_update_index; + struct reftable_write_options opts; + + /* memory buffer for writing */ + uint8_t *block; + + /* writer for the current section. NULL or points to + * block_writer_data */ + struct block_writer *block_writer; + + struct block_writer block_writer_data; + + /* pending index records for the current section */ + struct reftable_index_record *index; + size_t index_len; + size_t index_cap; + + /* + * tree for use with tsearch; used to populate the 'o' inverse OID + * map */ + struct tree_node *obj_index_tree; + + struct reftable_stats stats; +}; + +#endif From patchwork Wed Dec 9 14:00:24 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 11961577 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D658CC2BB9A for ; Wed, 9 Dec 2020 14:03:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B374C23B42 for ; Wed, 9 Dec 2020 14:03:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727569AbgLIOCz (ORCPT ); Wed, 9 Dec 2020 09:02:55 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60410 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732783AbgLIOB5 (ORCPT ); Wed, 9 Dec 2020 09:01:57 -0500 Received: from mail-wm1-x336.google.com (mail-wm1-x336.google.com [IPv6:2a00:1450:4864:20::336]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 802AFC0611C5 for ; Wed, 9 Dec 2020 06:00:43 -0800 (PST) Received: by mail-wm1-x336.google.com with SMTP id y23so1788473wmi.1 for ; Wed, 09 Dec 2020 06:00:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=fdmDZofI5H8V7eQqCkOHzUejEpebp+oNijuJR27PKqU=; b=hHJKvhUD2cHzOseSkiLQcTV0bOuh6KNH5YQZWCZ2sPrvOXGl/ICnHgcHdj4mZc56fz SbAupmqgwgzuTxfmpjXf5taQXysOMzpjHR+vSl5EaykRH3bKfX4hsOUeiO9p2koV6GeA vRm3ACRUNPTierQw0f1drURvJec7kyQJZDUtmL8PMVi4whrN3pK2FiGMXwjw4GSu44/Q 7GsrcjmbWRpyFphMVbwJgUONJ8vJMZlKXhTHJ5BQOa0c+UTI6aaKIWMnpSS8CmI0y98/ he0IOPzPTrdoe5gJFqjTNV7fRLSOOXFt3qjyT7wS0SEC56we+HB8A5DC8KMmQy9FBii6 O70A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=fdmDZofI5H8V7eQqCkOHzUejEpebp+oNijuJR27PKqU=; b=uL22B5PiGo+xwSFI2TeheQOZq77BwUAoBRA16FgOcEGXsMBrT1Qhq1/EmHH5V+dE7x tkafrEApuKm1wlt4a3u4IdVhxyP6UkbDbpqTAVQ+Mj6QVFNVS3ZXx+6w8Z18opbvevfT CL19YUvhgza/O0wepTPYp6CuqyiOILQMP6SbGL+HMhMHRFDblTgke3DJ8TwRIG3CcIbk 50JruhO/6wyDrUsilIYc2YgGtxDjb4UQaR9J84GmcbKV1SsEH3V2oJ6lmcvhPJQokCd5 /q1Q9CAm77CiG0Hd2ZJEHMnIqVeSGFR0gAO10yQEzrHdqequg8ShVXkpfBenNQXHsnvZ Y+KQ== X-Gm-Message-State: AOAM533sf/KqMBOkX4F5gEDDS/nrL0pthfMjQI+7Hs6+On0k1xalaE2o HU1mf5WUZKQ7nwMXVP/jfZ2JHsGAT8s= X-Google-Smtp-Source: ABdhPJx/XA+lcYfomVuWhMmvm8/axhoo5fsTGQIKbLD0B/tQI8o7GcXYwgTKM8ApeJzCaRGIWZkxLA== X-Received: by 2002:a1c:cc0a:: with SMTP id h10mr3032863wmb.24.1607522441326; Wed, 09 Dec 2020 06:00:41 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id v20sm3940590wra.19.2020.12.09.06.00.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 09 Dec 2020 06:00:40 -0800 (PST) Message-Id: <3c2de7dcc65c6ce7b3aa1cab5c99b21e53f5f99e.1607522429.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 09 Dec 2020 14:00:24 +0000 Subject: [PATCH v4 10/15] reftable: read reftable files Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Jeff King , Ramsay Jones , Jonathan Nieder , Johannes Schindelin , Jonathan Tan , Josh Steadmon , Emily Shaffer , Patrick Steinhardt , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Felipe Contreras , Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys This supports reading a single reftable file. The commit introduces an abstract iterator type, which captures the usecases both of reading individual refs, and iterating over a segment of the ref namespace. Signed-off-by: Han-Wen Nienhuys --- Makefile | 3 + reftable/iter.c | 248 ++++++++++++ reftable/iter.h | 75 ++++ reftable/reader.c | 733 +++++++++++++++++++++++++++++++++++ reftable/reader.h | 75 ++++ reftable/reftable-iterator.h | 37 ++ reftable/reftable-reader.h | 89 +++++ 7 files changed, 1260 insertions(+) create mode 100644 reftable/iter.c create mode 100644 reftable/iter.h create mode 100644 reftable/reader.c create mode 100644 reftable/reader.h create mode 100644 reftable/reftable-iterator.h create mode 100644 reftable/reftable-reader.h diff --git a/Makefile b/Makefile index e78908c976a..2bafcb1b62a 100644 --- a/Makefile +++ b/Makefile @@ -2393,8 +2393,11 @@ REFTABLE_OBJS += reftable/basics.o REFTABLE_OBJS += reftable/error.o REFTABLE_OBJS += reftable/block.o REFTABLE_OBJS += reftable/blocksource.o +REFTABLE_OBJS += reftable/iter.o REFTABLE_OBJS += reftable/publicbasics.o +REFTABLE_OBJS += reftable/reader.o REFTABLE_OBJS += reftable/record.o +REFTABLE_OBJS += reftable/reftable.o REFTABLE_OBJS += reftable/tree.o REFTABLE_OBJS += reftable/writer.o REFTABLE_OBJS += reftable/zlib-compat.o diff --git a/reftable/iter.c b/reftable/iter.c new file mode 100644 index 00000000000..b8c091ffd57 --- /dev/null +++ b/reftable/iter.c @@ -0,0 +1,248 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "iter.h" + +#include "system.h" + +#include "block.h" +#include "constants.h" +#include "reader.h" +#include "reftable-error.h" + +int iterator_is_null(struct reftable_iterator *it) +{ + return it->ops == NULL; +} + +static int empty_iterator_next(void *arg, struct reftable_record *rec) +{ + return 1; +} + +static void empty_iterator_close(void *arg) +{ +} + +static struct reftable_iterator_vtable empty_vtable = { + .next = &empty_iterator_next, + .close = &empty_iterator_close, +}; + +void iterator_set_empty(struct reftable_iterator *it) +{ + assert(it->ops == NULL); + it->iter_arg = NULL; + it->ops = &empty_vtable; +} + +int iterator_next(struct reftable_iterator *it, struct reftable_record *rec) +{ + return it->ops->next(it->iter_arg, rec); +} + +void reftable_iterator_destroy(struct reftable_iterator *it) +{ + if (it->ops == NULL) { + return; + } + it->ops->close(it->iter_arg); + it->ops = NULL; + FREE_AND_NULL(it->iter_arg); +} + +int reftable_iterator_next_ref(struct reftable_iterator *it, + struct reftable_ref_record *ref) +{ + struct reftable_record rec = { NULL }; + reftable_record_from_ref(&rec, ref); + return iterator_next(it, &rec); +} + +int reftable_iterator_next_log(struct reftable_iterator *it, + struct reftable_log_record *log) +{ + struct reftable_record rec = { NULL }; + reftable_record_from_log(&rec, log); + return iterator_next(it, &rec); +} + +static void filtering_ref_iterator_close(void *iter_arg) +{ + struct filtering_ref_iterator *fri = + (struct filtering_ref_iterator *)iter_arg; + strbuf_release(&fri->oid); + reftable_iterator_destroy(&fri->it); +} + +static int filtering_ref_iterator_next(void *iter_arg, + struct reftable_record *rec) +{ + struct filtering_ref_iterator *fri = + (struct filtering_ref_iterator *)iter_arg; + struct reftable_ref_record *ref = + (struct reftable_ref_record *)rec->data; + int err = 0; + while (1) { + err = reftable_iterator_next_ref(&fri->it, ref); + if (err != 0) { + break; + } + + if (fri->double_check) { + struct reftable_iterator it = { NULL }; + + err = reftable_table_seek_ref(&fri->tab, &it, + ref->refname); + if (err == 0) { + err = reftable_iterator_next_ref(&it, ref); + } + + reftable_iterator_destroy(&it); + + if (err < 0) { + break; + } + + if (err > 0) { + continue; + } + } + + if (ref->value_type == REFTABLE_REF_VAL2 && + (!memcmp(fri->oid.buf, ref->value.val2.target_value, + fri->oid.len) || + !memcmp(fri->oid.buf, ref->value.val2.value, + fri->oid.len))) + return 0; + + if (ref->value_type == REFTABLE_REF_VAL1 && + !memcmp(fri->oid.buf, ref->value.val1, fri->oid.len)) { + return 0; + } + } + + reftable_ref_record_release(ref); + return err; +} + +static struct reftable_iterator_vtable filtering_ref_iterator_vtable = { + .next = &filtering_ref_iterator_next, + .close = &filtering_ref_iterator_close, +}; + +void iterator_from_filtering_ref_iterator(struct reftable_iterator *it, + struct filtering_ref_iterator *fri) +{ + assert(it->ops == NULL); + it->iter_arg = fri; + it->ops = &filtering_ref_iterator_vtable; +} + +static void indexed_table_ref_iter_close(void *p) +{ + struct indexed_table_ref_iter *it = (struct indexed_table_ref_iter *)p; + block_iter_close(&it->cur); + reftable_block_done(&it->block_reader.block); + strbuf_release(&it->oid); +} + +static int indexed_table_ref_iter_next_block(struct indexed_table_ref_iter *it) +{ + uint64_t off; + int err = 0; + if (it->offset_idx == it->offset_len) { + it->is_finished = 1; + return 1; + } + + reftable_block_done(&it->block_reader.block); + + off = it->offsets[it->offset_idx++]; + err = reader_init_block_reader(it->r, &it->block_reader, off, + BLOCK_TYPE_REF); + if (err < 0) { + return err; + } + if (err > 0) { + /* indexed block does not exist. */ + return REFTABLE_FORMAT_ERROR; + } + block_reader_start(&it->block_reader, &it->cur); + return 0; +} + +static int indexed_table_ref_iter_next(void *p, struct reftable_record *rec) +{ + struct indexed_table_ref_iter *it = (struct indexed_table_ref_iter *)p; + struct reftable_ref_record *ref = + (struct reftable_ref_record *)rec->data; + + while (1) { + int err = block_iter_next(&it->cur, rec); + if (err < 0) { + return err; + } + + if (err > 0) { + err = indexed_table_ref_iter_next_block(it); + if (err < 0) { + return err; + } + + if (it->is_finished) { + return 1; + } + continue; + } + /* BUG */ + if (!memcmp(it->oid.buf, ref->value.val2.target_value, + it->oid.len) || + !memcmp(it->oid.buf, ref->value.val2.value, it->oid.len)) { + return 0; + } + } +} + +int new_indexed_table_ref_iter(struct indexed_table_ref_iter **dest, + struct reftable_reader *r, uint8_t *oid, + int oid_len, uint64_t *offsets, int offset_len) +{ + struct indexed_table_ref_iter empty = INDEXED_TABLE_REF_ITER_INIT; + struct indexed_table_ref_iter *itr = + reftable_calloc(sizeof(struct indexed_table_ref_iter)); + int err = 0; + + *itr = empty; + itr->r = r; + strbuf_add(&itr->oid, oid, oid_len); + + itr->offsets = offsets; + itr->offset_len = offset_len; + + err = indexed_table_ref_iter_next_block(itr); + if (err < 0) { + reftable_free(itr); + } else { + *dest = itr; + } + return err; +} + +static struct reftable_iterator_vtable indexed_table_ref_iter_vtable = { + .next = &indexed_table_ref_iter_next, + .close = &indexed_table_ref_iter_close, +}; + +void iterator_from_indexed_table_ref_iter(struct reftable_iterator *it, + struct indexed_table_ref_iter *itr) +{ + assert(it->ops == NULL); + it->iter_arg = itr; + it->ops = &indexed_table_ref_iter_vtable; +} diff --git a/reftable/iter.h b/reftable/iter.h new file mode 100644 index 00000000000..2ba964c29a3 --- /dev/null +++ b/reftable/iter.h @@ -0,0 +1,75 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef ITER_H +#define ITER_H + +#include "system.h" +#include "block.h" +#include "record.h" + +#include "reftable-iterator.h" +#include "reftable-generic.h" + +struct reftable_iterator_vtable { + int (*next)(void *iter_arg, struct reftable_record *rec); + void (*close)(void *iter_arg); +}; + +void iterator_set_empty(struct reftable_iterator *it); +int iterator_next(struct reftable_iterator *it, struct reftable_record *rec); + +/* Returns true for a zeroed out iterator, such as the one returned from + * iterator_destroy. */ +int iterator_is_null(struct reftable_iterator *it); + +/* iterator that produces only ref records that point to `oid` */ +struct filtering_ref_iterator { + int double_check; + struct reftable_table tab; + struct strbuf oid; + struct reftable_iterator it; +}; +#define FILTERING_REF_ITERATOR_INIT \ + { \ + .oid = STRBUF_INIT \ + } + +void iterator_from_filtering_ref_iterator(struct reftable_iterator *, + struct filtering_ref_iterator *); + +/* iterator that produces only ref records that point to `oid`, + * but using the object index. + */ +struct indexed_table_ref_iter { + struct reftable_reader *r; + struct strbuf oid; + + /* mutable */ + uint64_t *offsets; + + /* Points to the next offset to read. */ + int offset_idx; + int offset_len; + struct block_reader block_reader; + struct block_iter cur; + int is_finished; +}; + +#define INDEXED_TABLE_REF_ITER_INIT \ + { \ + .cur = { .last_key = STRBUF_INIT }, .oid = STRBUF_INIT, \ + } + +void iterator_from_indexed_table_ref_iter(struct reftable_iterator *it, + struct indexed_table_ref_iter *itr); +int new_indexed_table_ref_iter(struct indexed_table_ref_iter **dest, + struct reftable_reader *r, uint8_t *oid, + int oid_len, uint64_t *offsets, int offset_len); + +#endif diff --git a/reftable/reader.c b/reftable/reader.c new file mode 100644 index 00000000000..a5b7b1d4fef --- /dev/null +++ b/reftable/reader.c @@ -0,0 +1,733 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "reader.h" + +#include "system.h" +#include "block.h" +#include "constants.h" +#include "iter.h" +#include "record.h" +#include "reftable-error.h" +#include "tree.h" + +uint64_t block_source_size(struct reftable_block_source *source) +{ + return source->ops->size(source->arg); +} + +int block_source_read_block(struct reftable_block_source *source, + struct reftable_block *dest, uint64_t off, + uint32_t size) +{ + int result = source->ops->read_block(source->arg, dest, off, size); + dest->source = *source; + return result; +} + +void block_source_close(struct reftable_block_source *source) +{ + if (source->ops == NULL) { + return; + } + + source->ops->close(source->arg); + source->ops = NULL; +} + +static struct reftable_reader_offsets * +reader_offsets_for(struct reftable_reader *r, uint8_t typ) +{ + switch (typ) { + case BLOCK_TYPE_REF: + return &r->ref_offsets; + case BLOCK_TYPE_LOG: + return &r->log_offsets; + case BLOCK_TYPE_OBJ: + return &r->obj_offsets; + } + abort(); +} + +static int reader_get_block(struct reftable_reader *r, + struct reftable_block *dest, uint64_t off, + uint32_t sz) +{ + if (off >= r->size) + return 0; + + if (off + sz > r->size) { + sz = r->size - off; + } + + return block_source_read_block(&r->source, dest, off, sz); +} + +uint32_t reftable_reader_hash_id(struct reftable_reader *r) +{ + return r->hash_id; +} + +const char *reader_name(struct reftable_reader *r) +{ + return r->name; +} + +static int parse_footer(struct reftable_reader *r, uint8_t *footer, + uint8_t *header) +{ + uint8_t *f = footer; + uint8_t first_block_typ; + int err = 0; + uint32_t computed_crc; + uint32_t file_crc; + + if (memcmp(f, "REFT", 4)) { + err = REFTABLE_FORMAT_ERROR; + goto done; + } + f += 4; + + if (memcmp(footer, header, header_size(r->version))) { + err = REFTABLE_FORMAT_ERROR; + goto done; + } + + f++; + r->block_size = get_be24(f); + + f += 3; + r->min_update_index = get_be64(f); + f += 8; + r->max_update_index = get_be64(f); + f += 8; + + if (r->version == 1) { + r->hash_id = SHA1_ID; + } else { + r->hash_id = get_be32(f); + switch (r->hash_id) { + case SHA1_ID: + break; + case SHA256_ID: + break; + default: + err = REFTABLE_FORMAT_ERROR; + goto done; + } + f += 4; + } + + r->ref_offsets.index_offset = get_be64(f); + f += 8; + + r->obj_offsets.offset = get_be64(f); + f += 8; + + r->object_id_len = r->obj_offsets.offset & ((1 << 5) - 1); + r->obj_offsets.offset >>= 5; + + r->obj_offsets.index_offset = get_be64(f); + f += 8; + r->log_offsets.offset = get_be64(f); + f += 8; + r->log_offsets.index_offset = get_be64(f); + f += 8; + + computed_crc = crc32(0, footer, f - footer); + file_crc = get_be32(f); + f += 4; + if (computed_crc != file_crc) { + err = REFTABLE_FORMAT_ERROR; + goto done; + } + + first_block_typ = header[header_size(r->version)]; + r->ref_offsets.is_present = (first_block_typ == BLOCK_TYPE_REF); + r->ref_offsets.offset = 0; + r->log_offsets.is_present = (first_block_typ == BLOCK_TYPE_LOG || + r->log_offsets.offset > 0); + r->obj_offsets.is_present = r->obj_offsets.offset > 0; + err = 0; +done: + return err; +} + +int init_reader(struct reftable_reader *r, struct reftable_block_source *source, + const char *name) +{ + struct reftable_block footer = { NULL }; + struct reftable_block header = { NULL }; + int err = 0; + + memset(r, 0, sizeof(struct reftable_reader)); + + /* Need +1 to read type of first block. */ + err = block_source_read_block(source, &header, 0, header_size(2) + 1); + if (err != header_size(2) + 1) { + err = REFTABLE_IO_ERROR; + goto done; + } + + if (memcmp(header.data, "REFT", 4)) { + err = REFTABLE_FORMAT_ERROR; + goto done; + } + r->version = header.data[4]; + if (r->version != 1 && r->version != 2) { + err = REFTABLE_FORMAT_ERROR; + goto done; + } + + r->size = block_source_size(source) - footer_size(r->version); + r->source = *source; + r->name = xstrdup(name); + r->hash_id = 0; + + err = block_source_read_block(source, &footer, r->size, + footer_size(r->version)); + if (err != footer_size(r->version)) { + err = REFTABLE_IO_ERROR; + goto done; + } + + err = parse_footer(r, footer.data, header.data); +done: + reftable_block_done(&footer); + reftable_block_done(&header); + return err; +} + +struct table_iter { + struct reftable_reader *r; + uint8_t typ; + uint64_t block_off; + struct block_iter bi; + int is_finished; +}; +#define TABLE_ITER_INIT \ + { \ + .bi = {.last_key = STRBUF_INIT } \ + } + +static void table_iter_copy_from(struct table_iter *dest, + struct table_iter *src) +{ + dest->r = src->r; + dest->typ = src->typ; + dest->block_off = src->block_off; + dest->is_finished = src->is_finished; + block_iter_copy_from(&dest->bi, &src->bi); +} + +static int table_iter_next_in_block(struct table_iter *ti, + struct reftable_record *rec) +{ + int res = block_iter_next(&ti->bi, rec); + if (res == 0 && reftable_record_type(rec) == BLOCK_TYPE_REF) { + ((struct reftable_ref_record *)rec->data)->update_index += + ti->r->min_update_index; + } + + return res; +} + +static void table_iter_block_done(struct table_iter *ti) +{ + if (ti->bi.br == NULL) { + return; + } + reftable_block_done(&ti->bi.br->block); + FREE_AND_NULL(ti->bi.br); + + ti->bi.last_key.len = 0; + ti->bi.next_off = 0; +} + +static int32_t extract_block_size(uint8_t *data, uint8_t *typ, uint64_t off, + int version) +{ + int32_t result = 0; + + if (off == 0) { + data += header_size(version); + } + + *typ = data[0]; + if (reftable_is_block_type(*typ)) { + result = get_be24(data + 1); + } + return result; +} + +int reader_init_block_reader(struct reftable_reader *r, struct block_reader *br, + uint64_t next_off, uint8_t want_typ) +{ + int32_t guess_block_size = r->block_size ? r->block_size : + DEFAULT_BLOCK_SIZE; + struct reftable_block block = { NULL }; + uint8_t block_typ = 0; + int err = 0; + uint32_t header_off = next_off ? 0 : header_size(r->version); + int32_t block_size = 0; + + if (next_off >= r->size) + return 1; + + err = reader_get_block(r, &block, next_off, guess_block_size); + if (err < 0) + return err; + + block_size = extract_block_size(block.data, &block_typ, next_off, + r->version); + if (block_size < 0) + return block_size; + + if (want_typ != BLOCK_TYPE_ANY && block_typ != want_typ) { + reftable_block_done(&block); + return 1; + } + + if (block_size > guess_block_size) { + reftable_block_done(&block); + err = reader_get_block(r, &block, next_off, block_size); + if (err < 0) { + return err; + } + } + + return block_reader_init(br, &block, header_off, r->block_size, + hash_size(r->hash_id)); +} + +static int table_iter_next_block(struct table_iter *dest, + struct table_iter *src) +{ + uint64_t next_block_off = src->block_off + src->bi.br->full_block_size; + struct block_reader br = { 0 }; + int err = 0; + + dest->r = src->r; + dest->typ = src->typ; + dest->block_off = next_block_off; + + err = reader_init_block_reader(src->r, &br, next_block_off, src->typ); + if (err > 0) { + dest->is_finished = 1; + return 1; + } + if (err != 0) + return err; + else { + struct block_reader *brp = + reftable_malloc(sizeof(struct block_reader)); + *brp = br; + + dest->is_finished = 0; + block_reader_start(brp, &dest->bi); + } + return 0; +} + +static int table_iter_next(struct table_iter *ti, struct reftable_record *rec) +{ + if (reftable_record_type(rec) != ti->typ) + return REFTABLE_API_ERROR; + + while (1) { + struct table_iter next = TABLE_ITER_INIT; + int err = 0; + if (ti->is_finished) { + return 1; + } + + err = table_iter_next_in_block(ti, rec); + if (err <= 0) { + return err; + } + + err = table_iter_next_block(&next, ti); + if (err != 0) { + ti->is_finished = 1; + } + table_iter_block_done(ti); + if (err != 0) { + return err; + } + table_iter_copy_from(ti, &next); + block_iter_close(&next.bi); + } +} + +static int table_iter_next_void(void *ti, struct reftable_record *rec) +{ + return table_iter_next((struct table_iter *)ti, rec); +} + +static void table_iter_close(void *p) +{ + struct table_iter *ti = (struct table_iter *)p; + table_iter_block_done(ti); + block_iter_close(&ti->bi); +} + +static struct reftable_iterator_vtable table_iter_vtable = { + .next = &table_iter_next_void, + .close = &table_iter_close, +}; + +static void iterator_from_table_iter(struct reftable_iterator *it, + struct table_iter *ti) +{ + assert(it->ops == NULL); + it->iter_arg = ti; + it->ops = &table_iter_vtable; +} + +static int reader_table_iter_at(struct reftable_reader *r, + struct table_iter *ti, uint64_t off, + uint8_t typ) +{ + struct block_reader br = { 0 }; + struct block_reader *brp = NULL; + + int err = reader_init_block_reader(r, &br, off, typ); + if (err != 0) + return err; + + brp = reftable_malloc(sizeof(struct block_reader)); + *brp = br; + ti->r = r; + ti->typ = block_reader_type(brp); + ti->block_off = off; + block_reader_start(brp, &ti->bi); + return 0; +} + +static int reader_start(struct reftable_reader *r, struct table_iter *ti, + uint8_t typ, int index) +{ + struct reftable_reader_offsets *offs = reader_offsets_for(r, typ); + uint64_t off = offs->offset; + if (index) { + off = offs->index_offset; + if (off == 0) { + return 1; + } + typ = BLOCK_TYPE_INDEX; + } + + return reader_table_iter_at(r, ti, off, typ); +} + +static int reader_seek_linear(struct reftable_reader *r, struct table_iter *ti, + struct reftable_record *want) +{ + struct reftable_record rec = + reftable_new_record(reftable_record_type(want)); + struct strbuf want_key = STRBUF_INIT; + struct strbuf got_key = STRBUF_INIT; + struct table_iter next = TABLE_ITER_INIT; + int err = -1; + + reftable_record_key(want, &want_key); + + while (1) { + err = table_iter_next_block(&next, ti); + if (err < 0) + goto done; + + if (err > 0) { + break; + } + + err = block_reader_first_key(next.bi.br, &got_key); + if (err < 0) + goto done; + + if (strbuf_cmp(&got_key, &want_key) > 0) { + table_iter_block_done(&next); + break; + } + + table_iter_block_done(ti); + table_iter_copy_from(ti, &next); + } + + err = block_iter_seek(&ti->bi, &want_key); + if (err < 0) + goto done; + err = 0; + +done: + block_iter_close(&next.bi); + reftable_record_destroy(&rec); + strbuf_release(&want_key); + strbuf_release(&got_key); + return err; +} + +static int reader_seek_indexed(struct reftable_reader *r, + struct reftable_iterator *it, + struct reftable_record *rec) +{ + struct reftable_index_record want_index = { .last_key = STRBUF_INIT }; + struct reftable_record want_index_rec = { NULL }; + struct reftable_index_record index_result = { .last_key = STRBUF_INIT }; + struct reftable_record index_result_rec = { NULL }; + struct table_iter index_iter = TABLE_ITER_INIT; + struct table_iter next = TABLE_ITER_INIT; + int err = 0; + + reftable_record_key(rec, &want_index.last_key); + reftable_record_from_index(&want_index_rec, &want_index); + reftable_record_from_index(&index_result_rec, &index_result); + + err = reader_start(r, &index_iter, reftable_record_type(rec), 1); + if (err < 0) + goto done; + + err = reader_seek_linear(r, &index_iter, &want_index_rec); + while (1) { + err = table_iter_next(&index_iter, &index_result_rec); + table_iter_block_done(&index_iter); + if (err != 0) + goto done; + + err = reader_table_iter_at(r, &next, index_result.offset, 0); + if (err != 0) + goto done; + + err = block_iter_seek(&next.bi, &want_index.last_key); + if (err < 0) + goto done; + + if (next.typ == reftable_record_type(rec)) { + err = 0; + break; + } + + if (next.typ != BLOCK_TYPE_INDEX) { + err = REFTABLE_FORMAT_ERROR; + break; + } + + table_iter_copy_from(&index_iter, &next); + } + + if (err == 0) { + struct table_iter empty = TABLE_ITER_INIT; + struct table_iter *malloced = + reftable_calloc(sizeof(struct table_iter)); + *malloced = empty; + table_iter_copy_from(malloced, &next); + iterator_from_table_iter(it, malloced); + } +done: + block_iter_close(&next.bi); + table_iter_close(&index_iter); + reftable_record_release(&want_index_rec); + reftable_record_release(&index_result_rec); + return err; +} + +static int reader_seek_internal(struct reftable_reader *r, + struct reftable_iterator *it, + struct reftable_record *rec) +{ + struct reftable_reader_offsets *offs = + reader_offsets_for(r, reftable_record_type(rec)); + uint64_t idx = offs->index_offset; + struct table_iter ti = TABLE_ITER_INIT; + int err = 0; + if (idx > 0) + return reader_seek_indexed(r, it, rec); + + err = reader_start(r, &ti, reftable_record_type(rec), 0); + if (err < 0) + return err; + err = reader_seek_linear(r, &ti, rec); + if (err < 0) + return err; + else { + struct table_iter *p = + reftable_malloc(sizeof(struct table_iter)); + *p = ti; + iterator_from_table_iter(it, p); + } + + return 0; +} + +int reader_seek(struct reftable_reader *r, struct reftable_iterator *it, + struct reftable_record *rec) +{ + uint8_t typ = reftable_record_type(rec); + + struct reftable_reader_offsets *offs = reader_offsets_for(r, typ); + if (!offs->is_present) { + iterator_set_empty(it); + return 0; + } + + return reader_seek_internal(r, it, rec); +} + +int reftable_reader_seek_ref(struct reftable_reader *r, + struct reftable_iterator *it, const char *name) +{ + struct reftable_ref_record ref = { + .refname = (char *)name, + }; + struct reftable_record rec = { NULL }; + reftable_record_from_ref(&rec, &ref); + return reader_seek(r, it, &rec); +} + +int reftable_reader_seek_log_at(struct reftable_reader *r, + struct reftable_iterator *it, const char *name, + uint64_t update_index) +{ + struct reftable_log_record log = { + .refname = (char *)name, + .update_index = update_index, + }; + struct reftable_record rec = { NULL }; + reftable_record_from_log(&rec, &log); + return reader_seek(r, it, &rec); +} + +int reftable_reader_seek_log(struct reftable_reader *r, + struct reftable_iterator *it, const char *name) +{ + uint64_t max = ~((uint64_t)0); + return reftable_reader_seek_log_at(r, it, name, max); +} + +void reader_close(struct reftable_reader *r) +{ + block_source_close(&r->source); + FREE_AND_NULL(r->name); +} + +int reftable_new_reader(struct reftable_reader **p, + struct reftable_block_source *src, char const *name) +{ + struct reftable_reader *rd = + reftable_calloc(sizeof(struct reftable_reader)); + int err = init_reader(rd, src, name); + if (err == 0) { + *p = rd; + } else { + block_source_close(src); + reftable_free(rd); + } + return err; +} + +void reftable_reader_free(struct reftable_reader *r) +{ + reader_close(r); + reftable_free(r); +} + +static int reftable_reader_refs_for_indexed(struct reftable_reader *r, + struct reftable_iterator *it, + uint8_t *oid) +{ + struct reftable_obj_record want = { + .hash_prefix = oid, + .hash_prefix_len = r->object_id_len, + }; + struct reftable_record want_rec = { NULL }; + struct reftable_iterator oit = { NULL }; + struct reftable_obj_record got = { NULL }; + struct reftable_record got_rec = { NULL }; + int err = 0; + struct indexed_table_ref_iter *itr = NULL; + + /* Look through the reverse index. */ + reftable_record_from_obj(&want_rec, &want); + err = reader_seek(r, &oit, &want_rec); + if (err != 0) + goto done; + + /* read out the reftable_obj_record */ + reftable_record_from_obj(&got_rec, &got); + err = iterator_next(&oit, &got_rec); + if (err < 0) + goto done; + + if (err > 0 || + memcmp(want.hash_prefix, got.hash_prefix, r->object_id_len)) { + /* didn't find it; return empty iterator */ + iterator_set_empty(it); + err = 0; + goto done; + } + + err = new_indexed_table_ref_iter(&itr, r, oid, hash_size(r->hash_id), + got.offsets, got.offset_len); + if (err < 0) + goto done; + got.offsets = NULL; + iterator_from_indexed_table_ref_iter(it, itr); + +done: + reftable_iterator_destroy(&oit); + reftable_record_release(&got_rec); + return err; +} + +static int reftable_reader_refs_for_unindexed(struct reftable_reader *r, + struct reftable_iterator *it, + uint8_t *oid) +{ + struct table_iter ti_empty = TABLE_ITER_INIT; + struct table_iter *ti = reftable_calloc(sizeof(struct table_iter)); + struct filtering_ref_iterator *filter = NULL; + struct filtering_ref_iterator empty = FILTERING_REF_ITERATOR_INIT; + int oid_len = hash_size(r->hash_id); + int err; + + *ti = ti_empty; + err = reader_start(r, ti, BLOCK_TYPE_REF, 0); + if (err < 0) { + reftable_free(ti); + return err; + } + + filter = reftable_malloc(sizeof(struct filtering_ref_iterator)); + *filter = empty; + + strbuf_add(&filter->oid, oid, oid_len); + reftable_table_from_reader(&filter->tab, r); + filter->double_check = 0; + iterator_from_table_iter(&filter->it, ti); + + iterator_from_filtering_ref_iterator(it, filter); + return 0; +} + +int reftable_reader_refs_for(struct reftable_reader *r, + struct reftable_iterator *it, uint8_t *oid) +{ + if (r->obj_offsets.is_present) + return reftable_reader_refs_for_indexed(r, it, oid); + return reftable_reader_refs_for_unindexed(r, it, oid); +} + +uint64_t reftable_reader_max_update_index(struct reftable_reader *r) +{ + return r->max_update_index; +} + +uint64_t reftable_reader_min_update_index(struct reftable_reader *r) +{ + return r->min_update_index; +} diff --git a/reftable/reader.h b/reftable/reader.h new file mode 100644 index 00000000000..6d4927e1c5d --- /dev/null +++ b/reftable/reader.h @@ -0,0 +1,75 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef READER_H +#define READER_H + +#include "block.h" +#include "record.h" +#include "reftable-iterator.h" +#include "reftable-reader.h" + +uint64_t block_source_size(struct reftable_block_source *source); + +int block_source_read_block(struct reftable_block_source *source, + struct reftable_block *dest, uint64_t off, + uint32_t size); +void block_source_close(struct reftable_block_source *source); + +/* metadata for a block type */ +struct reftable_reader_offsets { + int is_present; + uint64_t offset; + uint64_t index_offset; +}; + +/* The state for reading a reftable file. */ +struct reftable_reader { + /* for convience, associate a name with the instance. */ + char *name; + struct reftable_block_source source; + + /* Size of the file, excluding the footer. */ + uint64_t size; + + /* 'sha1' for SHA1, 's256' for SHA-256 */ + uint32_t hash_id; + + uint32_t block_size; + uint64_t min_update_index; + uint64_t max_update_index; + /* Length of the OID keys in the 'o' section */ + int object_id_len; + int version; + + struct reftable_reader_offsets ref_offsets; + struct reftable_reader_offsets obj_offsets; + struct reftable_reader_offsets log_offsets; +}; + +int init_reader(struct reftable_reader *r, struct reftable_block_source *source, + const char *name); +int reader_seek(struct reftable_reader *r, struct reftable_iterator *it, + struct reftable_record *rec); +void reader_close(struct reftable_reader *r); +const char *reader_name(struct reftable_reader *r); + +/* initialize a block reader to read from `r` */ +int reader_init_block_reader(struct reftable_reader *r, struct block_reader *br, + uint64_t next_off, uint8_t want_typ); + +/* generic interface to reftables */ +struct reftable_table_vtable { + int (*seek_record)(void *tab, struct reftable_iterator *it, + struct reftable_record *); + uint32_t (*hash_id)(void *tab); + uint64_t (*min_update_index)(void *tab); + uint64_t (*max_update_index)(void *tab); +}; + +#endif diff --git a/reftable/reftable-iterator.h b/reftable/reftable-iterator.h new file mode 100644 index 00000000000..d8251a45ba1 --- /dev/null +++ b/reftable/reftable-iterator.h @@ -0,0 +1,37 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef REFTABLE_ITERATOR_H +#define REFTABLE_ITERATOR_H + +#include "reftable-record.h" + +/* iterator is the generic interface for walking over data stored in a + * reftable. + */ +struct reftable_iterator { + struct reftable_iterator_vtable *ops; + void *iter_arg; +}; + +/* reads the next reftable_ref_record. Returns < 0 for error, 0 for OK and > 0: + * end of iteration. + */ +int reftable_iterator_next_ref(struct reftable_iterator *it, + struct reftable_ref_record *ref); + +/* reads the next reftable_log_record. Returns < 0 for error, 0 for OK and > 0: + * end of iteration. + */ +int reftable_iterator_next_log(struct reftable_iterator *it, + struct reftable_log_record *log); + +/* releases resources associated with an iterator. */ +void reftable_iterator_destroy(struct reftable_iterator *it); + +#endif diff --git a/reftable/reftable-reader.h b/reftable/reftable-reader.h new file mode 100644 index 00000000000..804e28543b3 --- /dev/null +++ b/reftable/reftable-reader.h @@ -0,0 +1,89 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef REFTABLE_READER_H +#define REFTABLE_READER_H + +#include "reftable-iterator.h" +#include "reftable-blocksource.h" + +/* + * Reading single tables + * + * The follow routines are for reading single files. For an application-level + * interface, skip ahead to struct reftable_merged_table and struct + * reftable_stack. + */ + +/* The reader struct is a handle to an open reftable file. */ +struct reftable_reader; + +/* reftable_new_reader opens a reftable for reading. If successful, returns 0 + * code and sets pp. The name is used for creating a stack. Typically, it is the + * basename of the file. The block source `src` is owned by the reader, and is + * closed on calling reftable_reader_destroy(). + */ +int reftable_new_reader(struct reftable_reader **pp, + struct reftable_block_source *src, const char *name); + +/* reftable_reader_seek_ref returns an iterator where 'name' would be inserted + in the table. To seek to the start of the table, use name = "". + + example: + + struct reftable_reader *r = NULL; + int err = reftable_new_reader(&r, &src, "filename"); + if (err < 0) { ... } + struct reftable_iterator it = {0}; + err = reftable_reader_seek_ref(r, &it, "refs/heads/master"); + if (err < 0) { ... } + struct reftable_ref_record ref = {0}; + while (1) { + err = reftable_iterator_next_ref(&it, &ref); + if (err > 0) { + break; + } + if (err < 0) { + ..error handling.. + } + ..found.. + } + reftable_iterator_destroy(&it); + reftable_ref_record_release(&ref); + */ +int reftable_reader_seek_ref(struct reftable_reader *r, + struct reftable_iterator *it, const char *name); + +/* returns the hash ID used in this table. */ +uint32_t reftable_reader_hash_id(struct reftable_reader *r); + +/* seek to logs for the given name, older than update_index. To seek to the + start of the table, use name = "". + */ +int reftable_reader_seek_log_at(struct reftable_reader *r, + struct reftable_iterator *it, const char *name, + uint64_t update_index); + +/* seek to newest log entry for given name. */ +int reftable_reader_seek_log(struct reftable_reader *r, + struct reftable_iterator *it, const char *name); + +/* closes and deallocates a reader. */ +void reftable_reader_free(struct reftable_reader *); + +/* return an iterator for the refs pointing to `oid`. */ +int reftable_reader_refs_for(struct reftable_reader *r, + struct reftable_iterator *it, uint8_t *oid); + +/* return the max_update_index for a table */ +uint64_t reftable_reader_max_update_index(struct reftable_reader *r); + +/* return the min_update_index for a table */ +uint64_t reftable_reader_min_update_index(struct reftable_reader *r); + +#endif From patchwork Wed Dec 9 14:00:25 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 11961565 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BB2E2C19425 for ; Wed, 9 Dec 2020 14:02:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7ACB423B52 for ; Wed, 9 Dec 2020 14:02:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725982AbgLIOCT (ORCPT ); Wed, 9 Dec 2020 09:02:19 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60412 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732784AbgLIOB6 (ORCPT ); Wed, 9 Dec 2020 09:01:58 -0500 Received: from mail-wm1-x335.google.com (mail-wm1-x335.google.com [IPv6:2a00:1450:4864:20::335]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D7DE4C0611CA for ; Wed, 9 Dec 2020 06:00:43 -0800 (PST) Received: by mail-wm1-x335.google.com with SMTP id k10so1575104wmi.3 for ; Wed, 09 Dec 2020 06:00:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=HRt6qg3HoxpfIfenQCqKgOh/4F42DjruqupfohsyS+Y=; b=tJoCDMDQxzk8EWg+Wk4s4c0Hw2+RoCz4na/xxzW7BWEZOuE97ZLdkafe9oQ21ZxLSo HCR5uF7pD6iorUSUnWGhmuui4YNGEeR3+OlZ9Zpdsw+OXZtaUsaUypFg+8aPkBAO9cLw 1mYepY/uDhDLNnDmTH9belfybwNW9/9BQcoTLY+uMhlhRmcXhzGd4v/VwSOyOeDqSZmr jSlheJ5JXN9Sqn3CJUcv2uzRvAy2CYTMWmbdkMuu0I8SNRz3xBZ8DwOaIB1+V/c0Pwoe 2bx3TKcnZJ4STgVNtbTba5sMcsIRJM8QxvpM9YuMHngutIKIcDB8upKK/4k+U2ybl52n MTCQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=HRt6qg3HoxpfIfenQCqKgOh/4F42DjruqupfohsyS+Y=; b=NvVwvpORSsieWzPvQ88h7SV902OdN1Z9iK6HMIybdzsSwPyi//ckZNXR1p6yt68zmi 7NebzFBNq9KPxFbOSOG/I7otWIUDxK4QP4BlWvbtIo9aO0cnUI3ssp80ki+p48S0ShpU svrDuYJxijJMugZoJZiFgBPjT0W+PFjSnJiMMNrXWBw66mYUOc1xYRCPepA+eDrICzuI rzNBK22uJxZpW3mW8kPjZt1X1qdIXcp62Y9itv2/1XM9UFTpDDegyx9CwJXPuXdIiW0q hVNDKUp+qQ1Eem8aF8Di/NiYGr2xvJyxPJQwVqMagWKVgkTnppFg9QIOx+LFAlErGZOj YjNA== X-Gm-Message-State: AOAM530S5drVaf2nVZaLpaYz1cO9VLHlO75Y4Ot2QOqRgMCLnb23LT94 hnC4WpS4apH674nq11SQm0odBw1I2EM= X-Google-Smtp-Source: ABdhPJxTHEMzMSpfQiQamKEpiwmwW1PR5YBTxzgssS22X3Nty/ZlE+l/F1hWN4audBrtwW+DF2QOdg== X-Received: by 2002:a7b:c208:: with SMTP id x8mr2933389wmi.179.1607522442111; Wed, 09 Dec 2020 06:00:42 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id m2sm3433400wml.34.2020.12.09.06.00.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 09 Dec 2020 06:00:41 -0800 (PST) Message-Id: <03681b820d132a270a08a52040c1ad55c12b8f4b.1607522429.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 09 Dec 2020 14:00:25 +0000 Subject: [PATCH v4 11/15] reftable: reftable file level tests Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Jeff King , Ramsay Jones , Jonathan Nieder , Johannes Schindelin , Jonathan Tan , Josh Steadmon , Emily Shaffer , Patrick Steinhardt , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Felipe Contreras , Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys With support for reading and writing files in place, we can construct files (in memory) and attempt to read them back. Because some sections of the format are optional (eg. indices, log entries), we have to exercise this code using multiple sizes of input data Signed-off-by: Han-Wen Nienhuys --- Makefile | 1 + reftable/reftable_test.c | 580 +++++++++++++++++++++++++++++++++++++++ t/helper/test-reftable.c | 1 + 3 files changed, 582 insertions(+) create mode 100644 reftable/reftable_test.c diff --git a/Makefile b/Makefile index 2bafcb1b62a..c913e90b643 100644 --- a/Makefile +++ b/Makefile @@ -2405,6 +2405,7 @@ REFTABLE_OBJS += reftable/zlib-compat.o REFTABLE_TEST_OBJS += reftable/basics_test.o REFTABLE_TEST_OBJS += reftable/block_test.o REFTABLE_TEST_OBJS += reftable/record_test.o +REFTABLE_TEST_OBJS += reftable/reftable_test.o REFTABLE_TEST_OBJS += reftable/test_framework.o REFTABLE_TEST_OBJS += reftable/tree_test.o diff --git a/reftable/reftable_test.c b/reftable/reftable_test.c new file mode 100644 index 00000000000..c58f7061833 --- /dev/null +++ b/reftable/reftable_test.c @@ -0,0 +1,580 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "system.h" + +#include "basics.h" +#include "block.h" +#include "blocksource.h" +#include "constants.h" +#include "reader.h" +#include "record.h" +#include "test_framework.h" +#include "reftable-tests.h" +#include "reftable-stack.h" + +static const int update_index = 5; + +static void test_buffer(void) +{ + struct strbuf buf = STRBUF_INIT; + struct reftable_block_source source = { NULL }; + struct reftable_block out = { NULL }; + int n; + uint8_t in[] = "hello"; + strbuf_add(&buf, in, sizeof(in)); + block_source_from_strbuf(&source, &buf); + EXPECT(block_source_size(&source) == 6); + n = block_source_read_block(&source, &out, 0, sizeof(in)); + EXPECT(n == sizeof(in)); + EXPECT(!memcmp(in, out.data, n)); + reftable_block_done(&out); + + n = block_source_read_block(&source, &out, 1, 2); + EXPECT(n == 2); + EXPECT(!memcmp(out.data, "el", 2)); + + reftable_block_done(&out); + block_source_close(&source); + strbuf_release(&buf); +} + +static void write_table(char ***names, struct strbuf *buf, int N, + int block_size, uint32_t hash_id) +{ + struct reftable_write_options opts = { + .block_size = block_size, + .hash_id = hash_id, + }; + struct reftable_writer *w = + reftable_new_writer(&strbuf_add_void, buf, &opts); + struct reftable_ref_record ref = { NULL }; + int i = 0, n; + struct reftable_log_record log = { NULL }; + const struct reftable_stats *stats = NULL; + *names = reftable_calloc(sizeof(char *) * (N + 1)); + reftable_writer_set_limits(w, update_index, update_index); + for (i = 0; i < N; i++) { + uint8_t hash[SHA256_SIZE] = { 0 }; + char name[100]; + int n; + + set_test_hash(hash, i); + + snprintf(name, sizeof(name), "refs/heads/branch%02d", i); + + ref.refname = name; + ref.update_index = update_index; + ref.value_type = REFTABLE_REF_VAL1; + ref.value.val1 = hash; + (*names)[i] = xstrdup(name); + + n = reftable_writer_add_ref(w, &ref); + EXPECT(n == 0); + } + + for (i = 0; i < N; i++) { + uint8_t hash[SHA256_SIZE] = { 0 }; + char name[100]; + int n; + + set_test_hash(hash, i); + + snprintf(name, sizeof(name), "refs/heads/branch%02d", i); + + log.refname = name; + log.new_hash = hash; + log.update_index = update_index; + log.message = "message"; + + n = reftable_writer_add_log(w, &log); + EXPECT(n == 0); + } + + n = reftable_writer_close(w); + EXPECT(n == 0); + + stats = writer_stats(w); + for (i = 0; i < stats->ref_stats.blocks; i++) { + int off = i * opts.block_size; + if (off == 0) { + off = header_size((hash_id == SHA256_ID) ? 2 : 1); + } + EXPECT(buf->buf[off] == 'r'); + } + + EXPECT(stats->log_stats.blocks > 0); + reftable_writer_free(w); +} + +static void test_log_buffer_size(void) +{ + struct strbuf buf = STRBUF_INIT; + struct reftable_write_options opts = { + .block_size = 4096, + }; + int err; + struct reftable_log_record log = { + .refname = "refs/heads/master", + .name = "Han-Wen Nienhuys", + .email = "hanwen@google.com", + .tz_offset = 100, + .time = 0x5e430672, + .update_index = 0xa, + .message = "commit: 9\n", + }; + struct reftable_writer *w = + reftable_new_writer(&strbuf_add_void, &buf, &opts); + + /* This tests buffer extension for log compression. Must use a random + hash, to ensure that the compressed part is larger than the original. + */ + uint8_t hash1[SHA1_SIZE], hash2[SHA1_SIZE]; + for (int i = 0; i < SHA1_SIZE; i++) { + hash1[i] = (uint8_t)(rand() % 256); + hash2[i] = (uint8_t)(rand() % 256); + } + log.old_hash = hash1; + log.new_hash = hash2; + reftable_writer_set_limits(w, update_index, update_index); + err = reftable_writer_add_log(w, &log); + EXPECT_ERR(err); + err = reftable_writer_close(w); + EXPECT_ERR(err); + reftable_writer_free(w); + strbuf_release(&buf); +} + +static void test_log_write_read(void) +{ + int N = 2; + char **names = reftable_calloc(sizeof(char *) * (N + 1)); + int err; + struct reftable_write_options opts = { + .block_size = 256, + }; + struct reftable_ref_record ref = { NULL }; + int i = 0; + struct reftable_log_record log = { NULL }; + int n; + struct reftable_iterator it = { NULL }; + struct reftable_reader rd = { NULL }; + struct reftable_block_source source = { NULL }; + struct strbuf buf = STRBUF_INIT; + struct reftable_writer *w = + reftable_new_writer(&strbuf_add_void, &buf, &opts); + const struct reftable_stats *stats = NULL; + reftable_writer_set_limits(w, 0, N); + for (i = 0; i < N; i++) { + char name[256]; + struct reftable_ref_record ref = { NULL }; + snprintf(name, sizeof(name), "b%02d%0*d", i, 130, 7); + names[i] = xstrdup(name); + ref.refname = name; + ref.update_index = i; + + err = reftable_writer_add_ref(w, &ref); + EXPECT_ERR(err); + } + for (i = 0; i < N; i++) { + uint8_t hash1[SHA1_SIZE], hash2[SHA1_SIZE]; + struct reftable_log_record log = { NULL }; + set_test_hash(hash1, i); + set_test_hash(hash2, i + 1); + + log.refname = names[i]; + log.update_index = i; + log.old_hash = hash1; + log.new_hash = hash2; + + err = reftable_writer_add_log(w, &log); + EXPECT_ERR(err); + } + + n = reftable_writer_close(w); + EXPECT(n == 0); + + stats = writer_stats(w); + EXPECT(stats->log_stats.blocks > 0); + reftable_writer_free(w); + w = NULL; + + block_source_from_strbuf(&source, &buf); + + err = init_reader(&rd, &source, "file.log"); + EXPECT_ERR(err); + + err = reftable_reader_seek_ref(&rd, &it, names[N - 1]); + EXPECT_ERR(err); + + err = reftable_iterator_next_ref(&it, &ref); + EXPECT_ERR(err); + + /* end of iteration. */ + err = reftable_iterator_next_ref(&it, &ref); + EXPECT(0 < err); + + reftable_iterator_destroy(&it); + reftable_ref_record_release(&ref); + + err = reftable_reader_seek_log(&rd, &it, ""); + EXPECT_ERR(err); + + i = 0; + while (1) { + int err = reftable_iterator_next_log(&it, &log); + if (err > 0) { + break; + } + + EXPECT_ERR(err); + EXPECT_STREQ(names[i], log.refname); + EXPECT(i == log.update_index); + i++; + reftable_log_record_release(&log); + } + + EXPECT(i == N); + reftable_iterator_destroy(&it); + + /* cleanup. */ + strbuf_release(&buf); + free_names(names); + reader_close(&rd); +} + +static void test_table_read_write_sequential(void) +{ + char **names; + struct strbuf buf = STRBUF_INIT; + int N = 50; + struct reftable_iterator it = { NULL }; + struct reftable_block_source source = { NULL }; + struct reftable_reader rd = { NULL }; + int err = 0; + int j = 0; + + write_table(&names, &buf, N, 256, SHA1_ID); + + block_source_from_strbuf(&source, &buf); + + err = init_reader(&rd, &source, "file.ref"); + EXPECT_ERR(err); + + err = reftable_reader_seek_ref(&rd, &it, ""); + EXPECT_ERR(err); + + while (1) { + struct reftable_ref_record ref = { NULL }; + int r = reftable_iterator_next_ref(&it, &ref); + EXPECT(r >= 0); + if (r > 0) { + break; + } + EXPECT(0 == strcmp(names[j], ref.refname)); + EXPECT(update_index == ref.update_index); + + j++; + reftable_ref_record_release(&ref); + } + EXPECT(j == N); + reftable_iterator_destroy(&it); + strbuf_release(&buf); + free_names(names); + + reader_close(&rd); +} + +static void test_table_write_small_table(void) +{ + char **names; + struct strbuf buf = STRBUF_INIT; + int N = 1; + write_table(&names, &buf, N, 4096, SHA1_ID); + EXPECT(buf.len < 200); + strbuf_release(&buf); + free_names(names); +} + +static void test_table_read_api(void) +{ + char **names; + struct strbuf buf = STRBUF_INIT; + int N = 50; + struct reftable_reader rd = { NULL }; + struct reftable_block_source source = { NULL }; + int err; + int i; + struct reftable_log_record log = { NULL }; + struct reftable_iterator it = { NULL }; + + write_table(&names, &buf, N, 256, SHA1_ID); + + block_source_from_strbuf(&source, &buf); + + err = init_reader(&rd, &source, "file.ref"); + EXPECT_ERR(err); + + err = reftable_reader_seek_ref(&rd, &it, names[0]); + EXPECT_ERR(err); + + err = reftable_iterator_next_log(&it, &log); + EXPECT(err == REFTABLE_API_ERROR); + + strbuf_release(&buf); + for (i = 0; i < N; i++) { + reftable_free(names[i]); + } + reftable_iterator_destroy(&it); + reftable_free(names); + reader_close(&rd); + strbuf_release(&buf); +} + +static void test_table_read_write_seek(int index, int hash_id) +{ + char **names; + struct strbuf buf = STRBUF_INIT; + int N = 50; + struct reftable_reader rd = { NULL }; + struct reftable_block_source source = { NULL }; + int err; + int i = 0; + + struct reftable_iterator it = { NULL }; + struct strbuf pastLast = STRBUF_INIT; + struct reftable_ref_record ref = { NULL }; + + write_table(&names, &buf, N, 256, hash_id); + + block_source_from_strbuf(&source, &buf); + + err = init_reader(&rd, &source, "file.ref"); + EXPECT_ERR(err); + EXPECT(hash_id == reftable_reader_hash_id(&rd)); + + if (!index) { + rd.ref_offsets.index_offset = 0; + } else { + EXPECT(rd.ref_offsets.index_offset > 0); + } + + for (i = 1; i < N; i++) { + int err = reftable_reader_seek_ref(&rd, &it, names[i]); + EXPECT_ERR(err); + err = reftable_iterator_next_ref(&it, &ref); + EXPECT_ERR(err); + EXPECT(0 == strcmp(names[i], ref.refname)); + EXPECT(REFTABLE_REF_VAL1 == ref.value_type); + EXPECT(i == ref.value.val1[0]); + + reftable_ref_record_release(&ref); + reftable_iterator_destroy(&it); + } + + strbuf_addstr(&pastLast, names[N - 1]); + strbuf_addstr(&pastLast, "/"); + + err = reftable_reader_seek_ref(&rd, &it, pastLast.buf); + if (err == 0) { + struct reftable_ref_record ref = { NULL }; + int err = reftable_iterator_next_ref(&it, &ref); + EXPECT(err > 0); + } else { + EXPECT(err > 0); + } + + strbuf_release(&pastLast); + reftable_iterator_destroy(&it); + + strbuf_release(&buf); + for (i = 0; i < N; i++) { + reftable_free(names[i]); + } + reftable_free(names); + reader_close(&rd); +} + +static void test_table_read_write_seek_linear(void) +{ + test_table_read_write_seek(0, SHA1_ID); +} + +static void test_table_read_write_seek_linear_sha256(void) +{ + test_table_read_write_seek(0, SHA256_ID); +} + +static void test_table_read_write_seek_index(void) +{ + test_table_read_write_seek(1, SHA1_ID); +} + +static void test_table_refs_for(int indexed) +{ + int N = 50; + char **want_names = reftable_calloc(sizeof(char *) * (N + 1)); + int want_names_len = 0; + uint8_t want_hash[SHA1_SIZE]; + + struct reftable_write_options opts = { + .block_size = 256, + }; + struct reftable_ref_record ref = { NULL }; + int i = 0; + int n; + int err; + struct reftable_reader rd; + struct reftable_block_source source = { NULL }; + + struct strbuf buf = STRBUF_INIT; + struct reftable_writer *w = + reftable_new_writer(&strbuf_add_void, &buf, &opts); + + struct reftable_iterator it = { NULL }; + int j; + + set_test_hash(want_hash, 4); + + for (i = 0; i < N; i++) { + uint8_t hash[SHA1_SIZE]; + char fill[51] = { 0 }; + char name[100]; + uint8_t hash1[SHA1_SIZE]; + uint8_t hash2[SHA1_SIZE]; + struct reftable_ref_record ref = { NULL }; + + memset(hash, i, sizeof(hash)); + memset(fill, 'x', 50); + /* Put the variable part in the start */ + snprintf(name, sizeof(name), "br%02d%s", i, fill); + name[40] = 0; + ref.refname = name; + + set_test_hash(hash1, i / 4); + set_test_hash(hash2, 3 + i / 4); + ref.value_type = REFTABLE_REF_VAL2; + ref.value.val2.value = hash1; + ref.value.val2.target_value = hash2; + + /* 80 bytes / entry, so 3 entries per block. Yields 17 + */ + /* blocks. */ + n = reftable_writer_add_ref(w, &ref); + EXPECT(n == 0); + + if (!memcmp(hash1, want_hash, SHA1_SIZE) || + !memcmp(hash2, want_hash, SHA1_SIZE)) { + want_names[want_names_len++] = xstrdup(name); + } + } + + n = reftable_writer_close(w); + EXPECT(n == 0); + + reftable_writer_free(w); + w = NULL; + + block_source_from_strbuf(&source, &buf); + + err = init_reader(&rd, &source, "file.ref"); + EXPECT_ERR(err); + if (!indexed) { + rd.obj_offsets.is_present = 0; + } + + err = reftable_reader_seek_ref(&rd, &it, ""); + EXPECT_ERR(err); + reftable_iterator_destroy(&it); + + err = reftable_reader_refs_for(&rd, &it, want_hash); + EXPECT_ERR(err); + + j = 0; + while (1) { + int err = reftable_iterator_next_ref(&it, &ref); + EXPECT(err >= 0); + if (err > 0) { + break; + } + + EXPECT(j < want_names_len); + EXPECT(0 == strcmp(ref.refname, want_names[j])); + j++; + reftable_ref_record_release(&ref); + } + EXPECT(j == want_names_len); + + strbuf_release(&buf); + free_names(want_names); + reftable_iterator_destroy(&it); + reader_close(&rd); +} + +static void test_table_refs_for_no_index(void) +{ + test_table_refs_for(0); +} + +static void test_table_refs_for_obj_index(void) +{ + test_table_refs_for(1); +} + +static void test_table_empty(void) +{ + struct reftable_write_options opts = { 0 }; + struct strbuf buf = STRBUF_INIT; + struct reftable_writer *w = + reftable_new_writer(&strbuf_add_void, &buf, &opts); + struct reftable_block_source source = { NULL }; + struct reftable_reader *rd = NULL; + struct reftable_ref_record rec = { NULL }; + struct reftable_iterator it = { NULL }; + int err; + + reftable_writer_set_limits(w, 1, 1); + + err = reftable_writer_close(w); + EXPECT(err == REFTABLE_EMPTY_TABLE_ERROR); + reftable_writer_free(w); + + EXPECT(buf.len == header_size(1) + footer_size(1)); + + block_source_from_strbuf(&source, &buf); + + err = reftable_new_reader(&rd, &source, "filename"); + EXPECT_ERR(err); + + err = reftable_reader_seek_ref(rd, &it, ""); + EXPECT_ERR(err); + + err = reftable_iterator_next_ref(&it, &rec); + EXPECT(err > 0); + + reftable_iterator_destroy(&it); + reftable_reader_free(rd); + strbuf_release(&buf); +} + +int reftable_test_main(int argc, const char *argv[]) +{ + RUN_TEST(test_log_write_read); + RUN_TEST(test_table_read_write_seek_linear_sha256); + RUN_TEST(test_log_buffer_size); + RUN_TEST(test_table_write_small_table); + RUN_TEST(test_buffer); + RUN_TEST(test_table_read_api); + RUN_TEST(test_table_read_write_sequential); + RUN_TEST(test_table_read_write_seek_linear); + RUN_TEST(test_table_read_write_seek_index); + RUN_TEST(test_table_refs_for_no_index); + RUN_TEST(test_table_refs_for_obj_index); + RUN_TEST(test_table_empty); + return 0; +} diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c index 050551fa698..fdf92586737 100644 --- a/t/helper/test-reftable.c +++ b/t/helper/test-reftable.c @@ -6,6 +6,7 @@ int cmd__reftable(int argc, const char **argv) basics_test_main(argc, argv); block_test_main(argc, argv); record_test_main(argc, argv); + reftable_test_main(argc, argv); tree_test_main(argc, argv); return 0; } From patchwork Wed Dec 9 14:00:26 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11961569 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 11623C2BB40 for ; Wed, 9 Dec 2020 14:02:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CF27823B42 for ; Wed, 9 Dec 2020 14:02:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728855AbgLIOC1 (ORCPT ); Wed, 9 Dec 2020 09:02:27 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60382 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728849AbgLIOCI (ORCPT ); Wed, 9 Dec 2020 09:02:08 -0500 Received: from mail-wr1-x430.google.com (mail-wr1-x430.google.com [IPv6:2a00:1450:4864:20::430]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9D37EC0611CB for ; Wed, 9 Dec 2020 06:00:46 -0800 (PST) Received: by mail-wr1-x430.google.com with SMTP id t16so1867026wra.3 for ; Wed, 09 Dec 2020 06:00:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=aYAwInn6Wga/fcn96w3ByqtN3bJNh5A7rkXjIchaWiI=; b=cbbsENz4Z1VPfeF3bmPWncqm79Wi/t6VK5moINqkU4HSVKtoPiS8VGrsvqEZ+YBxN+ YwPX8GsDeH+EiOtrVoRrB16XCFwEoxzErQrMlW2r31JVChK3k146WRH0ZO6FT6zZGAnz O+3OCng7cVJhFyR4CUu+37oNTal4uk6napfV2Z1hQ8ueXecpbgIXlghjQuftF/SRPkj4 gM7LI2oF0KXJWAO4NHRMmH0awRqBw5Y0TKEok2GSgPMnVIxbMTVXNVrrgT2s73ipjZH2 TWYVVTcbxnun5O8VJWVzqM7zVUtWaOklbTtodK8U60EOq2CdN3Jj2dV/4T43ndbx+DK7 gW9A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=aYAwInn6Wga/fcn96w3ByqtN3bJNh5A7rkXjIchaWiI=; b=UKas/GYnmHuZbv/eEpcLYGQQElIvMgeYNVmFSH587cHzqGAnqxgvqzqXz/wOhz+AOt 2BkojRIc3PWgOCogG9Vro+KXAfG6NjKXSOBrz9lO83mmiamnn+LgIKBPNELvVvDV4qgJ /q0LecY+uvELiApdz3t2gjtt6SqhCeIYYioAR1MIW2zBuXeCCqk/oJQWj5RySIg7t2Z7 R3ZpSpelF4D7ryRTrZL/nTIqYmojnd7mqZxjyhp11/KGSfTE5XSWQVt3uTDT0D+0DxOJ ppDjoErMrWtqenhiad/SVSNNi6V/7W0NVKbcgHbqrLrGOgXb1oGRqMERGGn2EIz2o7nD yVkg== X-Gm-Message-State: AOAM531XL+sM8hYufn5FGGBnCIuWCIvv00s3B/n5gc13REAEf3GApac1 DNph5NtHIZB62LaimiXOC55o7NaYmSA= X-Google-Smtp-Source: ABdhPJx7nfVLcqXPXpwElZaqzVheVVN8xF6uveN4AsIvP/A+PY/jwJQkJfnwdWDbdfYebluxtx0r4w== X-Received: by 2002:a5d:4d41:: with SMTP id a1mr2866600wru.399.1607522443226; Wed, 09 Dec 2020 06:00:43 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id h5sm4003943wrp.56.2020.12.09.06.00.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 09 Dec 2020 06:00:42 -0800 (PST) Message-Id: <557183d3e3e252932ce2f8b7c96d7378e295e2dd.1607522429.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Han-Wen Nienhuys via GitGitGadget" Date: Wed, 09 Dec 2020 14:00:26 +0000 Subject: [PATCH v4 12/15] reftable: rest of library Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Jeff King , Ramsay Jones , Jonathan Nieder , Johannes Schindelin , Jonathan Tan , Josh Steadmon , Emily Shaffer , Patrick Steinhardt , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Felipe Contreras , Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys Signed-off-by: Han-Wen Nienhuys --- Makefile | 4 + reftable/VERSION | 1 + reftable/dump.c | 212 ++++++ reftable/merged.c | 366 ++++++++++ reftable/merged.h | 35 + reftable/merged_test.c | 343 ++++++++++ reftable/pq.c | 115 ++++ reftable/pq.h | 32 + reftable/refname.c | 209 ++++++ reftable/refname.h | 29 + reftable/refname_test.c | 102 +++ reftable/reftable-generic.h | 48 ++ reftable/reftable-merged.h | 68 ++ reftable/reftable-stack.h | 120 ++++ reftable/reftable.c | 98 +++ reftable/stack.c | 1260 +++++++++++++++++++++++++++++++++++ reftable/stack.h | 41 ++ reftable/stack_test.c | 803 ++++++++++++++++++++++ t/helper/test-reftable.c | 3 + 19 files changed, 3889 insertions(+) create mode 100644 reftable/VERSION create mode 100644 reftable/dump.c create mode 100644 reftable/merged.c create mode 100644 reftable/merged.h create mode 100644 reftable/merged_test.c create mode 100644 reftable/pq.c create mode 100644 reftable/pq.h create mode 100644 reftable/refname.c create mode 100644 reftable/refname.h create mode 100644 reftable/refname_test.c create mode 100644 reftable/reftable-generic.h create mode 100644 reftable/reftable-merged.h create mode 100644 reftable/reftable-stack.h create mode 100644 reftable/reftable.c create mode 100644 reftable/stack.c create mode 100644 reftable/stack.h create mode 100644 reftable/stack_test.c diff --git a/Makefile b/Makefile index c913e90b643..18cc18c2153 100644 --- a/Makefile +++ b/Makefile @@ -2394,10 +2394,14 @@ REFTABLE_OBJS += reftable/error.o REFTABLE_OBJS += reftable/block.o REFTABLE_OBJS += reftable/blocksource.o REFTABLE_OBJS += reftable/iter.o +REFTABLE_OBJS += reftable/merged.o +REFTABLE_OBJS += reftable/pq.o REFTABLE_OBJS += reftable/publicbasics.o REFTABLE_OBJS += reftable/reader.o REFTABLE_OBJS += reftable/record.o +REFTABLE_OBJS += reftable/refname.o REFTABLE_OBJS += reftable/reftable.o +REFTABLE_OBJS += reftable/stack.o REFTABLE_OBJS += reftable/tree.o REFTABLE_OBJS += reftable/writer.o REFTABLE_OBJS += reftable/zlib-compat.o diff --git a/reftable/VERSION b/reftable/VERSION new file mode 100644 index 00000000000..a67c0682e1b --- /dev/null +++ b/reftable/VERSION @@ -0,0 +1 @@ +9b4a54059db9a05c270c0a0587f245bc6868d576 C: use rand() rather than cobbled together random generator. diff --git a/reftable/dump.c b/reftable/dump.c new file mode 100644 index 00000000000..00b444e8c9b --- /dev/null +++ b/reftable/dump.c @@ -0,0 +1,212 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include +#include +#include +#include +#include + +#include "reftable.h" +#include "reftable-tests.h" + +static uint32_t hash_id; + +static int dump_table(const char *tablename) +{ + struct reftable_block_source src = { 0 }; + int err = reftable_block_source_from_file(&src, tablename); + struct reftable_iterator it = { 0 }; + struct reftable_ref_record ref = { 0 }; + struct reftable_log_record log = { 0 }; + struct reftable_reader *r = NULL; + + if (err < 0) + return err; + + err = reftable_new_reader(&r, &src, tablename); + if (err < 0) + return err; + + err = reftable_reader_seek_ref(r, &it, ""); + if (err < 0) { + return err; + } + + while (1) { + err = reftable_iterator_next_ref(&it, &ref); + if (err > 0) { + break; + } + if (err < 0) { + return err; + } + reftable_ref_record_print(&ref, hash_id); + } + reftable_iterator_destroy(&it); + reftable_ref_record_clear(&ref); + + err = reftable_reader_seek_log(r, &it, ""); + if (err < 0) { + return err; + } + while (1) { + err = reftable_iterator_next_log(&it, &log); + if (err > 0) { + break; + } + if (err < 0) { + return err; + } + reftable_log_record_print(&log, hash_id); + } + reftable_iterator_destroy(&it); + reftable_log_record_clear(&log); + + reftable_reader_free(r); + return 0; +} + +static int compact_stack(const char *stackdir) +{ + struct reftable_stack *stack = NULL; + struct reftable_write_options cfg = {}; + + int err = reftable_new_stack(&stack, stackdir, cfg); + if (err < 0) + goto done; + + err = reftable_stack_compact_all(stack, NULL); + if (err < 0) + goto done; +done: + if (stack != NULL) { + reftable_stack_destroy(stack); + } + return err; +} + +static int dump_stack(const char *stackdir) +{ + struct reftable_stack *stack = NULL; + struct reftable_write_options cfg = {}; + struct reftable_iterator it = { 0 }; + struct reftable_ref_record ref = { 0 }; + struct reftable_log_record log = { 0 }; + struct reftable_merged_table *merged = NULL; + + int err = reftable_new_stack(&stack, stackdir, cfg); + if (err < 0) + return err; + + merged = reftable_stack_merged_table(stack); + + err = reftable_merged_table_seek_ref(merged, &it, ""); + if (err < 0) { + return err; + } + + while (1) { + err = reftable_iterator_next_ref(&it, &ref); + if (err > 0) { + break; + } + if (err < 0) { + return err; + } + reftable_ref_record_print(&ref, hash_id); + } + reftable_iterator_destroy(&it); + reftable_ref_record_clear(&ref); + + err = reftable_merged_table_seek_log(merged, &it, ""); + if (err < 0) { + return err; + } + while (1) { + err = reftable_iterator_next_log(&it, &log); + if (err > 0) { + break; + } + if (err < 0) { + return err; + } + reftable_log_record_print(&log, hash_id); + } + reftable_iterator_destroy(&it); + reftable_log_record_clear(&log); + + reftable_stack_destroy(stack); + return 0; +} + +static void print_help(void) +{ + printf("usage: dump [-cst] arg\n\n" + "options: \n" + " -c compact\n" + " -t dump table\n" + " -s dump stack\n" + " -h this help\n" + " -2 use SHA256\n" + "\n"); +} + +int reftable_dump_main(int argc, char *const *argv) +{ + int err = 0; + int opt; + int opt_dump_table = 0; + int opt_dump_stack = 0; + int opt_compact = 0; + const char *arg = NULL; + while ((opt = getopt(argc, argv, "2chts")) != -1) { + switch (opt) { + case '2': + hash_id = 0x73323536; + break; + case 't': + opt_dump_table = 1; + break; + case 's': + opt_dump_stack = 1; + break; + case 'c': + opt_compact = 1; + break; + case '?': + case 'h': + print_help(); + return 2; + break; + } + } + + if (argv[optind] == NULL) { + fprintf(stderr, "need argument\n"); + print_help(); + return 2; + } + + arg = argv[optind]; + + if (opt_dump_table) { + err = dump_table(arg); + } else if (opt_dump_stack) { + err = dump_stack(arg); + } else if (opt_compact) { + err = compact_stack(arg); + } + + if (err < 0) { + fprintf(stderr, "%s: %s: %s\n", argv[0], arg, + reftable_error_str(err)); + return 1; + } + return 0; +} diff --git a/reftable/merged.c b/reftable/merged.c new file mode 100644 index 00000000000..ed02625bb54 --- /dev/null +++ b/reftable/merged.c @@ -0,0 +1,366 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "merged.h" + +#include "constants.h" +#include "iter.h" +#include "pq.h" +#include "reader.h" +#include "record.h" +#include "reftable-merged.h" +#include "reftable-error.h" +#include "system.h" + +static int merged_iter_init(struct merged_iter *mi) +{ + int i = 0; + for (i = 0; i < mi->stack_len; i++) { + struct reftable_record rec = reftable_new_record(mi->typ); + int err = iterator_next(&mi->stack[i], &rec); + if (err < 0) { + return err; + } + + if (err > 0) { + reftable_iterator_destroy(&mi->stack[i]); + reftable_record_destroy(&rec); + } else { + struct pq_entry e = { + .rec = rec, + .index = i, + }; + merged_iter_pqueue_add(&mi->pq, e); + } + } + + return 0; +} + +static void merged_iter_close(void *p) +{ + struct merged_iter *mi = (struct merged_iter *)p; + int i = 0; + merged_iter_pqueue_release(&mi->pq); + for (i = 0; i < mi->stack_len; i++) { + reftable_iterator_destroy(&mi->stack[i]); + } + reftable_free(mi->stack); +} + +static int merged_iter_advance_nonnull_subiter(struct merged_iter *mi, + size_t idx) +{ + struct reftable_record rec = reftable_new_record(mi->typ); + struct pq_entry e = { + .rec = rec, + .index = idx, + }; + int err = iterator_next(&mi->stack[idx], &rec); + if (err < 0) + return err; + + if (err > 0) { + reftable_iterator_destroy(&mi->stack[idx]); + reftable_record_destroy(&rec); + return 0; + } + + merged_iter_pqueue_add(&mi->pq, e); + return 0; +} + +static int merged_iter_advance_subiter(struct merged_iter *mi, size_t idx) +{ + if (iterator_is_null(&mi->stack[idx])) + return 0; + return merged_iter_advance_nonnull_subiter(mi, idx); +} + +static int merged_iter_next_entry(struct merged_iter *mi, + struct reftable_record *rec) +{ + struct strbuf entry_key = STRBUF_INIT; + struct pq_entry entry = { 0 }; + int err = 0; + + if (merged_iter_pqueue_is_empty(mi->pq)) + return 1; + + entry = merged_iter_pqueue_remove(&mi->pq); + err = merged_iter_advance_subiter(mi, entry.index); + if (err < 0) + return err; + + /* + One can also use reftable as datacenter-local storage, where the ref + database is maintained in globally consistent database (eg. + CockroachDB or Spanner). In this scenario, replication delays together + with compaction may cause newer tables to contain older entries. In + such a deployment, the loop below must be changed to collect all + entries for the same key, and return new the newest one. + */ + reftable_record_key(&entry.rec, &entry_key); + while (!merged_iter_pqueue_is_empty(mi->pq)) { + struct pq_entry top = merged_iter_pqueue_top(mi->pq); + struct strbuf k = STRBUF_INIT; + int err = 0, cmp = 0; + + reftable_record_key(&top.rec, &k); + + cmp = strbuf_cmp(&k, &entry_key); + strbuf_release(&k); + + if (cmp > 0) { + break; + } + + merged_iter_pqueue_remove(&mi->pq); + err = merged_iter_advance_subiter(mi, top.index); + if (err < 0) { + return err; + } + reftable_record_destroy(&top.rec); + } + + reftable_record_copy_from(rec, &entry.rec, hash_size(mi->hash_id)); + reftable_record_destroy(&entry.rec); + strbuf_release(&entry_key); + return 0; +} + +static int merged_iter_next(struct merged_iter *mi, struct reftable_record *rec) +{ + while (1) { + int err = merged_iter_next_entry(mi, rec); + if (err == 0 && mi->suppress_deletions && + reftable_record_is_deletion(rec)) { + continue; + } + + return err; + } +} + +static int merged_iter_next_void(void *p, struct reftable_record *rec) +{ + struct merged_iter *mi = (struct merged_iter *)p; + if (merged_iter_pqueue_is_empty(mi->pq)) + return 1; + + return merged_iter_next(mi, rec); +} + +static struct reftable_iterator_vtable merged_iter_vtable = { + .next = &merged_iter_next_void, + .close = &merged_iter_close, +}; + +static void iterator_from_merged_iter(struct reftable_iterator *it, + struct merged_iter *mi) +{ + assert(it->ops == NULL); + it->iter_arg = mi; + it->ops = &merged_iter_vtable; +} + +int reftable_new_merged_table(struct reftable_merged_table **dest, + struct reftable_table *stack, int n, + uint32_t hash_id) +{ + struct reftable_merged_table *m = NULL; + uint64_t last_max = 0; + uint64_t first_min = 0; + int i = 0; + for (i = 0; i < n; i++) { + uint64_t min = reftable_table_min_update_index(&stack[i]); + uint64_t max = reftable_table_max_update_index(&stack[i]); + + if (reftable_table_hash_id(&stack[i]) != hash_id) { + return REFTABLE_FORMAT_ERROR; + } + if (i == 0 || min < first_min) { + first_min = min; + } + if (i == 0 || max > last_max) { + last_max = max; + } + } + + m = (struct reftable_merged_table *)reftable_calloc( + sizeof(struct reftable_merged_table)); + m->stack = stack; + m->stack_len = n; + m->min = first_min; + m->max = last_max; + m->hash_id = hash_id; + *dest = m; + return 0; +} + +/* clears the list of subtable, without affecting the readers themselves. */ +void merged_table_release(struct reftable_merged_table *mt) +{ + FREE_AND_NULL(mt->stack); + mt->stack_len = 0; +} + +void reftable_merged_table_free(struct reftable_merged_table *mt) +{ + if (mt == NULL) { + return; + } + merged_table_release(mt); + reftable_free(mt); +} + +uint64_t +reftable_merged_table_max_update_index(struct reftable_merged_table *mt) +{ + return mt->max; +} + +uint64_t +reftable_merged_table_min_update_index(struct reftable_merged_table *mt) +{ + return mt->min; +} + +static int reftable_table_seek_record(struct reftable_table *tab, + struct reftable_iterator *it, + struct reftable_record *rec) +{ + return tab->ops->seek_record(tab->table_arg, it, rec); +} + +static int merged_table_seek_record(struct reftable_merged_table *mt, + struct reftable_iterator *it, + struct reftable_record *rec) +{ + struct reftable_iterator *iters = reftable_calloc( + sizeof(struct reftable_iterator) * mt->stack_len); + struct merged_iter merged = { + .stack = iters, + .typ = reftable_record_type(rec), + .hash_id = mt->hash_id, + .suppress_deletions = mt->suppress_deletions, + }; + int n = 0; + int err = 0; + int i = 0; + for (i = 0; i < mt->stack_len && err == 0; i++) { + int e = reftable_table_seek_record(&mt->stack[i], &iters[n], + rec); + if (e < 0) { + err = e; + } + if (e == 0) { + n++; + } + } + if (err < 0) { + int i = 0; + for (i = 0; i < n; i++) { + reftable_iterator_destroy(&iters[i]); + } + reftable_free(iters); + return err; + } + + merged.stack_len = n; + err = merged_iter_init(&merged); + if (err < 0) { + merged_iter_close(&merged); + return err; + } else { + struct merged_iter *p = + reftable_malloc(sizeof(struct merged_iter)); + *p = merged; + iterator_from_merged_iter(it, p); + } + return 0; +} + +int reftable_merged_table_seek_ref(struct reftable_merged_table *mt, + struct reftable_iterator *it, + const char *name) +{ + struct reftable_ref_record ref = { + .refname = (char *)name, + }; + struct reftable_record rec = { NULL }; + reftable_record_from_ref(&rec, &ref); + return merged_table_seek_record(mt, it, &rec); +} + +int reftable_merged_table_seek_log_at(struct reftable_merged_table *mt, + struct reftable_iterator *it, + const char *name, uint64_t update_index) +{ + struct reftable_log_record log = { + .refname = (char *)name, + .update_index = update_index, + }; + struct reftable_record rec = { NULL }; + reftable_record_from_log(&rec, &log); + return merged_table_seek_record(mt, it, &rec); +} + +int reftable_merged_table_seek_log(struct reftable_merged_table *mt, + struct reftable_iterator *it, + const char *name) +{ + uint64_t max = ~((uint64_t)0); + return reftable_merged_table_seek_log_at(mt, it, name, max); +} + +uint32_t reftable_merged_table_hash_id(struct reftable_merged_table *mt) +{ + return mt->hash_id; +} + +static int reftable_merged_table_seek_void(void *tab, + struct reftable_iterator *it, + struct reftable_record *rec) +{ + return merged_table_seek_record((struct reftable_merged_table *)tab, it, + rec); +} + +static uint32_t reftable_merged_table_hash_id_void(void *tab) +{ + return reftable_merged_table_hash_id( + (struct reftable_merged_table *)tab); +} + +static uint64_t reftable_merged_table_min_update_index_void(void *tab) +{ + return reftable_merged_table_min_update_index( + (struct reftable_merged_table *)tab); +} + +static uint64_t reftable_merged_table_max_update_index_void(void *tab) +{ + return reftable_merged_table_max_update_index( + (struct reftable_merged_table *)tab); +} + +static struct reftable_table_vtable merged_table_vtable = { + .seek_record = reftable_merged_table_seek_void, + .hash_id = reftable_merged_table_hash_id_void, + .min_update_index = reftable_merged_table_min_update_index_void, + .max_update_index = reftable_merged_table_max_update_index_void, +}; + +void reftable_table_from_merged_table(struct reftable_table *tab, + struct reftable_merged_table *merged) +{ + assert(tab->ops == NULL); + tab->ops = &merged_table_vtable; + tab->table_arg = merged; +} diff --git a/reftable/merged.h b/reftable/merged.h new file mode 100644 index 00000000000..8c4d4d58d77 --- /dev/null +++ b/reftable/merged.h @@ -0,0 +1,35 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef MERGED_H +#define MERGED_H + +#include "pq.h" + +struct reftable_merged_table { + struct reftable_table *stack; + size_t stack_len; + uint32_t hash_id; + int suppress_deletions; + + uint64_t min; + uint64_t max; +}; + +struct merged_iter { + struct reftable_iterator *stack; + uint32_t hash_id; + size_t stack_len; + uint8_t typ; + int suppress_deletions; + struct merged_iter_pqueue pq; +}; + +void merged_table_release(struct reftable_merged_table *mt); + +#endif diff --git a/reftable/merged_test.c b/reftable/merged_test.c new file mode 100644 index 00000000000..09a70aa0fda --- /dev/null +++ b/reftable/merged_test.c @@ -0,0 +1,343 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "merged.h" + +#include "system.h" + +#include "basics.h" +#include "blocksource.h" +#include "constants.h" +#include "pq.h" +#include "reader.h" +#include "record.h" +#include "test_framework.h" +#include "reftable-merged.h" +#include "reftable-tests.h" +#include "reftable-generic.h" +#include "reftable-stack.h" + +static void test_pq(void) +{ + char *names[54] = { NULL }; + int N = ARRAY_SIZE(names) - 1; + + struct merged_iter_pqueue pq = { NULL }; + const char *last = NULL; + + int i = 0; + for (i = 0; i < N; i++) { + char name[100]; + snprintf(name, sizeof(name), "%02d", i); + names[i] = xstrdup(name); + } + + i = 1; + do { + struct reftable_record rec = + reftable_new_record(BLOCK_TYPE_REF); + struct pq_entry e = { 0 }; + + reftable_record_as_ref(&rec)->refname = names[i]; + e.rec = rec; + merged_iter_pqueue_add(&pq, e); + merged_iter_pqueue_check(pq); + i = (i * 7) % N; + } while (i != 1); + + while (!merged_iter_pqueue_is_empty(pq)) { + struct pq_entry e = merged_iter_pqueue_remove(&pq); + struct reftable_ref_record *ref = + reftable_record_as_ref(&e.rec); + + merged_iter_pqueue_check(pq); + + if (last != NULL) { + assert(strcmp(last, ref->refname) < 0); + } + last = ref->refname; + ref->refname = NULL; + reftable_free(ref); + } + + for (i = 0; i < N; i++) { + reftable_free(names[i]); + } + + merged_iter_pqueue_release(&pq); +} + +static void write_test_table(struct strbuf *buf, + struct reftable_ref_record refs[], int n) +{ + int min = 0xffffffff; + int max = 0; + int i = 0; + int err; + + struct reftable_write_options opts = { + .block_size = 256, + }; + struct reftable_writer *w = NULL; + for (i = 0; i < n; i++) { + uint64_t ui = refs[i].update_index; + if (ui > max) { + max = ui; + } + if (ui < min) { + min = ui; + } + } + + w = reftable_new_writer(&strbuf_add_void, buf, &opts); + reftable_writer_set_limits(w, min, max); + + for (i = 0; i < n; i++) { + uint64_t before = refs[i].update_index; + int n = reftable_writer_add_ref(w, &refs[i]); + assert(n == 0); + assert(before == refs[i].update_index); + } + + err = reftable_writer_close(w); + EXPECT_ERR(err); + + reftable_writer_free(w); +} + +static struct reftable_merged_table * +merged_table_from_records(struct reftable_ref_record **refs, + struct reftable_block_source **source, + struct reftable_reader ***readers, int *sizes, + struct strbuf *buf, int n) +{ + int i = 0; + struct reftable_merged_table *mt = NULL; + int err; + struct reftable_table *tabs = + reftable_calloc(n * sizeof(struct reftable_table)); + *readers = reftable_calloc(n * sizeof(struct reftable_reader *)); + *source = reftable_calloc(n * sizeof(**source)); + for (i = 0; i < n; i++) { + write_test_table(&buf[i], refs[i], sizes[i]); + block_source_from_strbuf(&(*source)[i], &buf[i]); + + err = reftable_new_reader(&(*readers)[i], &(*source)[i], + "name"); + EXPECT_ERR(err); + reftable_table_from_reader(&tabs[i], (*readers)[i]); + } + + err = reftable_new_merged_table(&mt, tabs, n, SHA1_ID); + EXPECT_ERR(err); + return mt; +} + +static void readers_destroy(struct reftable_reader **readers, size_t n) +{ + int i = 0; + for (; i < n; i++) + reftable_reader_free(readers[i]); + reftable_free(readers); +} + +static void test_merged_between(void) +{ + uint8_t hash1[SHA1_SIZE] = { 1, 2, 3, 0 }; + + struct reftable_ref_record r1[] = { { + .refname = "b", + .update_index = 1, + .value_type = REFTABLE_REF_VAL1, + .value.val1 = hash1, + } }; + struct reftable_ref_record r2[] = { { + .refname = "a", + .update_index = 2, + .value_type = REFTABLE_REF_DELETION, + } }; + + struct reftable_ref_record *refs[] = { r1, r2 }; + int sizes[] = { 1, 1 }; + struct strbuf bufs[2] = { STRBUF_INIT, STRBUF_INIT }; + struct reftable_block_source *bs = NULL; + struct reftable_reader **readers = NULL; + struct reftable_merged_table *mt = + merged_table_from_records(refs, &bs, &readers, sizes, bufs, 2); + int i; + struct reftable_ref_record ref = { NULL }; + struct reftable_iterator it = { NULL }; + int err = reftable_merged_table_seek_ref(mt, &it, "a"); + EXPECT_ERR(err); + + err = reftable_iterator_next_ref(&it, &ref); + EXPECT_ERR(err); + EXPECT(ref.update_index == 2); + reftable_ref_record_release(&ref); + reftable_iterator_destroy(&it); + readers_destroy(readers, 2); + reftable_merged_table_free(mt); + for (i = 0; i < ARRAY_SIZE(bufs); i++) { + strbuf_release(&bufs[i]); + } + reftable_free(bs); +} + +static void test_merged(void) +{ + uint8_t hash1[SHA1_SIZE] = { 1 }; + uint8_t hash2[SHA1_SIZE] = { 2 }; + struct reftable_ref_record r1[] = { + { + .refname = "a", + .update_index = 1, + .value_type = REFTABLE_REF_VAL1, + .value.val1 = hash1, + }, + { + .refname = "b", + .update_index = 1, + .value_type = REFTABLE_REF_VAL1, + .value.val1 = hash1, + }, + { + .refname = "c", + .update_index = 1, + .value_type = REFTABLE_REF_VAL1, + .value.val1 = hash1, + } + }; + struct reftable_ref_record r2[] = { { + .refname = "a", + .update_index = 2, + .value_type = REFTABLE_REF_DELETION, + } }; + struct reftable_ref_record r3[] = { + { + .refname = "c", + .update_index = 3, + .value_type = REFTABLE_REF_VAL1, + .value.val1 = hash2, + }, + { + .refname = "d", + .update_index = 3, + .value_type = REFTABLE_REF_VAL1, + .value.val1 = hash1, + }, + }; + + struct reftable_ref_record want[] = { + r2[0], + r1[1], + r3[0], + r3[1], + }; + + struct reftable_ref_record *refs[] = { r1, r2, r3 }; + int sizes[3] = { 3, 1, 2 }; + struct strbuf bufs[3] = { STRBUF_INIT, STRBUF_INIT, STRBUF_INIT }; + struct reftable_block_source *bs = NULL; + struct reftable_reader **readers = NULL; + struct reftable_merged_table *mt = + merged_table_from_records(refs, &bs, &readers, sizes, bufs, 3); + + struct reftable_iterator it = { NULL }; + int err = reftable_merged_table_seek_ref(mt, &it, "a"); + struct reftable_ref_record *out = NULL; + size_t len = 0; + size_t cap = 0; + int i = 0; + + EXPECT_ERR(err); + while (len < 100) { /* cap loops/recursion. */ + struct reftable_ref_record ref = { NULL }; + int err = reftable_iterator_next_ref(&it, &ref); + if (err > 0) { + break; + } + if (len == cap) { + cap = 2 * cap + 1; + out = reftable_realloc( + out, sizeof(struct reftable_ref_record) * cap); + } + out[len++] = ref; + } + reftable_iterator_destroy(&it); + + assert(ARRAY_SIZE(want) == len); + for (i = 0; i < len; i++) { + assert(reftable_ref_record_equal(&want[i], &out[i], SHA1_SIZE)); + } + for (i = 0; i < len; i++) { + reftable_ref_record_release(&out[i]); + } + reftable_free(out); + + for (i = 0; i < 3; i++) { + strbuf_release(&bufs[i]); + } + readers_destroy(readers, 3); + reftable_merged_table_free(mt); + reftable_free(bs); +} + +static void test_default_write_opts(void) +{ + struct reftable_write_options opts = { 0 }; + struct strbuf buf = STRBUF_INIT; + struct reftable_writer *w = + reftable_new_writer(&strbuf_add_void, &buf, &opts); + + struct reftable_ref_record rec = { + .refname = "master", + .update_index = 1, + }; + int err; + struct reftable_block_source source = { NULL }; + struct reftable_table *tab = reftable_calloc(sizeof(*tab) * 1); + uint32_t hash_id; + struct reftable_reader *rd = NULL; + struct reftable_merged_table *merged = NULL; + + reftable_writer_set_limits(w, 1, 1); + + err = reftable_writer_add_ref(w, &rec); + EXPECT_ERR(err); + + err = reftable_writer_close(w); + EXPECT_ERR(err); + reftable_writer_free(w); + + block_source_from_strbuf(&source, &buf); + + err = reftable_new_reader(&rd, &source, "filename"); + EXPECT_ERR(err); + + hash_id = reftable_reader_hash_id(rd); + assert(hash_id == SHA1_ID); + + reftable_table_from_reader(&tab[0], rd); + err = reftable_new_merged_table(&merged, tab, 1, SHA1_ID); + EXPECT_ERR(err); + + reftable_reader_free(rd); + reftable_merged_table_free(merged); + strbuf_release(&buf); +} + +/* XXX test refs_for(oid) */ + +int merged_test_main(int argc, const char *argv[]) +{ + RUN_TEST(test_merged_between); + RUN_TEST(test_pq); + RUN_TEST(test_merged); + RUN_TEST(test_default_write_opts); + return 0; +} diff --git a/reftable/pq.c b/reftable/pq.c new file mode 100644 index 00000000000..8918d158e2d --- /dev/null +++ b/reftable/pq.c @@ -0,0 +1,115 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "pq.h" + +#include "reftable-record.h" +#include "system.h" +#include "basics.h" + +static int pq_less(struct pq_entry a, struct pq_entry b) +{ + struct strbuf ak = STRBUF_INIT; + struct strbuf bk = STRBUF_INIT; + int cmp = 0; + reftable_record_key(&a.rec, &ak); + reftable_record_key(&b.rec, &bk); + + cmp = strbuf_cmp(&ak, &bk); + + strbuf_release(&ak); + strbuf_release(&bk); + + if (cmp == 0) + return a.index > b.index; + + return cmp < 0; +} + +struct pq_entry merged_iter_pqueue_top(struct merged_iter_pqueue pq) +{ + return pq.heap[0]; +} + +int merged_iter_pqueue_is_empty(struct merged_iter_pqueue pq) +{ + return pq.len == 0; +} + +void merged_iter_pqueue_check(struct merged_iter_pqueue pq) +{ + int i = 0; + for (i = 1; i < pq.len; i++) { + int parent = (i - 1) / 2; + + assert(pq_less(pq.heap[parent], pq.heap[i])); + } +} + +struct pq_entry merged_iter_pqueue_remove(struct merged_iter_pqueue *pq) +{ + int i = 0; + struct pq_entry e = pq->heap[0]; + pq->heap[0] = pq->heap[pq->len - 1]; + pq->len--; + + i = 0; + while (i < pq->len) { + int min = i; + int j = 2 * i + 1; + int k = 2 * i + 2; + if (j < pq->len && pq_less(pq->heap[j], pq->heap[i])) { + min = j; + } + if (k < pq->len && pq_less(pq->heap[k], pq->heap[min])) { + min = k; + } + + if (min == i) { + break; + } + + SWAP(pq->heap[i], pq->heap[min]); + i = min; + } + + return e; +} + +void merged_iter_pqueue_add(struct merged_iter_pqueue *pq, struct pq_entry e) +{ + int i = 0; + if (pq->len == pq->cap) { + pq->cap = 2 * pq->cap + 1; + pq->heap = reftable_realloc(pq->heap, + pq->cap * sizeof(struct pq_entry)); + } + + pq->heap[pq->len++] = e; + i = pq->len - 1; + while (i > 0) { + int j = (i - 1) / 2; + if (pq_less(pq->heap[j], pq->heap[i])) { + break; + } + + SWAP(pq->heap[j], pq->heap[i]); + + i = j; + } +} + +void merged_iter_pqueue_release(struct merged_iter_pqueue *pq) +{ + int i = 0; + for (i = 0; i < pq->len; i++) { + reftable_record_destroy(&pq->heap[i].rec); + } + FREE_AND_NULL(pq->heap); + pq->len = pq->cap = 0; +} diff --git a/reftable/pq.h b/reftable/pq.h new file mode 100644 index 00000000000..385d2fb139a --- /dev/null +++ b/reftable/pq.h @@ -0,0 +1,32 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef PQ_H +#define PQ_H + +#include "record.h" + +struct pq_entry { + int index; + struct reftable_record rec; +}; + +struct merged_iter_pqueue { + struct pq_entry *heap; + size_t len; + size_t cap; +}; + +struct pq_entry merged_iter_pqueue_top(struct merged_iter_pqueue pq); +int merged_iter_pqueue_is_empty(struct merged_iter_pqueue pq); +void merged_iter_pqueue_check(struct merged_iter_pqueue pq); +struct pq_entry merged_iter_pqueue_remove(struct merged_iter_pqueue *pq); +void merged_iter_pqueue_add(struct merged_iter_pqueue *pq, struct pq_entry e); +void merged_iter_pqueue_release(struct merged_iter_pqueue *pq); + +#endif diff --git a/reftable/refname.c b/reftable/refname.c new file mode 100644 index 00000000000..0f4eb3b292d --- /dev/null +++ b/reftable/refname.c @@ -0,0 +1,209 @@ +/* + Copyright 2020 Google LLC + + Use of this source code is governed by a BSD-style + license that can be found in the LICENSE file or at + https://developers.google.com/open-source/licenses/bsd +*/ + +#include "system.h" +#include "reftable-error.h" +#include "basics.h" +#include "refname.h" +#include "reftable-iterator.h" + +struct find_arg { + char **names; + const char *want; +}; + +static int find_name(size_t k, void *arg) +{ + struct find_arg *f_arg = (struct find_arg *)arg; + return strcmp(f_arg->names[k], f_arg->want) >= 0; +} + +static int modification_has_ref(struct modification *mod, const char *name) +{ + struct reftable_ref_record ref = { NULL }; + int err = 0; + + if (mod->add_len > 0) { + struct find_arg arg = { + .names = mod->add, + .want = name, + }; + int idx = binsearch(mod->add_len, find_name, &arg); + if (idx < mod->add_len && !strcmp(mod->add[idx], name)) { + return 0; + } + } + + if (mod->del_len > 0) { + struct find_arg arg = { + .names = mod->del, + .want = name, + }; + int idx = binsearch(mod->del_len, find_name, &arg); + if (idx < mod->del_len && !strcmp(mod->del[idx], name)) { + return 1; + } + } + + err = reftable_table_read_ref(&mod->tab, name, &ref); + reftable_ref_record_release(&ref); + return err; +} + +static void modification_release(struct modification *mod) +{ + /* don't delete the strings themselves; they're owned by ref records. + */ + FREE_AND_NULL(mod->add); + FREE_AND_NULL(mod->del); + mod->add_len = 0; + mod->del_len = 0; +} + +static int modification_has_ref_with_prefix(struct modification *mod, + const char *prefix) +{ + struct reftable_iterator it = { NULL }; + struct reftable_ref_record ref = { NULL }; + int err = 0; + + if (mod->add_len > 0) { + struct find_arg arg = { + .names = mod->add, + .want = prefix, + }; + int idx = binsearch(mod->add_len, find_name, &arg); + if (idx < mod->add_len && + !strncmp(prefix, mod->add[idx], strlen(prefix))) + goto done; + } + err = reftable_table_seek_ref(&mod->tab, &it, prefix); + if (err) + goto done; + + while (1) { + err = reftable_iterator_next_ref(&it, &ref); + if (err) + goto done; + + if (mod->del_len > 0) { + struct find_arg arg = { + .names = mod->del, + .want = ref.refname, + }; + int idx = binsearch(mod->del_len, find_name, &arg); + if (idx < mod->del_len && + !strcmp(ref.refname, mod->del[idx])) { + continue; + } + } + + if (strncmp(ref.refname, prefix, strlen(prefix))) { + err = 1; + goto done; + } + err = 0; + goto done; + } + +done: + reftable_ref_record_release(&ref); + reftable_iterator_destroy(&it); + return err; +} + +static int validate_refname(const char *name) +{ + while (1) { + char *next = strchr(name, '/'); + if (!*name) { + return REFTABLE_REFNAME_ERROR; + } + if (!next) { + return 0; + } + if (next - name == 0 || (next - name == 1 && *name == '.') || + (next - name == 2 && name[0] == '.' && name[1] == '.')) + return REFTABLE_REFNAME_ERROR; + name = next + 1; + } + return 0; +} + +int validate_ref_record_addition(struct reftable_table tab, + struct reftable_ref_record *recs, size_t sz) +{ + struct modification mod = { + .tab = tab, + .add = reftable_calloc(sizeof(char *) * sz), + .del = reftable_calloc(sizeof(char *) * sz), + }; + int i = 0; + int err = 0; + for (; i < sz; i++) { + if (reftable_ref_record_is_deletion(&recs[i])) { + mod.del[mod.del_len++] = recs[i].refname; + } else { + mod.add[mod.add_len++] = recs[i].refname; + } + } + + err = modification_validate(&mod); + modification_release(&mod); + return err; +} + +static void strbuf_trim_component(struct strbuf *sl) +{ + while (sl->len > 0) { + int is_slash = (sl->buf[sl->len - 1] == '/'); + strbuf_setlen(sl, sl->len - 1); + if (is_slash) + break; + } +} + +int modification_validate(struct modification *mod) +{ + struct strbuf slashed = STRBUF_INIT; + int err = 0; + int i = 0; + for (; i < mod->add_len; i++) { + err = validate_refname(mod->add[i]); + if (err) + goto done; + strbuf_reset(&slashed); + strbuf_addstr(&slashed, mod->add[i]); + strbuf_addstr(&slashed, "/"); + + err = modification_has_ref_with_prefix(mod, slashed.buf); + if (err == 0) { + err = REFTABLE_NAME_CONFLICT; + goto done; + } + if (err < 0) + goto done; + + strbuf_reset(&slashed); + strbuf_addstr(&slashed, mod->add[i]); + while (slashed.len) { + strbuf_trim_component(&slashed); + err = modification_has_ref(mod, slashed.buf); + if (err == 0) { + err = REFTABLE_NAME_CONFLICT; + goto done; + } + if (err < 0) + goto done; + } + } + err = 0; +done: + strbuf_release(&slashed); + return err; +} diff --git a/reftable/refname.h b/reftable/refname.h new file mode 100644 index 00000000000..a24b40fcb42 --- /dev/null +++ b/reftable/refname.h @@ -0,0 +1,29 @@ +/* + Copyright 2020 Google LLC + + Use of this source code is governed by a BSD-style + license that can be found in the LICENSE file or at + https://developers.google.com/open-source/licenses/bsd +*/ +#ifndef REFNAME_H +#define REFNAME_H + +#include "reftable-record.h" +#include "reftable-generic.h" + +struct modification { + struct reftable_table tab; + + char **add; + size_t add_len; + + char **del; + size_t del_len; +}; + +int validate_ref_record_addition(struct reftable_table tab, + struct reftable_ref_record *recs, size_t sz); + +int modification_validate(struct modification *mod); + +#endif diff --git a/reftable/refname_test.c b/reftable/refname_test.c new file mode 100644 index 00000000000..5e005d6af31 --- /dev/null +++ b/reftable/refname_test.c @@ -0,0 +1,102 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "basics.h" +#include "block.h" +#include "blocksource.h" +#include "constants.h" +#include "reader.h" +#include "record.h" +#include "refname.h" +#include "reftable-error.h" +#include "reftable-writer.h" +#include "system.h" + +#include "test_framework.h" +#include "reftable-tests.h" + +struct testcase { + char *add; + char *del; + int error_code; +}; + +static void test_conflict(void) +{ + struct reftable_write_options opts = { 0 }; + struct strbuf buf = STRBUF_INIT; + struct reftable_writer *w = + reftable_new_writer(&strbuf_add_void, &buf, &opts); + struct reftable_ref_record rec = { + .refname = "a/b", + .value_type = REFTABLE_REF_SYMREF, + .value.symref = "destination", /* make sure it's not a symref. + */ + .update_index = 1, + }; + int err; + int i; + struct reftable_block_source source = { NULL }; + struct reftable_reader *rd = NULL; + struct reftable_table tab = { NULL }; + struct testcase cases[] = { + { "a/b/c", NULL, REFTABLE_NAME_CONFLICT }, + { "b", NULL, 0 }, + { "a", NULL, REFTABLE_NAME_CONFLICT }, + { "a", "a/b", 0 }, + + { "p/", NULL, REFTABLE_REFNAME_ERROR }, + { "p//q", NULL, REFTABLE_REFNAME_ERROR }, + { "p/./q", NULL, REFTABLE_REFNAME_ERROR }, + { "p/../q", NULL, REFTABLE_REFNAME_ERROR }, + + { "a/b/c", "a/b", 0 }, + { NULL, "a//b", 0 }, + }; + reftable_writer_set_limits(w, 1, 1); + + err = reftable_writer_add_ref(w, &rec); + EXPECT_ERR(err); + + err = reftable_writer_close(w); + EXPECT_ERR(err); + reftable_writer_free(w); + + block_source_from_strbuf(&source, &buf); + err = reftable_new_reader(&rd, &source, "filename"); + EXPECT_ERR(err); + + reftable_table_from_reader(&tab, rd); + + for (i = 0; i < ARRAY_SIZE(cases); i++) { + struct modification mod = { + .tab = tab, + }; + + if (cases[i].add != NULL) { + mod.add = &cases[i].add; + mod.add_len = 1; + } + if (cases[i].del != NULL) { + mod.del = &cases[i].del; + mod.del_len = 1; + } + + err = modification_validate(&mod); + EXPECT(err == cases[i].error_code); + } + + reftable_reader_free(rd); + strbuf_release(&buf); +} + +int refname_test_main(int argc, const char *argv[]) +{ + RUN_TEST(test_conflict); + return 0; +} diff --git a/reftable/reftable-generic.h b/reftable/reftable-generic.h new file mode 100644 index 00000000000..77eca3b4eb0 --- /dev/null +++ b/reftable/reftable-generic.h @@ -0,0 +1,48 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef REFTABLE_GENERIC_H +#define REFTABLE_GENERIC_H + +#include "reftable-iterator.h" +#include "reftable-reader.h" +#include "reftable-merged.h" + +/* + * Provides a unified API for reading tables, either merged tables, or single + * readers. */ +struct reftable_table { + struct reftable_table_vtable *ops; + void *table_arg; +}; + +int reftable_table_seek_ref(struct reftable_table *tab, + struct reftable_iterator *it, const char *name); + +void reftable_table_from_reader(struct reftable_table *tab, + struct reftable_reader *reader); + +/* returns the hash ID from a generic reftable_table */ +uint32_t reftable_table_hash_id(struct reftable_table *tab); + +/* create a generic table from reftable_merged_table */ +void reftable_table_from_merged_table(struct reftable_table *tab, + struct reftable_merged_table *table); + +/* returns the max update_index covered by this table. */ +uint64_t reftable_table_max_update_index(struct reftable_table *tab); + +/* returns the min update_index covered by this table. */ +uint64_t reftable_table_min_update_index(struct reftable_table *tab); + +/* convenience function to read a single ref. Returns < 0 for error, 0 + for success, and 1 if ref not found. */ +int reftable_table_read_ref(struct reftable_table *tab, const char *name, + struct reftable_ref_record *ref); + +#endif diff --git a/reftable/reftable-merged.h b/reftable/reftable-merged.h new file mode 100644 index 00000000000..0e8ebe5a995 --- /dev/null +++ b/reftable/reftable-merged.h @@ -0,0 +1,68 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef REFTABLE_MERGED_H +#define REFTABLE_MERGED_H + +#include "reftable-iterator.h" + +/* + * Merged tables + * + * A ref database kept in a sequence of table files. The merged_table presents a + * unified view to reading (seeking, iterating) a sequence of immutable tables. + * + * The merged tables are on purpose kept disconnected from their actual storage + * (eg. files on disk), because it is useful to merge tables aren't files. For + * example, the per-workspace and global ref namespace can be implemented as a + * merged table of two stacks of file-backed reftables. + */ + +/* A merged table is implements seeking/iterating over a stack of tables. */ +struct reftable_merged_table; + +/* A generic reftable; see below. */ +struct reftable_table; + +/* reftable_new_merged_table creates a new merged table. It takes ownership of + the stack array. +*/ +int reftable_new_merged_table(struct reftable_merged_table **dest, + struct reftable_table *stack, int n, + uint32_t hash_id); + +/* returns an iterator positioned just before 'name' */ +int reftable_merged_table_seek_ref(struct reftable_merged_table *mt, + struct reftable_iterator *it, + const char *name); + +/* returns an iterator for log entry, at given update_index */ +int reftable_merged_table_seek_log_at(struct reftable_merged_table *mt, + struct reftable_iterator *it, + const char *name, uint64_t update_index); + +/* like reftable_merged_table_seek_log_at but look for the newest entry. */ +int reftable_merged_table_seek_log(struct reftable_merged_table *mt, + struct reftable_iterator *it, + const char *name); + +/* returns the max update_index covered by this merged table. */ +uint64_t +reftable_merged_table_max_update_index(struct reftable_merged_table *mt); + +/* returns the min update_index covered by this merged table. */ +uint64_t +reftable_merged_table_min_update_index(struct reftable_merged_table *mt); + +/* releases memory for the merged_table */ +void reftable_merged_table_free(struct reftable_merged_table *m); + +/* return the hash ID of the merged table. */ +uint32_t reftable_merged_table_hash_id(struct reftable_merged_table *m); + +#endif diff --git a/reftable/reftable-stack.h b/reftable/reftable-stack.h new file mode 100644 index 00000000000..b7060f111e8 --- /dev/null +++ b/reftable/reftable-stack.h @@ -0,0 +1,120 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef REFTABLE_STACK_H +#define REFTABLE_STACK_H + +#include "reftable-writer.h" + +/* + * The stack presents an interface to a mutable sequence of reftables. + + * A stack can be mutated by pushing a table to the top of the stack. + + * The reftable_stack automatically compacts files on disk to ensure good + * amortized performance. + * + * For windows and other platforms that cannot have open files as rename + * destinations, concurrent access from multiple processes needs the rand() + * random seed to be randomized. + */ +struct reftable_stack; + +/* open a new reftable stack. The tables along with the table list will be + * stored in 'dir'. Typically, this should be .git/reftables. + */ +int reftable_new_stack(struct reftable_stack **dest, const char *dir, + struct reftable_write_options config); + +/* returns the update_index at which a next table should be written. */ +uint64_t reftable_stack_next_update_index(struct reftable_stack *st); + +/* holds a transaction to add tables at the top of a stack. */ +struct reftable_addition; + +/* + * returns a new transaction to add reftables to the given stack. As a side + * effect, the ref database is locked. + */ +int reftable_stack_new_addition(struct reftable_addition **dest, + struct reftable_stack *st); + +/* Adds a reftable to transaction. */ +int reftable_addition_add(struct reftable_addition *add, + int (*write_table)(struct reftable_writer *wr, + void *arg), + void *arg); + +/* Commits the transaction, releasing the lock. */ +int reftable_addition_commit(struct reftable_addition *add); + +/* Release all non-committed data from the transaction, and deallocate the + * transaction. Releases the lock if held. */ +void reftable_addition_destroy(struct reftable_addition *add); + +/* add a new table to the stack. The write_table function must call + * reftable_writer_set_limits, add refs and return an error value. */ +int reftable_stack_add(struct reftable_stack *st, + int (*write_table)(struct reftable_writer *wr, + void *write_arg), + void *write_arg); + +/* returns the merged_table for seeking. This table is valid until the + * next write or reload, and should not be closed or deleted. + */ +struct reftable_merged_table * +reftable_stack_merged_table(struct reftable_stack *st); + +/* frees all resources associated with the stack. */ +void reftable_stack_destroy(struct reftable_stack *st); + +/* Reloads the stack if necessary. This is very cheap to run if the stack was up + * to date */ +int reftable_stack_reload(struct reftable_stack *st); + +/* Policy for expiring reflog entries. */ +struct reftable_log_expiry_config { + /* Drop entries older than this timestamp */ + uint64_t time; + + /* Drop older entries */ + uint64_t min_update_index; +}; + +/* compacts all reftables into a giant table. Expire reflog entries if config is + * non-NULL */ +int reftable_stack_compact_all(struct reftable_stack *st, + struct reftable_log_expiry_config *config); + +/* heuristically compact unbalanced table stack. */ +int reftable_stack_auto_compact(struct reftable_stack *st); + +/* convenience function to read a single ref. Returns < 0 for error, 0 for + * success, and 1 if ref not found. */ +int reftable_stack_read_ref(struct reftable_stack *st, const char *refname, + struct reftable_ref_record *ref); + +/* convenience function to read a single log. Returns < 0 for error, 0 for + * success, and 1 if ref not found. */ +int reftable_stack_read_log(struct reftable_stack *st, const char *refname, + struct reftable_log_record *log); + +/* statistics on past compactions. */ +struct reftable_compaction_stats { + uint64_t bytes; /* total number of bytes written */ + uint64_t entries_written; /* total number of entries written, including + failures. */ + int attempts; /* how often we tried to compact */ + int failures; /* failures happen on concurrent updates */ +}; + +/* return statistics for compaction up till now. */ +struct reftable_compaction_stats * +reftable_stack_compaction_stats(struct reftable_stack *st); + +#endif diff --git a/reftable/reftable.c b/reftable/reftable.c new file mode 100644 index 00000000000..dc4fd03d5b2 --- /dev/null +++ b/reftable/reftable.c @@ -0,0 +1,98 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "record.h" +#include "reader.h" +#include "reftable-iterator.h" +#include "reftable-generic.h" + +static int reftable_reader_seek_void(void *tab, struct reftable_iterator *it, + struct reftable_record *rec) +{ + return reader_seek((struct reftable_reader *)tab, it, rec); +} + +static uint32_t reftable_reader_hash_id_void(void *tab) +{ + return reftable_reader_hash_id((struct reftable_reader *)tab); +} + +static uint64_t reftable_reader_min_update_index_void(void *tab) +{ + return reftable_reader_min_update_index((struct reftable_reader *)tab); +} + +static uint64_t reftable_reader_max_update_index_void(void *tab) +{ + return reftable_reader_max_update_index((struct reftable_reader *)tab); +} + +static struct reftable_table_vtable reader_vtable = { + .seek_record = reftable_reader_seek_void, + .hash_id = reftable_reader_hash_id_void, + .min_update_index = reftable_reader_min_update_index_void, + .max_update_index = reftable_reader_max_update_index_void, +}; + +int reftable_table_seek_ref(struct reftable_table *tab, + struct reftable_iterator *it, const char *name) +{ + struct reftable_ref_record ref = { + .refname = (char *)name, + }; + struct reftable_record rec = { NULL }; + reftable_record_from_ref(&rec, &ref); + return tab->ops->seek_record(tab->table_arg, it, &rec); +} + +void reftable_table_from_reader(struct reftable_table *tab, + struct reftable_reader *reader) +{ + assert(tab->ops == NULL); + tab->ops = &reader_vtable; + tab->table_arg = reader; +} + +int reftable_table_read_ref(struct reftable_table *tab, const char *name, + struct reftable_ref_record *ref) +{ + struct reftable_iterator it = { NULL }; + int err = reftable_table_seek_ref(tab, &it, name); + if (err) + goto done; + + err = reftable_iterator_next_ref(&it, ref); + if (err) + goto done; + + if (strcmp(ref->refname, name) || + reftable_ref_record_is_deletion(ref)) { + reftable_ref_record_release(ref); + err = 1; + goto done; + } + +done: + reftable_iterator_destroy(&it); + return err; +} + +uint64_t reftable_table_max_update_index(struct reftable_table *tab) +{ + return tab->ops->max_update_index(tab->table_arg); +} + +uint64_t reftable_table_min_update_index(struct reftable_table *tab) +{ + return tab->ops->min_update_index(tab->table_arg); +} + +uint32_t reftable_table_hash_id(struct reftable_table *tab) +{ + return tab->ops->hash_id(tab->table_arg); +} diff --git a/reftable/stack.c b/reftable/stack.c new file mode 100644 index 00000000000..10608089347 --- /dev/null +++ b/reftable/stack.c @@ -0,0 +1,1260 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "stack.h" + +#include "system.h" +#include "merged.h" +#include "reader.h" +#include "refname.h" +#include "reftable-error.h" +#include "reftable-record.h" +#include "writer.h" + +static int stack_try_add(struct reftable_stack *st, + int (*write_table)(struct reftable_writer *wr, + void *arg), + void *arg); +static int stack_write_compact(struct reftable_stack *st, + struct reftable_writer *wr, int first, int last, + struct reftable_log_expiry_config *config); +static int stack_check_addition(struct reftable_stack *st, + const char *new_tab_name); +static void reftable_addition_close(struct reftable_addition *add); +static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st, + int reuse_open); + +static int reftable_fd_write(void *arg, const void *data, size_t sz) +{ + int *fdp = (int *)arg; + return write(*fdp, data, sz); +} + +int reftable_new_stack(struct reftable_stack **dest, const char *dir, + struct reftable_write_options config) +{ + struct reftable_stack *p = + reftable_calloc(sizeof(struct reftable_stack)); + struct strbuf list_file_name = STRBUF_INIT; + int err = 0; + + if (config.hash_id == 0) { + config.hash_id = SHA1_ID; + } + + *dest = NULL; + + strbuf_reset(&list_file_name); + strbuf_addstr(&list_file_name, dir); + strbuf_addstr(&list_file_name, "/tables.list"); + + p->list_file = strbuf_detach(&list_file_name, NULL); + p->reftable_dir = xstrdup(dir); + p->config = config; + + err = reftable_stack_reload_maybe_reuse(p, 1); + if (err < 0) { + reftable_stack_destroy(p); + } else { + *dest = p; + } + return err; +} + +static int fd_read_lines(int fd, char ***namesp) +{ + off_t size = lseek(fd, 0, SEEK_END); + char *buf = NULL; + int err = 0; + if (size < 0) { + err = REFTABLE_IO_ERROR; + goto done; + } + err = lseek(fd, 0, SEEK_SET); + if (err < 0) { + err = REFTABLE_IO_ERROR; + goto done; + } + + buf = reftable_malloc(size + 1); + if (read(fd, buf, size) != size) { + err = REFTABLE_IO_ERROR; + goto done; + } + buf[size] = 0; + + parse_names(buf, size, namesp); + +done: + reftable_free(buf); + return err; +} + +int read_lines(const char *filename, char ***namesp) +{ + int fd = open(filename, O_RDONLY, 0644); + int err = 0; + if (fd < 0) { + if (errno == ENOENT) { + *namesp = reftable_calloc(sizeof(char *)); + return 0; + } + + return REFTABLE_IO_ERROR; + } + err = fd_read_lines(fd, namesp); + close(fd); + return err; +} + +struct reftable_merged_table * +reftable_stack_merged_table(struct reftable_stack *st) +{ + return st->merged; +} + +/* Close and free the stack */ +void reftable_stack_destroy(struct reftable_stack *st) +{ + if (st->merged != NULL) { + reftable_merged_table_free(st->merged); + st->merged = NULL; + } + + if (st->readers != NULL) { + int i = 0; + for (i = 0; i < st->readers_len; i++) { + reftable_reader_free(st->readers[i]); + } + st->readers_len = 0; + FREE_AND_NULL(st->readers); + } + FREE_AND_NULL(st->list_file); + FREE_AND_NULL(st->reftable_dir); + reftable_free(st); +} + +static struct reftable_reader **stack_copy_readers(struct reftable_stack *st, + int cur_len) +{ + struct reftable_reader **cur = + reftable_calloc(sizeof(struct reftable_reader *) * cur_len); + int i = 0; + for (i = 0; i < cur_len; i++) { + cur[i] = st->readers[i]; + } + return cur; +} + +static int reftable_stack_reload_once(struct reftable_stack *st, char **names, + int reuse_open) +{ + int cur_len = st->merged == NULL ? 0 : st->merged->stack_len; + struct reftable_reader **cur = stack_copy_readers(st, cur_len); + int err = 0; + int names_len = names_length(names); + struct reftable_reader **new_readers = + reftable_calloc(sizeof(struct reftable_reader *) * names_len); + struct reftable_table *new_tables = + reftable_calloc(sizeof(struct reftable_table) * names_len); + int new_readers_len = 0; + struct reftable_merged_table *new_merged = NULL; + int i; + + while (*names) { + struct reftable_reader *rd = NULL; + char *name = *names++; + + /* this is linear; we assume compaction keeps the number of + tables under control so this is not quadratic. */ + int j = 0; + for (j = 0; reuse_open && j < cur_len; j++) { + if (cur[j] != NULL && 0 == strcmp(cur[j]->name, name)) { + rd = cur[j]; + cur[j] = NULL; + break; + } + } + + if (rd == NULL) { + struct reftable_block_source src = { NULL }; + struct strbuf table_path = STRBUF_INIT; + strbuf_addstr(&table_path, st->reftable_dir); + strbuf_addstr(&table_path, "/"); + strbuf_addstr(&table_path, name); + + err = reftable_block_source_from_file(&src, + table_path.buf); + strbuf_release(&table_path); + + if (err < 0) + goto done; + + err = reftable_new_reader(&rd, &src, name); + if (err < 0) + goto done; + } + + new_readers[new_readers_len] = rd; + reftable_table_from_reader(&new_tables[new_readers_len], rd); + new_readers_len++; + } + + /* success! */ + err = reftable_new_merged_table(&new_merged, new_tables, + new_readers_len, st->config.hash_id); + if (err < 0) + goto done; + + new_tables = NULL; + st->readers_len = new_readers_len; + if (st->merged != NULL) { + merged_table_release(st->merged); + reftable_merged_table_free(st->merged); + } + if (st->readers != NULL) { + reftable_free(st->readers); + } + st->readers = new_readers; + new_readers = NULL; + new_readers_len = 0; + + new_merged->suppress_deletions = 1; + st->merged = new_merged; + for (i = 0; i < cur_len; i++) { + if (cur[i] != NULL) { + reader_close(cur[i]); + reftable_reader_free(cur[i]); + } + } + +done: + for (i = 0; i < new_readers_len; i++) { + reader_close(new_readers[i]); + reftable_reader_free(new_readers[i]); + } + reftable_free(new_readers); + reftable_free(new_tables); + reftable_free(cur); + return err; +} + +/* return negative if a before b. */ +static int tv_cmp(struct timeval *a, struct timeval *b) +{ + time_t diff = a->tv_sec - b->tv_sec; + int udiff = a->tv_usec - b->tv_usec; + + if (diff != 0) + return diff; + + return udiff; +} + +static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st, + int reuse_open) +{ + struct timeval deadline = { 0 }; + int err = gettimeofday(&deadline, NULL); + int64_t delay = 0; + int tries = 0; + if (err < 0) + return err; + + deadline.tv_sec += 3; + while (1) { + char **names = NULL; + char **names_after = NULL; + struct timeval now = { 0 }; + int err = gettimeofday(&now, NULL); + int err2 = 0; + if (err < 0) { + return err; + } + + /* Only look at deadlines after the first few times. This + simplifies debugging in GDB */ + tries++; + if (tries > 3 && tv_cmp(&now, &deadline) >= 0) { + break; + } + + err = read_lines(st->list_file, &names); + if (err < 0) { + free_names(names); + return err; + } + err = reftable_stack_reload_once(st, names, reuse_open); + if (err == 0) { + free_names(names); + break; + } + if (err != REFTABLE_NOT_EXIST_ERROR) { + free_names(names); + return err; + } + + /* err == REFTABLE_NOT_EXIST_ERROR can be caused by a concurrent + writer. Check if there was one by checking if the name list + changed. + */ + err2 = read_lines(st->list_file, &names_after); + if (err2 < 0) { + free_names(names); + return err2; + } + + if (names_equal(names_after, names)) { + free_names(names); + free_names(names_after); + return err; + } + free_names(names); + free_names(names_after); + + delay = delay + (delay * rand()) / RAND_MAX + 1; + sleep_millisec(delay); + } + + return 0; +} + +/* -1 = error + 0 = up to date + 1 = changed. */ +static int stack_uptodate(struct reftable_stack *st) +{ + char **names = NULL; + int err = read_lines(st->list_file, &names); + int i = 0; + if (err < 0) + return err; + + for (i = 0; i < st->readers_len; i++) { + if (names[i] == NULL) { + err = 1; + goto done; + } + + if (strcmp(st->readers[i]->name, names[i])) { + err = 1; + goto done; + } + } + + if (names[st->merged->stack_len] != NULL) { + err = 1; + goto done; + } + +done: + free_names(names); + return err; +} + +int reftable_stack_reload(struct reftable_stack *st) +{ + int err = stack_uptodate(st); + if (err > 0) + return reftable_stack_reload_maybe_reuse(st, 1); + return err; +} + +int reftable_stack_add(struct reftable_stack *st, + int (*write)(struct reftable_writer *wr, void *arg), + void *arg) +{ + int err = stack_try_add(st, write, arg); + if (err < 0) { + if (err == REFTABLE_LOCK_ERROR) { + /* Ignore error return, we want to propagate + REFTABLE_LOCK_ERROR. + */ + reftable_stack_reload(st); + } + return err; + } + + if (!st->disable_auto_compact) + return reftable_stack_auto_compact(st); + + return 0; +} + +static void format_name(struct strbuf *dest, uint64_t min, uint64_t max) +{ + char buf[100]; + uint32_t rnd = (uint32_t)rand(); + snprintf(buf, sizeof(buf), "0x%012" PRIx64 "-0x%012" PRIx64 "-%08x", + min, max, rnd); + strbuf_reset(dest); + strbuf_addstr(dest, buf); +} + +struct reftable_addition { + int lock_file_fd; + struct strbuf lock_file_name; + struct reftable_stack *stack; + + char **new_tables; + int new_tables_len; + uint64_t next_update_index; +}; + +#define REFTABLE_ADDITION_INIT \ + { \ + .lock_file_name = STRBUF_INIT \ + } + +static int reftable_stack_init_addition(struct reftable_addition *add, + struct reftable_stack *st) +{ + int err = 0; + add->stack = st; + + strbuf_reset(&add->lock_file_name); + strbuf_addstr(&add->lock_file_name, st->list_file); + strbuf_addstr(&add->lock_file_name, ".lock"); + + add->lock_file_fd = open(add->lock_file_name.buf, + O_EXCL | O_CREAT | O_WRONLY, 0644); + if (add->lock_file_fd < 0) { + if (errno == EEXIST) { + err = REFTABLE_LOCK_ERROR; + } else { + err = REFTABLE_IO_ERROR; + } + goto done; + } + err = stack_uptodate(st); + if (err < 0) + goto done; + + if (err > 1) { + err = REFTABLE_LOCK_ERROR; + goto done; + } + + add->next_update_index = reftable_stack_next_update_index(st); +done: + if (err) { + reftable_addition_close(add); + } + return err; +} + +static void reftable_addition_close(struct reftable_addition *add) +{ + int i = 0; + struct strbuf nm = STRBUF_INIT; + for (i = 0; i < add->new_tables_len; i++) { + strbuf_reset(&nm); + strbuf_addstr(&nm, add->stack->reftable_dir); + strbuf_addstr(&nm, "/"); + strbuf_addstr(&nm, add->new_tables[i]); + unlink(nm.buf); + reftable_free(add->new_tables[i]); + add->new_tables[i] = NULL; + } + reftable_free(add->new_tables); + add->new_tables = NULL; + add->new_tables_len = 0; + + if (add->lock_file_fd > 0) { + close(add->lock_file_fd); + add->lock_file_fd = 0; + } + if (add->lock_file_name.len > 0) { + unlink(add->lock_file_name.buf); + strbuf_release(&add->lock_file_name); + } + + strbuf_release(&nm); +} + +void reftable_addition_destroy(struct reftable_addition *add) +{ + if (add == NULL) { + return; + } + reftable_addition_close(add); + reftable_free(add); +} + +int reftable_addition_commit(struct reftable_addition *add) +{ + struct strbuf table_list = STRBUF_INIT; + int i = 0; + int err = 0; + if (add->new_tables_len == 0) + goto done; + + for (i = 0; i < add->stack->merged->stack_len; i++) { + strbuf_addstr(&table_list, add->stack->readers[i]->name); + strbuf_addstr(&table_list, "\n"); + } + for (i = 0; i < add->new_tables_len; i++) { + strbuf_addstr(&table_list, add->new_tables[i]); + strbuf_addstr(&table_list, "\n"); + } + + err = write(add->lock_file_fd, table_list.buf, table_list.len); + strbuf_release(&table_list); + if (err < 0) { + err = REFTABLE_IO_ERROR; + goto done; + } + + err = close(add->lock_file_fd); + add->lock_file_fd = 0; + if (err < 0) { + err = REFTABLE_IO_ERROR; + goto done; + } + + err = rename(add->lock_file_name.buf, add->stack->list_file); + if (err < 0) { + err = REFTABLE_IO_ERROR; + goto done; + } + + /* success, no more state to clean up. */ + strbuf_release(&add->lock_file_name); + for (i = 0; i < add->new_tables_len; i++) { + reftable_free(add->new_tables[i]); + } + reftable_free(add->new_tables); + add->new_tables = NULL; + add->new_tables_len = 0; + + err = reftable_stack_reload(add->stack); +done: + reftable_addition_close(add); + return err; +} + +int reftable_stack_new_addition(struct reftable_addition **dest, + struct reftable_stack *st) +{ + int err = 0; + struct reftable_addition empty = REFTABLE_ADDITION_INIT; + *dest = reftable_calloc(sizeof(**dest)); + **dest = empty; + err = reftable_stack_init_addition(*dest, st); + if (err) { + reftable_free(*dest); + *dest = NULL; + } + return err; +} + +static int stack_try_add(struct reftable_stack *st, + int (*write_table)(struct reftable_writer *wr, + void *arg), + void *arg) +{ + struct reftable_addition add = REFTABLE_ADDITION_INIT; + int err = reftable_stack_init_addition(&add, st); + if (err < 0) + goto done; + if (err > 0) { + err = REFTABLE_LOCK_ERROR; + goto done; + } + + err = reftable_addition_add(&add, write_table, arg); + if (err < 0) + goto done; + + err = reftable_addition_commit(&add); +done: + reftable_addition_close(&add); + return err; +} + +int reftable_addition_add(struct reftable_addition *add, + int (*write_table)(struct reftable_writer *wr, + void *arg), + void *arg) +{ + struct strbuf temp_tab_file_name = STRBUF_INIT; + struct strbuf tab_file_name = STRBUF_INIT; + struct strbuf next_name = STRBUF_INIT; + struct reftable_writer *wr = NULL; + int err = 0; + int tab_fd = 0; + + strbuf_reset(&next_name); + format_name(&next_name, add->next_update_index, add->next_update_index); + + strbuf_addstr(&temp_tab_file_name, add->stack->reftable_dir); + strbuf_addstr(&temp_tab_file_name, "/"); + strbuf_addbuf(&temp_tab_file_name, &next_name); + strbuf_addstr(&temp_tab_file_name, ".temp.XXXXXX"); + + tab_fd = mkstemp(temp_tab_file_name.buf); + if (tab_fd < 0) { + err = REFTABLE_IO_ERROR; + goto done; + } + + wr = reftable_new_writer(reftable_fd_write, &tab_fd, + &add->stack->config); + err = write_table(wr, arg); + if (err < 0) + goto done; + + err = reftable_writer_close(wr); + if (err == REFTABLE_EMPTY_TABLE_ERROR) { + err = 0; + goto done; + } + if (err < 0) + goto done; + + err = close(tab_fd); + tab_fd = 0; + if (err < 0) { + err = REFTABLE_IO_ERROR; + goto done; + } + + err = stack_check_addition(add->stack, temp_tab_file_name.buf); + if (err < 0) + goto done; + + if (wr->min_update_index < add->next_update_index) { + err = REFTABLE_API_ERROR; + goto done; + } + + format_name(&next_name, wr->min_update_index, wr->max_update_index); + strbuf_addstr(&next_name, ".ref"); + + strbuf_addstr(&tab_file_name, add->stack->reftable_dir); + strbuf_addstr(&tab_file_name, "/"); + strbuf_addbuf(&tab_file_name, &next_name); + + /* + On windows, this relies on rand() picking a unique destination name. + Maybe we should do retry loop as well? + */ + err = rename(temp_tab_file_name.buf, tab_file_name.buf); + if (err < 0) { + err = REFTABLE_IO_ERROR; + goto done; + } + + add->new_tables = reftable_realloc(add->new_tables, + sizeof(*add->new_tables) * + (add->new_tables_len + 1)); + add->new_tables[add->new_tables_len] = strbuf_detach(&next_name, NULL); + add->new_tables_len++; +done: + if (tab_fd > 0) { + close(tab_fd); + tab_fd = 0; + } + if (temp_tab_file_name.len > 0) { + unlink(temp_tab_file_name.buf); + } + + strbuf_release(&temp_tab_file_name); + strbuf_release(&tab_file_name); + strbuf_release(&next_name); + reftable_writer_free(wr); + return err; +} + +uint64_t reftable_stack_next_update_index(struct reftable_stack *st) +{ + int sz = st->merged->stack_len; + if (sz > 0) + return reftable_reader_max_update_index(st->readers[sz - 1]) + + 1; + return 1; +} + +static int stack_compact_locked(struct reftable_stack *st, int first, int last, + struct strbuf *temp_tab, + struct reftable_log_expiry_config *config) +{ + struct strbuf next_name = STRBUF_INIT; + int tab_fd = -1; + struct reftable_writer *wr = NULL; + int err = 0; + + format_name(&next_name, + reftable_reader_min_update_index(st->readers[first]), + reftable_reader_max_update_index(st->readers[last])); + + strbuf_reset(temp_tab); + strbuf_addstr(temp_tab, st->reftable_dir); + strbuf_addstr(temp_tab, "/"); + strbuf_addbuf(temp_tab, &next_name); + strbuf_addstr(temp_tab, ".temp.XXXXXX"); + + tab_fd = mkstemp(temp_tab->buf); + wr = reftable_new_writer(reftable_fd_write, &tab_fd, &st->config); + + err = stack_write_compact(st, wr, first, last, config); + if (err < 0) + goto done; + err = reftable_writer_close(wr); + if (err < 0) + goto done; + + err = close(tab_fd); + tab_fd = 0; + +done: + reftable_writer_free(wr); + if (tab_fd > 0) { + close(tab_fd); + tab_fd = 0; + } + if (err != 0 && temp_tab->len > 0) { + unlink(temp_tab->buf); + strbuf_release(temp_tab); + } + strbuf_release(&next_name); + return err; +} + +static int stack_write_compact(struct reftable_stack *st, + struct reftable_writer *wr, int first, int last, + struct reftable_log_expiry_config *config) +{ + int subtabs_len = last - first + 1; + struct reftable_table *subtabs = reftable_calloc( + sizeof(struct reftable_table) * (last - first + 1)); + struct reftable_merged_table *mt = NULL; + int err = 0; + struct reftable_iterator it = { NULL }; + struct reftable_ref_record ref = { NULL }; + struct reftable_log_record log = { NULL }; + + uint64_t entries = 0; + + int i = 0, j = 0; + for (i = first, j = 0; i <= last; i++) { + struct reftable_reader *t = st->readers[i]; + reftable_table_from_reader(&subtabs[j++], t); + st->stats.bytes += t->size; + } + reftable_writer_set_limits(wr, st->readers[first]->min_update_index, + st->readers[last]->max_update_index); + + err = reftable_new_merged_table(&mt, subtabs, subtabs_len, + st->config.hash_id); + if (err < 0) { + reftable_free(subtabs); + goto done; + } + + err = reftable_merged_table_seek_ref(mt, &it, ""); + if (err < 0) + goto done; + + while (1) { + err = reftable_iterator_next_ref(&it, &ref); + if (err > 0) { + err = 0; + break; + } + if (err < 0) { + break; + } + + if (first == 0 && reftable_ref_record_is_deletion(&ref)) { + continue; + } + + err = reftable_writer_add_ref(wr, &ref); + if (err < 0) { + break; + } + entries++; + } + reftable_iterator_destroy(&it); + + err = reftable_merged_table_seek_log(mt, &it, ""); + if (err < 0) + goto done; + + while (1) { + err = reftable_iterator_next_log(&it, &log); + if (err > 0) { + err = 0; + break; + } + if (err < 0) { + break; + } + if (first == 0 && reftable_log_record_is_deletion(&log)) { + continue; + } + + if (config != NULL && config->time > 0 && + log.time < config->time) { + continue; + } + + if (config != NULL && config->min_update_index > 0 && + log.update_index < config->min_update_index) { + continue; + } + + err = reftable_writer_add_log(wr, &log); + if (err < 0) { + break; + } + entries++; + } + +done: + reftable_iterator_destroy(&it); + if (mt != NULL) { + merged_table_release(mt); + reftable_merged_table_free(mt); + } + reftable_ref_record_release(&ref); + reftable_log_record_release(&log); + st->stats.entries_written += entries; + return err; +} + +/* < 0: error. 0 == OK, > 0 attempt failed; could retry. */ +static int stack_compact_range(struct reftable_stack *st, int first, int last, + struct reftable_log_expiry_config *expiry) +{ + struct strbuf temp_tab_file_name = STRBUF_INIT; + struct strbuf new_table_name = STRBUF_INIT; + struct strbuf lock_file_name = STRBUF_INIT; + struct strbuf ref_list_contents = STRBUF_INIT; + struct strbuf new_table_path = STRBUF_INIT; + int err = 0; + int have_lock = 0; + int lock_file_fd = 0; + int compact_count = last - first + 1; + char **listp = NULL; + char **delete_on_success = + reftable_calloc(sizeof(char *) * (compact_count + 1)); + char **subtable_locks = + reftable_calloc(sizeof(char *) * (compact_count + 1)); + int i = 0; + int j = 0; + int is_empty_table = 0; + + if (first > last || (expiry == NULL && first == last)) { + err = 0; + goto done; + } + + st->stats.attempts++; + + strbuf_reset(&lock_file_name); + strbuf_addstr(&lock_file_name, st->list_file); + strbuf_addstr(&lock_file_name, ".lock"); + + lock_file_fd = + open(lock_file_name.buf, O_EXCL | O_CREAT | O_WRONLY, 0644); + if (lock_file_fd < 0) { + if (errno == EEXIST) { + err = 1; + } else { + err = REFTABLE_IO_ERROR; + } + goto done; + } + /* Don't want to write to the lock for now. */ + close(lock_file_fd); + lock_file_fd = 0; + + have_lock = 1; + err = stack_uptodate(st); + if (err != 0) + goto done; + + for (i = first, j = 0; i <= last; i++) { + struct strbuf subtab_file_name = STRBUF_INIT; + struct strbuf subtab_lock = STRBUF_INIT; + int sublock_file_fd = -1; + + strbuf_addstr(&subtab_file_name, st->reftable_dir); + strbuf_addstr(&subtab_file_name, "/"); + strbuf_addstr(&subtab_file_name, reader_name(st->readers[i])); + + strbuf_reset(&subtab_lock); + strbuf_addbuf(&subtab_lock, &subtab_file_name); + strbuf_addstr(&subtab_lock, ".lock"); + + sublock_file_fd = open(subtab_lock.buf, + O_EXCL | O_CREAT | O_WRONLY, 0644); + if (sublock_file_fd > 0) { + close(sublock_file_fd); + } else if (sublock_file_fd < 0) { + if (errno == EEXIST) { + err = 1; + } else { + err = REFTABLE_IO_ERROR; + } + } + + subtable_locks[j] = subtab_lock.buf; + delete_on_success[j] = subtab_file_name.buf; + j++; + + if (err != 0) + goto done; + } + + err = unlink(lock_file_name.buf); + if (err < 0) + goto done; + have_lock = 0; + + err = stack_compact_locked(st, first, last, &temp_tab_file_name, + expiry); + /* Compaction + tombstones can create an empty table out of non-empty + * tables. */ + is_empty_table = (err == REFTABLE_EMPTY_TABLE_ERROR); + if (is_empty_table) { + err = 0; + } + if (err < 0) + goto done; + + lock_file_fd = + open(lock_file_name.buf, O_EXCL | O_CREAT | O_WRONLY, 0644); + if (lock_file_fd < 0) { + if (errno == EEXIST) { + err = 1; + } else { + err = REFTABLE_IO_ERROR; + } + goto done; + } + have_lock = 1; + + format_name(&new_table_name, st->readers[first]->min_update_index, + st->readers[last]->max_update_index); + strbuf_addstr(&new_table_name, ".ref"); + + strbuf_reset(&new_table_path); + strbuf_addstr(&new_table_path, st->reftable_dir); + strbuf_addstr(&new_table_path, "/"); + strbuf_addbuf(&new_table_path, &new_table_name); + + if (!is_empty_table) { + /* retry? */ + err = rename(temp_tab_file_name.buf, new_table_path.buf); + if (err < 0) { + err = REFTABLE_IO_ERROR; + goto done; + } + } + + for (i = 0; i < first; i++) { + strbuf_addstr(&ref_list_contents, st->readers[i]->name); + strbuf_addstr(&ref_list_contents, "\n"); + } + if (!is_empty_table) { + strbuf_addbuf(&ref_list_contents, &new_table_name); + strbuf_addstr(&ref_list_contents, "\n"); + } + for (i = last + 1; i < st->merged->stack_len; i++) { + strbuf_addstr(&ref_list_contents, st->readers[i]->name); + strbuf_addstr(&ref_list_contents, "\n"); + } + + err = write(lock_file_fd, ref_list_contents.buf, ref_list_contents.len); + if (err < 0) { + err = REFTABLE_IO_ERROR; + unlink(new_table_path.buf); + goto done; + } + err = close(lock_file_fd); + lock_file_fd = 0; + if (err < 0) { + err = REFTABLE_IO_ERROR; + unlink(new_table_path.buf); + goto done; + } + + err = rename(lock_file_name.buf, st->list_file); + if (err < 0) { + err = REFTABLE_IO_ERROR; + unlink(new_table_path.buf); + goto done; + } + have_lock = 0; + + /* Reload the stack before deleting. On windows, we can only delete the + files after we closed them. + */ + err = reftable_stack_reload_maybe_reuse(st, first < last); + + listp = delete_on_success; + while (*listp) { + if (strcmp(*listp, new_table_path.buf)) { + unlink(*listp); + } + listp++; + } + +done: + free_names(delete_on_success); + + listp = subtable_locks; + while (*listp) { + unlink(*listp); + listp++; + } + free_names(subtable_locks); + if (lock_file_fd > 0) { + close(lock_file_fd); + lock_file_fd = 0; + } + if (have_lock) { + unlink(lock_file_name.buf); + } + strbuf_release(&new_table_name); + strbuf_release(&new_table_path); + strbuf_release(&ref_list_contents); + strbuf_release(&temp_tab_file_name); + strbuf_release(&lock_file_name); + return err; +} + +int reftable_stack_compact_all(struct reftable_stack *st, + struct reftable_log_expiry_config *config) +{ + return stack_compact_range(st, 0, st->merged->stack_len - 1, config); +} + +static int stack_compact_range_stats(struct reftable_stack *st, int first, + int last, + struct reftable_log_expiry_config *config) +{ + int err = stack_compact_range(st, first, last, config); + if (err > 0) { + st->stats.failures++; + } + return err; +} + +static int segment_size(struct segment *s) +{ + return s->end - s->start; +} + +int fastlog2(uint64_t sz) +{ + int l = 0; + if (sz == 0) + return 0; + for (; sz; sz /= 2) { + l++; + } + return l - 1; +} + +struct segment *sizes_to_segments(int *seglen, uint64_t *sizes, int n) +{ + struct segment *segs = reftable_calloc(sizeof(struct segment) * n); + int next = 0; + struct segment cur = { 0 }; + int i = 0; + + if (n == 0) { + *seglen = 0; + return segs; + } + for (i = 0; i < n; i++) { + int log = fastlog2(sizes[i]); + if (cur.log != log && cur.bytes > 0) { + struct segment fresh = { + .start = i, + }; + + segs[next++] = cur; + cur = fresh; + } + + cur.log = log; + cur.end = i + 1; + cur.bytes += sizes[i]; + } + segs[next++] = cur; + *seglen = next; + return segs; +} + +struct segment suggest_compaction_segment(uint64_t *sizes, int n) +{ + int seglen = 0; + struct segment *segs = sizes_to_segments(&seglen, sizes, n); + struct segment min_seg = { + .log = 64, + }; + int i = 0; + for (i = 0; i < seglen; i++) { + if (segment_size(&segs[i]) == 1) { + continue; + } + + if (segs[i].log < min_seg.log) { + min_seg = segs[i]; + } + } + + while (min_seg.start > 0) { + int prev = min_seg.start - 1; + if (fastlog2(min_seg.bytes) < fastlog2(sizes[prev])) { + break; + } + + min_seg.start = prev; + min_seg.bytes += sizes[prev]; + } + + reftable_free(segs); + return min_seg; +} + +static uint64_t *stack_table_sizes_for_compaction(struct reftable_stack *st) +{ + uint64_t *sizes = + reftable_calloc(sizeof(uint64_t) * st->merged->stack_len); + int version = (st->config.hash_id == SHA1_ID) ? 1 : 2; + int overhead = header_size(version) - 1; + int i = 0; + for (i = 0; i < st->merged->stack_len; i++) { + sizes[i] = st->readers[i]->size - overhead; + } + return sizes; +} + +int reftable_stack_auto_compact(struct reftable_stack *st) +{ + uint64_t *sizes = stack_table_sizes_for_compaction(st); + struct segment seg = + suggest_compaction_segment(sizes, st->merged->stack_len); + reftable_free(sizes); + if (segment_size(&seg) > 0) + return stack_compact_range_stats(st, seg.start, seg.end - 1, + NULL); + + return 0; +} + +struct reftable_compaction_stats * +reftable_stack_compaction_stats(struct reftable_stack *st) +{ + return &st->stats; +} + +int reftable_stack_read_ref(struct reftable_stack *st, const char *refname, + struct reftable_ref_record *ref) +{ + struct reftable_table tab = { NULL }; + reftable_table_from_merged_table(&tab, reftable_stack_merged_table(st)); + return reftable_table_read_ref(&tab, refname, ref); +} + +int reftable_stack_read_log(struct reftable_stack *st, const char *refname, + struct reftable_log_record *log) +{ + struct reftable_iterator it = { NULL }; + struct reftable_merged_table *mt = reftable_stack_merged_table(st); + int err = reftable_merged_table_seek_log(mt, &it, refname); + if (err) + goto done; + + err = reftable_iterator_next_log(&it, log); + if (err) + goto done; + + if (strcmp(log->refname, refname) || + reftable_log_record_is_deletion(log)) { + err = 1; + goto done; + } + +done: + if (err) { + reftable_log_record_release(log); + } + reftable_iterator_destroy(&it); + return err; +} + +static int stack_check_addition(struct reftable_stack *st, + const char *new_tab_name) +{ + int err = 0; + struct reftable_block_source src = { NULL }; + struct reftable_reader *rd = NULL; + struct reftable_table tab = { NULL }; + struct reftable_ref_record *refs = NULL; + struct reftable_iterator it = { NULL }; + int cap = 0; + int len = 0; + int i = 0; + + if (st->config.skip_name_check) + return 0; + + err = reftable_block_source_from_file(&src, new_tab_name); + if (err < 0) + goto done; + + err = reftable_new_reader(&rd, &src, new_tab_name); + if (err < 0) + goto done; + + err = reftable_reader_seek_ref(rd, &it, ""); + if (err > 0) { + err = 0; + goto done; + } + if (err < 0) + goto done; + + while (1) { + struct reftable_ref_record ref = { NULL }; + err = reftable_iterator_next_ref(&it, &ref); + if (err > 0) { + break; + } + if (err < 0) + goto done; + + if (len >= cap) { + cap = 2 * cap + 1; + refs = reftable_realloc(refs, cap * sizeof(refs[0])); + } + + refs[len++] = ref; + } + + reftable_table_from_merged_table(&tab, reftable_stack_merged_table(st)); + + err = validate_ref_record_addition(tab, refs, len); + +done: + for (i = 0; i < len; i++) { + reftable_ref_record_release(&refs[i]); + } + + free(refs); + reftable_iterator_destroy(&it); + reftable_reader_free(rd); + return err; +} diff --git a/reftable/stack.h b/reftable/stack.h new file mode 100644 index 00000000000..f57005846e5 --- /dev/null +++ b/reftable/stack.h @@ -0,0 +1,41 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef STACK_H +#define STACK_H + +#include "system.h" +#include "reftable-writer.h" +#include "reftable-stack.h" + +struct reftable_stack { + char *list_file; + char *reftable_dir; + int disable_auto_compact; + + struct reftable_write_options config; + + struct reftable_reader **readers; + size_t readers_len; + struct reftable_merged_table *merged; + struct reftable_compaction_stats stats; +}; + +int read_lines(const char *filename, char ***lines); + +struct segment { + int start, end; + int log; + uint64_t bytes; +}; + +int fastlog2(uint64_t sz); +struct segment *sizes_to_segments(int *seglen, uint64_t *sizes, int n); +struct segment suggest_compaction_segment(uint64_t *sizes, int n); + +#endif diff --git a/reftable/stack_test.c b/reftable/stack_test.c new file mode 100644 index 00000000000..f88a18e51fb --- /dev/null +++ b/reftable/stack_test.c @@ -0,0 +1,803 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "stack.h" + +#include "system.h" + +#include "merged.h" +#include "basics.h" +#include "constants.h" +#include "record.h" +#include "test_framework.h" +#include "reftable-tests.h" + +#include +#include + +static void clear_dir(const char *dirname) +{ + struct strbuf path = STRBUF_INIT; + strbuf_addstr(&path, dirname); + remove_dir_recursively(&path, 0); + strbuf_release(&path); +} + +static char *get_tmp_template(const char *prefix) +{ + const char *tmp = getenv("TMPDIR"); + static char template[1024]; + snprintf(template, sizeof(template) - 1, "%s/%s.XXXXXX", + tmp ? tmp : "/tmp", prefix); + return template; +} + +static void test_read_file(void) +{ + char *fn = get_tmp_template(__FUNCTION__); + int fd = mkstemp(fn); + char out[1024] = "line1\n\nline2\nline3"; + int n, err; + char **names = NULL; + char *want[] = { "line1", "line2", "line3" }; + int i = 0; + + EXPECT(fd > 0); + n = write(fd, out, strlen(out)); + EXPECT(n == strlen(out)); + err = close(fd); + EXPECT(err >= 0); + + err = read_lines(fn, &names); + EXPECT_ERR(err); + + for (i = 0; names[i] != NULL; i++) { + EXPECT(0 == strcmp(want[i], names[i])); + } + free_names(names); + remove(fn); +} + +static void test_parse_names(void) +{ + char buf[] = "line\n"; + char **names = NULL; + parse_names(buf, strlen(buf), &names); + + EXPECT(NULL != names[0]); + EXPECT(0 == strcmp(names[0], "line")); + EXPECT(NULL == names[1]); + free_names(names); +} + +static void test_names_equal(void) +{ + char *a[] = { "a", "b", "c", NULL }; + char *b[] = { "a", "b", "d", NULL }; + char *c[] = { "a", "b", NULL }; + + EXPECT(names_equal(a, a)); + EXPECT(!names_equal(a, b)); + EXPECT(!names_equal(a, c)); +} + +static int write_test_ref(struct reftable_writer *wr, void *arg) +{ + struct reftable_ref_record *ref = arg; + reftable_writer_set_limits(wr, ref->update_index, ref->update_index); + return reftable_writer_add_ref(wr, ref); +} + +struct write_log_arg { + struct reftable_log_record *log; + uint64_t update_index; +}; + +static int write_test_log(struct reftable_writer *wr, void *arg) +{ + struct write_log_arg *wla = arg; + + reftable_writer_set_limits(wr, wla->update_index, wla->update_index); + return reftable_writer_add_log(wr, wla->log); +} + +static void test_reftable_stack_add_one(void) +{ + char *dir = get_tmp_template(__FUNCTION__); + struct reftable_write_options cfg = { 0 }; + struct reftable_stack *st = NULL; + int err; + struct reftable_ref_record ref = { + .refname = "HEAD", + .update_index = 1, + .value_type = REFTABLE_REF_SYMREF, + .value.symref = "master", + }; + struct reftable_ref_record dest = { NULL }; + + EXPECT(mkdtemp(dir)); + + err = reftable_new_stack(&st, dir, cfg); + EXPECT_ERR(err); + + err = reftable_stack_add(st, &write_test_ref, &ref); + EXPECT_ERR(err); + + err = reftable_stack_read_ref(st, ref.refname, &dest); + EXPECT_ERR(err); + EXPECT(0 == strcmp("master", dest.value.symref)); + + reftable_ref_record_release(&dest); + reftable_stack_destroy(st); + clear_dir(dir); +} + +static void test_reftable_stack_uptodate(void) +{ + struct reftable_write_options cfg = { 0 }; + struct reftable_stack *st1 = NULL; + struct reftable_stack *st2 = NULL; + char *dir = get_tmp_template(__FUNCTION__); + int err; + struct reftable_ref_record ref1 = { + .refname = "HEAD", + .update_index = 1, + .value_type = REFTABLE_REF_SYMREF, + .value.symref = "master", + }; + struct reftable_ref_record ref2 = { + .refname = "branch2", + .update_index = 2, + .value_type = REFTABLE_REF_SYMREF, + .value.symref = "master", + }; + + EXPECT(mkdtemp(dir)); + + /* simulate multi-process access to the same stack + by creating two stacks for the same directory. + */ + err = reftable_new_stack(&st1, dir, cfg); + EXPECT_ERR(err); + + err = reftable_new_stack(&st2, dir, cfg); + EXPECT_ERR(err); + + err = reftable_stack_add(st1, &write_test_ref, &ref1); + EXPECT_ERR(err); + + err = reftable_stack_add(st2, &write_test_ref, &ref2); + EXPECT(err == REFTABLE_LOCK_ERROR); + + err = reftable_stack_reload(st2); + EXPECT_ERR(err); + + err = reftable_stack_add(st2, &write_test_ref, &ref2); + EXPECT_ERR(err); + reftable_stack_destroy(st1); + reftable_stack_destroy(st2); + clear_dir(dir); +} + +static void test_reftable_stack_transaction_api(void) +{ + char *dir = get_tmp_template(__FUNCTION__); + struct reftable_write_options cfg = { 0 }; + struct reftable_stack *st = NULL; + int err; + struct reftable_addition *add = NULL; + + struct reftable_ref_record ref = { + .refname = "HEAD", + .update_index = 1, + .value_type = REFTABLE_REF_SYMREF, + .value.symref = "master", + }; + struct reftable_ref_record dest = { NULL }; + + EXPECT(mkdtemp(dir)); + + err = reftable_new_stack(&st, dir, cfg); + EXPECT_ERR(err); + + reftable_addition_destroy(add); + + err = reftable_stack_new_addition(&add, st); + EXPECT_ERR(err); + + err = reftable_addition_add(add, &write_test_ref, &ref); + EXPECT_ERR(err); + + err = reftable_addition_commit(add); + EXPECT_ERR(err); + + reftable_addition_destroy(add); + + err = reftable_stack_read_ref(st, ref.refname, &dest); + EXPECT_ERR(err); + EXPECT(REFTABLE_REF_SYMREF == dest.value_type); + EXPECT(0 == strcmp("master", dest.value.symref)); + + reftable_ref_record_release(&dest); + reftable_stack_destroy(st); + clear_dir(dir); +} + +static void test_reftable_stack_validate_refname(void) +{ + struct reftable_write_options cfg = { 0 }; + struct reftable_stack *st = NULL; + int err; + char *dir = get_tmp_template(__FUNCTION__); + int i; + struct reftable_ref_record ref = { + .refname = "a/b", + .update_index = 1, + .value_type = REFTABLE_REF_SYMREF, + .value.symref = "master", + }; + char *additions[] = { "a", "a/b/c" }; + + EXPECT(mkdtemp(dir)); + err = reftable_new_stack(&st, dir, cfg); + EXPECT_ERR(err); + + err = reftable_stack_add(st, &write_test_ref, &ref); + EXPECT_ERR(err); + + for (i = 0; i < ARRAY_SIZE(additions); i++) { + struct reftable_ref_record ref = { + .refname = additions[i], + .update_index = 1, + .value_type = REFTABLE_REF_SYMREF, + .value.symref = "master", + }; + + err = reftable_stack_add(st, &write_test_ref, &ref); + EXPECT(err == REFTABLE_NAME_CONFLICT); + } + + reftable_stack_destroy(st); + clear_dir(dir); +} + +static int write_error(struct reftable_writer *wr, void *arg) +{ + return *((int *)arg); +} + +static void test_reftable_stack_update_index_check(void) +{ + char *dir = get_tmp_template(__FUNCTION__); + struct reftable_write_options cfg = { 0 }; + struct reftable_stack *st = NULL; + int err; + struct reftable_ref_record ref1 = { + .refname = "name1", + .update_index = 1, + .value_type = REFTABLE_REF_SYMREF, + .value.symref = "master", + }; + struct reftable_ref_record ref2 = { + .refname = "name2", + .update_index = 1, + .value_type = REFTABLE_REF_SYMREF, + .value.symref = "master", + }; + EXPECT(mkdtemp(dir)); + + err = reftable_new_stack(&st, dir, cfg); + EXPECT_ERR(err); + + err = reftable_stack_add(st, &write_test_ref, &ref1); + EXPECT_ERR(err); + + err = reftable_stack_add(st, &write_test_ref, &ref2); + EXPECT(err == REFTABLE_API_ERROR); + reftable_stack_destroy(st); + clear_dir(dir); +} + +static void test_reftable_stack_lock_failure(void) +{ + char *dir = get_tmp_template(__FUNCTION__); + struct reftable_write_options cfg = { 0 }; + struct reftable_stack *st = NULL; + int err, i; + EXPECT(mkdtemp(dir)); + + err = reftable_new_stack(&st, dir, cfg); + EXPECT_ERR(err); + for (i = -1; i != REFTABLE_EMPTY_TABLE_ERROR; i--) { + err = reftable_stack_add(st, &write_error, &i); + EXPECT(err == i); + } + + reftable_stack_destroy(st); + clear_dir(dir); +} + +static void test_reftable_stack_add(void) +{ + int i = 0; + int err = 0; + struct reftable_write_options cfg = { + .exact_log_message = 1, + }; + struct reftable_stack *st = NULL; + char *dir = get_tmp_template(__FUNCTION__); + struct reftable_ref_record refs[2] = { { NULL } }; + struct reftable_log_record logs[2] = { { NULL } }; + int N = ARRAY_SIZE(refs); + + EXPECT(mkdtemp(dir)); + + err = reftable_new_stack(&st, dir, cfg); + EXPECT_ERR(err); + st->disable_auto_compact = 1; + + for (i = 0; i < N; i++) { + char buf[256]; + snprintf(buf, sizeof(buf), "branch%02d", i); + refs[i].refname = xstrdup(buf); + refs[i].update_index = i + 1; + refs[i].value_type = REFTABLE_REF_VAL1; + refs[i].value.val1 = reftable_malloc(SHA1_SIZE); + set_test_hash(refs[i].value.val1, i); + + logs[i].refname = xstrdup(buf); + logs[i].update_index = N + i + 1; + logs[i].new_hash = reftable_malloc(SHA1_SIZE); + logs[i].email = xstrdup("identity@invalid"); + set_test_hash(logs[i].new_hash, i); + } + + for (i = 0; i < N; i++) { + int err = reftable_stack_add(st, &write_test_ref, &refs[i]); + EXPECT_ERR(err); + } + + for (i = 0; i < N; i++) { + struct write_log_arg arg = { + .log = &logs[i], + .update_index = reftable_stack_next_update_index(st), + }; + int err = reftable_stack_add(st, &write_test_log, &arg); + EXPECT_ERR(err); + } + + err = reftable_stack_compact_all(st, NULL); + EXPECT_ERR(err); + + for (i = 0; i < N; i++) { + struct reftable_ref_record dest = { NULL }; + + int err = reftable_stack_read_ref(st, refs[i].refname, &dest); + EXPECT_ERR(err); + EXPECT(reftable_ref_record_equal(&dest, refs + i, SHA1_SIZE)); + reftable_ref_record_release(&dest); + } + + for (i = 0; i < N; i++) { + struct reftable_log_record dest = { NULL }; + int err = reftable_stack_read_log(st, refs[i].refname, &dest); + EXPECT_ERR(err); + EXPECT(reftable_log_record_equal(&dest, logs + i, SHA1_SIZE)); + reftable_log_record_release(&dest); + } + + /* cleanup */ + reftable_stack_destroy(st); + for (i = 0; i < N; i++) { + reftable_ref_record_release(&refs[i]); + reftable_log_record_release(&logs[i]); + } + clear_dir(dir); +} + +static void test_reftable_stack_log_normalize(void) +{ + int err = 0; + struct reftable_write_options cfg = { + 0, + }; + struct reftable_stack *st = NULL; + char *dir = get_tmp_template(__FUNCTION__); + + uint8_t h1[SHA1_SIZE] = { 0x01 }, h2[SHA1_SIZE] = { 0x02 }; + + struct reftable_log_record input = { + .refname = "branch", + .update_index = 1, + .new_hash = h1, + .old_hash = h2, + }; + struct reftable_log_record dest = { + .update_index = 0, + }; + struct write_log_arg arg = { + .log = &input, + .update_index = 1, + }; + + EXPECT(mkdtemp(dir)); + err = reftable_new_stack(&st, dir, cfg); + EXPECT_ERR(err); + + input.message = "one\ntwo"; + err = reftable_stack_add(st, &write_test_log, &arg); + EXPECT(err == REFTABLE_API_ERROR); + + input.message = "one"; + err = reftable_stack_add(st, &write_test_log, &arg); + EXPECT_ERR(err); + + err = reftable_stack_read_log(st, input.refname, &dest); + EXPECT_ERR(err); + EXPECT(0 == strcmp(dest.message, "one\n")); + + input.message = "two\n"; + arg.update_index = 2; + err = reftable_stack_add(st, &write_test_log, &arg); + EXPECT_ERR(err); + err = reftable_stack_read_log(st, input.refname, &dest); + EXPECT_ERR(err); + EXPECT(0 == strcmp(dest.message, "two\n")); + + /* cleanup */ + reftable_stack_destroy(st); + reftable_log_record_release(&dest); + clear_dir(dir); +} + +static void test_reftable_stack_tombstone(void) +{ + int i = 0; + char *dir = get_tmp_template(__FUNCTION__); + struct reftable_write_options cfg = { 0 }; + struct reftable_stack *st = NULL; + int err; + struct reftable_ref_record refs[2] = { { NULL } }; + struct reftable_log_record logs[2] = { { NULL } }; + int N = ARRAY_SIZE(refs); + struct reftable_ref_record dest = { NULL }; + struct reftable_log_record log_dest = { NULL }; + + EXPECT(mkdtemp(dir)); + + err = reftable_new_stack(&st, dir, cfg); + EXPECT_ERR(err); + + for (i = 0; i < N; i++) { + const char *buf = "branch"; + refs[i].refname = xstrdup(buf); + refs[i].update_index = i + 1; + if (i % 2 == 0) { + refs[i].value_type = REFTABLE_REF_VAL1; + refs[i].value.val1 = reftable_malloc(SHA1_SIZE); + set_test_hash(refs[i].value.val1, i); + } + logs[i].refname = xstrdup(buf); + /* update_index is part of the key. */ + logs[i].update_index = 42; + if (i % 2 == 0) { + logs[i].new_hash = reftable_malloc(SHA1_SIZE); + set_test_hash(logs[i].new_hash, i); + logs[i].email = xstrdup("identity@invalid"); + } + } + for (i = 0; i < N; i++) { + int err = reftable_stack_add(st, &write_test_ref, &refs[i]); + EXPECT_ERR(err); + } + for (i = 0; i < N; i++) { + struct write_log_arg arg = { + .log = &logs[i], + .update_index = reftable_stack_next_update_index(st), + }; + int err = reftable_stack_add(st, &write_test_log, &arg); + EXPECT_ERR(err); + } + + err = reftable_stack_read_ref(st, "branch", &dest); + EXPECT(err == 1); + reftable_ref_record_release(&dest); + + err = reftable_stack_read_log(st, "branch", &log_dest); + EXPECT(err == 1); + reftable_log_record_release(&log_dest); + + err = reftable_stack_compact_all(st, NULL); + EXPECT_ERR(err); + + err = reftable_stack_read_ref(st, "branch", &dest); + EXPECT(err == 1); + + err = reftable_stack_read_log(st, "branch", &log_dest); + EXPECT(err == 1); + reftable_ref_record_release(&dest); + reftable_log_record_release(&log_dest); + + /* cleanup */ + reftable_stack_destroy(st); + for (i = 0; i < N; i++) { + reftable_ref_record_release(&refs[i]); + reftable_log_record_release(&logs[i]); + } + clear_dir(dir); +} + +static void test_reftable_stack_hash_id(void) +{ + char *dir = get_tmp_template(__FUNCTION__); + struct reftable_write_options cfg = { 0 }; + struct reftable_stack *st = NULL; + int err; + + struct reftable_ref_record ref = { + .refname = "master", + .value_type = REFTABLE_REF_SYMREF, + .value.symref = "target", + .update_index = 1, + }; + struct reftable_write_options cfg32 = { .hash_id = SHA256_ID }; + struct reftable_stack *st32 = NULL; + struct reftable_write_options cfg_default = { 0 }; + struct reftable_stack *st_default = NULL; + struct reftable_ref_record dest = { NULL }; + + EXPECT(mkdtemp(dir)); + err = reftable_new_stack(&st, dir, cfg); + EXPECT_ERR(err); + + err = reftable_stack_add(st, &write_test_ref, &ref); + EXPECT_ERR(err); + + /* can't read it with the wrong hash ID. */ + err = reftable_new_stack(&st32, dir, cfg32); + EXPECT(err == REFTABLE_FORMAT_ERROR); + + /* check that we can read it back with default config too. */ + err = reftable_new_stack(&st_default, dir, cfg_default); + EXPECT_ERR(err); + + err = reftable_stack_read_ref(st_default, "master", &dest); + EXPECT_ERR(err); + + EXPECT(reftable_ref_record_equal(&ref, &dest, SHA1_SIZE)); + reftable_ref_record_release(&dest); + reftable_stack_destroy(st); + reftable_stack_destroy(st_default); + clear_dir(dir); +} + +static void test_log2(void) +{ + EXPECT(1 == fastlog2(3)); + EXPECT(2 == fastlog2(4)); + EXPECT(2 == fastlog2(5)); +} + +static void test_sizes_to_segments(void) +{ + uint64_t sizes[] = { 2, 3, 4, 5, 7, 9 }; + /* .................0 1 2 3 4 5 */ + + int seglen = 0; + struct segment *segs = + sizes_to_segments(&seglen, sizes, ARRAY_SIZE(sizes)); + EXPECT(segs[2].log == 3); + EXPECT(segs[2].start == 5); + EXPECT(segs[2].end == 6); + + EXPECT(segs[1].log == 2); + EXPECT(segs[1].start == 2); + EXPECT(segs[1].end == 5); + reftable_free(segs); +} + +static void test_sizes_to_segments_empty(void) +{ + int seglen = 0; + struct segment *segs = sizes_to_segments(&seglen, NULL, 0); + EXPECT(seglen == 0); + reftable_free(segs); +} + +static void test_sizes_to_segments_all_equal(void) +{ + uint64_t sizes[] = { 5, 5 }; + + int seglen = 0; + struct segment *segs = + sizes_to_segments(&seglen, sizes, ARRAY_SIZE(sizes)); + EXPECT(seglen == 1); + EXPECT(segs[0].start == 0); + EXPECT(segs[0].end == 2); + reftable_free(segs); +} + +static void test_suggest_compaction_segment(void) +{ + uint64_t sizes[] = { 128, 64, 17, 16, 9, 9, 9, 16, 16 }; + /* .................0 1 2 3 4 5 6 */ + struct segment min = + suggest_compaction_segment(sizes, ARRAY_SIZE(sizes)); + EXPECT(min.start == 2); + EXPECT(min.end == 7); +} + +static void test_suggest_compaction_segment_nothing(void) +{ + uint64_t sizes[] = { 64, 32, 16, 8, 4, 2 }; + struct segment result = + suggest_compaction_segment(sizes, ARRAY_SIZE(sizes)); + EXPECT(result.start == result.end); +} + +static void test_reflog_expire(void) +{ + char *dir = get_tmp_template(__FUNCTION__); + struct reftable_write_options cfg = { 0 }; + struct reftable_stack *st = NULL; + struct reftable_log_record logs[20] = { { NULL } }; + int N = ARRAY_SIZE(logs) - 1; + int i = 0; + int err; + struct reftable_log_expiry_config expiry = { + .time = 10, + }; + struct reftable_log_record log = { NULL }; + + EXPECT(mkdtemp(dir)); + + err = reftable_new_stack(&st, dir, cfg); + EXPECT_ERR(err); + + for (i = 1; i <= N; i++) { + char buf[256]; + snprintf(buf, sizeof(buf), "branch%02d", i); + + logs[i].refname = xstrdup(buf); + logs[i].update_index = i; + logs[i].time = i; + logs[i].new_hash = reftable_malloc(SHA1_SIZE); + logs[i].email = xstrdup("identity@invalid"); + set_test_hash(logs[i].new_hash, i); + } + + for (i = 1; i <= N; i++) { + struct write_log_arg arg = { + .log = &logs[i], + .update_index = reftable_stack_next_update_index(st), + }; + int err = reftable_stack_add(st, &write_test_log, &arg); + EXPECT_ERR(err); + } + + err = reftable_stack_compact_all(st, NULL); + EXPECT_ERR(err); + + err = reftable_stack_compact_all(st, &expiry); + EXPECT_ERR(err); + + err = reftable_stack_read_log(st, logs[9].refname, &log); + EXPECT(err == 1); + + err = reftable_stack_read_log(st, logs[11].refname, &log); + EXPECT_ERR(err); + + expiry.min_update_index = 15; + err = reftable_stack_compact_all(st, &expiry); + EXPECT_ERR(err); + + err = reftable_stack_read_log(st, logs[14].refname, &log); + EXPECT(err == 1); + + err = reftable_stack_read_log(st, logs[16].refname, &log); + EXPECT_ERR(err); + + /* cleanup */ + reftable_stack_destroy(st); + for (i = 0; i <= N; i++) { + reftable_log_record_release(&logs[i]); + } + clear_dir(dir); + reftable_log_record_release(&log); +} + +static int write_nothing(struct reftable_writer *wr, void *arg) +{ + reftable_writer_set_limits(wr, 1, 1); + return 0; +} + +static void test_empty_add(void) +{ + struct reftable_write_options cfg = { 0 }; + struct reftable_stack *st = NULL; + int err; + char *dir = get_tmp_template(__FUNCTION__); + struct reftable_stack *st2 = NULL; + + EXPECT(mkdtemp(dir)); + + err = reftable_new_stack(&st, dir, cfg); + EXPECT_ERR(err); + + err = reftable_stack_add(st, &write_nothing, NULL); + EXPECT_ERR(err); + + err = reftable_new_stack(&st2, dir, cfg); + EXPECT_ERR(err); + clear_dir(dir); + reftable_stack_destroy(st); + reftable_stack_destroy(st2); +} + +static void test_reftable_stack_auto_compaction(void) +{ + struct reftable_write_options cfg = { 0 }; + struct reftable_stack *st = NULL; + char *dir = get_tmp_template(__FUNCTION__); + int err, i; + int N = 100; + EXPECT(mkdtemp(dir)); + + err = reftable_new_stack(&st, dir, cfg); + EXPECT_ERR(err); + + for (i = 0; i < N; i++) { + char name[100]; + struct reftable_ref_record ref = { + .refname = name, + .update_index = reftable_stack_next_update_index(st), + .value_type = REFTABLE_REF_SYMREF, + .value.symref = "master", + }; + snprintf(name, sizeof(name), "branch%04d", i); + + err = reftable_stack_add(st, &write_test_ref, &ref); + EXPECT_ERR(err); + + EXPECT(i < 3 || st->merged->stack_len < 2 * fastlog2(i)); + } + + EXPECT(reftable_stack_compaction_stats(st)->entries_written < + (uint64_t)(N * fastlog2(N))); + + reftable_stack_destroy(st); + clear_dir(dir); +} + +int stack_test_main(int argc, const char *argv[]) +{ + RUN_TEST(test_reftable_stack_uptodate); + RUN_TEST(test_reftable_stack_transaction_api); + RUN_TEST(test_reftable_stack_hash_id); + RUN_TEST(test_sizes_to_segments_all_equal); + RUN_TEST(test_reftable_stack_auto_compaction); + RUN_TEST(test_reftable_stack_validate_refname); + RUN_TEST(test_reftable_stack_update_index_check); + RUN_TEST(test_reftable_stack_lock_failure); + RUN_TEST(test_reftable_stack_log_normalize); + RUN_TEST(test_reftable_stack_tombstone); + RUN_TEST(test_reftable_stack_add_one); + RUN_TEST(test_empty_add); + RUN_TEST(test_reflog_expire); + RUN_TEST(test_suggest_compaction_segment); + RUN_TEST(test_suggest_compaction_segment_nothing); + RUN_TEST(test_sizes_to_segments); + RUN_TEST(test_sizes_to_segments_empty); + RUN_TEST(test_log2); + RUN_TEST(test_parse_names); + RUN_TEST(test_read_file); + RUN_TEST(test_names_equal); + RUN_TEST(test_reftable_stack_add); + return 0; +} diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c index fdf92586737..3b702f4855e 100644 --- a/t/helper/test-reftable.c +++ b/t/helper/test-reftable.c @@ -5,8 +5,11 @@ int cmd__reftable(int argc, const char **argv) { basics_test_main(argc, argv); block_test_main(argc, argv); + merged_test_main(argc, argv); record_test_main(argc, argv); + refname_test_main(argc, argv); reftable_test_main(argc, argv); + stack_test_main(argc, argv); tree_test_main(argc, argv); return 0; } From patchwork Wed Dec 9 14:00:27 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 11961567 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D2A8BC4167B for ; Wed, 9 Dec 2020 14:02:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A2F6723B31 for ; Wed, 9 Dec 2020 14:02:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728827AbgLIOCZ (ORCPT ); Wed, 9 Dec 2020 09:02:25 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60380 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728855AbgLIOCI (ORCPT ); Wed, 9 Dec 2020 09:02:08 -0500 Received: from mail-wm1-x332.google.com (mail-wm1-x332.google.com [IPv6:2a00:1450:4864:20::332]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 38957C0611CD for ; Wed, 9 Dec 2020 06:00:47 -0800 (PST) Received: by mail-wm1-x332.google.com with SMTP id k10so1575290wmi.3 for ; Wed, 09 Dec 2020 06:00:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=UfjcafWVzccKipiKqVXRhTp+Sc5Krq3PJ4PikPlzqI8=; b=I/MdgBowf0n/VS7ino1dD93/YoZb6QDrji0UuM4fvel7TenYz95YqGCU7gcYUi+QtH WRkGkPrPMWdo9+CEMYSbt8jFXYoXeI0JjyL5yntAuQ1TSjBfUNZU+nX2ittm/xRWKh7N sY7eP1aMW1eNodgjOS4hcP8xxFX4nsT+22ohRMtR3LYW5WglLVKdb63RkSa4ROR7t+IC 3yRHfAUzPpV0p2Uo0j8nY+LEe7RG7Vv8noN8e4xOwcPFoeSfLr2PRuC7QBH6la/1ltfE HDiQCCfxqkM3jQbwBoEaI7+VdtJYk5ChaeWB2GVpiJSCpgmP/fwbWBybOgiMQG49Y/Qh nX2Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=UfjcafWVzccKipiKqVXRhTp+Sc5Krq3PJ4PikPlzqI8=; b=RgRJd8jMfZtto6c+LCGEm5qPaYevJUkn7ARG6ikOV2quMlK+Uy39zq/QER7cWVLnGy whMShjEq/yZx9wLWqegjPJXmGL5rTiH2eC0oSIY/ZNVtJNE8HWpIvu4Nbz2s1S/xuZxT Z837LvW98o8lmd3HdGZ2jbLryh9Yy59VBo0cWGtlbBBUenlO4se6VWQyZfPvovdAG/N7 GDpd5wUgKNCMRe6HawJtdRXFaiCfgUhYQ72Z9DJSUjb5VSUegy4wDNg55yRJHJQ0BYbb BH0onteKdGjTvt6dXFUJgfViQO8bSBrMnoUp/Gxo4b7zBRNkAtJfotE+VljZTO/5tlHR jwBw== X-Gm-Message-State: AOAM533FC50e+l3HUCN3CtiUiMaXAe7OhcX7leQmanqCKp34MX6YfSGF Kr+rthFrwF8esKCKeCXQdU5wL8HNsIo= X-Google-Smtp-Source: ABdhPJwK4u36UDd1VVRRWu83DkhcXOJVynwk0kBRG6xV4xPbmg8VJIa1wCXLg3qSzixG0qNF9EzBKg== X-Received: by 2002:a1c:9c53:: with SMTP id f80mr2882413wme.19.1607522444293; Wed, 09 Dec 2020 06:00:44 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id n123sm3681030wmn.7.2020.12.09.06.00.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 09 Dec 2020 06:00:43 -0800 (PST) Message-Id: In-Reply-To: References: Date: Wed, 09 Dec 2020 14:00:27 +0000 Subject: [PATCH v4 13/15] Reftable support for git-core Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Jeff King , Ramsay Jones , Jonathan Nieder , Johannes Schindelin , Jonathan Tan , Josh Steadmon , Emily Shaffer , Patrick Steinhardt , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Felipe Contreras , Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys For background, see the previous commit introducing the library. This introduces the file refs/reftable-backend.c containing a reftable-powered ref storage backend. It can be activated by passing --ref-storage=reftable to "git init", or setting GIT_TEST_REFTABLE in the environment. Example use: see t/t0031-reftable.sh Signed-off-by: Han-Wen Nienhuys Signed-off-by: Johannes Schindelin Helped-by: Johannes Schindelin Co-authored-by: Jeff King --- Documentation/config/extensions.txt | 9 + .../technical/repository-version.txt | 7 + Makefile | 4 + builtin/clone.c | 5 +- builtin/init-db.c | 55 +- builtin/worktree.c | 27 +- cache.h | 8 +- config.mak.uname | 2 +- contrib/buildsystems/Generators/Vcxproj.pm | 11 +- refs.c | 27 +- refs.h | 3 + refs/refs-internal.h | 1 + refs/reftable-backend.c | 1435 +++++++++++++++++ repository.c | 2 + repository.h | 3 + setup.c | 9 +- t/t0031-reftable.sh | 199 +++ t/t1409-avoid-packing-refs.sh | 6 + t/t1450-fsck.sh | 6 + t/t3210-pack-refs.sh | 6 + t/test-lib.sh | 5 + 21 files changed, 1797 insertions(+), 33 deletions(-) create mode 100644 refs/reftable-backend.c create mode 100755 t/t0031-reftable.sh diff --git a/Documentation/config/extensions.txt b/Documentation/config/extensions.txt index 4e23d73cdca..82c5940f143 100644 --- a/Documentation/config/extensions.txt +++ b/Documentation/config/extensions.txt @@ -6,3 +6,12 @@ extensions.objectFormat:: Note that this setting should only be set by linkgit:git-init[1] or linkgit:git-clone[1]. Trying to change it after initialization will not work and will produce hard-to-diagnose issues. ++ +extensions.refStorage:: + Specify the ref storage mechanism to use. The acceptable values are `files` and + `reftable`. If not specified, `files` is assumed. It is an error to specify + this key unless `core.repositoryFormatVersion` is 1. ++ +Note that this setting should only be set by linkgit:git-init[1] or +linkgit:git-clone[1]. Trying to change it after initialization will not +work and will produce hard-to-diagnose issues. diff --git a/Documentation/technical/repository-version.txt b/Documentation/technical/repository-version.txt index 7844ef30ffd..72576235833 100644 --- a/Documentation/technical/repository-version.txt +++ b/Documentation/technical/repository-version.txt @@ -100,3 +100,10 @@ If set, by default "git config" reads from both "config" and multiple working directory mode, "config" file is shared while "config.worktree" is per-working directory (i.e., it's in GIT_COMMON_DIR/worktrees//config.worktree) + +==== `refStorage` + +Specifies the file format for the ref database. Values are `files` +(for the traditional packed + loose ref format) and `reftable` for the +binary reftable format. See https://github.com/google/reftable for +more information. diff --git a/Makefile b/Makefile index 18cc18c2153..6b7c8a165f4 100644 --- a/Makefile +++ b/Makefile @@ -977,6 +977,7 @@ LIB_OBJS += reflog-walk.o LIB_OBJS += refs.o LIB_OBJS += refs/debug.o LIB_OBJS += refs/files-backend.o +LIB_OBJS += refs/reftable-backend.o LIB_OBJS += refs/iterator.o LIB_OBJS += refs/packed-backend.o LIB_OBJS += refs/ref-cache.o @@ -2408,8 +2409,11 @@ REFTABLE_OBJS += reftable/zlib-compat.o REFTABLE_TEST_OBJS += reftable/basics_test.o REFTABLE_TEST_OBJS += reftable/block_test.o +REFTABLE_TEST_OBJS += reftable/merged_test.o REFTABLE_TEST_OBJS += reftable/record_test.o +REFTABLE_TEST_OBJS += reftable/refname_test.o REFTABLE_TEST_OBJS += reftable/reftable_test.o +REFTABLE_TEST_OBJS += reftable/stack_test.o REFTABLE_TEST_OBJS += reftable/test_framework.o REFTABLE_TEST_OBJS += reftable/tree_test.o diff --git a/builtin/clone.c b/builtin/clone.c index a0841923cfe..974a374e9c9 100644 --- a/builtin/clone.c +++ b/builtin/clone.c @@ -1138,7 +1138,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix) } init_db(git_dir, real_git_dir, option_template, GIT_HASH_UNKNOWN, NULL, - INIT_DB_QUIET); + default_ref_storage(), INIT_DB_QUIET); if (real_git_dir) git_dir = real_git_dir; @@ -1273,7 +1273,8 @@ int cmd_clone(int argc, const char **argv, const char *prefix) * Now that we know what algorithm the remote side is using, * let's set ours to the same thing. */ - initialize_repository_version(hash_algo, 1); + initialize_repository_version(hash_algo, 1, + default_ref_storage()); repo_set_hash_algo(the_repository, hash_algo); mapped_refs = wanted_peer_refs(refs, &remote->fetch); diff --git a/builtin/init-db.c b/builtin/init-db.c index dcb7015db48..e10dc712b8d 100644 --- a/builtin/init-db.c +++ b/builtin/init-db.c @@ -179,12 +179,14 @@ static int needs_work_tree_config(const char *git_dir, const char *work_tree) return 1; } -void initialize_repository_version(int hash_algo, int reinit) +void initialize_repository_version(int hash_algo, int reinit, + const char *ref_storage_format) { char repo_version_string[10]; int repo_version = GIT_REPO_VERSION; - if (hash_algo != GIT_HASH_SHA1) + if (hash_algo != GIT_HASH_SHA1 || + !strcmp(ref_storage_format, "reftable")) repo_version = GIT_REPO_VERSION_READ; /* This forces creation of new config file */ @@ -237,6 +239,7 @@ static int create_default_files(const char *template_path, is_bare_repository_cfg = init_is_bare_repository; if (init_shared_repository != -1) set_shared_repository(init_shared_repository); + the_repository->ref_storage_format = xstrdup(fmt->ref_storage); /* * We would have created the above under user's umask -- under @@ -246,6 +249,24 @@ static int create_default_files(const char *template_path, adjust_shared_perm(get_git_dir()); } + /* + * Check to see if .git/HEAD exists; this must happen before + * initializing the ref db, because we want to see if there is an + * existing HEAD. + */ + path = git_path_buf(&buf, "HEAD"); + reinit = (!access(path, R_OK) || + readlink(path, junk, sizeof(junk) - 1) != -1); + + /* + * refs/heads is a file when using reftable. We can't reinitialize with + * a reftable because it will overwrite HEAD + */ + if (reinit && (!strcmp(fmt->ref_storage, "reftable")) == + is_directory(git_path_buf(&buf, "refs/heads"))) { + die("cannot switch ref storage format."); + } + /* * We need to create a "refs" dir in any case so that older * versions of git can tell that this is a repository. @@ -260,9 +281,6 @@ static int create_default_files(const char *template_path, * Point the HEAD symref to the initial branch with if HEAD does * not yet exist. */ - path = git_path_buf(&buf, "HEAD"); - reinit = (!access(path, R_OK) - || readlink(path, junk, sizeof(junk)-1) != -1); if (!reinit) { char *ref; @@ -279,7 +297,7 @@ static int create_default_files(const char *template_path, free(ref); } - initialize_repository_version(fmt->hash_algo, 0); + initialize_repository_version(fmt->hash_algo, 0, fmt->ref_storage); /* Check filemode trustability */ path = git_path_buf(&buf, "config"); @@ -395,7 +413,7 @@ static void validate_hash_algorithm(struct repository_format *repo_fmt, int hash int init_db(const char *git_dir, const char *real_git_dir, const char *template_dir, int hash, const char *initial_branch, - unsigned int flags) + const char *ref_storage_format, unsigned int flags) { int reinit; int exist_ok = flags & INIT_DB_EXIST_OK; @@ -434,6 +452,7 @@ int init_db(const char *git_dir, const char *real_git_dir, * is an attempt to reinitialize new repository with an old tool. */ check_repository_format(&repo_fmt); + repo_fmt.ref_storage = xstrdup(ref_storage_format); validate_hash_algorithm(&repo_fmt, hash); @@ -487,6 +506,9 @@ int init_db(const char *git_dir, const char *real_git_dir, git_config_set("receive.denyNonFastforwards", "true"); } + if (!strcmp(ref_storage_format, "reftable")) + git_config_set("extensions.refStorage", ref_storage_format); + if (!(flags & INIT_DB_QUIET)) { int len = strlen(git_dir); @@ -560,6 +582,7 @@ static const char *const init_db_usage[] = { int cmd_init_db(int argc, const char **argv, const char *prefix) { const char *git_dir; + const char *ref_storage_format = default_ref_storage(); const char *real_git_dir = NULL; const char *work_tree; const char *template_dir = NULL; @@ -568,15 +591,18 @@ int cmd_init_db(int argc, const char **argv, const char *prefix) const char *initial_branch = NULL; int hash_algo = GIT_HASH_UNKNOWN; const struct option init_db_options[] = { - OPT_STRING(0, "template", &template_dir, N_("template-directory"), - N_("directory from which templates will be used")), + OPT_STRING(0, "template", &template_dir, + N_("template-directory"), + N_("directory from which templates will be used")), OPT_SET_INT(0, "bare", &is_bare_repository_cfg, - N_("create a bare repository"), 1), + N_("create a bare repository"), 1), { OPTION_CALLBACK, 0, "shared", &init_shared_repository, - N_("permissions"), - N_("specify that the git repository is to be shared amongst several users"), - PARSE_OPT_OPTARG | PARSE_OPT_NONEG, shared_callback, 0}, + N_("permissions"), + N_("specify that the git repository is to be shared amongst several users"), + PARSE_OPT_OPTARG | PARSE_OPT_NONEG, shared_callback, 0 }, OPT_BIT('q', "quiet", &flags, N_("be quiet"), INIT_DB_QUIET), + OPT_STRING(0, "ref-storage", &ref_storage_format, N_("backend"), + N_("the ref storage format to use")), OPT_STRING(0, "separate-git-dir", &real_git_dir, N_("gitdir"), N_("separate git dir from working tree")), OPT_STRING('b', "initial-branch", &initial_branch, N_("name"), @@ -717,10 +743,11 @@ int cmd_init_db(int argc, const char **argv, const char *prefix) } UNLEAK(real_git_dir); + UNLEAK(ref_storage_format); UNLEAK(git_dir); UNLEAK(work_tree); flags |= INIT_DB_EXIST_OK; return init_db(git_dir, real_git_dir, template_dir, hash_algo, - initial_branch, flags); + initial_branch, ref_storage_format, flags); } diff --git a/builtin/worktree.c b/builtin/worktree.c index 197fd24a555..a102aedfca7 100644 --- a/builtin/worktree.c +++ b/builtin/worktree.c @@ -12,6 +12,7 @@ #include "submodule.h" #include "utf8.h" #include "worktree.h" +#include "../refs/refs-internal.h" static const char * const worktree_usage[] = { N_("git worktree add [] []"), @@ -402,9 +403,29 @@ static int add_worktree(const char *path, const char *refname, * worktree. */ strbuf_reset(&sb); - strbuf_addf(&sb, "%s/HEAD", sb_repo.buf); - write_file(sb.buf, "%s", oid_to_hex(&null_oid)); - strbuf_reset(&sb); + if (get_main_ref_store(the_repository)->be == &refs_be_reftable) { + /* XXX this is cut & paste from reftable_init_db. */ + strbuf_addf(&sb, "%s/HEAD", sb_repo.buf); + write_file(sb.buf, "%s", "ref: refs/heads/.invalid\n"); + strbuf_reset(&sb); + + strbuf_addf(&sb, "%s/refs", sb_repo.buf); + safe_create_dir(sb.buf, 1); + strbuf_reset(&sb); + + strbuf_addf(&sb, "%s/refs/heads", sb_repo.buf); + write_file(sb.buf, "this repository uses the reftable format"); + strbuf_reset(&sb); + + strbuf_addf(&sb, "%s/reftable", sb_repo.buf); + safe_create_dir(sb.buf, 1); + strbuf_reset(&sb); + } else { + strbuf_addf(&sb, "%s/HEAD", sb_repo.buf); + write_file(sb.buf, "%s", oid_to_hex(&null_oid)); + strbuf_reset(&sb); + } + strbuf_addf(&sb, "%s/commondir", sb_repo.buf); write_file(sb.buf, "../.."); diff --git a/cache.h b/cache.h index e986cf4ea9c..545d2b72607 100644 --- a/cache.h +++ b/cache.h @@ -627,9 +627,10 @@ int path_inside_repo(const char *prefix, const char *path); #define INIT_DB_EXIST_OK 0x0002 int init_db(const char *git_dir, const char *real_git_dir, - const char *template_dir, int hash_algo, - const char *initial_branch, unsigned int flags); -void initialize_repository_version(int hash_algo, int reinit); + const char *template_dir, int hash_algo, const char *initial_branch, + const char *ref_storage_format, unsigned int flags); +void initialize_repository_version(int hash_algo, int reinit, + const char *ref_storage_format); void sanitize_stdfds(void); int daemonize(void); @@ -1043,6 +1044,7 @@ struct repository_format { int is_bare; int hash_algo; char *work_tree; + char *ref_storage; struct string_list unknown_extensions; struct string_list v1_only_extensions; }; diff --git a/config.mak.uname b/config.mak.uname index 5b30a9154ac..4ab0191f5e6 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -701,7 +701,7 @@ vcxproj: # Make .vcxproj files and add them unset QUIET_GEN QUIET_BUILT_IN; \ perl contrib/buildsystems/generate -g Vcxproj - git add -f git.sln {*,*/lib,t/helper/*}/*.vcxproj + git add -f git.sln {*,*/lib,*/libreftable,t/helper/*}/*.vcxproj # Generate the LinkOrCopyBuiltins.targets and LinkOrCopyRemoteHttp.targets file (echo '' && \ diff --git a/contrib/buildsystems/Generators/Vcxproj.pm b/contrib/buildsystems/Generators/Vcxproj.pm index d2584450ba1..1a25789d285 100644 --- a/contrib/buildsystems/Generators/Vcxproj.pm +++ b/contrib/buildsystems/Generators/Vcxproj.pm @@ -77,7 +77,7 @@ sub createProject { my $libs_release = "\n "; my $libs_debug = "\n "; if (!$static_library) { - $libs_release = join(";", sort(grep /^(?!libgit\.lib|xdiff\/lib\.lib|vcs-svn\/lib\.lib)/, @{$$build_structure{"$prefix${name}_LIBS"}})); + $libs_release = join(";", sort(grep /^(?!libgit\.lib|xdiff\/lib\.lib|vcs-svn\/lib\.lib|reftable\/libreftable\.lib)/, @{$$build_structure{"$prefix${name}_LIBS"}})); $libs_debug = $libs_release; $libs_debug =~ s/zlib\.lib/zlibd\.lib/g; $libs_debug =~ s/libexpat\.lib/libexpatd\.lib/g; @@ -232,6 +232,7 @@ sub createProject { EOM if (!$static_library || $target =~ 'vcs-svn' || $target =~ 'xdiff') { my $uuid_libgit = $$build_structure{"LIBS_libgit_GUID"}; + my $uuid_libreftable = $$build_structure{"LIBS_reftable/libreftable_GUID"}; my $uuid_xdiff_lib = $$build_structure{"LIBS_xdiff/lib_GUID"}; print F << "EOM"; @@ -241,6 +242,14 @@ sub createProject { false EOM + if (!($name =~ /xdiff|libreftable/)) { + print F << "EOM"; + + $uuid_libreftable + false + +EOM + } if (!($name =~ 'xdiff')) { print F << "EOM"; diff --git a/refs.c b/refs.c index 392f0bbf68b..1b874db3345 100644 --- a/refs.c +++ b/refs.c @@ -19,10 +19,16 @@ #include "repository.h" #include "sigchain.h" +const char *default_ref_storage(void) +{ + const char *test = getenv("GIT_TEST_REFTABLE"); + return test ? "reftable" : "files"; +} + /* * List of all available backends */ -static struct ref_storage_be *refs_backends = &refs_be_files; +static struct ref_storage_be *refs_backends = &refs_be_reftable; static struct ref_storage_be *find_ref_storage_backend(const char *name) { @@ -1754,13 +1760,13 @@ static struct ref_store *lookup_ref_store_map(struct hashmap *map, * Create, record, and return a ref_store instance for the specified * gitdir. */ -static struct ref_store *ref_store_init(const char *gitdir, +static struct ref_store *ref_store_init(const char *gitdir, const char *be_name, unsigned int flags) { - const char *be_name = "files"; - struct ref_storage_be *be = find_ref_storage_backend(be_name); + struct ref_storage_be *be; struct ref_store *refs; + be = find_ref_storage_backend(be_name); if (!be) BUG("reference backend %s is unknown", be_name); @@ -1776,7 +1782,11 @@ struct ref_store *get_main_ref_store(struct repository *r) if (!r->gitdir) BUG("attempting to get main_ref_store outside of repository"); - r->refs_private = ref_store_init(r->gitdir, REF_STORE_ALL_CAPS); + r->refs_private = ref_store_init(r->gitdir, + r->ref_storage_format ? + r->ref_storage_format : + default_ref_storage(), + REF_STORE_ALL_CAPS); r->refs_private = maybe_debug_wrap_ref_store(r->gitdir, r->refs_private); return r->refs_private; } @@ -1832,7 +1842,7 @@ struct ref_store *get_submodule_ref_store(const char *submodule) goto done; /* assume that add_submodule_odb() has been called */ - refs = ref_store_init(submodule_sb.buf, + refs = ref_store_init(submodule_sb.buf, default_ref_storage(), REF_STORE_READ | REF_STORE_ODB); register_ref_store_map(&submodule_ref_stores, "submodule", refs, submodule); @@ -1846,6 +1856,7 @@ struct ref_store *get_submodule_ref_store(const char *submodule) struct ref_store *get_worktree_ref_store(const struct worktree *wt) { + const char *format = default_ref_storage(); struct ref_store *refs; const char *id; @@ -1859,9 +1870,9 @@ struct ref_store *get_worktree_ref_store(const struct worktree *wt) if (wt->id) refs = ref_store_init(git_common_path("worktrees/%s", wt->id), - REF_STORE_ALL_CAPS); + format, REF_STORE_ALL_CAPS); else - refs = ref_store_init(get_git_common_dir(), + refs = ref_store_init(get_git_common_dir(), format, REF_STORE_ALL_CAPS); if (refs) diff --git a/refs.h b/refs.h index 66955181569..7dc60472c96 100644 --- a/refs.h +++ b/refs.h @@ -11,6 +11,9 @@ struct string_list; struct string_list_item; struct worktree; +/* Returns the ref storage backend to use by default. */ +const char *default_ref_storage(void); + /* * Resolve a reference, recursively following symbolic refererences. * diff --git a/refs/refs-internal.h b/refs/refs-internal.h index 467f4b3c936..28166bf1f82 100644 --- a/refs/refs-internal.h +++ b/refs/refs-internal.h @@ -669,6 +669,7 @@ struct ref_storage_be { }; extern struct ref_storage_be refs_be_files; +extern struct ref_storage_be refs_be_reftable; extern struct ref_storage_be refs_be_packed; /* diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c new file mode 100644 index 00000000000..b7d12ca91de --- /dev/null +++ b/refs/reftable-backend.c @@ -0,0 +1,1435 @@ +#include "../cache.h" +#include "../chdir-notify.h" +#include "../config.h" +#include "../iterator.h" +#include "../lockfile.h" +#include "../refs.h" +#include "../reftable/reftable-stack.h" +#include "../reftable/reftable-record.h" +#include "../reftable/reftable-error.h" +#include "../reftable/reftable-blocksource.h" +#include "../reftable/reftable-reader.h" +#include "../reftable/reftable-iterator.h" +#include "../reftable/reftable-merged.h" +#include "../reftable/reftable-generic.h" +#include "../worktree.h" +#include "refs-internal.h" + +extern struct ref_storage_be refs_be_reftable; + +struct git_reftable_ref_store { + struct ref_store base; + unsigned int store_flags; + + int err; + char *repo_dir; + + char *reftable_dir; + char *worktree_reftable_dir; + + struct reftable_stack *main_stack; + struct reftable_stack *worktree_stack; +}; + +static struct reftable_stack *stack_for(struct git_reftable_ref_store *store, + const char *refname) +{ + if (store->worktree_stack == NULL) + return store->main_stack; + + switch (ref_type(refname)) { + case REF_TYPE_PER_WORKTREE: + case REF_TYPE_PSEUDOREF: + case REF_TYPE_OTHER_PSEUDOREF: + return store->worktree_stack; + default: + case REF_TYPE_MAIN_PSEUDOREF: + case REF_TYPE_NORMAL: + return store->main_stack; + } +} + +static int git_reftable_read_raw_ref(struct ref_store *ref_store, + const char *refname, struct object_id *oid, + struct strbuf *referent, + unsigned int *type); + +static void clear_reftable_log_record(struct reftable_log_record *log) +{ + log->old_hash = NULL; + log->new_hash = NULL; + log->message = NULL; + log->refname = NULL; + reftable_log_record_release(log); +} + +static void fill_reftable_log_record(struct reftable_log_record *log) +{ + const char *info = git_committer_info(0); + struct ident_split split = { NULL }; + int result = split_ident_line(&split, info, strlen(info)); + int sign = 1; + assert(0 == result); + + reftable_log_record_release(log); + log->name = + xstrndup(split.name_begin, split.name_end - split.name_begin); + log->email = + xstrndup(split.mail_begin, split.mail_end - split.mail_begin); + log->time = atol(split.date_begin); + if (*split.tz_begin == '-') { + sign = -1; + split.tz_begin++; + } + if (*split.tz_begin == '+') { + sign = 1; + split.tz_begin++; + } + + log->tz_offset = sign * atoi(split.tz_begin); +} + +static int has_suffix(struct strbuf *b, const char *suffix) +{ + size_t len = strlen(suffix); + + if (len > b->len) { + return 0; + } + + return 0 == strncmp(b->buf + b->len - len, suffix, len); +} + +/* trims the last path component of b. Returns -1 if it is not + * present, or 0 on success + */ +static int trim_component(struct strbuf *b) +{ + char *last; + last = strrchr(b->buf, '/'); + if (!last) + return -1; + strbuf_setlen(b, last - b->buf); + return 0; +} + +/* Returns whether `b` is a worktree path, trimming it to the gitdir + */ +static int is_worktree(struct strbuf *b) +{ + if (trim_component(b) < 0) { + return 0; + } + if (!has_suffix(b, "/worktrees")) { + return 0; + } + trim_component(b); + return 1; +} + +static struct ref_store *git_reftable_ref_store_create(const char *path, + unsigned int store_flags) +{ + struct git_reftable_ref_store *refs = xcalloc(1, sizeof(*refs)); + struct ref_store *ref_store = (struct ref_store *)refs; + struct reftable_write_options cfg = { + .block_size = 4096, + .hash_id = the_hash_algo->format_id, + }; + struct strbuf sb = STRBUF_INIT; + const char *gitdir = path; + struct strbuf wt_buf = STRBUF_INIT; + int wt = 0; + + strbuf_addstr(&wt_buf, path); + + /* this is clumsy, but the official worktree functions (eg. + * get_worktrees()) function will try to initialize a ref storage + * backend, leading to infinite recursion. */ + wt = is_worktree(&wt_buf); + if (wt) { + gitdir = wt_buf.buf; + } + + base_ref_store_init(ref_store, &refs_be_reftable); + ref_store->gitdir = xstrdup(gitdir); + refs->store_flags = store_flags; + strbuf_addf(&sb, "%s/reftable", gitdir); + refs->reftable_dir = xstrdup(sb.buf); + strbuf_reset(&sb); + + refs->err = + reftable_new_stack(&refs->main_stack, refs->reftable_dir, cfg); + assert(refs->err != REFTABLE_API_ERROR); + + if (refs->err == 0 && wt) { + strbuf_addf(&sb, "%s/reftable", path); + refs->worktree_reftable_dir = xstrdup(sb.buf); + + refs->err = reftable_new_stack(&refs->worktree_stack, + refs->worktree_reftable_dir, + cfg); + assert(refs->err != REFTABLE_API_ERROR); + } + + strbuf_release(&sb); + strbuf_release(&wt_buf); + return ref_store; +} + +static int git_reftable_init_db(struct ref_store *ref_store, struct strbuf *err) +{ + struct git_reftable_ref_store *refs = + (struct git_reftable_ref_store *)ref_store; + struct strbuf sb = STRBUF_INIT; + + safe_create_dir(refs->reftable_dir, 1); + assert(refs->worktree_reftable_dir == NULL); + + strbuf_addf(&sb, "%s/HEAD", refs->base.gitdir); + write_file(sb.buf, "ref: refs/heads/.invalid"); + strbuf_reset(&sb); + + strbuf_addf(&sb, "%s/refs", refs->base.gitdir); + safe_create_dir(sb.buf, 1); + strbuf_reset(&sb); + + strbuf_addf(&sb, "%s/refs/heads", refs->base.gitdir); + write_file(sb.buf, "this repository uses the reftable format"); + + return 0; +} + +struct git_reftable_iterator { + struct ref_iterator base; + struct reftable_iterator iter; + struct reftable_ref_record ref; + struct object_id oid; + struct ref_store *ref_store; + + /* In case we must iterate over 2 stacks, this is non-null. */ + struct reftable_merged_table *merged; + unsigned int flags; + int err; + const char *prefix; +}; + +static int reftable_ref_iterator_advance(struct ref_iterator *ref_iterator) +{ + struct git_reftable_iterator *ri = + (struct git_reftable_iterator *)ref_iterator; + while (ri->err == 0) { + ri->err = reftable_iterator_next_ref(&ri->iter, &ri->ref); + if (ri->err) { + break; + } + + if (ref_type(ri->ref.refname) == REF_TYPE_PSEUDOREF) { + /* + pseudorefs, eg. HEAD, FETCH_HEAD should not be + produced, by default. + */ + continue; + } + ri->base.refname = ri->ref.refname; + if (ri->prefix != NULL && + strncmp(ri->prefix, ri->ref.refname, strlen(ri->prefix))) { + ri->err = 1; + break; + } + if (ri->flags & DO_FOR_EACH_PER_WORKTREE_ONLY && + ref_type(ri->base.refname) != REF_TYPE_PER_WORKTREE) + continue; + + ri->base.flags = 0; + switch (ri->ref.value_type) { + case REFTABLE_REF_VAL1: + hashcpy(ri->oid.hash, ri->ref.value.val1); + break; + case REFTABLE_REF_VAL2: + hashcpy(ri->oid.hash, ri->ref.value.val2.value); + break; + case REFTABLE_REF_SYMREF: { + int out_flags = 0; + const char *resolved = refs_resolve_ref_unsafe( + ri->ref_store, ri->ref.refname, + RESOLVE_REF_READING, &ri->oid, &out_flags); + ri->base.flags = out_flags; + if (resolved == NULL && + !(ri->flags & DO_FOR_EACH_INCLUDE_BROKEN) && + (ri->base.flags & REF_ISBROKEN)) { + continue; + } + break; + } + default: + abort(); + } + + ri->base.oid = &ri->oid; + if (!(ri->flags & DO_FOR_EACH_INCLUDE_BROKEN) && + !ref_resolves_to_object(ri->base.refname, ri->base.oid, + ri->base.flags)) { + continue; + } + + break; + } + + if (ri->err > 0) { + return ITER_DONE; + } + if (ri->err < 0) { + return ITER_ERROR; + } + + return ITER_OK; +} + +static int reftable_ref_iterator_peel(struct ref_iterator *ref_iterator, + struct object_id *peeled) +{ + struct git_reftable_iterator *ri = + (struct git_reftable_iterator *)ref_iterator; + if (ri->ref.value_type == REFTABLE_REF_VAL2) { + hashcpy(peeled->hash, ri->ref.value.val2.target_value); + return 0; + } + + return -1; +} + +static int reftable_ref_iterator_abort(struct ref_iterator *ref_iterator) +{ + struct git_reftable_iterator *ri = + (struct git_reftable_iterator *)ref_iterator; + reftable_ref_record_release(&ri->ref); + reftable_iterator_destroy(&ri->iter); + if (ri->merged) { + reftable_merged_table_free(ri->merged); + } + return 0; +} + +static struct ref_iterator_vtable reftable_ref_iterator_vtable = { + reftable_ref_iterator_advance, reftable_ref_iterator_peel, + reftable_ref_iterator_abort +}; + +static struct ref_iterator * +git_reftable_ref_iterator_begin(struct ref_store *ref_store, const char *prefix, + unsigned int flags) +{ + struct git_reftable_ref_store *refs = + (struct git_reftable_ref_store *)ref_store; + struct git_reftable_iterator *ri = xcalloc(1, sizeof(*ri)); + + if (refs->err < 0) { + ri->err = refs->err; + } else if (refs->worktree_stack == NULL) { + struct reftable_merged_table *mt = + reftable_stack_merged_table(refs->main_stack); + ri->err = reftable_merged_table_seek_ref(mt, &ri->iter, prefix); + } else { + struct reftable_merged_table *mt1 = + reftable_stack_merged_table(refs->main_stack); + struct reftable_merged_table *mt2 = + reftable_stack_merged_table(refs->worktree_stack); + struct reftable_table *tabs = + xcalloc(2, sizeof(struct reftable_table)); + reftable_table_from_merged_table(&tabs[0], mt1); + reftable_table_from_merged_table(&tabs[1], mt2); + ri->err = reftable_new_merged_table(&ri->merged, tabs, 2, + the_hash_algo->format_id); + if (ri->err == 0) + ri->err = reftable_merged_table_seek_ref( + ri->merged, &ri->iter, prefix); + } + + base_ref_iterator_init(&ri->base, &reftable_ref_iterator_vtable, 1); + ri->prefix = prefix; + ri->base.oid = &ri->oid; + ri->flags = flags; + ri->ref_store = ref_store; + return &ri->base; +} + +static int fixup_symrefs(struct ref_store *ref_store, + struct ref_transaction *transaction) +{ + struct strbuf referent = STRBUF_INIT; + int i = 0; + int err = 0; + + for (i = 0; i < transaction->nr; i++) { + struct ref_update *update = transaction->updates[i]; + struct object_id old_oid; + + err = git_reftable_read_raw_ref(ref_store, update->refname, + &old_oid, &referent, + /* mutate input, like + files-backend.c */ + &update->type); + if (err < 0 && errno == ENOENT && + is_null_oid(&update->old_oid)) { + err = 0; + } + if (err < 0) + goto done; + + if (!(update->type & REF_ISSYMREF)) + continue; + + if (update->flags & REF_NO_DEREF) { + /* what should happen here? See files-backend.c + * lock_ref_for_update. */ + } else { + /* + If we are updating a symref (eg. HEAD), we should also + update the branch that the symref points to. + + This is generic functionality, and would be better + done in refs.c, but the current implementation is + intertwined with the locking in files-backend.c. + */ + int new_flags = update->flags; + struct ref_update *new_update = NULL; + + /* if this is an update for HEAD, should also record a + log entry for HEAD? See files-backend.c, + split_head_update() + */ + new_update = ref_transaction_add_update( + transaction, referent.buf, new_flags, + &update->new_oid, &update->old_oid, + update->msg); + new_update->parent_update = update; + + /* files-backend sets REF_LOG_ONLY here. */ + update->flags |= REF_NO_DEREF | REF_LOG_ONLY; + update->flags &= ~REF_HAVE_OLD; + } + } + +done: + assert(err != REFTABLE_API_ERROR); + strbuf_release(&referent); + return err; +} + +static int git_reftable_transaction_prepare(struct ref_store *ref_store, + struct ref_transaction *transaction, + struct strbuf *errbuf) +{ + struct git_reftable_ref_store *refs = + (struct git_reftable_ref_store *)ref_store; + struct reftable_addition *add = NULL; + struct reftable_stack *stack = + transaction->nr ? + stack_for(refs, transaction->updates[0]->refname) : + refs->main_stack; + int err = refs->err; + if (err < 0) { + goto done; + } + + err = reftable_stack_reload(stack); + if (err) { + goto done; + } + + err = reftable_stack_new_addition(&add, stack); + if (err) { + goto done; + } + + err = fixup_symrefs(ref_store, transaction); + if (err) { + goto done; + } + + transaction->backend_data = add; + transaction->state = REF_TRANSACTION_PREPARED; + +done: + assert(err != REFTABLE_API_ERROR); + if (err < 0) { + transaction->state = REF_TRANSACTION_CLOSED; + strbuf_addf(errbuf, "reftable: transaction prepare: %s", + reftable_error_str(err)); + } + + return err; +} + +static int git_reftable_transaction_abort(struct ref_store *ref_store, + struct ref_transaction *transaction, + struct strbuf *err) +{ + struct reftable_addition *add = + (struct reftable_addition *)transaction->backend_data; + reftable_addition_destroy(add); + transaction->backend_data = NULL; + return 0; +} + +static int reftable_check_old_oid(struct ref_store *refs, const char *refname, + struct object_id *want_oid) +{ + struct object_id out_oid; + int out_flags = 0; + const char *resolved = refs_resolve_ref_unsafe( + refs, refname, RESOLVE_REF_READING, &out_oid, &out_flags); + if (is_null_oid(want_oid) != (resolved == NULL)) { + return REFTABLE_LOCK_ERROR; + } + + if (resolved != NULL && !oideq(&out_oid, want_oid)) { + return REFTABLE_LOCK_ERROR; + } + + return 0; +} + +static int ref_update_cmp(const void *a, const void *b) +{ + return strcmp((*(struct ref_update **)a)->refname, + (*(struct ref_update **)b)->refname); +} + +static int write_transaction_table(struct reftable_writer *writer, void *arg) +{ + struct ref_transaction *transaction = (struct ref_transaction *)arg; + struct git_reftable_ref_store *refs = + (struct git_reftable_ref_store *)transaction->ref_store; + struct reftable_stack *stack = + stack_for(refs, transaction->updates[0]->refname); + uint64_t ts = reftable_stack_next_update_index(stack); + int err = 0; + int i = 0; + struct reftable_log_record *logs = + calloc(transaction->nr, sizeof(*logs)); + struct ref_update **sorted = + malloc(transaction->nr * sizeof(struct ref_update *)); + COPY_ARRAY(sorted, transaction->updates, transaction->nr); + QSORT(sorted, transaction->nr, ref_update_cmp); + reftable_writer_set_limits(writer, ts, ts); + + for (i = 0; i < transaction->nr; i++) { + struct ref_update *u = sorted[i]; + struct reftable_log_record *log = &logs[i]; + fill_reftable_log_record(log); + log->refname = (char *)u->refname; + log->old_hash = u->old_oid.hash; + log->new_hash = u->new_oid.hash; + log->update_index = ts; + log->message = u->msg; + + if (u->flags & REF_LOG_ONLY) { + continue; + } + + if (u->flags & REF_HAVE_NEW) { + struct reftable_ref_record ref = { NULL }; + struct object_id peeled; + + int peel_error = peel_object(&u->new_oid, &peeled); + ref.refname = (char *)u->refname; + ref.update_index = ts; + + if (!peel_error) { + ref.value_type = REFTABLE_REF_VAL2; + ref.value.val2.target_value = peeled.hash; + ref.value.val2.value = u->new_oid.hash; + } else if (!is_null_oid(&u->new_oid)) { + ref.value_type = REFTABLE_REF_VAL1; + ref.value.val1 = u->new_oid.hash; + } + + err = reftable_writer_add_ref(writer, &ref); + if (err < 0) { + goto done; + } + } + } + + for (i = 0; i < transaction->nr; i++) { + err = reftable_writer_add_log(writer, &logs[i]); + clear_reftable_log_record(&logs[i]); + if (err < 0) { + goto done; + } + } + +done: + assert(err != REFTABLE_API_ERROR); + free(logs); + free(sorted); + return err; +} + +static int git_reftable_transaction_finish(struct ref_store *ref_store, + struct ref_transaction *transaction, + struct strbuf *errmsg) +{ + struct reftable_addition *add = + (struct reftable_addition *)transaction->backend_data; + int err = 0; + int i; + + for (i = 0; i < transaction->nr; i++) { + struct ref_update *u = transaction->updates[i]; + if (u->flags & REF_HAVE_OLD) { + err = reftable_check_old_oid(transaction->ref_store, + u->refname, &u->old_oid); + if (err < 0) { + goto done; + } + } + } + if (transaction->nr) { + err = reftable_addition_add(add, &write_transaction_table, + transaction); + if (err < 0) { + goto done; + } + } + + err = reftable_addition_commit(add); + +done: + assert(err != REFTABLE_API_ERROR); + reftable_addition_destroy(add); + transaction->state = REF_TRANSACTION_CLOSED; + transaction->backend_data = NULL; + if (err) { + strbuf_addf(errmsg, "reftable: transaction failure: %s", + reftable_error_str(err)); + return -1; + } + return err; +} + +static int +git_reftable_transaction_initial_commit(struct ref_store *ref_store, + struct ref_transaction *transaction, + struct strbuf *errmsg) +{ + int err = git_reftable_transaction_prepare(ref_store, transaction, + errmsg); + if (err) + return err; + + return git_reftable_transaction_finish(ref_store, transaction, errmsg); +} + +struct write_delete_refs_arg { + struct reftable_stack *stack; + struct string_list *refnames; + const char *logmsg; + unsigned int flags; +}; + +static int write_delete_refs_table(struct reftable_writer *writer, void *argv) +{ + struct write_delete_refs_arg *arg = + (struct write_delete_refs_arg *)argv; + uint64_t ts = reftable_stack_next_update_index(arg->stack); + int err = 0; + int i = 0; + + reftable_writer_set_limits(writer, ts, ts); + for (i = 0; i < arg->refnames->nr; i++) { + struct reftable_ref_record ref = { + .refname = (char *)arg->refnames->items[i].string, + .value_type = REFTABLE_REF_DELETION, + .update_index = ts, + }; + err = reftable_writer_add_ref(writer, &ref); + if (err < 0) { + return err; + } + } + + for (i = 0; i < arg->refnames->nr; i++) { + struct reftable_log_record log = { + .update_index = ts, + }; + struct reftable_ref_record current = { NULL }; + fill_reftable_log_record(&log); + log.message = xstrdup(arg->logmsg); + log.new_hash = NULL; + log.old_hash = NULL; + log.update_index = ts; + log.refname = (char *)arg->refnames->items[i].string; + + if (reftable_stack_read_ref(arg->stack, log.refname, + ¤t) == 0) { + log.old_hash = reftable_ref_record_val1(¤t); + } + err = reftable_writer_add_log(writer, &log); + log.old_hash = NULL; + reftable_ref_record_release(¤t); + + clear_reftable_log_record(&log); + if (err < 0) { + return err; + } + } + return 0; +} + +static int git_reftable_delete_refs(struct ref_store *ref_store, + const char *msg, + struct string_list *refnames, + unsigned int flags) +{ + struct git_reftable_ref_store *refs = + (struct git_reftable_ref_store *)ref_store; + struct reftable_stack *stack = + stack_for(refs, refnames->items[0].string); + struct write_delete_refs_arg arg = { + .stack = stack, + .refnames = refnames, + .logmsg = msg, + .flags = flags, + }; + int err = refs->err; + if (err < 0) { + goto done; + } + + string_list_sort(refnames); + err = reftable_stack_reload(stack); + if (err) { + goto done; + } + err = reftable_stack_add(stack, &write_delete_refs_table, &arg); +done: + assert(err != REFTABLE_API_ERROR); + return err; +} + +static int git_reftable_pack_refs(struct ref_store *ref_store, + unsigned int flags) +{ + struct git_reftable_ref_store *refs = + (struct git_reftable_ref_store *)ref_store; + int err = refs->err; + if (err < 0) { + return err; + } + err = reftable_stack_compact_all(refs->main_stack, NULL); + if (err == 0 && refs->worktree_stack != NULL) + err = reftable_stack_compact_all(refs->worktree_stack, NULL); + return err; +} + +struct write_create_symref_arg { + struct git_reftable_ref_store *refs; + struct reftable_stack *stack; + const char *refname; + const char *target; + const char *logmsg; +}; + +static int write_create_symref_table(struct reftable_writer *writer, void *arg) +{ + struct write_create_symref_arg *create = + (struct write_create_symref_arg *)arg; + uint64_t ts = reftable_stack_next_update_index(create->stack); + int err = 0; + + struct reftable_ref_record ref = { + .refname = (char *)create->refname, + .value_type = REFTABLE_REF_SYMREF, + .value.symref = (char *)create->target, + .update_index = ts, + }; + reftable_writer_set_limits(writer, ts, ts); + err = reftable_writer_add_ref(writer, &ref); + if (err == 0) { + struct reftable_log_record log = { NULL }; + struct object_id new_oid; + struct object_id old_oid; + + fill_reftable_log_record(&log); + log.refname = (char *)create->refname; + log.message = (char *)create->logmsg; + log.update_index = ts; + if (refs_resolve_ref_unsafe( + (struct ref_store *)create->refs, create->refname, + RESOLVE_REF_READING, &old_oid, NULL) != NULL) { + log.old_hash = old_oid.hash; + } + + if (refs_resolve_ref_unsafe((struct ref_store *)create->refs, + create->target, RESOLVE_REF_READING, + &new_oid, NULL) != NULL) { + log.new_hash = new_oid.hash; + } + + if (log.old_hash != NULL || log.new_hash != NULL) { + err = reftable_writer_add_log(writer, &log); + } + log.refname = NULL; + log.message = NULL; + log.old_hash = NULL; + log.new_hash = NULL; + clear_reftable_log_record(&log); + } + return err; +} + +static int git_reftable_create_symref(struct ref_store *ref_store, + const char *refname, const char *target, + const char *logmsg) +{ + struct git_reftable_ref_store *refs = + (struct git_reftable_ref_store *)ref_store; + struct reftable_stack *stack = stack_for(refs, refname); + struct write_create_symref_arg arg = { .refs = refs, + .stack = stack, + .refname = refname, + .target = target, + .logmsg = logmsg }; + int err = refs->err; + if (err < 0) { + goto done; + } + err = reftable_stack_reload(stack); + if (err) { + goto done; + } + err = reftable_stack_add(stack, &write_create_symref_table, &arg); +done: + assert(err != REFTABLE_API_ERROR); + return err; +} + +struct write_rename_arg { + struct reftable_stack *stack; + const char *oldname; + const char *newname; + const char *logmsg; +}; + +static int write_rename_table(struct reftable_writer *writer, void *argv) +{ + struct write_rename_arg *arg = (struct write_rename_arg *)argv; + uint64_t ts = reftable_stack_next_update_index(arg->stack); + struct reftable_ref_record ref = { NULL }; + int err = reftable_stack_read_ref(arg->stack, arg->oldname, &ref); + + if (err) { + goto done; + } + + /* XXX do ref renames overwrite the target? */ + if (reftable_stack_read_ref(arg->stack, arg->newname, &ref) == 0) { + goto done; + } + + reftable_writer_set_limits(writer, ts, ts); + + { + struct reftable_ref_record todo[2] = { + { + .refname = (char *)arg->oldname, + .update_index = ts, + .value_type = REFTABLE_REF_DELETION, + }, + ref, + }; + todo[1].update_index = ts; + free(todo[1].refname); + todo[1].refname = strdup(arg->newname); + + err = reftable_writer_add_refs(writer, todo, 2); + if (err < 0) { + goto done; + } + } + + if (reftable_ref_record_val1(&ref)) { + uint8_t *val1 = reftable_ref_record_val1(&ref); + struct reftable_log_record todo[2] = { { NULL } }; + fill_reftable_log_record(&todo[0]); + fill_reftable_log_record(&todo[1]); + + todo[0].refname = (char *)arg->oldname; + todo[0].update_index = ts; + todo[0].message = (char *)arg->logmsg; + todo[0].old_hash = val1; + todo[0].new_hash = NULL; + + todo[1].refname = (char *)arg->newname; + todo[1].update_index = ts; + todo[1].old_hash = NULL; + todo[1].new_hash = val1; + todo[1].message = (char *)arg->logmsg; + + err = reftable_writer_add_logs(writer, todo, 2); + + clear_reftable_log_record(&todo[0]); + clear_reftable_log_record(&todo[1]); + + if (err < 0) { + goto done; + } + + } else { + /* XXX symrefs? */ + } + +done: + assert(err != REFTABLE_API_ERROR); + reftable_ref_record_release(&ref); + return err; +} + +static int git_reftable_rename_ref(struct ref_store *ref_store, + const char *oldrefname, + const char *newrefname, const char *logmsg) +{ + struct git_reftable_ref_store *refs = + (struct git_reftable_ref_store *)ref_store; + struct reftable_stack *stack = stack_for(refs, newrefname); + struct write_rename_arg arg = { + .stack = stack, + .oldname = oldrefname, + .newname = newrefname, + .logmsg = logmsg, + }; + int err = refs->err; + if (err < 0) { + goto done; + } + err = reftable_stack_reload(stack); + if (err) { + goto done; + } + + err = reftable_stack_add(stack, &write_rename_table, &arg); +done: + assert(err != REFTABLE_API_ERROR); + return err; +} + +static int git_reftable_copy_ref(struct ref_store *ref_store, + const char *oldrefname, const char *newrefname, + const char *logmsg) +{ + BUG("reftable reference store does not support copying references"); +} + +struct git_reftable_reflog_ref_iterator { + struct ref_iterator base; + struct reftable_iterator iter; + struct reftable_log_record log; + struct object_id oid; + + /* Used when iterating over worktree & main */ + struct reftable_merged_table *merged; + char *last_name; +}; + +static int +git_reftable_reflog_ref_iterator_advance(struct ref_iterator *ref_iterator) +{ + struct git_reftable_reflog_ref_iterator *ri = + (struct git_reftable_reflog_ref_iterator *)ref_iterator; + + while (1) { + int err = reftable_iterator_next_log(&ri->iter, &ri->log); + if (err > 0) { + return ITER_DONE; + } + if (err < 0) { + return ITER_ERROR; + } + + ri->base.refname = ri->log.refname; + if (ri->last_name != NULL && + !strcmp(ri->log.refname, ri->last_name)) { + /* we want the refnames that we have reflogs for, so we + * skip if we've already produced this name. This could + * be faster by seeking directly to + * reflog@update_index==0. + */ + continue; + } + + free(ri->last_name); + ri->last_name = xstrdup(ri->log.refname); + hashcpy(ri->oid.hash, ri->log.new_hash); + return ITER_OK; + } +} + +static int +git_reftable_reflog_ref_iterator_peel(struct ref_iterator *ref_iterator, + struct object_id *peeled) +{ + BUG("not supported."); + return -1; +} + +static int +git_reftable_reflog_ref_iterator_abort(struct ref_iterator *ref_iterator) +{ + struct git_reftable_reflog_ref_iterator *ri = + (struct git_reftable_reflog_ref_iterator *)ref_iterator; + reftable_log_record_release(&ri->log); + reftable_iterator_destroy(&ri->iter); + if (ri->merged) + reftable_merged_table_free(ri->merged); + return 0; +} + +static struct ref_iterator_vtable git_reftable_reflog_ref_iterator_vtable = { + git_reftable_reflog_ref_iterator_advance, + git_reftable_reflog_ref_iterator_peel, + git_reftable_reflog_ref_iterator_abort +}; + +static struct ref_iterator * +git_reftable_reflog_iterator_begin(struct ref_store *ref_store) +{ + struct git_reftable_reflog_ref_iterator *ri = xcalloc(sizeof(*ri), 1); + struct git_reftable_ref_store *refs = + (struct git_reftable_ref_store *)ref_store; + + if (refs->worktree_stack == NULL) { + struct reftable_stack *stack = refs->main_stack; + struct reftable_merged_table *mt = + reftable_stack_merged_table(stack); + int err = reftable_merged_table_seek_log(mt, &ri->iter, ""); + if (err < 0) { + free(ri); + /* XXX is this allowed? */ + return NULL; + } + } else { + struct reftable_merged_table *mt1 = + reftable_stack_merged_table(refs->main_stack); + struct reftable_merged_table *mt2 = + reftable_stack_merged_table(refs->worktree_stack); + struct reftable_table *tabs = + xcalloc(2, sizeof(struct reftable_table)); + int err = 0; + reftable_table_from_merged_table(&tabs[0], mt1); + reftable_table_from_merged_table(&tabs[1], mt2); + err = reftable_new_merged_table(&ri->merged, tabs, 2, + the_hash_algo->format_id); + if (err < 0) { + free(tabs); + /* XXX see above */ + return NULL; + } + err = reftable_merged_table_seek_ref(ri->merged, &ri->iter, ""); + if (err < 0) { + return NULL; + } + } + base_ref_iterator_init(&ri->base, + &git_reftable_reflog_ref_iterator_vtable, 1); + ri->base.oid = &ri->oid; + + return (struct ref_iterator *)ri; +} + +static int git_reftable_for_each_reflog_ent_newest_first( + struct ref_store *ref_store, const char *refname, each_reflog_ent_fn fn, + void *cb_data) +{ + struct reftable_iterator it = { NULL }; + struct git_reftable_ref_store *refs = + (struct git_reftable_ref_store *)ref_store; + struct reftable_stack *stack = stack_for(refs, refname); + struct reftable_merged_table *mt = NULL; + int err = 0; + struct reftable_log_record log = { NULL }; + + if (refs->err < 0) { + return refs->err; + } + + mt = reftable_stack_merged_table(stack); + err = reftable_merged_table_seek_log(mt, &it, refname); + while (err == 0) { + struct object_id old_oid; + struct object_id new_oid; + const char *full_committer = ""; + + err = reftable_iterator_next_log(&it, &log); + if (err > 0) { + err = 0; + break; + } + if (err < 0) { + break; + } + + if (strcmp(log.refname, refname)) { + break; + } + + hashcpy(old_oid.hash, log.old_hash); + hashcpy(new_oid.hash, log.new_hash); + + full_committer = fmt_ident(log.name, log.email, + WANT_COMMITTER_IDENT, + /*date*/ NULL, IDENT_NO_DATE); + err = fn(&old_oid, &new_oid, full_committer, log.time, + log.tz_offset, log.message, cb_data); + if (err) + break; + } + + reftable_log_record_release(&log); + reftable_iterator_destroy(&it); + return err; +} + +static int git_reftable_for_each_reflog_ent_oldest_first( + struct ref_store *ref_store, const char *refname, each_reflog_ent_fn fn, + void *cb_data) +{ + struct reftable_iterator it = { NULL }; + struct git_reftable_ref_store *refs = + (struct git_reftable_ref_store *)ref_store; + struct reftable_stack *stack = stack_for(refs, refname); + struct reftable_merged_table *mt = NULL; + struct reftable_log_record *logs = NULL; + int cap = 0; + int len = 0; + int err = 0; + int i = 0; + + if (refs->err < 0) { + return refs->err; + } + mt = reftable_stack_merged_table(stack); + err = reftable_merged_table_seek_log(mt, &it, refname); + + while (err == 0) { + struct reftable_log_record log = { NULL }; + err = reftable_iterator_next_log(&it, &log); + if (err > 0) { + err = 0; + break; + } + if (err < 0) { + break; + } + + if (strcmp(log.refname, refname)) { + break; + } + + if (len == cap) { + cap = 2 * cap + 1; + logs = realloc(logs, cap * sizeof(*logs)); + } + + logs[len++] = log; + } + + for (i = len; i--;) { + struct reftable_log_record *log = &logs[i]; + struct object_id old_oid; + struct object_id new_oid; + const char *full_committer = ""; + + hashcpy(old_oid.hash, log->old_hash); + hashcpy(new_oid.hash, log->new_hash); + + full_committer = fmt_ident(log->name, log->email, + WANT_COMMITTER_IDENT, NULL, + IDENT_NO_DATE); + err = fn(&old_oid, &new_oid, full_committer, log->time, + log->tz_offset, log->message, cb_data); + if (err) { + break; + } + } + + for (i = 0; i < len; i++) { + reftable_log_record_release(&logs[i]); + } + free(logs); + + reftable_iterator_destroy(&it); + return err; +} + +static int git_reftable_reflog_exists(struct ref_store *ref_store, + const char *refname) +{ + struct reftable_iterator it = { NULL }; + struct git_reftable_ref_store *refs = + (struct git_reftable_ref_store *)ref_store; + struct reftable_stack *stack = stack_for(refs, refname); + struct reftable_merged_table *mt = reftable_stack_merged_table(stack); + struct reftable_log_record log = { NULL }; + int err = refs->err; + + if (err < 0) { + goto done; + } + err = reftable_merged_table_seek_log(mt, &it, refname); + if (err) { + goto done; + } + err = reftable_iterator_next_log(&it, &log); + if (err) { + goto done; + } + + if (strcmp(log.refname, refname)) { + err = 1; + } + +done: + reftable_iterator_destroy(&it); + reftable_log_record_release(&log); + return !err; +} + +static int git_reftable_create_reflog(struct ref_store *ref_store, + const char *refname, int force_create, + struct strbuf *err) +{ + return 0; +} + +static int git_reftable_delete_reflog(struct ref_store *ref_store, + const char *refname) +{ + return 0; +} + +struct reflog_expiry_arg { + struct git_reftable_ref_store *refs; + struct reftable_stack *stack; + struct reftable_log_record *tombstones; + int len; + int cap; +}; + +static void clear_log_tombstones(struct reflog_expiry_arg *arg) +{ + int i = 0; + for (; i < arg->len; i++) { + reftable_log_record_release(&arg->tombstones[i]); + } + + FREE_AND_NULL(arg->tombstones); +} + +static void add_log_tombstone(struct reflog_expiry_arg *arg, + const char *refname, uint64_t ts) +{ + struct reftable_log_record tombstone = { + .refname = xstrdup(refname), + .update_index = ts, + }; + if (arg->len == arg->cap) { + arg->cap = 2 * arg->cap + 1; + arg->tombstones = + realloc(arg->tombstones, arg->cap * sizeof(tombstone)); + } + arg->tombstones[arg->len++] = tombstone; +} + +static int write_reflog_expiry_table(struct reftable_writer *writer, void *argv) +{ + struct reflog_expiry_arg *arg = (struct reflog_expiry_arg *)argv; + uint64_t ts = reftable_stack_next_update_index(arg->stack); + int i = 0; + reftable_writer_set_limits(writer, ts, ts); + for (i = 0; i < arg->len; i++) { + int err = reftable_writer_add_log(writer, &arg->tombstones[i]); + if (err) { + return err; + } + } + return 0; +} + +static int +git_reftable_reflog_expire(struct ref_store *ref_store, const char *refname, + const struct object_id *oid, unsigned int flags, + reflog_expiry_prepare_fn prepare_fn, + reflog_expiry_should_prune_fn should_prune_fn, + reflog_expiry_cleanup_fn cleanup_fn, + void *policy_cb_data) +{ + /* + For log expiry, we write tombstones in place of the expired entries, + This means that the entries are still retrievable by delving into the + stack, and expiring entries paradoxically takes extra memory. + + This memory is only reclaimed when some operation issues a + git_reftable_pack_refs(), which will compact the entire stack and get + rid of deletion entries. + + It would be better if the refs backend supported an API that sets a + criterion for all refs, passing the criterion to pack_refs(). + */ + struct git_reftable_ref_store *refs = + (struct git_reftable_ref_store *)ref_store; + struct reftable_stack *stack = stack_for(refs, refname); + struct reftable_merged_table *mt = NULL; + struct reflog_expiry_arg arg = { + .stack = stack, + .refs = refs, + }; + struct reftable_log_record log = { NULL }; + struct reftable_iterator it = { NULL }; + int err = 0; + if (refs->err < 0) { + return refs->err; + } + err = reftable_stack_reload(stack); + if (err) { + goto done; + } + + mt = reftable_stack_merged_table(stack); + err = reftable_merged_table_seek_log(mt, &it, refname); + if (err < 0) { + goto done; + } + + while (1) { + struct object_id ooid; + struct object_id noid; + + int err = reftable_iterator_next_log(&it, &log); + if (err < 0) { + goto done; + } + + if (err > 0 || strcmp(log.refname, refname)) { + break; + } + hashcpy(ooid.hash, log.old_hash); + hashcpy(noid.hash, log.new_hash); + + if (should_prune_fn(&ooid, &noid, log.email, + (timestamp_t)log.time, log.tz_offset, + log.message, policy_cb_data)) { + add_log_tombstone(&arg, refname, log.update_index); + } + } + err = reftable_stack_add(stack, &write_reflog_expiry_table, &arg); + +done: + assert(err != REFTABLE_API_ERROR); + reftable_log_record_release(&log); + reftable_iterator_destroy(&it); + clear_log_tombstones(&arg); + return err; +} + +static int reftable_error_to_errno(int err) +{ + switch (err) { + case REFTABLE_IO_ERROR: + return EIO; + case REFTABLE_FORMAT_ERROR: + return EFAULT; + case REFTABLE_NOT_EXIST_ERROR: + return ENOENT; + case REFTABLE_LOCK_ERROR: + return EBUSY; + case REFTABLE_API_ERROR: + return EINVAL; + case REFTABLE_ZLIB_ERROR: + return EDOM; + default: + return ERANGE; + } +} + +static int git_reftable_read_raw_ref(struct ref_store *ref_store, + const char *refname, struct object_id *oid, + struct strbuf *referent, + unsigned int *type) +{ + struct git_reftable_ref_store *refs = + (struct git_reftable_ref_store *)ref_store; + struct reftable_stack *stack = stack_for(refs, refname); + + struct reftable_ref_record ref = { NULL }; + int err = 0; + if (refs->err < 0) { + return refs->err; + } + + /* This is usually not needed, but Git doesn't signal to ref backend if + a subprocess updated the ref DB. So we always check. + */ + err = reftable_stack_reload(stack); + if (err) { + goto done; + } + + err = reftable_stack_read_ref(stack, refname, &ref); + if (err > 0) { + errno = ENOENT; + err = -1; + goto done; + } + if (err < 0) { + errno = reftable_error_to_errno(err); + err = -1; + goto done; + } + + if (ref.value_type == REFTABLE_REF_SYMREF) { + strbuf_reset(referent); + strbuf_addstr(referent, ref.value.symref); + *type |= REF_ISSYMREF; + } else if (reftable_ref_record_val1(&ref) != NULL) { + hashcpy(oid->hash, reftable_ref_record_val1(&ref)); + } else { + *type |= REF_ISBROKEN; + errno = EINVAL; + err = -1; + } +done: + assert(err != REFTABLE_API_ERROR); + reftable_ref_record_release(&ref); + return err; +} + +struct ref_storage_be refs_be_reftable = { + &refs_be_files, + "reftable", + git_reftable_ref_store_create, + git_reftable_init_db, + git_reftable_transaction_prepare, + git_reftable_transaction_finish, + git_reftable_transaction_abort, + git_reftable_transaction_initial_commit, + + git_reftable_pack_refs, + git_reftable_create_symref, + git_reftable_delete_refs, + git_reftable_rename_ref, + git_reftable_copy_ref, + + git_reftable_ref_iterator_begin, + git_reftable_read_raw_ref, + + git_reftable_reflog_iterator_begin, + git_reftable_for_each_reflog_ent_oldest_first, + git_reftable_for_each_reflog_ent_newest_first, + git_reftable_reflog_exists, + git_reftable_create_reflog, + git_reftable_delete_reflog, + git_reftable_reflog_expire, +}; diff --git a/repository.c b/repository.c index a4174ddb062..ff0988dac84 100644 --- a/repository.c +++ b/repository.c @@ -174,6 +174,8 @@ int repo_init(struct repository *repo, if (worktree) repo_set_worktree(repo, worktree); + repo->ref_storage_format = xstrdup_or_null(format.ref_storage); + clear_repository_format(&format); return 0; diff --git a/repository.h b/repository.h index b385ca3c94b..8019a7d0a1f 100644 --- a/repository.h +++ b/repository.h @@ -78,6 +78,9 @@ struct repository { */ struct ref_store *refs_private; + /* The format to use for the ref database. */ + char *ref_storage_format; + /* * Contains path to often used file names. */ diff --git a/setup.c b/setup.c index c04cd25a30d..c6b57efd031 100644 --- a/setup.c +++ b/setup.c @@ -500,6 +500,9 @@ static enum extension_result handle_extension(const char *var, return error("invalid value for 'extensions.objectformat'"); data->hash_algo = format; return EXTENSION_OK; + } else if (!strcmp(ext, "refstorage")) { + data->ref_storage = xstrdup(value); + return EXTENSION_OK; } return EXTENSION_UNKNOWN; } @@ -651,6 +654,7 @@ void clear_repository_format(struct repository_format *format) string_list_clear(&format->v1_only_extensions, 0); free(format->work_tree); free(format->partial_clone); + free(format->ref_storage); init_repository_format(format); } @@ -1308,8 +1312,11 @@ const char *setup_git_directory_gently(int *nongit_ok) gitdir = DEFAULT_GIT_DIR_ENVIRONMENT; setup_git_env(gitdir); } - if (startup_info->have_repository) + if (startup_info->have_repository) { repo_set_hash_algo(the_repository, repo_fmt.hash_algo); + the_repository->ref_storage_format = + xstrdup_or_null(repo_fmt.ref_storage); + } } strbuf_release(&dir); diff --git a/t/t0031-reftable.sh b/t/t0031-reftable.sh new file mode 100755 index 00000000000..58c7d5d4bcd --- /dev/null +++ b/t/t0031-reftable.sh @@ -0,0 +1,199 @@ +#!/bin/sh +# +# Copyright (c) 2020 Google LLC +# + +test_description='reftable basics' + +. ./test-lib.sh + +INVALID_SHA1=aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa + +initialize () { + rm -rf .git && + git init --ref-storage=reftable && + mv .git/hooks .git/hooks-disabled +} + +test_expect_success 'SHA256 support, env' ' + rm -rf .git && + GIT_DEFAULT_HASH=sha256 && export GIT_DEFAULT_HASH && + git init --ref-storage=reftable && + mv .git/hooks .git/hooks-disabled && + test_commit file +' + +test_expect_success 'SHA256 support, option' ' + rm -rf .git && + git init --ref-storage=reftable --object-format=sha256 && + mv .git/hooks .git/hooks-disabled && + test_commit file +' + +test_expect_success 'delete ref' ' + initialize && + test_commit file && + SHA=$(git show-ref -s --verify HEAD) && + test_write_lines "$SHA refs/heads/master" "$SHA refs/tags/file" >expect && + git show-ref > actual && + ! git update-ref -d refs/tags/file $INVALID_SHA1 && + test_cmp expect actual && + git update-ref -d refs/tags/file $SHA && + test_write_lines "$SHA refs/heads/master" >expect && + git show-ref > actual && + test_cmp expect actual +' + + +test_expect_success 'clone calls transaction_initial_commit' ' + test_commit message1 file1 && + git clone . cloned && + (test -f cloned/file1 || echo "Fixme.") +' + +test_expect_success 'basic operation of reftable storage: commit, show-ref' ' + initialize && + test_commit file && + test_write_lines refs/heads/master refs/tags/file >expect && + git show-ref && + git show-ref | cut -f2 -d" " > actual && + test_cmp actual expect +' + +test_expect_success 'reflog, repack' ' + initialize && + for count in $(test_seq 1 10) + do + test_commit "number $count" file.t $count number-$count || + return 1 + done && + git pack-refs && + ls -1 .git/reftable >table-files && + test_line_count = 2 table-files && + git reflog refs/heads/master >output && + test_line_count = 10 output && + grep "commit (initial): number 1" output && + grep "commit: number 10" output && + git gc && + git reflog refs/heads/master >output && + test_line_count = 0 output +' + +test_expect_success 'branch switch in reflog output' ' + initialize && + test_commit file1 && + git checkout -b branch1 && + test_commit file2 && + git checkout -b branch2 && + git switch - && + git rev-parse --symbolic-full-name HEAD > actual && + echo refs/heads/branch1 > expect && + test_cmp actual expect +' + + +# This matches show-ref's output +print_ref() { + echo "$(git rev-parse "$1") $1" +} + +test_expect_success 'peeled tags are stored' ' + initialize && + test_commit file && + git tag -m "annotated tag" test_tag HEAD && + { + print_ref "refs/heads/master" && + print_ref "refs/tags/file" && + print_ref "refs/tags/test_tag" && + print_ref "refs/tags/test_tag^{}" + } >expect && + git show-ref -d >actual && + test_cmp expect actual +' + +test_expect_success 'show-ref works on fresh repo' ' + initialize && + rm -rf .git && + git init --ref-storage=reftable && + >expect && + ! git show-ref > actual && + test_cmp expect actual +' + +test_expect_success 'checkout unborn branch' ' + initialize && + git checkout -b master +' + + +test_expect_success 'dir/file conflict' ' + initialize && + test_commit file && + ! git branch master/forbidden +' + + +test_expect_success 'do not clobber existing repo' ' + rm -rf .git && + git init --ref-storage=files && + cat .git/HEAD > expect && + test_commit file && + (git init --ref-storage=reftable || true) && + cat .git/HEAD > actual && + test_cmp expect actual +' + +# cherry-pick uses a pseudo ref. +test_expect_success 'pseudo refs' ' + initialize && + test_commit message1 file1 && + test_commit message2 file2 && + git branch source && + git checkout HEAD^ && + test_commit message3 file3 && + git cherry-pick source && + test -f file2 +' + +# cherry-pick uses a pseudo ref. +test_expect_success 'rebase' ' + initialize && + test_commit message1 file1 && + test_commit message2 file2 && + git branch source && + git checkout HEAD^ && + test_commit message3 file3 && + git rebase source && + test -f file2 +' + +test_expect_success 'worktrees' ' + git init --ref-storage=reftable start && + (cd start && test_commit file1 && git checkout -b branch1 && + git checkout -b branch2 && + git worktree add ../wt + ) && + cd wt && + git checkout branch1 && + git branch +' + +test_expect_success 'worktrees 2' ' + initialize && + test_commit file1 && + mkdir existing_empty && + git worktree add --detach existing_empty master +' + +test_expect_success 'FETCH_HEAD' ' + initialize && + test_commit one && + (git init sub && cd sub && test_commit two) && + git --git-dir sub/.git rev-parse HEAD >expect && + git fetch sub && + git checkout FETCH_HEAD && + git rev-parse HEAD >actual && + test_cmp expect actual +' + +test_done diff --git a/t/t1409-avoid-packing-refs.sh b/t/t1409-avoid-packing-refs.sh index be12fb63506..c6f78325563 100755 --- a/t/t1409-avoid-packing-refs.sh +++ b/t/t1409-avoid-packing-refs.sh @@ -4,6 +4,12 @@ test_description='avoid rewriting packed-refs unnecessarily' . ./test-lib.sh +if test_have_prereq REFTABLE +then + skip_all='skipping pack-refs tests; incompatible with reftable' + test_done +fi + # Add an identifying mark to the packed-refs file header line. This # shouldn't upset readers, and it should be omitted if the file is # ever rewritten. diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh index b17f5c21fbc..cc5d01571a4 100755 --- a/t/t1450-fsck.sh +++ b/t/t1450-fsck.sh @@ -8,6 +8,12 @@ test_description='git fsck random collection of tests . ./test-lib.sh +if test_have_prereq REFTABLE +then + skip_all='skipping tests; incompatible with reftable' + test_done +fi + test_expect_success setup ' git config gc.auto 0 && git config i18n.commitencoding ISO-8859-1 && diff --git a/t/t3210-pack-refs.sh b/t/t3210-pack-refs.sh index f41b2afb996..edaef2c175a 100755 --- a/t/t3210-pack-refs.sh +++ b/t/t3210-pack-refs.sh @@ -11,6 +11,12 @@ semantic is still the same. ' . ./test-lib.sh +if test_have_prereq REFTABLE +then + skip_all='skipping pack-refs tests; incompatible with reftable' + test_done +fi + test_expect_success 'enable reflogs' ' git config core.logallrefupdates true ' diff --git a/t/test-lib.sh b/t/test-lib.sh index a863ccee7e9..7b638e0b8c8 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -1520,6 +1520,11 @@ parisc* | hppa*) ;; esac +if test -n "$GIT_TEST_REFTABLE" +then + test_set_prereq REFTABLE +fi + ( COLUMNS=1 && test $COLUMNS = 1 ) && test_set_prereq COLUMNS_CAN_BE_1 test -z "$NO_PERL" && test_set_prereq PERL test -z "$NO_PTHREADS" && test_set_prereq PTHREADS From patchwork Wed Dec 9 14:00:28 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?SZEDER_G=C3=A1bor?= X-Patchwork-Id: 11961563 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7C30EC433FE for ; Wed, 9 Dec 2020 14:02:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4815623B31 for ; Wed, 9 Dec 2020 14:02:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732793AbgLIOCN (ORCPT ); Wed, 9 Dec 2020 09:02:13 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60414 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732785AbgLIOB6 (ORCPT ); Wed, 9 Dec 2020 09:01:58 -0500 Received: from mail-wm1-x333.google.com (mail-wm1-x333.google.com [IPv6:2a00:1450:4864:20::333]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D9DD0C0611CC for ; Wed, 9 Dec 2020 06:00:46 -0800 (PST) Received: by mail-wm1-x333.google.com with SMTP id w206so953426wma.0 for ; Wed, 09 Dec 2020 06:00:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:mime-version :content-transfer-encoding:fcc:to:cc; bh=+6PYEuFV+jKAR8TJMsA1Qhghy1Wlzfn68uworvrq0zQ=; b=gm8GvRzNVMGrCTkp3ZIPmi3ugqhksdPJI68JgSIpvI0qOt3T78qimkLXwWE7nTwSFQ hzup3J/bGN+oto4lBIQ9QECmi5O+zZZhrS8FwP3mmHG6DoCMjZgHnLeTnRhw2i7rOwlW owDrW3O5kGFQvKfAP/YLMiGPfq2gbzqzq85sHuaarpyM1zHra8itrf2aoajMwH/ePn3B nBrHrASJWtelsbHT5+7734R9dqPXUhGD4H2eK7EuI3cbL+nLi0SLrIIGWVcRjwIWTg5X VeTXNv8p5EtE0G2JPevGOmL10iQpXkOXQcVYgiE61+LFVWkmPRFl17pb0q2u3N1ukrYF +Wjw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:mime-version:content-transfer-encoding:fcc:to:cc; bh=+6PYEuFV+jKAR8TJMsA1Qhghy1Wlzfn68uworvrq0zQ=; b=P9qia8cwYatCK7Sn/WDKxSsAcNba0cWBS9BK5t6hy40/SvHXHXyXnT2I1IZDHuz2Mk xF3s5DHOHSNV21GSa5/foVDYmNX8Qlp5LMVS+R11MnGw58IuwJG84uiXD9oEASjAGuqh LlrYw/1h85JgQ9q9AxBJUPPisXF5Y/bA47aTYV2DYv+PFvL7jIDQuscVMy8GdcHDfnX8 S37FWf2c2v7BGek06ExbT57UsvnlgV3K6Lx5lp19iTfq2pprAKz/bIYvCn1LbqJRrTGE uyNbQte1/ZyMZ1jgqKquw+G6fN9RFCDCHgKB+ZJU6bEaUKn901bqKvLwgBlZA1ZxC8cg qZ3Q== X-Gm-Message-State: AOAM531CezCzcfxzoC9JSSEi8SN01AuI4yIKjxyVo2khNaGt3aeb1gQI V0ZJOxV++iSk+B//h6Peu+ZQLfxEuSk= X-Google-Smtp-Source: ABdhPJww5p1O8gHww2LlDyJkCeGqmzY+DXXV8lSEUKheAbWcjtQU4wvEEEDUBGk3hxwdv2QLNn2EPw== X-Received: by 2002:a7b:c24b:: with SMTP id b11mr2974455wmj.168.1607522445073; Wed, 09 Dec 2020 06:00:45 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id x17sm3445074wro.40.2020.12.09.06.00.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 09 Dec 2020 06:00:44 -0800 (PST) Message-Id: <9df5bc69f971dc2b51d519a2db9a2b0f22f9d87b.1607522430.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 09 Dec 2020 14:00:28 +0000 Subject: [PATCH v4 14/15] git-prompt: prepare for reftable refs backend MIME-Version: 1.0 Fcc: Sent To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Jeff King , Ramsay Jones , Jonathan Nieder , Johannes Schindelin , Jonathan Tan , Josh Steadmon , Emily Shaffer , Patrick Steinhardt , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Felipe Contreras , Han-Wen Nienhuys , =?utf-8?q?SZEDER_G=C3=A1bor?= Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: =?utf-8?q?SZEDER_G=C3=A1bor?= From: =?UTF-8?q?SZEDER=20G=C3=A1bor?= In our git-prompt script we strive to use Bash builtins wherever possible, because fork()-ing subshells for command substitutions and fork()+exec()-ing Git commands are expensive on some platforms. We even read and parse '.git/HEAD' using Bash builtins to get the name of the current branch [1]. However, the upcoming reftable refs backend won't use '.git/HEAD' at all, but will write an invalid refname as placeholder for backwards compatibility instead, which will break our git-prompt script. Update the git-prompt script to recognize the placeholder '.git/HEAD' written by the reftable backend (its content is specified in the reftable specs), and then fall back to use 'git symbolic-ref' to get the name of the current branch. [1] 3a43c4b5bd (bash prompt: use bash builtins to find out current branch, 2011-03-31) Signed-off-by: SZEDER Gábor --- contrib/completion/git-prompt.sh | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/contrib/completion/git-prompt.sh b/contrib/completion/git-prompt.sh index 4640a1535d1..2e5a5d80271 100644 --- a/contrib/completion/git-prompt.sh +++ b/contrib/completion/git-prompt.sh @@ -478,10 +478,15 @@ __git_ps1 () if ! __git_eread "$g/HEAD" head; then return $exit fi - # is it a symbolic ref? b="${head#ref: }" if [ "$head" = "$b" ]; then detached=yes + elif [ "$b" = "refs/heads/.invalid" ]; then + # Reftable + b="$(git symbolic-ref HEAD 2>/dev/null)" || + detached=yes + fi + if [ "$detached" = yes ]; then b="$( case "${GIT_PS1_DESCRIBE_STYLE-}" in (contains) From patchwork Wed Dec 9 14:00:29 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 11961573 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A85E8C19425 for ; Wed, 9 Dec 2020 14:03:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7FC5623B42 for ; Wed, 9 Dec 2020 14:03:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727589AbgLIOCi (ORCPT ); Wed, 9 Dec 2020 09:02:38 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60506 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728849AbgLIOCd (ORCPT ); Wed, 9 Dec 2020 09:02:33 -0500 Received: from mail-wr1-x444.google.com (mail-wr1-x444.google.com [IPv6:2a00:1450:4864:20::444]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2629CC0611CE for ; Wed, 9 Dec 2020 06:00:48 -0800 (PST) Received: by mail-wr1-x444.google.com with SMTP id r3so1870629wrt.2 for ; Wed, 09 Dec 2020 06:00:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=QeG/Uiv06sgOLRIt9htHLskysMZMELPOfsIsuyKEoEU=; b=GsSgBWmIjt/eV8E81HLGEF9k8eM/BPVMzHTOIpI1ZGQs0V/rHjRxoQ4ap5x4+v6cgs rz5PkEPzni4PaS+vF23gZ+2bQ+n/ezYMsJhoPhBwrYBLe4kQ0JBhQm0S9JXlTw2x9yxm opLLSU85kRO4wGFQC+QRyMpM3dUuEqy/QGEQdqBFn8HJnB0gXYJkD2KB7qXJ9Svm2xmn YXi9QPXOKNBNIicYMSYwflVAkRTdCqRm/bCrc0lzlckEYJDSxrdW54nqn06Trplwdfdl q4RsaHHKVGNUvznV7hKqmPQWEzxgZIIacQOI52e11BSgd1VLcWkUppPnKmkPsmOyIbql 8feg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=QeG/Uiv06sgOLRIt9htHLskysMZMELPOfsIsuyKEoEU=; b=C6p1kEYSEurPMDcgrxBkslEDOCwWhwjk7PJZQvPyfp5x6vjAJnrpjcMhT7fcus+2LD m0DSCcaqTZGp3wo7FP9zLrGdhFw+imZfd23cEscNSMnQlaAMlpNMN4gJ8d06FL09IcMh 5VQ9i1zX01U5Ds71Hn1uwxdN5i4RgmxE9BMWmZZd4HjT0dGGLpokrVSpKvqQYm1Rl1FO swv3D1vSLmBnyaRSD01fSg2+VqYkSSWtSpJVAkn1oKj6EbHBQiAGbXp2onye+ntpzONa iGMxHYD+y4F+b1h0O9r8cHlG/Z4NQGNqO8j6di471QGBSIA6/2QMK6RMog7MVMV7TAJx 7egw== X-Gm-Message-State: AOAM533cJuIW4WCCaQpK/c24R0WNA34o/B1F+UeIKOKBvWW9f1jK1D4B Jv0fhtMK9nPLYsM+Oc5JdtoXA+/SPW4= X-Google-Smtp-Source: ABdhPJyXMJySKYICwXiN2yiVBexLl+RYPDA5ZJazuqd9Oig8DAU8O4qjGu6hgzNhncsPsLKh1T1hWQ== X-Received: by 2002:a5d:5710:: with SMTP id a16mr2907934wrv.229.1607522446608; Wed, 09 Dec 2020 06:00:46 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id q1sm3336137wrj.8.2020.12.09.06.00.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 09 Dec 2020 06:00:46 -0800 (PST) Message-Id: <4076ef5d20b4986cf857e70fffe1de9da3bb12e3.1607522430.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 09 Dec 2020 14:00:29 +0000 Subject: [PATCH v4 15/15] Add "test-tool dump-reftable" command. Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Jeff King , Ramsay Jones , Jonathan Nieder , Johannes Schindelin , Jonathan Tan , Josh Steadmon , Emily Shaffer , Patrick Steinhardt , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Felipe Contreras , Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys This command dumps individual tables or a stack of of tables. Signed-off-by: Han-Wen Nienhuys --- Makefile | 1 + reftable/dump.c | 67 ++++++++++++++++++++-------------------- t/helper/test-reftable.c | 5 +++ t/helper/test-tool.c | 1 + t/helper/test-tool.h | 1 + 5 files changed, 42 insertions(+), 33 deletions(-) diff --git a/Makefile b/Makefile index 6b7c8a165f4..2f9bf61fffb 100644 --- a/Makefile +++ b/Makefile @@ -2409,6 +2409,7 @@ REFTABLE_OBJS += reftable/zlib-compat.o REFTABLE_TEST_OBJS += reftable/basics_test.o REFTABLE_TEST_OBJS += reftable/block_test.o +REFTABLE_TEST_OBJS += reftable/dump.o REFTABLE_TEST_OBJS += reftable/merged_test.o REFTABLE_TEST_OBJS += reftable/record_test.o REFTABLE_TEST_OBJS += reftable/refname_test.o diff --git a/reftable/dump.c b/reftable/dump.c index 00b444e8c9b..7d620a3cf0f 100644 --- a/reftable/dump.c +++ b/reftable/dump.c @@ -12,18 +12,25 @@ license that can be found in the LICENSE file or at #include #include -#include "reftable.h" +#include "reftable-blocksource.h" +#include "reftable-error.h" +#include "reftable-merged.h" +#include "reftable-record.h" #include "reftable-tests.h" +#include "reftable-writer.h" +#include "reftable-iterator.h" +#include "reftable-reader.h" +#include "reftable-stack.h" static uint32_t hash_id; static int dump_table(const char *tablename) { - struct reftable_block_source src = { 0 }; + struct reftable_block_source src = { NULL }; int err = reftable_block_source_from_file(&src, tablename); - struct reftable_iterator it = { 0 }; - struct reftable_ref_record ref = { 0 }; - struct reftable_log_record log = { 0 }; + struct reftable_iterator it = { NULL }; + struct reftable_ref_record ref = { NULL }; + struct reftable_log_record log = { NULL }; struct reftable_reader *r = NULL; if (err < 0) @@ -49,7 +56,7 @@ static int dump_table(const char *tablename) reftable_ref_record_print(&ref, hash_id); } reftable_iterator_destroy(&it); - reftable_ref_record_clear(&ref); + reftable_ref_record_release(&ref); err = reftable_reader_seek_log(r, &it, ""); if (err < 0) { @@ -66,7 +73,7 @@ static int dump_table(const char *tablename) reftable_log_record_print(&log, hash_id); } reftable_iterator_destroy(&it); - reftable_log_record_clear(&log); + reftable_log_record_release(&log); reftable_reader_free(r); return 0; @@ -75,7 +82,7 @@ static int dump_table(const char *tablename) static int compact_stack(const char *stackdir) { struct reftable_stack *stack = NULL; - struct reftable_write_options cfg = {}; + struct reftable_write_options cfg = { 0 }; int err = reftable_new_stack(&stack, stackdir, cfg); if (err < 0) @@ -94,10 +101,10 @@ static int compact_stack(const char *stackdir) static int dump_stack(const char *stackdir) { struct reftable_stack *stack = NULL; - struct reftable_write_options cfg = {}; - struct reftable_iterator it = { 0 }; - struct reftable_ref_record ref = { 0 }; - struct reftable_log_record log = { 0 }; + struct reftable_write_options cfg = { 0 }; + struct reftable_iterator it = { NULL }; + struct reftable_ref_record ref = { NULL }; + struct reftable_log_record log = { NULL }; struct reftable_merged_table *merged = NULL; int err = reftable_new_stack(&stack, stackdir, cfg); @@ -122,7 +129,7 @@ static int dump_stack(const char *stackdir) reftable_ref_record_print(&ref, hash_id); } reftable_iterator_destroy(&it); - reftable_ref_record_clear(&ref); + reftable_ref_record_release(&ref); err = reftable_merged_table_seek_log(merged, &it, ""); if (err < 0) { @@ -139,7 +146,7 @@ static int dump_stack(const char *stackdir) reftable_log_record_print(&log, hash_id); } reftable_iterator_destroy(&it); - reftable_log_record_clear(&log); + reftable_log_record_release(&log); reftable_stack_destroy(stack); return 0; @@ -160,40 +167,34 @@ static void print_help(void) int reftable_dump_main(int argc, char *const *argv) { int err = 0; - int opt; int opt_dump_table = 0; int opt_dump_stack = 0; int opt_compact = 0; - const char *arg = NULL; - while ((opt = getopt(argc, argv, "2chts")) != -1) { - switch (opt) { - case '2': - hash_id = 0x73323536; + const char *arg = NULL, *argv0 = argv[0]; + + for (; argc > 1; argv++, argc--) + if (*argv[1] != '-') break; - case 't': + else if (!strcmp("-2", argv[1])) + hash_id = 0x73323536; + else if (!strcmp("-t", argv[1])) opt_dump_table = 1; - break; - case 's': + else if (!strcmp("-s", argv[1])) opt_dump_stack = 1; - break; - case 'c': + else if (!strcmp("-c", argv[1])) opt_compact = 1; - break; - case '?': - case 'h': + else if (!strcmp("-?", argv[1]) || !strcmp("-h", argv[1])) { print_help(); return 2; - break; } - } - if (argv[optind] == NULL) { + if (argc != 2) { fprintf(stderr, "need argument\n"); print_help(); return 2; } - arg = argv[optind]; + arg = argv[1]; if (opt_dump_table) { err = dump_table(arg); @@ -204,7 +205,7 @@ int reftable_dump_main(int argc, char *const *argv) } if (err < 0) { - fprintf(stderr, "%s: %s: %s\n", argv[0], arg, + fprintf(stderr, "%s: %s: %s\n", argv0, arg, reftable_error_str(err)); return 1; } diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c index 3b702f4855e..c1ba131c3a4 100644 --- a/t/helper/test-reftable.c +++ b/t/helper/test-reftable.c @@ -13,3 +13,8 @@ int cmd__reftable(int argc, const char **argv) tree_test_main(argc, argv); return 0; } + +int cmd__dump_reftable(int argc, const char **argv) +{ + return reftable_dump_main(argc, (char *const *)argv); +} diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c index 0208a0a41cf..f064440a319 100644 --- a/t/helper/test-tool.c +++ b/t/helper/test-tool.c @@ -56,6 +56,7 @@ static struct test_cmd cmds[] = { { "read-midx", cmd__read_midx }, { "ref-store", cmd__ref_store }, { "reftable", cmd__reftable }, + { "dump-reftable", cmd__dump_reftable }, { "regex", cmd__regex }, { "repository", cmd__repository }, { "revision-walking", cmd__revision_walking }, diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h index 1de39ce5b58..226af8c6b89 100644 --- a/t/helper/test-tool.h +++ b/t/helper/test-tool.h @@ -18,6 +18,7 @@ int cmd__dump_cache_tree(int argc, const char **argv); int cmd__dump_fsmonitor(int argc, const char **argv); int cmd__dump_split_index(int argc, const char **argv); int cmd__dump_untracked_cache(int argc, const char **argv); +int cmd__dump_reftable(int argc, const char **argv); int cmd__example_decorate(int argc, const char **argv); int cmd__fast_rebase(int argc, const char **argv); int cmd__genrandom(int argc, const char **argv);