From patchwork Mon Aug 22 15:12:48 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12950776 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B3096C28D13 for ; Mon, 22 Aug 2022 15:19:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236062AbiHVPTH (ORCPT ); Mon, 22 Aug 2022 11:19:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49480 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236454AbiHVPSs (ORCPT ); Mon, 22 Aug 2022 11:18:48 -0400 Received: from mail-wr1-x42f.google.com (mail-wr1-x42f.google.com [IPv6:2a00:1450:4864:20::42f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DA90025EB1 for ; Mon, 22 Aug 2022 08:12:59 -0700 (PDT) Received: by mail-wr1-x42f.google.com with SMTP id e20so13124101wri.13 for ; Mon, 22 Aug 2022 08:12:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:mime-version:content-transfer-encoding:fcc:subject:date:from :references:in-reply-to:message-id:from:to:cc; bh=u0Y8DjsXF55pkSXQYNbfM7bdS8cdkF52DVnNTamF++M=; b=AsEbnapN7FmYjyFGOFutPbFLsiLQIeQiDDklWImn2ulLVz0KdQmNpNzLa+kX9eXOb2 sa5Ygf5fzXTfXF0b4F/PAm+Tvomz9D4WQCmpxrgesqJqut9pgHdqZcHVSaigikNKVrIr 2+28uv20PAxAOw9lJwR120XMjV4k6bscfhLWujVwBZ9TMUEUsF3C4gEeJegD8Z/c/wxX zPu46RnokZJcoja0pb1S83dg0VKNTm+jhW3qziIiA5bezFNMRDfxnIr1EpNJYnNURATI qGOtYeH89fihRVFIiw52s48zVBTjMhwHqvcoc3ZaK4jC2wZxa7qUt8iNz3EVKStsXb9f qcSQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:mime-version:content-transfer-encoding:fcc:subject:date:from :references:in-reply-to:message-id:x-gm-message-state:from:to:cc; bh=u0Y8DjsXF55pkSXQYNbfM7bdS8cdkF52DVnNTamF++M=; b=iYDEZtMzILKk8YwyDhZldBTappKIo3iT3db0TbDxoOFv3W0PYR3DPmtb19Hu7wYOiZ TsetsCG7+R594CllbizzMSHxLhAeJlcKKm5HHDKh4TIr4doNtUsbsKhwvVm+MD7LiXwc 5VYdSG8hwm93QXk+NaPT1pRB8QFCo6+PEFsXbNi0KJXM1owKgtQBjXOiRQdrZxlbpmAl d0odNZM9cJcxxQgwRFDeuwrXd6AVO/OLyEy3ESnvieeNhsnOOw3ETzmNiECrXgRtMu8q tVSyn5CML4beASbSerSrcp5q4sNOGSYF4tQ2qA+8aywhwgoYezkCCevIIkjhOEokyNSV dOlw== X-Gm-Message-State: ACgBeo3LCwIYbdzJ6bSTLP/zSakwziZZS0uCj9Of4Y4Es8H/AW8I8Nk9 rJGxnNhlH7IByQRKBItJNJKxS2pSP9U= X-Google-Smtp-Source: AA6agR5tRZNjhaobV4EeFQ+dheVg6/0OCFnd7zoDv+SwZPSOCYwDXmF5GsRw2HuXjIgOyPOUgzoRNw== X-Received: by 2002:adf:fc05:0:b0:225:272b:76b4 with SMTP id i5-20020adffc05000000b00225272b76b4mr10810379wrr.162.1661181176950; Mon, 22 Aug 2022 08:12:56 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id h3-20020a05600c2ca300b003a5ea1cc63csm18652832wmc.39.2022.08.22.08.12.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 22 Aug 2022 08:12:56 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Mon, 22 Aug 2022 15:12:48 +0000 Subject: [PATCH 1/7] bundle-uri: create bundle_list struct and helpers Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, me@ttaylorr.com, newren@gmail.com, avarab@gmail.com, mjcheetham@outlook.com, steadmon@google.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee It will likely be rare where a user uses a single bundle URI and expects that URI to point to a bundle. Instead, that URI will likely be a list of bundles provided in some format. Alternatively, the Git server could advertise a list of bundles. In anticipation of these two ways of advertising multiple bundles, create a data structure that represents such a list. This will be populated using a common API, but for now focus on what data can be represented. Each list contains a number of remote_bundle_info structs. These contain an 'id' that is used to uniquely identify them in the list, and also a 'uri' that contains the location of its data. Finally, there is a strbuf containing the filename used when Git downloads the contents to disk. The list itself stores these remote_bundle_info structs in a hashtable using 'id' as the key. The order of the structs in the input is considered unimportant, but future modifications to the format and these data structures will place ordering possibilities on the set. The list also has a few "global" properties, including the version (used when parsing the list) and the mode. The mode is one of these two options: 1. BUNDLE_MODE_ALL: all listed URIs are intended to be combined together. The client should download all of the advertised data to have a complete copy of the data. 2. BUNDLE_MODE_ANY: any one listed item is sufficient to have a complete copy of the data. The client can choose arbitrarily from these options. In the future, the client may use pings to find the closest URI among geodistributed replicas, or use some other heuristic information added to the format. This API is currently unused, but will soon be expanded with parsing logic and then be consumed by the bundle URI download logic. Signed-off-by: Derrick Stolee --- bundle-uri.c | 61 ++++++++++++++++++++++++++++++++++++++++++++++++ bundle-uri.h | 65 ++++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 126 insertions(+) diff --git a/bundle-uri.c b/bundle-uri.c index 4a8cc74ed05..ceeef0b6641 100644 --- a/bundle-uri.c +++ b/bundle-uri.c @@ -4,6 +4,67 @@ #include "object-store.h" #include "refs.h" #include "run-command.h" +#include "hashmap.h" +#include "pkt-line.h" + +static int compare_bundles(const void *hashmap_cmp_fn_data, + const struct hashmap_entry *he1, + const struct hashmap_entry *he2, + const void *id) +{ + const struct remote_bundle_info *e1 = + container_of(he1, const struct remote_bundle_info, ent); + const struct remote_bundle_info *e2 = + container_of(he2, const struct remote_bundle_info, ent); + + return strcmp(e1->id, id ? (const char *)id : e2->id); +} + +void init_bundle_list(struct bundle_list *list) +{ + memset(list, 0, sizeof(*list)); + + /* Implied defaults. */ + list->mode = BUNDLE_MODE_ALL; + list->version = 1; + + hashmap_init(&list->bundles, compare_bundles, NULL, 0); +} + +static int clear_remote_bundle_info(struct remote_bundle_info *bundle, + void *data) +{ + free(bundle->id); + free(bundle->uri); + strbuf_release(&bundle->file); + return 0; +} + +void clear_bundle_list(struct bundle_list *list) +{ + if (!list) + return; + + for_all_bundles_in_list(list, clear_remote_bundle_info, NULL); + hashmap_clear_and_free(&list->bundles, struct remote_bundle_info, ent); +} + +int for_all_bundles_in_list(struct bundle_list *list, + bundle_iterator iter, + void *data) +{ + struct remote_bundle_info *info; + struct hashmap_iter i; + + hashmap_for_each_entry(&list->bundles, &i, info, ent) { + int result = iter(info, data); + + if (result) + return result; + } + + return 0; +} static int find_temp_filename(struct strbuf *name) { diff --git a/bundle-uri.h b/bundle-uri.h index 8a152f1ef14..6692aa4b170 100644 --- a/bundle-uri.h +++ b/bundle-uri.h @@ -1,7 +1,72 @@ #ifndef BUNDLE_URI_H #define BUNDLE_URI_H +#include "hashmap.h" +#include "strbuf.h" + struct repository; +struct string_list; + +/** + * The remote_bundle_info struct contains information for a single bundle + * URI. This may be initialized simply by a given URI or might have + * additional metadata associated with it if the bundle was advertised by + * a bundle list. + */ +struct remote_bundle_info { + struct hashmap_entry ent; + + /** + * The 'id' is a name given to the bundle for reference + * by other bundle infos. + */ + char *id; + + /** + * The 'uri' is the location of the remote bundle so + * it can be downloaded on-demand. This will be NULL + * if there was no table of contents. + */ + char *uri; + + /** + * If the bundle has been downloaded, then 'file' is a + * filename storing its contents. Otherwise, 'file' is + * an empty string. + */ + struct strbuf file; +}; + +#define REMOTE_BUNDLE_INFO_INIT { \ + .file = STRBUF_INIT, \ +} + +enum bundle_list_mode { + BUNDLE_MODE_NONE = 0, + BUNDLE_MODE_ALL, + BUNDLE_MODE_ANY +}; + +/** + * A bundle_list contains an unordered set of remote_bundle_info structs, + * as well as information about the bundle listing, such as version and + * mode. + */ +struct bundle_list { + int version; + enum bundle_list_mode mode; + struct hashmap bundles; +}; + +void init_bundle_list(struct bundle_list *list); +void clear_bundle_list(struct bundle_list *list); + +typedef int (*bundle_iterator)(struct remote_bundle_info *bundle, + void *data); + +int for_all_bundles_in_list(struct bundle_list *list, + bundle_iterator iter, + void *data); /** * Fetch data from the given 'uri' and unbundle the bundle data found From patchwork Mon Aug 22 15:12:49 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12950781 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 059D1C32772 for ; Mon, 22 Aug 2022 15:19:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236413AbiHVPTd (ORCPT ); Mon, 22 Aug 2022 11:19:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50806 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236407AbiHVPS6 (ORCPT ); Mon, 22 Aug 2022 11:18:58 -0400 Received: from mail-wr1-x42d.google.com (mail-wr1-x42d.google.com [IPv6:2a00:1450:4864:20::42d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8165D474C6 for ; Mon, 22 Aug 2022 08:13:01 -0700 (PDT) Received: by mail-wr1-x42d.google.com with SMTP id n7so13614605wrv.4 for ; Mon, 22 Aug 2022 08:13:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:mime-version:content-transfer-encoding:fcc:subject:date:from :references:in-reply-to:message-id:from:to:cc; bh=Jkq6eDc+Te2e/roIjoOrp6lP8qYnEDriqAN3FZDewMk=; b=PVF73aWmFKB8bHwYV3Z5O/exKs9+8o3nf8oxo3ckFm5gy122uNDvB46PMuT6VYEUaF FX9lByhC0AmODH0bWZ1fH23uaAFROa9mgy0F3AAvwRWQ70DHl307HHbdKzGyh4Mj/O0C rp6P3G/QTW++S0RQaZD3sBTqx6JCvsXvbM7SMfKpcb1513RyM7DQBcXgoQjev8nymtmz o39oiRtsUGzPSb2HCAB1OLJ7+VrKVB56cod9oVi6Kd+nGHhMCjeziTYn5H74mpRGC6sp qiihveq2cXMsMGrYnLtz/Wmiw3GlVTWXPYcOZh1jKw/bDxCbd51AoIxWbb/Bt9/s1cYS 9rpw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:mime-version:content-transfer-encoding:fcc:subject:date:from :references:in-reply-to:message-id:x-gm-message-state:from:to:cc; bh=Jkq6eDc+Te2e/roIjoOrp6lP8qYnEDriqAN3FZDewMk=; b=JqRIoJx9dNjGbeuUeyc1LihVlQM3oPa8QfZm+YyPiO+AYCMOz6sksuV90HbF4hSeIZ 8a1ETbyT+iAo9YTkpw4z24+/wI236muHaqgjyODVDUEiGQJnDWntHG17ZAMmUeX7pohT hlC20inJgJuGvXe9JBIpeOMPhRpICbeD3E22mw8T2SVaeD3PxJC60tjyPtJ+OogdXXVY eSjBXLBPbJvqQuQCoQc6VGcQA8G3GGculAZOHi+2b3pSErdmBmh8djg87BSx3MFs3dg4 6j5iiMNaFmZxjGCDmLhtEgNtoG2Rwe7Ytw2isxAOY0H2RffNd9BLpX3NEG1PBsFRp5Yv PGRg== X-Gm-Message-State: ACgBeo1otC+QXi9zPVRNunMcasP0BREboFv2iUPKCFBjAsguKjWsMmVb J9uJiSXDKDOwJG9vD68YOP4uDhITA/U= X-Google-Smtp-Source: AA6agR4iVbnYr4ioHBkL28E861ohddCtRvntVPwTp7LHsXT+/6xgFDg7QO+2eK0T2qlfMniuRx/ZyA== X-Received: by 2002:a05:6000:a05:b0:225:6149:20cc with SMTP id co5-20020a0560000a0500b00225614920ccmr1438389wrb.681.1661181177871; Mon, 22 Aug 2022 08:12:57 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id bj5-20020a0560001e0500b002206203ed3dsm12498505wrb.29.2022.08.22.08.12.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 22 Aug 2022 08:12:57 -0700 (PDT) Message-Id: <7e4e4656e530395d055abac2a59e93866c9a0de2.1661181174.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 22 Aug 2022 15:12:49 +0000 Subject: [PATCH 2/7] bundle-uri: create base key-value pair parsing Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, me@ttaylorr.com, newren@gmail.com, avarab@gmail.com, mjcheetham@outlook.com, steadmon@google.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee There will be two primary ways to advertise a bundle list: as a list of packet lines in Git's protocol v2 and as a config file served from a bundle URI. Both of these fundamentally use a list of key-value pairs. We will use the same set of key-value pairs across these formats. Create a new bundle_list_update() method that is currently unusued, but will be used in the next change. It inspects each key to see if it is understood and then applies it to the given bundle_list. Here are the keys that we teach Git to understand: * bundle.version: This value should be an integer. Git currently understands only version 1 and will ignore the list if the version is any other value. This version can be increased in the future if we need to add new keys that Git should not ignore. We can add new "heuristic" keys without incrementing the version. * bundle.mode: This value should be one of "all" or "any". If this mode is not understood, then Git will ignore the list. This mode indicates whether Git needs all of the bundle list items to make a complete view of the content or if any single item is sufficient. The rest of the keys use a bundle identifier "" as part of the key name. Keys using the same "" describe a single bundle list item. * bundle..uri: This stores the URI of the bundle item. This currently is expected to be an absolute URI, but will be relaxed to be a relative URI in the future. Signed-off-by: Derrick Stolee --- Documentation/config.txt | 2 + Documentation/config/bundle.txt | 22 ++++++++++ bundle-uri.c | 74 +++++++++++++++++++++++++++++++++ 3 files changed, 98 insertions(+) create mode 100644 Documentation/config/bundle.txt diff --git a/Documentation/config.txt b/Documentation/config.txt index e376d547ce0..4280af6992e 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -387,6 +387,8 @@ include::config/branch.txt[] include::config/browser.txt[] +include::config/bundle.txt[] + include::config/checkout.txt[] include::config/clean.txt[] diff --git a/Documentation/config/bundle.txt b/Documentation/config/bundle.txt new file mode 100644 index 00000000000..3515bfe38d1 --- /dev/null +++ b/Documentation/config/bundle.txt @@ -0,0 +1,22 @@ +bundle.*:: + The `bundle.*` keys are used when communicating a list of bundle URIs + See link:technical/bundle-uri.html[the bundle URI design document] for + more details. + +bundle.version:: + This integer value advertises the version of the bundle list format + used by the bundle list. Currently, the only accepted value is `1`. + +bundle.mode:: + This string value should be either `all` or `any`. This value describes + whether all of the advertised bundles are required to unbundle a + complete understanding of the bundled information (`all`) or if any one + of the listed bundle URIs is sufficient (`any`). + +bundle..*:: + The `bundle..*` keys are used to describe a single item in the + bundle list, grouped under `` for identification purposes. + +bundle..uri:: + This string value defines the URI by which Git can reach the contents + of this ``. This URI may be a bundle file or another bundle list. diff --git a/bundle-uri.c b/bundle-uri.c index ceeef0b6641..ade7eccce39 100644 --- a/bundle-uri.c +++ b/bundle-uri.c @@ -66,6 +66,80 @@ int for_all_bundles_in_list(struct bundle_list *list, return 0; } +/** + * Given a key-value pair, update the state of the given bundle list. + * Returns 0 if the key-value pair is understood. Returns 1 if the key + * is not understood or the value is malformed. + */ +MAYBE_UNUSED +static int bundle_list_update(const char *key, const char *value, + struct bundle_list *list) +{ + const char *pkey, *dot; + struct strbuf id = STRBUF_INIT; + struct remote_bundle_info lookup = REMOTE_BUNDLE_INFO_INIT; + struct remote_bundle_info *bundle; + + if (!skip_prefix(key, "bundle.", &pkey)) + return 1; + + dot = strchr(pkey, '.'); + if (!dot) { + if (!strcmp(pkey, "version")) { + int version = atoi(value); + if (version != 1) + return 1; + + list->version = version; + return 0; + } + + if (!strcmp(pkey, "mode")) { + if (!strcmp(value, "all")) + list->mode = BUNDLE_MODE_ALL; + else if (!strcmp(value, "any")) + list->mode = BUNDLE_MODE_ANY; + else + return 1; + return 0; + } + + /* Ignore other unknown global keys. */ + return 0; + } + + strbuf_add(&id, pkey, dot - pkey); + dot++; + + /* + * Check for an existing bundle with this , or create one + * if necessary. + */ + lookup.id = id.buf; + hashmap_entry_init(&lookup.ent, strhash(lookup.id)); + if (!(bundle = hashmap_get_entry(&list->bundles, &lookup, ent, NULL))) { + CALLOC_ARRAY(bundle, 1); + bundle->id = strbuf_detach(&id, NULL); + strbuf_init(&bundle->file, 0); + hashmap_entry_init(&bundle->ent, strhash(bundle->id)); + hashmap_add(&list->bundles, &bundle->ent); + } + strbuf_release(&id); + + if (!strcmp(dot, "uri")) { + free(bundle->uri); + bundle->uri = xstrdup(value); + return 0; + } + + /* + * At this point, we ignore any information that we don't + * understand, assuming it to be hints for a heuristic the client + * does not currently understand. + */ + return 0; +} + static int find_temp_filename(struct strbuf *name) { int fd; From patchwork Mon Aug 22 15:12:50 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsCBCamFybWFzb24=?= X-Patchwork-Id: 12950777 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 54903C28D13 for ; Mon, 22 Aug 2022 15:19:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236353AbiHVPTT (ORCPT ); Mon, 22 Aug 2022 11:19:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49502 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236337AbiHVPSy (ORCPT ); Mon, 22 Aug 2022 11:18:54 -0400 Received: from mail-wr1-x42a.google.com (mail-wr1-x42a.google.com [IPv6:2a00:1450:4864:20::42a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 35C48275DE for ; Mon, 22 Aug 2022 08:13:03 -0700 (PDT) Received: by mail-wr1-x42a.google.com with SMTP id a4so13639590wrq.1 for ; Mon, 22 Aug 2022 08:13:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:fcc:content-transfer-encoding:mime-version:subject:date:from :references:in-reply-to:message-id:from:to:cc; bh=+WMo1J6aL1DMi2R8hkwa6RiChDV0Yf9lUicNPrOIeHA=; b=c/COa3rTH3Mb50Gj7F3tPfZmpq0cPz87dOelHzxJVHmJ/R2IEwx186OZikva78NQCK s2xfZacn9fAWP612vQDxiee3+MAy38J3QcwNcB6kXfJ/1TUN8trwJME+wNQ7x34kb35I K+XKoeCb1cOkPShM3ezNLTg8pB+VSx7Udu+Fmk0WVkKEIn2e4pLJKEpx5yE57sEHju4D yvW8/6M9rE6d7MgRmtrEDH3goU3N8Y2ubuifQZvSVA/h+6geSYoGh144toZs+AxIF5sY URG/Jh43X4V4RYcWn9vE1Y7IZUThbHXp/oqiHk30APOQrt3apP3pp4VFWLpPJ1EeEGDR ZYow== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:fcc:content-transfer-encoding:mime-version:subject:date:from :references:in-reply-to:message-id:x-gm-message-state:from:to:cc; bh=+WMo1J6aL1DMi2R8hkwa6RiChDV0Yf9lUicNPrOIeHA=; b=Ir4+Qfu3uo1riEKmD1wlBVqrSaRsykVIyYj5Fee01XZWPQb7IsH9Zxa9x/OnE2RB/d 9q0CzxfXPXUQh3sY4uSl+qaWxR/RvTr2qMKjD9CFHsXeY2zJkzoN03IO+tUPAqEC48bu w6P7j27iIdfCKtwnGyajMr010hD/2KCMIDycqNWta5HP2e2p8xk3Ai/Q2bSq2b0Xl8Vb Ch1uPXR9aq6p9KSiKsDcYZT/yW2j2ACL54kuIM8ZkLFs3LlBqvT8jvyFwRkPhg0JvpIv FYl26ypMlZWMCXdvVwGKAw4vN38P8qYtF1QcVV8oiZ5YuGfPv86iJ94XDZ/Kj7VuM//+ 0iVw== X-Gm-Message-State: ACgBeo3FyY1XAUuFavE57lbygtAEHB2QDnAODG2dwDWlOhvImQxI359O iK8hChu9Dtkg1Z+K9fpJ1lK7JX3DfRQ= X-Google-Smtp-Source: AA6agR629zFra7ipWNiIgul+F80lysCr5LXMG/pyEwRT/2wbjcOM8CnYH10vhw5c0WXOVL/QoipFTw== X-Received: by 2002:a5d:40cc:0:b0:225:26dc:938e with SMTP id b12-20020a5d40cc000000b0022526dc938emr11529177wrq.297.1661181178789; Mon, 22 Aug 2022 08:12:58 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id e20-20020a5d5954000000b0021e42e7c7dbsm12012797wri.83.2022.08.22.08.12.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 22 Aug 2022 08:12:58 -0700 (PDT) Message-Id: <49c4f88b6fd804f0bd5c62d523b45431846f4cee.1661181174.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 22 Aug 2022 15:12:50 +0000 Subject: [PATCH 3/7] bundle-uri: create "key=value" line parsing MIME-Version: 1.0 Fcc: Sent To: git@vger.kernel.org Cc: gitster@pobox.com, me@ttaylorr.com, newren@gmail.com, avarab@gmail.com, mjcheetham@outlook.com, steadmon@google.com, Derrick Stolee , =?utf-8?b?w4Z2YXIgQXJuZmrDtnI=?= =?utf-8?b?w7AgQmphcm1hc29u?= Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsCBCamFybWFzb24=?= From: =?UTF-8?q?=C3=86var=20Arnfj=C3=B6r=C3=B0=20Bjarmason?= When advertising a bundle list over Git's protocol v2, we will use packet lines. Each line will be of the form "key=value" representing a bundle list. Connect the API necessary for Git's transport to the key-value pair parsing created in the previous change. Co-authored-by: Derrick Stolee Signed-off-by: Ævar Arnfjörð Bjarmason Signed-off-by: Derrick Stolee --- bundle-uri.c | 27 ++++++++++++++++++++++++++- bundle-uri.h | 14 +++++++++++++- 2 files changed, 39 insertions(+), 2 deletions(-) diff --git a/bundle-uri.c b/bundle-uri.c index ade7eccce39..9a7d09349fe 100644 --- a/bundle-uri.c +++ b/bundle-uri.c @@ -71,7 +71,6 @@ int for_all_bundles_in_list(struct bundle_list *list, * Returns 0 if the key-value pair is understood. Returns 1 if the key * is not understood or the value is malformed. */ -MAYBE_UNUSED static int bundle_list_update(const char *key, const char *value, struct bundle_list *list) { @@ -301,3 +300,29 @@ cleanup: strbuf_release(&filename); return result; } + +/** + * General API for {transport,connect}.c etc. + */ +int bundle_uri_parse_line(struct bundle_list *list, const char *line) +{ + int result; + const char *equals; + struct strbuf key = STRBUF_INIT; + + if (!strlen(line)) + return error(_("bundle-uri: got an empty line")); + + equals = strchr(line, '='); + + if (!equals) + return error(_("bundle-uri: line is not of the form 'key=value'")); + if (line == equals || !*(equals + 1)) + return error(_("bundle-uri: line has empty key or value")); + + strbuf_add(&key, line, equals - line); + result = bundle_list_update(key.buf, equals + 1, list); + strbuf_release(&key); + + return result; +} diff --git a/bundle-uri.h b/bundle-uri.h index 6692aa4b170..f725c9796f7 100644 --- a/bundle-uri.h +++ b/bundle-uri.h @@ -76,4 +76,16 @@ int for_all_bundles_in_list(struct bundle_list *list, */ int fetch_bundle_uri(struct repository *r, const char *uri); -#endif +/** + * General API for {transport,connect}.c etc. + */ + +/** + * Parse a "key=value" packet line from the bundle-uri verb. + * + * Returns 0 on success and non-zero on error. + */ +int bundle_uri_parse_line(struct bundle_list *list, + const char *line); + +#endif /* BUNDLE_URI_H */ From patchwork Mon Aug 22 15:12:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsCBCamFybWFzb24=?= X-Patchwork-Id: 12950779 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6FFC6C28D13 for ; Mon, 22 Aug 2022 15:19:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236395AbiHVPT3 (ORCPT ); Mon, 22 Aug 2022 11:19:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49438 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236368AbiHVPS5 (ORCPT ); Mon, 22 Aug 2022 11:18:57 -0400 Received: from mail-wr1-x42d.google.com (mail-wr1-x42d.google.com [IPv6:2a00:1450:4864:20::42d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E5E62474D0 for ; Mon, 22 Aug 2022 08:13:03 -0700 (PDT) Received: by mail-wr1-x42d.google.com with SMTP id a4so13639655wrq.1 for ; Mon, 22 Aug 2022 08:13:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:fcc:content-transfer-encoding:mime-version:subject:date:from :references:in-reply-to:message-id:from:to:cc; bh=oyXpLERDXU6eoC1bJQA+ATjlhkCK7YEgYwous2TIrs4=; b=bKn1BZ0fRWbAtXXuATVI5YEno1tkv6ZK6jiNIuLdGvfs4npNzBKrYexDZ+lLpKkXWu +Le5vN/OeA3mK+Sp9LuN0eP0P2z0DYDg/OtPdewL1ZhZ1GGFxPnmt5uclrklvseSIKUc j5ljzL6KdzWLRgsEbhdKZsjXntnJT63a6YT3ZtspocIKqeKpca+HI443QPxTUfoZIaMK D6tf0LP4OM6RjErdiJ7vaQ+n4uRY7kCIQ9NH4bGKJY4XXd6xm/7hQBwbXS9RIaARSlKA A4bOX8hyfsM15+AhjeHA9ORwdUpGxPlC2IW9L5c38nYMNOLTDYGDzOmk1UCM2k9iDgli 20Qg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:fcc:content-transfer-encoding:mime-version:subject:date:from :references:in-reply-to:message-id:x-gm-message-state:from:to:cc; bh=oyXpLERDXU6eoC1bJQA+ATjlhkCK7YEgYwous2TIrs4=; b=y+EDmMULYLMrh4fq1gfsUFk3Pa+iO8CtAAPo7nlZexT+VBV8JVhObTzQ22UEnwnwey kLRo1klLFhk01WUZkDIcWe5DImj1ocLb++Aai0Ihh73tewkRwoRT5EDYIya8+EW6vOVM eJnApnQ9IikdNccgzxj4nsdqRFVo74zuqdv3psw7S5Ch2lP1Br7Evdt1V31FectFq/bj tewV1Nz68uGeUCi8OrwWDskHqQC90/MBG9q39z59es9zU9T1d2v/31ZK5QmwYwHabEhC XsQ2t9MxYEMefedN6f5EGcpS6HREMB/IAJe3SWOxz68GvbEfgQQku5Xfqdt01yS/YrHn 2E+Q== X-Gm-Message-State: ACgBeo2yh1UPMZZ21S1v/M8aaLjqe2XxKzGXuRz4eseWjr1wF5QNwca6 OC9TTcUVzcF/Fzraf1uJWcvi8gQ6FVw= X-Google-Smtp-Source: AA6agR5lR3tDGKv2qpC2F2GzZtQDTzKqIuOFg8RXf39LLMWO/vVOTRnbPvm2WB7TAkWoE3yrv6JceQ== X-Received: by 2002:a5d:5343:0:b0:225:2ffe:77ba with SMTP id t3-20020a5d5343000000b002252ffe77bamr11149307wrv.453.1661181179729; Mon, 22 Aug 2022 08:12:59 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id m5-20020a05600c4f4500b003a5fa79007fsm15652495wmq.7.2022.08.22.08.12.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 22 Aug 2022 08:12:59 -0700 (PDT) Message-Id: <7580e1f09aff2acdf7a7cb86bf8dc7e294cffd33.1661181174.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 22 Aug 2022 15:12:51 +0000 Subject: [PATCH 4/7] bundle-uri: unit test "key=value" parsing MIME-Version: 1.0 Fcc: Sent To: git@vger.kernel.org Cc: gitster@pobox.com, me@ttaylorr.com, newren@gmail.com, avarab@gmail.com, mjcheetham@outlook.com, steadmon@google.com, Derrick Stolee , =?utf-8?b?w4Z2YXIgQXJuZmrDtnI=?= =?utf-8?b?w7AgQmphcm1hc29u?= Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsCBCamFybWFzb24=?= From: =?UTF-8?q?=C3=86var=20Arnfj=C3=B6r=C3=B0=20Bjarmason?= Create a new 'test-tool bundle-uri' test helper. This helper will assist in testing logic deep in the bundle URI feature. This change introduces the 'parse-key-values' subcommand, which parses stdin as a list of lines. These are fed into bundle_uri_parse_line() to test how we construct a 'struct bundle_list' from that data. The list is then output to stdout as if the key-value pairs were a Git config file. Co-authored-by: Derrick Stolee Signed-off-by: Ævar Arnfjörð Bjarmason Signed-off-by: Derrick Stolee --- Makefile | 1 + bundle-uri.c | 33 ++++++++++++++ bundle-uri.h | 3 ++ t/helper/test-bundle-uri.c | 63 +++++++++++++++++++++++++ t/helper/test-tool.c | 1 + t/helper/test-tool.h | 1 + t/t5750-bundle-uri-parse.sh | 91 +++++++++++++++++++++++++++++++++++++ t/test-lib-functions.sh | 11 +++++ 8 files changed, 204 insertions(+) create mode 100644 t/helper/test-bundle-uri.c create mode 100755 t/t5750-bundle-uri-parse.sh diff --git a/Makefile b/Makefile index 7d5f48069ea..7dee0329c49 100644 --- a/Makefile +++ b/Makefile @@ -722,6 +722,7 @@ PROGRAMS += $(patsubst %.o,git-%$X,$(PROGRAM_OBJS)) TEST_BUILTINS_OBJS += test-advise.o TEST_BUILTINS_OBJS += test-bitmap.o TEST_BUILTINS_OBJS += test-bloom.o +TEST_BUILTINS_OBJS += test-bundle-uri.o TEST_BUILTINS_OBJS += test-chmtime.o TEST_BUILTINS_OBJS += test-config.o TEST_BUILTINS_OBJS += test-crontab.o diff --git a/bundle-uri.c b/bundle-uri.c index 9a7d09349fe..d56c5e33d5f 100644 --- a/bundle-uri.c +++ b/bundle-uri.c @@ -66,6 +66,39 @@ int for_all_bundles_in_list(struct bundle_list *list, return 0; } +static int summarize_bundle(struct remote_bundle_info *info, void *data) +{ + FILE *fp = data; + fprintf(fp, "[bundle \"%s\"]\n", info->id); + fprintf(fp, "\turi = %s\n", info->uri); + return 0; +} + +void print_bundle_list(FILE *fp, struct bundle_list *list) +{ + const char *mode; + + switch (list->mode) { + case BUNDLE_MODE_ALL: + mode = "all"; + break; + + case BUNDLE_MODE_ANY: + mode = "any"; + break; + + case BUNDLE_MODE_NONE: + default: + mode = ""; + } + + printf("[bundle]\n"); + printf("\tversion = %d\n", list->version); + printf("\tmode = %s\n", mode); + + for_all_bundles_in_list(list, summarize_bundle, fp); +} + /** * Given a key-value pair, update the state of the given bundle list. * Returns 0 if the key-value pair is understood. Returns 1 if the key diff --git a/bundle-uri.h b/bundle-uri.h index f725c9796f7..41a1510a4ac 100644 --- a/bundle-uri.h +++ b/bundle-uri.h @@ -68,6 +68,9 @@ int for_all_bundles_in_list(struct bundle_list *list, bundle_iterator iter, void *data); +struct FILE; +void print_bundle_list(FILE *fp, struct bundle_list *list); + /** * Fetch data from the given 'uri' and unbundle the bundle data found * based on that information. diff --git a/t/helper/test-bundle-uri.c b/t/helper/test-bundle-uri.c new file mode 100644 index 00000000000..5cb0c9196fa --- /dev/null +++ b/t/helper/test-bundle-uri.c @@ -0,0 +1,63 @@ +#include "test-tool.h" +#include "parse-options.h" +#include "bundle-uri.h" +#include "strbuf.h" +#include "string-list.h" + +static int cmd__bundle_uri_parse_key_values(int argc, const char **argv) +{ + const char *usage[] = { + "test-tool bundle-uri parse-key-values []", + NULL + }; + struct option options[] = { + OPT_END(), + }; + + argc = parse_options(argc, argv, NULL, options, usage, + PARSE_OPT_STOP_AT_NON_OPTION | + PARSE_OPT_KEEP_ARGV0); + if (argc == 1) + goto usage; + + if (!strcmp(argv[1], "parse-key-values")) + return cmd__bundle_uri_parse_key_values(argc - 1, argv + 1); + error("there is no test-tool bundle-uri tool '%s'", argv[1]); + +usage: + usage_with_options(usage, options); +} diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c index 318fdbab0c3..fbe2d9d8108 100644 --- a/t/helper/test-tool.c +++ b/t/helper/test-tool.c @@ -17,6 +17,7 @@ static struct test_cmd cmds[] = { { "advise", cmd__advise_if_enabled }, { "bitmap", cmd__bitmap }, { "bloom", cmd__bloom }, + { "bundle-uri", cmd__bundle_uri }, { "chmtime", cmd__chmtime }, { "config", cmd__config }, { "crontab", cmd__crontab }, diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h index bb799271631..b2aa1f39a8f 100644 --- a/t/helper/test-tool.h +++ b/t/helper/test-tool.h @@ -7,6 +7,7 @@ int cmd__advise_if_enabled(int argc, const char **argv); int cmd__bitmap(int argc, const char **argv); int cmd__bloom(int argc, const char **argv); +int cmd__bundle_uri(int argc, const char **argv); int cmd__chmtime(int argc, const char **argv); int cmd__config(int argc, const char **argv); int cmd__crontab(int argc, const char **argv); diff --git a/t/t5750-bundle-uri-parse.sh b/t/t5750-bundle-uri-parse.sh new file mode 100755 index 00000000000..675c1f1d2f4 --- /dev/null +++ b/t/t5750-bundle-uri-parse.sh @@ -0,0 +1,91 @@ +#!/bin/sh + +test_description="Test bundle-uri bundle_uri_parse_line()" + +TEST_NO_CREATE_REPO=1 +TEST_PASSES_SANITIZE_LEAK=true +. ./test-lib.sh + +test_expect_success 'bundle_uri_parse_line() just URIs' ' + cat >in <<-\EOF && + bundle.one.uri=http://example.com/bundle.bdl + bundle.two.uri=https://example.com/bundle.bdl + bundle.three.uri=file:///usr/share/git/bundle.bdl + EOF + + cat >expect <<-\EOF && + [bundle] + version = 1 + mode = all + [bundle "one"] + uri = http://example.com/bundle.bdl + [bundle "two"] + uri = https://example.com/bundle.bdl + [bundle "three"] + uri = file:///usr/share/git/bundle.bdl + EOF + + test-tool bundle-uri parse-key-values actual 2>err && + test_must_be_empty err && + test_cmp_config_output expect actual +' + +test_expect_success 'bundle_uri_parse_line() parsing edge cases: empty key or value' ' + cat >in <<-\EOF && + =bogus-value + bogus-key= + EOF + + cat >err.expect <<-EOF && + error: bundle-uri: line has empty key or value + error: bad line: '\''=bogus-value'\'' + error: bundle-uri: line has empty key or value + error: bad line: '\''bogus-key='\'' + EOF + + cat >expect <<-\EOF && + [bundle] + version = 1 + mode = all + EOF + + test_must_fail test-tool bundle-uri parse-key-values actual 2>err && + test_cmp err.expect err && + test_cmp_config_output expect actual +' + +test_expect_success 'bundle_uri_parse_line() parsing edge cases: empty lines' ' + cat >in <<-\EOF && + bundle.one.uri=http://example.com/bundle.bdl + + bundle.two.uri=https://example.com/bundle.bdl + + bundle.three.uri=file:///usr/share/git/bundle.bdl + EOF + + cat >err.expect <<-\EOF && + error: bundle-uri: got an empty line + error: bad line: '\'''\'' + error: bundle-uri: got an empty line + error: bad line: '\'''\'' + EOF + + # We fail, but try to continue parsing regardless + cat >expect <<-\EOF && + [bundle] + version = 1 + mode = all + [bundle "one"] + uri = http://example.com/bundle.bdl + [bundle "two"] + uri = https://example.com/bundle.bdl + [bundle "three"] + uri = file:///usr/share/git/bundle.bdl + EOF + + test_must_fail test-tool bundle-uri parse-key-values actual 2>err && + test_cmp err.expect err && + test_cmp_config_output expect actual +' + +test_done diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh index 6da7273f1d5..3175d665add 100644 --- a/t/test-lib-functions.sh +++ b/t/test-lib-functions.sh @@ -1956,3 +1956,14 @@ test_is_magic_mtime () { rm -f .git/test-mtime-actual return $ret } + +# Given two filenames, parse both using 'git config --list --file' +# and compare the sorted output of those commands. Useful when +# wanting to ignore whitespace differences and sorting concerns. +test_cmp_config_output () { + git config --list --file="$1" >config-expect && + git config --list --file="$2" >config-actual && + sort config-expect >sorted-expect && + sort config-actual >sorted-actual && + test_cmp sorted-expect sorted-actual +} From patchwork Mon Aug 22 15:12:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12950780 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 00DB0C28D13 for ; Mon, 22 Aug 2022 15:19:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236516AbiHVPTa (ORCPT ); Mon, 22 Aug 2022 11:19:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49526 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235315AbiHVPS5 (ORCPT ); Mon, 22 Aug 2022 11:18:57 -0400 Received: from mail-wr1-x434.google.com (mail-wr1-x434.google.com [IPv6:2a00:1450:4864:20::434]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BB87A27B23 for ; Mon, 22 Aug 2022 08:13:06 -0700 (PDT) Received: by mail-wr1-x434.google.com with SMTP id b5so9336788wrr.5 for ; Mon, 22 Aug 2022 08:13:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:mime-version:content-transfer-encoding:fcc:subject:date:from :references:in-reply-to:message-id:from:to:cc; bh=hGP/DY7F9uw3HccMlGFYyDjtui5Vi5AEWmVQmMW/puU=; b=U5tn/ydBN20z/qmQa4eSLbJTHbszIJtEKHqhbMZsRmUpaA1ep0B/hOMifbn56KXzy6 RnoO8Rz7hVxPwMB8HMGjBH+tfjGRRRs80m9QlR0gXf54BX0Osr0xhC+KD3NeOZ9oYgkA pEFI3e7uWJ1efNsmsLYiWC3vmMHajxd9oeBA8Dqa0JiM6UnzGcKTCMhTa3y8riR59AJ1 0Wv1+bcBNMhDiFjD9ts9x+v0O4nuCNHiKKGooXwVvtJyW4XBEycz+GXtG6t1qTAu13mf GIu+kf9w/s3Alx6qblJ2okjbL4Oj2hhU1TES0cX+1zxUykPw2LwMwGYxu/N40g/6y3eL 0ocQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:mime-version:content-transfer-encoding:fcc:subject:date:from :references:in-reply-to:message-id:x-gm-message-state:from:to:cc; bh=hGP/DY7F9uw3HccMlGFYyDjtui5Vi5AEWmVQmMW/puU=; b=MzHSENizn26MEzPBen1JXAl7bV22UVseVR7p+W+5wwneZmHf9XZcPmNHM+KiFRD6NE z3L1qMt/C5ifM5nXC5K8QLYMRG0k7zvbVRq39aUvL0e5tyTmYKWdNmNxpEq6TDH2Y+3p 61QiYWiZdR3WJEUZIcXuD2AN2K4hmZn7BYeJZKHvXEJti4tKMO7YDIngnnhYkSPhcJXq UuWj1pQTm38QQALpYKv99sV1+aIvX63+fAIlAjNkUAFCHUBSJ2qXmdpBN1icrx0W+R1/ xITNqcHYS8KG8dZW0hIYLdFYUCcZQ+C3kqZbojUmQqzGMisf5xecBEXUXK4G+PYmri9I Ah1Q== X-Gm-Message-State: ACgBeo3pGBP/JebqOnxfYyNCB+WLbuW6qNuMQzxFcpoIf+ASEtV0Anz9 m1EjeE2AuFkYCYUA6U70Ps2ckw9w1w8= X-Google-Smtp-Source: AA6agR7QBnVtsWp1QWENmQCYptoWJVxvfyJgn7p4iCq+gBa1V4Dl/V6yXew5Haxfwp9y0oaHJ2eOOw== X-Received: by 2002:a05:6000:1149:b0:225:29d4:67eb with SMTP id d9-20020a056000114900b0022529d467ebmr11132704wrx.254.1661181180820; Mon, 22 Aug 2022 08:13:00 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id g11-20020a05600c4ecb00b003a4c6e67f01sm21243619wmq.6.2022.08.22.08.13.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 22 Aug 2022 08:13:00 -0700 (PDT) Message-Id: <1d1bd9c710327b4d705cfede017771da7fb6ec52.1661181174.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 22 Aug 2022 15:12:52 +0000 Subject: [PATCH 5/7] bundle-uri: parse bundle list in config format Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, me@ttaylorr.com, newren@gmail.com, avarab@gmail.com, mjcheetham@outlook.com, steadmon@google.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee When a bundle provider wants to operate independently from a Git remote, they want to provide a single, consistent URI that users can use in their 'git clone --bundle-uri' commands. At this point, the Git client expects that URI to be a single bundle that can be unbundled and used to bootstrap the rest of the clone from the Git server. This single bundle cannot be re-used to assist with future incremental fetches. To allow for the incremental fetch case, teach Git to understand a bundle list that could be advertised at an independent bundle URI. Such a bundle list is likely to be inspected by human readers, even if only by the bundle provider creating the list. For this reason, we can take our expected "key=value" pairs and instead format them using Git config format. Create parse_bundle_list_in_config_format() to parse a file in config format and convert that into a 'struct bundle_list' filled with its understanding of the contents. Be careful to call git_config_from_file_with_options() because the default action for git_config_from_file() is to die() on a parsing error. The current warning isn't particularly helpful if it arises to a user, but it will be made more verbose at a higher layer later. Update 'test-tool bundle-uri' to take this config file format as input. It uses a filename instead of stdin because there is no existing way to parse a FILE pointer in the config machinery. Using git_config_from_mem() is overly complicated and more likely to introduce bugs than this simpler version. I would rather have a slightly confusing test helper than complicated product code. Signed-off-by: Derrick Stolee --- bundle-uri.c | 29 +++++++++++++++++++++ bundle-uri.h | 10 ++++++++ t/helper/test-bundle-uri.c | 45 ++++++++++++++++++++++++++------- t/t5750-bundle-uri-parse.sh | 50 +++++++++++++++++++++++++++++++++++++ 4 files changed, 125 insertions(+), 9 deletions(-) diff --git a/bundle-uri.c b/bundle-uri.c index d56c5e33d5f..dca88ed1e89 100644 --- a/bundle-uri.c +++ b/bundle-uri.c @@ -6,6 +6,7 @@ #include "run-command.h" #include "hashmap.h" #include "pkt-line.h" +#include "config.h" static int compare_bundles(const void *hashmap_cmp_fn_data, const struct hashmap_entry *he1, @@ -172,6 +173,34 @@ static int bundle_list_update(const char *key, const char *value, return 0; } +static int config_to_bundle_list(const char *key, const char *value, void *data) +{ + struct bundle_list *list = data; + return bundle_list_update(key, value, list); +} + +int parse_bundle_list_in_config_format(const char *uri, + const char *filename, + struct bundle_list *list) +{ + int result; + struct config_options opts = { + .error_action = CONFIG_ERROR_ERROR, + }; + + list->mode = BUNDLE_MODE_NONE; + result = git_config_from_file_with_options(config_to_bundle_list, + filename, list, + &opts); + + if (!result && list->mode == BUNDLE_MODE_NONE) { + warning(_("bundle list at '%s' has no mode"), uri); + result = 1; + } + + return result; +} + static int find_temp_filename(struct strbuf *name) { int fd; diff --git a/bundle-uri.h b/bundle-uri.h index 41a1510a4ac..294ac804140 100644 --- a/bundle-uri.h +++ b/bundle-uri.h @@ -71,6 +71,16 @@ int for_all_bundles_in_list(struct bundle_list *list, struct FILE; void print_bundle_list(FILE *fp, struct bundle_list *list); +/** + * A bundle URI may point to a bundle list where the key=value + * pairs are provided in config file format. This method is + * exposed publicly for testing purposes. + */ + +int parse_bundle_list_in_config_format(const char *uri, + const char *filename, + struct bundle_list *list); + /** * Fetch data from the given 'uri' and unbundle the bundle data found * based on that information. diff --git a/t/helper/test-bundle-uri.c b/t/helper/test-bundle-uri.c index 5cb0c9196fa..23ce0eebca3 100644 --- a/t/helper/test-bundle-uri.c +++ b/t/helper/test-bundle-uri.c @@ -4,27 +4,52 @@ #include "strbuf.h" #include "string-list.h" -static int cmd__bundle_uri_parse_key_values(int argc, const char **argv) +enum input_mode { + KEY_VALUE_PAIRS, + CONFIG_FILE, +}; + +static int cmd__bundle_uri_parse(int argc, const char **argv, enum input_mode mode) { - const char *usage[] = { + const char *key_value_usage[] = { "test-tool bundle-uri parse-key-values ", + NULL + }; struct option options[] = { OPT_END(), }; + const char **usage = key_value_usage; struct strbuf sb = STRBUF_INIT; struct bundle_list list; int err = 0; - argc = parse_options(argc, argv, NULL, options, usage, 0); - if (argc) - goto usage; + if (mode == CONFIG_FILE) + usage = config_usage; + + argc = parse_options(argc, argv, NULL, options, usage, + PARSE_OPT_STOP_AT_NON_OPTION); init_bundle_list(&list); - while (strbuf_getline(&sb, stdin) != EOF) { - if (bundle_uri_parse_line(&list, sb.buf) < 0) - err = error("bad line: '%s'", sb.buf); + + switch (mode) { + case KEY_VALUE_PAIRS: + if (argc) + goto usage; + while (strbuf_getline(&sb, stdin) != EOF) { + if (bundle_uri_parse_line(&list, sb.buf) < 0) + err = error("bad line: '%s'", sb.buf); + } + break; + + case CONFIG_FILE: + if (argc != 1) + goto usage; + err = parse_bundle_list_in_config_format("", argv[0], &list); + break; } strbuf_release(&sb); @@ -55,7 +80,9 @@ int cmd__bundle_uri(int argc, const char **argv) goto usage; if (!strcmp(argv[1], "parse-key-values")) - return cmd__bundle_uri_parse_key_values(argc - 1, argv + 1); + return cmd__bundle_uri_parse(argc - 1, argv + 1, KEY_VALUE_PAIRS); + if (!strcmp(argv[1], "parse-config")) + return cmd__bundle_uri_parse(argc - 1, argv + 1, CONFIG_FILE); error("there is no test-tool bundle-uri tool '%s'", argv[1]); usage: diff --git a/t/t5750-bundle-uri-parse.sh b/t/t5750-bundle-uri-parse.sh index 675c1f1d2f4..dd9dc36bfd7 100755 --- a/t/t5750-bundle-uri-parse.sh +++ b/t/t5750-bundle-uri-parse.sh @@ -88,4 +88,54 @@ test_expect_success 'bundle_uri_parse_line() parsing edge cases: empty lines' ' test_cmp_config_output expect actual ' +test_expect_success 'parse config format: just URIs' ' + cat >expect <<-\EOF && + [bundle] + version = 1 + mode = all + [bundle "one"] + uri = http://example.com/bundle.bdl + [bundle "two"] + uri = https://example.com/bundle.bdl + [bundle "three"] + uri = file:///usr/share/git/bundle.bdl + EOF + + test-tool bundle-uri parse-config expect >actual 2>err && + test_must_be_empty err && + test_cmp_config_output expect actual +' + +test_expect_success 'parse config format edge cases: empty key or value' ' + cat >in1 <<-\EOF && + = bogus-value + EOF + + cat >err1 <<-EOF && + error: bad config line 1 in file in1 + EOF + + cat >expect <<-\EOF && + [bundle] + version = 1 + mode = + EOF + + test_must_fail test-tool bundle-uri parse-config in1 >actual 2>err && + test_cmp err1 err && + test_cmp_config_output expect actual && + + cat >in2 <<-\EOF && + bogus-key = + EOF + + cat >err2 <<-EOF && + warning: bundle list at '\'''\'' has no mode + EOF + + test_must_fail test-tool bundle-uri parse-config in2 >actual 2>err && + test_cmp err2 err && + test_cmp_config_output expect actual +' + test_done From patchwork Mon Aug 22 15:12:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12950778 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EDC7BC28D13 for ; Mon, 22 Aug 2022 15:19:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236492AbiHVPTW (ORCPT ); Mon, 22 Aug 2022 11:19:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49518 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236361AbiHVPS4 (ORCPT ); Mon, 22 Aug 2022 11:18:56 -0400 Received: from mail-wm1-x335.google.com (mail-wm1-x335.google.com [IPv6:2a00:1450:4864:20::335]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EF1D2474E0 for ; Mon, 22 Aug 2022 08:13:07 -0700 (PDT) Received: by mail-wm1-x335.google.com with SMTP id j26so5774816wms.0 for ; Mon, 22 Aug 2022 08:13:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:mime-version:content-transfer-encoding:fcc:subject:date:from :references:in-reply-to:message-id:from:to:cc; bh=4NW4/0bdG0kXFtejIGKFCn58tMh4l2G8g5WzbD8gXrU=; b=DQPMOy7REVNaNk7ptTtsvdoakJiVg0gpnRy8i6Ruwt6BlB7TyzRTrY6nZB7+mnvLC7 Vv4TKctZXSo+SJAskosb9SoDHPipxVjLcJ2D4fo2XNfMi1Rh10/E6FUQY2O+j0J4Ztik Q4j7Wq/UMaxAiSJZ8uUqsoScZM6ndRlEVy1ObYLcsXVJoEUdRjm03RVKO4+m8UC3PpXF B09xzDAinubK9oDveti71X4wUgUOnZtziqeXIukAwI0KK61gaMU4+nE86eCp04NwCstW zN373F7g21CDfqmFuNcYB7/bYb9RfaEEWG50xRraLKGXwI4in4VGSvZvpx8YXbbGTWvR kKCA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:mime-version:content-transfer-encoding:fcc:subject:date:from :references:in-reply-to:message-id:x-gm-message-state:from:to:cc; bh=4NW4/0bdG0kXFtejIGKFCn58tMh4l2G8g5WzbD8gXrU=; b=vMxvcnflvF//1aITTTJZLgDc/4xj+Fyt9lRWMBpxEtCaVQdYVe5SZI8vOjKX56MWrv REleH/mUYRYp1HHUgPrJyL9c9O0MiesrJrzIOQyIv6GP85e0H1La26csODPZisz5caEN s/CqG5Hso3Q0VbbS2Q+V44s9ZT5MYKd54APJj5Ji+0mgn+Q5khZAYexRsaYMdx+X7Q7F EbSVOBd9JhzO2lpPme2083JUTU1C4gJQ0wnfF+GDL1tS7tNLeOoey4/ntD2+TftUBwK1 vCd7gVv4VRMpHJWeGOEoiv82XEqd0uCtHZQyGmcNBLe2rA3PjXfau0kS92iByP09156a t3qA== X-Gm-Message-State: ACgBeo2q7FBb3H0f0Zy4Coo7XaePze+L/F8VGr7B72/Ip5M6EOShkQkp msHDNb0kNTZacYgzHCJoTwRWCM7X/Vs= X-Google-Smtp-Source: AA6agR4CtkrWJoOdSt+jxgeOnaYl98oFz6eLknd0oOV9tnyjkuJ/SUtK3jD7AsT70Y6doqRg9qyEVQ== X-Received: by 2002:a05:600c:3b0c:b0:3a6:aa0:5966 with SMTP id m12-20020a05600c3b0c00b003a60aa05966mr12384583wms.183.1661181181715; Mon, 22 Aug 2022 08:13:01 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id o4-20020a05600c510400b003a5dde32e4bsm19416884wms.37.2022.08.22.08.13.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 22 Aug 2022 08:13:01 -0700 (PDT) Message-Id: <039e172849c6b028df1abf258666c77bd42b23fe.1661181174.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 22 Aug 2022 15:12:53 +0000 Subject: [PATCH 6/7] bundle-uri: limit recursion depth for bundle lists Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, me@ttaylorr.com, newren@gmail.com, avarab@gmail.com, mjcheetham@outlook.com, steadmon@google.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee The next change will start allowing us to parse bundle lists that are downloaded from a provided bundle URI. Those lists might point to other lists, which could proceed to an arbitrary depth (and even create cycles). Restructure fetch_bundle_uri() to have an internal version that has a recursion depth. Compare that to a new max_bundle_uri_depth constant that is twice as high as we expect this depth to be for any legitimate use of bundle list linking. We can consider making max_bundle_uri_depth a configurable value if there is demonstrated value in the future. Signed-off-by: Derrick Stolee --- bundle-uri.c | 21 ++++++++++++++++++++- 1 file changed, 20 insertions(+), 1 deletion(-) diff --git a/bundle-uri.c b/bundle-uri.c index dca88ed1e89..c9f3df28b2f 100644 --- a/bundle-uri.c +++ b/bundle-uri.c @@ -334,11 +334,25 @@ static int unbundle_from_file(struct repository *r, const char *file) return result; } -int fetch_bundle_uri(struct repository *r, const char *uri) +/** + * This limits the recursion on fetch_bundle_uri_internal() when following + * bundle lists. + */ +static int max_bundle_uri_depth = 4; + +static int fetch_bundle_uri_internal(struct repository *r, + const char *uri, + int depth) { int result = 0; struct strbuf filename = STRBUF_INIT; + if (depth >= max_bundle_uri_depth) { + warning(_("exceeded bundle URI recursion limit (%d)"), + max_bundle_uri_depth); + return -1; + } + if ((result = find_temp_filename(&filename))) goto cleanup; @@ -363,6 +377,11 @@ cleanup: return result; } +int fetch_bundle_uri(struct repository *r, const char *uri) +{ + return fetch_bundle_uri_internal(r, uri, 0); +} + /** * General API for {transport,connect}.c etc. */ From patchwork Mon Aug 22 15:12:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12950782 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8B7EFC28D13 for ; Mon, 22 Aug 2022 15:19:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236444AbiHVPTr (ORCPT ); Mon, 22 Aug 2022 11:19:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49462 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235124AbiHVPTC (ORCPT ); Mon, 22 Aug 2022 11:19:02 -0400 Received: from mail-wr1-x42b.google.com (mail-wr1-x42b.google.com [IPv6:2a00:1450:4864:20::42b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 219FF474E4 for ; Mon, 22 Aug 2022 08:13:08 -0700 (PDT) Received: by mail-wr1-x42b.google.com with SMTP id k9so13638631wri.0 for ; Mon, 22 Aug 2022 08:13:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:mime-version:content-transfer-encoding:fcc:subject:date:from :references:in-reply-to:message-id:from:to:cc; bh=o6qIvDIhR8I3oqqIZBfgHkhY6OLZE7XHU3hOwCi8wqM=; b=dVuuyZEM0qVfppTGEH3kIXIFmBqteDuhUG4b8wh1g9lEnPUh8k6Omkvwc7ISO1KFSJ Ubl/o32BoZM9Nd2duOZZkT1Are3jXZh6BlEuf8PYkqnslWiCsWbxnZJrl7mAbYz/lNFz jtoECL3AiAtUF34jlhdlpnbb81Ksp41RQllio2BIZUuwXDCgfVGdMUZtM2zcdH4prgB9 it3XzPzrou0XXu1taH7G6phwEzwCnhcRM7IutCDj9aXd0NgV2td4pN2sv10KN2B2VJ1z Z3zR/n0n2AAIiNgabeooy2oI3g9W4oR4mt0r9Hyr6Y9PC3qYyfmyV0tLMpe/qYsZwzPA dWYg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:mime-version:content-transfer-encoding:fcc:subject:date:from :references:in-reply-to:message-id:x-gm-message-state:from:to:cc; bh=o6qIvDIhR8I3oqqIZBfgHkhY6OLZE7XHU3hOwCi8wqM=; b=4T0mHi+lo6oJmNg+IgqeVnHFAU0ddPur2IvneilY04XC10kK4w+h/t5BO7zjeIUj6Y GyO5oDQHoLwdHDYYN7HvdR9UKk6pfvDj2DeO6OnDF11BFushSFI0CyFHLuc0mokj64Hq PQstdQ99eUVACmDE3xzVbt8B8uwF5ZZDA6VrZkmBDJ/NUgmZstr41TXlHnMeY2lrH1Tr AfisjkfbQeFIzPs6n/x79fk0iTAtBCe094xfUUkZ/TBAaZx6OKnNZmWjaHv3IenWCSub QC1tvcvcXG63azl4TNztOyNlI4cnmYoUvpJHc+MN1JYxz3jSlhoqOaV8r1OGDYradFsz Xp4w== X-Gm-Message-State: ACgBeo1AirW8IvNglHGyPTzevv5fS/KBrR02epqRxODjHJpoTKYgogaj zilq02xCO1hGFyStwnvZuQhrK/KkWXA= X-Google-Smtp-Source: AA6agR5ws4M5Qc2FADVFn1N9Sk41V+DSkk3C5sHiz7Weu/VV8lTi0md/SO+FTBqL++Hb+FeY6eDsIw== X-Received: by 2002:a5d:6d8a:0:b0:225:2e6c:8a1b with SMTP id l10-20020a5d6d8a000000b002252e6c8a1bmr5622663wrs.647.1661181182531; Mon, 22 Aug 2022 08:13:02 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id r6-20020a05600c35c600b003a626055569sm15179057wmq.16.2022.08.22.08.13.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 22 Aug 2022 08:13:02 -0700 (PDT) Message-Id: <7b45c06cc9e0294311d9f00d40eb1fa4f8f146f9.1661181174.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 22 Aug 2022 15:12:54 +0000 Subject: [PATCH 7/7] bundle-uri: fetch a list of bundles Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, me@ttaylorr.com, newren@gmail.com, avarab@gmail.com, mjcheetham@outlook.com, steadmon@google.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee When the content at a given bundle URI is not understood as a bundle (based on inspecting the initial content), then Git currently gives up and ignores that content. Independent bundle providers may want to split up the bundle content into multiple bundles, but still make them available from a single URI. Teach Git to attempt parsing the bundle URI content as a Git config file providing the key=value pairs for a bundle list. Git then looks at the mode of the list to see if ANY single bundle is sufficient or if ALL bundles are required. The content at the selected URIs are downloaded and the content is inspected again, creating a recursive process. To guard the recursion against malformed or malicious content, limit the recursion depth to a reasonable four for now. This can be converted to a configured value in the future if necessary. The value of four is twice as high as expected to be useful (a bundle list is unlikely to point to more bundle lists). To test this scenario, create an interesting bundle topology where three incremental bundles are built on top of a single full bundle. By using a merge commit, the two middle bundles are "independent" in that they do not require each other in order to unbundle themselves. They each only need the base bundle. The bundle containing the merge commit requires both of the middle bundles, though. This leads to some interesting decisions when unbundling, especially when we later implement heuristics that promote downloading bundles until the prerequisite commits are satisfied. Signed-off-by: Derrick Stolee --- bundle-uri.c | 211 +++++++++++++++++++++++++++++++++--- bundle-uri.h | 6 + t/t5558-clone-bundle-uri.sh | 93 ++++++++++++++++ 3 files changed, 293 insertions(+), 17 deletions(-) diff --git a/bundle-uri.c b/bundle-uri.c index c9f3df28b2f..37867afca27 100644 --- a/bundle-uri.c +++ b/bundle-uri.c @@ -35,9 +35,10 @@ void init_bundle_list(struct bundle_list *list) static int clear_remote_bundle_info(struct remote_bundle_info *bundle, void *data) { - free(bundle->id); - free(bundle->uri); + FREE_AND_NULL(bundle->id); + FREE_AND_NULL(bundle->uri); strbuf_release(&bundle->file); + bundle->unbundled = 0; return 0; } @@ -334,18 +335,102 @@ static int unbundle_from_file(struct repository *r, const char *file) return result; } +struct bundle_list_context { + struct repository *r; + struct bundle_list *list; + enum bundle_list_mode mode; + int count; + int depth; +}; + +/* + * This early definition is necessary because we use indirect recursion: + * + * While iterating through a bundle list that was downloaded as part + * of fetch_bundle_uri_internal(), iterator methods eventually call it + * again, but with depth + 1. + */ +static int fetch_bundle_uri_internal(struct repository *r, + struct remote_bundle_info *bundle, + int depth, + struct bundle_list *list); + +static int download_bundle_to_file(struct remote_bundle_info *bundle, void *data) +{ + struct bundle_list_context *ctx = data; + + if (ctx->mode == BUNDLE_MODE_ANY && ctx->count) + return 0; + + ctx->count++; + return fetch_bundle_uri_internal(ctx->r, bundle, ctx->depth + 1, ctx->list); +} + +static int download_bundle_list(struct repository *r, + struct bundle_list *local_list, + struct bundle_list *global_list, + int depth) +{ + struct bundle_list_context ctx = { + .r = r, + .list = global_list, + .depth = depth + 1, + .mode = local_list->mode, + }; + + return for_all_bundles_in_list(local_list, download_bundle_to_file, &ctx); +} + +static int fetch_bundle_list_in_config_format(struct repository *r, + struct bundle_list *global_list, + struct remote_bundle_info *bundle, + int depth) +{ + int result; + struct bundle_list list_from_bundle; + + init_bundle_list(&list_from_bundle); + + if ((result = parse_bundle_list_in_config_format(bundle->uri, + bundle->file.buf, + &list_from_bundle))) + goto cleanup; + + if (list_from_bundle.mode == BUNDLE_MODE_NONE) { + warning(_("unrecognized bundle mode from URI '%s'"), + bundle->uri); + result = -1; + goto cleanup; + } + + if ((result = download_bundle_list(r, &list_from_bundle, + global_list, depth))) + goto cleanup; + +cleanup: + clear_bundle_list(&list_from_bundle); + return result; +} + /** * This limits the recursion on fetch_bundle_uri_internal() when following * bundle lists. */ static int max_bundle_uri_depth = 4; +/** + * Recursively download all bundles advertised at the given URI + * to files. If the file is a bundle, then add it to the given + * 'list'. Otherwise, expect a bundle list and recurse on the + * URIs in that list according to the list mode (ANY or ALL). + */ static int fetch_bundle_uri_internal(struct repository *r, - const char *uri, - int depth) + struct remote_bundle_info *bundle, + int depth, + struct bundle_list *list) { int result = 0; - struct strbuf filename = STRBUF_INIT; + struct remote_bundle_info *bcopy; if (depth >= max_bundle_uri_depth) { warning(_("exceeded bundle URI recursion limit (%d)"), @@ -353,33 +438,125 @@ static int fetch_bundle_uri_internal(struct repository *r, return -1; } - if ((result = find_temp_filename(&filename))) + if (!bundle->file.len && + (result = find_temp_filename(&bundle->file))) goto cleanup; - if ((result = copy_uri_to_file(filename.buf, uri))) { - warning(_("failed to download bundle from URI '%s'"), uri); + if ((result = copy_uri_to_file(bundle->file.buf, bundle->uri))) { + warning(_("failed to download bundle from URI '%s'"), bundle->uri); goto cleanup; } - if ((result = !is_bundle(filename.buf, 0))) { - warning(_("file at URI '%s' is not a bundle"), uri); + if ((result = !is_bundle(bundle->file.buf, 1))) { + result = fetch_bundle_list_in_config_format( + r, list, bundle, depth); + if (result) + warning(_("file at URI '%s' is not a bundle or bundle list"), + bundle->uri); goto cleanup; } - if ((result = unbundle_from_file(r, filename.buf))) { - warning(_("failed to unbundle bundle from URI '%s'"), uri); - goto cleanup; - } + /* Copy the bundle and insert it into the global list. */ + CALLOC_ARRAY(bcopy, 1); + bcopy->id = xstrdup(bundle->id); + strbuf_init(&bcopy->file, 0); + strbuf_add(&bcopy->file, bundle->file.buf, bundle->file.len); + hashmap_entry_init(&bcopy->ent, strhash(bcopy->id)); + hashmap_add(&list->bundles, &bcopy->ent); cleanup: - unlink(filename.buf); - strbuf_release(&filename); + if (result) + unlink(bundle->file.buf); return result; } +struct attempt_unbundle_context { + struct repository *r; + int success_count; + int failure_count; +}; + +static int attempt_unbundle(struct remote_bundle_info *info, void *data) +{ + struct attempt_unbundle_context *ctx = data; + + if (info->unbundled || !unbundle_from_file(ctx->r, info->file.buf)) { + ctx->success_count++; + info->unbundled = 1; + } else { + ctx->failure_count++; + } + + return 0; +} + +static int unbundle_all_bundles(struct repository *r, + struct bundle_list *list) +{ + int last_success_count = -1; + struct attempt_unbundle_context ctx = { + .r = r, + }; + + /* + * Iterate through all bundles looking for ones that can + * successfully unbundle. If any succeed, then perhaps another + * will succeed in the next attempt. + */ + while (last_success_count < ctx.success_count) { + last_success_count = ctx.success_count; + + ctx.success_count = 0; + ctx.failure_count = 0; + for_all_bundles_in_list(list, attempt_unbundle, &ctx); + } + + if (ctx.success_count) + git_config_set_multivar_gently("log.excludedecoration", + "refs/bundle/", + "refs/bundle/", + CONFIG_FLAGS_FIXED_VALUE | + CONFIG_FLAGS_MULTI_REPLACE); + + if (ctx.failure_count) + warning(_("failed to unbundle %d bundles"), + ctx.failure_count); + + return 0; +} + +static int unlink_bundle(struct remote_bundle_info *info, void *data) +{ + if (info->file.buf) + unlink_or_warn(info->file.buf); + return 0; +} + int fetch_bundle_uri(struct repository *r, const char *uri) { - return fetch_bundle_uri_internal(r, uri, 0); + int result; + struct bundle_list list; + struct remote_bundle_info bundle = { + .uri = xstrdup(uri), + .id = xstrdup(""), + .file = STRBUF_INIT, + }; + + init_bundle_list(&list); + + /* If a bundle is added to this global list, then it is required. */ + list.mode = BUNDLE_MODE_ALL; + + if ((result = fetch_bundle_uri_internal(r, &bundle, 0, &list))) + goto cleanup; + + result = unbundle_all_bundles(r, &list); + +cleanup: + for_all_bundles_in_list(&list, unlink_bundle, NULL); + clear_bundle_list(&list); + clear_remote_bundle_info(&bundle, NULL); + return result; } /** diff --git a/bundle-uri.h b/bundle-uri.h index 294ac804140..e9d85a6ecfb 100644 --- a/bundle-uri.h +++ b/bundle-uri.h @@ -35,6 +35,12 @@ struct remote_bundle_info { * an empty string. */ struct strbuf file; + + /** + * If the bundle has been unbundled successfully, then + * this boolean is true. + */ + unsigned unbundled:1; }; #define REMOTE_BUNDLE_INFO_INIT { \ diff --git a/t/t5558-clone-bundle-uri.sh b/t/t5558-clone-bundle-uri.sh index ad666a2d28a..592790b49f0 100755 --- a/t/t5558-clone-bundle-uri.sh +++ b/t/t5558-clone-bundle-uri.sh @@ -41,6 +41,72 @@ test_expect_success 'clone with file:// bundle' ' test_cmp expect actual ' +# To get interesting tests for bundle lists, we need to construct a +# somewhat-interesting commit history. +# +# ---------------- bundle-4 +# +# 4 +# / \ +# ----|---|------- bundle-3 +# | | +# | 3 +# | | +# ----|---|------- bundle-2 +# | | +# 2 | +# | | +# ----|---|------- bundle-1 +# \ / +# 1 +# | +# (previous commits) +test_expect_success 'construct incremental bundle list' ' + ( + cd clone-from && + git checkout -b base && + test_commit 1 && + git checkout -b left && + test_commit 2 && + git checkout -b right base && + test_commit 3 && + git checkout -b merge left && + git merge right -m "4" && + + git bundle create bundle-1.bundle base && + git bundle create bundle-2.bundle base..left && + git bundle create bundle-3.bundle base..right && + git bundle create bundle-4.bundle merge --not left right + ) +' + +test_expect_success 'clone bundle list (file, no heuristic)' ' + cat >bundle-list <<-EOF && + [bundle] + version = 1 + mode = all + + [bundle "bundle-1"] + uri = file://$(pwd)/clone-from/bundle-1.bundle + + [bundle "bundle-2"] + uri = file://$(pwd)/clone-from/bundle-2.bundle + + [bundle "bundle-3"] + uri = file://$(pwd)/clone-from/bundle-3.bundle + + [bundle "bundle-4"] + uri = file://$(pwd)/clone-from/bundle-4.bundle + EOF + + git clone --bundle-uri="file://$(pwd)/bundle-list" . clone-list-file && + for oid in $(git -C clone-from for-each-ref --format="%(objectname)") + do + git -C clone-list-file rev-parse $oid || return 1 + done +' + + ######################################################################### # HTTP tests begin here @@ -75,6 +141,33 @@ test_expect_success 'clone HTTP bundle' ' test_config -C clone-http log.excludedecoration refs/bundle/ ' +test_expect_success 'clone bundle list (HTTP, no heuristic)' ' + cp clone-from/bundle-*.bundle "$HTTPD_DOCUMENT_ROOT_PATH/" && + cat >"$HTTPD_DOCUMENT_ROOT_PATH/bundle-list" <<-EOF && + [bundle] + version = 1 + mode = all + + [bundle "bundle-1"] + uri = $HTTPD_URL/bundle-1.bundle + + [bundle "bundle-2"] + uri = $HTTPD_URL/bundle-2.bundle + + [bundle "bundle-3"] + uri = $HTTPD_URL/bundle-3.bundle + + [bundle "bundle-4"] + uri = $HTTPD_URL/bundle-4.bundle + EOF + + git clone --bundle-uri="$HTTPD_URL/bundle-list" . clone-list-http && + for oid in $(git -C clone-from for-each-ref --format="%(objectname)") + do + git -C clone-list-http rev-parse $oid || return 1 + done +' + # Do not add tests here unless they use the HTTP server, as they will # not run unless the HTTP dependencies exist.