From patchwork Wed Feb 23 18:30:39 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12757300 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 19153C433EF for ; Wed, 23 Feb 2022 18:31:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243896AbiBWSbt (ORCPT ); Wed, 23 Feb 2022 13:31:49 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37328 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243862AbiBWSbg (ORCPT ); Wed, 23 Feb 2022 13:31:36 -0500 Received: from mail-wr1-x430.google.com (mail-wr1-x430.google.com [IPv6:2a00:1450:4864:20::430]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 948554B403 for ; Wed, 23 Feb 2022 10:31:07 -0800 (PST) Received: by mail-wr1-x430.google.com with SMTP id s1so14665423wrg.10 for ; Wed, 23 Feb 2022 10:31:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=zy3PxKim2kgrf5ZCj5JwKWMYo9oRJzEQWBJCStxrC1Y=; b=Am2jUzGuqdFWpNrKdIBxkZMWr1wRK9LXwKp+5/krqInZ6VcZXGr6wOu2hEmd6vB3mf zyufwGDkFXxOnC5vhgo62DbMCp1V0GtOC9GrFQNt4fcmDUuDBHP2z/ghdzhpM3tirTus edFErV1sFplXmgko5WYC6hu2ORYzVQjl9nNXw8c0a6HweN7y0bckljgvsvG2LyOE3vii +1WOES/Dpay8v2DEyrvkwQfaajPMcCI83VdUyi+U7T9+Liy7DJFZpREd460/Md022Mfr oQ56RdubTYT0QPgQr7r4xnNsA2L1oomh1i9BMcxNISOgAMB+qPpHTePgiLwC8IwbL5fY pEEA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=zy3PxKim2kgrf5ZCj5JwKWMYo9oRJzEQWBJCStxrC1Y=; b=eBerG7AOp+nYwCXv3grXa/c1s7wiaHdm1nUuqI4LrMc4iuIXha1Ht93BS4syY2ZGIc fiuNklZ4EOu3tX92XlMFlG4uWDEJQak6Ef88FtI7XXbQul38FxsgHT8wtQ2T/B3DIulg HCNWQDMs2KhHXPUxdHvtx1O59Cl3WqFfvB6Ojt3fLsbob5RluxHyby0ETALNGzGo3D6X mo0uhdumwf5UjuuOS4ielCM/vrI2gXz3lOxRiCy84xGxLFg7QLJr+T3y2ZGc5OENBFu5 3ctYa5SeSsr2D/T7kSBqZI53jC3vmjjmf5FlblyEArd0GTV0878ebpO3PWmQn5PYXs9X 6AMg== X-Gm-Message-State: AOAM532StrtyH3656AJyOJXsB//EZBPzq9Lys0am+AUdE63vsDbeHa0M dbol84EQpvUXSqJwdWlK/gl5D2YNOzI= X-Google-Smtp-Source: ABdhPJyPeYPNbBplGv1w+2QafWHFd9Z8NFc2oe+zSPNPljdNrjNTvuanBtZX1W+vsxjU4lO5UtFj9A== X-Received: by 2002:a5d:6c66:0:b0:1ea:8609:e8fc with SMTP id r6-20020a5d6c66000000b001ea8609e8fcmr668731wrz.424.1645641065535; Wed, 23 Feb 2022 10:31:05 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id x17sm271986wrv.107.2022.02.23.10.31.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Feb 2022 10:31:05 -0800 (PST) Message-Id: <0abec796b0089b84d23cb52bb127788fdd04961c.1645641063.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 23 Feb 2022 18:30:39 +0000 Subject: [PATCH 01/25] docs: document bundle URI standard Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, me@ttaylorr.com, aevar@gmail.com, newren@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee Introduce the idea of bundle URIs to the Git codebase through an aspirational design document. This document includes the full design intended to include the feature in its fully-implemented form. This will take several steps as detailed in the Implementation Plan section. By committing this document now, it can be used to motivate changes necessary to reach these final goals. The design can still be altered as new information is discovered. Signed-off-by: Derrick Stolee --- Documentation/technical/bundle-uri.txt | 404 +++++++++++++++++++++++++ 1 file changed, 404 insertions(+) create mode 100644 Documentation/technical/bundle-uri.txt diff --git a/Documentation/technical/bundle-uri.txt b/Documentation/technical/bundle-uri.txt new file mode 100644 index 00000000000..5c0b9e8e3ef --- /dev/null +++ b/Documentation/technical/bundle-uri.txt @@ -0,0 +1,404 @@ +Bundle URIs +=========== + +Bundle URIs are locations where Git can download one or more bundles in +order to bootstrap the object database in advance of fetching the remaining +objects from a remote. + +One goal is to speed up clones and fetches for users with poor network +connectivity to the origin server. Another benefit is to allow heavy users, +such as CI build farms, to use local resources for the majority of Git data +and thereby reducing the load on the origin server. + +To enable the bundle URI feature, users can specify a bundle URI using +command-line options or the origin server can advertise one or more URIs +via a protocol v2 capability. + +Server requirements +------------------- + +To provide a server-side implementation of bundle servers, no other parts +of the Git protocol are required. This allows server maintainers to use +static content solutions such as CDNs in order to serve the bundle files. + +At the current scope of the bundle URI feature, all URIs are expected to +be HTTP(S) URLs where content is downloaded to a local file using a `GET` +request to that URL. The server could include authentication requirements +to those requests with the aim of triggering the configured credential +helper for secure access. + +Assuming a `200 OK` response from the server, the content at the URL is +expected to be of one of two forms: + +1. Bundle: A Git bundle file of version 2 or higher. + +2. Table of Contents: A plain-text file that is parsable using Git's + config file parser. This file describes one or more bundles that are + accessible from other URIs. + +Any other data provided by the server is considered erroneous. + +Table of Contents Format +------------------------ + +If the content at a bundle URI is not a bundle, then it is expected to be +a plaintext file that is parseable using Git's config parser. This file +can contain any list of key/value pairs, but only a fixed set will be +considered by Git. + +bundle.tableOfContents.version:: + This value provides a version number for the table of contents. If + a future Git change enables a feature that needs the Git client to + react to a new key in the table of contents file, then this version + will increment. The only current version number is 1, and if any + other value is specified then Git will fail to use this file. + +bundle.tableOfContents.forFetch:: + This boolean value is a signal to the Git client that the bundle + server has designed its bundle organization to assist `git fetch` + commands in addition to `git clone` commands. If this is missing, + Git should not use this table of contents for `git fetch` as it + may lead to excess data downloads. + +The remaining keys include an `` segment which is a server-designated +name for each available bundle. + +bundle..uri:: + This string value is the URI for downloading bundle ``. If + the URI begins with a protocol (`http://` or `https://`) then the + URI is absolute. Otherwise, the URI is interpreted as relative to + the URI used for the table of contents. If the URI begins with `/`, + then that relative path is relative to the domain name used for + the table of contents. (This use of relative paths is intended to + make it easier to distribute a set of bundles across a large + number of servers or CDNs with different domain names.) + +bundle..timestamp:: + (Optional) This value is the number of seconds since Unix epoch + (UTC) that this bundle was created. This is used as an approximation + of a point in time that the bundle matches the data available at + the origin server. + +bundle..requires:: + (Optional) This string value represents the ID of another bundle. + When present, the server is indicating that this bundle contains a + thin packfile. If the client does not have all necessary objects + to unbundle this packfile, then the client can download the bundle + with the `requires` ID and try again. (Note: it may be beneficial + to allow the server to specify multiple `requires` bundles.) + +bundle..filter:: + (Optional) This string value represents an object filter that + should also appear in the header of this bundle. The server uses + this value to differentiate different kinds of bundles from which + the client can choose those that match their object filters. + +Here is an example table of contents: + +``` +[bundle "tableofcontents"] + version = 1 + forFetch = true + +[bundle "2022-02-09-1644442601-daily"] + uri = https://gitbundleserver.z13.web.core.windows.net/git/git/2022-02-09-1644442601-daily.bundle + timestamp = 1644442601 + requires = 2022-02-02-1643842562 + +[bundle "2022-02-02-1643842562"] + uri = https://gitbundleserver.z13.web.core.windows.net/git/git/2022-02-02-1643842562.bundle + timestamp = 1643842562 + +[bundle "2022-02-09-1644442631-daily-blobless"] + uri = 2022-02-09-1644442631-daily-blobless.bundle + timestamp = 1644442631 + requires = 2022-02-02-1643842568-blobless + filter = blob:none + +[bundle "2022-02-02-1643842568-blobless"] + uri = /git/git/2022-02-02-1643842568-blobless.bundle + timestamp = 1643842568 + filter = blob:none +``` + +This example uses all of the keys in the specification. Suppose that the +table of contents was found at the URI +`https://gitbundleserver.z13.web.core.windows.net/git/git/` and so the +two blobless bundles have the following fully-expanded URIs: + +* `https://gitbundleserver.z13.web.core.windows.net/git/git/2022-02-09-1644442631-daily-blobless.bundle` +* `https://gitbundleserver.z13.web.core.windows.net/git/git/2022-02-02-1643842568-blobless.bundle` + +Advertising Bundle URIs +----------------------- + +If a user knows a bundle URI for the repository they are cloning, then they +can specify that URI manually through a command-line option. However, a +Git host may want to advertise bundle URIs during the clone operation, +helping users unaware of the feature. + +Note: The exact details of this section are not final. This is a possible +way that Git could auto-discover bundle URIs, but is not a committed +direction until that feature is implemented. + +The only thing required for this feature is that the server can advertise +one or more bundle URIs. One way to implement this is to create a new +protocol v2 capability that advertises recommended features, including +bundle URIs. + +The client could choose an arbitrary bundle URI as an option _or_ select +the URI with lowest latency by some exploratory checks. It is up to the +server operator to decide if having multiple URIs is preferable to a +single URI that is geodistributed through server-side infrastructure. + +Cloning with Bundle URIs +------------------------ + +The primary need for bundle URIs is to speed up clones. The Git client +will interact with bundle URIs according to the following flow: + +1. The user specifies a bundle URI with the `--bundle-uri` command-line + option _or_ the client discovers a bundle URI that was advertised by + the remote server. + +2. The client downloads the file at the bundle URI. If it is a bundle, then + it is unbundled with the refs being stored in `refs/bundle/*`. + +3. If the file is instead a table of contents, then the bundles with + matching `filter` settings are sorted by `timestamp` (if present), + and the most-recent bundle is downloaded. + +4. If the current bundle header mentions negative commid OIDs that are not + in the object database, then download the `requires` bundle and try + again. + +5. After inspecting a bundle with no negative commit OIDs (or all OIDs are + already in the object database somehow), then unbundle all of the + bundles in reverse order, placing references within `refs/bundle/*`. + +6. The client performs a fetch negotiation with the origin server, using + the `refs/bundle/*` references as `have`s and the server's ref + advertisement as `want`s. This results in a pack-file containing the + remaining objects requested by the clone but not in the bundles. + +Note that during a clone we expect that all bundles will be required. The +client could be extended to download all bundles in parallel, though they +need to be unbundled in the correct order. + +If a table of contents is used and it contains +`bundle.tableOfContents.forFetch = true`, then the client can store a +config value indicating to reuse this URI for later `git fetch` commands. +In this case, the client will also want to store the maximum timestamp of +a downloaded bundle. + +Fetching with Bundle URIs +------------------------- + +When the client fetches new data, it can decide to fetch from bundle +servers before fetching from the origin remote. This could be done via +a command-line option, but it is more likely useful to use a config value +such as the one specified during the clone. + +The fetch operation follows the same procedure to download bundles from a +table of contents (although we do _not_ want to use parallel downloads +here). We expect that the process will end because all negative commit +OIDs in a thin bundle are already in the object database. + +A further optimization is that the client can avoid downloading any +bundles if their timestamps are not larger than the stored timestamp. +After fetching new bundles, this local timestamp value is updated. + +Choices for Bundle Server Organization +-------------------------------------- + +With this standard, there are many options available to the bundle server +in how it organizes Git data into bundles. + +* Bundles can have whatever name the server desires. This name could refer + to immutable data by using a hash of the bundle contents. However, this + means that a new URI will be needed after every update of the content. + This might be acceptable if the server is advertising the URI (and the + server is aware of new bundles being generated) but would not be + ergonomic for users using the command line option. + +* If the server intends to only serve full clones, then the advertised URI + could be a bundle file without a filter that is updated at some cadence. + +* If the server intends to serve clones, but wants clients to choose full + or blobless partial clones, then the server can use a table of contents + that lists two non-thin bundles and the client chooses between them only + by the `bundle..filter` values. + +* If the server intends to improve clones with parallel downloads, then it + can use a table of contents and split the repository into time intervals + of approximately similar-sized bundles. Using `bundle..timestamp` + and `bundle..requires` values helps the client decide the order to + unbundle the bundles. + +* If the server intends to serve fetches, then it can use a table of + contents to advertise a list of bundles that are updated regularly. The + most recent bundles could be generated on short intervals, such as hourly. + These small bundles could be merged together at some rate, such as 24 + hourly bundles merging into a single daily bundle. At some point, it may + be beneficial to create a bundle that stores the majority of the history, + such as all data older than 30 days. + +These recommendations are intended only as suggestions. Each repository is +different and every Git server has different needs. Hopefully the bundle +URI feature and its table of contents is flexible enough to satisfy all +needs. If not, then the format can be extended. + +Error Conditions +---------------- + +If the Git client discovers something unexpected while downloading +information according to a bundle URI or the table of contents found at +that location, then Git can ignore that data and continue as if it was not +given a bundle URI. The remote Git server is the ultimate source of truth, +not the bundle URI. + +Here are a few example error conditions: + +* The client fails to connect with a server at the given URI or a connection + is lost without any chance to recover. + +* The client receives a response other than `200 OK` (such as `404 Not Found`, + `401 Not Authorized`, or `500 Internal Server Error`). + +* The client receives data that is not parsable as a bundle or table of + contents. + +* The table of contents describes a directed cycle in the + `bundle..requires` links. + +* A bundle includes a filter that does not match expectations. + +* The client cannot unbundle the bundles because the negative commit OIDs + are not in the object database and there are no more + `bundle..requires` links to follow. + +There are also situations that could be seen as wasteful, but are not +error conditions: + +* The downloaded bundles contain more information than is requested by + the clone or fetch request. A primary example is if the user requests + a clone with `--single-branch` but downloads bundles that store every + reachable commit from all `refs/heads/*` references. This might be + initially wasteful, but perhaps these objects will become reachable by + a later ref update that the client cares about. + +* A bundle download during a `git fetch` contains objects already in the + object database. This is probably unavoidable if we are using bundles + for fetches, since the client will almost always be slightly ahead of + the bundle servers after performing its "catch-up" fetch to the remote + server. This extra work is most wasteful when the client is fetching + much more frequently than the server is computing bundles, such as if + the client is using hourly prefetches with background maintenance, but + the server is computing bundles weekly. For this reason, the client + should not use bundle URIs for fetch unless the server has explicitly + recommended it through the `bundle.tableOfContents.forFetch = true` + value. + +Implementation Plan +------------------- + +This design document is being submitted on its own as an aspirational +document, with the goal of implementing all of the mentioned client +features over the course of several patch series. Here is a potential +outline for submitting these features for full review: + +1. Update the `git bundle create` command to take a `--filter` option, + allowing bundles to store packfiles restricted to an object filter. + This is necessary for using bundle URIs to benefit partial clones. + +2. Integrate bundle URIs into `git clone` with a `--bundle-uri` option. + This will include the full understanding of a table of contents, but + will not integrate with `git fetch` or allow the server to advertise + URIs. + +3. Integrate bundle URIs into `git fetch`, triggered by config values that + are set during `git clone` if the server indicates that the bundle + strategy works for fetches. + +4. Create a new "recommended features" capability in protocol v2 where the + server can recommend features such as bundle URIs, partial clone, and + sparse-checkout. These features will be extremely limited in scope and + blocked by opt-in config options. The design for this portion could be + replaced by a "bundle-uri" capability that only advertises bundle URIs + and no other information. + +Related Work: Packfile URIs +--------------------------- + +The Git protocol already has a capability where the Git server can list +a set of URLs along with the packfile response when serving a client +request. The client is then expected to download the packfiles at those +locations in order to have a complete understanding of the response. + +This mechanism is used by the Gerrit server (implemented with JGit) and +has been effective at reducing CPU load and improving user performance for +clones. + +A major downside to this mechanism is that the origin server needs to know +_exactly_ what is in those packfiles, and the packfiles need to be available +to the user for some time after the server has responded. This coupling +between the origin and the packfile data is difficult to manage. + +Further, this implementation is extremely hard to make work with fetches. + +Related Work: GVFS Cache Servers +-------------------------------- + +The GVFS Protocol [2] is a set of HTTP endpoints designed independently of +the Git project before Git's partial clone was created. One feature of this +protocol is the idea of a "cache server" which can be colocated with build +machines or developer offices to transfer Git data without overloading the +central server. + +The endpoint that VFS for Git is famous for is the `GET /gvfs/objects/{oid}` +endpoint, which allows downloading an object on-demand. This is a critical +piece of the filesystem virtualization of that product. + +However, a more subtle need is the `GET /gvfs/prefetch?lastPackTimestamp=` +endpoint. Given an optional timestamp, the cache server responds with a list +of precomputed packfiles containing the commits and trees that were introduced +in those time intervals. + +The cache server computes these "prefetch" packfiles using the following +strategy: + +1. Every hour, an "hourly" pack is generated with a given timestamp. +2. Nightly, the previous 24 hourly packs are rolled up into a "daily" pack. +3. Nightly, all prefetch packs more than 30 days old are rolled up into + one pack. + +When a user runs `gvfs clone` or `scalar clone` against a repo with cache +servers, the client requests all prefetch packfiles, which is at most +`24 + 30 + 1` packfiles downloading only commits and trees. The client +then follows with a request to the origin server for the references, and +attempts to checkout that tip reference. (There is an extra endpoint that +helps get all reachable trees from a given commit, in case that commit +was not already in a prefetch packfile.) + +During a `git fetch`, a hook requests the prefetch endpoint using the +most-recent timestamp from a previously-downloaded prefetch packfile. +Only the list of packfiles with later timestamps are downloaded. Most +users fetch hourly, so they get at most one hourly prefetch pack. Users +whose machines have been off or otherwise have not fetched in over 30 days +might redownload all prefetch packfiles. This is rare. + +It is important to note that the clients always contact the origin server +for the refs advertisement, so the refs are frequently "ahead" of the +prefetched pack data. The missing objects are downloaded on-demand using +the `GET gvfs/objects/{oid}` requests, when needed by a command such as +`git checkout` or `git log`. Some Git optimizations disable checks that +would cause these on-demand downloads to be too aggressive. + +See Also +-------- + +[1] https://lore.kernel.org/git/RFC-cover-00.13-0000000000-20210805T150534Z-avarab@gmail.com/ + An earlier RFC for a bundle URI feature. + +[2] https://github.com/microsoft/VFSForGit/blob/master/Protocol.md + The GVFS Protocol \ No newline at end of file From patchwork Wed Feb 23 18:30:40 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12757296 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A1661C433F5 for ; Wed, 23 Feb 2022 18:31:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243879AbiBWSbm (ORCPT ); Wed, 23 Feb 2022 13:31:42 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37326 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236947AbiBWSbg (ORCPT ); Wed, 23 Feb 2022 13:31:36 -0500 Received: from mail-wm1-x32a.google.com (mail-wm1-x32a.google.com [IPv6:2a00:1450:4864:20::32a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9514145AE4 for ; Wed, 23 Feb 2022 10:31:08 -0800 (PST) Received: by mail-wm1-x32a.google.com with SMTP id d14-20020a05600c34ce00b0037bf4d14dc7so4916565wmq.3 for ; Wed, 23 Feb 2022 10:31:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=qw5N4QyXi2eCCr/aaFG4WHyfzLl6kV96xcMMk2SGczk=; b=S8xS81lNRu6iVEltyht8g+ix976xDn9SveA+K5Oku86HfVuVQjxF1oa61SeRS6pOYw QbwLym7mttKcTpG9PoeL4DJ4F/fdJy6e3R1UgJZ5eaScz6WywlNUDFvEHmsv43PHEKE4 Q2jK7UL6GOjGkdgGkNf5DPsZxUfHS1/W+hIa9CTnL8GIGEG+HCQgUi9xJcspTtcEJ16C hF/F+Frn7LI9epdjAA5mv35CFPEl+PJfRFYb3rUZDGoCNdy8a+VQGsG2GovPaVVjpPPO X9E8kQ/LITiaCVKEp8MefHQYOxmM3ogGmYmdoCNoDf0/P3x0oJYdSLO3HwxiXfroW3MH nWuA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=qw5N4QyXi2eCCr/aaFG4WHyfzLl6kV96xcMMk2SGczk=; b=YKif5YPhTEUlu518y1Rketvi+HPxGWHvJZw6opHajSpf96xvtKsv9P2ubDS85dPscP zqvEdVb0QHJtl2QYhzEw7suI8T3AT2q6ve4QXd0dClf/+olHk10QbXIdLQxlSOPbkOkq Uj9swckoMN2eRc7xemu0G8akXDK2xUL8EeFiXdRmCPGl/w9BSOjpan4qmTkkH8J5tBmf gVPXK+kVGyixK2nMWg9GUMz229rifgQtBO0+FjnOZs7N1l0fUB+xV+dVCrTH+oi2gQTa +gMlZMQhVtGmUQKM5WOmnrSY5Zgw76pvC+Omxk0GEFxr7IbxsVu8XzPGhRLOODDjUFbk Whfw== X-Gm-Message-State: AOAM532ouFqPcj1iLT0Vxa0lm1qla7MdvEGq9Uxk4VaSerubiKhVNOl6 WNrxwW6k9XhPy6FWS30qn9ddGe5Ogq0= X-Google-Smtp-Source: ABdhPJzRWYifYgvybmZ8OxPWybMtz3KfRi+bapvxOIPGG89pOT350v0ARSrRMhMSnDVmRflDva8NRg== X-Received: by 2002:a7b:c016:0:b0:37b:ebf6:3d13 with SMTP id c22-20020a7bc016000000b0037bebf63d13mr8793995wmb.191.1645641066870; Wed, 23 Feb 2022 10:31:06 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id f63sm256334wma.17.2022.02.23.10.31.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Feb 2022 10:31:05 -0800 (PST) Message-Id: In-Reply-To: References: Date: Wed, 23 Feb 2022 18:30:40 +0000 Subject: [PATCH 02/25] bundle: alphabetize subcommands better Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, me@ttaylorr.com, aevar@gmail.com, newren@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee The usage strings for the 'git bundle' subcommands are not alphabetical. This also applies to their inspection within cmd_bundle(). Fix this ordering before we insert a new subcommand. This change does not reorder the cmd_bundle_*() methods to avoid moving lines that are more likely wanted in a future 'git blame' call. It is fine that those longer methods are not ordered alphabetically. Signed-off-by: Derrick Stolee --- builtin/bundle.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/builtin/bundle.c b/builtin/bundle.c index 5a85d7cd0fe..8187b7df739 100644 --- a/builtin/bundle.c +++ b/builtin/bundle.c @@ -13,9 +13,9 @@ static const char * const builtin_bundle_usage[] = { N_("git bundle create [] "), - N_("git bundle verify [] "), N_("git bundle list-heads [...]"), N_("git bundle unbundle [...]"), + N_("git bundle verify [] "), NULL }; @@ -24,11 +24,6 @@ static const char * const builtin_bundle_create_usage[] = { NULL }; -static const char * const builtin_bundle_verify_usage[] = { - N_("git bundle verify [] "), - NULL -}; - static const char * const builtin_bundle_list_heads_usage[] = { N_("git bundle list-heads [...]"), NULL @@ -39,6 +34,11 @@ static const char * const builtin_bundle_unbundle_usage[] = { NULL }; +static const char * const builtin_bundle_verify_usage[] = { + N_("git bundle verify [] "), + NULL +}; + static int parse_options_cmd_bundle(int argc, const char **argv, const char* prefix, @@ -209,12 +209,12 @@ int cmd_bundle(int argc, const char **argv, const char *prefix) else if (!strcmp(argv[0], "create")) result = cmd_bundle_create(argc, argv, prefix); - else if (!strcmp(argv[0], "verify")) - result = cmd_bundle_verify(argc, argv, prefix); else if (!strcmp(argv[0], "list-heads")) result = cmd_bundle_list_heads(argc, argv, prefix); else if (!strcmp(argv[0], "unbundle")) result = cmd_bundle_unbundle(argc, argv, prefix); + else if (!strcmp(argv[0], "verify")) + result = cmd_bundle_verify(argc, argv, prefix); else { error(_("Unknown subcommand: %s"), argv[0]); usage_with_options(builtin_bundle_usage, options); From patchwork Wed Feb 23 18:30:41 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12757298 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AAF37C433FE for ; Wed, 23 Feb 2022 18:31:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243883AbiBWSbq (ORCPT ); Wed, 23 Feb 2022 13:31:46 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37334 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243868AbiBWSbh (ORCPT ); Wed, 23 Feb 2022 13:31:37 -0500 Received: from mail-wr1-x433.google.com (mail-wr1-x433.google.com [IPv6:2a00:1450:4864:20::433]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 80A644B414 for ; Wed, 23 Feb 2022 10:31:09 -0800 (PST) Received: by mail-wr1-x433.google.com with SMTP id d17so6223522wrc.9 for ; Wed, 23 Feb 2022 10:31:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=1jvZERtSPiben/T2+8gaCUWFMGf3l9m8EC+gogi8QVY=; b=eq2XT1Oyl8P6ixtt93znC1LP7RuodgsmRWsU9avMQMtLbwUdMiCDbLxAqrC9Qz4i1a zU3A2v2MzTq6mVvk5X4mpvHCw9poQDLsjAl40uUfb1e/jnRfXaDSBI64nj96zqtPjHRl ky0v5h06wZp1Io3QXMfFLDL03cw+b/Ua7J1yO3Q6X0iTf1YGb7Xrj4btAFBaq3btyxmQ i54WPb8V1HXkviyD92tcMZO4bzlPekkfGyqSLzY9zKAmWc4A3DvfHvEEhIlyIRo5tqNS wJfuGyO2MVEhyz6qqpczcnjdwHeH74z2ZDPSNjDkYP3eEOck2PYPwf/YdMM+2M/6QetU C6ig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=1jvZERtSPiben/T2+8gaCUWFMGf3l9m8EC+gogi8QVY=; b=cTHMnn7ZK0etv3nyhZInoT7Boy36IZt3fUgGRi0lbAXoEEF+9trU9ZlsuQaptIqeIf /WPn3QrCAAsPuZPracJPjacYH1Nknmjho1XnwWyz0h4vVuemlgOwofHIuoNSE/jOdann r+BNbh2JQ1k+IhOsWq3gXyaWZZ9Fwao+TdEu9a6NrAEoWDaqNC85iMbFpfcwH0yu68v/ 0SWF+IfZd+skiWmFx3XBWDq3y1mrch7uS9UlBN+Dh9DeclhsBsoYRWGITZjgeJ7+F8Z8 oxrRYA6GKsDnard9eHuWXR/+6OrncPQXa7zxWUJkABOF7EpBw1ayyDbd8SlhuLapv5v+ eskA== X-Gm-Message-State: AOAM5305TOty6vYKCTNlAcjhAQ0zOapVc4DgeD0dHjrUb/nAbSNURkVO dP5IrKzNt5PwFh3iKbGzeC2c3Mvrv0k= X-Google-Smtp-Source: ABdhPJwiwfaXX1AJ6EnrD7LzJvmx8ls2ufGxghdkkQ5tYAmcn0xjLdfMkKNy8Cj6QGULnOFMFyoxkQ== X-Received: by 2002:adf:bc14:0:b0:1e2:b035:9c46 with SMTP id s20-20020adfbc14000000b001e2b0359c46mr674191wrg.386.1645641067835; Wed, 23 Feb 2022 10:31:07 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id s7sm322644wri.5.2022.02.23.10.31.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Feb 2022 10:31:07 -0800 (PST) Message-Id: In-Reply-To: References: Date: Wed, 23 Feb 2022 18:30:41 +0000 Subject: [PATCH 03/25] dir: extract starts_with_dot[_dot]_slash() Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, me@ttaylorr.com, aevar@gmail.com, newren@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee We will want to use this logic to assist checking if paths are absolute or relative, so extract it into a helpful place. This creates a collision with similar methods in builtin/fsck.c, but those methods have important differences. Prepend "fsck_" to those methods to emphasize that they are custom to the fsck builtin. Signed-off-by: Derrick Stolee --- builtin/submodule--helper.c | 10 ---------- dir.h | 11 +++++++++++ fsck.c | 14 +++++++------- 3 files changed, 18 insertions(+), 17 deletions(-) diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c index c5d3fc3817f..c17dde4170f 100644 --- a/builtin/submodule--helper.c +++ b/builtin/submodule--helper.c @@ -70,16 +70,6 @@ static int print_default_remote(int argc, const char **argv, const char *prefix) return 0; } -static int starts_with_dot_slash(const char *str) -{ - return str[0] == '.' && is_dir_sep(str[1]); -} - -static int starts_with_dot_dot_slash(const char *str) -{ - return str[0] == '.' && str[1] == '.' && is_dir_sep(str[2]); -} - /* * Returns 1 if it was the last chop before ':'. */ diff --git a/dir.h b/dir.h index 8e02dfb505d..5e38d1ba536 100644 --- a/dir.h +++ b/dir.h @@ -578,4 +578,15 @@ void connect_work_tree_and_git_dir(const char *work_tree, void relocate_gitdir(const char *path, const char *old_git_dir, const char *new_git_dir); + +static inline int starts_with_dot_slash(const char *str) +{ + return str[0] == '.' && is_dir_sep(str[1]); +} + +static inline int starts_with_dot_dot_slash(const char *str) +{ + return str[0] == '.' && str[1] == '.' && is_dir_sep(str[2]); +} + #endif diff --git a/fsck.c b/fsck.c index 3ec500d707a..32cd3bc081f 100644 --- a/fsck.c +++ b/fsck.c @@ -976,31 +976,31 @@ done: } /* - * Like builtin/submodule--helper.c's starts_with_dot_slash, but without + * Like dir.h's starts_with_dot_slash, but without * relying on the platform-dependent is_dir_sep helper. * * This is for use in checking whether a submodule URL is interpreted as * relative to the current directory on any platform, since \ is a * directory separator on Windows but not on other platforms. */ -static int starts_with_dot_slash(const char *str) +static int fsck_starts_with_dot_slash(const char *str) { return str[0] == '.' && (str[1] == '/' || str[1] == '\\'); } /* - * Like starts_with_dot_slash, this is a variant of submodule--helper's - * helper of the same name with the twist that it accepts backslash as a + * Like fsck_starts_with_dot_slash, this is a variant of dir.h's + * helper with the twist that it accepts backslash as a * directory separator even on non-Windows platforms. */ -static int starts_with_dot_dot_slash(const char *str) +static int fsck_starts_with_dot_dot_slash(const char *str) { - return str[0] == '.' && starts_with_dot_slash(str + 1); + return str[0] == '.' && fsck_starts_with_dot_slash(str + 1); } static int submodule_url_is_relative(const char *url) { - return starts_with_dot_slash(url) || starts_with_dot_dot_slash(url); + return fsck_starts_with_dot_slash(url) || fsck_starts_with_dot_dot_slash(url); } /* From patchwork Wed Feb 23 18:30:42 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12757299 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4096DC433F5 for ; Wed, 23 Feb 2022 18:31:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243893AbiBWSbs (ORCPT ); Wed, 23 Feb 2022 13:31:48 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37352 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243869AbiBWSbj (ORCPT ); Wed, 23 Feb 2022 13:31:39 -0500 Received: from mail-wm1-x32e.google.com (mail-wm1-x32e.google.com [IPv6:2a00:1450:4864:20::32e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CCF764B416 for ; Wed, 23 Feb 2022 10:31:10 -0800 (PST) Received: by mail-wm1-x32e.google.com with SMTP id i19so14277714wmq.5 for ; Wed, 23 Feb 2022 10:31:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=nqJ7gYx5eQohJ5KAa6XRx1AGCVw3/JtF38IijvN1zdw=; b=mpmAc6b4SXQBLVhG8zIEYEQo2dAoK/5+HkWN9BveSryoh/sPNmqQyQaBaA1fKXG94c 9azW1KjUHwNZUMWZdTuOVXOlLY4QGyVOBNWgrDkg6LnanQZNsCC8Fmq88jHWvEF5BvKW CYk9HIlxNGE5mbHYfUFhC+u21svndSM5kK5NoQu5jSaOebrcio8uLSkpdnKk7qByBbd3 1Ns4mcRW7TFBdyLYZL0vOKvs1EpxXBit3fzUmfUSKI/XsjEqIlYOk6Rq0Je+lkE8Q0p0 rlgtS2deOTW4mHPxKF+Ph0P6e6u4xmqeTOsnXNpQ3YH7wIUlPzQw4Umb7sMXIaf1NvNH yH1g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=nqJ7gYx5eQohJ5KAa6XRx1AGCVw3/JtF38IijvN1zdw=; b=f568jzE55MMxZ/uaOQeExq1x+iNgBTyhZgLgaFrIGa0GH5DTUI1cdBbgFfqznznUvz mgNXe8kq1Y8FbqaWSAeTJvRuYzs1LanCwiXEaUbFrIoqNRPoCGBAbRQag3+2F7T6Lmuy oNYcO5EvcG6B0W64yB9PKT8x2eZEd6/gSQWWtdjom5nwC8NTpQjU3wu0NdU66+ppQxv/ Zh8NKchFCNdAVRZIk/dGkcw+XpfI0N2rJJJyDzA/rWaCVnVeR0Hy1A7iOhJR77ojkJ5W zc5s4XI2rfiToHuYBtLt9KcBxzWIC7QDsPF1AJYEnqNtoVCLLAii+HVZHXsj5JucrDhf ay3w== X-Gm-Message-State: AOAM533R/xJwB6Fro3Hy7RKHgduYJJo/oxI9XypaFj/tK9Bl9bnddYTU Wr/+yeeFxkA6XUoCYbMBpJsojEkVNJU= X-Google-Smtp-Source: ABdhPJz6yCUlaNFHRJ1xbC/5E2CJb+DmcVHzChn+Puuxpqtf9k2dG4Hnem2xh4YR7b+csyt+X+QXUQ== X-Received: by 2002:a05:600c:284a:b0:37e:9244:abea with SMTP id r10-20020a05600c284a00b0037e9244abeamr796400wmb.2.1645641069160; Wed, 23 Feb 2022 10:31:09 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id y12sm322133wrt.72.2022.02.23.10.31.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Feb 2022 10:31:08 -0800 (PST) Message-Id: In-Reply-To: References: Date: Wed, 23 Feb 2022 18:30:42 +0000 Subject: [PATCH 04/25] remote: move relative_url() Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, me@ttaylorr.com, aevar@gmail.com, newren@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee This method was initially written in 63e95beb0 (submodule: port resolve_relative_url from shell to C, 2016-05-15). As we will need similar functionality in the bundle URI feature, extract this to be available in remote.h. The code is exactly the same. The prototype is different only in whitespace. The documentation comment only adds explicit instructions on what happens when supplying two absolute URLs. Signed-off-by: Derrick Stolee --- builtin/submodule--helper.c | 119 ------------------------------------ remote.c | 96 +++++++++++++++++++++++++++++ remote.h | 30 +++++++++ 3 files changed, 126 insertions(+), 119 deletions(-) diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c index c17dde4170f..506ebd8e6bc 100644 --- a/builtin/submodule--helper.c +++ b/builtin/submodule--helper.c @@ -70,125 +70,6 @@ static int print_default_remote(int argc, const char **argv, const char *prefix) return 0; } -/* - * Returns 1 if it was the last chop before ':'. - */ -static int chop_last_dir(char **remoteurl, int is_relative) -{ - char *rfind = find_last_dir_sep(*remoteurl); - if (rfind) { - *rfind = '\0'; - return 0; - } - - rfind = strrchr(*remoteurl, ':'); - if (rfind) { - *rfind = '\0'; - return 1; - } - - if (is_relative || !strcmp(".", *remoteurl)) - die(_("cannot strip one component off url '%s'"), - *remoteurl); - - free(*remoteurl); - *remoteurl = xstrdup("."); - return 0; -} - -/* - * The `url` argument is the URL that navigates to the submodule origin - * repo. When relative, this URL is relative to the superproject origin - * URL repo. The `up_path` argument, if specified, is the relative - * path that navigates from the submodule working tree to the superproject - * working tree. Returns the origin URL of the submodule. - * - * Return either an absolute URL or filesystem path (if the superproject - * origin URL is an absolute URL or filesystem path, respectively) or a - * relative file system path (if the superproject origin URL is a relative - * file system path). - * - * When the output is a relative file system path, the path is either - * relative to the submodule working tree, if up_path is specified, or to - * the superproject working tree otherwise. - * - * NEEDSWORK: This works incorrectly on the domain and protocol part. - * remote_url url outcome expectation - * http://a.com/b ../c http://a.com/c as is - * http://a.com/b/ ../c http://a.com/c same as previous line, but - * ignore trailing slash in url - * http://a.com/b ../../c http://c error out - * http://a.com/b ../../../c http:/c error out - * http://a.com/b ../../../../c http:c error out - * http://a.com/b ../../../../../c .:c error out - * NEEDSWORK: Given how chop_last_dir() works, this function is broken - * when a local part has a colon in its path component, too. - */ -static char *relative_url(const char *remote_url, - const char *url, - const char *up_path) -{ - int is_relative = 0; - int colonsep = 0; - char *out; - char *remoteurl = xstrdup(remote_url); - struct strbuf sb = STRBUF_INIT; - size_t len = strlen(remoteurl); - - if (is_dir_sep(remoteurl[len-1])) - remoteurl[len-1] = '\0'; - - if (!url_is_local_not_ssh(remoteurl) || is_absolute_path(remoteurl)) - is_relative = 0; - else { - is_relative = 1; - /* - * Prepend a './' to ensure all relative - * remoteurls start with './' or '../' - */ - if (!starts_with_dot_slash(remoteurl) && - !starts_with_dot_dot_slash(remoteurl)) { - strbuf_reset(&sb); - strbuf_addf(&sb, "./%s", remoteurl); - free(remoteurl); - remoteurl = strbuf_detach(&sb, NULL); - } - } - /* - * When the url starts with '../', remove that and the - * last directory in remoteurl. - */ - while (url) { - if (starts_with_dot_dot_slash(url)) { - url += 3; - colonsep |= chop_last_dir(&remoteurl, is_relative); - } else if (starts_with_dot_slash(url)) - url += 2; - else - break; - } - strbuf_reset(&sb); - strbuf_addf(&sb, "%s%s%s", remoteurl, colonsep ? ":" : "/", url); - if (ends_with(url, "/")) - strbuf_setlen(&sb, sb.len - 1); - free(remoteurl); - - if (starts_with_dot_slash(sb.buf)) - out = xstrdup(sb.buf + 2); - else - out = xstrdup(sb.buf); - - if (!up_path || !is_relative) { - strbuf_release(&sb); - return out; - } - - strbuf_reset(&sb); - strbuf_addf(&sb, "%s%s", up_path, out); - free(out); - return strbuf_detach(&sb, NULL); -} - static char *resolve_relative_url(const char *rel_url, const char *up_path, int quiet) { char *remoteurl, *resolved_url; diff --git a/remote.c b/remote.c index c97c626eaa8..c4a56749e85 100644 --- a/remote.c +++ b/remote.c @@ -14,6 +14,7 @@ #include "strvec.h" #include "commit-reach.h" #include "advice.h" +#include "connect.h" enum map_direction { FROM_SRC, FROM_DST }; @@ -2727,3 +2728,98 @@ void remote_state_clear(struct remote_state *remote_state) hashmap_clear_and_free(&remote_state->remotes_hash, struct remote, ent); hashmap_clear_and_free(&remote_state->branches_hash, struct remote, ent); } + +/* + * Returns 1 if it was the last chop before ':'. + */ +static int chop_last_dir(char **remoteurl, int is_relative) +{ + char *rfind = find_last_dir_sep(*remoteurl); + if (rfind) { + *rfind = '\0'; + return 0; + } + + rfind = strrchr(*remoteurl, ':'); + if (rfind) { + *rfind = '\0'; + return 1; + } + + if (is_relative || !strcmp(".", *remoteurl)) + die(_("cannot strip one component off url '%s'"), + *remoteurl); + + free(*remoteurl); + *remoteurl = xstrdup("."); + return 0; +} + +/* + * NEEDSWORK: Given how chop_last_dir() works, this function is broken + * when a local part has a colon in its path component, too. + */ +char *relative_url(const char *remote_url, + const char *url, + const char *up_path) +{ + int is_relative = 0; + int colonsep = 0; + char *out; + char *remoteurl = xstrdup(remote_url); + struct strbuf sb = STRBUF_INIT; + size_t len = strlen(remoteurl); + + if (is_dir_sep(remoteurl[len-1])) + remoteurl[len-1] = '\0'; + + if (!url_is_local_not_ssh(remoteurl) || is_absolute_path(remoteurl)) + is_relative = 0; + else { + is_relative = 1; + /* + * Prepend a './' to ensure all relative + * remoteurls start with './' or '../' + */ + if (!starts_with_dot_slash(remoteurl) && + !starts_with_dot_dot_slash(remoteurl)) { + strbuf_reset(&sb); + strbuf_addf(&sb, "./%s", remoteurl); + free(remoteurl); + remoteurl = strbuf_detach(&sb, NULL); + } + } + /* + * When the url starts with '../', remove that and the + * last directory in remoteurl. + */ + while (url) { + if (starts_with_dot_dot_slash(url)) { + url += 3; + colonsep |= chop_last_dir(&remoteurl, is_relative); + } else if (starts_with_dot_slash(url)) + url += 2; + else + break; + } + strbuf_reset(&sb); + strbuf_addf(&sb, "%s%s%s", remoteurl, colonsep ? ":" : "/", url); + if (ends_with(url, "/")) + strbuf_setlen(&sb, sb.len - 1); + free(remoteurl); + + if (starts_with_dot_slash(sb.buf)) + out = xstrdup(sb.buf + 2); + else + out = xstrdup(sb.buf); + + if (!up_path || !is_relative) { + strbuf_release(&sb); + return out; + } + + strbuf_reset(&sb); + strbuf_addf(&sb, "%s%s", up_path, out); + free(out); + return strbuf_detach(&sb, NULL); +} diff --git a/remote.h b/remote.h index 4a1209ae2c8..91c7f187863 100644 --- a/remote.h +++ b/remote.h @@ -409,4 +409,34 @@ int parseopt_push_cas_option(const struct option *, const char *arg, int unset); int is_empty_cas(const struct push_cas_option *); void apply_push_cas(struct push_cas_option *, struct remote *, struct ref *); +/* + * The `url` argument is the URL that navigates to the submodule origin + * repo. When relative, this URL is relative to the superproject origin + * URL repo. The `up_path` argument, if specified, is the relative + * path that navigates from the submodule working tree to the superproject + * working tree. Returns the origin URL of the submodule. + * + * Return either an absolute URL or filesystem path (if the superproject + * origin URL is an absolute URL or filesystem path, respectively) or a + * relative file system path (if the superproject origin URL is a relative + * file system path). + * + * When the output is a relative file system path, the path is either + * relative to the submodule working tree, if up_path is specified, or to + * the superproject working tree otherwise. + * + * NEEDSWORK: This works incorrectly on the domain and protocol part. + * remote_url url outcome expectation + * http://a.com/b ../c http://a.com/c as is + * http://a.com/b/ ../c http://a.com/c same as previous line, but + * ignore trailing slash in url + * http://a.com/b ../../c http://c error out + * http://a.com/b ../../../c http:/c error out + * http://a.com/b ../../../../c http:c error out + * http://a.com/b ../../../../../c .:c error out + */ +char *relative_url(const char *remote_url, + const char *url, + const char *up_path); + #endif From patchwork Wed Feb 23 18:30:43 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12757297 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5B0E3C433EF for ; Wed, 23 Feb 2022 18:31:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243878AbiBWSbn (ORCPT ); Wed, 23 Feb 2022 13:31:43 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37364 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243871AbiBWSbk (ORCPT ); Wed, 23 Feb 2022 13:31:40 -0500 Received: from mail-wr1-x434.google.com (mail-wr1-x434.google.com [IPv6:2a00:1450:4864:20::434]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0619E4B421 for ; Wed, 23 Feb 2022 10:31:12 -0800 (PST) Received: by mail-wr1-x434.google.com with SMTP id d3so25437109wrf.1 for ; Wed, 23 Feb 2022 10:31:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=2WQN9xs7IjLHHOx2h4dk6qc3hKyFx/uqIjlbjTom1KA=; b=guFhpzwJLQT4GF9jsuVmH7YcbJNxW1UcSztfUCsXdHVlUHO8HIxWe8Owq1tnXZW5o2 s0mWRdIbRyYIE7ZtOQ1p9TdJ63/b9/sgzcYJyi9/6slXbiJuVlGVif5rjU58fTIwok6f RsCrmxqVhknComgOmSK3TCYl2n3WErYtO36xP/ddDg0az1JPHyKgXDHntiBv9L9rTU4K kJnqTTvkpOF+m6v0IlXhtVT5gnieb6WvjC8Ntkuyy2p3BF848fnN0i9Ei4irNpZhPY7o cAKyndJJgLtwQ4QtO3KT5H4Rxp0mjFHF8r288Gn+864bk/foxLJV3WoIOWp/5shmR3lR LFig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=2WQN9xs7IjLHHOx2h4dk6qc3hKyFx/uqIjlbjTom1KA=; b=Esa+tO7QN3bRDujYDuuCAIddFokcFVWM4ctq1pX8eMXQee1HHE9vZLVfrRZImMK1oo 72FItazYi2xszh1OpDXLCOVhqIjsXWBSvQIBOtG8hyt9abhZ+o6lYjOguNCpDwxV58HD IFc5p0z/JWwHcNuObcLCNZq1VEu5W0br4PIDydVA8CFcpi/KGCOvmskcpuuAR++x3HJh KkIEebL+WOPxFDG7z7eifttVnVmOd4tx4I/c7fCZbMQ7VGJBxXgyzG5Ahuh4fcMlHK16 pWREuGNFgrx1IHVk9kRBuG6g5k2cainRDn66xj9KIh3mjKzJ2QQesHUZYzkzk2ICK0uV Yw1g== X-Gm-Message-State: AOAM533lSmU9CKwYR8y2ip2u7y87ml8h0d4OcmvhTc3rGkKBA8ByECE5 0/kcNx50NhMxgSWBmNxKi7GnIuAz+Og= X-Google-Smtp-Source: ABdhPJyCtVcHfVes8oel0GkEOP6FZQsCXfbMAASCwomIK/IKWTTsWlXTToQcgjQHa9Pdd+cXfgS4TA== X-Received: by 2002:a05:6000:1364:b0:1ed:b65a:da45 with SMTP id q4-20020a056000136400b001edb65ada45mr681996wrz.680.1645641070403; Wed, 23 Feb 2022 10:31:10 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id r12sm339156wrz.50.2022.02.23.10.31.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Feb 2022 10:31:10 -0800 (PST) Message-Id: <60a8d52af64cdc3ca8b374c714cd3af1fc74f5ec.1645641063.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 23 Feb 2022 18:30:43 +0000 Subject: [PATCH 05/25] remote: allow relative_url() to return an absolute url Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, me@ttaylorr.com, aevar@gmail.com, newren@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee When the 'url' parameter was absolute, the previous implementation would concatenate 'remote_url' with 'url'. Instead, we want to return 'url' in this case. Signed-off-by: Derrick Stolee --- remote.c | 12 ++++++++++-- remote.h | 1 + 2 files changed, 11 insertions(+), 2 deletions(-) diff --git a/remote.c b/remote.c index c4a56749e85..ac1d98ae922 100644 --- a/remote.c +++ b/remote.c @@ -2766,10 +2766,18 @@ char *relative_url(const char *remote_url, int is_relative = 0; int colonsep = 0; char *out; - char *remoteurl = xstrdup(remote_url); + char *remoteurl; struct strbuf sb = STRBUF_INIT; - size_t len = strlen(remoteurl); + size_t len; + + if (!url_is_local_not_ssh(url) || is_absolute_path(url)) + return xstrdup(url); + + len = strlen(remote_url); + if (!len) + BUG("invalid empty remote_url"); + remoteurl = xstrdup(remote_url); if (is_dir_sep(remoteurl[len-1])) remoteurl[len-1] = '\0'; diff --git a/remote.h b/remote.h index 91c7f187863..438152ef562 100644 --- a/remote.h +++ b/remote.h @@ -434,6 +434,7 @@ void apply_push_cas(struct push_cas_option *, struct remote *, struct ref *); * http://a.com/b ../../../c http:/c error out * http://a.com/b ../../../../c http:c error out * http://a.com/b ../../../../../c .:c error out + * http://a.com/b http://d.org/e http://d.org/e as is */ char *relative_url(const char *remote_url, const char *url, From patchwork Wed Feb 23 18:30:44 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12757301 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 601DDC433FE for ; Wed, 23 Feb 2022 18:31:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243898AbiBWSbv (ORCPT ); Wed, 23 Feb 2022 13:31:51 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37392 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243877AbiBWSbl (ORCPT ); Wed, 23 Feb 2022 13:31:41 -0500 Received: from mail-wm1-x330.google.com (mail-wm1-x330.google.com [IPv6:2a00:1450:4864:20::330]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 352974B430 for ; Wed, 23 Feb 2022 10:31:13 -0800 (PST) Received: by mail-wm1-x330.google.com with SMTP id y5so3845615wmi.0 for ; Wed, 23 Feb 2022 10:31:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=xe4mMWm75qppTGdTaj8uVSs0xTobaeCr6mEjy+hpChU=; b=D2tdb/dmSY9JFZcducWPpFSXHqHLQB7iG58LQSuMNmnPhyaPuv1iC2TdLDROvUHutS BAM+YIZMMIUvW9oZTaiD8BMVsXyfEpfDw3CakG4nQF8QVPcZhiozIJMOrExol8euV1FP noZr4ATcfWXJA8ysrqwIcnouIhJG7eUr81VQ4IqBFnLqof5V1+dhYVDvR1dMigbCAAI4 5EiktS0LrAoIkIp4in4zyXDbZrgzFeiArBktyJ2LtYNIEzPzRkR1XMVOx/M301myonkZ WM7dW16dFj4qWMvuAySF2ih+WV9F//ZZZ7qxZCzfAl5mVQ7eb9zgyocwA0tXpfU7DrcF kY/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=xe4mMWm75qppTGdTaj8uVSs0xTobaeCr6mEjy+hpChU=; b=h6qJrhlX5Yl6j2U4u8aoogIUnMHPXaMSUM5wSsXP9mJd+ozSjazwhIE0RomYpuyAAk esFugLNoD2jJ4+0YZYQHDdHCkZYpesnf6UNHYx7TBWFqka55g4oEzM+OAmrel/30tpn7 Xcbfr4zaUqNZY8NXP29dpzubWiVEc4NOJr2mrKOxVe1QGp4SajLb08s4RfnGJJ0JGn8q l/QQ4BwogzdhGQkCfqnT9UR/QyBqBf3E/UsO+bvRFQUxmcGvR/i+E6gC8hfjfu1A544u AxWlcA9s2lURW9vkrSwkiIdxQwyZNitZcvFmUFTwiW+iKAeKMQaeZbk4GYqyEryVCkFc BWKQ== X-Gm-Message-State: AOAM531A1N5nX8bk4+AdKJAOEJx8UAiLfkM5B+h5Me7YSwzmLM9xtyO+ 4K1P6CGuQaghfNcZXKQ6Pl3RHL9PG10= X-Google-Smtp-Source: ABdhPJzDonKwzWh6hntzNBdwZdnJKSHJGXdZqGE71SLxh7LBzveHPffb43zIYqERQ/OgAYc3APqxGQ== X-Received: by 2002:a1c:4b09:0:b0:37c:522:c736 with SMTP id y9-20020a1c4b09000000b0037c0522c736mr757387wma.145.1645641071453; Wed, 23 Feb 2022 10:31:11 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id z16-20020a7bc7d0000000b00381004c643asm453535wmk.30.2022.02.23.10.31.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Feb 2022 10:31:11 -0800 (PST) Message-Id: <0e19607d5d96308d07ae8df65d48d7f29be0ea50.1645641063.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 23 Feb 2022 18:30:44 +0000 Subject: [PATCH 06/25] http: make http_get_file() external Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, me@ttaylorr.com, aevar@gmail.com, newren@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee This method will be used in an upcoming extension of git-remote-curl to download a single file over HTTP(S) by request. Signed-off-by: Derrick Stolee --- http.c | 4 ++-- http.h | 9 +++++++++ 2 files changed, 11 insertions(+), 2 deletions(-) diff --git a/http.c b/http.c index 229da4d1488..04e73149357 100644 --- a/http.c +++ b/http.c @@ -1945,8 +1945,8 @@ int http_get_strbuf(const char *url, * If a previous interrupted download is detected (i.e. a previous temporary * file is still around) the download is resumed. */ -static int http_get_file(const char *url, const char *filename, - struct http_get_options *options) +int http_get_file(const char *url, const char *filename, + struct http_get_options *options) { int ret; struct strbuf tmpfile = STRBUF_INIT; diff --git a/http.h b/http.h index df1590e53a4..ba303cfb372 100644 --- a/http.h +++ b/http.h @@ -163,6 +163,15 @@ struct http_get_options { */ int http_get_strbuf(const char *url, struct strbuf *result, struct http_get_options *options); +/* + * Downloads a URL and stores the result in the given file. + * + * If a previous interrupted download is detected (i.e. a previous temporary + * file is still around) the download is resumed. + */ +int http_get_file(const char *url, const char *filename, + struct http_get_options *options); + int http_fetch_ref(const char *base, struct ref *ref); /* Helpers for fetching packs */ From patchwork Wed Feb 23 18:30:45 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12757302 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 16427C433EF for ; Wed, 23 Feb 2022 18:31:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243900AbiBWSbx (ORCPT ); Wed, 23 Feb 2022 13:31:53 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37396 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243880AbiBWSbm (ORCPT ); Wed, 23 Feb 2022 13:31:42 -0500 Received: from mail-wr1-x436.google.com (mail-wr1-x436.google.com [IPv6:2a00:1450:4864:20::436]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 56B864B403 for ; Wed, 23 Feb 2022 10:31:14 -0800 (PST) Received: by mail-wr1-x436.google.com with SMTP id s13so12547052wrb.6 for ; Wed, 23 Feb 2022 10:31:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=b2SPFTkKJp2RnDv9/xes81K08Q/esDVTO6037Hllzsg=; b=Z1oEIs5atEDk7bclrgmdt7MB2XI996OIxYSMDNkYxy/yOxfXO4WJVXLxJDKuzdS5jS cTzNxrZRo+sToOQAlv2buRzDyYey5x0DwvIXZFVRgza1ECVoMGbEgQS61DHrJJk2aXDy 9DEDq1CjDORKGd0PTdlXtiYNO8Z21ltnT5k3AQ3RR1iOnqel4SHHmI0zglYL6niqmDlY 8AvX5HqcJ+N3a10Gpderqt4bp6e6ERXnoT4Hifwo6vvrGrUlU/kAXlQ3t6yBB/RLcq7V Rd8gTiZdE190KvPFsh0RZrijyZufb2OPZ+BZtQCgayNEDko8v42p0QBnyE8hj6DPn1tv QKiw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=b2SPFTkKJp2RnDv9/xes81K08Q/esDVTO6037Hllzsg=; b=1tqBjWXb8R/PEDwQr63jOMiWZGj609Xqh9hzAHwiDahvzufsEOSdJwcogfUvdnmSGp DrpFxvAHbjbG04GfrQHguo6hhlsLTMtuBNMUpF0tVMxvxWgYIGueg61nPSln+r1O9NrV MCHC0lth5RB7zpXhFtyYfCF4i3E4At+Nw58TE79F1zF1NgvEShAfIhlJ2e/LR4s1d3no DbKuJ3dtz6JUYbIQRti/R4oMlvDhYJ+u/jkYwFh3oa4Nd+dIoLMWJhrjUjlUA2KSMnUi HvbI7VfmbnyRtLIFZqJjzTOErnF+Mx5PilIWZ58yDWOybDWcUQJKLYJdmYc3eym7HzPo DBSw== X-Gm-Message-State: AOAM533GHT+F+GEc8W7OFy+6RqIrwBECksUVO8f+cpdRTs3wRe/8XXkS xJN+4leSrGL+l+2dh6xmGqkMV6imgfA= X-Google-Smtp-Source: ABdhPJzOOn8G1mCZZUdcfYy2h+DwWrMvZ6utixGafoyR4625A/goEZVh0ax5JFF+kUFEOJlTPmfbLA== X-Received: by 2002:adf:d1cf:0:b0:1ea:937c:1c89 with SMTP id b15-20020adfd1cf000000b001ea937c1c89mr655188wrd.602.1645641072563; Wed, 23 Feb 2022 10:31:12 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id w4sm272599wre.102.2022.02.23.10.31.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Feb 2022 10:31:12 -0800 (PST) Message-Id: In-Reply-To: References: Date: Wed, 23 Feb 2022 18:30:45 +0000 Subject: [PATCH 07/25] remote-curl: add 'get' capability Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, me@ttaylorr.com, aevar@gmail.com, newren@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee A future change will want a way to download a file over HTTP(S) using the simplest of download mechanisms. We do not want to assume that the server on the other side understands anything about the Git protocol but could be a simple static web server. Create the new 'get' capability for the remote helpers which advertises that the 'get' command is avalable. A caller can send a line containing 'get ' to download the file at into the file at . RFC-TODO: This change requires tests directly on the remote helper. Signed-off-by: Derrick Stolee --- Documentation/gitremote-helpers.txt | 6 ++++++ remote-curl.c | 32 +++++++++++++++++++++++++++++ 2 files changed, 38 insertions(+) diff --git a/Documentation/gitremote-helpers.txt b/Documentation/gitremote-helpers.txt index 6f1e269ae43..f82588601a9 100644 --- a/Documentation/gitremote-helpers.txt +++ b/Documentation/gitremote-helpers.txt @@ -168,6 +168,9 @@ Supported commands: 'list', 'import'. Can guarantee that when a clone is requested, the received pack is self contained and is connected. +'get':: + Can use the 'get' command to download a file from a given URI. + If a helper advertises 'connect', Git will use it if possible and fall back to another capability if the helper requests so when connecting (see the 'connect' command under COMMANDS). @@ -418,6 +421,9 @@ Supported if the helper has the "connect" capability. + Supported if the helper has the "stateless-connect" capability. +'get' :: + Downloads the file from the given `` to the given ``. + If a fatal error occurs, the program writes the error message to stderr and exits. The caller should expect that a suitable error message has been printed if the child closes the connection without diff --git a/remote-curl.c b/remote-curl.c index 0dabef2dd7c..92beb98631b 100644 --- a/remote-curl.c +++ b/remote-curl.c @@ -1270,6 +1270,33 @@ static void parse_fetch(struct strbuf *buf) strbuf_reset(buf); } +static void parse_get(struct strbuf *buf) +{ + struct http_get_options opts = { 0 }; + struct strbuf url = STRBUF_INIT; + struct strbuf path = STRBUF_INIT; + const char *p, *space; + + if (!skip_prefix(buf->buf, "get ", &p)) + die(_("http transport does not support %s"), buf->buf); + + space = strchr(p, ' '); + + if (!space) + die(_("protocol error: expected ' ', missing space")); + + strbuf_add(&url, p, space - p); + strbuf_addstr(&path, space + 1); + + http_get_file(url.buf, path.buf, &opts); + + strbuf_release(&url); + strbuf_release(&path); + printf("\n"); + fflush(stdout); + strbuf_reset(buf); +} + static int push_dav(int nr_spec, const char **specs) { struct child_process child = CHILD_PROCESS_INIT; @@ -1542,9 +1569,14 @@ int cmd_main(int argc, const char **argv) printf("unsupported\n"); fflush(stdout); + } else if (skip_prefix(buf.buf, "get ", &arg)) { + parse_get(&buf); + fflush(stdout); + } else if (!strcmp(buf.buf, "capabilities")) { printf("stateless-connect\n"); printf("fetch\n"); + printf("get\n"); printf("option\n"); printf("push\n"); printf("check-connectivity\n"); From patchwork Wed Feb 23 18:30:46 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12757303 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A9148C433F5 for ; Wed, 23 Feb 2022 18:31:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243902AbiBWSby (ORCPT ); Wed, 23 Feb 2022 13:31:54 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37412 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243881AbiBWSbn (ORCPT ); Wed, 23 Feb 2022 13:31:43 -0500 Received: from mail-wr1-x435.google.com (mail-wr1-x435.google.com [IPv6:2a00:1450:4864:20::435]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3F62A45AE4 for ; Wed, 23 Feb 2022 10:31:15 -0800 (PST) Received: by mail-wr1-x435.google.com with SMTP id p9so41064621wra.12 for ; Wed, 23 Feb 2022 10:31:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=Aa7ncIT6vPEUq/ovGdDxjj1cqheO8Rxlz3GuEJwC19k=; b=khvYr5OHfooyQRfCkBHHdj2Hi2F7bR2nDXi/S6KZwEpqjOFR9tSa2BTqjLFagj5PvR +63YS5+/VZYE6HXpSN+U8XJWF/aEBp6x0I9TMMbpZtXuO00Ep32/QSj6SwMBomE3HsXp WV0bgWYTORokYcDxOjP2qcz2gLkxUZoMck20+yDmCW5u3mnvMdlRnMVbSzHh9H+0Bt/N uebYeEMazEWcU/2QNEhxjyDgLm3uA4ej9qlrjOubdnOjic1UrdyoSC7yaSQGcwjzaWTV jcg+jCXd1Gt/kbzUCJfXcNk0f3MftJdDUgPun4IQMt7ctljqdDNEYISXUiPOukXxPoZr LnrA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=Aa7ncIT6vPEUq/ovGdDxjj1cqheO8Rxlz3GuEJwC19k=; b=l9/4QvQFv36BOUytKEaKjAEJhIFQSuXOY6AbBa6mVbaK+XKHOf38EPjDm5tiUIRS3D EdID4y1iTDhm5s5UXN5hMhzWejbLrB4OvgGvPSvTbYRquNWd5akwlO/80s1i9Bdl0gyq LQxBExqU80Zsxgtx2chsK93f13iDYwZtnxNTvcCH4iZxkQWIfvZj32UFwmcf5g6jc6W5 mT5Mukbg1VK0C+LSz9cElyu4ey1hitzXUMZI5W/nDxkYUhh4bUMXx2mlrBJ/3Hs+IKRt M12zpGgM5N6tmcIcllW3u/2L7nrCjptiEHugiHjV2As25W+dHio4pGJhhG+zB4doFQB6 Y4kw== X-Gm-Message-State: AOAM530qRuMMSNtlSA7Pfm1iqQI/LY/LM1C8+r4yEh7onrwRl+mSGyJv MtT9uGhIxdW8pyykhuNcZUe+GVYtMow= X-Google-Smtp-Source: ABdhPJyWd1DMRIYcbUm5ZF+RJzlxuGoJvIGFcolVVqrfMTOcevdVWn6qKdT4KUT0BI3stivUpEQO/g== X-Received: by 2002:adf:f5c3:0:b0:1ed:c1da:9684 with SMTP id k3-20020adff5c3000000b001edc1da9684mr663579wrp.245.1645641073560; Wed, 23 Feb 2022 10:31:13 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id u9-20020a05600c19c900b0037c050d73dcsm464229wmq.46.2022.02.23.10.31.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Feb 2022 10:31:13 -0800 (PST) Message-Id: In-Reply-To: References: Date: Wed, 23 Feb 2022 18:30:46 +0000 Subject: [PATCH 08/25] bundle: implement 'fetch' command for direct bundles Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, me@ttaylorr.com, aevar@gmail.com, newren@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee The 'git bundle fetch ' command will be used to download one or more bundles from a specified ''. The implementation being added here focuses only on downloading a file from '' and unbundling it if it is a valid bundle file. If it is not a bundle file, then we currently die(), but a later change will attempt to interpret it as a table of contents with possibly multiple bundles listed, along with other metadata for each bundle. That explains a bit why cmd_bundle_fetch() has three steps carefully commented, including a "stack" that currently can only hold one bundle. We will later update this while loop to push onto the stack when necessary. RFC-TODO: Add documentation to Documentation/git-bundle.txt RFC-TODO: Add direct tests of 'git bundle fetch' when the URI is a bundle file. RFC-TODO: Split out the docs and subcommand boilerplate into its own commit. Signed-off-by: Derrick Stolee --- builtin/bundle.c | 261 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 261 insertions(+) diff --git a/builtin/bundle.c b/builtin/bundle.c index 8187b7df739..0e06f1756d1 100644 --- a/builtin/bundle.c +++ b/builtin/bundle.c @@ -3,6 +3,10 @@ #include "parse-options.h" #include "cache.h" #include "bundle.h" +#include "run-command.h" +#include "hashmap.h" +#include "object-store.h" +#include "refs.h" /* * Basic handler for bundle files to connect repositories via sneakernet. @@ -13,6 +17,7 @@ static const char * const builtin_bundle_usage[] = { N_("git bundle create [] "), + N_("git bundle fetch [] "), N_("git bundle list-heads [...]"), N_("git bundle unbundle [...]"), N_("git bundle verify [] "), @@ -24,6 +29,11 @@ static const char * const builtin_bundle_create_usage[] = { NULL }; +static const char * const builtin_bundle_fetch_usage[] = { + N_("git bundle fetch [--filter=] "), + NULL +}; + static const char * const builtin_bundle_list_heads_usage[] = { N_("git bundle list-heads [...]"), NULL @@ -131,6 +141,255 @@ cleanup: return ret; } +/** + * The remote_bundle_info struct contains the necessary data for + * the list of bundles advertised by a table of contents. If the + * bundle URI instead contains a single bundle, then this struct + * can represent a single bundle without a 'uri' but with a + * tempfile storing its current location on disk. + */ +struct remote_bundle_info { + struct hashmap_entry ent; + + /** + * The 'id' is a name given to the bundle for reference + * by other bundle infos. + */ + char *id; + + /** + * The 'uri' is the location of the remote bundle so + * it can be downloaded on-demand. This will be NULL + * if there was no table of contents. + */ + char *uri; + + /** + * The 'next_id' string, if non-NULL, contains the 'id' + * for a bundle that contains the prerequisites for this + * bundle. Used by table of contents to allow fetching + * a portion of a repository incrementally. + */ + char *next_id; + + /** + * A table of contents can include a timestamp for the + * bundle as a heuristic for describing a list of bundles + * in order of recency. + */ + timestamp_t timestamp; + + /** + * If the bundle has been downloaded, then 'file' is a + * filename storing its contents. Otherwise, 'file' is + * an empty string. + */ + struct strbuf file; + + /** + * The 'stack_next' pointer allows this struct to form + * a stack. + */ + struct remote_bundle_info *stack_next; +}; + +static void download_uri_to_file(const char *uri, const char *file) +{ + struct child_process cp = CHILD_PROCESS_INIT; + FILE *child_in; + + strvec_pushl(&cp.args, "git-remote-https", "origin", uri, NULL); + cp.in = -1; + cp.out = -1; + + if (start_command(&cp)) + die(_("failed to start remote helper")); + + child_in = fdopen(cp.in, "w"); + if (!child_in) + die(_("cannot write to child process")); + + fprintf(child_in, "get %s %s\n\n", uri, file); + fclose(child_in); + + if (finish_command(&cp)) + die(_("remote helper failed")); +} + +static void find_temp_filename(struct strbuf *name) +{ + int fd; + /* + * Find a temporray filename that is available. This is briefly + * racy, but unlikely to collide. + */ + fd = odb_mkstemp(name, "bundles/tmp_uri_XXXXXX"); + if (fd < 0) + die(_("failed to create temporary file")); + close(fd); + unlink(name->buf); +} + +static void unbundle_fetched_bundle(struct remote_bundle_info *info) +{ + struct child_process cp = CHILD_PROCESS_INIT; + FILE *f; + struct strbuf line = STRBUF_INIT; + struct strbuf bundle_ref = STRBUF_INIT; + size_t bundle_prefix_len; + + strvec_pushl(&cp.args, "bundle", "unbundle", + info->file.buf, NULL); + cp.git_cmd = 1; + cp.out = -1; + + if (start_command(&cp)) + die(_("failed to start 'unbundle' process")); + + strbuf_addstr(&bundle_ref, "refs/bundles/"); + bundle_prefix_len = bundle_ref.len; + + f = fdopen(cp.out, "r"); + while (strbuf_getline(&line, f) != EOF) { + struct object_id oid, old_oid; + const char *refname, *branch_name, *end; + char *space; + int has_old; + + strbuf_trim_trailing_newline(&line); + + space = strchr(line.buf, ' '); + + if (!space) + continue; + + refname = space + 1; + *space = '\0'; + parse_oid_hex(line.buf, &oid, &end); + + if (!skip_prefix(refname, "refs/heads/", &branch_name)) + continue; + + strbuf_setlen(&bundle_ref, bundle_prefix_len); + strbuf_addstr(&bundle_ref, branch_name); + + has_old = !read_ref(bundle_ref.buf, &old_oid); + + update_ref("bundle fetch", bundle_ref.buf, &oid, + has_old ? &old_oid : NULL, + REF_SKIP_OID_VERIFICATION, + UPDATE_REFS_MSG_ON_ERR); + } + + if (finish_command(&cp)) + die(_("failed to unbundle bundle from '%s'"), info->uri); + + unlink_or_warn(info->file.buf); +} + +static int cmd_bundle_fetch(int argc, const char **argv, const char *prefix) +{ + int ret = 0; + int progress = isatty(2); + char *bundle_uri; + struct remote_bundle_info first_file = { + .file = STRBUF_INIT, + }; + struct remote_bundle_info *stack = NULL; + + struct option options[] = { + OPT_BOOL(0, "progress", &progress, + N_("show progress meter")), + OPT_END() + }; + + argc = parse_options_cmd_bundle(argc, argv, prefix, + builtin_bundle_fetch_usage, options, &bundle_uri); + + if (!startup_info->have_repository) + die(_("'fetch' requires a repository")); + + /* + * Step 1: determine protocol for uri, and download contents to + * a temporary location. + */ + first_file.uri = bundle_uri; + find_temp_filename(&first_file.file); + download_uri_to_file(bundle_uri, first_file.file.buf); + + /* + * Step 2: Check if the file is a bundle (if so, add it to the + * stack and move to step 3). + */ + + if (is_bundle(first_file.file.buf, 1)) { + /* The simple case: only one file, no stack to worry about. */ + stack = &first_file; + } else { + /* TODO: Expect and parse a table of contents. */ + die(_("unexpected data at bundle URI")); + } + + /* + * Step 3: For each bundle in the stack: + * i. If not downloaded to a temporary file, download it. + * ii. Once downloaded, check that its prerequisites are in + * the object database. If not, then push its dependent + * bundle onto the stack. (Fail if no such bundle exists.) + * iii. If all prerequisites are present, then unbundle the + * temporary file and pop the bundle from the stack. + */ + while (stack) { + int valid = 1; + int bundle_fd; + struct string_list_item *prereq; + struct bundle_header header = BUNDLE_HEADER_INIT; + + if (!stack->file.len) { + find_temp_filename(&stack->file); + download_uri_to_file(stack->uri, stack->file.buf); + if (!is_bundle(stack->file.buf, 1)) + die(_("file downloaded from '%s' is not a bundle"), stack->uri); + } + + bundle_header_init(&header); + bundle_fd = read_bundle_header(stack->file.buf, &header); + if (bundle_fd < 0) + die(_("failed to read bundle from '%s'"), stack->uri); + + for_each_string_list_item(prereq, &header.prerequisites) { + struct object_info info = OBJECT_INFO_INIT; + struct object_id *oid = prereq->util; + + if (oid_object_info_extended(the_repository, oid, &info, + OBJECT_INFO_QUICK)) { + valid = 0; + break; + } + } + + close(bundle_fd); + bundle_header_release(&header); + + if (valid) { + unbundle_fetched_bundle(stack); + } else if (stack->next_id) { + /* + * Load the next bundle from the hashtable and + * push it onto the stack. + */ + } else { + die(_("bundle from '%s' has missing prerequisites and no dependent bundle"), + stack->uri); + } + + stack = stack->stack_next; + } + + free(bundle_uri); + return ret; +} + static int cmd_bundle_list_heads(int argc, const char **argv, const char *prefix) { struct bundle_header header = BUNDLE_HEADER_INIT; int bundle_fd = -1; @@ -209,6 +468,8 @@ int cmd_bundle(int argc, const char **argv, const char *prefix) else if (!strcmp(argv[0], "create")) result = cmd_bundle_create(argc, argv, prefix); + else if (!strcmp(argv[0], "fetch")) + result = cmd_bundle_fetch(argc, argv, prefix); else if (!strcmp(argv[0], "list-heads")) result = cmd_bundle_list_heads(argc, argv, prefix); else if (!strcmp(argv[0], "unbundle")) From patchwork Wed Feb 23 18:30:47 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12757304 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2D7B4C433EF for ; Wed, 23 Feb 2022 18:31:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243904AbiBWSb5 (ORCPT ); Wed, 23 Feb 2022 13:31:57 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37430 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243884AbiBWSbq (ORCPT ); Wed, 23 Feb 2022 13:31:46 -0500 Received: from mail-wm1-x332.google.com (mail-wm1-x332.google.com [IPv6:2a00:1450:4864:20::332]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7C1AF4B403 for ; Wed, 23 Feb 2022 10:31:16 -0800 (PST) Received: by mail-wm1-x332.google.com with SMTP id w13so13812958wmi.2 for ; Wed, 23 Feb 2022 10:31:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=dphLSFrZz7UAc1+gvJ5y8oPCJbsw/pv5ss6NDgClwYQ=; b=DOOaa7JZ4ZVWW2v/h/A449F46DWVQNGQOK3KfJnj+sNn/6w3NDzfG1ZN6BaLKmdEPE tEJPqIrKiBPJ5ijj0mG2nljVWhuUKJ+/ss2oW+430A0GZe9kbgNBf1iEhVd25+zXAdmh xrizE4tFQJRUNAhRDLD+CaheCGICuu99PH5pPaNmEBByYNfI7H/GEx3Z7rh3ATLG/cZH +CMvwPsa3A4sM5YIwAeSRwG7OUOF6VNUAX31xCHXcsOFH/WTgEZsOdhmkDivnQjcBD/H j+zhOzNxkNiEX5/biSQKseUBxIAULvzDfWowbkD506HgVNLmeATSUzz6C3VQoJzUyYXY xTxQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=dphLSFrZz7UAc1+gvJ5y8oPCJbsw/pv5ss6NDgClwYQ=; b=4P+GYZaSriiVzaaeJ+B94KnT8Ecrp8erbMWKNZ/q4BfXXclxT/TrzGwOO3z6CPY0ok WL+GRC2A3vSA1AefXUwTmmDr0UqTBB9o4yup1wJcZwRymB7jgxwuYucoxHa998416ZbG bKZ5zmm/vLpjTssTxT+dd4bDcfEBfFIQWhRA8YVj6N0w8bwcnCbH54JUJbqKZSGkyFNh TOopJRxrNf0VGK6shRyZTrFQ0HAhyqYAp2GILW4tVG4nBVx0iE/NL71HWGo/7m8EJBvt ARht5LF0oQJSbQZZHulGV8MGTiGF8ncoLX3xExmi+H5l1QJAeClSvrUx6KiEfzw9tOmF rqgw== X-Gm-Message-State: AOAM531+7xBwDgw1oovp8I4C3NHxEQMeEsoGUtMn2BSgXMJzjR7FcKgN AWfceGAkggUyT7dYPP7Px3KbS77Z4PM= X-Google-Smtp-Source: ABdhPJy8lJXlH/+71vQa3QXo0xW8z6Wnu2zwpXRQzi5wVj7zwJhP3nq0uhX61mZBWaEjvnou7oycdw== X-Received: by 2002:a7b:c016:0:b0:37b:ebf6:3d13 with SMTP id c22-20020a7bc016000000b0037bebf63d13mr8794446wmb.191.1645641074734; Wed, 23 Feb 2022 10:31:14 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id g6sm274386wrq.97.2022.02.23.10.31.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Feb 2022 10:31:13 -0800 (PST) Message-Id: <7221bd9a8c7b3c81e20a0b5cd2f7b8b5e248a81f.1645641063.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 23 Feb 2022 18:30:47 +0000 Subject: [PATCH 09/25] bundle: parse table of contents during 'fetch' Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, me@ttaylorr.com, aevar@gmail.com, newren@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee In order to support a flexible bundle URI feature, we allow the server to return a "table of contents" file that is formatted according to Git config file standards. These files can describe multiple bundles, intended to assist with using bundle URIs for fetching or with partial clone. Here is an example table of contents file: [bundle "tableofcontents"] version = 1 [bundle "2022-02-09-1644442601-daily"] uri = 2022-02-09-1644442601-daily.bundle timestamp = 1644442601 requires = 2022-02-02-1643842562 [bundle "2022-02-02-1643842562"] uri = 2022-02-02-1643842562.bundle timestamp = 1643842562 [bundle "2022-02-09-1644442631-daily-blobless"] uri = 2022-02-09-1644442631-daily-blobless.bundle timestamp = 1644442631 requires = 2022-02-02-1643842568-blobless filter = blob:none [bundle "2022-02-02-1643842568-blobless"] uri = 2022-02-02-1643842568-blobless.bundle timestamp = 1643842568 filter = blob:none (End of example.) This file contains some important fixed values, such as * bundle.tableofcontents.version = 1 Also, different bundles are referenced by , using keys with names * bundle..uri: the URI to download this bundle. This could be an absolute URI or a URI relative to the bundle file's URI. * bundle..timestamp: the timestamp when this file was generated. * bundle..filter: the partial clone filter applied on this bundle. * bundle..requires: the ID for the previous bundle. The current change does not parse the '.filter' option, but does use the '.requires' in the 'while (stack)' loop. The process is that 'git bundle fetch' will parse the table of contents and pick the most-recent bundle and download that one. That bundle header has a ref listing, including (possibly) a list of commits that are missing from the bundle. If any of those commits are missing, then Git downloads the bundle specified by the '.requires' value and tries again. Eventually, Git should download a bundle where all missing commits actually exist in the current repository, or Git downloads a bundle with no missing commits. Of course, the server could be advertising incorrect information, so it could advertise bundles that never satisfy the missing objects. It could also create a directed cycle in its '.requires' specifications. In each of these cases, Git will die with a "bundle '' still invalid after downloading required bundle" message or a "bundle from '' has missing prerequisites and no dependent bundle" message. RFC-TODO: add a direct test of table of contents parsing in this change. RFC-TODO: create tests that check these erroneous cases. Signed-off-by: Derrick Stolee --- builtin/bundle.c | 169 +++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 162 insertions(+), 7 deletions(-) diff --git a/builtin/bundle.c b/builtin/bundle.c index 0e06f1756d1..66f3b3c9376 100644 --- a/builtin/bundle.c +++ b/builtin/bundle.c @@ -7,6 +7,8 @@ #include "hashmap.h" #include "object-store.h" #include "refs.h" +#include "config.h" +#include "packfile.h" /* * Basic handler for bundle files to connect repositories via sneakernet. @@ -165,12 +167,21 @@ struct remote_bundle_info { char *uri; /** - * The 'next_id' string, if non-NULL, contains the 'id' + * The 'requires_id' string, if non-NULL, contains the 'id' * for a bundle that contains the prerequisites for this * bundle. Used by table of contents to allow fetching * a portion of a repository incrementally. */ - char *next_id; + char *requires_id; + + /** + * The 'filter_str' string, if non-NULL, specifies the + * filter capability exists in this bundle with the given + * specification. Allows selecting bundles that match the + * client's desired filter. If NULL, then no filter exists + * on the bundle. + */ + char *filter_str; /** * A table of contents can include a timestamp for the @@ -191,8 +202,106 @@ struct remote_bundle_info { * a stack. */ struct remote_bundle_info *stack_next; + + /** + * 'pushed' is set when first pushing the required bundle + * onto the stack. Used to error out when verifying the + * prerequisites and avoiding an infinite loop. + */ + unsigned pushed:1; }; +static int remote_bundle_cmp(const void *unused_cmp_data, + const struct hashmap_entry *a, + const struct hashmap_entry *b, + const void *key) +{ + const struct remote_bundle_info *ee1 = + container_of(a, struct remote_bundle_info, ent); + const struct remote_bundle_info *ee2 = + container_of(b, struct remote_bundle_info, ent); + + return strcmp(ee1->id, ee2->id); +} + +static int parse_toc_config(const char *key, const char *value, void *data) +{ + struct hashmap *toc = data; + const char *key1, *key2, *id_end; + struct strbuf id = STRBUF_INIT; + struct remote_bundle_info info_lookup; + struct remote_bundle_info *info; + + if (!skip_prefix(key, "bundle.", &key1)) + return -1; + + if (skip_prefix(key1, "tableofcontents.", &key2)) { + if (!strcmp(key2, "version")) { + int version = git_config_int(key, value); + + if (version != 1) { + warning(_("table of contents version %d not understood"), version); + return -1; + } + } + + return 0; + } + + id_end = strchr(key1, '.'); + + /* + * If this key is of the form "bundle." with no third item, + * then we do not know about it. We should ignore it. Later versions + * might start caring about this data on an optional basis. Increase + * the version number to add keys that must be understood. + */ + if (!id_end) + return 0; + + strbuf_add(&id, key1, id_end - key1); + key2 = id_end + 1; + + info_lookup.id = id.buf; + hashmap_entry_init(&info_lookup.ent, strhash(info_lookup.id)); + if (!(info = hashmap_get_entry(toc, &info_lookup, ent, NULL))) { + CALLOC_ARRAY(info, 1); + info->id = strbuf_detach(&id, NULL); + strbuf_init(&info->file, 0); + hashmap_entry_init(&info->ent, strhash(info->id)); + hashmap_add(toc, &info->ent); + } + + if (!strcmp(key2, "uri")) { + if (info->uri) + warning(_("duplicate 'uri' value for id '%s'"), info->id); + else + info->uri = xstrdup(value); + return 0; + } else if (!strcmp(key2, "timestamp")) { + if (info->timestamp) + warning(_("duplicate 'timestamp' value for id '%s'"), info->id); + else + info->timestamp = git_config_int64(key, value); + return 0; + } else if (!strcmp(key2, "requires")) { + if (info->requires_id) + warning(_("duplicate 'requires' value for id '%s'"), info->id); + else + info->requires_id = xstrdup(value); + return 0; + } else if (!strcmp(key2, "filter")) { + if (info->filter_str) + warning(_("duplicate 'filter' value for id '%s'"), info->id); + else + info->filter_str = xstrdup(value); + return 0; + } + + /* Return 0 here to ignore unknown options. */ + return 0; +} + static void download_uri_to_file(const char *uri, const char *file) { struct child_process cp = CHILD_PROCESS_INIT; @@ -289,13 +398,14 @@ static void unbundle_fetched_bundle(struct remote_bundle_info *info) static int cmd_bundle_fetch(int argc, const char **argv, const char *prefix) { - int ret = 0; + int ret = 0, used_hashmap = 0; int progress = isatty(2); char *bundle_uri; struct remote_bundle_info first_file = { .file = STRBUF_INIT, }; struct remote_bundle_info *stack = NULL; + struct hashmap toc = { 0 }; struct option options[] = { OPT_BOOL(0, "progress", &progress, @@ -319,15 +429,31 @@ static int cmd_bundle_fetch(int argc, const char **argv, const char *prefix) /* * Step 2: Check if the file is a bundle (if so, add it to the - * stack and move to step 3). + * stack and move to step 3). Otherwise, expect it to be a table + * of contents. Use the table to populate a hashtable of bundles + * and push the most recent bundle to the stack. */ if (is_bundle(first_file.file.buf, 1)) { /* The simple case: only one file, no stack to worry about. */ stack = &first_file; } else { - /* TODO: Expect and parse a table of contents. */ - die(_("unexpected data at bundle URI")); + struct hashmap_iter iter; + struct remote_bundle_info *info; + timestamp_t max_time = 0; + + /* populate a hashtable with all relevant bundles. */ + used_hashmap = 1; + hashmap_init(&toc, remote_bundle_cmp, NULL, 0); + git_config_from_file(parse_toc_config, first_file.file.buf, &toc); + + /* initialize stack using timestamp heuristic. */ + hashmap_for_each_entry(&toc, &iter, info, ent) { + if (info->timestamp > max_time || !stack) { + stack = info; + max_time = info->timestamp; + } + } } /* @@ -357,6 +483,7 @@ static int cmd_bundle_fetch(int argc, const char **argv, const char *prefix) if (bundle_fd < 0) die(_("failed to read bundle from '%s'"), stack->uri); + reprepare_packed_git(the_repository); for_each_string_list_item(prereq, &header.prerequisites) { struct object_info info = OBJECT_INFO_INIT; struct object_id *oid = prereq->util; @@ -373,11 +500,28 @@ static int cmd_bundle_fetch(int argc, const char **argv, const char *prefix) if (valid) { unbundle_fetched_bundle(stack); - } else if (stack->next_id) { + } else if (stack->pushed) { + die(_("bundle '%s' still invalid after downloading required bundle"), stack->id); + } else if (stack->requires_id) { /* * Load the next bundle from the hashtable and * push it onto the stack. */ + struct remote_bundle_info *info; + struct remote_bundle_info info_lookup = { 0 }; + info_lookup.id = stack->requires_id; + + hashmap_entry_init(&info_lookup.ent, strhash(info_lookup.id)); + if ((info = hashmap_get_entry(&toc, &info_lookup, ent, NULL))) { + /* Push onto the stack */ + stack->pushed = 1; + info->stack_next = stack; + stack = info; + continue; + } else { + die(_("unable to find bundle '%s' required by bundle '%s'"), + stack->requires_id, stack->id); + } } else { die(_("bundle from '%s' has missing prerequisites and no dependent bundle"), stack->uri); @@ -386,6 +530,17 @@ static int cmd_bundle_fetch(int argc, const char **argv, const char *prefix) stack = stack->stack_next; } + if (used_hashmap) { + struct hashmap_iter iter; + struct remote_bundle_info *info; + hashmap_for_each_entry(&toc, &iter, info, ent) { + free(info->id); + free(info->uri); + free(info->requires_id); + free(info->filter_str); + } + hashmap_clear_and_free(&toc, struct remote_bundle_info, ent); + } free(bundle_uri); return ret; } From patchwork Wed Feb 23 18:30:48 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12757306 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5F406C433FE for ; Wed, 23 Feb 2022 18:31:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243935AbiBWScM (ORCPT ); Wed, 23 Feb 2022 13:32:12 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37444 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243886AbiBWSbq (ORCPT ); Wed, 23 Feb 2022 13:31:46 -0500 Received: from mail-wm1-x332.google.com (mail-wm1-x332.google.com [IPv6:2a00:1450:4864:20::332]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 487714B416 for ; Wed, 23 Feb 2022 10:31:17 -0800 (PST) Received: by mail-wm1-x332.google.com with SMTP id c18-20020a7bc852000000b003806ce86c6dso2103982wml.5 for ; Wed, 23 Feb 2022 10:31:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=Gk7hwcevbSQEOI3ogjwQ6/G/ReEKIjYaCr5lOLD/awU=; b=dwFKRG9FYKQWdJ5NSphVxPfeWyOirI2jfTjvERZJCesgx1QAzEm9fVR1czBqC7zvqk zJlfseaFrfHufTVLublsDOCet68Kl91ZfmaTl4kJTLnb9SxDdLNMTDSOvos5NkwOCtXq wmWL96QCBlgAzHXDGlCtJmDr9Qtu6ZTa39MoI9GX5n/eTA7lgMNrCaW4uAHjj5Qux6bd bpIfF1nhczIf/q3eiBAOuSPjlfl+98/vlwJ/JN5Cp979c8AuPzmNJcuSAtvexFyiF97W n7NDN6ljXVWQiDwE2cYcIktpoUzNiUBrp76wlNaWpFfncALevNVGsti49Lxv9SOEyzho AHsg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=Gk7hwcevbSQEOI3ogjwQ6/G/ReEKIjYaCr5lOLD/awU=; b=Os8GTahfVQA078siivyvsgwqId7iF6jZvmNpkp3X3QUQ/LnSF6IiuMFcIG98OxM86Y YC43EdtUGPO7irLLRYrwmUkIwYHlZrl9Gu3lLw8LXKbhaQ6DPPuHjuR09mkSYcPfuM2Y sMbBCDNv3BzI8biTzU0aQ4JS7tdO4VakQ9HT08o6hrhTdnYavNqdq8iRyrBzjtX9U5g5 LDYivE1HhTJsLlPaPmT0Yrn3rio3hS8QzPQJsOGvRURLDtgHl1kZiHo/sc32xTc5eq/g I2Nfo/YmU6Jjw5lfvXOUjZm60XMbKtka+PaTcy6QVzzSi/l1w+doWpxfr9361/myOodX 1QyA== X-Gm-Message-State: AOAM530kM9e/FK12Y1U1odG0l9fk3aIZISQFGqmx9Zx6GU1CBtEmoJqJ C0Hie6NNQqNQ2WhYbghXqqPV70iBfjQ= X-Google-Smtp-Source: ABdhPJwcqZ2FYfcvZ+h0Nb91OF2MTityydREWKs6WquZDCxD1E5Hg3AyLKfW71FDYtaVLR+9CJppag== X-Received: by 2002:a1c:a915:0:b0:380:e3de:b78f with SMTP id s21-20020a1ca915000000b00380e3deb78fmr6134867wme.19.1645641075644; Wed, 23 Feb 2022 10:31:15 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id m31sm282236wms.4.2022.02.23.10.31.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Feb 2022 10:31:15 -0800 (PST) Message-Id: In-Reply-To: References: Date: Wed, 23 Feb 2022 18:30:48 +0000 Subject: [PATCH 10/25] bundle: add --filter option to 'fetch' Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, me@ttaylorr.com, aevar@gmail.com, newren@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee When a repository uses an object filter for partial clone, the 'git bundle fetch' command should try to download bundles that match that filter. Teach 'git bundle fetch' to take a '--filter' option and then only consider bundles that match that filter (or lack thereof). This allows the bundle server to advertise different sets of bundles for different filters. Add some verification to be sure that the bundle we downloaded actually uses that filter. This is especially important when no filter is requested but the downloaded bundle _does_ have a filter. RFC-TODO: add tests for the happy path. RFC-TODO: add tests for these validations. Signed-off-by: Derrick Stolee --- builtin/bundle.c | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/builtin/bundle.c b/builtin/bundle.c index 66f3b3c9376..27da5e3737f 100644 --- a/builtin/bundle.c +++ b/builtin/bundle.c @@ -9,6 +9,7 @@ #include "refs.h" #include "config.h" #include "packfile.h" +#include "list-objects-filter-options.h" /* * Basic handler for bundle files to connect repositories via sneakernet. @@ -406,10 +407,13 @@ static int cmd_bundle_fetch(int argc, const char **argv, const char *prefix) }; struct remote_bundle_info *stack = NULL; struct hashmap toc = { 0 }; + const char *filter = NULL; struct option options[] = { OPT_BOOL(0, "progress", &progress, N_("show progress meter")), + OPT_STRING(0, "filter", &filter, + N_("filter-spec"), N_("only install bundles matching this filter")), OPT_END() }; @@ -449,6 +453,17 @@ static int cmd_bundle_fetch(int argc, const char **argv, const char *prefix) /* initialize stack using timestamp heuristic. */ hashmap_for_each_entry(&toc, &iter, info, ent) { + /* Skip if filter does not match. */ + if (!filter && info->filter_str) + continue; + if (filter && + (!info->filter_str || strcasecmp(filter, info->filter_str))) + continue; + + /* + * Now that the filter matches, start with the + * bundle with largest timestamp. + */ if (info->timestamp > max_time || !stack) { stack = info; max_time = info->timestamp; @@ -468,6 +483,7 @@ static int cmd_bundle_fetch(int argc, const char **argv, const char *prefix) while (stack) { int valid = 1; int bundle_fd; + const char *filter_str = NULL; struct string_list_item *prereq; struct bundle_header header = BUNDLE_HEADER_INIT; @@ -483,6 +499,16 @@ static int cmd_bundle_fetch(int argc, const char **argv, const char *prefix) if (bundle_fd < 0) die(_("failed to read bundle from '%s'"), stack->uri); + if (header.filter) + filter_str = list_objects_filter_spec(header.filter); + + if (filter && (!filter_str || strcasecmp(filter, filter_str))) + die(_("bundle from '%s' does not match expected filter"), + stack->uri); + if (!filter && filter_str) + die(_("bundle from '%s' has an unexpected filter"), + stack->uri); + reprepare_packed_git(the_repository); for_each_string_list_item(prereq, &header.prerequisites) { struct object_info info = OBJECT_INFO_INIT; From patchwork Wed Feb 23 18:30:49 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12757305 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7678AC433F5 for ; Wed, 23 Feb 2022 18:31:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243916AbiBWScK (ORCPT ); Wed, 23 Feb 2022 13:32:10 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37454 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243887AbiBWSbq (ORCPT ); Wed, 23 Feb 2022 13:31:46 -0500 Received: from mail-wr1-x42c.google.com (mail-wr1-x42c.google.com [IPv6:2a00:1450:4864:20::42c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3908745AE4 for ; Wed, 23 Feb 2022 10:31:18 -0800 (PST) Received: by mail-wr1-x42c.google.com with SMTP id f17so16230133wrh.7 for ; Wed, 23 Feb 2022 10:31:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=x8827e6tp2kO2qZ735l3t7QGL5QSAt3RhPUIWuYgixg=; b=cTUPHe7q2hDFBxPDkUC9wLGfqPYVAEspZF+De1PKGlU9pFGtNwTInL4IuaLMmjYzmT BM5hlGKJOZNRSVJ3r3l7/LC1gzRAyyQEe6UuahXobroPWiPVapMCs4y4vd8YLL58lhvW OBUhnYrJUg2IbStc1OnQKL+j26QK1d0Clk+7RHNJ3HrRiqTNXGSLPjEuylgsx99hrcO2 NwE01AAnrnZbxRcJOIPDjHQyqbxh3rosJW2dCbJlW30ZKgQHnVwP45ACkSIb8QGaGO4t g5ktJymCs0DotL+fsUPa4aBbEfyD/1XeSaSVd07TqQ1VOTHhRyUNg8rLkTdc0epTCim5 m4PA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=x8827e6tp2kO2qZ735l3t7QGL5QSAt3RhPUIWuYgixg=; b=tIwzTni7r4/I1JDXLUalvr9Z4trNMmtBmjfIrw00oqqvA5YdzcHAWOvHfPjDm/ihqF uogPqlFaK7ZBAkEnT8ggk88kuJfU9G3Z4tcFWHeV9saLQx99pYfCa81vn13iUtyvCKTa e+in6mWayle43Bd3cGRlow7RJbROJMbGtW00GorAFLwPZ72arz2jfdRLusrricYi9Krl GUZVsY0rDcqAW4LxMs89JsfcyqLTOI5dSWmFsJJjgdJ1Hufw03k9zHJB+hZE6yi6CDT/ r8XTl7cBS5l1a5AaV7DnL7iTcnz0ZyZJNwfhguu/y1hxFM7rZezmZq4eCs/JeIIK8O4K 3RLw== X-Gm-Message-State: AOAM532OCUUX6gig95JqdKUy8VehznGOuxCl1VCkWVuzTtU0119q1HKY V45s6xS6KynYFcx34X+vAVYv01c0m7k= X-Google-Smtp-Source: ABdhPJwmP61ubEjGQt8G0FvS8V97KyQXNmPWa5tzmIEtCkg2BE7dYJ/nBFrvwZepV7L3t2IM3BuNbA== X-Received: by 2002:a5d:52ca:0:b0:1e5:8cbc:7f2e with SMTP id r10-20020a5d52ca000000b001e58cbc7f2emr670323wrv.309.1645641076549; Wed, 23 Feb 2022 10:31:16 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id r11sm252746wmb.19.2022.02.23.10.31.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Feb 2022 10:31:16 -0800 (PST) Message-Id: In-Reply-To: References: Date: Wed, 23 Feb 2022 18:30:49 +0000 Subject: [PATCH 11/25] bundle: allow relative URLs in table of contents Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, me@ttaylorr.com, aevar@gmail.com, newren@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee When hosting bundle data, it can be helpful to distribute that data across multiple CDNs. This might require a change in the base URI, all the way to the domain name. If all bundles require an absolute URI in their 'uri' value, then every push to a CDN would require altering the table of contents to match the expected domain and exact location within it. Allow the table of contents to specify a relative URI for the bundles. This allows easier distribution of bundle data. RFC-TODO: An earlier change referenced relative URLs, but it was not implemented until this change. Signed-off-by: Derrick Stolee --- builtin/bundle.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/builtin/bundle.c b/builtin/bundle.c index 27da5e3737f..ec969a62ae1 100644 --- a/builtin/bundle.c +++ b/builtin/bundle.c @@ -10,6 +10,7 @@ #include "config.h" #include "packfile.h" #include "list-objects-filter-options.h" +#include "remote.h" /* * Basic handler for bundle files to connect repositories via sneakernet. @@ -453,6 +454,8 @@ static int cmd_bundle_fetch(int argc, const char **argv, const char *prefix) /* initialize stack using timestamp heuristic. */ hashmap_for_each_entry(&toc, &iter, info, ent) { + char *old_uri; + /* Skip if filter does not match. */ if (!filter && info->filter_str) continue; @@ -460,6 +463,10 @@ static int cmd_bundle_fetch(int argc, const char **argv, const char *prefix) (!info->filter_str || strcasecmp(filter, info->filter_str))) continue; + old_uri = info->uri; + info->uri = relative_url(bundle_uri, info->uri, NULL); + free(old_uri); + /* * Now that the filter matches, start with the * bundle with largest timestamp. From patchwork Wed Feb 23 18:30:50 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12757307 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2A2D5C433EF for ; Wed, 23 Feb 2022 18:31:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243917AbiBWScM (ORCPT ); Wed, 23 Feb 2022 13:32:12 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37460 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243888AbiBWSbq (ORCPT ); Wed, 23 Feb 2022 13:31:46 -0500 Received: from mail-wm1-x32b.google.com (mail-wm1-x32b.google.com [IPv6:2a00:1450:4864:20::32b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EDF4B4B430 for ; Wed, 23 Feb 2022 10:31:18 -0800 (PST) Received: by mail-wm1-x32b.google.com with SMTP id v2-20020a7bcb42000000b0037b9d960079so2158350wmj.0 for ; Wed, 23 Feb 2022 10:31:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=BXrzedZOQdaUzkt+06k4i4zy10kAbzieYhw98+se9QY=; b=eEZDo6vEabEsKPHUlw7qiQQPsPRUFtSiUSJYRntbnGbpiFSjMC0bakh8GWwyEZCZVM zp6GyvYNT8ec6AtA5O4LDnXISefBPasPAGfHCUaMCfhd0WQ7ljT0F1NtzsDnZKC2bXsf x20zYWUUAKQKldkZoPSAXujvCMwVZlA39DoKtvB4WZfUGvefbWndzf3trHTTo8OlUgCb fftQ+Iq6BQsslOlO/KPZnpv8wWiSx8F9LsX4wAXUaID9GYRio9KY0U3ZWOGo6D5HZmoG RmPK2Lq9/U68k0msyNYqcEZ96e753yX7qjngtTHr3ql16emASRwwwh4ETfdQIg40u44U pBrQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=BXrzedZOQdaUzkt+06k4i4zy10kAbzieYhw98+se9QY=; b=kcoz1lKKjLfxOPHUC3D6GZWBkAFtPGEV/46VATyRpnBv0eCLB01sRHc4CJGf2ITDAS QNpMoYl5+aL5EwEJv4tuPbifgizj5VKDsQP887d6m0Qr6EiJE7V2WDv2ptODcBuj73qG rj4IMtf9/5f7+aPLqO30nzteb4buO1JZtCh20MZOpg45Rg/SUJDkBfh1+1yTUN2xz7Rg mJ67YqnET3fGoVafkRGGOTgE0o+TNpS3CiHjhvkxOTm6GmQZMBzUddZybmkcihYT3hh5 JHA8QpFKBcfQFLGkMaraASD6hpTVS9kNcqecrZMsfNhbW+KEdjwXwpgU5kBXhkkcE6N2 tH/Q== X-Gm-Message-State: AOAM531NmyKy7+EWqGJYl4muaCf/eZiCzUdvt2l/i7ftJnBWpiVcjWav y7Dfv29FOsMLMGyXhZF5d0qtoEpr9O0= X-Google-Smtp-Source: ABdhPJyYZHsm/xekYdjqHGU3yk1HXJSeK9+JqiX7r+LKuWpB6mC7Kf0FGMb47XIdRFkk7wyke7IRyA== X-Received: by 2002:a05:600c:1c0d:b0:37c:3016:1601 with SMTP id j13-20020a05600c1c0d00b0037c30161601mr811436wms.84.1645641077421; Wed, 23 Feb 2022 10:31:17 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id a16sm290626wrt.37.2022.02.23.10.31.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Feb 2022 10:31:17 -0800 (PST) Message-Id: In-Reply-To: References: Date: Wed, 23 Feb 2022 18:30:50 +0000 Subject: [PATCH 12/25] bundle: make it easy to call 'git bundle fetch' Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, me@ttaylorr.com, aevar@gmail.com, newren@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee Future changes will integrate 'git bundle fetch' into the 'git clone' and 'git fetch' operations. Make it easy to fetch bundles via a helper method. Signed-off-by: Derrick Stolee --- bundle.c | 21 +++++++++++++++++++++ bundle.h | 9 +++++++++ 2 files changed, 30 insertions(+) diff --git a/bundle.c b/bundle.c index 3d97de40ef0..9e1b5300366 100644 --- a/bundle.c +++ b/bundle.c @@ -649,3 +649,24 @@ int unbundle(struct repository *r, struct bundle_header *header, return error(_("index-pack died")); return 0; } + +int fetch_bundle_uri(const char *bundle_uri, + const char *filter) +{ + int res = 0; + struct strvec args = STRVEC_INIT; + + strvec_pushl(&args, "bundle", "fetch", NULL); + + if (filter) + strvec_pushf(&args, "--filter=%s", filter); + strvec_push(&args, bundle_uri); + + if (run_command_v_opt(args.v, RUN_GIT_CMD)) { + warning(_("failed to download bundle from uri '%s'"), bundle_uri); + res = 1; + } + + strvec_clear(&args); + return res; +} diff --git a/bundle.h b/bundle.h index eb026153d56..bf865b19687 100644 --- a/bundle.h +++ b/bundle.h @@ -45,4 +45,13 @@ int unbundle(struct repository *r, struct bundle_header *header, int list_bundle_refs(struct bundle_header *header, int argc, const char **argv); +struct list_objects_filter_options; +/** + * Fetch bundles from the given URI with the given filter. + * + * Uses 'git bundle fetch' as a subprocess. + */ +int fetch_bundle_uri(const char *bundle_uri, + const char *filter); + #endif From patchwork Wed Feb 23 18:30:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12757308 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 555FEC433F5 for ; Wed, 23 Feb 2022 18:31:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243941AbiBWScO (ORCPT ); Wed, 23 Feb 2022 13:32:14 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37488 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243892AbiBWSbs (ORCPT ); Wed, 23 Feb 2022 13:31:48 -0500 Received: from mail-wm1-x334.google.com (mail-wm1-x334.google.com [IPv6:2a00:1450:4864:20::334]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C6F244B435 for ; Wed, 23 Feb 2022 10:31:19 -0800 (PST) Received: by mail-wm1-x334.google.com with SMTP id m13-20020a7bca4d000000b00380e379bae2so2132854wml.3 for ; Wed, 23 Feb 2022 10:31:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=l2bOQQhpVQaDo7qkyj8KpnTRnvE1PVFH7xRfiYKu/jw=; b=Gitky/1GmIcKsDHzhIV2ugfJyvU7ByhlgRM1puNmE40i00miLiLzAZWic0ckeIuZNC cKIit31MSpZYM8snURHdpknVuXjxKpBULBLTBDS0x503K8qsIXdMfFSUldYV/zu09+q3 Xs100Bh4j54LAOQzpb2DsXrVAC5ikF/gO0gDYv1Vd64ocjgyXKXkvNKE+IwdJz1hG1gy ULYsGRXXoKPczELgOACOGUea9Jk3JBzdMlZoalrYlvWSWqz5qndDZYj9gsc/wtMtDKBt ut494rSEnmjmxziglN9C8S5XoBFbGkr+ngpMk5+vM03gNdTQOczZENuZ0DDIywpJcU+n 1KUw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=l2bOQQhpVQaDo7qkyj8KpnTRnvE1PVFH7xRfiYKu/jw=; b=chpmqh4RcYhxlmdRWZcM7WZLjoNBK6Oerz1yQcWKMOjLYyk11Iwf/KS9aFglts35mu dbcgY4c4CV0K3AzkN4Yo6ccOUCM39P0JqJejdZZ/yTT2SN0BwM5wshyqEnaihz4bXcza vLynn8jC8CfzMZwXBtPWSln+tHD9ZCiTxXe76/7JoyXAo7H5iQQ88e5GTksNlxa/J8dU r3iWmJy2Kk+GjxnmhCxYngS1b7I9vnOHgRplI1jeJgicg1lf+mSto7jEI7eWcK2el9lB BtMt3Rug5WJPE8Sv5IxjQcWU8HpD//3V3XeKyvpuD3w9Z0RAcVEsk1Uj7x+gQxSoco8I DlZA== X-Gm-Message-State: AOAM53318FZy3/ifJA3YazbEyAO2CaPNXG+ajiiUPUGcgOaza5ofRmS5 b27r5AI2+xffUD9PimO0mdDiNCaYtjE= X-Google-Smtp-Source: ABdhPJwUyGeX9pZhWHBmP57IfAkt+wQ+QbhPJbulNWO2Cle+sMrP92iw/6tHxR/VaakGzbCBi+CSYQ== X-Received: by 2002:a05:600c:1c25:b0:380:d306:1058 with SMTP id j37-20020a05600c1c2500b00380d3061058mr786664wms.150.1645641078180; Wed, 23 Feb 2022 10:31:18 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id a2sm252367wmq.38.2022.02.23.10.31.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Feb 2022 10:31:17 -0800 (PST) Message-Id: <050725d90ef019ca2684ec0afbfd701efea7f88a.1645641063.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 23 Feb 2022 18:30:51 +0000 Subject: [PATCH 13/25] clone: add --bundle-uri option Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, me@ttaylorr.com, aevar@gmail.com, newren@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee Cloning a remote repository is one of the most expensive operations in Git. The server can spend a lot of CPU time generating a pack-file for the client's request. The amount of data can clog the network for a long time, and the Git protocol is not resumable. For users with poor network connections or are located far away from the origin server, this can be especially painful. The 'git bundle fetch' command allows users to bootstrap a repository using a set of bundles. However, this would require them to use 'git init' first, followed by the 'git bundle fetch', and finally add a remote, fetch, and checkout the branch they want. Instead, integrate this workflow directly into 'git clone' with the --bundle-uri' option. If the user is aware of a bundle server, then they can tell Git to bootstrap the new repository with these bundles before fetching the remaining objects from the origin server. RFC-TODO: Document this option in git-clone.txt. RFC-TODO: I added a comment about the location of this code being necessary for the later step of auto-discovering the bundle URI from the origin server. This is probably not actually a requirement, but rather a pain point around how I implemented the feature. If a --bundle-uri option is specified, but SSH is used for the clone, then the SSH connection is left open while Git downloads bundles from another server. This is sub-optimal and should be reconsidered when fully reviewed. RFC-TODO: create tests for this option with a variety of URI types. RFC-TODO: a simple end-to-end test is available at the end of the series. Signed-off-by: Derrick Stolee --- builtin/clone.c | 33 +++++++++++++++++++++++++++++++++ 1 file changed, 33 insertions(+) diff --git a/builtin/clone.c b/builtin/clone.c index 9c29093b352..6df3d513dc4 100644 --- a/builtin/clone.c +++ b/builtin/clone.c @@ -33,6 +33,7 @@ #include "packfile.h" #include "list-objects-filter-options.h" #include "hook.h" +#include "bundle.h" /* * Overall FIXMEs: @@ -74,6 +75,7 @@ static struct string_list option_recurse_submodules = STRING_LIST_INIT_NODUP; static struct list_objects_filter_options filter_options; static struct string_list server_options = STRING_LIST_INIT_NODUP; static int option_remote_submodules; +static const char *bundle_uri; static int recurse_submodules_cb(const struct option *opt, const char *arg, int unset) @@ -155,6 +157,8 @@ static struct option builtin_clone_options[] = { N_("any cloned submodules will use their remote-tracking branch")), OPT_BOOL(0, "sparse", &option_sparse_checkout, N_("initialize sparse-checkout file to include only files at root")), + OPT_STRING(0, "bundle-uri", &bundle_uri, + N_("uri"), N_("A URI for downloading bundles before fetching from origin remote")), OPT_END() }; @@ -1185,6 +1189,35 @@ int cmd_clone(int argc, const char **argv, const char *prefix) refs = transport_get_remote_refs(transport, &transport_ls_refs_options); + /* + * NOTE: The bundle URI download takes place after transport_get_remote_refs() + * because a later change will introduce a check for recommended features, + * which might include a recommended bundle URI. + */ + + /* + * Before fetching from the remote, download and install bundle + * data from the --bundle-uri option. + */ + if (bundle_uri) { + const char *filter = NULL; + + if (filter_options.filter_spec.nr) + filter = expand_list_objects_filter_spec(&filter_options); + /* + * Set the config for fetching from this bundle URI in the + * future, but do it before fetch_bundle_uri() which might + * un-set it (for instance, if there is no table of contents). + */ + git_config_set("fetch.bundleuri", bundle_uri); + if (filter) + git_config_set("fetch.bundlefilter", filter); + + if (!fetch_bundle_uri(bundle_uri, filter)) + warning(_("failed to fetch objects from bundle URI '%s'"), + bundle_uri); + } + if (refs) mapped_refs = wanted_peer_refs(refs, &remote->fetch); From patchwork Wed Feb 23 18:30:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12757309 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 22A4EC433F5 for ; Wed, 23 Feb 2022 18:31:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243943AbiBWScQ (ORCPT ); Wed, 23 Feb 2022 13:32:16 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37454 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243585AbiBWScJ (ORCPT ); Wed, 23 Feb 2022 13:32:09 -0500 Received: from mail-wm1-x32b.google.com (mail-wm1-x32b.google.com [IPv6:2a00:1450:4864:20::32b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2AF6F4B843 for ; Wed, 23 Feb 2022 10:31:21 -0800 (PST) Received: by mail-wm1-x32b.google.com with SMTP id i19so14277943wmq.5 for ; Wed, 23 Feb 2022 10:31:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=BE8pRQBi1xdCF9mnbjjcuFKiO+AGnnrTf1akRhh1Gvo=; b=bSXDAjWuyEujZNJ6elVW6ekW9pvViRm/bQL8nlb7eEFRhGIQ3PsDbF/LTrdVQh9LER 0MGH7CftG2N2AtSAYMw32tNGuy+FsVH1Nc0wyN0rqCLTHv7Kjhn2jdbXOGVFp22L+OkS ODb/jp/RXgf05GtfJpwFFslVRTVa9fSsUbZTfTQQlpiMq3OK4b0M/qxgax/OTTxMAMeV Ziw1Fo6zeay2QVMN7275LG58MXoxeBlCLKtEP9yzegKxP22hAsIKGry3OwN4CJhrveHx koT+smUYB1gLH0KLeMFuYAPrf26WLzmk6wB8P+csOs5lFKIMLHlAUPsSlEnsOekDqyLp AlAA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=BE8pRQBi1xdCF9mnbjjcuFKiO+AGnnrTf1akRhh1Gvo=; b=n4yozIeU0hMz9VMSl30iwixsmP81nH7l2BJ3P+05AYGXdU8GPmDpDSkcLPw6V95nxn ZX0QHAtVD4lFHArtDOTgblUuMjbb10/tOlRufMJRsxt/6gTFQJnbZZsRqaPuk7EORA4z eAA1bvsjTvXrV+qbbuQBntfiwIo/7vcpxAop8UVcPLDDDcrIyUAdG/3FG/HOi2GfdoZn QS/BTNP6R+OjJnH5b5meRWf5v0eB4nX9vUd0o5O3TditcQX2tqPpI8s6O/8UISXHSt0R I/ZrvXMnF7f4Uplnzmg5piLBYMxw9lGRLNN4nX+e9LlBZWQy3YitdSjiXn0bzPPm2Zdm tPLA== X-Gm-Message-State: AOAM531XZuIGdbpHX4zEHi7c3okttB+zT+1KRr+1P+LJrUYeJkwyjTbm 8pg/2WXWB4p4MF+UU/yl+NtS9UjNl28= X-Google-Smtp-Source: ABdhPJxfntQh18FX1xsW4mIIiMNJ/VMHV+dJrpjnIvPrX7mRtzQVT+pehm+505LKqrLYMStLpLq5Tg== X-Received: by 2002:a05:600c:21ce:b0:37c:526:4793 with SMTP id x14-20020a05600c21ce00b0037c05264793mr8655759wmj.120.1645641079440; Wed, 23 Feb 2022 10:31:19 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id s7sm257603wro.104.2022.02.23.10.31.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Feb 2022 10:31:19 -0800 (PST) Message-Id: In-Reply-To: References: Date: Wed, 23 Feb 2022 18:30:52 +0000 Subject: [PATCH 14/25] clone: --bundle-uri cannot be combined with --depth Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, me@ttaylorr.com, aevar@gmail.com, newren@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee The previous change added the '--bundle-uri' option, but did not check if the --depth parameter was included. Since bundles are not compatible with shallow clones, provide an error message to the user who is attempting this combination. I am leaving this as its own change, separate from the one that implements '--bundle-uri', because this is more of an advisory for the user. There is nothing wrong with bootstrapping with bundles and then fetching a shallow clone. However, that is likely going to involve too much work for the client _and_ the server. The client will download all of this bundle information containing the full history of the repository only to ignore most of it. The server will get a shallow fetch request, but with a list of haves that might cause a more painful computation of that shallow pack-file. RFC-TODO: add a test case for this error message. Signed-off-by: Derrick Stolee --- builtin/clone.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/builtin/clone.c b/builtin/clone.c index 6df3d513dc4..cfe3d96047a 100644 --- a/builtin/clone.c +++ b/builtin/clone.c @@ -912,6 +912,11 @@ int cmd_clone(int argc, const char **argv, const char *prefix) option_no_checkout = 1; } + if (bundle_uri) { + if (deepen) + die(_("--bundle-uri is incompatible with --depth, --shallow-since, and --shallow-exclude")); + } + repo_name = argv[0]; path = get_repo_path(repo_name, &is_bundle); From patchwork Wed Feb 23 18:30:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12757310 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E7B23C433FE for ; Wed, 23 Feb 2022 18:31:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243945AbiBWScS (ORCPT ); Wed, 23 Feb 2022 13:32:18 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37466 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243906AbiBWScK (ORCPT ); Wed, 23 Feb 2022 13:32:10 -0500 Received: from mail-wr1-x42a.google.com (mail-wr1-x42a.google.com [IPv6:2a00:1450:4864:20::42a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 536EC4B853 for ; Wed, 23 Feb 2022 10:31:22 -0800 (PST) Received: by mail-wr1-x42a.google.com with SMTP id l13so90836wrt.2 for ; Wed, 23 Feb 2022 10:31:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=9JjItsrO9AFu5slq2QOt5nLhfyq3iogPwFsHJ47dvPk=; b=hkhV3vWuVYrmRIr60JXJsV/5/4bozgwkIyaX0EiUqKpeIhsVLA4hkGXWu3Y85Zjme/ ToLvi5OLk1iUALuDXBS4nseSydpoda5uJi0CoJZ07hZ5Uq9bl/ejiJvuUPFTfaTVd9fY kh+HyfGFgtcSkezTb1ZxlmPDgIHcq6/n8SkCW+VtrMfo1T8xRfdQ2ZJlpTHldBtB2SU+ +73rij1xUHKj8TuhdVCUteGTkUkW7S/2UYDYlX/9WQzH4V0JJxfpqSBOTpGnPtZv2chz xRUWclTsBoQ9/hBp7gPn4gS1OTZ2sHEkF1/nnawsZ35l7Vj5LDGADMesGUk9IFxdUF/P zGdg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=9JjItsrO9AFu5slq2QOt5nLhfyq3iogPwFsHJ47dvPk=; b=YWgQohxpnymPrwHmiGfvrkAp3eiune7p2oWN2xb1JP2fSApEh5F++A4/rgTFXfwulk wv4u1DP2YDUhJ/HKEQhOd6bbSvBdPKpDKRXaQvkOrsSeLqaJcYSyh4W1QifZ/iRbnyZE IjeaSBtffzF5NiNi5OKCDMKOaM+313Vhma8cgfWExzOc5dVf6FD/7kD8bPai3u7QlmL3 5COOEpOKowaNc4VVdKxZN1MDyaKoz8sy09jcaQ6bApmxxgcVL9qvB3EXBoFspU5tNbnE kGKvvhwReSJK3UeitapeYc+Mk6nJvMQKqX17u2fiwJhGsQdYcEzuQ2+Bn4Wx80U32U6K wvmQ== X-Gm-Message-State: AOAM533dwRH1qSc9uP8Mo5QJajmmb6w0phb/8SO8x3ns14fZ9xTxjzZt +nHuqY92v7Q/I+BtuvwBoBO1yvMbQ1Y= X-Google-Smtp-Source: ABdhPJwrExarnw/31iiqE8ZL73MHOmFOAiRwrWnIHk94EnB+GaIMAB4/QL/+U6n4Z2rjxepPG9/7KA== X-Received: by 2002:a5d:6106:0:b0:1ed:b56f:73f7 with SMTP id v6-20020a5d6106000000b001edb56f73f7mr693681wrt.601.1645641080751; Wed, 23 Feb 2022 10:31:20 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id o12sm314976wrq.117.2022.02.23.10.31.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Feb 2022 10:31:20 -0800 (PST) Message-Id: <7ec7ba3f328f9d6fa7f919ce18a138840d44ee49.1645641063.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 23 Feb 2022 18:30:53 +0000 Subject: [PATCH 15/25] config: add git_config_get_timestamp() Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, me@ttaylorr.com, aevar@gmail.com, newren@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee The existing config parsing methods do not include a way to consistently parse timestamps across all platforms. Recall that "unsigned long" is 32 bits on 64-bit Windows, so git_config_get_ulong() is insufficient. Adding a new type requires quite a bit of boilerplate to match the style of other types. RFC-QUESTION: Would this be better to use uintmax_t, which could be cast to timestamp_t or other types more robust than "unsigned long"? Signed-off-by: Derrick Stolee --- config.c | 39 +++++++++++++++++++++++++++++++++++++++ config.h | 14 ++++++++++++++ 2 files changed, 53 insertions(+) diff --git a/config.c b/config.c index e0c03d154c9..84021b7d504 100644 --- a/config.c +++ b/config.c @@ -1228,6 +1228,15 @@ int git_parse_ulong(const char *value, unsigned long *ret) return 1; } +int git_parse_timestamp(const char *value, timestamp_t *ret) +{ + uintmax_t tmp; + if (!git_parse_unsigned(value, &tmp, maximum_unsigned_value_of_type(timestamp_t))) + return 0; + *ret = tmp; + return 1; +} + int git_parse_ssize_t(const char *value, ssize_t *ret) { intmax_t tmp; @@ -1296,6 +1305,14 @@ unsigned long git_config_ulong(const char *name, const char *value) return ret; } +timestamp_t git_config_timestamp(const char *name, const char *value) +{ + timestamp_t ret; + if (!git_parse_timestamp(value, &ret)) + die_bad_number(name, value); + return ret; +} + ssize_t git_config_ssize_t(const char *name, const char *value) { ssize_t ret; @@ -2328,6 +2345,16 @@ int git_configset_get_ulong(struct config_set *cs, const char *key, unsigned lon return 1; } +int git_configset_get_timestamp(struct config_set *cs, const char *key, timestamp_t *dest) +{ + const char *value; + if (!git_configset_get_value(cs, key, &value)) { + *dest = git_config_timestamp(key, value); + return 0; + } else + return 1; +} + int git_configset_get_bool(struct config_set *cs, const char *key, int *dest) { const char *value; @@ -2471,6 +2498,13 @@ int repo_config_get_ulong(struct repository *repo, return git_configset_get_ulong(repo->config, key, dest); } +int repo_config_get_timestamp(struct repository *repo, + const char *key, timestamp_t *dest) +{ + git_config_check_init(repo); + return git_configset_get_timestamp(repo->config, key, dest); +} + int repo_config_get_bool(struct repository *repo, const char *key, int *dest) { @@ -2544,6 +2578,11 @@ int git_config_get_ulong(const char *key, unsigned long *dest) return repo_config_get_ulong(the_repository, key, dest); } +int git_config_get_timestamp(const char *key, timestamp_t *dest) +{ + return repo_config_get_timestamp(the_repository, key, dest); +} + int git_config_get_bool(const char *key, int *dest) { return repo_config_get_bool(the_repository, key, dest); diff --git a/config.h b/config.h index ab0106d2875..a6e4d35da0a 100644 --- a/config.h +++ b/config.h @@ -206,6 +206,7 @@ int config_with_options(config_fn_t fn, void *, int git_parse_ssize_t(const char *, ssize_t *); int git_parse_ulong(const char *, unsigned long *); +int git_parse_timestamp(const char *, timestamp_t *); /** * Same as `git_config_bool`, except that it returns -1 on error rather @@ -226,6 +227,11 @@ int64_t git_config_int64(const char *, const char *); */ unsigned long git_config_ulong(const char *, const char *); +/** + * Identical to `git_config_int`, but for (unsigned) timestamps. + */ +timestamp_t git_config_timestamp(const char *name, const char *value); + ssize_t git_config_ssize_t(const char *, const char *); /** @@ -469,6 +475,7 @@ int git_configset_get_string(struct config_set *cs, const char *key, char **dest int git_configset_get_string_tmp(struct config_set *cs, const char *key, const char **dest); int git_configset_get_int(struct config_set *cs, const char *key, int *dest); int git_configset_get_ulong(struct config_set *cs, const char *key, unsigned long *dest); +int git_configset_get_timestamp(struct config_set *cs, const char *key, timestamp_t *dest); int git_configset_get_bool(struct config_set *cs, const char *key, int *dest); int git_configset_get_bool_or_int(struct config_set *cs, const char *key, int *is_bool, int *dest); int git_configset_get_maybe_bool(struct config_set *cs, const char *key, int *dest); @@ -489,6 +496,8 @@ int repo_config_get_int(struct repository *repo, const char *key, int *dest); int repo_config_get_ulong(struct repository *repo, const char *key, unsigned long *dest); +int repo_config_get_timestamp(struct repository *repo, + const char *key, timestamp_t *dest); int repo_config_get_bool(struct repository *repo, const char *key, int *dest); int repo_config_get_bool_or_int(struct repository *repo, @@ -558,6 +567,11 @@ int git_config_get_int(const char *key, int *dest); */ int git_config_get_ulong(const char *key, unsigned long *dest); +/** + * Similar to `git_config_get_int` but for (unsigned) timestamps. + */ +int git_config_get_timestamp(const char *key, timestamp_t *dest); + /** * Finds and parses the value into a boolean value, for the configuration * variable `key` respecting keywords like "true" and "false". Integer From patchwork Wed Feb 23 18:30:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12757313 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B2BCAC433FE for ; Wed, 23 Feb 2022 18:31:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243907AbiBWScY (ORCPT ); Wed, 23 Feb 2022 13:32:24 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38400 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243910AbiBWScK (ORCPT ); Wed, 23 Feb 2022 13:32:10 -0500 Received: from mail-wr1-x434.google.com (mail-wr1-x434.google.com [IPv6:2a00:1450:4864:20::434]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 775074B862 for ; Wed, 23 Feb 2022 10:31:23 -0800 (PST) Received: by mail-wr1-x434.google.com with SMTP id j22so14987675wrb.13 for ; Wed, 23 Feb 2022 10:31:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=M1drjjGdWYYVE/wOjGuoFe1tl2uD5YSXOdkKy/YTNkI=; b=T6GvC0ipO0DquIC3tPOl/38SUYRPZMwyZXn4ry8yQH9LH2dynDKs9rhacX3sOi6jHN kQUJtOzH5x0VUnT8rr7JBZhoW9HfFfy82SlzPPNSOhjuLncuN6GULTcAtrkmVeDKzAZj NarhoXyeX++Q9W5h7NUWB5JgEfb0s0ljRHQ+rTJaYy4ZE8W2WYMTWdirUnglhkpuUbv3 OSFiph6T0iuL3iOEK1qReKdNP2bP+zI5GGdQmEb0741DEibTieV0TreQYBwYIKWoJLHV 2JSSQFSQxEB11todVYOvTXDrLILe4ox3hrNE3MCPCIBidihwz/1n+AlPRypEVYc5zupk FiPg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=M1drjjGdWYYVE/wOjGuoFe1tl2uD5YSXOdkKy/YTNkI=; b=dSqXe4nRdle9UCQEpUqlpOg1g2qbYup8S0Pkh56D8eFkfa0eWvah4lgSCfx5zB3MbO H9Hgu1s6tD2Ziu5GRD8HqiqpNfJ0rdxDMAlfeOJPASetybCOPK74qdvBk8r+P8SFrISA Mer0PFuT574yAmmgcPIMkXkcssMcMyv9Q/fHmhsIxx1hl+4G2wAv34jyxO7TPJ898M89 AzRE0Io5zyAN4LY+FlYW+7xIgOX44qG+GhRA8Wl1afbBeGhUWOnphbV9rczdDlUQXKZH NP3ujW4T3dqw/DqNt9h4W5SzB7ORs53OM+1Pl+QicN+UVUAz8g1B/6dc6doqbXQ5PYdf g4Qw== X-Gm-Message-State: AOAM531n9sj7JoPnnw7jBtwz4sEfRpqOlPMr+2ek4FFjU4++m4R8Pc5o J+KAOo9M5RVQ3i7WCqdSc/ZPlYG9bkQ= X-Google-Smtp-Source: ABdhPJwLchjiYTEgakcGGLKqNVWc7KUKUo8gmucH573nk2KdQLJkLdroD278/juzx2rHrz6bLLbMpA== X-Received: by 2002:a05:6000:1846:b0:1ea:7f4d:c56f with SMTP id c6-20020a056000184600b001ea7f4dc56fmr675438wri.25.1645641081853; Wed, 23 Feb 2022 10:31:21 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id b2sm391767wri.35.2022.02.23.10.31.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Feb 2022 10:31:21 -0800 (PST) Message-Id: <5f7f54e9205acbe072ad436e83e3e98a36f1df3b.1645641063.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 23 Feb 2022 18:30:54 +0000 Subject: [PATCH 16/25] bundle: only fetch bundles if timestamp is new Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, me@ttaylorr.com, aevar@gmail.com, newren@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee If a bundle server is providing a table of contents with timestamps for the bundles, then we can store the most-recent timestamp and use that as a test if the bundle server has any new information. Teach 'git bundle fetch' to store the timestamp in the config file as 'fetch.bundleTimestamp' and compare the existing value to the most-recent timestamp in the bundle server's table of contents. If the new timestamp is at most the stored timestamp, then exit early (with success). If the new timestamp is greater than the stored timestamp, then continue with the normal fetch logic of downloading the most-recent bundle until all missing objects are satisfied. Store that new timestamp in the config for next time. RFC-TODO: Update documentation of 'git bundle fetch' to match his new behavior. RFC-TODO: Add 'fetch.bundleTimestamp' to Documentation/config/ Signed-off-by: Derrick Stolee --- builtin/bundle.c | 21 ++++++++++++++++++++- 1 file changed, 20 insertions(+), 1 deletion(-) diff --git a/builtin/bundle.c b/builtin/bundle.c index ec969a62ae1..cab99ee2b15 100644 --- a/builtin/bundle.c +++ b/builtin/bundle.c @@ -409,6 +409,9 @@ static int cmd_bundle_fetch(int argc, const char **argv, const char *prefix) struct remote_bundle_info *stack = NULL; struct hashmap toc = { 0 }; const char *filter = NULL; + const char *timestamp_key = "fetch.bundletimestamp"; + timestamp_t stored_time = 0; + timestamp_t max_time = 0; struct option options[] = { OPT_BOOL(0, "progress", &progress, @@ -424,6 +427,8 @@ static int cmd_bundle_fetch(int argc, const char **argv, const char *prefix) if (!startup_info->have_repository) die(_("'fetch' requires a repository")); + git_config_get_timestamp(timestamp_key, &stored_time); + /* * Step 1: determine protocol for uri, and download contents to * a temporary location. @@ -445,7 +450,6 @@ static int cmd_bundle_fetch(int argc, const char **argv, const char *prefix) } else { struct hashmap_iter iter; struct remote_bundle_info *info; - timestamp_t max_time = 0; /* populate a hashtable with all relevant bundles. */ used_hashmap = 1; @@ -476,6 +480,13 @@ static int cmd_bundle_fetch(int argc, const char **argv, const char *prefix) max_time = info->timestamp; } } + + trace2_data_intmax("bundle", the_repository, "max_time", max_time); + trace2_data_intmax("bundle", the_repository, "stored_time", stored_time); + + /* Skip fetching bundles if data isn't new enough. */ + if (max_time <= stored_time) + goto cleanup; } /* @@ -563,6 +574,14 @@ static int cmd_bundle_fetch(int argc, const char **argv, const char *prefix) stack = stack->stack_next; } + if (max_time) { + struct strbuf tstr = STRBUF_INIT; + strbuf_addf(&tstr, "%"PRIuMAX"", max_time); + git_config_set_gently(timestamp_key, tstr.buf); + strbuf_release(&tstr); + } + +cleanup: if (used_hashmap) { struct hashmap_iter iter; struct remote_bundle_info *info; From patchwork Wed Feb 23 18:30:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12757312 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E25F8C433EF for ; Wed, 23 Feb 2022 18:31:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242043AbiBWScV (ORCPT ); Wed, 23 Feb 2022 13:32:21 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37460 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243907AbiBWScK (ORCPT ); Wed, 23 Feb 2022 13:32:10 -0500 Received: from mail-wm1-x32d.google.com (mail-wm1-x32d.google.com [IPv6:2a00:1450:4864:20::32d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 38BD04B863 for ; Wed, 23 Feb 2022 10:31:24 -0800 (PST) Received: by mail-wm1-x32d.google.com with SMTP id n25-20020a05600c3b9900b00380f41e51e6so1925507wms.2 for ; Wed, 23 Feb 2022 10:31:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=v9RFMQ7zWfgEqIwSaWdpAS9Bc5xyQgRqjjpEu14/g1s=; b=KtGT6wbKSWJal+gXD3/9SB7ZKOMMU6ZzPlNi2JzCGa+RC2K2RVwVq+AC84Tw08oE2w p2TQ/4DKC3KzmanN8jKFInHENTz61FRwxiA7VnaR62UIN2n7eavIHhfxFNdNKKa3LRsx HKZ/poXPKP8FGypkRFMQbw3qVDExyNEneZCX45q1lan61N1ph6Vyd1vLFaPke+I0ob9L HrJqWqjbdJaDU2lNx7aG8jimNZ0v7TSFcLy+VL4u4xySG5G/y68TSKzQEFu3DONnvsVW Jp21KpNBQ+SxSv4aqjd9MI2Em7p/r6KB8VHib1i/ophbuDj1g+1IQcjDzLoEMyfKT5NH ynKA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=v9RFMQ7zWfgEqIwSaWdpAS9Bc5xyQgRqjjpEu14/g1s=; b=VoxpXwVGPZLGxyhN/F3xEOqdv1chM68wdrSuhUrFjwhAkFn5vbs8CPuY4/r00sktsn sHkRXRsWB6h4EAdapeDypAkklIwtlFaQlP8xucjD6gTgRhjefaoS2XNo40OkPlEw00Zb s0l/dlO7Ejryiuqqf7Bak84SS4359kjFe68Bu0d1ZP+Hc4fmPPIQbNcs1czAiWG9WWU9 kPat6WmlJNMeypc+ImjNrVgwjAj7WajZ/a9TZ6a1PaHwPGfZTQ9BKiUdBQKoj10J7soI 2slC9LsULUdypPq8cMmogLtPhtJY3XLc6+I0ovHiWmNPmejG2oyrQly1qii5eIAbGO1h SS1A== X-Gm-Message-State: AOAM532jlQ7/5VxJZGKmM34tvQAxlyEn3T3TSrIZ0YfGVSK+8mL2a/Rj fhY+biDprWSH5q0KWygGDy7MtIOLKBU= X-Google-Smtp-Source: ABdhPJwIfOyMEBJY0ztz7OeUsbit2UvqR/Nt7vpkyz5WTpNiLM+vXn5v83QFw+Wdd/cQ90opm9DiBQ== X-Received: by 2002:a1c:f413:0:b0:37b:d1de:5762 with SMTP id z19-20020a1cf413000000b0037bd1de5762mr804040wma.108.1645641082663; Wed, 23 Feb 2022 10:31:22 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id w4sm272974wre.102.2022.02.23.10.31.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Feb 2022 10:31:22 -0800 (PST) Message-Id: <42d351f4aae9532d2787e72ae4be2e08fe614f44.1645641063.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 23 Feb 2022 18:30:55 +0000 Subject: [PATCH 17/25] fetch: fetch bundles before fetching original data Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, me@ttaylorr.com, aevar@gmail.com, newren@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee If a user cloned using a bundle URI, then they might want to re-use it to download new bundles during 'git fetch' before fetching the remaining objects from the origin server. Use the 'fetch.bundleURI' config as the indicator for whether this extra step should happen. Do not fetch bundles if --dry-run is specified. RFC-TODO: add tests. RFC-TODO: update Documentation/git-fetch.txt RFC-TODO: update Documentation/config/ Signed-off-by: Derrick Stolee --- builtin/fetch.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/builtin/fetch.c b/builtin/fetch.c index 6f5e1578639..c0fece55632 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -29,6 +29,7 @@ #include "commit-graph.h" #include "shallow.h" #include "worktree.h" +#include "bundle.h" #define FORCED_UPDATES_DELAY_WARNING_IN_MS (10 * 1000) @@ -2081,6 +2082,22 @@ int cmd_fetch(int argc, const char **argv, const char *prefix) /* FETCH_HEAD never gets updated in --dry-run mode */ if (dry_run) write_fetch_head = 0; + else { + /* + * --dry-run mode skips bundle downloads, which might + * update some refs. + */ + char *bundle_uri = NULL; + git_config_get_string("fetch.bundleuri", &bundle_uri); + + if (bundle_uri) { + char *filter = NULL; + git_config_get_string("fetch.bundlefilter", &filter); + fetch_bundle_uri(bundle_uri, filter); + free(bundle_uri); + free(filter); + } + } if (all) { if (argc == 1) From patchwork Wed Feb 23 18:30:56 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsCBCamFybWFzb24=?= X-Patchwork-Id: 12757311 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CB569C433EF for ; Wed, 23 Feb 2022 18:31:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243951AbiBWScU (ORCPT ); Wed, 23 Feb 2022 13:32:20 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37488 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243908AbiBWScK (ORCPT ); Wed, 23 Feb 2022 13:32:10 -0500 Received: from mail-wm1-x32f.google.com (mail-wm1-x32f.google.com [IPv6:2a00:1450:4864:20::32f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 77FB34B86D for ; Wed, 23 Feb 2022 10:31:25 -0800 (PST) Received: by mail-wm1-x32f.google.com with SMTP id w13so13813147wmi.2 for ; Wed, 23 Feb 2022 10:31:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:mime-version :content-transfer-encoding:fcc:to:cc; bh=x8+sZRlT+5AjxhkuyEq88suJEF1Kw89D8eIrUkKrH5U=; b=WkKQjfmKYp66FUkrRCSUwahp80R1IjxJwHxBCLyXM+gQqPh3Q2WLXVC9sDHby7Fq+l ZvckKtcOvT85AilyULd6+x5Y0g6k7tDwMTs6tE9T4fDU6YGyGesqxBNsbfqD2/hrNWpy NS/jK9/lFJmefc2obfK1qTHmop8q6OD/upCe71PurR2/HWAA1Q31vIQBJM6bBMCyBS9b aXFfbvyxnXviPa0fEERDt3L60atip7trvABw6ixJiZTAgk+osl9F2y2kl/DZgIFOi5bh adSlBlRgmSmqa2bvhP9v25+RYSBFAbRnjX6zc/viRCPBHX19J9gQPIxGkMsM4zUfxxrq abVg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:mime-version:content-transfer-encoding:fcc:to:cc; bh=x8+sZRlT+5AjxhkuyEq88suJEF1Kw89D8eIrUkKrH5U=; b=YRJbt9RGjKQEYPsyNnXp3YjiBLHZreEHZWd62XtTdiCMUU80wWarQ0tU0fI3GkQI9B 8+npU803Ai4PgumZFIdRzHbYExHoo0FO0g5Vrm32o8YIfYhJq8Y6xdCVUEgLRYPmTjGJ Fd+4EoUKk2BlRKvCSbu0IwXoWk0Lt9VY3GugKIbYYg533NvGLzf09+Z4WiTakuUc75l0 itlCVbcAP30EAHJLnRPB8LiUdL0KpN7bopnQZHZSGjHQoPHQjww9dhJi88DdcHVhe50C gbvgu3y2vTJoPuwQpOzXTILN5252FuvWTYIy+r+2nm9CPq9/YqNxXmLhQP+mXw1Iwy8v QdgQ== X-Gm-Message-State: AOAM530VLxDnN8S9mhJm1n24LMJdW6yWBI5gHJke6gw6QnJBYs+2Hufp pQIEjUgpSb0lPzCV/lLmdrkNgFIzBoc= X-Google-Smtp-Source: ABdhPJxds8X5tPkA0A4tvVaayPdxzOECqGYr3LTrsb8dwyCmVWYdSfoygFbdiXKOPd2iy/0Kr4rJyQ== X-Received: by 2002:a05:600c:215:b0:37c:729:f84d with SMTP id 21-20020a05600c021500b0037c0729f84dmr8626045wmi.131.1645641083834; Wed, 23 Feb 2022 10:31:23 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id u15sm457012wrs.18.2022.02.23.10.31.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Feb 2022 10:31:23 -0800 (PST) Message-Id: In-Reply-To: References: Date: Wed, 23 Feb 2022 18:30:56 +0000 Subject: [PATCH 18/25] connect.c: refactor sending of agent & object-format MIME-Version: 1.0 Fcc: Sent To: git@vger.kernel.org Cc: gitster@pobox.com, me@ttaylorr.com, aevar@gmail.com, newren@gmail.com, Derrick Stolee , =?utf-8?b?w4Z2YXIgQXJuZmrDtnI=?= =?utf-8?b?w7AgQmphcm1hc29u?= Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsCBCamFybWFzb24=?= From: =?UTF-8?q?=C3=86var=20Arnfj=C3=B6r=C3=B0=20Bjarmason?= Refactor the sending of the "agent" and "object-format" capabilities into a function. This was added in its current form in ab67235bc4 (connect: parse v2 refs with correct hash algorithm, 2020-05-25). When we connect to a v2 server we need to know about its object-format, and it needs to know about ours. Since most things in connect.c and transport.c piggy-back on the eager getting of remote refs via the handshake() those commands can make use of the just-sent-over object-format by ls-refs. But I'm about to add a command that may come after ls-refs, and may not, but we need the server to know about our user-agent and object-format. So let's split this into a function. Signed-off-by: Ævar Arnfjörð Bjarmason Signed-off-by: Derrick Stolee --- connect.c | 33 ++++++++++++++++++++------------- 1 file changed, 20 insertions(+), 13 deletions(-) diff --git a/connect.c b/connect.c index eaf7d6d2618..9d78d681e95 100644 --- a/connect.c +++ b/connect.c @@ -473,6 +473,24 @@ void check_stateless_delimiter(int stateless_rpc, die("%s", error); } +static void send_capabilities(int fd_out, struct packet_reader *reader) +{ + const char *hash_name; + + if (server_supports_v2("agent", 0)) + packet_write_fmt(fd_out, "agent=%s", git_user_agent_sanitized()); + + if (server_feature_v2("object-format", &hash_name)) { + int hash_algo = hash_algo_by_name(hash_name); + if (hash_algo == GIT_HASH_UNKNOWN) + die(_("unknown object format '%s' specified by server"), hash_name); + reader->hash_algo = &hash_algos[hash_algo]; + packet_write_fmt(fd_out, "object-format=%s", reader->hash_algo->name); + } else { + reader->hash_algo = &hash_algos[GIT_HASH_SHA1]; + } +} + struct ref **get_remote_refs(int fd_out, struct packet_reader *reader, struct ref **list, int for_push, struct transport_ls_refs_options *transport_options, @@ -480,7 +498,6 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader, int stateless_rpc) { int i; - const char *hash_name; struct strvec *ref_prefixes = transport_options ? &transport_options->ref_prefixes : NULL; char **unborn_head_target = transport_options ? @@ -490,18 +507,8 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader, if (server_supports_v2("ls-refs", 1)) packet_write_fmt(fd_out, "command=ls-refs\n"); - if (server_supports_v2("agent", 0)) - packet_write_fmt(fd_out, "agent=%s", git_user_agent_sanitized()); - - if (server_feature_v2("object-format", &hash_name)) { - int hash_algo = hash_algo_by_name(hash_name); - if (hash_algo == GIT_HASH_UNKNOWN) - die(_("unknown object format '%s' specified by server"), hash_name); - reader->hash_algo = &hash_algos[hash_algo]; - packet_write_fmt(fd_out, "object-format=%s", reader->hash_algo->name); - } else { - reader->hash_algo = &hash_algos[GIT_HASH_SHA1]; - } + /* Send capabilities */ + send_capabilities(fd_out, reader); if (server_options && server_options->nr && server_supports_v2("server-option", 1)) From patchwork Wed Feb 23 18:30:57 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12757314 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 39ACFC433F5 for ; Wed, 23 Feb 2022 18:32:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243914AbiBWSc0 (ORCPT ); Wed, 23 Feb 2022 13:32:26 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38404 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243911AbiBWScK (ORCPT ); Wed, 23 Feb 2022 13:32:10 -0500 Received: from mail-wr1-x435.google.com (mail-wr1-x435.google.com [IPv6:2a00:1450:4864:20::435]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4B2D84B87E for ; Wed, 23 Feb 2022 10:31:26 -0800 (PST) Received: by mail-wr1-x435.google.com with SMTP id o4so4301760wrf.3 for ; Wed, 23 Feb 2022 10:31:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=jJsXhVhG82wFUXv4oii4SDL+BE/grwhSYmD/abbrqhE=; b=SIV9WYKtFHqYWyxRBHvUy5TofkHBY5LduR7wuFGS3j4Zr4JxVvrn/ZNf2EwERGaglW noqrBWDN/LXDXOjHpEunWP7LDinDY+oycfhR3BvlYZGAcQlTWulVMwexPy+HVjS8Vs6n gXsAeSIbaLR5tZI6sq4q4b7HFDtpJIj1Cfn+PWOXaGuBqKgn+SYbehyV/IHQ7FdirE/v q3WAERizD3cKZREF4Ll8Iv8leYIHTfb6gZQQX+THC1+EP1oPTf66h86ws8rZTWmj8dSp t+JczVz61DKk5j9RQLNtjG/LEvDiJ/s7e/uuKKf31d3mOaEViCC3jGp0XIF4IDncl+2o grqA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=jJsXhVhG82wFUXv4oii4SDL+BE/grwhSYmD/abbrqhE=; b=nVWNJGU05sW8YjC98lak6BuWdDnsWsNJTWPNlmf+qADjqOEN9Ffd1u60FANezHS+IU iFED3hWi+A40Y3rgNGh/Yvssc68P9xK3MT83HUSwrIgh+WecTh1W4R3y1cgKQ8wg7CWD 2T0G/K6ZBYfK59g7SE4bGU9jfPaGeshRZ8PynSnXq2ANNf/fI4znBHY7VtAswgozxVs5 m0W82M90akQh62kd4tbZVr1mDRRhU9COYihRhLrUr5jAxfHvBzd9Mp+Hd+Rzi91Y12C8 L/h/uDPRjB1Y/EG0064f6uuCtkPI28EhbU9VxDgbSDuADUfQAdbGk8zeT+ZDQuQ9akmx xDJg== X-Gm-Message-State: AOAM531HDO3bNF9OFYYz80a+YfXY5NfFA6mvdfiEtSPrwyqoMKIMXijt zKYnXBEsryBJG5r4jRBOv65Ot7GJ9eA= X-Google-Smtp-Source: ABdhPJxKM2S2Z02wMWYK9NIQgFgex75H7btp3zrWeO/VjYrxKqz9Ux8apjFGv67tFuTrdlixjBMf4A== X-Received: by 2002:a5d:4890:0:b0:1ed:9d4e:f8ef with SMTP id g16-20020a5d4890000000b001ed9d4ef8efmr662691wrq.595.1645641084708; Wed, 23 Feb 2022 10:31:24 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id a10sm365847wri.74.2022.02.23.10.31.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Feb 2022 10:31:24 -0800 (PST) Message-Id: <9a72854af51cae0813cf98318f5ea20a9c8bd559.1645641063.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 23 Feb 2022 18:30:57 +0000 Subject: [PATCH 19/25] protocol-caps: implement cap_features() Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, me@ttaylorr.com, aevar@gmail.com, newren@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee The 'features' capability sends a list of "key=value" pairs from the server. These are a set of fixed config values, all prefixed with "serve." to avoid conflicting with other config values of similar names. The initial set chosen here are: * bundleURI: Allow advertising one or more bundle servers by URI. * partialCloneFilter: Advertise one or more recommended partial clone filters. * sparseCheckout: Advertise that this repository recommends using the sparse-checkout feature in cone mode. The client will have the choice to enable these features. RFC-TODO: Create Documentation/config/serve.txt Signed-off-by: Derrick Stolee --- protocol-caps.c | 66 +++++++++++++++++++++++++++++++++++++++++++++++++ protocol-caps.h | 1 + 2 files changed, 67 insertions(+) diff --git a/protocol-caps.c b/protocol-caps.c index bbde91810ac..88b01c4133e 100644 --- a/protocol-caps.c +++ b/protocol-caps.c @@ -8,6 +8,7 @@ #include "object-store.h" #include "string-list.h" #include "strbuf.h" +#include "config.h" struct requested_info { unsigned size : 1; @@ -111,3 +112,68 @@ int cap_object_info(struct repository *r, struct packet_reader *request) return 0; } + +static void send_lines(struct repository *r, struct packet_writer *writer, + struct string_list *str_list) +{ + struct string_list_item *item; + + if (!str_list->nr) + return; + + for_each_string_list_item (item, str_list) { + packet_writer_write(writer, "%s", item->string); + } +} + +int cap_features(struct repository *r, struct packet_reader *request) +{ + struct packet_writer writer; + struct string_list feature_list = STRING_LIST_INIT_DUP; + int i = 0; + const char *keys[] = { + "bundleuri", + "partialclonefilter", + "sparsecheckout", + NULL + }; + struct strbuf serve_feature = STRBUF_INIT; + struct strbuf key_equals_value = STRBUF_INIT; + size_t len; + strbuf_add(&serve_feature, "serve.", 6); + len = serve_feature.len; + + packet_writer_init(&writer, 1); + + while (keys[i]) { + struct string_list_item *item; + const struct string_list *values = NULL; + strbuf_setlen(&serve_feature, len); + strbuf_addstr(&serve_feature, keys[i]); + + values = repo_config_get_value_multi(r, serve_feature.buf); + + if (values) { + for_each_string_list_item(item, values) { + strbuf_reset(&key_equals_value); + strbuf_addstr(&key_equals_value, keys[i]); + strbuf_addch(&key_equals_value, '='); + strbuf_addstr(&key_equals_value, item->string); + + string_list_append(&feature_list, key_equals_value.buf); + } + } + + i++; + } + strbuf_release(&serve_feature); + strbuf_release(&key_equals_value); + + send_lines(r, &writer, &feature_list); + + string_list_clear(&feature_list, 1); + + packet_flush(1); + + return 0; +} diff --git a/protocol-caps.h b/protocol-caps.h index 15c4550360c..681d2106d88 100644 --- a/protocol-caps.h +++ b/protocol-caps.h @@ -4,5 +4,6 @@ struct repository; struct packet_reader; int cap_object_info(struct repository *r, struct packet_reader *request); +int cap_features(struct repository *r, struct packet_reader *request); #endif /* PROTOCOL_CAPS_H */ From patchwork Wed Feb 23 18:30:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12757316 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 98992C433EF for ; Wed, 23 Feb 2022 18:32:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243959AbiBWSca (ORCPT ); Wed, 23 Feb 2022 13:32:30 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37444 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243915AbiBWScK (ORCPT ); Wed, 23 Feb 2022 13:32:10 -0500 Received: from mail-wm1-x334.google.com (mail-wm1-x334.google.com [IPv6:2a00:1450:4864:20::334]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2FE8B4BFD6 for ; Wed, 23 Feb 2022 10:31:27 -0800 (PST) Received: by mail-wm1-x334.google.com with SMTP id p184-20020a1c29c1000000b0037f76d8b484so4914468wmp.5 for ; Wed, 23 Feb 2022 10:31:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=0virNklmWNiDFPjHgrPqI8m+ooiigmpC/NpkboLQmU4=; b=C0x97i+wR9hdRVNm4dYTyMmt0txIaoXL3hW+W4CmSnlIOZFDebBh+MdZ2U3nDqcaGZ RFmEFTPz2w66tfHjXBpcpDg5LR3n32Wu2UDqP4cdJkymsGnUBzen4AyqFuPQ4CxcPCTm O64gKpcWQYtvG3EpwskiBNxe+tWi0EqoIEHF8PnQ/rjI5Avsy0oGoI0pFWGdgf9kqVBf qknzatUeSBIT68BTwmNNyoiv0KcETTf4i3rKk/WIpZyjnTDeQFMV/6/swcms5VkBs/4K 9QLPuQhBo39hvc6u9TEXt88HEP+GNe5cOnbOTP5TQ9lccwuGakFS44n+jXOKLMOFSxOV GiTA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=0virNklmWNiDFPjHgrPqI8m+ooiigmpC/NpkboLQmU4=; b=RaCF5vSGXSQ+T5kPsWCuLaxpxCTccFWVPrnDJ90umCwhCZ+p+JOTl3JwDusW79qIE0 mp0moxZUA+4bWOSjMMV2iRxFWtjYuOH7OjceJcPGEPiT2Pg0qXP+WwpHMPmtg5ULxfZv QiSj088YJoqeju+uXebZ6zVIEB8kq8NGWqfqIDGML79T9b0fG2cUDfkLHOD/oI7RxPrb xP7BDqcKojbkBkZsXoy5dW2AG2WRRSWH4PbKMzmR3P3padJWQo+haEIb2I9f7CJ6DdZp PSvPIMktwrtCWh1tgO3bpcOM/9N5LcKgPDcII9iWVggd4vTvLriQiyeJ4GWdrgOWQBEp oMbA== X-Gm-Message-State: AOAM530Lmz6R9qqeF3oBY/PlTAMUgeNlNjsrfq0El7nrNU9wQ4k4bXFb Ufp0tRhud7viSvj6L/+qOAh79QE1ZG0= X-Google-Smtp-Source: ABdhPJwibmooduDpAX49qxGS9cKEAFgLWaE3cqFGWbZuqxPZsVJxqTMqVMuuB2pUnxXDmqYPm25lYg== X-Received: by 2002:a1c:540b:0:b0:380:edd5:9f2a with SMTP id i11-20020a1c540b000000b00380edd59f2amr3820720wmb.122.1645641085659; Wed, 23 Feb 2022 10:31:25 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id j12sm319349wrs.1.2022.02.23.10.31.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Feb 2022 10:31:25 -0800 (PST) Message-Id: <5b983cc3c104fe6ed64608320387bf82fefcdbb4.1645641063.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 23 Feb 2022 18:30:58 +0000 Subject: [PATCH 20/25] serve: understand but do not advertise 'features' capability Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, me@ttaylorr.com, aevar@gmail.com, newren@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee The previous change implemented cap_features() to return a set of 'key=value' pairs when this capability is run. Add the capability to our list of understood capabilities. This change does not advertise the capability. When deploying a new capability to a distributed fleet of Git servers, it is important to delay advertising the capability until all nodes understand it. A later change will advertise it when appropriate, but as a separate change to simplify this transition. Signed-off-by: Derrick Stolee --- serve.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/serve.c b/serve.c index b3fe9b5126a..a1c853dda1f 100644 --- a/serve.c +++ b/serve.c @@ -18,6 +18,12 @@ static int always_advertise(struct repository *r, return 1; } +static int never_advertise(struct repository *r, + struct strbuf *value) +{ + return 0; +} + static int agent_advertise(struct repository *r, struct strbuf *value) { @@ -136,6 +142,11 @@ static struct protocol_capability capabilities[] = { .advertise = always_advertise, .command = cap_object_info, }, + { + .name = "features", + .advertise = never_advertise, + .command = cap_features, + }, }; void protocol_v2_advertise_capabilities(void) From patchwork Wed Feb 23 18:30:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12757315 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B4FB1C433F5 for ; Wed, 23 Feb 2022 18:32:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240010AbiBWSc2 (ORCPT ); Wed, 23 Feb 2022 13:32:28 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37454 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243894AbiBWScK (ORCPT ); Wed, 23 Feb 2022 13:32:10 -0500 Received: from mail-wr1-x430.google.com (mail-wr1-x430.google.com [IPv6:2a00:1450:4864:20::430]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 50C2A4BFDF for ; Wed, 23 Feb 2022 10:31:28 -0800 (PST) Received: by mail-wr1-x430.google.com with SMTP id d28so13320145wra.4 for ; Wed, 23 Feb 2022 10:31:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=NOsus6dQJSUF86hnfVb84Zn0qDjSNhLGKvDkGIv1aAY=; b=H9H/tRmayqPZUOV2uuK35cb43VmCufXm9WnE2bQU0sfFgmar9FzlyVBDd2viW+G6bf 3ee5B7hJtK/crY16Tya9L9f2Egvf3e5qKW6342aCmKjf+COXIxxstNIjth6ceqEg7IZY 6Y/OlC5QAChCJMGvCDabNX/c4/ctWOFLaoAofnaco/N6O2TR6YYJHNtW9wUO+ayHK/18 KRXM+9hGnyCsjFdR2JCStQYt7dRJNC+22Gc7w/V/zaSZsDzia06kOWK9En6P7zifySGb LH99U5CLmEmzdn4TYf0uPA9G4m0V4uxLR1F77ioAJjb2N+YvBvob59YpkkUsySC8X6CI /hUw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=NOsus6dQJSUF86hnfVb84Zn0qDjSNhLGKvDkGIv1aAY=; b=120ZaPJOqdH4UX5nVWXHn5WbxAFAQU8nXShRM7dT0p4SKa1QdF+SDduTBKc7Rg/xgv Ru2xcb/l6ZS+SHvwK1mq+ipK3TazojuhrinkPyp+FGzk0Jz8We+z70Bn0nfGnIAqSisE MfrsD9GE7Kf/eL6uSfyOaiiHlEn1LZLaRlT34FjHHU7hk1SCHGsfD0rMtBc1lJTWBmJN 0ckgxhbe4/bJQFfcbNBh8cMxIKs/040AOLLonJ+E0DNSQtENFk9ABOkBQxitVFtKU0CP nWNejjRhlOKWJgctb8FBguh6ip+YxV7nLkyTgu/ctMMyNTl64AQIhYuATcyEyZ3ehWQQ SF/g== X-Gm-Message-State: AOAM532jYddUTlllcEWCvQmY/E4NwAMSsPkE0cy40BXFs7YFtE5HOoLT vd9C5TYSNaMsDiBk8eXHozesT8fxEFs= X-Google-Smtp-Source: ABdhPJxpiumnnzMNPnudC6rZXdCX2P1nDIhqFwmhhPshYb+UmJBEpLqxXJPp+yMlo47m4bBkj62GoA== X-Received: by 2002:a05:6000:2a5:b0:1e8:d9dc:f369 with SMTP id l5-20020a05600002a500b001e8d9dcf369mr639424wry.589.1645641086728; Wed, 23 Feb 2022 10:31:26 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id n9sm263700wrx.76.2022.02.23.10.31.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Feb 2022 10:31:26 -0800 (PST) Message-Id: In-Reply-To: References: Date: Wed, 23 Feb 2022 18:30:59 +0000 Subject: [PATCH 21/25] serve: advertise 'features' when config exists Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, me@ttaylorr.com, aevar@gmail.com, newren@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee The 'features' capability allows a server to recommend some Git features at a high level. Previous changes implemented the capability so servers understand it, but it was never advertised. Now, allow it to be advertised, but only when the capability will actually _do_ something. That is, advertise if and only if a config value exists with the prefix "serve.". This avoids unnecessary round trips for an empty result. Signed-off-by: Derrick Stolee --- serve.c | 18 +++++++++++++++--- t/t5701-git-serve.sh | 9 +++++++++ 2 files changed, 24 insertions(+), 3 deletions(-) diff --git a/serve.c b/serve.c index a1c853dda1f..7dcabb68147 100644 --- a/serve.c +++ b/serve.c @@ -18,12 +18,24 @@ static int always_advertise(struct repository *r, return 1; } -static int never_advertise(struct repository *r, - struct strbuf *value) +static int key_serve_prefix(const char *key, const char *value, void *data) { + int *signal = data; + if (!strncmp(key, "serve.", 6)) { + *signal = 1; + return 1; + } return 0; } +static int has_serve_config(struct repository *r, + struct strbuf *value) +{ + int signal = 0; + repo_config(r, key_serve_prefix, &signal); + return signal; +} + static int agent_advertise(struct repository *r, struct strbuf *value) { @@ -144,7 +156,7 @@ static struct protocol_capability capabilities[] = { }, { .name = "features", - .advertise = never_advertise, + .advertise = has_serve_config, .command = cap_features, }, }; diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh index 1896f671cb3..6ef721c3f97 100755 --- a/t/t5701-git-serve.sh +++ b/t/t5701-git-serve.sh @@ -30,6 +30,15 @@ test_expect_success 'test capability advertisement' ' test_cmp expect actual ' +test_expect_success 'test capability advertisement' ' + test_when_finished git config --unset serve.bundleuri && + git config serve.bundleuri "file://$(pwd)" && + GIT_TEST_SIDEBAND_ALL=0 test-tool serve-v2 \ + --advertise-capabilities >out && + test-tool pkt-line unpack actual && + grep features actual +' + test_expect_success 'stateless-rpc flag does not list capabilities' ' # Empty request test-tool pkt-line pack >in <<-EOF && From patchwork Wed Feb 23 18:31:00 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12757317 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3DF90C433FE for ; Wed, 23 Feb 2022 18:32:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243968AbiBWScc (ORCPT ); Wed, 23 Feb 2022 13:32:32 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38422 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243920AbiBWScK (ORCPT ); Wed, 23 Feb 2022 13:32:10 -0500 Received: from mail-wr1-x42b.google.com (mail-wr1-x42b.google.com [IPv6:2a00:1450:4864:20::42b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 81D494BFE8 for ; Wed, 23 Feb 2022 10:31:29 -0800 (PST) Received: by mail-wr1-x42b.google.com with SMTP id u1so41051242wrg.11 for ; Wed, 23 Feb 2022 10:31:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:mime-version :content-transfer-encoding:fcc:to:cc; bh=VcEgWbt3lO6ugJXNLWN/Cdal2JKkBE560E2IRHc6CQU=; b=ZwwIVaZ8joSRWsWsyMOtiYCz1stj/DsJa3EkKbD5ETJJ9egB3JtUjANBvD1neDcJkU Gmtltuu8UvURDKB9uQYBmZYYUl3rmNlToklLyo0Nl4vqwyhtxBpNE8Ii7Evsnsqmxb0E Qfgmzqxe6ihklI6DVrHBb4tyT5wLwC36Vq3Ei48rM8Tyg0hc+fH9msOVhIYIz2ZKhltJ xJ9l0/lHTpdYyAPPVKyuil3B0Xqa1fUxn0NoMHhUOcSjbrX9a3kOIRMVG2IP56wBkbU+ RKmnwqJe+0v8nD2yywHUqnGRxRYZxHhPJKK5IqdeK2SVKS8J2Y45PKORa8rjS1Cnep8S cCKg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:mime-version:content-transfer-encoding:fcc:to:cc; bh=VcEgWbt3lO6ugJXNLWN/Cdal2JKkBE560E2IRHc6CQU=; b=oq1lIZliutGlB7TUeT5pycv7ft6evFPrzAmKS+4xdyAwlQERZkLQkg8yWWYq7xrvEr tC51wubEe9DQqEE8Dcnj2bmGw05Y42fIM+jUH+7fDc9x2mvJtDK4I5OzOYxfRZxBliAO B2Q9QHc5oEQ8uAPVc4+hyaLJGJ3Tcg8pcTkcJLKx4+tDdkMiKKRCdH2pVuBchQq+0/un l2Ll7ukc3Nqd+R8ugVhwTIQq6YOhFMOwe3NuALrlCTPrtsTLPYzdjzLIUGAHuFaSO6qf CasUP3n+P1rFJZVIZpm5hKnorq8eUhVOOpDZqMjQ3QPYF0nFXx0Z2VEZiWjmgufp/yr4 owaA== X-Gm-Message-State: AOAM533S4wX8XYsSMWpdF67pMosvo5mgvWBIL7ccSWwJwjooReckCQ45 dThdZUrOmUFPTic8JZx1V/eklaA88dU= X-Google-Smtp-Source: ABdhPJysZAlYdjBD4DHwA+c9AuANnix3hdqmy7xvhG79NbB/VXtrdwS2xlWuToomfVLGdRxrJLxFXg== X-Received: by 2002:a05:6000:508:b0:1e4:a027:d147 with SMTP id a8-20020a056000050800b001e4a027d147mr678436wrf.315.1645641087903; Wed, 23 Feb 2022 10:31:27 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id w18sm286188wrl.62.2022.02.23.10.31.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Feb 2022 10:31:27 -0800 (PST) Message-Id: <7a8b13c308c1e54e4f51429805fc309384a93120.1645641063.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 23 Feb 2022 18:31:00 +0000 Subject: [PATCH 22/25] connect: implement get_recommended_features() MIME-Version: 1.0 Fcc: Sent To: git@vger.kernel.org Cc: gitster@pobox.com, me@ttaylorr.com, aevar@gmail.com, newren@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee This method allows a client to request and parse the 'features' capability of protocol v2. The response is expected to be a list of 'key=value' lines, but this implementation does no checking of the lines, expecting a later parse of the lines to be careful of the existence of that '=' character. This change is based on an earlier patch [1] written for a similar capability. [1] https://lore.kernel.org/git/RFC-patch-04.13-21caf01775-20210805T150534Z-avarab@gmail.com/ Co-authored-by: Ævar Arnfjörð Bjarmason Signed-off-by: Ævar Arnfjörð Bjarmason Signed-off-by: Derrick Stolee --- connect.c | 36 ++++++++++++++++++++++++++++++++++++ remote.h | 4 ++++ 2 files changed, 40 insertions(+) diff --git a/connect.c b/connect.c index 9d78d681e95..e1e6f4770dd 100644 --- a/connect.c +++ b/connect.c @@ -491,6 +491,42 @@ static void send_capabilities(int fd_out, struct packet_reader *reader) } } +int get_recommended_features(int fd_out, struct packet_reader *reader, + struct string_list *list, int stateless_rpc) +{ + int line_nr = 1; + + server_supports_v2("features", 1); + + /* (Re-)send capabilities */ + send_capabilities(fd_out, reader); + + /* Send command */ + packet_write_fmt(fd_out, "command=features\n"); + packet_delim(fd_out); + packet_flush(fd_out); + + /* Process response from server */ + while (packet_reader_read(reader) == PACKET_READ_NORMAL) { + const char *line = reader->line; + line_nr++; + + string_list_append(list, line); + } + + if (reader->status != PACKET_READ_FLUSH) + return error(_("expected flush after features listing")); + + /* + * Might die(), but obscure enough that that's OK, e.g. in + * serve.c, we'll call BUG() on its equivalent (the + * PACKET_READ_RESPONSE_END check). + */ + check_stateless_delimiter(stateless_rpc, reader, + _("expected response end packet after features listing")); + return 0; +} + struct ref **get_remote_refs(int fd_out, struct packet_reader *reader, struct ref **list, int for_push, struct transport_ls_refs_options *transport_options, diff --git a/remote.h b/remote.h index 438152ef562..268e8134f5e 100644 --- a/remote.h +++ b/remote.h @@ -236,6 +236,10 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader, const struct string_list *server_options, int stateless_rpc); +/* Used for protocol v2 in order to retrieve recommended features */ +int get_recommended_features(int fd_out, struct packet_reader *reader, + struct string_list *list, int stateless_rpc); + int resolve_remote_symref(struct ref *ref, struct ref *list); /* From patchwork Wed Feb 23 18:31:01 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12757318 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 80354C433F5 for ; Wed, 23 Feb 2022 18:32:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243972AbiBWScd (ORCPT ); Wed, 23 Feb 2022 13:32:33 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37466 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243921AbiBWScK (ORCPT ); Wed, 23 Feb 2022 13:32:10 -0500 Received: from mail-wr1-x430.google.com (mail-wr1-x430.google.com [IPv6:2a00:1450:4864:20::430]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 799F7C26 for ; Wed, 23 Feb 2022 10:31:30 -0800 (PST) Received: by mail-wr1-x430.google.com with SMTP id v21so5809094wrv.5 for ; Wed, 23 Feb 2022 10:31:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=++j/iu4CuQc9wJEV3X8xTUgeXRsJQvZidAJQbcdBemg=; b=eumhDHao2zp3ICbEMfzzQ5xfbPiL3u5guJrpUoloTgJ0OxV0J823m+SyhBe6O2T9rq rw5McpjKBExPH16/+7y/xj9FZdhz/KtaFlsjZVGDd3Cs7WYVjn6lln7dYcj9bvY1KXk2 hJMP4qY426KfdkVGRVxBqUKJCFqH8pr+qifZQsrSz7w5rCG7c5U7DZa2A21MuuaUufYD GP+7PMX3RNHFdp2ycdjK2VHgdXuGY96ohZbA7A2IF9Hc8sHp+bsXVEDUBEUD/o9rzH3p y+73TCwJ+0VgGQVIM+Cn7zAoHLNXEhgIYrhoLH0yThMqUVttndaSaDwtUmTsGDHQEDgK L87w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=++j/iu4CuQc9wJEV3X8xTUgeXRsJQvZidAJQbcdBemg=; b=3QW171J/9tlah5qVMYr6ovvq7hHi12dq/TAbof/XJdYh8fMuCNhXScQJD0zWatBT/I uBSINSYi06ZVb1ok+sfzCdibblt/GHo3Nilj8iJnOS43QRERTm9/DYgM7nJqnGx2fjnc a3lGqIsD2mEcZr9F183QboKK0i6T+YpsVrQqcuKnSkZpxYtQ9IjwhO6qZbDAOkxwqENw 8pQ7AXk03qj3Sor1xHRP0LAcEFDfArIbKu9Bs77AVKJq//3s+ITzbZyP8nvLacD7MTFI 5JEnG7i0nG5wYtEYmoS9yBcxgDcdKIYOjbKp8Pao4y4o9U9t+Gs3wcTQJUJSKjH05iFP ou4g== X-Gm-Message-State: AOAM533GBok6BnGtED74vFttkrZR6kRLTsM+e590YngUxH4cBNzeTS48 ESksrlQXTCJryDCQTyNvbOY0AXs9CPk= X-Google-Smtp-Source: ABdhPJyX4WzaCA+VtiJowbxAtb3WeNQ6aCk7BgxWd/UcqTQZQ6MGS+syWcLRK6jyl/TQhFV1Or1tDQ== X-Received: by 2002:a05:6000:50a:b0:1e3:5af:153e with SMTP id a10-20020a056000050a00b001e305af153emr670292wrf.385.1645641088761; Wed, 23 Feb 2022 10:31:28 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id n19-20020a05600c4f9300b0037c06fe68casm458791wmq.44.2022.02.23.10.31.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Feb 2022 10:31:28 -0800 (PST) Message-Id: In-Reply-To: References: Date: Wed, 23 Feb 2022 18:31:01 +0000 Subject: [PATCH 23/25] transport: add connections for 'features' capability Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, me@ttaylorr.com, aevar@gmail.com, newren@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee To allow 'git clone' to check the 'features' capability, we need to fill in some boilerplate methods that help detect if the capability exists and then to execute the get_recommended_features() method with the proper context. This involves jumping through some vtables. Signed-off-by: Derrick Stolee --- transport-helper.c | 14 ++++++++++++++ transport-internal.h | 9 +++++++++ transport.c | 38 ++++++++++++++++++++++++++++++++++++++ transport.h | 5 +++++ 4 files changed, 66 insertions(+) diff --git a/transport-helper.c b/transport-helper.c index a0297b0986c..642472e2adb 100644 --- a/transport-helper.c +++ b/transport-helper.c @@ -1264,11 +1264,25 @@ static struct ref *get_refs_list_using_list(struct transport *transport, return ret; } +static int get_features(struct transport *transport, + struct string_list *list) +{ + get_helper(transport); + + if (process_connect(transport, 0)) { + do_take_over(transport); + return transport->vtable->get_features(transport, list); + } + + return -1; +} + static struct transport_vtable vtable = { .set_option = set_helper_option, .get_refs_list = get_refs_list, .fetch_refs = fetch_refs, .push_refs = push_refs, + .get_features = get_features, .connect = connect_helper, .disconnect = release_helper }; diff --git a/transport-internal.h b/transport-internal.h index c4ca0b733ac..759c79148db 100644 --- a/transport-internal.h +++ b/transport-internal.h @@ -5,6 +5,7 @@ struct ref; struct transport; struct strvec; struct transport_ls_refs_options; +struct string_list; struct transport_vtable { /** @@ -51,6 +52,14 @@ struct transport_vtable { * process involved generating new commits. **/ int (*push_refs)(struct transport *transport, struct ref *refs, int flags); + + /** + * get_features() requests a list of recommended features and + * populates the given string_list with those 'key=value' pairs. + */ + int (*get_features)(struct transport *transport, + struct string_list *list); + int (*connect)(struct transport *connection, const char *name, const char *executable, int fd[2]); diff --git a/transport.c b/transport.c index 2a3e3241545..99d6b719f35 100644 --- a/transport.c +++ b/transport.c @@ -349,6 +349,20 @@ static struct ref *get_refs_via_connect(struct transport *transport, int for_pus return handshake(transport, for_push, options, 1); } +static int get_features(struct transport *transport, + struct string_list *list) +{ + struct git_transport_data *data = transport->data; + struct packet_reader reader; + + packet_reader_init(&reader, data->fd[0], NULL, 0, + PACKET_READ_CHOMP_NEWLINE | + PACKET_READ_GENTLE_ON_EOF); + + return get_recommended_features(data->fd[1], &reader, list, + transport->stateless_rpc); +} + static int fetch_refs_via_pack(struct transport *transport, int nr_heads, struct ref **to_fetch) { @@ -890,6 +904,7 @@ static struct transport_vtable taken_over_vtable = { .get_refs_list = get_refs_via_connect, .fetch_refs = fetch_refs_via_pack, .push_refs = git_transport_push, + .get_features = get_features, .disconnect = disconnect_git }; @@ -1043,6 +1058,7 @@ static struct transport_vtable builtin_smart_vtable = { .get_refs_list = get_refs_via_connect, .fetch_refs = fetch_refs_via_pack, .push_refs = git_transport_push, + .get_features = get_features, .connect = connect_git, .disconnect = disconnect_git }; @@ -1456,6 +1472,28 @@ int transport_fetch_refs(struct transport *transport, struct ref *refs) return rc; } +struct string_list *transport_remote_features(struct transport *transport) +{ + const struct transport_vtable *vtable = transport->vtable; + struct string_list *list = NULL; + + if (!server_supports_v2("features", 0)) + return NULL; + + if (!vtable->get_features) { + warning(_("'features' not supported by this remote")); + return NULL; + } + + CALLOC_ARRAY(list, 1); + string_list_init_dup(list); + + if (vtable->get_features(transport, list)) + warning(_("failed to get recommended features from remote")); + + return list; +} + void transport_unlock_pack(struct transport *transport, unsigned int flags) { int in_signal_handler = !!(flags & TRANSPORT_UNLOCK_PACK_IN_SIGNAL_HANDLER); diff --git a/transport.h b/transport.h index 3f16e50c196..bfa2dd48d85 100644 --- a/transport.h +++ b/transport.h @@ -272,6 +272,11 @@ struct transport_ls_refs_options { const struct ref *transport_get_remote_refs(struct transport *transport, struct transport_ls_refs_options *transport_options); +/** + * Get recommended config from remote. + */ +struct string_list *transport_remote_features(struct transport *transport); + /* * Fetch the hash algorithm used by a remote. * From patchwork Wed Feb 23 18:31:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12757319 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 76204C433EF for ; Wed, 23 Feb 2022 18:32:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243954AbiBWScf (ORCPT ); Wed, 23 Feb 2022 13:32:35 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38426 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243924AbiBWScK (ORCPT ); Wed, 23 Feb 2022 13:32:10 -0500 Received: from mail-wm1-x330.google.com (mail-wm1-x330.google.com [IPv6:2a00:1450:4864:20::330]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2FE52CE0 for ; Wed, 23 Feb 2022 10:31:31 -0800 (PST) Received: by mail-wm1-x330.google.com with SMTP id bg16-20020a05600c3c9000b00380f6f473b0so1653965wmb.1 for ; Wed, 23 Feb 2022 10:31:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=uIdsWVdx8T+F33SXZg5tzNTHSBrgc/ivVi2E8wz2vjM=; b=V3dJJpC42CWp6zP7c6YulT3EqdZ6WAS7KXk4Hbv//yeMuyFmPF6Fhg499kYvUrz6+Y XWDQ+76Xb3aJ2YJTrlO8YR+J0tkfm8BjnZ3Es9Gnoj7dRL+1naSVnN/hnSZb0g4Qnlk2 hElOJR1VCSB/R73O6lPK8GtimoBfhY0kW2LElAoWWMW6v+DrAnsuurYnvj1v5at6S77k /x9WKeBfzoQEYPZI5f+rB20FPOab5lmJbECLn6f0QY+suLoaHJDK62/qkroKIHYr/ROw TngDx6zmD8/Knh7FanduQy7YhuYfw63TVDUsjgreFzIXGOq9EsKnixEoJZZl9vLk8A+r ABqA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=uIdsWVdx8T+F33SXZg5tzNTHSBrgc/ivVi2E8wz2vjM=; b=4KWWyrwoWc8b/jcOT2k9SajWgPrICUIWvuE4cnIcEQbZJBKOku+GCLyFkwuqcSuiGm BX3FzjC1cKNMbh+MDfXQzLWLDo7lBnKd/74ttA7bHauxxuIvuEfXUbEvOx5K4eWch9z1 EFFoRYbuj7eu2uGDbueQYM1dgNg3WmIRt5Mru3vBKEsXzvQWopc12P1MPFeffsWB5vTr jCXCeZ3BEwnl2lNcwVCtGCsZbRHhK/h/A2kVH3N6nJHVmsCMtjknMlyqDy38fdzFO2LQ PI46hj9QQOaoPi8W6KNOJ+q3qCeyjZFsZpPYXlMQnHGGT1bIYVkQubhpZ3dK7GXtog01 r7jA== X-Gm-Message-State: AOAM531C1n6XyQrtt8ldySomDGAxJPrDnvB9gi4kz3CiAL+7UqyZEsYX tyT2v98sr0nVAQiCSBcr6s6PH8UfqBA= X-Google-Smtp-Source: ABdhPJxTkP5K+5u2wEnnMQAZ1hxziYHdEy3rNA/5JkFhujjfTyVgHxLC20fk1aWuge1Kp2C2lrqJwQ== X-Received: by 2002:a05:600c:4982:b0:380:e458:dbd9 with SMTP id h2-20020a05600c498200b00380e458dbd9mr760802wmp.49.1645641090120; Wed, 23 Feb 2022 10:31:30 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id a3-20020a7bc1c3000000b00380e493660esm418812wmj.42.2022.02.23.10.31.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Feb 2022 10:31:29 -0800 (PST) Message-Id: <883e17d0d21277e0ec8538da3e3b20f90b19128f.1645641063.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 23 Feb 2022 18:31:02 +0000 Subject: [PATCH 24/25] clone: use server-recommended bundle URI Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, me@ttaylorr.com, aevar@gmail.com, newren@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee After the ref advertisement initializes the connection between the client and the remote, use the 'features' capability (if available) to get a list of recommended features from the server. In this change, we only update the bundle URI setting. The bundles are downloaded immediately afterwards if the bundle URI becomes non-null. RFC-TODO: don't overwrite a given --bundle-uri option. RFC-TODO: implement the other capabilities. RFC-TODO: guard this entire request behind opt-in config. RFC-TODO: prevent using an HTTP(S) URI when in an SSH clone. RFC-TODO: prevent using a local path for the bundle URI. Signed-off-by: Derrick Stolee --- builtin/clone.c | 25 +++++++++++++++++++------ 1 file changed, 19 insertions(+), 6 deletions(-) diff --git a/builtin/clone.c b/builtin/clone.c index cfe3d96047a..92b8727fc9d 100644 --- a/builtin/clone.c +++ b/builtin/clone.c @@ -876,6 +876,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix) struct remote *remote; int err = 0, complete_refs_before_fetch = 1; int submodule_progress; + struct string_list *feature_list = NULL; struct transport_ls_refs_options transport_ls_refs_options = TRANSPORT_LS_REFS_OPTIONS_INIT; @@ -1194,11 +1195,23 @@ int cmd_clone(int argc, const char **argv, const char *prefix) refs = transport_get_remote_refs(transport, &transport_ls_refs_options); - /* - * NOTE: The bundle URI download takes place after transport_get_remote_refs() - * because a later change will introduce a check for recommended features, - * which might include a recommended bundle URI. - */ + feature_list = transport_remote_features(transport); + + if (feature_list) { + struct string_list_item *item; + for_each_string_list_item(item, feature_list) { + char *value; + char *equals = strchr(item->string, '='); + + if (!equals) + continue; + *equals = '\0'; + value = equals + 1; + + if (!strcmp(item->string, "bundleuri")) + bundle_uri = value; + } + } /* * Before fetching from the remote, download and install bundle @@ -1218,7 +1231,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix) if (filter) git_config_set("fetch.bundlefilter", filter); - if (!fetch_bundle_uri(bundle_uri, filter)) + if (fetch_bundle_uri(bundle_uri, filter)) warning(_("failed to fetch objects from bundle URI '%s'"), bundle_uri); } From patchwork Wed Feb 23 18:31:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12757320 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BF2B2C433EF for ; Wed, 23 Feb 2022 18:32:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243985AbiBWScj (ORCPT ); Wed, 23 Feb 2022 13:32:39 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38434 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243927AbiBWScL (ORCPT ); Wed, 23 Feb 2022 13:32:11 -0500 Received: from mail-wm1-x32e.google.com (mail-wm1-x32e.google.com [IPv6:2a00:1450:4864:20::32e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 339C425D5 for ; Wed, 23 Feb 2022 10:31:32 -0800 (PST) Received: by mail-wm1-x32e.google.com with SMTP id p4so4084018wmg.1 for ; Wed, 23 Feb 2022 10:31:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=Pr1J/M+tHmy/8C/jRdXK2EcRs94uaKQs2Im82L5XVtQ=; b=Mb9YODnUK+dPkVVQemW51XBXvFsZ9WeZ3zJ70wohOji9e7oNpXmJ/cDsMbUZoZbE3V FcMbceldGfvfsnIycH0fKS5XBcTZbincduq8Q+XE4bWSwtkqwPNC/f9UZwe2yoqM4h/E lDoGpkN0c7bUjvmZL/BiXmpaIpigXgx8OEii9TQe4Q4trsLoondPJsz2NaiI+X583Syh Yi3chNUFDFmg+ubK6RqKJcx6yYiy+8duj3Nnd8j2n57I5HCYOZN8Q+kIOvyYE7TLFDET DESYi0+9lA7a89LrhPf7g0qo7co1Kvi4BefOqzRzgxXNHx/UGodyWswPgGrsl2QpOrPG I3yQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=Pr1J/M+tHmy/8C/jRdXK2EcRs94uaKQs2Im82L5XVtQ=; b=VTNARBMG1YzAyjS3Ld3x3yiG0yKNdKKNyvkGpaZJoFxg+njmrJi9InRdhKJGo4vCha lAMMBqHUZ2oSOSi+6Dsg6Di5QAkLfvaMCrldzbS4nnkYFY95+T+NRbJ+ML2bH1br7oKI g3HEqDrla1i88y/Q8HiUU+D9e6T6mqOscICsg1NIDtEtFZ3IrT+NLh+zwr8uioXWlO54 Dbcx+8oclN3PBS7323A66vRExUyQESBhx/UPRm0F3Pr/1jiZ+nZPKcIL2hKYGcJ0xoIM a4mZ9xnX+9FXQ/11xrpDipARU5DBNZ+B2x/LmztZw2i3yGwyVmJx/ZevRIfBMqB9KhHc WmZg== X-Gm-Message-State: AOAM533sQXPYYPrVGk2SoETXSDLtxj7jsASWC/r8itdlTDV2BOHITh9/ KwTEgu0Tye5zCTdzec5gyjb7yYTwfJI= X-Google-Smtp-Source: ABdhPJwr6VvutOhrSmSmvsE+aHNxGHzFq//CldCJ8G/SFoHFF9vjvea5RgSOBVZ9U+Gqp640hAZMCQ== X-Received: by 2002:a1c:7311:0:b0:37c:3ceb:71c3 with SMTP id d17-20020a1c7311000000b0037c3ceb71c3mr8451856wmb.154.1645641091445; Wed, 23 Feb 2022 10:31:31 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id a9-20020a05600c2d4900b0038100a95903sm446527wmg.41.2022.02.23.10.31.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Feb 2022 10:31:31 -0800 (PST) Message-Id: In-Reply-To: References: Date: Wed, 23 Feb 2022 18:31:03 +0000 Subject: [PATCH 25/25] t5601: basic bundle URI test Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, me@ttaylorr.com, aevar@gmail.com, newren@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee This test demonstrates an end-to-end form of the bundle URI feature given by an HTTP server advertising the 'features' capability with a bundle URI that is a bundle file on that same HTTP server. We verify that we unbundled a bundle, which could only have happened if we successfully downloaded that file. RFC-TODO: Create similar tests throughout the series that perform similar tests, including examples with table of contents and partial clones. Signed-off-by: Derrick Stolee --- t/t5601-clone.sh | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh index 83c24fc97a7..b2409a4c04c 100755 --- a/t/t5601-clone.sh +++ b/t/t5601-clone.sh @@ -769,6 +769,18 @@ test_expect_success 'reject cloning shallow repository using HTTP' ' git clone --no-reject-shallow $HTTPD_URL/smart/repo.git repo ' +test_expect_success 'auto-discover bundle URI from HTTP clone' ' + test_when_finished rm -rf repo "$HTTPD_DOCUMENT_ROOT_PATH/repo2.git" && + git -C src bundle create "$HTTPD_DOCUMENT_ROOT_PATH/everything.bundle" --all && + git clone --bare --no-local src "$HTTPD_DOCUMENT_ROOT_PATH/repo2.git" && + git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo2.git" config \ + serve.bundleuri $HTTPD_URL/everything.bundle && + GIT_TRACE2_EVENT="$(pwd)/trace.txt" \ + git -c protocol.version=2 clone \ + $HTTPD_URL/smart/repo2.git repo && + test_subcommand_inexact git bundle unbundle