From patchwork Fri May 29 22:30:13 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 11579751 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C6CA1912 for ; Fri, 29 May 2020 22:30:36 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id ACAA520897 for ; Fri, 29 May 2020 22:30:36 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="rfQGgiWX" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728355AbgE2Wae (ORCPT ); Fri, 29 May 2020 18:30:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59198 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728227AbgE2Wae (ORCPT ); Fri, 29 May 2020 18:30:34 -0400 Received: from mail-qk1-x74a.google.com (mail-qk1-x74a.google.com [IPv6:2607:f8b0:4864:20::74a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A81E3C08C5C9 for ; Fri, 29 May 2020 15:30:32 -0700 (PDT) Received: by mail-qk1-x74a.google.com with SMTP id p7so3068178qkf.21 for ; Fri, 29 May 2020 15:30:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=F6251SzSw+rgrnELuVLDTRIs3IvJP+BbDVHu+vBtevk=; b=rfQGgiWX5ZzfV0S/LuIae2YzP4xVz59hgmJn0I/3NRGTNAKwUrUhA91OkgIWpOd0TP Ra6qoTS1KzhBGGK6q38Ea25e1Ruh6PTZTEL4RJ6e9MOy6kSrEjHysUI+bRx1a4EWAgES xSM/2ekhqNMMa7Yth8JY9oAKgXQUAmpBcj0+CKhk1G+GGjSRZjD0mj0a4+nkazrlOjEm mTlZ7sfTcFDPnls6PGoENrSCj66U2bNd0brx7H9TWRqzbj9tfnKcDWJQaHPfUkrN2X1i SG/rFl2WMNIs+vOGzmDKztBVbCGz1l1AifZwObOVy/3g744tfEU33jls0Sr2yePLWVGO Lhow== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=F6251SzSw+rgrnELuVLDTRIs3IvJP+BbDVHu+vBtevk=; b=rs8J3F6H6pAN5xwh/P4Q+YRUkAQ04Oxi4iWmgDw1vu+jjzrKxLbv1UV3sKNqxAuRmK qsyY85inENj1Xmz6f6GTJMsLCooGuflRLCycEbl/j9hnoSFpW5k9YN+NF1PUDcNib2xt wd0DnsA9TqO9UGh15Y7eSIo3PCLwQCPLli5W3QDFYbj7xd3UxCj6wMG4OQf1q5BlZb5L jj0LUDVLtDHOwmAYgiTghwFz+TNLOXeADg6A25/AL+jg1RJAyIrpidUAkgn6DLigUD7Y TwYfWHIpZa2+tMuHwu5RuZCF6A8yAFzvomTB9shkgWq0RDPv1YYk6seZ7rgVWYB7MCqD c1uw== X-Gm-Message-State: AOAM53306hq8BWlmCovadAE+KJj6IKgYKh83oIsd64v7CqPHWnOOOoZG IJgqcigczueMEBqx9svBghn2zrDoOkQP3B9mEPuYzdVcxMajZj+FnxfCEHDCUeD+N8a1ESZA98Z y2nFAYTKJJbB+VWZPwZLeQn3J4djO+deTwZtRCDhFwloALR+sKz+p7dB3Uyuv4vcb9+MGpQNqkf n2 X-Google-Smtp-Source: ABdhPJwDC8J+PeMepCNhungRtakbQZwK4i6fKyA6zGtoXbjXuZ1N70oQwDqXAkyhFWpGdreRRob1j6jyusXlKCinAS3H X-Received: by 2002:a0c:a284:: with SMTP id g4mr10392154qva.243.1590791431847; Fri, 29 May 2020 15:30:31 -0700 (PDT) Date: Fri, 29 May 2020 15:30:13 -0700 In-Reply-To: Message-Id: <4d17d560b87746acfd62ff785cc22c09600d4e65.1590789428.git.jonathantanmy@google.com> Mime-Version: 1.0 References: X-Mailer: git-send-email 2.27.0.rc0.183.gde8f92d652-goog Subject: [PATCH 1/8] http: use --stdin when getting dumb HTTP pack From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org When Git fetches a pack using dumb HTTP, it reuses the server's name for the packfile (which incorporates a hash), which is different from the behavior of fetch-pack and receive-pack. A subsequent patch will allow downloading packs over HTTP(S) as part of a fetch. These downloads will not necessarily be from a Git repository, and thus may not have a hash as part of its name. Thus, teach http to pass --stdin to index-pack, so that we have no reliance on the server's name for the packfile. Signed-off-by: Jonathan Tan --- http.c | 33 +++++++++++---------------------- 1 file changed, 11 insertions(+), 22 deletions(-) diff --git a/http.c b/http.c index 4882c9f5b2..130e9d6259 100644 --- a/http.c +++ b/http.c @@ -2276,9 +2276,9 @@ int finish_http_pack_request(struct http_pack_request *preq) { struct packed_git **lst; struct packed_git *p = preq->target; - char *tmp_idx; - size_t len; struct child_process ip = CHILD_PROCESS_INIT; + int tmpfile_fd; + int ret = 0; close_pack_index(p); @@ -2290,35 +2290,24 @@ int finish_http_pack_request(struct http_pack_request *preq) lst = &((*lst)->next); *lst = (*lst)->next; - if (!strip_suffix(preq->tmpfile.buf, ".pack.temp", &len)) - BUG("pack tmpfile does not end in .pack.temp?"); - tmp_idx = xstrfmt("%.*s.idx.temp", (int)len, preq->tmpfile.buf); + tmpfile_fd = xopen(preq->tmpfile.buf, O_RDONLY); argv_array_push(&ip.args, "index-pack"); - argv_array_pushl(&ip.args, "-o", tmp_idx, NULL); - argv_array_push(&ip.args, preq->tmpfile.buf); + argv_array_push(&ip.args, "--stdin"); ip.git_cmd = 1; - ip.no_stdin = 1; + ip.in = tmpfile_fd; ip.no_stdout = 1; if (run_command(&ip)) { - unlink(preq->tmpfile.buf); - unlink(tmp_idx); - free(tmp_idx); - return -1; - } - - unlink(sha1_pack_index_name(p->hash)); - - if (finalize_object_file(preq->tmpfile.buf, sha1_pack_name(p->hash)) - || finalize_object_file(tmp_idx, sha1_pack_index_name(p->hash))) { - free(tmp_idx); - return -1; + ret = -1; + goto cleanup; } install_packed_git(the_repository, p); - free(tmp_idx); - return 0; +cleanup: + close(tmpfile_fd); + unlink(preq->tmpfile.buf); + return ret; } struct http_pack_request *new_http_pack_request( From patchwork Fri May 29 22:30:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 11579753 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 002F8912 for ; Fri, 29 May 2020 22:30:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D988E207BC for ; Fri, 29 May 2020 22:30:37 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="ksYIdLur" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728391AbgE2Wag (ORCPT ); Fri, 29 May 2020 18:30:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59204 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726555AbgE2Wae (ORCPT ); Fri, 29 May 2020 18:30:34 -0400 Received: from mail-qv1-xf49.google.com (mail-qv1-xf49.google.com [IPv6:2607:f8b0:4864:20::f49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7D208C03E969 for ; Fri, 29 May 2020 15:30:34 -0700 (PDT) Received: by mail-qv1-xf49.google.com with SMTP id s15so2337130qvo.6 for ; Fri, 29 May 2020 15:30:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=S031r3fa8qHI52krymqvmwy6Oalu9UjGWzVJofOIT6g=; b=ksYIdLurzMIF2IxsmHWRTr0srqUIYENxi54ba1jX+gMIWU9tbj7Levx5vytbVRcpkJ qKCl7cP9wXx2JoZHoWuBjvfQyUDVP6nrUytvoI6tGuMWsNVDfJSRr7J+eJmut836Cqmi DqqLWrqrjE8XimwHXQitIw57YIzu9JFSc9V5H3QIOMF6JaD10qZ77euE41Rv8pLwUK99 7fQCU7SJSFhA5WVvlZim3vw6cEZSkGHWmEE+yf/lNo886aRxQnLa++aN+r2likdCDRTM MwfUEO41edw7Sv1BOmWeCiTZj/tVxSElVUdadPmCt/OCgQ2FQVemXc08rgVrEB5J85Vq +SWg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=S031r3fa8qHI52krymqvmwy6Oalu9UjGWzVJofOIT6g=; b=udG0FeLAdvPp635Maryp9YsF25rQmHKKRFcdBaxEV471wDsASGGfiZdvXML4nwLMAd gdawyysWbosx6m2kNb2dNZUyr42efKyGrt/PhIVi+1w8csl2zYgHACVO3Z2Cl11tK2iz Yg1RXDwNtoHFyFr6Sk7aSMWjGW7BflGX6At/saRTEGXbb4bGM0sjkmI1DDl5NBCsbeLj mlcr8aGlB2v0N5FRbPUE16qIBBWoSjg/oQJYldgB//GL8/0IBX9DnZng1WxEgSRHGk9C cietVEGD5nm7Dln8bbLrUzrE7/B/QFqX+nzO5bZKBq1kcEjIpOdgQIAjcOnUT6iPAwwN 6MCA== X-Gm-Message-State: AOAM532+tpIoup69Uc5bz8o05mSZKpUhDcn59vDvfC9g9AadXf4eHzfz 4Bo2y9w4F94yFdyQ6YX4Elyt19gvFvhx7qtQxGhXZMZ//x2TzudNB1jiWhx51Ue0mjaRkgcQJma rG4bb14ilMEGRd9iuHMZe7zc3ZntG8aPz+FdTSnhWca6hZnPJYIlnamYw+qqRj+12kXQiY51Nn6 RC X-Google-Smtp-Source: ABdhPJyVRdhTc8Pt+8wtcXmCRB99ux74KxTlMunYIJwKQQ1Qn1lyHTkZzMk2EBzF1WZBrbHTNH5eosWmoIoORB0cbeKj X-Received: by 2002:a0c:b593:: with SMTP id g19mr10648329qve.55.1590791433431; Fri, 29 May 2020 15:30:33 -0700 (PDT) Date: Fri, 29 May 2020 15:30:14 -0700 In-Reply-To: Message-Id: <7c66ab1be0722eecf3a7be76921b4c1ef9bb50f2.1590789428.git.jonathantanmy@google.com> Mime-Version: 1.0 References: X-Mailer: git-send-email 2.27.0.rc0.183.gde8f92d652-goog Subject: [PATCH 2/8] http: improve documentation of http_pack_request From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org struct http_pack_request and the functions that use it will be modified in a subsequent patch. Using it is complicated (to use, call the initialization function, then set some but not all fields in the returned struct), so add some documentation to help future users. Signed-off-by: Jonathan Tan --- http.h | 19 ++++++++++++++++++- 1 file changed, 18 insertions(+), 1 deletion(-) diff --git a/http.h b/http.h index faf8cbb0d1..a5b082f3ae 100644 --- a/http.h +++ b/http.h @@ -215,14 +215,31 @@ int http_get_info_packs(const char *base_url, struct packed_git **packs_head); struct http_pack_request { + /* + * Initialized by new_http_pack_request(). + */ char *url; struct packed_git *target; + struct active_request_slot *slot; + + /* + * After calling new_http_pack_request(), point lst to the head of the + * pack list that target is in. finish_http_pack_request() will remove + * target from lst and call install_packed_git() on target. + */ struct packed_git **lst; + + /* + * State managed by functions in http.c. + */ FILE *packfile; struct strbuf tmpfile; - struct active_request_slot *slot; }; +/* + * target must be an element in a pack list obtained from + * http_get_info_packs(). + */ struct http_pack_request *new_http_pack_request( struct packed_git *target, const char *base_url); int finish_http_pack_request(struct http_pack_request *preq); From patchwork Fri May 29 22:30:15 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 11579757 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8E1B7166C for ; Fri, 29 May 2020 22:30:40 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5EFDF20897 for ; Fri, 29 May 2020 22:30:40 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="UQAy7RAd" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728494AbgE2Waj (ORCPT ); Fri, 29 May 2020 18:30:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59210 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726555AbgE2Wah (ORCPT ); Fri, 29 May 2020 18:30:37 -0400 Received: from mail-qv1-xf4a.google.com (mail-qv1-xf4a.google.com [IPv6:2607:f8b0:4864:20::f4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C32A0C03E969 for ; Fri, 29 May 2020 15:30:35 -0700 (PDT) Received: by mail-qv1-xf4a.google.com with SMTP id k35so3509695qva.18 for ; Fri, 29 May 2020 15:30:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=QCYckzNzK5i+fZTw25Hd9a5OrW3Hwbmx//2aNeQEKXY=; b=UQAy7RAdy+j/+Yv25OGDtzIAHkyplfRCUvGwqpZPyrDbieVaytcLOhSsc726n3QIZp 8at8oaD15oxIPAX88IUE/mfbdB20mX0ocPyOgbTck7t20PcJymJMZimrPhzcaopif9Ke n7otYfCIw1iz1ssgszS2EuQgxCjCgyTVNiPUFQhtu7YrtBfQe3sUMHp/8CBjdu/ULKoT MREEbdcgsSGnMhl2lNo3LcmmxkgEi50Yv11c1xP2OOU7qD5ng+elqm4cCfxkj/v38Cyw 4dLGqovKH/Sm+zT0fVvvHWtxa2rBeiSlTcr74jpoGlUnuy3J6PStU9MZ0+c5rfUUycSl cHMg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=QCYckzNzK5i+fZTw25Hd9a5OrW3Hwbmx//2aNeQEKXY=; b=Y6l0eNvpDBiy97iW9kDx8O4302H856lJRTQkV4Q9+zpS+L8dc7rcScaR/wtmmydryn 69Dn+MasTNRZ7Sp8oGv5GnF8ZzDwuYsnd//5gsc0YSnPstQGjGWZ4ANPAFEX03kLtq6z 5Cu8sprlpOPheYDIfDJ31raO2WCOWuJqLZojdBLjQ1qV+kyDZtsZ8oDvJQmLvuDGlT06 tQLbiyE0TAuCkzttk9B8WarXCXk2Em1+xsPWJmmN4UvX1GV9Nk2yQBDev6vHx3H/rXQF cqSpuytJk3pqBX/VleKJHuGKRXf6qrQhEWonPd39zDUOOQHJxBIJLTiq8/EF/YfyTwvg Gi5Q== X-Gm-Message-State: AOAM531UPZmaSkIY35/xFhE3HH43aw7DrmTidvsTp4p1lYpXpboLoBpL zD5d26FG5uGXOxwf3jQEoeAfYMNgpw1MY32uuRPiR4+rnlGp9ZwvTwOKynK6PrXNg2dL5WvzzB+ LpVJKBhaAqtkrdXIo93xenYecyLMutBeZ1M5aMO3uhd+s6jYmV1PD3ZXXCNYBH0mUis1QZ8/rhj nc X-Google-Smtp-Source: ABdhPJzQio6F8lAED0QcA+fu8+OnCk0H0lCicYjfm2cMnCy1bt6KvlniVXOfLZvFdfCOXfDQcOdQ0s6Z/OLOowejomQS X-Received: by 2002:a0c:f486:: with SMTP id i6mr10153526qvm.190.1590791434825; Fri, 29 May 2020 15:30:34 -0700 (PDT) Date: Fri, 29 May 2020 15:30:15 -0700 In-Reply-To: Message-Id: <6b3a628719e0593893e537de0220a5e0d5460232.1590789428.git.jonathantanmy@google.com> Mime-Version: 1.0 References: X-Mailer: git-send-email 2.27.0.rc0.183.gde8f92d652-goog Subject: [PATCH 3/8] http-fetch: support fetching packfiles by URL From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Teach http-fetch the ability to download packfiles directly, given a URL, and to verify them. The http_pack_request suite of functions have been modified to support a NULL target. When target is NULL, the given URL is downloaded directly instead of being treated as the root of a repository. Signed-off-by: Jonathan Tan --- Documentation/git-http-fetch.txt | 8 +++- http-fetch.c | 64 +++++++++++++++++++++++++------- http.c | 55 ++++++++++++++++++++------- http.h | 19 ++++++++-- t/t5550-http-fetch-dumb.sh | 25 +++++++++++++ 5 files changed, 141 insertions(+), 30 deletions(-) diff --git a/Documentation/git-http-fetch.txt b/Documentation/git-http-fetch.txt index 666b042679..8357359a9b 100644 --- a/Documentation/git-http-fetch.txt +++ b/Documentation/git-http-fetch.txt @@ -9,7 +9,7 @@ git-http-fetch - Download from a remote Git repository via HTTP SYNOPSIS -------- [verse] -'git http-fetch' [-c] [-t] [-a] [-d] [-v] [-w filename] [--recover] [--stdin] +'git http-fetch' [-c] [-t] [-a] [-d] [-v] [-w filename] [--recover] [--stdin | --packfile | ] DESCRIPTION ----------- @@ -40,6 +40,12 @@ commit-id:: ['\t'] +--packfile:: + Instead of a commit id on the command line (which is not expected in + this case), 'git http-fetch' fetches the packfile directly at the given + URL and uses index-pack to generate corresponding .idx and .keep files. + The output of index-pack is printed to stdout. + --recover:: Verify that everything reachable from target is fetched. Used after an earlier fetch is interrupted. diff --git a/http-fetch.c b/http-fetch.c index a32ac118d9..a9764d6f96 100644 --- a/http-fetch.c +++ b/http-fetch.c @@ -5,7 +5,7 @@ #include "walker.h" static const char http_fetch_usage[] = "git http-fetch " -"[-c] [-t] [-a] [-v] [--recover] [-w ref] [--stdin] commit-id url"; +"[-c] [-t] [-a] [-v] [--recover] [-w ref] [--stdin | --packfile | commit-id] url"; int cmd_main(int argc, const char **argv) { @@ -19,6 +19,7 @@ int cmd_main(int argc, const char **argv) int rc = 0; int get_verbosely = 0; int get_recover = 0; + int packfile = 0; while (arg < argc && argv[arg][0] == '-') { if (argv[arg][1] == 't') { @@ -35,43 +36,80 @@ int cmd_main(int argc, const char **argv) get_recover = 1; } else if (!strcmp(argv[arg], "--stdin")) { commits_on_stdin = 1; + } else if (!strcmp(argv[arg], "--packfile")) { + packfile = 1; } arg++; } - if (argc != arg + 2 - commits_on_stdin) + if (argc != arg + 2 - (commits_on_stdin || packfile)) usage(http_fetch_usage); if (commits_on_stdin) { commits = walker_targets_stdin(&commit_id, &write_ref); + } else if (packfile) { + /* URL will be set later */ } else { commit_id = (char **) &argv[arg++]; commits = 1; } - if (argv[arg]) - str_end_url_with_slash(argv[arg], &url); + if (packfile) { + url = xstrdup(argv[arg]); + } else { + if (argv[arg]) + str_end_url_with_slash(argv[arg], &url); + } setup_git_directory(); git_config(git_default_config, NULL); http_init(NULL, url, 0); - walker = get_http_walker(url); - walker->get_verbosely = get_verbosely; - walker->get_recover = get_recover; - rc = walker_fetch(walker, commits, commit_id, write_ref, url); + if (packfile) { + struct http_pack_request *preq; + struct slot_results results; + int ret; + + preq = new_http_pack_request(NULL, url); + if (preq == NULL) + die("couldn't create http pack request"); + preq->slot->results = &results; + preq->generate_keep = 1; + + if (start_active_slot(preq->slot)) { + run_active_slot(preq->slot); + if (results.curl_result != CURLE_OK) { + die("Unable to get pack file %s\n%s", preq->url, + curl_errorstr); + } + } else { + die("Unable to start request"); + } + + if ((ret = finish_http_pack_request(preq))) + die("finish_http_pack_request gave result %d", ret); + release_http_pack_request(preq); + rc = 0; + } else { + walker = get_http_walker(url); + walker->get_verbosely = get_verbosely; + walker->get_recover = get_recover; + + rc = walker_fetch(walker, commits, commit_id, write_ref, url); - if (commits_on_stdin) - walker_targets_free(commits, commit_id, write_ref); + if (commits_on_stdin) + walker_targets_free(commits, commit_id, write_ref); - if (walker->corrupt_object_found) { - fprintf(stderr, + if (walker->corrupt_object_found) { + fprintf(stderr, "Some loose object were found to be corrupt, but they might be just\n" "a false '404 Not Found' error message sent with incorrect HTTP\n" "status code. Suggest running 'git fsck'.\n"); + } + + walker_free(walker); } - walker_free(walker); http_cleanup(); free(url); diff --git a/http.c b/http.c index 130e9d6259..ac66215ee6 100644 --- a/http.c +++ b/http.c @@ -2280,15 +2280,18 @@ int finish_http_pack_request(struct http_pack_request *preq) int tmpfile_fd; int ret = 0; - close_pack_index(p); + if (p) + close_pack_index(p); fclose(preq->packfile); preq->packfile = NULL; - lst = preq->lst; - while (*lst != p) - lst = &((*lst)->next); - *lst = (*lst)->next; + if (p) { + lst = preq->lst; + while (*lst != p) + lst = &((*lst)->next); + *lst = (*lst)->next; + } tmpfile_fd = xopen(preq->tmpfile.buf, O_RDONLY); @@ -2296,14 +2299,21 @@ int finish_http_pack_request(struct http_pack_request *preq) argv_array_push(&ip.args, "--stdin"); ip.git_cmd = 1; ip.in = tmpfile_fd; - ip.no_stdout = 1; + if (preq->generate_keep) { + argv_array_pushf(&ip.args, "--keep=git %"PRIuMAX, + (uintmax_t)getpid()); + ip.out = 0; + } else { + ip.no_stdout = 1; + } if (run_command(&ip)) { ret = -1; goto cleanup; } - install_packed_git(the_repository, p); + if (p) + install_packed_git(the_repository, p); cleanup: close(tmpfile_fd); unlink(preq->tmpfile.buf); @@ -2321,12 +2331,31 @@ struct http_pack_request *new_http_pack_request( strbuf_init(&preq->tmpfile, 0); preq->target = target; - end_url_with_slash(&buf, base_url); - strbuf_addf(&buf, "objects/pack/pack-%s.pack", - hash_to_hex(target->hash)); - preq->url = strbuf_detach(&buf, NULL); + if (target) { + end_url_with_slash(&buf, base_url); + strbuf_addf(&buf, "objects/pack/pack-%s.pack", + hash_to_hex(target->hash)); + preq->url = strbuf_detach(&buf, NULL); + } else { + preq->url = xstrdup(base_url); + } + + if (target) { + strbuf_addf(&preq->tmpfile, "%s.temp", + sha1_pack_name(target->hash)); + } else { + const char *shortened_url; + size_t url_len = strlen(base_url); + + shortened_url = url_len <= 50 + ? base_url : base_url + (url_len - 50); + strbuf_addf(&preq->tmpfile, "%s/pack/pack-", + get_object_directory()); + strbuf_addstr_urlencode(&preq->tmpfile, + shortened_url, is_rfc3986_unreserved); + strbuf_addstr(&preq->tmpfile, ".temp"); + } - strbuf_addf(&preq->tmpfile, "%s.temp", sha1_pack_name(target->hash)); preq->packfile = fopen(preq->tmpfile.buf, "a"); if (!preq->packfile) { error("Unable to open local file %s for pack", @@ -2350,7 +2379,7 @@ struct http_pack_request *new_http_pack_request( if (http_is_verbose) fprintf(stderr, "Resuming fetch of pack %s at byte %"PRIuMAX"\n", - hash_to_hex(target->hash), + target ? hash_to_hex(target->hash) : base_url, (uintmax_t)prev_posn); http_opt_request_remainder(preq->slot->curl, prev_posn); } diff --git a/http.h b/http.h index a5b082f3ae..709dfa4c19 100644 --- a/http.h +++ b/http.h @@ -223,12 +223,21 @@ struct http_pack_request { struct active_request_slot *slot; /* - * After calling new_http_pack_request(), point lst to the head of the + * After calling new_http_pack_request(), if fetching a pack that + * http_get_info_packs() told us about, point lst to the head of the * pack list that target is in. finish_http_pack_request() will remove * target from lst and call install_packed_git() on target. */ struct packed_git **lst; + /* + * If this is true, finish_http_pack_request() will pass "--keep" to + * index-pack, resulting in the creation of a keep file, and will not + * suppress its stdout (that is, the "keep\t\n" line will be + * printed to stdout). + */ + unsigned generate_keep : 1; + /* * State managed by functions in http.c. */ @@ -237,8 +246,12 @@ struct http_pack_request { }; /* - * target must be an element in a pack list obtained from - * http_get_info_packs(). + * If fetching a pack that http_get_info_packs() told us about, set target to + * an element in a pack list obtained from http_get_info_packs(). The actual + * URL fetched will be base_url followed by a suffix with the hash of the pack. + * + * Otherwise, set target to NULL. The actual URL fetched will be base_url + * itself. */ struct http_pack_request *new_http_pack_request( struct packed_git *target, const char *base_url); diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh index 50485300eb..53010efc49 100755 --- a/t/t5550-http-fetch-dumb.sh +++ b/t/t5550-http-fetch-dumb.sh @@ -199,6 +199,23 @@ test_expect_success 'fetch packed objects' ' git clone $HTTPD_URL/dumb/repo_pack.git ' +test_expect_success 'http-fetch --packfile' ' + git init packfileclient && + p=$(cd "$HTTPD_DOCUMENT_ROOT_PATH"/repo_pack.git && ls objects/pack/pack-*.pack) && + git -C packfileclient http-fetch --packfile "$HTTPD_URL"/dumb/repo_pack.git/$p >out && + + # Ensure that the expected files are generated + grep "^keep.[0-9a-f]\{16,\}$" out && + cut -c6- out >packhash && + test -e "packfileclient/.git/objects/pack/pack-$(cat packhash).pack" && + test -e "packfileclient/.git/objects/pack/pack-$(cat packhash).idx" && + test -e "packfileclient/.git/objects/pack/pack-$(cat packhash).keep" && + + # Ensure that it has the HEAD of repo_pack, at least + HASH=$(git -C "$HTTPD_DOCUMENT_ROOT_PATH"/repo_pack.git rev-parse HEAD) && + git -C packfileclient cat-file -e "$HASH" +' + test_expect_success 'fetch notices corrupt pack' ' cp -R "$HTTPD_DOCUMENT_ROOT_PATH"/repo_pack.git "$HTTPD_DOCUMENT_ROOT_PATH"/repo_bad1.git && (cd "$HTTPD_DOCUMENT_ROOT_PATH"/repo_bad1.git && @@ -214,6 +231,14 @@ test_expect_success 'fetch notices corrupt pack' ' ) ' +test_expect_success 'http-fetch --packfile with corrupt pack' ' + rm -rf packfileclient && + git init packfileclient && + p=$(cd "$HTTPD_DOCUMENT_ROOT_PATH"/repo_bad1.git && ls objects/pack/pack-*.pack) && + test_must_fail git -C packfileclient http-fetch --packfile \ + "$HTTPD_URL"/dumb/repo_bad1.git/$p +' + test_expect_success 'fetch notices corrupt idx' ' cp -R "$HTTPD_DOCUMENT_ROOT_PATH"/repo_pack.git "$HTTPD_DOCUMENT_ROOT_PATH"/repo_bad2.git && (cd "$HTTPD_DOCUMENT_ROOT_PATH"/repo_bad2.git && From patchwork Fri May 29 22:30:16 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 11579755 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EEEAF92A for ; Fri, 29 May 2020 22:30:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D435620897 for ; Fri, 29 May 2020 22:30:39 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="Ke3w4043" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728433AbgE2Wai (ORCPT ); Fri, 29 May 2020 18:30:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59214 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728395AbgE2Wah (ORCPT ); Fri, 29 May 2020 18:30:37 -0400 Received: from mail-qt1-x84a.google.com (mail-qt1-x84a.google.com [IPv6:2607:f8b0:4864:20::84a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 21114C08C5C9 for ; Fri, 29 May 2020 15:30:37 -0700 (PDT) Received: by mail-qt1-x84a.google.com with SMTP id x6so4251539qts.3 for ; Fri, 29 May 2020 15:30:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=g+q9b2cPs+1BdeT4vO5zikuuaRjd4B1f9EEvKXkyJAI=; b=Ke3w4043q2Qwh9bBPf/xNyycJGXZFkZ5d1Fe9zFqOWdPJOFft9SlfRds9htCLdjzzy fWDt+srXZEoY7999OoOT84mHFnYiy4PPjfcpIEFS4+SFiRJkCif87qns139r/FdByad3 6lqKtlUWtS7BYyWLfxK9Ym92TAkU+c0GnuNnrwsRRC4GWyD8mSRVzTKnwtUkfGaEMQDN 8fUoBYPx8e4l3Mb8qvxdnsQh+r8QMwhLBPpsrjeqJSobWDaGqqesX/xRcO26ym3Mhui7 IyYAezLb82HAhn+sjaHkkIPYnSyfUZVqxiEIuivJRghook48H8db7gr14J9gHb9DlaeB iJuQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=g+q9b2cPs+1BdeT4vO5zikuuaRjd4B1f9EEvKXkyJAI=; b=GOxc0AsliIpY5uGfd0MTTmdLPkgIbAlkQYQuA3nHb1YSZRb1kprHkJB9w1ElKcj0OZ vMM08DT+MQMRtGn9QlSYPHOrSih0OHQBRTnNFLXTCO6O81t+dD9IDOiJ2VP0MUWOoaxb 0DB5PRn/WXbe5q+oizS+pqyT7146+CCHXNvN1P0cCpFwsyGBLEeRvhml+WSaiZCtIi+w Wgx6jv6sY80XPUl2rYn/9BrjqkHdVpUOoGkWSop7YeSkwD7R/L2eIPFO1B2Mqj1Q6FUt EbESl6WiCLmAhmokChHmH8hJbl8dhibVnF2ZgMGdmUj7QrKWv7DkGNr8+dT7MNLtKPlY qE4A== X-Gm-Message-State: AOAM533Oju5a2edKCqPViIn4D4yu6EfYny7PxURnS7Pf17jxa47P7jp2 SsmpPm8b9mOlTN3xGlelZ6VDLt1epJY6KHSTEzuhmo9xHtJWOHgbZiF+0n2cthE3UejdMOGwcD1 7UqDydUSRW5lhvNvYUNoLTYsXKfmPY8/LaMiaycrWxwV8Rf4vIKf9523duJV585tRyJZwp5CQMC k/ X-Google-Smtp-Source: ABdhPJy2UhjLj/RpdkaLNl57HYGLZLtnh5WVC/fXCA3Ih/Vz+B21Q50dRfL17v3JBUc94xh+rbLVw83iOLD1MQlK/rGw X-Received: by 2002:a0c:fb0e:: with SMTP id c14mr10819819qvp.63.1590791436280; Fri, 29 May 2020 15:30:36 -0700 (PDT) Date: Fri, 29 May 2020 15:30:16 -0700 In-Reply-To: Message-Id: <7a2e9c3c5994fc155eb6a40f039cf2298957fa6c.1590789428.git.jonathantanmy@google.com> Mime-Version: 1.0 References: X-Mailer: git-send-email 2.27.0.rc0.183.gde8f92d652-goog Subject: [PATCH 4/8] Documentation: order protocol v2 sections From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org The current C Git implementation expects Git servers to follow a specific order of sections when transmitting protocol v2 responses, but this is not explicit in the documentation. Make the order explicit. Signed-off-by: Jonathan Tan --- Documentation/technical/protocol-v2.txt | 18 ++++++++---------- 1 file changed, 8 insertions(+), 10 deletions(-) diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt index 3996d70891..ef7514a3ee 100644 --- a/Documentation/technical/protocol-v2.txt +++ b/Documentation/technical/protocol-v2.txt @@ -327,11 +327,11 @@ included in the client's request: The response of `fetch` is broken into a number of sections separated by delimiter packets (0001), with each section beginning with its section -header. +header. Most sections are sent only when the packfile is sent. - output = *section - section = (acknowledgments | shallow-info | wanted-refs | packfile) - (flush-pkt | delim-pkt) + output = acknowledgements flush-pkt | + [acknowledgments delim-pkt] [shallow-info delim-pkt] + [wanted-refs delim-pkt] packfile flush-pkt acknowledgments = PKT-LINE("acknowledgments" LF) (nak | *ack) @@ -353,9 +353,10 @@ header. *PKT-LINE(%x01-03 *%x00-ff) acknowledgments section - * If the client determines that it is finished with negotiations - by sending a "done" line, the acknowledgments sections MUST be - omitted from the server's response. + * If the client determines that it is finished with negotiations by + sending a "done" line (thus requiring the server to send a packfile), + the acknowledgments sections MUST be omitted from the server's + response. * Always begins with the section header "acknowledgments" @@ -406,9 +407,6 @@ header. which the client has not indicated was shallow as a part of its request. - * This section is only included if a packfile section is also - included in the response. - wanted-refs section * This section is only included if the client has requested a ref using a 'want-ref' line and if a packfile section is also From patchwork Fri May 29 22:30:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 11579759 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 16CDC92A for ; Fri, 29 May 2020 22:30:43 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E8C1320810 for ; Fri, 29 May 2020 22:30:42 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="WUWw1tye" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728498AbgE2Wal (ORCPT ); Fri, 29 May 2020 18:30:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59222 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728443AbgE2Wai (ORCPT ); Fri, 29 May 2020 18:30:38 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9AFB2C03E969 for ; Fri, 29 May 2020 15:30:38 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id x10so4743957ybx.8 for ; Fri, 29 May 2020 15:30:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=n2/LooV3J7rIzyGhiU1ruNHDoO4132sCvvVFXZPdjZY=; b=WUWw1tye7exuRYesZVccQRrOaxae6I/6StqjUCbNzzI7vekcLaJw6M8Bae09x38DMD aEdunFlq5iNkxQM8MD8r3qOk53QLuC9vTzjAhEXLneC7BmvxcKXeKAI8E0GwKInSeLc2 Yy8Nq255qX/42Koma3Uz3F3P41rSb6vSnPy5SM652JyAAFiqQJgkyAguen8qIAIjo6Dd WSQwedoEok79KtwJ2EcoItXwfHo6Bez2tzW1yqcV0ePoVERDgJOKkTcp+viM16B8KDk5 qYAGCtpXz5oQR02ObpK+3QVuBbHVaM7FmEIPuh8mY9ko9Hx8aEScQJL2aZth62TAfvi6 QS3A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=n2/LooV3J7rIzyGhiU1ruNHDoO4132sCvvVFXZPdjZY=; b=nPccILWv6nQxpnHLB+FkK15X4vBpGRJIbyliHQpElJ2IN1nFbqgWYcyu8/NpJs0WKd Nc3G5x/+gpQI8aWDMydXWjmQWwdu4nnxloRKoFURCPylnYY0PRWaNIlpjtY4AMbQKHTj TWU80/x2P4FP/LD+kKQVj9yknuHDklC9HGPoa/7vQm8vM+csYKvMBqJljQC3HUnAKT04 OzEA+YlhEaQlsmcFDr1p3xk8ufK7fGySvX15oXs2TFCyQWlrxICbj0E4OL7N0+qwbm4D Sbm0g6m6gU+UowZNFdZqHzKbdtEsp0u8Z4X91rCb9TaC+zNycgG49xtCmyNhamwk13z8 KfZA== X-Gm-Message-State: AOAM531KIMJVK7NFGC2Hh5qRkMy8bpxzAUuUn4ve9OSCTwD9dErY8xSw FrnOpAVWyEt/3QbjpnP6l/BVsF/TQ9GczmPa/s4/7Fw29u0mTSVpTvFZvCkdRCKKfK9ood/oR3T EC3tvHW2ywjMRPLLZ1yUiokr83uZL1KJoFV77Qaq/nAvLQKVacAv6g/Ttxck/BaidCfNUUrSQl8 Ew X-Google-Smtp-Source: ABdhPJzlCJGWBcFdlD+z9YJLWLGwwbB97TzzOk7JItS37Xt9Rh+woIQKGHHIuZKUK7ll0j3TnpfYyrWh5f7KxcH5Zr6t X-Received: by 2002:a25:a229:: with SMTP id b38mr15759991ybi.400.1590791437718; Fri, 29 May 2020 15:30:37 -0700 (PDT) Date: Fri, 29 May 2020 15:30:17 -0700 In-Reply-To: Message-Id: <4eea9d927af1df11cdb0342e969b293a6e317d46.1590789428.git.jonathantanmy@google.com> Mime-Version: 1.0 References: X-Mailer: git-send-email 2.27.0.rc0.183.gde8f92d652-goog Subject: [PATCH 5/8] Documentation: add Packfile URIs design doc From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Signed-off-by: Jonathan Tan --- Documentation/technical/packfile-uri.txt | 78 ++++++++++++++++++++++++ Documentation/technical/protocol-v2.txt | 28 ++++++++- 2 files changed, 105 insertions(+), 1 deletion(-) create mode 100644 Documentation/technical/packfile-uri.txt diff --git a/Documentation/technical/packfile-uri.txt b/Documentation/technical/packfile-uri.txt new file mode 100644 index 0000000000..6a5a6440d5 --- /dev/null +++ b/Documentation/technical/packfile-uri.txt @@ -0,0 +1,78 @@ +Packfile URIs +============= + +This feature allows servers to serve part of their packfile response as URIs. +This allows server designs that improve scalability in bandwidth and CPU usage +(for example, by serving some data through a CDN), and (in the future) provides +some measure of resumability to clients. + +This feature is available only in protocol version 2. + +Protocol +-------- + +The server advertises `packfile-uris`. + +If the client then communicates which protocols (HTTPS, etc.) it supports with +a `packfile-uris` argument, the server MAY send a `packfile-uris` section +directly before the `packfile` section (right after `wanted-refs` if it is +sent) containing URIs of any of the given protocols. The URIs point to +packfiles that use only features that the client has declared that it supports +(e.g. ofs-delta and thin-pack). See protocol-v2.txt for the documentation of +this section. + +Clients then should understand that the returned packfile could be incomplete, +and that it needs to download all the given URIs before the fetch or clone is +complete. + +Server design +------------- + +The server can be trivially made compatible with the proposed protocol by +having it advertise `packfile-uris`, tolerating the client sending +`packfile-uris`, and never sending any `packfile-uris` section. But we should +include some sort of non-trivial implementation in the Minimum Viable Product, +at least so that we can test the client. + +This is the implementation: a feature, marked experimental, that allows the +server to be configured by one or more `uploadpack.blobPackfileUri= +` entries. Whenever the list of objects to be sent is assembled, a blob +with the given sha1 can be replaced by the given URI. This allows, for example, +servers to delegate serving of large blobs to CDNs. + +Client design +------------- + +While fetching, the client needs to remember the list of URIs and cannot +declare that the fetch is complete until all URIs have been downloaded as +packfiles. + +The division of work (initial fetch + additional URIs) introduces convenient +points for resumption of an interrupted clone - such resumption can be done +after the Minimum Viable Product (see "Future work"). + +The client can inhibit this feature (i.e. refrain from sending the +`packfile-uris` parameter) by passing --no-packfile-uris to `git fetch`. + +Future work +----------- + +The protocol design allows some evolution of the server and client without any +need for protocol changes, so only a small-scoped design is included here to +form the MVP. For example, the following can be done: + + * On the server, a long-running process that takes in entire requests and + outputs a list of URIs and the corresponding inclusion and exclusion sets of + objects. This allows, e.g., signed URIs to be used and packfiles for common + requests to be cached. + * On the client, resumption of clone. If a clone is interrupted, information + could be recorded in the repository's config and a "clone-resume" command + can resume the clone in progress. (Resumption of subsequent fetches is more + difficult because that must deal with the user wanting to use the repository + even after the fetch was interrupted.) + +There are some possible features that will require a change in protocol: + + * Additional HTTP headers (e.g. authentication) + * Byte range support + * Different file formats referenced by URIs (e.g. raw object) diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt index ef7514a3ee..7e1b3a0bfe 100644 --- a/Documentation/technical/protocol-v2.txt +++ b/Documentation/technical/protocol-v2.txt @@ -325,13 +325,26 @@ included in the client's request: indicating its sideband (1, 2, or 3), and the server may send "0005\2" (a PKT-LINE of sideband 2 with no payload) as a keepalive packet. +If the 'packfile-uris' feature is advertised, the following argument +can be included in the client's request as well as the potential +addition of the 'packfile-uris' section in the server's response as +explained below. + + packfile-uris + Indicates to the server that the client is willing to receive + URIs of any of the given protocols in place of objects in the + sent packfile. Before performing the connectivity check, the + client should download from all given URIs. Currently, the + protocols supported are "http" and "https". + The response of `fetch` is broken into a number of sections separated by delimiter packets (0001), with each section beginning with its section header. Most sections are sent only when the packfile is sent. output = acknowledgements flush-pkt | [acknowledgments delim-pkt] [shallow-info delim-pkt] - [wanted-refs delim-pkt] packfile flush-pkt + [wanted-refs delim-pkt] [packfile-uris delim-pkt] + packfile flush-pkt acknowledgments = PKT-LINE("acknowledgments" LF) (nak | *ack) @@ -349,6 +362,9 @@ header. Most sections are sent only when the packfile is sent. *PKT-LINE(wanted-ref LF) wanted-ref = obj-id SP refname + packfile-uris = PKT-LINE("packfile-uris" LF) *packfile-uri + packfile-uri = PKT-LINE(40*(HEXDIGIT) SP *%x20-ff LF) + packfile = PKT-LINE("packfile" LF) *PKT-LINE(%x01-03 *%x00-ff) @@ -420,6 +436,16 @@ header. Most sections are sent only when the packfile is sent. * The server MUST NOT send any refs which were not requested using 'want-ref' lines. + packfile-uris section + * This section is only included if the client sent + 'packfile-uris' and the server has at least one such URI to + send. + + * Always begins with the section header "packfile-uris". + + * For each URI the server sends, it sends a hash of the pack's + contents (as output by git index-pack) followed by the URI. + packfile section * This section is only included if the client has sent 'want' lines in its request and either requested that no more From patchwork Fri May 29 22:30:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 11579761 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 584BA912 for ; Fri, 29 May 2020 22:30:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3963620810 for ; Fri, 29 May 2020 22:30:44 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="IEdfEQUW" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728505AbgE2Wan (ORCPT ); Fri, 29 May 2020 18:30:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59226 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728499AbgE2Wal (ORCPT ); Fri, 29 May 2020 18:30:41 -0400 Received: from mail-qk1-x749.google.com (mail-qk1-x749.google.com [IPv6:2607:f8b0:4864:20::749]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 175A3C03E969 for ; Fri, 29 May 2020 15:30:40 -0700 (PDT) Received: by mail-qk1-x749.google.com with SMTP id y2so3121950qkf.2 for ; Fri, 29 May 2020 15:30:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=a4bzFUaFdGp0H0bR41YlH9SDOh3PjiNdIlFPhBYrjSs=; b=IEdfEQUWvN7Osm6l2u8RGiyX8LPanCZ7ml7aC/FvT16NXvsfQHyfafxtL0g3psefp+ sy5cjNQZALpyex5CFYUkn3g+3TXEdM0ZV5Q58bp0RJjnVYidlZqJsFLRf5ecBYJ9YriM i2Dsr4cMtHWGP2o9Agq/5uOvkLIhM9sUe1jLEuUGesMhLf7ACJN28cP+i63OxZi2/+Go 9GqiRM49ATfZqLZQIWvNG9DAx3//Bu/Rwl09w3EaV3nh5ajVd26gOJ6Hqc7dkeVv/z4W EDfZbmG7S1KH9BmotFc8oIBkzNM/+TEawPPq/pQwZcmj7KnYg6sate8sq1K2YTjwoLuB a+CA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=a4bzFUaFdGp0H0bR41YlH9SDOh3PjiNdIlFPhBYrjSs=; b=OFeXmtpNIwxDSWbvFe8kenZjbZJrwhydiq2IMeAS3gb5jVHG+FE0WIy162Y9HbtpbD BhkMMYWFr2iFZ5fs17Ja3wR6PW5YLuaLZFyXRN81P5L77jadqz2v0hzQexD07w4jP8Gx kpbFvY+kvkmbeKqFdDeq91Mq1MLJthGcgVFMt/wuPGvFHUUA9FG2wha1v12XedAgZp2a W5riJ/GCvu+oECFp0P31duMWzCnh8//wLoHeGfEuVfmy9W3m/nMGdn9zBtwTPMzjk7Bs r/pl5yH87AJuirKCOkZc6hjYpEKTA77dnd2/cPRQwoJQDz9Yoe5o5v5eSOGbd6laKH5R jr8g== X-Gm-Message-State: AOAM533bVAJadfvJTchZc1gwoZ//5NL5CH7KL9q1USZxtjaGMypRdVc8 7pGbp/B6U/g1o+P4/6hYpuLS3qlgmftSsEPPLja8VMf95ID3Xx4S6aqjVkRIEtOjpKt0sXYvHvC eRR7il3vVe4wTZp7ur+KyvwHGe1bnUEGU6h2I748oO40kn2zDEz62GlSDKSe6VpytntECX2Shjg ZH X-Google-Smtp-Source: ABdhPJwVWPfFPoLWVWquJqJUJ5dWjoijzuq2K9O27Vnd3cwFQDEV3t9SrWOJeCP2Wspoaoog+XGIL3GVoZ4zPGofGTmf X-Received: by 2002:a0c:8e84:: with SMTP id x4mr10819183qvb.175.1590791439176; Fri, 29 May 2020 15:30:39 -0700 (PDT) Date: Fri, 29 May 2020 15:30:18 -0700 In-Reply-To: Message-Id: <65db1a649d9b481b0122f981eee255907b7139bd.1590789428.git.jonathantanmy@google.com> Mime-Version: 1.0 References: X-Mailer: git-send-email 2.27.0.rc0.183.gde8f92d652-goog Subject: [PATCH 6/8] upload-pack: refactor reading of pack-objects out From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Subsequent patches will change how the output of pack-objects is processed, so extract that processing into its own function. Currently, at most 1 character can be buffered (in the "buffered" local variable). One of those patches will require a larger buffer, so replace that "buffered" local variable with a buffer array. Signed-off-by: Jonathan Tan --- upload-pack.c | 81 ++++++++++++++++++++++++++++++--------------------- 1 file changed, 47 insertions(+), 34 deletions(-) diff --git a/upload-pack.c b/upload-pack.c index 0478bff3e7..13f6152560 100644 --- a/upload-pack.c +++ b/upload-pack.c @@ -102,15 +102,53 @@ static int write_one_shallow(const struct commit_graft *graft, void *cb_data) return 0; } +struct output_state { + char buffer[8193]; + int used; +}; + +static int relay_pack_data(int pack_objects_out, struct output_state *os) +{ + /* + * We keep the last byte to ourselves + * in case we detect broken rev-list, so that we + * can leave the stream corrupted. This is + * unfortunate -- unpack-objects would happily + * accept a valid packdata with trailing garbage, + * so appending garbage after we pass all the + * pack data is not good enough to signal + * breakage to downstream. + */ + ssize_t readsz; + + readsz = xread(pack_objects_out, os->buffer + os->used, + sizeof(os->buffer) - os->used); + if (readsz < 0) { + return readsz; + } + os->used += readsz; + + if (os->used > 1) { + send_client_data(1, os->buffer, os->used - 1); + os->buffer[0] = os->buffer[os->used - 1]; + os->used = 1; + } else { + send_client_data(1, os->buffer, os->used); + os->used = 0; + } + + return readsz; +} + static void create_pack_file(const struct object_array *have_obj, const struct object_array *want_obj, struct list_objects_filter_options *filter_options) { struct child_process pack_objects = CHILD_PROCESS_INIT; - char data[8193], progress[128]; + struct output_state output_state = { { 0 } }; + char progress[128]; char abort_msg[] = "aborting due to possible repository " "corruption on the remote side."; - int buffered = -1; ssize_t sz; int i; FILE *pipe_fd; @@ -239,39 +277,15 @@ static void create_pack_file(const struct object_array *have_obj, continue; } if (0 <= pu && (pfd[pu].revents & (POLLIN|POLLHUP))) { - /* Data ready; we keep the last byte to ourselves - * in case we detect broken rev-list, so that we - * can leave the stream corrupted. This is - * unfortunate -- unpack-objects would happily - * accept a valid packdata with trailing garbage, - * so appending garbage after we pass all the - * pack data is not good enough to signal - * breakage to downstream. - */ - char *cp = data; - ssize_t outsz = 0; - if (0 <= buffered) { - *cp++ = buffered; - outsz++; - } - sz = xread(pack_objects.out, cp, - sizeof(data) - outsz); - if (0 < sz) - ; - else if (sz == 0) { + int result = relay_pack_data(pack_objects.out, + &output_state); + + if (result == 0) { close(pack_objects.out); pack_objects.out = -1; - } - else + } else if (result < 0) { goto fail; - sz += outsz; - if (1 < sz) { - buffered = data[sz-1] & 0xFF; - sz--; } - else - buffered = -1; - send_client_data(1, data, sz); } /* @@ -296,9 +310,8 @@ static void create_pack_file(const struct object_array *have_obj, } /* flush the data */ - if (0 <= buffered) { - data[0] = buffered; - send_client_data(1, data, 1); + if (output_state.used > 0) { + send_client_data(1, output_state.buffer, output_state.used); fprintf(stderr, "flushed.\n"); } if (use_sideband) From patchwork Fri May 29 22:30:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 11579765 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 012E7912 for ; Fri, 29 May 2020 22:30:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D57FF20810 for ; Fri, 29 May 2020 22:30:48 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="PvWJSE8w" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728511AbgE2Waq (ORCPT ); Fri, 29 May 2020 18:30:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59234 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728443AbgE2Wan (ORCPT ); Fri, 29 May 2020 18:30:43 -0400 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A1383C08C5C9 for ; Fri, 29 May 2020 15:30:41 -0700 (PDT) Received: by mail-yb1-xb49.google.com with SMTP id x10so4744098ybx.8 for ; Fri, 29 May 2020 15:30:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=2NIbA7zBAR/xGgbqyHMig/cutK7VNGXILLWiI7Nd71c=; b=PvWJSE8wN+Jxs58ByLaE9cdgRfaQch6quLe6iTjtmNtyku6S86fWCLnnw8f98pT1Sh cnlPu5619cVe4r7XtU4K0xyq7gQVj5TGZrpfwt1OSeypN3A8spHwo1OO4fpttjSs8crH vQ/pmFvqi2jGMMvKulX0cikAZQgZjH/7p8bSBQkB/ffnOTGXE7TaEGfzDqnWK7l+num3 DDqr++SlytzwcvN3scR/YgF4Iee8zd9DBmNHad1d5rB/6Bu0sO8N16dXglns37sLzl3x 84kBy2ws61JmXtFpZDhpSYftDJiOpUbs+WU9oeWBT7skWVcpupp1B6KDL1eU99lY4LS1 mS1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=2NIbA7zBAR/xGgbqyHMig/cutK7VNGXILLWiI7Nd71c=; b=aFgafQgtFdG+jxRO894XfGDB+FMaU1gLk1iMYm3jaDxkxe6NXEoez2ytZ7eXr6nU6K Hk+tO4iQiG6Pb4QAGsopzlEOp+TpolEkZyaElvPCulr8inPPv3lAArZjb69VI0bcahbK kHnOvDbOgrxVGmHswMjicTllYHCClkbxqeLSbEYNMdH+eKkenGfo4UVGuPIn1AG0pn0P XA7UafujNRMzbzmrC7808WvNSVlivgYOebKlGSL9ze+9PHrXuSQ3ybXkVifWu5MMKZ0V mSXBZLkuBr1mSAu3/m6GyuYqN8yCyqPwDwKkN22nY4DV9dQqU2NSJKywg4B6kV3am91n fUmQ== X-Gm-Message-State: AOAM5329Nq36bGpmdU8XwaXQMAs67Pm5t3uTJG/+A5iXKV/Zzi9qkijY BvbyqWJZxsJfXtF/Ki8R9HDnsviH613QTGnbkPJs3eYk+Ca4Ti9i/GQxbP0Nx+E4u4KV0X/Hc+O hmgnwj7EQd6lJvhTo4yRWVqTFB1RfTKgyZ3LMIQcTClaSuqMmMpYDZaNTYOBpW2/sU+tBdpIjBe Vx X-Google-Smtp-Source: ABdhPJxqR8fxHb2ihiRsZ3QmujVHuVdhlS3AoG2vvAWIPwELf9SkTYp6r58hPHeMdi0CsSY3k7FgEWkD4EZG3VvrZzf1 X-Received: by 2002:a5b:c87:: with SMTP id i7mr16823152ybq.182.1590791440797; Fri, 29 May 2020 15:30:40 -0700 (PDT) Date: Fri, 29 May 2020 15:30:19 -0700 In-Reply-To: Message-Id: <4a34e5104a5e3aafc4efc81419fb18296d422577.1590789428.git.jonathantanmy@google.com> Mime-Version: 1.0 References: X-Mailer: git-send-email 2.27.0.rc0.183.gde8f92d652-goog Subject: [PATCH 7/8] fetch-pack: support more than one pack lockfile From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Whenever a fetch results in a packfile being downloaded, a .keep file is generated, so that the packfile can be preserved (from, say, a running "git repack") until refs are written referring to the contents of the packfile. In a subsequent patch, a successful fetch using protocol v2 may result in more than one .keep file being generated. Therefore, teach fetch_pack() and the transport mechanism to support multiple .keep files. Implementation notes: - builtin/fetch-pack.c normally does not generate .keep files, and thus is unaffected by this or future changes. However, it has an undocumented "--lock-pack" feature, used by remote-curl.c when implementing the "fetch" remote helper command. In keeping with the remote helper protocol, only one "lock" line will ever be written; the rest will result in warnings to stderr. However, in practice, warnings will never be written because the remote-curl.c "fetch" is only used for protocol v0/v1 (which will not generate multiple .keep files). (Protocol v2 uses the "stateless-connect" command, not the "fetch" command.) - connected.c has an optimization in that connectivity checks on a ref need not be done if the target object is in a pack known to be self-contained and connected. If there are multiple packfiles, this optimization can no longer be done. Signed-off-by: Jonathan Tan --- builtin/fetch-pack.c | 17 +++++++++++------ connected.c | 8 +++++--- fetch-pack.c | 29 +++++++++++++++-------------- fetch-pack.h | 2 +- transport-helper.c | 5 +++-- transport.c | 12 +++++++----- transport.h | 6 +++--- 7 files changed, 45 insertions(+), 34 deletions(-) diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c index 94b0c89b82..bbb5c96167 100644 --- a/builtin/fetch-pack.c +++ b/builtin/fetch-pack.c @@ -48,8 +48,8 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix) struct ref **sought = NULL; int nr_sought = 0, alloc_sought = 0; int fd[2]; - char *pack_lockfile = NULL; - char **pack_lockfile_ptr = NULL; + struct string_list pack_lockfiles = STRING_LIST_INIT_DUP; + struct string_list *pack_lockfiles_ptr = NULL; struct child_process *conn; struct fetch_pack_args args; struct oid_array shallow = OID_ARRAY_INIT; @@ -134,7 +134,7 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix) } if (!strcmp("--lock-pack", arg)) { args.lock_pack = 1; - pack_lockfile_ptr = &pack_lockfile; + pack_lockfiles_ptr = &pack_lockfiles; continue; } if (!strcmp("--check-self-contained-and-connected", arg)) { @@ -235,10 +235,15 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix) } ref = fetch_pack(&args, fd, ref, sought, nr_sought, - &shallow, pack_lockfile_ptr, version); - if (pack_lockfile) { - printf("lock %s\n", pack_lockfile); + &shallow, pack_lockfiles_ptr, version); + if (pack_lockfiles.nr) { + int i; + + printf("lock %s\n", pack_lockfiles.items[0].string); fflush(stdout); + for (i = 1; i < pack_lockfiles.nr; i++) + warning(_("Lockfile created but not reported: %s"), + pack_lockfiles.items[i].string); } if (args.check_self_contained_and_connected && args.self_contained_and_connected) { diff --git a/connected.c b/connected.c index 3135b71e19..937b4bae38 100644 --- a/connected.c +++ b/connected.c @@ -43,10 +43,12 @@ int check_connected(oid_iterate_fn fn, void *cb_data, if (transport && transport->smart_options && transport->smart_options->self_contained_and_connected && - transport->pack_lockfile && - strip_suffix(transport->pack_lockfile, ".keep", &base_len)) { + transport->pack_lockfiles.nr == 1 && + strip_suffix(transport->pack_lockfiles.items[0].string, + ".keep", &base_len)) { struct strbuf idx_file = STRBUF_INIT; - strbuf_add(&idx_file, transport->pack_lockfile, base_len); + strbuf_add(&idx_file, transport->pack_lockfiles.items[0].string, + base_len); strbuf_addstr(&idx_file, ".idx"); new_pack = add_packed_git(idx_file.buf, idx_file.len, 1); strbuf_release(&idx_file); diff --git a/fetch-pack.c b/fetch-pack.c index d8bbf45ee2..0a9a82bc46 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -794,7 +794,7 @@ static void write_promisor_file(const char *keep_name, } static int get_pack(struct fetch_pack_args *args, - int xd[2], char **pack_lockfile, + int xd[2], struct string_list *pack_lockfiles, struct ref **sought, int nr_sought) { struct async demux; @@ -838,7 +838,7 @@ static int get_pack(struct fetch_pack_args *args, } if (do_keep || args->from_promisor) { - if (pack_lockfile) + if (pack_lockfiles) cmd.out = -1; cmd_name = "index-pack"; argv_array_push(&cmd.args, cmd_name); @@ -863,7 +863,7 @@ static int get_pack(struct fetch_pack_args *args, * information below. If not, we need index-pack to do it for * us. */ - if (!(do_keep && pack_lockfile) && args->from_promisor) + if (!(do_keep && pack_lockfiles) && args->from_promisor) argv_array_push(&cmd.args, "--promisor"); } else { @@ -899,8 +899,9 @@ static int get_pack(struct fetch_pack_args *args, cmd.git_cmd = 1; if (start_command(&cmd)) die(_("fetch-pack: unable to fork off %s"), cmd_name); - if (do_keep && pack_lockfile) { - *pack_lockfile = index_pack_lockfile(cmd.out); + if (do_keep && pack_lockfiles) { + string_list_append_nodup(pack_lockfiles, + index_pack_lockfile(cmd.out)); close(cmd.out); } @@ -922,8 +923,8 @@ static int get_pack(struct fetch_pack_args *args, * Now that index-pack has succeeded, write the promisor file using the * obtained .keep filename if necessary */ - if (do_keep && pack_lockfile && args->from_promisor) - write_promisor_file(*pack_lockfile, sought, nr_sought); + if (do_keep && pack_lockfiles && pack_lockfiles->nr && args->from_promisor) + write_promisor_file(pack_lockfiles->items[0].string, sought, nr_sought); return 0; } @@ -940,7 +941,7 @@ static struct ref *do_fetch_pack(struct fetch_pack_args *args, const struct ref *orig_ref, struct ref **sought, int nr_sought, struct shallow_info *si, - char **pack_lockfile) + struct string_list *pack_lockfiles) { struct repository *r = the_repository; struct ref *ref = copy_ref_list(orig_ref); @@ -1067,7 +1068,7 @@ static struct ref *do_fetch_pack(struct fetch_pack_args *args, alternate_shallow_file = setup_temporary_shallow(si->shallow); else alternate_shallow_file = NULL; - if (get_pack(args, fd, pack_lockfile, sought, nr_sought)) + if (get_pack(args, fd, pack_lockfiles, sought, nr_sought)) die(_("git fetch-pack: fetch failed.")); all_done: @@ -1464,7 +1465,7 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, struct ref **sought, int nr_sought, struct oid_array *shallows, struct shallow_info *si, - char **pack_lockfile) + struct string_list *pack_lockfiles) { struct repository *r = the_repository; struct ref *ref = copy_ref_list(orig_ref); @@ -1571,7 +1572,7 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, /* get the pack */ process_section_header(&reader, "packfile", 0); - if (get_pack(args, fd, pack_lockfile, sought, nr_sought)) + if (get_pack(args, fd, pack_lockfiles, sought, nr_sought)) die(_("git fetch-pack: fetch failed.")); do_check_stateless_delimiter(args, &reader); @@ -1772,7 +1773,7 @@ struct ref *fetch_pack(struct fetch_pack_args *args, const struct ref *ref, struct ref **sought, int nr_sought, struct oid_array *shallow, - char **pack_lockfile, + struct string_list *pack_lockfiles, enum protocol_version version) { struct ref *ref_cpy; @@ -1807,11 +1808,11 @@ struct ref *fetch_pack(struct fetch_pack_args *args, memset(&si, 0, sizeof(si)); ref_cpy = do_fetch_pack_v2(args, fd, ref, sought, nr_sought, &shallows_scratch, &si, - pack_lockfile); + pack_lockfiles); } else { prepare_shallow_info(&si, shallow); ref_cpy = do_fetch_pack(args, fd, ref, sought, nr_sought, - &si, pack_lockfile); + &si, pack_lockfiles); } reprepare_packed_git(the_repository); diff --git a/fetch-pack.h b/fetch-pack.h index 67f684229a..85d1e39fe7 100644 --- a/fetch-pack.h +++ b/fetch-pack.h @@ -83,7 +83,7 @@ struct ref *fetch_pack(struct fetch_pack_args *args, struct ref **sought, int nr_sought, struct oid_array *shallow, - char **pack_lockfile, + struct string_list *pack_lockfiles, enum protocol_version version); /* diff --git a/transport-helper.c b/transport-helper.c index a46afcb69d..93a6f50793 100644 --- a/transport-helper.c +++ b/transport-helper.c @@ -410,10 +410,11 @@ static int fetch_with_fetch(struct transport *transport, exit(128); if (skip_prefix(buf.buf, "lock ", &name)) { - if (transport->pack_lockfile) + if (transport->pack_lockfiles.nr) warning(_("%s also locked %s"), data->name, name); else - transport->pack_lockfile = xstrdup(name); + string_list_append(&transport->pack_lockfiles, + name); } else if (data->check_connectivity && data->transport_options.check_self_contained_and_connected && diff --git a/transport.c b/transport.c index 7d50c502ad..6ee6771f55 100644 --- a/transport.c +++ b/transport.c @@ -378,7 +378,7 @@ static int fetch_refs_via_pack(struct transport *transport, refs = fetch_pack(&args, data->fd, refs_tmp ? refs_tmp : transport->remote_refs, to_fetch, nr_heads, &data->shallow, - &transport->pack_lockfile, data->version); + &transport->pack_lockfiles, data->version); close(data->fd[0]); close(data->fd[1]); @@ -921,6 +921,7 @@ struct transport *transport_get(struct remote *remote, const char *url) struct transport *ret = xcalloc(1, sizeof(*ret)); ret->progress = isatty(2); + string_list_init(&ret->pack_lockfiles, 1); if (!remote) BUG("No remote provided to transport_get()"); @@ -1316,10 +1317,11 @@ int transport_fetch_refs(struct transport *transport, struct ref *refs) void transport_unlock_pack(struct transport *transport) { - if (transport->pack_lockfile) { - unlink_or_warn(transport->pack_lockfile); - FREE_AND_NULL(transport->pack_lockfile); - } + int i; + + for (i = 0; i < transport->pack_lockfiles.nr; i++) + unlink_or_warn(transport->pack_lockfiles.items[i].string); + string_list_clear(&transport->pack_lockfiles, 0); } int transport_connect(struct transport *transport, const char *name, diff --git a/transport.h b/transport.h index 4298c855be..05efa72db1 100644 --- a/transport.h +++ b/transport.h @@ -5,8 +5,7 @@ #include "run-command.h" #include "remote.h" #include "list-objects-filter-options.h" - -struct string_list; +#include "string-list.h" struct git_transport_options { unsigned thin : 1; @@ -98,7 +97,8 @@ struct transport { */ const struct string_list *server_options; - char *pack_lockfile; + struct string_list pack_lockfiles; + signed verbose : 3; /** * Transports should not set this directly, and should use this From patchwork Fri May 29 22:30:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 11579763 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 64C3E912 for ; Fri, 29 May 2020 22:30:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3F9E620897 for ; Fri, 29 May 2020 22:30:47 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="K+3Po0ED" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728513AbgE2Waq (ORCPT ); Fri, 29 May 2020 18:30:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59238 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728499AbgE2Wan (ORCPT ); Fri, 29 May 2020 18:30:43 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 55AEFC03E969 for ; Fri, 29 May 2020 15:30:43 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id r18so4729331ybg.10 for ; Fri, 29 May 2020 15:30:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=cUKHyxmV7H1d49kXyReIwwb/hcHEYYYeMJPZ+erbffA=; b=K+3Po0EDGYFg08SSewSEkljUrPDszpOsqhySC9UhDahfZSDnXnah3HW3wNOj8x41Mb L/U4D8bqHgqrjkCw1W+pkSZ2cPrpqLDeUx4zKWt3fyk3qagZko6zzEc+nKqIkw/y31/E YOzs5eywnS7g/HG18NPOS7Ednl+ujfumiOqMi/5ORpO5aeuNDw09f3aEa8YYOdiULoXQ msvPFP6i2YNg61VIsCLNEiOvxzpIBm/Vj0Vt92gZ3GP47YWo7t9kfPvBtUIm8h29w/qK 6jowps0GjLnc6uzTFbpMOb/3Apt7mhUg/qpJRS8ovCZskSYAvxievjTbEZ8lLjyC8+XF ngWA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=cUKHyxmV7H1d49kXyReIwwb/hcHEYYYeMJPZ+erbffA=; b=hULSzBcdV1bDd4jTakZvugb/+VXMhxd47/ZtCZDs39YfdekyphcsrDw0V7O+UR95L5 5FmNWbs8xRJGmfSIjP/HWEXtNlYDVm0pm5Ti+xrbIwR1e4tEpBoKh8sNMUcN4V1PxTWr Yt54y8JFjnE08Lw4adziYfkeKAUUTlR/Po6AEO7Or6S5R5TPYdesN2dVaU9s60T382pt dW+gpLy6lx3CoJosAdMhb+ca1c5QGVjMspkp+OrRwLT2thXPN6ZtxistuYovgUoI/rCH LewLL4WutfXc7lXlaBTZPcsu4Q6kcpWI5vs5pK6ZTIpynvecGYlmZzQK3jhPJ3Kz/Acq U/tg== X-Gm-Message-State: AOAM530lcf/0GDtTGRM4ycZ9uzlB9crygANbi1AKoJHBiX7iQhVuYLXl aOiFp8Mi9lNbTzVNgByiJW7dLV34QS9SNAd8B1bJ1URH6zQf/jnCye0b7e8CmD4eRWtzQ2aUlvC 1ZaO9CXW1B8GCjwwROcWq5AjLFciVoIZz/M+g1oIKC7lh5gOtBw0U9tDECvltovvjDnkY/Sry0l PN X-Google-Smtp-Source: ABdhPJw5Rqdq83Ha++oHInk7v4aGO4e+CpHbNmQNBEXuxmQe6Zt0Lo2MZtEy5UCBwcDejJxNZi3u3pvEyRK02BGOt1k9 X-Received: by 2002:a25:ac8:: with SMTP id 191mr16767243ybk.390.1590791442431; Fri, 29 May 2020 15:30:42 -0700 (PDT) Date: Fri, 29 May 2020 15:30:20 -0700 In-Reply-To: Message-Id: <2cfee363873736d9ff73cd38d96f3533bb49c904.1590789428.git.jonathantanmy@google.com> Mime-Version: 1.0 References: X-Mailer: git-send-email 2.27.0.rc0.183.gde8f92d652-goog Subject: [PATCH 8/8] upload-pack: send part of packfile response as uri From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Teach upload-pack to send part of its packfile response as URIs. An administrator may configure a repository with one or more "uploadpack.blobpackfileuri" lines, each line containing an OID, a pack hash, and a URI. A client may configure fetch.uriprotocols to be a comma-separated list of protocols that it is willing to use to fetch additional packfiles - this list will be sent to the server. Whenever an object with one of those OIDs would appear in the packfile transmitted by upload-pack, the server may exclude that object, and instead send the URI. The client will then download the packs referred to by those URIs before performing the connectivity check. Signed-off-by: Jonathan Tan --- builtin/pack-objects.c | 76 ++++++++++++++++++++++++++++ fetch-pack.c | 110 +++++++++++++++++++++++++++++++++++++++-- t/t5702-protocol-v2.sh | 88 +++++++++++++++++++++++++++++++++ upload-pack.c | 80 +++++++++++++++++++++++++++--- 4 files changed, 344 insertions(+), 10 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index c5b433a23f..7016b28485 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -117,6 +117,8 @@ static unsigned long window_memory_limit = 0; static struct list_objects_filter_options filter_options; +static struct string_list uri_protocols = STRING_LIST_INIT_NODUP; + enum missing_action { MA_ERROR = 0, /* fail if any missing objects are encountered */ MA_ALLOW_ANY, /* silently allow ALL missing objects */ @@ -125,6 +127,15 @@ enum missing_action { static enum missing_action arg_missing_action; static show_object_fn fn_show_object; +struct configured_exclusion { + struct oidmap_entry e; + char *pack_hash_hex; + char *uri; +}; +static struct oidmap configured_exclusions; + +static struct oidset excluded_by_config; + /* * stats */ @@ -969,6 +980,25 @@ static void write_reused_pack(struct hashfile *f) unuse_pack(&w_curs); } +static void write_excluded_by_configs(void) +{ + struct oidset_iter iter; + const struct object_id *oid; + + oidset_iter_init(&excluded_by_config, &iter); + while ((oid = oidset_iter_next(&iter))) { + struct configured_exclusion *ex = + oidmap_get(&configured_exclusions, oid); + + if (!ex) + BUG("configured exclusion wasn't configured"); + write_in_full(1, ex->pack_hash_hex, strlen(ex->pack_hash_hex)); + write_in_full(1, " ", 1); + write_in_full(1, ex->uri, strlen(ex->uri)); + write_in_full(1, "\n", 1); + } +} + static const char no_split_warning[] = N_( "disabling bitmap writing, packs are split due to pack.packSizeLimit" ); @@ -1266,6 +1296,25 @@ static int want_object_in_pack(const struct object_id *oid, } } + if (uri_protocols.nr) { + struct configured_exclusion *ex = + oidmap_get(&configured_exclusions, oid); + int i; + const char *p; + + if (ex) { + for (i = 0; i < uri_protocols.nr; i++) { + if (skip_prefix(ex->uri, + uri_protocols.items[i].string, + &p) && + *p == ':') { + oidset_insert(&excluded_by_config, oid); + return 0; + } + } + } + } + return 1; } @@ -2864,6 +2913,29 @@ static int git_pack_config(const char *k, const char *v, void *cb) pack_idx_opts.version); return 0; } + if (!strcmp(k, "uploadpack.blobpackfileuri")) { + struct configured_exclusion *ex = xmalloc(sizeof(*ex)); + const char *oid_end, *pack_end; + /* + * Stores the pack hash. This is not a true object ID, but is + * of the same form. + */ + struct object_id pack_hash; + + if (parse_oid_hex(v, &ex->e.oid, &oid_end) || + *oid_end != ' ' || + parse_oid_hex(oid_end + 1, &pack_hash, &pack_end) || + *pack_end != ' ') + die(_("value of uploadpack.blobpackfileuri must be " + "of the form ' ' (got '%s')"), v); + if (oidmap_get(&configured_exclusions, &ex->e.oid)) + die(_("object already configured in another " + "uploadpack.blobpackfileuri (got '%s')"), v); + ex->pack_hash_hex = xcalloc(1, pack_end - oid_end); + memcpy(ex->pack_hash_hex, oid_end + 1, pack_end - oid_end - 1); + ex->uri = xstrdup(pack_end + 1); + oidmap_put(&configured_exclusions, ex); + } return git_default_config(k, v, cb); } @@ -3462,6 +3534,9 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) N_("do not pack objects in promisor packfiles")), OPT_BOOL(0, "delta-islands", &use_delta_islands, N_("respect islands during delta compression")), + OPT_STRING_LIST(0, "uri-protocol", &uri_protocols, + N_("protocol"), + N_("exclude any configured uploadpack.blobpackfileuri with this protocol")), OPT_END(), }; @@ -3650,6 +3725,7 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) } trace2_region_enter("pack-objects", "write-pack-file", the_repository); + write_excluded_by_configs(); write_pack_file(); trace2_region_leave("pack-objects", "write-pack-file", the_repository); diff --git a/fetch-pack.c b/fetch-pack.c index 0a9a82bc46..9668c0b66e 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -38,6 +38,7 @@ static int server_supports_filtering; static struct shallow_lock shallow_lock; static const char *alternate_shallow_file; static struct strbuf fsck_msg_types = STRBUF_INIT; +static struct string_list uri_protocols = STRING_LIST_INIT_DUP; /* Remember to update object flag allocation in object.h */ #define COMPLETE (1U << 0) @@ -795,6 +796,7 @@ static void write_promisor_file(const char *keep_name, static int get_pack(struct fetch_pack_args *args, int xd[2], struct string_list *pack_lockfiles, + int only_packfile, struct ref **sought, int nr_sought) { struct async demux; @@ -855,8 +857,15 @@ static int get_pack(struct fetch_pack_args *args, "--keep=fetch-pack %"PRIuMAX " on %s", (uintmax_t)getpid(), hostname); } - if (args->check_self_contained_and_connected) + if (only_packfile && args->check_self_contained_and_connected) argv_array_push(&cmd.args, "--check-self-contained-and-connected"); + else + /* + * We cannot perform any connectivity checks because + * not all packs have been downloaded; let the caller + * have this responsibility. + */ + args->check_self_contained_and_connected = 0; /* * If we're obtaining the filename of a lockfile, we'll use * that filename to write a .promisor file with more @@ -1068,7 +1077,7 @@ static struct ref *do_fetch_pack(struct fetch_pack_args *args, alternate_shallow_file = setup_temporary_shallow(si->shallow); else alternate_shallow_file = NULL; - if (get_pack(args, fd, pack_lockfiles, sought, nr_sought)) + if (get_pack(args, fd, pack_lockfiles, 1, sought, nr_sought)) die(_("git fetch-pack: fetch failed.")); all_done: @@ -1222,6 +1231,26 @@ static int send_fetch_request(struct fetch_negotiator *negotiator, int fd_out, warning("filtering not recognized by server, ignoring"); } + if (server_supports_feature("fetch", "packfile-uris", 0)) { + int i; + struct strbuf to_send = STRBUF_INIT; + + for (i = 0; i < uri_protocols.nr; i++) { + const char *s = uri_protocols.items[i].string; + + if (!strcmp(s, "https") || !strcmp(s, "http")) { + if (to_send.len) + strbuf_addch(&to_send, ','); + strbuf_addstr(&to_send, s); + } + } + if (to_send.len) { + packet_buf_write(&req_buf, "packfile-uris %s", + to_send.buf); + strbuf_release(&to_send); + } + } + /* add wants */ add_wants(args->no_dependents, wants, &req_buf); @@ -1444,6 +1473,21 @@ static void receive_wanted_refs(struct packet_reader *reader, die(_("error processing wanted refs: %d"), reader->status); } +static void receive_packfile_uris(struct packet_reader *reader, + struct string_list *uris) +{ + process_section_header(reader, "packfile-uris", 0); + while (packet_reader_read(reader) == PACKET_READ_NORMAL) { + if (reader->pktlen < the_hash_algo->hexsz || + reader->line[the_hash_algo->hexsz] != ' ') + die("expected ' ', got: %s\n", reader->line); + + string_list_append(uris, reader->line); + } + if (reader->status != PACKET_READ_DELIM) + die("expected DELIM"); +} + enum fetch_state { FETCH_CHECK_LOCAL = 0, FETCH_SEND_REQUEST, @@ -1477,6 +1521,8 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, struct fetch_negotiator negotiator_alloc; struct fetch_negotiator *negotiator; int seen_ack = 0; + struct string_list packfile_uris = STRING_LIST_INIT_DUP; + int i; if (args->no_dependents) { negotiator = NULL; @@ -1570,9 +1616,12 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, if (process_section_header(&reader, "wanted-refs", 1)) receive_wanted_refs(&reader, sought, nr_sought); - /* get the pack */ + /* get the pack(s) */ + if (process_section_header(&reader, "packfile-uris", 1)) + receive_packfile_uris(&reader, &packfile_uris); process_section_header(&reader, "packfile", 0); - if (get_pack(args, fd, pack_lockfiles, sought, nr_sought)) + if (get_pack(args, fd, pack_lockfiles, + !packfile_uris.nr, sought, nr_sought)) die(_("git fetch-pack: fetch failed.")); do_check_stateless_delimiter(args, &reader); @@ -1583,8 +1632,53 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, } } + for (i = 0; i < packfile_uris.nr; i++) { + struct child_process cmd = CHILD_PROCESS_INIT; + char packname[GIT_MAX_HEXSZ + 1]; + const char *uri = packfile_uris.items[i].string + + the_hash_algo->hexsz + 1; + + argv_array_push(&cmd.args, "http-fetch"); + argv_array_push(&cmd.args, "--packfile"); + argv_array_push(&cmd.args, uri); + cmd.git_cmd = 1; + cmd.no_stdin = 1; + cmd.out = -1; + if (start_command(&cmd)) + die("fetch-pack: unable to spawn http-fetch"); + + if (read_in_full(cmd.out, packname, 5) < 0 || + memcmp(packname, "keep\t", 5)) + die("fetch-pack: expected keep then TAB at start of http-fetch output"); + + if (read_in_full(cmd.out, packname, + the_hash_algo->hexsz + 1) < 0 || + packname[the_hash_algo->hexsz] != '\n') + die("fetch-pack: expected hash then LF at end of http-fetch output"); + + packname[the_hash_algo->hexsz] = '\0'; + + close(cmd.out); + + if (finish_command(&cmd)) + die("fetch-pack: unable to finish http-fetch"); + + if (memcmp(packfile_uris.items[i].string, packname, + the_hash_algo->hexsz)) + die("fetch-pack: pack downloaded from %s does not match expected hash %.*s", + uri, (int) the_hash_algo->hexsz, + packfile_uris.items[i].string); + + string_list_append_nodup(pack_lockfiles, + xstrfmt("%s/pack/pack-%s.keep", + get_object_directory(), + packname)); + } + string_list_clear(&packfile_uris, 0); + if (negotiator) negotiator->release(negotiator); + oidset_clear(&common); return ref; } @@ -1621,6 +1715,14 @@ static void fetch_pack_config(void) git_config_get_bool("repack.usedeltabaseoffset", &prefer_ofs_delta); git_config_get_bool("fetch.fsckobjects", &fetch_fsck_objects); git_config_get_bool("transfer.fsckobjects", &transfer_fsck_objects); + if (!uri_protocols.nr) { + char *str; + + if (!git_config_get_string("fetch.uriprotocols", &str) && str) { + string_list_split(&uri_protocols, str, ',', -1); + free(str); + } + } git_config(fetch_pack_config_cb, NULL); } diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh index 8da65e60de..3ea9345e6f 100755 --- a/t/t5702-protocol-v2.sh +++ b/t/t5702-protocol-v2.sh @@ -795,6 +795,94 @@ test_expect_success 'when server does not send "ready", expect FLUSH' ' test_i18ngrep "expected no other sections to be sent after no .ready." err ' +configure_exclusion () { + git -C "$1" hash-object "$2" >objh && + git -C "$1" pack-objects "$HTTPD_DOCUMENT_ROOT_PATH/mypack" packh && + git -C "$1" config --add \ + "uploadpack.blobpackfileuri" \ + "$(cat objh) $(cat packh) $HTTPD_URL/dumb/mypack-$(cat packh).pack" && + cat objh +} + +test_expect_success 'part of packfile response provided as URI' ' + P="$HTTPD_DOCUMENT_ROOT_PATH/http_parent" && + rm -rf "$P" http_child log && + + git init "$P" && + git -C "$P" config "uploadpack.allowsidebandall" "true" && + + echo my-blob >"$P/my-blob" && + git -C "$P" add my-blob && + echo other-blob >"$P/other-blob" && + git -C "$P" add other-blob && + git -C "$P" commit -m x && + + configure_exclusion "$P" my-blob >h && + configure_exclusion "$P" other-blob >h2 && + + GIT_TRACE=1 GIT_TRACE_PACKET="$(pwd)/log" GIT_TEST_SIDEBAND_ALL=1 \ + git -c protocol.version=2 \ + -c fetch.uriprotocols=http,https \ + clone "$HTTPD_URL/smart/http_parent" http_child && + + # Ensure that my-blob and other-blob are in separate packfiles. + for idx in http_child/.git/objects/pack/*.idx + do + git verify-pack --verbose $idx >out && + { + grep "^[0-9a-f]\{16,\} " out || : + } >out.objectlist && + if test_line_count = 1 out.objectlist + then + if grep $(cat h) out + then + >hfound + fi && + if grep $(cat h2) out + then + >h2found + fi + fi + done && + test -f hfound && + test -f h2found && + + # Ensure that there are exactly 6 files (3 .pack and 3 .idx). + ls http_child/.git/objects/pack/* >filelist && + test_line_count = 6 filelist +' + +test_expect_success 'fetching with valid packfile URI but invalid hash fails' ' + P="$HTTPD_DOCUMENT_ROOT_PATH/http_parent" && + rm -rf "$P" http_child log && + + git init "$P" && + git -C "$P" config "uploadpack.allowsidebandall" "true" && + + echo my-blob >"$P/my-blob" && + git -C "$P" add my-blob && + echo other-blob >"$P/other-blob" && + git -C "$P" add other-blob && + git -C "$P" commit -m x && + + configure_exclusion "$P" my-blob >h && + # Configure a URL for other-blob. Just reuse the hash of the object as + # the hash of the packfile, since the hash does not matter for this + # test as long as it is not the hash of the pack, and it is of the + # expected length. + git -C "$P" hash-object other-blob >objh && + git -C "$P" pack-objects "$HTTPD_DOCUMENT_ROOT_PATH/mypack" packh && + git -C "$P" config --add \ + "uploadpack.blobpackfileuri" \ + "$(cat objh) $(cat objh) $HTTPD_URL/dumb/mypack-$(cat packh).pack" && + + GIT_TEST_SIDEBAND_ALL=1 \ + test_must_fail git -c protocol.version=2 \ + -c fetch.uriprotocols=http,https \ + clone "$HTTPD_URL/smart/http_parent" http_child 2>err && + test_i18ngrep "pack downloaded from.*does not match expected hash" err +' + # DO NOT add non-httpd-specific tests here, because the last part of this # test script is only executed when httpd is available and enabled. diff --git a/upload-pack.c b/upload-pack.c index 13f6152560..d928d1c0a8 100644 --- a/upload-pack.c +++ b/upload-pack.c @@ -105,9 +105,12 @@ static int write_one_shallow(const struct commit_graft *graft, void *cb_data) struct output_state { char buffer[8193]; int used; + unsigned packfile_uris_started : 1; + unsigned packfile_started : 1; }; -static int relay_pack_data(int pack_objects_out, struct output_state *os) +static int relay_pack_data(int pack_objects_out, struct output_state *os, + int write_packfile_line) { /* * We keep the last byte to ourselves @@ -128,6 +131,37 @@ static int relay_pack_data(int pack_objects_out, struct output_state *os) } os->used += readsz; + while (!os->packfile_started) { + char *p; + if (os->used >= 4 && !memcmp(os->buffer, "PACK", 4)) { + os->packfile_started = 1; + if (write_packfile_line) { + if (os->packfile_uris_started) + packet_delim(1); + packet_write_fmt(1, "\1packfile\n"); + } + break; + } + if ((p = memchr(os->buffer, '\n', os->used))) { + if (!os->packfile_uris_started) { + os->packfile_uris_started = 1; + if (!write_packfile_line) + BUG("packfile_uris requires sideband-all"); + packet_write_fmt(1, "\1packfile-uris\n"); + } + *p = '\0'; + packet_write_fmt(1, "\1%s\n", os->buffer); + + os->used -= p - os->buffer + 1; + memmove(os->buffer, p + 1, os->used); + } else { + /* + * Incomplete line. + */ + return readsz; + } + } + if (os->used > 1) { send_client_data(1, os->buffer, os->used - 1); os->buffer[0] = os->buffer[os->used - 1]; @@ -142,7 +176,8 @@ static int relay_pack_data(int pack_objects_out, struct output_state *os) static void create_pack_file(const struct object_array *have_obj, const struct object_array *want_obj, - struct list_objects_filter_options *filter_options) + struct list_objects_filter_options *filter_options, + const struct string_list *uri_protocols) { struct child_process pack_objects = CHILD_PROCESS_INIT; struct output_state output_state = { { 0 } }; @@ -192,6 +227,11 @@ static void create_pack_file(const struct object_array *have_obj, spec); } } + if (uri_protocols) { + for (i = 0; i < uri_protocols->nr; i++) + argv_array_pushf(&pack_objects.args, "--uri-protocol=%s", + uri_protocols->items[0].string); + } pack_objects.in = -1; pack_objects.out = -1; @@ -278,7 +318,8 @@ static void create_pack_file(const struct object_array *have_obj, } if (0 <= pu && (pfd[pu].revents & (POLLIN|POLLHUP))) { int result = relay_pack_data(pack_objects.out, - &output_state); + &output_state, + !!uri_protocols); if (result == 0) { close(pack_objects.out); @@ -1137,7 +1178,7 @@ void upload_pack(struct upload_pack_options *options) if (want_obj.nr) { struct object_array have_obj = OBJECT_ARRAY_INIT; get_common_commits(&reader, &have_obj, &want_obj); - create_pack_file(&have_obj, &want_obj, &filter_options); + create_pack_file(&have_obj, &want_obj, &filter_options, 0); } list_objects_filter_release(&filter_options); @@ -1154,6 +1195,7 @@ struct upload_pack_data { timestamp_t deepen_since; int deepen_rev_list; int deepen_relative; + struct string_list uri_protocols; struct list_objects_filter_options filter_options; @@ -1175,6 +1217,7 @@ static void upload_pack_data_init(struct upload_pack_data *data) struct oid_array haves = OID_ARRAY_INIT; struct object_array shallows = OBJECT_ARRAY_INIT; struct string_list deepen_not = STRING_LIST_INIT_DUP; + struct string_list uri_protocols = STRING_LIST_INIT_DUP; memset(data, 0, sizeof(*data)); data->wants = wants; @@ -1182,6 +1225,7 @@ static void upload_pack_data_init(struct upload_pack_data *data) data->haves = haves; data->shallows = shallows; data->deepen_not = deepen_not; + data->uri_protocols = uri_protocols; packet_writer_init(&data->writer, 1); } @@ -1342,10 +1386,18 @@ static void process_args(struct packet_reader *request, continue; } + if (skip_prefix(arg, "packfile-uris ", &p)) { + string_list_split(&data->uri_protocols, p, ',', -1); + continue; + } + /* ignore unknown lines maybe? */ die("unexpected line: '%s'", arg); } + if (data->uri_protocols.nr && !data->writer.use_sideband) + string_list_clear(&data->uri_protocols, 0); + if (request->status != PACKET_READ_FLUSH) die(_("expected flush after fetch arguments")); } @@ -1537,8 +1589,15 @@ int upload_pack_v2(struct repository *r, struct argv_array *keys, send_wanted_ref_info(&data); send_shallow_info(&data, &want_obj); - packet_writer_write(&data.writer, "packfile\n"); - create_pack_file(&have_obj, &want_obj, &data.filter_options); + if (data.uri_protocols.nr) { + create_pack_file(&have_obj, &want_obj, + &data.filter_options, + &data.uri_protocols); + } else { + packet_writer_write(&data.writer, "packfile\n"); + create_pack_file(&have_obj, &want_obj, + &data.filter_options, NULL); + } state = FETCH_DONE; break; case FETCH_DONE: @@ -1559,6 +1618,7 @@ int upload_pack_advertise(struct repository *r, int allow_filter_value; int allow_ref_in_want; int allow_sideband_all_value; + char *str = NULL; strbuf_addstr(value, "shallow"); @@ -1580,6 +1640,14 @@ int upload_pack_advertise(struct repository *r, &allow_sideband_all_value) && allow_sideband_all_value)) strbuf_addstr(value, " sideband-all"); + + if (!repo_config_get_string(the_repository, + "uploadpack.blobpackfileuri", + &str) && + str) { + strbuf_addstr(value, " packfile-uris"); + free(str); + } } return 1;