From patchwork Sat Feb 23 23:38:55 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 10827655 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A22D317E9 for ; Sat, 23 Feb 2019 23:39:23 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8E9DB2C35D for ; Sat, 23 Feb 2019 23:39:23 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7AA822C37E; Sat, 23 Feb 2019 23:39:23 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.5 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, USER_IN_DEF_DKIM_WL autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0189B2C35D for ; Sat, 23 Feb 2019 23:39:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727923AbfBWXjI (ORCPT ); Sat, 23 Feb 2019 18:39:08 -0500 Received: from mail-pl1-f201.google.com ([209.85.214.201]:52103 "EHLO mail-pl1-f201.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726773AbfBWXjH (ORCPT ); Sat, 23 Feb 2019 18:39:07 -0500 Received: by mail-pl1-f201.google.com with SMTP id f10so4387852plr.18 for ; Sat, 23 Feb 2019 15:39:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=bJhW754qe07OnuZcLV28LYcPl0aiVIopN6BGE7qNgmA=; b=mv537SeJjIQbaIvDwNIV5pjE0Cwj+NDk92Ob9kqYupaaXblU5dpzGKSya3hsEABNmV pn2sQcs4Fapf1GCxnpFkiMBgqbHzwneuf/j6YBNl6/IyaWrDYcMRZbWODNWTr+DkrlDm 8ttZAO9b6Kx+q6b4k2YC65MVZWq8DD27D5BzPjLpyK6/yW5sIVhWrE1hZnPVzMs0uy+D GIXk0C7R4xhjfmlFJ0S9/zuvb/l9KSdPuVfbFrF3fBV7/K+SRAo3S2fFgUsvg2HgSuxZ fqrjtOtEUvCzQJp4Xy5wrSpcegpCGItqhMNzCfy5TW5x8oLF96ShXaYXV6nYA4ACmBAc wSpA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=bJhW754qe07OnuZcLV28LYcPl0aiVIopN6BGE7qNgmA=; b=aLyvB+M+LLXQRCL27KggpTwXeLPhWOoSvZDyrmm0hD/bjl585wdfVHCKWn0Y9AsteW /gFKCC/ZTowuS3RaA/l0uSOobPL4kMijQ01/WWA3exPebK8doQAdPzE32nq+Fnl+3FwJ 8IUdPQDFU86Vg8yOec3gRJikdm4Py7++5yV42KTuUkZnmnSEI5r0iCMUs6OB/VyweBSS WPkKgH24oC3XCfoLdFYcdkR5NXXESOZaHIRJffKX9lpRQ3hvYfQPDuUiM1kGwpES1RaZ GeVIkfErZRSg5T8ebi/KDUu5+YN9hcZHjZISOK2c9RDR8O8RGD3Wb+xkHO9BJylGc6HJ iTRg== X-Gm-Message-State: AHQUAubgC6VnkgvmVpMJ2ToG089NosE2Kdh6a3Fu9xcmcYxQRKrf9aCf SXNN0WHSxZWb8McSNiqW4kfQhrHysXOzAZjRGEaL8yfCuQ7ucLNgxmRLGf7VeWB43jyKIXlnRx1 AtBpACcIq9qjuQgbfbqIZPJJ8SiQHHq2ZWTBN3+CsoliooEhv2IxOnU+/8tA7JE3U/XdRSuotwq Dc X-Google-Smtp-Source: AHgI3IaQl9f7rpnmSKnY2S7PrwnK9eLlALaE4BNXAkj8Sd6W5yhYjnwHLfkPMduIm+w7y56hKYCjCmhIxWnhizwk523x X-Received: by 2002:a62:42d6:: with SMTP id h83mr4206120pfd.52.1550965146689; Sat, 23 Feb 2019 15:39:06 -0800 (PST) Date: Sat, 23 Feb 2019 15:38:55 -0800 In-Reply-To: Message-Id: <5ed3a4442e38165d0c45d53de5c8620b58a01015.1550963965.git.jonathantanmy@google.com> Mime-Version: 1.0 References: X-Mailer: git-send-email 2.19.0.271.gfe8321ec05.dirty Subject: [WIP 1/7] http: use --stdin and --keep when downloading pack From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan , gitster@pobox.com, peff@peff.net, christian.couder@gmail.com, avarab@gmail.com Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP When Git fetches a pack using dumb HTTP, it does at least 2 things differently from when it fetches using fetch-pack or receive-pack: (1) it reuses the server's name for the packfile (which incorporates a hash) for the packfile, and (2) it does not create a .keep file to avoid race conditions with repack. A subsequent patch will allow downloading packs over HTTP(S) as part of a fetch. These downloads will not necessarily be from a Git repository, and thus may not have a hash as part of its name. Also, generating a .keep file will be necessary to avoid race conditions with repack (until the fetch has successfully written the new refs). Thus, teach http to pass --stdin and --keep to index-pack, the former so that we have no reliance on the server's name for the packfile, and the latter so that we have the .keep file. Signed-off-by: Jonathan Tan --- http-push.c | 7 ++++++- http-walker.c | 5 ++++- http.c | 42 ++++++++++++++++++++---------------------- http.h | 2 +- 4 files changed, 31 insertions(+), 25 deletions(-) diff --git a/http-push.c b/http-push.c index b22c7caea0..409b266b0c 100644 --- a/http-push.c +++ b/http-push.c @@ -586,11 +586,16 @@ static void finish_request(struct transfer_request *request) fprintf(stderr, "Unable to get pack file %s\n%s", request->url, curl_errorstr); } else { + char *lockfile; + preq = (struct http_pack_request *)request->userData; if (preq) { - if (finish_http_pack_request(preq) == 0) + if (finish_http_pack_request(preq, + &lockfile) == 0) { + unlink(lockfile); fail = 0; + } release_http_pack_request(preq); } } diff --git a/http-walker.c b/http-walker.c index 8ae5d76c6a..804dc82304 100644 --- a/http-walker.c +++ b/http-walker.c @@ -425,6 +425,7 @@ static int http_fetch_pack(struct walker *walker, struct alt_base *repo, unsigne int ret; struct slot_results results; struct http_pack_request *preq; + char *lockfile; if (fetch_indices(walker, repo)) return -1; @@ -457,7 +458,9 @@ static int http_fetch_pack(struct walker *walker, struct alt_base *repo, unsigne goto abort; } - ret = finish_http_pack_request(preq); + ret = finish_http_pack_request(preq, &lockfile); + if (!ret) + unlink(lockfile); release_http_pack_request(preq); if (ret) return ret; diff --git a/http.c b/http.c index a32ad36ddf..5f8e602cd2 100644 --- a/http.c +++ b/http.c @@ -2200,13 +2200,13 @@ void release_http_pack_request(struct http_pack_request *preq) free(preq); } -int finish_http_pack_request(struct http_pack_request *preq) +int finish_http_pack_request(struct http_pack_request *preq, char **lockfile) { struct packed_git **lst; struct packed_git *p = preq->target; - char *tmp_idx; - size_t len; struct child_process ip = CHILD_PROCESS_INIT; + int tmpfile_fd; + int ret = 0; close_pack_index(p); @@ -2218,35 +2218,33 @@ int finish_http_pack_request(struct http_pack_request *preq) lst = &((*lst)->next); *lst = (*lst)->next; - if (!strip_suffix(preq->tmpfile.buf, ".pack.temp", &len)) - BUG("pack tmpfile does not end in .pack.temp?"); - tmp_idx = xstrfmt("%.*s.idx.temp", (int)len, preq->tmpfile.buf); + tmpfile_fd = xopen(preq->tmpfile.buf, O_RDONLY); argv_array_push(&ip.args, "index-pack"); - argv_array_pushl(&ip.args, "-o", tmp_idx, NULL); - argv_array_push(&ip.args, preq->tmpfile.buf); + argv_array_push(&ip.args, "--stdin"); + argv_array_pushf(&ip.args, "--keep=git %"PRIuMAX, (uintmax_t)getpid()); ip.git_cmd = 1; - ip.no_stdin = 1; - ip.no_stdout = 1; + ip.in = tmpfile_fd; + ip.out = -1; - if (run_command(&ip)) { - unlink(preq->tmpfile.buf); - unlink(tmp_idx); - free(tmp_idx); - return -1; + if (start_command(&ip)) { + ret = -1; + goto cleanup; } - unlink(sha1_pack_index_name(p->sha1)); + *lockfile = index_pack_lockfile(ip.out); + close(ip.out); - if (finalize_object_file(preq->tmpfile.buf, sha1_pack_name(p->sha1)) - || finalize_object_file(tmp_idx, sha1_pack_index_name(p->sha1))) { - free(tmp_idx); - return -1; + if (finish_command(&ip)) { + ret = -1; + goto cleanup; } install_packed_git(the_repository, p); - free(tmp_idx); - return 0; +cleanup: + close(tmpfile_fd); + unlink(preq->tmpfile.buf); + return ret; } struct http_pack_request *new_http_pack_request( diff --git a/http.h b/http.h index 4eb4e808e5..20d1c85d0b 100644 --- a/http.h +++ b/http.h @@ -212,7 +212,7 @@ struct http_pack_request { extern struct http_pack_request *new_http_pack_request( struct packed_git *target, const char *base_url); -extern int finish_http_pack_request(struct http_pack_request *preq); +int finish_http_pack_request(struct http_pack_request *preq, char **lockfile); extern void release_http_pack_request(struct http_pack_request *preq); /* Helpers for fetching object */ From patchwork Sat Feb 23 23:38:56 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 10827657 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C992F184E for ; Sat, 23 Feb 2019 23:39:23 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B9AAB2C36E for ; Sat, 23 Feb 2019 23:39:23 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A6EAE2C35D; Sat, 23 Feb 2019 23:39:23 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.5 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, USER_IN_DEF_DKIM_WL autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 21E372C362 for ; Sat, 23 Feb 2019 23:39:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727989AbfBWXjK (ORCPT ); Sat, 23 Feb 2019 18:39:10 -0500 Received: from mail-oi1-f201.google.com ([209.85.167.201]:35315 "EHLO mail-oi1-f201.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727928AbfBWXjK (ORCPT ); Sat, 23 Feb 2019 18:39:10 -0500 Received: by mail-oi1-f201.google.com with SMTP id k131so2250534oif.2 for ; Sat, 23 Feb 2019 15:39:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=wCVUwNQVo4WMTFNY/XTveqJjtkcujZ4JIRLnyb/al3k=; b=doBcs0VKfSzuZbjasAanNwkhbjqcdlrGXz082uYO5GLTXpVLEvgFycE4E2Co4s5r96 0f03A0sVMT6y3F5fzHMNawy23BOxhK4uYc3Y8EkpTne5PkIC5NJFc6WNnZnIDrzyY1Hd Z1tIML75InftU9OaTB+EYc+7l2mcK5YsNy+tukFdMqbC2Qu+gR6metqVI2pkxED0PsOK WzUte+7ZQwDqwoZS62Bg6lDNC4lkyb4lopAioaDbuQlywuk5Xqez873dtx++1DYqm0Zs CoOI1WfR+LrcXLKhXtAjLJj7Gm7zFj4l5Mx7QCLlM5c+YLb4EK0UPMWkbif4NZNKdNoX PLNQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=wCVUwNQVo4WMTFNY/XTveqJjtkcujZ4JIRLnyb/al3k=; b=gBTi9Yytf0jetVCD2POE8BP7YNs9idFdAYu6OHRoybIdHPYl+nvi9LNQYRSTtkPydT Ea3R35ilAdsEWmEJUppgK+ysQd0QPasXP5wliq2B9XYDfxqShEYRXWX9jJYUmmoVJExn Fo7AC/fOHfQ4dl2mTSu7uWmQCpRiCEy7lbt2bKPza/wJPfELl1g0IUNCd+4V0kuVne/w 4u8civOiPsjpGWdgXD/H7BvcvlCXQLsCOiiEAocnovYPuIJinBhfzPU34Bjh81p45NKu 1liOq49Fo5fmsAccjB4LCPjccLbgcAYNRvaN5lT3BSulygpIfpbp3l0A+nPj5rCiO8kM qZCw== X-Gm-Message-State: AHQUAuZgtWg67rPMXqi87WLTeOa/ZMlOfZpX0j+CvSUpA+TQUuNW1DKt ouMyaeip1CBUYmfbcf6R24z/CBdvuKgtxlBMT8RwE/jzdFJGrR0ybX6NA7qsouyZeB1aJfQjnGf b/4CjPbI80u0SaOyZINKCSHHQmj7HczLiANMRTKOGLDMc6/ui3FVivCe2EqA1CP9+kRxxOAzYj1 oW X-Google-Smtp-Source: AHgI3Ib/TjnRDlHpdkXKzMI/DuqN2KdyS95ybiLkjie6tUq2Xwl3GLR+B9vMG0nYH0HXVYqoYRGxOEON/s8kD8+FnFOZ X-Received: by 2002:a9d:4913:: with SMTP id e19mr6993208otf.67.1550965149132; Sat, 23 Feb 2019 15:39:09 -0800 (PST) Date: Sat, 23 Feb 2019 15:38:56 -0800 In-Reply-To: Message-Id: <3304894f269fae63b6cb9d3f2c81906236556e86.1550963965.git.jonathantanmy@google.com> Mime-Version: 1.0 References: X-Mailer: git-send-email 2.19.0.271.gfe8321ec05.dirty Subject: [WIP 2/7] http: improve documentation of http_pack_request From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan , gitster@pobox.com, peff@peff.net, christian.couder@gmail.com, avarab@gmail.com Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP struct http_pack_request and the functions that use it will be modified in a subsequent patch. Using it is complicated (to use, call the initialization function, then set some but not all fields in the returned struct), so add some documentation to help future users. Signed-off-by: Jonathan Tan --- http.h | 19 ++++++++++++++++++- 1 file changed, 18 insertions(+), 1 deletion(-) diff --git a/http.h b/http.h index 20d1c85d0b..1aa556257e 100644 --- a/http.h +++ b/http.h @@ -202,14 +202,31 @@ extern int http_get_info_packs(const char *base_url, struct packed_git **packs_head); struct http_pack_request { + /* + * Initialized by new_http_pack_request(). + */ char *url; struct packed_git *target; + struct active_request_slot *slot; + + /* + * After calling new_http_pack_request(), point lst to the head of the + * pack list that target is in. finish_http_pack_request() will remove + * target from lst and call install_packed_git() on target. + */ struct packed_git **lst; + + /* + * State managed by functions in http.c. + */ FILE *packfile; struct strbuf tmpfile; - struct active_request_slot *slot; }; +/* + * target must be an element in a pack list obtained from + * http_get_info_packs(). + */ extern struct http_pack_request *new_http_pack_request( struct packed_git *target, const char *base_url); int finish_http_pack_request(struct http_pack_request *preq, char **lockfile); From patchwork Sat Feb 23 23:38:57 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 10827665 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A14301823 for ; Sat, 23 Feb 2019 23:39:25 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8E9BE2C35D for ; Sat, 23 Feb 2019 23:39:25 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 82AA22C35B; Sat, 23 Feb 2019 23:39:25 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.5 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, USER_IN_DEF_DKIM_WL autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 426802C365 for ; Sat, 23 Feb 2019 23:39:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727993AbfBWXjM (ORCPT ); Sat, 23 Feb 2019 18:39:12 -0500 Received: from mail-pf1-f201.google.com ([209.85.210.201]:52154 "EHLO mail-pf1-f201.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727928AbfBWXjM (ORCPT ); Sat, 23 Feb 2019 18:39:12 -0500 Received: by mail-pf1-f201.google.com with SMTP id x134so4757498pfd.18 for ; Sat, 23 Feb 2019 15:39:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=XEtEND9vfCrppGuzoz/9WVKITc9pEfxWjm0g1lpZBek=; b=pmpctPiWvbWg6HddvR0DvtNYiYJnNl4eaAwu3EnCTG4QaNybKEu8LcGL+DVkaQtBOf IA4XN+a1OIiurvWOKY0VLfBtv36Oi1cn+hH+2F6CE4KbM8KvvdR3+Bjwickmr8MFO1RY +Hco6vrQavynVjrdQZ2dGSLycuCQ+CtYcmEBYpNFtzxyHkHM01FpHamlUIuysLTpMwbd mQ+Um/pW4qwC2jpQUZgJTSb/Oek64SvgcX9L9lmkm23+xQ4ZiRaHY1vpa9TNpVmjyl52 DbL+ChM2f+yD+1MC8Y+Xs2kvmWow7V24iNyOTK+4f2hv6mBjbfhxHDDzWRXTmm/Nt5R8 o8jw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=XEtEND9vfCrppGuzoz/9WVKITc9pEfxWjm0g1lpZBek=; b=QnbstzLvlNWfOryyuFMruscNAcTXvMh8U+EiO4JOfMvH54MZkb1MaWezinzLFEUHgk WcI8EUhI4eZnOHEJQJh+PAs2DbrYX6Ez77oFB/Lwrl+0QRrjOvC1OLkPBnLRiuOqkwff uql1QbuYokofHGSvyAJdt4NqdljjVxz7uL915CF6MoJqYV72i8faOUbjamZ/wt1OzUG9 VceFnBBXHIjdNjiQk7iMe/s5I2oUVVwgGBMhdxIslYGEXsVwid6BaWCsv3k5sz0thtQa C7r1E7CgCZ1ituRf/lGnFFLAdRnatAXigVoi2+UfrLpA+SSUvSYyMe6EqRYBQlEX9nAL ehEg== X-Gm-Message-State: AHQUAuZ3sqdkQLiJIIIdB8lhO2qrKl34gWXwdrIzl/KJV9P9PaWTIMja trONgmW8Q6+NOTNcMVKy6ymX5OrDJIMFmioGiI3E5uMFru2qXSuCmKTJZ4kL+twUB4NumNyScXX o/1sqAdUPx0bWrUvsOH15jqWv7FE4q1tYDab7umWPlpaGhmcbMeWodTtbwNRBHfY7qz0UpFRM/7 uq X-Google-Smtp-Source: AHgI3IZconPH0GrvVmdaOfoe1E0yZnXH00TaO8ezP5r6y5xqGp49NGNITJ9rWC1r5qtBxTnZuoaJmG5GVJ0X/rJGo19f X-Received: by 2002:a63:f055:: with SMTP id s21mr4373163pgj.129.1550965151347; Sat, 23 Feb 2019 15:39:11 -0800 (PST) Date: Sat, 23 Feb 2019 15:38:57 -0800 In-Reply-To: Message-Id: <0d4b4678963fa08b7d73572fa4ae50d08f9e90ce.1550963965.git.jonathantanmy@google.com> Mime-Version: 1.0 References: X-Mailer: git-send-email 2.19.0.271.gfe8321ec05.dirty Subject: [WIP 3/7] http-fetch: support fetching packfiles by URL From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan , gitster@pobox.com, peff@peff.net, christian.couder@gmail.com, avarab@gmail.com Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Teach http-fetch the ability to download packfiles directly, given a URL, and to verify them. The http_pack_request suite of functions have been modified to support a NULL target. When target is NULL, the given URL is downloaded directly instead of being treated as the root of a repository. Signed-off-by: Jonathan Tan --- Documentation/git-http-fetch.txt | 7 +++- http-fetch.c | 65 +++++++++++++++++++++++++------- http.c | 41 ++++++++++++++------ http.h | 11 ++++-- t/t5550-http-fetch-dumb.sh | 18 +++++++++ 5 files changed, 113 insertions(+), 29 deletions(-) diff --git a/Documentation/git-http-fetch.txt b/Documentation/git-http-fetch.txt index 666b042679..e667544bb1 100644 --- a/Documentation/git-http-fetch.txt +++ b/Documentation/git-http-fetch.txt @@ -9,7 +9,7 @@ git-http-fetch - Download from a remote Git repository via HTTP SYNOPSIS -------- [verse] -'git http-fetch' [-c] [-t] [-a] [-d] [-v] [-w filename] [--recover] [--stdin] +'git http-fetch' [-c] [-t] [-a] [-d] [-v] [-w filename] [--recover] [--stdin | --packfile | ] DESCRIPTION ----------- @@ -40,6 +40,11 @@ commit-id:: ['\t'] +--packfile:: + Instead of a commit id on the command line (which is not expected in + this case), 'git http-fetch' fetches the packfile directly at the given + URL and generates the corresponding .idx file. + --recover:: Verify that everything reachable from target is fetched. Used after an earlier fetch is interrupted. diff --git a/http-fetch.c b/http-fetch.c index a32ac118d9..d283ce83a5 100644 --- a/http-fetch.c +++ b/http-fetch.c @@ -5,7 +5,7 @@ #include "walker.h" static const char http_fetch_usage[] = "git http-fetch " -"[-c] [-t] [-a] [-v] [--recover] [-w ref] [--stdin] commit-id url"; +"[-c] [-t] [-a] [-v] [--recover] [-w ref] [--stdin | --packfile | commit-id] url"; int cmd_main(int argc, const char **argv) { @@ -19,6 +19,7 @@ int cmd_main(int argc, const char **argv) int rc = 0; int get_verbosely = 0; int get_recover = 0; + int packfile = 0; while (arg < argc && argv[arg][0] == '-') { if (argv[arg][1] == 't') { @@ -35,43 +36,81 @@ int cmd_main(int argc, const char **argv) get_recover = 1; } else if (!strcmp(argv[arg], "--stdin")) { commits_on_stdin = 1; + } else if (!strcmp(argv[arg], "--packfile")) { + packfile = 1; } arg++; } - if (argc != arg + 2 - commits_on_stdin) + if (argc != arg + 2 - (commits_on_stdin || packfile)) usage(http_fetch_usage); if (commits_on_stdin) { commits = walker_targets_stdin(&commit_id, &write_ref); + } else if (packfile) { + /* URL will be set later */ } else { commit_id = (char **) &argv[arg++]; commits = 1; } - if (argv[arg]) - str_end_url_with_slash(argv[arg], &url); + if (packfile) { + url = xstrdup(argv[arg]); + } else { + if (argv[arg]) + str_end_url_with_slash(argv[arg], &url); + } setup_git_directory(); git_config(git_default_config, NULL); http_init(NULL, url, 0); - walker = get_http_walker(url); - walker->get_verbosely = get_verbosely; - walker->get_recover = get_recover; - rc = walker_fetch(walker, commits, commit_id, write_ref, url); + if (packfile) { + struct http_pack_request *preq; + struct slot_results results; + int ret; + char *lockfile; + + preq = new_http_pack_request(NULL, url); + if (preq == NULL) + die("couldn't create http pack request"); + preq->slot->results = &results; + + if (start_active_slot(preq->slot)) { + run_active_slot(preq->slot); + if (results.curl_result != CURLE_OK) { + die("Unable to get pack file %s\n%s", preq->url, + curl_errorstr); + } + } else { + die("Unable to start request"); + } + + if ((ret = finish_http_pack_request(preq, &lockfile))) + die("finish_http_pack_request gave result %d", ret); + unlink(lockfile); + release_http_pack_request(preq); + rc = 0; + } else { + walker = get_http_walker(url); + walker->get_verbosely = get_verbosely; + walker->get_recover = get_recover; + + rc = walker_fetch(walker, commits, commit_id, write_ref, url); - if (commits_on_stdin) - walker_targets_free(commits, commit_id, write_ref); + if (commits_on_stdin) + walker_targets_free(commits, commit_id, write_ref); - if (walker->corrupt_object_found) { - fprintf(stderr, + if (walker->corrupt_object_found) { + fprintf(stderr, "Some loose object were found to be corrupt, but they might be just\n" "a false '404 Not Found' error message sent with incorrect HTTP\n" "status code. Suggest running 'git fsck'.\n"); + } + + walker_free(walker); } - walker_free(walker); http_cleanup(); free(url); diff --git a/http.c b/http.c index 5f8e602cd2..73c3e6295b 100644 --- a/http.c +++ b/http.c @@ -2208,15 +2208,18 @@ int finish_http_pack_request(struct http_pack_request *preq, char **lockfile) int tmpfile_fd; int ret = 0; - close_pack_index(p); + if (p) + close_pack_index(p); fclose(preq->packfile); preq->packfile = NULL; - lst = preq->lst; - while (*lst != p) - lst = &((*lst)->next); - *lst = (*lst)->next; + if (p) { + lst = preq->lst; + while (*lst != p) + lst = &((*lst)->next); + *lst = (*lst)->next; + } tmpfile_fd = xopen(preq->tmpfile.buf, O_RDONLY); @@ -2240,7 +2243,8 @@ int finish_http_pack_request(struct http_pack_request *preq, char **lockfile) goto cleanup; } - install_packed_git(the_repository, p); + if (p) + install_packed_git(the_repository, p); cleanup: close(tmpfile_fd); unlink(preq->tmpfile.buf); @@ -2258,12 +2262,24 @@ struct http_pack_request *new_http_pack_request( strbuf_init(&preq->tmpfile, 0); preq->target = target; - end_url_with_slash(&buf, base_url); - strbuf_addf(&buf, "objects/pack/pack-%s.pack", - sha1_to_hex(target->sha1)); - preq->url = strbuf_detach(&buf, NULL); + if (target) { + end_url_with_slash(&buf, base_url); + strbuf_addf(&buf, "objects/pack/pack-%s.pack", + sha1_to_hex(target->sha1)); + preq->url = strbuf_detach(&buf, NULL); + } else { + preq->url = xstrdup(base_url); + } + + if (target) { + strbuf_addf(&preq->tmpfile, "%s.temp", + sha1_pack_name(target->sha1)); + } else { + strbuf_addf(&preq->tmpfile, "%s/pack/pack-", get_object_directory()); + strbuf_addstr_urlencode(&preq->tmpfile, base_url, 1); + strbuf_addstr(&preq->tmpfile, ".temp"); + } - strbuf_addf(&preq->tmpfile, "%s.temp", sha1_pack_name(target->sha1)); preq->packfile = fopen(preq->tmpfile.buf, "a"); if (!preq->packfile) { error("Unable to open local file %s for pack", @@ -2287,7 +2303,8 @@ struct http_pack_request *new_http_pack_request( if (http_is_verbose) fprintf(stderr, "Resuming fetch of pack %s at byte %"PRIuMAX"\n", - sha1_to_hex(target->sha1), (uintmax_t)prev_posn); + target ? sha1_to_hex(target->sha1) : base_url, + (uintmax_t)prev_posn); http_opt_request_remainder(preq->slot->curl, prev_posn); } diff --git a/http.h b/http.h index 1aa556257e..a33f2aa4b9 100644 --- a/http.h +++ b/http.h @@ -210,7 +210,8 @@ struct http_pack_request { struct active_request_slot *slot; /* - * After calling new_http_pack_request(), point lst to the head of the + * After calling new_http_pack_request(), if fetching a pack that + * http_get_info_packs() told us about, point lst to the head of the * pack list that target is in. finish_http_pack_request() will remove * target from lst and call install_packed_git() on target. */ @@ -224,8 +225,12 @@ struct http_pack_request { }; /* - * target must be an element in a pack list obtained from - * http_get_info_packs(). + * If fetching a pack that http_get_info_packs() told us about, set target to + * an element in a pack list obtained from http_get_info_packs(). The actual + * URL fetched will be base_url followed by a suffix with the hash of the pack. + * + * Otherwise, set target to NULL. The actual URL fetched will be base_url + * itself. */ extern struct http_pack_request *new_http_pack_request( struct packed_git *target, const char *base_url); diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh index 6d7d88ccc9..be81d33d68 100755 --- a/t/t5550-http-fetch-dumb.sh +++ b/t/t5550-http-fetch-dumb.sh @@ -199,6 +199,16 @@ test_expect_success 'fetch packed objects' ' git clone $HTTPD_URL/dumb/repo_pack.git ' +test_expect_success 'http-fetch --packfile' ' + git init packfileclient && + p=$(cd "$HTTPD_DOCUMENT_ROOT_PATH"/repo_pack.git && ls objects/pack/pack-*.pack) && + git -C packfileclient http-fetch --packfile "$HTTPD_URL"/dumb/repo_pack.git/$p && + + # Ensure that it has the HEAD of repo_pack, at least + HASH=$(git -C "$HTTPD_DOCUMENT_ROOT_PATH"/repo_pack.git rev-parse HEAD) && + git -C packfileclient cat-file -e "$HASH" +' + test_expect_success 'fetch notices corrupt pack' ' cp -R "$HTTPD_DOCUMENT_ROOT_PATH"/repo_pack.git "$HTTPD_DOCUMENT_ROOT_PATH"/repo_bad1.git && (cd "$HTTPD_DOCUMENT_ROOT_PATH"/repo_bad1.git && @@ -214,6 +224,14 @@ test_expect_success 'fetch notices corrupt pack' ' ) ' +test_expect_success 'http-fetch --packfile with corrupt pack' ' + rm -rf packfileclient && + git init packfileclient && + p=$(cd "$HTTPD_DOCUMENT_ROOT_PATH"/repo_bad1.git && ls objects/pack/pack-*.pack) && + test_must_fail git -C packfileclient http-fetch --packfile \ + "$HTTPD_URL"/dumb/repo_bad1.git/$p +' + test_expect_success 'fetch notices corrupt idx' ' cp -R "$HTTPD_DOCUMENT_ROOT_PATH"/repo_pack.git "$HTTPD_DOCUMENT_ROOT_PATH"/repo_bad2.git && (cd "$HTTPD_DOCUMENT_ROOT_PATH"/repo_bad2.git && From patchwork Sat Feb 23 23:38:58 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 10827659 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 09F0A1823 for ; Sat, 23 Feb 2019 23:39:24 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EEEDA2C35D for ; Sat, 23 Feb 2019 23:39:23 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E3AD22C36E; Sat, 23 Feb 2019 23:39:23 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.5 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, USER_IN_DEF_DKIM_WL autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 655012C36A for ; Sat, 23 Feb 2019 23:39:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728007AbfBWXjP (ORCPT ); Sat, 23 Feb 2019 18:39:15 -0500 Received: from mail-qt1-f202.google.com ([209.85.160.202]:38674 "EHLO mail-qt1-f202.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727998AbfBWXjP (ORCPT ); Sat, 23 Feb 2019 18:39:15 -0500 Received: by mail-qt1-f202.google.com with SMTP id 35so5864923qtq.5 for ; Sat, 23 Feb 2019 15:39:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=Z2zGuw9aQZisw8t6yp0DQLtiO1vIHYJFhFDRGKycjUU=; b=r9DyZxXzMbs5A14tAhmxMmUu1/LY/Xy5Qi71HnpcLWnOOa8AfBSUN1JcNRZxNghihr eO6YDZGujopCi7wlDGlpuJKDXr7yivktE3stDTxyXNFNys+y3vNnPa2LaVaQr0O19tzl 1ALmjtoSyKcL7LrrOQKp1dJs8SOzaKJjMidU30a5l2My7dU3b9kiR9YvjS5GJEKtRSaj 6IQ31w9mN5c0PglfeerdhuwX1U+RzCk5f8dp4ioPNHHB9pYA4a+vhP9IgywybAVyJKF5 BEJAqirTRWzsyVNVOvGhX9NLmrRKFCovV5DBbKKKostNbqmLzeQMVc3cy8xUggPGCI3W apqA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=Z2zGuw9aQZisw8t6yp0DQLtiO1vIHYJFhFDRGKycjUU=; b=qmnfB/RK4yCykbDYlNPXC3JA8hkSjEbNWNi+5fNtjxgyuugmdwimv9jwKGajdMsPIE K56TkyxkhAQA6r06EPaHAQruVRdRXNIw6C6K0hq+B2QZc3n4jomUhSQV8TbEEzghHKl0 6fL7MyczCTSqZhUrG2vxi0DdHqNDHUUct+IG+MZStrnomKZ2l68HxqlqvCRKH6lnGRHu 8FCPRzEP/X9xAmRTz5LO7msNoG3ab7CU1kxmGbahytF+Rw16viVBwklOQm6u/Jb1MuRt psx9tlpFTNVcmRS4+4PX8q2GjcB5y9s5yN+oDVPJzltvmmE9N+VhbM+YNjpsDaaHF3HY 3Byg== X-Gm-Message-State: AHQUAuZcc9C048ZyzTnJKb6bzdud98OBfr8PeauxQTN0rxrgyObJZYSV pv+zd3ZsX0Za0xxZiYq5BWKmltdBGxl8I9C77dlDrmGABsfbkJhjLIYb7k3hMRb823srCWwZBR3 OFBzfrKdnPs94zsScZXgeQxFumWk/q1OCoa9xnfmMMvQ7lVIVheqm/7FX/9z2zjUJqNMb0am7cZ PQ X-Google-Smtp-Source: AHgI3IZYxPoIZztF42b0QZ4ZGooS8J1+eYaz1v595zD2x2wEcpJEJ9RcwrEzuLBmH4xs8nIhDdLzuQvcG/pBmMTTmusY X-Received: by 2002:a37:9a4c:: with SMTP id c73mr6788795qke.50.1550965154046; Sat, 23 Feb 2019 15:39:14 -0800 (PST) Date: Sat, 23 Feb 2019 15:38:58 -0800 In-Reply-To: Message-Id: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.19.0.271.gfe8321ec05.dirty Subject: [WIP 4/7] Documentation: order protocol v2 sections From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan , gitster@pobox.com, peff@peff.net, christian.couder@gmail.com, avarab@gmail.com Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The current C Git implementation expects Git servers to follow a specific order of sections when transmitting protocol v2 responses, but this is not explicit in the documentation. Make the order explicit. Signed-off-by: Jonathan Tan --- Documentation/technical/protocol-v2.txt | 18 ++++++++---------- 1 file changed, 8 insertions(+), 10 deletions(-) diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt index ead85ce35c..36239ec7e9 100644 --- a/Documentation/technical/protocol-v2.txt +++ b/Documentation/technical/protocol-v2.txt @@ -325,11 +325,11 @@ included in the client's request: The response of `fetch` is broken into a number of sections separated by delimiter packets (0001), with each section beginning with its section -header. +header. Most sections are sent only when the packfile is sent. - output = *section - section = (acknowledgments | shallow-info | wanted-refs | packfile) - (flush-pkt | delim-pkt) + output = acknowledgements flush-pkt | + [acknowledgments delim-pkt] [shallow-info delim-pkt] + [wanted-refs delim-pkt] packfile flush-pkt acknowledgments = PKT-LINE("acknowledgments" LF) (nak | *ack) @@ -351,9 +351,10 @@ header. *PKT-LINE(%x01-03 *%x00-ff) acknowledgments section - * If the client determines that it is finished with negotiations - by sending a "done" line, the acknowledgments sections MUST be - omitted from the server's response. + * If the client determines that it is finished with negotiations by + sending a "done" line (thus requiring the server to send a packfile), + the acknowledgments sections MUST be omitted from the server's + response. * Always begins with the section header "acknowledgments" @@ -404,9 +405,6 @@ header. which the client has not indicated was shallow as a part of its request. - * This section is only included if a packfile section is also - included in the response. - wanted-refs section * This section is only included if the client has requested a ref using a 'want-ref' line and if a packfile section is also From patchwork Sat Feb 23 23:38:59 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 10827661 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 384F2186E for ; Sat, 23 Feb 2019 23:39:24 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2717F2C35B for ; Sat, 23 Feb 2019 23:39:24 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1B2DD2C378; Sat, 23 Feb 2019 23:39:24 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.5 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, USER_IN_DEF_DKIM_WL autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 85AD42C35B for ; Sat, 23 Feb 2019 23:39:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727999AbfBWXjS (ORCPT ); Sat, 23 Feb 2019 18:39:18 -0500 Received: from mail-it1-f201.google.com ([209.85.166.201]:38849 "EHLO mail-it1-f201.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728014AbfBWXjR (ORCPT ); Sat, 23 Feb 2019 18:39:17 -0500 Received: by mail-it1-f201.google.com with SMTP id r136so5452210ith.3 for ; Sat, 23 Feb 2019 15:39:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=S8rsRG253FiyHhRqr1XywDsN5sN75U3Ea+uBHZUYlhw=; b=o0tbI1wNaG1VOplL49QY7UjLT/t8QZWCg5uVRU7YCgTB2xSSE/jQqwzbY/WPjbfyaj 5L3NC1BYJFFVzxFLJvG60PBDVgXHr67S5fYA8GC3XVktAL6TilHX1v6KUseqdDa2XDnT 22tPaKrnO3tRM7+bOExQICZat4A5U3fTIEs8RWMChUjFxL1PSm0iHnuB0vbiS475VR3A MVVvcZBdLwXj/SlN9EF0umYeWg+Lfiyeh4ZJXXaejoL+Ga9nQYmTs19DkbAinuFGHUMB 1dhCBnvcb4mzmsUvEdK0j/ZUy3/UsYhg0xELM8cxC2jUvpeXZpU2WhLeFB26PLucAC30 uFYg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=S8rsRG253FiyHhRqr1XywDsN5sN75U3Ea+uBHZUYlhw=; b=RGT5yEJKfsoP+WIRmQJkrILup0/x4MSAG+WNMi07Z6kBLgQgvwrN+gHgXRBhxIivUE 1h9QC5uI4KXIfCA4gxxBs65KpAcQnruDg6t444aHgT5ivFj+22n9SVOXlMGGGstmEx1F 3SB/iVTk9gRvfvVbIeF5R4B0BSnn84DI80rqlNFuOLa98f6bKlw1FZuHl1Y5YQMW183k 44kchuotSm/bdEDVovsLZdP3+sNM0SZB2FR+pR5NUASBEbAwgWYNI6lKJ/dKIrJe/ayn VhU+HNx9bLA5G61Gbv+aaOAqdhr4coFd8F9mffkx/9RyvXb5KjFMlik5SlGWwc0TyM66 S4aA== X-Gm-Message-State: AHQUAuZQcQXS4GwSn1t9YcTKr8G6XVJRaiqRXU8qief/tdwSXADexzEm /iLDLKkmx/3dtrad7qQe94seh/r4o/xEwiDK6vtvZDYJNzHfV9Km0uzl9Aj+HMPPStCMAL8n4ts SRThBTUUhJHbCzHMn5J1w4/6sBlci1WOIQA3pzSmtboUStxLCAJoiQp6j1jZ2HvQRBgE0KDPcii 8f X-Google-Smtp-Source: AHgI3IYAZFcN1jEfjvU1xxlc/Cl2QuNN8ZQqM73+ZeTLApOoMYlBpO8P+xRUpYRBHIVb8yEzt7VUId5hBvqbmy/AdMBX X-Received: by 2002:a24:330e:: with SMTP id k14mr3458607itk.39.1550965156530; Sat, 23 Feb 2019 15:39:16 -0800 (PST) Date: Sat, 23 Feb 2019 15:38:59 -0800 In-Reply-To: Message-Id: <84fa3f27a8061ec3e53a935374fbf701f40b6584.1550963965.git.jonathantanmy@google.com> Mime-Version: 1.0 References: X-Mailer: git-send-email 2.19.0.271.gfe8321ec05.dirty Subject: [WIP 5/7] Documentation: add Packfile URIs design doc From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan , gitster@pobox.com, peff@peff.net, christian.couder@gmail.com, avarab@gmail.com Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Signed-off-by: Jonathan Tan --- Documentation/technical/packfile-uri.txt | 79 ++++++++++++++++++++++++ Documentation/technical/protocol-v2.txt | 6 +- 2 files changed, 84 insertions(+), 1 deletion(-) create mode 100644 Documentation/technical/packfile-uri.txt diff --git a/Documentation/technical/packfile-uri.txt b/Documentation/technical/packfile-uri.txt new file mode 100644 index 0000000000..97047dd1d2 --- /dev/null +++ b/Documentation/technical/packfile-uri.txt @@ -0,0 +1,79 @@ +Packfile URIs +============= + +This feature allows servers to serve part of their packfile response as URIs. +This allows server designs that improve scalability in bandwidth and CPU usage +(for example, by serving some data through a CDN), and (in the future) provides +some measure of resumability to clients. + +This feature is available only in protocol version 2. + +Protocol +-------- + +The server advertises `packfile-uris`. + +If the client then communicates which protocols (HTTPS, etc.) it supports with +a `packfile-uris` argument, the server MAY send a `packfile-uris` section +directly before the `packfile` section (right after `wanted-refs` if it is +sent) containing URIs of any of the given protocols. The URIs point to +packfiles that use only features that the client has declared that it supports +(e.g. ofs-delta and thin-pack). See protocol-v2.txt for the documentation of +this section. + +Clients then should understand that the returned packfile could be incomplete, +and that it needs to download all the given URIs before the fetch or clone is +complete. + +Server design +------------- + +The server can be trivially made compatible with the proposed protocol by +having it advertise `packfile-uris`, tolerating the client sending +`packfile-uris`, and never sending any `packfile-uris` section. But we should +include some sort of non-trivial implementation in the Minimum Viable Product, +at least so that we can test the client. + +This is the implementation: a feature, marked experimental, that allows the +server to be configured by one or more `uploadpack.blobPackfileUri= +` entries. Whenever the list of objects to be sent is assembled, a blob +with the given sha1 can be replaced by the given URI. This allows, for example, +servers to delegate serving of large blobs to CDNs. + +Client design +------------- + +While fetching, the client needs to remember the list of URIs and cannot +declare that the fetch is complete until all URIs have been downloaded as +packfiles. + +The division of work (initial fetch + additional URIs) introduces convenient +points for resumption of an interrupted clone - such resumption can be done +after the Minimum Viable Product (see "Future work"). + +The client can inhibit this feature (i.e. refrain from sending the +`packfile-urls` parameter) by passing --no-packfile-urls to `git fetch`. + +Future work +----------- + +The protocol design allows some evolution of the server and client without any +need for protocol changes, so only a small-scoped design is included here to +form the MVP. For example, the following can be done: + + * On the server, a long-running process that takes in entire requests and + outputs a list of URIs and the corresponding inclusion and exclusion sets of + objects. This allows, e.g., signed URIs to be used and packfiles for common + requests to be cached. + * On the client, resumption of clone. If a clone is interrupted, information + could be recorded in the repository's config and a "clone-resume" command + can resume the clone in progress. (Resumption of subsequent fetches is more + difficult because that must deal with the user wanting to use the repository + even after the fetch was interrupted.) + +There are some possible features that will require a change in protocol: + + * Additional HTTP headers (e.g. authentication) + * Byte range support + * Different file formats referenced by URIs (e.g. raw object) + diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt index 36239ec7e9..edb85c059b 100644 --- a/Documentation/technical/protocol-v2.txt +++ b/Documentation/technical/protocol-v2.txt @@ -329,7 +329,8 @@ header. Most sections are sent only when the packfile is sent. output = acknowledgements flush-pkt | [acknowledgments delim-pkt] [shallow-info delim-pkt] - [wanted-refs delim-pkt] packfile flush-pkt + [wanted-refs delim-pkt] [packfile-uris delim-pkt] + packfile flush-pkt acknowledgments = PKT-LINE("acknowledgments" LF) (nak | *ack) @@ -347,6 +348,9 @@ header. Most sections are sent only when the packfile is sent. *PKT-LINE(wanted-ref LF) wanted-ref = obj-id SP refname + packfile-uris = PKT-LINE("packfile-uris" LF) *packfile-uri + packfile-uri = PKT-LINE("uri" SP *%x20-ff LF) + packfile = PKT-LINE("packfile" LF) *PKT-LINE(%x01-03 *%x00-ff) From patchwork Sat Feb 23 23:39:00 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 10827663 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7DBFE180E for ; Sat, 23 Feb 2019 23:39:25 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6E0BB2C35B for ; Sat, 23 Feb 2019 23:39:25 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6242C2C35D; Sat, 23 Feb 2019 23:39:25 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.5 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, USER_IN_DEF_DKIM_WL autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A7DF32C362 for ; Sat, 23 Feb 2019 23:39:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728017AbfBWXjU (ORCPT ); Sat, 23 Feb 2019 18:39:20 -0500 Received: from mail-io1-f74.google.com ([209.85.166.74]:44420 "EHLO mail-io1-f74.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728014AbfBWXjT (ORCPT ); Sat, 23 Feb 2019 18:39:19 -0500 Received: by mail-io1-f74.google.com with SMTP id k24so4857893ioh.11 for ; Sat, 23 Feb 2019 15:39:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=eCfWlNGgENC7TS3N4CqpCf2JRmWLTT4o89UL+ZncUtQ=; b=aIiBhTjqH0u7gWb6hB+ywFxefDggVVW+gdWciUxLnDacolply4cICXqsWkEGfNhjOG r/t4xuI48B1M+4oV0lG3NzINveR0E8rYhYxD7PenYobeFjSb51ZNsuYbBzmODAt+LobL g1xo46Bc9Ki7HWXQ1hNJODqmA2V6mxW+KGsSax6fqYOzJ1k863f9ngbUcZMt+0RU07FV WBqzUspmplTNvCjC9noBP9MVdO4QEXB7hKTOr9ZJXUrVn3MjrAlYN6FCufauOMGxxYpp V3MhrDWHO3Sme6VSXVAqvLlh+adQRDWW+zWbCVvG+ewKQPh5wgmIlLFelo0bGAHPbYEP XVmw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=eCfWlNGgENC7TS3N4CqpCf2JRmWLTT4o89UL+ZncUtQ=; b=lvNvm/PYjUmPe7AK9sjBRWYzUScWq+m8LS+wZfYKlToAsfllT25k861xC1rN5kWQYr GwcEpFVUy+W2TXLpxzwcy4Y7OEYHTVVlkP/alzIBx+RsWosfftHdjqwuCsdG49a+Hb7E L0X1L3H24KVowFgrklZMY+9b8y1Vt6D31Bv6Z4eoNAWUpo6AhKnEMhlEiqINLuKRGKNG zEUDMCyBhwM3PQpHz1+dO8hD6y43yYPzfRp832ZACzMeuO5Y0ObPmI6Mkhs2V9pJWf3M JDiF2cHwTHXJs9DJnJjHbCpsN5u92UrxMMaOsVx3oyrtkAMvuqkOjvZzOjU7sy2tbw4T TafA== X-Gm-Message-State: AHQUAuZxbuxr0KwR9E7CeuDsPCfcr+RuILk7M+u/DjoV4zWYVTZJZWkq BnL+oPUZxW5BBCpd4BXadsxR2xzGYFn1t97ZKprr85x4ls7mstJ2Jp1VBamODEvKAarwgtSJMz7 JIp1DMNKdKLLIuvyJoVlYayv7P+tlI8+4qJmFBiH9Xtpz+7rfeorwS0CspwMb/0Br45XjfYWuKX Wz X-Google-Smtp-Source: AHgI3IYTBf4H+vSzBZH8g0reQa2sisOXlA7nRkzXdvNiMTlb96Qzp5iRjE3/Sz0ZtdTMhcEe4Fo8oHbIwUDtRwfB+O8j X-Received: by 2002:a24:1303:: with SMTP id 3mr3647357itz.40.1550965158748; Sat, 23 Feb 2019 15:39:18 -0800 (PST) Date: Sat, 23 Feb 2019 15:39:00 -0800 In-Reply-To: Message-Id: <2d00fd79a7c861d0bda61782630843ec4054f248.1550963965.git.jonathantanmy@google.com> Mime-Version: 1.0 References: X-Mailer: git-send-email 2.19.0.271.gfe8321ec05.dirty Subject: [WIP 6/7] upload-pack: refactor reading of pack-objects out From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan , gitster@pobox.com, peff@peff.net, christian.couder@gmail.com, avarab@gmail.com Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Subsequent patches will change how the output of pack-objects is processed, so extract that processing into its own function. Currently, at most 1 character can be buffered (in the "buffered" local variable). One of those patches will require a larger buffer, so replace that "buffered" local variable with a buffer array. Signed-off-by: Jonathan Tan --- upload-pack.c | 81 ++++++++++++++++++++++++++++++--------------------- 1 file changed, 47 insertions(+), 34 deletions(-) diff --git a/upload-pack.c b/upload-pack.c index d098ef5982..987d2e139b 100644 --- a/upload-pack.c +++ b/upload-pack.c @@ -102,14 +102,52 @@ static int write_one_shallow(const struct commit_graft *graft, void *cb_data) return 0; } +struct output_state { + char buffer[8193]; + int used; +}; + +static int relay_pack_data(int pack_objects_out, struct output_state *os) +{ + /* + * We keep the last byte to ourselves + * in case we detect broken rev-list, so that we + * can leave the stream corrupted. This is + * unfortunate -- unpack-objects would happily + * accept a valid packdata with trailing garbage, + * so appending garbage after we pass all the + * pack data is not good enough to signal + * breakage to downstream. + */ + ssize_t readsz; + + readsz = xread(pack_objects_out, os->buffer + os->used, + sizeof(os->buffer) - os->used); + if (readsz < 0) { + return readsz; + } + os->used += readsz; + + if (os->used > 1) { + send_client_data(1, os->buffer, os->used - 1); + os->buffer[0] = os->buffer[os->used - 1]; + os->used = 1; + } else { + send_client_data(1, os->buffer, os->used); + os->used = 0; + } + + return readsz; +} + static void create_pack_file(const struct object_array *have_obj, const struct object_array *want_obj) { struct child_process pack_objects = CHILD_PROCESS_INIT; - char data[8193], progress[128]; + struct output_state output_state = {0}; + char progress[128]; char abort_msg[] = "aborting due to possible repository " "corruption on the remote side."; - int buffered = -1; ssize_t sz; int i; FILE *pipe_fd; @@ -239,39 +277,15 @@ static void create_pack_file(const struct object_array *have_obj, continue; } if (0 <= pu && (pfd[pu].revents & (POLLIN|POLLHUP))) { - /* Data ready; we keep the last byte to ourselves - * in case we detect broken rev-list, so that we - * can leave the stream corrupted. This is - * unfortunate -- unpack-objects would happily - * accept a valid packdata with trailing garbage, - * so appending garbage after we pass all the - * pack data is not good enough to signal - * breakage to downstream. - */ - char *cp = data; - ssize_t outsz = 0; - if (0 <= buffered) { - *cp++ = buffered; - outsz++; - } - sz = xread(pack_objects.out, cp, - sizeof(data) - outsz); - if (0 < sz) - ; - else if (sz == 0) { + int result = relay_pack_data(pack_objects.out, + &output_state); + + if (result == 0) { close(pack_objects.out); pack_objects.out = -1; - } - else + } else if (result < 0) { goto fail; - sz += outsz; - if (1 < sz) { - buffered = data[sz-1] & 0xFF; - sz--; } - else - buffered = -1; - send_client_data(1, data, sz); } /* @@ -296,9 +310,8 @@ static void create_pack_file(const struct object_array *have_obj, } /* flush the data */ - if (0 <= buffered) { - data[0] = buffered; - send_client_data(1, data, 1); + if (output_state.used > 0) { + send_client_data(1, output_state.buffer, output_state.used); fprintf(stderr, "flushed.\n"); } if (use_sideband) From patchwork Sat Feb 23 23:39:01 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 10827667 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8472E17E9 for ; Sat, 23 Feb 2019 23:39:29 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6FBD82C49F for ; Sat, 23 Feb 2019 23:39:29 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 619282C4A6; Sat, 23 Feb 2019 23:39:29 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.5 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, USER_IN_DEF_DKIM_WL autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5AA022C49F for ; Sat, 23 Feb 2019 23:39:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728023AbfBWXjX (ORCPT ); Sat, 23 Feb 2019 18:39:23 -0500 Received: from mail-oi1-f202.google.com ([209.85.167.202]:51936 "EHLO mail-oi1-f202.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728018AbfBWXjW (ORCPT ); Sat, 23 Feb 2019 18:39:22 -0500 Received: by mail-oi1-f202.google.com with SMTP id y145so2190406oie.18 for ; Sat, 23 Feb 2019 15:39:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=C9fDc4k9BWpCugp3jJ+hI6WBHhFCxhxBH26DwxNxICc=; b=YLzwPf5TsUb1QQIs3oTj5MASa3FhWfFe1bnbZVn7VIDFhgIU5O4sBrNNwfvwXEg297 M7Bml4gjJfZkStvpqP3zJD7c9f9f+MrFr6LJa45gkCH2XNdSjjHbKunRlKA2dTZGv1hX 9xZ/n+u28AfVlHVUbiU068zNrq0G7DJL1tN6aUvptZWXO5qqLoAQKKJIfifbN9ypQgV2 VT6dZsup+5vW8aPdDm7nIf1bzxxz8HI/oyJmxY7lOT+w7HX2nbm8rzC9Nfx4BeSpDga6 8szlmkEJC5vGhr1ntIRzmjCXHI53S1WJ0HCVa0nvrLsq5VDFSMnrc/TZ3CMMcxdS9Ora A7gA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=C9fDc4k9BWpCugp3jJ+hI6WBHhFCxhxBH26DwxNxICc=; b=mhRZ6tPlZ6BZsCmiZRX8l+Vih68MLDZb/WM6qsdSgbuMaHz85qA1rEmHfyiu937HM0 BduBPqQEPRanIHgIpyWK1JkLsZwfW5lQ2ptI4rd2yhkBNSLTimhyxPBT1HDd5K+CU7ml bBhBUMyiYLBGjbnC7GcI0ehKfrAxmVkKEhyZGoYkB363Wus44LSSnqqM4RySWdlQ01wh RUliDTFjP9mdJOZUxBlBlgWwC6hQjRF0fKVfiyjZa2xsNeHlmkuLQYpdnaEXwYMHBpte 1hcn3ULpAQ+ZNv1YdECYuCRw+tlRVeEtb2fgK7k/cq9Ly741ckptpzUNO4ErVpQynboA 7J3g== X-Gm-Message-State: AHQUAuYmTsjfAvvDHstyj06N2LSSHAZVr2wNOZ7Fkx3UJ6dyis5NVCoC CCwtKHJmfCNRZq4VNiFTLacGjBkB8lfJ24St4cXQsciCveo111EyCKzfxxiAMQzIIu32SNNbapG W/1dUKKV90uj0p4fpNZ+D616TptR4YMPZtbcEVGMyEhHS9zPsbqeSkZpZSq+L4VPKMvrJ1vDJSP et X-Google-Smtp-Source: AHgI3Ia+gbVql1mYoeHL5u7ewmhqHk5MT3SClAigr9OPweicVUeXrK90iGbhLKLd5Lyheso0lbq62zKt8JaIiJW8cUAF X-Received: by 2002:a9d:ee4:: with SMTP id 91mr5307624otj.50.1550965160905; Sat, 23 Feb 2019 15:39:20 -0800 (PST) Date: Sat, 23 Feb 2019 15:39:01 -0800 In-Reply-To: Message-Id: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.19.0.271.gfe8321ec05.dirty Subject: [WIP 7/7] upload-pack: send part of packfile response as uri From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan , gitster@pobox.com, peff@peff.net, christian.couder@gmail.com, avarab@gmail.com Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Teach upload-pack to send part of its packfile response as URIs. An administrator may configure a repository with one or more "uploadpack.blobpackfileuri" lines, each line containing an OID and a URI. A client may configure fetch.uriprotocols to be a comma-separated list of protocols that it is willing to use to fetch additional packfiles - this list will be sent to the server. Whenever an object with one of those OIDs would appear in the packfile transmitted by upload-pack, the server may exclude that object, and instead send the URI. The client will then download the packs referred to by those URIs before performing the connectivity check. Signed-off-by: Jonathan Tan --- builtin/pack-objects.c | 63 ++++++++++++++++++++++++++++++++++ fetch-pack.c | 58 +++++++++++++++++++++++++++++++ t/t5702-protocol-v2.sh | 54 +++++++++++++++++++++++++++++ upload-pack.c | 78 ++++++++++++++++++++++++++++++++++++++---- 4 files changed, 247 insertions(+), 6 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index a9fac7c128..73309821e2 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -110,6 +110,8 @@ static unsigned long window_memory_limit = 0; static struct list_objects_filter_options filter_options; +static struct string_list uri_protocols = STRING_LIST_INIT_NODUP; + enum missing_action { MA_ERROR = 0, /* fail if any missing objects are encountered */ MA_ALLOW_ANY, /* silently allow ALL missing objects */ @@ -118,6 +120,14 @@ enum missing_action { static enum missing_action arg_missing_action; static show_object_fn fn_show_object; +struct configured_exclusion { + struct oidmap_entry e; + char *uri; +}; +static struct oidmap configured_exclusions; + +static struct oidset excluded_by_config; + /* * stats */ @@ -832,6 +842,23 @@ static off_t write_reused_pack(struct hashfile *f) return reuse_packfile_offset - sizeof(struct pack_header); } +static void write_excluded_by_configs(void) +{ + struct oidset_iter iter; + const struct object_id *oid; + + oidset_iter_init(&excluded_by_config, &iter); + while ((oid = oidset_iter_next(&iter))) { + struct configured_exclusion *ex = + oidmap_get(&configured_exclusions, oid); + + if (!ex) + BUG("configured exclusion wasn't configured"); + write_in_full(1, ex->uri, strlen(ex->uri)); + write_in_full(1, "\n", 1); + } +} + static const char no_split_warning[] = N_( "disabling bitmap writing, packs are split due to pack.packSizeLimit" ); @@ -1125,6 +1152,25 @@ static int want_object_in_pack(const struct object_id *oid, } } + if (uri_protocols.nr) { + struct configured_exclusion *ex = + oidmap_get(&configured_exclusions, oid); + int i; + const char *p; + + if (ex) { + for (i = 0; i < uri_protocols.nr; i++) { + if (skip_prefix(ex->uri, + uri_protocols.items[i].string, + &p) && + *p == ':') { + oidset_insert(&excluded_by_config, oid); + return 0; + } + } + } + } + return 1; } @@ -2726,6 +2772,19 @@ static int git_pack_config(const char *k, const char *v, void *cb) pack_idx_opts.version); return 0; } + if (!strcmp(k, "uploadpack.blobpackfileuri")) { + struct configured_exclusion *ex = xmalloc(sizeof(*ex)); + const char *end; + + if (parse_oid_hex(v, &ex->e.oid, &end) || *end != ' ') + die(_("value of uploadpack.blobpackfileuri must be " + "of the form ' ' (got '%s')"), v); + if (oidmap_get(&configured_exclusions, &ex->e.oid)) + die(_("object already configured in another " + "uploadpack.blobpackfileuri (got '%s')"), v); + ex->uri = xstrdup(end + 1); + oidmap_put(&configured_exclusions, ex); + } return git_default_config(k, v, cb); } @@ -3318,6 +3377,9 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) N_("do not pack objects in promisor packfiles")), OPT_BOOL(0, "delta-islands", &use_delta_islands, N_("respect islands during delta compression")), + OPT_STRING_LIST(0, "uri-protocol", &uri_protocols, + N_("protocol"), + N_("exclude any configured uploadpack.blobpackfileuri with this protocol")), OPT_END(), }; @@ -3492,6 +3554,7 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) return 0; if (nr_result) prepare_pack(window, depth); + write_excluded_by_configs(); write_pack_file(); if (progress) fprintf_ln(stderr, diff --git a/fetch-pack.c b/fetch-pack.c index 812be15d7e..edbe4b3ec3 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -38,6 +38,7 @@ static struct lock_file shallow_lock; static const char *alternate_shallow_file; static char *negotiation_algorithm; static struct strbuf fsck_msg_types = STRBUF_INIT; +static struct string_list uri_protocols = STRING_LIST_INIT_DUP; /* Remember to update object flag allocation in object.h */ #define COMPLETE (1U << 0) @@ -1147,6 +1148,26 @@ static int send_fetch_request(struct fetch_negotiator *negotiator, int fd_out, warning("filtering not recognized by server, ignoring"); } + if (server_supports_feature("fetch", "packfile-uris", 0)) { + int i; + struct strbuf to_send = STRBUF_INIT; + + for (i = 0; i < uri_protocols.nr; i++) { + const char *s = uri_protocols.items[i].string; + + if (!strcmp(s, "https") || !strcmp(s, "http")) { + if (to_send.len) + strbuf_addch(&to_send, ','); + strbuf_addstr(&to_send, s); + } + } + if (to_send.len) { + packet_buf_write(&req_buf, "packfile-uris %s", + to_send.buf); + strbuf_release(&to_send); + } + } + /* add wants */ add_wants(args->no_dependents, wants, &req_buf); @@ -1322,6 +1343,32 @@ static void receive_wanted_refs(struct packet_reader *reader, die(_("error processing wanted refs: %d"), reader->status); } +static void receive_packfile_uris(struct packet_reader *reader) +{ + process_section_header(reader, "packfile-uris", 0); + while (packet_reader_read(reader) == PACKET_READ_NORMAL) { + const char *p; + struct child_process cmd = CHILD_PROCESS_INIT; + + + if (!skip_prefix(reader->line, "uri ", &p)) + die("expected 'uri ', got: %s\n", reader->line); + + argv_array_push(&cmd.args, "http-fetch"); + argv_array_push(&cmd.args, "--packfile"); + argv_array_push(&cmd.args, p); + cmd.git_cmd = 1; + cmd.no_stdin = 1; + cmd.no_stdout = 1; + if (start_command(&cmd)) + die("fetch-pack: unable to spawn"); + if (finish_command(&cmd)) + die("fetch-pack: unable to finish"); + } + if (reader->status != PACKET_READ_DELIM) + die("expected DELIM"); +} + enum fetch_state { FETCH_CHECK_LOCAL = 0, FETCH_SEND_REQUEST, @@ -1414,6 +1461,9 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, receive_wanted_refs(&reader, sought, nr_sought); /* get the pack */ + if (process_section_header(&reader, "packfile-uris", 1)) { + receive_packfile_uris(&reader); + } process_section_header(&reader, "packfile", 0); if (get_pack(args, fd, pack_lockfile)) die(_("git fetch-pack: fetch failed.")); @@ -1464,6 +1514,14 @@ static void fetch_pack_config(void) git_config_get_bool("transfer.fsckobjects", &transfer_fsck_objects); git_config_get_string("fetch.negotiationalgorithm", &negotiation_algorithm); + if (!uri_protocols.nr) { + char *str; + + if (!git_config_get_string("fetch.uriprotocols", &str) && str) { + string_list_split(&uri_protocols, str, ',', -1); + free(str); + } + } git_config(fetch_pack_config_cb, NULL); } diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh index db4ae09f2f..6dbe2e9584 100755 --- a/t/t5702-protocol-v2.sh +++ b/t/t5702-protocol-v2.sh @@ -656,6 +656,60 @@ test_expect_success 'when server does not send "ready", expect FLUSH' ' test_i18ngrep "expected no other sections to be sent after no .ready." err ' +test_expect_success 'part of packfile response provided as URI' ' + P="$HTTPD_DOCUMENT_ROOT_PATH/http_parent" && + rm -rf "$P" http_child log && + + git init "$P" && + git -C "$P" config "uploadpack.allowsidebandall" "true" && + + echo my-blob >"$P/my-blob" && + git -C "$P" add my-blob && + echo other-blob >"$P/other-blob" && + git -C "$P" add other-blob && + git -C "$P" commit -m x && + + # Create a packfile for my-blob and configure it for exclusion. + git -C "$P" hash-object my-blob >h && + git -C "$P" pack-objects --stdout "$HTTPD_DOCUMENT_ROOT_PATH/one.pack" && + git -C "$P" config \ + "uploadpack.blobpackfileuri" \ + "$(cat h) $HTTPD_URL/dumb/one.pack" && + + # Do the same for other-blob. + git -C "$P" hash-object other-blob >h2 && + git -C "$P" pack-objects --stdout

"$HTTPD_DOCUMENT_ROOT_PATH/two.pack" && + git -C "$P" config --add \ + "uploadpack.blobpackfileuri" \ + "$(cat h2) $HTTPD_URL/dumb/two.pack" && + + GIT_TRACE_PACKET="$(pwd)/log" GIT_TEST_SIDEBAND_ALL=1 \ + git -c protocol.version=2 \ + -c fetch.uriprotocols=http,https \ + clone "$HTTPD_URL/smart/http_parent" http_child && + + # Ensure that my-blob and other-blob are in separate packfiles. + for idx in http_child/.git/objects/pack/*.idx + do + git verify-pack --verbose $idx >out && + if test "$(grep "^[0-9a-f]\{40\} " out | wc -l)" = 1 + then + if grep $(cat h) out + then + >hfound + fi && + if grep $(cat h2) out + then + >h2found + fi + fi + done && + test -f hfound && + test -f h2found +' + stop_httpd test_done diff --git a/upload-pack.c b/upload-pack.c index 987d2e139b..2365b707bc 100644 --- a/upload-pack.c +++ b/upload-pack.c @@ -105,9 +105,12 @@ static int write_one_shallow(const struct commit_graft *graft, void *cb_data) struct output_state { char buffer[8193]; int used; + unsigned packfile_uris_started : 1; + unsigned packfile_started : 1; }; -static int relay_pack_data(int pack_objects_out, struct output_state *os) +static int relay_pack_data(int pack_objects_out, struct output_state *os, + int write_packfile_line) { /* * We keep the last byte to ourselves @@ -128,6 +131,37 @@ static int relay_pack_data(int pack_objects_out, struct output_state *os) } os->used += readsz; + while (!os->packfile_started) { + char *p; + if (os->used >= 4 && !memcmp(os->buffer, "PACK", 4)) { + os->packfile_started = 1; + if (write_packfile_line) { + if (os->packfile_uris_started) + packet_delim(1); + packet_write_fmt(1, "\1packfile\n"); + } + break; + } + if ((p = memchr(os->buffer, '\n', os->used))) { + if (!os->packfile_uris_started) { + os->packfile_uris_started = 1; + if (!write_packfile_line) + BUG("packfile_uris requires sideband-all"); + packet_write_fmt(1, "\1packfile-uris\n"); + } + *p = '\0'; + packet_write_fmt(1, "\1uri %s\n", os->buffer); + + os->used -= p - os->buffer + 1; + memmove(os->buffer, p + 1, os->used); + } else { + /* + * Incomplete line. + */ + return readsz; + } + } + if (os->used > 1) { send_client_data(1, os->buffer, os->used - 1); os->buffer[0] = os->buffer[os->used - 1]; @@ -141,7 +175,8 @@ static int relay_pack_data(int pack_objects_out, struct output_state *os) } static void create_pack_file(const struct object_array *have_obj, - const struct object_array *want_obj) + const struct object_array *want_obj, + const struct string_list *uri_protocols) { struct child_process pack_objects = CHILD_PROCESS_INIT; struct output_state output_state = {0}; @@ -192,6 +227,11 @@ static void create_pack_file(const struct object_array *have_obj, expanded_filter_spec.buf); } } + if (uri_protocols) { + for (i = 0; i < uri_protocols->nr; i++) + argv_array_pushf(&pack_objects.args, "--uri-protocol=%s", + uri_protocols->items[0].string); + } pack_objects.in = -1; pack_objects.out = -1; @@ -278,7 +318,8 @@ static void create_pack_file(const struct object_array *have_obj, } if (0 <= pu && (pfd[pu].revents & (POLLIN|POLLHUP))) { int result = relay_pack_data(pack_objects.out, - &output_state); + &output_state, + !!uri_protocols); if (result == 0) { close(pack_objects.out); @@ -1123,7 +1164,7 @@ void upload_pack(struct upload_pack_options *options) if (want_obj.nr) { struct object_array have_obj = OBJECT_ARRAY_INIT; get_common_commits(&reader, &have_obj, &want_obj); - create_pack_file(&have_obj, &want_obj); + create_pack_file(&have_obj, &want_obj, 0); } } @@ -1138,6 +1179,7 @@ struct upload_pack_data { timestamp_t deepen_since; int deepen_rev_list; int deepen_relative; + struct string_list uri_protocols; struct packet_writer writer; @@ -1157,6 +1199,7 @@ static void upload_pack_data_init(struct upload_pack_data *data) struct oid_array haves = OID_ARRAY_INIT; struct object_array shallows = OBJECT_ARRAY_INIT; struct string_list deepen_not = STRING_LIST_INIT_DUP; + struct string_list uri_protocols = STRING_LIST_INIT_DUP; memset(data, 0, sizeof(*data)); data->wants = wants; @@ -1164,6 +1207,7 @@ static void upload_pack_data_init(struct upload_pack_data *data) data->haves = haves; data->shallows = shallows; data->deepen_not = deepen_not; + data->uri_protocols = uri_protocols; packet_writer_init(&data->writer, 1); } @@ -1322,9 +1366,17 @@ static void process_args(struct packet_reader *request, continue; } + if (skip_prefix(arg, "packfile-uris ", &p)) { + string_list_split(&data->uri_protocols, p, ',', -1); + continue; + } + /* ignore unknown lines maybe? */ die("unexpected line: '%s'", arg); } + + if (data->uri_protocols.nr && !data->writer.use_sideband) + string_list_clear(&data->uri_protocols, 0); } static int process_haves(struct oid_array *haves, struct oid_array *common, @@ -1514,8 +1566,13 @@ int upload_pack_v2(struct repository *r, struct argv_array *keys, send_wanted_ref_info(&data); send_shallow_info(&data, &want_obj); - packet_writer_write(&data.writer, "packfile\n"); - create_pack_file(&have_obj, &want_obj); + if (data.uri_protocols.nr) { + create_pack_file(&have_obj, &want_obj, + &data.uri_protocols); + } else { + packet_write_fmt(1, "packfile\n"); + create_pack_file(&have_obj, &want_obj, NULL); + } state = FETCH_DONE; break; case FETCH_DONE: @@ -1536,6 +1593,7 @@ int upload_pack_advertise(struct repository *r, int allow_filter_value; int allow_ref_in_want; int allow_sideband_all_value; + char *str = NULL; strbuf_addstr(value, "shallow"); @@ -1557,6 +1615,14 @@ int upload_pack_advertise(struct repository *r, &allow_sideband_all_value) && allow_sideband_all_value)) strbuf_addstr(value, " sideband-all"); + + if (!repo_config_get_string(the_repository, + "uploadpack.blobpackfileuri", + &str) && + str) { + strbuf_addstr(value, " packfile-uris"); + free(str); + } } return 1;