From patchwork Tue Oct 13 19:17:27 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean McAllister X-Patchwork-Id: 11836063 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-20.4 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E42F8C433E7 for ; Tue, 13 Oct 2020 19:17:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6DDCD20BED for ; Tue, 13 Oct 2020 19:17:44 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="g2N+OcbK" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725955AbgJMTRn (ORCPT ); Tue, 13 Oct 2020 15:17:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52274 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725919AbgJMTRn (ORCPT ); Tue, 13 Oct 2020 15:17:43 -0400 Received: from mail-qt1-x84a.google.com (mail-qt1-x84a.google.com [IPv6:2607:f8b0:4864:20::84a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E7BFCC0613D0 for ; Tue, 13 Oct 2020 12:17:42 -0700 (PDT) Received: by mail-qt1-x84a.google.com with SMTP id n8so575517qtf.10 for ; Tue, 13 Oct 2020 12:17:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:message-id:mime-version:subject:from:to:cc; bh=jFIi5d8yHMioX6n/Y9ckXIX4A9pL0FgLjpnWEhbf7nY=; b=g2N+OcbKqVHnZP7lfMykYK55USdGjg+xku6DR88UvFGMwA9XeBX7lgnCbljYnnw5Ou rVWEBE6npO7kmczZ8M6nq9oiVOS/5xFvL3gteQRU7TPzPW5pn0ltO7r3cvhDTQIRpbhn GbK4ktRF0G/RKsklSsEZntJdQRJPgqd6NsG9wuJvWlVgPQPsMfqrF6t9RfrULGAEqCpz VfdwkvaX1Ygo15Rg5niLlGUHVDQZQ5ihdrG2mmyqOYUeGsZnZQiTy3qos7ybv99GQEQ9 YTdlSAw9i2WtOmefHk/t52Y0+yt7oulIgyliYEhFRIHWHjH9I7MtcTnTattNyg23YOOu esvA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:message-id:mime-version:subject:from :to:cc; bh=jFIi5d8yHMioX6n/Y9ckXIX4A9pL0FgLjpnWEhbf7nY=; b=PpLWwSxhHjx2pItKBAN40NP+uvnGyKfL/AjpgkvNNxoeX8EQpSopAb+PUfwaSXJeUs s01q8CRkl2dA9TcDjLtVetkJkwXTouM6YlPLDprbb0V8+88+GI9oBZ5UuI5TAu3BmhEY p8uSoWaDfWel/x5T8Ysn88tabakvuw4Lnm8GgT1bKQ/R1sz31graa/YfyKkbSY8eRKJB dz9o8afcjiYz50N7tnp+622s2vMswPIwjR8fvuVmJPqPJ7/xjQoWvsebap6EL3ZssHwI lLupGsr22ynah68TjJGaTwgfJMB9ydOMZmKiLMQlbseQSWT0rC2XhPOlD9mUKzdLaNG8 ZBXg== X-Gm-Message-State: AOAM533JONGu0ZE1CMdgaKINjAFjDvyZgw/y38BnyWW67hWn8wyiNIJs qhQHB06qLiMM2VyMa45QSy58O7HOKOrir+AdfokPFSgRZlfKjO3AzD+6dayj0Eqgw5ZS3CGLizZ jlkdjIxQuO8iFB2TokXoJZvlY8IaRcz2ve62ZSYEHMT4RsXiMy8UJe7cZ22Le3xE= X-Google-Smtp-Source: ABdhPJybYrqLmzRXTCBKy/lpKn2/k6IOJ9SRkIc5VyTvY51gcsBq1yck4/nJBlX0J6mIfc073t4Y0MZPyIjDbg== Sender: "smcallis via sendgmr" X-Received: from smcallis.bld.corp.google.com ([2620:15c:183:200:a6ae:11ff:fe11:fc5d]) (user=smcallis job=sendgmr) by 2002:a05:6214:180d:: with SMTP id o13mr1458278qvw.34.1602616661782; Tue, 13 Oct 2020 12:17:41 -0700 (PDT) Date: Tue, 13 Oct 2020 13:17:27 -0600 Message-Id: <20201013191729.2524700-1-smcallis@google.com> Mime-Version: 1.0 X-Mailer: git-send-email 2.28.0.1011.ga647a8990f-goog Subject: [PATCH v2 1/3] remote-curl: add testing for intelligent retry for HTTP From: Sean McAllister To: git@vger.kernel.org, gitster@pobox.com, peff@peff.net, masayasuzuki@google.com, jrnieder@gmail.com Cc: Sean McAllister Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org HTTP servers can sometimes throw errors, but that doesn't mean we should give up. If we have an error condition that we can reasonably retry on, then we should. This change is tricky because it requires a new CGI script to test as we need to be able to instruct the server to throw an error(s) before succeeding. We do this by encoding an error code and optional value for Retry-After into the URL, followed by the real endpoint: /error_ntime/dc724af1//429/10/smart/server This is a bit hacky, but really the best we can do since HTTP is fundamentally stateless. The URL is uniquefied by a nonce and we leave a breadcrumb on disk so all accesses after the first redirect to the appropriate endpoint. Signed-off-by: Sean McAllister --- t/lib-httpd.sh | 8 ++++ t/lib-httpd/apache.conf | 9 +++++ t/lib-httpd/error-ntime.sh | 80 ++++++++++++++++++++++++++++++++++++++ t/t5601-clone.sh | 9 +++++ 4 files changed, 106 insertions(+) create mode 100755 t/lib-httpd/error-ntime.sh diff --git a/t/lib-httpd.sh b/t/lib-httpd.sh index d2edfa4c50..edc10b6f4b 100644 --- a/t/lib-httpd.sh +++ b/t/lib-httpd.sh @@ -132,6 +132,7 @@ prepare_httpd() { install_script incomplete-length-upload-pack-v2-http.sh install_script incomplete-body-upload-pack-v2-http.sh install_script broken-smart-http.sh + install_script error-ntime.sh install_script error-smart-http.sh install_script error.sh install_script apply-one-time-perl.sh @@ -308,3 +309,10 @@ check_access_log() { test_cmp "$1" access.log.stripped fi } + +# generate a process unique one-up value +global_counter_for_nonce=0 +gen_nonce () { + global_counter_for_nonce=$((global_counter_for_nonce + 1)) && + echo "$global_counter_for_nonce" +} diff --git a/t/lib-httpd/apache.conf b/t/lib-httpd/apache.conf index afa91e38b0..77c495e164 100644 --- a/t/lib-httpd/apache.conf +++ b/t/lib-httpd/apache.conf @@ -117,6 +117,12 @@ Alias /auth/dumb/ www/auth/dumb/ SetEnv GIT_EXEC_PATH ${GIT_EXEC_PATH} SetEnv GIT_HTTP_EXPORT_ALL + +# This may be suffixed with any path for redirection, so it should come before +# any of the other aliases, particularly the /smart_*[^/]*/(.*) alias as that can +# match a lot of URLs +ScriptAlias /error_ntime/ error-ntime.sh/ + ScriptAlias /smart/incomplete_length/git-upload-pack incomplete-length-upload-pack-v2-http.sh/ ScriptAlias /smart/incomplete_body/git-upload-pack incomplete-body-upload-pack-v2-http.sh/ ScriptAliasMatch /error_git_upload_pack/(.*)/git-upload-pack error.sh/ @@ -137,6 +143,9 @@ ScriptAliasMatch /one_time_perl/(.*) apply-one-time-perl.sh/$1 Options ExecCGI + + Options ExecCGI + Options ExecCGI diff --git a/t/lib-httpd/error-ntime.sh b/t/lib-httpd/error-ntime.sh new file mode 100755 index 0000000000..64dc878746 --- /dev/null +++ b/t/lib-httpd/error-ntime.sh @@ -0,0 +1,80 @@ +#!/bin/sh + +# Script to simulate a transient error code with Retry-After header set. +# +# PATH_INFO must be of the form ///// +# (eg: /dc724af1/3/429/10/some/url) +# +# The value uniquely identifies the URL, since we're simulating +# a stateful operation using a stateless protocol, we need a way to "namespace" +# URLs so that they don't step on each other. +# +# The first times this endpoint is called, it will return the given +# , and if the is non-negative, it will set the +# Retry-After head to that value. +# +# Subsequent calls will return a 302 redirect to . +# +# Supported error codes are 429,502,503, and 504 + +print_status() { + if [ "$1" -eq "302" ]; then + printf "Status: 302 Found\n" + elif [ "$1" -eq "429" ]; then + printf "Status: 429 Too Many Requests\n" + elif [ "$1" -eq "502" ]; then + printf "Status: 502 Bad Gateway\n" + elif [ "$1" -eq "503" ]; then + printf "Status: 503 Service Unavailable\n" + elif [ "$1" -eq "504" ]; then + printf "Status: 504 Gateway Timeout\n" + else + printf "Status: 500 Internal Server Error\n" + fi + printf "Content-Type: text/plain\n" +} + +# set $@ to components of PATH_INFO +IFS='/' +set -f +set -- $PATH_INFO +set +f + +# pull out first four path components +shift +nonce=$1 +times=$2 +code=$3 +retry=$4 +shift 4 + +# glue the rest back together as redirect path +path="" +while [ "$#" -gt "0" ]; do + path="${path}/$1" + shift +done + +# leave a cookie for this request/retry count +state_file="request_${REMOTE_ADDR}_${nonce}_${times}_${code}_${retry}" + +if [ ! -f "$state_file" ]; then + echo 0 > "$state_file" +fi + +read -r cnt < "$state_file" +if [ "$cnt" -lt "$times" ]; then + echo $((cnt+1)) > "$state_file" + + # return error + print_status "$code" + if [ "$retry" -ge "0" ]; then + printf "Retry-After: %s\n" "$retry" + fi +else + # redirect + print_status 302 + printf "Location: %s?%s\n" "$path" "${QUERY_STRING}" +fi + +echo diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh index 7df3c5373a..72aaed5a93 100755 --- a/t/t5601-clone.sh +++ b/t/t5601-clone.sh @@ -756,6 +756,15 @@ test_expect_success 'partial clone using HTTP' ' partial_clone "$HTTPD_DOCUMENT_ROOT_PATH/server" "$HTTPD_URL/smart/server" ' +test_expect_success 'partial clone using HTTP with redirect' ' + _NONCE=`gen_nonce` && export _NONCE && + curl "$HTTPD_URL/error_ntime/${_NONCE}/3/502/10/smart/server" && + curl "$HTTPD_URL/error_ntime/${_NONCE}/3/502/10/smart/server" && + curl "$HTTPD_URL/error_ntime/${_NONCE}/3/502/10/smart/server" && + partial_clone "$HTTPD_DOCUMENT_ROOT_PATH/server" "$HTTPD_URL/error_ntime/${_NONCE}/3/502/10/smart/server" +' + + # DO NOT add non-httpd-specific tests here, because the last part of this # test script is only executed when httpd is available and enabled. From patchwork Tue Oct 13 19:17:28 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean McAllister X-Patchwork-Id: 11836067 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-20.4 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AEB23C43457 for ; Tue, 13 Oct 2020 19:17:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 51CEB20BED for ; Tue, 13 Oct 2020 19:17:46 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="Rsig7VPH" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726035AbgJMTRp (ORCPT ); Tue, 13 Oct 2020 15:17:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52278 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725960AbgJMTRo (ORCPT ); Tue, 13 Oct 2020 15:17:44 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 27DA6C0613D0 for ; Tue, 13 Oct 2020 12:17:44 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id v4so879789ybp.3 for ; Tue, 13 Oct 2020 12:17:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=5juivkZ5jlo3u09VQrufzxjUkil38qmJ5lRnAGAw3Nc=; b=Rsig7VPHkKp94qgTL1N14WCh4UKptQ6MBxXubysA5kDK76WknXzJvNTQR9EMPcmLC6 5Kp3BqT3eyj1lI2Nde8G3mZ8SlomXhD2MBecvB3NAlvML9rYXf/4y2P6B+n73pjT/0CW JyNwOhBxkWve7MFs+1wsXs4jFCJ3/hRmIZHZVldESie5etqKlyNjZqCKxaGef8TVxk6O 0UfFRr94nDOhveD5UFA0v8SOwjutW39RksxnIW65d+7LRS0uK4/LMmvtszDXV+6ErAWD 1OiiKuKWI/+NcUlPOLYVCYQSqJAL9mPpTX2TwK4VoROoUySydTeJhVP8KaZYwwgrUsi8 6J+A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=5juivkZ5jlo3u09VQrufzxjUkil38qmJ5lRnAGAw3Nc=; b=aupFRC8QIbuuuuxzLRVqPCkkIab9lQ1J/VqRt8uNQKEkznWetrYlI1uQOv1/fieorx mJfBN3K0Pp7Oc9pbuIUw5UPzg88aQ6PlmrS2jbXp7HqjN36lvSA2U1e1BoE02jyOAJAR a9/IR73HWct6eCMXe5v/QkII5SonDbz0DZlpnUdfmDOZMaG2jarhnjr9SxlKXq7Aj6XF tQC21EmDey7dj6l5rIk36/fM8equqScTc1F9DHd+zNGVf8BPA4Z0SH2vdhZwnJg6rXjr yg7Wb41s9VpHO3s1ZPLTZcUv0Ka450aRR8IBXaP2XVnfI84+MAtxhlNIQ08v8JQIx72O rTFQ== X-Gm-Message-State: AOAM532QSekpdH4hJsUQVVs44Ln+9z761+O41PTcuki4oZcJUKgJutYX s7VogxAI8UspO5d+ntdYiHx6Ew6kDed1BFSVwzjDS/raKTizbWgTpFw3b8GBl7chgjpz/ZIndaq tvSk1x8QC/HoIKDkRR2NoL5aOezD12LhQu6zWyn7KYOES6uIlFvAsR+Yd9uYDWy4= X-Google-Smtp-Source: ABdhPJyvhZT4UzD4sBGF23HNQWHl2R4Wcuzqw9GaV5dYwxqJBeX8pLYzHjDBuL0eBQhw6ja8UfjT8lnfONrH4Q== Sender: "smcallis via sendgmr" X-Received: from smcallis.bld.corp.google.com ([2620:15c:183:200:a6ae:11ff:fe11:fc5d]) (user=smcallis job=sendgmr) by 2002:a25:c287:: with SMTP id s129mr2062346ybf.489.1602616663243; Tue, 13 Oct 2020 12:17:43 -0700 (PDT) Date: Tue, 13 Oct 2020 13:17:28 -0600 In-Reply-To: <20201013191729.2524700-1-smcallis@google.com> Message-Id: <20201013191729.2524700-2-smcallis@google.com> Mime-Version: 1.0 References: <20201013191729.2524700-1-smcallis@google.com> X-Mailer: git-send-email 2.28.0.1011.ga647a8990f-goog Subject: [PATCH v2 2/3] replace CURLOPT_FILE With CURLOPT_WRITEDATA From: Sean McAllister To: git@vger.kernel.org, gitster@pobox.com, peff@peff.net, masayasuzuki@google.com, jrnieder@gmail.com Cc: Sean McAllister Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org CURLOPT_FILE has been deprecated since 2003. Signed-off-by: Sean McAllister --- http-push.c | 6 +++--- http-walker.c | 2 +- http.c | 6 +++--- remote-curl.c | 4 ++-- 4 files changed, 9 insertions(+), 9 deletions(-) diff --git a/http-push.c b/http-push.c index 6a4a43e07f..2e6fee3305 100644 --- a/http-push.c +++ b/http-push.c @@ -894,7 +894,7 @@ static struct remote_lock *lock_remote(const char *path, long timeout) slot->results = &results; curl_setup_http(slot->curl, url, DAV_LOCK, &out_buffer, fwrite_buffer); curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, dav_headers); - curl_easy_setopt(slot->curl, CURLOPT_FILE, &in_buffer); + curl_easy_setopt(slot->curl, CURLOPT_WRITEDATA, &in_buffer); lock = xcalloc(1, sizeof(*lock)); lock->timeout = -1; @@ -1151,7 +1151,7 @@ static void remote_ls(const char *path, int flags, curl_setup_http(slot->curl, url, DAV_PROPFIND, &out_buffer, fwrite_buffer); curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, dav_headers); - curl_easy_setopt(slot->curl, CURLOPT_FILE, &in_buffer); + curl_easy_setopt(slot->curl, CURLOPT_WRITEDATA, &in_buffer); if (start_active_slot(slot)) { run_active_slot(slot); @@ -1225,7 +1225,7 @@ static int locking_available(void) curl_setup_http(slot->curl, repo->url, DAV_PROPFIND, &out_buffer, fwrite_buffer); curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, dav_headers); - curl_easy_setopt(slot->curl, CURLOPT_FILE, &in_buffer); + curl_easy_setopt(slot->curl, CURLOPT_WRITEDATA, &in_buffer); if (start_active_slot(slot)) { run_active_slot(slot); diff --git a/http-walker.c b/http-walker.c index 4fb1235cd4..6c630711d1 100644 --- a/http-walker.c +++ b/http-walker.c @@ -384,7 +384,7 @@ static void fetch_alternates(struct walker *walker, const char *base) alt_req.walker = walker; slot->callback_data = &alt_req; - curl_easy_setopt(slot->curl, CURLOPT_FILE, &buffer); + curl_easy_setopt(slot->curl, CURLOPT_WRITEDATA, &buffer); curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION, fwrite_buffer); curl_easy_setopt(slot->curl, CURLOPT_URL, url.buf); diff --git a/http.c b/http.c index 8b23a546af..b3c1669388 100644 --- a/http.c +++ b/http.c @@ -1921,7 +1921,7 @@ static int http_request(const char *url, curl_easy_setopt(slot->curl, CURLOPT_NOBODY, 1); } else { curl_easy_setopt(slot->curl, CURLOPT_NOBODY, 0); - curl_easy_setopt(slot->curl, CURLOPT_FILE, result); + curl_easy_setopt(slot->curl, CURLOPT_WRITEDATA, result); if (target == HTTP_REQUEST_FILE) { off_t posn = ftello(result); @@ -2337,7 +2337,7 @@ struct http_pack_request *new_direct_http_pack_request( } preq->slot = get_active_slot(); - curl_easy_setopt(preq->slot->curl, CURLOPT_FILE, preq->packfile); + curl_easy_setopt(preq->slot->curl, CURLOPT_WRITEDATA, preq->packfile); curl_easy_setopt(preq->slot->curl, CURLOPT_WRITEFUNCTION, fwrite); curl_easy_setopt(preq->slot->curl, CURLOPT_URL, preq->url); curl_easy_setopt(preq->slot->curl, CURLOPT_HTTPHEADER, @@ -2508,7 +2508,7 @@ struct http_object_request *new_http_object_request(const char *base_url, freq->slot = get_active_slot(); - curl_easy_setopt(freq->slot->curl, CURLOPT_FILE, freq); + curl_easy_setopt(freq->slot->curl, CURLOPT_WRITEDATA, freq); curl_easy_setopt(freq->slot->curl, CURLOPT_FAILONERROR, 0); curl_easy_setopt(freq->slot->curl, CURLOPT_WRITEFUNCTION, fwrite_sha1_file); curl_easy_setopt(freq->slot->curl, CURLOPT_ERRORBUFFER, freq->errorstr); diff --git a/remote-curl.c b/remote-curl.c index 32cc4a0c55..7f44fa30fe 100644 --- a/remote-curl.c +++ b/remote-curl.c @@ -847,7 +847,7 @@ static int probe_rpc(struct rpc_state *rpc, struct slot_results *results) curl_easy_setopt(slot->curl, CURLOPT_POSTFIELDSIZE, 4); curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, headers); curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION, fwrite_buffer); - curl_easy_setopt(slot->curl, CURLOPT_FILE, &buf); + curl_easy_setopt(slot->curl, CURLOPT_WRITEDATA, &buf); err = run_slot(slot, results); @@ -1012,7 +1012,7 @@ static int post_rpc(struct rpc_state *rpc, int stateless_connect, int flush_rece rpc_in_data.slot = slot; rpc_in_data.check_pktline = stateless_connect; memset(&rpc_in_data.pktline_state, 0, sizeof(rpc_in_data.pktline_state)); - curl_easy_setopt(slot->curl, CURLOPT_FILE, &rpc_in_data); + curl_easy_setopt(slot->curl, CURLOPT_WRITEDATA, &rpc_in_data); curl_easy_setopt(slot->curl, CURLOPT_FAILONERROR, 0); From patchwork Tue Oct 13 19:17:29 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean McAllister X-Patchwork-Id: 11836065 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-20.4 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9CF3AC43467 for ; Tue, 13 Oct 2020 19:17:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3C1F720BED for ; Tue, 13 Oct 2020 19:17:47 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="SoEVDtOj" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726956AbgJMTRq (ORCPT ); Tue, 13 Oct 2020 15:17:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52284 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725919AbgJMTRp (ORCPT ); Tue, 13 Oct 2020 15:17:45 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 53DC6C0613D0 for ; Tue, 13 Oct 2020 12:17:45 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id y7so854676ybh.10 for ; Tue, 13 Oct 2020 12:17:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=BBfsiTtx4TLB/Uphsh1a5P1NcnNbUCe3CdTEt+GWi7o=; b=SoEVDtOjDnEJa+R7OCP6HkFZkWD2BcaL3XFQ0Jtgqa6YACG4tIcvi6hinsnh17zcy0 ZFq3lda078JE+b9KzarBcK8hWlHYsSHj2zXO1CRYwBHmNWYQKWTsVBjsBXhO0nkWC0Hh bC+eKGpWmj+v+lNhVP2LN+Jz1HXp1/6FmMDMplFeBWSabMz0yegywQA5f4ontgWhbS45 B6o3XYQ8Y8zrgc7tADfKYEtkYMGyLIWRwNDckHIjrYqOCYYfiRJpJrz5555gA8mu704d UubN/N+iEldd2kzMu3Clb0PNrNcmpKPUku6S1Wsgi7Z/ebXDniONju2Mmvcd0xozy1ow D2Wg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=BBfsiTtx4TLB/Uphsh1a5P1NcnNbUCe3CdTEt+GWi7o=; b=dcNbI3GIfX6j4W42VMlJxLBkptrHfspVH3CplWUoSAau2nfTZW+Vk2yB2HHO3J4hz9 5cy1Tc0vn4DW8QVqi90qjgTUzpNCYBQuxudEAzxi2JBJnTkkKdoW43K9KNnsQwzQblJW XSKBNshn9Dwp+AtxwjXv7rO7LLExIkSG3LZhqB3RF4eyUSq8XKhM3rMU/yt9uySesg13 jnESKaRqFyaDAH1Lm3wYsBZoXdtpcO+ebrEClDfTFrymC393E+PinRYNWm4VOku+oWOC iks6wfCqyHapZm1TKMQ0L9TaPDOa2PApezoq675SQj+qc089rRh0ACju4ng4xBNSGeHb JaXw== X-Gm-Message-State: AOAM530Y/Mel4YWbpwwDUTeoTF41NEhqC/0jBh97trKHulh/9D9k0OlT GGoRwSQfOiPineu5jwj/m0bnPKQHSdLTxselO/7KRl33Bo/lbneyl5jPhsXnmV6dcLQkHtjnf2D jbHcYqt+gZPE8lZmgeZtUBhbXeFZbNAzlOdjCWWgLxbffmDZcGLSlIR/7FDehiVs= X-Google-Smtp-Source: ABdhPJyCYko+5oG7onFXI5RXpl3cDlWW/70vlOYejHP8tlyJQMOZcvm1fwpyGYYHU2Lg9fHXuHxgdZw6qNY2JA== Sender: "smcallis via sendgmr" X-Received: from smcallis.bld.corp.google.com ([2620:15c:183:200:a6ae:11ff:fe11:fc5d]) (user=smcallis job=sendgmr) by 2002:a5b:44a:: with SMTP id s10mr2215865ybp.172.1602616664451; Tue, 13 Oct 2020 12:17:44 -0700 (PDT) Date: Tue, 13 Oct 2020 13:17:29 -0600 In-Reply-To: <20201013191729.2524700-1-smcallis@google.com> Message-Id: <20201013191729.2524700-3-smcallis@google.com> Mime-Version: 1.0 References: <20201013191729.2524700-1-smcallis@google.com> X-Mailer: git-send-email 2.28.0.1011.ga647a8990f-goog Subject: [PATCH v2 3/3] http: automatically retry some requests From: Sean McAllister To: git@vger.kernel.org, gitster@pobox.com, peff@peff.net, masayasuzuki@google.com, jrnieder@gmail.com Cc: Sean McAllister Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Some HTTP response codes indicate a server state that can support retrying the request rather than immediately erroring out. The server can also provide information about how long to wait before retries to via the Retry-After header. So check the server response and retry some reasonable number of times before erroring out to better accomodate transient errors. Exiting immediately becomes irksome when pulling large multi-repo code bases such as Android or Chromium, as often the entire fetch operation has to be restarted from the beginning due to an error in one repo. If we can reduce how often that occurs, then it's a big win. Signed-off-by: Sean McAllister --- Documentation/config/http.txt | 5 ++ http.c | 124 +++++++++++++++++++++++++++++++++- http.h | 2 +- remote-curl.c | 2 +- t/t5601-clone.sh | 6 +- 5 files changed, 133 insertions(+), 6 deletions(-) diff --git a/Documentation/config/http.txt b/Documentation/config/http.txt index 3968fbb697..5d5baef967 100644 --- a/Documentation/config/http.txt +++ b/Documentation/config/http.txt @@ -260,6 +260,11 @@ http.followRedirects:: the base for the follow-up requests, this is generally sufficient. The default is `initial`. +http.retryLimit:: + Some HTTP status codes (eg: 429, 503) can reasonably be retried if + they're encountered. This value configures the number of retry attempts + before giving up. The default retry limit is 3. + http..*:: Any of the http.* options above can be applied selectively to some URLs. For a config key to match a URL, each element of the config key is diff --git a/http.c b/http.c index b3c1669388..f0147582f9 100644 --- a/http.c +++ b/http.c @@ -92,6 +92,9 @@ static const char *http_proxy_ssl_key; static const char *http_proxy_ssl_ca_info; static struct credential proxy_cert_auth = CREDENTIAL_INIT; static int proxy_ssl_cert_password_required; +static int http_retry_limit = 3; +static int http_default_delay_sec = 2; +static int http_max_delay_sec = 60; static struct { const char *name; @@ -219,6 +222,56 @@ size_t fwrite_null(char *ptr, size_t eltsize, size_t nmemb, void *strbuf) return nmemb; } + +/* return 1 for a retryable HTTP code, 0 otherwise */ +static int retryable_code(int code) +{ + switch(code) { + case 429: /* fallthrough */ + case 502: /* fallthrough */ + case 503: /* fallthrough */ + case 504: return 1; + default: return 0; + } +} + +/* + * Query the value of an HTTP header. + * + * If the header is found, then a newly allocate string containing the value + * is returned. + * + * If not found, returns NULL + */ +static char* http_header_value(const struct strbuf headers, const char *header) +{ + const size_t header_len = strlen(header); + const char* beg = headers.buf, *end; + const char* ptr = strcasestr(beg, header), *eol; + + while (ptr) { + /* headers should have no leading whitespace, and end with ':' */ + end = ptr + header_len; + if ((ptr != beg && ptr[-1] != '\n') || *end != ':') { + ptr = strcasestr(end, header); + continue; + } + + /* skip leading LWS */ + ptr = end + 1; + while (*ptr && isspace(*ptr) && *ptr != '\r') + ptr++; + + /* skip trailing LWS */ + eol = strchrnul(ptr, '\r'); + while (eol > ptr && isspace(eol[-1])) + eol--; + + return xstrndup(ptr, eol-ptr); + } + return NULL; +} + static void closedown_active_slot(struct active_request_slot *slot) { active_requests--; @@ -452,6 +505,11 @@ static int http_options(const char *var, const char *value, void *cb) return 0; } + if (!strcmp("http.retrylimit", var)) { + http_retry_limit = git_config_int(var, value); + return 0; + } + /* Fall back on the default ones */ return git_default_config(var, value, cb); } @@ -1668,7 +1726,7 @@ static int handle_curl_result(struct slot_results *results) } int run_one_slot(struct active_request_slot *slot, - struct slot_results *results) + struct slot_results *results, int *http_code) { slot->results = results; if (!start_active_slot(slot)) { @@ -1678,6 +1736,8 @@ int run_one_slot(struct active_request_slot *slot, } run_active_slot(slot); + if (http_code) + *http_code = results->http_code; return handle_curl_result(results); } @@ -1903,20 +1963,58 @@ static void http_opt_request_remainder(CURL *curl, off_t pos) #define HTTP_REQUEST_STRBUF 0 #define HTTP_REQUEST_FILE 1 +/* + * check for a retry-after header in the given headers string, if found, then + * honor it, otherwise do an exponential backoff up to the max on the current + * delay +*/ +static int http_retry_after(const struct strbuf headers, int cur_delay_sec) +{ + int delay_sec; + char *end; + char* value = http_header_value(headers, "retry-after"); + + if (value) { + delay_sec = strtol(value, &end, 0); + free(value); + if (*value && *end == '\0' && delay_sec >= 0) { + if (delay_sec > http_max_delay_sec) { + die(Q_("server requested retry after %d second," + " which is longer than max allowed\n", + "server requested retry after %d seconds," + " which is longer than max allowed\n", delay_sec), + delay_sec); + } + return delay_sec; + } + } + + cur_delay_sec *= 2; + return cur_delay_sec >= http_max_delay_sec ? http_max_delay_sec : cur_delay_sec; +} + static int http_request(const char *url, void *result, int target, const struct http_get_options *options) { struct active_request_slot *slot; struct slot_results results; - struct curl_slist *headers = http_copy_default_headers(); + struct curl_slist *headers; struct strbuf buf = STRBUF_INIT; + struct strbuf result_headers = STRBUF_INIT; const char *accept_language; int ret; + int retry_cnt = 0; + int retry_delay_sec = http_default_delay_sec; + int http_code; +retry: slot = get_active_slot(); curl_easy_setopt(slot->curl, CURLOPT_HTTPGET, 1); + curl_easy_setopt(slot->curl, CURLOPT_HEADERDATA, &result_headers); + curl_easy_setopt(slot->curl, CURLOPT_HEADERFUNCTION, fwrite_buffer); + if (result == NULL) { curl_easy_setopt(slot->curl, CURLOPT_NOBODY, 1); } else { @@ -1936,6 +2034,7 @@ static int http_request(const char *url, accept_language = get_accept_language(); + headers = http_copy_default_headers(); if (accept_language) headers = curl_slist_append(headers, accept_language); @@ -1961,7 +2060,26 @@ static int http_request(const char *url, curl_easy_setopt(slot->curl, CURLOPT_ENCODING, ""); curl_easy_setopt(slot->curl, CURLOPT_FAILONERROR, 0); - ret = run_one_slot(slot, &results); + http_code = 0; + ret = run_one_slot(slot, &results, &http_code); + + /* remove header data fields since not all slots will use them */ + curl_easy_setopt(slot->curl, CURLOPT_HEADERDATA, NULL); + curl_easy_setopt(slot->curl, CURLOPT_HEADERFUNCTION, NULL); + + if (ret != HTTP_OK) { + if (retryable_code(http_code) && retry_cnt < http_retry_limit) { + retry_cnt++; + retry_delay_sec = http_retry_after(result_headers, retry_delay_sec); + warning(Q_("got HTTP response %d, retrying after %d second (%d/%d)\n", + "got HTTP response %d, retrying after %d seconds (%d/%d)\n", + retry_delay_sec), + http_code, retry_delay_sec, retry_cnt, http_retry_limit); + sleep(retry_delay_sec); + + goto retry; + } + } if (options && options->content_type) { struct strbuf raw = STRBUF_INIT; diff --git a/http.h b/http.h index 5de792ef3f..faf9f1060e 100644 --- a/http.h +++ b/http.h @@ -99,7 +99,7 @@ void finish_all_active_slots(void); * */ int run_one_slot(struct active_request_slot *slot, - struct slot_results *results); + struct slot_results *results, int *http_code); #ifdef USE_CURL_MULTI void fill_active_slots(void); diff --git a/remote-curl.c b/remote-curl.c index 7f44fa30fe..2657c95bcb 100644 --- a/remote-curl.c +++ b/remote-curl.c @@ -805,7 +805,7 @@ static int run_slot(struct active_request_slot *slot, if (!results) results = &results_buf; - err = run_one_slot(slot, results); + err = run_one_slot(slot, results, NULL); if (err != HTTP_OK && err != HTTP_REAUTH) { struct strbuf msg = STRBUF_INIT; diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh index 72aaed5a93..3965cd265d 100755 --- a/t/t5601-clone.sh +++ b/t/t5601-clone.sh @@ -757,13 +757,17 @@ test_expect_success 'partial clone using HTTP' ' ' test_expect_success 'partial clone using HTTP with redirect' ' - _NONCE=`gen_nonce` && export _NONCE && + _NONCE=$(gen_nonce) && curl "$HTTPD_URL/error_ntime/${_NONCE}/3/502/10/smart/server" && curl "$HTTPD_URL/error_ntime/${_NONCE}/3/502/10/smart/server" && curl "$HTTPD_URL/error_ntime/${_NONCE}/3/502/10/smart/server" && partial_clone "$HTTPD_DOCUMENT_ROOT_PATH/server" "$HTTPD_URL/error_ntime/${_NONCE}/3/502/10/smart/server" ' +test_expect_success 'partial clone with retry' ' + partial_clone "$HTTPD_DOCUMENT_ROOT_PATH/server" "$HTTPD_URL/error_ntime/$(gen_nonce)/3/429/1/smart/server" 2>err && + test_i18ngrep "got HTTP response 429" err +' # DO NOT add non-httpd-specific tests here, because the last part of this # test script is only executed when httpd is available and enabled.