From patchwork Tue Sep 10 16:29:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 13798872 Received: from mail-wm1-f47.google.com (mail-wm1-f47.google.com [209.85.128.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F294813A3E8 for ; Tue, 10 Sep 2024 16:30:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.47 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725985823; cv=none; b=sPvCWdlLyxjjeU5xdd3vG2KTaZDQsXo6Z/QWEQFqW7mh94zW+n/GL4+jog3yuM3IyNFCC4FnVRvByBPGoPYFB9Bxi28SOR3ggqvQ8odGsQg46401fwmKmh8N/3MfrhvUv5rE6t+yiddTO7XkQdxgDQEclMRQd5paxTneMEAp5BA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725985823; c=relaxed/simple; bh=1aT+nypFKeLY7fDpJL2rbrVDdpgmlb2TJQV3M4VGMWc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=je3FfHXh9+LtE9WqpGtql5QcZ6wIzDxU6vUagBvpTQW4S4FcK98bfdZOFe0+MDw9I+LvSLDNOL9hwivAgVyIDR9W8JfTxQXz89puMKHNWA2nLtkZX2gUNEfw3Ed7ArIZ/t6X2b/Wn+7lmlsicFowOI8QLVCsSOf6GD86Tc/GtI0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=bb805f75; arc=none smtp.client-ip=209.85.128.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="bb805f75" Received: by mail-wm1-f47.google.com with SMTP id 5b1f17b1804b1-42bbffe38e6so44418685e9.0 for ; Tue, 10 Sep 2024 09:30:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1725985819; x=1726590619; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=uPnrOVcHCY6sFsCUjNfr/0qW1N6Jb5gzzCZUNqi5TaU=; b=bb805f758H5JRN9QOtxC0o01wMXYDYNI+1l/PAqtJr0CzQz/Uhvw9Hw1Dy8DMbc3oF 55gqgqbN/+fXvEQmUWF5aQErhwITdgWQ5cB2mgrSBdQX8Dsfljx3sTAvP4+HqmzdoQA5 PtT3v1BJT8BOJAaDA9xbrWBx2YVZZ1w7llSmaw5679ricN2RDf7cPz0a+Qe1/+LZm8vE 8tlImnmOo38c0KW4BskLdrQ5CdfpEEvOGshW0doYGvi5dj4745l5zShunwqqkbgSXpuu bIsgG96niD+0c+bZGfZ6+lKszYQYIYRNniBUFrwsLdV14QxkPPoVFH28vLTsOG3s92kG okFA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725985819; x=1726590619; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=uPnrOVcHCY6sFsCUjNfr/0qW1N6Jb5gzzCZUNqi5TaU=; b=d2EHW/xUvJI3CZTl/lbSiWhb2NHs4jBHP4/wd8XdmhrwAtfSy3joWcz2M7ZuEs0qRe slDOpeW1eZUpykkjfFwNKbc7qnMzOCjQkAWh1sEGrUYu9qttu23v5gFxtHcyrWPZt2T9 n20cs4pTvxVpTSIAWi9izYMEfQbxSjCcCZCfw12ThkWCAdVvdnMS8fl9pl9DeLj4gOK+ /cfgZNhLTzTW0v+iDRLzR1RoPMP4jw76QqH/38FKQgTNqs40YsxITTsU7t87knxdoCSx OfNVMs38OYQY70NoKXvTwpOP4JulGHzAo4j9Xa7OT3EVEBf0ePRJkO/0/Zhxuaoqhilf AXHA== X-Gm-Message-State: AOJu0Yz8grfx/aVsJFuvnc1AGOqP4bFs5e8glcg/ZWpVDWjq66XhMpIw QuGdfLjq/waoti0SsTI9n0Aes2FwuTF4vTn4Ohh/wl9uXFYLjPm7rVVbzQ== X-Google-Smtp-Source: AGHT+IFl517/0avT8b0Q+JTLaRJp8Si7e6VgRO49M1a7Z8s5t6y0lFBnTS4VxyXx6tlbwbM87npywQ== X-Received: by 2002:adf:cd0b:0:b0:374:be8e:7325 with SMTP id ffacd0b85a97d-3789272c93bmr8199729f8f.51.1725985818697; Tue, 10 Sep 2024 09:30:18 -0700 (PDT) Received: from christian-Precision-5550.. (176-138-135-207.abo.bbox.fr. [176.138.135.207]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-42cc01a8ee7sm29897865e9.0.2024.09.10.09.30.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Sep 2024 09:30:17 -0700 (PDT) From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , John Cai , Patrick Steinhardt , Taylor Blau , Eric Sunshine , Christian Couder , Christian Couder Subject: [PATCH v2 1/4] version: refactor strbuf_sanitize() Date: Tue, 10 Sep 2024 18:29:57 +0200 Message-ID: <20240910163000.1985723-2-christian.couder@gmail.com> X-Mailer: git-send-email 2.46.0.4.g7a37e584ed In-Reply-To: <20240910163000.1985723-1-christian.couder@gmail.com> References: <20240731134014.2299361-1-christian.couder@gmail.com> <20240910163000.1985723-1-christian.couder@gmail.com> Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 The git_user_agent_sanitized() function performs some sanitizing to avoid special characters being sent over the line and possibly messing up with the protocol or with the parsing on the other side. Let's extract this sanitizing into a new strbuf_sanitize() function, as we will want to reuse it in a following patch, and let's put it into strbuf.{c,h}. While at it, let's also make a few small improvements: - use 'size_t' for 'i' instead of 'int', - move the declaration of 'i' inside the 'for ( ... )', - use strbuf_detach() to explicitly detach the string contained by the 'sb' strbuf. Helped-by: Eric Sunshine Signed-off-by: Christian Couder --- strbuf.c | 9 +++++++++ strbuf.h | 7 +++++++ version.c | 9 ++------- 3 files changed, 18 insertions(+), 7 deletions(-) diff --git a/strbuf.c b/strbuf.c index 3d2189a7f6..cccfdec0e3 100644 --- a/strbuf.c +++ b/strbuf.c @@ -1082,3 +1082,12 @@ void strbuf_strip_file_from_path(struct strbuf *sb) char *path_sep = find_last_dir_sep(sb->buf); strbuf_setlen(sb, path_sep ? path_sep - sb->buf + 1 : 0); } + +void strbuf_sanitize(struct strbuf *sb) +{ + strbuf_trim(sb); + for (size_t i = 0; i < sb->len; i++) { + if (sb->buf[i] <= 32 || sb->buf[i] >= 127) + sb->buf[i] = '.'; + } +} diff --git a/strbuf.h b/strbuf.h index 003f880ff7..884157873e 100644 --- a/strbuf.h +++ b/strbuf.h @@ -664,6 +664,13 @@ typedef int (*char_predicate)(char ch); void strbuf_addstr_urlencode(struct strbuf *sb, const char *name, char_predicate allow_unencoded_fn); +/* + * Trim and replace each character with ascii code below 32 or above + * 127 (included) using a dot '.' character. Useful for sending + * capabilities. + */ +void strbuf_sanitize(struct strbuf *sb); + __attribute__((format (printf,1,2))) int printf_ln(const char *fmt, ...); __attribute__((format (printf,2,3))) diff --git a/version.c b/version.c index 41b718c29e..951e6dca74 100644 --- a/version.c +++ b/version.c @@ -24,15 +24,10 @@ const char *git_user_agent_sanitized(void) if (!agent) { struct strbuf buf = STRBUF_INIT; - int i; strbuf_addstr(&buf, git_user_agent()); - strbuf_trim(&buf); - for (i = 0; i < buf.len; i++) { - if (buf.buf[i] <= 32 || buf.buf[i] >= 127) - buf.buf[i] = '.'; - } - agent = buf.buf; + strbuf_sanitize(&buf); + agent = strbuf_detach(&buf, NULL); } return agent; From patchwork Tue Sep 10 16:29:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 13798874 Received: from mail-wm1-f50.google.com (mail-wm1-f50.google.com [209.85.128.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B648613B2B8 for ; Tue, 10 Sep 2024 16:30:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.50 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725985825; cv=none; b=IxCO/D8uJXySH2T8l7uFK3bqnLbJhhrDNgpTUfYdjH01E0c5+lHkf3/9mb+ImiMlhHyaXmCPQ1LdYsvhbqKcDi1kU6G/UWX/iL2lu8tHbS99PUGxAUKTbr6lnmQ0LJtJcyFVXQe6PeQy14/PHL21tRp+ZhuSW/Y6TQa3vLmW3SM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725985825; c=relaxed/simple; bh=ya0/CACRmTfh4fk0cbIlxjg7nz6daWeVS1RiID9z0Ng=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=IFV5t3dTmmJ+OLI1Bf/OxJX3duEZ3sDGpu7QtcAuJktg01QwetYssewbJE0HQ1Ub+HIwb1Y0H6zxJjzhmwmqcnQ7TXniXR/qrqNVQiean7UuYnS9cDkWz1fzW+XJk091MMw++IQX15e6HIGkDnDGVzDpoNBJwcJ49IC7/zs60AQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=MAnBbG9J; arc=none smtp.client-ip=209.85.128.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="MAnBbG9J" Received: by mail-wm1-f50.google.com with SMTP id 5b1f17b1804b1-42cb5b3c57eso25361775e9.2 for ; Tue, 10 Sep 2024 09:30:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1725985822; x=1726590622; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=NfygiRTxI6/Pzw9YYF2AKnRPiTz3aD+gNb6HSs1gjPc=; b=MAnBbG9Jwrz09bYOx3piKZ28izYjxoxKXcGSZyGgYICka48hFhiB3SjV9whI5FTGAL FYkhLTHMSps40cIKC4JpbzuAI+L+3QTPMNaVoUu5Qt7cAJTdGcBtj3eq1Af28xUqkgAE M6W9BOPk1shQRbxNySxNTN6k8lqgq5ZewgprzlL5s+YqCg9TA2VZVT4h72KBUw5mibhk 0d3yztIhveY0wpBUXLwfhrzDVwm1iezmtYCEeR3XNbRGAtvP8FRnHKqEfobwd0UKYFKR EWRhB4q1zb9KXbbpJGe1/J4A2lt0nIoCoGweY4dWQhU7yuIAbOwjjpehFN4gbZBLKBRg qsNg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725985822; x=1726590622; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=NfygiRTxI6/Pzw9YYF2AKnRPiTz3aD+gNb6HSs1gjPc=; b=tpbTg5oWlLQ3xkowEUANsNIRRYy3Mi//cn0hI3tjZld02yUwy69aokBrZO9szjCd9J s72F8DTOIqvmqFr+6SRZN8KwF2pAMpQktRVhGpjVIxEXxr2oIvg292RDJhsYTD+0fXAE wc8eU8WwVwzI2dunZ1cXhHF3X03NioTwPs8EQhZmE5mHc1Bz4u22UvBJDvHsH4eJ5zfH 3kh7sQwCKKUgHxDvtTmIxV/Ag8u49gT5BwBmfxPyRC8be9okxAVuxMVhAoHwPMvpzoxh bBfsmlXRBXwze8SBYaqiBwgpNuqDnYealva3T+5a9qMMisZBQlhXe2K3BaZm8WBQC73c aPnw== X-Gm-Message-State: AOJu0YyAAR60twSUAJag2yuC1pa2Mb8pi2uYy6xGBVBMXzCqnzLOxoDJ J+7x0bc3ZcopLoqBf3PTdtzxwJohuiFuWUxyCnzoQ28QZi5KvoHpTUTmHw== X-Google-Smtp-Source: AGHT+IFJra3Fad3Jr5+FSCwhoJCy/Dx5VJNB9NBNV7aIAMKXMv1L/KPIb9Ak2z1/bOx9jq8Ke8kv/Q== X-Received: by 2002:a05:600c:4f4e:b0:42c:b4f5:47a9 with SMTP id 5b1f17b1804b1-42cb4f54c54mr73501515e9.14.1725985821560; Tue, 10 Sep 2024 09:30:21 -0700 (PDT) Received: from christian-Precision-5550.. (176-138-135-207.abo.bbox.fr. [176.138.135.207]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-42cc01a8ee7sm29897865e9.0.2024.09.10.09.30.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Sep 2024 09:30:19 -0700 (PDT) From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , John Cai , Patrick Steinhardt , Taylor Blau , Eric Sunshine , Christian Couder , Christian Couder Subject: [PATCH v2 2/4] strbuf: refactor strbuf_trim_trailing_ch() Date: Tue, 10 Sep 2024 18:29:58 +0200 Message-ID: <20240910163000.1985723-3-christian.couder@gmail.com> X-Mailer: git-send-email 2.46.0.4.g7a37e584ed In-Reply-To: <20240910163000.1985723-1-christian.couder@gmail.com> References: <20240731134014.2299361-1-christian.couder@gmail.com> <20240910163000.1985723-1-christian.couder@gmail.com> Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 We often have to split strings at some specified terminator character. The strbuf_split*() functions, that we can use for this purpose, return substrings that include the terminator character, so we often need to remove that character. When it is a whitespace, newline or directory separator, the terminator character can easily be removed using an existing triming function like strbuf_rtrim(), strbuf_trim_trailing_newline() or strbuf_trim_trailing_dir_sep(). There is no function to remove that character when it's not one of those characters though. Let's introduce a new strbuf_trim_trailing_ch() function that can be used to remove any trailing character, and let's refactor existing code that manually removed trailing characters using this new function. We are also going to use this new function in a following commit. Signed-off-by: Christian Couder --- strbuf.c | 7 +++++++ strbuf.h | 3 +++ trace2/tr2_cfg.c | 10 ++-------- 3 files changed, 12 insertions(+), 8 deletions(-) diff --git a/strbuf.c b/strbuf.c index cccfdec0e3..c986ec28f4 100644 --- a/strbuf.c +++ b/strbuf.c @@ -134,6 +134,13 @@ void strbuf_trim_trailing_dir_sep(struct strbuf *sb) sb->buf[sb->len] = '\0'; } +void strbuf_trim_trailing_ch(struct strbuf *sb, int c) +{ + while (sb->len > 0 && sb->buf[sb->len - 1] == c) + sb->len--; + sb->buf[sb->len] = '\0'; +} + void strbuf_trim_trailing_newline(struct strbuf *sb) { if (sb->len > 0 && sb->buf[sb->len - 1] == '\n') { diff --git a/strbuf.h b/strbuf.h index 884157873e..5e389ab065 100644 --- a/strbuf.h +++ b/strbuf.h @@ -197,6 +197,9 @@ void strbuf_trim_trailing_dir_sep(struct strbuf *sb); /* Strip trailing LF or CR/LF */ void strbuf_trim_trailing_newline(struct strbuf *sb); +/* Strip trailing character c */ +void strbuf_trim_trailing_ch(struct strbuf *sb, int c); + /** * Replace the contents of the strbuf with a reencoded form. Returns -1 * on error, 0 on success. diff --git a/trace2/tr2_cfg.c b/trace2/tr2_cfg.c index d96d908bb9..356fcd38f4 100644 --- a/trace2/tr2_cfg.c +++ b/trace2/tr2_cfg.c @@ -33,10 +33,7 @@ static int tr2_cfg_load_patterns(void) tr2_cfg_patterns = strbuf_split_buf(envvar, strlen(envvar), ',', -1); for (s = tr2_cfg_patterns; *s; s++) { - struct strbuf *buf = *s; - - if (buf->len && buf->buf[buf->len - 1] == ',') - strbuf_setlen(buf, buf->len - 1); + strbuf_trim_trailing_ch(*s, ','); strbuf_trim_trailing_newline(*s); strbuf_trim(*s); } @@ -72,10 +69,7 @@ static int tr2_load_env_vars(void) tr2_cfg_env_vars = strbuf_split_buf(varlist, strlen(varlist), ',', -1); for (s = tr2_cfg_env_vars; *s; s++) { - struct strbuf *buf = *s; - - if (buf->len && buf->buf[buf->len - 1] == ',') - strbuf_setlen(buf, buf->len - 1); + strbuf_trim_trailing_ch(*s, ','); strbuf_trim_trailing_newline(*s); strbuf_trim(*s); } From patchwork Tue Sep 10 16:29:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 13798875 Received: from mail-wm1-f48.google.com (mail-wm1-f48.google.com [209.85.128.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A512C38F9C for ; Tue, 10 Sep 2024 16:30:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.48 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725985828; cv=none; b=QUuNmUNGY0ITl49nK/GwcTkVZwPvw3d+lhpvDUlqguVj4y6WVljpgQiXEKpHaG8rn6NumLQgWzbMt/DE9cSncR6y8r31RDv28bkkGnQjcnxSZHtEmUJfF+zf8denwTVWoBDJnHmDoAgMezo7pLrOAF+CwDZyfxz70vBB3koYXro= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725985828; c=relaxed/simple; bh=Ja9yno0OY0m4HCmmF+dXns0TB5JmALBNjGErubllYGs=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=XWpQKcysqg1ajgDP9lx+JEMEd5Fc2qQ/O1Z2HkxCQxA1b0NjoqC5QjNWXSQOc+BypWjetuR3+CvHrzen12eVQe5v9brlQitEhTqbiDxcAh6KkqR3Us0+cyLft8nIU48xNZnO+zE3pm+VV3MT2c6tySMmGoIAicmKS0SgUtC4IbI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=U11dp0Ou; arc=none smtp.client-ip=209.85.128.48 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="U11dp0Ou" Received: by mail-wm1-f48.google.com with SMTP id 5b1f17b1804b1-42bbffe38e6so44419335e9.0 for ; Tue, 10 Sep 2024 09:30:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1725985823; x=1726590623; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=GDvowuXzQh66ztbbksfGQJJCtfSBncvWNLO9yP0YsA8=; b=U11dp0OuJL2gK0ySOtwDtzwo86wQFCgYoNwBnLqIatrMe3OJ9FtOA0JODmzn1xJDgo 6+CLXor6Ju3rmDT3i+bt4P/MKpTGPnkC+IWDYX7TUF01sXd3ntTXYnYbTRpCwz2N9ZYr s6m0jc/iJujodcD1ZmgJMv/bCRKnQdYf8ASblHvM4wOdou+knFRZ+nZ4XRJq81P0L2AC wemKdRsKVFpEZwFSsy690UoRhl0CD/JVzZuhTw8GuclGlh+GNgevRYJ67rWeIvX460k2 3C3SPdSyztLkPZ2abgGLqEY4eJVY1yB4OXjNQnsrUVoZArIQqdSbUI4Y0W+z3bcLOW/q SeWg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725985823; x=1726590623; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=GDvowuXzQh66ztbbksfGQJJCtfSBncvWNLO9yP0YsA8=; b=DUc9DD8+ZzJBM+320Y+CmEwo+qlWWLtE2dC/4/lCi4nK10mnVJvdz3tocX4WYQfUTe LhDmd9gtaQrMriXL10KaZbEp4J0rgxIobWLED6f9TytD1X3B7J5brcUYv+6l3ZFUXyIv nuwb736c2Y2dqPff9rexf8+kFJFUuHr9luKV55RkJiJNgMFyIG8fuCUFb3dFMy+i2IB+ 9fiKgB9iJzc4V+7IuRXybP6Hlm8WNw/XrPjGh5ShskrGCkZOlOv0zRWX/Mv6S1d1vK20 dWBRqy/cixcnDtTPD5wn752RvTcLLDFxMtPFV1IeHT1HSWzFoPe32slUsxBas8LjI3DI Xjeg== X-Gm-Message-State: AOJu0Yw9Mc3rysVjSvq8AP1sKnXjLAXGJsNm7qnJZCe1NxyZV5LHX0MC u1FqNY0TI7SRNNz1BEDcWh70cay1UIsJSowAKjGkrso5YlR0KHZdC3W8rQ== X-Google-Smtp-Source: AGHT+IElZCevoiv4wt55LpxCpcXh+nRHoCMnbA1pdYjxuFZKFD04ql71JYj7A69y+WYEB320TugF1Q== X-Received: by 2002:a7b:ca4e:0:b0:42c:a6da:a149 with SMTP id 5b1f17b1804b1-42cad87efa4mr95631605e9.25.1725985823280; Tue, 10 Sep 2024 09:30:23 -0700 (PDT) Received: from christian-Precision-5550.. (176-138-135-207.abo.bbox.fr. [176.138.135.207]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-42cc01a8ee7sm29897865e9.0.2024.09.10.09.30.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Sep 2024 09:30:22 -0700 (PDT) From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , John Cai , Patrick Steinhardt , Taylor Blau , Eric Sunshine , Christian Couder , Christian Couder Subject: [PATCH v2 3/4] Add 'promisor-remote' capability to protocol v2 Date: Tue, 10 Sep 2024 18:29:59 +0200 Message-ID: <20240910163000.1985723-4-christian.couder@gmail.com> X-Mailer: git-send-email 2.46.0.4.g7a37e584ed In-Reply-To: <20240910163000.1985723-1-christian.couder@gmail.com> References: <20240731134014.2299361-1-christian.couder@gmail.com> <20240910163000.1985723-1-christian.couder@gmail.com> Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 When a server S knows that some objects from a repository are available from a promisor remote X, S might want to suggest to a client C cloning or fetching the repo from S that C should use X directly instead of S for these objects. Note that this could happen both in the case S itself doesn't have the objects and borrows them from X, and in the case S has the objects but knows that X is better connected to the world (e.g., it is in a $LARGEINTERNETCOMPANY datacenter with petabit/s backbone connections) than S. Implementation of the latter case, which would require S to omit in its response the objects available on X, is left for future improvement though. Then C might or might not, want to get the objects from X, and should let S know about this. To allow S and C to agree and let each other know about C using X or not, let's introduce a new "promisor-remote" capability in the protocol v2, as well as a few new configuration variables: - "promisor.advertise" on the server side, and: - "promisor.acceptFromServer" on the client side. By default, or if "promisor.advertise" is set to 'false', a server S will not advertise the "promisor-remote" capability. If S doesn't advertise the "promisor-remote" capability, then a client C replying to S shouldn't advertise the "promisor-remote" capability either. If "promisor.advertise" is set to 'true', S will advertise its promisor remotes with a string like: promisor-remote=[;]... where each element contains information about a single promisor remote in the form: name=[,url=] where is the urlencoded name of a promisor remote and is the urlencoded URL of the promisor remote named . For now, the URL is passed in addition to the name. In the future, it might be possible to pass other information like a filter-spec that the client should use when cloning from S, or a token that the client should use when retrieving objects from X. It might also be possible in the future for "promisor.advertise" to have other values. For example a value like "onlyName" could prevent S from advertising URLs, which could help in case C should use a different URL for X than the URL S is using. (The URL S is using might be an internal one on the server side for example.) By default or if "promisor.acceptFromServer" is set to "None", C will not accept to use the promisor remotes that might have been advertised by S. In this case, C will not advertise any "promisor-remote" capability in its reply to S. If "promisor.acceptFromServer" is set to "All" and S advertised some promisor remotes, then on the contrary, C will accept to use all the promisor remotes that S advertised and C will reply with a string like: promisor-remote=[;]... where the elements are the urlencoded names of all the promisor remotes S advertised. In a following commit, other values for "promisor.acceptFromServer" will be implemented, so that C will be able to decide the promisor remotes it accepts depending on the name and URL it received from S. So even if that name and URL information is not used much right now, it will be needed soon. Helped-by: Taylor Blau Helped-by: Patrick Steinhardt Signed-off-by: Christian Couder --- Documentation/config/promisor.txt | 17 +++ Documentation/gitprotocol-v2.txt | 54 +++++++ connect.c | 9 ++ promisor-remote.c | 198 ++++++++++++++++++++++++++ promisor-remote.h | 36 ++++- serve.c | 26 ++++ t/t5710-promisor-remote-capability.sh | 124 ++++++++++++++++ upload-pack.c | 3 + 8 files changed, 466 insertions(+), 1 deletion(-) create mode 100755 t/t5710-promisor-remote-capability.sh diff --git a/Documentation/config/promisor.txt b/Documentation/config/promisor.txt index 98c5cb2ec2..9cbfe3e59e 100644 --- a/Documentation/config/promisor.txt +++ b/Documentation/config/promisor.txt @@ -1,3 +1,20 @@ promisor.quiet:: If set to "true" assume `--quiet` when fetching additional objects for a partial clone. + +promisor.advertise:: + If set to "true", a server will use the "promisor-remote" + capability, see linkgit:gitprotocol-v2[5], to advertise the + promisor remotes it is using, if it uses some. Default is + "false", which means the "promisor-remote" capability is not + advertised. + +promisor.acceptFromServer:: + If set to "all", a client will accept all the promisor remotes + a server might advertise using the "promisor-remote" + capability. Default is "none", which means no promisor remote + advertised by a server will be accepted. By accepting a + promisor remote, the client agrees that the server might omit + objects that are lazily fetchable from this promisor remote + from its responses to "fetch" and "clone" requests from the + client. See linkgit:gitprotocol-v2[5]. diff --git a/Documentation/gitprotocol-v2.txt b/Documentation/gitprotocol-v2.txt index 414bc625d5..65d5256baf 100644 --- a/Documentation/gitprotocol-v2.txt +++ b/Documentation/gitprotocol-v2.txt @@ -781,6 +781,60 @@ retrieving the header from a bundle at the indicated URI, and thus save themselves and the server(s) the request(s) needed to inspect the headers of that bundle or bundles. +promisor-remote= +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The server may advertise some promisor remotes it is using or knows +about to a client which may want to use them as its promisor remotes, +instead of this repository. In this case should be of the +form: + + pr-infos = pr-info | pr-infos ";" pr-info + + pr-info = "name=" pr-name | "name=" pr-name "," "url=" pr-url + +where `pr-name` is the urlencoded name of a promisor remote, and +`pr-url` the urlencoded URL of that promisor remote. + +In this case, if the client decides to use one or more promisor +remotes the server advertised, it can reply with +"promisor-remote=" where should be of the form: + + pr-names = pr-name | pr-names ";" pr-name + +where `pr-name` is the urlencoded name of a promisor remote the server +advertised and the client accepts. + +Note that, everywhere in this document, `pr-name` MUST be a valid +remote name, and the ';' and ',' characters MUST be encoded if they +appear in `pr-name` or `pr-url`. + +If the server doesn't know any promisor remote that could be good for +a client to use, or prefers a client not to use any promisor remote it +uses or knows about, it shouldn't advertise the "promisor-remote" +capability at all. + +In this case, or if the client doesn't want to use any promisor remote +the server advertised, the client shouldn't advertise the +"promisor-remote" capability at all in its reply. + +The "promisor.advertise" and "promisor.acceptFromServer" configuration +options can be used on the server and client side respectively to +control what they advertise or accept respectively. See the +documentation of these configuration options for more information. + +Note that in the future it would be nice if the "promisor-remote" +protocol capability could be used by the server, when responding to +`git fetch` or `git clone`, to advertise better-connected remotes that +the client can use as promisor remotes, instead of this repository, so +that the client can lazily fetch objects from these other +better-connected remotes. This would require the server to omit in its +response the objects available on the better-connected remotes that +the client has accepted. This hasn't been implemented yet though. So +for now this "promisor-remote" capability is useful only when the +server advertises some promisor remotes it already uses to borrow +objects from. + GIT --- Part of the linkgit:git[1] suite diff --git a/connect.c b/connect.c index cf84e631e9..1650bbd71d 100644 --- a/connect.c +++ b/connect.c @@ -20,6 +20,7 @@ #include "protocol.h" #include "alias.h" #include "bundle-uri.h" +#include "promisor-remote.h" static char *server_capabilities_v1; static struct strvec server_capabilities_v2 = STRVEC_INIT; @@ -485,6 +486,7 @@ void check_stateless_delimiter(int stateless_rpc, static void send_capabilities(int fd_out, struct packet_reader *reader) { const char *hash_name; + const char *promisor_remote_info; if (server_supports_v2("agent")) packet_write_fmt(fd_out, "agent=%s", git_user_agent_sanitized()); @@ -498,6 +500,13 @@ static void send_capabilities(int fd_out, struct packet_reader *reader) } else { reader->hash_algo = &hash_algos[GIT_HASH_SHA1]; } + if (server_feature_v2("promisor-remote", &promisor_remote_info)) { + char *reply = promisor_remote_reply(promisor_remote_info); + if (reply) { + packet_write_fmt(fd_out, "promisor-remote=%s", reply); + free(reply); + } + } } int get_remote_bundle_uri(int fd_out, struct packet_reader *reader, diff --git a/promisor-remote.c b/promisor-remote.c index 317e1b127f..baacbe9d94 100644 --- a/promisor-remote.c +++ b/promisor-remote.c @@ -11,6 +11,7 @@ #include "strvec.h" #include "packfile.h" #include "environment.h" +#include "url.h" struct promisor_remote_config { struct promisor_remote *promisors; @@ -219,6 +220,18 @@ int repo_has_promisor_remote(struct repository *r) return !!repo_promisor_remote_find(r, NULL); } +int repo_has_accepted_promisor_remote(struct repository *r) +{ + struct promisor_remote *p; + + promisor_remote_init(r); + + for (p = r->promisor_remote_config->promisors; p; p = p->next) + if (p->accepted) + return 1; + return 0; +} + static int remove_fetched_oids(struct repository *repo, struct object_id **oids, int oid_nr, int to_free) @@ -290,3 +303,188 @@ void promisor_remote_get_direct(struct repository *repo, if (to_free) free(remaining_oids); } + +static int allow_unsanitized(char ch) +{ + if (ch == ',' || ch == ';' || ch == '%') + return 0; + return ch > 32 && ch < 127; +} + +static void promisor_info_vecs(struct repository *repo, + struct strvec *names, + struct strvec *urls) +{ + struct promisor_remote *r; + + promisor_remote_init(repo); + + for (r = repo->promisor_remote_config->promisors; r; r = r->next) { + char *url; + char *url_key = xstrfmt("remote.%s.url", r->name); + + strvec_push(names, r->name); + strvec_push(urls, git_config_get_string(url_key, &url) ? NULL : url); + + free(url); + free(url_key); + } +} + +char *promisor_remote_info(struct repository *repo) +{ + struct strbuf sb = STRBUF_INIT; + int advertise_promisors = 0; + struct strvec names = STRVEC_INIT; + struct strvec urls = STRVEC_INIT; + + git_config_get_bool("promisor.advertise", &advertise_promisors); + + if (!advertise_promisors) + return NULL; + + promisor_info_vecs(repo, &names, &urls); + + if (!names.nr) + return NULL; + + for (size_t i = 0; i < names.nr; i++) { + if (i) + strbuf_addch(&sb, ';'); + strbuf_addstr(&sb, "name="); + strbuf_addstr_urlencode(&sb, names.v[i], allow_unsanitized); + if (urls.v[i]) { + strbuf_addstr(&sb, ",url="); + strbuf_addstr_urlencode(&sb, urls.v[i], allow_unsanitized); + } + } + + strbuf_sanitize(&sb); + + strvec_clear(&names); + strvec_clear(&urls); + + return strbuf_detach(&sb, NULL); +} + +enum accept_promisor { + ACCEPT_NONE = 0, + ACCEPT_ALL +}; + +static int should_accept_remote(enum accept_promisor accept, + const char *remote_name UNUSED, + const char *remote_url UNUSED) +{ + if (accept == ACCEPT_ALL) + return 1; + + BUG("Unhandled 'enum accept_promisor' value '%d'", accept); +} + +static void filter_promisor_remote(struct repository *repo, + struct strvec *accepted, + const char *info) +{ + struct strbuf **remotes; + char *accept_str; + enum accept_promisor accept = ACCEPT_NONE; + + if (!git_config_get_string("promisor.acceptfromserver", &accept_str)) { + if (!accept_str || !*accept_str || !strcasecmp("None", accept_str)) + accept = ACCEPT_NONE; + else if (!strcasecmp("All", accept_str)) + accept = ACCEPT_ALL; + else + warning(_("unknown '%s' value for '%s' config option"), + accept_str, "promisor.acceptfromserver"); + } + + if (accept == ACCEPT_NONE) + return; + + /* Parse remote info received */ + + remotes = strbuf_split_str(info, ';', 0); + + for (size_t i = 0; remotes[i]; i++) { + struct strbuf **elems; + const char *remote_name = NULL; + const char *remote_url = NULL; + char *decoded_name = NULL; + char *decoded_url = NULL; + + strbuf_trim_trailing_ch(remotes[i], ';'); + elems = strbuf_split_str(remotes[i]->buf, ',', 0); + + for (size_t j = 0; elems[j]; j++) { + int res; + strbuf_trim_trailing_ch(elems[j], ','); + res = skip_prefix(elems[j]->buf, "name=", &remote_name) || + skip_prefix(elems[j]->buf, "url=", &remote_url); + if (!res) + warning(_("unknown element '%s' from remote info"), + elems[j]->buf); + } + + if (remote_name) + decoded_name = url_percent_decode(remote_name); + if (remote_url) + decoded_url = url_percent_decode(remote_url); + + if (decoded_name && should_accept_remote(accept, decoded_name, decoded_url)) + strvec_push(accepted, decoded_name); + + strbuf_list_free(elems); + free(decoded_name); + free(decoded_url); + } + + free(accept_str); + strbuf_list_free(remotes); +} + +char *promisor_remote_reply(const char *info) +{ + struct strvec accepted = STRVEC_INIT; + struct strbuf reply = STRBUF_INIT; + + filter_promisor_remote(the_repository, &accepted, info); + + if (!accepted.nr) + return NULL; + + for (size_t i = 0; i < accepted.nr; i++) { + if (i) + strbuf_addch(&reply, ';'); + strbuf_addstr_urlencode(&reply, accepted.v[i], allow_unsanitized); + } + + strvec_clear(&accepted); + + return strbuf_detach(&reply, NULL); +} + +void mark_promisor_remotes_as_accepted(struct repository *r, const char *remotes) +{ + struct strbuf **accepted_remotes = strbuf_split_str(remotes, ';', 0); + + for (size_t i = 0; accepted_remotes[i]; i++) { + struct promisor_remote *p; + char *decoded_remote; + + strbuf_trim_trailing_ch(accepted_remotes[i], ';'); + decoded_remote = url_percent_decode(accepted_remotes[i]->buf); + + p = repo_promisor_remote_find(r, decoded_remote); + if (p) + p->accepted = 1; + else + warning(_("accepted promisor remote '%s' not found"), + decoded_remote); + + free(decoded_remote); + } + + strbuf_list_free(accepted_remotes); +} diff --git a/promisor-remote.h b/promisor-remote.h index 88cb599c39..814ca248c7 100644 --- a/promisor-remote.h +++ b/promisor-remote.h @@ -9,11 +9,13 @@ struct object_id; * Promisor remote linked list * * Information in its fields come from remote.XXX config entries or - * from extensions.partialclone. + * from extensions.partialclone, except for 'accepted' which comes + * from protocol v2 capabilities exchange. */ struct promisor_remote { struct promisor_remote *next; char *partial_clone_filter; + unsigned int accepted : 1; const char name[FLEX_ARRAY]; }; @@ -32,4 +34,36 @@ void promisor_remote_get_direct(struct repository *repo, const struct object_id *oids, int oid_nr); +/* + * Prepare a "promisor-remote" advertisement by a server. + * Check the value of "promisor.advertise" and maybe the configured + * promisor remotes, if any, to prepare information to send in an + * advertisement. + * Return value is NULL if no promisor remote advertisement should be + * made. Otherwise it contains the names and urls of the advertised + * promisor remotes separated by ';' + */ +char *promisor_remote_info(struct repository *repo); + +/* + * Prepare a reply to a "promisor-remote" advertisement from a server. + * Check the value of "promisor.acceptfromserver" and maybe the + * configured promisor remotes, if any, to prepare the reply. + * Return value is NULL if no promisor remote from the server + * is accepted. Otherwise it contains the names of the accepted promisor + * remotes separated by ';'. + */ +char *promisor_remote_reply(const char *info); + +/* + * Set the 'accepted' flag for some promisor remotes. Useful when some + * promisor remotes have been accepted by the client. + */ +void mark_promisor_remotes_as_accepted(struct repository *repo, const char *remotes); + +/* + * Has any promisor remote been accepted by the client? + */ +int repo_has_accepted_promisor_remote(struct repository *r); + #endif /* PROMISOR_REMOTE_H */ diff --git a/serve.c b/serve.c index 884cd84ca8..a8935571d6 100644 --- a/serve.c +++ b/serve.c @@ -12,6 +12,7 @@ #include "upload-pack.h" #include "bundle-uri.h" #include "trace2.h" +#include "promisor-remote.h" static int advertise_sid = -1; static int advertise_object_info = -1; @@ -31,6 +32,26 @@ static int agent_advertise(struct repository *r UNUSED, return 1; } +static int promisor_remote_advertise(struct repository *r, + struct strbuf *value) +{ + if (value) { + char *info = promisor_remote_info(r); + if (!info) + return 0; + strbuf_addstr(value, info); + free(info); + } + return 1; +} + +static void promisor_remote_receive(struct repository *r, + const char *remotes) +{ + mark_promisor_remotes_as_accepted(r, remotes); +} + + static int object_format_advertise(struct repository *r, struct strbuf *value) { @@ -157,6 +178,11 @@ static struct protocol_capability capabilities[] = { .advertise = bundle_uri_advertise, .command = bundle_uri_command, }, + { + .name = "promisor-remote", + .advertise = promisor_remote_advertise, + .receive = promisor_remote_receive, + }, }; void protocol_v2_advertise_capabilities(void) diff --git a/t/t5710-promisor-remote-capability.sh b/t/t5710-promisor-remote-capability.sh new file mode 100755 index 0000000000..7e44ad15ce --- /dev/null +++ b/t/t5710-promisor-remote-capability.sh @@ -0,0 +1,124 @@ +#!/bin/sh + +test_description='handling of promisor remote advertisement' + +. ./test-lib.sh + +# Setup the repository with three commits, this way HEAD is always +# available and we can hide commit 1 or 2. +test_expect_success 'setup: create "template" repository' ' + git init template && + test_commit -C template 1 && + test_commit -C template 2 && + test_commit -C template 3 && + test-tool genrandom foo 10240 >template/foo && + git -C template add foo && + git -C template commit -m foo +' + +# A bare repo will act as a server repo with unpacked objects. +test_expect_success 'setup: create bare "server" repository' ' + git clone --bare --no-local template server && + mv server/objects/pack/pack-* . && + packfile=$(ls pack-*.pack) && + git -C server unpack-objects --strict <"$packfile" +' + +check_missing_objects () { + git -C "$1" rev-list --objects --all --missing=print > all.txt && + perl -ne 'print if s/^[?]//' all.txt >missing.txt && + test_line_count = "$2" missing.txt && + test "$3" = "$(cat missing.txt)" +} + +initialize_server () { + # Repack everything first + git -C server -c repack.writebitmaps=false repack -a -d && + + # Remove promisor file in case they exist, useful when reinitializing + rm -rf server/objects/pack/*.promisor && + + # Repack without the largest object and create a promisor pack on server + git -C server -c repack.writebitmaps=false repack -a -d \ + --filter=blob:limit=5k --filter-to="$(pwd)" && + promisor_file=$(ls server/objects/pack/*.pack | sed "s/\.pack/.promisor/") && + touch "$promisor_file" && + + # Check that only one object is missing on the server + check_missing_objects server 1 "$oid" +} + +test_expect_success "setup for testing promisor remote advertisement" ' + # Create another bare repo called "server2" + git init --bare server2 && + + # Copy the largest object from server to server2 + obj="HEAD:foo" && + oid="$(git -C server rev-parse $obj)" && + oid_path="$(test_oid_to_path $oid)" && + path="server/objects/$oid_path" && + path2="server2/objects/$oid_path" && + mkdir -p $(dirname "$path2") && + cp "$path" "$path2" && + + initialize_server && + + # Configure server2 as promisor remote for server + git -C server remote add server2 "file://$(pwd)/server2" && + git -C server config remote.server2.promisor true && + + git -C server2 config uploadpack.allowFilter true && + git -C server2 config uploadpack.allowAnySHA1InWant true && + git -C server config uploadpack.allowFilter true && + git -C server config uploadpack.allowAnySHA1InWant true +' + +test_expect_success "fetch with promisor.advertise set to 'true'" ' + git -C server config promisor.advertise true && + + # Clone from server to create a client + GIT_NO_LAZY_FETCH=0 git clone -c remote.server2.promisor=true \ + -c remote.server2.fetch="+refs/heads/*:refs/remotes/server2/*" \ + -c remote.server2.url="file://$(pwd)/server2" \ + -c promisor.acceptfromserver=All \ + --no-local --filter="blob:limit=5k" server client && + test_when_finished "rm -rf client" && + + # Check that the largest object is still missing on the server + check_missing_objects server 1 "$oid" +' + +test_expect_success "fetch with promisor.advertise set to 'false'" ' + git -C server config promisor.advertise false && + + # Clone from server to create a client + GIT_NO_LAZY_FETCH=0 git clone -c remote.server2.promisor=true \ + -c remote.server2.fetch="+refs/heads/*:refs/remotes/server2/*" \ + -c remote.server2.url="file://$(pwd)/server2" \ + -c promisor.acceptfromserver=All \ + --no-local --filter="blob:limit=5k" server client && + test_when_finished "rm -rf client" && + + # Check that the largest object is not missing on the server + check_missing_objects server 0 "" && + + # Reinitialize server so that the largest object is missing again + initialize_server +' + +test_expect_success "fetch with promisor.acceptfromserver set to 'None'" ' + git -C server config promisor.advertise true && + + # Clone from server to create a client + GIT_NO_LAZY_FETCH=0 git clone -c remote.server2.promisor=true \ + -c remote.server2.fetch="+refs/heads/*:refs/remotes/server2/*" \ + -c remote.server2.url="file://$(pwd)/server2" \ + -c promisor.acceptfromserver=None \ + --no-local --filter="blob:limit=5k" server client && + test_when_finished "rm -rf client" && + + # Check that the largest object is not missing on the server + check_missing_objects server 0 "" +' + +test_done diff --git a/upload-pack.c b/upload-pack.c index 0052c6a4dc..0cff76c845 100644 --- a/upload-pack.c +++ b/upload-pack.c @@ -31,6 +31,7 @@ #include "write-or-die.h" #include "json-writer.h" #include "strmap.h" +#include "promisor-remote.h" /* Remember to update object flag allocation in object.h */ #define THEY_HAVE (1u << 11) @@ -317,6 +318,8 @@ static void create_pack_file(struct upload_pack_data *pack_data, strvec_push(&pack_objects.args, "--delta-base-offset"); if (pack_data->use_include_tag) strvec_push(&pack_objects.args, "--include-tag"); + if (repo_has_accepted_promisor_remote(the_repository)) + strvec_push(&pack_objects.args, "--missing=allow-promisor"); if (pack_data->filter_options.choice) { const char *spec = expand_list_objects_filter_spec(&pack_data->filter_options); From patchwork Tue Sep 10 16:30:00 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 13798876 Received: from mail-wm1-f42.google.com (mail-wm1-f42.google.com [209.85.128.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BD93E1A2576 for ; Tue, 10 Sep 2024 16:30:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.42 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725985830; cv=none; b=SiBW7/OYLDVnL/TT0zD5hDhB+R6l5PWEohncfJmogmxi0Q4Ey9GHYBI+z9uhKRWjakSNdUXtuaSGQ3NIt7vV0Mlcd7Ih+MoKyIIUZIxX7jGK+Ci6Ql+kddb8q8IuUl2jDlomhJk6u8W6Tb9caC+iVrW6DZYE5Rr5aeCMQEUAI/4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725985830; c=relaxed/simple; bh=j0mLt02jD8fWrq09lJ7n2yvN8x079uNgRzeKXfUNT1g=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=nmQZYSDmPYhwwsaIJwLOHEeN42kzr8ngOZIeHaSINAuYjn5zQMQdtL8whWzlShw1Y2gwpOClH74fgN9iOaTvt176AWbg7ae3cUecp4wyOvrJx1+spRLqj8o2GpbwrDeGSpo1SIk/hY9JvYCjvR7LlVOC5OdUMJDbHH5lNCFXpuI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=OyV1DQEt; arc=none smtp.client-ip=209.85.128.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="OyV1DQEt" Received: by mail-wm1-f42.google.com with SMTP id 5b1f17b1804b1-42cbe624c59so9900945e9.3 for ; Tue, 10 Sep 2024 09:30:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1725985826; x=1726590626; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=fofkcqyB+IkL+epoxSrto4IA9ljK97B/4ILztsK04ag=; b=OyV1DQEtFFNdj3l7X56eREXJLP2Elh71S3VsO9muPUGkIEEjMq39+07z2DAzZEAWJk UpQzMm8fOb9pjG1NVoSmCI3HHY73osRVdOAzUgbdGSagfFrnnSZErK4JODMbiIUPa4o1 2iJBHDCC5ywfvrgUQpeWCr9CiXLq/qXDnWHiCsriiQ839N7Jw4Kxv064Kc67dy6rdo2h ML/17UUBgqO2cfLx7Zv/ccVXyGjSUZOJQI35ByniDfuJ3bre3BnrWaWIfX8+zTJlOCTQ eLXmUEGR6X8utKII2lSbh71PZwFDHGVAMb3b6p4rNYemAVH6v3WyfOgR2QZxfJatA1H4 op+A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725985826; x=1726590626; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=fofkcqyB+IkL+epoxSrto4IA9ljK97B/4ILztsK04ag=; b=KUix2Bc2O7mG9E6ek5DdwAxXvRwBCRCAk70+sSOX6ES30ljkp09MH5juaQ+m5iW4TX n9lE58TDqGIV39X4BbRcaMrZhGfNPyr2/hxcnW2r9caP6XOnDIgoELA0CiPyiI1Rm/+d 7RBG292KBrFy/ABqIkg33299ft+Fn+8kW26pLWptNmkCXtbjOZqWIuxL+353mXShPD/M H2AhmHAJHpFXfD2mjBp+x0sL3OVHLCUahG8KAv0sRYSS0s70CTOZn9vtgWYrALRJqEsc oVtIv7AOfWPAolWPzlBfvmziB/w13zh6U3gGa59HYoLEm0MnZYsoaPUbMOIMOWrQKF83 TQEA== X-Gm-Message-State: AOJu0Yyujv50dubsyosEOD2IOgj2dq0MepwwafIbFPHnqJNNpAVWoaV1 O0IX8xe+Yz2yEL2FE7it80OM4w/OcrqSjONH3UInZiwZdxk63nbJZqGdkA== X-Google-Smtp-Source: AGHT+IG9rNi/YFxu4I/XsnzVaxTxqG9aZSJMt4I/BxB8LsQ4d+Gv47BfoPF/Fxz2a9pIOHdWCaoQfw== X-Received: by 2002:a05:600c:3541:b0:42c:c37b:4d53 with SMTP id 5b1f17b1804b1-42cc37b501fmr17241935e9.0.1725985825133; Tue, 10 Sep 2024 09:30:25 -0700 (PDT) Received: from christian-Precision-5550.. (176-138-135-207.abo.bbox.fr. [176.138.135.207]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-42cc01a8ee7sm29897865e9.0.2024.09.10.09.30.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Sep 2024 09:30:23 -0700 (PDT) From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , John Cai , Patrick Steinhardt , Taylor Blau , Eric Sunshine , Christian Couder , Christian Couder Subject: [PATCH v2 4/4] promisor-remote: check advertised name or URL Date: Tue, 10 Sep 2024 18:30:00 +0200 Message-ID: <20240910163000.1985723-5-christian.couder@gmail.com> X-Mailer: git-send-email 2.46.0.4.g7a37e584ed In-Reply-To: <20240910163000.1985723-1-christian.couder@gmail.com> References: <20240731134014.2299361-1-christian.couder@gmail.com> <20240910163000.1985723-1-christian.couder@gmail.com> Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 A previous commit introduced a "promisor.acceptFromServer" configuration variable with only "None" or "All" as valid values. Let's introduce "KnownName" and "KnownUrl" as valid values for this configuration option to give more choice to a client about which promisor remotes it might accept among those that the server advertised. In case of "KnownName", the client will accept promisor remotes which are already configured on the client and have the same name as those advertised by the client. This could be useful in a corporate setup where servers and clients are trusted to not switch names and URLs, but where some kind of control is still useful. In case of "KnownUrl", the client will accept promisor remotes which have both the same name and the same URL configured on the client as the name and URL advertised by the server. This is the most secure option, so it should be used if possible. Signed-off-by: Christian Couder --- Documentation/config/promisor.txt | 22 ++++++--- promisor-remote.c | 54 +++++++++++++++++++-- t/t5710-promisor-remote-capability.sh | 68 +++++++++++++++++++++++++++ 3 files changed, 134 insertions(+), 10 deletions(-) diff --git a/Documentation/config/promisor.txt b/Documentation/config/promisor.txt index 9cbfe3e59e..d1364bc018 100644 --- a/Documentation/config/promisor.txt +++ b/Documentation/config/promisor.txt @@ -12,9 +12,19 @@ promisor.advertise:: promisor.acceptFromServer:: If set to "all", a client will accept all the promisor remotes a server might advertise using the "promisor-remote" - capability. Default is "none", which means no promisor remote - advertised by a server will be accepted. By accepting a - promisor remote, the client agrees that the server might omit - objects that are lazily fetchable from this promisor remote - from its responses to "fetch" and "clone" requests from the - client. See linkgit:gitprotocol-v2[5]. + capability. If set to "knownName" the client will accept + promisor remotes which are already configured on the client + and have the same name as those advertised by the client. This + is not very secure, but could be used in a corporate setup + where servers and clients are trusted to not switch name and + URLs. If set to "knownUrl", the client will accept promisor + remotes which have both the same name and the same URL + configured on the client as the name and URL advertised by the + server. This is more secure than "all" or "knownUrl", so it + should be used if possible instead of those options. Default + is "none", which means no promisor remote advertised by a + server will be accepted. By accepting a promisor remote, the + client agrees that the server might omit objects that are + lazily fetchable from this promisor remote from its responses + to "fetch" and "clone" requests from the client. See + linkgit:gitprotocol-v2[5]. diff --git a/promisor-remote.c b/promisor-remote.c index baacbe9d94..f713595eb0 100644 --- a/promisor-remote.c +++ b/promisor-remote.c @@ -367,19 +367,54 @@ char *promisor_remote_info(struct repository *repo) return strbuf_detach(&sb, NULL); } +/* + * Find first index of 'vec' where there is 'val'. 'val' is compared + * case insensively to the strings in 'vec'. If not found 'vec->nr' is + * returned. + */ +static size_t strvec_find_index(struct strvec *vec, const char *val) +{ + for (size_t i = 0; i < vec->nr; i++) + if (!strcasecmp(vec->v[i], val)) + return i; + return vec->nr; +} + enum accept_promisor { ACCEPT_NONE = 0, + ACCEPT_KNOWN_URL, + ACCEPT_KNOWN_NAME, ACCEPT_ALL }; static int should_accept_remote(enum accept_promisor accept, - const char *remote_name UNUSED, - const char *remote_url UNUSED) + const char *remote_name, const char *remote_url, + struct strvec *names, struct strvec *urls) { + size_t i; + if (accept == ACCEPT_ALL) return 1; - BUG("Unhandled 'enum accept_promisor' value '%d'", accept); + i = strvec_find_index(names, remote_name); + + if (i >= names->nr) + /* We don't know about that remote */ + return 0; + + if (accept == ACCEPT_KNOWN_NAME) + return 1; + + if (accept != ACCEPT_KNOWN_URL) + BUG("Unhandled 'enum accept_promisor' value '%d'", accept); + + if (!strcasecmp(urls->v[i], remote_url)) + return 1; + + warning(_("known remote named '%s' but with url '%s' instead of '%s'"), + remote_name, urls->v[i], remote_url); + + return 0; } static void filter_promisor_remote(struct repository *repo, @@ -389,10 +424,16 @@ static void filter_promisor_remote(struct repository *repo, struct strbuf **remotes; char *accept_str; enum accept_promisor accept = ACCEPT_NONE; + struct strvec names = STRVEC_INIT; + struct strvec urls = STRVEC_INIT; if (!git_config_get_string("promisor.acceptfromserver", &accept_str)) { if (!accept_str || !*accept_str || !strcasecmp("None", accept_str)) accept = ACCEPT_NONE; + else if (!strcasecmp("KnownUrl", accept_str)) + accept = ACCEPT_KNOWN_URL; + else if (!strcasecmp("KnownName", accept_str)) + accept = ACCEPT_KNOWN_NAME; else if (!strcasecmp("All", accept_str)) accept = ACCEPT_ALL; else @@ -403,6 +444,9 @@ static void filter_promisor_remote(struct repository *repo, if (accept == ACCEPT_NONE) return; + if (accept != ACCEPT_ALL) + promisor_info_vecs(repo, &names, &urls); + /* Parse remote info received */ remotes = strbuf_split_str(info, ';', 0); @@ -432,7 +476,7 @@ static void filter_promisor_remote(struct repository *repo, if (remote_url) decoded_url = url_percent_decode(remote_url); - if (decoded_name && should_accept_remote(accept, decoded_name, decoded_url)) + if (decoded_name && should_accept_remote(accept, decoded_name, decoded_url, &names, &urls)) strvec_push(accepted, decoded_name); strbuf_list_free(elems); @@ -441,6 +485,8 @@ static void filter_promisor_remote(struct repository *repo, } free(accept_str); + strvec_clear(&names); + strvec_clear(&urls); strbuf_list_free(remotes); } diff --git a/t/t5710-promisor-remote-capability.sh b/t/t5710-promisor-remote-capability.sh index 7e44ad15ce..c2c83a5914 100755 --- a/t/t5710-promisor-remote-capability.sh +++ b/t/t5710-promisor-remote-capability.sh @@ -117,6 +117,74 @@ test_expect_success "fetch with promisor.acceptfromserver set to 'None'" ' --no-local --filter="blob:limit=5k" server client && test_when_finished "rm -rf client" && + # Check that the largest object is not missing on the server + check_missing_objects server 0 "" && + + # Reinitialize server so that the largest object is missing again + initialize_server +' + +test_expect_success "fetch with promisor.acceptfromserver set to 'KnownName'" ' + git -C server config promisor.advertise true && + + # Clone from server to create a client + GIT_NO_LAZY_FETCH=0 git clone -c remote.server2.promisor=true \ + -c remote.server2.fetch="+refs/heads/*:refs/remotes/server2/*" \ + -c remote.server2.url="file://$(pwd)/server2" \ + -c promisor.acceptfromserver=KnownName \ + --no-local --filter="blob:limit=5k" server client && + test_when_finished "rm -rf client" && + + # Check that the largest object is still missing on the server + check_missing_objects server 1 "$oid" +' + +test_expect_success "fetch with 'KnownName' and different remote names" ' + git -C server config promisor.advertise true && + + # Clone from server to create a client + GIT_NO_LAZY_FETCH=0 git clone -c remote.serverTwo.promisor=true \ + -c remote.serverTwo.fetch="+refs/heads/*:refs/remotes/server2/*" \ + -c remote.serverTwo.url="file://$(pwd)/server2" \ + -c promisor.acceptfromserver=KnownName \ + --no-local --filter="blob:limit=5k" server client && + test_when_finished "rm -rf client" && + + # Check that the largest object is not missing on the server + check_missing_objects server 0 "" && + + # Reinitialize server so that the largest object is missing again + initialize_server +' + +test_expect_success "fetch with promisor.acceptfromserver set to 'KnownUrl'" ' + git -C server config promisor.advertise true && + + # Clone from server to create a client + GIT_NO_LAZY_FETCH=0 git clone -c remote.server2.promisor=true \ + -c remote.server2.fetch="+refs/heads/*:refs/remotes/server2/*" \ + -c remote.server2.url="file://$(pwd)/server2" \ + -c promisor.acceptfromserver=KnownUrl \ + --no-local --filter="blob:limit=5k" server client && + test_when_finished "rm -rf client" && + + # Check that the largest object is still missing on the server + check_missing_objects server 1 "$oid" +' + +test_expect_success "fetch with 'KnownUrl' and different remote urls" ' + ln -s server2 serverTwo && + + git -C server config promisor.advertise true && + + # Clone from server to create a client + GIT_NO_LAZY_FETCH=0 git clone -c remote.server2.promisor=true \ + -c remote.server2.fetch="+refs/heads/*:refs/remotes/server2/*" \ + -c remote.server2.url="file://$(pwd)/serverTwo" \ + -c promisor.acceptfromserver=KnownUrl \ + --no-local --filter="blob:limit=5k" server client && + test_when_finished "rm -rf client" && + # Check that the largest object is not missing on the server check_missing_objects server 0 "" '