diff mbox series

mbox: Add support for proxied https connections

Message ID 20220118174144.2445150-1-void@manifault.com (mailing list archive)
State New, archived
Headers show
Series mbox: Add support for proxied https connections | expand

Commit Message

David Vernet Jan. 18, 2022, 5:41 p.m. UTC
The mbox subsystem uses the python requests library to fetch a thread from
lore.kernel.org, by its message ID. This is extremely convenient, but
unfortunately doesn't work for users who must redirect their http(s)
traffic through a proxy to access the internet.

This diff therefore adds support for querying 'http.proxy' from their git
config, and when set, passing that as a proxy entry to the underlying
session requests object.

Signed-off-by: David Vernet <void@manifault.com>
---
 b4/__init__.py | 11 +++++++++++
 1 file changed, 11 insertions(+)

Comments

Konstantin Ryabitsev Jan. 18, 2022, 6 p.m. UTC | #1
On Tue, Jan 18, 2022 at 09:41:45AM -0800, David Vernet wrote:
> The mbox subsystem uses the python requests library to fetch a thread from
> lore.kernel.org, by its message ID. This is extremely convenient, but
> unfortunately doesn't work for users who must redirect their http(s)
> traffic through a proxy to access the internet.
> 
> This diff therefore adds support for querying 'http.proxy' from their git
> config, and when set, passing that as a proxy entry to the underlying
> session requests object.

Hmm... I'm not super keen on this addition, because git has a lot more than
just http.proxy setting and supporting all of it would be a huge pain. Is
there a reason why simply setting HTTPS_PROXY environment variable is not
sufficient for your needs?

https://docs.python-requests.org/en/latest/user/advanced/#proxies

A few observations on the code below, just fyi:

> diff --git a/b4/__init__.py b/b4/__init__.py
> index 11c287e..0526a19 100644
> --- a/b4/__init__.py
> +++ b/b4/__init__.py
> @@ -138,6 +138,8 @@ DEFAULT_CONFIG = {
>  MAIN_CONFIG = None
>  # This is git-config user.*
>  USER_CONFIG = None
> +# This is git-config http.*
> +HTTP_CONFIG = None
>  
>  # Used for storing our requests session
>  REQSESSION = None

We already track REQSESSION as a global caching var, so we don't need to
introduce another global tracking var HTTP_CONFIG (it's not saving us any
calls to git-config, as we'll already only do it once per b4 invocation).

> @@ -2101,12 +2103,21 @@ def get_user_config():
>              USER_CONFIG['name'] = udata.pw_gecos
>      return USER_CONFIG
>  
> +def get_http_config():
> +    global HTTP_CONFIG
> +    if HTTP_CONFIG is None:
> +        HTTP_CONFIG = get_config_from_git(r'http\..*')
> +    return HTTP_CONFIG
>  
>  def get_requests_session():
>      global REQSESSION
>      if REQSESSION is None:
>          REQSESSION = requests.session()
>          REQSESSION.headers.update({'User-Agent': 'b4/%s' % __VERSION__})
> +        http_config = get_http_config()

Here, we can just do http_config = get_config_from_git(r'http\..*') and it
will be stored in REQSESSION.

> +        if 'proxy' in http_config:
> +            REQSESSION.proxies.update({'https': http_config['proxy']})
> +
>      return REQSESSION

-K
David Vernet Jan. 18, 2022, 6:14 p.m. UTC | #2
Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote on Tue [2022-Jan-18 13:00:06 -0500]:
> Hmm... I'm not super keen on this addition, because git has a lot more than
> just http.proxy setting and supporting all of it would be a huge pain. Is
> there a reason why simply setting HTTPS_PROXY environment variable is not
> sufficient for your needs?

Understood -- using HTTPS_PROXY works perfectly fine as well. I thought
this could be useful to allow callers to configure their git environment
and forget, but if you're worried about it being a headache to support it
then it's OK with me to drop the patch (it's arguably nothing a bash alias
couldn't solve anyways).

> We already track REQSESSION as a global caching var, so we don't need to
> introduce another global tracking var HTTP_CONFIG (it's not saving us any
> calls to git-config, as we'll already only do it once per b4 invocation).

Good point. Thanks for the review + suggestions. If the preference is to
just use HTTPS_PROXY then we can just consider this patch withdrawn rather
than my sending a v2.

- David
Konstantin Ryabitsev Jan. 18, 2022, 6:28 p.m. UTC | #3
On Tue, Jan 18, 2022 at 10:14:31AM -0800, David Vernet wrote:
> Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote on Tue [2022-Jan-18 13:00:06 -0500]:
> > Hmm... I'm not super keen on this addition, because git has a lot more than
> > just http.proxy setting and supporting all of it would be a huge pain. Is
> > there a reason why simply setting HTTPS_PROXY environment variable is not
> > sufficient for your needs?
> 
> Understood -- using HTTPS_PROXY works perfectly fine as well. I thought
> this could be useful to allow callers to configure their git environment
> and forget, but if you're worried about it being a headache to support it
> then it's OK with me to drop the patch (it's arguably nothing a bash alias
> couldn't solve anyways).

My thinking is that it's best to defer to the way python libraries expect this
to work and only look at git configuration when actually touching git or when
looking up things like gpg binary paths. If we implement looking at
http.proxy, then we'll also need to consider supporting the rest of
http.proxy* configuration parameters, and that looks daunting.

Besides, there may be situations where people have http.proxy configured
*just* for git things (e.g. they have some kind of git-aware caching proxy --
I'm told they do exist) that doesn't necessarily work for non-git requests.

> > We already track REQSESSION as a global caching var, so we don't need to
> > introduce another global tracking var HTTP_CONFIG (it's not saving us any
> > calls to git-config, as we'll already only do it once per b4 invocation).
> 
> Good point. Thanks for the review + suggestions. If the preference is to
> just use HTTPS_PROXY then we can just consider this patch withdrawn rather
> than my sending a v2.

Sounds good, but if you want to sent a manpage update to mention HTTP_PROXY
and HTTPS_PROXY env vars with the link to the python-requests docs, I'll be
happy to accept that. If you do, don't bother with the rendered manpage
source, just edit the .rst and I will regenerate the manpage on my end.

Best regards,
-K
David Vernet Jan. 18, 2022, 7:42 p.m. UTC | #4
Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote on Tue [2022-Jan-18 13:28:37 -0500]:
> My thinking is that it's best to defer to the way python libraries expect this
> to work and only look at git configuration when actually touching git or when
> looking up things like gpg binary paths. If we implement looking at
> http.proxy, then we'll also need to consider supporting the rest of
> http.proxy* configuration parameters, and that looks daunting.

Ack, that's sensible and I agree with you that the environment variable is
the better approach here.

> Sounds good, but if you want to sent a manpage update to mention HTTP_PROXY
> and HTTPS_PROXY env vars with the link to the python-requests docs, I'll be
> happy to accept that. If you do, don't bother with the rendered manpage
> source, just edit the .rst and I will regenerate the manpage on my end.

Sure thing. I'll add a small 'PROXYING REQUESTS' section directly after the
'CONFIGURATION' section and send it as a separate patch with a link to this
discussion (let me know if you'd prefer it somewhere else). Apologies for
the arguably noisy first patch, and thanks again for the suggestions.
diff mbox series

Patch

diff --git a/b4/__init__.py b/b4/__init__.py
index 11c287e..0526a19 100644
--- a/b4/__init__.py
+++ b/b4/__init__.py
@@ -138,6 +138,8 @@  DEFAULT_CONFIG = {
 MAIN_CONFIG = None
 # This is git-config user.*
 USER_CONFIG = None
+# This is git-config http.*
+HTTP_CONFIG = None
 
 # Used for storing our requests session
 REQSESSION = None
@@ -2101,12 +2103,21 @@  def get_user_config():
             USER_CONFIG['name'] = udata.pw_gecos
     return USER_CONFIG
 
+def get_http_config():
+    global HTTP_CONFIG
+    if HTTP_CONFIG is None:
+        HTTP_CONFIG = get_config_from_git(r'http\..*')
+    return HTTP_CONFIG
 
 def get_requests_session():
     global REQSESSION
     if REQSESSION is None:
         REQSESSION = requests.session()
         REQSESSION.headers.update({'User-Agent': 'b4/%s' % __VERSION__})
+        http_config = get_http_config()
+        if 'proxy' in http_config:
+            REQSESSION.proxies.update({'https': http_config['proxy']})
+
     return REQSESSION