Message ID | 20220225031135.4136158-1-robh@kernel.org (mailing list archive) |
---|---|
State | Accepted |
Headers | show |
Series | Remove URL redirect project lookup | expand |
On Thu, Feb 24, 2022 at 9:11 PM Rob Herring <robh@kernel.org> wrote: > > Now that lore indexes all messages, there's no need to lookup the project > for the message-id. If the project is not specified, then 'all' is used. > > The primary benefit of this change is that cached accesses can now work > offline instead of splatting with a network error. > > Signed-off-by: Rob Herring <robh@kernel.org> > --- > > My usecase is twofold. First I want to speed up opening a thread by > having it fetched in the background and cached. Second, I want to be > able to work offline by fetching a list of threads (my PW queue) in > advance and using the offline copy. With a sufficiently long cache > timeout, the cache works perfectly for this use. Though maybe a 'use the > cache if there's a network failure' mode is needed instead of always > timing out the cache. > > I also have this working using b4 to fetch my queue to an mbox and > then using the 'use local mbox' option. This mostly works except for > the handling of 'From ' in message bodies which is problematic for mbox > format. The cache manages to avoid this problem. > > Rob > > b4/__init__.py | 23 +++++++---------------- > 1 file changed, 7 insertions(+), 16 deletions(-) Ping! > > diff --git a/b4/__init__.py b/b4/__init__.py > index 0d506bbaa649..ec1a6da44144 100644 > --- a/b4/__init__.py > +++ b/b4/__init__.py > @@ -2235,6 +2235,9 @@ def get_pi_thread_by_url(t_mbx_url, nocache=False): > logger.critical('Grabbing thread from %s', t_mbx_url.split('://')[1]) > session = get_requests_session() > resp = session.get(t_mbx_url) > + if resp.status_code == 404: > + logger.critical('That message-id is not known.') > + return None > if resp.status_code != 200: > logger.critical('Server returned an error: %s', resp.status_code) > return None > @@ -2263,22 +2266,10 @@ def get_pi_thread_by_url(t_mbx_url, nocache=False): > def get_pi_thread_by_msgid(msgid, useproject=None, nocache=False, onlymsgids: Optional[set] = None): > qmsgid = urllib.parse.quote_plus(msgid) > config = get_main_config() > - # Grab the head from lore, to see where we are redirected > - midmask = config['midmask'] % qmsgid > - loc = urllib.parse.urlparse(midmask) > - if useproject: > - projurl = '%s://%s/%s' % (loc.scheme, loc.netloc, useproject) > - else: > - logger.info('Looking up %s', midmask) > - session = get_requests_session() > - resp = session.head(midmask) > - if resp.status_code < 300 or resp.status_code > 400: > - logger.critical('That message-id is not known.') > - return None > - # Pop msgid from the end of the redirect > - chunks = resp.headers['Location'].rstrip('/').split('/') > - projurl = '/'.join(chunks[:-1]) > - resp.close() > + loc = urllib.parse.urlparse(config['midmask']) > + if not useproject: > + useproject = 'all' > + projurl = '%s://%s/%s' % (loc.scheme, loc.netloc, useproject) > t_mbx_url = '%s/%s/t.mbox.gz' % (projurl, qmsgid) > logger.debug('t_mbx_url=%s', t_mbx_url) > > -- > 2.32.0 >
On Mon, Mar 21, 2022 at 11:41:20AM -0500, Rob Herring wrote: > > My usecase is twofold. First I want to speed up opening a thread by > > having it fetched in the background and cached. Second, I want to be > > able to work offline by fetching a list of threads (my PW queue) in > > advance and using the offline copy. With a sufficiently long cache > > timeout, the cache works perfectly for this use. Though maybe a 'use the > > cache if there's a network failure' mode is needed instead of always > > timing out the cache. > > > > I also have this working using b4 to fetch my queue to an mbox and > > then using the 'use local mbox' option. This mostly works except for > > the handling of 'From ' in message bodies which is problematic for mbox > > format. The cache manages to avoid this problem. > > > > Rob > > > > b4/__init__.py | 23 +++++++---------------- > > 1 file changed, 7 insertions(+), 16 deletions(-) > > Ping! Sorry, I've been a bit verklempt about things over the past few weeks. I'll try to get to outstanding patches in very short order. Best regards, -K
On Thu, Feb 24, 2022 at 09:11:35PM -0600, Rob Herring wrote: > Now that lore indexes all messages, there's no need to lookup the project > for the message-id. If the project is not specified, then 'all' is used. Rob: Sorry for the long delay, but I did finally get around to it. I didn't quite use your patch directly, because the goal is to also support non-lore public-inbox installations, and they may not provide the unified index in /all/, so we needed to keep the old lookup option available. The change is in master as commit bfe5df6694c8115fa8402943b125c6e47c8eec08. Thanks, -K
diff --git a/b4/__init__.py b/b4/__init__.py index 0d506bbaa649..ec1a6da44144 100644 --- a/b4/__init__.py +++ b/b4/__init__.py @@ -2235,6 +2235,9 @@ def get_pi_thread_by_url(t_mbx_url, nocache=False): logger.critical('Grabbing thread from %s', t_mbx_url.split('://')[1]) session = get_requests_session() resp = session.get(t_mbx_url) + if resp.status_code == 404: + logger.critical('That message-id is not known.') + return None if resp.status_code != 200: logger.critical('Server returned an error: %s', resp.status_code) return None @@ -2263,22 +2266,10 @@ def get_pi_thread_by_url(t_mbx_url, nocache=False): def get_pi_thread_by_msgid(msgid, useproject=None, nocache=False, onlymsgids: Optional[set] = None): qmsgid = urllib.parse.quote_plus(msgid) config = get_main_config() - # Grab the head from lore, to see where we are redirected - midmask = config['midmask'] % qmsgid - loc = urllib.parse.urlparse(midmask) - if useproject: - projurl = '%s://%s/%s' % (loc.scheme, loc.netloc, useproject) - else: - logger.info('Looking up %s', midmask) - session = get_requests_session() - resp = session.head(midmask) - if resp.status_code < 300 or resp.status_code > 400: - logger.critical('That message-id is not known.') - return None - # Pop msgid from the end of the redirect - chunks = resp.headers['Location'].rstrip('/').split('/') - projurl = '/'.join(chunks[:-1]) - resp.close() + loc = urllib.parse.urlparse(config['midmask']) + if not useproject: + useproject = 'all' + projurl = '%s://%s/%s' % (loc.scheme, loc.netloc, useproject) t_mbx_url = '%s/%s/t.mbox.gz' % (projurl, qmsgid) logger.debug('t_mbx_url=%s', t_mbx_url)
Now that lore indexes all messages, there's no need to lookup the project for the message-id. If the project is not specified, then 'all' is used. The primary benefit of this change is that cached accesses can now work offline instead of splatting with a network error. Signed-off-by: Rob Herring <robh@kernel.org> --- My usecase is twofold. First I want to speed up opening a thread by having it fetched in the background and cached. Second, I want to be able to work offline by fetching a list of threads (my PW queue) in advance and using the offline copy. With a sufficiently long cache timeout, the cache works perfectly for this use. Though maybe a 'use the cache if there's a network failure' mode is needed instead of always timing out the cache. I also have this working using b4 to fetch my queue to an mbox and then using the 'use local mbox' option. This mostly works except for the handling of 'From ' in message bodies which is problematic for mbox format. The cache manages to avoid this problem. Rob b4/__init__.py | 23 +++++++---------------- 1 file changed, 7 insertions(+), 16 deletions(-)