diff mbox series

[2/3] tests/functional/asset: Verify downloaded size

Message ID 20250312051739.938441-3-npiggin@gmail.com (mailing list archive)
State New
Headers show
Series tests/functional/asset: improve partial-download handling | expand

Commit Message

Nicholas Piggin March 12, 2025, 5:17 a.m. UTC
If the server provides a Content-Length header, use that to verify the
size of the downloaded file. This catches cases where the connection
terminates early, and gives the opportunity to retry. Without this, the
checksum will likely mismatch and fail without retry.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 tests/functional/qemu_test/asset.py | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

Comments

Thomas Huth March 12, 2025, 6:49 a.m. UTC | #1
On 12/03/2025 06.17, Nicholas Piggin wrote:
> If the server provides a Content-Length header, use that to verify the
> size of the downloaded file. This catches cases where the connection
> terminates early, and gives the opportunity to retry. Without this, the
> checksum will likely mismatch and fail without retry.
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>   tests/functional/qemu_test/asset.py | 16 ++++++++++++++++
>   1 file changed, 16 insertions(+)
> 
> diff --git a/tests/functional/qemu_test/asset.py b/tests/functional/qemu_test/asset.py
> index 6a1c92ffbef..d34e8f5e2ad 100644
> --- a/tests/functional/qemu_test/asset.py
> +++ b/tests/functional/qemu_test/asset.py
> @@ -124,6 +124,22 @@ def fetch(self):
>                   with tmp_cache_file.open("xb") as dst:
>                       with urllib.request.urlopen(self.url) as resp:
>                           copyfileobj(resp, dst)
> +                        length_hdr = resp.getheader("Content-Length")
> +
> +                # Verify downloaded file size against length metadata, if
> +                # available. dst must be out of scope before testing st_size
> +                # because # copyfileobj returns before all buffers are

remove the "#" after "because" ?

> +                # flushed to filesystem.

As far as I understood Python, that's the main reason for using "with" in 
cases like this here, so I'd maybe even rather scratch the last sentence
completely.

> +                if length_hdr:
> +                    length = int(length_hdr)
> +                    if tmp_cache_file.stat().st_size != length:
> +                        print("st_size %ld", tmp_cache_file.stat().st_size)

Debug left-over?

> +                        self.log.error("Unable to download %s: "
> +                                       "connection closed before "
> +                                       "transfer complete",
> +                                       self.url)

Maybe it would be good to include st_size and length in the log message instead?

> +                        tmp_cache_file.unlink()
> +                        continue
>                   break
>               except FileExistsError:
>                   self.log.debug("%s already exists, "

  Thomas
diff mbox series

Patch

diff --git a/tests/functional/qemu_test/asset.py b/tests/functional/qemu_test/asset.py
index 6a1c92ffbef..d34e8f5e2ad 100644
--- a/tests/functional/qemu_test/asset.py
+++ b/tests/functional/qemu_test/asset.py
@@ -124,6 +124,22 @@  def fetch(self):
                 with tmp_cache_file.open("xb") as dst:
                     with urllib.request.urlopen(self.url) as resp:
                         copyfileobj(resp, dst)
+                        length_hdr = resp.getheader("Content-Length")
+
+                # Verify downloaded file size against length metadata, if
+                # available. dst must be out of scope before testing st_size
+                # because # copyfileobj returns before all buffers are
+                # flushed to filesystem.
+                if length_hdr:
+                    length = int(length_hdr)
+                    if tmp_cache_file.stat().st_size != length:
+                        print("st_size %ld", tmp_cache_file.stat().st_size)
+                        self.log.error("Unable to download %s: "
+                                       "connection closed before "
+                                       "transfer complete",
+                                       self.url)
+                        tmp_cache_file.unlink()
+                        continue
                 break
             except FileExistsError:
                 self.log.debug("%s already exists, "