diff mbox series

[1/1] tests/acceptance: change armbian archive to a faster host

Message ID 20210526205601.263444-2-willianr@redhat.com (mailing list archive)
State New, archived
Headers show
Series tests/acceptance: change armbian archive to a faster host | expand

Commit Message

Willian Rampazzo May 26, 2021, 8:56 p.m. UTC
The current host for the image
Armbian_20.08.1_Orangepipc_bionic_current_5.8.5.img.xz
(archive.armbian.com) is extremely slow in the last couple of weeks,
making the job running the test
tests/system/boot_linux_console.py:BootLinuxConsole.test_arm_orangepi_bionic_20_08
for the first time when the image is not yet on GitLab cache, time out
while the image is being downloaded.

This changes the host to one faster, so new users with an empty cache
are not impacted.

Signed-off-by: Willian Rampazzo <willianr@redhat.com>
---
 tests/acceptance/boot_linux_console.py | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Cleber Rosa May 26, 2021, 11:41 p.m. UTC | #1
On Wed, May 26, 2021 at 05:56:01PM -0300, Willian Rampazzo wrote:
> The current host for the image
> Armbian_20.08.1_Orangepipc_bionic_current_5.8.5.img.xz
> (archive.armbian.com) is extremely slow in the last couple of weeks,
> making the job running the test
> tests/system/boot_linux_console.py:BootLinuxConsole.test_arm_orangepi_bionic_20_08
> for the first time when the image is not yet on GitLab cache, time out
> while the image is being downloaded.
> 
> This changes the host to one faster, so new users with an empty cache
> are not impacted.
> 
> Signed-off-by: Willian Rampazzo <willianr@redhat.com>
> ---
>  tests/acceptance/boot_linux_console.py | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/tests/acceptance/boot_linux_console.py b/tests/acceptance/boot_linux_console.py
> index 276a53f146..51c23b822c 100644
> --- a/tests/acceptance/boot_linux_console.py
> +++ b/tests/acceptance/boot_linux_console.py
> @@ -804,7 +804,8 @@ def test_arm_orangepi_bionic_20_08(self):
>          # to 1036 MiB, but the underlying filesystem is 1552 MiB...
>          # As we expand it to 2 GiB we are safe.
>  
> -        image_url = ('https://archive.armbian.com/orangepipc/archive/'
> +        image_url = ('https://armbian.systemonachip.net/'
> +                     'archive/orangepipc/archive/'

Hi Willian,

I was pretty annoyed by my pipeline failures, that I came up with:

   https://gitlab.com/cleber.gnu/qemu/-/commit/917b3e376e682e9c35c6f7f597ffca110c719e13

To prove that it was a GitLab <-> archive.arbian.com issue.  But I
wonder:

 1. how susceptible to the same situation is this other mirror?
 2. how trustworthy is this mirror, say, stability wise? Maybe
    people in the armbian community would have some info?

Depending on the feedback we get about, this can be a very valid
hotfix/workaround indeed.  But the core issues we need to look into
are:

 a. applying a timeout when fetching assets.  If the asset fails to be
    fetched within the timeout, the test simply gets canceled.

 b. evaluate the use of the multiple "locations" support that the
    avocado.utils.asset library has (and improve it if necessary).

Anyway, thanks for looking into this, and let's wait a bit for
feedback.

- Cleber.
Willian Rampazzo May 27, 2021, 1:45 p.m. UTC | #2
On Wed, May 26, 2021 at 8:41 PM Cleber Rosa <crosa@redhat.com> wrote:
>
> On Wed, May 26, 2021 at 05:56:01PM -0300, Willian Rampazzo wrote:
> > The current host for the image
> > Armbian_20.08.1_Orangepipc_bionic_current_5.8.5.img.xz
> > (archive.armbian.com) is extremely slow in the last couple of weeks,
> > making the job running the test
> > tests/system/boot_linux_console.py:BootLinuxConsole.test_arm_orangepi_bionic_20_08
> > for the first time when the image is not yet on GitLab cache, time out
> > while the image is being downloaded.
> >
> > This changes the host to one faster, so new users with an empty cache
> > are not impacted.
> >
> > Signed-off-by: Willian Rampazzo <willianr@redhat.com>
> > ---
> >  tests/acceptance/boot_linux_console.py | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/tests/acceptance/boot_linux_console.py b/tests/acceptance/boot_linux_console.py
> > index 276a53f146..51c23b822c 100644
> > --- a/tests/acceptance/boot_linux_console.py
> > +++ b/tests/acceptance/boot_linux_console.py
> > @@ -804,7 +804,8 @@ def test_arm_orangepi_bionic_20_08(self):
> >          # to 1036 MiB, but the underlying filesystem is 1552 MiB...
> >          # As we expand it to 2 GiB we are safe.
> >
> > -        image_url = ('https://archive.armbian.com/orangepipc/archive/'
> > +        image_url = ('https://armbian.systemonachip.net/'
> > +                     'archive/orangepipc/archive/'
>
> Hi Willian,
>
> I was pretty annoyed by my pipeline failures, that I came up with:
>
>    https://gitlab.com/cleber.gnu/qemu/-/commit/917b3e376e682e9c35c6f7f597ffca110c719e13
>
> To prove that it was a GitLab <-> archive.arbian.com issue.

When I tried both links, the slow link, and this new link, on my
machine, I could see the slow link is also slow locally. Not as slow
as on GitLab, but 10 times slower than this new link. I was thinking
about open an issue on GitLab. In the worst case, they will say it is
not their fault, but a problem on the other end.

> But I wonder:
>
>  1. how susceptible to the same situation is this other mirror?

Unfortunately, having tests depending on external artifacts will bring
this kind of situation. Unless GitLab is doing traffic shaping, we
will never know how susceptible an external server is to any kind of
instability.

>  2. how trustworthy is this mirror, say, stability wise? Maybe
>     people in the armbian community would have some info?

This new link is the same link that
https://www.armbian.com/orange-pi-pc/ "Archived versions" is pointing,
so I consider it an official mirror from Armbian. That's why I have
not thought much about changing it.

Now, stability wise, we never know :) I don't think we have this
answer for any of the links related to external artifacts QEMU
acceptance tests use.

>
> Depending on the feedback we get about, this can be a very valid
> hotfix/workaround indeed.  But the core issues we need to look into
> are:
>
>  a. applying a timeout when fetching assets.  If the asset fails to be
>     fetched within the timeout, the test simply gets canceled.

But this is failing during the download before the test starts, or in
the pre-phase. The test suite was not created and Avocado don't have a
mapping asset <=> test yet.

>
>  b. evaluate the use of the multiple "locations" support that the
>     avocado.utils.asset library has (and improve it if necessary).
>

This may be an option with a timeout in the location. If the download
on one location times out, try another.

> Anyway, thanks for looking into this, and let's wait a bit for
> feedback.
>
> - Cleber.
Wainer dos Santos Moschetta May 27, 2021, 6:11 p.m. UTC | #3
Hi,

On 5/26/21 5:56 PM, Willian Rampazzo wrote:
> The current host for the image
> Armbian_20.08.1_Orangepipc_bionic_current_5.8.5.img.xz
> (archive.armbian.com) is extremely slow in the last couple of weeks,
> making the job running the test
> tests/system/boot_linux_console.py:BootLinuxConsole.test_arm_orangepi_bionic_20_08
> for the first time when the image is not yet on GitLab cache, time out
> while the image is being downloaded.
>
> This changes the host to one faster, so new users with an empty cache
> are not impacted.

Here the old host performed slightly better: download time of 0:17:36 vs 
0:19:44. Maybe it was a temporary issue with the old host or maybe 
GitLab's runner network?

Anyway,

Reviewed-by: Wainer dos Santos Moschetta <wainersm@redhat.com>


>
> Signed-off-by: Willian Rampazzo <willianr@redhat.com>
> ---
>   tests/acceptance/boot_linux_console.py | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/tests/acceptance/boot_linux_console.py b/tests/acceptance/boot_linux_console.py
> index 276a53f146..51c23b822c 100644
> --- a/tests/acceptance/boot_linux_console.py
> +++ b/tests/acceptance/boot_linux_console.py
> @@ -804,7 +804,8 @@ def test_arm_orangepi_bionic_20_08(self):
>           # to 1036 MiB, but the underlying filesystem is 1552 MiB...
>           # As we expand it to 2 GiB we are safe.
>   
> -        image_url = ('https://archive.armbian.com/orangepipc/archive/'
> +        image_url = ('https://armbian.systemonachip.net/'
> +                     'archive/orangepipc/archive/'
>                        'Armbian_20.08.1_Orangepipc_bionic_current_5.8.5.img.xz')
>           image_hash = ('b4d6775f5673486329e45a0586bf06b6'
>                         'dbe792199fd182ac6b9c7bb6c7d3e6dd')
Cleber Rosa June 2, 2021, 2:08 p.m. UTC | #4
On Thu, May 27, 2021 at 9:45 AM Willian Rampazzo <wrampazz@redhat.com>
wrote:

> On Wed, May 26, 2021 at 8:41 PM Cleber Rosa <crosa@redhat.com> wrote:
> >
> > On Wed, May 26, 2021 at 05:56:01PM -0300, Willian Rampazzo wrote:
> > > The current host for the image
> > > Armbian_20.08.1_Orangepipc_bionic_current_5.8.5.img.xz
> > > (archive.armbian.com) is extremely slow in the last couple of weeks,
> > > making the job running the test
> > >
> tests/system/boot_linux_console.py:BootLinuxConsole.test_arm_orangepi_bionic_20_08
> > > for the first time when the image is not yet on GitLab cache, time out
> > > while the image is being downloaded.
> > >
> > > This changes the host to one faster, so new users with an empty cache
> > > are not impacted.
> > >
> > > Signed-off-by: Willian Rampazzo <willianr@redhat.com>
> > > ---
> > >  tests/acceptance/boot_linux_console.py | 3 ++-
> > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/tests/acceptance/boot_linux_console.py
> b/tests/acceptance/boot_linux_console.py
> > > index 276a53f146..51c23b822c 100644
> > > --- a/tests/acceptance/boot_linux_console.py
> > > +++ b/tests/acceptance/boot_linux_console.py
> > > @@ -804,7 +804,8 @@ def test_arm_orangepi_bionic_20_08(self):
> > >          # to 1036 MiB, but the underlying filesystem is 1552 MiB...
> > >          # As we expand it to 2 GiB we are safe.
> > >
> > > -        image_url = ('https://archive.armbian.com/orangepipc/archive/
> '
> > > +        image_url = ('https://armbian.systemonachip.net/'
> > > +                     'archive/orangepipc/archive/'
> >
> > Hi Willian,
> >
> > I was pretty annoyed by my pipeline failures, that I came up with:
> >
> >
> https://gitlab.com/cleber.gnu/qemu/-/commit/917b3e376e682e9c35c6f7f597ffca110c719e13
> >
> > To prove that it was a GitLab <-> archive.arbian.com issue.
>
> When I tried both links, the slow link, and this new link, on my
> machine, I could see the slow link is also slow locally. Not as slow
> as on GitLab, but 10 times slower than this new link. I was thinking
> about open an issue on GitLab. In the worst case, they will say it is
> not their fault, but a problem on the other end.
>
> > But I wonder:
> >
> >  1. how susceptible to the same situation is this other mirror?
>
> Unfortunately, having tests depending on external artifacts will bring
> this kind of situation. Unless GitLab is doing traffic shaping, we
> will never know how susceptible an external server is to any kind of
> instability.
>
> >  2. how trustworthy is this mirror, say, stability wise? Maybe
> >     people in the armbian community would have some info?
>
> This new link is the same link that
> https://www.armbian.com/orange-pi-pc/ "Archived versions" is pointing,
> so I consider it an official mirror from Armbian. That's why I have
> not thought much about changing it.
>
> Now, stability wise, we never know :) I don't think we have this
> answer for any of the links related to external artifacts QEMU
> acceptance tests use.
>
> >
> > Depending on the feedback we get about, this can be a very valid
> > hotfix/workaround indeed.  But the core issues we need to look into
> > are:
> >
> >  a. applying a timeout when fetching assets.  If the asset fails to be
> >     fetched within the timeout, the test simply gets canceled.
>
> But this is failing during the download before the test starts, or in
> the pre-phase. The test suite was not created and Avocado don't have a
> mapping asset <=> test yet.
>
>
Right. But my point is that if it times out, then this "best effort"
attempt would fail (but not abort the job).  Then, during the test itself,
considering `cancel_on_missing=True`, the test would also cancel when it
fails to access the asset.

A canceled test is what we want here, and not a stuck job.  That's why I
still think the timeout may be a solution.

Thanks,
- Cleber.
Willian Rampazzo June 2, 2021, 2:42 p.m. UTC | #5
On Wed, Jun 2, 2021 at 11:08 AM Cleber Rosa Junior <crosa@redhat.com> wrote:
>
>
>
> On Thu, May 27, 2021 at 9:45 AM Willian Rampazzo <wrampazz@redhat.com> wrote:
>>
>> On Wed, May 26, 2021 at 8:41 PM Cleber Rosa <crosa@redhat.com> wrote:
>> >
>> > On Wed, May 26, 2021 at 05:56:01PM -0300, Willian Rampazzo wrote:
>> > > The current host for the image
>> > > Armbian_20.08.1_Orangepipc_bionic_current_5.8.5.img.xz
>> > > (archive.armbian.com) is extremely slow in the last couple of weeks,
>> > > making the job running the test
>> > > tests/system/boot_linux_console.py:BootLinuxConsole.test_arm_orangepi_bionic_20_08
>> > > for the first time when the image is not yet on GitLab cache, time out
>> > > while the image is being downloaded.
>> > >
>> > > This changes the host to one faster, so new users with an empty cache
>> > > are not impacted.
>> > >
>> > > Signed-off-by: Willian Rampazzo <willianr@redhat.com>
>> > > ---
>> > >  tests/acceptance/boot_linux_console.py | 3 ++-
>> > >  1 file changed, 2 insertions(+), 1 deletion(-)
>> > >
>> > > diff --git a/tests/acceptance/boot_linux_console.py b/tests/acceptance/boot_linux_console.py
>> > > index 276a53f146..51c23b822c 100644
>> > > --- a/tests/acceptance/boot_linux_console.py
>> > > +++ b/tests/acceptance/boot_linux_console.py
>> > > @@ -804,7 +804,8 @@ def test_arm_orangepi_bionic_20_08(self):
>> > >          # to 1036 MiB, but the underlying filesystem is 1552 MiB...
>> > >          # As we expand it to 2 GiB we are safe.
>> > >
>> > > -        image_url = ('https://archive.armbian.com/orangepipc/archive/'
>> > > +        image_url = ('https://armbian.systemonachip.net/'
>> > > +                     'archive/orangepipc/archive/'
>> >
>> > Hi Willian,
>> >
>> > I was pretty annoyed by my pipeline failures, that I came up with:
>> >
>> >    https://gitlab.com/cleber.gnu/qemu/-/commit/917b3e376e682e9c35c6f7f597ffca110c719e13
>> >
>> > To prove that it was a GitLab <-> archive.arbian.com issue.
>>
>> When I tried both links, the slow link, and this new link, on my
>> machine, I could see the slow link is also slow locally. Not as slow
>> as on GitLab, but 10 times slower than this new link. I was thinking
>> about open an issue on GitLab. In the worst case, they will say it is
>> not their fault, but a problem on the other end.
>>
>> > But I wonder:
>> >
>> >  1. how susceptible to the same situation is this other mirror?
>>
>> Unfortunately, having tests depending on external artifacts will bring
>> this kind of situation. Unless GitLab is doing traffic shaping, we
>> will never know how susceptible an external server is to any kind of
>> instability.
>>
>> >  2. how trustworthy is this mirror, say, stability wise? Maybe
>> >     people in the armbian community would have some info?
>>
>> This new link is the same link that
>> https://www.armbian.com/orange-pi-pc/ "Archived versions" is pointing,
>> so I consider it an official mirror from Armbian. That's why I have
>> not thought much about changing it.
>>
>> Now, stability wise, we never know :) I don't think we have this
>> answer for any of the links related to external artifacts QEMU
>> acceptance tests use.
>>
>> >
>> > Depending on the feedback we get about, this can be a very valid
>> > hotfix/workaround indeed.  But the core issues we need to look into
>> > are:
>> >
>> >  a. applying a timeout when fetching assets.  If the asset fails to be
>> >     fetched within the timeout, the test simply gets canceled.
>>
>> But this is failing during the download before the test starts, or in
>> the pre-phase. The test suite was not created and Avocado don't have a
>> mapping asset <=> test yet.
>>
>
> Right. But my point is that if it times out, then this "best effort" attempt would fail (but not abort the job).  Then, during the test itself, considering `cancel_on_missing=True`, the test would also cancel when it fails to access the asset.
>
> A canceled test is what we want here, and not a stuck job.  That's why I still think the timeout may be a solution.
>

Okay, got it! I opened an issue to track this feature:
https://github.com/avocado-framework/avocado/issues/4643

Anyway, for now, I think changing the URL gives us some time until we
have the problem again :)

> Thanks,
> - Cleber.
>
diff mbox series

Patch

diff --git a/tests/acceptance/boot_linux_console.py b/tests/acceptance/boot_linux_console.py
index 276a53f146..51c23b822c 100644
--- a/tests/acceptance/boot_linux_console.py
+++ b/tests/acceptance/boot_linux_console.py
@@ -804,7 +804,8 @@  def test_arm_orangepi_bionic_20_08(self):
         # to 1036 MiB, but the underlying filesystem is 1552 MiB...
         # As we expand it to 2 GiB we are safe.
 
-        image_url = ('https://archive.armbian.com/orangepipc/archive/'
+        image_url = ('https://armbian.systemonachip.net/'
+                     'archive/orangepipc/archive/'
                      'Armbian_20.08.1_Orangepipc_bionic_current_5.8.5.img.xz')
         image_hash = ('b4d6775f5673486329e45a0586bf06b6'
                       'dbe792199fd182ac6b9c7bb6c7d3e6dd')