diff mbox series

scripts/libvirt_pool.sh: fix regression with pool heuristics

Message ID 20250307015312.3063619-1-mcgrof@kernel.org (mailing list archive)
State New
Headers show
Series scripts/libvirt_pool.sh: fix regression with pool heuristics | expand

Commit Message

Luis Chamberlain March 7, 2025, 1:53 a.m. UTC
Heavy users of kdevops will often want to dedicate an entire
partition so that it can be used for all guests of *all users*.
This also allows us to share *all guestfs images* so we only
have to download a guestfs image once for custom images like
Debian Trixie.

For instance:

sudo virsh pool-dumpxml xfs1 | grep path
    <path>/xfs1</path>

If we go back in history we can see this used to work, let's try
with the first repo's tree's history, we *used* to get:

mcgrof@big-box /xfs1/mcgrof/debug-pools/kdevops (git::main)$
make mrproper; make defconfig-kernel-hacking > /dev/null 2>&1
grep CONFIG_KDEVOPS_STORAGE_POOL_PATH .config
CONFIG_KDEVOPS_STORAGE_POOL_PATH="/xfs1"

Now we get:

mcgrof@big-box /xfs1/mcgrof/debug-pools/kdevops (git::main)$
make mrproper; make defconfig-kernel-hacking > /dev/null 2>&1
grep CONFIG_KDEVOPS_STORAGE_POOL_PATH .config
CONFIG_KDEVOPS_STORAGE_POOL_PATH="/xfs1/mcgrof/debug-pools/kdevops/default

So now we lie, because the heuristic to see if we can use sudo for virsh
pool stuff is broken, so we try to create a new pool every single time
the user uses kdevops.  I bisected this down to commit bc731be7f131
("scripts: Adjust heuristic to see if current user can sudo"), so the
pool heuristic has been broken since January 13th.

That also means we were forcing down each new CI in place the download
of a new debian trixie image.

There's a few issues with this new heuristic:

1) CAN_SUDO=get_can_sudo
   Well that's just a string, not an execution so we never even
   ran this new routine.

2) It's a good thing we never ran it anyway because the routine
   calls exit

3) The sudo heuristic was broken as it assumed the -eq would capture
   previous routine's exit call but, $(foo) it actually just captures
   the output. So it could never have worked.

Fix the heuristics. Even though a first reaction may be to see if
we can move away from these shell heuristics with ansible, these
helpers are used by scripts we use to set default variables inside
Kconfig strings and bools so ansible can't be used unless we're
comfortable with the idea of calling ansible as part of a local
playbook to do some local work. Maybe that is not crazy idea after all.

Cc: Swarna Prabhu <s.prabhu@samsung.com>
Cc: Joel Granados <joel.granados@kernel.org>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---

I already applied the patch and pushe dit as its a fix I've been dying
to have for a while now, I just had no time to bisect until now. Posting to
the list for posterity.

 scripts/libvirt_pool.sh | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

Comments

Daniel Gomez March 7, 2025, 1:21 p.m. UTC | #1
On Thu, Mar 06, 2025 at 05:53:12PM +0100, Luis Chamberlain wrote:
> Heavy users of kdevops will often want to dedicate an entire
> partition so that it can be used for all guests of *all users*.
> This also allows us to share *all guestfs images* so we only
> have to download a guestfs image once for custom images like
> Debian Trixie.
> 
> For instance:
> 
> sudo virsh pool-dumpxml xfs1 | grep path
>     <path>/xfs1</path>
> 
> If we go back in history we can see this used to work, let's try
> with the first repo's tree's history, we *used* to get:
> 
> mcgrof@big-box /xfs1/mcgrof/debug-pools/kdevops (git::main)$
> make mrproper; make defconfig-kernel-hacking > /dev/null 2>&1
> grep CONFIG_KDEVOPS_STORAGE_POOL_PATH .config
> CONFIG_KDEVOPS_STORAGE_POOL_PATH="/xfs1"
> 
> Now we get:
> 
> mcgrof@big-box /xfs1/mcgrof/debug-pools/kdevops (git::main)$
> make mrproper; make defconfig-kernel-hacking > /dev/null 2>&1
> grep CONFIG_KDEVOPS_STORAGE_POOL_PATH .config
> CONFIG_KDEVOPS_STORAGE_POOL_PATH="/xfs1/mcgrof/debug-pools/kdevops/default
> 
> So now we lie, because the heuristic to see if we can use sudo for virsh
> pool stuff is broken, so we try to create a new pool every single time
> the user uses kdevops.  I bisected this down to commit bc731be7f131
> ("scripts: Adjust heuristic to see if current user can sudo"), so the
> pool heuristic has been broken since January 13th.
> 
> That also means we were forcing down each new CI in place the download
> of a new debian trixie image.
> 
> There's a few issues with this new heuristic:
> 
> 1) CAN_SUDO=get_can_sudo
>    Well that's just a string, not an execution so we never even
>    ran this new routine.
> 
> 2) It's a good thing we never ran it anyway because the routine
>    calls exit
> 
> 3) The sudo heuristic was broken as it assumed the -eq would capture
>    previous routine's exit call but, $(foo) it actually just captures
>    the output. So it could never have worked.
> 
> Fix the heuristics. Even though a first reaction may be to see if
> we can move away from these shell heuristics with ansible, these
> helpers are used by scripts we use to set default variables inside
> Kconfig strings and bools so ansible can't be used unless we're
> comfortable with the idea of calling ansible as part of a local
> playbook to do some local work. Maybe that is not crazy idea after all.
> 
> Cc: Swarna Prabhu <s.prabhu@samsung.com>
> Cc: Joel Granados <joel.granados@kernel.org>
> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
> ---
> 
> I already applied the patch and pushe dit as its a fix I've been dying
> to have for a while now, I just had no time to bisect until now. Posting to
> the list for posterity.
> 
>  scripts/libvirt_pool.sh | 6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/scripts/libvirt_pool.sh b/scripts/libvirt_pool.sh
> index ffae92817ca8..ce25fdd6e552 100644
> --- a/scripts/libvirt_pool.sh
> +++ b/scripts/libvirt_pool.sh
> @@ -13,12 +13,10 @@ get_can_sudo()
>  	#      2. With sudo it has two messages: one for paswordless sudo and
>  	#         one passwordfull sudo. But we ignore the distinction as both
>  	#         of these mean that can_sudo is "y".
> -	if [[ $(sudo -nv 2>&1 | grep 'may not' > /dev/null) -eq 0 ]]; then
> +	if sudo -nv 2>&1 | grep -q 'may not'; then
>  		echo "n"
> -		exit

We are returning "n\ny" in case we cannot sudo. It's fine because we only check
when != "y". But I think it's a bit more clean to return here.

>  	fi
>  	echo "y"
> -	exit
>  }
>  
>  get_pool_vars()
> @@ -30,7 +28,7 @@ get_pool_vars()
>  		fi
>  	fi
>  
> -	CAN_SUDO=get_can_sudo
> +	CAN_SUDO="$(get_can_sudo)"
>  
>  	if [[ "$USES_QEMU_USER_SESSION" != "y" ]]; then
>  		REQ_SUDO="sudo"
> -- 
> 2.47.2
>
Luis Chamberlain March 7, 2025, 4:59 p.m. UTC | #2
On Fri, Mar 07, 2025 at 02:21:12PM +0100, Daniel Gomez wrote:
> On Thu, Mar 06, 2025 at 05:53:12PM +0100, Luis Chamberlain wrote:
> > Heavy users of kdevops will often want to dedicate an entire
> > partition so that it can be used for all guests of *all users*.
> > This also allows us to share *all guestfs images* so we only
> > have to download a guestfs image once for custom images like
> > Debian Trixie.
> > 
> > For instance:
> > 
> > sudo virsh pool-dumpxml xfs1 | grep path
> >     <path>/xfs1</path>
> > 
> > If we go back in history we can see this used to work, let's try
> > with the first repo's tree's history, we *used* to get:
> > 
> > mcgrof@big-box /xfs1/mcgrof/debug-pools/kdevops (git::main)$
> > make mrproper; make defconfig-kernel-hacking > /dev/null 2>&1
> > grep CONFIG_KDEVOPS_STORAGE_POOL_PATH .config
> > CONFIG_KDEVOPS_STORAGE_POOL_PATH="/xfs1"
> > 
> > Now we get:
> > 
> > mcgrof@big-box /xfs1/mcgrof/debug-pools/kdevops (git::main)$
> > make mrproper; make defconfig-kernel-hacking > /dev/null 2>&1
> > grep CONFIG_KDEVOPS_STORAGE_POOL_PATH .config
> > CONFIG_KDEVOPS_STORAGE_POOL_PATH="/xfs1/mcgrof/debug-pools/kdevops/default
> > 
> > So now we lie, because the heuristic to see if we can use sudo for virsh
> > pool stuff is broken, so we try to create a new pool every single time
> > the user uses kdevops.  I bisected this down to commit bc731be7f131
> > ("scripts: Adjust heuristic to see if current user can sudo"), so the
> > pool heuristic has been broken since January 13th.
> > 
> > That also means we were forcing down each new CI in place the download
> > of a new debian trixie image.
> > 
> > There's a few issues with this new heuristic:
> > 
> > 1) CAN_SUDO=get_can_sudo
> >    Well that's just a string, not an execution so we never even
> >    ran this new routine.
> > 
> > 2) It's a good thing we never ran it anyway because the routine
> >    calls exit
> > 
> > 3) The sudo heuristic was broken as it assumed the -eq would capture
> >    previous routine's exit call but, $(foo) it actually just captures
> >    the output. So it could never have worked.
> > 
> > Fix the heuristics. Even though a first reaction may be to see if
> > we can move away from these shell heuristics with ansible, these
> > helpers are used by scripts we use to set default variables inside
> > Kconfig strings and bools so ansible can't be used unless we're
> > comfortable with the idea of calling ansible as part of a local
> > playbook to do some local work. Maybe that is not crazy idea after all.
> > 
> > Cc: Swarna Prabhu <s.prabhu@samsung.com>
> > Cc: Joel Granados <joel.granados@kernel.org>
> > Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
> > ---
> > 
> > I already applied the patch and pushe dit as its a fix I've been dying
> > to have for a while now, I just had no time to bisect until now. Posting to
> > the list for posterity.
> > 
> >  scripts/libvirt_pool.sh | 6 ++----
> >  1 file changed, 2 insertions(+), 4 deletions(-)
> > 
> > diff --git a/scripts/libvirt_pool.sh b/scripts/libvirt_pool.sh
> > index ffae92817ca8..ce25fdd6e552 100644
> > --- a/scripts/libvirt_pool.sh
> > +++ b/scripts/libvirt_pool.sh
> > @@ -13,12 +13,10 @@ get_can_sudo()
> >  	#      2. With sudo it has two messages: one for paswordless sudo and
> >  	#         one passwordfull sudo. But we ignore the distinction as both
> >  	#         of these mean that can_sudo is "y".
> > -	if [[ $(sudo -nv 2>&1 | grep 'may not' > /dev/null) -eq 0 ]]; then
> > +	if sudo -nv 2>&1 | grep -q 'may not'; then
> >  		echo "n"
> > -		exit
> 
> We are returning "n\ny" in case we cannot sudo. It's fine because we only check
> when != "y". But I think it's a bit more clean to return here.

Ah yes, commited yet another fix. Thanks.

  Luis
diff mbox series

Patch

diff --git a/scripts/libvirt_pool.sh b/scripts/libvirt_pool.sh
index ffae92817ca8..ce25fdd6e552 100644
--- a/scripts/libvirt_pool.sh
+++ b/scripts/libvirt_pool.sh
@@ -13,12 +13,10 @@  get_can_sudo()
 	#      2. With sudo it has two messages: one for paswordless sudo and
 	#         one passwordfull sudo. But we ignore the distinction as both
 	#         of these mean that can_sudo is "y".
-	if [[ $(sudo -nv 2>&1 | grep 'may not' > /dev/null) -eq 0 ]]; then
+	if sudo -nv 2>&1 | grep -q 'may not'; then
 		echo "n"
-		exit
 	fi
 	echo "y"
-	exit
 }
 
 get_pool_vars()
@@ -30,7 +28,7 @@  get_pool_vars()
 		fi
 	fi
 
-	CAN_SUDO=get_can_sudo
+	CAN_SUDO="$(get_can_sudo)"
 
 	if [[ "$USES_QEMU_USER_SESSION" != "y" ]]; then
 		REQ_SUDO="sudo"