diff mbox series

[6/6] qga/commands-posix: fix use after free of local_err

Message ID 20200324153630.11882-7-vsementsov@virtuozzo.com (mailing list archive)
State New, archived
Headers show
Series Several error use-after-free | expand

Commit Message

Vladimir Sementsov-Ogievskiy March 24, 2020, 3:36 p.m. UTC
local_err is used several times in guest_suspend(). Setting non-NULL
local_err will crash, so let's zero it after freeing. Also fix possible
leak of local_err in final if().

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 qga/commands-posix.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Eric Blake March 24, 2020, 8:03 p.m. UTC | #1
On 3/24/20 10:36 AM, Vladimir Sementsov-Ogievskiy wrote:
> local_err is used several times in guest_suspend(). Setting non-NULL
> local_err will crash, so let's zero it after freeing. Also fix possible
> leak of local_err in final if().
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>   qga/commands-posix.c | 3 +++
>   1 file changed, 3 insertions(+)
> 
> diff --git a/qga/commands-posix.c b/qga/commands-posix.c
> index 93474ff770..cc69b82704 100644
> --- a/qga/commands-posix.c
> +++ b/qga/commands-posix.c
> @@ -1773,6 +1773,7 @@ static void guest_suspend(SuspendMode mode, Error **errp)
>       }
>   
>       error_free(local_err);
> +    local_err = NULL;

Let's show this with more context.

> static void guest_suspend(SuspendMode mode, Error **errp)
> {
>     Error *local_err = NULL;
>     bool mode_supported = false;
> 
>     if (systemd_supports_mode(mode, &local_err)) {

Hmm - we have an even earlier bug that needs fixing.  Note that 
systemd_supports_mode() returns a bool AND conditionally sets errp.  But 
it is inconsistent: it has the following table of actions based on the 
results of run_process_child() on "systemctl status" coupled with the 
man page on "systemctl status" return values:
-1 (unable to run systemctl) -> errp set, return false
0 (unit is active) -> errp left unchanged, return false
1 (unit not failed) -> errp left unchanged, return true
2 (unused) -> errp left unchanged, return true
3 (unit not active) -> errp left unchanged, return true
4 (no such unit) -> errp left unchanged, return false
5+ (unexpected from systemctl) -> errp left unchanged, return false

But the comments in systemd_supports_mode() claim that ANY status < 4 
(other than -1, which means we did not run systemctl) should count as 
the service existing, even though the most common status is 3.  If our 
comment is to be believed, then we should return true, not false, for 
status 0.

Now, back to _this_ function:

>         mode_supported = true;
>         systemd_suspend(mode, &local_err);

Okay - if we get here (whether from status 1-3, or with 
systemd_supports_mode fixed to support status 0-3), local_err is still 
unset prior to calling systemd_suspend(), and we are guaranteed that 
after the call, either we suspended successfully or local_err is now set.

>     }
> 
>     if (!local_err) {
>         return;
>     }

So if returned, we succeeded at systemd_suspend, and there is nothing 
further to do; but if we get past that point, we don't know if it was 
systemd_supports_mode that failed or systemd_suspend that failed, and we 
don't know if local_err is set.

> 
>     error_free(local_err);
> +    local_err = NULL;

Yet, we blindly throw away local_err, without trying to report it.  If 
that's the case, then WHY are we passing in local_err?  Wouldn't it be 
better to pass in NULL (we really don't care about the error message), 
and/or fix systemd_suspend() to return a bool just like 
systemd_supports_mode, and/or fix systemd_supports_mode to guarantee 
that it sets errp when returning false?
Vladimir Sementsov-Ogievskiy March 25, 2020, 4:28 a.m. UTC | #2
24.03.2020 23:03, Eric Blake wrote:
> On 3/24/20 10:36 AM, Vladimir Sementsov-Ogievskiy wrote:
>> local_err is used several times in guest_suspend(). Setting non-NULL
>> local_err will crash, so let's zero it after freeing. Also fix possible
>> leak of local_err in final if().
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   qga/commands-posix.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/qga/commands-posix.c b/qga/commands-posix.c
>> index 93474ff770..cc69b82704 100644
>> --- a/qga/commands-posix.c
>> +++ b/qga/commands-posix.c
>> @@ -1773,6 +1773,7 @@ static void guest_suspend(SuspendMode mode, Error **errp)
>>       }
>>       error_free(local_err);
>> +    local_err = NULL;
> 
> Let's show this with more context.
> 
>> static void guest_suspend(SuspendMode mode, Error **errp)
>> {
>>     Error *local_err = NULL;
>>     bool mode_supported = false;
>>
>>     if (systemd_supports_mode(mode, &local_err)) {
> 
> Hmm - we have an even earlier bug that needs fixing.  Note that systemd_supports_mode() returns a bool AND conditionally sets errp.  But it is inconsistent: it has the following table of actions based on the results of run_process_child() on "systemctl status" coupled with the man page on "systemctl status" return values:
> -1 (unable to run systemctl) -> errp set, return false
> 0 (unit is active) -> errp left unchanged, return false
> 1 (unit not failed) -> errp left unchanged, return true
> 2 (unused) -> errp left unchanged, return true
> 3 (unit not active) -> errp left unchanged, return true
> 4 (no such unit) -> errp left unchanged, return false
> 5+ (unexpected from systemctl) -> errp left unchanged, return false
> 
> But the comments in systemd_supports_mode() claim that ANY status < 4 (other than -1, which means we did not run systemctl) should count as the service existing, even though the most common status is 3.  If our comment is to be believed, then we should return true, not false, for status 0.
> 
> Now, back to _this_ function:
> 
>>         mode_supported = true;
>>         systemd_suspend(mode, &local_err);
> 
> Okay - if we get here (whether from status 1-3, or with systemd_supports_mode fixed to support status 0-3), local_err is still unset prior to calling systemd_suspend(), and we are guaranteed that after the call, either we suspended successfully or local_err is now set.
> 
>>     }
>>
>>     if (!local_err) {
>>         return;
>>     }
> 
> So if returned, we succeeded at systemd_suspend, and there is nothing further to do; but if we get past that point, we don't know if it was systemd_supports_mode that failed or systemd_suspend that failed, and we don't know if local_err is set.

No, we know that is set, as we check exactly this and return if not set.

> 
>>
>>     error_free(local_err);
>> +    local_err = NULL;
> 
> Yet, we blindly throw away local_err, without trying to report it.  If that's the case, then WHY are we passing in local_err?  Wouldn't it be better to pass in NULL (we really don't care about the error message), and/or fix systemd_suspend() to return a bool just like systemd_supports_mode, and/or fix systemd_supports_mode to guarantee that it sets errp when returning false?
> 

I agree that this is a strange function and its logic is weird. But I don't know what the logic should be. My patch is still valid to just fix obvious use-after-free and possible leak. It doesn't fix the logic.
Markus Armbruster March 31, 2020, 8:13 a.m. UTC | #3
Eric Blake <eblake@redhat.com> writes:

> On 3/24/20 10:36 AM, Vladimir Sementsov-Ogievskiy wrote:
>> local_err is used several times in guest_suspend(). Setting non-NULL
>> local_err will crash, so let's zero it after freeing. Also fix possible
>> leak of local_err in final if().
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   qga/commands-posix.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/qga/commands-posix.c b/qga/commands-posix.c
>> index 93474ff770..cc69b82704 100644
>> --- a/qga/commands-posix.c
>> +++ b/qga/commands-posix.c
>> @@ -1773,6 +1773,7 @@ static void guest_suspend(SuspendMode mode, Error **errp)
>>       }
>>         error_free(local_err);
>> +    local_err = NULL;
>
> Let's show this with more context.
>
>> static void guest_suspend(SuspendMode mode, Error **errp)
>> {
>>     Error *local_err = NULL;
>>     bool mode_supported = false;
>>
>>     if (systemd_supports_mode(mode, &local_err)) {
>
> Hmm - we have an even earlier bug that needs fixing.  Note that
> systemd_supports_mode() returns a bool AND conditionally sets errp.
> But it is inconsistent: it has the following table of actions based on
> the results of run_process_child() on "systemctl status" coupled with
> the man page on "systemctl status" return values:
> -1 (unable to run systemctl) -> errp set, return false
> 0 (unit is active) -> errp left unchanged, return false
> 1 (unit not failed) -> errp left unchanged, return true
> 2 (unused) -> errp left unchanged, return true
> 3 (unit not active) -> errp left unchanged, return true
> 4 (no such unit) -> errp left unchanged, return false
> 5+ (unexpected from systemctl) -> errp left unchanged, return false

Three coarser cases:

* systemd_supports_mode() returned false with @local_err set
* systemd_supports_mode() returned false with @local_err clear
* systemd_supports_mode() returned true with @local_err clear

GLib specificially advises against the second case with GError:

    By convention, if you return a boolean value indicating success then
    TRUE means success and FALSE means failure.  Avoid creating
    functions which have a boolean return value and a GError parameter,
    but where the boolean does something other than signal whether the
    GError is set.

    https://developer.gnome.org/glib/stable/glib-Error-Reporting.html

In my opinion, the advice applies to our Error just as much.

> But the comments in systemd_supports_mode() claim that ANY status < 4
> (other than -1, which means we did not run systemctl) should count as
> the service existing, even though the most common status is 3.  If our
> comment is to be believed, then we should return true, not false, for
> status 0.
>
> Now, back to _this_ function:
>
>>         mode_supported = true;
>>         systemd_suspend(mode, &local_err);
>
> Okay - if we get here (whether from status 1-3, or with
> systemd_supports_mode fixed to support status 0-3), local_err is still
> unset prior to calling systemd_suspend(), and we are guaranteed that
> after the call, either we suspended successfully or local_err is now
> set.
>
>>     }

The conditional code splits the third case.  Result:

* systemd_supports_mode() returned false with @local_err set
* systemd_supports_mode() returned false with @local_err clear
* systemd_supports_mode() returned true, systemd_suspend() failed,
  @local_err set
* systemd_supports_mode() returned true, systemd_suspend() succeeded,
  @local_err clear

>>
>>     if (!local_err) {
>>         return;
>>     }
>
> So if returned, we succeeded at systemd_suspend, and there is nothing
> further to do; but if we get past that point, we don't know if it was
> systemd_supports_mode that failed or systemd_suspend that failed, and
> we don't know if local_err is set.
>
>>
>>     error_free(local_err);
>> +    local_err = NULL;

We use @local_err as one bit of information.

> Yet, we blindly throw away local_err, without trying to report it.  If
> that's the case, then WHY are we passing in local_err?  Wouldn't it be
> better to pass in NULL (we really don't care about the error message),
> and/or fix systemd_suspend() to return a bool just like
> systemd_supports_mode, and/or fix systemd_supports_mode to guarantee
> that it sets errp when returning false?

You're right, these interfaces are awkward.  They're used just here, so
there's no excuse for keeping them awkward.

Let's step back and examine what we're trying to do.  Pseudo-code:

    for method in systemd, pmutils, linux_sys_state:
        if method supports mode:
            try method
            if successful:
                return success

    if no method supports mode:
        return failure "the requested suspend mode is not supported by the guest"
    // we tried at least one method
    return the last method's failure

Observations:

1. We can abstract from the methods, or we can unroll the loop.
   Unrolling seems simpler here.

2. 'Method supports mode' is used as a simple predicate.  So make it
   one: return bool, and not take an Error ** argument.

3. The error for 'try method' is ignored except for the last try.  I'm
   not sure reporting just the last one is appropriate, but let's assume
   it is.  Preferred solution: make 'try method' return true on success,
   false on failure, ignore error (by passing null) unless we actually
   need it.  Acceptable solution: keep it void, free the Error objects
   we ignore.
Markus Armbruster March 31, 2020, 11:46 a.m. UTC | #4
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> writes:

[...]
> I agree that this is a strange function and its logic is weird. But I
> don't know what the logic should be. My patch is still valid to just
> fix obvious use-after-free and possible leak. It doesn't fix the
> logic.

I sketched improved logic elsewhere in this thread, and I can turn that
into a patch.

I can either make it replace Vladimir's patch, or make it go on top.  If
the latter, we can apply just Vladimir's patch for 5.0, and punt mine to
5.1

Got a preference?
Vladimir Sementsov-Ogievskiy March 31, 2020, 12:04 p.m. UTC | #5
31.03.2020 14:46, Markus Armbruster wrote:
> Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> writes:
> 
> [...]
>> I agree that this is a strange function and its logic is weird. But I
>> don't know what the logic should be. My patch is still valid to just
>> fix obvious use-after-free and possible leak. It doesn't fix the
>> logic.
> 
> I sketched improved logic elsewhere in this thread, and I can turn that
> into a patch.
> 
> I can either make it replace Vladimir's patch, or make it go on top.  If
> the latter, we can apply just Vladimir's patch for 5.0, and punt mine to
> 5.1
> 
> Got a preference?
> 

I don't.
diff mbox series

Patch

diff --git a/qga/commands-posix.c b/qga/commands-posix.c
index 93474ff770..cc69b82704 100644
--- a/qga/commands-posix.c
+++ b/qga/commands-posix.c
@@ -1773,6 +1773,7 @@  static void guest_suspend(SuspendMode mode, Error **errp)
     }
 
     error_free(local_err);
+    local_err = NULL;
 
     if (pmutils_supports_mode(mode, &local_err)) {
         mode_supported = true;
@@ -1784,6 +1785,7 @@  static void guest_suspend(SuspendMode mode, Error **errp)
     }
 
     error_free(local_err);
+    local_err = NULL;
 
     if (linux_sys_state_supports_mode(mode, &local_err)) {
         mode_supported = true;
@@ -1791,6 +1793,7 @@  static void guest_suspend(SuspendMode mode, Error **errp)
     }
 
     if (!mode_supported) {
+        error_free(local_err);
         error_setg(errp,
                    "the requested suspend mode is not supported by the guest");
     } else {