diff mbox series

tools/xenstored: Don't crash xenstored when Live-Update is cancelled

Message ID 20210617173857.6450-1-julien@xen.org (mailing list archive)
State New
Headers show
Series tools/xenstored: Don't crash xenstored when Live-Update is cancelled | expand

Commit Message

Julien Grall June 17, 2021, 5:38 p.m. UTC
From: Julien GralL <jgrall@amazon.com>

As Live-Update is asynchronous, it is possible to receive a request to
cancel it (either on the same connection or from a different one).

Currently, this will crash xenstored because do_lu_start() assumes
lu_status will be valid. This is not the case when Live-Update has been
cancelled. This will result to dereference a NULL pointer and
crash Xenstored.

Rework do_lu_start() to check if lu_status is NULL and return an
error in this case.

Fixes: af216a99fb ("tools/xenstore: add the basic framework for doing the live update")
Signed-off-by: Julien Grall <jgrall@amazon.com>

----

This is currently based on top of:

https://lore.kernel.org/xen-devel/20210616144324.31652-1-julien@xen.org

This can be re-ordered if necessary.
---
 tools/xenstore/xenstored_control.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

Comments

Luca Fancellu June 21, 2021, 9:36 a.m. UTC | #1
> On 17 Jun 2021, at 18:38, Julien Grall <julien@xen.org> wrote:
> 
> From: Julien GralL <jgrall@amazon.com>
> 
> As Live-Update is asynchronous, it is possible to receive a request to
> cancel it (either on the same connection or from a different one).
> 
> Currently, this will crash xenstored because do_lu_start() assumes
> lu_status will be valid. This is not the case when Live-Update has been
> cancelled. This will result to dereference a NULL pointer and
> crash Xenstored.
> 
> Rework do_lu_start() to check if lu_status is NULL and return an
> error in this case.
> 
> Fixes: af216a99fb ("tools/xenstore: add the basic framework for doing the live update")
> Signed-off-by: Julien Grall <jgrall@amazon.com>

Reviewed-by: Luca Fancellu <luca.fancellu@arm.com>

> 
> ----
> 
> This is currently based on top of:
> 
> https://lore.kernel.org/xen-devel/20210616144324.31652-1-julien@xen.org
> 
> This can be re-ordered if necessary.
> ---
> tools/xenstore/xenstored_control.c | 15 +++++++++++++--
> 1 file changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/xenstore/xenstored_control.c b/tools/xenstore/xenstored_control.c
> index a045f102a420..37a3d39f20b5 100644
> --- a/tools/xenstore/xenstored_control.c
> +++ b/tools/xenstore/xenstored_control.c
> @@ -696,7 +696,18 @@ static bool do_lu_start(struct delayed_request *req)
> 	time_t now = time(NULL);
> 	const char *ret;
> 	struct buffered_data *saved_in;
> -	struct connection *conn = lu_status->conn;
> +	struct connection *conn = req->data;
> +
> +	/*
> +	 * Cancellation may have been requested asynchronously. In this
> +	 * case, lu_status will be NULL.
> +	 */
> +	if (!lu_status) {
> +		ret = "Cancellation was requested";
> +		conn = req->data;
> +		goto out;
> +	} else
> +		assert(lu_status->conn == conn);
> 
> 	if (!lu_check_lu_allowed()) {
> 		if (now < lu_status->started_at + lu_status->timeout)
> @@ -747,7 +758,7 @@ static const char *lu_start(const void *ctx, struct connection *conn,
> 	lu_status->timeout = to;
> 	lu_status->started_at = time(NULL);
> 
> -	errno = delay_request(conn, conn->in, do_lu_start, NULL, false);
> +	errno = delay_request(conn, conn->in, do_lu_start, conn, false);
> 
> 	return NULL;
> }
> -- 
> 2.17.1
> 
>
Juergen Gross June 22, 2021, 8:46 a.m. UTC | #2
On 17.06.21 19:38, Julien Grall wrote:
> From: Julien GralL <jgrall@amazon.com>
> 
> As Live-Update is asynchronous, it is possible to receive a request to
> cancel it (either on the same connection or from a different one).
> 
> Currently, this will crash xenstored because do_lu_start() assumes
> lu_status will be valid. This is not the case when Live-Update has been
> cancelled. This will result to dereference a NULL pointer and
> crash Xenstored.

Umm, you introduced that bug in "[PATCH 03/10] tools/xenstore: Don't
assume conn->in points to the LU request".

So I think you should fold your fix into that series.


Juergen
Julien Grall June 22, 2021, 8:53 a.m. UTC | #3
Hi Juergen,

On 22/06/2021 10:46, Juergen Gross wrote:
> On 17.06.21 19:38, Julien Grall wrote:
>> From: Julien GralL <jgrall@amazon.com>
>>
>> As Live-Update is asynchronous, it is possible to receive a request to
>> cancel it (either on the same connection or from a different one).
>>
>> Currently, this will crash xenstored because do_lu_start() assumes
>> lu_status will be valid. This is not the case when Live-Update has been
>> cancelled. This will result to dereference a NULL pointer and
>> crash Xenstored.
> 
> Umm, you introduced that bug in "[PATCH 03/10] tools/xenstore: Don't
> assume conn->in points to the LU request".

No. I did reproduced this one without my series. If there are in-flight 
transaction this will crash in lu_check_lu_allowed() otherwise, it will 
crash when calling lu_dump_state().

The easiest way to reproduce is requesting live-update when there are 
long transactions and from a different connection (still in dom0) 
requesting to abort the connection.

In this case, lu_abort() will free lu_status and the destructor will set 
it to NULL. But the actual request is still in the delayed request queue 
for the other connection.

Cheers,
Juergen Gross June 22, 2021, 9:13 a.m. UTC | #4
On 22.06.21 10:53, Julien Grall wrote:
> Hi Juergen,
> 
> On 22/06/2021 10:46, Juergen Gross wrote:
>> On 17.06.21 19:38, Julien Grall wrote:
>>> From: Julien GralL <jgrall@amazon.com>
>>>
>>> As Live-Update is asynchronous, it is possible to receive a request to
>>> cancel it (either on the same connection or from a different one).
>>>
>>> Currently, this will crash xenstored because do_lu_start() assumes
>>> lu_status will be valid. This is not the case when Live-Update has been
>>> cancelled. This will result to dereference a NULL pointer and
>>> crash Xenstored.
>>
>> Umm, you introduced that bug in "[PATCH 03/10] tools/xenstore: Don't
>> assume conn->in points to the LU request".
> 
> No. I did reproduced this one without my series. If there are in-flight 

> transaction this will crash in lu_check_lu_allowed() otherwise, it will 

> crash when calling lu_dump_state().

Oh, right, I missed the indirection via delay_request().

Sorry.


Juergen
Juergen Gross June 22, 2021, 9:23 a.m. UTC | #5
On 17.06.21 19:38, Julien Grall wrote:
> From: Julien GralL <jgrall@amazon.com>
> 
> As Live-Update is asynchronous, it is possible to receive a request to
> cancel it (either on the same connection or from a different one).
> 
> Currently, this will crash xenstored because do_lu_start() assumes
> lu_status will be valid. This is not the case when Live-Update has been
> cancelled. This will result to dereference a NULL pointer and
> crash Xenstored.
> 
> Rework do_lu_start() to check if lu_status is NULL and return an
> error in this case.
> 
> Fixes: af216a99fb ("tools/xenstore: add the basic framework for doing the live update")
> Signed-off-by: Julien Grall <jgrall@amazon.com>
> 
> ----
> 
> This is currently based on top of:
> 
> https://lore.kernel.org/xen-devel/20210616144324.31652-1-julien@xen.org
> 
> This can be re-ordered if necessary.
> ---
>   tools/xenstore/xenstored_control.c | 15 +++++++++++++--
>   1 file changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/xenstore/xenstored_control.c b/tools/xenstore/xenstored_control.c
> index a045f102a420..37a3d39f20b5 100644
> --- a/tools/xenstore/xenstored_control.c
> +++ b/tools/xenstore/xenstored_control.c
> @@ -696,7 +696,18 @@ static bool do_lu_start(struct delayed_request *req)
>   	time_t now = time(NULL);
>   	const char *ret;
>   	struct buffered_data *saved_in;
> -	struct connection *conn = lu_status->conn;
> +	struct connection *conn = req->data;
> +
> +	/*
> +	 * Cancellation may have been requested asynchronously. In this
> +	 * case, lu_status will be NULL.
> +	 */
> +	if (!lu_status) {
> +		ret = "Cancellation was requested";
> +		conn = req->data;

This will set conn to the same value it already has.


Other than that:

Reviewed-by: Juergen Gross <jgross@suse.com>


Juergen
Julien Grall June 24, 2021, 8:12 a.m. UTC | #6
Hi Juergen,

On 22/06/2021 11:23, Juergen Gross wrote:
> On 17.06.21 19:38, Julien Grall wrote:
>> From: Julien GralL <jgrall@amazon.com>
>>
>> As Live-Update is asynchronous, it is possible to receive a request to
>> cancel it (either on the same connection or from a different one).
>>
>> Currently, this will crash xenstored because do_lu_start() assumes
>> lu_status will be valid. This is not the case when Live-Update has been
>> cancelled. This will result to dereference a NULL pointer and
>> crash Xenstored.
>>
>> Rework do_lu_start() to check if lu_status is NULL and return an
>> error in this case.
>>
>> Fixes: af216a99fb ("tools/xenstore: add the basic framework for doing 
>> the live update")
>> Signed-off-by: Julien Grall <jgrall@amazon.com>
>>
>> ----
>>
>> This is currently based on top of:
>>
>> https://lore.kernel.org/xen-devel/20210616144324.31652-1-julien@xen.org
>>
>> This can be re-ordered if necessary.
>> ---
>>   tools/xenstore/xenstored_control.c | 15 +++++++++++++--
>>   1 file changed, 13 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/xenstore/xenstored_control.c 
>> b/tools/xenstore/xenstored_control.c
>> index a045f102a420..37a3d39f20b5 100644
>> --- a/tools/xenstore/xenstored_control.c
>> +++ b/tools/xenstore/xenstored_control.c
>> @@ -696,7 +696,18 @@ static bool do_lu_start(struct delayed_request *req)
>>       time_t now = time(NULL);
>>       const char *ret;
>>       struct buffered_data *saved_in;
>> -    struct connection *conn = lu_status->conn;
>> +    struct connection *conn = req->data;
>> +
>> +    /*
>> +     * Cancellation may have been requested asynchronously. In this
>> +     * case, lu_status will be NULL.
>> +     */
>> +    if (!lu_status) {
>> +        ret = "Cancellation was requested";
>> +        conn = req->data;
> 
> This will set conn to the same value it already has.

Ah yes. I will drop this line.

Also, I took the opportunity to replace

} else
   assert(...)

with just

assert(...)

This should improve a bit the readability. Let me know if you want me to 
resend the patch for that.

> 
> 
> Other than that:
> 
> Reviewed-by: Juergen Gross <jgross@suse.com>

Thank you!

Cheers,
Juergen Gross June 24, 2021, 8:17 a.m. UTC | #7
On 24.06.21 10:12, Julien Grall wrote:
> Hi Juergen,
> 
> On 22/06/2021 11:23, Juergen Gross wrote:
>> On 17.06.21 19:38, Julien Grall wrote:
>>> From: Julien GralL <jgrall@amazon.com>
>>>
>>> As Live-Update is asynchronous, it is possible to receive a request to
>>> cancel it (either on the same connection or from a different one).
>>>
>>> Currently, this will crash xenstored because do_lu_start() assumes
>>> lu_status will be valid. This is not the case when Live-Update has been
>>> cancelled. This will result to dereference a NULL pointer and
>>> crash Xenstored.
>>>
>>> Rework do_lu_start() to check if lu_status is NULL and return an
>>> error in this case.
>>>
>>> Fixes: af216a99fb ("tools/xenstore: add the basic framework for doing 

>>> the live update")
>>> Signed-off-by: Julien Grall <jgrall@amazon.com>
>>>
>>> ----
>>>
>>> This is currently based on top of:
>>>
>>> https://lore.kernel.org/xen-devel/20210616144324.31652-1-julien@xen.org
>>>
>>> This can be re-ordered if necessary.
>>> ---
>>>   tools/xenstore/xenstored_control.c | 15 +++++++++++++--
>>>   1 file changed, 13 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/tools/xenstore/xenstored_control.c 
>>> b/tools/xenstore/xenstored_control.c
>>> index a045f102a420..37a3d39f20b5 100644
>>> --- a/tools/xenstore/xenstored_control.c
>>> +++ b/tools/xenstore/xenstored_control.c
>>> @@ -696,7 +696,18 @@ static bool do_lu_start(struct delayed_request 
>>> *req)
>>>       time_t now = time(NULL);
>>>       const char *ret;
>>>       struct buffered_data *saved_in;
>>> -    struct connection *conn = lu_status->conn;
>>> +    struct connection *conn = req->data;
>>> +
>>> +    /*
>>> +     * Cancellation may have been requested asynchronously. In this
>>> +     * case, lu_status will be NULL.
>>> +     */
>>> +    if (!lu_status) {
>>> +        ret = "Cancellation was 
requested";
>>> +        conn = req->data;
>>
>> This will set conn to the same value it already has.
> 
> Ah yes. I will drop this line.
> 
> Also, I took the opportunity to replace
> 
> } else
>   assert(...)
> 
> with just
> 
> assert(...)
> 
> This should improve a bit the readability. Let me know if you want me to 
> resend the patch for that.

I guess you are planning to do the commit?

If yes, there is no need for resending the patch.


Juergen
Julien Grall June 24, 2021, 8:18 a.m. UTC | #8
Hi Juergen,

On 24/06/2021 10:17, Juergen Gross wrote:
> On 24.06.21 10:12, Julien Grall wrote:
>> Hi Juergen,
>>
>> On 22/06/2021 11:23, Juergen Gross wrote:
>>> On 17.06.21 19:38, Julien Grall wrote:
>>>> From: Julien GralL <jgrall@amazon.com>
>>>>
>>>> As Live-Update is asynchronous, it is possible to receive a request to
>>>> cancel it (either on the same connection or from a different one).
>>>>
>>>> Currently, this will crash xenstored because do_lu_start() assumes
>>>> lu_status will be valid. This is not the case when Live-Update has been
>>>> cancelled. This will result to dereference a NULL pointer and
>>>> crash Xenstored.
>>>>
>>>> Rework do_lu_start() to check if lu_status is NULL and return an
>>>> error in this case.
>>>>
>>>> Fixes: af216a99fb ("tools/xenstore: add the basic framework for doing 
> 
>>>> the live update")
>>>> Signed-off-by: Julien Grall <jgrall@amazon.com>
>>>>
>>>> ----
>>>>
>>>> This is currently based on top of:
>>>>
>>>> https://lore.kernel.org/xen-devel/20210616144324.31652-1-julien@xen.org
>>>>
>>>> This can be re-ordered if necessary.
>>>> ---
>>>>   tools/xenstore/xenstored_control.c | 15 +++++++++++++--
>>>>   1 file changed, 13 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/tools/xenstore/xenstored_control.c 
>>>> b/tools/xenstore/xenstored_control.c
>>>> index a045f102a420..37a3d39f20b5 100644
>>>> --- a/tools/xenstore/xenstored_control.c
>>>> +++ b/tools/xenstore/xenstored_control.c
>>>> @@ -696,7 +696,18 @@ static bool do_lu_start(struct delayed_request 
>>>> *req)
>>>>       time_t now = time(NULL);
>>>>       const char *ret;
>>>>       struct buffered_data *saved_in;
>>>> -    struct connection *conn = lu_status->conn;
>>>> +    struct connection *conn = req->data;
>>>> +
>>>> +    /*
>>>> +     * Cancellation may have been requested asynchronously. In this
>>>> +     * case, lu_status will be NULL.
>>>> +     */
>>>> +    if (!lu_status) {
>>>> +        ret = "Cancellation was 
> requested";
>>>> +        conn = req->data;
>>>
>>> This will set conn to the same value it already has.
>>
>> Ah yes. I will drop this line.
>>
>> Also, I took the opportunity to replace
>>
>> } else
>>   assert(...)
>>
>> with just
>>
>> assert(...)
>>
>> This should improve a bit the readability. Let me know if you want me 
>> to resend the patch for that.
> 
> I guess you are planning to do the commit?

That's my plan.

> 
> If yes, there is no need for resending the patch.

Thanks!

Cheers,
Julien Grall June 24, 2021, 11:17 a.m. UTC | #9
On 24/06/2021 10:18, Julien Grall wrote:
> Hi Juergen,
> 
> On 24/06/2021 10:17, Juergen Gross wrote:
>> On 24.06.21 10:12, Julien Grall wrote:
>>> Hi Juergen,
>>>
>>> On 22/06/2021 11:23, Juergen Gross wrote:
>>>> On 17.06.21 19:38, Julien Grall wrote:
>>>>> From: Julien GralL <jgrall@amazon.com>
>>>>>
>>>>> As Live-Update is asynchronous, it is possible to receive a request to
>>>>> cancel it (either on the same connection or from a different one).
>>>>>
>>>>> Currently, this will crash xenstored because do_lu_start() assumes
>>>>> lu_status will be valid. This is not the case when Live-Update has 
>>>>> been
>>>>> cancelled. This will result to dereference a NULL pointer and
>>>>> crash Xenstored.
>>>>>
>>>>> Rework do_lu_start() to check if lu_status is NULL and return an
>>>>> error in this case.
>>>>>
>>>>> Fixes: af216a99fb ("tools/xenstore: add the basic framework for doing 
>>
>>>>> the live update")
>>>>> Signed-off-by: Julien Grall <jgrall@amazon.com>
>>>>>
>>>>> ----
>>>>>
>>>>> This is currently based on top of:
>>>>>
>>>>> https://lore.kernel.org/xen-devel/20210616144324.31652-1-julien@xen.org 
>>>>>
>>>>>
>>>>> This can be re-ordered if necessary.
>>>>> ---
>>>>>   tools/xenstore/xenstored_control.c | 15 +++++++++++++--
>>>>>   1 file changed, 13 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/tools/xenstore/xenstored_control.c 
>>>>> b/tools/xenstore/xenstored_control.c
>>>>> index a045f102a420..37a3d39f20b5 100644
>>>>> --- a/tools/xenstore/xenstored_control.c
>>>>> +++ b/tools/xenstore/xenstored_control.c
>>>>> @@ -696,7 +696,18 @@ static bool do_lu_start(struct delayed_request 
>>>>> *req)
>>>>>       time_t now = time(NULL);
>>>>>       const char *ret;
>>>>>       struct buffered_data *saved_in;
>>>>> -    struct connection *conn = lu_status->conn;
>>>>> +    struct connection *conn = req->data;
>>>>> +
>>>>> +    /*
>>>>> +     * Cancellation may have been requested asynchronously. In this
>>>>> +     * case, lu_status will be NULL.
>>>>> +     */
>>>>> +    if (!lu_status) {
>>>>> +        ret = "Cancellation was 
>> requested";
>>>>> +        conn = req->data;
>>>>
>>>> This will set conn to the same value it already has.
>>>
>>> Ah yes. I will drop this line.
>>>
>>> Also, I took the opportunity to replace
>>>
>>> } else
>>>   assert(...)
>>>
>>> with just
>>>
>>> assert(...)
>>>
>>> This should improve a bit the readability. Let me know if you want me 
>>> to resend the patch for that.
>>
>> I guess you are planning to do the commit?
> 
> That's my plan.

Committed.

> 
>>
>> If yes, there is no need for resending the patch.
> 
> Thanks!
> 
> Cheers,
>
diff mbox series

Patch

diff --git a/tools/xenstore/xenstored_control.c b/tools/xenstore/xenstored_control.c
index a045f102a420..37a3d39f20b5 100644
--- a/tools/xenstore/xenstored_control.c
+++ b/tools/xenstore/xenstored_control.c
@@ -696,7 +696,18 @@  static bool do_lu_start(struct delayed_request *req)
 	time_t now = time(NULL);
 	const char *ret;
 	struct buffered_data *saved_in;
-	struct connection *conn = lu_status->conn;
+	struct connection *conn = req->data;
+
+	/*
+	 * Cancellation may have been requested asynchronously. In this
+	 * case, lu_status will be NULL.
+	 */
+	if (!lu_status) {
+		ret = "Cancellation was requested";
+		conn = req->data;
+		goto out;
+	} else
+		assert(lu_status->conn == conn);
 
 	if (!lu_check_lu_allowed()) {
 		if (now < lu_status->started_at + lu_status->timeout)
@@ -747,7 +758,7 @@  static const char *lu_start(const void *ctx, struct connection *conn,
 	lu_status->timeout = to;
 	lu_status->started_at = time(NULL);
 
-	errno = delay_request(conn, conn->in, do_lu_start, NULL, false);
+	errno = delay_request(conn, conn->in, do_lu_start, conn, false);
 
 	return NULL;
 }