diff mbox series

[13/20] tools/xenstore: don't allow creating too many nodes in a transaction

Message ID 20221101152842.4257-14-jgross@suse.com (mailing list archive)
State Superseded
Headers show
Series tools/xenstore: do some cleanup and fixes | expand

Commit Message

Jürgen Groß Nov. 1, 2022, 3:28 p.m. UTC
The accounting for the number of nodes of a domain in an active
transaction is not working correctly, as it allows to create arbitrary
number of nodes. The transaction will finally fail due to exceeding
the number of nodes quota, but before closing the transaction an
unprivileged guest could cause Xenstore to use a lot of memory.

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 tools/xenstore/xenstored_domain.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

Comments

Julien Grall Nov. 6, 2022, 10 p.m. UTC | #1
Hi Juergen,

On 01/11/2022 15:28, Juergen Gross wrote:
> The accounting for the number of nodes of a domain in an active
> transaction is not working correctly, as it allows to create arbitrary
> number of nodes. The transaction will finally fail due to exceeding
> the number of nodes quota, but before closing the transaction an
> unprivileged guest could cause Xenstore to use a lot of memory.

I have already made some of comments on the security ML when this was 
initially set. The arguments still don't sound right to me.

For a first, since XSA-326, we have a cap in place for the memory a 
domain can consume. So this surely can't be a "lot of memory". Otherwise 
we are saying that our limit (there are other way to hit it) were wrong...

In addition to that, this is quite pointless to check the number of 
entry a domain currently owned because this can change before the 
transaction is committed by another "external" command. Therefore, this 
would add some randomness in the command which could make more difficult 
to diagnose.

Lastly, there are other very easy way to use memory (just create 
multiple transaction in parallel).

So based on the commit message, the change doesn't help at all to make 
better Xenstored.

Note that I don't particularly mind the code change (even though it adds 
more complexity). I just strongly dislike the current justifications.

At the moment, I can't find a justification that would make the change 
whorthwhile.

Cheers,
Jürgen Groß Nov. 7, 2022, 8:34 a.m. UTC | #2
On 06.11.22 23:00, Julien Grall wrote:
> Hi Juergen,
> 
> On 01/11/2022 15:28, Juergen Gross wrote:
>> The accounting for the number of nodes of a domain in an active
>> transaction is not working correctly, as it allows to create arbitrary
>> number of nodes. The transaction will finally fail due to exceeding
>> the number of nodes quota, but before closing the transaction an
>> unprivileged guest could cause Xenstore to use a lot of memory.
> 
> I have already made some of comments on the security ML when this was initially 
> set. The arguments still don't sound right to me.
> 
> For a first, since XSA-326, we have a cap in place for the memory a domain can 
> consume. So this surely can't be a "lot of memory". Otherwise we are saying that 
> our limit (there are other way to hit it) were wrong...

Yeah, maybe I should rework the commit message.

The reason I still want to keep the patch is that in a transaction without any
conflicts and without hitting any transaction specific limits (number of nodes
accessed), the errors returned due to a single operation should be the same as
with the same operation performed outside a transaction.

> In addition to that, this is quite pointless to check the number of entry a 
> domain currently owned because this can change before the transaction is 
> committed by another "external" command. Therefore, this would add some 
> randomness in the command which could make more difficult to diagnose.

In the scope of the transaction the tests are correct.

> Lastly, there are other very easy way to use memory (just create multiple 
> transaction in parallel).
> 
> So based on the commit message, the change doesn't help at all to make better 
> Xenstored.
> 
> Note that I don't particularly mind the code change (even though it adds more 
> complexity). I just strongly dislike the current justifications.
> 
> At the moment, I can't find a justification that would make the change whorthwhile.

See above reasoning.


Juergen
Julien Grall Nov. 7, 2022, 6:37 p.m. UTC | #3
On 07/11/2022 08:34, Juergen Gross wrote:
> On 06.11.22 23:00, Julien Grall wrote:
>> Hi Juergen,
>>
>> On 01/11/2022 15:28, Juergen Gross wrote:
>>> The accounting for the number of nodes of a domain in an active
>>> transaction is not working correctly, as it allows to create arbitrary
>>> number of nodes. The transaction will finally fail due to exceeding
>>> the number of nodes quota, but before closing the transaction an
>>> unprivileged guest could cause Xenstore to use a lot of memory.
>>
>> I have already made some of comments on the security ML when this was 
>> initially set. The arguments still don't sound right to me.
>>
>> For a first, since XSA-326, we have a cap in place for the memory a 
>> domain can consume. So this surely can't be a "lot of memory". 
>> Otherwise we are saying that our limit (there are other way to hit it) 
>> were wrong...
> 
> Yeah, maybe I should rework the commit message.
> 
> The reason I still want to keep the patch is that in a transaction 
> without any
> conflicts and without hitting any transaction specific limits (number of 
> nodes
> accessed), the errors returned due to a single operation should be the 
> same as
> with the same operation performed outside a transaction.

This seems to be a very niche use case. So it is not clear to me why 
this matter and we want to add extra check for everyone.

Cheers,
Jürgen Groß Nov. 8, 2022, 8:09 a.m. UTC | #4
On 07.11.22 19:37, Julien Grall wrote:
> 
> 
> On 07/11/2022 08:34, Juergen Gross wrote:
>> On 06.11.22 23:00, Julien Grall wrote:
>>> Hi Juergen,
>>>
>>> On 01/11/2022 15:28, Juergen Gross wrote:
>>>> The accounting for the number of nodes of a domain in an active
>>>> transaction is not working correctly, as it allows to create arbitrary
>>>> number of nodes. The transaction will finally fail due to exceeding
>>>> the number of nodes quota, but before closing the transaction an
>>>> unprivileged guest could cause Xenstore to use a lot of memory.
>>>
>>> I have already made some of comments on the security ML when this was 
>>> initially set. The arguments still don't sound right to me.
>>>
>>> For a first, since XSA-326, we have a cap in place for the memory a domain 
>>> can consume. So this surely can't be a "lot of memory". Otherwise we are 
>>> saying that our limit (there are other way to hit it) were wrong...
>>
>> Yeah, maybe I should rework the commit message.
>>
>> The reason I still want to keep the patch is that in a transaction without any
>> conflicts and without hitting any transaction specific limits (number of nodes
>> accessed), the errors returned due to a single operation should be the same as
>> with the same operation performed outside a transaction.
> 
> This seems to be a very niche use case. So it is not clear to me why this matter 
> and we want to add extra check for everyone.

It is a matter of correctness.

Additionally, after the internal accounting rework this makes even more
sense, as the maximum values per domain seen are really correct, even when
queried while transactions are active.


Juergen
Julien Grall Dec. 1, 2022, 7:25 p.m. UTC | #5
Hi Juergen,

On 08/11/2022 08:09, Juergen Gross wrote:
> On 07.11.22 19:37, Julien Grall wrote:
>> On 07/11/2022 08:34, Juergen Gross wrote:
>>> On 06.11.22 23:00, Julien Grall wrote:
>>>> Hi Juergen,
>>>>
>>>> On 01/11/2022 15:28, Juergen Gross wrote:
>>>>> The accounting for the number of nodes of a domain in an active
>>>>> transaction is not working correctly, as it allows to create arbitrary
>>>>> number of nodes. The transaction will finally fail due to exceeding
>>>>> the number of nodes quota, but before closing the transaction an
>>>>> unprivileged guest could cause Xenstore to use a lot of memory.
>>>>
>>>> I have already made some of comments on the security ML when this 
>>>> was initially set. The arguments still don't sound right to me.
>>>>
>>>> For a first, since XSA-326, we have a cap in place for the memory a 
>>>> domain can consume. So this surely can't be a "lot of memory". 
>>>> Otherwise we are saying that our limit (there are other way to hit 
>>>> it) were wrong...
>>>
>>> Yeah, maybe I should rework the commit message.
>>>
>>> The reason I still want to keep the patch is that in a transaction 
>>> without any
>>> conflicts and without hitting any transaction specific limits (number 
>>> of nodes
>>> accessed), the errors returned due to a single operation should be 
>>> the same as
>>> with the same operation performed outside a transaction.
>>
>> This seems to be a very niche use case. So it is not clear to me why 
>> this matter and we want to add extra check for everyone.
> 
> It is a matter of correctness.

I think this is a matter of perspective. Transactions are inherently 
racy and I see no point of try to solve issues in the idealistic case 
(i.e. no conflict). The reasoning below...

> 
> Additionally, after the internal accounting rework this makes even more
> sense, as the maximum values per domain seen are really correct, even when
> queried while transactions are active.
... might be a better justification. But I will need to review the other 
patch in order to forge a more positive opinion. Can you point me to the 
patch?

Cheers,
Jürgen Groß Dec. 13, 2022, 7:55 a.m. UTC | #6
On 01.12.22 20:25, Julien Grall wrote:
> Hi Juergen,
> 
> On 08/11/2022 08:09, Juergen Gross wrote:
>> On 07.11.22 19:37, Julien Grall wrote:
>>> On 07/11/2022 08:34, Juergen Gross wrote:
>>>> On 06.11.22 23:00, Julien Grall wrote:
>>>>> Hi Juergen,
>>>>>
>>>>> On 01/11/2022 15:28, Juergen Gross wrote:
>>>>>> The accounting for the number of nodes of a domain in an active
>>>>>> transaction is not working correctly, as it allows to create arbitrary
>>>>>> number of nodes. The transaction will finally fail due to exceeding
>>>>>> the number of nodes quota, but before closing the transaction an
>>>>>> unprivileged guest could cause Xenstore to use a lot of memory.
>>>>>
>>>>> I have already made some of comments on the security ML when this was 
>>>>> initially set. The arguments still don't sound right to me.
>>>>>
>>>>> For a first, since XSA-326, we have a cap in place for the memory a domain 
>>>>> can consume. So this surely can't be a "lot of memory". Otherwise we are 
>>>>> saying that our limit (there are other way to hit it) were wrong...
>>>>
>>>> Yeah, maybe I should rework the commit message.
>>>>
>>>> The reason I still want to keep the patch is that in a transaction without any
>>>> conflicts and without hitting any transaction specific limits (number of nodes
>>>> accessed), the errors returned due to a single operation should be the same as
>>>> with the same operation performed outside a transaction.
>>>
>>> This seems to be a very niche use case. So it is not clear to me why this 
>>> matter and we want to add extra check for everyone.
>>
>> It is a matter of correctness.
> 
> I think this is a matter of perspective. Transactions are inherently racy and I 
> see no point of try to solve issues in the idealistic case (i.e. no conflict).
> The reasoning below...
> 
>>
>> Additionally, after the internal accounting rework this makes even more
>> sense, as the maximum values per domain seen are really correct, even when
>> queried while transactions are active.
> ... might be a better justification. But I will need to review the other patch 
> in order to forge a more positive opinion. Can you point me to the patch?

The main changes are in patches 2-4 of the 2nd series [1].


Juergen

[1]: https://lists.xen.org/archives/html/xen-devel/2022-11/msg00050.html
diff mbox series

Patch

diff --git a/tools/xenstore/xenstored_domain.c b/tools/xenstore/xenstored_domain.c
index b737a77683..529ffb522a 100644
--- a/tools/xenstore/xenstored_domain.c
+++ b/tools/xenstore/xenstored_domain.c
@@ -1125,9 +1125,8 @@  int domain_nbentry_fix(unsigned int domid, int num, bool update)
 
 int domain_nbentry(struct connection *conn)
 {
-	return (domain_is_unprivileged(conn))
-		? conn->domain->nbentry
-		: 0;
+	return domain_is_unprivileged(conn)
+	       ? domain_nbentry_add(conn, conn->id, 0, true) : 0;
 }
 
 static bool domain_chk_quota(struct domain *domain, int mem)