diff mbox series

[v2,01/13] tools/xenstore: don't allow creating too many nodes in a transaction

Message ID 20230120100028.11142-2-jgross@suse.com (mailing list archive)
State Superseded
Headers show
Series tools/xenstore: rework internal accounting | expand

Commit Message

Jürgen Groß Jan. 20, 2023, 10 a.m. UTC
The accounting for the number of nodes of a domain in an active
transaction is not working correctly, as it allows to create arbitrary
number of nodes. The transaction will finally fail due to exceeding
the number of nodes quota, but before closing the transaction an
unprivileged guest could cause Xenstore to use a lot of memory.

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 tools/xenstore/xenstored_domain.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

Comments

Julien Grall Feb. 20, 2023, 9:46 a.m. UTC | #1
Hi Juergen,

On 20/01/2023 10:00, Juergen Gross wrote:
> The accounting for the number of nodes of a domain in an active
> transaction is not working correctly, as it allows to create arbitrary
> number of nodes. The transaction will finally fail due to exceeding
> the number of nodes quota, but before closing the transaction an
> unprivileged guest could cause Xenstore to use a lot of memory.

I know I said I would delay my decision on this patch. However, I was 
still expecting the commit message to be updated based on our previous 
discussion.

Also thinking more about it, "The transaction will finally fail due to 
exceeding the number of nodes quota" may not be true for a couple of 
reasons:
   1) The transaction may removed a node afterwards.
   2) A node may have been removed outside of the transaction.

In both situation, the transaction will still be committed. This will 
now be prevented by this patch.

While I understand, they may be edge cases, this is also true for what 
you are aiming to solve. So I am still not convinced about the benefits 
of this patch.

Cheers,
Jürgen Groß Feb. 20, 2023, 11:04 a.m. UTC | #2
On 20.02.23 10:46, Julien Grall wrote:
> Hi Juergen,
> 
> On 20/01/2023 10:00, Juergen Gross wrote:
>> The accounting for the number of nodes of a domain in an active
>> transaction is not working correctly, as it allows to create arbitrary
>> number of nodes. The transaction will finally fail due to exceeding
>> the number of nodes quota, but before closing the transaction an
>> unprivileged guest could cause Xenstore to use a lot of memory.
> 
> I know I said I would delay my decision on this patch. However, I was still 
> expecting the commit message to be updated based on our previous discussion.

As said before, I was waiting on the settlement of our discussion before
doing the update.

> Also thinking more about it, "The transaction will finally fail due to exceeding 
> the number of nodes quota" may not be true for a couple of reasons:
>    1) The transaction may removed a node afterwards.

Yes, and? Just because it is a transaction, this is still a violation of
the quota. Even outside a transaction you could use the same reasoning,
but you don't (which is correct, of course).

In case you never finish the transaction, you are owner of more than
allowed nodes.

>    2) A node may have been removed outside of the transaction.

If the removed node hasn't been touched by the transaction, it will be
accounted for correctly. If it has been touched, the transaction will
fail anyway.

> In both situation, the transaction will still be committed. This will now be 
> prevented by this patch.

As said above, only in the first case.

> While I understand, they may be edge cases, this is also true for what you are 
> aiming to solve. So I am still not convinced about the benefits of this patch.


Juergen
Julien Grall Feb. 20, 2023, 12:07 p.m. UTC | #3
Hi Juergen,

On 20/02/2023 11:04, Juergen Gross wrote:
> On 20.02.23 10:46, Julien Grall wrote:
>> Hi Juergen,
>>
>> On 20/01/2023 10:00, Juergen Gross wrote:
>>> The accounting for the number of nodes of a domain in an active
>>> transaction is not working correctly, as it allows to create arbitrary
>>> number of nodes. The transaction will finally fail due to exceeding
>>> the number of nodes quota, but before closing the transaction an
>>> unprivileged guest could cause Xenstore to use a lot of memory.
>>
>> I know I said I would delay my decision on this patch. However, I was 
>> still expecting the commit message to be updated based on our previous 
>> discussion.
> 
> As said before, I was waiting on the settlement of our discussion before
> doing the update.

This is not a very good practice to resend a patch without recording the 
disagreement because new reviewers may not be aware of the disagreement 
and this could end up to be committed by mistake...

> 
>> Also thinking more about it, "The transaction will finally fail due to 
>> exceeding the number of nodes quota" may not be true for a couple of 
>> reasons:
>>    1) The transaction may removed a node afterwards.
> 
> Yes, and? Just because it is a transaction, this is still a violation of
> the quota. Even outside a transaction you could use the same reasoning,
> but you don't (which is correct, of course).

I can understand your point. However, to me, the goal of the transaction 
is to commit everything in one go at the end. So the violations in the 
middle should not matter.

Furthermore, I would expect a transaction to work on a snapshot of the 
system. So it feels really strange to me that we are constantly checking 
the quota with the updated values rather than the initial one.

> 
> In case you never finish the transaction, you are owner of more than
> allowed nodes.
How so? The nodes would not be committed so you don't really own them.
We also have quotas to limit the number of nodes accessed in a transaction.

> 
>>    2) A node may have been removed outside of the transaction.
> 
> If the removed node hasn't been touched by the transaction, it will be
> accounted for correctly.

It depends on when the node was removed. If it is removed *after* the 
quota was hit then your transaction will fail.

>  If it has been touched, the transaction will
> fail anyway.
So the transaction will fail to commit because there is a conflict. So 
the client is expected to retry it. We should not expected the client to 
retry for error like quota.

So the overall behavior is changed.

> 
>> In both situation, the transaction will still be committed. This will 
>> now be prevented by this patch.
> 
> As said above, only in the first case.

Cheers,
Jürgen Groß Feb. 20, 2023, 1:49 p.m. UTC | #4
On 20.02.23 13:07, Julien Grall wrote:
> Hi Juergen,
> 
> On 20/02/2023 11:04, Juergen Gross wrote:
>> On 20.02.23 10:46, Julien Grall wrote:
>>> Hi Juergen,
>>>
>>> On 20/01/2023 10:00, Juergen Gross wrote:
>>>> The accounting for the number of nodes of a domain in an active
>>>> transaction is not working correctly, as it allows to create arbitrary
>>>> number of nodes. The transaction will finally fail due to exceeding
>>>> the number of nodes quota, but before closing the transaction an
>>>> unprivileged guest could cause Xenstore to use a lot of memory.
>>>
>>> I know I said I would delay my decision on this patch. However, I was still 
>>> expecting the commit message to be updated based on our previous discussion.
>>
>> As said before, I was waiting on the settlement of our discussion before
>> doing the update.
> 
> This is not a very good practice to resend a patch without recording the 
> disagreement because new reviewers may not be aware of the disagreement and this 
> could end up to be committed by mistake...

You said you wanted to look into this patch in detail after the previous
series, so I move it into this series. The wording update would strongly
depend on the outcome of the discussion IMO, so I didn't do it yet.

Not adding the patch in this series would require some additional rebase
effort, so I kept the patch as is.

>>> Also thinking more about it, "The transaction will finally fail due to 
>>> exceeding the number of nodes quota" may not be true for a couple of reasons:
>>>    1) The transaction may removed a node afterwards.
>>
>> Yes, and? Just because it is a transaction, this is still a violation of
>> the quota. Even outside a transaction you could use the same reasoning,
>> but you don't (which is correct, of course).
> 
> I can understand your point. However, to me, the goal of the transaction is to 
> commit everything in one go at the end. So the violations in the middle should 
> not matter.

Of course they should.

We wouldn't allow to write over-sized nodes, even if they could be rewritten in
the same transaction with a smaller size.

We wouldn't allow to create nodes which would violate overall memory
consumption.

We wouldn't allow nodes with more permission entries than allowed, even if they
could be reduced in the same transaction again.

I don't see why the number of nodes shouldn't be taken into account.

> Furthermore, I would expect a transaction to work on a snapshot of the system. 
> So it feels really strange to me that we are constantly checking the quota with 
> the updated values rather than the initial one.

You are aware that the code without this patch is just neglecting the nodes
created in the transaction? It just is using the current number of nodes
outside any transaction for quota check. So I could easily create a scenario
where my new code will succeed, but the current one is failing:

Assume node quota is 1000, and at start of the transaction the guest is owning
999 nodes. In the transaction the guest is deleting 10 nodes, then dom0 is
creating 5 additional nodes owned by the guest. The central node counter is now
1004 (over quota), while the in-transaction count is 994. In the transaction the
guest can now happily create a new node (#995) with my patch, but will fail to
do so without (it sees the 1004 due to the transaction count being ignored).

>> In case you never finish the transaction, you are owner of more than
>> allowed nodes.
> How so? The nodes would not be committed so you don't really own them.

But you can use them inside the transaction.

> We also have quotas to limit the number of nodes accessed in a transaction.

Yes, but you can use the excess nodes in the transaction until that quota
is reached.

>>>    2) A node may have been removed outside of the transaction.
>>
>> If the removed node hasn't been touched by the transaction, it will be
>> accounted for correctly.
> 
> It depends on when the node was removed. If it is removed *after* the quota was 
> hit then your transaction will fail.

Yes. That is why the quota check at the finalization of the transaction has
to be kept.

> 
>>  If it has been touched, the transaction will
>> fail anyway.
> So the transaction will fail to commit because there is a conflict. So the 
> client is expected to retry it. We should not expected the client to retry for 
> error like quota.

Correct. The failure I'm mentioning isn't due to quota, but due to a conflict.

> So the overall behavior is changed.

We do so for any bug-fix. And IMO my patch is a bug-fix.


Juergen
Jürgen Groß Feb. 20, 2023, 2:06 p.m. UTC | #5
On 20.02.23 14:49, Juergen Gross wrote:
> Assume node quota is 1000, and at start of the transaction the guest is owning
> 999 nodes. In the transaction the guest is deleting 10 nodes, then dom0 is
> creating 5 additional nodes owned by the guest. The central node counter is now
> 1004 (over quota), while the in-transaction count is 994. In the transaction the
> guest can now happily create a new node (#995) with my patch, but will fail to
> do so without (it sees the 1004 due to the transaction count being ignored).

It is even worse, so I'd like to suggest the following commit message:

   tools/xenstore: take transaction internal nodes into account for quota

   The accounting for the number of nodes of a domain in an active
   transaction is not working correctly, as it is checking the node quota
   only against the number of nodes outside the transaction.

   This can result in the transaction finally failing, as node quota is
   checked at the end of the transaction again.

   On the other hand even in a transaction deleting many nodes, new nodes
   might not be creatable, in case the node quota was already reached at
   the start of the transaction.


Juergen
Julien Grall Feb. 20, 2023, 2:15 p.m. UTC | #6
On 20/02/2023 13:49, Juergen Gross wrote:
> On 20.02.23 13:07, Julien Grall wrote:
>> Hi Juergen,
>>
>> On 20/02/2023 11:04, Juergen Gross wrote:
>>> On 20.02.23 10:46, Julien Grall wrote:
>>>> Hi Juergen,
>>>>
>>>> On 20/01/2023 10:00, Juergen Gross wrote:
>>>>> The accounting for the number of nodes of a domain in an active
>>>>> transaction is not working correctly, as it allows to create arbitrary
>>>>> number of nodes. The transaction will finally fail due to exceeding
>>>>> the number of nodes quota, but before closing the transaction an
>>>>> unprivileged guest could cause Xenstore to use a lot of memory.
>>>>
>>>> I know I said I would delay my decision on this patch. However, I 
>>>> was still expecting the commit message to be updated based on our 
>>>> previous discussion.
>>>
>>> As said before, I was waiting on the settlement of our discussion before
>>> doing the update.
>>
>> This is not a very good practice to resend a patch without recording 
>> the disagreement because new reviewers may not be aware of the 
>> disagreement and this could end up to be committed by mistake...
> 
> You said you wanted to look into this patch in detail after the previous
> series, so I move it into this series. The wording update would strongly
> depend on the outcome of the discussion IMO, so I didn't do it yet.
This could have been mentioned after ---. This could allow people to 
understand the concern and then participate.

> 
>>>> Also thinking more about it, "The transaction will finally fail due 
>>>> to exceeding the number of nodes quota" may not be true for a couple 
>>>> of reasons:
>>>>    1) The transaction may removed a node afterwards.
>>>
>>> Yes, and? Just because it is a transaction, this is still a violation of
>>> the quota. Even outside a transaction you could use the same reasoning,
>>> but you don't (which is correct, of course).
>>
>> I can understand your point. However, to me, the goal of the 
>> transaction is to commit everything in one go at the end. So the 
>> violations in the middle should not matter.
> 
> Of course they should.
> 
> We wouldn't allow to write over-sized nodes, even if they could be 
> rewritten in
> the same transaction with a smaller size.

I view the two different.

> We wouldn't allow to create nodes which would violate overall memory
> consumption.
> 
> We wouldn't allow nodes with more permission entries than allowed, even 
> if they
> could be reduced in the same transaction again.
> 
> I don't see why the number of nodes shouldn't be taken into account.
> 
>> Furthermore, I would expect a transaction to work on a snapshot of the 
>> system. So it feels really strange to me that we are constantly 
>> checking the quota with the updated values rather than the initial one.
> 
> You are aware that the code without this patch is just neglecting the nodes
> created in the transaction? It just is using the current number of nodes
> outside any transaction for quota check.

Are you referring to the quota check within the transaction?

> So I could easily create a 
> scenario
> where my new code will succeed, but the current one is failing:
> Assume node quota is 1000, and at start of the transaction the guest is 
> owning
> 999 nodes. In the transaction the guest is deleting 10 nodes, then dom0 is
> creating 5 additional nodes owned by the guest. The central node counter 
> is now
> 1004 (over quota), while the in-transaction count is 994.
> In the 
> transaction the
> guest can now happily create a new node (#995) with my patch, but will 
> fail to
> do so without (it sees the 1004 due to the transaction count being 
> ignored).

This doesn't sound correct. To do you have any test I could use to check 
the behavior?

Cheers,
Jürgen Groß Feb. 20, 2023, 2:21 p.m. UTC | #7
On 20.02.23 15:15, Julien Grall wrote:
> On 20/02/2023 13:49, Juergen Gross wrote:
>> On 20.02.23 13:07, Julien Grall wrote:
>>> Hi Juergen,
>>>
>>> On 20/02/2023 11:04, Juergen Gross wrote:
>>>> On 20.02.23 10:46, Julien Grall wrote:
>>>>> Hi Juergen,
>>>>>
>>>>> On 20/01/2023 10:00, Juergen Gross wrote:
>>>>>> The accounting for the number of nodes of a domain in an active
>>>>>> transaction is not working correctly, as it allows to create arbitrary
>>>>>> number of nodes. The transaction will finally fail due to exceeding
>>>>>> the number of nodes quota, but before closing the transaction an
>>>>>> unprivileged guest could cause Xenstore to use a lot of memory.
>>>>>
>>>>> I know I said I would delay my decision on this patch. However, I was still 
>>>>> expecting the commit message to be updated based on our previous discussion.
>>>>
>>>> As said before, I was waiting on the settlement of our discussion before
>>>> doing the update.
>>>
>>> This is not a very good practice to resend a patch without recording the 
>>> disagreement because new reviewers may not be aware of the disagreement and 
>>> this could end up to be committed by mistake...
>>
>> You said you wanted to look into this patch in detail after the previous
>> series, so I move it into this series. The wording update would strongly
>> depend on the outcome of the discussion IMO, so I didn't do it yet.
> This could have been mentioned after ---. This could allow people to understand 
> the concern and then participate.

Will do so in future.

> 
>>
>>>>> Also thinking more about it, "The transaction will finally fail due to 
>>>>> exceeding the number of nodes quota" may not be true for a couple of reasons:
>>>>>    1) The transaction may removed a node afterwards.
>>>>
>>>> Yes, and? Just because it is a transaction, this is still a violation of
>>>> the quota. Even outside a transaction you could use the same reasoning,
>>>> but you don't (which is correct, of course).
>>>
>>> I can understand your point. However, to me, the goal of the transaction is 
>>> to commit everything in one go at the end. So the violations in the middle 
>>> should not matter.
>>
>> Of course they should.
>>
>> We wouldn't allow to write over-sized nodes, even if they could be rewritten in
>> the same transaction with a smaller size.
> 
> I view the two different.
> 
>> We wouldn't allow to create nodes which would violate overall memory
>> consumption.
>>
>> We wouldn't allow nodes with more permission entries than allowed, even if they
>> could be reduced in the same transaction again.
>>
>> I don't see why the number of nodes shouldn't be taken into account.
>>
>>> Furthermore, I would expect a transaction to work on a snapshot of the 
>>> system. So it feels really strange to me that we are constantly checking the 
>>> quota with the updated values rather than the initial one.
>>
>> You are aware that the code without this patch is just neglecting the nodes
>> created in the transaction? It just is using the current number of nodes
>> outside any transaction for quota check.
> 
> Are you referring to the quota check within the transaction?

I'm referring to the quota check in create_node().

> 
>> So I could easily create a scenario
>> where my new code will succeed, but the current one is failing:
>> Assume node quota is 1000, and at start of the transaction the guest is owning
>> 999 nodes. In the transaction the guest is deleting 10 nodes, then dom0 is
>> creating 5 additional nodes owned by the guest. The central node counter is now
>> 1004 (over quota), while the in-transaction count is 994.
>> In the transaction the
>> guest can now happily create a new node (#995) with my patch, but will fail to
>> do so without (it sees the 1004 due to the transaction count being ignored).
> 
> This doesn't sound correct. To do you have any test I could use to check the 
> behavior?

Try it:

- create nodes in a guest until you hit the ENOSPC return code due to too many
   nodes
- start a transaction deleting some nodes and then trying to create another
   one, which fail fail with ENOSPC.


Juergen
Julien Grall Feb. 20, 2023, 6:01 p.m. UTC | #8
On 20/02/2023 14:21, Juergen Gross wrote:
> On 20.02.23 15:15, Julien Grall wrote:
>> On 20/02/2023 13:49, Juergen Gross wrote:
>>> On 20.02.23 13:07, Julien Grall wrote:
>>>> Hi Juergen,
>>>>
>>>> On 20/02/2023 11:04, Juergen Gross wrote:
>>>>> On 20.02.23 10:46, Julien Grall wrote:
>>>>>> Hi Juergen,
>>>>>>
>>>>>> On 20/01/2023 10:00, Juergen Gross wrote:
>>>>>>> The accounting for the number of nodes of a domain in an active
>>>>>>> transaction is not working correctly, as it allows to create 
>>>>>>> arbitrary
>>>>>>> number of nodes. The transaction will finally fail due to exceeding
>>>>>>> the number of nodes quota, but before closing the transaction an
>>>>>>> unprivileged guest could cause Xenstore to use a lot of memory.
>>>>>>
>>>>>> I know I said I would delay my decision on this patch. However, I 
>>>>>> was still expecting the commit message to be updated based on our 
>>>>>> previous discussion.
>>>>>
>>>>> As said before, I was waiting on the settlement of our discussion 
>>>>> before
>>>>> doing the update.
>>>>
>>>> This is not a very good practice to resend a patch without recording 
>>>> the disagreement because new reviewers may not be aware of the 
>>>> disagreement and this could end up to be committed by mistake...
>>>
>>> You said you wanted to look into this patch in detail after the previous
>>> series, so I move it into this series. The wording update would strongly
>>> depend on the outcome of the discussion IMO, so I didn't do it yet.
>> This could have been mentioned after ---. This could allow people to 
>> understand the concern and then participate.
> 
> Will do so in future.
> 
>>
>>>
>>>>>> Also thinking more about it, "The transaction will finally fail 
>>>>>> due to exceeding the number of nodes quota" may not be true for a 
>>>>>> couple of reasons:
>>>>>>    1) The transaction may removed a node afterwards.
>>>>>
>>>>> Yes, and? Just because it is a transaction, this is still a 
>>>>> violation of
>>>>> the quota. Even outside a transaction you could use the same 
>>>>> reasoning,
>>>>> but you don't (which is correct, of course).
>>>>
>>>> I can understand your point. However, to me, the goal of the 
>>>> transaction is to commit everything in one go at the end. So the 
>>>> violations in the middle should not matter.
>>>
>>> Of course they should.
>>>
>>> We wouldn't allow to write over-sized nodes, even if they could be 
>>> rewritten in
>>> the same transaction with a smaller size.
>>
>> I view the two different.
>>
>>> We wouldn't allow to create nodes which would violate overall memory
>>> consumption.
>>>
>>> We wouldn't allow nodes with more permission entries than allowed, 
>>> even if they
>>> could be reduced in the same transaction again.
>>>
>>> I don't see why the number of nodes shouldn't be taken into account.
>>>
>>>> Furthermore, I would expect a transaction to work on a snapshot of 
>>>> the system. So it feels really strange to me that we are constantly 
>>>> checking the quota with the updated values rather than the initial one.
>>>
>>> You are aware that the code without this patch is just neglecting the 
>>> nodes
>>> created in the transaction? It just is using the current number of nodes
>>> outside any transaction for quota check.
>>
>> Are you referring to the quota check within the transaction?
> 
> I'm referring to the quota check in create_node(). >
>>
>>> So I could easily create a scenario
>>> where my new code will succeed, but the current one is failing:
>>> Assume node quota is 1000, and at start of the transaction the guest 
>>> is owning
>>> 999 nodes. In the transaction the guest is deleting 10 nodes, then 
>>> dom0 is
>>> creating 5 additional nodes owned by the guest. The central node 
>>> counter is now
>>> 1004 (over quota), while the in-transaction count is 994.
>>> In the transaction the
>>> guest can now happily create a new node (#995) with my patch, but 
>>> will fail to
>>> do so without (it sees the 1004 due to the transaction count being 
>>> ignored).
>>
>> This doesn't sound correct. To do you have any test I could use to 
>> check the behavior?
> 
> Try it:
> 
> - create nodes in a guest until you hit the ENOSPC return code due to 
> too many
>    nodes
> - start a transaction deleting some nodes and then trying to create another
>    one, which fail fail with ENOSPC.

So I have recreated an XTF test which I think match what you wrote [1].

It is indeed failing without your patch. But then there are still some 
weird behavior here.

I would expect the creation of the node would also fail if instead of 
removing the node if removed outside of the transaction.

This is not the case because we are looking at the current quota. So 
shouldn't we snapshot the global count?

Cheers,

[1]
#include <xtf.h>

const char test_title[] = "Test xenstore-transaction-limit-1";

#define BASELINE_DIR "data"
#define BASELINE_MAX_DIR BASELINE_DIR"/max"

#define MAX_NODES 2000

static bool max_out_nodes(void)
{
     unsigned int parent_id = 0, child_id = 0, nr_nodes = 0;
     xs_transaction_t tid = XBT_NULL;

     printk("Maxing out nodes\n");

     do
     {
         int rc;
         char path[256];

         rc = snprintf(path, ARRAY_SIZE(path), "%s/%u/%u",
                       BASELINE_MAX_DIR, parent_id, child_id);

         if ( rc >= (int)ARRAY_SIZE(path) )
         {
             xtf_error("Unable to create the path\n");
             return false;
         }

         rc = xenstore_mkdir(tid, path);

         /* Xenstored will return ENOSPC if we exceed a quota */
         if ( rc == ENOSPC )
         {
             /*
              * If we can't write the first child, then this likely means
              * we exceed the maximum of nodes quota. Consider it a
              * success.
              *
              * Otherwise, we may hit the maximum size of the parent.
              * Switch to a different parent.
              */
             if ( child_id == 0 )
             {
                 printk("Stopped after %u iterations\n", nr_nodes);
                 return true;
             }
             else
             {
                 printk("Parent ID %u: created %u children\n",
                        parent_id, child_id);
                 parent_id++;
                 child_id = 0;
                 continue;
             }
         }
         else if ( rc )
         {
             xtf_error("Unexpected error %d\n", rc);
             return false;
         }
         else
         {
             nr_nodes++;
             child_id++;
         }
     } while ( nr_nodes < MAX_NODES );

     xtf_error("Created %u nodes and the quota is still not reached?\n",
               nr_nodes);

     return false;
}

void test_main(void)
{
     xs_transaction_t tid;
     int rc;

     if ( !max_out_nodes() )
         return;

     tid = xenstore_transaction_start();
     if ( tid == XBT_NULL )
     {
         xtf_error("Cannot start transaction\n");
         return;
     }

     /* Remove one of the node within the transaction */
     rc = xenstore_rm(tid, "data/max/0/0");
     if ( rc )
     {
         xtf_error("Cannot remove the node data/max/0/0 (rc = %d)\n", rc);
         return;
     }

     /* Creating a new node should work because we removed one */
     rc = xenstore_write(tid, "data/foo", "bar");
     if ( rc )
     {
         xtf_error("Cannot create node data/foo (rc = %d)\n", rc);
         return;
     }

     xtf_success(NULL);
}



> 
> 
> Juergen
Jürgen Groß Feb. 21, 2023, 8:10 a.m. UTC | #9
On 20.02.23 19:01, Julien Grall wrote:
> 
> 
> On 20/02/2023 14:21, Juergen Gross wrote:
>> On 20.02.23 15:15, Julien Grall wrote:
>>> On 20/02/2023 13:49, Juergen Gross wrote:
>>>> On 20.02.23 13:07, Julien Grall wrote:
>>>>> Hi Juergen,
>>>>>
>>>>> On 20/02/2023 11:04, Juergen Gross wrote:
>>>>>> On 20.02.23 10:46, Julien Grall wrote:
>>>>>>> Hi Juergen,
>>>>>>>
>>>>>>> On 20/01/2023 10:00, Juergen Gross wrote:
>>>>>>>> The accounting for the number of nodes of a domain in an active
>>>>>>>> transaction is not working correctly, as it allows to create arbitrary
>>>>>>>> number of nodes. The transaction will finally fail due to exceeding
>>>>>>>> the number of nodes quota, but before closing the transaction an
>>>>>>>> unprivileged guest could cause Xenstore to use a lot of memory.
>>>>>>>
>>>>>>> I know I said I would delay my decision on this patch. However, I was 
>>>>>>> still expecting the commit message to be updated based on our previous 
>>>>>>> discussion.
>>>>>>
>>>>>> As said before, I was waiting on the settlement of our discussion before
>>>>>> doing the update.
>>>>>
>>>>> This is not a very good practice to resend a patch without recording the 
>>>>> disagreement because new reviewers may not be aware of the disagreement and 
>>>>> this could end up to be committed by mistake...
>>>>
>>>> You said you wanted to look into this patch in detail after the previous
>>>> series, so I move it into this series. The wording update would strongly
>>>> depend on the outcome of the discussion IMO, so I didn't do it yet.
>>> This could have been mentioned after ---. This could allow people to 
>>> understand the concern and then participate.
>>
>> Will do so in future.
>>
>>>
>>>>
>>>>>>> Also thinking more about it, "The transaction will finally fail due to 
>>>>>>> exceeding the number of nodes quota" may not be true for a couple of 
>>>>>>> reasons:
>>>>>>>    1) The transaction may removed a node afterwards.
>>>>>>
>>>>>> Yes, and? Just because it is a transaction, this is still a violation of
>>>>>> the quota. Even outside a transaction you could use the same reasoning,
>>>>>> but you don't (which is correct, of course).
>>>>>
>>>>> I can understand your point. However, to me, the goal of the transaction is 
>>>>> to commit everything in one go at the end. So the violations in the middle 
>>>>> should not matter.
>>>>
>>>> Of course they should.
>>>>
>>>> We wouldn't allow to write over-sized nodes, even if they could be rewritten in
>>>> the same transaction with a smaller size.
>>>
>>> I view the two different.
>>>
>>>> We wouldn't allow to create nodes which would violate overall memory
>>>> consumption.
>>>>
>>>> We wouldn't allow nodes with more permission entries than allowed, even if they
>>>> could be reduced in the same transaction again.
>>>>
>>>> I don't see why the number of nodes shouldn't be taken into account.
>>>>
>>>>> Furthermore, I would expect a transaction to work on a snapshot of the 
>>>>> system. So it feels really strange to me that we are constantly checking 
>>>>> the quota with the updated values rather than the initial one.
>>>>
>>>> You are aware that the code without this patch is just neglecting the nodes
>>>> created in the transaction? It just is using the current number of nodes
>>>> outside any transaction for quota check.
>>>
>>> Are you referring to the quota check within the transaction?
>>
>> I'm referring to the quota check in create_node(). >
>>>
>>>> So I could easily create a scenario
>>>> where my new code will succeed, but the current one is failing:
>>>> Assume node quota is 1000, and at start of the transaction the guest is owning
>>>> 999 nodes. In the transaction the guest is deleting 10 nodes, then dom0 is
>>>> creating 5 additional nodes owned by the guest. The central node counter is now
>>>> 1004 (over quota), while the in-transaction count is 994.
>>>> In the transaction the
>>>> guest can now happily create a new node (#995) with my patch, but will fail to
>>>> do so without (it sees the 1004 due to the transaction count being ignored).
>>>
>>> This doesn't sound correct. To do you have any test I could use to check the 
>>> behavior?
>>
>> Try it:
>>
>> - create nodes in a guest until you hit the ENOSPC return code due to too many
>>    nodes
>> - start a transaction deleting some nodes and then trying to create another
>>    one, which fail fail with ENOSPC.
> 
> So I have recreated an XTF test which I think match what you wrote [1].
> 
> It is indeed failing without your patch. But then there are still some weird 
> behavior here.
> 
> I would expect the creation of the node would also fail if instead of removing 
> the node if removed outside of the transaction.
> 
> This is not the case because we are looking at the current quota. So shouldn't 
> we snapshot the global count?

As we don't do a global snapshot of the data base for a transaction (this was
changed due to huge memory needs, bad performance, and a higher transaction
failure rate), I don't think we should snapshot the count either.

A transaction is performed atomically at the time it is finished. Therefore
seeing the current global state inside the transaction (with the transaction
private state on top like an overlay) is absolutely fine IMO.


Juergen
Julien Grall Feb. 21, 2023, 10:36 p.m. UTC | #10
Hi Juergen,

On 21/02/2023 08:10, Juergen Gross wrote:
> On 20.02.23 19:01, Julien Grall wrote:
>> So I have recreated an XTF test which I think match what you wrote [1].
>>
>> It is indeed failing without your patch. But then there are still some 
>> weird behavior here.
>>
>> I would expect the creation of the node would also fail if instead of 
>> removing the node if removed outside of the transaction.
>>
>> This is not the case because we are looking at the current quota. So 
>> shouldn't we snapshot the global count?
> 
> As we don't do a global snapshot of the data base for a transaction 
> (this was
> changed due to huge memory needs, bad performance, and a higher transaction
> failure rate), 

I am a bit surprised that the only way to do proper transaction is to 
have a global snapshot. Instead, you could have an overlay.

> I don't think we should snapshot the count either.

But that would mean that the quota will change depending on modification 
of the global database while the transaction is inflight.

I guess this is not better nor worse that the current situation. But it 
is still really confusing for a client because:
   1) The error could happen at random point
   2) You may see an inconsistent database as nodes are only cached when 
they are first accessed

> A transaction is performed atomically at the time it is finished. Therefore
> seeing the current global state inside the transaction (with the 
> transaction
> private state on top like an overlay) is absolutely fine IMO.

To me it is just showing that our concept of transaction is very broken 
in C Xenstored. I am curious to know whether OXenstored is behaving the 
same way.

Anyway, I agree this is not better nor worse than the current situation. 
So I will acked this patch:

Acked-by: Julien Grall <jgrall@amazon.com>

Cheers,
Jürgen Groß Feb. 22, 2023, 8:36 a.m. UTC | #11
On 21.02.23 23:36, Julien Grall wrote:
> Hi Juergen,
> 
> On 21/02/2023 08:10, Juergen Gross wrote:
>> On 20.02.23 19:01, Julien Grall wrote:
>>> So I have recreated an XTF test which I think match what you wrote [1].
>>>
>>> It is indeed failing without your patch. But then there are still some weird 
>>> behavior here.
>>>
>>> I would expect the creation of the node would also fail if instead of 
>>> removing the node if removed outside of the transaction.
>>>
>>> This is not the case because we are looking at the current quota. So 
>>> shouldn't we snapshot the global count?
>>
>> As we don't do a global snapshot of the data base for a transaction (this was
>> changed due to huge memory needs, bad performance, and a higher transaction
>> failure rate), 
> 
> I am a bit surprised that the only way to do proper transaction is to have a 
> global snapshot. Instead, you could have an overlay.

I didn't say that a global snapshot is the only way. And we are using an
overlay.

> 
>> I don't think we should snapshot the count either.
> 
> But that would mean that the quota will change depending on modification of the 
> global database while the transaction is inflight.

I really don't see the problem with that. But it seems our views are different
in this case.

> I guess this is not better nor worse that the current situation. But it is still 
> really confusing for a client because:
>    1) The error could happen at random point

Yes, like without a transaction.

>    2) You may see an inconsistent database as nodes are only cached when they 
> are first accessed

It isn't inconsistent at all. The same could happen if such other nodes are
added/modified/removed just a microsecond before you start the transaction.
You won't be able to tell the difference. You can only reason about nodes
being accessed in the transaction, and those are transaction-local.

>> A transaction is performed atomically at the time it is finished. Therefore
>> seeing the current global state inside the transaction (with the transaction
>> private state on top like an overlay) is absolutely fine IMO.
> 
> To me it is just showing that our concept of transaction is very broken in C 
> Xenstored. I am curious to know whether OXenstored is behaving the same way.

I don't know either.

> Anyway, I agree this is not better nor worse than the current situation. So I 
> will acked this patch:
> 
> Acked-by: Julien Grall <jgrall@amazon.com>

Thanks,


Juergen
Julien Grall Feb. 22, 2023, 8:52 a.m. UTC | #12
Hi Juergen,

On 22/02/2023 08:36, Juergen Gross wrote:
> On 21.02.23 23:36, Julien Grall wrote:
>> Hi Juergen,
>>
>> On 21/02/2023 08:10, Juergen Gross wrote:
>>> On 20.02.23 19:01, Julien Grall wrote:
>>>> So I have recreated an XTF test which I think match what you wrote [1].
>>>>
>>>> It is indeed failing without your patch. But then there are still 
>>>> some weird behavior here.
>>>>
>>>> I would expect the creation of the node would also fail if instead 
>>>> of removing the node if removed outside of the transaction.
>>>>
>>>> This is not the case because we are looking at the current quota. So 
>>>> shouldn't we snapshot the global count?
>>>
>>> As we don't do a global snapshot of the data base for a transaction 
>>> (this was
>>> changed due to huge memory needs, bad performance, and a higher 
>>> transaction
>>> failure rate), 
>>
>> I am a bit surprised that the only way to do proper transaction is to 
>> have a global snapshot. Instead, you could have an overlay.
> 
> I didn't say that a global snapshot is the only way. And we are using an
> overlay.
> 
>>
>>> I don't think we should snapshot the count either.
>>
>> But that would mean that the quota will change depending on 
>> modification of the global database while the transaction is inflight.
> 
> I really don't see the problem with that. But it seems our views are 
> different
> in this case.

See below.

> 
>> I guess this is not better nor worse that the current situation. But 
>> it is still really confusing for a client because:
>>    1) The error could happen at random point
> 
> Yes, like without a transaction.
> 
>>    2) You may see an inconsistent database as nodes are only cached 
>> when they are first accessed
> 
> It isn't inconsistent at all. The same could happen if such other nodes are
> added/modified/removed just a microsecond before you start the transaction.
> You won't be able to tell the difference. You can only reason about nodes
> being accessed in the transaction, and those are transaction-local.

I am not talking about the case a node is added/modified/removed outside 
of a transaction. I am talking about the in-transaction case. For 
example, let say we have a transaction A that remove node 1, 2 and 
transaction B to access 1, 2 (it may do more).

Now let's imagine the following sequence with the initial state is node 
1 and 2 exists:

  - TA started
  - TA remove 1
  - TA remove 2
  - TA remove 3
  - TB started
  - TB read 1
  - TA ended
  - TB read 2

With the above, one would expect that transaction B can read 2 as 
transaction A didn't commit before B started. But this is not what's 
happening.

My point here is that a protocol could require that if 1 is present then 
2 is. So it would be valid for a client to error out because the other 
side was considered to have misbehaved. However, here this is just how 
Xenstored behave and AFAICT it is undocumented.

Cheers,
Jürgen Groß Feb. 22, 2023, 9:37 a.m. UTC | #13
On 22.02.23 09:52, Julien Grall wrote:
> Hi Juergen,
> 
> On 22/02/2023 08:36, Juergen Gross wrote:
>> On 21.02.23 23:36, Julien Grall wrote:
>>> Hi Juergen,
>>>
>>> On 21/02/2023 08:10, Juergen Gross wrote:
>>>> On 20.02.23 19:01, Julien Grall wrote:
>>>>> So I have recreated an XTF test which I think match what you wrote [1].
>>>>>
>>>>> It is indeed failing without your patch. But then there are still some 
>>>>> weird behavior here.
>>>>>
>>>>> I would expect the creation of the node would also fail if instead of 
>>>>> removing the node if removed outside of the transaction.
>>>>>
>>>>> This is not the case because we are looking at the current quota. So 
>>>>> shouldn't we snapshot the global count?
>>>>
>>>> As we don't do a global snapshot of the data base for a transaction (this was
>>>> changed due to huge memory needs, bad performance, and a higher transaction
>>>> failure rate), 
>>>
>>> I am a bit surprised that the only way to do proper transaction is to have a 
>>> global snapshot. Instead, you could have an overlay.
>>
>> I didn't say that a global snapshot is the only way. And we are using an
>> overlay.
>>
>>>
>>>> I don't think we should snapshot the count either.
>>>
>>> But that would mean that the quota will change depending on modification of 
>>> the global database while the transaction is inflight.
>>
>> I really don't see the problem with that. But it seems our views are different
>> in this case.
> 
> See below.
> 
>>
>>> I guess this is not better nor worse that the current situation. But it is 
>>> still really confusing for a client because:
>>>    1) The error could happen at random point
>>
>> Yes, like without a transaction.
>>
>>>    2) You may see an inconsistent database as nodes are only cached when they 
>>> are first accessed
>>
>> It isn't inconsistent at all. The same could happen if such other nodes are
>> added/modified/removed just a microsecond before you start the transaction.
>> You won't be able to tell the difference. You can only reason about nodes
>> being accessed in the transaction, and those are transaction-local.
> 
> I am not talking about the case a node is added/modified/removed outside of a 
> transaction. I am talking about the in-transaction case. For example, let say we 
> have a transaction A that remove node 1, 2 and transaction B to access 1, 2 (it 
> may do more).
> 
> Now let's imagine the following sequence with the initial state is node 1 and 2 
> exists:
> 
>   - TA started
>   - TA remove 1
>   - TA remove 2
>   - TA remove 3
>   - TB started
>   - TB read 1
>   - TA ended
>   - TB read 2
> 
> With the above, one would expect that transaction B can read 2 as transaction A 
> didn't commit before B started. But this is not what's happening.
> 
> My point here is that a protocol could require that if 1 is present then 2 is. 
> So it would be valid for a client to error out because the other side was 
> considered to have misbehaved. However, here this is just how Xenstored behave 
> and AFAICT it is undocumented.

Ah, okay.

Yes, I can see your point.

I was thinking for some time now to switch over to a "Copy on Write" scheme for
transactions. This will require to get rid of TDB, as otherwise it will be quite
hard to setup the needed links between nodes.

I'm quite close to succeed on the TDB removal. Switching the transaction
mechanism will need some more thoughts, though.

So this is nothing I'll be able to solve soon. ;-)


Juergen
diff mbox series

Patch

diff --git a/tools/xenstore/xenstored_domain.c b/tools/xenstore/xenstored_domain.c
index 9ef41ede03..7eb9cd077b 100644
--- a/tools/xenstore/xenstored_domain.c
+++ b/tools/xenstore/xenstored_domain.c
@@ -1116,9 +1116,8 @@  int domain_nbentry_fix(unsigned int domid, int num, bool update)
 
 int domain_nbentry(struct connection *conn)
 {
-	return (domain_is_unprivileged(conn))
-		? conn->domain->nbentry
-		: 0;
+	return domain_is_unprivileged(conn)
+	       ? domain_nbentry_add(conn, conn->id, 0, true) : 0;
 }
 
 static bool domain_chk_quota(struct domain *domain, int mem)