ima: fix infinite loop within "ima_match_policy" function.

Message ID	20210819101529.28001-1-liqiong@nfschina.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-integrity-owner@kernel.org> sender: liqiong@nfschina.com) by ha0.nfschina.com (Postfix) with ESMTPA id 9B21AAE0DCB; Thu, 19 Aug 2021 18:16:52 +0800 (CST) From: liqiong <liqiong@nfschina.com> To: zohar@linux.ibm.com Cc: dmitry.kasatkin@gmail.com, jmorris@namei.org, serge@hallyn.com, linux-integrity@vger.kernel.org, linux-security-module@vger.kernel.org, linux-kernel@vger.kernel.org, liqiong <liqiong@nfschina.com> Subject: [PATCH] ima: fix infinite loop within "ima_match_policy" function. Date: Thu, 19 Aug 2021 18:15:29 +0800 Message-Id: <20210819101529.28001-1-liqiong@nfschina.com> Precedence: bulk
Series	ima: fix infinite loop within "ima_match_policy" function. \| expand ima: fix infinite loop within "ima_match_policy" function.

Li Qiong Aug. 19, 2021, 10:15 a.m. UTC

When "ima_match_policy" is looping while "ima_update_policy" changs
the variable "ima_rules", then "ima_match_policy" may can't exit loop,
and kernel keeps printf "rcu_sched detected stall on CPU ...".

It occurs at boot phase, systemd-services are being checked within
"ima_match_policy,at the same time, the variable "ima_rules"
is changed by a service.

Signed-off-by: liqiong <liqiong@nfschina.com>
---
 security/integrity/ima/ima_policy.c | 5 +++++
 1 file changed, 5 insertions(+)

THOBY Simon Aug. 19, 2021, 12:58 p.m. UTC | #1

Hi Liqiong,

On 8/19/21 12:15 PM, liqiong wrote:
> When "ima_match_policy" is looping while "ima_update_policy" changs
> the variable "ima_rules", then "ima_match_policy" may can't exit loop,
> and kernel keeps printf "rcu_sched detected stall on CPU ...".
> 
> It occurs at boot phase, systemd-services are being checked within
> "ima_match_policy,at the same time, the variable "ima_rules"
> is changed by a service.

First off, thanks for finding and identifying this nasty bug.

> 
> Signed-off-by: liqiong <liqiong@nfschina.com>
> ---
>  security/integrity/ima/ima_policy.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/security/integrity/ima/ima_policy.c b/security/integrity/ima/ima_policy.c
> index fd5d46e511f1..7e71e643457c 100644
> --- a/security/integrity/ima/ima_policy.c
> +++ b/security/integrity/ima/ima_policy.c
> @@ -217,6 +217,7 @@ static LIST_HEAD(ima_default_rules);
>  static LIST_HEAD(ima_policy_rules);
>  static LIST_HEAD(ima_temp_rules);
>  static struct list_head *ima_rules = &ima_default_rules;
> +static DECLARE_RWSEM(ima_rules_sem);
>  
>  static int ima_policy __initdata;
>  
> @@ -666,6 +667,7 @@ int ima_match_policy(struct user_namespace *mnt_userns, struct inode *inode,
>  	if (template_desc && !*template_desc)
>  		*template_desc = ima_template_desc_current();
>  
> +	down_read(&ima_rules_sem);
>  	rcu_read_lock();
>  	list_for_each_entry_rcu(entry, ima_rules, list) {
>  
> @@ -702,6 +704,7 @@ int ima_match_policy(struct user_namespace *mnt_userns, struct inode *inode,
>  			break;
>  	}
>  	rcu_read_unlock();
> +	up_read(&ima_rules_sem);
>  
>  	return action;
>  }
> @@ -919,7 +922,9 @@ void ima_update_policy(void)
>  
>  	if (ima_rules != policy) {
>  		ima_policy_flag = 0;
> +		down_write(&ima_rules_sem);
>  		ima_rules = policy;
> +		up_write(&ima_rules_sem);
>  
>  		/*
>  		 * IMA architecture specific policy rules are specified
> 

Rather than introducing a new semaphore, I wonder if you couldn't have done something
like the following?

@@ -674,13 +674,15 @@ int ima_match_policy(struct user_namespace *mnt_userns, struct inode *inode,
                     const char *func_data, unsigned int *allowed_algos)
 {
        struct ima_rule_entry *entry;
+       struct list_head *ima_rules_tmp;
        int action = 0, actmask = flags | (flags << 1);

        if (template_desc && !*template_desc)
                *template_desc = ima_template_desc_current();

        rcu_read_lock();
-       list_for_each_entry_rcu(entry, ima_rules, list) {
+       ima_rules_tmp = rcu_dereference(ima_rules);
+       list_for_each_entry_rcu(entry, ima_rules_tmp, list) {

                if (!(entry->action & actmask))
                        continue;
@@ -970,7 +972,7 @@ void ima_update_policy(void)

        if (ima_rules != policy) {
                ima_policy_flag = 0;
-               ima_rules = policy;
+               rcu_assign_pointer(ima_rules, policy);

                /*
                 * IMA architecture specific policy rules are specified


Also, ima_match_policy is not the only place where we iterate over ima_rules, maybe
this change should be applied to every function that perform a call the like of
"list_for_each_entry_rcu(entry, ima_rules_tmp, list)" ?

All that being said, your change is quite small and I have no objection to it,
I was just wondering whether we could achieve the same effect without locks
with RCU.

What do you think?

Thanks,
Simon

Mimi Zohar Aug. 19, 2021, 1:47 p.m. UTC | #2

On Thu, 2021-08-19 at 12:58 +0000, THOBY Simon wrote:
> Hi Liqiong,
> 
> On 8/19/21 12:15 PM, liqiong wrote:
> > When "ima_match_policy" is looping while "ima_update_policy" changs
> > the variable "ima_rules", then "ima_match_policy" may can't exit loop,
> > and kernel keeps printf "rcu_sched detected stall on CPU ...".
> > 
> > It occurs at boot phase, systemd-services are being checked within
> > "ima_match_policy,at the same time, the variable "ima_rules"
> > is changed by a service.
> 
> First off, thanks for finding and identifying this nasty bug.

Once the initial builtin policy rules have been replaced by a custom
policy, rules may only be appended by splicing the new rules with the
existing rules.  There should never be a problem reading the rules at
that point.   Does this problem occur before the builtin policy rules
have been replaced with a custom policy?

Mimi

Mimi Zohar Aug. 19, 2021, 7:31 p.m. UTC | #3

On Thu, 2021-08-19 at 09:47 -0400, Mimi Zohar wrote:
> On Thu, 2021-08-19 at 12:58 +0000, THOBY Simon wrote:
> > Hi Liqiong,
> > 
> > On 8/19/21 12:15 PM, liqiong wrote:
> > > When "ima_match_policy" is looping while "ima_update_policy" changs
> > > the variable "ima_rules", then "ima_match_policy" may can't exit loop,
> > > and kernel keeps printf "rcu_sched detected stall on CPU ...".
> > > 
> > > It occurs at boot phase, systemd-services are being checked within
> > > "ima_match_policy,at the same time, the variable "ima_rules"
> > > is changed by a service.
> > 
> > First off, thanks for finding and identifying this nasty bug.
> 
> Once the initial builtin policy rules have been replaced by a custom
> policy, rules may only be appended by splicing the new rules with the
> existing rules.  There should never be a problem reading the rules at
> that point.   Does this problem occur before the builtin policy rules
> have been replaced with a custom policy?

Yes, the problem is limited to transitioning from the builtin policy to
the custom policy.   Adding a new lock around rcu code seems counter
productive, especially since switching the policy rules happens once,
normally during early boot before access to real root.  Please consider
Simon's suggestion or finding some other solution.

thanks,

Mimi

Li Qiong Aug. 20, 2021, 10:15 a.m. UTC | #4

Hi, Simon:

This solution is better then rwsem, a temp "ima_rules" variable should 
can fix. I also have a another idea, with a little trick, default list
can traverse to the new list, so we don't need care about the read side. 

here is the patch:

@@ -918,8 +918,21 @@ void ima_update_policy(void)
        list_splice_tail_init_rcu(&ima_temp_rules, policy, synchronize_rcu);

        if (ima_rules != policy) {
+               struct list_head *prev_rules = ima_rules;
+               struct list_head *first = ima_rules->next;
                ima_policy_flag = 0;
+
+               /*
+                * Make the previous list can traverse to new list,
+                * that is tricky, or there is a deadly loop whithin
+                * "list_for_each_entry_rcu(entry, ima_rules, list)"
+                *
+                * After update "ima_rules", restore the previous list.
+                */
+               prev_rules->next = policy->next;
                ima_rules = policy;
+               syncchronize_rcu();
+               prev_rules->next = first;


The side effect is the "ima_default_rules" will be changed a little while.
But it make sense, the process should be checked again by the new policy.

This patch has been tested, if will do, I can resubmit this patch.

How about this ?

----------
Regards,
liqiong

在 2021年08月19日 20:58, THOBY Simon 写道:
> Hi Liqiong,
>
> On 8/19/21 12:15 PM, liqiong wrote:
>> When "ima_match_policy" is looping while "ima_update_policy" changs
>> the variable "ima_rules", then "ima_match_policy" may can't exit loop,
>> and kernel keeps printf "rcu_sched detected stall on CPU ...".
>>
>> It occurs at boot phase, systemd-services are being checked within
>> "ima_match_policy,at the same time, the variable "ima_rules"
>> is changed by a service.
> First off, thanks for finding and identifying this nasty bug.
>
>> Signed-off-by: liqiong <liqiong@nfschina.com>
>> ---
>>  security/integrity/ima/ima_policy.c | 5 +++++
>>  1 file changed, 5 insertions(+)
>>
>> diff --git a/security/integrity/ima/ima_policy.c b/security/integrity/ima/ima_policy.c
>> index fd5d46e511f1..7e71e643457c 100644
>> --- a/security/integrity/ima/ima_policy.c
>> +++ b/security/integrity/ima/ima_policy.c
>> @@ -217,6 +217,7 @@ static LIST_HEAD(ima_default_rules);
>>  static LIST_HEAD(ima_policy_rules);
>>  static LIST_HEAD(ima_temp_rules);
>>  static struct list_head *ima_rules = &ima_default_rules;
>> +static DECLARE_RWSEM(ima_rules_sem);
>>  
>>  static int ima_policy __initdata;
>>  
>> @@ -666,6 +667,7 @@ int ima_match_policy(struct user_namespace *mnt_userns, struct inode *inode,
>>  	if (template_desc && !*template_desc)
>>  		*template_desc = ima_template_desc_current();
>>  
>> +	down_read(&ima_rules_sem);
>>  	rcu_read_lock();
>>  	list_for_each_entry_rcu(entry, ima_rules, list) {
>>  
>> @@ -702,6 +704,7 @@ int ima_match_policy(struct user_namespace *mnt_userns, struct inode *inode,
>>  			break;
>>  	}
>>  	rcu_read_unlock();
>> +	up_read(&ima_rules_sem);
>>  
>>  	return action;
>>  }
>> @@ -919,7 +922,9 @@ void ima_update_policy(void)
>>  
>>  	if (ima_rules != policy) {
>>  		ima_policy_flag = 0;
>> +		down_write(&ima_rules_sem);
>>  		ima_rules = policy;
>> +		up_write(&ima_rules_sem);
>>  
>>  		/*
>>  		 * IMA architecture specific policy rules are specified
>>
> Rather than introducing a new semaphore, I wonder if you couldn't have done something
> like the following?
>
> @@ -674,13 +674,15 @@ int ima_match_policy(struct user_namespace *mnt_userns, struct inode *inode,
>                      const char *func_data, unsigned int *allowed_algos)
>  {
>         struct ima_rule_entry *entry;
> +       struct list_head *ima_rules_tmp;
>         int action = 0, actmask = flags | (flags << 1);
>
>         if (template_desc && !*template_desc)
>                 *template_desc = ima_template_desc_current();
>
>         rcu_read_lock();
> -       list_for_each_entry_rcu(entry, ima_rules, list) {
> +       ima_rules_tmp = rcu_dereference(ima_rules);
> +       list_for_each_entry_rcu(entry, ima_rules_tmp, list) {
>
>                 if (!(entry->action & actmask))
>                         continue;
> @@ -970,7 +972,7 @@ void ima_update_policy(void)
>
>         if (ima_rules != policy) {
>                 ima_policy_flag = 0;
> -               ima_rules = policy;
> +               rcu_assign_pointer(ima_rules, policy);
>
>                 /*
>                  * IMA architecture specific policy rules are specified
>
>
> Also, ima_match_policy is not the only place where we iterate over ima_rules, maybe
> this change should be applied to every function that perform a call the like of
> "list_for_each_entry_rcu(entry, ima_rules_tmp, list)" ?
>
> All that being said, your change is quite small and I have no objection to it,
> I was just wondering whether we could achieve the same effect without locks
> with RCU.
>
> What do you think?
>
> Thanks,
> Simon

THOBY Simon Aug. 20, 2021, 1:23 p.m. UTC | #5

Hi Liqiong,

On 8/20/21 12:15 PM, 李力琼 wrote:
> Hi, Simon:
> 
> This solution is better then rwsem, a temp "ima_rules" variable should 
> can fix. I also have a another idea, with a little trick, default list
> can traverse to the new list, so we don't need care about the read side. 
> 
> here is the patch:
> 
> @@ -918,8 +918,21 @@ void ima_update_policy(void)
>         list_splice_tail_init_rcu(&ima_temp_rules, policy, synchronize_rcu);
> 
>         if (ima_rules != policy) {
> +               struct list_head *prev_rules = ima_rules;
> +               struct list_head *first = ima_rules->next;
>                 ima_policy_flag = 0;
> +
> +               /*
> +                * Make the previous list can traverse to new list,
> +                * that is tricky, or there is a deadly loop whithin
> +                * "list_for_each_entry_rcu(entry, ima_rules, list)"
> +                *
> +                * After update "ima_rules", restore the previous list.
> +                */

I think this could be rephrased to be a tad clearer, I am not quite sure
how I must interpret the first sentence of the comment.


> +               prev_rules->next = policy->next;
>                 ima_rules = policy;
> +               syncchronize_rcu();

I'm a bit puzzled as you seem to imply in the mail this patch was tested,
but there is no 'syncchronize_rcu' (with two 'c') symbol in the kernel.
Was that a copy/paste error? Or maybe you forgot the 'not' in "This
patch has been tested"? These errors happen, and I am myself quite an
expert in doing them :)

> +               prev_rules->next = first;
> 
> 
> The side effect is the "ima_default_rules" will be changed a little while.
> But it make sense, the process should be checked again by the new policy.
> 
> This patch has been tested, if will do, I can resubmit this patch.> 
> How about this ?


Correct me if I'm wrong, here is how I think I understand you patch.
We start with a situation like that (step 0):
ima_rules --> List entry 0 (head node) = ima_default_rules <-> List entry 1 <-> List entry 2 <-> ... <-> List entry 0

Then we decide to update the policy for the first time, so
'ima_rules [&ima_default_rules] != policy [&ima_policy_rules]'.
We enter the condition.
First we copy the current value of ima_rules (&ima_default_rules)
to a temporary variable 'prev_rules'. We also create a pointer dubbed
'first' to the entry 1 in the default list (step 1):
prev_rules -------------
                       \/
ima_rules --> List entry 0 (head node) = ima_default_rules <-> List entry 1 <-> List entry 2 <-> ... <-> List entry 0
                                                                   /\
first --------------------------------------------------------------


Then we update prev_rules->next to point to policy->next (step 2):
List entry 1 <-> List entry 2 <-> ... -> List entry 0
 /\
first
	(notice that list entry 0 no longer points backwards to 'list entry 1',
	but I don't think there is any reverse iteration in IMA, so it should be
	safe)

prev_rules -------------
                       \/
ima_rules --> List entry 0 (head node) = ima_default_rules   
                       |
                       |
                       -------------------------------------------
                                                                 \/
policy --> policy entry 0' (head node) = ima_policy_rules <-> policy entry 1' <-> policy entry 2' <-> .... <-> policy entry 0'


We then update ima_rules to point to ima_policy_rules (step 3):
List entry 1 <-> List entry 2 <-> ... -> List entry 0
 /\
first

prev_rules -------------
                       \/
ima_rules     List entry 0 (head node) = ima_default_rules   
     |                 |
     |                 |
     |                 ------------------------------------------
     ---------------                                            |
                   \/                                           \/
policy --> policy entry 0' (head node) = ima_policy_rules <-> policy entry 1' <-> policy entry 2' <-> .... <-> policy entry 0'
                                                                   /\
first --------------------------------------------------------------

Then we run synchronize_rcu() to wait for any RCU reader to exit their loops (step 4).

Finally we update prev_rules->next to point back to the ima policy and fix the loop (step 5):

List entry 1 <-> List entry 2 <-> ... -> List entry 0
 /\
first

prev_rules ---> List entry 0 (head node) = ima_default_rules <-> List entry 1 <-> List entry 2 <-> ... <-> List entry 0
                                                                     /\
                                                                 first (now useless)
ima_rules        
     |
     |
     |
     ---------------
                   \/
policy --> policy entry 0' (head node) = ima_policy_rules <-> policy entry 1' <-> policy entry 2' <-> .... <-> policy entry 0'

The goal is that readers should still be able to loop
(forward, as we saw that backward looping is temporarily broken)
while in steps 0-4.

I'm not completely sure what would happen to a client that started iterating
over ima_rules right after step 2.

Wouldn't they be able to start looping through the new policy
as 'List entry 0 (head node) = ima_default_rules' points to ima_policy_rules?
And if they, wouldn't they loop until the write to 'ima_rule' at step 3 (admittedly
very shortly thereafter) completed?
And would the compiler be allowed to optimize the read to 'ima_rules' in the
list_for_each_entry() loop, thereby never reloading the new value for
'ima_rules', and thus looping forever, just what we are trying to avoid?

Overall, I'm tempted to say this is perhaps a bit too complex (at least,
my head tells me it is, but that may very well be because I'm terrible
at concurrency issues).

Honestly, in this case I think awaiting input from more experienced
kernel devs than I is the best path forward :-)

> 
> ----------
> Regards,
> liqiong
> 

Thanks,
Simon

Mimi Zohar Aug. 20, 2021, 3:48 p.m. UTC | #6

On Fri, 2021-08-20 at 13:23 +0000, THOBY Simon wrote:
> Hi Liqiong,
> 
> On 8/20/21 12:15 PM, 李力琼 wrote:
> > Hi, Simon:
> > 
> > This solution is better then rwsem, a temp "ima_rules" variable should 
> > can fix. I also have a another idea, with a little trick, default list
> > can traverse to the new list, so we don't need care about the read side. 
> > 
> > here is the patch:
> > 
> > @@ -918,8 +918,21 @@ void ima_update_policy(void)
> >         list_splice_tail_init_rcu(&ima_temp_rules, policy, synchronize_rcu);
> > 
> >         if (ima_rules != policy) {
> > +               struct list_head *prev_rules = ima_rules;
> > +               struct list_head *first = ima_rules->next;
> >                 ima_policy_flag = 0;
> > +
> > +               /*
> > +                * Make the previous list can traverse to new list,
> > +                * that is tricky, or there is a deadly loop whithin
> > +                * "list_for_each_entry_rcu(entry, ima_rules, list)"
> > +                *
> > +                * After update "ima_rules", restore the previous list.
> > +                */
> 
> I think this could be rephrased to be a tad clearer, I am not quite sure
> how I must interpret the first sentence of the comment.
> 
> 
> > +               prev_rules->next = policy->next;
> >                 ima_rules = policy;
> > +               syncchronize_rcu();
> 
> I'm a bit puzzled as you seem to imply in the mail this patch was tested,
> but there is no 'syncchronize_rcu' (with two 'c') symbol in the kernel.
> Was that a copy/paste error? Or maybe you forgot the 'not' in "This
> patch has been tested"? These errors happen, and I am myself quite an
> expert in doing them :)
> 
> > +               prev_rules->next = first;
> > 
> > 
> > The side effect is the "ima_default_rules" will be changed a little while.
> > But it make sense, the process should be checked again by the new policy.
> > 
> > This patch has been tested, if will do, I can resubmit this patch.> 
> > How about this ?
> 
> 
> Correct me if I'm wrong, here is how I think I understand you patch.
> We start with a situation like that (step 0):
> ima_rules --> List entry 0 (head node) = ima_default_rules <-> List entry 1 <-> List entry 2 <-> ... <-> List entry 0
> 
> Then we decide to update the policy for the first time, so
> 'ima_rules [&ima_default_rules] != policy [&ima_policy_rules]'.
> We enter the condition.
> First we copy the current value of ima_rules (&ima_default_rules)
> to a temporary variable 'prev_rules'. We also create a pointer dubbed
> 'first' to the entry 1 in the default list (step 1):
> prev_rules -------------
>                        \/
> ima_rules --> List entry 0 (head node) = ima_default_rules <-> List entry 1 <-> List entry 2 <-> ... <-> List entry 0
>                                                                    /\
> first --------------------------------------------------------------
> 
> 
> Then we update prev_rules->next to point to policy->next (step 2):
> List entry 1 <-> List entry 2 <-> ... -> List entry 0
>  /\
> first
> 	(notice that list entry 0 no longer points backwards to 'list entry 1',
> 	but I don't think there is any reverse iteration in IMA, so it should be
> 	safe)
> 
> prev_rules -------------
>                        \/
> ima_rules --> List entry 0 (head node) = ima_default_rules   
>                        |
>                        |
>                        -------------------------------------------
>                                                                  \/
> policy --> policy entry 0' (head node) = ima_policy_rules <-> policy entry 1' <-> policy entry 2' <-> .... <-> policy entry 0'
> 
> 
> We then update ima_rules to point to ima_policy_rules (step 3):
> List entry 1 <-> List entry 2 <-> ... -> List entry 0
>  /\
> first
> 
> prev_rules -------------
>                        \/
> ima_rules     List entry 0 (head node) = ima_default_rules   
>      |                 |
>      |                 |
>      |                 ------------------------------------------
>      ---------------                                            |
>                    \/                                           \/
> policy --> policy entry 0' (head node) = ima_policy_rules <-> policy entry 1' <-> policy entry 2' <-> .... <-> policy entry 0'
>                                                                    /\
> first --------------------------------------------------------------
> 
> Then we run synchronize_rcu() to wait for any RCU reader to exit their loops (step 4).
> 
> Finally we update prev_rules->next to point back to the ima policy and fix the loop (step 5):
> 
> List entry 1 <-> List entry 2 <-> ... -> List entry 0
>  /\
> first
> 
> prev_rules ---> List entry 0 (head node) = ima_default_rules <-> List entry 1 <-> List entry 2 <-> ... <-> List entry 0
>                                                                      /\
>                                                                  first (now useless)
> ima_rules        
>      |
>      |
>      |
>      ---------------
>                    \/
> policy --> policy entry 0' (head node) = ima_policy_rules <-> policy entry 1' <-> policy entry 2' <-> .... <-> policy entry 0'
> 
> The goal is that readers should still be able to loop
> (forward, as we saw that backward looping is temporarily broken)
> while in steps 0-4.
> 
> I'm not completely sure what would happen to a client that started iterating
> over ima_rules right after step 2.
> 
> Wouldn't they be able to start looping through the new policy
> as 'List entry 0 (head node) = ima_default_rules' points to ima_policy_rules?
> And if they, wouldn't they loop until the write to 'ima_rule' at step 3 (admittedly
> very shortly thereafter) completed?
> And would the compiler be allowed to optimize the read to 'ima_rules' in the
> list_for_each_entry() loop, thereby never reloading the new value for
> 'ima_rules', and thus looping forever, just what we are trying to avoid?
> 
> Overall, I'm tempted to say this is perhaps a bit too complex (at least,
> my head tells me it is, but that may very well be because I'm terrible
> at concurrency issues).
> 
> Honestly, in this case I think awaiting input from more experienced
> kernel devs than I is the best path forward :-)

I'm far from an expert on RCU locking, but __list_splice_init_rcu()
provides an example of how to make sure there aren't any readers
traversing the list, before two lists are spliced together.   In our
case, after there aren't any readers, instead of splicing two lists
together, it should be safe to point to the new list.

thanks,

Mimi

Li Qiong Aug. 20, 2021, 5:53 p.m. UTC | #7

Hi Simon,

On 2021/8/20 21:23, THOBY Simon wrote:
> Hi Liqiong,
>
> On 8/20/21 12:15 PM, 李力琼 wrote:
>> Hi, Simon:
>>
>> This solution is better then rwsem, a temp "ima_rules" variable should
>> can fix. I also have a another idea, with a little trick, default list
>> can traverse to the new list, so we don't need care about the read side.
>>
>> here is the patch:
>>
>> @@ -918,8 +918,21 @@ void ima_update_policy(void)
>>          list_splice_tail_init_rcu(&ima_temp_rules, policy, synchronize_rcu);
>>
>>          if (ima_rules != policy) {
>> +               struct list_head *prev_rules = ima_rules;
>> +               struct list_head *first = ima_rules->next;
>>                  ima_policy_flag = 0;
>> +
>> +               /*
>> +                * Make the previous list can traverse to new list,
>> +                * that is tricky, or there is a deadly loop whithin
>> +                * "list_for_each_entry_rcu(entry, ima_rules, list)"
>> +                *
>> +                * After update "ima_rules", restore the previous list.
>> +                */
> I think this could be rephrased to be a tad clearer, I am not quite sure
> how I must interpret the first sentence of the comment.
I got it,  how about this:
  /*
   * The previous list has to traverse to new list,
   * Or there may be a deadly loop within
   * "list_for_each_entry_rcu(entry, ima_rules, list)"
   *
   * That is tricky, after updated "ima_rules", restore the previous list.
   */
>
>
>> +               prev_rules->next = policy->next;
>>                  ima_rules = policy;
>> +               syncchronize_rcu();
> I'm a bit puzzled as you seem to imply in the mail this patch was tested,
> but there is no 'syncchronize_rcu' (with two 'c') symbol in the kernel.
> Was that a copy/paste error? Or maybe you forgot the 'not' in "This
> patch has been tested"? These errors happen, and I am myself quite an
> expert in doing them :)


Sorry for the mistake, I copy/paste the patch and delete/edit some lines,
have reviewed before sending, but not found. I have made a case to reproduce
the error, dumping "ima_rules" and every item address of list in the error
situaiton, I can watchthe "ima_rules" change, old list traversing to the 
new list.
And I have been doing a reboot test which found this bug. This patch 
seems to work fine.


>
>> +               prev_rules->next = first;
>>
>>
>> The side effect is the "ima_default_rules" will be changed a little while.
>> But it make sense, the process should be checked again by the new policy.
>>
>> This patch has been tested, if will do, I can resubmit this patch.>
>> How about this ?
>
> Correct me if I'm wrong, here is how I think I understand you patch.
> We start with a situation like that (step 0):
> ima_rules --> List entry 0 (head node) = ima_default_rules <-> List entry 1 <-> List entry 2 <-> ... <-> List entry 0
>
> Then we decide to update the policy for the first time, so
> 'ima_rules [&ima_default_rules] != policy [&ima_policy_rules]'.
> We enter the condition.
> First we copy the current value of ima_rules (&ima_default_rules)
> to a temporary variable 'prev_rules'. We also create a pointer dubbed
> 'first' to the entry 1 in the default list (step 1):
> prev_rules -------------
>                         \/
> ima_rules --> List entry 0 (head node) = ima_default_rules <-> List entry 1 <-> List entry 2 <-> ... <-> List entry 0
>                                                                     /\
> first --------------------------------------------------------------
>
>
> Then we update prev_rules->next to point to policy->next (step 2):
> List entry 1 <-> List entry 2 <-> ... -> List entry 0
>   /\
> first
> 	(notice that list entry 0 no longer points backwards to 'list entry 1',
> 	but I don't think there is any reverse iteration in IMA, so it should be
> 	safe)
>
> prev_rules -------------
>                         \/
> ima_rules --> List entry 0 (head node) = ima_default_rules
>                         |
>                         |
>                         -------------------------------------------
>                                                                   \/
> policy --> policy entry 0' (head node) = ima_policy_rules <-> policy entry 1' <-> policy entry 2' <-> .... <-> policy entry 0'
>
>
> We then update ima_rules to point to ima_policy_rules (step 3):
> List entry 1 <-> List entry 2 <-> ... -> List entry 0
>   /\
> first
>
> prev_rules -------------
>                         \/
> ima_rules     List entry 0 (head node) = ima_default_rules
>       |                 |
>       |                 |
>       |                 ------------------------------------------
>       ---------------                                            |
>                     \/                                           \/
> policy --> policy entry 0' (head node) = ima_policy_rules <-> policy entry 1' <-> policy entry 2' <-> .... <-> policy entry 0'
>                                                                     /\
> first --------------------------------------------------------------
>
> Then we run synchronize_rcu() to wait for any RCU reader to exit their loops (step 4).
>
> Finally we update prev_rules->next to point back to the ima policy and fix the loop (step 5):
>
> List entry 1 <-> List entry 2 <-> ... -> List entry 0
>   /\
> first
>
> prev_rules ---> List entry 0 (head node) = ima_default_rules <-> List entry 1 <-> List entry 2 <-> ... <-> List entry 0
>                                                                       /\
>                                                                   first (now useless)
> ima_rules
>       |
>       |
>       |
>       ---------------
>                     \/
> policy --> policy entry 0' (head node) = ima_policy_rules <-> policy entry 1' <-> policy entry 2' <-> .... <-> policy entry 0'
>
> The goal is that readers should still be able to loop
> (forward, as we saw that backward looping is temporarily broken)
> while in steps 0-4.


Yes, It's the workflow.


> I'm not completely sure what would happen to a client that started iterating
> over ima_rules right after step 2.
>
> Wouldn't they be able to start looping through the new policy
> as 'List entry 0 (head node) = ima_default_rules' points to ima_policy_rules?
> And if they, wouldn't they loop until the write to 'ima_rule' at step 3 (admittedly
> very shortly thereafter) completed?
> And would the compiler be allowed to optimize the read to 'ima_rules' in the
> list_for_each_entry() loop, thereby never reloading the new value for
> 'ima_rules', and thus looping forever, just what we are trying to avoid?


Yes,  "ima_rules" cache not update in time, It's a risk. I am not sure 
if "WRITE_ONCE"
can do this trick. How about:
     WRITE_ONCE(prev_rules->next, policy->next);
     WRITE_ONCE(ima_rules, policy);


If can't fix the cache issue, maybe the "ima_rules_tmp" solution is the 
best way.
I will test it.


> Overall, I'm tempted to say this is perhaps a bit too complex (at least,
> my head tells me it is, but that may very well be because I'm terrible
> at concurrency issues).
>
> Honestly, in this case I think awaiting input from more experienced
> kernel devs than I is the best path forward :-)
>
>> ----------
>> Regards,
>> liqiong
>>
> Thanks,
> Simon

Li Qiong Aug. 23, 2021, 3:04 a.m. UTC | #8

Hi Mimi :

The situation is a little different,'list_splice_init_rcu'
don't change the list head. If "ima_rules" being changed,
readers may can't reload the new value in time for cpu cache
or compiler optimization. Defining "ima_rules" as a volatile 
variable can fix, but It is inefficient.

Maybe using a temporary ima_rules variable for every 
"list_for_each_entry_rcu(entry, ima_rules, list)" loop is 
a better solution to fix the "endless loop" bug. 

Regards,

liqiong

在 2021年08月20日 23:48, Mimi Zohar 写道:
> On Fri, 2021-08-20 at 13:23 +0000, THOBY Simon wrote:
>> Hi Liqiong,
>>
>> On 8/20/21 12:15 PM, 李力琼 wrote:
>>> Hi, Simon:
>>>
>>> This solution is better then rwsem, a temp "ima_rules" variable should 
>>> can fix. I also have a another idea, with a little trick, default list
>>> can traverse to the new list, so we don't need care about the read side. 
>>>
>>> here is the patch:
>>>
>>> @@ -918,8 +918,21 @@ void ima_update_policy(void)
>>>         list_splice_tail_init_rcu(&ima_temp_rules, policy, synchronize_rcu);
>>>
>>>         if (ima_rules != policy) {
>>> +               struct list_head *prev_rules = ima_rules;
>>> +               struct list_head *first = ima_rules->next;
>>>                 ima_policy_flag = 0;
>>> +
>>> +               /*
>>> +                * Make the previous list can traverse to new list,
>>> +                * that is tricky, or there is a deadly loop whithin
>>> +                * "list_for_each_entry_rcu(entry, ima_rules, list)"
>>> +                *
>>> +                * After update "ima_rules", restore the previous list.
>>> +                */
>> I think this could be rephrased to be a tad clearer, I am not quite sure
>> how I must interpret the first sentence of the comment.
>>
>>
>>> +               prev_rules->next = policy->next;
>>>                 ima_rules = policy;
>>> +               syncchronize_rcu();
>> I'm a bit puzzled as you seem to imply in the mail this patch was tested,
>> but there is no 'syncchronize_rcu' (with two 'c') symbol in the kernel.
>> Was that a copy/paste error? Or maybe you forgot the 'not' in "This
>> patch has been tested"? These errors happen, and I am myself quite an
>> expert in doing them :)
>>
>>> +               prev_rules->next = first;
>>>
>>>
>>> The side effect is the "ima_default_rules" will be changed a little while.
>>> But it make sense, the process should be checked again by the new policy.
>>>
>>> This patch has been tested, if will do, I can resubmit this patch.> 
>>> How about this ?
>> least
>>
>> Correct me if I'm wrong, here is how I think I understand you patch.
>> We start with a situation like that (step 0):
>> ima_rules --> List entry 0 (head node) = ima_default_rules <-> List entry 1 <-> List entry 2 <-> ... <-> List entry 0
>>
>> Then we decide to update the policy for the first time, so
>> 'ima_rules [&ima_default_rules] != policy [&ima_policy_rules]'.
>> We enter the condition.
>> First we copy the current value of ima_rules (&ima_default_rules)
>> to a temporary variable 'prev_rules'. We also create a pointer dubbed
>> 'first' to the entry 1 in the default list (step 1):
>> prev_rules -------------
>>                        \/
>> ima_rules --> List entry 0 (head node) = ima_default_rules <-> List entry 1 <-> List entry 2 <-> ... <-> List entry 0
>>                                                                    /\
>> first --------------------------------------------------------------
>>
>>
>> Then we update prev_rules->next to point to policy->next (step 2):
>> List entry 1 <-> List entry 2 <-> ... -> List entry 0
>>  /\
>> first
>> 	(notice that list entry 0 no longer points backwards to 'list entry 1',
>> 	but I don't think there is any reverse iteration in IMA, so it should be
>> 	safe)
>>
>> prev_rules -------------
>>                        \/
>> ima_rules --> List entry 0 (head node) = ima_default_rules   
>>                        |
>>                        |
>>                        -------------------------------------------
>>                                                                  \/
>> policy --> policy entry 0' (head node) = ima_policy_rules <-> policy entry 1' <-> policy entry 2' <-> .... <-> policy entry 0'
>>
>>
>> We then update ima_rules to point to ima_policy_rules (step 3):
>> List entry 1 <-> List entry 2 <-> ... -> List entry 0
>>  /\
>> first
>>
>> prev_rules -------------
>>                        \/
>> ima_rules     List entry 0 (head node) = ima_default_rules   
>>      |                 |
>>      |                 |
>>      |                 ------------------------------------------
>>      ---------------                                            |
>>                    \/                                           \/
>> policy --> policy entry 0' (head node) = ima_policy_rules <-> policy entry 1' <-> policy entry 2' <-> .... <-> policy entry 0'
>>                                                   synchronize_rcu                 /\
>> first --------------------------------------------------------------
>>
>> Then we run synchronize_rcu() to wait for any RCU reader to exit their loops (step 4).
>>
>> Finally we update prev_rules->next to point back to the ima policy and fix the loop (step 5):
>>
>> List entry 1 <-> List entry 2 <-> ... -> List entry 0
>>  /\
>> first
>>
>> prev_rules ---> List entry 0 (head node) = ima_default_rules <-> List entry 1 <-> List entry 2 <-> ... <-> List entry 0
>>                                                                      /\
>>                                                                  first (now useless)
>> ima_rules        
>>      |
>>      |
>>      |
>>      ---------------
>>                    \/
>> policy --> policy entry 0' (head node) = ima_policy_rules <-> policy entry 1' <-> policy entry 2' <-> .... <-> policy entry 0'
>>
>> The goal is that readers should still be able to loop
>> (forward, as we saw that backward looping is temporarily broken)
>> while in steps 0-4.
>>
>> I'm not completely sure what would happen to a client that started iterating
>> over ima_rules right after step 2.
>>
>> Wouldn't they be able to start looping through the new policy
>> as 'List entry 0 (head node) = ima_default_rules' points to ima_policy_rules?
>> And if they, wouldn't they loop until the write to 'ima_rule' at step 3 (admittedly
>> very shortly thereafter) completed?
>> And would the compiler be allowed to optimize the read to 'ima_rules' in the
>> list_for_each_entry() loop, thereby never reloading the new value for
>> 'ima_rules', and thus looping forever, just what we are trying to avoid?
>>
>> Overall, I'm tempted to say this is perhaps a bit too complex (at least,
>> my head tells me it is, but that may very well be because I'm terrible
>> at concurrency issues).
>>
>> Honestly, in this case I think awaiting input from more experienced
>> kernel devs than I is the best path forward :-)
> I'm far from an expert on RCU locking, but __list_splice_init_rcu()
> provides an example of how to make sure there aren't any readers
> traversing the list, before two lists are spliced together.   In our
> case, after there aren't any readers, instead of splicing two lists
> together, it should be safe to point to the new list.
>
> thanks,
>
> Mimi
>

THOBY Simon Aug. 23, 2021, 7:13 a.m. UTC | #9

Hi Liqiong,

On 8/20/21 7:53 PM, liqiong wrote:
> Hi Simon,
> 
> On 2021/8/20 21:23, THOBY Simon wrote:
>> Hi Liqiong,
>>
>> On 8/20/21 12:15 PM, 李力琼 wrote:
>>> Hi, Simon:
>>>
>>> This solution is better then rwsem, a temp "ima_rules" variable should
>>> can fix. I also have a another idea, with a little trick, default list
>>> can traverse to the new list, so we don't need care about the read side.
>>>
>>> here is the patch:
>>>
>>> @@ -918,8 +918,21 @@ void ima_update_policy(void)
>>>          list_splice_tail_init_rcu(&ima_temp_rules, policy, synchronize_rcu);
>>>
>>>          if (ima_rules != policy) {
>>> +               struct list_head *prev_rules = ima_rules;
>>> +               struct list_head *first = ima_rules->next;
>>>                  ima_policy_flag = 0;
>>> +
>>> +               /*
>>> +                * Make the previous list can traverse to new list,
>>> +                * that is tricky, or there is a deadly loop whithin
>>> +                * "list_for_each_entry_rcu(entry, ima_rules, list)"
>>> +                *
>>> +                * After update "ima_rules", restore the previous list.
>>> +                */
>> I think this could be rephrased to be a tad clearer, I am not quite sure
>> how I must interpret the first sentence of the comment.
> I got it,  how about this:
>  /*
>   * The previous list has to traverse to new list,
>   * Or there may be a deadly loop within

Maybe 'deadlock' would be clearer than 'deadly loop'?

>   * "list_for_each_entry_rcu(entry, ima_rules, list)"
>   *
>   * That is tricky, after updated "ima_rules", restore the previous list.

Maybe something like "This is tricky, so we restore the previous list (ima_default_rules)
once 'ima_rules' is updated" ?

>   */>>
>>
>>> +               prev_rules->next = policy->next;
>>>                  ima_rules = policy;
>>> +               syncchronize_rcu();
>> I'm a bit puzzled as you seem to imply in the mail this patch was tested,
>> but there is no 'syncchronize_rcu' (with two 'c') symbol in the kernel.
>> Was that a copy/paste error? Or maybe you forgot the 'not' in "This
>> patch has been tested"? These errors happen, and I am myself quite an
>> expert in doing them :)
> 
> 
> Sorry for the mistake, I copy/paste the patch and delete/edit some lines,
> have reviewed before sending, but not found. I have made a case to reproduce
> the error, dumping "ima_rules" and every item address of list in the error
> situaiton, I can watchthe "ima_rules" change, old list traversing to the new list.
> And I have been doing a reboot test which found this bug. This patch seems to work fine.
> 

No worries, i just wanted to make sure I understood you correctly.

> 
>>
>>> +               prev_rules->next = first;
>>>
>>>
>>> The side effect is the "ima_default_rules" will be changed a little while.
>>> But it make sense, the process should be checked again by the new policy.
>>>
>>> This patch has been tested, if will do, I can resubmit this patch.>
>>> How about this ?
>>
>> Correct me if I'm wrong, here is how I think I understand you patch.
>> We start with a situation like that (step 0):
>> ima_rules --> List entry 0 (head node) = ima_default_rules <-> List entry 1 <-> List entry 2 <-> ... <-> List entry 0
>>
>> Then we decide to update the policy for the first time, so
>> 'ima_rules [&ima_default_rules] != policy [&ima_policy_rules]'.
>> We enter the condition.
>> First we copy the current value of ima_rules (&ima_default_rules)
>> to a temporary variable 'prev_rules'. We also create a pointer dubbed
>> 'first' to the entry 1 in the default list (step 1):
>> prev_rules -------------
>>                         \/
>> ima_rules --> List entry 0 (head node) = ima_default_rules <-> List entry 1 <-> List entry 2 <-> ... <-> List entry 0
>>                                                                     /\
>> first --------------------------------------------------------------
>>
>>
>> Then we update prev_rules->next to point to policy->next (step 2):
>> List entry 1 <-> List entry 2 <-> ... -> List entry 0
>>   /\
>> first
>>     (notice that list entry 0 no longer points backwards to 'list entry 1',
>>     but I don't think there is any reverse iteration in IMA, so it should be
>>     safe)
>>
>> prev_rules -------------
>>                         \/
>> ima_rules --> List entry 0 (head node) = ima_default_rules
>>                         |
>>                         |
>>                         -------------------------------------------
>>                                                                   \/
>> policy --> policy entry 0' (head node) = ima_policy_rules <-> policy entry 1' <-> policy entry 2' <-> .... <-> policy entry 0'
>>
>>
>> We then update ima_rules to point to ima_policy_rules (step 3):
>> List entry 1 <-> List entry 2 <-> ... -> List entry 0
>>   /\
>> first
>>
>> prev_rules -------------
>>                         \/
>> ima_rules     List entry 0 (head node) = ima_default_rules
>>       |                 |
>>       |                 |
>>       |                 ------------------------------------------
>>       ---------------                                            |
>>                     \/                                           \/
>> policy --> policy entry 0' (head node) = ima_policy_rules <-> policy entry 1' <-> policy entry 2' <-> .... <-> policy entry 0'
>>                                                                     /\
>> first --------------------------------------------------------------
>>
>> Then we run synchronize_rcu() to wait for any RCU reader to exit their loops (step 4).
>>
>> Finally we update prev_rules->next to point back to the ima policy and fix the loop (step 5):
>>
>> List entry 1 <-> List entry 2 <-> ... -> List entry 0
>>   /\
>> first
>>
>> prev_rules ---> List entry 0 (head node) = ima_default_rules <-> List entry 1 <-> List entry 2 <-> ... <-> List entry 0
>>                                                                       /\
>>                                                                   first (now useless)
>> ima_rules
>>       |
>>       |
>>       |
>>       ---------------
>>                     \/
>> policy --> policy entry 0' (head node) = ima_policy_rules <-> policy entry 1' <-> policy entry 2' <-> .... <-> policy entry 0'
>>
>> The goal is that readers should still be able to loop
>> (forward, as we saw that backward looping is temporarily broken)
>> while in steps 0-4.
> 
> 
> Yes, It's the workflow.
> 
> 
>> I'm not completely sure what would happen to a client that started iterating
>> over ima_rules right after step 2.
>>
>> Wouldn't they be able to start looping through the new policy
>> as 'List entry 0 (head node) = ima_default_rules' points to ima_policy_rules?
>> And if they, wouldn't they loop until the write to 'ima_rule' at step 3 (admittedly
>> very shortly thereafter) completed?
>> And would the compiler be allowed to optimize the read to 'ima_rules' in the
>> list_for_each_entry() loop, thereby never reloading the new value for
>> 'ima_rules', and thus looping forever, just what we are trying to avoid?
> 
> 
> Yes,  "ima_rules" cache not update in time, It's a risk. I am not sure if "WRITE_ONCE"
> can do this trick. How about:
>     WRITE_ONCE(prev_rules->next, policy->next);
>     WRITE_ONCE(ima_rules, policy);

Quite frankly, I don't know. As I said earlier, this is really way above my level.
I'm fine waiting for more experienced opinions on this one.

On the aspect of maintainability, I do think this solution is perhaps too complex
when compared to other solutions like the semaphore you first proposed.
A solution of similar complexity with RCU would be ideal to prevent adding a
semaphore on a read-mostly scenario, but I'm still more confident in the semaphore
than in the solution above, because it is easy to have confidence in the semaphore,
while this patch is not at all obvious to me, and maybe the next person who will
have to edit that piece of code.

> 
> If can't fix the cache issue, maybe the "ima_rules_tmp" solution is the best way.
> I will test it.
> 
> 
>> Overall, I'm tempted to say this is perhaps a bit too complex (at least,
>> my head tells me it is, but that may very well be because I'm terrible
>> at concurrency issues).
>>
>> Honestly, in this case I think awaiting input from more experienced
>> kernel devs than I is the best path forward :-)
>>
>>> ----------
>>> Regards,
>>> liqiong

Thanks,
Simon

Li Qiong Aug. 23, 2021, 7:51 a.m. UTC | #10

Hi Simon :

Using a temporary ima_rules variable is not working for "ima_policy_next". void *ima_policy_next(struct seq_file *m, void *v, loff_t *pos) { struct ima_rule_entry *entry = v; + struct list_head *ima_rules_tmp = rcu_dereference(ima_rules); rcu_read_lock(); entry = list_entry_rcu(entry->list.next, struct ima_rule_entry, list); rcu_read_unlock(); (*pos)++; - return (&entry->list == ima_rules) ? NULL : entry; + return (&entry->list == ima_rules_tmp) ? NULL : entry; }

It seems no way to fix "ima_rules" change within this function, it will alway
return a entry if "ima_rules" being changed.

Regrads,

liqiong

在 2021年08月23日 11:04, 李力琼 写道:
> Hi Mimi :
>
> The situation is a little different,'list_splice_init_rcu'
> don't change the list head. If "ima_rules" being changed,
> readers may can't reload the new value in time for cpu cache
> or compiler optimization. Defining "ima_rules" as a volatile 
> variable can fix, but It is inefficient.
>
> Maybe using a temporary ima_rules variable for every 
> "list_for_each_entry_rcu(entry, ima_rules, list)" loop is 
> a better solution to fix the "endless loop" bug. 
>
> Regards,
>
> liqiong
>
> 在 2021年08月20日 23:48, Mimi Zohar 写道:
>> On Fri, 2021-08-20 at 13:23 +0000, THOBY Simon wrote:
>>> Hi Liqiong,
>>>
>>> On 8/20/21 12:15 PM, 李力琼 wrote:
>>>> Hi, Simon:
>>>>
>>>> This solution is better then rwsem, a temp "ima_rules" variable should 
>>>> can fix. I also have a another idea, with a little trick, default list
>>>> can traverse to the new list, so we don't need care about the read side. 
>>>>
>>>> here is the patch:
>>>>
>>>> @@ -918,8 +918,21 @@ void ima_update_policy(void)
>>>>         list_splice_tail_init_rcu(&ima_temp_rules, policy, synchronize_rcu);
>>>>
>>>>         if (ima_rules != policy) {
>>>> +               struct list_head *prev_rules = ima_rules;
>>>> +               struct list_head *first = ima_rules->next;
>>>>                 ima_policy_flag = 0;
>>>> +
>>>> +               /*
>>>> +                * Make the previous list can traverse to new list,
>>>> +                * that is tricky, or there is a deadly loop whithin
>>>> +                * "list_for_each_entry_rcu(entry, ima_rules, list)"
>>>> +                *
>>>> +                * After update "ima_rules", restore the previous list.
>>>> +                */
>>> I think this could be rephrased to be a tad clearer, I am not quite sure
>>> how I must interpret the first sentence of the comment.
>>>
>>>
>>>> +               prev_rules->next = policy->next;
>>>>                 ima_rules = policy;
>>>> +               syncchronize_rcu();
>>> I'm a bit puzzled as you seem to imply in the mail this patch was tested,
>>> but there is no 'syncchronize_rcu' (with two 'c') symbol in the kernel.
>>> Was that a copy/paste error? Or maybe you forgot the 'not' in "This
>>> patch has been tested"? These errors happen, and I am myself quite an
>>> expert in doing them :)
>>>
>>>> +               prev_rules->next = first;
>>>>
>>>>
>>>> The side effect is the "ima_default_rules" will be changed a little while.
>>>> But it make sense, the process should be checked again by the new policy.
>>>>
>>>> This patch has been tested, if will do, I can resubmit this patch.> 
>>>> How about this ?
>>> least
>>>
>>> Correct me if I'm wrong, here is how I think I understand you patch.
>>> We start with a situation like that (step 0):
>>> ima_rules --> List entry 0 (head node) = ima_default_rules <-> List entry 1 <-> List entry 2 <-> ... <-> List entry 0
>>>
>>> Then we decide to update the policy for the first time, so
>>> 'ima_rules [&ima_default_rules] != policy [&ima_policy_rules]'.
>>> We enter the condition.
>>> First we copy the current value of ima_rules (&ima_default_rules)
>>> to a temporary variable 'prev_rules'. We also create a pointer dubbed
>>> 'first' to the entry 1 in the default list (step 1):
>>> prev_rules -------------
>>>                        \/
>>> ima_rules --> List entry 0 (head node) = ima_default_rules <-> List entry 1 <-> List entry 2 <-> ... <-> List entry 0
>>>                                                                    /\
>>> first --------------------------------------------------------------
>>>
>>>
>>> Then we update prev_rules->next to point to policy->next (step 2):
>>> List entry 1 <-> List entry 2 <-> ... -> List entry 0
>>>  /\
>>> first
>>> 	(notice that list entry 0 no longer points backwards to 'list entry 1',
>>> 	but I don't think there is any reverse iteration in IMA, so it should be
>>> 	safe)
>>>
>>> prev_rules -------------
>>>                        \/
>>> ima_rules --> List entry 0 (head node) = ima_default_rules   
>>>                        |
>>>                        |
>>>                        -------------------------------------------
>>>                                                                  \/
>>> policy --> policy entry 0' (head node) = ima_policy_rules <-> policy entry 1' <-> policy entry 2' <-> .... <-> policy entry 0'
>>>
>>>
>>> We then update ima_rules to point to ima_policy_rules (step 3):
>>> List entry 1 <-> List entry 2 <-> ... -> List entry 0
>>>  /\
>>> first
>>>
>>> prev_rules -------------
>>>                        \/
>>> ima_rules     List entry 0 (head node) = ima_default_rules   
>>>      |                 |
>>>      |                 |
>>>      |                 ------------------------------------------
>>>      ---------------                                            |
>>>                    \/                                           \/
>>> policy --> policy entry 0' (head node) = ima_policy_rules <-> policy entry 1' <-> policy entry 2' <-> .... <-> policy entry 0'
>>>                                                   synchronize_rcu                 /\
>>> first --------------------------------------------------------------
>>>
>>> Then we run synchronize_rcu() to wait for any RCU reader to exit their loops (step 4).
>>>
>>> Finally we update prev_rules->next to point back to the ima policy and fix the loop (step 5):
>>>
>>> List entry 1 <-> List entry 2 <-> ... -> List entry 0
>>>  /\
>>> first
>>>
>>> prev_rules ---> List entry 0 (head node) = ima_default_rules <-> List entry 1 <-> List entry 2 <-> ... <-> List entry 0
>>>                                                                      /\
>>>                                                                  first (now useless)
>>> ima_rules        
>>>      |
>>>      |
>>>      |
>>>      ---------------
>>>                    \/
>>> policy --> policy entry 0' (head node) = ima_policy_rules <-> policy entry 1' <-> policy entry 2' <-> .... <-> policy entry 0'
>>>
>>> The goal is that readers should still be able to loop
>>> (forward, as we saw that backward looping is temporarily broken)
>>> while in steps 0-4.
>>>
>>> I'm not completely sure what would happen to a client that started iterating
>>> over ima_rules right after step 2.
>>>
>>> Wouldn't they be able to start looping through the new policy
>>> as 'List entry 0 (head node) = ima_default_rules' points to ima_policy_rules?
>>> And if they, wouldn't they loop until the write to 'ima_rule' at step 3 (admittedly
>>> very shortly thereafter) completed?
>>> And would the compiler be allowed to optimize the read to 'ima_rules' in the
>>> list_for_each_entry() loop, thereby never reloading the new value for
>>> 'ima_rules', and thus looping forever, just what we are trying to avoid?
>>>
>>> Overall, I'm tempted to say this is perhaps a bit too complex (at least,
>>> my head tells me it is, but that may very well be because I'm terrible
>>> at concurrency issues).
>>>
>>> Honestly, in this case I think awaiting input from more experienced
>>> kernel devs than I is the best path forward :-)
>> I'm far from an expert on RCU locking, but __list_splice_init_rcu()
>> provides an example of how to make sure there aren't any readers
>> traversing the list, before two lists are spliced together.   In our
>> case, after there aren't any readers, instead of splicing two lists
>> together, it should be safe to point to the new list.
>>
>> thanks,
>>
>> Mimi
>>

Li Qiong Aug. 23, 2021, 8:06 a.m. UTC | #11

Hi Simon :

Using a temporary ima_rules variable is not working for "ima_policy_next". 

 void *ima_policy_next(struct seq_file *m, void *v, loff_t *pos)
 {
 	struct ima_rule_entry *entry = v;
-
+	struct list_head *ima_rules_tmp = rcu_dereference(ima_rules);
 	rcu_read_lock();
 	entry = list_entry_rcu(entry->list.next, struct ima_rule_entry, list);
 	rcu_read_unlock();
 	(*pos)++;
 
-	return (&entry->list == ima_rules) ? NULL : entry;
+	return (&entry->list == ima_rules_tmp) ? NULL : entry;
 }

It seems no way to fix "ima_rules" change within this function, it will alway
return a entry if "ima_rules" being changed.

Regrads,

liqiong



在 2021年08月23日 11:04, 李力琼 写道:
> Hi Mimi :
>
> The situation is a little different,'list_splice_init_rcu'
> don't change the list head. If "ima_rules" being changed,
> readers may can't reload the new value in time for cpu cache
> or compiler optimization. Defining "ima_rules" as a volatile 
> variable can fix, but It is inefficient.
>
> Maybe using a temporary ima_rules variable for every 
> "list_for_each_entry_rcu(entry, ima_rules, list)" loop is 
> a better solution to fix the "endless loop" bug. 
>
> Regards,
>
> liqiong
>
> 在 2021年08月20日 23:48, Mimi Zohar 写道:
>> On Fri, 2021-08-20 at 13:23 +0000, THOBY Simon wrote:
>>> Hi Liqiong,
>>>
>>> On 8/20/21 12:15 PM, 李力琼 wrote:
>>>> Hi, Simon:
>>>>
>>>> This solution is better then rwsem, a temp "ima_rules" variable should 
>>>> can fix. I also have a another idea, with a little trick, default list
>>>> can traverse to the new list, so we don't need care about the read side. 
>>>>
>>>> here is the patch:
>>>>
>>>> @@ -918,8 +918,21 @@ void ima_update_policy(void)
>>>>         list_splice_tail_init_rcu(&ima_temp_rules, policy, synchronize_rcu);
>>>>
>>>>         if (ima_rules != policy) {
>>>> +               struct list_head *prev_rules = ima_rules;
>>>> +               struct list_head *first = ima_rules->next;
>>>>                 ima_policy_flag = 0;
>>>> +
>>>> +               /*
>>>> +                * Make the previous list can traverse to new list,
>>>> +                * that is tricky, or there is a deadly loop whithin
>>>> +                * "list_for_each_entry_rcu(entry, ima_rules, list)"
>>>> +                *
>>>> +                * After update "ima_rules", restore the previous list.
>>>> +                */
>>> I think this could be rephrased to be a tad clearer, I am not quite sure
>>> how I must interpret the first sentence of the comment.
>>>
>>>
>>>> +               prev_rules->next = policy->next;
>>>>                 ima_rules = policy;
>>>> +               syncchronize_rcu();
>>> I'm a bit puzzled as you seem to imply in the mail this patch was tested,
>>> but there is no 'syncchronize_rcu' (with two 'c') symbol in the kernel.
>>> Was that a copy/paste error? Or maybe you forgot the 'not' in "This
>>> patch has been tested"? These errors happen, and I am myself quite an
>>> expert in doing them :)
>>>
>>>> +               prev_rules->next = first;
>>>>
>>>>
>>>> The side effect is the "ima_default_rules" will be changed a little while.
>>>> But it make sense, the process should be checked again by the new policy.
>>>>
>>>> This patch has been tested, if will do, I can resubmit this patch.> 
>>>> How about this ?
>>> least
>>>
>>> Correct me if I'm wrong, here is how I think I understand you patch.
>>> We start with a situation like that (step 0):
>>> ima_rules --> List entry 0 (head node) = ima_default_rules <-> List entry 1 <-> List entry 2 <-> ... <-> List entry 0
>>>
>>> Then we decide to update the policy for the first time, so
>>> 'ima_rules [&ima_default_rules] != policy [&ima_policy_rules]'.
>>> We enter the condition.
>>> First we copy the current value of ima_rules (&ima_default_rules)
>>> to a temporary variable 'prev_rules'. We also create a pointer dubbed
>>> 'first' to the entry 1 in the default list (step 1):
>>> prev_rules -------------
>>>                        \/
>>> ima_rules --> List entry 0 (head node) = ima_default_rules <-> List entry 1 <-> List entry 2 <-> ... <-> List entry 0
>>>                                                                    /\
>>> first --------------------------------------------------------------
>>>
>>>
>>> Then we update prev_rules->next to point to policy->next (step 2):
>>> List entry 1 <-> List entry 2 <-> ... -> List entry 0
>>>  /\
>>> first
>>> 	(notice that list entry 0 no longer points backwards to 'list entry 1',
>>> 	but I don't think there is any reverse iteration in IMA, so it should be
>>> 	safe)
>>>
>>> prev_rules -------------
>>>                        \/
>>> ima_rules --> List entry 0 (head node) = ima_default_rules   
>>>                        |
>>>                        |
>>>                        -------------------------------------------
>>>                                                                  \/
>>> policy --> policy entry 0' (head node) = ima_policy_rules <-> policy entry 1' <-> policy entry 2' <-> .... <-> policy entry 0'
>>>
>>>
>>> We then update ima_rules to point to ima_policy_rules (step 3):
>>> List entry 1 <-> List entry 2 <-> ... -> List entry 0
>>>  /\
>>> first
>>>
>>> prev_rules -------------
>>>                        \/
>>> ima_rules     List entry 0 (head node) = ima_default_rules   
>>>      |                 |
>>>      |                 |
>>>      |                 ------------------------------------------
>>>      ---------------                                            |
>>>                    \/                                           \/
>>> policy --> policy entry 0' (head node) = ima_policy_rules <-> policy entry 1' <-> policy entry 2' <-> .... <-> policy entry 0'
>>>                                                   synchronize_rcu                 /\
>>> first --------------------------------------------------------------
>>>
>>> Then we run synchronize_rcu() to wait for any RCU reader to exit their loops (step 4).
>>>
>>> Finally we update prev_rules->next to point back to the ima policy and fix the loop (step 5):
>>>
>>> List entry 1 <-> List entry 2 <-> ... -> List entry 0
>>>  /\
>>> first
>>>
>>> prev_rules ---> List entry 0 (head node) = ima_default_rules <-> List entry 1 <-> List entry 2 <-> ... <-> List entry 0
>>>                                                                      /\
>>>                                                                  first (now useless)
>>> ima_rules        
>>>      |
>>>      |
>>>      |
>>>      ---------------
>>>                    \/
>>> policy --> policy entry 0' (head node) = ima_policy_rules <-> policy entry 1' <-> policy entry 2' <-> .... <-> policy entry 0'
>>>
>>> The goal is that readers should still be able to loop
>>> (forward, as we saw that backward looping is temporarily broken)
>>> while in steps 0-4.
>>>
>>> I'm not completely sure what would happen to a client that started iterating
>>> over ima_rules right after step 2.
>>>
>>> Wouldn't they be able to start looping through the new policy
>>> as 'List entry 0 (head node) = ima_default_rules' points to ima_policy_rules?
>>> And if they, wouldn't they loop until the write to 'ima_rule' at step 3 (admittedly
>>> very shortly thereafter) completed?
>>> And would the compiler be allowed to optimize the read to 'ima_rules' in the
>>> list_for_each_entry() loop, thereby never reloading the new value for
>>> 'ima_rules', and thus looping forever, just what we are trying to avoid?
>>>
>>> Overall, I'm tempted to say this is perhaps a bit too complex (at least,
>>> my head tells me it is, but that may very well be because I'm terrible
>>> at concurrency issues).
>>>
>>> Honestly, in this case I think awaiting input from more experienced
>>> kernel devs than I is the best path forward :-)
>> I'm far from an expert on RCU locking, but __list_splice_init_rcu()
>> provides an example of how to make sure there aren't any readers
>> traversing the list, before two lists are spliced together.   In our
>> case, after there aren't any readers, instead of splicing two lists
>> together, it should be safe to point to the new list.
>>
>> thanks,
>>
>> Mimi
>>

THOBY Simon Aug. 23, 2021, 8:14 a.m. UTC | #12

Hi Liqiong,

On 8/23/21 10:06 AM, liqiong wrote:
> Hi Simon :
> 
> Using a temporary ima_rules variable is not working for "ima_policy_next". 
> 
>  void *ima_policy_next(struct seq_file *m, void *v, loff_t *pos)
>  {
>  	struct ima_rule_entry *entry = v;
> -
> +	struct list_head *ima_rules_tmp = rcu_dereference(ima_rules);
>  	rcu_read_lock();
>  	entry = list_entry_rcu(entry->list.next, struct ima_rule_entry, list);
>  	rcu_read_unlock();
>  	(*pos)++;
>  
> -	return (&entry->list == ima_rules) ? NULL : entry;
> +	return (&entry->list == ima_rules_tmp) ? NULL : entry;
>  }
> 
> It seems no way to fix "ima_rules" change within this function, it will alway
> return a entry if "ima_rules" being changed.

- I think rcu_dereference() should be called inside the RCU read lock
- Maybe we could cheat with:
	return (&entry->list == &ima_policy_rules || &entry->list == &ima_default_rules) ? NULL : entry;
  as that's the only two rulesets IMA ever use?
  Admittedly, this is not as clean as previously, but it should work too.

The way I see it, the semaphore solution would not work here either,
as ima_policy_next() is called repeatedly as a seq_file
(it is set up in ima_fs.c) and we can't control the locking there:
we cannot lock across the seq_read() call (that cure could end up be
worse than the disease, deadlock-wise), so I fear we cannot protect
against a list update while a user is iterating with a lock.

So in both cases a cheat like "&entry->list == &ima_policy_rules || &entry->list == &ima_default_rules"
maybe need to be considered.

What do you think?

> 
> Regrads,
> 
> liqiong

Thanks,
Simon

Mimi Zohar Aug. 23, 2021, 11:22 a.m. UTC | #13

Hi Liqiong,

On Mon, 2021-08-23 at 11:04 +0800, 李力琼 wrote:
> Hi Mimi :
> 
> The situation is a little different,'list_splice_init_rcu'
> don't change the list head. If "ima_rules" being changed,
> readers may can't reload the new value in time for cpu cache
> or compiler optimization. Defining "ima_rules" as a volatile 
> variable can fix, but It is inefficient.

After looking at list_splice_tail_init_rcu() some more, it is
actually making sure there aren't any readers traversing
"ima_temp_rules", not "ima_policy_rules".   There aren't any readers
traversing "ima_temp_rules". 

> 
> Maybe using a temporary ima_rules variable for every 
> "list_for_each_entry_rcu(entry, ima_rules, list)" loop is 
> a better solution to fix the "endless loop" bug. 

Agreed

thanks,

Mimi

Mimi Zohar Aug. 23, 2021, 11:57 a.m. UTC | #14

On Mon, 2021-08-23 at 08:14 +0000, THOBY Simon wrote:
> Hi Liqiong,
> 
> On 8/23/21 10:06 AM, liqiong wrote:
> > Hi Simon :
> > 
> > Using a temporary ima_rules variable is not working for "ima_policy_next". 
> > 
> >  void *ima_policy_next(struct seq_file *m, void *v, loff_t *pos)
> >  {
> >  	struct ima_rule_entry *entry = v;
> > -
> > +	struct list_head *ima_rules_tmp = rcu_dereference(ima_rules);
> >  	rcu_read_lock();
> >  	entry = list_entry_rcu(entry->list.next, struct ima_rule_entry, list);
> >  	rcu_read_unlock();
> >  	(*pos)++;
> >  
> > -	return (&entry->list == ima_rules) ? NULL : entry;
> > +	return (&entry->list == ima_rules_tmp) ? NULL : entry;
> >  }
> > 
> > It seems no way to fix "ima_rules" change within this function, it will alway
> > return a entry if "ima_rules" being changed.
> 
> - I think rcu_dereference() should be called inside the RCU read lock
> - Maybe we could cheat with:
> 	return (&entry->list == &ima_policy_rules || &entry->list == &ima_default_rules) ? NULL : entry;
>   as that's the only two rulesets IMA ever use?
>   Admittedly, this is not as clean as previously, but it should work too.
> 
> The way I see it, the semaphore solution would not work here either,
> as ima_policy_next() is called repeatedly as a seq_file
> (it is set up in ima_fs.c) and we can't control the locking there:
> we cannot lock across the seq_read() call (that cure could end up be
> worse than the disease, deadlock-wise), so I fear we cannot protect
> against a list update while a user is iterating with a lock.
> 
> So in both cases a cheat like "&entry->list == &ima_policy_rules || &entry->list == &ima_default_rules"
> maybe need to be considered.
> 
> What do you think?

Is this an overall suggestion or limited to just ima_policy_next()?

thanks,

Mimi

THOBY Simon Aug. 23, 2021, 12:02 p.m. UTC | #15

Hi Mimi,

On 8/23/21 1:57 PM, Mimi Zohar wrote:
> On Mon, 2021-08-23 at 08:14 +0000, THOBY Simon wrote:
>> Hi Liqiong,
>>
>> On 8/23/21 10:06 AM, liqiong wrote:
>>> Hi Simon :
>>>
>>> Using a temporary ima_rules variable is not working for "ima_policy_next". 
>>>
>>>  void *ima_policy_next(struct seq_file *m, void *v, loff_t *pos)
>>>  {
>>>  	struct ima_rule_entry *entry = v;
>>> -
>>> +	struct list_head *ima_rules_tmp = rcu_dereference(ima_rules);
>>>  	rcu_read_lock();
>>>  	entry = list_entry_rcu(entry->list.next, struct ima_rule_entry, list);
>>>  	rcu_read_unlock();
>>>  	(*pos)++;
>>>  
>>> -	return (&entry->list == ima_rules) ? NULL : entry;
>>> +	return (&entry->list == ima_rules_tmp) ? NULL : entry;
>>>  }
>>>
>>> It seems no way to fix "ima_rules" change within this function, it will alway
>>> return a entry if "ima_rules" being changed.
>>
>> - I think rcu_dereference() should be called inside the RCU read lock
>> - Maybe we could cheat with:
>> 	return (&entry->list == &ima_policy_rules || &entry->list == &ima_default_rules) ? NULL : entry;
>>   as that's the only two rulesets IMA ever use?
>>   Admittedly, this is not as clean as previously, but it should work too.
>>
>> The way I see it, the semaphore solution would not work here either,
>> as ima_policy_next() is called repeatedly as a seq_file
>> (it is set up in ima_fs.c) and we can't control the locking there:
>> we cannot lock across the seq_read() call (that cure could end up be
>> worse than the disease, deadlock-wise), so I fear we cannot protect
>> against a list update while a user is iterating with a lock.
>>
>> So in both cases a cheat like "&entry->list == &ima_policy_rules || &entry->list == &ima_default_rules"
>> maybe need to be considered.
>>
>> What do you think?
> 
> Is this an overall suggestion or limited to just ima_policy_next()?

I was thinking only of ima_policy_next(), I don't think (from what I could see in a short glance)
that other functions need such a treatment. The ima_rules_tmp dance is probably safe for the
other uses of ima_rules.

> 
> thanks,
> 
> Mimi
> 
> 

Thanks,
Simon

Mimi Zohar Aug. 23, 2021, 12:09 p.m. UTC | #16

Hi Simon,

On Mon, 2021-08-23 at 12:02 +0000, THOBY Simon wrote:
> Hi Mimi,
> 
> On 8/23/21 1:57 PM, Mimi Zohar wrote:
> > On Mon, 2021-08-23 at 08:14 +0000, THOBY Simon wrote:
> >> Hi Liqiong,
> >>
> >> On 8/23/21 10:06 AM, liqiong wrote:
> >>> Hi Simon :
> >>>
> >>> Using a temporary ima_rules variable is not working for "ima_policy_next". 
> >>>
> >>>  void *ima_policy_next(struct seq_file *m, void *v, loff_t *pos)
> >>>  {
> >>>  	struct ima_rule_entry *entry = v;
> >>> -
> >>> +	struct list_head *ima_rules_tmp = rcu_dereference(ima_rules);
> >>>  	rcu_read_lock();
> >>>  	entry = list_entry_rcu(entry->list.next, struct ima_rule_entry, list);
> >>>  	rcu_read_unlock();
> >>>  	(*pos)++;
> >>>  
> >>> -	return (&entry->list == ima_rules) ? NULL : entry;
> >>> +	return (&entry->list == ima_rules_tmp) ? NULL : entry;
> >>>  }
> >>>
> >>> It seems no way to fix "ima_rules" change within this function, it will alway
> >>> return a entry if "ima_rules" being changed.
> >>
> >> - I think rcu_dereference() should be called inside the RCU read lock
> >> - Maybe we could cheat with:
> >> 	return (&entry->list == &ima_policy_rules || &entry->list == &ima_default_rules) ? NULL : entry;
> >>   as that's the only two rulesets IMA ever use?
> >>   Admittedly, this is not as clean as previously, but it should work too.
> >>
> >> The way I see it, the semaphore solution would not work here either,
> >> as ima_policy_next() is called repeatedly as a seq_file
> >> (it is set up in ima_fs.c) and we can't control the locking there:
> >> we cannot lock across the seq_read() call (that cure could end up be
> >> worse than the disease, deadlock-wise), so I fear we cannot protect
> >> against a list update while a user is iterating with a lock.
> >>
> >> So in both cases a cheat like "&entry->list == &ima_policy_rules || &entry->list == &ima_default_rules"
> >> maybe need to be considered.
> >>
> >> What do you think?
> > 
> > Is this an overall suggestion or limited to just ima_policy_next()?
> 
> I was thinking only of ima_policy_next(), I don't think (from what I could see in a short glance)
> that other functions need such a treatment. The ima_rules_tmp dance is probably safe for the
> other uses of ima_rules.

Thanks, just making sure it is limited to here.

Mimi

Li Qiong Aug. 23, 2021, 12:56 p.m. UTC | #17

Hi Simon :

在 2021年08月23日 16:14, THOBY Simon 写道:
> Hi Liqiong,
>
> On 8/23/21 10:06 AM, liqiong wrote:
>> Hi Simon :
>>
>> Using a temporary ima_rules variable is not working for "ima_policy_next". 
>>
>>  void *ima_policy_next(struct seq_file *m, void *v, loff_t *pos)
>>  {
>>  	struct ima_rule_entry *entry = v;
>> -
>> +	struct list_head *ima_rules_tmp = rcu_dereference(ima_rules);
>>  	rcu_read_lock();
>>  	entry = list_entry_rcu(entry->list.next, struct ima_rule_entry, list);
>>  	rcu_read_unlock();
>>  	(*pos)++;
>>  
>> -	return (&entry->list == ima_rules) ? NULL : entry;
>> +	return (&entry->list == ima_rules_tmp) ? NULL : entry;
>>  }
>>
>> It seems no way to fix "ima_rules" change within this function, it will alway
>> return a entry if "ima_rules" being changed.
> - I think rcu_dereference() should be called inside the RCU read lock
> - Maybe we could cheat with:
> 	return (&entry->list == &ima_policy_rules || &entry->list == &ima_default_rules) ? NULL : entry;
>   as that's the only two rulesets IMA ever use?
>   Admittedly, this is not as clean as previously, but it should work too.
>
> The way I see it, the semaphore solution would not work here either,
> as ima_policy_next() is called repeatedly as a seq_file
> (it is set up in ima_fs.c) and we can't control the locking there:
> we cannot lock across the seq_read() call (that cure could end up be
> worse than the disease, deadlock-wise), so I fear we cannot protect
> against a list update while a user is iterating with a lock.
>
> So in both cases a cheat like "&entry->list == &ima_policy_rules || &entry->list == &ima_default_rules"
> maybe need to be considered.
>
> What do you think?

Yes,  semaphore solution not work here,  splicing two list is a little complex.
This solution is  simple and clear,  should  work.  I will work on that, test and 
confirm  the patch. 

"rcu_dereference() should be called inside the RCU read lock", I will correct this.

Thanks for your help.


Regrads,

liqiong

>
>
>> Regrads,
>>
>> liqiong
> Thanks,
> Simon

ima: fix infinite loop within "ima_match_policy" function.

Commit Message

Comments

Patch