Message ID | 20210819101529.28001-1-liqiong@nfschina.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | ima: fix infinite loop within "ima_match_policy" function. | expand |
Hi Liqiong, On 8/19/21 12:15 PM, liqiong wrote: > When "ima_match_policy" is looping while "ima_update_policy" changs > the variable "ima_rules", then "ima_match_policy" may can't exit loop, > and kernel keeps printf "rcu_sched detected stall on CPU ...". > > It occurs at boot phase, systemd-services are being checked within > "ima_match_policy,at the same time, the variable "ima_rules" > is changed by a service. First off, thanks for finding and identifying this nasty bug. > > Signed-off-by: liqiong <liqiong@nfschina.com> > --- > security/integrity/ima/ima_policy.c | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/security/integrity/ima/ima_policy.c b/security/integrity/ima/ima_policy.c > index fd5d46e511f1..7e71e643457c 100644 > --- a/security/integrity/ima/ima_policy.c > +++ b/security/integrity/ima/ima_policy.c > @@ -217,6 +217,7 @@ static LIST_HEAD(ima_default_rules); > static LIST_HEAD(ima_policy_rules); > static LIST_HEAD(ima_temp_rules); > static struct list_head *ima_rules = &ima_default_rules; > +static DECLARE_RWSEM(ima_rules_sem); > > static int ima_policy __initdata; > > @@ -666,6 +667,7 @@ int ima_match_policy(struct user_namespace *mnt_userns, struct inode *inode, > if (template_desc && !*template_desc) > *template_desc = ima_template_desc_current(); > > + down_read(&ima_rules_sem); > rcu_read_lock(); > list_for_each_entry_rcu(entry, ima_rules, list) { > > @@ -702,6 +704,7 @@ int ima_match_policy(struct user_namespace *mnt_userns, struct inode *inode, > break; > } > rcu_read_unlock(); > + up_read(&ima_rules_sem); > > return action; > } > @@ -919,7 +922,9 @@ void ima_update_policy(void) > > if (ima_rules != policy) { > ima_policy_flag = 0; > + down_write(&ima_rules_sem); > ima_rules = policy; > + up_write(&ima_rules_sem); > > /* > * IMA architecture specific policy rules are specified > Rather than introducing a new semaphore, I wonder if you couldn't have done something like the following? @@ -674,13 +674,15 @@ int ima_match_policy(struct user_namespace *mnt_userns, struct inode *inode, const char *func_data, unsigned int *allowed_algos) { struct ima_rule_entry *entry; + struct list_head *ima_rules_tmp; int action = 0, actmask = flags | (flags << 1); if (template_desc && !*template_desc) *template_desc = ima_template_desc_current(); rcu_read_lock(); - list_for_each_entry_rcu(entry, ima_rules, list) { + ima_rules_tmp = rcu_dereference(ima_rules); + list_for_each_entry_rcu(entry, ima_rules_tmp, list) { if (!(entry->action & actmask)) continue; @@ -970,7 +972,7 @@ void ima_update_policy(void) if (ima_rules != policy) { ima_policy_flag = 0; - ima_rules = policy; + rcu_assign_pointer(ima_rules, policy); /* * IMA architecture specific policy rules are specified Also, ima_match_policy is not the only place where we iterate over ima_rules, maybe this change should be applied to every function that perform a call the like of "list_for_each_entry_rcu(entry, ima_rules_tmp, list)" ? All that being said, your change is quite small and I have no objection to it, I was just wondering whether we could achieve the same effect without locks with RCU. What do you think? Thanks, Simon
On Thu, 2021-08-19 at 12:58 +0000, THOBY Simon wrote: > Hi Liqiong, > > On 8/19/21 12:15 PM, liqiong wrote: > > When "ima_match_policy" is looping while "ima_update_policy" changs > > the variable "ima_rules", then "ima_match_policy" may can't exit loop, > > and kernel keeps printf "rcu_sched detected stall on CPU ...". > > > > It occurs at boot phase, systemd-services are being checked within > > "ima_match_policy,at the same time, the variable "ima_rules" > > is changed by a service. > > First off, thanks for finding and identifying this nasty bug. Once the initial builtin policy rules have been replaced by a custom policy, rules may only be appended by splicing the new rules with the existing rules. There should never be a problem reading the rules at that point. Does this problem occur before the builtin policy rules have been replaced with a custom policy? Mimi
On Thu, 2021-08-19 at 09:47 -0400, Mimi Zohar wrote: > On Thu, 2021-08-19 at 12:58 +0000, THOBY Simon wrote: > > Hi Liqiong, > > > > On 8/19/21 12:15 PM, liqiong wrote: > > > When "ima_match_policy" is looping while "ima_update_policy" changs > > > the variable "ima_rules", then "ima_match_policy" may can't exit loop, > > > and kernel keeps printf "rcu_sched detected stall on CPU ...". > > > > > > It occurs at boot phase, systemd-services are being checked within > > > "ima_match_policy,at the same time, the variable "ima_rules" > > > is changed by a service. > > > > First off, thanks for finding and identifying this nasty bug. > > Once the initial builtin policy rules have been replaced by a custom > policy, rules may only be appended by splicing the new rules with the > existing rules. There should never be a problem reading the rules at > that point. Does this problem occur before the builtin policy rules > have been replaced with a custom policy? Yes, the problem is limited to transitioning from the builtin policy to the custom policy. Adding a new lock around rcu code seems counter productive, especially since switching the policy rules happens once, normally during early boot before access to real root. Please consider Simon's suggestion or finding some other solution. thanks, Mimi
Hi, Simon: This solution is better then rwsem, a temp "ima_rules" variable should can fix. I also have a another idea, with a little trick, default list can traverse to the new list, so we don't need care about the read side. here is the patch: @@ -918,8 +918,21 @@ void ima_update_policy(void) list_splice_tail_init_rcu(&ima_temp_rules, policy, synchronize_rcu); if (ima_rules != policy) { + struct list_head *prev_rules = ima_rules; + struct list_head *first = ima_rules->next; ima_policy_flag = 0; + + /* + * Make the previous list can traverse to new list, + * that is tricky, or there is a deadly loop whithin + * "list_for_each_entry_rcu(entry, ima_rules, list)" + * + * After update "ima_rules", restore the previous list. + */ + prev_rules->next = policy->next; ima_rules = policy; + syncchronize_rcu(); + prev_rules->next = first; The side effect is the "ima_default_rules" will be changed a little while. But it make sense, the process should be checked again by the new policy. This patch has been tested, if will do, I can resubmit this patch. How about this ? ---------- Regards, liqiong 在 2021年08月19日 20:58, THOBY Simon 写道: > Hi Liqiong, > > On 8/19/21 12:15 PM, liqiong wrote: >> When "ima_match_policy" is looping while "ima_update_policy" changs >> the variable "ima_rules", then "ima_match_policy" may can't exit loop, >> and kernel keeps printf "rcu_sched detected stall on CPU ...". >> >> It occurs at boot phase, systemd-services are being checked within >> "ima_match_policy,at the same time, the variable "ima_rules" >> is changed by a service. > First off, thanks for finding and identifying this nasty bug. > >> Signed-off-by: liqiong <liqiong@nfschina.com> >> --- >> security/integrity/ima/ima_policy.c | 5 +++++ >> 1 file changed, 5 insertions(+) >> >> diff --git a/security/integrity/ima/ima_policy.c b/security/integrity/ima/ima_policy.c >> index fd5d46e511f1..7e71e643457c 100644 >> --- a/security/integrity/ima/ima_policy.c >> +++ b/security/integrity/ima/ima_policy.c >> @@ -217,6 +217,7 @@ static LIST_HEAD(ima_default_rules); >> static LIST_HEAD(ima_policy_rules); >> static LIST_HEAD(ima_temp_rules); >> static struct list_head *ima_rules = &ima_default_rules; >> +static DECLARE_RWSEM(ima_rules_sem); >> >> static int ima_policy __initdata; >> >> @@ -666,6 +667,7 @@ int ima_match_policy(struct user_namespace *mnt_userns, struct inode *inode, >> if (template_desc && !*template_desc) >> *template_desc = ima_template_desc_current(); >> >> + down_read(&ima_rules_sem); >> rcu_read_lock(); >> list_for_each_entry_rcu(entry, ima_rules, list) { >> >> @@ -702,6 +704,7 @@ int ima_match_policy(struct user_namespace *mnt_userns, struct inode *inode, >> break; >> } >> rcu_read_unlock(); >> + up_read(&ima_rules_sem); >> >> return action; >> } >> @@ -919,7 +922,9 @@ void ima_update_policy(void) >> >> if (ima_rules != policy) { >> ima_policy_flag = 0; >> + down_write(&ima_rules_sem); >> ima_rules = policy; >> + up_write(&ima_rules_sem); >> >> /* >> * IMA architecture specific policy rules are specified >> > Rather than introducing a new semaphore, I wonder if you couldn't have done something > like the following? > > @@ -674,13 +674,15 @@ int ima_match_policy(struct user_namespace *mnt_userns, struct inode *inode, > const char *func_data, unsigned int *allowed_algos) > { > struct ima_rule_entry *entry; > + struct list_head *ima_rules_tmp; > int action = 0, actmask = flags | (flags << 1); > > if (template_desc && !*template_desc) > *template_desc = ima_template_desc_current(); > > rcu_read_lock(); > - list_for_each_entry_rcu(entry, ima_rules, list) { > + ima_rules_tmp = rcu_dereference(ima_rules); > + list_for_each_entry_rcu(entry, ima_rules_tmp, list) { > > if (!(entry->action & actmask)) > continue; > @@ -970,7 +972,7 @@ void ima_update_policy(void) > > if (ima_rules != policy) { > ima_policy_flag = 0; > - ima_rules = policy; > + rcu_assign_pointer(ima_rules, policy); > > /* > * IMA architecture specific policy rules are specified > > > Also, ima_match_policy is not the only place where we iterate over ima_rules, maybe > this change should be applied to every function that perform a call the like of > "list_for_each_entry_rcu(entry, ima_rules_tmp, list)" ? > > All that being said, your change is quite small and I have no objection to it, > I was just wondering whether we could achieve the same effect without locks > with RCU. > > What do you think? > > Thanks, > Simon
Hi Liqiong, On 8/20/21 12:15 PM, 李力琼 wrote: > Hi, Simon: > > This solution is better then rwsem, a temp "ima_rules" variable should > can fix. I also have a another idea, with a little trick, default list > can traverse to the new list, so we don't need care about the read side. > > here is the patch: > > @@ -918,8 +918,21 @@ void ima_update_policy(void) > list_splice_tail_init_rcu(&ima_temp_rules, policy, synchronize_rcu); > > if (ima_rules != policy) { > + struct list_head *prev_rules = ima_rules; > + struct list_head *first = ima_rules->next; > ima_policy_flag = 0; > + > + /* > + * Make the previous list can traverse to new list, > + * that is tricky, or there is a deadly loop whithin > + * "list_for_each_entry_rcu(entry, ima_rules, list)" > + * > + * After update "ima_rules", restore the previous list. > + */ I think this could be rephrased to be a tad clearer, I am not quite sure how I must interpret the first sentence of the comment. > + prev_rules->next = policy->next; > ima_rules = policy; > + syncchronize_rcu(); I'm a bit puzzled as you seem to imply in the mail this patch was tested, but there is no 'syncchronize_rcu' (with two 'c') symbol in the kernel. Was that a copy/paste error? Or maybe you forgot the 'not' in "This patch has been tested"? These errors happen, and I am myself quite an expert in doing them :) > + prev_rules->next = first; > > > The side effect is the "ima_default_rules" will be changed a little while. > But it make sense, the process should be checked again by the new policy. > > This patch has been tested, if will do, I can resubmit this patch.> > How about this ? Correct me if I'm wrong, here is how I think I understand you patch. We start with a situation like that (step 0): ima_rules --> List entry 0 (head node) = ima_default_rules <-> List entry 1 <-> List entry 2 <-> ... <-> List entry 0 Then we decide to update the policy for the first time, so 'ima_rules [&ima_default_rules] != policy [&ima_policy_rules]'. We enter the condition. First we copy the current value of ima_rules (&ima_default_rules) to a temporary variable 'prev_rules'. We also create a pointer dubbed 'first' to the entry 1 in the default list (step 1): prev_rules ------------- \/ ima_rules --> List entry 0 (head node) = ima_default_rules <-> List entry 1 <-> List entry 2 <-> ... <-> List entry 0 /\ first -------------------------------------------------------------- Then we update prev_rules->next to point to policy->next (step 2): List entry 1 <-> List entry 2 <-> ... -> List entry 0 /\ first (notice that list entry 0 no longer points backwards to 'list entry 1', but I don't think there is any reverse iteration in IMA, so it should be safe) prev_rules ------------- \/ ima_rules --> List entry 0 (head node) = ima_default_rules | | ------------------------------------------- \/ policy --> policy entry 0' (head node) = ima_policy_rules <-> policy entry 1' <-> policy entry 2' <-> .... <-> policy entry 0' We then update ima_rules to point to ima_policy_rules (step 3): List entry 1 <-> List entry 2 <-> ... -> List entry 0 /\ first prev_rules ------------- \/ ima_rules List entry 0 (head node) = ima_default_rules | | | | | ------------------------------------------ --------------- | \/ \/ policy --> policy entry 0' (head node) = ima_policy_rules <-> policy entry 1' <-> policy entry 2' <-> .... <-> policy entry 0' /\ first -------------------------------------------------------------- Then we run synchronize_rcu() to wait for any RCU reader to exit their loops (step 4). Finally we update prev_rules->next to point back to the ima policy and fix the loop (step 5): List entry 1 <-> List entry 2 <-> ... -> List entry 0 /\ first prev_rules ---> List entry 0 (head node) = ima_default_rules <-> List entry 1 <-> List entry 2 <-> ... <-> List entry 0 /\ first (now useless) ima_rules | | | --------------- \/ policy --> policy entry 0' (head node) = ima_policy_rules <-> policy entry 1' <-> policy entry 2' <-> .... <-> policy entry 0' The goal is that readers should still be able to loop (forward, as we saw that backward looping is temporarily broken) while in steps 0-4. I'm not completely sure what would happen to a client that started iterating over ima_rules right after step 2. Wouldn't they be able to start looping through the new policy as 'List entry 0 (head node) = ima_default_rules' points to ima_policy_rules? And if they, wouldn't they loop until the write to 'ima_rule' at step 3 (admittedly very shortly thereafter) completed? And would the compiler be allowed to optimize the read to 'ima_rules' in the list_for_each_entry() loop, thereby never reloading the new value for 'ima_rules', and thus looping forever, just what we are trying to avoid? Overall, I'm tempted to say this is perhaps a bit too complex (at least, my head tells me it is, but that may very well be because I'm terrible at concurrency issues). Honestly, in this case I think awaiting input from more experienced kernel devs than I is the best path forward :-) > > ---------- > Regards, > liqiong > Thanks, Simon
On Fri, 2021-08-20 at 13:23 +0000, THOBY Simon wrote: > Hi Liqiong, > > On 8/20/21 12:15 PM, 李力琼 wrote: > > Hi, Simon: > > > > This solution is better then rwsem, a temp "ima_rules" variable should > > can fix. I also have a another idea, with a little trick, default list > > can traverse to the new list, so we don't need care about the read side. > > > > here is the patch: > > > > @@ -918,8 +918,21 @@ void ima_update_policy(void) > > list_splice_tail_init_rcu(&ima_temp_rules, policy, synchronize_rcu); > > > > if (ima_rules != policy) { > > + struct list_head *prev_rules = ima_rules; > > + struct list_head *first = ima_rules->next; > > ima_policy_flag = 0; > > + > > + /* > > + * Make the previous list can traverse to new list, > > + * that is tricky, or there is a deadly loop whithin > > + * "list_for_each_entry_rcu(entry, ima_rules, list)" > > + * > > + * After update "ima_rules", restore the previous list. > > + */ > > I think this could be rephrased to be a tad clearer, I am not quite sure > how I must interpret the first sentence of the comment. > > > > + prev_rules->next = policy->next; > > ima_rules = policy; > > + syncchronize_rcu(); > > I'm a bit puzzled as you seem to imply in the mail this patch was tested, > but there is no 'syncchronize_rcu' (with two 'c') symbol in the kernel. > Was that a copy/paste error? Or maybe you forgot the 'not' in "This > patch has been tested"? These errors happen, and I am myself quite an > expert in doing them :) > > > + prev_rules->next = first; > > > > > > The side effect is the "ima_default_rules" will be changed a little while. > > But it make sense, the process should be checked again by the new policy. > > > > This patch has been tested, if will do, I can resubmit this patch.> > > How about this ? > > > Correct me if I'm wrong, here is how I think I understand you patch. > We start with a situation like that (step 0): > ima_rules --> List entry 0 (head node) = ima_default_rules <-> List entry 1 <-> List entry 2 <-> ... <-> List entry 0 > > Then we decide to update the policy for the first time, so > 'ima_rules [&ima_default_rules] != policy [&ima_policy_rules]'. > We enter the condition. > First we copy the current value of ima_rules (&ima_default_rules) > to a temporary variable 'prev_rules'. We also create a pointer dubbed > 'first' to the entry 1 in the default list (step 1): > prev_rules ------------- > \/ > ima_rules --> List entry 0 (head node) = ima_default_rules <-> List entry 1 <-> List entry 2 <-> ... <-> List entry 0 > /\ > first -------------------------------------------------------------- > > > Then we update prev_rules->next to point to policy->next (step 2): > List entry 1 <-> List entry 2 <-> ... -> List entry 0 > /\ > first > (notice that list entry 0 no longer points backwards to 'list entry 1', > but I don't think there is any reverse iteration in IMA, so it should be > safe) > > prev_rules ------------- > \/ > ima_rules --> List entry 0 (head node) = ima_default_rules > | > | > ------------------------------------------- > \/ > policy --> policy entry 0' (head node) = ima_policy_rules <-> policy entry 1' <-> policy entry 2' <-> .... <-> policy entry 0' > > > We then update ima_rules to point to ima_policy_rules (step 3): > List entry 1 <-> List entry 2 <-> ... -> List entry 0 > /\ > first > > prev_rules ------------- > \/ > ima_rules List entry 0 (head node) = ima_default_rules > | | > | | > | ------------------------------------------ > --------------- | > \/ \/ > policy --> policy entry 0' (head node) = ima_policy_rules <-> policy entry 1' <-> policy entry 2' <-> .... <-> policy entry 0' > /\ > first -------------------------------------------------------------- > > Then we run synchronize_rcu() to wait for any RCU reader to exit their loops (step 4). > > Finally we update prev_rules->next to point back to the ima policy and fix the loop (step 5): > > List entry 1 <-> List entry 2 <-> ... -> List entry 0 > /\ > first > > prev_rules ---> List entry 0 (head node) = ima_default_rules <-> List entry 1 <-> List entry 2 <-> ... <-> List entry 0 > /\ > first (now useless) > ima_rules > | > | > | > --------------- > \/ > policy --> policy entry 0' (head node) = ima_policy_rules <-> policy entry 1' <-> policy entry 2' <-> .... <-> policy entry 0' > > The goal is that readers should still be able to loop > (forward, as we saw that backward looping is temporarily broken) > while in steps 0-4. > > I'm not completely sure what would happen to a client that started iterating > over ima_rules right after step 2. > > Wouldn't they be able to start looping through the new policy > as 'List entry 0 (head node) = ima_default_rules' points to ima_policy_rules? > And if they, wouldn't they loop until the write to 'ima_rule' at step 3 (admittedly > very shortly thereafter) completed? > And would the compiler be allowed to optimize the read to 'ima_rules' in the > list_for_each_entry() loop, thereby never reloading the new value for > 'ima_rules', and thus looping forever, just what we are trying to avoid? > > Overall, I'm tempted to say this is perhaps a bit too complex (at least, > my head tells me it is, but that may very well be because I'm terrible > at concurrency issues). > > Honestly, in this case I think awaiting input from more experienced > kernel devs than I is the best path forward :-) I'm far from an expert on RCU locking, but __list_splice_init_rcu() provides an example of how to make sure there aren't any readers traversing the list, before two lists are spliced together. In our case, after there aren't any readers, instead of splicing two lists together, it should be safe to point to the new list. thanks, Mimi
Hi Simon, On 2021/8/20 21:23, THOBY Simon wrote: > Hi Liqiong, > > On 8/20/21 12:15 PM, 李力琼 wrote: >> Hi, Simon: >> >> This solution is better then rwsem, a temp "ima_rules" variable should >> can fix. I also have a another idea, with a little trick, default list >> can traverse to the new list, so we don't need care about the read side. >> >> here is the patch: >> >> @@ -918,8 +918,21 @@ void ima_update_policy(void) >> list_splice_tail_init_rcu(&ima_temp_rules, policy, synchronize_rcu); >> >> if (ima_rules != policy) { >> + struct list_head *prev_rules = ima_rules; >> + struct list_head *first = ima_rules->next; >> ima_policy_flag = 0; >> + >> + /* >> + * Make the previous list can traverse to new list, >> + * that is tricky, or there is a deadly loop whithin >> + * "list_for_each_entry_rcu(entry, ima_rules, list)" >> + * >> + * After update "ima_rules", restore the previous list. >> + */ > I think this could be rephrased to be a tad clearer, I am not quite sure > how I must interpret the first sentence of the comment. I got it, how about this: /* * The previous list has to traverse to new list, * Or there may be a deadly loop within * "list_for_each_entry_rcu(entry, ima_rules, list)" * * That is tricky, after updated "ima_rules", restore the previous list. */ > > >> + prev_rules->next = policy->next; >> ima_rules = policy; >> + syncchronize_rcu(); > I'm a bit puzzled as you seem to imply in the mail this patch was tested, > but there is no 'syncchronize_rcu' (with two 'c') symbol in the kernel. > Was that a copy/paste error? Or maybe you forgot the 'not' in "This > patch has been tested"? These errors happen, and I am myself quite an > expert in doing them :) Sorry for the mistake, I copy/paste the patch and delete/edit some lines, have reviewed before sending, but not found. I have made a case to reproduce the error, dumping "ima_rules" and every item address of list in the error situaiton, I can watchthe "ima_rules" change, old list traversing to the new list. And I have been doing a reboot test which found this bug. This patch seems to work fine. > >> + prev_rules->next = first; >> >> >> The side effect is the "ima_default_rules" will be changed a little while. >> But it make sense, the process should be checked again by the new policy. >> >> This patch has been tested, if will do, I can resubmit this patch.> >> How about this ? > > Correct me if I'm wrong, here is how I think I understand you patch. > We start with a situation like that (step 0): > ima_rules --> List entry 0 (head node) = ima_default_rules <-> List entry 1 <-> List entry 2 <-> ... <-> List entry 0 > > Then we decide to update the policy for the first time, so > 'ima_rules [&ima_default_rules] != policy [&ima_policy_rules]'. > We enter the condition. > First we copy the current value of ima_rules (&ima_default_rules) > to a temporary variable 'prev_rules'. We also create a pointer dubbed > 'first' to the entry 1 in the default list (step 1): > prev_rules ------------- > \/ > ima_rules --> List entry 0 (head node) = ima_default_rules <-> List entry 1 <-> List entry 2 <-> ... <-> List entry 0 > /\ > first -------------------------------------------------------------- > > > Then we update prev_rules->next to point to policy->next (step 2): > List entry 1 <-> List entry 2 <-> ... -> List entry 0 > /\ > first > (notice that list entry 0 no longer points backwards to 'list entry 1', > but I don't think there is any reverse iteration in IMA, so it should be > safe) > > prev_rules ------------- > \/ > ima_rules --> List entry 0 (head node) = ima_default_rules > | > | > ------------------------------------------- > \/ > policy --> policy entry 0' (head node) = ima_policy_rules <-> policy entry 1' <-> policy entry 2' <-> .... <-> policy entry 0' > > > We then update ima_rules to point to ima_policy_rules (step 3): > List entry 1 <-> List entry 2 <-> ... -> List entry 0 > /\ > first > > prev_rules ------------- > \/ > ima_rules List entry 0 (head node) = ima_default_rules > | | > | | > | ------------------------------------------ > --------------- | > \/ \/ > policy --> policy entry 0' (head node) = ima_policy_rules <-> policy entry 1' <-> policy entry 2' <-> .... <-> policy entry 0' > /\ > first -------------------------------------------------------------- > > Then we run synchronize_rcu() to wait for any RCU reader to exit their loops (step 4). > > Finally we update prev_rules->next to point back to the ima policy and fix the loop (step 5): > > List entry 1 <-> List entry 2 <-> ... -> List entry 0 > /\ > first > > prev_rules ---> List entry 0 (head node) = ima_default_rules <-> List entry 1 <-> List entry 2 <-> ... <-> List entry 0 > /\ > first (now useless) > ima_rules > | > | > | > --------------- > \/ > policy --> policy entry 0' (head node) = ima_policy_rules <-> policy entry 1' <-> policy entry 2' <-> .... <-> policy entry 0' > > The goal is that readers should still be able to loop > (forward, as we saw that backward looping is temporarily broken) > while in steps 0-4. Yes, It's the workflow. > I'm not completely sure what would happen to a client that started iterating > over ima_rules right after step 2. > > Wouldn't they be able to start looping through the new policy > as 'List entry 0 (head node) = ima_default_rules' points to ima_policy_rules? > And if they, wouldn't they loop until the write to 'ima_rule' at step 3 (admittedly > very shortly thereafter) completed? > And would the compiler be allowed to optimize the read to 'ima_rules' in the > list_for_each_entry() loop, thereby never reloading the new value for > 'ima_rules', and thus looping forever, just what we are trying to avoid? Yes, "ima_rules" cache not update in time, It's a risk. I am not sure if "WRITE_ONCE" can do this trick. How about: WRITE_ONCE(prev_rules->next, policy->next); WRITE_ONCE(ima_rules, policy); If can't fix the cache issue, maybe the "ima_rules_tmp" solution is the best way. I will test it. > Overall, I'm tempted to say this is perhaps a bit too complex (at least, > my head tells me it is, but that may very well be because I'm terrible > at concurrency issues). > > Honestly, in this case I think awaiting input from more experienced > kernel devs than I is the best path forward :-) > >> ---------- >> Regards, >> liqiong >> > Thanks, > Simon
Hi Mimi : The situation is a little different,'list_splice_init_rcu' don't change the list head. If "ima_rules" being changed, readers may can't reload the new value in time for cpu cache or compiler optimization. Defining "ima_rules" as a volatile variable can fix, but It is inefficient. Maybe using a temporary ima_rules variable for every "list_for_each_entry_rcu(entry, ima_rules, list)" loop is a better solution to fix the "endless loop" bug. Regards, liqiong 在 2021年08月20日 23:48, Mimi Zohar 写道: > On Fri, 2021-08-20 at 13:23 +0000, THOBY Simon wrote: >> Hi Liqiong, >> >> On 8/20/21 12:15 PM, 李力琼 wrote: >>> Hi, Simon: >>> >>> This solution is better then rwsem, a temp "ima_rules" variable should >>> can fix. I also have a another idea, with a little trick, default list >>> can traverse to the new list, so we don't need care about the read side. >>> >>> here is the patch: >>> >>> @@ -918,8 +918,21 @@ void ima_update_policy(void) >>> list_splice_tail_init_rcu(&ima_temp_rules, policy, synchronize_rcu); >>> >>> if (ima_rules != policy) { >>> + struct list_head *prev_rules = ima_rules; >>> + struct list_head *first = ima_rules->next; >>> ima_policy_flag = 0; >>> + >>> + /* >>> + * Make the previous list can traverse to new list, >>> + * that is tricky, or there is a deadly loop whithin >>> + * "list_for_each_entry_rcu(entry, ima_rules, list)" >>> + * >>> + * After update "ima_rules", restore the previous list. >>> + */ >> I think this could be rephrased to be a tad clearer, I am not quite sure >> how I must interpret the first sentence of the comment. >> >> >>> + prev_rules->next = policy->next; >>> ima_rules = policy; >>> + syncchronize_rcu(); >> I'm a bit puzzled as you seem to imply in the mail this patch was tested, >> but there is no 'syncchronize_rcu' (with two 'c') symbol in the kernel. >> Was that a copy/paste error? Or maybe you forgot the 'not' in "This >> patch has been tested"? These errors happen, and I am myself quite an >> expert in doing them :) >> >>> + prev_rules->next = first; >>> >>> >>> The side effect is the "ima_default_rules" will be changed a little while. >>> But it make sense, the process should be checked again by the new policy. >>> >>> This patch has been tested, if will do, I can resubmit this patch.> >>> How about this ? >> least >> >> Correct me if I'm wrong, here is how I think I understand you patch. >> We start with a situation like that (step 0): >> ima_rules --> List entry 0 (head node) = ima_default_rules <-> List entry 1 <-> List entry 2 <-> ... <-> List entry 0 >> >> Then we decide to update the policy for the first time, so >> 'ima_rules [&ima_default_rules] != policy [&ima_policy_rules]'. >> We enter the condition. >> First we copy the current value of ima_rules (&ima_default_rules) >> to a temporary variable 'prev_rules'. We also create a pointer dubbed >> 'first' to the entry 1 in the default list (step 1): >> prev_rules ------------- >> \/ >> ima_rules --> List entry 0 (head node) = ima_default_rules <-> List entry 1 <-> List entry 2 <-> ... <-> List entry 0 >> /\ >> first -------------------------------------------------------------- >> >> >> Then we update prev_rules->next to point to policy->next (step 2): >> List entry 1 <-> List entry 2 <-> ... -> List entry 0 >> /\ >> first >> (notice that list entry 0 no longer points backwards to 'list entry 1', >> but I don't think there is any reverse iteration in IMA, so it should be >> safe) >> >> prev_rules ------------- >> \/ >> ima_rules --> List entry 0 (head node) = ima_default_rules >> | >> | >> ------------------------------------------- >> \/ >> policy --> policy entry 0' (head node) = ima_policy_rules <-> policy entry 1' <-> policy entry 2' <-> .... <-> policy entry 0' >> >> >> We then update ima_rules to point to ima_policy_rules (step 3): >> List entry 1 <-> List entry 2 <-> ... -> List entry 0 >> /\ >> first >> >> prev_rules ------------- >> \/ >> ima_rules List entry 0 (head node) = ima_default_rules >> | | >> | | >> | ------------------------------------------ >> --------------- | >> \/ \/ >> policy --> policy entry 0' (head node) = ima_policy_rules <-> policy entry 1' <-> policy entry 2' <-> .... <-> policy entry 0' >> synchronize_rcu /\ >> first -------------------------------------------------------------- >> >> Then we run synchronize_rcu() to wait for any RCU reader to exit their loops (step 4). >> >> Finally we update prev_rules->next to point back to the ima policy and fix the loop (step 5): >> >> List entry 1 <-> List entry 2 <-> ... -> List entry 0 >> /\ >> first >> >> prev_rules ---> List entry 0 (head node) = ima_default_rules <-> List entry 1 <-> List entry 2 <-> ... <-> List entry 0 >> /\ >> first (now useless) >> ima_rules >> | >> | >> | >> --------------- >> \/ >> policy --> policy entry 0' (head node) = ima_policy_rules <-> policy entry 1' <-> policy entry 2' <-> .... <-> policy entry 0' >> >> The goal is that readers should still be able to loop >> (forward, as we saw that backward looping is temporarily broken) >> while in steps 0-4. >> >> I'm not completely sure what would happen to a client that started iterating >> over ima_rules right after step 2. >> >> Wouldn't they be able to start looping through the new policy >> as 'List entry 0 (head node) = ima_default_rules' points to ima_policy_rules? >> And if they, wouldn't they loop until the write to 'ima_rule' at step 3 (admittedly >> very shortly thereafter) completed? >> And would the compiler be allowed to optimize the read to 'ima_rules' in the >> list_for_each_entry() loop, thereby never reloading the new value for >> 'ima_rules', and thus looping forever, just what we are trying to avoid? >> >> Overall, I'm tempted to say this is perhaps a bit too complex (at least, >> my head tells me it is, but that may very well be because I'm terrible >> at concurrency issues). >> >> Honestly, in this case I think awaiting input from more experienced >> kernel devs than I is the best path forward :-) > I'm far from an expert on RCU locking, but __list_splice_init_rcu() > provides an example of how to make sure there aren't any readers > traversing the list, before two lists are spliced together. In our > case, after there aren't any readers, instead of splicing two lists > together, it should be safe to point to the new list. > > thanks, > > Mimi >
Hi Liqiong, On 8/20/21 7:53 PM, liqiong wrote: > Hi Simon, > > On 2021/8/20 21:23, THOBY Simon wrote: >> Hi Liqiong, >> >> On 8/20/21 12:15 PM, 李力琼 wrote: >>> Hi, Simon: >>> >>> This solution is better then rwsem, a temp "ima_rules" variable should >>> can fix. I also have a another idea, with a little trick, default list >>> can traverse to the new list, so we don't need care about the read side. >>> >>> here is the patch: >>> >>> @@ -918,8 +918,21 @@ void ima_update_policy(void) >>> list_splice_tail_init_rcu(&ima_temp_rules, policy, synchronize_rcu); >>> >>> if (ima_rules != policy) { >>> + struct list_head *prev_rules = ima_rules; >>> + struct list_head *first = ima_rules->next; >>> ima_policy_flag = 0; >>> + >>> + /* >>> + * Make the previous list can traverse to new list, >>> + * that is tricky, or there is a deadly loop whithin >>> + * "list_for_each_entry_rcu(entry, ima_rules, list)" >>> + * >>> + * After update "ima_rules", restore the previous list. >>> + */ >> I think this could be rephrased to be a tad clearer, I am not quite sure >> how I must interpret the first sentence of the comment. > I got it, how about this: > /* > * The previous list has to traverse to new list, > * Or there may be a deadly loop within Maybe 'deadlock' would be clearer than 'deadly loop'? > * "list_for_each_entry_rcu(entry, ima_rules, list)" > * > * That is tricky, after updated "ima_rules", restore the previous list. Maybe something like "This is tricky, so we restore the previous list (ima_default_rules) once 'ima_rules' is updated" ? > */>> >> >>> + prev_rules->next = policy->next; >>> ima_rules = policy; >>> + syncchronize_rcu(); >> I'm a bit puzzled as you seem to imply in the mail this patch was tested, >> but there is no 'syncchronize_rcu' (with two 'c') symbol in the kernel. >> Was that a copy/paste error? Or maybe you forgot the 'not' in "This >> patch has been tested"? These errors happen, and I am myself quite an >> expert in doing them :) > > > Sorry for the mistake, I copy/paste the patch and delete/edit some lines, > have reviewed before sending, but not found. I have made a case to reproduce > the error, dumping "ima_rules" and every item address of list in the error > situaiton, I can watchthe "ima_rules" change, old list traversing to the new list. > And I have been doing a reboot test which found this bug. This patch seems to work fine. > No worries, i just wanted to make sure I understood you correctly. > >> >>> + prev_rules->next = first; >>> >>> >>> The side effect is the "ima_default_rules" will be changed a little while. >>> But it make sense, the process should be checked again by the new policy. >>> >>> This patch has been tested, if will do, I can resubmit this patch.> >>> How about this ? >> >> Correct me if I'm wrong, here is how I think I understand you patch. >> We start with a situation like that (step 0): >> ima_rules --> List entry 0 (head node) = ima_default_rules <-> List entry 1 <-> List entry 2 <-> ... <-> List entry 0 >> >> Then we decide to update the policy for the first time, so >> 'ima_rules [&ima_default_rules] != policy [&ima_policy_rules]'. >> We enter the condition. >> First we copy the current value of ima_rules (&ima_default_rules) >> to a temporary variable 'prev_rules'. We also create a pointer dubbed >> 'first' to the entry 1 in the default list (step 1): >> prev_rules ------------- >> \/ >> ima_rules --> List entry 0 (head node) = ima_default_rules <-> List entry 1 <-> List entry 2 <-> ... <-> List entry 0 >> /\ >> first -------------------------------------------------------------- >> >> >> Then we update prev_rules->next to point to policy->next (step 2): >> List entry 1 <-> List entry 2 <-> ... -> List entry 0 >> /\ >> first >> (notice that list entry 0 no longer points backwards to 'list entry 1', >> but I don't think there is any reverse iteration in IMA, so it should be >> safe) >> >> prev_rules ------------- >> \/ >> ima_rules --> List entry 0 (head node) = ima_default_rules >> | >> | >> ------------------------------------------- >> \/ >> policy --> policy entry 0' (head node) = ima_policy_rules <-> policy entry 1' <-> policy entry 2' <-> .... <-> policy entry 0' >> >> >> We then update ima_rules to point to ima_policy_rules (step 3): >> List entry 1 <-> List entry 2 <-> ... -> List entry 0 >> /\ >> first >> >> prev_rules ------------- >> \/ >> ima_rules List entry 0 (head node) = ima_default_rules >> | | >> | | >> | ------------------------------------------ >> --------------- | >> \/ \/ >> policy --> policy entry 0' (head node) = ima_policy_rules <-> policy entry 1' <-> policy entry 2' <-> .... <-> policy entry 0' >> /\ >> first -------------------------------------------------------------- >> >> Then we run synchronize_rcu() to wait for any RCU reader to exit their loops (step 4). >> >> Finally we update prev_rules->next to point back to the ima policy and fix the loop (step 5): >> >> List entry 1 <-> List entry 2 <-> ... -> List entry 0 >> /\ >> first >> >> prev_rules ---> List entry 0 (head node) = ima_default_rules <-> List entry 1 <-> List entry 2 <-> ... <-> List entry 0 >> /\ >> first (now useless) >> ima_rules >> | >> | >> | >> --------------- >> \/ >> policy --> policy entry 0' (head node) = ima_policy_rules <-> policy entry 1' <-> policy entry 2' <-> .... <-> policy entry 0' >> >> The goal is that readers should still be able to loop >> (forward, as we saw that backward looping is temporarily broken) >> while in steps 0-4. > > > Yes, It's the workflow. > > >> I'm not completely sure what would happen to a client that started iterating >> over ima_rules right after step 2. >> >> Wouldn't they be able to start looping through the new policy >> as 'List entry 0 (head node) = ima_default_rules' points to ima_policy_rules? >> And if they, wouldn't they loop until the write to 'ima_rule' at step 3 (admittedly >> very shortly thereafter) completed? >> And would the compiler be allowed to optimize the read to 'ima_rules' in the >> list_for_each_entry() loop, thereby never reloading the new value for >> 'ima_rules', and thus looping forever, just what we are trying to avoid? > > > Yes, "ima_rules" cache not update in time, It's a risk. I am not sure if "WRITE_ONCE" > can do this trick. How about: > WRITE_ONCE(prev_rules->next, policy->next); > WRITE_ONCE(ima_rules, policy); Quite frankly, I don't know. As I said earlier, this is really way above my level. I'm fine waiting for more experienced opinions on this one. On the aspect of maintainability, I do think this solution is perhaps too complex when compared to other solutions like the semaphore you first proposed. A solution of similar complexity with RCU would be ideal to prevent adding a semaphore on a read-mostly scenario, but I'm still more confident in the semaphore than in the solution above, because it is easy to have confidence in the semaphore, while this patch is not at all obvious to me, and maybe the next person who will have to edit that piece of code. > > If can't fix the cache issue, maybe the "ima_rules_tmp" solution is the best way. > I will test it. > > >> Overall, I'm tempted to say this is perhaps a bit too complex (at least, >> my head tells me it is, but that may very well be because I'm terrible >> at concurrency issues). >> >> Honestly, in this case I think awaiting input from more experienced >> kernel devs than I is the best path forward :-) >> >>> ---------- >>> Regards, >>> liqiong Thanks, Simon
Hi Simon : Using a temporary ima_rules variable is not working for "ima_policy_next". void *ima_policy_next(struct seq_file *m, void *v, loff_t *pos) { struct ima_rule_entry *entry = v; + struct list_head *ima_rules_tmp = rcu_dereference(ima_rules); rcu_read_lock(); entry = list_entry_rcu(entry->list.next, struct ima_rule_entry, list); rcu_read_unlock(); (*pos)++; - return (&entry->list == ima_rules) ? NULL : entry; + return (&entry->list == ima_rules_tmp) ? NULL : entry; } It seems no way to fix "ima_rules" change within this function, it will alway return a entry if "ima_rules" being changed. Regrads, liqiong 在 2021年08月23日 11:04, 李力琼 写道: > Hi Mimi : > > The situation is a little different,'list_splice_init_rcu' > don't change the list head. If "ima_rules" being changed, > readers may can't reload the new value in time for cpu cache > or compiler optimization. Defining "ima_rules" as a volatile > variable can fix, but It is inefficient. > > Maybe using a temporary ima_rules variable for every > "list_for_each_entry_rcu(entry, ima_rules, list)" loop is > a better solution to fix the "endless loop" bug. > > Regards, > > liqiong > > 在 2021年08月20日 23:48, Mimi Zohar 写道: >> On Fri, 2021-08-20 at 13:23 +0000, THOBY Simon wrote: >>> Hi Liqiong, >>> >>> On 8/20/21 12:15 PM, 李力琼 wrote: >>>> Hi, Simon: >>>> >>>> This solution is better then rwsem, a temp "ima_rules" variable should >>>> can fix. I also have a another idea, with a little trick, default list >>>> can traverse to the new list, so we don't need care about the read side. >>>> >>>> here is the patch: >>>> >>>> @@ -918,8 +918,21 @@ void ima_update_policy(void) >>>> list_splice_tail_init_rcu(&ima_temp_rules, policy, synchronize_rcu); >>>> >>>> if (ima_rules != policy) { >>>> + struct list_head *prev_rules = ima_rules; >>>> + struct list_head *first = ima_rules->next; >>>> ima_policy_flag = 0; >>>> + >>>> + /* >>>> + * Make the previous list can traverse to new list, >>>> + * that is tricky, or there is a deadly loop whithin >>>> + * "list_for_each_entry_rcu(entry, ima_rules, list)" >>>> + * >>>> + * After update "ima_rules", restore the previous list. >>>> + */ >>> I think this could be rephrased to be a tad clearer, I am not quite sure >>> how I must interpret the first sentence of the comment. >>> >>> >>>> + prev_rules->next = policy->next; >>>> ima_rules = policy; >>>> + syncchronize_rcu(); >>> I'm a bit puzzled as you seem to imply in the mail this patch was tested, >>> but there is no 'syncchronize_rcu' (with two 'c') symbol in the kernel. >>> Was that a copy/paste error? Or maybe you forgot the 'not' in "This >>> patch has been tested"? These errors happen, and I am myself quite an >>> expert in doing them :) >>> >>>> + prev_rules->next = first; >>>> >>>> >>>> The side effect is the "ima_default_rules" will be changed a little while. >>>> But it make sense, the process should be checked again by the new policy. >>>> >>>> This patch has been tested, if will do, I can resubmit this patch.> >>>> How about this ? >>> least >>> >>> Correct me if I'm wrong, here is how I think I understand you patch. >>> We start with a situation like that (step 0): >>> ima_rules --> List entry 0 (head node) = ima_default_rules <-> List entry 1 <-> List entry 2 <-> ... <-> List entry 0 >>> >>> Then we decide to update the policy for the first time, so >>> 'ima_rules [&ima_default_rules] != policy [&ima_policy_rules]'. >>> We enter the condition. >>> First we copy the current value of ima_rules (&ima_default_rules) >>> to a temporary variable 'prev_rules'. We also create a pointer dubbed >>> 'first' to the entry 1 in the default list (step 1): >>> prev_rules ------------- >>> \/ >>> ima_rules --> List entry 0 (head node) = ima_default_rules <-> List entry 1 <-> List entry 2 <-> ... <-> List entry 0 >>> /\ >>> first -------------------------------------------------------------- >>> >>> >>> Then we update prev_rules->next to point to policy->next (step 2): >>> List entry 1 <-> List entry 2 <-> ... -> List entry 0 >>> /\ >>> first >>> (notice that list entry 0 no longer points backwards to 'list entry 1', >>> but I don't think there is any reverse iteration in IMA, so it should be >>> safe) >>> >>> prev_rules ------------- >>> \/ >>> ima_rules --> List entry 0 (head node) = ima_default_rules >>> | >>> | >>> ------------------------------------------- >>> \/ >>> policy --> policy entry 0' (head node) = ima_policy_rules <-> policy entry 1' <-> policy entry 2' <-> .... <-> policy entry 0' >>> >>> >>> We then update ima_rules to point to ima_policy_rules (step 3): >>> List entry 1 <-> List entry 2 <-> ... -> List entry 0 >>> /\ >>> first >>> >>> prev_rules ------------- >>> \/ >>> ima_rules List entry 0 (head node) = ima_default_rules >>> | | >>> | | >>> | ------------------------------------------ >>> --------------- | >>> \/ \/ >>> policy --> policy entry 0' (head node) = ima_policy_rules <-> policy entry 1' <-> policy entry 2' <-> .... <-> policy entry 0' >>> synchronize_rcu /\ >>> first -------------------------------------------------------------- >>> >>> Then we run synchronize_rcu() to wait for any RCU reader to exit their loops (step 4). >>> >>> Finally we update prev_rules->next to point back to the ima policy and fix the loop (step 5): >>> >>> List entry 1 <-> List entry 2 <-> ... -> List entry 0 >>> /\ >>> first >>> >>> prev_rules ---> List entry 0 (head node) = ima_default_rules <-> List entry 1 <-> List entry 2 <-> ... <-> List entry 0 >>> /\ >>> first (now useless) >>> ima_rules >>> | >>> | >>> | >>> --------------- >>> \/ >>> policy --> policy entry 0' (head node) = ima_policy_rules <-> policy entry 1' <-> policy entry 2' <-> .... <-> policy entry 0' >>> >>> The goal is that readers should still be able to loop >>> (forward, as we saw that backward looping is temporarily broken) >>> while in steps 0-4. >>> >>> I'm not completely sure what would happen to a client that started iterating >>> over ima_rules right after step 2. >>> >>> Wouldn't they be able to start looping through the new policy >>> as 'List entry 0 (head node) = ima_default_rules' points to ima_policy_rules? >>> And if they, wouldn't they loop until the write to 'ima_rule' at step 3 (admittedly >>> very shortly thereafter) completed? >>> And would the compiler be allowed to optimize the read to 'ima_rules' in the >>> list_for_each_entry() loop, thereby never reloading the new value for >>> 'ima_rules', and thus looping forever, just what we are trying to avoid? >>> >>> Overall, I'm tempted to say this is perhaps a bit too complex (at least, >>> my head tells me it is, but that may very well be because I'm terrible >>> at concurrency issues). >>> >>> Honestly, in this case I think awaiting input from more experienced >>> kernel devs than I is the best path forward :-) >> I'm far from an expert on RCU locking, but __list_splice_init_rcu() >> provides an example of how to make sure there aren't any readers >> traversing the list, before two lists are spliced together. In our >> case, after there aren't any readers, instead of splicing two lists >> together, it should be safe to point to the new list. >> >> thanks, >> >> Mimi >>
Hi Simon : Using a temporary ima_rules variable is not working for "ima_policy_next". void *ima_policy_next(struct seq_file *m, void *v, loff_t *pos) { struct ima_rule_entry *entry = v; - + struct list_head *ima_rules_tmp = rcu_dereference(ima_rules); rcu_read_lock(); entry = list_entry_rcu(entry->list.next, struct ima_rule_entry, list); rcu_read_unlock(); (*pos)++; - return (&entry->list == ima_rules) ? NULL : entry; + return (&entry->list == ima_rules_tmp) ? NULL : entry; } It seems no way to fix "ima_rules" change within this function, it will alway return a entry if "ima_rules" being changed. Regrads, liqiong 在 2021年08月23日 11:04, 李力琼 写道: > Hi Mimi : > > The situation is a little different,'list_splice_init_rcu' > don't change the list head. If "ima_rules" being changed, > readers may can't reload the new value in time for cpu cache > or compiler optimization. Defining "ima_rules" as a volatile > variable can fix, but It is inefficient. > > Maybe using a temporary ima_rules variable for every > "list_for_each_entry_rcu(entry, ima_rules, list)" loop is > a better solution to fix the "endless loop" bug. > > Regards, > > liqiong > > 在 2021年08月20日 23:48, Mimi Zohar 写道: >> On Fri, 2021-08-20 at 13:23 +0000, THOBY Simon wrote: >>> Hi Liqiong, >>> >>> On 8/20/21 12:15 PM, 李力琼 wrote: >>>> Hi, Simon: >>>> >>>> This solution is better then rwsem, a temp "ima_rules" variable should >>>> can fix. I also have a another idea, with a little trick, default list >>>> can traverse to the new list, so we don't need care about the read side. >>>> >>>> here is the patch: >>>> >>>> @@ -918,8 +918,21 @@ void ima_update_policy(void) >>>> list_splice_tail_init_rcu(&ima_temp_rules, policy, synchronize_rcu); >>>> >>>> if (ima_rules != policy) { >>>> + struct list_head *prev_rules = ima_rules; >>>> + struct list_head *first = ima_rules->next; >>>> ima_policy_flag = 0; >>>> + >>>> + /* >>>> + * Make the previous list can traverse to new list, >>>> + * that is tricky, or there is a deadly loop whithin >>>> + * "list_for_each_entry_rcu(entry, ima_rules, list)" >>>> + * >>>> + * After update "ima_rules", restore the previous list. >>>> + */ >>> I think this could be rephrased to be a tad clearer, I am not quite sure >>> how I must interpret the first sentence of the comment. >>> >>> >>>> + prev_rules->next = policy->next; >>>> ima_rules = policy; >>>> + syncchronize_rcu(); >>> I'm a bit puzzled as you seem to imply in the mail this patch was tested, >>> but there is no 'syncchronize_rcu' (with two 'c') symbol in the kernel. >>> Was that a copy/paste error? Or maybe you forgot the 'not' in "This >>> patch has been tested"? These errors happen, and I am myself quite an >>> expert in doing them :) >>> >>>> + prev_rules->next = first; >>>> >>>> >>>> The side effect is the "ima_default_rules" will be changed a little while. >>>> But it make sense, the process should be checked again by the new policy. >>>> >>>> This patch has been tested, if will do, I can resubmit this patch.> >>>> How about this ? >>> least >>> >>> Correct me if I'm wrong, here is how I think I understand you patch. >>> We start with a situation like that (step 0): >>> ima_rules --> List entry 0 (head node) = ima_default_rules <-> List entry 1 <-> List entry 2 <-> ... <-> List entry 0 >>> >>> Then we decide to update the policy for the first time, so >>> 'ima_rules [&ima_default_rules] != policy [&ima_policy_rules]'. >>> We enter the condition. >>> First we copy the current value of ima_rules (&ima_default_rules) >>> to a temporary variable 'prev_rules'. We also create a pointer dubbed >>> 'first' to the entry 1 in the default list (step 1): >>> prev_rules ------------- >>> \/ >>> ima_rules --> List entry 0 (head node) = ima_default_rules <-> List entry 1 <-> List entry 2 <-> ... <-> List entry 0 >>> /\ >>> first -------------------------------------------------------------- >>> >>> >>> Then we update prev_rules->next to point to policy->next (step 2): >>> List entry 1 <-> List entry 2 <-> ... -> List entry 0 >>> /\ >>> first >>> (notice that list entry 0 no longer points backwards to 'list entry 1', >>> but I don't think there is any reverse iteration in IMA, so it should be >>> safe) >>> >>> prev_rules ------------- >>> \/ >>> ima_rules --> List entry 0 (head node) = ima_default_rules >>> | >>> | >>> ------------------------------------------- >>> \/ >>> policy --> policy entry 0' (head node) = ima_policy_rules <-> policy entry 1' <-> policy entry 2' <-> .... <-> policy entry 0' >>> >>> >>> We then update ima_rules to point to ima_policy_rules (step 3): >>> List entry 1 <-> List entry 2 <-> ... -> List entry 0 >>> /\ >>> first >>> >>> prev_rules ------------- >>> \/ >>> ima_rules List entry 0 (head node) = ima_default_rules >>> | | >>> | | >>> | ------------------------------------------ >>> --------------- | >>> \/ \/ >>> policy --> policy entry 0' (head node) = ima_policy_rules <-> policy entry 1' <-> policy entry 2' <-> .... <-> policy entry 0' >>> synchronize_rcu /\ >>> first -------------------------------------------------------------- >>> >>> Then we run synchronize_rcu() to wait for any RCU reader to exit their loops (step 4). >>> >>> Finally we update prev_rules->next to point back to the ima policy and fix the loop (step 5): >>> >>> List entry 1 <-> List entry 2 <-> ... -> List entry 0 >>> /\ >>> first >>> >>> prev_rules ---> List entry 0 (head node) = ima_default_rules <-> List entry 1 <-> List entry 2 <-> ... <-> List entry 0 >>> /\ >>> first (now useless) >>> ima_rules >>> | >>> | >>> | >>> --------------- >>> \/ >>> policy --> policy entry 0' (head node) = ima_policy_rules <-> policy entry 1' <-> policy entry 2' <-> .... <-> policy entry 0' >>> >>> The goal is that readers should still be able to loop >>> (forward, as we saw that backward looping is temporarily broken) >>> while in steps 0-4. >>> >>> I'm not completely sure what would happen to a client that started iterating >>> over ima_rules right after step 2. >>> >>> Wouldn't they be able to start looping through the new policy >>> as 'List entry 0 (head node) = ima_default_rules' points to ima_policy_rules? >>> And if they, wouldn't they loop until the write to 'ima_rule' at step 3 (admittedly >>> very shortly thereafter) completed? >>> And would the compiler be allowed to optimize the read to 'ima_rules' in the >>> list_for_each_entry() loop, thereby never reloading the new value for >>> 'ima_rules', and thus looping forever, just what we are trying to avoid? >>> >>> Overall, I'm tempted to say this is perhaps a bit too complex (at least, >>> my head tells me it is, but that may very well be because I'm terrible >>> at concurrency issues). >>> >>> Honestly, in this case I think awaiting input from more experienced >>> kernel devs than I is the best path forward :-) >> I'm far from an expert on RCU locking, but __list_splice_init_rcu() >> provides an example of how to make sure there aren't any readers >> traversing the list, before two lists are spliced together. In our >> case, after there aren't any readers, instead of splicing two lists >> together, it should be safe to point to the new list. >> >> thanks, >> >> Mimi >>
Hi Liqiong, On 8/23/21 10:06 AM, liqiong wrote: > Hi Simon : > > Using a temporary ima_rules variable is not working for "ima_policy_next". > > void *ima_policy_next(struct seq_file *m, void *v, loff_t *pos) > { > struct ima_rule_entry *entry = v; > - > + struct list_head *ima_rules_tmp = rcu_dereference(ima_rules); > rcu_read_lock(); > entry = list_entry_rcu(entry->list.next, struct ima_rule_entry, list); > rcu_read_unlock(); > (*pos)++; > > - return (&entry->list == ima_rules) ? NULL : entry; > + return (&entry->list == ima_rules_tmp) ? NULL : entry; > } > > It seems no way to fix "ima_rules" change within this function, it will alway > return a entry if "ima_rules" being changed. - I think rcu_dereference() should be called inside the RCU read lock - Maybe we could cheat with: return (&entry->list == &ima_policy_rules || &entry->list == &ima_default_rules) ? NULL : entry; as that's the only two rulesets IMA ever use? Admittedly, this is not as clean as previously, but it should work too. The way I see it, the semaphore solution would not work here either, as ima_policy_next() is called repeatedly as a seq_file (it is set up in ima_fs.c) and we can't control the locking there: we cannot lock across the seq_read() call (that cure could end up be worse than the disease, deadlock-wise), so I fear we cannot protect against a list update while a user is iterating with a lock. So in both cases a cheat like "&entry->list == &ima_policy_rules || &entry->list == &ima_default_rules" maybe need to be considered. What do you think? > > Regrads, > > liqiong Thanks, Simon
Hi Liqiong, On Mon, 2021-08-23 at 11:04 +0800, 李力琼 wrote: > Hi Mimi : > > The situation is a little different,'list_splice_init_rcu' > don't change the list head. If "ima_rules" being changed, > readers may can't reload the new value in time for cpu cache > or compiler optimization. Defining "ima_rules" as a volatile > variable can fix, but It is inefficient. After looking at list_splice_tail_init_rcu() some more, it is actually making sure there aren't any readers traversing "ima_temp_rules", not "ima_policy_rules". There aren't any readers traversing "ima_temp_rules". > > Maybe using a temporary ima_rules variable for every > "list_for_each_entry_rcu(entry, ima_rules, list)" loop is > a better solution to fix the "endless loop" bug. Agreed thanks, Mimi
On Mon, 2021-08-23 at 08:14 +0000, THOBY Simon wrote: > Hi Liqiong, > > On 8/23/21 10:06 AM, liqiong wrote: > > Hi Simon : > > > > Using a temporary ima_rules variable is not working for "ima_policy_next". > > > > void *ima_policy_next(struct seq_file *m, void *v, loff_t *pos) > > { > > struct ima_rule_entry *entry = v; > > - > > + struct list_head *ima_rules_tmp = rcu_dereference(ima_rules); > > rcu_read_lock(); > > entry = list_entry_rcu(entry->list.next, struct ima_rule_entry, list); > > rcu_read_unlock(); > > (*pos)++; > > > > - return (&entry->list == ima_rules) ? NULL : entry; > > + return (&entry->list == ima_rules_tmp) ? NULL : entry; > > } > > > > It seems no way to fix "ima_rules" change within this function, it will alway > > return a entry if "ima_rules" being changed. > > - I think rcu_dereference() should be called inside the RCU read lock > - Maybe we could cheat with: > return (&entry->list == &ima_policy_rules || &entry->list == &ima_default_rules) ? NULL : entry; > as that's the only two rulesets IMA ever use? > Admittedly, this is not as clean as previously, but it should work too. > > The way I see it, the semaphore solution would not work here either, > as ima_policy_next() is called repeatedly as a seq_file > (it is set up in ima_fs.c) and we can't control the locking there: > we cannot lock across the seq_read() call (that cure could end up be > worse than the disease, deadlock-wise), so I fear we cannot protect > against a list update while a user is iterating with a lock. > > So in both cases a cheat like "&entry->list == &ima_policy_rules || &entry->list == &ima_default_rules" > maybe need to be considered. > > What do you think? Is this an overall suggestion or limited to just ima_policy_next()? thanks, Mimi
Hi Mimi, On 8/23/21 1:57 PM, Mimi Zohar wrote: > On Mon, 2021-08-23 at 08:14 +0000, THOBY Simon wrote: >> Hi Liqiong, >> >> On 8/23/21 10:06 AM, liqiong wrote: >>> Hi Simon : >>> >>> Using a temporary ima_rules variable is not working for "ima_policy_next". >>> >>> void *ima_policy_next(struct seq_file *m, void *v, loff_t *pos) >>> { >>> struct ima_rule_entry *entry = v; >>> - >>> + struct list_head *ima_rules_tmp = rcu_dereference(ima_rules); >>> rcu_read_lock(); >>> entry = list_entry_rcu(entry->list.next, struct ima_rule_entry, list); >>> rcu_read_unlock(); >>> (*pos)++; >>> >>> - return (&entry->list == ima_rules) ? NULL : entry; >>> + return (&entry->list == ima_rules_tmp) ? NULL : entry; >>> } >>> >>> It seems no way to fix "ima_rules" change within this function, it will alway >>> return a entry if "ima_rules" being changed. >> >> - I think rcu_dereference() should be called inside the RCU read lock >> - Maybe we could cheat with: >> return (&entry->list == &ima_policy_rules || &entry->list == &ima_default_rules) ? NULL : entry; >> as that's the only two rulesets IMA ever use? >> Admittedly, this is not as clean as previously, but it should work too. >> >> The way I see it, the semaphore solution would not work here either, >> as ima_policy_next() is called repeatedly as a seq_file >> (it is set up in ima_fs.c) and we can't control the locking there: >> we cannot lock across the seq_read() call (that cure could end up be >> worse than the disease, deadlock-wise), so I fear we cannot protect >> against a list update while a user is iterating with a lock. >> >> So in both cases a cheat like "&entry->list == &ima_policy_rules || &entry->list == &ima_default_rules" >> maybe need to be considered. >> >> What do you think? > > Is this an overall suggestion or limited to just ima_policy_next()? I was thinking only of ima_policy_next(), I don't think (from what I could see in a short glance) that other functions need such a treatment. The ima_rules_tmp dance is probably safe for the other uses of ima_rules. > > thanks, > > Mimi > > Thanks, Simon
Hi Simon, On Mon, 2021-08-23 at 12:02 +0000, THOBY Simon wrote: > Hi Mimi, > > On 8/23/21 1:57 PM, Mimi Zohar wrote: > > On Mon, 2021-08-23 at 08:14 +0000, THOBY Simon wrote: > >> Hi Liqiong, > >> > >> On 8/23/21 10:06 AM, liqiong wrote: > >>> Hi Simon : > >>> > >>> Using a temporary ima_rules variable is not working for "ima_policy_next". > >>> > >>> void *ima_policy_next(struct seq_file *m, void *v, loff_t *pos) > >>> { > >>> struct ima_rule_entry *entry = v; > >>> - > >>> + struct list_head *ima_rules_tmp = rcu_dereference(ima_rules); > >>> rcu_read_lock(); > >>> entry = list_entry_rcu(entry->list.next, struct ima_rule_entry, list); > >>> rcu_read_unlock(); > >>> (*pos)++; > >>> > >>> - return (&entry->list == ima_rules) ? NULL : entry; > >>> + return (&entry->list == ima_rules_tmp) ? NULL : entry; > >>> } > >>> > >>> It seems no way to fix "ima_rules" change within this function, it will alway > >>> return a entry if "ima_rules" being changed. > >> > >> - I think rcu_dereference() should be called inside the RCU read lock > >> - Maybe we could cheat with: > >> return (&entry->list == &ima_policy_rules || &entry->list == &ima_default_rules) ? NULL : entry; > >> as that's the only two rulesets IMA ever use? > >> Admittedly, this is not as clean as previously, but it should work too. > >> > >> The way I see it, the semaphore solution would not work here either, > >> as ima_policy_next() is called repeatedly as a seq_file > >> (it is set up in ima_fs.c) and we can't control the locking there: > >> we cannot lock across the seq_read() call (that cure could end up be > >> worse than the disease, deadlock-wise), so I fear we cannot protect > >> against a list update while a user is iterating with a lock. > >> > >> So in both cases a cheat like "&entry->list == &ima_policy_rules || &entry->list == &ima_default_rules" > >> maybe need to be considered. > >> > >> What do you think? > > > > Is this an overall suggestion or limited to just ima_policy_next()? > > I was thinking only of ima_policy_next(), I don't think (from what I could see in a short glance) > that other functions need such a treatment. The ima_rules_tmp dance is probably safe for the > other uses of ima_rules. Thanks, just making sure it is limited to here. Mimi
Hi Simon : 在 2021年08月23日 16:14, THOBY Simon 写道: > Hi Liqiong, > > On 8/23/21 10:06 AM, liqiong wrote: >> Hi Simon : >> >> Using a temporary ima_rules variable is not working for "ima_policy_next". >> >> void *ima_policy_next(struct seq_file *m, void *v, loff_t *pos) >> { >> struct ima_rule_entry *entry = v; >> - >> + struct list_head *ima_rules_tmp = rcu_dereference(ima_rules); >> rcu_read_lock(); >> entry = list_entry_rcu(entry->list.next, struct ima_rule_entry, list); >> rcu_read_unlock(); >> (*pos)++; >> >> - return (&entry->list == ima_rules) ? NULL : entry; >> + return (&entry->list == ima_rules_tmp) ? NULL : entry; >> } >> >> It seems no way to fix "ima_rules" change within this function, it will alway >> return a entry if "ima_rules" being changed. > - I think rcu_dereference() should be called inside the RCU read lock > - Maybe we could cheat with: > return (&entry->list == &ima_policy_rules || &entry->list == &ima_default_rules) ? NULL : entry; > as that's the only two rulesets IMA ever use? > Admittedly, this is not as clean as previously, but it should work too. > > The way I see it, the semaphore solution would not work here either, > as ima_policy_next() is called repeatedly as a seq_file > (it is set up in ima_fs.c) and we can't control the locking there: > we cannot lock across the seq_read() call (that cure could end up be > worse than the disease, deadlock-wise), so I fear we cannot protect > against a list update while a user is iterating with a lock. > > So in both cases a cheat like "&entry->list == &ima_policy_rules || &entry->list == &ima_default_rules" > maybe need to be considered. > > What do you think? Yes, semaphore solution not work here, splicing two list is a little complex. This solution is simple and clear, should work. I will work on that, test and confirm the patch. "rcu_dereference() should be called inside the RCU read lock", I will correct this. Thanks for your help. Regrads, liqiong > > >> Regrads, >> >> liqiong > Thanks, > Simon
diff --git a/security/integrity/ima/ima_policy.c b/security/integrity/ima/ima_policy.c index fd5d46e511f1..7e71e643457c 100644 --- a/security/integrity/ima/ima_policy.c +++ b/security/integrity/ima/ima_policy.c @@ -217,6 +217,7 @@ static LIST_HEAD(ima_default_rules); static LIST_HEAD(ima_policy_rules); static LIST_HEAD(ima_temp_rules); static struct list_head *ima_rules = &ima_default_rules; +static DECLARE_RWSEM(ima_rules_sem); static int ima_policy __initdata; @@ -666,6 +667,7 @@ int ima_match_policy(struct user_namespace *mnt_userns, struct inode *inode, if (template_desc && !*template_desc) *template_desc = ima_template_desc_current(); + down_read(&ima_rules_sem); rcu_read_lock(); list_for_each_entry_rcu(entry, ima_rules, list) { @@ -702,6 +704,7 @@ int ima_match_policy(struct user_namespace *mnt_userns, struct inode *inode, break; } rcu_read_unlock(); + up_read(&ima_rules_sem); return action; } @@ -919,7 +922,9 @@ void ima_update_policy(void) if (ima_rules != policy) { ima_policy_flag = 0; + down_write(&ima_rules_sem); ima_rules = policy; + up_write(&ima_rules_sem); /* * IMA architecture specific policy rules are specified
When "ima_match_policy" is looping while "ima_update_policy" changs the variable "ima_rules", then "ima_match_policy" may can't exit loop, and kernel keeps printf "rcu_sched detected stall on CPU ...". It occurs at boot phase, systemd-services are being checked within "ima_match_policy,at the same time, the variable "ima_rules" is changed by a service. Signed-off-by: liqiong <liqiong@nfschina.com> --- security/integrity/ima/ima_policy.c | 5 +++++ 1 file changed, 5 insertions(+)