diff mbox series

selinux: optimize MLS context to string conversion

Message ID 20190730124852.7670-1-omosnace@redhat.com (mailing list archive)
State Changes Requested
Headers show
Series selinux: optimize MLS context to string conversion | expand

Commit Message

Ondrej Mosnacek July 30, 2019, 12:48 p.m. UTC
When mls_compute_context_len() or mls_sid_to_context() encounters a
context with large category ranges, it behaves suboptimally - it
traverses each positive bit of the category bitmap, each time calling
find_next_bit() again.

This has a large performance impact on UNIX datagram sockets with
SO_PASSSEC set, since here the peer context needs to be converted to
string for each recieved datagram. See [1] for more information.

This patch introduces a new helper for ebitmap traversal, which allows
to efficiently iterate over positive ranges instead of bits -
ebitmap_for_each_positive_range() - and converts the two mls_*()
functions to leverage it.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1733259

Reported-by: Michal Sekletar <msekleta@redhat.com>
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
---
 security/selinux/ss/ebitmap.h | 46 +++++++++++++++++++++
 security/selinux/ss/mls.c     | 76 +++++++++++++----------------------
 2 files changed, 73 insertions(+), 49 deletions(-)

Comments

Paul Moore July 30, 2019, 2:46 p.m. UTC | #1
On Tue, Jul 30, 2019 at 8:48 AM Ondrej Mosnacek <omosnace@redhat.com> wrote:
> When mls_compute_context_len() or mls_sid_to_context() encounters a
> context with large category ranges, it behaves suboptimally - it
> traverses each positive bit of the category bitmap, each time calling
> find_next_bit() again.
>
> This has a large performance impact on UNIX datagram sockets with
> SO_PASSSEC set, since here the peer context needs to be converted to
> string for each recieved datagram. See [1] for more information.
>
> This patch introduces a new helper for ebitmap traversal, which allows
> to efficiently iterate over positive ranges instead of bits -
> ebitmap_for_each_positive_range() - and converts the two mls_*()
> functions to leverage it.
>
> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1733259
>
> Reported-by: Michal Sekletar <msekleta@redhat.com>
> Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>

My current opinion is that this isn't the right way to solve the
problem, a SELinux sid->secctx cache would be a better choice.  With
that in mind, and the understanding that this patch is an optimization
and not a bug fix, I'm going to hold-off on doing anything with this
patch until there is a cache option we can use for comparison.

As Stephen mentioned in the RH BZ (linked in the description), there
are a couple of reasons why the code doesn't currently store the
string translations.  Ignoring the issue of aliasing for a moment, I
do want to stress that I agree with Stephen on the issue of memory
pressure and that to keep translated strings around indefinitely in
the kernel is going to be a problem (there are other issues beyond
just memory use).  I imagine any cache implementation would need to be
built to only store a subset of all the translations and there would
need to be a way to age-out those translations (as well as purge them
completely on events such as a policy load).
Casey Schaufler July 30, 2019, 4:35 p.m. UTC | #2
On 7/30/2019 7:46 AM, Paul Moore wrote:
> On Tue, Jul 30, 2019 at 8:48 AM Ondrej Mosnacek <omosnace@redhat.com> wrote:
>> When mls_compute_context_len() or mls_sid_to_context() encounters a
>> context with large category ranges, it behaves suboptimally - it
>> traverses each positive bit of the category bitmap, each time calling
>> find_next_bit() again.
>>
>> This has a large performance impact on UNIX datagram sockets with
>> SO_PASSSEC set, since here the peer context needs to be converted to
>> string for each recieved datagram. See [1] for more information.
>>
>> This patch introduces a new helper for ebitmap traversal, which allows
>> to efficiently iterate over positive ranges instead of bits -
>> ebitmap_for_each_positive_range() - and converts the two mls_*()
>> functions to leverage it.
>>
>> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1733259
>>
>> Reported-by: Michal Sekletar <msekleta@redhat.com>
>> Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
> My current opinion is that this isn't the right way to solve the
> problem, a SELinux sid->secctx cache would be a better choice.

I implemented secctx retention a while ago (can't locate the
thread to determine exactly when) and the performance improvement
was rather impressive. A cache with proper lifecycle management
would be a tremendous win for audit by reducing not only the time
required to generate a secctx, but the frequency at which they would
need to be copied. The downside would be on small systems with
sophisticated policies, where a small cache would be subject to
thrashing. If the cached values aren't considered when doing
secctx->sid conversions I would think aliasing issues wouldn't
apply, although I have missed subtleties in the past.

I'm rooting for the cache solution.


>   With
> that in mind, and the understanding that this patch is an optimization
> and not a bug fix, I'm going to hold-off on doing anything with this
> patch until there is a cache option we can use for comparison.
>
> As Stephen mentioned in the RH BZ (linked in the description), there
> are a couple of reasons why the code doesn't currently store the
> string translations.  Ignoring the issue of aliasing for a moment, I
> do want to stress that I agree with Stephen on the issue of memory
> pressure and that to keep translated strings around indefinitely in
> the kernel is going to be a problem (there are other issues beyond
> just memory use).  I imagine any cache implementation would need to be
> built to only store a subset of all the translations and there would
> need to be a way to age-out those translations (as well as purge them
> completely on events such as a policy load).
>
Stephen Smalley Sept. 16, 2019, 2:43 p.m. UTC | #3
On 7/30/19 8:48 AM, Ondrej Mosnacek wrote:
> When mls_compute_context_len() or mls_sid_to_context() encounters a
> context with large category ranges, it behaves suboptimally - it
> traverses each positive bit of the category bitmap, each time calling
> find_next_bit() again.
> 
> This has a large performance impact on UNIX datagram sockets with
> SO_PASSSEC set, since here the peer context needs to be converted to
> string for each recieved datagram. See [1] for more information.
> 
> This patch introduces a new helper for ebitmap traversal, which allows
> to efficiently iterate over positive ranges instead of bits -
> ebitmap_for_each_positive_range() - and converts the two mls_*()
> functions to leverage it.
> 
> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1733259
> 
> Reported-by: Michal Sekletar <msekleta@redhat.com>
> Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
> ---
>   security/selinux/ss/ebitmap.h | 46 +++++++++++++++++++++
>   security/selinux/ss/mls.c     | 76 +++++++++++++----------------------
>   2 files changed, 73 insertions(+), 49 deletions(-)
> 
> diff --git a/security/selinux/ss/ebitmap.h b/security/selinux/ss/ebitmap.h
> index 6aa7cf6a2197..a415741cb206 100644
> --- a/security/selinux/ss/ebitmap.h
> +++ b/security/selinux/ss/ebitmap.h
> @@ -42,6 +42,10 @@ struct ebitmap {
>   	u32 highbit;	/* highest position in the total bitmap */
>   };
>   
> +struct ebitmap_range {
> +	unsigned int start, end;
> +};
> +
>   #define ebitmap_length(e) ((e)->highbit)
>   
>   static inline unsigned int ebitmap_start_positive(struct ebitmap *e,
> @@ -80,6 +84,43 @@ static inline unsigned int ebitmap_next_positive(struct ebitmap *e,
>   	return ebitmap_length(e);
>   }
>   
> +static inline unsigned int ebitmap_next_negative(struct ebitmap *e,
> +						 struct ebitmap_node **n,
> +						 unsigned int bit)
> +{
> +	unsigned int ofs;
> +
> +	ofs = find_next_zero_bit((*n)->maps, EBITMAP_SIZE,
> +				 bit - (*n)->startbit + 1);
> +	if (ofs < EBITMAP_SIZE)
> +		return ofs + (*n)->startbit;
> +
> +	for (*n = (*n)->next; *n; *n = (*n)->next) {
> +		ofs = find_first_zero_bit((*n)->maps, EBITMAP_SIZE);
> +		if (ofs < EBITMAP_SIZE)
> +			return ofs + (*n)->startbit;
> +	}
> +	return ebitmap_length(e);
> +}

This is likely moot given that the patch was deferred on exploring the 
cache option, but if/when this patch is revisited, you don't seem to 
account for the possibility that there could be a hole between the 
bitmaps represented by (*n) and (*n)->next, and that might be where the 
next negative/zero bit is.

> +
> +static inline void ebitmap_start_positive_range(struct ebitmap *e,
> +						struct ebitmap_node **n,
> +						struct ebitmap_range *range)
> +{
> +	range->end = range->start = ebitmap_start_positive(e, n);
> +	if (range->start < ebitmap_length(e))
> +		range->end = ebitmap_next_negative(e, n, range->start);
> +}
> +
> +static inline void ebitmap_next_positive_range(struct ebitmap *e,
> +					       struct ebitmap_node **n,
> +					       struct ebitmap_range *range)
> +{
> +	range->end = range->start = ebitmap_next_positive(e, n, range->end);
> +	if (range->start < ebitmap_length(e))
> +		range->end = ebitmap_next_negative(e, n, range->start);
> +}
> +
>   #define EBITMAP_NODE_INDEX(node, bit)	\
>   	(((bit) - (node)->startbit) / EBITMAP_UNIT_SIZE)
>   #define EBITMAP_NODE_OFFSET(node, bit)	\
> @@ -122,6 +163,11 @@ static inline void ebitmap_node_clr_bit(struct ebitmap_node *n,
>   	     bit < ebitmap_length(e);			\
>   	     bit = ebitmap_next_positive(e, &n, bit))	\
>   
> +#define ebitmap_for_each_positive_range(e, n, range)		\
> +	for (ebitmap_start_positive_range(e, &n, &range);	\
> +	     range.start < ebitmap_length(e);			\
> +	     ebitmap_next_positive_range(e, &n, &range))	\
> +
>   int ebitmap_cmp(struct ebitmap *e1, struct ebitmap *e2);
>   int ebitmap_cpy(struct ebitmap *dst, struct ebitmap *src);
>   int ebitmap_contains(struct ebitmap *e1, struct ebitmap *e2, u32 last_e2bit);
> diff --git a/security/selinux/ss/mls.c b/security/selinux/ss/mls.c
> index 5e05f5b902d7..3abd6b950c66 100644
> --- a/security/selinux/ss/mls.c
> +++ b/security/selinux/ss/mls.c
> @@ -35,10 +35,12 @@
>    */
>   int mls_compute_context_len(struct policydb *p, struct context *context)
>   {
> -	int i, l, len, head, prev;
> +	int l, len;
>   	char *nm;
>   	struct ebitmap *e;
>   	struct ebitmap_node *node;
> +	struct ebitmap_range range;
> +	unsigned int rlen;
>   
>   	if (!p->mls_enabled)
>   		return 0;
> @@ -49,24 +51,14 @@ int mls_compute_context_len(struct policydb *p, struct context *context)
>   		len += strlen(sym_name(p, SYM_LEVELS, index_sens - 1));
>   
>   		/* categories */
> -		head = -2;
> -		prev = -2;
>   		e = &context->range.level[l].cat;
> -		ebitmap_for_each_positive_bit(e, node, i) {
> -			if (i - prev > 1) {
> -				/* one or more negative bits are skipped */
> -				if (head != prev) {
> -					nm = sym_name(p, SYM_CATS, prev);
> -					len += strlen(nm) + 1;
> -				}
> -				nm = sym_name(p, SYM_CATS, i);
> +		ebitmap_for_each_positive_range(e, node, range) {
> +			rlen = range.end - range.start;
> +			if (rlen > 1) {
> +				nm = sym_name(p, SYM_CATS, range.start);
>   				len += strlen(nm) + 1;
> -				head = i;
>   			}
> -			prev = i;
> -		}
> -		if (prev != head) {
> -			nm = sym_name(p, SYM_CATS, prev);
> +			nm = sym_name(p, SYM_CATS, range.end - 1);
>   			len += strlen(nm) + 1;
>   		}
>   		if (l == 0) {
> @@ -91,9 +83,11 @@ void mls_sid_to_context(struct policydb *p,
>   			char **scontext)
>   {
>   	char *scontextp, *nm;
> -	int i, l, head, prev;
> +	int l, first;
>   	struct ebitmap *e;
>   	struct ebitmap_node *node;
> +	struct ebitmap_range range;
> +	unsigned int rlen;
>   
>   	if (!p->mls_enabled)
>   		return;
> @@ -109,43 +103,27 @@ void mls_sid_to_context(struct policydb *p,
>   		scontextp += strlen(scontextp);
>   
>   		/* categories */
> -		head = -2;
> -		prev = -2;
> +		first = 1;
>   		e = &context->range.level[l].cat;
> -		ebitmap_for_each_positive_bit(e, node, i) {
> -			if (i - prev > 1) {
> -				/* one or more negative bits are skipped */
> -				if (prev != head) {
> -					if (prev - head > 1)
> -						*scontextp++ = '.';
> -					else
> -						*scontextp++ = ',';
> -					nm = sym_name(p, SYM_CATS, prev);
> -					strcpy(scontextp, nm);
> -					scontextp += strlen(nm);
> -				}
> -				if (prev < 0)
> -					*scontextp++ = ':';
> -				else
> -					*scontextp++ = ',';
> -				nm = sym_name(p, SYM_CATS, i);
> -				strcpy(scontextp, nm);
> -				scontextp += strlen(nm);
> -				head = i;
> -			}
> -			prev = i;
> -		}
> -
> -		if (prev != head) {
> -			if (prev - head > 1)
> -				*scontextp++ = '.';
> -			else
> +		ebitmap_for_each_positive_range(e, node, range) {
> +			if (first) {
> +				first = 0;
> +				*scontextp++ = ':';
> +			} else {
>   				*scontextp++ = ',';
> -			nm = sym_name(p, SYM_CATS, prev);
> +			}
> +			nm = sym_name(p, SYM_CATS, range.start);
>   			strcpy(scontextp, nm);
>   			scontextp += strlen(nm);
> -		}
> +			rlen = range.end - range.start;
> +			if (rlen > 1) {
> +				*scontextp++ = rlen > 2 ? '.' : ',';
>   
> +				nm = sym_name(p, SYM_CATS, range.end - 1);
> +				strcpy(scontextp, nm);
> +				scontextp += strlen(nm);
> +			}
> +		}
>   		if (l == 0) {
>   			if (mls_level_eq(&context->range.level[0],
>   					 &context->range.level[1]))
>
Stephen Smalley Sept. 16, 2019, 5:30 p.m. UTC | #4
On 9/16/19 10:43 AM, Stephen Smalley wrote:
> On 7/30/19 8:48 AM, Ondrej Mosnacek wrote:
>> When mls_compute_context_len() or mls_sid_to_context() encounters a
>> context with large category ranges, it behaves suboptimally - it
>> traverses each positive bit of the category bitmap, each time calling
>> find_next_bit() again.
>>
>> This has a large performance impact on UNIX datagram sockets with
>> SO_PASSSEC set, since here the peer context needs to be converted to
>> string for each recieved datagram. See [1] for more information.
>>
>> This patch introduces a new helper for ebitmap traversal, which allows
>> to efficiently iterate over positive ranges instead of bits -
>> ebitmap_for_each_positive_range() - and converts the two mls_*()
>> functions to leverage it.
>>
>> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1733259
>>
>> Reported-by: Michal Sekletar <msekleta@redhat.com>
>> Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
>> ---
>>   security/selinux/ss/ebitmap.h | 46 +++++++++++++++++++++
>>   security/selinux/ss/mls.c     | 76 +++++++++++++----------------------
>>   2 files changed, 73 insertions(+), 49 deletions(-)
>>
>> diff --git a/security/selinux/ss/ebitmap.h 
>> b/security/selinux/ss/ebitmap.h
>> index 6aa7cf6a2197..a415741cb206 100644
>> --- a/security/selinux/ss/ebitmap.h
>> +++ b/security/selinux/ss/ebitmap.h
>> @@ -42,6 +42,10 @@ struct ebitmap {
>>       u32 highbit;    /* highest position in the total bitmap */
>>   };
>> +struct ebitmap_range {
>> +    unsigned int start, end;
>> +};
>> +
>>   #define ebitmap_length(e) ((e)->highbit)
>>   static inline unsigned int ebitmap_start_positive(struct ebitmap *e,
>> @@ -80,6 +84,43 @@ static inline unsigned int 
>> ebitmap_next_positive(struct ebitmap *e,
>>       return ebitmap_length(e);
>>   }
>> +static inline unsigned int ebitmap_next_negative(struct ebitmap *e,
>> +                         struct ebitmap_node **n,
>> +                         unsigned int bit)
>> +{
>> +    unsigned int ofs;
>> +
>> +    ofs = find_next_zero_bit((*n)->maps, EBITMAP_SIZE,
>> +                 bit - (*n)->startbit + 1);
>> +    if (ofs < EBITMAP_SIZE)
>> +        return ofs + (*n)->startbit;
>> +
>> +    for (*n = (*n)->next; *n; *n = (*n)->next) {
>> +        ofs = find_first_zero_bit((*n)->maps, EBITMAP_SIZE);
>> +        if (ofs < EBITMAP_SIZE)
>> +            return ofs + (*n)->startbit;
>> +    }
>> +    return ebitmap_length(e);
>> +}
> 
> This is likely moot given that the patch was deferred on exploring the 
> cache option, but if/when this patch is revisited, you don't seem to 
> account for the possibility that there could be a hole between the 
> bitmaps represented by (*n) and (*n)->next, and that might be where the 
> next negative/zero bit is.

You can see this bug manifest by doing the following:
runcon -l s0:c0.c383,c768 bash
id -Z

The correct output of course would be s0:c0.c383,c768, but your patched 
kernel will report s0:c0.c768.

> 
>> +
>> +static inline void ebitmap_start_positive_range(struct ebitmap *e,
>> +                        struct ebitmap_node **n,
>> +                        struct ebitmap_range *range)
>> +{
>> +    range->end = range->start = ebitmap_start_positive(e, n);
>> +    if (range->start < ebitmap_length(e))
>> +        range->end = ebitmap_next_negative(e, n, range->start);
>> +}
>> +
>> +static inline void ebitmap_next_positive_range(struct ebitmap *e,
>> +                           struct ebitmap_node **n,
>> +                           struct ebitmap_range *range)
>> +{
>> +    range->end = range->start = ebitmap_next_positive(e, n, range->end);
>> +    if (range->start < ebitmap_length(e))
>> +        range->end = ebitmap_next_negative(e, n, range->start);
>> +}
>> +
>>   #define EBITMAP_NODE_INDEX(node, bit)    \
>>       (((bit) - (node)->startbit) / EBITMAP_UNIT_SIZE)
>>   #define EBITMAP_NODE_OFFSET(node, bit)    \
>> @@ -122,6 +163,11 @@ static inline void ebitmap_node_clr_bit(struct 
>> ebitmap_node *n,
>>            bit < ebitmap_length(e);            \
>>            bit = ebitmap_next_positive(e, &n, bit))    \
>> +#define ebitmap_for_each_positive_range(e, n, range)        \
>> +    for (ebitmap_start_positive_range(e, &n, &range);    \
>> +         range.start < ebitmap_length(e);            \
>> +         ebitmap_next_positive_range(e, &n, &range))    \
>> +
>>   int ebitmap_cmp(struct ebitmap *e1, struct ebitmap *e2);
>>   int ebitmap_cpy(struct ebitmap *dst, struct ebitmap *src);
>>   int ebitmap_contains(struct ebitmap *e1, struct ebitmap *e2, u32 
>> last_e2bit);
>> diff --git a/security/selinux/ss/mls.c b/security/selinux/ss/mls.c
>> index 5e05f5b902d7..3abd6b950c66 100644
>> --- a/security/selinux/ss/mls.c
>> +++ b/security/selinux/ss/mls.c
>> @@ -35,10 +35,12 @@
>>    */
>>   int mls_compute_context_len(struct policydb *p, struct context 
>> *context)
>>   {
>> -    int i, l, len, head, prev;
>> +    int l, len;
>>       char *nm;
>>       struct ebitmap *e;
>>       struct ebitmap_node *node;
>> +    struct ebitmap_range range;
>> +    unsigned int rlen;
>>       if (!p->mls_enabled)
>>           return 0;
>> @@ -49,24 +51,14 @@ int mls_compute_context_len(struct policydb *p, 
>> struct context *context)
>>           len += strlen(sym_name(p, SYM_LEVELS, index_sens - 1));
>>           /* categories */
>> -        head = -2;
>> -        prev = -2;
>>           e = &context->range.level[l].cat;
>> -        ebitmap_for_each_positive_bit(e, node, i) {
>> -            if (i - prev > 1) {
>> -                /* one or more negative bits are skipped */
>> -                if (head != prev) {
>> -                    nm = sym_name(p, SYM_CATS, prev);
>> -                    len += strlen(nm) + 1;
>> -                }
>> -                nm = sym_name(p, SYM_CATS, i);
>> +        ebitmap_for_each_positive_range(e, node, range) {
>> +            rlen = range.end - range.start;
>> +            if (rlen > 1) {
>> +                nm = sym_name(p, SYM_CATS, range.start);
>>                   len += strlen(nm) + 1;
>> -                head = i;
>>               }
>> -            prev = i;
>> -        }
>> -        if (prev != head) {
>> -            nm = sym_name(p, SYM_CATS, prev);
>> +            nm = sym_name(p, SYM_CATS, range.end - 1);
>>               len += strlen(nm) + 1;
>>           }
>>           if (l == 0) {
>> @@ -91,9 +83,11 @@ void mls_sid_to_context(struct policydb *p,
>>               char **scontext)
>>   {
>>       char *scontextp, *nm;
>> -    int i, l, head, prev;
>> +    int l, first;
>>       struct ebitmap *e;
>>       struct ebitmap_node *node;
>> +    struct ebitmap_range range;
>> +    unsigned int rlen;
>>       if (!p->mls_enabled)
>>           return;
>> @@ -109,43 +103,27 @@ void mls_sid_to_context(struct policydb *p,
>>           scontextp += strlen(scontextp);
>>           /* categories */
>> -        head = -2;
>> -        prev = -2;
>> +        first = 1;
>>           e = &context->range.level[l].cat;
>> -        ebitmap_for_each_positive_bit(e, node, i) {
>> -            if (i - prev > 1) {
>> -                /* one or more negative bits are skipped */
>> -                if (prev != head) {
>> -                    if (prev - head > 1)
>> -                        *scontextp++ = '.';
>> -                    else
>> -                        *scontextp++ = ',';
>> -                    nm = sym_name(p, SYM_CATS, prev);
>> -                    strcpy(scontextp, nm);
>> -                    scontextp += strlen(nm);
>> -                }
>> -                if (prev < 0)
>> -                    *scontextp++ = ':';
>> -                else
>> -                    *scontextp++ = ',';
>> -                nm = sym_name(p, SYM_CATS, i);
>> -                strcpy(scontextp, nm);
>> -                scontextp += strlen(nm);
>> -                head = i;
>> -            }
>> -            prev = i;
>> -        }
>> -
>> -        if (prev != head) {
>> -            if (prev - head > 1)
>> -                *scontextp++ = '.';
>> -            else
>> +        ebitmap_for_each_positive_range(e, node, range) {
>> +            if (first) {
>> +                first = 0;
>> +                *scontextp++ = ':';
>> +            } else {
>>                   *scontextp++ = ',';
>> -            nm = sym_name(p, SYM_CATS, prev);
>> +            }
>> +            nm = sym_name(p, SYM_CATS, range.start);
>>               strcpy(scontextp, nm);
>>               scontextp += strlen(nm);
>> -        }
>> +            rlen = range.end - range.start;
>> +            if (rlen > 1) {
>> +                *scontextp++ = rlen > 2 ? '.' : ',';
>> +                nm = sym_name(p, SYM_CATS, range.end - 1);
>> +                strcpy(scontextp, nm);
>> +                scontextp += strlen(nm);
>> +            }
>> +        }
>>           if (l == 0) {
>>               if (mls_level_eq(&context->range.level[0],
>>                        &context->range.level[1]))
>>
>
Ondrej Mosnacek Sept. 17, 2019, 10:23 a.m. UTC | #5
On Mon, Sep 16, 2019 at 7:30 PM Stephen Smalley <sds@tycho.nsa.gov> wrote:
> On 9/16/19 10:43 AM, Stephen Smalley wrote:
> > On 7/30/19 8:48 AM, Ondrej Mosnacek wrote:
> >> When mls_compute_context_len() or mls_sid_to_context() encounters a
> >> context with large category ranges, it behaves suboptimally - it
> >> traverses each positive bit of the category bitmap, each time calling
> >> find_next_bit() again.
> >>
> >> This has a large performance impact on UNIX datagram sockets with
> >> SO_PASSSEC set, since here the peer context needs to be converted to
> >> string for each recieved datagram. See [1] for more information.
> >>
> >> This patch introduces a new helper for ebitmap traversal, which allows
> >> to efficiently iterate over positive ranges instead of bits -
> >> ebitmap_for_each_positive_range() - and converts the two mls_*()
> >> functions to leverage it.
> >>
> >> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1733259
> >>
> >> Reported-by: Michal Sekletar <msekleta@redhat.com>
> >> Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
> >> ---
> >>   security/selinux/ss/ebitmap.h | 46 +++++++++++++++++++++
> >>   security/selinux/ss/mls.c     | 76 +++++++++++++----------------------
> >>   2 files changed, 73 insertions(+), 49 deletions(-)
> >>
> >> diff --git a/security/selinux/ss/ebitmap.h
> >> b/security/selinux/ss/ebitmap.h
> >> index 6aa7cf6a2197..a415741cb206 100644
> >> --- a/security/selinux/ss/ebitmap.h
> >> +++ b/security/selinux/ss/ebitmap.h
> >> @@ -42,6 +42,10 @@ struct ebitmap {
> >>       u32 highbit;    /* highest position in the total bitmap */
> >>   };
> >> +struct ebitmap_range {
> >> +    unsigned int start, end;
> >> +};
> >> +
> >>   #define ebitmap_length(e) ((e)->highbit)
> >>   static inline unsigned int ebitmap_start_positive(struct ebitmap *e,
> >> @@ -80,6 +84,43 @@ static inline unsigned int
> >> ebitmap_next_positive(struct ebitmap *e,
> >>       return ebitmap_length(e);
> >>   }
> >> +static inline unsigned int ebitmap_next_negative(struct ebitmap *e,
> >> +                         struct ebitmap_node **n,
> >> +                         unsigned int bit)
> >> +{
> >> +    unsigned int ofs;
> >> +
> >> +    ofs = find_next_zero_bit((*n)->maps, EBITMAP_SIZE,
> >> +                 bit - (*n)->startbit + 1);
> >> +    if (ofs < EBITMAP_SIZE)
> >> +        return ofs + (*n)->startbit;
> >> +
> >> +    for (*n = (*n)->next; *n; *n = (*n)->next) {
> >> +        ofs = find_first_zero_bit((*n)->maps, EBITMAP_SIZE);
> >> +        if (ofs < EBITMAP_SIZE)
> >> +            return ofs + (*n)->startbit;
> >> +    }
> >> +    return ebitmap_length(e);
> >> +}
> >
> > This is likely moot given that the patch was deferred on exploring the
> > cache option, but if/when this patch is revisited, you don't seem to
> > account for the possibility that there could be a hole between the
> > bitmaps represented by (*n) and (*n)->next, and that might be where the
> > next negative/zero bit is.
>
> You can see this bug manifest by doing the following:
> runcon -l s0:c0.c383,c768 bash
> id -Z
>
> The correct output of course would be s0:c0.c383,c768, but your patched
> kernel will report s0:c0.c768.

Yes, good catch! This patch would really need some focused testing
(and fixing) before being committed... Some simple fuzz test over
/sys/fs/selinux/context should be enough, but I didn't get as far as
to write it. Probably should have initially sent the patch just as
RFC, though...

>
> >
> >> +
> >> +static inline void ebitmap_start_positive_range(struct ebitmap *e,
> >> +                        struct ebitmap_node **n,
> >> +                        struct ebitmap_range *range)
> >> +{
> >> +    range->end = range->start = ebitmap_start_positive(e, n);
> >> +    if (range->start < ebitmap_length(e))
> >> +        range->end = ebitmap_next_negative(e, n, range->start);
> >> +}
> >> +
> >> +static inline void ebitmap_next_positive_range(struct ebitmap *e,
> >> +                           struct ebitmap_node **n,
> >> +                           struct ebitmap_range *range)
> >> +{
> >> +    range->end = range->start = ebitmap_next_positive(e, n, range->end);
> >> +    if (range->start < ebitmap_length(e))
> >> +        range->end = ebitmap_next_negative(e, n, range->start);
> >> +}
> >> +
> >>   #define EBITMAP_NODE_INDEX(node, bit)    \
> >>       (((bit) - (node)->startbit) / EBITMAP_UNIT_SIZE)
> >>   #define EBITMAP_NODE_OFFSET(node, bit)    \
> >> @@ -122,6 +163,11 @@ static inline void ebitmap_node_clr_bit(struct
> >> ebitmap_node *n,
> >>            bit < ebitmap_length(e);            \
> >>            bit = ebitmap_next_positive(e, &n, bit))    \
> >> +#define ebitmap_for_each_positive_range(e, n, range)        \
> >> +    for (ebitmap_start_positive_range(e, &n, &range);    \
> >> +         range.start < ebitmap_length(e);            \
> >> +         ebitmap_next_positive_range(e, &n, &range))    \
> >> +
> >>   int ebitmap_cmp(struct ebitmap *e1, struct ebitmap *e2);
> >>   int ebitmap_cpy(struct ebitmap *dst, struct ebitmap *src);
> >>   int ebitmap_contains(struct ebitmap *e1, struct ebitmap *e2, u32
> >> last_e2bit);
> >> diff --git a/security/selinux/ss/mls.c b/security/selinux/ss/mls.c
> >> index 5e05f5b902d7..3abd6b950c66 100644
> >> --- a/security/selinux/ss/mls.c
> >> +++ b/security/selinux/ss/mls.c
> >> @@ -35,10 +35,12 @@
> >>    */
> >>   int mls_compute_context_len(struct policydb *p, struct context
> >> *context)
> >>   {
> >> -    int i, l, len, head, prev;
> >> +    int l, len;
> >>       char *nm;
> >>       struct ebitmap *e;
> >>       struct ebitmap_node *node;
> >> +    struct ebitmap_range range;
> >> +    unsigned int rlen;
> >>       if (!p->mls_enabled)
> >>           return 0;
> >> @@ -49,24 +51,14 @@ int mls_compute_context_len(struct policydb *p,
> >> struct context *context)
> >>           len += strlen(sym_name(p, SYM_LEVELS, index_sens - 1));
> >>           /* categories */
> >> -        head = -2;
> >> -        prev = -2;
> >>           e = &context->range.level[l].cat;
> >> -        ebitmap_for_each_positive_bit(e, node, i) {
> >> -            if (i - prev > 1) {
> >> -                /* one or more negative bits are skipped */
> >> -                if (head != prev) {
> >> -                    nm = sym_name(p, SYM_CATS, prev);
> >> -                    len += strlen(nm) + 1;
> >> -                }
> >> -                nm = sym_name(p, SYM_CATS, i);
> >> +        ebitmap_for_each_positive_range(e, node, range) {
> >> +            rlen = range.end - range.start;
> >> +            if (rlen > 1) {
> >> +                nm = sym_name(p, SYM_CATS, range.start);
> >>                   len += strlen(nm) + 1;
> >> -                head = i;
> >>               }
> >> -            prev = i;
> >> -        }
> >> -        if (prev != head) {
> >> -            nm = sym_name(p, SYM_CATS, prev);
> >> +            nm = sym_name(p, SYM_CATS, range.end - 1);
> >>               len += strlen(nm) + 1;
> >>           }
> >>           if (l == 0) {
> >> @@ -91,9 +83,11 @@ void mls_sid_to_context(struct policydb *p,
> >>               char **scontext)
> >>   {
> >>       char *scontextp, *nm;
> >> -    int i, l, head, prev;
> >> +    int l, first;
> >>       struct ebitmap *e;
> >>       struct ebitmap_node *node;
> >> +    struct ebitmap_range range;
> >> +    unsigned int rlen;
> >>       if (!p->mls_enabled)
> >>           return;
> >> @@ -109,43 +103,27 @@ void mls_sid_to_context(struct policydb *p,
> >>           scontextp += strlen(scontextp);
> >>           /* categories */
> >> -        head = -2;
> >> -        prev = -2;
> >> +        first = 1;
> >>           e = &context->range.level[l].cat;
> >> -        ebitmap_for_each_positive_bit(e, node, i) {
> >> -            if (i - prev > 1) {
> >> -                /* one or more negative bits are skipped */
> >> -                if (prev != head) {
> >> -                    if (prev - head > 1)
> >> -                        *scontextp++ = '.';
> >> -                    else
> >> -                        *scontextp++ = ',';
> >> -                    nm = sym_name(p, SYM_CATS, prev);
> >> -                    strcpy(scontextp, nm);
> >> -                    scontextp += strlen(nm);
> >> -                }
> >> -                if (prev < 0)
> >> -                    *scontextp++ = ':';
> >> -                else
> >> -                    *scontextp++ = ',';
> >> -                nm = sym_name(p, SYM_CATS, i);
> >> -                strcpy(scontextp, nm);
> >> -                scontextp += strlen(nm);
> >> -                head = i;
> >> -            }
> >> -            prev = i;
> >> -        }
> >> -
> >> -        if (prev != head) {
> >> -            if (prev - head > 1)
> >> -                *scontextp++ = '.';
> >> -            else
> >> +        ebitmap_for_each_positive_range(e, node, range) {
> >> +            if (first) {
> >> +                first = 0;
> >> +                *scontextp++ = ':';
> >> +            } else {
> >>                   *scontextp++ = ',';
> >> -            nm = sym_name(p, SYM_CATS, prev);
> >> +            }
> >> +            nm = sym_name(p, SYM_CATS, range.start);
> >>               strcpy(scontextp, nm);
> >>               scontextp += strlen(nm);
> >> -        }
> >> +            rlen = range.end - range.start;
> >> +            if (rlen > 1) {
> >> +                *scontextp++ = rlen > 2 ? '.' : ',';
> >> +                nm = sym_name(p, SYM_CATS, range.end - 1);
> >> +                strcpy(scontextp, nm);
> >> +                scontextp += strlen(nm);
> >> +            }
> >> +        }
> >>           if (l == 0) {
> >>               if (mls_level_eq(&context->range.level[0],
> >>                        &context->range.level[1]))
> >>
> >
>


--
Ondrej Mosnacek <omosnace at redhat dot com>
Software Engineer, Security Technologies
Red Hat, Inc.
diff mbox series

Patch

diff --git a/security/selinux/ss/ebitmap.h b/security/selinux/ss/ebitmap.h
index 6aa7cf6a2197..a415741cb206 100644
--- a/security/selinux/ss/ebitmap.h
+++ b/security/selinux/ss/ebitmap.h
@@ -42,6 +42,10 @@  struct ebitmap {
 	u32 highbit;	/* highest position in the total bitmap */
 };
 
+struct ebitmap_range {
+	unsigned int start, end;
+};
+
 #define ebitmap_length(e) ((e)->highbit)
 
 static inline unsigned int ebitmap_start_positive(struct ebitmap *e,
@@ -80,6 +84,43 @@  static inline unsigned int ebitmap_next_positive(struct ebitmap *e,
 	return ebitmap_length(e);
 }
 
+static inline unsigned int ebitmap_next_negative(struct ebitmap *e,
+						 struct ebitmap_node **n,
+						 unsigned int bit)
+{
+	unsigned int ofs;
+
+	ofs = find_next_zero_bit((*n)->maps, EBITMAP_SIZE,
+				 bit - (*n)->startbit + 1);
+	if (ofs < EBITMAP_SIZE)
+		return ofs + (*n)->startbit;
+
+	for (*n = (*n)->next; *n; *n = (*n)->next) {
+		ofs = find_first_zero_bit((*n)->maps, EBITMAP_SIZE);
+		if (ofs < EBITMAP_SIZE)
+			return ofs + (*n)->startbit;
+	}
+	return ebitmap_length(e);
+}
+
+static inline void ebitmap_start_positive_range(struct ebitmap *e,
+						struct ebitmap_node **n,
+						struct ebitmap_range *range)
+{
+	range->end = range->start = ebitmap_start_positive(e, n);
+	if (range->start < ebitmap_length(e))
+		range->end = ebitmap_next_negative(e, n, range->start);
+}
+
+static inline void ebitmap_next_positive_range(struct ebitmap *e,
+					       struct ebitmap_node **n,
+					       struct ebitmap_range *range)
+{
+	range->end = range->start = ebitmap_next_positive(e, n, range->end);
+	if (range->start < ebitmap_length(e))
+		range->end = ebitmap_next_negative(e, n, range->start);
+}
+
 #define EBITMAP_NODE_INDEX(node, bit)	\
 	(((bit) - (node)->startbit) / EBITMAP_UNIT_SIZE)
 #define EBITMAP_NODE_OFFSET(node, bit)	\
@@ -122,6 +163,11 @@  static inline void ebitmap_node_clr_bit(struct ebitmap_node *n,
 	     bit < ebitmap_length(e);			\
 	     bit = ebitmap_next_positive(e, &n, bit))	\
 
+#define ebitmap_for_each_positive_range(e, n, range)		\
+	for (ebitmap_start_positive_range(e, &n, &range);	\
+	     range.start < ebitmap_length(e);			\
+	     ebitmap_next_positive_range(e, &n, &range))	\
+
 int ebitmap_cmp(struct ebitmap *e1, struct ebitmap *e2);
 int ebitmap_cpy(struct ebitmap *dst, struct ebitmap *src);
 int ebitmap_contains(struct ebitmap *e1, struct ebitmap *e2, u32 last_e2bit);
diff --git a/security/selinux/ss/mls.c b/security/selinux/ss/mls.c
index 5e05f5b902d7..3abd6b950c66 100644
--- a/security/selinux/ss/mls.c
+++ b/security/selinux/ss/mls.c
@@ -35,10 +35,12 @@ 
  */
 int mls_compute_context_len(struct policydb *p, struct context *context)
 {
-	int i, l, len, head, prev;
+	int l, len;
 	char *nm;
 	struct ebitmap *e;
 	struct ebitmap_node *node;
+	struct ebitmap_range range;
+	unsigned int rlen;
 
 	if (!p->mls_enabled)
 		return 0;
@@ -49,24 +51,14 @@  int mls_compute_context_len(struct policydb *p, struct context *context)
 		len += strlen(sym_name(p, SYM_LEVELS, index_sens - 1));
 
 		/* categories */
-		head = -2;
-		prev = -2;
 		e = &context->range.level[l].cat;
-		ebitmap_for_each_positive_bit(e, node, i) {
-			if (i - prev > 1) {
-				/* one or more negative bits are skipped */
-				if (head != prev) {
-					nm = sym_name(p, SYM_CATS, prev);
-					len += strlen(nm) + 1;
-				}
-				nm = sym_name(p, SYM_CATS, i);
+		ebitmap_for_each_positive_range(e, node, range) {
+			rlen = range.end - range.start;
+			if (rlen > 1) {
+				nm = sym_name(p, SYM_CATS, range.start);
 				len += strlen(nm) + 1;
-				head = i;
 			}
-			prev = i;
-		}
-		if (prev != head) {
-			nm = sym_name(p, SYM_CATS, prev);
+			nm = sym_name(p, SYM_CATS, range.end - 1);
 			len += strlen(nm) + 1;
 		}
 		if (l == 0) {
@@ -91,9 +83,11 @@  void mls_sid_to_context(struct policydb *p,
 			char **scontext)
 {
 	char *scontextp, *nm;
-	int i, l, head, prev;
+	int l, first;
 	struct ebitmap *e;
 	struct ebitmap_node *node;
+	struct ebitmap_range range;
+	unsigned int rlen;
 
 	if (!p->mls_enabled)
 		return;
@@ -109,43 +103,27 @@  void mls_sid_to_context(struct policydb *p,
 		scontextp += strlen(scontextp);
 
 		/* categories */
-		head = -2;
-		prev = -2;
+		first = 1;
 		e = &context->range.level[l].cat;
-		ebitmap_for_each_positive_bit(e, node, i) {
-			if (i - prev > 1) {
-				/* one or more negative bits are skipped */
-				if (prev != head) {
-					if (prev - head > 1)
-						*scontextp++ = '.';
-					else
-						*scontextp++ = ',';
-					nm = sym_name(p, SYM_CATS, prev);
-					strcpy(scontextp, nm);
-					scontextp += strlen(nm);
-				}
-				if (prev < 0)
-					*scontextp++ = ':';
-				else
-					*scontextp++ = ',';
-				nm = sym_name(p, SYM_CATS, i);
-				strcpy(scontextp, nm);
-				scontextp += strlen(nm);
-				head = i;
-			}
-			prev = i;
-		}
-
-		if (prev != head) {
-			if (prev - head > 1)
-				*scontextp++ = '.';
-			else
+		ebitmap_for_each_positive_range(e, node, range) {
+			if (first) {
+				first = 0;
+				*scontextp++ = ':';
+			} else {
 				*scontextp++ = ',';
-			nm = sym_name(p, SYM_CATS, prev);
+			}
+			nm = sym_name(p, SYM_CATS, range.start);
 			strcpy(scontextp, nm);
 			scontextp += strlen(nm);
-		}
+			rlen = range.end - range.start;
+			if (rlen > 1) {
+				*scontextp++ = rlen > 2 ? '.' : ',';
 
+				nm = sym_name(p, SYM_CATS, range.end - 1);
+				strcpy(scontextp, nm);
+				scontextp += strlen(nm);
+			}
+		}
 		if (l == 0) {
 			if (mls_level_eq(&context->range.level[0],
 					 &context->range.level[1]))