diff mbox

lib: Always NUL terminate ucs2_as_utf8

Message ID 1461141427-16361-1-git-send-email-chris@chris-wilson.co.uk (mailing list archive)
State New, archived
Headers show

Commit Message

Chris Wilson April 20, 2016, 8:37 a.m. UTC
If the caller, in this case efivarfs_callback(), only provides sufficent
room for the expanded utf8 and not enough to include the terminating NUL
byte, that NUL byte is skipped. When the caller then interprets it as a
string, it may then read from past its allocated memory:

[  170.605647] WARNING: kmemcheck: Caught 8-bit read from uninitialized memory (ffff8804079ae786)
[  170.605677] 436f6e4f757400004c44322d35363062663538612d316530642d346437652d39
[  170.606037]  i i i i i i u u u u u u u u u u u u u u u u u u u u u u u u u u
[  170.606236]              ^
[  170.606243] RIP: 0010:[<ffffffff813a251f>]  [<ffffffff813a251f>] efivar_variable_is_removable+0xaf/0xf0
[  170.606346] RSP: 0018:ffff880408e73c20  EFLAGS: 00010206
[  170.606352] RAX: 0000000000000000 RBX: 0000000000000006 RCX: 0000000000000006
[  170.606359] RDX: 0000000000000000 RSI: 0000000000000074 RDI: ffff880408e73c30
[  170.606365] RBP: ffff880408e73c80 R08: 0000000000000006 R09: 000000000000008c
[  170.606371] R10: 0000000000000006 R11: 0000000000000000 R12: ffffffff8166ed20
[  170.606378] R13: 11d293ca8be4df61 R14: ffffffff81773834 R15: ffff8804079ae780
[  170.606385] FS:  0000000000000000(0000) GS:ffff88041ca00000(0000) knlGS:0000000000000000
[  170.606392] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  170.606399] CR2: ffff880409cbe4c0 CR3: 00000004085fd000 CR4: 00000000001406f0
[  170.606405]  [<ffffffff811eb938>] efivarfs_callback+0xf8/0x275
[  170.606418]  [<ffffffff813a3368>] efivar_init+0x248/0x2e0
[  170.606440]  [<ffffffff811eb6b4>] efivarfs_fill_super+0xb4/0xf0
[  170.606452]  [<ffffffff811333e7>] mount_single+0x87/0xb0
[  170.606463]  [<ffffffff811eb5f3>] efivarfs_mount+0x13/0x20
[  170.606475]  [<ffffffff81133480>] mount_fs+0x10/0x90
[  170.606497]  [<ffffffff8114c732>] vfs_kern_mount+0x62/0x100
[  170.606508]  [<ffffffff8114ecb0>] do_mount+0x1e0/0xcd0
[  170.606519]  [<ffffffff8114fa9f>] SyS_mount+0x8f/0xd0
[  170.606530]  [<ffffffff81451d1f>] entry_SYSCALL_64_fastpath+0x17/0x93
[  170.606542]  [<ffffffffffffffff>] 0xffffffffffffffff

Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Jason Andryuk <jandryuk@gmail.com>
Cc: Matthew Garrett <mjg59@coreos.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Peter Jones <pjones@redhat.com>
Cc: linux-efi@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
 drivers/firmware/efi/vars.c |  2 +-
 lib/ucs2_string.c           | 15 ++++++++++-----
 2 files changed, 11 insertions(+), 6 deletions(-)

Comments

Laszlo Ersek April 20, 2016, 9:36 a.m. UTC | #1
On 04/20/16 10:37, Chris Wilson wrote:
> If the caller, in this case efivarfs_callback(), only provides sufficent
> room for the expanded utf8 and not enough to include the terminating NUL
> byte, that NUL byte is skipped.

How does that occur? In efivarfs_callback() [fs/efivarfs/super.c], we have

	len = ucs2_utf8size(entry->var.VariableName);

	/* name, plus '-', plus GUID, plus NUL*/
	name = kmalloc(len + 1 + EFI_VARIABLE_GUID_LEN + 1, GFP_KERNEL);
	if (!name)
		goto fail;

	ucs2_as_utf8(name, entry->var.VariableName, len);

Instead, I think the following might be happening (note that RIP points
into efivar_variable_is_removable(), and I guess variable_matches()
(which is static) is inlined):

efivarfs_callback()              [fs/efivarfs/super.c]
  efivar_variable_is_removable() [drivers/firmware/efi/vars.c]
    variable_matches()           [drivers/firmware/efi/vars.c]

The bug seems to be in variable_matches(), which doesn't consider the
"len" parameter early enough. Namely, consider that we have the
following input:

- var_name: "a"
- len: 1
- match_name "ab"

In the first iteration of the loop (i.e., *match == 0):
- c = 'a'
- u = 'a'
- *match gets incremented to 1.

In the second iteration of the loop (i.e., *match == 1):
- c = 'b'
- u = <indeterminate value> (that is, undefined behavior),
  because (*match == len).

This seems to be consistent with the error message "Caught 8-bit read
from uninitialized memory": namely, the array allocated for "name" in
efivarfs_callback() is indeed not pre-zeroed, and the ucs2_as_utf8()
function does not populate name[len] -- correctly, I would say.

So, I think the function that needs a fix is variable_matches().

(I don't disagree though that it could be useful to audit all
ucs2_as_utf8() calls.)

Thanks
Laszlo


> When the caller then interprets it as a
> string, it may then read from past its allocated memory:
> 
> [  170.605647] WARNING: kmemcheck: Caught 8-bit read from uninitialized memory (ffff8804079ae786)
> [  170.605677] 436f6e4f757400004c44322d35363062663538612d316530642d346437652d39
> [  170.606037]  i i i i i i u u u u u u u u u u u u u u u u u u u u u u u u u u
> [  170.606236]              ^
> [  170.606243] RIP: 0010:[<ffffffff813a251f>]  [<ffffffff813a251f>] efivar_variable_is_removable+0xaf/0xf0
> [  170.606346] RSP: 0018:ffff880408e73c20  EFLAGS: 00010206
> [  170.606352] RAX: 0000000000000000 RBX: 0000000000000006 RCX: 0000000000000006
> [  170.606359] RDX: 0000000000000000 RSI: 0000000000000074 RDI: ffff880408e73c30
> [  170.606365] RBP: ffff880408e73c80 R08: 0000000000000006 R09: 000000000000008c
> [  170.606371] R10: 0000000000000006 R11: 0000000000000000 R12: ffffffff8166ed20
> [  170.606378] R13: 11d293ca8be4df61 R14: ffffffff81773834 R15: ffff8804079ae780
> [  170.606385] FS:  0000000000000000(0000) GS:ffff88041ca00000(0000) knlGS:0000000000000000
> [  170.606392] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  170.606399] CR2: ffff880409cbe4c0 CR3: 00000004085fd000 CR4: 00000000001406f0
> [  170.606405]  [<ffffffff811eb938>] efivarfs_callback+0xf8/0x275
> [  170.606418]  [<ffffffff813a3368>] efivar_init+0x248/0x2e0
> [  170.606440]  [<ffffffff811eb6b4>] efivarfs_fill_super+0xb4/0xf0
> [  170.606452]  [<ffffffff811333e7>] mount_single+0x87/0xb0
> [  170.606463]  [<ffffffff811eb5f3>] efivarfs_mount+0x13/0x20
> [  170.606475]  [<ffffffff81133480>] mount_fs+0x10/0x90
> [  170.606497]  [<ffffffff8114c732>] vfs_kern_mount+0x62/0x100
> [  170.606508]  [<ffffffff8114ecb0>] do_mount+0x1e0/0xcd0
> [  170.606519]  [<ffffffff8114fa9f>] SyS_mount+0x8f/0xd0
> [  170.606530]  [<ffffffff81451d1f>] entry_SYSCALL_64_fastpath+0x17/0x93
> [  170.606542]  [<ffffffffffffffff>] 0xffffffffffffffff
> 
> Cc: Matt Fleming <matt@codeblueprint.co.uk>
> Cc: Jason Andryuk <jandryuk@gmail.com>
> Cc: Matthew Garrett <mjg59@coreos.com>
> Cc: Laszlo Ersek <lersek@redhat.com>
> Cc: Peter Jones <pjones@redhat.com>
> Cc: linux-efi@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> ---
>  drivers/firmware/efi/vars.c |  2 +-
>  lib/ucs2_string.c           | 15 ++++++++++-----
>  2 files changed, 11 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/firmware/efi/vars.c b/drivers/firmware/efi/vars.c
> index 0ac594c0a234..8dd503bac35d 100644
> --- a/drivers/firmware/efi/vars.c
> +++ b/drivers/firmware/efi/vars.c
> @@ -166,7 +166,7 @@ validate_ascii_string(efi_char16_t *var_name, int match, u8 *buffer,
>  
>  struct variable_validate {
>  	efi_guid_t vendor;
> -	char *name;
> +	const char *name;
>  	bool (*validate)(efi_char16_t *var_name, int match, u8 *data,
>  			 unsigned long len);
>  };
> diff --git a/lib/ucs2_string.c b/lib/ucs2_string.c
> index f0b323abb4c6..fb8d03966656 100644
> --- a/lib/ucs2_string.c
> +++ b/lib/ucs2_string.c
> @@ -85,29 +85,34 @@ ucs2_as_utf8(u8 *dest, const ucs2_char_t *src, unsigned long maxlength)
>  	unsigned long j = 0;
>  	unsigned long limit = ucs2_strnlen(src, maxlength);
>  
> -	for (i = 0; maxlength && i < limit; i++) {
> +	if (maxlength == 0)
> +		return 0;
> +
> +	for (i = 0; i < limit; i++) {
>  		u16 c = src[i];
>  
>  		if (c >= 0x800) {
> -			if (maxlength < 3)
> +			if (maxlength <= 3)
>  				break;
>  			maxlength -= 3;
>  			dest[j++] = 0xe0 | (c & 0xf000) >> 12;
>  			dest[j++] = 0x80 | (c & 0x0fc0) >> 6;
>  			dest[j++] = 0x80 | (c & 0x003f);
>  		} else if (c >= 0x80) {
> -			if (maxlength < 2)
> +			if (maxlength <= 2)
>  				break;
>  			maxlength -= 2;
>  			dest[j++] = 0xc0 | (c & 0x7c0) >> 6;
>  			dest[j++] = 0x80 | (c & 0x03f);
>  		} else {
> +			if (maxlength <= 1)
> +				break;
>  			maxlength -= 1;
>  			dest[j++] = c & 0x7f;
>  		}
>  	}
> -	if (maxlength)
> -		dest[j] = '\0';
> +	dest[j] = '\0';
> +
>  	return j;
>  }
>  EXPORT_SYMBOL(ucs2_as_utf8);
>
Chris Wilson April 20, 2016, 9:41 a.m. UTC | #2
On Wed, Apr 20, 2016 at 11:36:37AM +0200, Laszlo Ersek wrote:
> On 04/20/16 10:37, Chris Wilson wrote:
> > If the caller, in this case efivarfs_callback(), only provides sufficent
> > room for the expanded utf8 and not enough to include the terminating NUL
> > byte, that NUL byte is skipped.
> 
> How does that occur? In efivarfs_callback() [fs/efivarfs/super.c], we have
> 
> 	len = ucs2_utf8size(entry->var.VariableName);
> 
> 	/* name, plus '-', plus GUID, plus NUL*/
> 	name = kmalloc(len + 1 + EFI_VARIABLE_GUID_LEN + 1, GFP_KERNEL);
> 	if (!name)
> 		goto fail;
> 
> 	ucs2_as_utf8(name, entry->var.VariableName, len);
> 
> Instead, I think the following might be happening (note that RIP points
> into efivar_variable_is_removable(), and I guess variable_matches()
> (which is static) is inlined):
> 
> efivarfs_callback()              [fs/efivarfs/super.c]
>   efivar_variable_is_removable() [drivers/firmware/efi/vars.c]
>     variable_matches()           [drivers/firmware/efi/vars.c]
> 
> The bug seems to be in variable_matches(), which doesn't consider the
> "len" parameter early enough. Namely, consider that we have the
> following input:
> 
> - var_name: "a"
> - len: 1
> - match_name "ab"
> 
> In the first iteration of the loop (i.e., *match == 0):
> - c = 'a'
> - u = 'a'
> - *match gets incremented to 1.
> 
> In the second iteration of the loop (i.e., *match == 1):
> - c = 'b'
> - u = <indeterminate value> (that is, undefined behavior),
>   because (*match == len).
> 
> This seems to be consistent with the error message "Caught 8-bit read
> from uninitialized memory": namely, the array allocated for "name" in
> efivarfs_callback() is indeed not pre-zeroed, and the ucs2_as_utf8()
> function does not populate name[len] -- correctly, I would say.

ucs2_as_utf8 reports that it returns a NUL terminated string. It didn't
in this case.
-Chris
Laszlo Ersek April 20, 2016, 12:45 p.m. UTC | #3
On 04/20/16 11:41, Chris Wilson wrote:
> On Wed, Apr 20, 2016 at 11:36:37AM +0200, Laszlo Ersek wrote:
>> On 04/20/16 10:37, Chris Wilson wrote:
>>> If the caller, in this case efivarfs_callback(), only provides sufficent
>>> room for the expanded utf8 and not enough to include the terminating NUL
>>> byte, that NUL byte is skipped.
>>
>> How does that occur? In efivarfs_callback() [fs/efivarfs/super.c], we have
>>
>> 	len = ucs2_utf8size(entry->var.VariableName);
>>
>> 	/* name, plus '-', plus GUID, plus NUL*/
>> 	name = kmalloc(len + 1 + EFI_VARIABLE_GUID_LEN + 1, GFP_KERNEL);
>> 	if (!name)
>> 		goto fail;
>>
>> 	ucs2_as_utf8(name, entry->var.VariableName, len);
>>
>> Instead, I think the following might be happening (note that RIP points
>> into efivar_variable_is_removable(), and I guess variable_matches()
>> (which is static) is inlined):
>>
>> efivarfs_callback()              [fs/efivarfs/super.c]
>>   efivar_variable_is_removable() [drivers/firmware/efi/vars.c]
>>     variable_matches()           [drivers/firmware/efi/vars.c]
>>
>> The bug seems to be in variable_matches(), which doesn't consider the
>> "len" parameter early enough. Namely, consider that we have the
>> following input:
>>
>> - var_name: "a"
>> - len: 1
>> - match_name "ab"
>>
>> In the first iteration of the loop (i.e., *match == 0):
>> - c = 'a'
>> - u = 'a'
>> - *match gets incremented to 1.
>>
>> In the second iteration of the loop (i.e., *match == 1):
>> - c = 'b'
>> - u = <indeterminate value> (that is, undefined behavior),
>>   because (*match == len).
>>
>> This seems to be consistent with the error message "Caught 8-bit read
>> from uninitialized memory": namely, the array allocated for "name" in
>> efivarfs_callback() is indeed not pre-zeroed, and the ucs2_as_utf8()
>> function does not populate name[len] -- correctly, I would say.
> 
> ucs2_as_utf8 reports that it returns a NUL terminated string.

I don't think it does. Here's the comment:

/*
 * copy at most maxlength bytes of whole utf8 characters to dest from the
 * ucs2 string src.
 *
 * The return value is the number of characters copied, not including the
 * final NUL character.
 */

It doesn't seem to promise that the output will always be NUL-terminated. And, the code explicitly considers the case when there is no room for the final NUL.

A strictly NUL-terminated output might make for a better interface, but then the comment should be updated as well. Plus, I'm unsure if it would be aligned with Peter's original goal (and the current call sites). I'm not against changing the interface contract; I'll let Peter speak up.

> It didn't
> in this case.
> -Chris
>
Laszlo Ersek April 20, 2016, 1:25 p.m. UTC | #4
On 04/20/16 10:37, Chris Wilson wrote:
> If the caller, in this case efivarfs_callback(), only provides sufficent
> room for the expanded utf8 and not enough to include the terminating NUL
> byte, that NUL byte is skipped. When the caller then interprets it as a
> string, it may then read from past its allocated memory:
> 
> [  170.605647] WARNING: kmemcheck: Caught 8-bit read from uninitialized memory (ffff8804079ae786)
> [  170.605677] 436f6e4f757400004c44322d35363062663538612d316530642d346437652d39
> [  170.606037]  i i i i i i u u u u u u u u u u u u u u u u u u u u u u u u u u
> [  170.606236]              ^
> [  170.606243] RIP: 0010:[<ffffffff813a251f>]  [<ffffffff813a251f>] efivar_variable_is_removable+0xaf/0xf0
> [  170.606346] RSP: 0018:ffff880408e73c20  EFLAGS: 00010206
> [  170.606352] RAX: 0000000000000000 RBX: 0000000000000006 RCX: 0000000000000006
> [  170.606359] RDX: 0000000000000000 RSI: 0000000000000074 RDI: ffff880408e73c30
> [  170.606365] RBP: ffff880408e73c80 R08: 0000000000000006 R09: 000000000000008c
> [  170.606371] R10: 0000000000000006 R11: 0000000000000000 R12: ffffffff8166ed20
> [  170.606378] R13: 11d293ca8be4df61 R14: ffffffff81773834 R15: ffff8804079ae780
> [  170.606385] FS:  0000000000000000(0000) GS:ffff88041ca00000(0000) knlGS:0000000000000000
> [  170.606392] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  170.606399] CR2: ffff880409cbe4c0 CR3: 00000004085fd000 CR4: 00000000001406f0
> [  170.606405]  [<ffffffff811eb938>] efivarfs_callback+0xf8/0x275
> [  170.606418]  [<ffffffff813a3368>] efivar_init+0x248/0x2e0
> [  170.606440]  [<ffffffff811eb6b4>] efivarfs_fill_super+0xb4/0xf0
> [  170.606452]  [<ffffffff811333e7>] mount_single+0x87/0xb0
> [  170.606463]  [<ffffffff811eb5f3>] efivarfs_mount+0x13/0x20
> [  170.606475]  [<ffffffff81133480>] mount_fs+0x10/0x90
> [  170.606497]  [<ffffffff8114c732>] vfs_kern_mount+0x62/0x100
> [  170.606508]  [<ffffffff8114ecb0>] do_mount+0x1e0/0xcd0
> [  170.606519]  [<ffffffff8114fa9f>] SyS_mount+0x8f/0xd0
> [  170.606530]  [<ffffffff81451d1f>] entry_SYSCALL_64_fastpath+0x17/0x93
> [  170.606542]  [<ffffffffffffffff>] 0xffffffffffffffff
> 
> Cc: Matt Fleming <matt@codeblueprint.co.uk>
> Cc: Jason Andryuk <jandryuk@gmail.com>
> Cc: Matthew Garrett <mjg59@coreos.com>
> Cc: Laszlo Ersek <lersek@redhat.com>
> Cc: Peter Jones <pjones@redhat.com>
> Cc: linux-efi@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> ---
>  drivers/firmware/efi/vars.c |  2 +-
>  lib/ucs2_string.c           | 15 ++++++++++-----
>  2 files changed, 11 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/firmware/efi/vars.c b/drivers/firmware/efi/vars.c
> index 0ac594c0a234..8dd503bac35d 100644
> --- a/drivers/firmware/efi/vars.c
> +++ b/drivers/firmware/efi/vars.c
> @@ -166,7 +166,7 @@ validate_ascii_string(efi_char16_t *var_name, int match, u8 *buffer,
>  
>  struct variable_validate {
>  	efi_guid_t vendor;
> -	char *name;
> +	const char *name;
>  	bool (*validate)(efi_char16_t *var_name, int match, u8 *data,
>  			 unsigned long len);
>  };
> diff --git a/lib/ucs2_string.c b/lib/ucs2_string.c
> index f0b323abb4c6..fb8d03966656 100644
> --- a/lib/ucs2_string.c
> +++ b/lib/ucs2_string.c
> @@ -85,29 +85,34 @@ ucs2_as_utf8(u8 *dest, const ucs2_char_t *src, unsigned long maxlength)
>  	unsigned long j = 0;
>  	unsigned long limit = ucs2_strnlen(src, maxlength);
>  
> -	for (i = 0; maxlength && i < limit; i++) {
> +	if (maxlength == 0)
> +		return 0;
> +
> +	for (i = 0; i < limit; i++) {
>  		u16 c = src[i];
>  
>  		if (c >= 0x800) {
> -			if (maxlength < 3)
> +			if (maxlength <= 3)
>  				break;
>  			maxlength -= 3;
>  			dest[j++] = 0xe0 | (c & 0xf000) >> 12;
>  			dest[j++] = 0x80 | (c & 0x0fc0) >> 6;
>  			dest[j++] = 0x80 | (c & 0x003f);
>  		} else if (c >= 0x80) {
> -			if (maxlength < 2)
> +			if (maxlength <= 2)
>  				break;
>  			maxlength -= 2;
>  			dest[j++] = 0xc0 | (c & 0x7c0) >> 6;
>  			dest[j++] = 0x80 | (c & 0x03f);
>  		} else {
> +			if (maxlength <= 1)
> +				break;
>  			maxlength -= 1;
>  			dest[j++] = c & 0x7f;
>  		}
>  	}
> -	if (maxlength)
> -		dest[j] = '\0';
> +	dest[j] = '\0';
> +
>  	return j;
>  }
>  EXPORT_SYMBOL(ucs2_as_utf8);
> 

I apologize for following up in "waves". I'd like us to consider how
this patch works together with the rest of the code.

IIUC, ucs2_utf8size() returns the number of non-NUL bytes that are
required to store the transcoded string. And, in "fs/efivarfs/super.c",
function efivarfs_callback(), we have:

	len = ucs2_utf8size(entry->var.VariableName);

	/* name, plus '-', plus GUID, plus NUL*/
	name = kmalloc(len + 1 + EFI_VARIABLE_GUID_LEN + 1, GFP_KERNEL);
	if (!name)
		goto fail;

	ucs2_as_utf8(name, entry->var.VariableName, len);

So, "len" does not include the room for the terminating NUL-byte here.
When "len" is passed to ucs2_as_utf8(), with the proposed patch applied,
a NUL byte will be produced in "name", but it will be at the price of a
genuine character from the input variable name.

Because, "len" has been computed for the exact storage need, and now the
equality in one of the permissive inequalities (introduced by this
patch) will terminate the transcoding early.

Assume that VariableName is (u16[]){ 'a', 'b', '\0' }. For this input,
ucs2_strlen() returns 2, and ucs2_utf8size() also returns 2. Right?

If so, then:
- len = 2
- "maxlen" at entry to ucs2_as_utf8() is also 2
- "limit" in ucs2_as_utf8() will be initialized to 2 as well

So we run the loop body twice. The first iteration will lower
"maxlength" to 1 (last branch). In the second iteration, (maxlength<=1)
will match, and instead of transferring the character 'b', we'll set
dest[1] to '\0'. In other words, we get the NUL-terminated string "a" as
output.

I think this breaks the subsequent comparisons against known variable names.

Am I wrong?

Thanks
Laszlo
Jani Nikula April 20, 2016, 2:03 p.m. UTC | #5
On Wed, 20 Apr 2016, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> If the caller, in this case efivarfs_callback(), only provides sufficent
> room for the expanded utf8 and not enough to include the terminating NUL
> byte, that NUL byte is skipped. When the caller then interprets it as a
> string, it may then read from past its allocated memory:
>
> [  170.605647] WARNING: kmemcheck: Caught 8-bit read from uninitialized memory (ffff8804079ae786)
> [  170.605677] 436f6e4f757400004c44322d35363062663538612d316530642d346437652d39
> [  170.606037]  i i i i i i u u u u u u u u u u u u u u u u u u u u u u u u u u
> [  170.606236]              ^
> [  170.606243] RIP: 0010:[<ffffffff813a251f>]  [<ffffffff813a251f>] efivar_variable_is_removable+0xaf/0xf0
> [  170.606346] RSP: 0018:ffff880408e73c20  EFLAGS: 00010206
> [  170.606352] RAX: 0000000000000000 RBX: 0000000000000006 RCX: 0000000000000006
> [  170.606359] RDX: 0000000000000000 RSI: 0000000000000074 RDI: ffff880408e73c30
> [  170.606365] RBP: ffff880408e73c80 R08: 0000000000000006 R09: 000000000000008c
> [  170.606371] R10: 0000000000000006 R11: 0000000000000000 R12: ffffffff8166ed20
> [  170.606378] R13: 11d293ca8be4df61 R14: ffffffff81773834 R15: ffff8804079ae780
> [  170.606385] FS:  0000000000000000(0000) GS:ffff88041ca00000(0000) knlGS:0000000000000000
> [  170.606392] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  170.606399] CR2: ffff880409cbe4c0 CR3: 00000004085fd000 CR4: 00000000001406f0
> [  170.606405]  [<ffffffff811eb938>] efivarfs_callback+0xf8/0x275
> [  170.606418]  [<ffffffff813a3368>] efivar_init+0x248/0x2e0
> [  170.606440]  [<ffffffff811eb6b4>] efivarfs_fill_super+0xb4/0xf0
> [  170.606452]  [<ffffffff811333e7>] mount_single+0x87/0xb0
> [  170.606463]  [<ffffffff811eb5f3>] efivarfs_mount+0x13/0x20
> [  170.606475]  [<ffffffff81133480>] mount_fs+0x10/0x90
> [  170.606497]  [<ffffffff8114c732>] vfs_kern_mount+0x62/0x100
> [  170.606508]  [<ffffffff8114ecb0>] do_mount+0x1e0/0xcd0
> [  170.606519]  [<ffffffff8114fa9f>] SyS_mount+0x8f/0xd0
> [  170.606530]  [<ffffffff81451d1f>] entry_SYSCALL_64_fastpath+0x17/0x93
> [  170.606542]  [<ffffffffffffffff>] 0xffffffffffffffff

So I eyeballed the code a bit, and came across efivar_validate(), which
calls variable_matches(), which definitely can access off by one beyond
the passed in var_name on that call path.

However I can't match that up with your backtrace, so may be another
bug.


BR,
Jani.
Peter Jones April 21, 2016, 3:13 p.m. UTC | #6
On Thu, Apr 21, 2016 at 01:18:27PM +0100, Matt Fleming wrote:
> ( Good Lord, I hate doing string manipulation in C )

(yep)

> 
> On Wed, 20 Apr, at 03:25:32PM, Laszlo Ersek wrote:
> > 
> > So, "len" does not include the room for the terminating NUL-byte here.
> > When "len" is passed to ucs2_as_utf8(), with the proposed patch applied,
> > a NUL byte will be produced in "name", but it will be at the price of a
> > genuine character from the input variable name.
> 
> Right, and this is a problem because we're trying to keep the names
> consistent between efivarfs and the EFI variable data. Force
> NUL-terminating the string is wrong, because if you have no room for
> the NUL the caller should check for that. Sadly none do.
> 
> On the flip-side, passing around non-NUL terminated strings is just
> begging for these kinds of issues to come up.
> 
> The fact is that the callers of ucs2_as_utf8() are passing it the
> wrong 'len' argument. We want a NUL-terminated utf8 string and we're
> passing a NUL-terminated ucs2 string. We should tell ucs2_as_utf8() it
> has enough room to copy the NUL.
> 
> Wouldn't this work (minus the return value checking)?

I agree with your analysis, and your patch looks plausible.
Laszlo Ersek April 21, 2016, 4:21 p.m. UTC | #7
On 04/21/16 14:18, Matt Fleming wrote:
> ( Good Lord, I hate doing string manipulation in C )
> 
> On Wed, 20 Apr, at 03:25:32PM, Laszlo Ersek wrote:
>>
>> So, "len" does not include the room for the terminating NUL-byte here.
>> When "len" is passed to ucs2_as_utf8(), with the proposed patch applied,
>> a NUL byte will be produced in "name", but it will be at the price of a
>> genuine character from the input variable name.
> 
> Right, and this is a problem because we're trying to keep the names
> consistent between efivarfs and the EFI variable data. Force
> NUL-terminating the string is wrong, because if you have no room for
> the NUL the caller should check for that. Sadly none do.

I don't think it is necessary to check for the return value if on input the caller can guarantee (from ucs2_utf8size()) that the output buffer will be large enough.

> On the flip-side, passing around non-NUL terminated strings is just
> begging for these kinds of issues to come up.

"Depends", I guess :) NUL-termination and (ptr, length) both work, but whatever is chosen should be used consistently.

In my opinion, the pattern the code tries to follow is (ptr, length), and the direct issue is only that variable_matches() performs an out-of-bounds access.

We can change the pattern to NUL-termination, but then it should be consistently used, and I think variable_matches() should be adapted just the same. Why pass in the length to examine if the string is NUL-terminated, guaranteed?

> The fact is that the callers of ucs2_as_utf8() are passing it the
> wrong 'len' argument. We want a NUL-terminated utf8 string

Not necessarily. For example, efivarfs_callback() is constructing "name" in several steps, and ucs2_as_utf8() is just the first step. If it produced a '\0' at name[len], it would be overwritten soon after (with a hyphen character).

Also, the efivar_variable_is_removable() function takes "len". I think the interfaces are consistent, it's just the variable_matches() function that is buggy.

Again, I don't oppose switching the pattern to NUL-termination generally, but variable_matches() will have to be updated anyway. I think that's the first step, and the interfaces can be switched over only after.

> and we're
> passing a NUL-terminated ucs2 string. We should tell ucs2_as_utf8() it
> has enough room to copy the NUL.
> 
> Wouldn't this work (minus the return value checking)?
> 
> ---
> 
> diff --git a/drivers/firmware/efi/vars.c b/drivers/firmware/efi/vars.c
> index 0ac594c0a234..13a837c70e90 100644
> --- a/drivers/firmware/efi/vars.c
> +++ b/drivers/firmware/efi/vars.c
> @@ -235,13 +235,12 @@ efivar_validate(efi_guid_t vendor, efi_char16_t *var_name, u8 *data,
>  	unsigned long utf8_size;
>  	u8 *utf8_name;
>  
> -	utf8_size = ucs2_utf8size(var_name);
> -	utf8_name = kmalloc(utf8_size + 1, GFP_KERNEL);
> +	utf8_size = ucs2_utf8size(var_name) + 1;
> +	utf8_name = kmalloc(utf8_size, GFP_KERNEL);
>  	if (!utf8_name)
>  		return false;
>  
>  	ucs2_as_utf8(utf8_name, var_name, utf8_size);
> -	utf8_name[utf8_size] = '\0';
>  
>  	for (i = 0; variable_validate[i].name[0] != '\0'; i++) {
>  		const char *name = variable_validate[i].name;
> @@ -250,7 +249,7 @@ efivar_validate(efi_guid_t vendor, efi_char16_t *var_name, u8 *data,
>  		if (efi_guidcmp(vendor, variable_validate[i].vendor))
>  			continue;
>  
> -		if (variable_matches(utf8_name, utf8_size+1, name, &match)) {
> +		if (variable_matches(utf8_name, utf8_size, name, &match)) {
>  			if (variable_validate[i].validate == NULL)
>  				break;
>  			kfree(utf8_name);

As I said, the thing I dislike about this is: why pass "utf8_size" to variable_matches() at all, if "utf8_name" is guaranteed to be NUL-terminated?

Otherwise, I think this patch does get the job done, yes. (I also checked the ucs2_as_utf8() call in efivar_create_sysfs_entry(), "drivers/firmware/efi/efivars.c", but there the buffer already has enough room for '\0'.)

... How about this instead?

> From 1684f4398b8498af135fb3e07f83614ef0423265 Mon Sep 17 00:00:00 2001
> From: Laszlo Ersek <lersek@redhat.com>
> Date: Thu, 21 Apr 2016 18:08:31 +0200
> Subject: [PATCH] efi: fix out-of-bounds read in variable_matches()
>
> The variable_matches() function can currently read "var_name[len]", for
> example when:
> - var_name[0] == 'a',
> - len == 1
> - match_name points to the NUL-terminated string "ab".
>
> This function is supposed to accept "var_name" inputs that are not
> NUL-terminated (hence the "len" parameter"). Document the function, and
> access "var_name[*match]" only if "*match" is smaller than "len".
>
> Ref: http://thread.gmane.org/gmane.comp.freedesktop.xorg.drivers.intel/86906
> Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Laszlo Ersek <lersek@redhat.com>
> ---
>  drivers/firmware/efi/vars.c | 37 ++++++++++++++++++++++++++-----------
>  1 file changed, 26 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/firmware/efi/vars.c b/drivers/firmware/efi/vars.c
> index 0ac594c0a234..34b741940494 100644
> --- a/drivers/firmware/efi/vars.c
> +++ b/drivers/firmware/efi/vars.c
> @@ -202,29 +202,44 @@ static const struct variable_validate variable_validate[] = {
>  	{ NULL_GUID, "", NULL },
>  };
>
> +/*
> + * Check if @var_name matches the pattern given in @match_name.
> + *
> + * @var_name: an array of @len non-NUL characters.
> + * @match_name: a NUL-terminated pattern string, optionally ending in "*". A
> + *              final "*" character matches any trailing characters @var_name,
> + *              including the case when there are none left in @var_name.
> + * @match: on output, the number of non-wildcard characters in @match_name
> + *         that @var_name matches, regardless of the return value.
> + * @return: whether @var_name fully matches @match_name.
> + */
>  static bool
>  variable_matches(const char *var_name, size_t len, const char *match_name,
>  		 int *match)
>  {
>  	for (*match = 0; ; (*match)++) {
>  		char c = match_name[*match];
> -		char u = var_name[*match];
>
> -		/* Wildcard in the matching name means we've matched */
> -		if (c == '*')
> +		switch (c) {
> +		case '*':
> +			/* Wildcard in @match_name means we've matched. */
>  			return true;
>
> -		/* Case sensitive match */
> -		if (!c && *match == len)
> -			return true;
> +		case '\0':
> +			/* @match_name has ended. Has @var_name too? */
> +			return (*match == len);
>
> -		if (c != u)
> +		default:
> +			/*
> +			 * We've reached a non-wildcard char in @match_name.
> +			 * Continue only if there's an identical character in
> +			 * @var_name.
> +			 */
> +			if (*match < len && c == var_name[*match])
> +				continue;
>  			return false;
> -
> -		if (!c)
> -			return true;
> +		}
>  	}
> -	return true;
>  }
>
>  bool
> --
> 1.8.3.1
>

Thanks
Laszlo

> diff --git a/fs/efivarfs/super.c b/fs/efivarfs/super.c
> index dd029d13ea61..be5a02721b41 100644
> --- a/fs/efivarfs/super.c
> +++ b/fs/efivarfs/super.c
> @@ -136,7 +136,7 @@ static int efivarfs_callback(efi_char16_t *name16, efi_guid_t vendor,
>  	if (!name)
>  		goto fail;
>  
> -	ucs2_as_utf8(name, entry->var.VariableName, len);
> +	ucs2_as_utf8(name, entry->var.VariableName, len + 1);
>  
>  	if (efivar_variable_is_removable(entry->var.VendorGuid, name, len))
>  		is_removable = true;
>
Dave Gordon April 22, 2016, 11:27 a.m. UTC | #8
On 21/04/16 16:13, Peter Jones wrote:
> On Thu, Apr 21, 2016 at 01:18:27PM +0100, Matt Fleming wrote:
>> ( Good Lord, I hate doing string manipulation in C )
>
> (yep)
>
>>
>> On Wed, 20 Apr, at 03:25:32PM, Laszlo Ersek wrote:
>>>
>>> So, "len" does not include the room for the terminating NUL-byte here.
>>> When "len" is passed to ucs2_as_utf8(), with the proposed patch applied,
>>> a NUL byte will be produced in "name", but it will be at the price of a
>>> genuine character from the input variable name.
>>
>> Right, and this is a problem because we're trying to keep the names
>> consistent between efivarfs and the EFI variable data. Force
>> NUL-terminating the string is wrong, because if you have no room for
>> the NUL the caller should check for that. Sadly none do.
>>
>> On the flip-side, passing around non-NUL terminated strings is just
>> begging for these kinds of issues to come up.
>>
>> The fact is that the callers of ucs2_as_utf8() are passing it the
>> wrong 'len' argument. We want a NUL-terminated utf8 string and we're
>> passing a NUL-terminated ucs2 string. We should tell ucs2_as_utf8() it
>> has enough room to copy the NUL.
>>
>> Wouldn't this work (minus the return value checking)?
>
> I agree with your analysis, and your patch looks plausible.

This isn't really any different from the plain old ASCII case, where a 
function might call strlen(), allocate a buffer, and strncpy() the 
non-NUL bytes into it.

It would of course then be wrong to pass that to any of the standard 
string functions that operate on NUL-terminated strings.

OTOH if the code knows that the next thing will be an append @buf[len], 
then the temporary lack of a NUL isn't an issue.

Personally I'd be inclined to keep my strings NUL-terminated at all 
times, just as a guard against forgetting that I should only use 
functions that take an explicit length.

The pattern:

	l1 = strlen(s1);
	l2 = strlen(s2);
	buf = malloc(l1+l2+1);
	strcpy(buf, s1);
	strcpy(buf+l1, s2);

is quite common, quite safe, and not likely to confuse anyone. The extra 
overwrites of the temporary NULs are a small price for those benefits.

.Dave.
Matt Fleming April 22, 2016, 6:52 p.m. UTC | #9
On Thu, 21 Apr, at 06:21:11PM, Laszlo Ersek wrote:
> 
> ... How about this instead?

Your patch looks fine to me. I've gone ahead and stuck it in the
urgent EFI queue.

Thanks everyone!
Laszlo Ersek April 25, 2016, 10:17 a.m. UTC | #10
On 04/22/16 20:52, Matt Fleming wrote:
> On Thu, 21 Apr, at 06:21:11PM, Laszlo Ersek wrote:
>>
>> ... How about this instead?
> 
> Your patch looks fine to me. I've gone ahead and stuck it in the
> urgent EFI queue.

I intended to probe for opinions first, and then (if appropriate) submit
the patch stand-alone second, but apparently on linux-efi it's Workflow
deLuxe for contributors! ;)

Thank you, Matt!
Laszlo

> Thanks everyone!
>
diff mbox

Patch

diff --git a/drivers/firmware/efi/vars.c b/drivers/firmware/efi/vars.c
index 0ac594c0a234..8dd503bac35d 100644
--- a/drivers/firmware/efi/vars.c
+++ b/drivers/firmware/efi/vars.c
@@ -166,7 +166,7 @@  validate_ascii_string(efi_char16_t *var_name, int match, u8 *buffer,
 
 struct variable_validate {
 	efi_guid_t vendor;
-	char *name;
+	const char *name;
 	bool (*validate)(efi_char16_t *var_name, int match, u8 *data,
 			 unsigned long len);
 };
diff --git a/lib/ucs2_string.c b/lib/ucs2_string.c
index f0b323abb4c6..fb8d03966656 100644
--- a/lib/ucs2_string.c
+++ b/lib/ucs2_string.c
@@ -85,29 +85,34 @@  ucs2_as_utf8(u8 *dest, const ucs2_char_t *src, unsigned long maxlength)
 	unsigned long j = 0;
 	unsigned long limit = ucs2_strnlen(src, maxlength);
 
-	for (i = 0; maxlength && i < limit; i++) {
+	if (maxlength == 0)
+		return 0;
+
+	for (i = 0; i < limit; i++) {
 		u16 c = src[i];
 
 		if (c >= 0x800) {
-			if (maxlength < 3)
+			if (maxlength <= 3)
 				break;
 			maxlength -= 3;
 			dest[j++] = 0xe0 | (c & 0xf000) >> 12;
 			dest[j++] = 0x80 | (c & 0x0fc0) >> 6;
 			dest[j++] = 0x80 | (c & 0x003f);
 		} else if (c >= 0x80) {
-			if (maxlength < 2)
+			if (maxlength <= 2)
 				break;
 			maxlength -= 2;
 			dest[j++] = 0xc0 | (c & 0x7c0) >> 6;
 			dest[j++] = 0x80 | (c & 0x03f);
 		} else {
+			if (maxlength <= 1)
+				break;
 			maxlength -= 1;
 			dest[j++] = c & 0x7f;
 		}
 	}
-	if (maxlength)
-		dest[j] = '\0';
+	dest[j] = '\0';
+
 	return j;
 }
 EXPORT_SYMBOL(ucs2_as_utf8);