diff mbox series

[ethtool,v2,07/13] ethtool: avoid null pointer dereference

Message ID 20221208011122.2343363-8-jesse.brandeburg@intel.com (mailing list archive)
State Changes Requested
Delegated to: Michal Kubecek
Headers show
Series ethtool: clean up and fix | expand

Checks

Context Check Description
netdev/tree_selection success Not a local patch

Commit Message

Jesse Brandeburg Dec. 8, 2022, 1:11 a.m. UTC
'$ scan-build make' reports:

Description: Array access (from variable 'arg') results in a null
pointer dereference
File: /git/ethtool/netlink/parser.c
Line: 782

Description: Dereference of null pointer (loaded from variable 'p')
File: /git/ethtool/netlink/parser.c
Line: 794

Both of these bugs are prevented by checking the input in
nl_parse_char_bitset(), which is called from nl_sset() via the kernel
callback, specifically for the parsing of the wake-on-lan options (-s
wol). None of the other functions in this file seem to have the issue of
deferencing data without checking for validity first. This could
"technically" allow nlctxt->argp to be NULL, and scan-build is limited
in it's ability to parse for bugs only at file scope in this case.
This particular bug should be unlikely to happen because the kernel
builds/parses the netlink structure before handing it to the
application. However in the interests of being able to run this
scan-build tool regularly, I'm still sending the initial version of this
patch as I tried several other ways to fix the bug with an earlier check
for NULL in nl_sset, but it won't prevent the scan-build error due to
the file scope problem.

CC: Michal Kubecek <mkubecek@suse.cz>
Fixes: 81a30f416ec7 ("netlink: add bitset command line parser handlers")
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
---
v2: updated commit message with more nuance after feedback from ethtool
maintainer. I'd be open to fixing this a different way but this seemed
the most straight-forward with the smallest amount of code changed.
v1: original version
---
 netlink/parser.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Michal Kubecek Dec. 8, 2022, 6:23 a.m. UTC | #1
On Wed, Dec 07, 2022 at 05:11:16PM -0800, Jesse Brandeburg wrote:
> '$ scan-build make' reports:
> 
> Description: Array access (from variable 'arg') results in a null
> pointer dereference
> File: /git/ethtool/netlink/parser.c
> Line: 782
> 
> Description: Dereference of null pointer (loaded from variable 'p')
> File: /git/ethtool/netlink/parser.c
> Line: 794
> 
> Both of these bugs are prevented by checking the input in
> nl_parse_char_bitset(), which is called from nl_sset() via the kernel
> callback, specifically for the parsing of the wake-on-lan options (-s
> wol). None of the other functions in this file seem to have the issue of
> deferencing data without checking for validity first. This could
> "technically" allow nlctxt->argp to be NULL, and scan-build is limited
> in it's ability to parse for bugs only at file scope in this case.
> This particular bug should be unlikely to happen because the kernel
> builds/parses the netlink structure before handing it to the

Again: this has nothing to do with netlink, this is command line parser,
nlctx->argp is a member of argv[] array. And as execve() (which is the
only syscall in the exec* family, the rest are wrappers) does not pass
argc, only argv[], argc is actually determined by kernel so for it to be
actually null, you would need a serious bug in kernel first.

Even if we want to be safe against buggy kernel passing garbage as
command line arguments, I still believe we should do that earlier, in
the general code, not deep in a specific helper function. Also, you only
check for null but that does not catch an invalid pointer in argv[]
which, unlike a null pointer, could do an actual harm. And I don't see
how that could be checked, I'm afraid we have to trust kernel.

Michal

> application. However in the interests of being able to run this
> scan-build tool regularly, I'm still sending the initial version of this
> patch as I tried several other ways to fix the bug with an earlier check
> for NULL in nl_sset, but it won't prevent the scan-build error due to
> the file scope problem.
> 
> CC: Michal Kubecek <mkubecek@suse.cz>
> Fixes: 81a30f416ec7 ("netlink: add bitset command line parser handlers")
> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
> ---
> v2: updated commit message with more nuance after feedback from ethtool
> maintainer. I'd be open to fixing this a different way but this seemed
> the most straight-forward with the smallest amount of code changed.
> v1: original version
> ---
>  netlink/parser.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/netlink/parser.c b/netlink/parser.c
> index 70f451008eb4..c573a9598a9f 100644
> --- a/netlink/parser.c
> +++ b/netlink/parser.c
> @@ -874,7 +874,7 @@ int nl_parse_bitset(struct nl_context *nlctx, uint16_t type, const void *data,
>   * optionally followed by '/' and another numeric value (mask, unless no_mask
>   * is set), or a string consisting of characters corresponding to bit indices.
>   * The @data parameter points to struct char_bitset_parser_data. Generates
> - * biset nested attribute. Fails if type is zero or if @dest is not null.
> + * bitset nested attribute. Fails if type is zero or if @dest is not null.
>   */
>  int nl_parse_char_bitset(struct nl_context *nlctx, uint16_t type,
>  			 const void *data, struct nl_msg_buff *msgbuff,
> @@ -882,7 +882,7 @@ int nl_parse_char_bitset(struct nl_context *nlctx, uint16_t type,
>  {
>  	const struct char_bitset_parser_data *parser_data = data;
>  
> -	if (!type || dest) {
> +	if (!type || dest || !*nlctx->argp) {
>  		fprintf(stderr, "ethtool (%s): internal error parsing '%s'\n",
>  			nlctx->cmd, nlctx->param);
>  		return -EFAULT;
> -- 
> 2.31.1
>
Jesse Brandeburg Dec. 9, 2022, 5:36 p.m. UTC | #2
On 12/7/2022 10:23 PM, Michal Kubecek wrote:
> On Wed, Dec 07, 2022 at 05:11:16PM -0800, Jesse Brandeburg wrote:
>> '$ scan-build make' reports:
>>
>> Description: Array access (from variable 'arg') results in a null
>> pointer dereference
>> File: /git/ethtool/netlink/parser.c
>> Line: 782
>>
>> Description: Dereference of null pointer (loaded from variable 'p')
>> File: /git/ethtool/netlink/parser.c
>> Line: 794
>>
>> Both of these bugs are prevented by checking the input in
>> nl_parse_char_bitset(), which is called from nl_sset() via the kernel
>> callback, specifically for the parsing of the wake-on-lan options (-s
>> wol). None of the other functions in this file seem to have the issue of
>> deferencing data without checking for validity first. This could
>> "technically" allow nlctxt->argp to be NULL, and scan-build is limited
>> in it's ability to parse for bugs only at file scope in this case.
>> This particular bug should be unlikely to happen because the kernel
>> builds/parses the netlink structure before handing it to the
> 
> Again: this has nothing to do with netlink, this is command line parser,
> nlctx->argp is a member of argv[] array. And as execve() (which is the
> only syscall in the exec* family, the rest are wrappers) does not pass
> argc, only argv[], argc is actually determined by kernel so for it to be
> actually null, you would need a serious bug in kernel first.

Thank you for explaining! I can drop this patch, but it's disappointing 
that one fairly cheap conditional will prevent us from being able to 
cleanly run scan-build. If you have any other suggestions please let me 
know (and see below)

I spent some time today trying to get the command line to pass a NULL 
value but I couldn't do it, and elsewhere in the code the checks for 
argc prevent the NULL value or no value from getting into the ethtool 
code parsing the commands, so in this case it's not really a false 
positive, but taken care of by other code that isn't observable to the 
scan-build virtual machine. The good news is I don't see a real issue here.

> Even if we want to be safe against buggy kernel passing garbage as
> command line arguments, I still believe we should do that earlier, in
> the general code, not deep in a specific helper function. Also, you only
> check for null but that does not catch an invalid pointer in argv[]
> which, unlike a null pointer, could do an actual harm. And I don't see
> how that could be checked, I'm afraid we have to trust kernel.

OK, let's trust the kernel, but can we still fix this issue in order to 
be able to add scan-build to a list of tools to run cleanly in automation?

some TL;DR details in case there is someone else that has a suggestion!

Here is the callchain, for reference:
This is from the command
# ethtool -s eth0 wol pumbag

#0  nl_parse_char_bitset
#1  in nl_parser at netlink/parser.c:1099
#2  in nl_sset at netlink/settings.c:1247
#3  in netlink_run_handler at netlink/netlink.c:493
#4  in main at ethtool.c:6425

and in the #0 frame above, *nlctx->argp = "pumbag"
in the callchain above, scan-build doesn't like us de-referencing argp 
because it doesn't have proof it's not null.

Further I tried putting the check in every element of the stack frame 
above and they all fail the scan-build check still, probably because the 
pointer is advanced to the "pumbag" argument later in the code.

Anyway, I'm still working on the v3 of the series.
Michal Kubecek Dec. 9, 2022, 6:06 p.m. UTC | #3
On Fri, Dec 09, 2022 at 09:36:54AM -0800, Jesse Brandeburg wrote:
> Here is the callchain, for reference:
> This is from the command
> # ethtool -s eth0 wol pumbag
> 
> #0  nl_parse_char_bitset
> #1  in nl_parser at netlink/parser.c:1099
> #2  in nl_sset at netlink/settings.c:1247
> #3  in netlink_run_handler at netlink/netlink.c:493
> #4  in main at ethtool.c:6425
> 
> and in the #0 frame above, *nlctx->argp = "pumbag"
> in the callchain above, scan-build doesn't like us de-referencing argp
> because it doesn't have proof it's not null.
> 
> Further I tried putting the check in every element of the stack frame above
> and they all fail the scan-build check still, probably because the pointer
> is advanced to the "pumbag" argument later in the code.

This should be guaranteed by min_argc beeing set to 1 in the relevant
member of sset_params[] array in netlink/setings.c. The dispatcher code
in nl_parser() checks that there are at least ->min_argc parameters left
before calling specific ->handler(). So as long as argc and argv[] are
consistent, we should be safe.

But I suppose we cannot expect the static checker to be smart enough to
see through that logic.

Michal
diff mbox series

Patch

diff --git a/netlink/parser.c b/netlink/parser.c
index 70f451008eb4..c573a9598a9f 100644
--- a/netlink/parser.c
+++ b/netlink/parser.c
@@ -874,7 +874,7 @@  int nl_parse_bitset(struct nl_context *nlctx, uint16_t type, const void *data,
  * optionally followed by '/' and another numeric value (mask, unless no_mask
  * is set), or a string consisting of characters corresponding to bit indices.
  * The @data parameter points to struct char_bitset_parser_data. Generates
- * biset nested attribute. Fails if type is zero or if @dest is not null.
+ * bitset nested attribute. Fails if type is zero or if @dest is not null.
  */
 int nl_parse_char_bitset(struct nl_context *nlctx, uint16_t type,
 			 const void *data, struct nl_msg_buff *msgbuff,
@@ -882,7 +882,7 @@  int nl_parse_char_bitset(struct nl_context *nlctx, uint16_t type,
 {
 	const struct char_bitset_parser_data *parser_data = data;
 
-	if (!type || dest) {
+	if (!type || dest || !*nlctx->argp) {
 		fprintf(stderr, "ethtool (%s): internal error parsing '%s'\n",
 			nlctx->cmd, nlctx->param);
 		return -EFAULT;