mbox series

[RFC,v1,0/4] static analysis of copy_to_user()

Message ID 20181220195931.20331-1-tycho@tycho.ws (mailing list archive)
Headers show
Series static analysis of copy_to_user() | expand

Message

Tycho Andersen Dec. 20, 2018, 7:59 p.m. UTC
Hi all,

A while ago I talked with various people about whether some static
analsys of copy_to_user() could be productive in finding infoleaks.
Unfortunately, due to the various issues outlined in the patch notes, it
doesn't seem like it is. Perhaps these checks are useful to put in just
to future proof ourselves against these sorts of issues, though.

Anyway, here's the code. Thoughts welcome!

Tycho

Tycho Andersen (4):
  expression.h: update comment to include other cast types
  move name-based analysis before linearization
  add a check for copy_to_user() address spaces
  check copy_to_user() sizes

 expression.h                           |   2 +-
 sparse.c                               | 327 ++++++++++++++++++++++---
 validation/copy_to_user.c              |  31 +++
 validation/copy_to_user_sizes.c        |  53 ++++
 validation/copy_to_user_sizes_inline.c |  29 +++
 5 files changed, 405 insertions(+), 37 deletions(-)
 create mode 100644 validation/copy_to_user.c
 create mode 100644 validation/copy_to_user_sizes.c
 create mode 100644 validation/copy_to_user_sizes_inline.c

Comments

Luc Van Oostenryck Dec. 21, 2018, 10:47 p.m. UTC | #1
On Thu, Dec 20, 2018 at 12:59:27PM -0700, Tycho Andersen wrote:
> Hi all,
> 
> A while ago I talked with various people about whether some static
> analsys of copy_to_user() could be productive in finding infoleaks.
> Unfortunately, due to the various issues outlined in the patch notes, it
> doesn't seem like it is. Perhaps these checks are useful to put in just
> to future proof ourselves against these sorts of issues, though.
> 
> Anyway, here's the code. Thoughts welcome!

Hi,

I'm taking the first patch directly but I won't be able to look
closer at the other patches until next week.

Best regards,
-- Luc
Tycho Andersen Jan. 20, 2019, 7:05 p.m. UTC | #2
Hey Luc,

On Fri, Dec 21, 2018 at 11:47:19PM +0100, Luc Van Oostenryck wrote:
> On Thu, Dec 20, 2018 at 12:59:27PM -0700, Tycho Andersen wrote:
> > Hi all,
> > 
> > A while ago I talked with various people about whether some static
> > analsys of copy_to_user() could be productive in finding infoleaks.
> > Unfortunately, due to the various issues outlined in the patch notes, it
> > doesn't seem like it is. Perhaps these checks are useful to put in just
> > to future proof ourselves against these sorts of issues, though.
> > 
> > Anyway, here's the code. Thoughts welcome!
> 
> Hi,
> 
> I'm taking the first patch directly but I won't be able to look
> closer at the other patches until next week.

Any chance you can take a peek at these?

Cheers,

Tycho
Luc Van Oostenryck Jan. 21, 2019, 9:41 p.m. UTC | #3
On Mon, Jan 21, 2019 at 08:05:10AM +1300, Tycho Andersen wrote:
> Hey Luc,
> 
> On Fri, Dec 21, 2018 at 11:47:19PM +0100, Luc Van Oostenryck wrote:
> > On Thu, Dec 20, 2018 at 12:59:27PM -0700, Tycho Andersen wrote:
> > > Hi all,
> > > 
> > > A while ago I talked with various people about whether some static
> > > analsys of copy_to_user() could be productive in finding infoleaks.
> > > Unfortunately, due to the various issues outlined in the patch notes, it
> > > doesn't seem like it is. Perhaps these checks are useful to put in just
> > > to future proof ourselves against these sorts of issues, though.
> > > 
> > > Anyway, here's the code. Thoughts welcome!
> > 
> > Hi,
> > 
> > I'm taking the first patch directly but I won't be able to look
> > closer at the other patches until next week.
> 
> Any chance you can take a peek at these?

Hi,

Sorry, I've had few available time the last weeks.
I had look at them shortly after you send them but
I haven't yet made my mind about them.

I'm quite reluctant to add complexity (the AST walking)
if it doesn't bring much benefit if any.

In, short the problems are:
1) duplication of the AST walking
2) unreliable type because of using void *
3) unreliable size because array to pointer degeneracy

There is some solutions, though:
1) what *could* be done is to add a method 'check'
   to struct symbol_op and call it, *for example*,
   just after op->expand() in expand_symbol_call()
   (and add a mechanism to set this method for the symbol
   corresponding to copy_to_user()).

   Otherwise, splitting the AST walking from sparse.c
   and making it something generic would be preferable.

   Another approach could be keep the check via OP_CALL
   but doing it just after linearization, before the
   optimization destroy the types (and add, if needed,
   some flag to force linearize_cast() keep absolutely
   all type info).

2) this one seems pretty hopeless

3) the current calls degenerate()/create_pointer()
   do indeed destroy the original type and (at first sight)
   no 'addressof' should exist anymore after evaluation.
   This is inconsistent with the existence of expand_addressof().
   By changing degenerate()/create_pointer() the original
   type should stay available.

Sorry again for the late reply,
-- Luc
Tycho Andersen Jan. 24, 2019, 3:15 a.m. UTC | #4
Hi Luc,

On Mon, Jan 21, 2019 at 10:41:28PM +0100, Luc Van Oostenryck wrote:
> On Mon, Jan 21, 2019 at 08:05:10AM +1300, Tycho Andersen wrote:
> > Hey Luc,
> > 
> > On Fri, Dec 21, 2018 at 11:47:19PM +0100, Luc Van Oostenryck wrote:
> > > On Thu, Dec 20, 2018 at 12:59:27PM -0700, Tycho Andersen wrote:
> > > > Hi all,
> > > > 
> > > > A while ago I talked with various people about whether some static
> > > > analsys of copy_to_user() could be productive in finding infoleaks.
> > > > Unfortunately, due to the various issues outlined in the patch notes, it
> > > > doesn't seem like it is. Perhaps these checks are useful to put in just
> > > > to future proof ourselves against these sorts of issues, though.
> > > > 
> > > > Anyway, here's the code. Thoughts welcome!
> > > 
> > > Hi,
> > > 
> > > I'm taking the first patch directly but I won't be able to look
> > > closer at the other patches until next week.
> > 
> > Any chance you can take a peek at these?
> 
> Hi,
> 
> Sorry, I've had few available time the last weeks.
> I had look at them shortly after you send them but
> I haven't yet made my mind about them.
> 
> I'm quite reluctant to add complexity (the AST walking)
> if it doesn't bring much benefit if any.

No problem :).

> In, short the problems are:
> 1) duplication of the AST walking
> 2) unreliable type because of using void *
> 3) unreliable size because array to pointer degeneracy
> 
> There is some solutions, though:
> 1) what *could* be done is to add a method 'check'
>    to struct symbol_op and call it, *for example*,
>    just after op->expand() in expand_symbol_call()
>    (and add a mechanism to set this method for the symbol
>    corresponding to copy_to_user()).
> 
>    Otherwise, splitting the AST walking from sparse.c
>    and making it something generic would be preferable.

Yeah, this sounds like a good option to me.

>    Another approach could be keep the check via OP_CALL
>    but doing it just after linearization, before the
>    optimization destroy the types (and add, if needed,
>    some flag to force linearize_cast() keep absolutely
>    all type info).
> 
> 2) this one seems pretty hopeless

I was hoping you might have some brilliant insight here. It seems like
these checks could catch real bugs at some point, so I'll give the
changes you've suggested a go over the next couple of weeks and see
about a v2.

> 3) the current calls degenerate()/create_pointer()
>    do indeed destroy the original type and (at first sight)
>    no 'addressof' should exist anymore after evaluation.
>    This is inconsistent with the existence of expand_addressof().
>    By changing degenerate()/create_pointer() the original
>    type should stay available.

Ah ha, thanks. I guess it's not necessary to change create_pointer()
for this, but degenerate() definitely looks important.

Thanks!

Tycho