Message ID | 20151120001043.GA28204@www.outflux.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, 19 Nov 2015 16:10:43 -0800 Kees Cook <keescook@chromium.org> wrote: > Normally, when a user can modify a file that has setuid or setgid bits, > those bits are cleared when they are not the file owner or a member of the > group. This is enforced when using write() directly but not when writing > to a shared mmap on the file. This could allow the file writer to gain > privileges by changing the binary without losing the setuid/setgid bits. > > Signed-off-by: Kees Cook <keescook@chromium.org> > Cc: stable@vger.kernel.org > --- > mm/memory.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/mm/memory.c b/mm/memory.c > index deb679c31f2a..4c970a4e0057 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -2036,6 +2036,7 @@ static inline int wp_page_reuse(struct mm_struct *mm, > > if (!page_mkwrite) > file_update_time(vma->vm_file); > + file_remove_privs(vma->vm_file); > } > > return VM_FAULT_WRITE; file_remove_privs() is depressingly heavyweight. You'd think there was some more lightweight way of caching the fact that we've already done this. Dumb question: can we run file_remove_privs() once, when the file is opened writably, rather than for each and every write into each page? Also, the proposed patch drops the file_remove_privs() return value on the floor and we just go ahead with the modification. How come? -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Nov 19, 2015 at 4:41 PM, Andrew Morton <akpm@linux-foundation.org> wrote: > On Thu, 19 Nov 2015 16:10:43 -0800 Kees Cook <keescook@chromium.org> wrote: > >> Normally, when a user can modify a file that has setuid or setgid bits, >> those bits are cleared when they are not the file owner or a member of the >> group. This is enforced when using write() directly but not when writing >> to a shared mmap on the file. This could allow the file writer to gain >> privileges by changing the binary without losing the setuid/setgid bits. >> >> Signed-off-by: Kees Cook <keescook@chromium.org> >> Cc: stable@vger.kernel.org >> --- >> mm/memory.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/mm/memory.c b/mm/memory.c >> index deb679c31f2a..4c970a4e0057 100644 >> --- a/mm/memory.c >> +++ b/mm/memory.c >> @@ -2036,6 +2036,7 @@ static inline int wp_page_reuse(struct mm_struct *mm, >> >> if (!page_mkwrite) >> file_update_time(vma->vm_file); >> + file_remove_privs(vma->vm_file); >> } >> >> return VM_FAULT_WRITE; > > file_remove_privs() is depressingly heavyweight. You'd think there was > some more lightweight way of caching the fact that we've already done > this. In theory, the IS_NOSEC(inode) should be fast. Perhaps track it in the vma or file struct? > Dumb question: can we run file_remove_privs() once, when the file is > opened writably, rather than for each and every write into each page? This got discussed briefly, but I can't remember why it got shot down. > Also, the proposed patch drops the file_remove_privs() return value on > the floor and we just go ahead with the modification. How come? Oh, excellent catch. If it can't drop it, it shouldn't be writable. I'm not sure what the right abort scenario is in wp_page_reuse. Maybe move this to start of wp_page_shared instead? -Kees
Hi Kees, On Thu, Nov 19, 2015 at 04:10:43PM -0800, Kees Cook wrote: > Normally, when a user can modify a file that has setuid or setgid bits, > those bits are cleared when they are not the file owner or a member of the > group. This is enforced when using write() directly but not when writing > to a shared mmap on the file. This could allow the file writer to gain > privileges by changing the binary without losing the setuid/setgid bits. > > Signed-off-by: Kees Cook <keescook@chromium.org> > Cc: stable@vger.kernel.org > --- > mm/memory.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/mm/memory.c b/mm/memory.c > index deb679c31f2a..4c970a4e0057 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -2036,6 +2036,7 @@ static inline int wp_page_reuse(struct mm_struct *mm, > > if (!page_mkwrite) > file_update_time(vma->vm_file); > + file_remove_privs(vma->vm_file); I thought you said in one of the early mails of this thread that it didn't work. Or maybe I misunderstood. Also, don't you think we should move that into the if (!page_mkwrite) just like for the time update ? Willy -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Nov 20, 2015 at 02:00:16AM +0100, Willy Tarreau wrote: > > diff --git a/mm/memory.c b/mm/memory.c > > index deb679c31f2a..4c970a4e0057 100644 > > --- a/mm/memory.c > > +++ b/mm/memory.c > > @@ -2036,6 +2036,7 @@ static inline int wp_page_reuse(struct mm_struct *mm, > > > > if (!page_mkwrite) > > file_update_time(vma->vm_file); > > + file_remove_privs(vma->vm_file); > > I thought you said in one of the early mails of this thread that it > didn't work. Or maybe I misunderstood. OK never mind for this one I just saw the other mail where you said the test is OK now. But I'm still worried about the performance so the other point below remains : > Also, don't you think we should move that into the if (!page_mkwrite) > just like for the time update ? Thanks! Willy -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Nov 19, 2015 at 5:00 PM, Willy Tarreau <w@1wt.eu> wrote: > Hi Kees, > > On Thu, Nov 19, 2015 at 04:10:43PM -0800, Kees Cook wrote: >> Normally, when a user can modify a file that has setuid or setgid bits, >> those bits are cleared when they are not the file owner or a member of the >> group. This is enforced when using write() directly but not when writing >> to a shared mmap on the file. This could allow the file writer to gain >> privileges by changing the binary without losing the setuid/setgid bits. >> >> Signed-off-by: Kees Cook <keescook@chromium.org> >> Cc: stable@vger.kernel.org >> --- >> mm/memory.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/mm/memory.c b/mm/memory.c >> index deb679c31f2a..4c970a4e0057 100644 >> --- a/mm/memory.c >> +++ b/mm/memory.c >> @@ -2036,6 +2036,7 @@ static inline int wp_page_reuse(struct mm_struct *mm, >> >> if (!page_mkwrite) >> file_update_time(vma->vm_file); >> + file_remove_privs(vma->vm_file); > > I thought you said in one of the early mails of this thread that it > didn't work. Or maybe I misunderstood. I had a think-o in my earlier attempts. I understood the meaning of page_mkwrite incorrectly. > Also, don't you think we should move that into the if (!page_mkwrite) > just like for the time update ? Nope, page_mkwrite indicates if there was a vmops call to page_mkwrite. In this case, it means "I will update the file time if the filesystem driver didn't take care of it like it should". For file_remove_privs, we want to always do it, since we should not depend on filesystems to do it. -Kees
On Thu, Nov 19, 2015 at 05:03:15PM -0800, Kees Cook wrote: > On Thu, Nov 19, 2015 at 5:00 PM, Willy Tarreau <w@1wt.eu> wrote: > > Hi Kees, > > > > On Thu, Nov 19, 2015 at 04:10:43PM -0800, Kees Cook wrote: > >> Normally, when a user can modify a file that has setuid or setgid bits, > >> those bits are cleared when they are not the file owner or a member of the > >> group. This is enforced when using write() directly but not when writing > >> to a shared mmap on the file. This could allow the file writer to gain > >> privileges by changing the binary without losing the setuid/setgid bits. > >> > >> Signed-off-by: Kees Cook <keescook@chromium.org> > >> Cc: stable@vger.kernel.org > >> --- > >> mm/memory.c | 1 + > >> 1 file changed, 1 insertion(+) > >> > >> diff --git a/mm/memory.c b/mm/memory.c > >> index deb679c31f2a..4c970a4e0057 100644 > >> --- a/mm/memory.c > >> +++ b/mm/memory.c > >> @@ -2036,6 +2036,7 @@ static inline int wp_page_reuse(struct mm_struct *mm, > >> > >> if (!page_mkwrite) > >> file_update_time(vma->vm_file); > >> + file_remove_privs(vma->vm_file); > > > > I thought you said in one of the early mails of this thread that it > > didn't work. Or maybe I misunderstood. > > I had a think-o in my earlier attempts. I understood the meaning of > page_mkwrite incorrectly. > > > Also, don't you think we should move that into the if (!page_mkwrite) > > just like for the time update ? > > Nope, page_mkwrite indicates if there was a vmops call to > page_mkwrite. In this case, it means "I will update the file time if > the filesystem driver didn't take care of it like it should". For > file_remove_privs, we want to always do it, since we should not depend > on filesystems to do it. Ah OK, thanks for the explanation, I didn't understand it like this at all last time I read it. Cheers, Willy -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu 19-11-15 16:10:43, Kees Cook wrote: > Normally, when a user can modify a file that has setuid or setgid bits, > those bits are cleared when they are not the file owner or a member of the > group. This is enforced when using write() directly but not when writing > to a shared mmap on the file. This could allow the file writer to gain > privileges by changing the binary without losing the setuid/setgid bits. > > Signed-off-by: Kees Cook <keescook@chromium.org> > Cc: stable@vger.kernel.org So I had another look at this and now I understand why we didn't do it from the start: To call file_remove_privs() safely, we need to hold inode->i_mutex since that operations is going to modify file mode / extended attributes and i_mutex protects those. However we cannot get i_mutex in the page fault path as that ranks above mmap_sem which we hold during the whole page fault. So calling file_remove_privs() when opening the file is probably as good as it can get. It doesn't catch the case when suid bits / IMA attrs are set while the file is already open but I don't see easy way around this. BTW: This is another example where page fault locking is constraining us and life would be simpler for filesystems we they get called without mmap_sem held... Honza
Jan Kara <jack@suse.cz> writes: > On Thu 19-11-15 16:10:43, Kees Cook wrote: >> Normally, when a user can modify a file that has setuid or setgid bits, >> those bits are cleared when they are not the file owner or a member of the >> group. This is enforced when using write() directly but not when writing >> to a shared mmap on the file. This could allow the file writer to gain >> privileges by changing the binary without losing the setuid/setgid bits. >> >> Signed-off-by: Kees Cook <keescook@chromium.org> >> Cc: stable@vger.kernel.org > > So I had another look at this and now I understand why we didn't do it from > the start: > > To call file_remove_privs() safely, we need to hold inode->i_mutex since > that operations is going to modify file mode / extended attributes and > i_mutex protects those. However we cannot get i_mutex in the page fault > path as that ranks above mmap_sem which we hold during the whole page > fault. > > So calling file_remove_privs() when opening the file is probably as good as > it can get. It doesn't catch the case when suid bits / IMA attrs are set > while the file is already open but I don't see easy way around this. Could we perhaps do this on mmap MAP_WRITE instead of open, and simply deny adding these attributes if a file is mapped for write? That would seem to be a little more compatible with what we already do, and guards against the races you mention as well. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Nov 23, 2015 at 4:26 AM, Jan Kara <jack@suse.cz> wrote: > On Thu 19-11-15 16:10:43, Kees Cook wrote: >> Normally, when a user can modify a file that has setuid or setgid bits, >> those bits are cleared when they are not the file owner or a member of the >> group. This is enforced when using write() directly but not when writing >> to a shared mmap on the file. This could allow the file writer to gain >> privileges by changing the binary without losing the setuid/setgid bits. >> >> Signed-off-by: Kees Cook <keescook@chromium.org> >> Cc: stable@vger.kernel.org > > So I had another look at this and now I understand why we didn't do it from > the start: > > To call file_remove_privs() safely, we need to hold inode->i_mutex since > that operations is going to modify file mode / extended attributes and > i_mutex protects those. However we cannot get i_mutex in the page fault > path as that ranks above mmap_sem which we hold during the whole page > fault. Ah, I see the notation in __generic_file_write_iter about i_mutex. Should file_remove_privs() get some debug annotation to catch callers that don't hold that mutex? (That would have alerted me much earlier.) > So calling file_remove_privs() when opening the file is probably as good as > it can get. It doesn't catch the case when suid bits / IMA attrs are set > while the file is already open but I don't see easy way around this. I agree with Eric: mmap time seems like the right place. > BTW: This is another example where page fault locking is constraining us > and life would be simpler for filesystems we they get called without > mmap_sem held... > > Honza > -- > Jan Kara <jack@suse.com> > SUSE Labs, CR -Kees
diff --git a/mm/memory.c b/mm/memory.c index deb679c31f2a..4c970a4e0057 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2036,6 +2036,7 @@ static inline int wp_page_reuse(struct mm_struct *mm, if (!page_mkwrite) file_update_time(vma->vm_file); + file_remove_privs(vma->vm_file); } return VM_FAULT_WRITE;
Normally, when a user can modify a file that has setuid or setgid bits, those bits are cleared when they are not the file owner or a member of the group. This is enforced when using write() directly but not when writing to a shared mmap on the file. This could allow the file writer to gain privileges by changing the binary without losing the setuid/setgid bits. Signed-off-by: Kees Cook <keescook@chromium.org> Cc: stable@vger.kernel.org --- mm/memory.c | 1 + 1 file changed, 1 insertion(+)