utimensat EACCES vs. EPERM in 4.8+
diff mbox

Message ID CAKgNAkjTkBMY0wrj3wsH39YF=bHp=8mbYrXkSPzn0X4ezfso1w@mail.gmail.com
State New
Headers show

Commit Message

Michael Kerrisk (man-pages) Jan. 17, 2017, 12:04 a.m. UTC
[CC += linux-api + Dave Chinner]

On 17 January 2017 at 04:53, Miklos Szeredi <mszeredi@redhat.com> wrote:
> On Mon, Jan 16, 2017 at 4:46 PM, Jan Stancek <jstancek@redhat.com> wrote:
>> Hi,
>>
>> we seem to have a conflict between kernel and man pages.

Jan, thanks for spotting this.

>> From utimensat man page:
>>
>> EACCES times is NULL, or both tv_nsec values are UTIME_NOW, and either:
>>        *  the effective user ID of the caller does not match the owner of the
>>           file, the caller does not  have  write  access  to  the file, and the
>>           caller is not privileged (Linux: does not have either the CAP_FOWNER
>>           or the CAP_DAC_OVERRIDE capability); or,
>>        *  the file is marked immutable (see chattr(1)).
>>
>> But following 2 commits gradually replaced EACCES with EPERM.
>>
>> commit 337684a1746f93ae107e05d90977b070bb7e39d8
>> Author: Eryu Guan <guaneryu@gmail.com>
>> Date:   Tue Aug 2 19:58:28 2016 +0800
>>     fs: return EPERM on immutable inode
>
> I agree with Eryu that consistently returning EPERM for immutable is
> better than sometimes returning EACCESS and sometimes EPERM.

I'm not so sure about that. In Eryu's patch (which *really, really*
should have CCed linux-api@, and it would be kind if subsystem
maintainers reminded patch submitters about that), there was this
change:

[[
        retval = do_inode_permission(inode, mask);
]]

[1] The effects of that change are pretty wide ranging, affecting
open(2)/openat(2) (of an existing file for writing),
access(2)/faccessat(2) (W_OK), and [f]truncate(2). In addition, there
is the observed change (from another part of the patch) in
utimensat(2) (and friends). Those cases formerly gave EACCES for
immutable files, now they give EPERM.

[2] By contrast, the following always gave EPERM: fallocate(2),
setxattr(2), unlink(2), link(2) [in certain cases], chown(2),
chmod(2), and some per-filesystem cases of operations such as
truncate.

> So I think the man page should be fixed.

I agree that the inconsistency in the error return for immutable files
is unfortunate. But, consider the following:

* Although the set of calls in [1] is shorter, they (in particular,
  open(2)) are probably much more commonly used than
  the system calls in [2]. (That is, Eryu's statement "In most cases,
  EPERM is returned on immutable inode" that accompanied the
  kernel patch isn't correct.)
* For access(W_OK), we introduced a new error (EPERM) that
  previously never previously occurred. If there are applications
  that use access() and check specific error returns, they'll be
  confused. (I acknowledge there may be few such applications.)
* We changed the carefully documented behavior of utimensat(2)
  (and friends). [Read the man page!]
* EACCES is the typical error for "file not writable" (because of file
  permissions or other reasons such as immutability). It's long
  been the behavior for open(O_WRONLY/O_RDWR) on immutable
  files; now that has changed.
* Now various man pages need to document two different (kernel
  version dependent) errors for immutable files (for the syscalls in [1],
  above), and applications may need to deal with those two errors.

Summary of the above list: there's a nontrivial risk that something in
userspace got broken. (And just because we didn't hear about it yet
doesn't mean it didn't happen; sometimes these reports only arrive
many months or even years later.)

So, (1) I'm struggling to see the rationale for this change (I don't
think "consistency" is enough) and (2) if "consistency" is the
argument then (because the set of system calls in [1] are more
frequently used than those in [2]), there's a reasonable argument that
the change should have gone the other way: changing all IS_IMMUTABLE
cases to fail with EACCES.

Summary: I think there's an argument for reverting the kernel patch.

Cheers,

Michael

Comments

Carlos O'Donell Jan. 17, 2017, 4:50 a.m. UTC | #1
On 01/16/2017 07:04 PM, Michael Kerrisk (man-pages) wrote:
> [CC += linux-api + Dave Chinner]

> Summary of the above list: there's a nontrivial risk that something in
> userspace got broken. (And just because we didn't hear about it yet
> doesn't mean it didn't happen; sometimes these reports only arrive
> many months or even years later.)
> 
> So, (1) I'm struggling to see the rationale for this change (I don't
> think "consistency" is enough) and (2) if "consistency" is the
> argument then (because the set of system calls in [1] are more
> frequently used than those in [2]), there's a reasonable argument that
> the change should have gone the other way: changing all IS_IMMUTABLE
> cases to fail with EACCES.
> 
> Summary: I think there's an argument for reverting the kernel patch.

Completely agree.

Even if you go ahead with these changes, they really should go through
some kind of distro verification [1]. If I even contemplated such a change
in glibc I'd run it through 4-6 months of Fedora Rawhide builds just to
see what breaks before putting it out in a real release (and we do this
frequently for thread-related changes).
Jan Stancek Jan. 17, 2017, 7:51 a.m. UTC | #2
> >> we seem to have a conflict between kernel and man pages.
> 
> Jan, thanks for spotting this.

Credit goes to Cyril. As for LTP-20170116 (released yesterday) we went
with current documented behavior - few failures are expected on 4.8+.

Thanks for detailed analysis Michael!

> 
> >> From utimensat man page:
> >>
> >> EACCES times is NULL, or both tv_nsec values are UTIME_NOW, and either:
> >>        *  the effective user ID of the caller does not match the owner of
> >>        the
> >>           file, the caller does not  have  write  access  to  the file,
> >>           and the
> >>           caller is not privileged (Linux: does not have either the
> >>           CAP_FOWNER
> >>           or the CAP_DAC_OVERRIDE capability); or,
> >>        *  the file is marked immutable (see chattr(1)).
> >>
> >> But following 2 commits gradually replaced EACCES with EPERM.
> >>
> >> commit 337684a1746f93ae107e05d90977b070bb7e39d8
> >> Author: Eryu Guan <guaneryu@gmail.com>
> >> Date:   Tue Aug 2 19:58:28 2016 +0800
> >>     fs: return EPERM on immutable inode
> >
> > I agree with Eryu that consistently returning EPERM for immutable is
> > better than sometimes returning EACCESS and sometimes EPERM.
> 
> I'm not so sure about that. In Eryu's patch (which *really, really*
> should have CCed linux-api@, and it would be kind if subsystem
> maintainers reminded patch submitters about that), there was this
> change:
> 
> [[
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -402,23 +402,23 @@ static inline int do_inode_permission(struct
> inode *inode, int mask)
>   * inode_permission().
>   */
>  int __inode_permission(struct inode *inode, int mask)
>  {
>         int retval;
> 
>         if (unlikely(mask & MAY_WRITE)) {
>                 /*
>                  * Nobody gets write access to an immutable file.
>                  */
>                 if (IS_IMMUTABLE(inode))
> -                       return -EACCES;
> +                       return -EPERM;
> 
>                 /*
>                  * Updating mtime will likely cause i_uid and i_gid to be
>                  * written back improperly if their true value is unknown
>                  * to the vfs.
>                  */
>                 if (HAS_UNMAPPED_ID(inode))
>                         return -EACCES;
>         }
> 
>         retval = do_inode_permission(inode, mask);
> ]]
> 
> [1] The effects of that change are pretty wide ranging, affecting
> open(2)/openat(2) (of an existing file for writing),
> access(2)/faccessat(2) (W_OK), and [f]truncate(2). In addition, there
> is the observed change (from another part of the patch) in
> utimensat(2) (and friends). Those cases formerly gave EACCES for
> immutable files, now they give EPERM.
> 
> [2] By contrast, the following always gave EPERM: fallocate(2),
> setxattr(2), unlink(2), link(2) [in certain cases], chown(2),
> chmod(2), and some per-filesystem cases of operations such as
> truncate.
> 
> > So I think the man page should be fixed.
> 
> I agree that the inconsistency in the error return for immutable files
> is unfortunate. But, consider the following:
> 
> * Although the set of calls in [1] is shorter, they (in particular,
>   open(2)) are probably much more commonly used than
>   the system calls in [2]. (That is, Eryu's statement "In most cases,
>   EPERM is returned on immutable inode" that accompanied the
>   kernel patch isn't correct.)
> * For access(W_OK), we introduced a new error (EPERM) that
>   previously never previously occurred. If there are applications
>   that use access() and check specific error returns, they'll be
>   confused. (I acknowledge there may be few such applications.)
> * We changed the carefully documented behavior of utimensat(2)
>   (and friends). [Read the man page!]
> * EACCES is the typical error for "file not writable" (because of file
>   permissions or other reasons such as immutability). It's long
>   been the behavior for open(O_WRONLY/O_RDWR) on immutable
>   files; now that has changed.
> * Now various man pages need to document two different (kernel
>   version dependent) errors for immutable files (for the syscalls in [1],
>   above), and applications may need to deal with those two errors.
> 
> Summary of the above list: there's a nontrivial risk that something in
> userspace got broken. (And just because we didn't hear about it yet
> doesn't mean it didn't happen; sometimes these reports only arrive
> many months or even years later.)
> 
> So, (1) I'm struggling to see the rationale for this change (I don't
> think "consistency" is enough) and (2) if "consistency" is the
> argument then (because the set of system calls in [1] are more
> frequently used than those in [2]), there's a reasonable argument that
> the change should have gone the other way: changing all IS_IMMUTABLE
> cases to fail with EACCES.
> 
> Summary: I think there's an argument for reverting the kernel patch.
> 
> Cheers,
> 
> Michael
> 
> --
> Michael Kerrisk
> Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
> Linux/UNIX System Programming Training: http://man7.org/training/
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Cyril Hrubis Jan. 17, 2017, 7:57 a.m. UTC | #3
Hi!
> > Jan, thanks for spotting this.
> 
> Credit goes to Cyril. As for LTP-20170116 (released yesterday) we went
> with current documented behavior - few failures are expected on 4.8+.

Actually credit goes to SUSE QAM that caught the change on kernel
update :-).
Miklos Szeredi Jan. 17, 2017, 9:39 a.m. UTC | #4
On Tue, Jan 17, 2017 at 8:57 AM, Cyril Hrubis <chrubis@suse.cz> wrote:
> Hi!
>> > Jan, thanks for spotting this.
>>
>> Credit goes to Cyril. As for LTP-20170116 (released yesterday) we went
>> with current documented behavior - few failures are expected on 4.8+.
>
> Actually credit goes to SUSE QAM that caught the change on kernel
> update :-).

And while this makes for a nice discussion, it is almost completely
irrelevant in real life, since the only thing broken by this change
will be test suites (in other words immutable files are rare as hen's
teeth).

If you find that this change actually breaks something other than a
test suite, then yes, please lets revert it.  Otherwise there's not
much point (distros can revert it for themselves if they want be
paranoid).

Thanks,
Miklos
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Cyril Hrubis Jan. 17, 2017, 3:43 p.m. UTC | #5
Hi!
> > Actually credit goes to SUSE QAM that caught the change on kernel
> > update :-).
> 
> And while this makes for a nice discussion, it is almost completely
> irrelevant in real life, since the only thing broken by this change
> will be test suites (in other words immutable files are rare as hen's
> teeth).

We do have documentation and code out of sync so this is a valid bug
regardless of which one of them ends up fixed in the end...
Michael Kerrisk (man-pages) Jan. 18, 2017, 8:23 a.m. UTC | #6
Hi Miklos,

On 01/17/2017 10:39 PM, Miklos Szeredi wrote:
> On Tue, Jan 17, 2017 at 8:57 AM, Cyril Hrubis <chrubis@suse.cz> wrote:
>> Hi!
>>>> Jan, thanks for spotting this.
>>>
>>> Credit goes to Cyril. As for LTP-20170116 (released yesterday) we went
>>> with current documented behavior - few failures are expected on 4.8+.
>>
>> Actually credit goes to SUSE QAM that caught the change on kernel
>> update :-).
> 
> And while this makes for a nice discussion, it is almost completely
> irrelevant in real life, since the only thing broken by this change
> will be test suites (in other words immutable files are rare as hen's
> teeth).

While I agree that the chances of breakage may be quite small (though
I certainly won't be so bold as to assert that only test suites
will get broken), your comment misses the important metapoint:
changes like this should not just be getting waved through the gate
without some analysis along the lines of what I did. "Consistency"
and an (incorrect) assertion that "In most cases, EPERM is returned
on immutable inode" just don't cut it, when it comes to making
changes of this sort.

> If you find that this change actually breaks something other than a
> test suite, then yes, please lets revert it.  

No. That is not a strategy. I already said it, but I'll say it again:
sometimes these reports only arrive many months or even years later.
And then we can end up with fiascoes such as F_SETOWN/F_SETOWN_EX.
(Essentially: F_SETOWN got subtly changed, and by the time people
realized the breakage had occurred it was *years* late. Too late to
revert... So, then we got F_SETOWN_EX to allow us what F_SETOWN
used to do. See fcntl(2).)

> Otherwise there's not
> much point (distros can revert it for themselves if they want be
> paranoid).

Also not a strategy; distros don't want to carry such junk
patches.

All of this said, I still don't know whether reverting the patch
is warranted...

Cheers,

Michael
Cyril Hrubis Jan. 31, 2017, 12:09 p.m. UTC | #7
Hi!
> All of this said, I still don't know whether reverting the patch
> is warranted...

Ping.

What shall we do with the failing testcase?

I guess that since the discussion withered the patch will stay in kernel
as it is and the manual needs to be updated...

Patch
diff mbox

--- a/fs/namei.c
+++ b/fs/namei.c
@@ -402,23 +402,23 @@  static inline int do_inode_permission(struct
inode *inode, int mask)
  * inode_permission().
  */
 int __inode_permission(struct inode *inode, int mask)
 {
        int retval;

        if (unlikely(mask & MAY_WRITE)) {
                /*
                 * Nobody gets write access to an immutable file.
                 */
                if (IS_IMMUTABLE(inode))
-                       return -EACCES;
+                       return -EPERM;

                /*
                 * Updating mtime will likely cause i_uid and i_gid to be
                 * written back improperly if their true value is unknown
                 * to the vfs.
                 */
                if (HAS_UNMAPPED_ID(inode))
                        return -EACCES;
        }