diff mbox

[1/3] hfs: stop using timespec based interfaces

Message ID CAK8P3a0WNPxDqSNJUuijY0P9UwGXPD_DCAUw9AH_OvAg-cPf1Q@mail.gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Arnd Bergmann June 19, 2018, 7:42 p.m. UTC
On Tue, Jun 19, 2018 at 7:03 PM, Viacheslav Dubeyko <slava@dubeyko.com> wrote:
> On Tue, 2018-06-19 at 18:02 +0200, Arnd Bergmann wrote:
>> The native HFS timestamps overflow in year 2040, two years after the
>> Unix
>> y2038 overflow. However, the way that the conversion between on-disk
>> timestamps and in-kernel timestamps was implemented, 64-bit machines
>> actually ended up converting negative UTC timestamps (1902 through
>> 1969)
>> into times between 2038 and 2106.
>>
>> Rather than making all machines faithfully represent timestamps in
>> the
>> ancient past but break after 2040, this changes the file system to
>> always use the unsigned UTC interpretation, reading back times
>> between
>> 1970 and 2106.
>>
>
> The trouble with HFS and HFS+ that the specification [1] declares this:
>
> "HFS Plus stores dates in several data structures, including the volume
> header and catalog records. These dates are stored in unsigned 32-bit
> integers (UInt32) containing the number of seconds since midnight,
> January 1, 1904, GMT. This is slightly different from HFS, where the
> value represents local time. The maximum representable date is February
> 6, 2040 at 06:28:15 GMT."
>
> So, I am not sure that we are able to support later dates because such
> timestamps cannot be stored on HFS/HFS+ volumes and will be
> incompatible with Mac OS X.

We never followed that interpretation in Linux though. As I wrote,
on 64-bit machines, these two calculations (hfs and hfs+,
respectively)

#define __hfs_m_to_utime(sec)   (be32_to_cpu(sec) - 2082844800U  +
sys_tz.tz_minuteswest * 60)
#define __hfsp_mt2ut(t)                (be32_to_cpu(t) - 2082844800U)

just wrap around when reading the timestamps before 1970 from
disk. On 32-bit machines they get wrapped another time when
we assign them to a signed 32-bit time_t.

> Also, I am not sure that anybody will use HFS/HFS+ after 2040.

I'm trying to fix all file systems to be unambiguous regarding
inode timestamps. This means it should behave the same way
on 32-bit and 64-bit kernels, and if possible in a sane way.

Even if you don't care about running HFS in the future, you
can trivially create files with arbitrary timestamps, just try

touch -d "Jan 1 1901" 1901
touch -d "Jan 1 1905" 1905
touch -d "Jan 1 1969" 1969
touch -d "Jan 1 2038" 2038
touch -d "Jan 1 2040" 2040
touch -d "Jan 1 2106" 2106
touch -d "Jan 1 2107" 2107

on HFS and do an 'ls -l' after an unmount/remount.

If you think it's important that we change the current behavior
to be compatible with MacOS and represent the 1904..2040
time range rather than 1970..2106, we can definitely do that
as well, using this patch:


I can submit that separately so that it can get backported into
stable kernels if you like, with the type changes as a follow-up
on top.

    Arnd

Comments

Viacheslav Dubeyko June 20, 2018, 4:55 p.m. UTC | #1
On Tue, 2018-06-19 at 21:42 +0200, Arnd Bergmann wrote:
> On Tue, Jun 19, 2018 at 7:03 PM, Viacheslav Dubeyko <slava@dubeyko.co
> m> wrote:
> > 
> > On Tue, 2018-06-19 at 18:02 +0200, Arnd Bergmann wrote:
> > > 
> > > The native HFS timestamps overflow in year 2040, two years after
> > > the
> > > Unix
> > > y2038 overflow. However, the way that the conversion between on-
> > > disk
> > > timestamps and in-kernel timestamps was implemented, 64-bit
> > > machines
> > > actually ended up converting negative UTC timestamps (1902
> > > through
> > > 1969)
> > > into times between 2038 and 2106.
> > > 
> > > Rather than making all machines faithfully represent timestamps
> > > in
> > > the
> > > ancient past but break after 2040, this changes the file system
> > > to
> > > always use the unsigned UTC interpretation, reading back times
> > > between
> > > 1970 and 2106.
> > > 
> > The trouble with HFS and HFS+ that the specification [1] declares
> > this:
> > 
> > "HFS Plus stores dates in several data structures, including the
> > volume
> > header and catalog records. These dates are stored in unsigned 32-
> > bit
> > integers (UInt32) containing the number of seconds since midnight,
> > January 1, 1904, GMT. This is slightly different from HFS, where
> > the
> > value represents local time. The maximum representable date is
> > February
> > 6, 2040 at 06:28:15 GMT."
> > 
> > So, I am not sure that we are able to support later dates because
> > such
> > timestamps cannot be stored on HFS/HFS+ volumes and will be
> > incompatible with Mac OS X.
> We never followed that interpretation in Linux though. As I wrote,
> on 64-bit machines, these two calculations (hfs and hfs+,
> respectively)
> 
> #define __hfs_m_to_utime(sec)   (be32_to_cpu(sec) - 2082844800U  +
> sys_tz.tz_minuteswest * 60)
> #define __hfsp_mt2ut(t)                (be32_to_cpu(t) - 2082844800U)
> 
> just wrap around when reading the timestamps before 1970 from
> disk. On 32-bit machines they get wrapped another time when
> we assign them to a signed 32-bit time_t.
> 

The whole patchset looks reasonable for me. I simply guess what the
correct behaviour of HFS/HFS+ file system driver could look like for
the case of achieving 2040 year. So, maybe the good way could be to
mount in the READ-ONLY mode. What do you think? 

> > 
> > Also, I am not sure that anybody will use HFS/HFS+ after 2040.
> I'm trying to fix all file systems to be unambiguous regarding
> inode timestamps. This means it should behave the same way
> on 32-bit and 64-bit kernels, and if possible in a sane way.
> 
> Even if you don't care about running HFS in the future, you
> can trivially create files with arbitrary timestamps, just try
> 
> touch -d "Jan 1 1901" 1901
> touch -d "Jan 1 1905" 1905
> touch -d "Jan 1 1969" 1969
> touch -d "Jan 1 2038" 2038
> touch -d "Jan 1 2040" 2040
> touch -d "Jan 1 2106" 2106
> touch -d "Jan 1 2107" 2107
> 
> on HFS and do an 'ls -l' after an unmount/remount.
> 
> If you think it's important that we change the current behavior
> to be compatible with MacOS and represent the 1904..2040
> time range rather than 1970..2106, we can definitely do that
> as well, using this patch:
> 
> diff --git a/fs/hfs/hfs_fs.h b/fs/hfs/hfs_fs.h
> index ff432931a5b1..2c7366342656 100644
> --- a/fs/hfs/hfs_fs.h
> +++ b/fs/hfs/hfs_fs.h
> @@ -249,7 +249,7 @@ extern void hfs_mark_mdb_dirty(struct super_block
> *sb);
>   * actually works until year 2106
>   */
>  #define __hfs_u_to_mtime(sec)  cpu_to_be32(sec + 2082844800U -
> sys_tz.tz_minuteswest * 60)
> -#define __hfs_m_to_utime(sec)  (be32_to_cpu(sec) - 2082844800U  +
> sys_tz.tz_minuteswest * 60)
> +#define __hfs_m_to_utime(sec)  ((time64_t)be32_to_cpu(sec) -
> 2082844800U  + sys_tz.tz_minuteswest * 60)
> 
>  #define HFS_I(inode)   (container_of(inode, struct hfs_inode_info,
> vfs_inode))
>  #define HFS_SB(sb)     ((struct hfs_sb_info *)(sb)->s_fs_info)
> diff --git a/fs/hfsplus/hfsplus_fs.h b/fs/hfsplus/hfsplus_fs.h
> index 1a6b469f8d22..4eaee8bdfcb2 100644
> --- a/fs/hfsplus/hfsplus_fs.h
> +++ b/fs/hfsplus/hfsplus_fs.h
> @@ -534,7 +534,7 @@ int hfsplus_read_wrapper(struct super_block *sb);
> 
>  /* time macros: convert between 1904-2040 and 1970-2106 range,
>   * pre-1970 timestamps are interpreted as post-2038 times after
> wrap-around */
> -#define __hfsp_mt2ut(t)                (be32_to_cpu(t) -
> 2082844800U)
> +#define __hfsp_mt2ut(t)                ((time64_t)be32_to_cpu(t) -
> 2082844800U)
>  #define __hfsp_ut2mt(t)                (cpu_to_be32(t +
> 2082844800U))
> 
>  /* compatibility */
> 
> I can submit that separately so that it can get backported into
> stable kernels if you like, with the type changes as a follow-up
> on top.
> 

Sounds good.

Thanks,
Vyacheslav Dubeyko.
Arnd Bergmann June 20, 2018, 7:55 p.m. UTC | #2
On Wed, Jun 20, 2018 at 6:55 PM, Viacheslav Dubeyko <slava@dubeyko.com> wrote:
> On Tue, 2018-06-19 at 21:42 +0200, Arnd Bergmann wrote:
>> On Tue, Jun 19, 2018 at 7:03 PM, Viacheslav Dubeyko <slava@dubeyko.com> wrote:

>> We never followed that interpretation in Linux though. As I wrote,
>> on 64-bit machines, these two calculations (hfs and hfs+,
>> respectively)
>>
>> #define __hfs_m_to_utime(sec)   (be32_to_cpu(sec) - 2082844800U  +
>> sys_tz.tz_minuteswest * 60)
>> #define __hfsp_mt2ut(t)                (be32_to_cpu(t) - 2082844800U)
>>
>> just wrap around when reading the timestamps before 1970 from
>> disk. On 32-bit machines they get wrapped another time when
>> we assign them to a signed 32-bit time_t.
>>
>
> The whole patchset looks reasonable for me. I simply guess what the
> correct behaviour of HFS/HFS+ file system driver could look like for
> the case of achieving 2040 year. So, maybe the good way could be to
> mount in the READ-ONLY mode. What do you think?

We've discussed doing that in VFS before, this is something we
need to revisit, but I'd like to do it in common code rather than
in every file system with a particular limit.

Deepa has a patch set to introduce minimum/maximum timestamps
in the superblock for this. We definitely want to use that for limiting
the range of utimensat() arguments from user space, and the idea
we had discussed in the past was to have a way to enforce
read-only mounting of file systems that cannot write current i_mtime
values past a certain (user-defined) future date.

We actually need something like that soon, as there are some
organizations that want to support super-long service lifetimes
for Linux systems (e.g. cars, industrial machines, ...) and want
an early-fail behavior to ensure that everything that works today
can in principle keep working for the foreseeable future, while
everything that is known to break can be forced to break already.

This is clearly not a priority for HFS in particular, but there is no
reason for HFS to be different from ext3 here, which has a similar
problem (timestamps are defined to range from 1902 to 2038).

>>  /* time macros: convert between 1904-2040 and 1970-2106 range,
>>   * pre-1970 timestamps are interpreted as post-2038 times after
>> wrap-around */
>> -#define __hfsp_mt2ut(t)                (be32_to_cpu(t) -
>> 2082844800U)
>> +#define __hfsp_mt2ut(t)                ((time64_t)be32_to_cpu(t) -
>> 2082844800U)
>>  #define __hfsp_ut2mt(t)                (cpu_to_be32(t +
>> 2082844800U))
>>
>>  /* compatibility */
>>
>> I can submit that separately so that it can get backported into
>> stable kernels if you like, with the type changes as a follow-up
>> on top.
>>
>
> Sounds good.

Ok, I'll send an updated version with that patch first then.

       Arnd
Arnd Bergmann June 22, 2018, 2:19 p.m. UTC | #3
On Wed, Jun 20, 2018 at 9:55 PM, Arnd Bergmann <arnd@arndb.de> wrote:
> On Wed, Jun 20, 2018 at 6:55 PM, Viacheslav Dubeyko <slava@dubeyko.com> wrote:
>> On Tue, 2018-06-19 at 21:42 +0200, Arnd Bergmann wrote:
>>> On Tue, Jun 19, 2018 at 7:03 PM, Viacheslav Dubeyko <slava@dubeyko.com> wrote:

>>>  /* time macros: convert between 1904-2040 and 1970-2106 range,
>>>   * pre-1970 timestamps are interpreted as post-2038 times after
>>> wrap-around */
>>> -#define __hfsp_mt2ut(t)                (be32_to_cpu(t) -
>>> 2082844800U)
>>> +#define __hfsp_mt2ut(t)                ((time64_t)be32_to_cpu(t) -
>>> 2082844800U)
>>>  #define __hfsp_ut2mt(t)                (cpu_to_be32(t +
>>> 2082844800U))
>>>
>>>  /* compatibility */
>>>
>>> I can submit that separately so that it can get backported into
>>> stable kernels if you like, with the type changes as a follow-up
>>> on top.
>>>
>>
>> Sounds good.
>
> Ok, I'll send an updated version with that patch first then.

I've now sent that patch with additional information that I got from reading the
XNU sources. Interestingly, that also uses the 1970-2106 time range that
I had in my original series, not the 1904-2040 time range that is documented.

      Arnd
diff mbox

Patch

diff --git a/fs/hfs/hfs_fs.h b/fs/hfs/hfs_fs.h
index ff432931a5b1..2c7366342656 100644
--- a/fs/hfs/hfs_fs.h
+++ b/fs/hfs/hfs_fs.h
@@ -249,7 +249,7 @@  extern void hfs_mark_mdb_dirty(struct super_block *sb);
  * actually works until year 2106
  */
 #define __hfs_u_to_mtime(sec)  cpu_to_be32(sec + 2082844800U -
sys_tz.tz_minuteswest * 60)
-#define __hfs_m_to_utime(sec)  (be32_to_cpu(sec) - 2082844800U  +
sys_tz.tz_minuteswest * 60)
+#define __hfs_m_to_utime(sec)  ((time64_t)be32_to_cpu(sec) -
2082844800U  + sys_tz.tz_minuteswest * 60)

 #define HFS_I(inode)   (container_of(inode, struct hfs_inode_info, vfs_inode))
 #define HFS_SB(sb)     ((struct hfs_sb_info *)(sb)->s_fs_info)
diff --git a/fs/hfsplus/hfsplus_fs.h b/fs/hfsplus/hfsplus_fs.h
index 1a6b469f8d22..4eaee8bdfcb2 100644
--- a/fs/hfsplus/hfsplus_fs.h
+++ b/fs/hfsplus/hfsplus_fs.h
@@ -534,7 +534,7 @@  int hfsplus_read_wrapper(struct super_block *sb);

 /* time macros: convert between 1904-2040 and 1970-2106 range,
  * pre-1970 timestamps are interpreted as post-2038 times after wrap-around */
-#define __hfsp_mt2ut(t)                (be32_to_cpu(t) - 2082844800U)
+#define __hfsp_mt2ut(t)                ((time64_t)be32_to_cpu(t) - 2082844800U)
 #define __hfsp_ut2mt(t)                (cpu_to_be32(t + 2082844800U))

 /* compatibility */