Message ID | 5722005.e8ezZRvZ1E@wuerfel (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, May 24, 2016 at 1:11 PM, Arnd Bergmann <arnd@arndb.de> wrote: > The following changes since commit bf16200689118d19de1b8d2a3c314fc21f5dc7bb: > > Linux 4.6-rc3 (2016-04-10 17:58:30 -0700) > > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground.git tags/y2038-4.7 The more I look at this, the less I like it. There doesn't even seem to be any *point* to the preparatory patches. I'm not seeing what any of those patches actually help prepare. The two new superblock fields that it adds, for example, should likely never be touched directly by any code in the first place, so adding them only encourages people to add more "preparatory" patches to filesystems that simply don't seem sensible. It's not clear we want a seconds-based interface there, when in many ways ktime_t is much *much* preferable for internal kernel representations for the next hundred years or so. For example, preparing to replace CURRENT_TIME_SEC with current_fs_time_sec() is going to be a huge big patch replacing every single user *anyway* due to the addition of the superblock parameter. And since we'd have to change the type in the inode, that will be a flag-day anyway. So I'm not seeing real advantages to the prep-work. What does it actually *help*? I'd have seen more point to it if it had actually converted all the existing CURRENT_TIME_SEC cases, and basically said "the code is now syntactically ready to start using per-sb limits". I don't much see the point of a preparatory patch that just paves the way for a hundred other small pointless one-liner patches, when it shouldn't be a problem to just do it in one go. Just as an example: code that does dir->i_mtime = dir->i_ctime = CURRENT_TIME_SEC; could pretty mechanically be converted to dir->i_mtime = dir->i_ctime = current_fs_time(sb); and there really is only about a hundred of those. THAT would be a preparatory patch that actually adds value. IOW, do it as one single patch that gets rid of a bad interface, not as "one pointless preparatory patch that than makes it possible to make a hundred other pointless patches to use it". It's not like it's hard to compile-test the pretty mechanical conversion. There are no architecture-specific users, so I suspect that a trivial "make allmodconfig" build will catch all the cases. Why drag something like this out, in other words? Linus -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, May 24, 2016 at 3:23 PM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > Just as an example: code that does > > dir->i_mtime = dir->i_ctime = CURRENT_TIME_SEC; > > could pretty mechanically be converted to > > dir->i_mtime = dir->i_ctime = current_fs_time(sb); Actually, looking at the users, most of them don't have the superblock directly as a variable, so it might be better to just make current_fs_time() take the inode pointer instead. That would make the conversion simpler, and it can then do the inode->i_sb thing when it is converted to actually take the filesystem limits and time granularity into account. I suspect you could do 95% with a fairly simply coccinelle script. Or even just use 'sed', with something like sed 's/\([a-z]*\)->i_\([amc]\)time = CURRENT_TIME_SEC/\1->i_\2time = current_fs_time(\1)/' seems to get close. Run that over the tree, fix up the few cases it doesn't catch, remove the broken CURRENT_TIME_SEC thing, and voilá, you're pretty much done. No? Linus -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, May 24, 2016 at 3:44 PM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Tue, May 24, 2016 at 3:23 PM, Linus Torvalds > <torvalds@linux-foundation.org> wrote: >> >> Just as an example: code that does >> >> dir->i_mtime = dir->i_ctime = CURRENT_TIME_SEC; >> >> could pretty mechanically be converted to >> >> dir->i_mtime = dir->i_ctime = current_fs_time(sb); If we use current_fs_time() instead of CURRENT_TIME_SEC, then it also means that all the filesystems now using these should use correct granularity or provide update_time callbacks for time. This is not true today. Eg: fat uses CURRENT_TIME_SEC everywhere but uses generic_update_time() and so the granularity is in nanoseconds for in memory timestamps even though this was not intended. > Actually, looking at the users, most of them don't have the superblock > directly as a variable, so it might be better to just make > current_fs_time() take the inode pointer instead. This is a much bigger change as current_fs_time() is used in many places for timestamps where inode is not part of the functions. And, this would mean reordering code. Eg: cifs_create_dfs_fattr(), btrfs_update_root_times() And, we are really only relying on super_block attributes in the function. > That would make the conversion simpler, and it can then do the > inode->i_sb thing when it is converted to actually take the filesystem > limits and time granularity into account. > > I suspect you could do 95% with a fairly simply coccinelle script. Or > even just use 'sed', with something like > > sed 's/\([a-z]*\)->i_\([amc]\)time = CURRENT_TIME_SEC/\1->i_\2time = > current_fs_time(\1)/' > > seems to get close. > > Run that over the tree, fix up the few cases it doesn't catch, remove > the broken CURRENT_TIME_SEC thing, and voilá, you're pretty much done. > > No? Yes, replacing CURRENT_TIME_SEC with current_fs_time_sec() is pretty straight forward. The reason this was split into many patches was because we thought it would be easier for individual maintainers to review the code. This is particularly true for filesystem ranges. For instance, CIFS has multiple ranges based on the version used: cifs modern or cifs smb. But, it should be fine as a single patch as well. -Deepa -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tuesday, May 24, 2016 3:23:39 PM CEST Linus Torvalds wrote: > On Tue, May 24, 2016 at 1:11 PM, Arnd Bergmann <arnd@arndb.de> wrote: > > The following changes since commit bf16200689118d19de1b8d2a3c314fc21f5dc7bb: > > > > Linux 4.6-rc3 (2016-04-10 17:58:30 -0700) > > > > are available in the git repository at: > > > > git://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground.git tags/y2038-4.7 > > The more I look at this, the less I like it. > > There doesn't even seem to be any *point* to the preparatory patches. > I'm not seeing what any of those patches actually help prepare. The > two new superblock fields that it adds, for example, should likely > never be touched directly by any code in the first place, so adding > them only encourages people to add more "preparatory" patches to > filesystems that simply don't seem sensible. The minimum and maximum times in the superblock are needed to implement a proper policy for dealing with limited file systems: Today, using 'utimes' on a file to set a date in the distant future or past usually results in a silent overflow with a more or less random time. We want to replace that with a sane behavior and either fail with an error for out of range times or truncate the the earliest or latest times that are supported by the respective file system. Whatever we decide to do there however requires knowing those times, and storing them in the superblock seems much easier than adding code to each file system for checking them. The other point is being able to refuse writable mounts to file systems that are close to overflowing. If someone wants to ship a system that has longterm support beyond 2038, then any file system that cannot store later time stamps must already not be mountable so we can prevent it from behaving inconsistently later. Again, the specific policy and how it's controlled (mount option, compile-time, or sysctl to refuse the mount, or simply printing a warning) is to be decided later, but we have to know the limits at mount time in order to do it. > It's not clear we want a > seconds-based interface there, when in many ways ktime_t is much > *much* preferable for internal kernel representations for the next > hundred years or so. > > For example, preparing to replace CURRENT_TIME_SEC with > current_fs_time_sec() is going to be a huge big patch replacing every > single user *anyway* due to the addition of the superblock parameter. Adding current_fs_time_sec() instead of just using current_fs_time() is just to preserve the existing performance behavior. Reading the current seconds value is a bit faster than reading seconds+nanoseconds (it doesn't require computing the exact nanosecond value), and much faster than computing the seconds from a ktime_t. Note that most file systems don't do sub-second timestamps at all. > And since we'd have to change the type in the inode, that will be a > flag-day anyway. It's not really a flag day, after the initial per-filesystem patches, a fairly manageable patch changes the VFS data structures and the few things that cannot be done separately before: https://github.com/deepa-hub/vfs/commit/09323fb679c27694475d9c12992782dcba69c493 > So I'm not seeing real advantages to the prep-work. What does it > actually *help*? I'd have seen more point to it if it had actually > converted all the existing CURRENT_TIME_SEC cases, and basically said > "the code is now syntactically ready to start using per-sb limits". > > I don't much see the point of a preparatory patch that just paves the > way for a hundred other small pointless one-liner patches, when it > shouldn't be a problem to just do it in one go. > > Just as an example: code that does > > dir->i_mtime = dir->i_ctime = CURRENT_TIME_SEC; > > could pretty mechanically be converted to > > dir->i_mtime = dir->i_ctime = current_fs_time(sb); > > and there really is only about a hundred of those. THAT would be a > preparatory patch that actually adds value. > > IOW, do it as one single patch that gets rid of a bad interface, not > as "one pointless preparatory patch that than makes it possible to > make a hundred other pointless patches to use it". > > It's not like it's hard to compile-test the pretty mechanical > conversion. There are no architecture-specific users, so I suspect > that a trivial "make allmodconfig" build will catch all the cases. > > Why drag something like this out, in other words? The vfs_time_to_timespec/timespec_to_vfs_time accessors and the s_time_min/s_time_max patch are really the ones that make most sense doing per file system. These are still all really simple patches, but it seemed logical to keep all three together and then go through each file system one by one. The hard part here is really catching the attention of the file system maintainers, not doing the patches. Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, May 25, 2016 at 06:03:19PM +0200, Arnd Bergmann wrote: > On Tuesday, May 24, 2016 3:23:39 PM CEST Linus Torvalds wrote: > > On Tue, May 24, 2016 at 1:11 PM, Arnd Bergmann <arnd@arndb.de> wrote: > > > The following changes since commit bf16200689118d19de1b8d2a3c314fc21f5dc7bb: > > > > > > Linux 4.6-rc3 (2016-04-10 17:58:30 -0700) > > > > > > are available in the git repository at: > > > > > > git://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground.git tags/y2038-4.7 > > > > The more I look at this, the less I like it. > > > > There doesn't even seem to be any *point* to the preparatory patches. > > I'm not seeing what any of those patches actually help prepare. The > > two new superblock fields that it adds, for example, should likely > > never be touched directly by any code in the first place, so adding > > them only encourages people to add more "preparatory" patches to > > filesystems that simply don't seem sensible. .... > > It's not like it's hard to compile-test the pretty mechanical > > conversion. There are no architecture-specific users, so I suspect > > that a trivial "make allmodconfig" build will catch all the cases. > > > > Why drag something like this out, in other words? Good question, indeed. > The vfs_time_to_timespec/timespec_to_vfs_time accessors and the > s_time_min/s_time_max patch are really the ones that make most > sense doing per file system. These are still all really simple > patches, but it seemed logical to keep all three together and then > go through each file system one by one. The hard part here is > really catching the attention of the file system maintainers, > not doing the patches. I was the only filesystem person who attempted to the review your changes 3 months ago. After the amount of shit you and Deepa dragged me through as I tried to get you to restructure the patchset *exactly* like Linus us now suggesting, I walked away and haven't looked at your patches since. Is it any wonder that no other filesystem maintainer has bothered to waste their time on this since? Linus - I'd suggest these VFS timestamp patches need to go through Al's VFS tree. That way we don't get unreviewed VFS infrastructure changes going into your tree via a door that nobody was paying attention to... Cheers, Dave.
diff --git a/fs/afs/main.c b/fs/afs/main.c index 35de0c04729f..129ff432391c 100644 --- a/fs/afs/main.c +++ b/fs/afs/main.c @@ -14,6 +14,7 @@ #include <linux/init.h> #include <linux/completion.h> #include <linux/sched.h> +#include <linux/ktime.h> #include "internal.h" MODULE_DESCRIPTION("AFS Client File System"); @@ -37,7 +38,6 @@ struct workqueue_struct *afs_wq; */ static int __init afs_get_client_UUID(void) { - struct timespec ts; u64 uuidtime; u16 clockseq; int ret; @@ -48,9 +48,7 @@ static int __init afs_get_client_UUID(void) if (ret < 0) return ret; - getnstimeofday(&ts); - uuidtime = (u64) ts.tv_sec * 1000 * 1000 * 10; - uuidtime += ts.tv_nsec / 100; + uuidtime = ktime_divns(ktime_get_real(), 100); uuidtime += AFS_UUID_TO_UNIX_TIME; afs_uuid.time_low = uuidtime; afs_uuid.time_mid = uuidtime >> 32; diff --git a/include/linux/fs.h b/include/linux/fs.h index 70e61b58baaf..d9573060591e 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1362,7 +1362,14 @@ struct super_block { /* Granularity of c/m/atime in ns. Cannot be worse than a second */ - u32 s_time_gran; + u32 s_time_gran; + + /* + * Max and min values for timestamps + * according to the range supported by filesystems. + */ + time64_t s_time_min; + time64_t s_time_max; /* * The next field is for VFS *only*. No filesystems have any business @@ -1423,6 +1430,26 @@ struct super_block { extern struct timespec current_fs_time(struct super_block *sb); +static inline struct timespec current_fs_time_sec(struct super_block *sb) +{ + return (struct timespec) { get_seconds(), 0 }; +} + +/* Place holder defines to ensure safe transition to timespec64 + * in the vfs layer. + * These can be deleted after all filesystems and vfs are switched + * over to using 64 bit time. + */ +static inline struct timespec vfs_time_to_timespec(struct timespec inode_ts) +{ + return inode_ts; +} + +static inline struct timespec timespec_to_vfs_time(struct timespec ts) +{ + return ts; +} + /* * Snapshotting support. */