diff mbox

[v2] vfs: Add support to check max and min inode times

Message ID 1456991401-27643-1-git-send-email-deepa.kernel@gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Deepa Dinamani March 3, 2016, 7:50 a.m. UTC
This is a preparation patch to add range checking for inode
timestamps.

Extend struct super_block to include information about the max
and min inode times each filesystem can hold. These are dependent
on the on-disk format of filesystems.

These range checks will be used to clamp timestamps to filesystem
allowed ranges.

Individual filesystems do not have the same on disk format as
the in memory inodes. Range checking and clamping times assigned
to inodes will help keep in memory and on-disk timestamps to be
in sync.

Every time a new superblock is created, make sure that the superblock
max and min timestamp fields are assigned invalid values.

Another series will initialize these fields to appropriate values for
every filesystem.

The values are currently ignored. The exact policy and behavior will be
decided in a separate patch.

max and min times are initialized to MIN_VFS_TIME and MAX_VFS_TIME
respectively so that even if one of the fields is uninitialized,
it can be detected by using the condition max_time < min_time.

The original idea for the feature comes from the discussion:
https://lkml.org/lkml/2014/5/30/669

Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
---

The intention is to include this as part of 4.6 so that the follow on
patches can go into 4.7.

The series and the plan have been discussed with Arnd Bergmann.

Changes from v1:
* Delete INVALID macros, use VFS_TIME macros directly.
* Add comment in alloc_super() to explain range checking.
* Reword the commit text to reflect the above.

fs/super.c         |  7 +++++++
 include/linux/fs.h | 12 +++++++++++-
 2 files changed, 18 insertions(+), 1 deletion(-)

Comments

Arnd Bergmann March 3, 2016, 2:10 p.m. UTC | #1
On Wednesday 02 March 2016 23:50:01 Deepa Dinamani wrote:
> +       /*
> +        * Assign a default empty range [MAX_VFS_TIME, MIN_VFS_TIME].
> +        * This will help VFS detect filesystems that do not populate
> +        * these fields in the superblock.
> +        */
> +       s->s_time_min = MAX_VFS_TIME;
> +       s->s_time_max = MIN_VFS_TIME;
>         s->cleancache_poolid = CLEANCACHE_NO_POOL;

I think this is still less clear than it should be, even with the
comment explaining it.

As I just replied to Dave, how about initializing both to -1 instead?

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Deepa Dinamani March 3, 2016, 3:38 p.m. UTC | #2
On Thu, Mar 3, 2016 at 7:40 PM, Arnd Bergmann <arnd@arndb.de> wrote:
> On Wednesday 02 March 2016 23:50:01 Deepa Dinamani wrote:
>> +       /*
>> +        * Assign a default empty range [MAX_VFS_TIME, MIN_VFS_TIME].
>> +        * This will help VFS detect filesystems that do not populate
>> +        * these fields in the superblock.
>> +        */
>> +       s->s_time_min = MAX_VFS_TIME;
>> +       s->s_time_max = MIN_VFS_TIME;
>>         s->cleancache_poolid = CLEANCACHE_NO_POOL;
>
> I think this is still less clear than it should be, even with the
> comment explaining it.
>
> As I just replied to Dave, how about initializing both to -1 instead?

This was one of my first ideas.
The reason I did not choose this is because
1. [-1, -1] is a valid range theoretically.
2. -1 would now be treated as a value.

It is true that no filesystem is using -1 as a limit.
But, I did not feel right excluding a perfectly good value.
Choosing [1,1] or any other number except probably 0 would work the same way.

This is the reason that I decided an invalid range should be
represented by an invalid tuple and
not a single value.
My premise was that a range where min_time > max_time is invalid.
My first choice here as in the first patch I shared with you privately
was [1,-1].
But, the condition min_time < max_time can easily be achieved by
changing just one of the values.
For instance, [1, 2016]. But, this doesn't necessarily mean that the
filesystem explicitly set both the values.
For this reason, the invalid range I chose is [max_time, min_time].
Of course [min_time, min_time] and [max_time, max_time] ranges are
also considered valid in this approach.

This guarantees
1. That both min and max values are assigned by each filesystem.
2. The range is universally invalid.

-Deepa
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Arnd Bergmann March 3, 2016, 4:03 p.m. UTC | #3
On Thursday 03 March 2016 21:08:25 Deepa Dinamani wrote:
> On Thu, Mar 3, 2016 at 7:40 PM, Arnd Bergmann <arnd@arndb.de> wrote:
> > On Wednesday 02 March 2016 23:50:01 Deepa Dinamani wrote:
> >> +       /*
> >> +        * Assign a default empty range [MAX_VFS_TIME, MIN_VFS_TIME].
> >> +        * This will help VFS detect filesystems that do not populate
> >> +        * these fields in the superblock.
> >> +        */
> >> +       s->s_time_min = MAX_VFS_TIME;
> >> +       s->s_time_max = MIN_VFS_TIME;
> >>         s->cleancache_poolid = CLEANCACHE_NO_POOL;
> >
> > I think this is still less clear than it should be, even with the
> > comment explaining it.
> >
> > As I just replied to Dave, how about initializing both to -1 instead?
> 
> This was one of my first ideas.
> The reason I did not choose this is because
> 1. [-1, -1] is a valid range theoretically.
> 2. -1 would now be treated as a value.
> 
> It is true that no filesystem is using -1 as a limit.
> But, I did not feel right excluding a perfectly good value.
> Choosing [1,1] or any other number except probably 0 would work the same way.
> 
> This is the reason that I decided an invalid range should be
> represented by an invalid tuple and
> not a single value.
> My premise was that a range where min_time > max_time is invalid.

Right.

> My first choice here as in the first patch I shared with you privately
> was [1,-1].
> But, the condition min_time < max_time can easily be achieved by
> changing just one of the values.
> For instance, [1, 2016]. But, this doesn't necessarily mean that the
> filesystem explicitly set both the values.

I see where you are coming from here, but I think we could also assume
some minimum level of competence from the file system developers, so
I would not have quite described it this way ;-)

> For this reason, the invalid range I chose is [max_time, min_time].
> Of course [min_time, min_time] and [max_time, max_time] ranges are
> also considered valid in this approach.
> 
> This guarantees
> 1. That both min and max values are assigned by each filesystem.
> 2. The range is universally invalid.

Ok, at least it clearly shows the intention that it cannot possibly
be valid, unlike the '0' or '-1' defaults.

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Deepa Dinamani March 3, 2016, 5:33 p.m. UTC | #4
On Thu, Mar 3, 2016 at 8:03 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> On Thursday 03 March 2016 21:08:25 Deepa Dinamani wrote:
>> On Thu, Mar 3, 2016 at 7:40 PM, Arnd Bergmann <arnd@arndb.de> wrote:
>> > On Wednesday 02 March 2016 23:50:01 Deepa Dinamani wrote:
>> >> +       /*
>> >> +        * Assign a default empty range [MAX_VFS_TIME, MIN_VFS_TIME].
>> >> +        * This will help VFS detect filesystems that do not populate
>> >> +        * these fields in the superblock.
>> >> +        */
>> >> +       s->s_time_min = MAX_VFS_TIME;
>> >> +       s->s_time_max = MIN_VFS_TIME;
>> >>         s->cleancache_poolid = CLEANCACHE_NO_POOL;
>> >
>> > I think this is still less clear than it should be, even with the
>> > comment explaining it.
>> >
>> > As I just replied to Dave, how about initializing both to -1 instead?
>>
>> This was one of my first ideas.
>> The reason I did not choose this is because
>> 1. [-1, -1] is a valid range theoretically.
>> 2. -1 would now be treated as a value.
>>
>> It is true that no filesystem is using -1 as a limit.
>> But, I did not feel right excluding a perfectly good value.
>> Choosing [1,1] or any other number except probably 0 would work the same way.
>>
>> This is the reason that I decided an invalid range should be
>> represented by an invalid tuple and
>> not a single value.
>> My premise was that a range where min_time > max_time is invalid.
>
> Right.
>
>> My first choice here as in the first patch I shared with you privately
>> was [1,-1].
>> But, the condition min_time < max_time can easily be achieved by
>> changing just one of the values.
>> For instance, [1, 2016]. But, this doesn't necessarily mean that the
>> filesystem explicitly set both the values.
>
> I see where you are coming from here, but I think we could also assume
> some minimum level of competence from the file system developers, so
> I would not have quite described it this way ;-)

Sorry, I did not mean to offend anybody. I was thinking it could avoid
accidental update of only one of the bounds. The illustration I gave was
meant to correlate with only setting s->s_time_max = U32_MAX
(in 2106).

- Deepa
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alex Elder March 17, 2016, 9:48 p.m. UTC | #5
On 03/03/2016 01:50 AM, Deepa Dinamani wrote:
> This is a preparation patch to add range checking for inode
> timestamps.
> 
> Extend struct super_block to include information about the max
> and min inode times each filesystem can hold. These are dependent
> on the on-disk format of filesystems.
> 
> These range checks will be used to clamp timestamps to filesystem
> allowed ranges.

I talked to Deepa and Arnd last week (in person!) and offered
to take a look at this discussion and provide some input.  I
don't claim expertise, but maybe I have some familiarity that
might help.

> Individual filesystems do not have the same on disk format as
> the in memory inodes. Range checking and clamping times assigned
> to inodes will help keep in memory and on-disk timestamps to be
> in sync.
> 
> Every time a new superblock is created, make sure that the superblock
> max and min timestamp fields are assigned invalid values.
> 
> Another series will initialize these fields to appropriate values for
> every filesystem.
> 
> The values are currently ignored. The exact policy and behavior will be
> decided in a separate patch.
> 
> max and min times are initialized to MIN_VFS_TIME and MAX_VFS_TIME
> respectively so that even if one of the fields is uninitialized,
> it can be detected by using the condition max_time < min_time.
> 
> The original idea for the feature comes from the discussion:
> https://lkml.org/lkml/2014/5/30/669
> 
> Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
> ---
> 
> The intention is to include this as part of 4.6 so that the follow on
> patches can go into 4.7.
> 
> The series and the plan have been discussed with Arnd Bergmann.
> 
> Changes from v1:
> * Delete INVALID macros, use VFS_TIME macros directly.
> * Add comment in alloc_super() to explain range checking.
> * Reword the commit text to reflect the above.
> 
> fs/super.c         |  7 +++++++
>  include/linux/fs.h | 12 +++++++++++-
>  2 files changed, 18 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/super.c b/fs/super.c
> index 1182af8..37ec188 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -239,6 +239,13 @@ static struct super_block *alloc_super(struct file_system_type *type, int flags)
>  	s->s_maxbytes = MAX_NON_LFS;
>  	s->s_op = &default_op;
>  	s->s_time_gran = 1000000000;
> +	/*
> +	 * Assign a default empty range [MAX_VFS_TIME, MIN_VFS_TIME].
> +	 * This will help VFS detect filesystems that do not populate
> +	 * these fields in the superblock.
> +	 */
> +	s->s_time_min = MAX_VFS_TIME;
> +	s->s_time_max = MIN_VFS_TIME;

Dave found the symbol name you used before confusing, and I
did too.  I know what you're doing here--assigning the max
as a min and the min as a max--but your comment doesn't do
a good job of describing it, and the result is still in my
view confusing.

Your comment says "assign these values" and asserts it allows
detection of something.  But you should say something more like:

    It is invalid for s_time_min to be greater than s_time_max.
    We use this condition to represent a file system whose
    supported on-disk time range has not been properly initialized.

As far as the particular values, Arnd's suggestion that they
(or at least s_time_max) be initialized to -1 seems reasonable,
provided time64_t is signed (it is, but there are no comments
at its definition to say it must be).

>  	s->cleancache_poolid = CLEANCACHE_NO_POOL;
>  
>  	s->s_shrink.seeks = DEFAULT_SEEKS;
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 1af4727..cee8f99 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -927,6 +927,9 @@ static inline struct file *get_file(struct file *f)
>  #define MAX_LFS_FILESIZE 	((loff_t)0x7fffffffffffffffLL)
>  #endif
>  
> +#define MIN_VFS_TIME	S64_MIN
> +#define MAX_VFS_TIME	S64_MAX
> +

Dave said he thought these should have their 32-bit
counterpart values instead--at least for 32-bit tasks.

I believe his point is that we can't assume a file system
implementation will properly convert between on-disk time
format and a 64-bit time value until it is verified to do so.
By using 32-bit max/min values in this case, the imposed limit
is no different from the current 32-bit time_t limit.

However, this might be missing Deepa's assignment of
an explicitly invalid combination of s_time_{min,max}
values.  At this point, there is no enforcement in place
anyway, but I think the purpose of doing that was to know
when a file system had not recorded its own limits, which
would then trigger generic code to (probably) warn and
then handle it specially (possibly using 32-bit limits
as Dave suggested).

Looking at this patch alone, it's a little hard to envision
what exactly is going to be done with these limit values.
On the other hand, if we wish to handle this generically,
it's pretty clear we need to have each file system type
expose the on-disk time value limits to which it is
constrained.

I personally think that defining these fields is
a necessary step, and they should be 64 bits (to
represent the One True VFS time).  I believe Arnd
told me it needed to be signed, and that should
be noted in comments where defined.  Their max and
min values should therefore be S64_MAX and S64_MIN
respectively.  A particular file system type must
convert between its own time and VFS time, and
care needs to be taken to ensure none of the VFS
changes assume such conversion occurs automatically.

Defining an expressly invalid combination of these
values (with max < min) is important; a single
invalid value is not enough (because a file system
should be allowed to support all representable values).

It will be good to find out a little more about how
exactly these *might* be used, even if the policy is
not yet fully worked out.

So with all of that as background, I guess I recommend
this patch (with perhaps very minor changes) get accepted.
But I'd like to know if Dave has a different suggestion
for how to support a unified VFS notion of time while
generically handling file system specific differences
in on-disk time representation.

I hope this helps.

					-Alex

>  #define FL_POSIX	1
>  #define FL_FLOCK	2
>  #define FL_DELEG	4	/* NFSv4 delegation */
> @@ -1343,7 +1346,14 @@ struct super_block {
>  
>  	/* Granularity of c/m/atime in ns.
>  	   Cannot be worse than a second */
> -	u32		   s_time_gran;
> +	u32		s_time_gran;
> +
> +	/*
> +	 * Max and min values for timestamps
> +	 * according to the range supported by filesystems.
> +	 */
> +	time64_t	s_time_min;
> +	time64_t	s_time_max;
>  
>  	/*
>  	 * The next field is for VFS *only*. No filesystems have any business
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/super.c b/fs/super.c
index 1182af8..37ec188 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -239,6 +239,13 @@  static struct super_block *alloc_super(struct file_system_type *type, int flags)
 	s->s_maxbytes = MAX_NON_LFS;
 	s->s_op = &default_op;
 	s->s_time_gran = 1000000000;
+	/*
+	 * Assign a default empty range [MAX_VFS_TIME, MIN_VFS_TIME].
+	 * This will help VFS detect filesystems that do not populate
+	 * these fields in the superblock.
+	 */
+	s->s_time_min = MAX_VFS_TIME;
+	s->s_time_max = MIN_VFS_TIME;
 	s->cleancache_poolid = CLEANCACHE_NO_POOL;
 
 	s->s_shrink.seeks = DEFAULT_SEEKS;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 1af4727..cee8f99 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -927,6 +927,9 @@  static inline struct file *get_file(struct file *f)
 #define MAX_LFS_FILESIZE 	((loff_t)0x7fffffffffffffffLL)
 #endif
 
+#define MIN_VFS_TIME	S64_MIN
+#define MAX_VFS_TIME	S64_MAX
+
 #define FL_POSIX	1
 #define FL_FLOCK	2
 #define FL_DELEG	4	/* NFSv4 delegation */
@@ -1343,7 +1346,14 @@  struct super_block {
 
 	/* Granularity of c/m/atime in ns.
 	   Cannot be worse than a second */
-	u32		   s_time_gran;
+	u32		s_time_gran;
+
+	/*
+	 * Max and min values for timestamps
+	 * according to the range supported by filesystems.
+	 */
+	time64_t	s_time_min;
+	time64_t	s_time_max;
 
 	/*
 	 * The next field is for VFS *only*. No filesystems have any business