blktrace: avoid using timespec
diff mbox

Message ID 20160617145849.3771756-1-arnd@arndb.de
State New
Headers show

Commit Message

Arnd Bergmann June 17, 2016, 2:58 p.m. UTC
The blktrace code stores the current time in a 32-bit word in its
user interface. This is a bad idea because 32-bit seconds overflow
at some point.

We probably have until 2106 before this one overflows, as it seems
to use an 'unsigned' variable, but we should confirm that user
space treats it the same way.

Aside from this, we want to stop using 'struct timespec' here,
so I'm adding a comment about the overflow and change the code
to use timespec64 instead to make the loss of range more obvious.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 kernel/trace/blktrace.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

Comments

Steven Rostedt June 17, 2016, 3:36 p.m. UTC | #1
Jens,

You want to take this, or do you want me to?

-- Steve


On Fri, 17 Jun 2016 16:58:26 +0200
Arnd Bergmann <arnd@arndb.de> wrote:

> The blktrace code stores the current time in a 32-bit word in its
> user interface. This is a bad idea because 32-bit seconds overflow
> at some point.
> 
> We probably have until 2106 before this one overflows, as it seems
> to use an 'unsigned' variable, but we should confirm that user
> space treats it the same way.
> 
> Aside from this, we want to stop using 'struct timespec' here,
> so I'm adding a comment about the overflow and change the code
> to use timespec64 instead to make the loss of range more obvious.
> 
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> ---
>  kernel/trace/blktrace.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
> index ef86b965ade3..b0816e4a61a5 100644
> --- a/kernel/trace/blktrace.c
> +++ b/kernel/trace/blktrace.c
> @@ -127,12 +127,13 @@ static void trace_note_tsk(struct task_struct *tsk)
>  
>  static void trace_note_time(struct blk_trace *bt)
>  {
> -	struct timespec now;
> +	struct timespec64 now;
>  	unsigned long flags;
>  	u32 words[2];
>  
> -	getnstimeofday(&now);
> -	words[0] = now.tv_sec;
> +	/* need to check user space to see if this breaks in y2038 or y2106 */
> +	ktime_get_real_ts64(&now);
> +	words[0] = (u32)now.tv_sec;
>  	words[1] = now.tv_nsec;
>  
>  	local_irq_save(flags);

--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jens Axboe June 17, 2016, 9:39 p.m. UTC | #2
On 06/17/2016 05:36 PM, Steven Rostedt wrote:
>
> Jens,
>
> You want to take this, or do you want me to?

I'll add it to my 4.8 tree, thanks Arnd.
Jeff Moyer June 17, 2016, 9:54 p.m. UTC | #3
Jens Axboe <axboe@kernel.dk> writes:

> On 06/17/2016 05:36 PM, Steven Rostedt wrote:
>>
>> Jens,
>>
>> You want to take this, or do you want me to?
>
> I'll add it to my 4.8 tree, thanks Arnd.

+	/* need to check user space to see if this breaks in y2038 or y2106 */

Userspace just uses it to print the timestamp, right?  So do we need the
comment?

-Jeff
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Arnd Bergmann June 18, 2016, 7:02 p.m. UTC | #4
On Friday, June 17, 2016 5:54:16 PM CEST Jeff Moyer wrote:
> Jens Axboe <axboe@kernel.dk> writes:
> 
> > On 06/17/2016 05:36 PM, Steven Rostedt wrote:
> >>
> >> Jens,
> >>
> >> You want to take this, or do you want me to?
> >
> > I'll add it to my 4.8 tree, thanks Arnd.
> 
> +       /* need to check user space to see if this breaks in y2038 or y2106 */
> 
> Userspace just uses it to print the timestamp, right?  So do we need the
> comment?

If we have more details, the comment should describe what happens and
when it overflows. If you have the source at hand, maybe you can answer
these:

How does it print the timestamp? Does it print the raw seconds value
using %u (correct) or %d (incorrect), or does it convert it into
year/month/day/hour/min/sec?

In the last case, how does it treat second values above 0x80000000? Are
those printed as  year 2038 or year 1902?

Are we sure that there is only one user space implementation that reads
these values?

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jeff Moyer June 20, 2016, 2:59 p.m. UTC | #5
Arnd Bergmann <arnd@arndb.de> writes:

> On Friday, June 17, 2016 5:54:16 PM CEST Jeff Moyer wrote:
>> Jens Axboe <axboe@kernel.dk> writes:
>> 
>> > On 06/17/2016 05:36 PM, Steven Rostedt wrote:
>> >>
>> >> Jens,
>> >>
>> >> You want to take this, or do you want me to?
>> >
>> > I'll add it to my 4.8 tree, thanks Arnd.
>> 
>> +       /* need to check user space to see if this breaks in y2038 or y2106 */
>> 
>> Userspace just uses it to print the timestamp, right?  So do we need the
>> comment?

> If we have more details, the comment should describe what happens and
> when it overflows. If you have the source at hand, maybe you can answer
> these:

As far as I can tell, that value is only ever consulted when an
undocumented format option is given to blkparse.  I don't think this
matters very much.

> How does it print the timestamp? Does it print the raw seconds value
> using %u (correct) or %d (incorrect), or does it convert it into
> year/month/day/hour/min/sec?

It converts it, but only prints hour/min/sec (and nsec):

struct timespec         abs_start_time;

...
static void handle_notify(struct blk_io_trace *bit)
{
...
        __u32   two32[2];
...
                abs_start_time.tv_sec  = two32[0];
                abs_start_time.tv_nsec = two32[1];
                if (abs_start_time.tv_nsec < 0) {
                        abs_start_time.tv_sec--;
                        abs_start_time.tv_nsec += 1000000000;
                }
...

static const char *
print_time(unsigned long long timestamp)
{
        static char     timebuf[128];
        struct tm       *tm;
        time_t          sec;
        unsigned long   nsec;

        sec  = abs_start_time.tv_sec + SECONDS(timestamp);
        nsec = abs_start_time.tv_nsec + NANO_SECONDS(timestamp);
        if (nsec >= 1000000000) {
                nsec -= 1000000000;
                sec += 1;
        }

        tm = localtime(&sec);
        snprintf(timebuf, sizeof(timebuf),
                        "%02u:%02u:%02u.%06lu",
                        tm->tm_hour,
                        tm->tm_min,
                        tm->tm_sec,
                        nsec / 1000);
        return timebuf;
}

> In the last case, how does it treat second values above 0x80000000? Are
> those printed as  year 2038 or year 1902?

We don't print the year.

> Are we sure that there is only one user space implementation that reads
> these values?

We're never sure about that.  However, I'd be very surprised if anything
outside of blktrace used this.

Cheers,
Jeff
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch
diff mbox

diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
index ef86b965ade3..b0816e4a61a5 100644
--- a/kernel/trace/blktrace.c
+++ b/kernel/trace/blktrace.c
@@ -127,12 +127,13 @@  static void trace_note_tsk(struct task_struct *tsk)
 
 static void trace_note_time(struct blk_trace *bt)
 {
-	struct timespec now;
+	struct timespec64 now;
 	unsigned long flags;
 	u32 words[2];
 
-	getnstimeofday(&now);
-	words[0] = now.tv_sec;
+	/* need to check user space to see if this breaks in y2038 or y2106 */
+	ktime_get_real_ts64(&now);
+	words[0] = (u32)now.tv_sec;
 	words[1] = now.tv_nsec;
 
 	local_irq_save(flags);