Message ID | 20190407184751.28027-1-t.gummerer@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | ls-files: use correct format string | expand |
On Sun, Apr 07, 2019 at 07:47:51PM +0100, Thomas Gummerer wrote: > struct stat_data and struct cache_time both use unsigned ints for all > their members. However the format string for 'git ls-files --debug' > currently uses %d for formatting these numbers. This means that we > potentially print these values incorrectly if they are greater than > INT_MAX. > > This has been the case since the --debug option was introduced in 'git > ls-files' in 8497421715 ("ls-files: learn a debugging dump format", > 2010-07-31). I didn't see any comment on this, but it seems like it must be obviously correct, since as you note we do define those fields as unsigned. I'm really surprised that -Wformat doesn't catch this, though. I wonder why. -Peff
On 04/11, Jeff King wrote: > On Sun, Apr 07, 2019 at 07:47:51PM +0100, Thomas Gummerer wrote: > > > struct stat_data and struct cache_time both use unsigned ints for all > > their members. However the format string for 'git ls-files --debug' > > currently uses %d for formatting these numbers. This means that we > > potentially print these values incorrectly if they are greater than > > INT_MAX. > > > > This has been the case since the --debug option was introduced in 'git > > ls-files' in 8497421715 ("ls-files: learn a debugging dump format", > > 2010-07-31). > > I didn't see any comment on this, but it seems like it must be obviously > correct, since as you note we do define those fields as unsigned. I'm > really surprised that -Wformat doesn't catch this, though. I wonder why. Good point. A bit of digging led me to -Wformat-signedness, which should catch this. This turns up a lot of errors in our codebase. I didn't go through to see how many of them are actual errors, and how many are false-positives though. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65446 describes how the option can lead to false positives, e.g. printf ("%u\n", unsigned_short); might turn up an error. From a quick test this seems to work correctly with gcc 8.2.1 that I have on my machine though, so the issue might be fixed in newer gcc version, even though that bug report is still marked as new. Maybe it's worth going through the warnings at some point to see if it would be possible to turn -Wformat-signedness on.
On Thu, Apr 11, 2019 at 10:28:30PM +0100, Thomas Gummerer wrote: > > I didn't see any comment on this, but it seems like it must be obviously > > correct, since as you note we do define those fields as unsigned. I'm > > really surprised that -Wformat doesn't catch this, though. I wonder why. > > Good point. A bit of digging led me to -Wformat-signedness, which > should catch this. This turns up a lot of errors in our codebase. I > didn't go through to see how many of them are actual errors, and how > many are false-positives though. Ah, right, I totally forgot that signedness got its own warning class. Thanks for enlightening me. > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65446 describes how the > option can lead to false positives, e.g. > > printf ("%u\n", unsigned_short); > > might turn up an error. From a quick test this seems to work > correctly with gcc 8.2.1 that I have on my machine though, so the > issue might be fixed in newer gcc version, even though that bug report > is still marked as new. Interesting. Looking at that thread, I actually don't think it would be so bad to warn there anyway. It's true that due to integer promotion an unsigned short will work with %u, but I'd be just as happy to switch such a format to "%hu", which is more correct. > Maybe it's worth going through the warnings at some point to see if it > would be possible to turn -Wformat-signedness on. I skimmed over a few of the results. There are definitely some that could produce funny output. There are also many that are harmless (e.g., printing a constant 0 with "%o", which technically should be "0U"). I don't think it's high priority, but if anybody wants to chip away at it, be my guest. In the meantime, I think your patch here is an obvious improvement. -Peff
diff --git a/builtin/ls-files.c b/builtin/ls-files.c index 29a8762d46..463105ccb5 100644 --- a/builtin/ls-files.c +++ b/builtin/ls-files.c @@ -112,11 +112,11 @@ static void print_debug(const struct cache_entry *ce) if (debug_mode) { const struct stat_data *sd = &ce->ce_stat_data; - printf(" ctime: %d:%d\n", sd->sd_ctime.sec, sd->sd_ctime.nsec); - printf(" mtime: %d:%d\n", sd->sd_mtime.sec, sd->sd_mtime.nsec); - printf(" dev: %d\tino: %d\n", sd->sd_dev, sd->sd_ino); - printf(" uid: %d\tgid: %d\n", sd->sd_uid, sd->sd_gid); - printf(" size: %d\tflags: %x\n", sd->sd_size, ce->ce_flags); + printf(" ctime: %u:%u\n", sd->sd_ctime.sec, sd->sd_ctime.nsec); + printf(" mtime: %u:%u\n", sd->sd_mtime.sec, sd->sd_mtime.nsec); + printf(" dev: %u\tino: %u\n", sd->sd_dev, sd->sd_ino); + printf(" uid: %u\tgid: %u\n", sd->sd_uid, sd->sd_gid); + printf(" size: %u\tflags: %x\n", sd->sd_size, ce->ce_flags); } }
struct stat_data and struct cache_time both use unsigned ints for all their members. However the format string for 'git ls-files --debug' currently uses %d for formatting these numbers. This means that we potentially print these values incorrectly if they are greater than INT_MAX. This has been the case since the --debug option was introduced in 'git ls-files' in 8497421715 ("ls-files: learn a debugging dump format", 2010-07-31). Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com> --- I noticed this running 'git ls-files --debug' the other day. Presumably there's not too many users of 'git ls-files', and even fewer will see values that trigger this behaviour. builtin/ls-files.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)