Message ID | 20250106190855.3098-2-soekkle@freenet.de (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Fixes typemissmatch warinigs from msvc | expand |
On 2025-01-06 at 19:08:52, Sören Krecker wrote: > Fix some compiler warings from msvw in add-patch.c for value truncation > form 64 bit to 32 bit integers.Change unsigned long to size_t for > correct variable size on linux and windows. > Add macro strtos for converting a string to size_t. > Test if convertion fails with over or underflow. A few minor nits here. We want to say "from" both here and in the title and "conversion" (and in the title, "mismatch"), and put a space after the period in a sentence. I think you meant "MSVC" instead of "msvw", but if not, please do explain what that is, since I'm not familiar with it and I'm curious. The commit message is a good place to explain lots in detail. > Signed-off-by: Sören Krecker <soekkle@freenet.de> > > Uses strtouq I don't see that we're using this function. > impove linux support > > Change Macro name We don't typically put comments about the revisions we've made to a patch in the commit message. We may put them below the --- so that they're visible to readers and reviewers, which is helpful, but we pretend that our patches were perfect to begin with in terms of the commit message, since the future reader of the history only cares about the actual end result and not what changes we made along the way. > --- > add-patch.c | 53 +++++++++++++++++++++++++++-------------------- > gettext.h | 2 +- > git-compat-util.h | 6 ++++++ > 3 files changed, 38 insertions(+), 23 deletions(-) > > diff --git a/add-patch.c b/add-patch.c > index 7b598e14df..67a7f68d23 100644 > --- a/add-patch.c > +++ b/add-patch.c > @@ -242,7 +242,7 @@ static struct patch_mode patch_mode_worktree_nothead = { > }; > > struct hunk_header { > - unsigned long old_offset, old_count, new_offset, new_count; > + size_t old_offset, old_count, new_offset, new_count; > /* > * Start/end offsets to the extra text after the second `@@` in the > * hunk header, e.g. the function signature. This is expected to > @@ -322,11 +322,12 @@ static void setup_child_process(struct add_p_state *s, > } > > static int parse_range(const char **p, > - unsigned long *offset, unsigned long *count) > + size_t *offset, size_t *count) > { > char *pend; > - > - *offset = strtoul(*p, &pend, 10); > + *offset = strtos(*p, &pend, 10); I see you've defined this below. > + if (errno == ERANGE) > + return error("Number dose not fit datatype"); I think the word you wanted was "does". However, perhaps we should provide a better, more meaningful error message so the user knows what data they provided that was invalid. Maybe "absurdly large value in diff header range"? It would be quite bizarre to get a value even as large as the maximum value of a 32-bit integer, and I don't think our diff code can even handle values larger than INT_MAX. In that context, it might not even be necessary to handle values larger than unsigned long, since we can't generate them. However, in the interests of compatibility with other implementations which might not have that limitation, size_t seems reasonable as a choice to handle more generally. Assuming we keep this, we probably also want to mark this for translation by wrapping it in `_(` and `)`. I also don't think this order is correct. In general, errno is not reset implicitly, so unless we know that an error occurred, errno is meaningless, since another function could have set it to ERANGE. We'd probably need to save errno, set it to 0, and restore to verify that we got the right value, since we can't distinguish here between a truncated value for range reasons and for other reasons. > if (pend == *p) > return -1; > > if (*pend != ',') { > @@ -334,7 +335,9 @@ static int parse_range(const char **p, > *p = pend; > return 0; > } > - *count = strtoul(pend + 1, (char **)p, 10); > + *count = strtos(pend + 1, (char **)p, 10); > + if (errno == ERANGE) > + return error("Number dose not fit datatype"); Same comment here. > return *p == pend + 1 ? -1 : 0; > } > > @@ -673,8 +676,8 @@ static void render_hunk(struct add_p_state *s, struct hunk *hunk, > */ > const char *p; > size_t len; > - unsigned long old_offset = header->old_offset; > - unsigned long new_offset = header->new_offset; > + size_t old_offset = header->old_offset; > + size_t new_offset = header->new_offset; > > if (!colored) { > p = s->plain.buf + header->extra_start; > @@ -700,12 +703,14 @@ static void render_hunk(struct add_p_state *s, struct hunk *hunk, > else > new_offset += delta; > > - strbuf_addf(out, "@@ -%lu", old_offset); > + strbuf_addf(out, "@@ -%" PRIuMAX, (uintmax_t)old_offset); > if (header->old_count != 1) > - strbuf_addf(out, ",%lu", header->old_count); > - strbuf_addf(out, " +%lu", new_offset); > + strbuf_addf(out, ",%" PRIuMAX, > + (uintmax_t)header->old_count); > + strbuf_addf(out, " +%" PRIuMAX, (uintmax_t)new_offset); If we're using size_t, we can use %zu. That's specified in C99 as the appropriate formatting type for size_t, and we require C99 or C11 for all systems. We don't need to cast to uintmax_t. > diff --git a/gettext.h b/gettext.h > index 484cafa562..d36f5a7ade 100644 > --- a/gettext.h > +++ b/gettext.h > @@ -53,7 +53,7 @@ static inline FORMAT_PRESERVING(1) const char *_(const char *msgid) > } > > static inline FORMAT_PRESERVING(1) FORMAT_PRESERVING(2) > -const char *Q_(const char *msgid, const char *plu, unsigned long n) > +const char *Q_(const char *msgid, const char *plu, size_t n) > { > if (!git_gettext_enabled) > return n == 1 ? msgid : plu; > diff --git a/git-compat-util.h b/git-compat-util.h > index e283c46c6f..4c33990a05 100644 > --- a/git-compat-util.h > +++ b/git-compat-util.h > @@ -291,6 +291,12 @@ static inline int _have_unix_sockets(void) > #ifdef HAVE_BSD_SYSCTL > #include <sys/sysctl.h> > #endif > +#if defined _WIN64 > +# define strtos strtoull > +#else > +#define strtos strtoul > +#endif This is not a great name for the function. First of all, it resembles the standard functions a lot, so it's something that POSIX could standardize or an OS could add, and then we'll have some fun compilation errors when we redefine things. Second, it's a lot less future-proof. While I do agree that only Windows 64-bit systems are likely to fall into this case, since we already include <limits.h>, we probably should do this: #if SIZE_MAX == ULONG_MAX #define str_to_size_t strtoul #else #define str_to_size_t strtoull #endif (or whatever you want to call the function). That expresses what we care about—that the type is suitable for the value we want to parse—and doesn't use the OS as a proxy for that data. Otherwise, the Unix developer who doesn't use Windows may not understand _why_ Windows is special and the reason we've chosen this change. On that note, it would be helpful if you explained in the commit message why that is for people who don't know. Maybe something like this: On 64-bit systems, size_t is a 64-bit type. On most Unix systems, unsigned long is also 64 bits in size, so we can use functions for that type to parse values of size_t. However, on Windows, unsigned long is always 32 bits, and if we want a 64-bit type, we must use unsigned long long. To future-proof our changes against other platforms that might be added in the future, we first check if unsigned long is sufficient, and otherwise, use unsigned long long, which will work in both cases. Of course, please feel free to edit as you see fit.
"brian m. carlson" <sandals@crustytoothpaste.net> writes: > We don't typically put comments about the revisions we've made to a > patch in the commit message. We may put them below the --- so that > they're visible to readers and reviewers, which is helpful, but we > pretend that our patches were perfect to begin with in terms of the > commit message, since the future reader of the history only cares about > the actual end result and not what changes we made along the way. Thanks, all true. The future readers would only *see* the end results, and we do not want to hear about the previous stumblings the author made before reaching an acceptable version. >> --- >> add-patch.c | 53 +++++++++++++++++++++++++++-------------------- >> gettext.h | 2 +- >> git-compat-util.h | 6 ++++++ >> 3 files changed, 38 insertions(+), 23 deletions(-) I already made this comment, but I think the offset/count being ulong is a very sane design decision, and what is causing the compiler warning is some earlier change that introduced size_t variables or parameters in the callchain. As far as I can tell, there is no system functions that yields size_t (hence we must use size_t everywhere) in the code paths that deal with offset and count. I suggested to find these abused size_t and fix them to use the matching type, i.e. "unsigned long", instead, as an alternative fix. I did not get an impression that the author tried the approach and found why we must use size_t for offset/count instead. And if we go that route, there is *no* need to talk about 64-bit ve 32-bit platforms. ulong used consistently everywhere would let you use offset/count that fits in ulong, and the apply machinery is artificially limited to limit the patch size to a few gigabytes, so 32-bit ulong should be plenty as Phillip pointed out earlier. If we needed to parse an integer into a large integer, the existing code seem to use strtoumax() into uintmax_t and move it to the target (while checking for truncation). "Ah we are on windows, so use strtoll, otherwise use strtol" is not something we want to see in our codebase. > If we're using size_t, we can use %zu. That's specified in C99 as the > appropriate formatting type for size_t, and we require C99 or C11 for > all systems. We don't need to cast to uintmax_t. You and Documentation/CodingGuidelines contradict with each other here. Thanks for a review.
Junio C Hamano <gitster@pobox.com> writes: >> If we're using size_t, we can use %zu. That's specified in C99 as the >> appropriate formatting type for size_t, and we require C99 or C11 for >> all systems. We don't need to cast to uintmax_t. > > You and Documentation/CodingGuidelines contradict with each other > here. By this, I do not necessarily mean that we should stick to the past tradition since d7d850e2 (CodingGuidelines: mention C99 features we can't use, 2022-10-10), written back when MSVC was claiming to do C99 without letting us use %z conversion. What I meant was that if we are to update our stance against %z conversion after re-evaluating the situation (and such time will certainly come someday---I do not offhand know if it can be today), we should update the documentation before or at least at the same time we recommend its use to new people. Thanks.
diff --git a/add-patch.c b/add-patch.c index 7b598e14df..67a7f68d23 100644 --- a/add-patch.c +++ b/add-patch.c @@ -242,7 +242,7 @@ static struct patch_mode patch_mode_worktree_nothead = { }; struct hunk_header { - unsigned long old_offset, old_count, new_offset, new_count; + size_t old_offset, old_count, new_offset, new_count; /* * Start/end offsets to the extra text after the second `@@` in the * hunk header, e.g. the function signature. This is expected to @@ -322,11 +322,12 @@ static void setup_child_process(struct add_p_state *s, } static int parse_range(const char **p, - unsigned long *offset, unsigned long *count) + size_t *offset, size_t *count) { char *pend; - - *offset = strtoul(*p, &pend, 10); + *offset = strtos(*p, &pend, 10); + if (errno == ERANGE) + return error("Number dose not fit datatype"); if (pend == *p) return -1; if (*pend != ',') { @@ -334,7 +335,9 @@ static int parse_range(const char **p, *p = pend; return 0; } - *count = strtoul(pend + 1, (char **)p, 10); + *count = strtos(pend + 1, (char **)p, 10); + if (errno == ERANGE) + return error("Number dose not fit datatype"); return *p == pend + 1 ? -1 : 0; } @@ -673,8 +676,8 @@ static void render_hunk(struct add_p_state *s, struct hunk *hunk, */ const char *p; size_t len; - unsigned long old_offset = header->old_offset; - unsigned long new_offset = header->new_offset; + size_t old_offset = header->old_offset; + size_t new_offset = header->new_offset; if (!colored) { p = s->plain.buf + header->extra_start; @@ -700,12 +703,14 @@ static void render_hunk(struct add_p_state *s, struct hunk *hunk, else new_offset += delta; - strbuf_addf(out, "@@ -%lu", old_offset); + strbuf_addf(out, "@@ -%" PRIuMAX, (uintmax_t)old_offset); if (header->old_count != 1) - strbuf_addf(out, ",%lu", header->old_count); - strbuf_addf(out, " +%lu", new_offset); + strbuf_addf(out, ",%" PRIuMAX, + (uintmax_t)header->old_count); + strbuf_addf(out, " +%" PRIuMAX, (uintmax_t)new_offset); if (header->new_count != 1) - strbuf_addf(out, ",%lu", header->new_count); + strbuf_addf(out, ",%" PRIuMAX, + (uintmax_t)header->new_count); strbuf_addstr(out, " @@"); if (len) @@ -1066,11 +1071,13 @@ static int split_hunk(struct add_p_state *s, struct file_diff *file_diff, /* last hunk simply gets the rest */ if (header->old_offset != remaining.old_offset) - BUG("miscounted old_offset: %lu != %lu", - header->old_offset, remaining.old_offset); + BUG("miscounted old_offset: %"PRIuMAX" != %"PRIuMAX, + (uintmax_t)header->old_offset, + (uintmax_t)remaining.old_offset); if (header->new_offset != remaining.new_offset) - BUG("miscounted new_offset: %lu != %lu", - header->new_offset, remaining.new_offset); + BUG("miscounted new_offset: %"PRIuMAX" != %"PRIuMAX, + (uintmax_t)header->new_offset, + (uintmax_t)remaining.new_offset); header->old_count = remaining.old_count; header->new_count = remaining.new_count; hunk->end = end; @@ -1354,9 +1361,10 @@ static void summarize_hunk(struct add_p_state *s, struct hunk *hunk, struct strbuf *plain = &s->plain; size_t len = out->len, i; - strbuf_addf(out, " -%lu,%lu +%lu,%lu ", - header->old_offset, header->old_count, - header->new_offset, header->new_count); + strbuf_addf(out, + " -%"PRIuMAX",%"PRIuMAX" +%"PRIuMAX",%"PRIuMAX" ", + (uintmax_t)header->old_offset, (uintmax_t)header->old_count, + (uintmax_t)header->new_offset, (uintmax_t)header->new_count); if (out->len - len < SUMMARY_HEADER_WIDTH) strbuf_addchars(out, ' ', SUMMARY_HEADER_WIDTH + len - out->len); @@ -1625,10 +1633,11 @@ static int patch_update_file(struct add_p_state *s, else if (0 < response && response <= file_diff->hunk_nr) hunk_index = response - 1; else - err(s, Q_("Sorry, only %d hunk available.", - "Sorry, only %d hunks available.", - file_diff->hunk_nr), - (int)file_diff->hunk_nr); + err(s, + Q_("Sorry, only %"PRIuMAX" hunk available.", + "Sorry, only %"PRIuMAX" hunks available.", + (uintmax_t)file_diff->hunk_nr), + (uintmax_t)file_diff->hunk_nr); } else if (s->answer.buf[0] == '/') { regex_t regex; int ret; diff --git a/gettext.h b/gettext.h index 484cafa562..d36f5a7ade 100644 --- a/gettext.h +++ b/gettext.h @@ -53,7 +53,7 @@ static inline FORMAT_PRESERVING(1) const char *_(const char *msgid) } static inline FORMAT_PRESERVING(1) FORMAT_PRESERVING(2) -const char *Q_(const char *msgid, const char *plu, unsigned long n) +const char *Q_(const char *msgid, const char *plu, size_t n) { if (!git_gettext_enabled) return n == 1 ? msgid : plu; diff --git a/git-compat-util.h b/git-compat-util.h index e283c46c6f..4c33990a05 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -291,6 +291,12 @@ static inline int _have_unix_sockets(void) #ifdef HAVE_BSD_SYSCTL #include <sys/sysctl.h> #endif +#if defined _WIN64 +# define strtos strtoull +#else +#define strtos strtoul +#endif + /* Used by compat/win32/path-utils.h, and more */ static inline int is_xplatform_dir_sep(int c)
Fix some compiler warings from msvw in add-patch.c for value truncation form 64 bit to 32 bit integers.Change unsigned long to size_t for correct variable size on linux and windows. Add macro strtos for converting a string to size_t. Test if convertion fails with over or underflow. Signed-off-by: Sören Krecker <soekkle@freenet.de> Uses strtouq impove linux support Change Macro name --- add-patch.c | 53 +++++++++++++++++++++++++++-------------------- gettext.h | 2 +- git-compat-util.h | 6 ++++++ 3 files changed, 38 insertions(+), 23 deletions(-)