mbox series

[0/3] actually detect bad file modes in fsck

Message ID YvQcNpizy9uOZiAz@coredump.intra.peff.net (mailing list archive)
Headers show
Series actually detect bad file modes in fsck | expand

Message

Jeff King Aug. 10, 2022, 8:59 p.m. UTC
On Wed, Aug 10, 2022 at 04:04:17PM -0400, Jeff King wrote:

> > Maybe downgrade to info or ignore by default then? It might still be
> > an issue for people who wilfully upgraded the diagnostic to error
> > hoping to catch the, but hopefully if they did that they'd rather get
> > the notice later than never?
> 
> Yeah, that may be a sensible resolution. All things being equal I think
> "warning" is the right level, but out of caution and the historical
> precedent, maybe downgrading it to "info" is justified.
> 
> It should be easy to work that into the patch I showed earlier.

OK, so here are cleaned-up patches to do that.

  [1/3]: tree-walk: add a mechanism for getting non-canonicalized modes
  [2/3]: fsck: actually detect bad file modes in trees
  [3/3]: fsck: downgrade tree badFilemode to "info"

 fsck.c                          |  4 ++--
 fsck.h                          |  2 +-
 packfile.c                      |  2 +-
 t/t1450-fsck.sh                 | 14 ++++++++++++++
 t/t5504-fetch-receive-strict.sh | 17 +++++++++++++++++
 tree-walk.c                     | 14 +++++++++-----
 tree-walk.h                     |  8 +++++++-
 7 files changed, 51 insertions(+), 10 deletions(-)

-Peff

Comments

Ævar Arnfjörð Bjarmason Aug. 11, 2022, 9:39 a.m. UTC | #1
On Wed, Aug 10 2022, Jeff King wrote:

> On Wed, Aug 10, 2022 at 04:04:17PM -0400, Jeff King wrote:
>
>> > Maybe downgrade to info or ignore by default then? It might still be
>> > an issue for people who wilfully upgraded the diagnostic to error
>> > hoping to catch the, but hopefully if they did that they'd rather get
>> > the notice later than never?
>> 
>> Yeah, that may be a sensible resolution. All things being equal I think
>> "warning" is the right level, but out of caution and the historical
>> precedent, maybe downgrading it to "info" is justified.
>> 
>> It should be easy to work that into the patch I showed earlier.
>
> OK, so here are cleaned-up patches to do that.
>
>   [1/3]: tree-walk: add a mechanism for getting non-canonicalized modes
>   [2/3]: fsck: actually detect bad file modes in trees
>   [3/3]: fsck: downgrade tree badFilemode to "info"

This LGTM.

I noticed/reported this issue more than a year ago, but the series I had
for fixing it ended up being dropped, here's the report/analysis at the
time:
https://lore.kernel.org/git/20210308150650.18626-31-avarab@gmail.com/

Basically I was taking a much longer way around to avoid...

>  	/* counts the number of bytes left in the `buffer`. */
>  	unsigned int size;
> +
> +	/* option flags passed via init_tree_desc_gently() */
> +	enum tree_desc_flags {
> +		TREE_DESC_RAW_MODES = (1 << 0),
> +	} flags;
>  };

...this from 1/3 here, i.e. now we're paying the cost of an extra entry
in every "struct tree_desc" user (which includes some hot codepaths),
and just for this one fsck caller.

I wonder if we couldn't introduce a init_tree_desc_gently_flags() for
this instead. You note in 1/3 that you're (rightly) avoiding the churn
of existing callers, but they could just use a "static inline" wrapper
function that would set those flags to 0, we'd pass the flags down, and
not put them into the "tree_desc" struct.

Arguably it doesn't belong there at all, since this new thing is really
an "opts" struct, but "tree_desc" is for "the state of the walk".

It might/would make sense as a "raw_mode" in "struct name_entry"
perhaps, but then we're gettin closer to the larger scope of my initial
larger series, oh well ... :)
Jeff King Aug. 14, 2022, 7:03 a.m. UTC | #2
On Thu, Aug 11, 2022 at 11:39:42AM +0200, Ævar Arnfjörð Bjarmason wrote:

> > OK, so here are cleaned-up patches to do that.
> >
> >   [1/3]: tree-walk: add a mechanism for getting non-canonicalized modes
> >   [2/3]: fsck: actually detect bad file modes in trees
> >   [3/3]: fsck: downgrade tree badFilemode to "info"
> 
> This LGTM.
> 
> I noticed/reported this issue more than a year ago, but the series I had
> for fixing it ended up being dropped, here's the report/analysis at the
> time:
> https://lore.kernel.org/git/20210308150650.18626-31-avarab@gmail.com/
> 
> Basically I was taking a much longer way around to avoid...

I took a look at that patch, but I'd really prefer _not_ to lose the
auto-canonicalizing for most code paths. It's an easy thing to forget
about, and the current state protects most code from getting confused by
malicious or buggy modes.

> >  	/* counts the number of bytes left in the `buffer`. */
> >  	unsigned int size;
> > +
> > +	/* option flags passed via init_tree_desc_gently() */
> > +	enum tree_desc_flags {
> > +		TREE_DESC_RAW_MODES = (1 << 0),
> > +	} flags;
> >  };
> 
> ...this from 1/3 here, i.e. now we're paying the cost of an extra entry
> in every "struct tree_desc" user (which includes some hot codepaths),
> and just for this one fsck caller.

Yes, but I don't think tree_desc itself is very hot. We allocate one per
iteration on the stack, not one per tree. So you'd have at most N at
once for a tree with depth N. And the rest of tree_desc is...already not
very lightweight.

In fact, I think this flag ends up in what is currently padding (I get
72 bytes for the struct before and after). Though in the long run that
"unsigned int" should almost certainly become a "size_t", which would
fill out that padding. I still doubt it's measurable.

> I wonder if we couldn't introduce a init_tree_desc_gently_flags() for
> this instead. You note in 1/3 that you're (rightly) avoiding the churn
> of existing callers, but they could just use a "static inline" wrapper
> function that would set those flags to 0, we'd pass the flags down, and
> not put them into the "tree_desc" struct.

You'd need not just that function, but wrappers for the iterator
functions. I agree it could work. It just seemed less clean to me.

> Arguably it doesn't belong there at all, since this new thing is really
> an "opts" struct, but "tree_desc" is for "the state of the walk".

I think that's a philosophical difference, and not one I really agree
with. I'd argue that options are part of the state (and we mingle them
already in other structs like rev_info).

> It might/would make sense as a "raw_mode" in "struct name_entry"
> perhaps, but then we're gettin closer to the larger scope of my initial
> larger series, oh well ... :)

Yes, I'd be OK with that approach, too. But aren't we now similarly
bloating things for a value that most callers won't care about? ;)

-Peff