diff mbox series

Add ideas for GSoC 2024

Message ID 106b8e7be9ddc2d24670b01d54347dfcf9aef196.1707122040.git.ps@pks.im (mailing list archive)
State New
Headers show
Series Add ideas for GSoC 2024 | expand

Commit Message

Patrick Steinhardt Feb. 5, 2024, 8:39 a.m. UTC
Add project ideas for the GSoC 2024.
---

I came up with four different topics:

  - The reftable unit test refactorings. This topic can also be squashed
    into the preexisting unit test topics, I wouldn't mind. In that case
    I'd be happy to be a possible mentor, too.

  - Ref consistency checks for git-fsck(1). This should be rather
    straight forward and make for an interesting topic.

  - Making git-bisect(1)'s state more self-contained as recently
    discussed. This topic is easy to implement, but the backwards
    compatibility issues might require a lot of attention.

  - Implementing support for reftables in the "dumb" HTTP protocol. It's
    quite niche given that the dumb protocol isn't really used much
    nowadays anymore. But it could make for an interesting project
    regardless.

It's hard to estimate for me whether their scope is either too small or
too big. So please feel free to chime in and share your concerns if you
think that any of those proposals don't make much sense in your opinion.

Patrick

 SoC-2024-Ideas.md | 129 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 129 insertions(+)

Comments

Christian Couder Feb. 5, 2024, 4:43 p.m. UTC | #1
On Mon, Feb 5, 2024 at 9:39 AM Patrick Steinhardt <ps@pks.im> wrote:
>
> Add project ideas for the GSoC 2024.
> ---
>
> I came up with four different topics:
>
>   - The reftable unit test refactorings. This topic can also be squashed
>     into the preexisting unit test topics, I wouldn't mind. In that case
>     I'd be happy to be a possible mentor, too.
>
>   - Ref consistency checks for git-fsck(1). This should be rather
>     straight forward and make for an interesting topic.
>
>   - Making git-bisect(1)'s state more self-contained as recently
>     discussed. This topic is easy to implement, but the backwards
>     compatibility issues might require a lot of attention.
>
>   - Implementing support for reftables in the "dumb" HTTP protocol. It's
>     quite niche given that the dumb protocol isn't really used much
>     nowadays anymore. But it could make for an interesting project
>     regardless.
>
> It's hard to estimate for me whether their scope is either too small or
> too big. So please feel free to chime in and share your concerns if you
> think that any of those proposals don't make much sense in your opinion.

Thanks a lot for these ideas! I have applied your patch and pushed it.

I have a few concerns though, see below.

>  SoC-2024-Ideas.md | 129 ++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 129 insertions(+)
>
> diff --git a/SoC-2024-Ideas.md b/SoC-2024-Ideas.md
> index 3efbcaf..286aea0 100644
> --- a/SoC-2024-Ideas.md
> +++ b/SoC-2024-Ideas.md
> @@ -39,3 +39,132 @@ Languages: C, shell(bash)
>  Possible mentors:
>  * Christian Couder < <christian.couder@gmail.com> >
>
> +### Convert reftable unit tests to use the unit testing framework
> +
> +The "reftable" unit tests in "t0032-reftable-unittest.sh"
> +predate the unit testing framework that was recently
> +introduced into Git. These tests should be converted to use
> +the new framework.
> +
> +See:
> +
> +  - this discussion <https://lore.kernel.org/git/cover.1692297001.git.steadmon@google.com/>
> +
> +Expected Project Size: 175 hours or 350 hours
> +
> +Difficulty: Low

"Difficulty: Low" might not be very accurate from the point of view of
contributors. I think it's always quite difficult to contribute
something significant to Git, and sometimes more than we expected.

> +Languages: C, shell(bash)
> +
> +Possible mentors:
> +* Patrick Steinhardt < <ps@pks.im> >
> +* Karthik Nayak < <karthik.188@gmail.com> >
> +
> +### Implement consistency checks for refs
> +
> +The git-fsck(1) command is used to check various data
> +structures for consistency. Notably missing though are
> +consistency checks for the refdb. While git-fsck(1)
> +implicitly checks some of the properties of the refdb
> +because it uses its refs for a connectivity check, these
> +checks aren't sufficient to properly ensure that all refs
> +are properly consistent.
> +
> +The goal of this project would be to introduce consistency
> +checks that can be implemented by the ref backend. Initially
> +these checks may only apply to the "files" backend. With the
> +ongoing efforts to upstream a new "reftable" backend the
> +effort may be extended.
> +
> +See:
> +
> +  - https://lore.kernel.org/git/6cfee0e4-3285-4f18-91ff-d097da9de737@rd10.de/
> +  - https://lore.kernel.org/git/cover.1706601199.git.ps@pks.im/
> +
> +Expected Project Size: 175 hours or 350 hours
> +
> +Difficulty: Medium
> +
> +Languages: C, shell(bash)
> +
> +Possible mentors:
> +* Patrick Steinhardt < <ps@pks.im> >
> +* Karthik Nayak < <karthik.188@gmail.com> >
> +
> +### Refactor git-bisect(1) to make its state self-contained
> +
> +The git-bisect(1) command is used to find a commit in a
> +range of commits that introduced a specific bug. Starting a
> +bisection run creates a set of state files into the Git
> +repository which record various different parameters like
> +".git/BISECT_START". These files look almost like refs
> +due to their names being all-uppercase. This has led to
> +confusion with the new "reftable" backend because it wasn't
> +quite clear whether those files are in fact refs or not.
> +
> +As it turns out they are not refs and should never be
> +treated like one. Overall, it has been concluded that the
> +way those files are currently stored is not ideal. Instead
> +of having a proliferation of files in the Git directory, it
> +was discussed whether the bisect state should be moved into
> +its own "bisect-state" subdirectory. This would make it more
> +self-contained and thereby avoid future confusion. It is
> +also aligned with the sequencer state used by rebases, which
> +is neatly contained in the "rebase-apply" and "rebase-merge"
> +directories.
> +
> +The goal of this project would be to realize this change.
> +While rearchitecting the layout should be comparatively easy
> +to do, the harder part will be to hash out how to handle
> +backwards compatibility.
> +
> +See:
> +
> +  - https://lore.kernel.org/git/Za-gF_Hp_lXViGWw@tanuki/

From reading the discussion it looks like everyone is Ok with doing
this. I really hope that we are not missing something that might make
us decide early not to do this though.

> +Expected Project Size: 175 hours or 350 hours
> +
> +Difficulty: Medium
> +
> +Languages: C, shell(bash)
> +
> +Possible mentors:
> +* Patrick Steinhardt < <ps@pks.im> >
> +* Karthik Nayak < <karthik.188@gmail.com> >
Kaartic Sivaraam Feb. 5, 2024, 6:55 p.m. UTC | #2
Hi Patrick, Christian and all,

On 05/02/24 22:13, Christian Couder wrote:
> 
> Thanks a lot for these ideas! I have applied your patch and pushed it.
> 

Yeah. Thanks for sharing these great ideas! I've submitted the 
application using the new ideas page now as mentioned in the parent thread.

>> +### Convert reftable unit tests to use the unit testing framework
>> +
>> +The "reftable" unit tests in "t0032-reftable-unittest.sh"
>> +predate the unit testing framework that was recently
>> +introduced into Git. These tests should be converted to use
>> +the new framework.
>> +
>> +See:
>> +
>> +  - this discussion <https://lore.kernel.org/git/cover.1692297001.git.steadmon@google.com/>
>> +
>> +Expected Project Size: 175 hours or 350 hours
>> +
>> +Difficulty: Low
> 
> "Difficulty: Low" might not be very accurate from the point of view of
> contributors. I think it's always quite difficult to contribute
> something significant to Git, and sometimes more than we expected.
> 

Makes sense. Also, I'm kind of cat-one-the-wall about whether it makes 
sense to have two projects about the unit test migration effort itself. 
If we're clear that both of them would not overlap, it should be fine. 
Otherwise, it would be better to merge them as Patrick suggests.

That said, how helpful would it be to link the following doc in the unit 
testing related ideas?

https://github.com/git/git/blob/master/Documentation/technical/unit-tests.txt

>> +### Implement consistency checks for refs
>> +
 >>
 >> [ ... snip ... ]
 >>
>> +
>> +  - https://lore.kernel.org/git/6cfee0e4-3285-4f18-91ff-d097da9de737@rd10.de/
>> +  - https://lore.kernel.org/git/cover.1706601199.git.ps@pks.im/
>> +
 >> [ .... snip ... ]
>> +
>> +### Implement support for reftables in "dumb" HTTP transport

Would it worth linking the reftable technical doc for the above ideas?

https://git-scm.com/docs/reftable

I could see it goes into a lot of detail. I'm just wondering if link to 
it would help someone who's looking to learn about reftable.
Patrick Steinhardt Feb. 6, 2024, 5:47 a.m. UTC | #3
On Mon, Feb 05, 2024 at 05:43:17PM +0100, Christian Couder wrote:
> On Mon, Feb 5, 2024 at 9:39 AM Patrick Steinhardt <ps@pks.im> wrote:
> >
> > Add project ideas for the GSoC 2024.
> > ---
> >
> > I came up with four different topics:
> >
> >   - The reftable unit test refactorings. This topic can also be squashed
> >     into the preexisting unit test topics, I wouldn't mind. In that case
> >     I'd be happy to be a possible mentor, too.
> >
> >   - Ref consistency checks for git-fsck(1). This should be rather
> >     straight forward and make for an interesting topic.
> >
> >   - Making git-bisect(1)'s state more self-contained as recently
> >     discussed. This topic is easy to implement, but the backwards
> >     compatibility issues might require a lot of attention.
> >
> >   - Implementing support for reftables in the "dumb" HTTP protocol. It's
> >     quite niche given that the dumb protocol isn't really used much
> >     nowadays anymore. But it could make for an interesting project
> >     regardless.
> >
> > It's hard to estimate for me whether their scope is either too small or
> > too big. So please feel free to chime in and share your concerns if you
> > think that any of those proposals don't make much sense in your opinion.
> 
> Thanks a lot for these ideas! I have applied your patch and pushed it.
> 
> I have a few concerns though, see below.
> 
> >  SoC-2024-Ideas.md | 129 ++++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 129 insertions(+)
> >
> > diff --git a/SoC-2024-Ideas.md b/SoC-2024-Ideas.md
> > index 3efbcaf..286aea0 100644
> > --- a/SoC-2024-Ideas.md
> > +++ b/SoC-2024-Ideas.md
> > @@ -39,3 +39,132 @@ Languages: C, shell(bash)
> >  Possible mentors:
> >  * Christian Couder < <christian.couder@gmail.com> >
> >
> > +### Convert reftable unit tests to use the unit testing framework
> > +
> > +The "reftable" unit tests in "t0032-reftable-unittest.sh"
> > +predate the unit testing framework that was recently
> > +introduced into Git. These tests should be converted to use
> > +the new framework.
> > +
> > +See:
> > +
> > +  - this discussion <https://lore.kernel.org/git/cover.1692297001.git.steadmon@google.com/>
> > +
> > +Expected Project Size: 175 hours or 350 hours
> > +
> > +Difficulty: Low
> 
> "Difficulty: Low" might not be very accurate from the point of view of
> contributors. I think it's always quite difficult to contribute
> something significant to Git, and sometimes more than we expected.

That's certainly true. I understood the difficulty levels here as being
relative to the already-high bar that the Git project typically sets.
Otherwise there wouldn't be much use to specify difficulty in the first
place if all items had the same difficulty.

Or is the intent of the difficulty level rather on a global GSoC level?
In that case I agree that we should bump the difficulty to "medium".

> > +Languages: C, shell(bash)
> > +
> > +Possible mentors:
> > +* Patrick Steinhardt < <ps@pks.im> >
> > +* Karthik Nayak < <karthik.188@gmail.com> >
> > +
> > +### Implement consistency checks for refs
> > +
> > +The git-fsck(1) command is used to check various data
> > +structures for consistency. Notably missing though are
> > +consistency checks for the refdb. While git-fsck(1)
> > +implicitly checks some of the properties of the refdb
> > +because it uses its refs for a connectivity check, these
> > +checks aren't sufficient to properly ensure that all refs
> > +are properly consistent.
> > +
> > +The goal of this project would be to introduce consistency
> > +checks that can be implemented by the ref backend. Initially
> > +these checks may only apply to the "files" backend. With the
> > +ongoing efforts to upstream a new "reftable" backend the
> > +effort may be extended.
> > +
> > +See:
> > +
> > +  - https://lore.kernel.org/git/6cfee0e4-3285-4f18-91ff-d097da9de737@rd10.de/
> > +  - https://lore.kernel.org/git/cover.1706601199.git.ps@pks.im/
> > +
> > +Expected Project Size: 175 hours or 350 hours
> > +
> > +Difficulty: Medium
> > +
> > +Languages: C, shell(bash)
> > +
> > +Possible mentors:
> > +* Patrick Steinhardt < <ps@pks.im> >
> > +* Karthik Nayak < <karthik.188@gmail.com> >
> > +
> > +### Refactor git-bisect(1) to make its state self-contained
> > +
> > +The git-bisect(1) command is used to find a commit in a
> > +range of commits that introduced a specific bug. Starting a
> > +bisection run creates a set of state files into the Git
> > +repository which record various different parameters like
> > +".git/BISECT_START". These files look almost like refs
> > +due to their names being all-uppercase. This has led to
> > +confusion with the new "reftable" backend because it wasn't
> > +quite clear whether those files are in fact refs or not.
> > +
> > +As it turns out they are not refs and should never be
> > +treated like one. Overall, it has been concluded that the
> > +way those files are currently stored is not ideal. Instead
> > +of having a proliferation of files in the Git directory, it
> > +was discussed whether the bisect state should be moved into
> > +its own "bisect-state" subdirectory. This would make it more
> > +self-contained and thereby avoid future confusion. It is
> > +also aligned with the sequencer state used by rebases, which
> > +is neatly contained in the "rebase-apply" and "rebase-merge"
> > +directories.
> > +
> > +The goal of this project would be to realize this change.
> > +While rearchitecting the layout should be comparatively easy
> > +to do, the harder part will be to hash out how to handle
> > +backwards compatibility.
> > +
> > +See:
> > +
> > +  - https://lore.kernel.org/git/Za-gF_Hp_lXViGWw@tanuki/
> 
> From reading the discussion it looks like everyone is Ok with doing
> this. I really hope that we are not missing something that might make
> us decide early not to do this though.

I agree that this is a risky one, and that's what I tried to bring
across with the "harder part will be to hash out how to handle backwards
compatibility". Overall I think this project will be more about selling
the patch and reasoning about how it can be done without breaking
backwards compatibility.

We could bump the difficulty to high to reflect that better. But if you
deem the risk to be too high then I'm also happy to drop the topic
completely.

Patrick
Patrick Steinhardt Feb. 6, 2024, 5:51 a.m. UTC | #4
On Tue, Feb 06, 2024 at 12:25:31AM +0530, Kaartic Sivaraam wrote:
> Hi Patrick, Christian and all,
> 
> On 05/02/24 22:13, Christian Couder wrote:
> > 
> > Thanks a lot for these ideas! I have applied your patch and pushed it.
> > 
> 
> Yeah. Thanks for sharing these great ideas! I've submitted the application
> using the new ideas page now as mentioned in the parent thread.
> 
> > > +### Convert reftable unit tests to use the unit testing framework
> > > +
> > > +The "reftable" unit tests in "t0032-reftable-unittest.sh"
> > > +predate the unit testing framework that was recently
> > > +introduced into Git. These tests should be converted to use
> > > +the new framework.
> > > +
> > > +See:
> > > +
> > > +  - this discussion <https://lore.kernel.org/git/cover.1692297001.git.steadmon@google.com/>
> > > +
> > > +Expected Project Size: 175 hours or 350 hours
> > > +
> > > +Difficulty: Low
> > 
> > "Difficulty: Low" might not be very accurate from the point of view of
> > contributors. I think it's always quite difficult to contribute
> > something significant to Git, and sometimes more than we expected.
> > 
> 
> Makes sense. Also, I'm kind of cat-one-the-wall about whether it makes sense
> to have two projects about the unit test migration effort itself. If we're
> clear that both of them would not overlap, it should be fine. Otherwise, it
> would be better to merge them as Patrick suggests.

I don't quite mind either way. I think overall we have enough tests that
can be converted even if both projects got picked up separately. And the
reftable unit tests are a bit more involved than the other tests given
that their coding style doesn't fit at all into the Git project. So it's
not like they can just be copied over, they definitely need some special
care.

Also, the technical complexity of the "reftable" backend is rather high,
which is another hurdle to take.

Which overall makes me lean more towards keeping this as a separate
project now that I think about it.

> That said, how helpful would it be to link the following doc in the unit
> testing related ideas?
> 
> https://github.com/git/git/blob/master/Documentation/technical/unit-tests.txt

Makes sense to me.

> > > +### Implement consistency checks for refs
> > > +
> >>
> >> [ ... snip ... ]
> >>
> > > +
> > > +  - https://lore.kernel.org/git/6cfee0e4-3285-4f18-91ff-d097da9de737@rd10.de/
> > > +  - https://lore.kernel.org/git/cover.1706601199.git.ps@pks.im/
> > > +
> >> [ .... snip ... ]
> > > +
> > > +### Implement support for reftables in "dumb" HTTP transport
> 
> Would it worth linking the reftable technical doc for the above ideas?
> 
> https://git-scm.com/docs/reftable
> 
> I could see it goes into a lot of detail. I'm just wondering if link to it
> would help someone who's looking to learn about reftable.

Definitely doesn't hurt.

Patrick
Christian Couder Feb. 6, 2024, 8:13 a.m. UTC | #5
On Tue, Feb 6, 2024 at 6:51 AM Patrick Steinhardt <ps@pks.im> wrote:
> On Tue, Feb 06, 2024 at 12:25:31AM +0530, Kaartic Sivaraam wrote:

> > Makes sense. Also, I'm kind of cat-one-the-wall about whether it makes sense
> > to have two projects about the unit test migration effort itself. If we're
> > clear that both of them would not overlap, it should be fine. Otherwise, it
> > would be better to merge them as Patrick suggests.
>
> I don't quite mind either way. I think overall we have enough tests that
> can be converted even if both projects got picked up separately. And the
> reftable unit tests are a bit more involved than the other tests given
> that their coding style doesn't fit at all into the Git project. So it's
> not like they can just be copied over, they definitely need some special
> care.
>
> Also, the technical complexity of the "reftable" backend is rather high,
> which is another hurdle to take.
>
> Which overall makes me lean more towards keeping this as a separate
> project now that I think about it.

Ok, for me. If we have a contributor working on each of these 2
projects, we just need to be clear that the contributors should not
work together on the 2 projects as I think the GSoC forbids that.

> > That said, how helpful would it be to link the following doc in the unit
> > testing related ideas?
> >
> > https://github.com/git/git/blob/master/Documentation/technical/unit-tests.txt
>
> Makes sense to me.

To me too.

> > Would it worth linking the reftable technical doc for the above ideas?
> >
> > https://git-scm.com/docs/reftable
> >
> > I could see it goes into a lot of detail. I'm just wondering if link to it
> > would help someone who's looking to learn about reftable.
>
> Definitely doesn't hurt.

I agree.

Thanks!
Christian Couder Feb. 6, 2024, 8:26 a.m. UTC | #6
On Tue, Feb 6, 2024 at 6:47 AM Patrick Steinhardt <ps@pks.im> wrote:
> On Mon, Feb 05, 2024 at 05:43:17PM +0100, Christian Couder wrote:

> > "Difficulty: Low" might not be very accurate from the point of view of
> > contributors. I think it's always quite difficult to contribute
> > something significant to Git, and sometimes more than we expected.
>
> That's certainly true. I understood the difficulty levels here as being
> relative to the already-high bar that the Git project typically sets.

I am not sure potential contributors are aware of the high bar that
the Git project typically sets.

> Otherwise there wouldn't be much use to specify difficulty in the first
> place if all items had the same difficulty.
>
> Or is the intent of the difficulty level rather on a global GSoC level?

Yeah, I think it makes more sense to consider it like this.

> In that case I agree that we should bump the difficulty to "medium".

Yeah, I think we should bump it to "medium".

> > From reading the discussion it looks like everyone is Ok with doing
> > this. I really hope that we are not missing something that might make
> > us decide early not to do this though.
>
> I agree that this is a risky one, and that's what I tried to bring
> across with the "harder part will be to hash out how to handle backwards
> compatibility". Overall I think this project will be more about selling
> the patch and reasoning about how it can be done without breaking
> backwards compatibility.
>
> We could bump the difficulty to high to reflect that better. But if you
> deem the risk to be too high then I'm also happy to drop the topic
> completely.

I think we can keep this topic with a "Medium" difficulty. Perhaps it
will actually not be very difficult if all goes well.

Yeah, it may seem strange, but I think unless we start to have
projects not related much to our code base, like perhaps projects
related to our web sites or some infrastructure or the Git Rev News or
our docs, I think most projects should have a "Medium" difficulty. We
might want to use "High" sometimes if we want to discourage
contributors unless they have some special background related to the
specific topic (like multi-threading for example if we had related
projects).
Kaartic Sivaraam Feb. 8, 2024, 2:02 p.m. UTC | #7
Hi Patrick amd Christian,


On 6 February 2024 1:43:02 pm IST, Christian Couder <christian.couder@gmail.com> wrote:
>On Tue, Feb 6, 2024 at 6:51 AM Patrick Steinhardt <ps@pks.im> wrote:
>> On Tue, Feb 06, 2024 at 12:25:31AM +0530, Kaartic Sivaraam wrote:
>
>> I don't quite mind either way. I think overall we have enough tests that
>> can be converted even if both projects got picked up separately. And the
>> reftable unit tests are a bit more involved than the other tests given
>> that their coding style doesn't fit at all into the Git project. So it's
>> not like they can just be copied over, they definitely need some special
>> care.
>>
>> Also, the technical complexity of the "reftable" backend is rather high,
>> which is another hurdle to take.
>>
>> Which overall makes me lean more towards keeping this as a separate
>> project now that I think about it.

Makes sense.  I suppose we need to capture the distinction more clearly in the ideas page.

I've tweaked the doc for the same. Do check it out and feel free to suggest any corrections.

Ideas page: https://git.github.io/SoC-2024-Ideas/

>Ok, for me. If we have a contributor working on each of these 2
>projects, we just need to be clear that the contributors should not
>work together on the 2 projects as I think the GSoC forbids that.
>

Indeed. We must make sure to communicate this to selected contributors if we end up choosing two of them for the unit test migration projects.

On a related note, I think I could help as a co-mentor the non-reftable unit tests migration project if we don't find any other willing volunteer. :-) 

I'm hoping I should be of some help on guiding the contributor as a co-mentor. Feel free to let me correct me if I might potentially lack required knowledge.

>> > That said, how helpful would it be to link the following doc in the unit
>> > testing related ideas?
>> >
>> > https://github.com/git/git/blob/master/Documentation/technical/unit-tests.txt
>>
>> Makes sense to me.
>
>To me too.
>
>> > Would it worth linking the reftable technical doc for the above ideas?
>> >
>> > https://git-scm.com/docs/reftable
>> >
>> > I could see it goes into a lot of detail. I'm just wondering if link to it
>> > would help someone who's looking to learn about reftable.
>>
>> Definitely doesn't hurt.
>
>I agree.
>

Thanks for the feedback. Included both of these links in relevant ideas too. Feel free to cross-check them!
Patrick Steinhardt Feb. 9, 2024, 6:27 a.m. UTC | #8
On Thu, Feb 08, 2024 at 07:32:50PM +0530, Kaartic Sivaraam wrote:
> Hi Patrick amd Christian,
> 
> 
> On 6 February 2024 1:43:02 pm IST, Christian Couder <christian.couder@gmail.com> wrote:
> >On Tue, Feb 6, 2024 at 6:51 AM Patrick Steinhardt <ps@pks.im> wrote:
> >> On Tue, Feb 06, 2024 at 12:25:31AM +0530, Kaartic Sivaraam wrote:
> >
> >> I don't quite mind either way. I think overall we have enough tests that
> >> can be converted even if both projects got picked up separately. And the
> >> reftable unit tests are a bit more involved than the other tests given
> >> that their coding style doesn't fit at all into the Git project. So it's
> >> not like they can just be copied over, they definitely need some special
> >> care.
> >>
> >> Also, the technical complexity of the "reftable" backend is rather high,
> >> which is another hurdle to take.
> >>
> >> Which overall makes me lean more towards keeping this as a separate
> >> project now that I think about it.
> 
> Makes sense.  I suppose we need to capture the distinction more
> clearly in the ideas page.
> 
> I've tweaked the doc for the same. Do check it out and feel free to
> suggest any corrections.
> 
> Ideas page: https://git.github.io/SoC-2024-Ideas/

Yeah, the clarification looks good to me. Thanks!

Patrick
Christian Couder Feb. 9, 2024, 8:36 a.m. UTC | #9
Hi Kaartic,

On Thu, Feb 8, 2024 at 3:02 PM Kaartic Sivaraam
<kaartic.sivaraam@gmail.com> wrote:
> On 6 February 2024 1:43:02 pm IST, Christian Couder <christian.couder@gmail.com> wrote:
> >On Tue, Feb 6, 2024 at 6:51 AM Patrick Steinhardt <ps@pks.im> wrote:
> >
> >> I don't quite mind either way. I think overall we have enough tests that
> >> can be converted even if both projects got picked up separately. And the
> >> reftable unit tests are a bit more involved than the other tests given
> >> that their coding style doesn't fit at all into the Git project. So it's
> >> not like they can just be copied over, they definitely need some special
> >> care.
> >>
> >> Also, the technical complexity of the "reftable" backend is rather high,
> >> which is another hurdle to take.
> >>
> >> Which overall makes me lean more towards keeping this as a separate
> >> project now that I think about it.
>
> Makes sense.  I suppose we need to capture the distinction more clearly in the ideas page.
>
> I've tweaked the doc for the same. Do check it out and feel free to suggest any corrections.
>
> Ideas page: https://git.github.io/SoC-2024-Ideas/

Thanks! It looks good to me too.

> >Ok, for me. If we have a contributor working on each of these 2
> >projects, we just need to be clear that the contributors should not
> >work together on the 2 projects as I think the GSoC forbids that.
>
> Indeed. We must make sure to communicate this to selected contributors if we end up choosing two of them for the unit test migration projects.
>
> On a related note, I think I could help as a co-mentor the non-reftable unit tests migration project if we don't find any other willing volunteer. :-)
>
> I'm hoping I should be of some help on guiding the contributor as a co-mentor. Feel free to let me correct me if I might potentially lack required knowledge.

Thanks a lot for volunteering to co-mentor with me! I think you don't
need any special knowledge and you will be very helpful as usual.

> >> > That said, how helpful would it be to link the following doc in the unit
> >> > testing related ideas?
> >> >
> >> > https://github.com/git/git/blob/master/Documentation/technical/unit-tests.txt
> >>
> >> Makes sense to me.
> >
> >To me too.
> >
> >> > Would it worth linking the reftable technical doc for the above ideas?
> >> >
> >> > https://git-scm.com/docs/reftable
> >> >
> >> > I could see it goes into a lot of detail. I'm just wondering if link to it
> >> > would help someone who's looking to learn about reftable.
> >>
> >> Definitely doesn't hurt.
> >
> >I agree.
>
> Thanks for the feedback. Included both of these links in relevant ideas too. Feel free to cross-check them!

Great, thanks!
diff mbox series

Patch

diff --git a/SoC-2024-Ideas.md b/SoC-2024-Ideas.md
index 3efbcaf..286aea0 100644
--- a/SoC-2024-Ideas.md
+++ b/SoC-2024-Ideas.md
@@ -39,3 +39,132 @@  Languages: C, shell(bash)
 Possible mentors:
 * Christian Couder < <christian.couder@gmail.com> >
 
+### Convert reftable unit tests to use the unit testing framework
+
+The "reftable" unit tests in "t0032-reftable-unittest.sh"
+predate the unit testing framework that was recently
+introduced into Git. These tests should be converted to use
+the new framework.
+
+See:
+
+  - this discussion <https://lore.kernel.org/git/cover.1692297001.git.steadmon@google.com/>
+
+Expected Project Size: 175 hours or 350 hours
+
+Difficulty: Low
+
+Languages: C, shell(bash)
+
+Possible mentors:
+* Patrick Steinhardt < <ps@pks.im> >
+* Karthik Nayak < <karthik.188@gmail.com> >
+
+### Implement consistency checks for refs
+
+The git-fsck(1) command is used to check various data
+structures for consistency. Notably missing though are
+consistency checks for the refdb. While git-fsck(1)
+implicitly checks some of the properties of the refdb
+because it uses its refs for a connectivity check, these
+checks aren't sufficient to properly ensure that all refs
+are properly consistent.
+
+The goal of this project would be to introduce consistency
+checks that can be implemented by the ref backend. Initially
+these checks may only apply to the "files" backend. With the
+ongoing efforts to upstream a new "reftable" backend the
+effort may be extended.
+
+See:
+
+  - https://lore.kernel.org/git/6cfee0e4-3285-4f18-91ff-d097da9de737@rd10.de/
+  - https://lore.kernel.org/git/cover.1706601199.git.ps@pks.im/
+
+Expected Project Size: 175 hours or 350 hours
+
+Difficulty: Medium
+
+Languages: C, shell(bash)
+
+Possible mentors:
+* Patrick Steinhardt < <ps@pks.im> >
+* Karthik Nayak < <karthik.188@gmail.com> >
+
+### Refactor git-bisect(1) to make its state self-contained
+
+The git-bisect(1) command is used to find a commit in a
+range of commits that introduced a specific bug. Starting a
+bisection run creates a set of state files into the Git
+repository which record various different parameters like
+".git/BISECT_START". These files look almost like refs
+due to their names being all-uppercase. This has led to
+confusion with the new "reftable" backend because it wasn't
+quite clear whether those files are in fact refs or not.
+
+As it turns out they are not refs and should never be
+treated like one. Overall, it has been concluded that the
+way those files are currently stored is not ideal. Instead
+of having a proliferation of files in the Git directory, it
+was discussed whether the bisect state should be moved into
+its own "bisect-state" subdirectory. This would make it more
+self-contained and thereby avoid future confusion. It is
+also aligned with the sequencer state used by rebases, which
+is neatly contained in the "rebase-apply" and "rebase-merge"
+directories.
+
+The goal of this project would be to realize this change.
+While rearchitecting the layout should be comparatively easy
+to do, the harder part will be to hash out how to handle
+backwards compatibility.
+
+See:
+
+  - https://lore.kernel.org/git/Za-gF_Hp_lXViGWw@tanuki/
+
+Expected Project Size: 175 hours or 350 hours
+
+Difficulty: Medium
+
+Languages: C, shell(bash)
+
+Possible mentors:
+* Patrick Steinhardt < <ps@pks.im> >
+* Karthik Nayak < <karthik.188@gmail.com> >
+
+### Implement support for reftables in "dumb" HTTP transport
+
+Fetching Git repositories uses one of two major protocols:
+
+  - The "dumb" protocol works without requiring any kind of
+    interactive negotiation like a CGI module. It can thus
+    be served by a static web server.
+
+  - The "smart" protocol works by having the client and
+    server exchange multiple messages with each other. It is
+    more efficient, but requires support for Git in the
+    server.
+
+While almost all servers nowadays use the "smart" protocol,
+there are still some that use the "dumb" protocol.
+
+The "dumb" protocol cannot serve repositories which use the
+"reftable" backend though. While there exists a "info/refs"
+file that is supposed to be backend-agnostic, this file does
+not contain information about the default branch. Instead,
+clients are expected to download the "HEAD" file and derive
+the default branch like that. This file is a mere stub in
+the "reftable" backend though, which breaks this protocol.
+
+The goal of this project is to implement "reftable" support
+for "dumb" fetches.
+
+Expected Project Size: 175 hours or 350 hours
+
+Difficulty: Medium
+
+Languages: C, shell(bash)
+
+Possible mentors:
+* Patrick Steinhardt < <ps@pks.im> >
+* Karthik Nayak < <karthik.188@gmail.com> >