diff mbox series

[v6,1/5] common-main: split init and exit code into new files

Message ID ff6cd62397ec2755d15e9d76f9af8a84b54a36c1.1736971328.git.steadmon@google.com (mailing list archive)
State New
Headers show
Series Introduce libgit-rs, a Rust wrapper around libgit.a | expand

Commit Message

Josh Steadmon Jan. 15, 2025, 8:05 p.m. UTC
Currently, object files in libgit.a reference common_exit(), which is
contained in common-main.o. However, common-main.o also includes main(),
which references cmd_main() in git.o, which in turn depends on all the
builtin/*.o objects.

We would like to allow external users to link libgit.a without needing
to include so many extra objects. Enable this by splitting common_exit()
and check_bug_if_BUG() into a new file common-exit.c, and add
common-exit.o to LIB_OBJS so that these are included in libgit.a.

This split has previously been proposed ([1], [2]) to support fuzz tests
and unit tests by avoiding conflicting definitions for main(). However,
both of those issues were resolved by other methods of avoiding symbol
conflicts. Now we are trying to make libgit.a more self-contained, so
hopefully we can revisit this approach.

Additionally, move the initialization code out of main() into a new
init_git() function in its own file. Include this in libgit.a as well,
so that external users can share our setup code without calling our
main().

[1] https://lore.kernel.org/git/Yp+wjCPhqieTku3X@google.com/
[2] https://lore.kernel.org/git/20230517-unit-tests-v2-v2-1-21b5b60f4b32@google.com/

Signed-off-by: Josh Steadmon <steadmon@google.com>
---
 Makefile      |  2 ++
 common-exit.c | 26 ++++++++++++++++
 common-init.c | 63 ++++++++++++++++++++++++++++++++++++++
 common-init.h |  6 ++++
 common-main.c | 83 ++-------------------------------------------------
 5 files changed, 99 insertions(+), 81 deletions(-)
 create mode 100644 common-exit.c
 create mode 100644 common-init.c
 create mode 100644 common-init.h

Comments

Junio C Hamano Jan. 15, 2025, 10:40 p.m. UTC | #1
Josh Steadmon <steadmon@google.com> writes:

> This split has previously been proposed ([1], [2]) to support fuzz tests
> and unit tests by avoiding conflicting definitions for main(). However,
> both of those issues were resolved by other methods of avoiding symbol
> conflicts. Now we are trying to make libgit.a more self-contained, so
> hopefully we can revisit this approach.

Yay!

> Additionally, move the initialization code out of main() into a new
> init_git() function in its own file. Include this in libgit.a as well,
> so that external users can share our setup code without calling our
> main().

Sounds good.

> diff --git a/common-main.c b/common-main.c
> index 8e68ac9e42..6b7ab077b0 100644
> --- a/common-main.c
> +++ b/common-main.c
> @@ -1,92 +1,13 @@
> ...
> +#include "common-init.h"
>  
>  int main(int argc, const char **argv)
>  {
>  	int result;
>  
> +	init_git(argv);
>  	result = cmd_main(argc, argv);
>  
>  	/* Not exit(3), but a wrapper calling our common_exit() */
>  	exit(result);
>  }

Nice.  Very nice.  I wasn't too carefully validating what we lost
here is identical to what we added to init_git(), but hopefully
others would spot and stop us if that is not the case.

Thanks.
Junio C Hamano Jan. 16, 2025, 2:46 a.m. UTC | #2
Junio C Hamano <gitster@pobox.com> writes:

> Josh Steadmon <steadmon@google.com> writes:
>
>> This split has previously been proposed ([1], [2]) to support fuzz tests
>> and unit tests by avoiding conflicting definitions for main(). However,
>> both of those issues were resolved by other methods of avoiding symbol
>> conflicts. Now we are trying to make libgit.a more self-contained, so
>> hopefully we can revisit this approach.
>
> Yay!
>
>> Additionally, move the initialization code out of main() into a new
>> init_git() function in its own file. Include this in libgit.a as well,
>> so that external users can share our setup code without calling our
>> main().
>
> Sounds good.
>
>> diff --git a/common-main.c b/common-main.c
>> index 8e68ac9e42..6b7ab077b0 100644
>> --- a/common-main.c
>> +++ b/common-main.c
>> @@ -1,92 +1,13 @@
>> ...
>> +#include "common-init.h"
>>  
>>  int main(int argc, const char **argv)
>>  {
>>  	int result;
>>  
>> +	init_git(argv);
>>  	result = cmd_main(argc, argv);
>>  
>>  	/* Not exit(3), but a wrapper calling our common_exit() */
>>  	exit(result);
>>  }
>
> Nice.  Very nice.  I wasn't too carefully validating what we lost
> here is identical to what we added to init_git(), but hopefully
> others would spot and stop us if that is not the case.
>
> Thanks.

Unfortunately, build based on meson does not seem to like the
init_git() thing.  Perhaps we need to add some missing files to
relevant lists in meson.build file or something silly like that?

https://github.com/git/git/actions/runs/12800227601/job/35687658673#step:8:961
Junio C Hamano Jan. 16, 2025, 9:02 p.m. UTC | #3
Junio C Hamano <gitster@pobox.com> writes:

> Unfortunately, build based on meson does not seem to like the
> init_git() thing.  Perhaps we need to add some missing files to
> relevant lists in meson.build file or something silly like that?
>
> https://github.com/git/git/actions/runs/12800227601/job/35687658673#step:8:961

I needed the following to get "meson compile" pass in my local
environment.  I suspect that Mesonized CI jobs exercise a bit more
than just "meson compile", so there is no guarantee that the
following is enough, but at least hopefully it would nudge you (and
those who may be interested in helping to build a working Rust
bindings) in the right direction.

I think it should be squashed into the step these files are added,
i.e. [PATCH 1/5].

Thanks.

 meson.build | 2 ++
 1 file changed, 2 insertions(+)

diff --git c/meson.build w/meson.build
index 0064eb64f5..e5ba28b47f 100644
--- c/meson.build
+++ w/meson.build
@@ -245,6 +245,8 @@ libgit_sources = [
   'commit-graph.c',
   'commit-reach.c',
   'commit.c',
+  'common-exit.c',
+  'common-init.c',
   'compat/nonblock.c',
   'compat/obstack.c',
   'compat/terminal.c',
Patrick Steinhardt Jan. 17, 2025, 9:44 a.m. UTC | #4
On Thu, Jan 16, 2025 at 01:02:33PM -0800, Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
> 
> > Unfortunately, build based on meson does not seem to like the
> > init_git() thing.  Perhaps we need to add some missing files to
> > relevant lists in meson.build file or something silly like that?
> >
> > https://github.com/git/git/actions/runs/12800227601/job/35687658673#step:8:961
> 
> I needed the following to get "meson compile" pass in my local
> environment.  I suspect that Mesonized CI jobs exercise a bit more
> than just "meson compile", so there is no guarantee that the
> following is enough, but at least hopefully it would nudge you (and
> those who may be interested in helping to build a working Rust
> bindings) in the right direction.
> 
> I think it should be squashed into the step these files are added,
> i.e. [PATCH 1/5].
> 
> Thanks.
> 
>  meson.build | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git c/meson.build w/meson.build
> index 0064eb64f5..e5ba28b47f 100644
> --- c/meson.build
> +++ w/meson.build
> @@ -245,6 +245,8 @@ libgit_sources = [
>    'commit-graph.c',
>    'commit-reach.c',
>    'commit.c',
> +  'common-exit.c',
> +  'common-init.c',
>    'compat/nonblock.c',
>    'compat/obstack.c',
>    'compat/terminal.c',

Yeah, I remember having the same hunk while Meson was still in-flight in
order to make it compatible with "seen". So this should be sufficient.

Patrick
Josh Steadmon Jan. 21, 2025, 11:21 p.m. UTC | #5
On 2025.01.17 10:44, Patrick Steinhardt wrote:
> On Thu, Jan 16, 2025 at 01:02:33PM -0800, Junio C Hamano wrote:
> > Junio C Hamano <gitster@pobox.com> writes:
> > 
> > > Unfortunately, build based on meson does not seem to like the
> > > init_git() thing.  Perhaps we need to add some missing files to
> > > relevant lists in meson.build file or something silly like that?
> > >
> > > https://github.com/git/git/actions/runs/12800227601/job/35687658673#step:8:961
> > 
> > I needed the following to get "meson compile" pass in my local
> > environment.  I suspect that Mesonized CI jobs exercise a bit more
> > than just "meson compile", so there is no guarantee that the
> > following is enough, but at least hopefully it would nudge you (and
> > those who may be interested in helping to build a working Rust
> > bindings) in the right direction.
> > 
> > I think it should be squashed into the step these files are added,
> > i.e. [PATCH 1/5].
> > 
> > Thanks.
> > 
> >  meson.build | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git c/meson.build w/meson.build
> > index 0064eb64f5..e5ba28b47f 100644
> > --- c/meson.build
> > +++ w/meson.build
> > @@ -245,6 +245,8 @@ libgit_sources = [
> >    'commit-graph.c',
> >    'commit-reach.c',
> >    'commit.c',
> > +  'common-exit.c',
> > +  'common-init.c',
> >    'compat/nonblock.c',
> >    'compat/obstack.c',
> >    'compat/terminal.c',
> 
> Yeah, I remember having the same hunk while Meson was still in-flight in
> order to make it compatible with "seen". So this should be sufficient.
> 
> Patrick

ACK, sorry about that. I missed the Meson developments while I was
off-list, so it wasn't on my radar. I'll make sure to squash the fix in
to V7. Thanks both.
diff mbox series

Patch

diff --git a/Makefile b/Makefile
index 97e8385b66..27e68ac039 100644
--- a/Makefile
+++ b/Makefile
@@ -981,6 +981,8 @@  LIB_OBJS += combine-diff.o
 LIB_OBJS += commit-graph.o
 LIB_OBJS += commit-reach.o
 LIB_OBJS += commit.o
+LIB_OBJS += common-exit.o
+LIB_OBJS += common-init.o
 LIB_OBJS += compat/nonblock.o
 LIB_OBJS += compat/obstack.o
 LIB_OBJS += compat/terminal.o
diff --git a/common-exit.c b/common-exit.c
new file mode 100644
index 0000000000..1aaa538be3
--- /dev/null
+++ b/common-exit.c
@@ -0,0 +1,26 @@ 
+#include "git-compat-util.h"
+#include "trace2.h"
+
+static void check_bug_if_BUG(void)
+{
+	if (!bug_called_must_BUG)
+		return;
+	BUG("on exit(): had bug() call(s) in this process without explicit BUG_if_bug()");
+}
+
+/* We wrap exit() to call common_exit() in git-compat-util.h */
+int common_exit(const char *file, int line, int code)
+{
+	/*
+	 * For non-POSIX systems: Take the lowest 8 bits of the "code"
+	 * to e.g. turn -1 into 255. On a POSIX system this is
+	 * redundant, see exit(3) and wait(2), but as it doesn't harm
+	 * anything there we don't need to guard this with an "ifdef".
+	 */
+	code &= 0xff;
+
+	check_bug_if_BUG();
+	trace2_cmd_exit_fl(file, line, code);
+
+	return code;
+}
diff --git a/common-init.c b/common-init.c
new file mode 100644
index 0000000000..5cc73f058c
--- /dev/null
+++ b/common-init.c
@@ -0,0 +1,63 @@ 
+#define USE_THE_REPOSITORY_VARIABLE
+
+#include "git-compat-util.h"
+#include "common-init.h"
+#include "exec-cmd.h"
+#include "gettext.h"
+#include "attr.h"
+#include "repository.h"
+#include "setup.h"
+#include "strbuf.h"
+#include "trace2.h"
+
+/*
+ * Many parts of Git have subprograms communicate via pipe, expect the
+ * upstream of a pipe to die with SIGPIPE when the downstream of a
+ * pipe does not need to read all that is written.  Some third-party
+ * programs that ignore or block SIGPIPE for their own reason forget
+ * to restore SIGPIPE handling to the default before spawning Git and
+ * break this carefully orchestrated machinery.
+ *
+ * Restore the way SIGPIPE is handled to default, which is what we
+ * expect.
+ */
+static void restore_sigpipe_to_default(void)
+{
+	sigset_t unblock;
+
+	sigemptyset(&unblock);
+	sigaddset(&unblock, SIGPIPE);
+	sigprocmask(SIG_UNBLOCK, &unblock, NULL);
+	signal(SIGPIPE, SIG_DFL);
+}
+
+void init_git(const char **argv)
+{
+	struct strbuf tmp = STRBUF_INIT;
+
+	trace2_initialize_clock();
+
+	/*
+	 * Always open file descriptors 0/1/2 to avoid clobbering files
+	 * in die().  It also avoids messing up when the pipes are dup'ed
+	 * onto stdin/stdout/stderr in the child processes we spawn.
+	 */
+	sanitize_stdfds();
+	restore_sigpipe_to_default();
+
+	git_resolve_executable_dir(argv[0]);
+
+	setlocale(LC_CTYPE, "");
+	git_setup_gettext();
+
+	initialize_repository(the_repository);
+
+	attr_start();
+
+	trace2_initialize();
+	trace2_cmd_start(argv);
+	trace2_collect_process_info(TRACE2_PROCESS_INFO_STARTUP);
+
+	if (!strbuf_getcwd(&tmp))
+		tmp_original_cwd = strbuf_detach(&tmp, NULL);
+}
diff --git a/common-init.h b/common-init.h
new file mode 100644
index 0000000000..3e6db20cae
--- /dev/null
+++ b/common-init.h
@@ -0,0 +1,6 @@ 
+#ifndef COMMON_INIT_H
+#define COMMON_INIT_H
+
+void init_git(const char **argv);
+
+#endif /* COMMON_INIT_H */
diff --git a/common-main.c b/common-main.c
index 8e68ac9e42..6b7ab077b0 100644
--- a/common-main.c
+++ b/common-main.c
@@ -1,92 +1,13 @@ 
-#define USE_THE_REPOSITORY_VARIABLE
-
 #include "git-compat-util.h"
-#include "exec-cmd.h"
-#include "gettext.h"
-#include "attr.h"
-#include "repository.h"
-#include "setup.h"
-#include "strbuf.h"
-#include "trace2.h"
-
-/*
- * Many parts of Git have subprograms communicate via pipe, expect the
- * upstream of a pipe to die with SIGPIPE when the downstream of a
- * pipe does not need to read all that is written.  Some third-party
- * programs that ignore or block SIGPIPE for their own reason forget
- * to restore SIGPIPE handling to the default before spawning Git and
- * break this carefully orchestrated machinery.
- *
- * Restore the way SIGPIPE is handled to default, which is what we
- * expect.
- */
-static void restore_sigpipe_to_default(void)
-{
-	sigset_t unblock;
-
-	sigemptyset(&unblock);
-	sigaddset(&unblock, SIGPIPE);
-	sigprocmask(SIG_UNBLOCK, &unblock, NULL);
-	signal(SIGPIPE, SIG_DFL);
-}
+#include "common-init.h"
 
 int main(int argc, const char **argv)
 {
 	int result;
-	struct strbuf tmp = STRBUF_INIT;
-
-	trace2_initialize_clock();
-
-	/*
-	 * Always open file descriptors 0/1/2 to avoid clobbering files
-	 * in die().  It also avoids messing up when the pipes are dup'ed
-	 * onto stdin/stdout/stderr in the child processes we spawn.
-	 */
-	sanitize_stdfds();
-	restore_sigpipe_to_default();
-
-	git_resolve_executable_dir(argv[0]);
-
-	setlocale(LC_CTYPE, "");
-	git_setup_gettext();
-
-	initialize_repository(the_repository);
-
-	attr_start();
-
-	trace2_initialize();
-	trace2_cmd_start(argv);
-	trace2_collect_process_info(TRACE2_PROCESS_INFO_STARTUP);
-
-	if (!strbuf_getcwd(&tmp))
-		tmp_original_cwd = strbuf_detach(&tmp, NULL);
 
+	init_git(argv);
 	result = cmd_main(argc, argv);
 
 	/* Not exit(3), but a wrapper calling our common_exit() */
 	exit(result);
 }
-
-static void check_bug_if_BUG(void)
-{
-	if (!bug_called_must_BUG)
-		return;
-	BUG("on exit(): had bug() call(s) in this process without explicit BUG_if_bug()");
-}
-
-/* We wrap exit() to call common_exit() in git-compat-util.h */
-int common_exit(const char *file, int line, int code)
-{
-	/*
-	 * For non-POSIX systems: Take the lowest 8 bits of the "code"
-	 * to e.g. turn -1 into 255. On a POSIX system this is
-	 * redundant, see exit(3) and wait(2), but as it doesn't harm
-	 * anything there we don't need to guard this with an "ifdef".
-	 */
-	code &= 0xff;
-
-	check_bug_if_BUG();
-	trace2_cmd_exit_fl(file, line, code);
-
-	return code;
-}