diff mbox series

[v4,02/19] convert: add [async_]convert_to_working_tree_ca() variants

Message ID fc03417592c25a111fbf77f9e5b9e5468b9070ae.1604521275.git.matheus.bernardino@usp.br (mailing list archive)
State New, archived
Headers show
Series Parallel Checkout (part I) | expand

Commit Message

Matheus Tavares Nov. 4, 2020, 8:33 p.m. UTC
From: Jeff Hostetler <jeffhost@microsoft.com>

Separate the attribute gathering from the actual conversion by adding
_ca() variants of the conversion functions. These variants receive a
precomputed 'struct conv_attrs', not relying, thus, on a index state.
They will be used in a future patch adding parallel checkout support,
for two reasons:

- We will already load the conversion attributes in checkout_entry(),
  before conversion, to decide whether a path is eligible for parallel
  checkout. Therefore, it would be wasteful to load them again later,
  for the actual conversion.

- The parallel workers will be responsible for reading, converting and
  writing blobs to the working tree. They won't have access to the main
  process' index state, so they cannot load the attributes. Instead,
  they will receive the preloaded ones and call the _ca() variant of
  the conversion functions. Furthermore, the attributes machinery is
  optimized to handle paths in sequential order, so it's better to leave
  it for the main process, anyway.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
[matheus.bernardino: squash, remove one function definition and reword]
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 convert.c | 47 ++++++++++++++++++++++++-----------------------
 convert.h | 37 ++++++++++++++++++++++++++++---------
 2 files changed, 52 insertions(+), 32 deletions(-)

Comments

Christian Couder Dec. 5, 2020, 11:10 a.m. UTC | #1
On Wed, Nov 4, 2020 at 9:33 PM Matheus Tavares
<matheus.bernardino@usp.br> wrote:
>
> From: Jeff Hostetler <jeffhost@microsoft.com>
>
> Separate the attribute gathering from the actual conversion by adding
> _ca() variants of the conversion functions. These variants receive a
> precomputed 'struct conv_attrs', not relying, thus, on a index state.

s/a/an/

> They will be used in a future patch adding parallel checkout support,
> for two reasons:
>
> - We will already load the conversion attributes in checkout_entry(),
>   before conversion, to decide whether a path is eligible for parallel
>   checkout. Therefore, it would be wasteful to load them again later,
>   for the actual conversion.
>
> - The parallel workers will be responsible for reading, converting and
>   writing blobs to the working tree. They won't have access to the main
>   process' index state, so they cannot load the attributes. Instead,
>   they will receive the preloaded ones and call the _ca() variant of
>   the conversion functions. Furthermore, the attributes machinery is
>   optimized to handle paths in sequential order, so it's better to leave
>   it for the main process, anyway.

Well explained.

> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
> [matheus.bernardino: squash, remove one function definition and reword]

<rant++>I'd rather have "Reworked-by: Matheus Tavares
<matheus.bernardino@usp.br>" or "Improved-by: Matheus Tavares
<matheus.bernardino@usp.br>" than lines such as the above
one.</rant++>

> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>

> @@ -1447,7 +1447,7 @@ void convert_to_git_filter_fd(const struct index_state *istate,
>         ident_to_git(dst->buf, dst->len, dst, ca.ident);
>  }
>
> -static int convert_to_working_tree_internal(const struct index_state *istate,
> +static int convert_to_working_tree_internal(const struct conv_attrs *ca,

It is still internal, but for consistency it might be better to add
"_ca" to the name of this function too, as we now pass it a "ca"
instead of an "istate".
Matheus Tavares Dec. 5, 2020, 10:20 p.m. UTC | #2
On Sat, Dec 5, 2020 at 8:10 AM Christian Couder
<christian.couder@gmail.com> wrote:
>
> On Wed, Nov 4, 2020 at 9:33 PM Matheus Tavares
> <matheus.bernardino@usp.br> wrote:
> >
> > From: Jeff Hostetler <jeffhost@microsoft.com>
> >
> > Separate the attribute gathering from the actual conversion by adding
> > _ca() variants of the conversion functions. These variants receive a
> > precomputed 'struct conv_attrs', not relying, thus, on a index state.
>
> s/a/an/

Good catch, thanks.

> > They will be used in a future patch adding parallel checkout support,
> > for two reasons:
> >
> > - We will already load the conversion attributes in checkout_entry(),
> >   before conversion, to decide whether a path is eligible for parallel
> >   checkout. Therefore, it would be wasteful to load them again later,
> >   for the actual conversion.
> >
> > - The parallel workers will be responsible for reading, converting and
> >   writing blobs to the working tree. They won't have access to the main
> >   process' index state, so they cannot load the attributes. Instead,
> >   they will receive the preloaded ones and call the _ca() variant of
> >   the conversion functions. Furthermore, the attributes machinery is
> >   optimized to handle paths in sequential order, so it's better to leave
> >   it for the main process, anyway.
>
> Well explained.
>
> > Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
> > [matheus.bernardino: squash, remove one function definition and reword]
>
> <rant++>I'd rather have "Reworked-by: Matheus Tavares
> <matheus.bernardino@usp.br>" or "Improved-by: Matheus Tavares
> <matheus.bernardino@usp.br>" than lines such as the above
> one.</rant++>
>
> > Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
>
> > @@ -1447,7 +1447,7 @@ void convert_to_git_filter_fd(const struct index_state *istate,
> >         ident_to_git(dst->buf, dst->len, dst, ca.ident);
> >  }
> >
> > -static int convert_to_working_tree_internal(const struct index_state *istate,
> > +static int convert_to_working_tree_internal(const struct conv_attrs *ca,
>
> It is still internal, but for consistency it might be better to add
> "_ca" to the name of this function too, as we now pass it a "ca"
> instead of an "istate".

Right, will do.
diff mbox series

Patch

diff --git a/convert.c b/convert.c
index f13b001273..ab3d517233 100644
--- a/convert.c
+++ b/convert.c
@@ -1447,7 +1447,7 @@  void convert_to_git_filter_fd(const struct index_state *istate,
 	ident_to_git(dst->buf, dst->len, dst, ca.ident);
 }
 
-static int convert_to_working_tree_internal(const struct index_state *istate,
+static int convert_to_working_tree_internal(const struct conv_attrs *ca,
 					    const char *path, const char *src,
 					    size_t len, struct strbuf *dst,
 					    int normalizing,
@@ -1455,11 +1455,8 @@  static int convert_to_working_tree_internal(const struct index_state *istate,
 					    struct delayed_checkout *dco)
 {
 	int ret = 0, ret_filter = 0;
-	struct conv_attrs ca;
-
-	convert_attrs(istate, &ca, path);
 
-	ret |= ident_to_worktree(src, len, dst, ca.ident);
+	ret |= ident_to_worktree(src, len, dst, ca->ident);
 	if (ret) {
 		src = dst->buf;
 		len = dst->len;
@@ -1469,49 +1466,53 @@  static int convert_to_working_tree_internal(const struct index_state *istate,
 	 * is a smudge or process filter (even if the process filter doesn't
 	 * support smudge).  The filters might expect CRLFs.
 	 */
-	if ((ca.drv && (ca.drv->smudge || ca.drv->process)) || !normalizing) {
-		ret |= crlf_to_worktree(src, len, dst, ca.crlf_action);
+	if ((ca->drv && (ca->drv->smudge || ca->drv->process)) || !normalizing) {
+		ret |= crlf_to_worktree(src, len, dst, ca->crlf_action);
 		if (ret) {
 			src = dst->buf;
 			len = dst->len;
 		}
 	}
 
-	ret |= encode_to_worktree(path, src, len, dst, ca.working_tree_encoding);
+	ret |= encode_to_worktree(path, src, len, dst, ca->working_tree_encoding);
 	if (ret) {
 		src = dst->buf;
 		len = dst->len;
 	}
 
 	ret_filter = apply_filter(
-		path, src, len, -1, dst, ca.drv, CAP_SMUDGE, meta, dco);
-	if (!ret_filter && ca.drv && ca.drv->required)
-		die(_("%s: smudge filter %s failed"), path, ca.drv->name);
+		path, src, len, -1, dst, ca->drv, CAP_SMUDGE, meta, dco);
+	if (!ret_filter && ca->drv && ca->drv->required)
+		die(_("%s: smudge filter %s failed"), path, ca->drv->name);
 
 	return ret | ret_filter;
 }
 
-int async_convert_to_working_tree(const struct index_state *istate,
-				  const char *path, const char *src,
-				  size_t len, struct strbuf *dst,
-				  const struct checkout_metadata *meta,
-				  void *dco)
+int async_convert_to_working_tree_ca(const struct conv_attrs *ca,
+				     const char *path, const char *src,
+				     size_t len, struct strbuf *dst,
+				     const struct checkout_metadata *meta,
+				     void *dco)
 {
-	return convert_to_working_tree_internal(istate, path, src, len, dst, 0, meta, dco);
+	return convert_to_working_tree_internal(ca, path, src, len, dst, 0, meta, dco);
 }
 
-int convert_to_working_tree(const struct index_state *istate,
-			    const char *path, const char *src,
-			    size_t len, struct strbuf *dst,
-			    const struct checkout_metadata *meta)
+int convert_to_working_tree_ca(const struct conv_attrs *ca,
+			       const char *path, const char *src,
+			       size_t len, struct strbuf *dst,
+			       const struct checkout_metadata *meta)
 {
-	return convert_to_working_tree_internal(istate, path, src, len, dst, 0, meta, NULL);
+	return convert_to_working_tree_internal(ca, path, src, len, dst, 0, meta, NULL);
 }
 
 int renormalize_buffer(const struct index_state *istate, const char *path,
 		       const char *src, size_t len, struct strbuf *dst)
 {
-	int ret = convert_to_working_tree_internal(istate, path, src, len, dst, 1, NULL, NULL);
+	struct conv_attrs ca;
+	int ret;
+
+	convert_attrs(istate, &ca, path);
+	ret = convert_to_working_tree_internal(&ca, path, src, len, dst, 1, NULL, NULL);
 	if (ret) {
 		src = dst->buf;
 		len = dst->len;
diff --git a/convert.h b/convert.h
index 5678e99922..a4838b5e5c 100644
--- a/convert.h
+++ b/convert.h
@@ -99,15 +99,34 @@  const char *get_convert_attr_ascii(const struct index_state *istate,
 int convert_to_git(const struct index_state *istate,
 		   const char *path, const char *src, size_t len,
 		   struct strbuf *dst, int conv_flags);
-int convert_to_working_tree(const struct index_state *istate,
-			    const char *path, const char *src,
-			    size_t len, struct strbuf *dst,
-			    const struct checkout_metadata *meta);
-int async_convert_to_working_tree(const struct index_state *istate,
-				  const char *path, const char *src,
-				  size_t len, struct strbuf *dst,
-				  const struct checkout_metadata *meta,
-				  void *dco);
+int convert_to_working_tree_ca(const struct conv_attrs *ca,
+			       const char *path, const char *src,
+			       size_t len, struct strbuf *dst,
+			       const struct checkout_metadata *meta);
+int async_convert_to_working_tree_ca(const struct conv_attrs *ca,
+				     const char *path, const char *src,
+				     size_t len, struct strbuf *dst,
+				     const struct checkout_metadata *meta,
+				     void *dco);
+static inline int convert_to_working_tree(const struct index_state *istate,
+					  const char *path, const char *src,
+					  size_t len, struct strbuf *dst,
+					  const struct checkout_metadata *meta)
+{
+	struct conv_attrs ca;
+	convert_attrs(istate, &ca, path);
+	return convert_to_working_tree_ca(&ca, path, src, len, dst, meta);
+}
+static inline int async_convert_to_working_tree(const struct index_state *istate,
+						const char *path, const char *src,
+						size_t len, struct strbuf *dst,
+						const struct checkout_metadata *meta,
+						void *dco)
+{
+	struct conv_attrs ca;
+	convert_attrs(istate, &ca, path);
+	return async_convert_to_working_tree_ca(&ca, path, src, len, dst, meta, dco);
+}
 int async_query_available_blobs(const char *cmd,
 				struct string_list *available_paths);
 int renormalize_buffer(const struct index_state *istate,