diff mbox series

[v2,7/7] midx.c: avoid adding preferred objects twice

Message ID 887ab9485faa21f5a5cd889d97895ed41013803d.1661197803.git.me@ttaylorr.com (mailing list archive)
State Accepted
Commit 99e4d084ffc4c6f8cb28ec61fdbb44facdd47ac7
Headers show
Series midx: permit changing the preferred pack when reusing the MIDX | expand

Commit Message

Taylor Blau Aug. 22, 2022, 7:50 p.m. UTC
The last commit changes the behavior of midx.c's `get_sorted_objects()`
function to handle the case of writing a MIDX bitmap while reusing an
existing MIDX and changing the identity of the preferred pack
separately.

As part of this change, all objects from the (new) preferred pack are
added to the fanout table in a separate pass. Since these copies of the
objects all have their preferred bits set, any duplicates will be
resolved in their favor.

Importantly, this includes any copies of those same objects that come
from the existing MIDX. We know at the time of adding them that they'll
be redundant if their source pack is the (new) preferred one, so we can
avoid adding them to the list in this case.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 midx.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

Comments

Derrick Stolee Aug. 23, 2022, 4:22 p.m. UTC | #1
On 8/22/2022 3:50 PM, Taylor Blau wrote:
> The last commit changes the behavior of midx.c's `get_sorted_objects()`
> function to handle the case of writing a MIDX bitmap while reusing an
> existing MIDX and changing the identity of the preferred pack
> separately.
> 
> As part of this change, all objects from the (new) preferred pack are
> added to the fanout table in a separate pass. Since these copies of the
> objects all have their preferred bits set, any duplicates will be
> resolved in their favor.
> 
> Importantly, this includes any copies of those same objects that come
> from the existing MIDX. We know at the time of adding them that they'll
> be redundant if their source pack is the (new) preferred one, so we can
> avoid adding them to the list in this case.

Good call to reduce memory requirements.

> @@ -605,6 +606,15 @@ static void midx_fanout_add_midx_fanout(struct midx_fanout *fanout,
>  	end = ntohl(m->chunk_oid_fanout[cur_fanout]);
>  
>  	for (cur_object = start; cur_object < end; cur_object++) {
> +		if ((preferred_pack > -1) &&
> +		    (preferred_pack == nth_midxed_pack_int_id(m, cur_object))) {

nit: you don't need the extra parentheses here.

Thanks,
-Stolee
diff mbox series

Patch

diff --git a/midx.c b/midx.c
index bd1d27090e..148ecc2f14 100644
--- a/midx.c
+++ b/midx.c
@@ -595,7 +595,8 @@  static void midx_fanout_sort(struct midx_fanout *fanout)
 
 static void midx_fanout_add_midx_fanout(struct midx_fanout *fanout,
 					struct multi_pack_index *m,
-					uint32_t cur_fanout)
+					uint32_t cur_fanout,
+					int preferred_pack)
 {
 	uint32_t start = 0, end;
 	uint32_t cur_object;
@@ -605,6 +606,15 @@  static void midx_fanout_add_midx_fanout(struct midx_fanout *fanout,
 	end = ntohl(m->chunk_oid_fanout[cur_fanout]);
 
 	for (cur_object = start; cur_object < end; cur_object++) {
+		if ((preferred_pack > -1) &&
+		    (preferred_pack == nth_midxed_pack_int_id(m, cur_object))) {
+			/*
+			 * Objects from preferred packs are added
+			 * separately.
+			 */
+			continue;
+		}
+
 		midx_fanout_grow(fanout, fanout->nr + 1);
 		nth_midxed_pack_midx_entry(m,
 					   &fanout->entries[fanout->nr],
@@ -680,7 +690,8 @@  static struct pack_midx_entry *get_sorted_entries(struct multi_pack_index *m,
 		fanout.nr = 0;
 
 		if (m)
-			midx_fanout_add_midx_fanout(&fanout, m, cur_fanout);
+			midx_fanout_add_midx_fanout(&fanout, m, cur_fanout,
+						    preferred_pack);
 
 		for (cur_pack = start_pack; cur_pack < nr_packs; cur_pack++) {
 			int preferred = cur_pack == preferred_pack;