Message ID | 200909231525.n8NFP87C029957@hydrogen.msp.redhat.com (mailing list archive) |
---|---|
State | Rejected, archived |
Delegated to: | Alasdair Kergon |
Headers | show |
On Wed, Sep 23, 2009 at 10:25:08AM -0500, Jon Brassow wrote: > Don't insert into hash before an existing chunk. > Most inserts are after an existing extent anyway so this code is used > very rarely. Again, I'm not particularly keen on taking this one. Some systems are very short of memory and everything we can save matters and we have no statistics about how "rare" this actually is. Is it really that difficult to split into two when removing? Alasdair -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On Thu, 24 Sep 2009, Alasdair G Kergon wrote: > On Wed, Sep 23, 2009 at 10:25:08AM -0500, Jon Brassow wrote: > > Don't insert into hash before an existing chunk. > > Most inserts are after an existing extent anyway so this code is used > > very rarely. > > Again, I'm not particularly keen on taking this one. Some systems are very > short of memory and everything we can save matters and we have no statistics > about how "rare" this actually is. > > Is it really that difficult to split into two when removing? It is code bloat --- it would increase coding/reviewing/testing time (the splitting would have to be simulated because it happens rarely). Also, it could lead to the situation that memory consumption starts growing when merging. Mikulas > Alasdair > -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On Thu, 24 Sep 2009, Mikulas Patocka wrote: > On Thu, 24 Sep 2009, Alasdair G Kergon wrote: > > > On Wed, Sep 23, 2009 at 10:25:08AM -0500, Jon Brassow wrote: > > > Don't insert into hash before an existing chunk. > > > Most inserts are after an existing extent anyway so this code is used > > > very rarely. > > > > Again, I'm not particularly keen on taking this one. Some systems are very > > short of memory and everything we can save matters and we have no statistics > > about how "rare" this actually is. > > > > Is it really that difficult to split into two when removing? > > It is code bloat --- it would increase coding/reviewing/testing time (the > splitting would have to be simulated because it happens rarely). > > Also, it could lead to the situation that memory consumption starts > growing when merging. > > Mikulas These backmerges happen if someone writes to the logical volume backwards. Do you know any common usage scenario that does it? Mikulas -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On Thu, Sep 24 2009 at 10:15am -0400, Mikulas Patocka <mpatocka@redhat.com> wrote: > On Thu, 24 Sep 2009, Mikulas Patocka wrote: > > > On Thu, 24 Sep 2009, Alasdair G Kergon wrote: > > > > > On Wed, Sep 23, 2009 at 10:25:08AM -0500, Jon Brassow wrote: > > > > Don't insert into hash before an existing chunk. > > > > Most inserts are after an existing extent anyway so this code is used > > > > very rarely. > > > > > > Again, I'm not particularly keen on taking this one. Some systems are very > > > short of memory and everything we can save matters and we have no statistics > > > about how "rare" this actually is. > > > > > > Is it really that difficult to split into two when removing? > > > > It is code bloat --- it would increase coding/reviewing/testing time (the > > splitting would have to be simulated because it happens rarely). > > > > Also, it could lead to the situation that memory consumption starts > > growing when merging. > > > > Mikulas > > These backmerges happen if someone writes to the logical volume backwards. > Do you know any common usage scenario that does it? I was able to reliably reproduce backmerges simply by taking a snapshot of a 32G LV and then formatting that origin with ext3. This resulted in a ~9% copy-out to the snapshot. AFAIK mkfs.ext3 does write backwards for updating block groups. In practice it clearly does because without disabling the insertion before other chunks I would reliably BUG in clear_exception (as I shared earlier today on dm-devel). Mike -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On Thu, 24 Sep 2009, Mikulas Patocka wrote: > On Thu, 24 Sep 2009, Mikulas Patocka wrote: > > > On Thu, 24 Sep 2009, Alasdair G Kergon wrote: > > > > > On Wed, Sep 23, 2009 at 10:25:08AM -0500, Jon Brassow wrote: > > > > Don't insert into hash before an existing chunk. > > > > Most inserts are after an existing extent anyway so this code is used > > > > very rarely. > > > > > > Again, I'm not particularly keen on taking this one. Some systems are very > > > short of memory and everything we can save matters and we have no statistics > > > about how "rare" this actually is. > > > > > > Is it really that difficult to split into two when removing? > > > > It is code bloat --- it would increase coding/reviewing/testing time (the > > splitting would have to be simulated because it happens rarely). > > > > Also, it could lead to the situation that memory consumption starts > > growing when merging. > > > > Mikulas > > These backmerges happen if someone writes to the logical volume backwards. > Do you know any common usage scenario that does it? > > Mikulas I was wrong about this. If someone writes backwards, (from chunk 100 to 97), it will be allocated 100->2, 99->3, 98->4, 97->5 and no back-merge will happen. The backmerge will happen if the disk driver finishes the writes backwards: if someone writes to chunks 100-103, they get reallocated to 2-5 and copying starts. If the disk driver finishes the copies backwards (it first finishes 103->5, then 102->4, then 101->3 and finally 100->2), the exceptions will be added in this order to the table and they'll be back-merged. I think that disk shuffling writes is unlikely. There was already a fix to make kcopyd not reorder writes (b673c3a8192e28f13e2050a4b82c1986be92cc15, 2.6.28). On the other hand, merging could be implemented even without splitting (because the merge comes either from the beginning or from the end of the range), so we can go with backmerges. Mikulas -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On Thu, Sep 24 2009 at 8:48am -0400, Alasdair G Kergon <agk@redhat.com> wrote: > On Wed, Sep 23, 2009 at 10:25:08AM -0500, Jon Brassow wrote: > > Don't insert into hash before an existing chunk. > > Most inserts are after an existing extent anyway so this code is used > > very rarely. > > Again, I'm not particularly keen on taking this one. Some systems are very > short of memory and everything we can save matters and we have no statistics > about how "rare" this actually is. > > Is it really that difficult to split into two when removing? Jon, Mikulas and I examined/discussed this in detail today. We had a nice break-through: snapshot-merge no longer needs to avoid inserting before an existing chunk. As such snapshot-merge no longer imposes less efficient packing of the exception cache. We needed a small change to the snapshot-merge clear_exception() code so that it increments e->old_chunk and e->new_chunk when it makes its call to dm_consecutive_chunk_count_dec(), e.g.: if (dm_consecutive_chunk_count(e)) { if ((de->old_chunk == e->old_chunk) && (de->new_chunk == dm_chunk_number(e->new_chunk))) { e->old_chunk++; e->new_chunk++; } dm_consecutive_chunk_count_dec(e); } This is the inverse of what is done in dm_insert_exception(): /* Insert before an existing chunk? */ if (new_e->old_chunk == (e->old_chunk - 1) && new_e->new_chunk == (dm_chunk_number(e->new_chunk) - 1)) { dm_consecutive_chunk_count_inc(e); e->old_chunk--; e->new_chunk--; dm_free_exception(eh, new_e); return; } Updated DM snapshot-merge patches are available here: http://people.redhat.com/msnitzer/patches/snapshot-merge/kernel_unified/2.6.31/ -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
Index: linux-2.6/drivers/md/dm-snap.c =================================================================== --- linux-2.6.orig/drivers/md/dm-snap.c +++ linux-2.6/drivers/md/dm-snap.c @@ -478,16 +478,6 @@ static void insert_completed_exception(s return; } - /* Insert before an existing chunk? */ - if (new_e->old_chunk == (e->old_chunk - 1) && - new_e->new_chunk == (dm_chunk_number(e->new_chunk) - 1)) { - dm_consecutive_chunk_count_inc(e); - e->old_chunk--; - e->new_chunk--; - free_exception(new_e); - return; - } - if (new_e->old_chunk > e->old_chunk) break; }