Message ID | 20150813112316.628f0de6@noble (mailing list archive) |
---|---|
State | Rejected, archived |
Delegated to: | Mike Snitzer |
Headers | show |
On Thu, 13 Aug 2015, NeilBrown wrote: > On Wed, 12 Aug 2015 10:16:17 -0400 (EDT) Mikulas Patocka > <mpatocka@redhat.com> wrote: > > > > > > > On Wed, 12 Aug 2015, NeilBrown wrote: > > > > > Ahh - thanks for the explanation. That makes sense. > > > Presumably you need to wait for old reads that you don't have a read > > > returning old data after a write has confirmed the new data is written? > > > Is there some other reason? > > > > > > As soon as the new "proper" exception is installed, no new reads will > > > use the old address, so it should be safe to wait without holding the > > > lock. > > > > > > So what about this? > > > > This is wrong too - when you add the exception to the complete hash table > > and drop the lock, writes to the chunk in the origin target are can modify > > the origin volume. These writes race with pending reads to the snapshot > > (__chunk_is_tracked tests for those reads). If the I/O scheduler schedules > > the write to the origin before the read from the snapshot, the read from > > the snapshot returns corrupted data. > > > > There can be more problems with snapshot merging - if the chunk is on the > > complete hash table, the merging code assumes that it can be safely > > merged. > > > > Thanks again. > > I came up with the following which should address the origin-write > issue, but I'm not so sure about snapshot merging, and it is becoming a > lot less simple that I had hoped. > > I'll see if I can come up with code for a more general solution that is > still localised to dm - along the lines of my bio_split suggestion. > > Thanks, > NeilBrown > > > > diff --git a/drivers/md/dm-snap.c b/drivers/md/dm-snap.c > index 7c82d3ccce87..7f9e1b0429c8 100644 > --- a/drivers/md/dm-snap.c > +++ b/drivers/md/dm-snap.c > @@ -1461,14 +1461,22 @@ static void pending_complete(struct dm_snap_pending_exception *pe, int success) > goto out; > } > > - /* Check for conflicting reads */ > - __check_for_conflicting_io(s, pe->e.old_chunk); > + /* Add proper exception. Now all reads will be > + * redirected, so no new reads can be started on > + * 'old_chunk'. > + */ > + dm_insert_exception(&s->complete, e); > + > + /* Drain conflicting reads */ > + if (__chunk_is_tracked(s, pe->e.old_chunk)) { > + up_write(&s->lock); > + __check_for_conflicting_io(s, pe->e.old_chunk); > + down_write(&s->lock); > + } > > /* > - * Add a proper exception, and remove the > - * in-flight exception from the list. > + * And now we can remove the temporary exception. > */ > - dm_insert_exception(&s->complete, e); > > out: > dm_remove_exception(&pe->e); > @@ -2089,17 +2097,17 @@ static int __origin_write(struct list_head *snapshots, sector_t sector, > */ > chunk = sector_to_chunk(snap->store, sector); > > - /* > - * Check exception table to see if block > - * is already remapped in this snapshot > - * and trigger an exception if not. > - */ > - e = dm_lookup_exception(&snap->complete, chunk); > - if (e) > - goto next_snapshot; > - > pe = __lookup_pending_exception(snap, chunk); > if (!pe) { > + /* > + * Check exception table to see if block > + * is already remapped in this snapshot > + * and trigger an exception if not. > + */ > + e = dm_lookup_exception(&snap->complete, chunk); > + if (e) > + goto next_snapshot; > + > up_write(&snap->lock); > pe = alloc_pending_exception(snap); > down_write(&snap->lock); You forgot about another race after these calls to up_write/down_write - after the lock is taken again, you must call both dm_lookup_exception and __lookup_pending_exception. Mikulas -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
diff --git a/drivers/md/dm-snap.c b/drivers/md/dm-snap.c index 7c82d3ccce87..7f9e1b0429c8 100644 --- a/drivers/md/dm-snap.c +++ b/drivers/md/dm-snap.c @@ -1461,14 +1461,22 @@ static void pending_complete(struct dm_snap_pending_exception *pe, int success) goto out; } - /* Check for conflicting reads */ - __check_for_conflicting_io(s, pe->e.old_chunk); + /* Add proper exception. Now all reads will be + * redirected, so no new reads can be started on + * 'old_chunk'. + */ + dm_insert_exception(&s->complete, e); + + /* Drain conflicting reads */ + if (__chunk_is_tracked(s, pe->e.old_chunk)) { + up_write(&s->lock); + __check_for_conflicting_io(s, pe->e.old_chunk); + down_write(&s->lock); + } /* - * Add a proper exception, and remove the - * in-flight exception from the list. + * And now we can remove the temporary exception. */ - dm_insert_exception(&s->complete, e); out: dm_remove_exception(&pe->e); @@ -2089,17 +2097,17 @@ static int __origin_write(struct list_head *snapshots, sector_t sector, */ chunk = sector_to_chunk(snap->store, sector); - /* - * Check exception table to see if block - * is already remapped in this snapshot - * and trigger an exception if not. - */ - e = dm_lookup_exception(&snap->complete, chunk); - if (e) - goto next_snapshot; - pe = __lookup_pending_exception(snap, chunk); if (!pe) { + /* + * Check exception table to see if block + * is already remapped in this snapshot + * and trigger an exception if not. + */ + e = dm_lookup_exception(&snap->complete, chunk); + if (e) + goto next_snapshot; + up_write(&snap->lock); pe = alloc_pending_exception(snap); down_write(&snap->lock);