[2/2] dmcache: Implement a flush message

Message ID	20130509204751.GB5712@blackbox.djwong.org (mailing list archive)
State	Deferred, archived
Headers	show Return-Path: <dm-devel-bounces@redhat.com> Date: Thu, 9 May 2013 13:47:51 -0700 From: "Darrick J. Wong" <darrick.wong@oracle.com> To: Mike Snitzer <snitzer@redhat.com> Message-ID: <20130509204751.GB5712@blackbox.djwong.org> References: <20130508214845.GA7729@blackbox.djwong.org> <20130508220526.GA24132@redhat.com> <20130509203616.GA5713@blackbox.djwong.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20130509203616.GA5713@blackbox.djwong.org> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: device-mapper development <dm-devel@redhat.com>, Joe Thornber <thornber@redhat.com>, linux-kernel@vger.kernel.org Subject: [dm-devel] [PATCH 2/2] dmcache: Implement a flush message Precedence: junk Reply-To: device-mapper development <dm-devel@redhat.com> Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com

Darrick J. Wong May 9, 2013, 8:47 p.m. UTC

Create a new 'flush' message that causes the dmcache to write all of its
metadata out to disk.  This enables us to ensure that the disk reflects
whatever's in memory without having to tear down the cache device.  This helps
me in the case where I have a cached ro fs that I can't umount and therefore
can't tear down the cache device, but want to save the cache metadata anyway.
The command syntax is as follows:

# dmsetup message mycache 0 flush now

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 drivers/md/dm-cache-target.c |    4 ++++
 1 file changed, 4 insertions(+)


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

Joe Thornber May 10, 2013, 10:22 a.m. UTC | #1

On Thu, May 09, 2013 at 01:47:51PM -0700, Darrick J. Wong wrote:
> Create a new 'flush' message that causes the dmcache to write all of its
> metadata out to disk.  This enables us to ensure that the disk reflects
> whatever's in memory without having to tear down the cache device.  This helps
> me in the case where I have a cached ro fs that I can't umount and therefore
> can't tear down the cache device, but want to save the cache metadata anyway.
> The command syntax is as follows:
> 
> # dmsetup message mycache 0 flush now

Nack.

[Ignoring the ugly 'now' parameter.]

I think you're in danger of hiding the real issue.  Which is if the
target's destructor and post suspend is not being called then, as far
as dm-cache is concerned this is a crash.  Any open transactions will
be lost as it automatically rolls back.

We need to understand more why this is happening.  It's actually
harmless atm for dm-cache, because we're forced to commit before using
a new migration.  But for dm-thin you can lose writes.  Why are you
never tearing down your dm devices?

- Joe

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

Darrick J. Wong May 10, 2013, 5:51 p.m. UTC | #2

On Fri, May 10, 2013 at 11:22:24AM +0100, Joe Thornber wrote:
> On Thu, May 09, 2013 at 01:47:51PM -0700, Darrick J. Wong wrote:
> > Create a new 'flush' message that causes the dmcache to write all of its
> > metadata out to disk.  This enables us to ensure that the disk reflects
> > whatever's in memory without having to tear down the cache device.  This helps
> > me in the case where I have a cached ro fs that I can't umount and therefore
> > can't tear down the cache device, but want to save the cache metadata anyway.
> > The command syntax is as follows:
> > 
> > # dmsetup message mycache 0 flush now
> 
> Nack.
> 
> [Ignoring the ugly 'now' parameter.]
> 
> I think you're in danger of hiding the real issue.  Which is if the
> target's destructor and post suspend is not being called then, as far
> as dm-cache is concerned this is a crash.  Any open transactions will
> be lost as it automatically rolls back.
> 
> We need to understand more why this is happening.  It's actually
> harmless atm for dm-cache, because we're forced to commit before using
> a new migration.  But for dm-thin you can lose writes.  Why are you
> never tearing down your dm devices?

afaict, there isn't anything in the initscripts that tears down dm devices
prior to invoking reboot(), and the kernel drivers don't have reboot notifiers
to flush things out either.  I've been told that lvm does this, but I don't see
anything in the Ubuntu or RHEL6 that would suggest a teardown script...

# dpkg -L lvm2 dmsetup libdevmapper1.02.1 libdevmapper-event1.02.1 | grep etc
/etc
/etc/lvm
/etc/lvm/lvm.conf
# grep -rn dmsetup /etc
/etc/lvm/lvm.conf:333:    # waiting for udev, run 'dmsetup udevcomplete_all' manually to wake them up.

# rpm -ql lvm2 lvm2-libs device-mapper device-mapper-event device-mapper-event-libs device-mapper-libs | grep /etc
/etc/lvm
/etc/lvm/archive
/etc/lvm/backup
/etc/lvm/cache
/etc/lvm/cache/.cache
/etc/lvm/lvm.conf
/etc/rc.d/init.d/lvm2-monitor
# grep -rn dmsetup /etc/rc* /etc/init*
/etc/rc0.d/K75netfs:53:		       /sbin/dmsetup ls --target multipath --exec "/sbin/kpartx -a -p p"
/etc/rc0.d/S01halt:22:            if /sbin/dmsetup info "$dst" | grep -q '^Open count: *0$'; then
/etc/rc0.d/S01halt:120:	    && [ "$(dmsetup status "$dst" | cut -d ' ' -f 3)" = crypt ]; then
/etc/rc1.d/K75netfs:53:		       /sbin/dmsetup ls --target multipath --exec "/sbin/kpartx -a -p p"
/etc/rc2.d/K75netfs:53:		       /sbin/dmsetup ls --target multipath --exec "/sbin/kpartx -a -p p"
/etc/rc3.d/S25netfs:53:		       /sbin/dmsetup ls --target multipath --exec "/sbin/kpartx -a -p p"
/etc/rc4.d/S25netfs:53:		       /sbin/dmsetup ls --target multipath --exec "/sbin/kpartx -a -p p"
/etc/rc5.d/S25netfs:53:		       /sbin/dmsetup ls --target multipath --exec "/sbin/kpartx -a -p p"
/etc/rc6.d/K75netfs:53:		       /sbin/dmsetup ls --target multipath --exec "/sbin/kpartx -a -p p"
/etc/rc6.d/S01reboot:22:            if /sbin/dmsetup info "$dst" | grep -q '^Open count: *0$'; then
/etc/rc6.d/S01reboot:120:	    && [ "$(dmsetup status "$dst" | cut -d ' ' -f 3)" = crypt ]; then
/etc/rc.d/rc6.d/K75netfs:53:		       /sbin/dmsetup ls --target multipath --exec "/sbin/kpartx -a -p p"
/etc/rc.d/rc6.d/S01reboot:22:            if /sbin/dmsetup info "$dst" | grep -q '^Open count: *0$'; then
/etc/rc.d/rc6.d/S01reboot:120:	    && [ "$(dmsetup status "$dst" | cut -d ' ' -f 3)" = crypt ]; then
/etc/rc.d/rc0.d/K75netfs:53:		       /sbin/dmsetup ls --target multipath --exec "/sbin/kpartx -a -p p"
/etc/rc.d/rc0.d/S01halt:22:            if /sbin/dmsetup info "$dst" | grep -q '^Open count: *0$'; then
/etc/rc.d/rc0.d/S01halt:120:	    && [ "$(dmsetup status "$dst" | cut -d ' ' -f 3)" = crypt ]; then
/etc/rc.d/rc.sysinit:191:		/sbin/dmsetup ls --target multipath --exec "/sbin/kpartx -a -p p" >/dev/null
/etc/rc.d/rc5.d/S25netfs:53:		       /sbin/dmsetup ls --target multipath --exec "/sbin/kpartx -a -p p"
/etc/rc.d/rc1.d/K75netfs:53:		       /sbin/dmsetup ls --target multipath --exec "/sbin/kpartx -a -p p"
/etc/rc.d/rc3.d/S25netfs:53:		       /sbin/dmsetup ls --target multipath --exec "/sbin/kpartx -a -p p"
/etc/rc.d/init.d/netfs:53:		       /sbin/dmsetup ls --target multipath --exec "/sbin/kpartx -a -p p"
/etc/rc.d/init.d/halt:22:            if /sbin/dmsetup info "$dst" | grep -q '^Open count: *0$'; then
/etc/rc.d/init.d/halt:120:	    && [ "$(dmsetup status "$dst" | cut -d ' ' -f 3)" = crypt ]; then
/etc/rc.d/rc4.d/S25netfs:53:		       /sbin/dmsetup ls --target multipath --exec "/sbin/kpartx -a -p p"
/etc/rc.d/rc2.d/K75netfs:53:		       /sbin/dmsetup ls --target multipath --exec "/sbin/kpartx -a -p p"
/etc/rc.sysinit:191:		/sbin/dmsetup ls --target multipath --exec "/sbin/kpartx -a -p p" >/dev/null
/etc/init.d/netfs:53:		       /sbin/dmsetup ls --target multipath --exec "/sbin/kpartx -a -p p"
/etc/init.d/halt:22:            if /sbin/dmsetup info "$dst" | grep -q '^Open count: *0$'; then
/etc/init.d/halt:120:	    && [ "$(dmsetup status "$dst" | cut -d ' ' -f 3)" = crypt ]; then

What am I missing?  My observation of Ubuntu is that at best it shuts down
services, umounts most of the filesystems, syncs, and reboots.  RHEL seems to
shut down multipath and dmcrypt, but that was all I found.  For /most/ users of
dm it seems like the system simply reboots, and nobody's the worse for the
wear.

In the meantime I've added a script to my dmcache test tools to tear things
down at the end, which works unless the umount fails. :/ I guess I could simply
suspend the devices, but the postsuspend flush only seems to get called if I
actually redefine the device to some driver that isn't cache.

(I guess I could suspend the device and replace cache with zero... yuck.)

--D

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

Mike Snitzer May 11, 2013, 3:25 p.m. UTC | #3

[in the future please refrain from posting to LKML for such a narrow
topic like dm-cache... not seeing the point in adding to the LKML noise
-- dm-devel should suffice]

On Fri, May 10 2013 at  1:51pm -0400,
Darrick J. Wong <darrick.wong@oracle.com> wrote:

> On Fri, May 10, 2013 at 11:22:24AM +0100, Joe Thornber wrote:
> > On Thu, May 09, 2013 at 01:47:51PM -0700, Darrick J. Wong wrote:
> > > Create a new 'flush' message that causes the dmcache to write all of its
> > > metadata out to disk.  This enables us to ensure that the disk reflects
> > > whatever's in memory without having to tear down the cache device.  This helps
> > > me in the case where I have a cached ro fs that I can't umount and therefore
> > > can't tear down the cache device, but want to save the cache metadata anyway.
> > > The command syntax is as follows:
> > > 
> > > # dmsetup message mycache 0 flush now
> > 
> > Nack.
> > 
> > [Ignoring the ugly 'now' parameter.]
> > 
> > I think you're in danger of hiding the real issue.  Which is if the
> > target's destructor and post suspend is not being called then, as far
> > as dm-cache is concerned this is a crash.  Any open transactions will
> > be lost as it automatically rolls back.
> > 
> > We need to understand more why this is happening.  It's actually
> > harmless atm for dm-cache, because we're forced to commit before using
> > a new migration.  But for dm-thin you can lose writes.  Why are you
> > never tearing down your dm devices?
> 
> afaict, there isn't anything in the initscripts that tears down dm devices
> prior to invoking reboot(), and the kernel drivers don't have reboot notifiers
> to flush things out either.  I've been told that lvm does this, but I don't see
> anything in the Ubuntu or RHEL6 that would suggest a teardown script...

See: https://git.fedorahosted.org/cgit/lvm2.git/commit/?id=c698ee14bbb1310cf2383c8977d14a8e29139f8c

But I'm not sure which distros have hooked blkdeactivate in (cc'ing
prajnoha for his insight).

> What am I missing?  My observation of Ubuntu is that at best it shuts down
> services, umounts most of the filesystems, syncs, and reboots.  RHEL seems to
> shut down multipath and dmcrypt, but that was all I found.  For /most/ users of
> dm it seems like the system simply reboots, and nobody's the worse for the
> wear.

DM devices should be properly torn down; as Joe said this is
particularly important for dm-thinp (otherwise it looks like a crash and
the open transaction is rolled back).

> In the meantime I've added a script to my dmcache test tools to tear things
> down at the end, which works unless the umount fails. :/ 

You should switch to using blkdeactivate.

> I guess I could simply suspend the devices, but the postsuspend flush
> only seems to get called if I actually redefine the device to some
> driver that isn't cache.
> 
> (I guess I could suspend the device and replace cache with zero... yuck.)

You _really_ shouldn't need to play these games.

postsuspend will get called regardless of whether you're changing the
table in any way.

See: do_suspend -> dm_suspend -> dm_table_postsuspend_targets -> suspend_targets

(the only way I'm seeing that the postsuspend could not get called is if
the freeze_bdev/thaw_bdev were to fail, via {lock,unlock}_fs())

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

Peter Rajnoha May 13, 2013, 12:04 p.m. UTC | #4

On 11.05.2013 17:25, Mike Snitzer wrote:> On Fri, May 10 2013 at  1:51pm
-0400,
> Darrick J. Wong <darrick.wong@oracle.com> wrote:
>
...
>> afaict, there isn't anything in the initscripts that tears down dm
devices
>> prior to invoking reboot(), and the kernel drivers don't have reboot
notifiers
>> to flush things out either.  I've been told that lvm does this, but I
don't see
>> anything in the Ubuntu or RHEL6 that would suggest a teardown script...
>
> See:
https://git.fedorahosted.org/cgit/lvm2.git/commit/?id=c698ee14bbb1310cf2383c8977d14a8e29139f8c
>
> But I'm not sure which distros have hooked blkdeactivate in (cc'ing
> prajnoha for his insight).
>

The blk-availability initscript/systemd unit that gets called at
shutdown/reboot and which in turn calls the blkdeactivate is already
used in RHEL 6.4 onwards and also in Fedora 18 onwards. However, for
Fedora, you need to enable the systemd unit explicitly at the moment
(systemctl enable blk-availability.service). To have it enabled by
default, the distro-wide default systemd configuration needs to be
edited which is controlled by systemd-preset file (I hope F19 is going
to have this enabled by default finally).

As for any other distros, it's up to the maintainers in that distro to
make use of the new script - I haven't looked if they started using it
or not. But upstream already provides it since lvm2 v2.02.98.

Peter

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

Darrick J. Wong May 13, 2013, 9:36 p.m. UTC | #5

On Mon, May 13, 2013 at 02:04:08PM +0200, Peter Rajnoha wrote:
> On 11.05.2013 17:25, Mike Snitzer wrote:> On Fri, May 10 2013 at  1:51pm
> -0400,
> > Darrick J. Wong <darrick.wong@oracle.com> wrote:
> >
> ...
> >> afaict, there isn't anything in the initscripts that tears down dm
> devices
> >> prior to invoking reboot(), and the kernel drivers don't have reboot
> notifiers
> >> to flush things out either.  I've been told that lvm does this, but I
> don't see
> >> anything in the Ubuntu or RHEL6 that would suggest a teardown script...
> >
> > See:
> https://git.fedorahosted.org/cgit/lvm2.git/commit/?id=c698ee14bbb1310cf2383c8977d14a8e29139f8c
> >
> > But I'm not sure which distros have hooked blkdeactivate in (cc'ing
> > prajnoha for his insight).
> >
> 
> The blk-availability initscript/systemd unit that gets called at
> shutdown/reboot and which in turn calls the blkdeactivate is already
> used in RHEL 6.4 onwards and also in Fedora 18 onwards. However, for
> Fedora, you need to enable the systemd unit explicitly at the moment
> (systemctl enable blk-availability.service). To have it enabled by
> default, the distro-wide default systemd configuration needs to be
> edited which is controlled by systemd-preset file (I hope F19 is going
> to have this enabled by default finally).
> 
> As for any other distros, it's up to the maintainers in that distro to
> make use of the new script - I haven't looked if they started using it
> or not. But upstream already provides it since lvm2 v2.02.98.

Aha!  Thank you for providing the missing link.  Now it all makes sense. :)

(fwiw Ubuntu's latest is 2.02.95.)

--D
> 
> Peter
> 
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

[2/2] dmcache: Implement a flush message

Commit Message

Comments

Patch