Message ID | 20240406090930.2252838-16-yukuai1@huaweicloud.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | fs & block: remove bdev->bd_inode | expand |
On Sat, Apr 06, 2024 at 05:09:19PM +0800, Yu Kuai wrote: > From: Yu Kuai <yukuai3@huawei.com> > > Avoid to access bd_inode directly, prepare to remove bd_inode from > block_devcie. > > Signed-off-by: Yu Kuai <yukuai3@huawei.com> > Reviewed-by: Christoph Hellwig <hch@lst.de> > Reviewed-by: Jan Kara <jack@suse.cz> > --- > drivers/s390/block/dasd_ioctl.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/drivers/s390/block/dasd_ioctl.c b/drivers/s390/block/dasd_ioctl.c > index 7e0ed7032f76..c1201590f343 100644 > --- a/drivers/s390/block/dasd_ioctl.c > +++ b/drivers/s390/block/dasd_ioctl.c > @@ -215,8 +215,9 @@ dasd_format(struct dasd_block *block, struct format_data_t *fdata) > * enabling the device later. > */ > if (fdata->start_unit == 0) { > - block->gdp->part0->bd_inode->i_blkbits = > - blksize_bits(fdata->blksize); > + rc = set_blocksize(block->gdp->part0, fdata->blksize); Could somebody (preferably s390 folks) explain what is going on in dasd_format()? The change in this commit is *NOT* an equivalent transformation - mainline does not evict the page cache of device. Is that * intentional behaviour in mainline version, possibly broken by this patch * a bug in mainline accidentally fixed by this patch * something else? And shouldn't there be an exclusion between that and having a filesystem on a partition of that disk currently mounted?
On Tue, Apr 16, 2024 at 02:35:55AM +0100, Al Viro wrote: > > drivers/s390/block/dasd_ioctl.c | 5 +++-- > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/s390/block/dasd_ioctl.c b/drivers/s390/block/dasd_ioctl.c > > index 7e0ed7032f76..c1201590f343 100644 > > --- a/drivers/s390/block/dasd_ioctl.c > > +++ b/drivers/s390/block/dasd_ioctl.c > > @@ -215,8 +215,9 @@ dasd_format(struct dasd_block *block, struct format_data_t *fdata) > > * enabling the device later. > > */ > > if (fdata->start_unit == 0) { > > - block->gdp->part0->bd_inode->i_blkbits = > > - blksize_bits(fdata->blksize); > > + rc = set_blocksize(block->gdp->part0, fdata->blksize); > > Could somebody (preferably s390 folks) explain what is going on in > dasd_format()? The change in this commit is *NOT* an equivalent > transformation - mainline does not evict the page cache of device. > > Is that > * intentional behaviour in mainline version, possibly broken > by this patch > * a bug in mainline accidentally fixed by this patch > * something else? > > And shouldn't there be an exclusion between that and having a filesystem > on a partition of that disk currently mounted? CC-ing Stefan and Jan. Thanks!
Am 16.04.24 um 10:47 schrieb Alexander Gordeev: > On Tue, Apr 16, 2024 at 02:35:55AM +0100, Al Viro wrote: >>> drivers/s390/block/dasd_ioctl.c | 5 +++-- >>> 1 file changed, 3 insertions(+), 2 deletions(-) >>> >>> diff --git a/drivers/s390/block/dasd_ioctl.c b/drivers/s390/block/dasd_ioctl.c >>> index 7e0ed7032f76..c1201590f343 100644 >>> --- a/drivers/s390/block/dasd_ioctl.c >>> +++ b/drivers/s390/block/dasd_ioctl.c >>> @@ -215,8 +215,9 @@ dasd_format(struct dasd_block *block, struct format_data_t *fdata) >>> * enabling the device later. >>> */ >>> if (fdata->start_unit == 0) { >>> - block->gdp->part0->bd_inode->i_blkbits = >>> - blksize_bits(fdata->blksize); >>> + rc = set_blocksize(block->gdp->part0, fdata->blksize); >> Could somebody (preferably s390 folks) explain what is going on in >> dasd_format()? The change in this commit is *NOT* an equivalent >> transformation - mainline does not evict the page cache of device. >> >> Is that >> * intentional behaviour in mainline version, possibly broken >> by this patch >> * a bug in mainline accidentally fixed by this patch >> * something else? >> >> And shouldn't there be an exclusion between that and having a filesystem >> on a partition of that disk currently mounted? > CC-ing Stefan and Jan. > > Thanks! Hi, from my point of view this was an equivalent transformation. set_blocksize() does basically also set i_blkbits like it was before. The dasd_format ioctl does only work on a disabled device. To achieve this all partitions need to be unmounted. The tooling also refuses to work on disks actually in use. So there should be no page cache to evict. The comment above this code says: /* Since dasdfmt keeps the device open after it was disabled, * there still exists an inode for this device. * We must update i_blkbits, otherwise we might get errors when * enabling the device later. */ This is the reason for updating i_blkbits. However, I get your point to question the code itself. Honestly this code exists for many years and I can not tell if the circumstances of the comment have changed in between somehow. A quick test without this code did not show any change or errors but there might be corner cases I am missing. Maybe you can give a hint if this makes any sense from your point of view. Thanks, Stefan
On Wed, Apr 17, 2024 at 02:47:14PM +0200, Stefan Haberland wrote: > set_blocksize() does basically also set i_blkbits like it was before. > The dasd_format ioctl does only work on a disabled device. To achieve this > all partitions need to be unmounted. > The tooling also refuses to work on disks actually in use. > > So there should be no page cache to evict. You mean this? if (base->state != DASD_STATE_BASIC) { pr_warn("%s: The DASD cannot be formatted while it is enabled\n", dev_name(&base->cdev->dev)); return -EBUSY; } OK, but what would prevent dasd_ioctl_disable() from working while disk is in use? And I don't see anything that would evict the page cache in dasd_ioctl_disable() either, actually... What am I missing here?
On Sun, Apr 28, 2024 at 07:58:23PM +0100, Al Viro wrote: > On Wed, Apr 17, 2024 at 02:47:14PM +0200, Stefan Haberland wrote: > > > set_blocksize() does basically also set i_blkbits like it was before. > > The dasd_format ioctl does only work on a disabled device. To achieve this > > all partitions need to be unmounted. > > The tooling also refuses to work on disks actually in use. > > > > So there should be no page cache to evict. > > You mean this? > if (base->state != DASD_STATE_BASIC) { > pr_warn("%s: The DASD cannot be formatted while it is enabled\n", > dev_name(&base->cdev->dev)); > return -EBUSY; > } > > OK, but what would prevent dasd_ioctl_disable() from working while > disk is in use? And I don't see anything that would evict the > page cache in dasd_ioctl_disable() either, actually... > > What am I missing here? BTW, you are updating block size according to new device size, before rc = base->discipline->format_device(base, fdata, 1); if (rc == -EAGAIN) rc = base->discipline->format_device(base, fdata, 0); Unless something very unidiomatic is going on, this attempt to format might fail...
Am 29.04.24 um 01:23 schrieb Al Viro: > On Sun, Apr 28, 2024 at 07:58:23PM +0100, Al Viro wrote: >> On Wed, Apr 17, 2024 at 02:47:14PM +0200, Stefan Haberland wrote: >> >>> set_blocksize() does basically also set i_blkbits like it was before. >>> The dasd_format ioctl does only work on a disabled device. To achieve this >>> all partitions need to be unmounted. >>> The tooling also refuses to work on disks actually in use. >>> >>> So there should be no page cache to evict. >> You mean this? >> if (base->state != DASD_STATE_BASIC) { >> pr_warn("%s: The DASD cannot be formatted while it is enabled\n", >> dev_name(&base->cdev->dev)); >> return -EBUSY; >> } >> >> OK, but what would prevent dasd_ioctl_disable() from working while >> disk is in use? And I don't see anything that would evict the >> page cache in dasd_ioctl_disable() either, actually... >> >> What am I missing here? Thank you for your input. Let me provide some more insides how it is intended to work. Maybe there is something we should improve. This whole code is basically intended to be used by the dasdfmt tool. For the dasdfmt tool and the dasd_format ioctl we are talking about DASD ECKD devices. An important note: for those devices a partition has to be used to access the disk because the first tracks of the disks are not safe to store user data. A partition has to be created by fdasd. A disk in use has the state DASD_STATE_ONLINE. To format a device the dasdfmt tool has to be called, it does the following: The dasdfmt tool checks if the disk is actually in use and refuses to work on an 'in use' DASD. So for example a partition that was in use has to be unmounted first. Afterwards it does the following calls: BIODASDDISABLE - to disable the device and prevent further usage - sets the disk in state DASD_STATE_BASIC BIODASDFMT - does the actual formatting - checks if the disk is in state DASD_STATE_BASIC (if BIODASDDISABLE was called before) - this ioctl is usually called multiple times to format smaller parts of the disk each time - in the first call to this ioctl the first track (track 0) is invalidated (basically wiped out) and format_data_t.intensity equals DASD_FMT_INT_INVAL - the last step is to finally format the first track to indicate a successful formatting of the whole disk BIODASDENABLE - to enable the disk again for general usage - sets the disk to state DASD_STATE_ONLINE again - NOTE: a disabled device refuses an open call, so the tooling needs to keep the file descriptor open. So the assumption in this processing is that a possibly used page cache is evicted when removing the partition from actual usage (e.g. unmounting, ..). While writing this I get to the point that it might not be the best idea to rely on proper tool handling only and it might be a good idea to check for an open count in BIODASDDISABLE as well so that the ioctls itself are safe to use. (While it does not make a lot sense to use them alone.) My assumption was that this is already done but obviously it isn't. > BTW, you are updating block size according to new device size, before > rc = base->discipline->format_device(base, fdata, 1); > if (rc == -EAGAIN) > rc = base->discipline->format_device(base, fdata, 0); > Unless something very unidiomatic is going on, this attempt to > format might fail... This is true. I guess the idea here was that the actual formatting of track 0 is done last after the whole disk was successfully formatted and everything went fine. But actually also the invalidation of the first track would do this here. So we should not only move this after the format_device call but we should also add a check for DASD_FMT_INT_INVAL which is the first step in the whole formatting. My current conclusion would be that this patch itself is fine as is but I should submit patches later to address the findings in this discussion.
On Mon, Apr 29, 2024 at 04:41:19PM +0200, Stefan Haberland wrote: > The dasdfmt tool checks if the disk is actually in use and refuses to > work on an 'in use' DASD. > So for example a partition that was in use has to be unmounted first. Hmm... How is that check done? Does it open device exclusive?
Am 30.04.24 um 02:30 schrieb Al Viro: > On Mon, Apr 29, 2024 at 04:41:19PM +0200, Stefan Haberland wrote: > >> The dasdfmt tool checks if the disk is actually in use and refuses to >> work on an 'in use' DASD. >> So for example a partition that was in use has to be unmounted first. > Hmm... How is that check done? Does it open device exclusive? > No, it just checks the open_count gathered from the driver through another ioctl. And yes, of course there is a race in this check that between gathering the data and disabling the device it could be opened.
diff --git a/drivers/s390/block/dasd_ioctl.c b/drivers/s390/block/dasd_ioctl.c index 7e0ed7032f76..c1201590f343 100644 --- a/drivers/s390/block/dasd_ioctl.c +++ b/drivers/s390/block/dasd_ioctl.c @@ -215,8 +215,9 @@ dasd_format(struct dasd_block *block, struct format_data_t *fdata) * enabling the device later. */ if (fdata->start_unit == 0) { - block->gdp->part0->bd_inode->i_blkbits = - blksize_bits(fdata->blksize); + rc = set_blocksize(block->gdp->part0, fdata->blksize); + if (rc) + return rc; } rc = base->discipline->format_device(base, fdata, 1);