diff mbox

xfs: do not unconditionally enable hasalign feature on V5 filesystems

Message ID 1487174254-9002-1-git-send-email-chandan@linux.vnet.ibm.com
State New, archived
Headers show

Commit Message

Chandan Rajendra Feb. 15, 2017, 3:57 p.m. UTC
On a ppc64 system, executing generic/256 test with 32k block size gives
the following call trace,

XFS: Assertion failed: args->maxlen > 0, file: /root/repos/linux/fs/xfs/libxfs/xfs_alloc.c, line: 2026

kernel BUG at /root/repos/linux/fs/xfs/xfs_message.c:113!
Oops: Exception in kernel mode, sig: 5 [#1]
SMP NR_CPUS=2048
DEBUG_PAGEALLOC
NUMA
pSeries
Modules linked in:
CPU: 2 PID: 19361 Comm: mkdir Not tainted 4.10.0-rc5 #58
task: c000000102606d80 task.stack: c0000001026b8000
NIP: c0000000004ef798 LR: c0000000004ef798 CTR: c00000000082b290
REGS: c0000001026bb090 TRAP: 0700   Not tainted  (4.10.0-rc5)
MSR: 8000000000029032 <SF,EE,ME,IR,DR,RI>
CR: 28004428  XER: 00000000
CFAR: c0000000004ef180 SOFTE: 1
GPR00: c0000000004ef798 c0000001026bb310 c000000001157300 ffffffffffffffea
GPR04: 000000000000000a c0000001026bb130 0000000000000000 ffffffffffffffc0
GPR08: 00000000000000d1 0000000000000021 00000000ffffffd1 c000000000dd4990
GPR12: 0000000022004444 c00000000fe00800 0000000020000000 0000000000000000
GPR16: 0000000000000000 0000000043a606fc 0000000043a76c08 0000000043a1b3d0
GPR20: 000001002a35cd60 c0000001026bbb80 0000000000000000 0000000000000001
GPR24: 0000000000000240 0000000000000004 c00000062dc55000 0000000000000000
GPR28: 0000000000000004 c00000062ecd9200 0000000000000000 c0000001026bb6c0
NIP [c0000000004ef798] .assfail+0x28/0x30
LR [c0000000004ef798] .assfail+0x28/0x30
Call Trace:
[c0000001026bb310] [c0000000004ef798] .assfail+0x28/0x30 (unreliable)
[c0000001026bb380] [c000000000455d74] .xfs_alloc_space_available+0x194/0x1b0
[c0000001026bb410] [c00000000045b914] .xfs_alloc_fix_freelist+0x144/0x480
[c0000001026bb580] [c00000000045c368] .xfs_alloc_vextent+0x698/0xa90
[c0000001026bb650] [c0000000004a6200] .xfs_ialloc_ag_alloc+0x170/0x820
[c0000001026bb7c0] [c0000000004a9098] .xfs_dialloc+0x158/0x320
[c0000001026bb8a0] [c0000000004e628c] .xfs_ialloc+0x7c/0x610
[c0000001026bb990] [c0000000004e8138] .xfs_dir_ialloc+0xa8/0x2f0
[c0000001026bbaa0] [c0000000004e8814] .xfs_create+0x494/0x790
[c0000001026bbbf0] [c0000000004e5ebc] .xfs_generic_create+0x2bc/0x410
[c0000001026bbce0] [c0000000002b4a34] .vfs_mkdir+0x154/0x230
[c0000001026bbd70] [c0000000002bc444] .SyS_mkdirat+0x94/0x120
[c0000001026bbe30] [c00000000000b760] system_call+0x38/0xfc
Instruction dump:
4e800020 60000000 7c0802a6 7c862378 3c82ffca 7ca72b78 38841c18 7c651b78
38600000 f8010010 f821ff91 4bfff94d <0fe00000> 60000000 7c0802a6 7c892378

The root cause of the problem is due to the fact that
xfs_sb_version_hasalign() returns true when we are working on a V5
filesystem. Due to this args.minalignslop (in xfs_ialloc_ag_alloc())
gets the unsigned equivalent of -1 assigned to it. This later causes
alloc_len in xfs_alloc_space_available() to have a value of 0. In such a
scenario when args.total is also 0, the assert statement
"ASSERT(args->maxlen > 0);" fails.

This commit fixes the bug by checking for the existance of
XFS_SB_VERSION_ALIGNBIT bit in xfs_sb->sb_versionnum field even on V5
filesystems.

Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
---
PS: From what I could understand from other code paths,
xfs_alloc_arg->total must be set to the sum of space required by the
current code path and the space already reserved in the current
transaction. Please let me know if my understanding is
incorrect. Also, In the code path listed in the above call trace we
have xfs_alloc_arg->total set to 0 which probably isn't correct
either.

 fs/xfs/libxfs/xfs_format.h | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

Comments

Eric Sandeen Feb. 15, 2017, 4:13 p.m. UTC | #1
On 2/15/17 9:57 AM, Chandan Rajendra wrote:
> On a ppc64 system, executing generic/256 test with 32k block size gives
> the following call trace,
> 
> XFS: Assertion failed: args->maxlen > 0, file: /root/repos/linux/fs/xfs/libxfs/xfs_alloc.c, line: 2026
> 
> kernel BUG at /root/repos/linux/fs/xfs/xfs_message.c:113!
> Oops: Exception in kernel mode, sig: 5 [#1]
> SMP NR_CPUS=2048
> DEBUG_PAGEALLOC
> NUMA
> pSeries
> Modules linked in:
> CPU: 2 PID: 19361 Comm: mkdir Not tainted 4.10.0-rc5 #58
> task: c000000102606d80 task.stack: c0000001026b8000
> NIP: c0000000004ef798 LR: c0000000004ef798 CTR: c00000000082b290
> REGS: c0000001026bb090 TRAP: 0700   Not tainted  (4.10.0-rc5)
> MSR: 8000000000029032 <SF,EE,ME,IR,DR,RI>
> CR: 28004428  XER: 00000000
> CFAR: c0000000004ef180 SOFTE: 1
> GPR00: c0000000004ef798 c0000001026bb310 c000000001157300 ffffffffffffffea
> GPR04: 000000000000000a c0000001026bb130 0000000000000000 ffffffffffffffc0
> GPR08: 00000000000000d1 0000000000000021 00000000ffffffd1 c000000000dd4990
> GPR12: 0000000022004444 c00000000fe00800 0000000020000000 0000000000000000
> GPR16: 0000000000000000 0000000043a606fc 0000000043a76c08 0000000043a1b3d0
> GPR20: 000001002a35cd60 c0000001026bbb80 0000000000000000 0000000000000001
> GPR24: 0000000000000240 0000000000000004 c00000062dc55000 0000000000000000
> GPR28: 0000000000000004 c00000062ecd9200 0000000000000000 c0000001026bb6c0
> NIP [c0000000004ef798] .assfail+0x28/0x30
> LR [c0000000004ef798] .assfail+0x28/0x30
> Call Trace:
> [c0000001026bb310] [c0000000004ef798] .assfail+0x28/0x30 (unreliable)
> [c0000001026bb380] [c000000000455d74] .xfs_alloc_space_available+0x194/0x1b0
> [c0000001026bb410] [c00000000045b914] .xfs_alloc_fix_freelist+0x144/0x480
> [c0000001026bb580] [c00000000045c368] .xfs_alloc_vextent+0x698/0xa90
> [c0000001026bb650] [c0000000004a6200] .xfs_ialloc_ag_alloc+0x170/0x820
> [c0000001026bb7c0] [c0000000004a9098] .xfs_dialloc+0x158/0x320
> [c0000001026bb8a0] [c0000000004e628c] .xfs_ialloc+0x7c/0x610
> [c0000001026bb990] [c0000000004e8138] .xfs_dir_ialloc+0xa8/0x2f0
> [c0000001026bbaa0] [c0000000004e8814] .xfs_create+0x494/0x790
> [c0000001026bbbf0] [c0000000004e5ebc] .xfs_generic_create+0x2bc/0x410
> [c0000001026bbce0] [c0000000002b4a34] .vfs_mkdir+0x154/0x230
> [c0000001026bbd70] [c0000000002bc444] .SyS_mkdirat+0x94/0x120
> [c0000001026bbe30] [c00000000000b760] system_call+0x38/0xfc
> Instruction dump:
> 4e800020 60000000 7c0802a6 7c862378 3c82ffca 7ca72b78 38841c18 7c651b78
> 38600000 f8010010 f821ff91 4bfff94d <0fe00000> 60000000 7c0802a6 7c892378
> 
> The root cause of the problem is due to the fact that
> xfs_sb_version_hasalign() returns true when we are working on a V5
> filesystem. Due to this args.minalignslop (in xfs_ialloc_ag_alloc())
> gets the unsigned equivalent of -1 assigned to it. This later causes
> alloc_len in xfs_alloc_space_available() to have a value of 0. In such a
> scenario when args.total is also 0, the assert statement
> "ASSERT(args->maxlen > 0);" fails.

Hm, the intent of the _haslign() function is to say that V5 must always
imply the "alignbit" - i.e. we don't want to grow an infinite feature
matrix, and by the time you get to V5 supers, there are many things which
cannot be turned on or off, such as this feature.

So what happens here... xfs_ialloc_ag_alloc does:

args.minalignslop = xfs_ialloc_cluster_alignment(args.mp) - 1;

so you're saying that cluster_alignment comes out as 0?

That function is checking _hasalign:

static inline int
xfs_ialloc_cluster_alignment(
        struct xfs_mount        *mp)
{
        if (xfs_sb_version_hasalign(&mp->m_sb) &&
            mp->m_sb.sb_inoalignmt >=
                        XFS_B_TO_FSBT(mp, mp->m_inode_cluster_size))
                return mp->m_sb.sb_inoalignmt;
        return 1;
}

So are you saying that this function returns 0?  That would imply that
sb_inoalignmt and m_inode_cluster_size are both zero, yes?  Is this
what you see?


-Eric

> This commit fixes the bug by checking for the existance of
> XFS_SB_VERSION_ALIGNBIT bit in xfs_sb->sb_versionnum field even on V5
> filesystems.
> 
> Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
> ---
> PS: From what I could understand from other code paths,
> xfs_alloc_arg->total must be set to the sum of space required by the
> current code path and the space already reserved in the current
> transaction. Please let me know if my understanding is
> incorrect. Also, In the code path listed in the above call trace we
> have xfs_alloc_arg->total set to 0 which probably isn't correct
> either.
> 
>  fs/xfs/libxfs/xfs_format.h | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h
> index 6b7579e..57a5264 100644
> --- a/fs/xfs/libxfs/xfs_format.h
> +++ b/fs/xfs/libxfs/xfs_format.h
> @@ -346,8 +346,7 @@ static inline void xfs_sb_version_addquota(struct xfs_sb *sbp)
>  
>  static inline bool xfs_sb_version_hasalign(struct xfs_sb *sbp)
>  {
> -	return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_5 ||
> -		(sbp->sb_versionnum & XFS_SB_VERSION_ALIGNBIT));
> +	return (sbp->sb_versionnum & XFS_SB_VERSION_ALIGNBIT);
>  }
>  
>  static inline bool xfs_sb_version_hasdalign(struct xfs_sb *sbp)
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Sandeen Feb. 15, 2017, 5:03 p.m. UTC | #2
On 2/15/17 10:13 AM, Eric Sandeen wrote:
>> The root cause of the problem is due to the fact that
>> xfs_sb_version_hasalign() returns true when we are working on a V5
>> filesystem. Due to this args.minalignslop (in xfs_ialloc_ag_alloc())
>> gets the unsigned equivalent of -1 assigned to it. This later causes
>> alloc_len in xfs_alloc_space_available() to have a value of 0. In such a
>> scenario when args.total is also 0, the assert statement
>> "ASSERT(args->maxlen > 0);" fails.
> Hm, the intent of the _haslign() function is to say that V5 must always
> imply the "alignbit" - i.e. we don't want to grow an infinite feature
> matrix, and by the time you get to V5 supers, there are many things which
> cannot be turned on or off, such as this feature.
> 
> So what happens here... xfs_ialloc_ag_alloc does:
> 
> args.minalignslop = xfs_ialloc_cluster_alignment(args.mp) - 1;
> 
> so you're saying that cluster_alignment comes out as 0?
> 
> That function is checking _hasalign:
> 
> static inline int
> xfs_ialloc_cluster_alignment(
>         struct xfs_mount        *mp)
> {
>         if (xfs_sb_version_hasalign(&mp->m_sb) &&
>             mp->m_sb.sb_inoalignmt >=
>                         XFS_B_TO_FSBT(mp, mp->m_inode_cluster_size))
>                 return mp->m_sb.sb_inoalignmt;
>         return 1;
> }
> 
> So are you saying that this function returns 0?  That would imply that
> sb_inoalignmt and m_inode_cluster_size are both zero, yes?  Is this
> what you see?

Sorry, I guess that means XFS_B_TO_FSBT(mp, mp->m_inode_cluster_size)) is
zero; inode cluster size is 8192 in this case I think, and that is in fact
0 filesystem blocks when computed with this macro.

I need to think about this a little bit to convince myself that the inode
alignment bit really /should/ be off for a filesystem of this geometry, vs
changing the macro to recognize the case.

-Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Darrick J. Wong Feb. 15, 2017, 5:16 p.m. UTC | #3
On Wed, Feb 15, 2017 at 11:03:11AM -0600, Eric Sandeen wrote:
> On 2/15/17 10:13 AM, Eric Sandeen wrote:
> >> The root cause of the problem is due to the fact that
> >> xfs_sb_version_hasalign() returns true when we are working on a V5
> >> filesystem. Due to this args.minalignslop (in xfs_ialloc_ag_alloc())
> >> gets the unsigned equivalent of -1 assigned to it. This later causes
> >> alloc_len in xfs_alloc_space_available() to have a value of 0. In such a
> >> scenario when args.total is also 0, the assert statement
> >> "ASSERT(args->maxlen > 0);" fails.
> > Hm, the intent of the _haslign() function is to say that V5 must always
> > imply the "alignbit" - i.e. we don't want to grow an infinite feature
> > matrix, and by the time you get to V5 supers, there are many things which
> > cannot be turned on or off, such as this feature.
> > 
> > So what happens here... xfs_ialloc_ag_alloc does:
> > 
> > args.minalignslop = xfs_ialloc_cluster_alignment(args.mp) - 1;
> > 
> > so you're saying that cluster_alignment comes out as 0?
> > 
> > That function is checking _hasalign:
> > 
> > static inline int
> > xfs_ialloc_cluster_alignment(
> >         struct xfs_mount        *mp)
> > {
> >         if (xfs_sb_version_hasalign(&mp->m_sb) &&
> >             mp->m_sb.sb_inoalignmt >=
> >                         XFS_B_TO_FSBT(mp, mp->m_inode_cluster_size))
> >                 return mp->m_sb.sb_inoalignmt;
> >         return 1;
> > }
> > 
> > So are you saying that this function returns 0?  That would imply that
> > sb_inoalignmt and m_inode_cluster_size are both zero, yes?  Is this
> > what you see?
> 
> Sorry, I guess that means XFS_B_TO_FSBT(mp, mp->m_inode_cluster_size)) is
> zero; inode cluster size is 8192 in this case I think, and that is in fact
> 0 filesystem blocks when computed with this macro.
> 
> I need to think about this a little bit to convince myself that the inode
> alignment bit really /should/ be off for a filesystem of this geometry, vs
> changing the macro to recognize the case.

Why isn't that XFS_B_TO_FSBT instead a call to xfs_icluster_size_fsb()?
That function is used elsewhere to compute the number of fsblocks
backing an inode cluster, which seems like what we need here to figure
out whether inoalignmt makes sense w.r.t. the size of an inode cluster.

--D

> 
> -Eric
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Chandan Rajendra Feb. 15, 2017, 5:36 p.m. UTC | #4
On Wednesday, February 15, 2017 11:03:11 AM Eric Sandeen wrote:
> On 2/15/17 10:13 AM, Eric Sandeen wrote:
> >> The root cause of the problem is due to the fact that
> >> xfs_sb_version_hasalign() returns true when we are working on a V5
> >> filesystem. Due to this args.minalignslop (in xfs_ialloc_ag_alloc())
> >> gets the unsigned equivalent of -1 assigned to it. This later causes
> >> alloc_len in xfs_alloc_space_available() to have a value of 0. In such a
> >> scenario when args.total is also 0, the assert statement
> >> "ASSERT(args->maxlen > 0);" fails.
> > Hm, the intent of the _haslign() function is to say that V5 must always
> > imply the "alignbit" - i.e. we don't want to grow an infinite feature
> > matrix, and by the time you get to V5 supers, there are many things which
> > cannot be turned on or off, such as this feature.
> > 
> > So what happens here... xfs_ialloc_ag_alloc does:
> > 
> > args.minalignslop = xfs_ialloc_cluster_alignment(args.mp) - 1;
> > 
> > so you're saying that cluster_alignment comes out as 0?
> > 
> > That function is checking _hasalign:
> > 
> > static inline int
> > xfs_ialloc_cluster_alignment(
> >         struct xfs_mount        *mp)
> > {
> >         if (xfs_sb_version_hasalign(&mp->m_sb) &&
> >             mp->m_sb.sb_inoalignmt >=
> >                         XFS_B_TO_FSBT(mp, mp->m_inode_cluster_size))
> >                 return mp->m_sb.sb_inoalignmt;
> >         return 1;
> > }
> > 
> > So are you saying that this function returns 0?  That would imply that
> > sb_inoalignmt and m_inode_cluster_size are both zero, yes?  Is this
> > what you see?
> 
> Sorry, I guess that means XFS_B_TO_FSBT(mp, mp->m_inode_cluster_size)) is
> zero; inode cluster size is 8192 in this case I think, and that is in fact
> 0 filesystem blocks when computed with this macro.

Yes, sb_inoalignmt is indeed 0 because of the following code from
main() in xfsprogs/mkfs/xfs_mkfs.c,

	if (sb_feat.inode_align) {
		int	cluster_size = XFS_INODE_BIG_CLUSTER_SIZE;
		if (sb_feat.crcs_enabled)
			cluster_size *= isize / XFS_DINODE_MIN_SIZE;
		sbp->sb_inoalignmt = cluster_size >> blocklog;
		sb_feat.inode_align = sbp->sb_inoalignmt != 0;
	}

And XFS_B_TO_FSBT(mp, mp->m_inode_cluster_size)) returns 0 as well. Hence the
condition (xfs_sb_version_hasalign(&mp->m_sb) && mp->m_sb.sb_inoalignmt >=
XFS_B_TO_FSBT(mp, mp->m_inode_cluster_size)) evaluates to true.

> 
> I need to think about this a little bit to convince myself that the inode
> alignment bit really /should/ be off for a filesystem of this geometry, vs
> changing the macro to recognize the case.
> 
> -Eric
>
Chandan Rajendra Feb. 15, 2017, 6 p.m. UTC | #5
On Wednesday, February 15, 2017 09:16:02 AM Darrick J. Wong wrote:
> On Wed, Feb 15, 2017 at 11:03:11AM -0600, Eric Sandeen wrote:
> > On 2/15/17 10:13 AM, Eric Sandeen wrote:
> > >> The root cause of the problem is due to the fact that
> > >> xfs_sb_version_hasalign() returns true when we are working on a V5
> > >> filesystem. Due to this args.minalignslop (in xfs_ialloc_ag_alloc())
> > >> gets the unsigned equivalent of -1 assigned to it. This later causes
> > >> alloc_len in xfs_alloc_space_available() to have a value of 0. In such a
> > >> scenario when args.total is also 0, the assert statement
> > >> "ASSERT(args->maxlen > 0);" fails.
> > > Hm, the intent of the _haslign() function is to say that V5 must always
> > > imply the "alignbit" - i.e. we don't want to grow an infinite feature
> > > matrix, and by the time you get to V5 supers, there are many things which
> > > cannot be turned on or off, such as this feature.
> > > 
> > > So what happens here... xfs_ialloc_ag_alloc does:
> > > 
> > > args.minalignslop = xfs_ialloc_cluster_alignment(args.mp) - 1;
> > > 
> > > so you're saying that cluster_alignment comes out as 0?
> > > 
> > > That function is checking _hasalign:
> > > 
> > > static inline int
> > > xfs_ialloc_cluster_alignment(
> > >         struct xfs_mount        *mp)
> > > {
> > >         if (xfs_sb_version_hasalign(&mp->m_sb) &&
> > >             mp->m_sb.sb_inoalignmt >=
> > >                         XFS_B_TO_FSBT(mp, mp->m_inode_cluster_size))
> > >                 return mp->m_sb.sb_inoalignmt;
> > >         return 1;
> > > }
> > > 
> > > So are you saying that this function returns 0?  That would imply that
> > > sb_inoalignmt and m_inode_cluster_size are both zero, yes?  Is this
> > > what you see?
> > 
> > Sorry, I guess that means XFS_B_TO_FSBT(mp, mp->m_inode_cluster_size)) is
> > zero; inode cluster size is 8192 in this case I think, and that is in fact
> > 0 filesystem blocks when computed with this macro.
> > 
> > I need to think about this a little bit to convince myself that the inode
> > alignment bit really /should/ be off for a filesystem of this geometry, vs
> > changing the macro to recognize the case.
> 
> Why isn't that XFS_B_TO_FSBT instead a call to xfs_icluster_size_fsb()?
> That function is used elsewhere to compute the number of fsblocks
> backing an inode cluster, which seems like what we need here to figure
> out whether inoalignmt makes sense w.r.t. the size of an inode cluster.
>

Thanks for the suggestion. Looks like xfs_icluster_size_fsb() is the right
function to use. I will test the fix and let you know the results.
Eric Sandeen Feb. 15, 2017, 10:53 p.m. UTC | #6
On 2/15/17 11:16 AM, Darrick J. Wong wrote:
> On Wed, Feb 15, 2017 at 11:03:11AM -0600, Eric Sandeen wrote:
>> On 2/15/17 10:13 AM, Eric Sandeen wrote:

...

>> Sorry, I guess that means XFS_B_TO_FSBT(mp, mp->m_inode_cluster_size)) is
>> zero; inode cluster size is 8192 in this case I think, and that is in fact
>> 0 filesystem blocks when computed with this macro.
>>
>> I need to think about this a little bit to convince myself that the inode
>> alignment bit really /should/ be off for a filesystem of this geometry, vs
>> changing the macro to recognize the case.
> 
> Why isn't that XFS_B_TO_FSBT instead a call to xfs_icluster_size_fsb()?
> That function is used elsewhere to compute the number of fsblocks
> backing an inode cluster, which seems like what we need here to figure
> out whether inoalignmt makes sense w.r.t. the size of an inode cluster.

Hm, I think this needs some care.  There are several places that use
XFS_B_TO_FSBT(mp, mp->m_inode_cluster_size), see xfs_ialloc_cluster_alignment()
for example, and the sparse inode check in mount code:

        if (xfs_sb_version_hassparseinodes(&mp->m_sb) &&
            mp->m_sb.sb_spino_align !=
                        XFS_B_TO_FSBT(mp, mp->m_inode_cluster_size)) {
                xfs_warn(mp,
        "Sparse inode block alignment (%u) must match cluster size (%llu).",
                         mp->m_sb.sb_spino_align,
                         XFS_B_TO_FSBT(mp, mp->m_inode_cluster_size));
                error = -EINVAL;
                goto out_remove_uuid;
        }

This probably fails in your usecase as well, no?

-Eric


> --D
> 
>>
>> -Eric
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Sandeen Feb. 16, 2017, 3:29 a.m. UTC | #7
On 2/15/17 10:13 AM, Eric Sandeen wrote:
> Hm, the intent of the _haslign() function is to say that V5 must always
> imply the "alignbit" - i.e. we don't want to grow an infinite feature
> matrix, and by the time you get to V5 supers, there are many things which
> cannot be turned on or off, such as this feature.

I'm rethinking this a bit; while my question may have uncovered
another real bug, I think maybe your patch /is/ ok.

While the intent /was/ to ensure that some features are not optional
on V5 filesystems, the mkfs code itself might end up turning off inode
alignment for sufficiently large filesystem blocks and sufficiently
small inode sizes.  The reality is, for sufficiently large fs blocks,
every new inode chunk allocation is aligned, because it is sub-block
sized.  And so there is nothing uniquely "aligned" about any allocation.

i.e. on v4 superblocks, the right combination of inode size & fs block
size will turn off the alignment feature, even if the user did not
request it.

So for v5 supers, it does seem odd to report the presence of the
alignment feature even when the flag is not set in the superblock
due to the geometry details.

This goes way back to:

 04a1e6c5b xfs: add CRC checks to the superblock

which doesn't really explain why the xfs_sb_version_hasalign got
changed.

Anyway; it probably makes sense to do a targeted fix to resolve
the issue you've run into, but we might want to take a broader
look at when the alignment feature (and feature flag) gets set
and reported, to make sure that it is all consistent and as expected.

-Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Chandan Rajendra Feb. 16, 2017, 12:22 p.m. UTC | #8
On Wednesday, February 15, 2017 09:29:35 PM Eric Sandeen wrote:
> On 2/15/17 10:13 AM, Eric Sandeen wrote:
> > Hm, the intent of the _haslign() function is to say that V5 must always
> > imply the "alignbit" - i.e. we don't want to grow an infinite feature
> > matrix, and by the time you get to V5 supers, there are many things which
> > cannot be turned on or off, such as this feature.
> 
> I'm rethinking this a bit; while my question may have uncovered
> another real bug, I think maybe your patch /is/ ok.
> 
> While the intent /was/ to ensure that some features are not optional
> on V5 filesystems, the mkfs code itself might end up turning off inode
> alignment for sufficiently large filesystem blocks and sufficiently
> small inode sizes.  The reality is, for sufficiently large fs blocks,
> every new inode chunk allocation is aligned, because it is sub-block
> sized.  And so there is nothing uniquely "aligned" about any allocation.
> 
> i.e. on v4 superblocks, the right combination of inode size & fs block
> size will turn off the alignment feature, even if the user did not
> request it.
> 
> So for v5 supers, it does seem odd to report the presence of the
> alignment feature even when the flag is not set in the superblock
> due to the geometry details.
> 
> This goes way back to:
> 
>  04a1e6c5b xfs: add CRC checks to the superblock
> 
> which doesn't really explain why the xfs_sb_version_hasalign got
> changed.
> 
> Anyway; it probably makes sense to do a targeted fix to resolve
> the issue you've run into, but we might want to take a broader
> look at when the alignment feature (and feature flag) gets set
> and reported, to make sure that it is all consistent and as expected.
> 

Right. I will post the new patch (one that uses xfs_icluster_size_fsb()
instead of XFS_B_TO_FSBT()) and then look at the rest of the alignment feature
usages. Thanks for your guidance.
diff mbox

Patch

diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h
index 6b7579e..57a5264 100644
--- a/fs/xfs/libxfs/xfs_format.h
+++ b/fs/xfs/libxfs/xfs_format.h
@@ -346,8 +346,7 @@  static inline void xfs_sb_version_addquota(struct xfs_sb *sbp)
 
 static inline bool xfs_sb_version_hasalign(struct xfs_sb *sbp)
 {
-	return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_5 ||
-		(sbp->sb_versionnum & XFS_SB_VERSION_ALIGNBIT));
+	return (sbp->sb_versionnum & XFS_SB_VERSION_ALIGNBIT);
 }
 
 static inline bool xfs_sb_version_hasdalign(struct xfs_sb *sbp)