Message ID | 20170203172403.GG45388@bfoster.bfoster (mailing list archive) |
---|---|
State | Accepted |
Headers | show |
Brian Foster wrote: > On Fri, Feb 03, 2017 at 03:50:09PM +0100, Michal Hocko wrote: > > [Let's CC more xfs people] > > > > On Fri 03-02-17 19:57:39, Tetsuo Handa wrote: > > [...] > > > (1) I got an assertion failure. > > > > I suspect this is a result of > > http://lkml.kernel.org/r/20170201092706.9966-2-mhocko@kernel.org > > I have no idea what the assert means though. > > > > > > > > [ 969.626518] Killed process 6262 (oom-write) total-vm:2166856kB, anon-rss:1128732kB, file-rss:4kB, shmem-rss:0kB > > > [ 969.958307] oom_reaper: reaped process 6262 (oom-write), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB > > > [ 972.114644] XFS: Assertion failed: oldlen > newlen, file: fs/xfs/libxfs/xfs_bmap.c, line: 2867 > > Indirect block reservation underrun on delayed allocation extent merge. > These are extra blocks are used for the inode bmap btree when a delalloc > extent is converted to physical blocks. We're in a case where we expect > to only ever free excess blocks due to a merge of extents with > independent reservations, but a situation occurs where we actually need > blocks and hence the assert fails. This can occur if an extent is merged > with one that has a reservation less than the expected worst case > reservation for its size (due to previous extent splits due to hole > punches, for example). Therefore, I think the core expectation that > xfs_bmap_add_extent_hole_delay() will always have enough blocks > pre-reserved is invalid. > > Can you describe the workload that reproduces this? FWIW, I think the > way xfs_bmap_add_extent_hole_delay() currently works is likely broken > and have a couple patches to fix up indlen reservation that I haven't > posted yet. The diff that deals with this particular bit is appended. > Care to give that a try? The workload is to write to a single file on XFS from 10 processes demonstrated at http://lkml.kernel.org/r/201512052133.IAE00551.LSOQFtMFFVOHOJ@I-love.SAKURA.ne.jp using "while :; do ./oom-write; done" loop on a VM with 4CPUs / 2048MB RAM. With this XFS_FILBLKS_MIN() change applied, I no longer hit assertion failures. -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Feb 06, 2017 at 03:29:24PM +0900, Tetsuo Handa wrote: > Brian Foster wrote: > > On Fri, Feb 03, 2017 at 03:50:09PM +0100, Michal Hocko wrote: > > > [Let's CC more xfs people] > > > > > > On Fri 03-02-17 19:57:39, Tetsuo Handa wrote: > > > [...] > > > > (1) I got an assertion failure. > > > > > > I suspect this is a result of > > > http://lkml.kernel.org/r/20170201092706.9966-2-mhocko@kernel.org > > > I have no idea what the assert means though. > > > > > > > > > > > [ 969.626518] Killed process 6262 (oom-write) total-vm:2166856kB, anon-rss:1128732kB, file-rss:4kB, shmem-rss:0kB > > > > [ 969.958307] oom_reaper: reaped process 6262 (oom-write), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB > > > > [ 972.114644] XFS: Assertion failed: oldlen > newlen, file: fs/xfs/libxfs/xfs_bmap.c, line: 2867 > > > > Indirect block reservation underrun on delayed allocation extent merge. > > These are extra blocks are used for the inode bmap btree when a delalloc > > extent is converted to physical blocks. We're in a case where we expect > > to only ever free excess blocks due to a merge of extents with > > independent reservations, but a situation occurs where we actually need > > blocks and hence the assert fails. This can occur if an extent is merged > > with one that has a reservation less than the expected worst case > > reservation for its size (due to previous extent splits due to hole > > punches, for example). Therefore, I think the core expectation that > > xfs_bmap_add_extent_hole_delay() will always have enough blocks > > pre-reserved is invalid. > > > > Can you describe the workload that reproduces this? FWIW, I think the > > way xfs_bmap_add_extent_hole_delay() currently works is likely broken > > and have a couple patches to fix up indlen reservation that I haven't > > posted yet. The diff that deals with this particular bit is appended. > > Care to give that a try? > > The workload is to write to a single file on XFS from 10 processes demonstrated at > http://lkml.kernel.org/r/201512052133.IAE00551.LSOQFtMFFVOHOJ@I-love.SAKURA.ne.jp > using "while :; do ./oom-write; done" loop on a VM with 4CPUs / 2048MB RAM. > With this XFS_FILBLKS_MIN() change applied, I no longer hit assertion failures. > Thanks for testing. Well, that's an interesting workload. I couldn't reproduce on a few quick tries in a similarly configured vm. Normally I'd expect to see this kind of thing on a hole punching workload or dealing with large, sparse files that make use of speculative preallocation (post-eof blocks allocated in anticipation of file extending writes). I'm wondering if what is happening here is that the appending writes and file closes due to oom kills are generating speculative preallocs and prealloc truncates, respectively, and that causes prealloc extents at the eof boundary to be split up and then re-merged by surviving appending writers. /tmp/file _is_ on an XFS filesystem in your test, correct? If so and if you still have the output file from a test that reproduced, could you get the 'xfs_io -c "fiemap -v" <file>' output? I suppose another possibility is that prealloc occurs, write failure(s) leads to extent splits via unmapping the target range of the write, and then surviving writers generate the warning on a delalloc extent merge.. Brian > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon 06-02-17 09:35:33, Brian Foster wrote: > On Mon, Feb 06, 2017 at 03:29:24PM +0900, Tetsuo Handa wrote: > > Brian Foster wrote: > > > On Fri, Feb 03, 2017 at 03:50:09PM +0100, Michal Hocko wrote: > > > > [Let's CC more xfs people] > > > > > > > > On Fri 03-02-17 19:57:39, Tetsuo Handa wrote: > > > > [...] > > > > > (1) I got an assertion failure. > > > > > > > > I suspect this is a result of > > > > http://lkml.kernel.org/r/20170201092706.9966-2-mhocko@kernel.org > > > > I have no idea what the assert means though. > > > > > > > > > > > > > > [ 969.626518] Killed process 6262 (oom-write) total-vm:2166856kB, anon-rss:1128732kB, file-rss:4kB, shmem-rss:0kB > > > > > [ 969.958307] oom_reaper: reaped process 6262 (oom-write), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB > > > > > [ 972.114644] XFS: Assertion failed: oldlen > newlen, file: fs/xfs/libxfs/xfs_bmap.c, line: 2867 > > > > > > Indirect block reservation underrun on delayed allocation extent merge. > > > These are extra blocks are used for the inode bmap btree when a delalloc > > > extent is converted to physical blocks. We're in a case where we expect > > > to only ever free excess blocks due to a merge of extents with > > > independent reservations, but a situation occurs where we actually need > > > blocks and hence the assert fails. This can occur if an extent is merged > > > with one that has a reservation less than the expected worst case > > > reservation for its size (due to previous extent splits due to hole > > > punches, for example). Therefore, I think the core expectation that > > > xfs_bmap_add_extent_hole_delay() will always have enough blocks > > > pre-reserved is invalid. > > > > > > Can you describe the workload that reproduces this? FWIW, I think the > > > way xfs_bmap_add_extent_hole_delay() currently works is likely broken > > > and have a couple patches to fix up indlen reservation that I haven't > > > posted yet. The diff that deals with this particular bit is appended. > > > Care to give that a try? > > > > The workload is to write to a single file on XFS from 10 processes demonstrated at > > http://lkml.kernel.org/r/201512052133.IAE00551.LSOQFtMFFVOHOJ@I-love.SAKURA.ne.jp > > using "while :; do ./oom-write; done" loop on a VM with 4CPUs / 2048MB RAM. > > With this XFS_FILBLKS_MIN() change applied, I no longer hit assertion failures. > > > > Thanks for testing. Well, that's an interesting workload. I couldn't > reproduce on a few quick tries in a similarly configured vm. > > Normally I'd expect to see this kind of thing on a hole punching > workload or dealing with large, sparse files that make use of > speculative preallocation (post-eof blocks allocated in anticipation of > file extending writes). I'm wondering if what is happening here is that > the appending writes and file closes due to oom kills are generating > speculative preallocs and prealloc truncates, respectively, and that > causes prealloc extents at the eof boundary to be split up and then > re-merged by surviving appending writers. Can those preallocs be affected by http://lkml.kernel.org/r/20170201092706.9966-2-mhocko@kernel.org ?
On Mon, Feb 06, 2017 at 03:42:22PM +0100, Michal Hocko wrote: > On Mon 06-02-17 09:35:33, Brian Foster wrote: > > On Mon, Feb 06, 2017 at 03:29:24PM +0900, Tetsuo Handa wrote: > > > Brian Foster wrote: > > > > On Fri, Feb 03, 2017 at 03:50:09PM +0100, Michal Hocko wrote: > > > > > [Let's CC more xfs people] > > > > > > > > > > On Fri 03-02-17 19:57:39, Tetsuo Handa wrote: > > > > > [...] > > > > > > (1) I got an assertion failure. > > > > > > > > > > I suspect this is a result of > > > > > http://lkml.kernel.org/r/20170201092706.9966-2-mhocko@kernel.org > > > > > I have no idea what the assert means though. > > > > > > > > > > > > > > > > > [ 969.626518] Killed process 6262 (oom-write) total-vm:2166856kB, anon-rss:1128732kB, file-rss:4kB, shmem-rss:0kB > > > > > > [ 969.958307] oom_reaper: reaped process 6262 (oom-write), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB > > > > > > [ 972.114644] XFS: Assertion failed: oldlen > newlen, file: fs/xfs/libxfs/xfs_bmap.c, line: 2867 > > > > > > > > Indirect block reservation underrun on delayed allocation extent merge. > > > > These are extra blocks are used for the inode bmap btree when a delalloc > > > > extent is converted to physical blocks. We're in a case where we expect > > > > to only ever free excess blocks due to a merge of extents with > > > > independent reservations, but a situation occurs where we actually need > > > > blocks and hence the assert fails. This can occur if an extent is merged > > > > with one that has a reservation less than the expected worst case > > > > reservation for its size (due to previous extent splits due to hole > > > > punches, for example). Therefore, I think the core expectation that > > > > xfs_bmap_add_extent_hole_delay() will always have enough blocks > > > > pre-reserved is invalid. > > > > > > > > Can you describe the workload that reproduces this? FWIW, I think the > > > > way xfs_bmap_add_extent_hole_delay() currently works is likely broken > > > > and have a couple patches to fix up indlen reservation that I haven't > > > > posted yet. The diff that deals with this particular bit is appended. > > > > Care to give that a try? > > > > > > The workload is to write to a single file on XFS from 10 processes demonstrated at > > > http://lkml.kernel.org/r/201512052133.IAE00551.LSOQFtMFFVOHOJ@I-love.SAKURA.ne.jp > > > using "while :; do ./oom-write; done" loop on a VM with 4CPUs / 2048MB RAM. > > > With this XFS_FILBLKS_MIN() change applied, I no longer hit assertion failures. > > > > > > > Thanks for testing. Well, that's an interesting workload. I couldn't > > reproduce on a few quick tries in a similarly configured vm. > > > > Normally I'd expect to see this kind of thing on a hole punching > > workload or dealing with large, sparse files that make use of > > speculative preallocation (post-eof blocks allocated in anticipation of > > file extending writes). I'm wondering if what is happening here is that > > the appending writes and file closes due to oom kills are generating > > speculative preallocs and prealloc truncates, respectively, and that > > causes prealloc extents at the eof boundary to be split up and then > > re-merged by surviving appending writers. > > Can those preallocs be affected by > http://lkml.kernel.org/r/20170201092706.9966-2-mhocko@kernel.org ? > Hmm, I wouldn't expect that to make much of a difference wrt to the core problem. The prealloc is created on a file extending write that requires block allocation (we basically just tack on extra blocks to an extending alloc based on some heuristics like the size of the file and the previous extent). Whether that allocation occurs on one iomap iteration or another due to a short write and retry, I wouldn't expect to matter that much. I suppose it could change the behavior of specialized workload though. E.g., if it caused a write() call to return quicker and thus lead to a file close(). We do use file release as an indication that prealloc will not be used and can reclaim it at that point (presumably causing an extent split with pre-eof blocks). Brian > -- > Michal Hocko > SUSE Labs > -- > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Brian Foster wrote: > > The workload is to write to a single file on XFS from 10 processes demonstrated at > > http://lkml.kernel.org/r/201512052133.IAE00551.LSOQFtMFFVOHOJ@I-love.SAKURA.ne.jp > > using "while :; do ./oom-write; done" loop on a VM with 4CPUs / 2048MB RAM. > > With this XFS_FILBLKS_MIN() change applied, I no longer hit assertion failures. > > > > Thanks for testing. Well, that's an interesting workload. I couldn't > reproduce on a few quick tries in a similarly configured vm. It takes 10 to 15 minutes. Maybe some size threshold involved? > /tmp/file _is_ on an XFS filesystem in your test, correct? If so and if > you still have the output file from a test that reproduced, could you > get the 'xfs_io -c "fiemap -v" <file>' output? Here it is. [ 720.199748] 0 pages HighMem/MovableOnly [ 720.199749] 150524 pages reserved [ 720.199749] 0 pages cma reserved [ 720.199750] 0 pages hwpoisoned [ 722.187335] XFS: Assertion failed: oldlen > newlen, file: fs/xfs/libxfs/xfs_bmap.c, line: 2867 [ 722.201784] ------------[ cut here ]------------ [ 722.205940] WARNING: CPU: 0 PID: 4877 at fs/xfs/xfs_message.c:105 asswarn+0x33/0x40 [xfs] [ 722.212333] Modules linked in: nf_conntrack_netbios_ns nf_conntrack_broadcast ip6t_rpfilter ipt_REJECT nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter coretemp crct10dif_pclmul vmw_vsock_vmci_transport crc32_pclmul ghash_clmulni_intel vsock aesni_intel crypto_simd cryptd glue_helper ppdev vmw_balloon pcspkr sg parport_pc i2c_piix4 shpchp vmw_vmci parport ip_tables xfs libcrc32c sd_mod sr_mod cdrom ata_generic pata_acpi crc32c_intel serio_raw vmwgfx drm_kms_helper syscopyarea sysfillrect [ 722.243207] sysimgblt fb_sys_fops mptspi scsi_transport_spi ata_piix ahci ttm mptscsih libahci drm libata mptbase e1000 i2c_core [ 722.247704] CPU: 0 PID: 4877 Comm: write Not tainted 4.10.0-rc6-next-20170202 #498 [ 722.250612] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015 [ 722.254089] Call Trace: [ 722.255751] dump_stack+0x85/0xc9 [ 722.257650] __warn+0xd1/0xf0 [ 722.259420] warn_slowpath_null+0x1d/0x20 [ 722.261434] asswarn+0x33/0x40 [xfs] [ 722.263356] xfs_bmap_add_extent_hole_delay+0xb7f/0xdf0 [xfs] [ 722.265695] xfs_bmapi_reserve_delalloc+0x297/0x440 [xfs] [ 722.267792] ? xfs_ilock+0x1c9/0x360 [xfs] [ 722.269559] xfs_file_iomap_begin+0x880/0x1140 [xfs] [ 722.271606] ? iomap_write_end+0x80/0x80 [ 722.273377] iomap_apply+0x6c/0x130 [ 722.274969] iomap_file_buffered_write+0x68/0xa0 [ 722.276702] ? iomap_write_end+0x80/0x80 [ 722.278311] xfs_file_buffered_aio_write+0x132/0x390 [xfs] [ 722.280394] ? _raw_spin_unlock+0x27/0x40 [ 722.282247] xfs_file_write_iter+0x90/0x130 [xfs] [ 722.284257] __vfs_write+0xe5/0x140 [ 722.285924] vfs_write+0xc7/0x1f0 [ 722.287536] ? syscall_trace_enter+0x1d0/0x380 [ 722.289490] SyS_write+0x58/0xc0 [ 722.291025] do_int80_syscall_32+0x6c/0x1f0 [ 722.292671] entry_INT80_compat+0x38/0x50 [ 722.294298] RIP: 0023:0x8048076 [ 722.295684] RSP: 002b:00000000ffedf840 EFLAGS: 00000202 ORIG_RAX: 0000000000000004 [ 722.298075] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 0000000008048000 [ 722.300516] RDX: 0000000000001000 RSI: 0000000000000000 RDI: 0000000000000000 [ 722.302902] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 [ 722.305278] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 722.307567] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 722.309792] ---[ end trace 5b7012eeb84093b7 ]--- [ 732.650867] oom_reaper: reaped process 4876 (oom-write), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB # ls -l /tmp/file -rw------- 1 kumaneko kumaneko 43426648064 Feb 7 19:25 /tmp/file # xfs_io -c "fiemap -v" /tmp/file /tmp/file: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS 0: [0..262015]: 358739712..359001727 262016 0x0 1: [262016..524159]: 367651920..367914063 262144 0x0 2: [524160..1048447]: 385063864..385588151 524288 0x0 3: [1048448..1238031]: 463702512..463892095 189584 0x0 4: [1238032..3335167]: 448234520..450331655 2097136 0x0 5: [3335168..4769775]: 36165320..37599927 1434608 0x0 6: [4769776..6897175]: 31677984..33805383 2127400 0x0 7: [6897176..15285759]: 450331656..458720239 8388584 0x0 8: [15285760..18520255]: 237497528..240732023 3234496 0x0 9: [18520256..21063607]: 229750248..232293599 2543352 0x0 10: [21063608..25257855]: 240732024..244926271 4194248 0x0 11: [25257856..29452159]: 179523440..183717743 4194304 0x0 12: [29452160..30380031]: 171930952..172858823 927872 0x0 13: [30380032..31428607]: 185220160..186268735 1048576 0x0 14: [31428608..32667751]: 232293600..233532743 1239144 0x0 15: [32667752..38474351]: 172858824..178665423 5806600 0x0 16: [38474352..39157119]: 188137184..188819951 682768 0x0 17: [39157120..40205695]: 234837584..235886159 1048576 0x0 18: [40205696..42302847]: 33805384..35902535 2097152 0x0 19: [42302848..44188591]: 37599928..39485671 1885744 0x0 20: [44188592..45112703]: 446735416..447659527 924112 0x0 21: [45112704..45436343]: 445337184..445660823 323640 0x0 22: [45436344..45960575]: 447659528..448183759 524232 0x0 23: [45960576..46484863]: 463892096..464416383 524288 0x0 24: [46484864..47533439]: 445660824..446709399 1048576 0x0 25: [47533440..48541959]: 233532744..234541263 1008520 0x0 26: [48541960..49294175]: 523533736..524285951 752216 0x0 27: [49294176..49630591]: 376080552..376416967 336416 0x0 28: [49630592..50154879]: 129846752..130371039 524288 0x0 29: [50154880..50844383]: 244926272..245615775 689504 0x0 30: [50844384..51203455]: 250812112..251171183 359072 0x0 31: [51203456..51727743]: 259555424..260079711 524288 0x0 32: [51727744..52295239]: 187350752..187918247 567496 0x0 33: [52295240..52776319]: 188819952..189301031 481080 0x0 34: [52776320..53300607]: 206841040..207365327 524288 0x0 35: [53300608..53824775]: 386221504..386745671 524168 0x0 36: [53824776..54348935]: 113736928..114261087 524160 0x0 37: [54348936..54854007]: 228911704..229416775 505072 0x0 38: [54854008..54905983]: 228760200..228812175 51976 0x0 39: [54905984..54971519]: 228597920..228663455 65536 0x0 40: [54971520..55364735]: 178998696..179391911 393216 0x0 41: [55364736..55868119]: 392669176..393172559 503384 0x0 42: [55868120..56370663]: 382896800..383399343 502544 0x0 43: [56370664..56836311]: 464416384..464882031 465648 0x0 44: [56836312..57085055]: 458720240..458968983 248744 0x0 45: [57085056..57548743]: 92768112..93231799 463688 0x0 46: [57548744..57871487]: 102724384..103047127 322744 0x0 47: [57871488..58304623]: 124278664..124711799 433136 0x0 48: [58304624..58526847]: 124712024..124934247 222224 0x0 49: [58526848..58788991]: 125635832..125897975 262144 0x0 50: [58788992..59203767]: 508031384..508446159 414776 0x0 51: [59203768..59602871]: 109812624..110211727 399104 0x0 52: [59602872..59992183]: 385736856..386126167 389312 0x0 53: [59992184..60381311]: 237108384..237497511 389128 0x0 54: [60381312..60756863]: 506355968..506731519 375552 0x0 55: [60756864..61127487]: 186268736..186639359 370624 0x0 56: [61127488..61490767]: 112848304..113211583 363280 0x0 57: [61490768..61541503]: 113214200..113264935 50736 0x0 58: [61541504..61904775]: 112246776..112610047 363272 0x0 59: [61904776..62246247]: 106328512..106669983 341472 0x0 60: [62246248..62571991]: 126075640..126401383 325744 0x0 61: [62571992..62895759]: 108921744..109245511 323768 0x0 62: [62895760..63219159]: 380153016..380476415 323400 0x0 63: [63219160..63442047]: 381056248..381279135 222888 0x0 64: [63442048..63704191]: 379768072..380030215 262144 0x0 65: [63704192..64026847]: 108328888..108651543 322656 0x0 66: [64026848..64342415]: 251387232..251702799 315568 0x0 67: [64342416..64651407]: 183717744..184026735 308992 0x0 68: [64651408..64947983]: 384092440..384389015 296576 0x0 69: [64947984..65145983]: 381775560..381973559 198000 0x0 70: [65145984..65408127]: 186914504..187176647 262144 0x0 71: [65408128..65447943]: 125328232..125368047 39816 0x0 72: [65447944..65690599]: 372579112..372821767 242656 0x0 73: [65690600..65929863]: 130429664..130668927 239264 0x0 74: [65929864..66168935]: 120951784..121190855 239072 0x0 75: [66168936..66402279]: 372845976..373079319 233344 0x0 76: [66402280..66633199]: 113372616..113603535 230920 0x0 77: [66633200..66859943]: 115982256..116208999 226744 0x0 78: [66859944..67082127]: 127187600..127409783 222184 0x0 79: [67082128..67217407]: 127636680..127771959 135280 0x0 80: [67217408..67280095]: 129510736..129573423 62688 0x0 81: [67280096..67499063]: 119220288..119439255 218968 0x0 82: [67499064..67717935]: 507320248..507539119 218872 0x0 83: [67717936..67936119]: 129292544..129510727 218184 0x0 84: [67936120..68153903]: 125368048..125585831 217784 0x0 85: [68153904..68370703]: 117784232..118001031 216800 0x0 86: [68370704..68586039]: 121997008..122212343 215336 0x0 87: [68586040..68798855]: 379191840..379404655 212816 0x0 88: [68798856..68983935]: 378690808..378875887 185080 0x0 89: [68983936..69196727]: 90790848..91003639 212792 0x0 90: [69196728..69409287]: 123091672..123304231 212560 0x0 91: [69409288..69621503]: 377436856..377649071 212216 0x0 92: [69621504..69828847]: 128990088..129197431 207344 0x0 93: [69828848..70035391]: 497270968..497477511 206544 0x0 94: [70035392..70241111]: 391898048..392103767 205720 0x0 95: [70241112..70446207]: 260716672..260921767 205096 0x0 96: [70446208..70507647]: 260079712..260141151 61440 0x0 97: [70507648..70704255]: 245836040..246032647 196608 0x0 98: [70704256..70906591]: 107009096..107211431 202336 0x0 99: [70906592..71108807]: 389471224..389673439 202216 0x0 100: [71108808..71309703]: 224305904..224506799 200896 0x0 101: [71309704..71509487]: 388524632..388724415 199784 0x0 102: [71509488..71707119]: 87983688..88181319 197632 0x0 103: [71707120..71903015]: 236195680..236391575 195896 0x0 104: [71903016..72098791]: 389000248..389196023 195776 0x0 105: [72098792..72294471]: 386931872..387127551 195680 0x0 106: [72294472..72342655]: 387127560..387175743 48184 0x0 107: [72342656..72408191]: 388031464..388096999 65536 0x0 108: [72408192..72539263]: 388194472..388325543 131072 0x0 109: [72539264..72562039]: 369903992..369926767 22776 0x0 110: [72562040..72753639]: 506916880..507108479 191600 0x0 111: [72753640..72945143]: 360577376..360768879 191504 0x0 112: [72945144..73136575]: 246426760..246618191 191432 0x0 113: [73136576..73326047]: 116629288..116818759 189472 0x0 114: [73326048..73515047]: 392203096..392392095 189000 0x0 115: [73515048..73699967]: 223549160..223734079 184920 0x0 116: [73699968..73883879]: 118860856..119044767 183912 0x0 117: [73883880..74067175]: 506143208..506326503 183296 0x0 118: [74067176..74249703]: 507108800..507291327 182528 0x0 119: [74249704..74401335]: 258917640..259069271 151632 0x0 120: [74401336..74583135]: 122742560..122924359 181800 0x0 121: [74583136..74764223]: 374250096..374431183 181088 0x0 122: [74764224..74945271]: 91175800..91356847 181048 0x0 123: [74945272..75124183]: 362484776..362663687 178912 0x0 124: [75124184..75302615]: 223086192..223264623 178432 0x0 125: [75302616..75479279]: 359280032..359456695 176664 0x0 126: [75479280..75655559]: 63083912..63260191 176280 0x0 127: [75655560..75831487]: 384469152..384645079 175928 0x0 128: [75831488..76006815]: 381459584..381634911 175328 0x0 129: [76006816..76181255]: 110626376..110800815 174440 0x0 130: [76181256..76355399]: 380785616..380959759 174144 0x0 131: [76355400..76527527]: 362768136..362940263 172128 0x0 132: [76527528..76698695]: 122571384..122742551 171168 0x0 133: [76698696..76868951]: 382399576..382569831 170256 0x0 134: [76868952..77039095]: 388353776..388523919 170144 0x0 135: [77039096..77209183]: 120236192..120406279 170088 0x0 136: [77209184..77379183]: 383464120..383634119 170000 0x0 137: [77379184..77548655]: 369926768..370096239 169472 0x0 138: [77548656..77717663]: 88823232..88992239 169008 0x0 139: [77717664..77884951]: 365878672..366045959 167288 0x0 140: [77884952..77897079]: 366445360..366457487 12128 0x0 141: [77897080..78063423]: 391500528..391666871 166344 0x0 142: [78063424..78229407]: 107876400..108042383 165984 0x0 143: [78229408..78395135]: 358573976..358739703 165728 0x0 144: [78395136..78560703]: 117078480..117244047 165568 0x0 145: [78560704..78726063]: 257377088..257542447 165360 0x0 146: [78726064..78889519]: 389678704..389842159 163456 0x0 147: [78889520..79052607]: 225850112..226013199 163088 0x0 148: [79052608..79215111]: 359822880..359985383 162504 0x0 149: [79215112..79376559]: 357914720..358076167 161448 0x0 150: [79376560..79538007]: 115473264..115634711 161448 0x0 151: [79538008..79698815]: 112610056..112770863 160808 0x0 152: [79698816..79857631]: 258732456..258891271 158816 0x0 153: [79857632..80015807]: 388725328..388883503 158176 0x0 154: [80015808..80173583]: 93847144..94004919 157776 0x0 155: [80173584..80331295]: 362940272..363097983 157712 0x0 156: [80331296..80488727]: 252008432..252165863 157432 0x0 157: [80488728..80646055]: 118387696..118545023 157328 0x0 158: [80646056..80803239]: 111368744..111525927 157184 0x0 159: [80803240..80960055]: 129573424..129730239 156816 0x0 160: [80960056..81116863]: 497936416..498093223 156808 0x0 161: [81116864..81272623]: 492109560..492265319 155760 0x0 162: [81272624..81427695]: 114554072..114709143 155072 0x0 163: [81427696..81582519]: 106854264..107009087 154824 0x0 164: [81582520..81735503]: 220700824..220853807 152984 0x0 165: [81735504..81887807]: 490724024..490876327 152304 0x0 166: [81887808..82038863]: 122393688..122544743 151056 0x0 167: [82038864..82189151]: 91659448..91809735 150288 0x0 168: [82189152..82337287]: 85811104..85959239 148136 0x0 169: [82337288..82484743]: 235886160..236033615 147456 0x0 170: [82484744..82631943]: 117486472..117633671 147200 0x0 171: [82631944..82777887]: 491753616..491899559 145944 0x0 172: [82777888..82923799]: 94927544..95073455 145912 0x0 173: [82923800..83068527]: 373754864..373899591 144728 0x0 174: [83068528..83116375]: 373980848..374028695 47848 0x0 175: [83116376..83261039]: 361766120..361910783 144664 0x0 176: [83261040..83404007]: 374431192..374574159 142968 0x0 177: [83404008..83546815]: 484667976..484810783 142808 0x0 178: [83546816..83689279]: 251702808..251845271 142464 0x0 179: [83689280..83831711]: 90474240..90616671 142432 0x0 180: [83831712..83972959]: 109362776..109504023 141248 0x0 181: [83972960..84113743]: 377296064..377436847 140784 0x0 182: [84113744..84254303]: 378416056..378556615 140560 0x0 183: [84254304..84393663]: 89517888..89657247 139360 0x0 184: [84393664..84532831]: 376569640..376708807 139168 0x0 185: [84532832..84671975]: 108725224..108864367 139144 0x0 186: [84671976..84810807]: 109637664..109776495 138832 0x0 187: [84810808..84901119]: 110211736..110302047 90312 0x1 -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Feb 07, 2017 at 07:30:54PM +0900, Tetsuo Handa wrote: > Brian Foster wrote: > > > The workload is to write to a single file on XFS from 10 processes demonstrated at > > > http://lkml.kernel.org/r/201512052133.IAE00551.LSOQFtMFFVOHOJ@I-love.SAKURA.ne.jp > > > using "while :; do ./oom-write; done" loop on a VM with 4CPUs / 2048MB RAM. > > > With this XFS_FILBLKS_MIN() change applied, I no longer hit assertion failures. > > > > > > > Thanks for testing. Well, that's an interesting workload. I couldn't > > reproduce on a few quick tries in a similarly configured vm. > > It takes 10 to 15 minutes. Maybe some size threshold involved? > > > /tmp/file _is_ on an XFS filesystem in your test, correct? If so and if > > you still have the output file from a test that reproduced, could you > > get the 'xfs_io -c "fiemap -v" <file>' output? > > Here it is. > > [ 720.199748] 0 pages HighMem/MovableOnly > [ 720.199749] 150524 pages reserved > [ 720.199749] 0 pages cma reserved > [ 720.199750] 0 pages hwpoisoned > [ 722.187335] XFS: Assertion failed: oldlen > newlen, file: fs/xfs/libxfs/xfs_bmap.c, line: 2867 > [ 722.201784] ------------[ cut here ]------------ ... > > # ls -l /tmp/file > -rw------- 1 kumaneko kumaneko 43426648064 Feb 7 19:25 /tmp/file > # xfs_io -c "fiemap -v" /tmp/file > /tmp/file: > EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS > 0: [0..262015]: 358739712..359001727 262016 0x0 ... > 187: [84810808..84901119]: 110211736..110302047 90312 0x1 Ok, from the size of the file I realized that I missed you were running in a loop the first time around. I tried playing with it some more and still haven't been able to reproduce. Anyways, the patch intended to fix this has been reviewed[1] and queued for the next release, so it's probably not a big deal since you've already verified it. Thanks again. Brian [1] http://www.spinics.net/lists/linux-xfs/msg04083.html -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c index bfc00de..d2e48ed 100644 --- a/fs/xfs/libxfs/xfs_bmap.c +++ b/fs/xfs/libxfs/xfs_bmap.c @@ -2809,7 +2809,8 @@ xfs_bmap_add_extent_hole_delay( oldlen = startblockval(left.br_startblock) + startblockval(new->br_startblock) + startblockval(right.br_startblock); - newlen = xfs_bmap_worst_indlen(ip, temp); + newlen = XFS_FILBLKS_MIN(xfs_bmap_worst_indlen(ip, temp), + oldlen); xfs_bmbt_set_startblock(xfs_iext_get_ext(ifp, *idx), nullstartblock((int)newlen)); trace_xfs_bmap_post_update(ip, *idx, state, _THIS_IP_); @@ -2830,7 +2831,8 @@ xfs_bmap_add_extent_hole_delay( xfs_bmbt_set_blockcount(xfs_iext_get_ext(ifp, *idx), temp); oldlen = startblockval(left.br_startblock) + startblockval(new->br_startblock); - newlen = xfs_bmap_worst_indlen(ip, temp); + newlen = XFS_FILBLKS_MIN(xfs_bmap_worst_indlen(ip, temp), + oldlen); xfs_bmbt_set_startblock(xfs_iext_get_ext(ifp, *idx), nullstartblock((int)newlen)); trace_xfs_bmap_post_update(ip, *idx, state, _THIS_IP_); @@ -2846,7 +2848,8 @@ xfs_bmap_add_extent_hole_delay( temp = new->br_blockcount + right.br_blockcount; oldlen = startblockval(new->br_startblock) + startblockval(right.br_startblock); - newlen = xfs_bmap_worst_indlen(ip, temp); + newlen = XFS_FILBLKS_MIN(xfs_bmap_worst_indlen(ip, temp), + oldlen); xfs_bmbt_set_allf(xfs_iext_get_ext(ifp, *idx), new->br_startoff, nullstartblock((int)newlen), temp, right.br_state);