Message ID | 20220829070212.2540615-1-guoxuenan@huawei.com (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
Series | xfs: fix uaf when leaf dir bestcount not match with dir data blocks | expand |
On Mon, Aug 29, 2022 at 03:02:12PM +0800, Guo Xuenan wrote: > For leaf dir, there should be as many bestfree slots as there are dir data > blocks that can fit under i_size. Othrewise, which may cause UAF or > slab-out-of bound etc. Nice find, and thanks for the comprehensive description of the problem! > Root cause is we don't examin the number bestfree slots, when the slots > number less than dir data blocks, if we need to allocate new dir data block > and update the bestfree array, we will use the dir block number as index to > assign bestfree array, which may cause UAF or other memory access problem. > This issue can also triggered with test cases xfs/473 from fstests. > > Simplify the testcase xfs/473 with commands below: > DEV=/dev/sdb > MP=/mnt/sdb > WORKDIR=/mnt/sdb/341 #1. mkfs create new xfs image > mkfs.xfs -f ${DEV} > mount ${DEV} ${MP} > mkdir -p ${WORKDIR} > for i in `seq 1 341` #2. create leaf dir with 341 entries file name len 8 > do > touch ${WORKDIR}/$(printf "%08d" ${i}) > done > inode=$(ls -i ${MP} | cut -d' ' -f1) > umount ${MP} #3. xfs_db set bestcount to 0 > xfs_db -x ${DEV} -c "inode ${inode}" -c "dblock 8388608" \ > -c "write ltail.bestcount 0" > mount ${DEV} ${MP} > touch ${WORKDIR}/{1..100}.txt #4. touch new file, reproduce Ah, so it's verifier deficiency... Can you turn this reproducer back into a new fstest, please? [....] > Signed-off-by: Guo Xuenan <guoxuenan@huawei.com> > --- > fs/xfs/libxfs/xfs_dir2_leaf.c | 14 ++++++++++++++ > 1 file changed, 14 insertions(+) > > diff --git a/fs/xfs/libxfs/xfs_dir2_leaf.c b/fs/xfs/libxfs/xfs_dir2_leaf.c > index d9b66306a9a7..09414651ac48 100644 > --- a/fs/xfs/libxfs/xfs_dir2_leaf.c > +++ b/fs/xfs/libxfs/xfs_dir2_leaf.c > @@ -659,6 +659,20 @@ xfs_dir2_leaf_addname( > bestsp = xfs_dir2_leaf_bests_p(ltp); > length = xfs_dir2_data_entsize(dp->i_mount, args->namelen); > > + /* > + * There should be as many bestfree slots as there are dir data > + * blocks that can fit under i_size. Othrewise, which may cause > + * serious problems eg. UAF or slab-out-of bound etc. > + */ > + if (be32_to_cpu(ltp->bestcount) != > + xfs_dir2_byte_to_db(args->geo, dp->i_d.di_size)) { > + xfs_buf_ioerror_alert(lbp, __return_address); > + if (tp->t_flags & XFS_TRANS_DIRTY) > + xfs_force_shutdown(tp->t_mountp, > + SHUTDOWN_CORRUPT_INCORE); > + return -EFSCORRUPTED; > + } > + Yeah, that needs to go in xfs_dir3_leaf_check_int() so we catch the corruption as it comes in off disk (and before an in-memory corruption might get written back to disk) - we don't need to check it every time we add an entry to a leaf block. Indeed, I left a comment in xfs_dir3_leaf_check_int() indicating that the checking wasn't restrictive enough: /* * XXX (dgc): This value is not restrictive enough. * Should factor in the size of the bests table as well. * We can deduce a value for that from i_disk_size. */ if (hdr->count > geo->leaf_max_ents) return __this_address; IOWs, we need update xfs_dir3_leaf_check_int() to check the bestcount against the size of the bests table as you've done above, and then also use that table size info to reduce the bound that we validate the leaf block entry count against, too. Can you update your fix to validate both these fields correctly against the size of the bests table in xfs_dir3_leaf_check_int()? Cheers, Dave.
Hi Guo, Thank you for the patch! Yet something to improve: [auto build test ERROR on v6.0-rc3] [also build test ERROR on linus/master next-20220829] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Guo-Xuenan/xfs-fix-uaf-when-leaf-dir-bestcount-not-match-with-dir-data-blocks/20220829-144530 base: b90cb1053190353cc30f0fef0ef1f378ccc063c5 config: sh-allmodconfig (https://download.01.org/0day-ci/archive/20220829/202208291703.BcRRyCDy-lkp@intel.com/config) compiler: sh4-linux-gcc (GCC) 12.1.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/intel-lab-lkp/linux/commit/26c85e96017e84257ac452f142a123bfd7dad776 git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Guo-Xuenan/xfs-fix-uaf-when-leaf-dir-bestcount-not-match-with-dir-data-blocks/20220829-144530 git checkout 26c85e96017e84257ac452f142a123bfd7dad776 # save the config file mkdir build_dir && cp config build_dir/.config COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=sh SHELL=/bin/bash fs/ If you fix the issue, kindly add following tag where applicable Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>): fs/xfs/libxfs/xfs_dir2_leaf.c: In function 'xfs_dir2_leaf_addname': >> fs/xfs/libxfs/xfs_dir2_leaf.c:668:60: error: 'struct xfs_inode' has no member named 'i_d'; did you mean 'i_df'? 668 | xfs_dir2_byte_to_db(args->geo, dp->i_d.di_size)) { | ^~~ | i_df vim +668 fs/xfs/libxfs/xfs_dir2_leaf.c 604 605 /* 606 * Add an entry to a leaf form directory. 607 */ 608 int /* error */ 609 xfs_dir2_leaf_addname( 610 struct xfs_da_args *args) /* operation arguments */ 611 { 612 struct xfs_dir3_icleaf_hdr leafhdr; 613 struct xfs_trans *tp = args->trans; 614 __be16 *bestsp; /* freespace table in leaf */ 615 __be16 *tagp; /* end of data entry */ 616 struct xfs_buf *dbp; /* data block buffer */ 617 struct xfs_buf *lbp; /* leaf's buffer */ 618 struct xfs_dir2_leaf *leaf; /* leaf structure */ 619 struct xfs_inode *dp = args->dp; /* incore directory inode */ 620 struct xfs_dir2_data_hdr *hdr; /* data block header */ 621 struct xfs_dir2_data_entry *dep; /* data block entry */ 622 struct xfs_dir2_leaf_entry *lep; /* leaf entry table pointer */ 623 struct xfs_dir2_leaf_entry *ents; 624 struct xfs_dir2_data_unused *dup; /* data unused entry */ 625 struct xfs_dir2_leaf_tail *ltp; /* leaf tail pointer */ 626 struct xfs_dir2_data_free *bf; /* bestfree table */ 627 int compact; /* need to compact leaves */ 628 int error; /* error return value */ 629 int grown; /* allocated new data block */ 630 int highstale = 0; /* index of next stale leaf */ 631 int i; /* temporary, index */ 632 int index; /* leaf table position */ 633 int length; /* length of new entry */ 634 int lfloglow; /* low leaf logging index */ 635 int lfloghigh; /* high leaf logging index */ 636 int lowstale = 0; /* index of prev stale leaf */ 637 int needbytes; /* leaf block bytes needed */ 638 int needlog; /* need to log data header */ 639 int needscan; /* need to rescan data free */ 640 xfs_dir2_db_t use_block; /* data block number */ 641 642 trace_xfs_dir2_leaf_addname(args); 643 644 error = xfs_dir3_leaf_read(tp, dp, args->geo->leafblk, &lbp); 645 if (error) 646 return error; 647 648 /* 649 * Look up the entry by hash value and name. 650 * We know it's not there, our caller has already done a lookup. 651 * So the index is of the entry to insert in front of. 652 * But if there are dup hash values the index is of the first of those. 653 */ 654 index = xfs_dir2_leaf_search_hash(args, lbp); 655 leaf = lbp->b_addr; 656 ltp = xfs_dir2_leaf_tail_p(args->geo, leaf); 657 xfs_dir2_leaf_hdr_from_disk(dp->i_mount, &leafhdr, leaf); 658 ents = leafhdr.ents; 659 bestsp = xfs_dir2_leaf_bests_p(ltp); 660 length = xfs_dir2_data_entsize(dp->i_mount, args->namelen); 661 662 /* 663 * There should be as many bestfree slots as there are dir data 664 * blocks that can fit under i_size. Othrewise, which may cause 665 * serious problems eg. UAF or slab-out-of bound etc. 666 */ 667 if (be32_to_cpu(ltp->bestcount) != > 668 xfs_dir2_byte_to_db(args->geo, dp->i_d.di_size)) { 669 xfs_buf_ioerror_alert(lbp, __return_address); 670 if (tp->t_flags & XFS_TRANS_DIRTY) 671 xfs_force_shutdown(tp->t_mountp, 672 SHUTDOWN_CORRUPT_INCORE); 673 return -EFSCORRUPTED; 674 } 675 676 /* 677 * See if there are any entries with the same hash value 678 * and space in their block for the new entry. 679 * This is good because it puts multiple same-hash value entries 680 * in a data block, improving the lookup of those entries. 681 */ 682 for (use_block = -1, lep = &ents[index]; 683 index < leafhdr.count && be32_to_cpu(lep->hashval) == args->hashval; 684 index++, lep++) { 685 if (be32_to_cpu(lep->address) == XFS_DIR2_NULL_DATAPTR) 686 continue; 687 i = xfs_dir2_dataptr_to_db(args->geo, be32_to_cpu(lep->address)); 688 ASSERT(i < be32_to_cpu(ltp->bestcount)); 689 ASSERT(bestsp[i] != cpu_to_be16(NULLDATAOFF)); 690 if (be16_to_cpu(bestsp[i]) >= length) { 691 use_block = i; 692 break; 693 } 694 } 695 /* 696 * Didn't find a block yet, linear search all the data blocks. 697 */ 698 if (use_block == -1) { 699 for (i = 0; i < be32_to_cpu(ltp->bestcount); i++) { 700 /* 701 * Remember a block we see that's missing. 702 */ 703 if (bestsp[i] == cpu_to_be16(NULLDATAOFF) && 704 use_block == -1) 705 use_block = i; 706 else if (be16_to_cpu(bestsp[i]) >= length) { 707 use_block = i; 708 break; 709 } 710 } 711 } 712 /* 713 * How many bytes do we need in the leaf block? 714 */ 715 needbytes = 0; 716 if (!leafhdr.stale) 717 needbytes += sizeof(xfs_dir2_leaf_entry_t); 718 if (use_block == -1) 719 needbytes += sizeof(xfs_dir2_data_off_t); 720 721 /* 722 * Now kill use_block if it refers to a missing block, so we 723 * can use it as an indication of allocation needed. 724 */ 725 if (use_block != -1 && bestsp[use_block] == cpu_to_be16(NULLDATAOFF)) 726 use_block = -1; 727 /* 728 * If we don't have enough free bytes but we can make enough 729 * by compacting out stale entries, we'll do that. 730 */ 731 if ((char *)bestsp - (char *)&ents[leafhdr.count] < needbytes && 732 leafhdr.stale > 1) 733 compact = 1; 734 735 /* 736 * Otherwise if we don't have enough free bytes we need to 737 * convert to node form. 738 */ 739 else if ((char *)bestsp - (char *)&ents[leafhdr.count] < needbytes) { 740 /* 741 * Just checking or no space reservation, give up. 742 */ 743 if ((args->op_flags & XFS_DA_OP_JUSTCHECK) || 744 args->total == 0) { 745 xfs_trans_brelse(tp, lbp); 746 return -ENOSPC; 747 } 748 /* 749 * Convert to node form. 750 */ 751 error = xfs_dir2_leaf_to_node(args, lbp); 752 if (error) 753 return error; 754 /* 755 * Then add the new entry. 756 */ 757 return xfs_dir2_node_addname(args); 758 } 759 /* 760 * Otherwise it will fit without compaction. 761 */ 762 else 763 compact = 0; 764 /* 765 * If just checking, then it will fit unless we needed to allocate 766 * a new data block. 767 */ 768 if (args->op_flags & XFS_DA_OP_JUSTCHECK) { 769 xfs_trans_brelse(tp, lbp); 770 return use_block == -1 ? -ENOSPC : 0; 771 } 772 /* 773 * If no allocations are allowed, return now before we've 774 * changed anything. 775 */ 776 if (args->total == 0 && use_block == -1) { 777 xfs_trans_brelse(tp, lbp); 778 return -ENOSPC; 779 } 780 /* 781 * Need to compact the leaf entries, removing stale ones. 782 * Leave one stale entry behind - the one closest to our 783 * insertion index - and we'll shift that one to our insertion 784 * point later. 785 */ 786 if (compact) { 787 xfs_dir3_leaf_compact_x1(&leafhdr, ents, &index, &lowstale, 788 &highstale, &lfloglow, &lfloghigh); 789 } 790 /* 791 * There are stale entries, so we'll need log-low and log-high 792 * impossibly bad values later. 793 */ 794 else if (leafhdr.stale) { 795 lfloglow = leafhdr.count; 796 lfloghigh = -1; 797 } 798 /* 799 * If there was no data block space found, we need to allocate 800 * a new one. 801 */ 802 if (use_block == -1) { 803 /* 804 * Add the new data block. 805 */ 806 if ((error = xfs_dir2_grow_inode(args, XFS_DIR2_DATA_SPACE, 807 &use_block))) { 808 xfs_trans_brelse(tp, lbp); 809 return error; 810 } 811 /* 812 * Initialize the block. 813 */ 814 if ((error = xfs_dir3_data_init(args, use_block, &dbp))) { 815 xfs_trans_brelse(tp, lbp); 816 return error; 817 } 818 /* 819 * If we're adding a new data block on the end we need to 820 * extend the bests table. Copy it up one entry. 821 */ 822 if (use_block >= be32_to_cpu(ltp->bestcount)) { 823 bestsp--; 824 memmove(&bestsp[0], &bestsp[1], 825 be32_to_cpu(ltp->bestcount) * sizeof(bestsp[0])); 826 be32_add_cpu(<p->bestcount, 1); 827 xfs_dir3_leaf_log_tail(args, lbp); 828 xfs_dir3_leaf_log_bests(args, lbp, 0, 829 be32_to_cpu(ltp->bestcount) - 1); 830 } 831 /* 832 * If we're filling in a previously empty block just log it. 833 */ 834 else 835 xfs_dir3_leaf_log_bests(args, lbp, use_block, use_block); 836 hdr = dbp->b_addr; 837 bf = xfs_dir2_data_bestfree_p(dp->i_mount, hdr); 838 bestsp[use_block] = bf[0].length; 839 grown = 1; 840 } else { 841 /* 842 * Already had space in some data block. 843 * Just read that one in. 844 */ 845 error = xfs_dir3_data_read(tp, dp, 846 xfs_dir2_db_to_da(args->geo, use_block), 847 0, &dbp); 848 if (error) { 849 xfs_trans_brelse(tp, lbp); 850 return error; 851 } 852 hdr = dbp->b_addr; 853 bf = xfs_dir2_data_bestfree_p(dp->i_mount, hdr); 854 grown = 0; 855 } 856 /* 857 * Point to the biggest freespace in our data block. 858 */ 859 dup = (xfs_dir2_data_unused_t *) 860 ((char *)hdr + be16_to_cpu(bf[0].offset)); 861 needscan = needlog = 0; 862 /* 863 * Mark the initial part of our freespace in use for the new entry. 864 */ 865 error = xfs_dir2_data_use_free(args, dbp, dup, 866 (xfs_dir2_data_aoff_t)((char *)dup - (char *)hdr), 867 length, &needlog, &needscan); 868 if (error) { 869 xfs_trans_brelse(tp, lbp); 870 return error; 871 } 872 /* 873 * Initialize our new entry (at last). 874 */ 875 dep = (xfs_dir2_data_entry_t *)dup; 876 dep->inumber = cpu_to_be64(args->inumber); 877 dep->namelen = args->namelen; 878 memcpy(dep->name, args->name, dep->namelen); 879 xfs_dir2_data_put_ftype(dp->i_mount, dep, args->filetype); 880 tagp = xfs_dir2_data_entry_tag_p(dp->i_mount, dep); 881 *tagp = cpu_to_be16((char *)dep - (char *)hdr); 882 /* 883 * Need to scan fix up the bestfree table. 884 */ 885 if (needscan) 886 xfs_dir2_data_freescan(dp->i_mount, hdr, &needlog); 887 /* 888 * Need to log the data block's header. 889 */ 890 if (needlog) 891 xfs_dir2_data_log_header(args, dbp); 892 xfs_dir2_data_log_entry(args, dbp, dep); 893 /* 894 * If the bests table needs to be changed, do it. 895 * Log the change unless we've already done that. 896 */ 897 if (be16_to_cpu(bestsp[use_block]) != be16_to_cpu(bf[0].length)) { 898 bestsp[use_block] = bf[0].length; 899 if (!grown) 900 xfs_dir3_leaf_log_bests(args, lbp, use_block, use_block); 901 } 902 903 lep = xfs_dir3_leaf_find_entry(&leafhdr, ents, index, compact, lowstale, 904 highstale, &lfloglow, &lfloghigh); 905 906 /* 907 * Fill in the new leaf entry. 908 */ 909 lep->hashval = cpu_to_be32(args->hashval); 910 lep->address = cpu_to_be32( 911 xfs_dir2_db_off_to_dataptr(args->geo, use_block, 912 be16_to_cpu(*tagp))); 913 /* 914 * Log the leaf fields and give up the buffers. 915 */ 916 xfs_dir2_leaf_hdr_to_disk(dp->i_mount, leaf, &leafhdr); 917 xfs_dir3_leaf_log_header(args, lbp); 918 xfs_dir3_leaf_log_ents(args, &leafhdr, lbp, lfloglow, lfloghigh); 919 xfs_dir3_leaf_check(dp, lbp); 920 xfs_dir3_data_check(dp, dbp); 921 return 0; 922 } 923
On Mon, Aug 29, 2022 at 03:02:12PM +0800, Guo Xuenan wrote: > For leaf dir, there should be as many bestfree slots as there are dir data > blocks that can fit under i_size. Othrewise, which may cause UAF or > slab-out-of bound etc. > > Root cause is we don't examin the number bestfree slots, when the slots > number less than dir data blocks, if we need to allocate new dir data block > and update the bestfree array, we will use the dir block number as index to > assign bestfree array, which may cause UAF or other memory access problem. > This issue can also triggered with test cases xfs/473 from fstests. > > Simplify the testcase xfs/473 with commands below: > DEV=/dev/sdb > MP=/mnt/sdb > WORKDIR=/mnt/sdb/341 #1. mkfs create new xfs image > mkfs.xfs -f ${DEV} > mount ${DEV} ${MP} > mkdir -p ${WORKDIR} > for i in `seq 1 341` #2. create leaf dir with 341 entries file name len 8 > do > touch ${WORKDIR}/$(printf "%08d" ${i}) > done > inode=$(ls -i ${MP} | cut -d' ' -f1) > umount ${MP} #3. xfs_db set bestcount to 0 > xfs_db -x ${DEV} -c "inode ${inode}" -c "dblock 8388608" \ > -c "write ltail.bestcount 0" > mount ${DEV} ${MP} > touch ${WORKDIR}/{1..100}.txt #4. touch new file, reproduce > > The error log is shown as follows: > ================================================================== > BUG: KASAN: use-after-free in xfs_dir2_leaf_addname+0x1995/0x1ac0 > Write of size 2 at addr ffff88810168b000 by task touch/1552 > CPU: 5 PID: 1552 Comm: touch Not tainted 6.0.0-rc3+ #101 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > 1.13.0-1ubuntu1.1 04/01/2014 > Call Trace: > <TASK> > dump_stack_lvl+0x4d/0x66 > print_report.cold+0xf6/0x691 > kasan_report+0xa8/0x120 > xfs_dir2_leaf_addname+0x1995/0x1ac0 > xfs_dir_createname+0x58c/0x7f0 > xfs_create+0x7af/0x1010 > xfs_generic_create+0x270/0x5e0 > path_openat+0x270b/0x3450 > do_filp_open+0x1cf/0x2b0 > do_sys_openat2+0x46b/0x7a0 > do_sys_open+0xb7/0x130 > do_syscall_64+0x35/0x80 > entry_SYSCALL_64_after_hwframe+0x63/0xcd > RIP: 0033:0x7fe4d9e9312b > Code: 25 00 00 41 00 3d 00 00 41 00 74 4b 64 8b 04 25 18 00 00 00 85 c0 > 75 67 44 89 e2 48 89 ee bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00 > f0 ff ff 0f 87 91 00 00 00 48 8b 4c 24 28 64 48 33 0c 25 > RSP: 002b:00007ffda4c16c20 EFLAGS: 00000246 ORIG_RAX: 0000000000000101 > RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fe4d9e9312b > RDX: 0000000000000941 RSI: 00007ffda4c17f33 RDI: 00000000ffffff9c > RBP: 00007ffda4c17f33 R08: 0000000000000000 R09: 0000000000000000 > R10: 00000000000001b6 R11: 0000000000000246 R12: 0000000000000941 > R13: 00007fe4d9f631a4 R14: 00007ffda4c17f33 R15: 0000000000000000 > </TASK> > > The buggy address belongs to the physical page: > page:ffffea000405a2c0 refcount:0 mapcount:0 mapping:0000000000000000 > index:0x0 pfn:0x10168b > flags: 0x2fffff80000000(node=0|zone=2|lastcpupid=0x1fffff) > raw: 002fffff80000000 ffffea0004057788 ffffea000402dbc8 0000000000000000 > raw: 0000000000000000 0000000000170000 00000000ffffffff 0000000000000000 > page dumped because: kasan: bad access detected > > Memory state around the buggy address: > ffff88810168af00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ffff88810168af80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > >ffff88810168b000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff > ^ > ffff88810168b080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff > ffff88810168b100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff > ================================================================== > Disabling lock debugging due to kernel taint > 00000000: 58 44 44 33 5b 53 35 c2 00 00 00 00 00 00 00 78 > XDD3[S5........x > XFS (sdb): Internal error xfs_dir2_data_use_free at line 1200 of file > fs/xfs/libxfs/xfs_dir2_data.c. Caller > xfs_dir2_data_use_free+0x28a/0xeb0 > CPU: 5 PID: 1552 Comm: touch Tainted: G B 6.0.0-rc3+ > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > 1.13.0-1ubuntu1.1 04/01/2014 > Call Trace: > <TASK> > dump_stack_lvl+0x4d/0x66 > xfs_corruption_error+0x132/0x150 > xfs_dir2_data_use_free+0x198/0xeb0 > xfs_dir2_leaf_addname+0xa59/0x1ac0 > xfs_dir_createname+0x58c/0x7f0 > xfs_create+0x7af/0x1010 > xfs_generic_create+0x270/0x5e0 > path_openat+0x270b/0x3450 > do_filp_open+0x1cf/0x2b0 > do_sys_openat2+0x46b/0x7a0 > do_sys_open+0xb7/0x130 > do_syscall_64+0x35/0x80 > entry_SYSCALL_64_after_hwframe+0x63/0xcd > RIP: 0033:0x7fe4d9e9312b > Code: 25 00 00 41 00 3d 00 00 41 00 74 4b 64 8b 04 25 18 00 00 00 85 c0 > 75 67 44 89 e2 48 89 ee bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00 > f0 ff ff 0f 87 91 00 00 00 48 8b 4c 24 28 64 48 33 0c 25 > RSP: 002b:00007ffda4c16c20 EFLAGS: 00000246 ORIG_RAX: 0000000000000101 > RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fe4d9e9312b > RDX: 0000000000000941 RSI: 00007ffda4c17f46 RDI: 00000000ffffff9c > RBP: 00007ffda4c17f46 R08: 0000000000000000 R09: 0000000000000001 > R10: 00000000000001b6 R11: 0000000000000246 R12: 0000000000000941 > R13: 00007fe4d9f631a4 R14: 00007ffda4c17f46 R15: 0000000000000000 > </TASK> > XFS (sdb): Corruption detected. Unmount and run xfs_repair > > Signed-off-by: Guo Xuenan <guoxuenan@huawei.com> > --- > fs/xfs/libxfs/xfs_dir2_leaf.c | 14 ++++++++++++++ > 1 file changed, 14 insertions(+) > > diff --git a/fs/xfs/libxfs/xfs_dir2_leaf.c b/fs/xfs/libxfs/xfs_dir2_leaf.c > index d9b66306a9a7..09414651ac48 100644 > --- a/fs/xfs/libxfs/xfs_dir2_leaf.c > +++ b/fs/xfs/libxfs/xfs_dir2_leaf.c > @@ -659,6 +659,20 @@ xfs_dir2_leaf_addname( > bestsp = xfs_dir2_leaf_bests_p(ltp); > length = xfs_dir2_data_entsize(dp->i_mount, args->namelen); > > + /* > + * There should be as many bestfree slots as there are dir data > + * blocks that can fit under i_size. Othrewise, which may cause Typo: "Otherwise..." > + * serious problems eg. UAF or slab-out-of bound etc. What about commit e5d1802c70f5 ("xfs: fix a bug in the online fsck directory leaf1 bestcount check")? Online fsck used to have this check in it, until I discovered that the kernel actually /can/ remove blocks from the end of the data section of a directory and forget to decrease i_disk_size. xfs_repair hasn't ever complained about that, which means it's part of the disk format. > + */ > + if (be32_to_cpu(ltp->bestcount) != > + xfs_dir2_byte_to_db(args->geo, dp->i_d.di_size)) { What is i_d.di_size? We changed that to i_disk_size ages ago... --D > + xfs_buf_ioerror_alert(lbp, __return_address); > + if (tp->t_flags & XFS_TRANS_DIRTY) > + xfs_force_shutdown(tp->t_mountp, > + SHUTDOWN_CORRUPT_INCORE); > + return -EFSCORRUPTED; > + } > + > /* > * See if there are any entries with the same hash value > * and space in their block for the new entry. > -- > 2.25.1 >
Hi Dave & Darrick: To reproduce this problem add xfs/554 [1]. And, I resend a patch v2 [2], considering the situation mentioned by Darrick, Meanwhile, check the i_disk_size or get dir blocks at xfs_dir3_leaf_check_int looks a little strange. So in my opinion, we can just add judgment in that situation, avoiding UAF and return EFSCORRUPTED, xfs_repair will help us fix it. [1] https://lore.kernel.org/all/20220902094046.3891252-1-guoxuenan@huawei.com/ [2] https://lore.kernel.org/all/20220831121639.3060527-1-guoxuenan@huawei.com/ thanks Xueanan
diff --git a/fs/xfs/libxfs/xfs_dir2_leaf.c b/fs/xfs/libxfs/xfs_dir2_leaf.c index d9b66306a9a7..09414651ac48 100644 --- a/fs/xfs/libxfs/xfs_dir2_leaf.c +++ b/fs/xfs/libxfs/xfs_dir2_leaf.c @@ -659,6 +659,20 @@ xfs_dir2_leaf_addname( bestsp = xfs_dir2_leaf_bests_p(ltp); length = xfs_dir2_data_entsize(dp->i_mount, args->namelen); + /* + * There should be as many bestfree slots as there are dir data + * blocks that can fit under i_size. Othrewise, which may cause + * serious problems eg. UAF or slab-out-of bound etc. + */ + if (be32_to_cpu(ltp->bestcount) != + xfs_dir2_byte_to_db(args->geo, dp->i_d.di_size)) { + xfs_buf_ioerror_alert(lbp, __return_address); + if (tp->t_flags & XFS_TRANS_DIRTY) + xfs_force_shutdown(tp->t_mountp, + SHUTDOWN_CORRUPT_INCORE); + return -EFSCORRUPTED; + } + /* * See if there are any entries with the same hash value * and space in their block for the new entry.