Message ID | 20170126031114.GS9134@birch.djwong.org (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
On 1/25/17 9:11 PM, Darrick J. Wong wrote: > In a bmapx call, bmv_count is the total size of the array, including the > zeroth element that userspace uses to supply the search key. The output > array starts at offset 1 so that we can set up the user for the next > invocation. Since we now can split an extent into multiple bmap records > due to shared/unshared status, we have to be careful that we don't > overflow the output array. > > In the original patch f86f403794b ("xfs: teach get_bmapx about shared > extents and the CoW fork") I used cur_ext (the output index) to check > for overflows, albeit with an off-by-one error. Since nexleft describes > the number of unfilled slots in the output, we can rip all that out and > use nexleft for the check directly. > > Failure to do this causes heap corruption in bmapx callers such as > xfs_io and xfs_scrub. xfs/328 can reproduce this problem. > > Suggested-by: Eric Sandeen <sandeen@sandeen.net> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Yup, I think this is better, thanks. Comments around the whole inject_map business would be nice, but *shrug* doesn't have to be in this patch. Reviewed-by: Eric Sandeen <sandeen@redhat.com> > --- > v2: simplify the loop accounting to use nexleft for the output checks > --- > fs/xfs/xfs_bmap_util.c | 10 ++++------ > 1 file changed, 4 insertions(+), 6 deletions(-) > > diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c > index b9abce5..fc6bdaf 100644 > --- a/fs/xfs/xfs_bmap_util.c > +++ b/fs/xfs/xfs_bmap_util.c > @@ -697,8 +697,7 @@ xfs_getbmap( > goto out_free_map; > ASSERT(nmap <= subnex); > > - for (i = 0; i < nmap && nexleft && bmv->bmv_length && > - cur_ext < bmv->bmv_count; i++) { > + for (i = 0; i < nmap && nexleft && bmv->bmv_length; i++) { > out[cur_ext].bmv_oflags = 0; > if (map[i].br_state == XFS_EXT_UNWRITTEN) > out[cur_ext].bmv_oflags |= BMV_OF_PREALLOC; > @@ -760,16 +759,15 @@ xfs_getbmap( > continue; > } > > + nexleft--; > if (inject_map.br_startblock != NULLFSBLOCK) { > map[i] = inject_map; > i--; > - } else > - nexleft--; > + } > bmv->bmv_entries++; > cur_ext++; > } > - } while (nmap && nexleft && bmv->bmv_length && > - cur_ext < bmv->bmv_count); > + } while (nmap && nexleft && bmv->bmv_length); > > out_free_map: > kmem_free(map); > -- > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jan 26, 2017 at 11:33:03AM -0600, Eric Sandeen wrote: > On 1/25/17 9:11 PM, Darrick J. Wong wrote: > > In a bmapx call, bmv_count is the total size of the array, including the > > zeroth element that userspace uses to supply the search key. The output > > array starts at offset 1 so that we can set up the user for the next > > invocation. Since we now can split an extent into multiple bmap records > > due to shared/unshared status, we have to be careful that we don't > > overflow the output array. > > > > In the original patch f86f403794b ("xfs: teach get_bmapx about shared > > extents and the CoW fork") I used cur_ext (the output index) to check > > for overflows, albeit with an off-by-one error. Since nexleft describes > > the number of unfilled slots in the output, we can rip all that out and > > use nexleft for the check directly. > > > > Failure to do this causes heap corruption in bmapx callers such as > > xfs_io and xfs_scrub. xfs/328 can reproduce this problem. > > > > Suggested-by: Eric Sandeen <sandeen@sandeen.net> > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> > > Yup, I think this is better, thanks. Comments around the > whole inject_map business would be nice, but *shrug* doesn't > have to be in this patch. > > Reviewed-by: Eric Sandeen <sandeen@redhat.com> > > > --- > > v2: simplify the loop accounting to use nexleft for the output checks > > --- > > fs/xfs/xfs_bmap_util.c | 10 ++++------ > > 1 file changed, 4 insertions(+), 6 deletions(-) > > > > diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c > > index b9abce5..fc6bdaf 100644 > > --- a/fs/xfs/xfs_bmap_util.c > > +++ b/fs/xfs/xfs_bmap_util.c > > @@ -697,8 +697,7 @@ xfs_getbmap( > > goto out_free_map; > > ASSERT(nmap <= subnex); > > > > - for (i = 0; i < nmap && nexleft && bmv->bmv_length && > > - cur_ext < bmv->bmv_count; i++) { > > + for (i = 0; i < nmap && nexleft && bmv->bmv_length; i++) { NAK. I forgot that nexleft is min(bmv_count-1, di_nextents), which means that that if we have one partially shared bmbt extent and bmv_count = 1000, we only return the first part of that bmbt extent to the user. Worse yet, we also return with bmv_entries < bmv_count-1, which leads xfs_io to stop calling bmapx prematurely. That leads to xfs/280 regressing, so I'm going to resubmit the v1 of this patch, but with improved commenting so that nobody else will miss this again. --D > > out[cur_ext].bmv_oflags = 0; > > if (map[i].br_state == XFS_EXT_UNWRITTEN) > > out[cur_ext].bmv_oflags |= BMV_OF_PREALLOC; > > @@ -760,16 +759,15 @@ xfs_getbmap( > > continue; > > } > > > > + nexleft--; > > if (inject_map.br_startblock != NULLFSBLOCK) { > > map[i] = inject_map; > > i--; > > - } else > > - nexleft--; > > + } > > bmv->bmv_entries++; > > cur_ext++; > > } > > - } while (nmap && nexleft && bmv->bmv_length && > > - cur_ext < bmv->bmv_count); > > + } while (nmap && nexleft && bmv->bmv_length); > > > > out_free_map: > > kmem_free(map); > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c index b9abce5..fc6bdaf 100644 --- a/fs/xfs/xfs_bmap_util.c +++ b/fs/xfs/xfs_bmap_util.c @@ -697,8 +697,7 @@ xfs_getbmap( goto out_free_map; ASSERT(nmap <= subnex); - for (i = 0; i < nmap && nexleft && bmv->bmv_length && - cur_ext < bmv->bmv_count; i++) { + for (i = 0; i < nmap && nexleft && bmv->bmv_length; i++) { out[cur_ext].bmv_oflags = 0; if (map[i].br_state == XFS_EXT_UNWRITTEN) out[cur_ext].bmv_oflags |= BMV_OF_PREALLOC; @@ -760,16 +759,15 @@ xfs_getbmap( continue; } + nexleft--; if (inject_map.br_startblock != NULLFSBLOCK) { map[i] = inject_map; i--; - } else - nexleft--; + } bmv->bmv_entries++; cur_ext++; } - } while (nmap && nexleft && bmv->bmv_length && - cur_ext < bmv->bmv_count); + } while (nmap && nexleft && bmv->bmv_length); out_free_map: kmem_free(map);
In a bmapx call, bmv_count is the total size of the array, including the zeroth element that userspace uses to supply the search key. The output array starts at offset 1 so that we can set up the user for the next invocation. Since we now can split an extent into multiple bmap records due to shared/unshared status, we have to be careful that we don't overflow the output array. In the original patch f86f403794b ("xfs: teach get_bmapx about shared extents and the CoW fork") I used cur_ext (the output index) to check for overflows, albeit with an off-by-one error. Since nexleft describes the number of unfilled slots in the output, we can rip all that out and use nexleft for the check directly. Failure to do this causes heap corruption in bmapx callers such as xfs_io and xfs_scrub. xfs/328 can reproduce this problem. Suggested-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> --- v2: simplify the loop accounting to use nexleft for the output checks --- fs/xfs/xfs_bmap_util.c | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html