Message ID | 20241207182622.97113-2-idryomov@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | ceph: ceph_direct_read_write() fixes | expand |
Reviewed-by: Alex Markuze <amarkuze@redhat.com> On Sat, Dec 7, 2024 at 8:26 PM Ilya Dryomov <idryomov@gmail.com> wrote: > > The bvecs array which is allocated in iter_get_bvecs_alloc() is leaked > and pages remain pinned if ceph_alloc_sparse_ext_map() fails. > > There is no need to delay the allocation of sparse_ext map until after > the bvecs array is set up, so fix this by moving sparse_ext allocation > a bit earlier. Also, make a similar adjustment in __ceph_sync_read() > for consistency (a leak of the same kind in __ceph_sync_read() has been > addressed differently). > > Cc: stable@vger.kernel.org > Fixes: 03bc06c7b0bd ("ceph: add new mount option to enable sparse reads") > Signed-off-by: Ilya Dryomov <idryomov@gmail.com> > --- > fs/ceph/file.c | 43 ++++++++++++++++++++++--------------------- > 1 file changed, 22 insertions(+), 21 deletions(-) > > diff --git a/fs/ceph/file.c b/fs/ceph/file.c > index f9bb9e5493ce..0df2ffc69e92 100644 > --- a/fs/ceph/file.c > +++ b/fs/ceph/file.c > @@ -1116,6 +1116,16 @@ ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, > len = read_off + read_len - off; > more = len < iov_iter_count(to); > > + op = &req->r_ops[0]; > + if (sparse) { > + extent_cnt = __ceph_sparse_read_ext_count(inode, read_len); > + ret = ceph_alloc_sparse_ext_map(op, extent_cnt); > + if (ret) { > + ceph_osdc_put_request(req); > + break; > + } > + } > + > num_pages = calc_pages_for(read_off, read_len); > page_off = offset_in_page(off); > pages = ceph_alloc_page_vector(num_pages, GFP_KERNEL); > @@ -1129,16 +1139,6 @@ ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, > offset_in_page(read_off), > false, true); > > - op = &req->r_ops[0]; > - if (sparse) { > - extent_cnt = __ceph_sparse_read_ext_count(inode, read_len); > - ret = ceph_alloc_sparse_ext_map(op, extent_cnt); > - if (ret) { > - ceph_osdc_put_request(req); > - break; > - } > - } > - > ceph_osdc_start_request(osdc, req); > ret = ceph_osdc_wait_request(osdc, req); > > @@ -1557,6 +1557,16 @@ ceph_direct_read_write(struct kiocb *iocb, struct iov_iter *iter, > break; > } > > + op = &req->r_ops[0]; > + if (sparse) { > + extent_cnt = __ceph_sparse_read_ext_count(inode, size); > + ret = ceph_alloc_sparse_ext_map(op, extent_cnt); > + if (ret) { > + ceph_osdc_put_request(req); > + break; > + } > + } > + > len = iter_get_bvecs_alloc(iter, size, &bvecs, &num_pages); > if (len < 0) { > ceph_osdc_put_request(req); > @@ -1566,6 +1576,8 @@ ceph_direct_read_write(struct kiocb *iocb, struct iov_iter *iter, > if (len != size) > osd_req_op_extent_update(req, 0, len); > > + osd_req_op_extent_osd_data_bvecs(req, 0, bvecs, num_pages, len); > + > /* > * To simplify error handling, allow AIO when IO within i_size > * or IO can be satisfied by single OSD request. > @@ -1597,17 +1609,6 @@ ceph_direct_read_write(struct kiocb *iocb, struct iov_iter *iter, > req->r_mtime = mtime; > } > > - osd_req_op_extent_osd_data_bvecs(req, 0, bvecs, num_pages, len); > - op = &req->r_ops[0]; > - if (sparse) { > - extent_cnt = __ceph_sparse_read_ext_count(inode, size); > - ret = ceph_alloc_sparse_ext_map(op, extent_cnt); > - if (ret) { > - ceph_osdc_put_request(req); > - break; > - } > - } > - > if (aio_req) { > aio_req->total_len += len; > aio_req->num_reqs++; > -- > 2.46.1 > >
diff --git a/fs/ceph/file.c b/fs/ceph/file.c index f9bb9e5493ce..0df2ffc69e92 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -1116,6 +1116,16 @@ ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, len = read_off + read_len - off; more = len < iov_iter_count(to); + op = &req->r_ops[0]; + if (sparse) { + extent_cnt = __ceph_sparse_read_ext_count(inode, read_len); + ret = ceph_alloc_sparse_ext_map(op, extent_cnt); + if (ret) { + ceph_osdc_put_request(req); + break; + } + } + num_pages = calc_pages_for(read_off, read_len); page_off = offset_in_page(off); pages = ceph_alloc_page_vector(num_pages, GFP_KERNEL); @@ -1129,16 +1139,6 @@ ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, offset_in_page(read_off), false, true); - op = &req->r_ops[0]; - if (sparse) { - extent_cnt = __ceph_sparse_read_ext_count(inode, read_len); - ret = ceph_alloc_sparse_ext_map(op, extent_cnt); - if (ret) { - ceph_osdc_put_request(req); - break; - } - } - ceph_osdc_start_request(osdc, req); ret = ceph_osdc_wait_request(osdc, req); @@ -1557,6 +1557,16 @@ ceph_direct_read_write(struct kiocb *iocb, struct iov_iter *iter, break; } + op = &req->r_ops[0]; + if (sparse) { + extent_cnt = __ceph_sparse_read_ext_count(inode, size); + ret = ceph_alloc_sparse_ext_map(op, extent_cnt); + if (ret) { + ceph_osdc_put_request(req); + break; + } + } + len = iter_get_bvecs_alloc(iter, size, &bvecs, &num_pages); if (len < 0) { ceph_osdc_put_request(req); @@ -1566,6 +1576,8 @@ ceph_direct_read_write(struct kiocb *iocb, struct iov_iter *iter, if (len != size) osd_req_op_extent_update(req, 0, len); + osd_req_op_extent_osd_data_bvecs(req, 0, bvecs, num_pages, len); + /* * To simplify error handling, allow AIO when IO within i_size * or IO can be satisfied by single OSD request. @@ -1597,17 +1609,6 @@ ceph_direct_read_write(struct kiocb *iocb, struct iov_iter *iter, req->r_mtime = mtime; } - osd_req_op_extent_osd_data_bvecs(req, 0, bvecs, num_pages, len); - op = &req->r_ops[0]; - if (sparse) { - extent_cnt = __ceph_sparse_read_ext_count(inode, size); - ret = ceph_alloc_sparse_ext_map(op, extent_cnt); - if (ret) { - ceph_osdc_put_request(req); - break; - } - } - if (aio_req) { aio_req->total_len += len; aio_req->num_reqs++;
The bvecs array which is allocated in iter_get_bvecs_alloc() is leaked and pages remain pinned if ceph_alloc_sparse_ext_map() fails. There is no need to delay the allocation of sparse_ext map until after the bvecs array is set up, so fix this by moving sparse_ext allocation a bit earlier. Also, make a similar adjustment in __ceph_sync_read() for consistency (a leak of the same kind in __ceph_sync_read() has been addressed differently). Cc: stable@vger.kernel.org Fixes: 03bc06c7b0bd ("ceph: add new mount option to enable sparse reads") Signed-off-by: Ilya Dryomov <idryomov@gmail.com> --- fs/ceph/file.c | 43 ++++++++++++++++++++++--------------------- 1 file changed, 22 insertions(+), 21 deletions(-)