Message ID | 20241120024306.156920-1-zhenghaoran@buaa.edu.cn (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | fs: Fix data race in inode_set_ctime_to_ts | expand |
On Wed 20-11-24 10:43:06, Hao-ran Zheng wrote: > A data race may occur when the function `inode_set_ctime_to_ts()` and > the function `inode_get_ctime_sec()` are executed concurrently. When > two threads call `aio_read` and `aio_write` respectively, they will > be distributed to the read and write functions of the corresponding > file system respectively. Taking the btrfs file system as an example, > the `btrfs_file_read_iter` and `btrfs_file_write_iter` functions are > finally called. These two functions created a data race when they > finally called `inode_get_ctime_sec()` and `inode_set_ctime_to_ns()`. > The specific call stack that appears during testing is as follows: > > ``` > ============DATA_RACE============ > btrfs_delayed_update_inode+0x1f61/0x7ce0 [btrfs] > btrfs_update_inode+0x45e/0xbb0 [btrfs] > btrfs_dirty_inode+0x2b8/0x530 [btrfs] > btrfs_update_time+0x1ad/0x230 [btrfs] > touch_atime+0x211/0x440 > filemap_read+0x90f/0xa20 > btrfs_file_read_iter+0xeb/0x580 [btrfs] > aio_read+0x275/0x3a0 > io_submit_one+0xd22/0x1ce0 > __se_sys_io_submit+0xb3/0x250 > do_syscall_64+0xc1/0x190 > entry_SYSCALL_64_after_hwframe+0x77/0x7f > ============OTHER_INFO============ > btrfs_write_check+0xa15/0x1390 [btrfs] > btrfs_buffered_write+0x52f/0x29d0 [btrfs] > btrfs_do_write_iter+0x53d/0x1590 [btrfs] > btrfs_file_write_iter+0x41/0x60 [btrfs] > aio_write+0x41e/0x5f0 > io_submit_one+0xd42/0x1ce0 > __se_sys_io_submit+0xb3/0x250 > do_syscall_64+0xc1/0x190 > entry_SYSCALL_64_after_hwframe+0x77/0x7f > ``` > > The call chain after traceability is as follows: > > ``` > Thread1: > btrfs_delayed_update_inode() -> > fill_stack_inode_item() -> > inode_get_ctime_sec() > > Thread2: > btrfs_write_check() -> > update_time_for_write() -> > inode_set_ctime_to_ts() > ``` > > To address this issue, it is recommended to > add WRITE_ONCE when writing the `inode->i_ctime_sec` variable. Thanks for the patch! This is really, really theoretic but with LTO I suppose the compiler could get inventive and compile this in some other way than plain stores. But WRITE_ONCE() alone is not enough. You should have READ_ONCE() in the reading counterparts as well. Honza > > Signed-off-by: Hao-ran Zheng <zhenghaoran@buaa.edu.cn> > --- > include/linux/fs.h | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/include/linux/fs.h b/include/linux/fs.h > index 3559446279c1..d11b257a35e1 100644 > --- a/include/linux/fs.h > +++ b/include/linux/fs.h > @@ -1674,8 +1674,8 @@ static inline struct timespec64 inode_get_ctime(const struct inode *inode) > static inline struct timespec64 inode_set_ctime_to_ts(struct inode *inode, > struct timespec64 ts) > { > - inode->i_ctime_sec = ts.tv_sec; > - inode->i_ctime_nsec = ts.tv_nsec; > + WRITE_ONCE(inode->i_ctime_sec, ts.tv_sec); > + WRITE_ONCE(inode->i_ctime_nsec, ts.tv_nsec); > return ts; > } > > -- > 2.34.1 >
diff --git a/include/linux/fs.h b/include/linux/fs.h index 3559446279c1..d11b257a35e1 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1674,8 +1674,8 @@ static inline struct timespec64 inode_get_ctime(const struct inode *inode) static inline struct timespec64 inode_set_ctime_to_ts(struct inode *inode, struct timespec64 ts) { - inode->i_ctime_sec = ts.tv_sec; - inode->i_ctime_nsec = ts.tv_nsec; + WRITE_ONCE(inode->i_ctime_sec, ts.tv_sec); + WRITE_ONCE(inode->i_ctime_nsec, ts.tv_nsec); return ts; }
A data race may occur when the function `inode_set_ctime_to_ts()` and the function `inode_get_ctime_sec()` are executed concurrently. When two threads call `aio_read` and `aio_write` respectively, they will be distributed to the read and write functions of the corresponding file system respectively. Taking the btrfs file system as an example, the `btrfs_file_read_iter` and `btrfs_file_write_iter` functions are finally called. These two functions created a data race when they finally called `inode_get_ctime_sec()` and `inode_set_ctime_to_ns()`. The specific call stack that appears during testing is as follows: ``` ============DATA_RACE============ btrfs_delayed_update_inode+0x1f61/0x7ce0 [btrfs] btrfs_update_inode+0x45e/0xbb0 [btrfs] btrfs_dirty_inode+0x2b8/0x530 [btrfs] btrfs_update_time+0x1ad/0x230 [btrfs] touch_atime+0x211/0x440 filemap_read+0x90f/0xa20 btrfs_file_read_iter+0xeb/0x580 [btrfs] aio_read+0x275/0x3a0 io_submit_one+0xd22/0x1ce0 __se_sys_io_submit+0xb3/0x250 do_syscall_64+0xc1/0x190 entry_SYSCALL_64_after_hwframe+0x77/0x7f ============OTHER_INFO============ btrfs_write_check+0xa15/0x1390 [btrfs] btrfs_buffered_write+0x52f/0x29d0 [btrfs] btrfs_do_write_iter+0x53d/0x1590 [btrfs] btrfs_file_write_iter+0x41/0x60 [btrfs] aio_write+0x41e/0x5f0 io_submit_one+0xd42/0x1ce0 __se_sys_io_submit+0xb3/0x250 do_syscall_64+0xc1/0x190 entry_SYSCALL_64_after_hwframe+0x77/0x7f ``` The call chain after traceability is as follows: ``` Thread1: btrfs_delayed_update_inode() -> fill_stack_inode_item() -> inode_get_ctime_sec() Thread2: btrfs_write_check() -> update_time_for_write() -> inode_set_ctime_to_ts() ``` To address this issue, it is recommended to add WRITE_ONCE when writing the `inode->i_ctime_sec` variable. Signed-off-by: Hao-ran Zheng <zhenghaoran@buaa.edu.cn> --- include/linux/fs.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)