diff mbox

vfs: check i_count under lock in evict_inodes

Message ID 1468282504-2272-1-git-send-email-david.chen@osnexus.com (mailing list archive)
State New, archived
Headers show

Commit Message

Chunwei Chen July 12, 2016, 12:15 a.m. UTC
We need to check i_count again with i_lock held, because iput might re-add
i_count when lazytime is on. Without this check, we could end up with
double-free or use-after-free.

Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
---
 fs/inode.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Comments

Al Viro July 12, 2016, 12:46 a.m. UTC | #1
On Mon, Jul 11, 2016 at 05:15:04PM -0700, Chunwei Chen wrote:
> We need to check i_count again with i_lock held, because iput might re-add
> i_count when lazytime is on. Without this check, we could end up with
> double-free or use-after-free.

Details, please.  Ideally - with a reproducer.  Who is calling that iput()
at that point of generic_shutdown_super() (has to be another thread) and
just what will happen if the same iput() is delayed until *after*
evict_inodes(), all the way into ->put_super().  At which point there's
no promise whatsoever that the data structures used by ->evict_inode()
hadn't been already freed...
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Chunwei Chen July 12, 2016, 1:31 a.m. UTC | #2
Hi Al,

I'm not sure about the in-tree fs, but in zfsonlinux, it would offload
iput to a thread, so this would happen there. And it would wait for
the thread in put_super(), so that part is not a problem...

Thanks

2016-07-11 17:46 GMT-07:00 Al Viro <viro@zeniv.linux.org.uk>:
> On Mon, Jul 11, 2016 at 05:15:04PM -0700, Chunwei Chen wrote:
>> We need to check i_count again with i_lock held, because iput might re-add
>> i_count when lazytime is on. Without this check, we could end up with
>> double-free or use-after-free.
>
> Details, please.  Ideally - with a reproducer.  Who is calling that iput()
> at that point of generic_shutdown_super() (has to be another thread) and
> just what will happen if the same iput() is delayed until *after*
> evict_inodes(), all the way into ->put_super().  At which point there's
> no promise whatsoever that the data structures used by ->evict_inode()
> hadn't been already freed...
>
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christoph Hellwig July 12, 2016, 1:40 a.m. UTC | #3
On Mon, Jul 11, 2016 at 06:31:57PM -0700, David Chen wrote:
> Hi Al,
> 
> I'm not sure about the in-tree fs, but in zfsonlinux, it would offload
> iput to a thread, so this would happen there. And it would wait for
> the thread in put_super(), so that part is not a problem...

And why exactly is your use of a broken and undistributable out of tree
module our problem?
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Al Viro July 12, 2016, 5:26 a.m. UTC | #4
On Mon, Jul 11, 2016 at 06:31:57PM -0700, David Chen wrote:
> Hi Al,
> 
> I'm not sure about the in-tree fs, but in zfsonlinux, it would offload
> iput to a thread, so this would happen there. And it would wait for
> the thread in put_super(), so that part is not a problem...

*shrug*  I hadn't looked (and won't look) at zfs glue, but I'd suggest
trying something along the line of stopping that thread in the beginning
of your ->kill_sb() (having told the sucker to stop offloading, of course)
and only then calling generic_shutdown_super()...
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/inode.c b/fs/inode.c
index 4ccbc21..10bb020 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -607,7 +607,12 @@  again:
 			continue;
 
 		spin_lock(&inode->i_lock);
-		if (inode->i_state & (I_NEW | I_FREEING | I_WILL_FREE)) {
+		/*
+		 * check i_count again with lock, because iput might re-add
+		 * it when lazytime is on.
+		 */
+		if (atomic_read(&inode->i_count) ||
+		    (inode->i_state & (I_NEW | I_FREEING | I_WILL_FREE))) {
 			spin_unlock(&inode->i_lock);
 			continue;
 		}