diff mbox

[v2,06/11] don't put symlink bodies in pagecache into highmem

Message ID 20160114222544.GB17997@ZenIV.linux.org.uk (mailing list archive)
State New, archived
Headers show

Commit Message

Al Viro Jan. 14, 2016, 10:25 p.m. UTC
On Thu, Jan 14, 2016 at 01:40:32PM -0800, Linus Torvalds wrote:
> On Thu, Jan 14, 2016 at 1:02 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > Arrrgh.  Try this:
> 
> Yeah, that would do it.
> 
> Al, did you check any other filesystems do this?

There's one more turd like that - shmem should've done inode_nohighmem()
a bit earlier in shmem_symlink().  The rest is OK.

> Also, I'm wondering if we should perhaps revert the "don't use
> highmem". Do we actually have examples of running out of kmaps? Do we
> care?

For one thing, we'll lose RCU ->get_link() for those.  For another... yes,
it was a nasty bug (I missed the possibility that filesystem might seed
the page cache on ->symlink() directly and use a highmem page - mea culpa),
but we can easily catch it at runtime.  We really shouldn't put highmem
pages into address_space without __GFP_HIGHMEM, and catching those in
__add_to_page_cache_locked() isn't costly.

Anyway, mm/shmem.c bit follows.  With that + NFS one we ought to be OK
wrt that class of bogosities.  I'll write the bits for
Documentation/filesystems/porting (basically, "if you preseed the pagecache
at ->symlink() time, don't put highmem pages there; page_symlink() will
take care of that, provided that inode_nohighmem() is called first") and
push the combined patch to #for-linus.

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Al Viro Jan. 14, 2016, 11:33 p.m. UTC | #1
On Thu, Jan 14, 2016 at 10:25:44PM +0000, Al Viro wrote:

> Anyway, mm/shmem.c bit follows.  With that + NFS one we ought to be OK
> wrt that class of bogosities.  I'll write the bits for
> Documentation/filesystems/porting (basically, "if you preseed the pagecache
> at ->symlink() time, don't put highmem pages there; page_symlink() will
> take care of that, provided that inode_nohighmem() is called first") and
> push the combined patch to #for-linus.

Done and pushed.  Please, pull from
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git

Shortlog:
Al Viro (1):
      Make sure that highmem pages are not added to symlink page cache

Diffstat:
 Documentation/filesystems/porting | 6 +++++-
 fs/nfs/dir.c                      | 5 ++---
 mm/shmem.c                        | 2 +-
 3 files changed, 8 insertions(+), 5 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Linus Torvalds Jan. 14, 2016, 11:58 p.m. UTC | #2
On Thu, Jan 14, 2016 at 2:25 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> For one thing, we'll lose RCU ->get_link() for those.

Why couldn't we just do that in the RCU walker? kmap should be fine..

That said, as long as you think it's ok now, I guess I don't care.
Having some sanity testing in __add_to_page_cache_locked might be a
good safety net.

             Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Al Viro Jan. 15, 2016, 12:05 a.m. UTC | #3
On Thu, Jan 14, 2016 at 03:58:22PM -0800, Linus Torvalds wrote:
> On Thu, Jan 14, 2016 at 2:25 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > For one thing, we'll lose RCU ->get_link() for those.
> 
> Why couldn't we just do that in the RCU walker? kmap should be fine..

In map_new_virtual():
                        __set_current_state(TASK_UNINTERRUPTIBLE);
                        add_wait_queue(pkmap_map_wait, &wait);
                        unlock_kmap();
                        schedule();
                        remove_wait_queue(pkmap_map_wait, &wait);
                        lock_kmap();
IOW, not in RCU mode ;-/

> That said, as long as you think it's ok now, I guess I don't care.
> Having some sanity testing in __add_to_page_cache_locked might be a
> good safety net.

That's better as a separate commit, IMO.  The thing I'm not sure about is
whether we want a BUG() in there - VM_WARN_ON(), perhaps?  OTOH, we do
have VM_BUG_ON_PAGE() in the same place already...
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/mm/shmem.c b/mm/shmem.c
index 5813b7f..642471b 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -2469,6 +2469,7 @@  static int shmem_symlink(struct inode *dir, struct dentry *dentry, const char *s
 		inode->i_op = &shmem_short_symlink_operations;
 		inode->i_link = info->symlink;
 	} else {
+		inode_nohighmem(inode);
 		error = shmem_getpage(inode, 0, &page, SGP_WRITE, NULL);
 		if (error) {
 			iput(inode);
@@ -2476,7 +2477,6 @@  static int shmem_symlink(struct inode *dir, struct dentry *dentry, const char *s
 		}
 		inode->i_mapping->a_ops = &shmem_aops;
 		inode->i_op = &shmem_symlink_inode_operations;
-		inode_nohighmem(inode);
 		memcpy(page_address(page), symname, len);
 		SetPageUptodate(page);
 		set_page_dirty(page);