diff mbox

[10/13] VFS/namei: handle LOOKUP_RCU in page_follow_link_light.

Message ID 20150316044320.23648.90827.stgit@notabene.brown (mailing list archive)
State New, archived
Headers show

Commit Message

NeilBrown March 16, 2015, 4:43 a.m. UTC
If the symlink has already be been read-in, then
page_follow_link_light can succeed in RCU-walk mode.
page_getlink_rcu() is added to support this.

With this many filesystems can follow links in RCU-walk
mode when everything is cached.  This  includes ext?fs and
others.

If the page is a HighMem page we do *not* try to kmap_atomic,
but simply give up - only page_address() is used.
This is because we need to be able to sleep while holding
the address of the page, particularly over calls to do_last()
which can be quite slow and in particular takes a mutex.

If this were a problem, then copying into a GFP_ATOMIC allocation
might be a workable solution.

This selective calling of kmap requires us to know, in page_put_link,
whether or not kunmap() need to be called.  Pass this information in
the lsb of the cookie.

Signed-off-by: NeilBrown <neilb@suse.de>
---
 fs/namei.c |   35 ++++++++++++++++++++++++++++++-----
 1 file changed, 30 insertions(+), 5 deletions(-)



--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Al Viro March 16, 2015, 10:50 p.m. UTC | #1
On Mon, Mar 16, 2015 at 03:43:20PM +1100, NeilBrown wrote:
> +	char *kaddr;
> +	struct page *page;
> +	struct address_space *mapping = dentry->d_inode->i_mapping;

Who said that dentry->d_inode hasn't gone NULL by that point?

> +	nd_terminate_link(kaddr, dentry->d_inode->i_size, PAGE_SIZE - 1);

... or changed here.  Again, dentry->d_inode is stable only if you are
holding a reference to dentry.  That's why we have those dances around
nd->inode, for example.  Doing unlazy_walk() is enough to stabilize the
damn thing, so currently ->follow_link() doesn't have to worry about it.
With your changes, though...
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
NeilBrown March 19, 2015, 10:38 p.m. UTC | #2
On Mon, 16 Mar 2015 22:50:40 +0000 Al Viro <viro@ZenIV.linux.org.uk> wrote:

> On Mon, Mar 16, 2015 at 03:43:20PM +1100, NeilBrown wrote:
> > +	char *kaddr;
> > +	struct page *page;
> > +	struct address_space *mapping = dentry->d_inode->i_mapping;
> 
> Who said that dentry->d_inode hasn't gone NULL by that point?
> 
> > +	nd_terminate_link(kaddr, dentry->d_inode->i_size, PAGE_SIZE - 1);
> 
> ... or changed here.  Again, dentry->d_inode is stable only if you are
> holding a reference to dentry.  That's why we have those dances around
> nd->inode, for example.  Doing unlazy_walk() is enough to stabilize the
> damn thing, so currently ->follow_link() doesn't have to worry about it.
> With your changes, though...

Ahhh - that's what nd->inode is for.  I wondered.

Am I correct in thinking that dentry->d_inode can only become NULL - it cannot
then become some other inode?

In that case the various follow_link methods that are sufficiently atomic for
rcu-walk just need something like:

 struct inode *inode = dentry->d_inode;

 if (!inode)
     return -ECHILD;

If ->d_inode can become another inode, then I suspect we need to pass the
inode as well as the dentry to ->follow_link.


Thanks,
NeilBrown
Al Viro March 19, 2015, 11:46 p.m. UTC | #3
On Fri, Mar 20, 2015 at 09:38:33AM +1100, NeilBrown wrote:

> Ahhh - that's what nd->inode is for.  I wondered.
> 
> Am I correct in thinking that dentry->d_inode can only become NULL - it cannot
> then become some other inode?

It can - consider somebody doing mkdir on that name right under you.
_All_ we are guaranteed is that at some moment nd->inode matched the
pathname this far and so was (at the same moment) path->dentry.  We
are not promised that these inode and dentry will remain associated
with each other, etc.

We ought to check ->d_seq after checking ->d_flags, BTW.  _That_ will confirm
that inode remained corresponding to that dentry until the time we'd
observed d_is_symlink(dentry), i.e. make sure that inode *is* a symlink one.

And yes, we probably would have to pass dentry and inode separately, more's
the pity.
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/namei.c b/fs/namei.c
index c9c58cd1af2a..2602d31ecc99 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -4499,6 +4499,28 @@  static char *page_getlink(struct dentry * dentry, struct page **ppage)
 	return kaddr;
 }
 
+/* get the link contents from pagecache under RCU */
+static char *page_getlink_rcu(struct dentry * dentry, struct page **ppage)
+{
+	char *kaddr;
+	struct page *page;
+	struct address_space *mapping = dentry->d_inode->i_mapping;
+	page = find_get_page(mapping, 0);
+	if (page &&
+	    (!PageUptodate(page) || PageHighMem(page))) {
+		put_page(page);
+		page = NULL;
+	}
+	if (!page) {
+		*ppage = ERR_PTR(-ECHILD);
+		return NULL;
+	}
+	*ppage = page;
+	kaddr = page_address(page);
+	nd_terminate_link(kaddr, dentry->d_inode->i_size, PAGE_SIZE - 1);
+	return kaddr;
+}
+
 int page_readlink(struct dentry *dentry, char __user *buffer, int buflen)
 {
 	struct page *page = NULL;
@@ -4514,19 +4536,22 @@  EXPORT_SYMBOL(page_readlink);
 void *page_follow_link_light(struct dentry *dentry, int flags)
 {
 	struct page *page = NULL;
-	if (flags & LOOKUP_RCU)
-		return ERR_PTR(-ECHILD);
-	nd_set_link(page_getlink(dentry, &page));
+	if (flags & LOOKUP_RCU) {
+		nd_set_link(page_getlink_rcu(dentry, &page));
+		page = (void*)((unsigned long)page | 1);
+	} else
+		nd_set_link(page_getlink(dentry, &page));
 	return page;
 }
 EXPORT_SYMBOL(page_follow_link_light);
 
 void page_put_link(struct dentry *dentry, void *cookie)
 {
-	struct page *page = cookie;
+	struct page *page = (void*)((unsigned long)cookie & ~1UL) ;
 
 	if (page) {
-		kunmap(page);
+		if (page == cookie)
+			kunmap(page);
 		page_cache_release(page);
 	}
 }