mbox series

[GIT,PULL] Rename page_offset() to page_pos()

Message ID 20200411203220.GG21484@bombadil.infradead.org (mailing list archive)
State New, archived
Headers show
Series [GIT,PULL] Rename page_offset() to page_pos() | expand

Pull-request

git://git.infradead.org/users/willy/linux-dax.git tags/page_pos

Message

Matthew Wilcox April 11, 2020, 8:32 p.m. UTC
Hi Linus,

We've had some trouble recently with page_offset() being confusingly
named.  To minimise conflicts, I automatically generated these two rename
patches just now so you can merge it right before -rc1.  I included
the coccinelle scripts in the commit messages so you can generate the
patches yourself if you'd rather do that.

The following changes since commit b032227c62939b5481bcd45442b36dfa263f4a7c:

  Merge tag 'nios2-v5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2 (2020-04-11 11:38:44 -0700)

are available in the Git repository at:

  git://git.infradead.org/users/willy/linux-dax.git tags/page_pos

for you to fetch changes up to d80f6dd912904b261010b14928b8ba89d8dd88bd:

  fs: Rename page_file_offset() to page_file_pos() (2020-04-11 16:17:34 -0400)

----------------------------------------------------------------
Rename page_offset to page_pos

----------------------------------------------------------------
Matthew Wilcox (Oracle) (2):
      fs: Rename page_offset() to page_pos()
      fs: Rename page_file_offset() to page_file_pos()

 fs/9p/vfs_addr.c        |  4 ++--
 fs/afs/dir.c            |  2 +-
 fs/btrfs/compression.c  |  8 ++++----
 fs/btrfs/disk-io.c      |  4 ++--
 fs/btrfs/extent_io.c    | 28 ++++++++++++++--------------
 fs/btrfs/file-item.c    |  4 ++--
 fs/btrfs/inode.c        | 22 +++++++++++-----------
 fs/btrfs/ioctl.c        |  6 +++---
 fs/btrfs/relocation.c   |  2 +-
 fs/buffer.c             |  2 +-
 fs/ceph/addr.c          | 26 +++++++++++++-------------
 fs/cifs/cifssmb.c       |  2 +-
 fs/cifs/file.c          |  8 ++++----
 fs/erofs/zdata.c        |  2 +-
 fs/ext2/dir.c           |  6 +++---
 fs/ext4/inode.c         |  2 +-
 fs/f2fs/file.c          |  2 +-
 fs/fuse/file.c          | 10 +++++-----
 fs/gfs2/file.c          |  4 ++--
 fs/hostfs/hostfs_kern.c |  4 ++--
 fs/iomap/buffered-io.c  | 22 +++++++++++-----------
 fs/iomap/seek.c         |  4 ++--
 fs/isofs/compress.c     |  2 +-
 fs/minix/dir.c          |  6 +++---
 fs/nfs/file.c           |  4 ++--
 fs/nfs/write.c          |  6 +++---
 fs/nilfs2/dir.c         |  4 ++--
 fs/nilfs2/file.c        |  2 +-
 fs/nilfs2/page.c        |  2 +-
 fs/ocfs2/aops.c         |  6 +++---
 fs/ocfs2/mmap.c         |  4 ++--
 fs/orangefs/inode.c     | 48 ++++++++++++++++++++++++------------------------
 fs/romfs/super.c        |  2 +-
 fs/sysv/dir.c           |  6 +++---
 fs/ubifs/file.c         |  2 +-
 fs/ufs/dir.c            |  6 +++---
 fs/vboxsf/file.c        |  4 ++--
 fs/xfs/xfs_aops.c       |  2 +-
 include/linux/pagemap.h |  4 ++--
 mm/page_io.c            |  4 ++--
 40 files changed, 144 insertions(+), 144 deletions(-)

Comments

Linus Torvalds April 11, 2020, 8:57 p.m. UTC | #1
On Sat, Apr 11, 2020 at 1:32 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> We've had some trouble recently with page_offset() being confusingly
> named.

This makes little sense to me.

I don't find "page_pos()" to be in the least more intuitive than
"page_offset()".  Yes, you have some numbers of "offset" vs "pos"
being used for the position in the file, but they aren't _that_
different, and honestly, if you look at things like the man-page for
"lseek()", the byte offset you seek to is called an "offset".

The fact that somebody was confused by the current name is a red
herring - there's nothing to say that they wouldn't have been confused
by "page_pos()", except for the fact that that wasn't the name.

So honestly, i the confusion is that we have "pgoff_t", which is the
offset of the page counted in _pages_, then my reaction is that

 (a) I think the truly confusing name is "pgoff_t" (and any
"page_offset" variable of that type). Calling that "pgindex_t" and
"page_index" would be a real clarification.

 (b) if we really do want to rename page_offset() because of confusion
with the page index "offset", then the logical thing would be to
clarify that it's a byte offset, not the page index.

So "page_pos()" to me sounds not at all more descriptive, and having
two names (for stable kernels, for people with memories, for
historical patches, whatever) only sounds like a source of even more
confusion in the future.

If we'd want a _descriptive_ name, then "byte_offset_of_page()" would
probably be that. That's hard to mis-understand.

Yes that's also more of a mouthful, and it still has the "two
different names for the same thing" issue wrt
stable/old/rebased/whatever patches.

But if there are enough people who find "page_offset()" to be a source
of confusion, then I'd at least prefer to _truly_ remove any
possibility of confusion with that longer name.

I'd like to have a few more people step up and say "I find that name
confusing enough that I think it's worth the confusion of renaming
it".

We've had the "page_offset()" name _forever_, this is the first time I
hear it being a problem (it goes back to 2005, and before that it was
used inside the NFS code).

Of course, we've also had "pgoff_t" forever - that name goes back to 2002.

But unlike "page_offset()", I do think that "pgoff_t" is actually a
truly bad name.

Which is why I'd much rather change "pgoff_t" to "pgindex_t" and
related "page_offset" variables to "page_index" variables.

            Linus
Matthew Wilcox April 11, 2020, 9:48 p.m. UTC | #2
On Sat, Apr 11, 2020 at 01:57:56PM -0700, Linus Torvalds wrote:
> So honestly, i the confusion is that we have "pgoff_t", which is the
> offset of the page counted in _pages_, then my reaction is that
> 
>  (a) I think the truly confusing name is "pgoff_t" (and any
> "page_offset" variable of that type). Calling that "pgindex_t" and
> "page_index" would be a real clarification.

I think you're right.  I have a patch series queued for 5.8 which
renames a lot of 'pgoff_t offset' to 'pgoff_t index'.  I wouldn't mind
at all renaming pgoff_t to pgindex_t.  If you're amenable, pgidx_t would
be shorter.

>  (b) if we really do want to rename page_offset() because of confusion
> with the page index "offset", then the logical thing would be to
> clarify that it's a byte offset, not the page index.

I wasn't entirely forthcoming ... I actually want to introduce a new

#define page_offset(page, x) ((unsigned long)(x) & (page_size(page) - 1))

to simplify handling huge pages.  So I always want to see offset be a
byte count.  offset_in_page() is already taken, and I have no idea what
else to call the function to get the offset of this address within a
particular page.

> If we'd want a _descriptive_ name, then "byte_offset_of_page()" would
> probably be that. That's hard to mis-understand.
> 
> Yes that's also more of a mouthful, and it still has the "two
> different names for the same thing" issue wrt
> stable/old/rebased/whatever patches.

That was one of the options we discussed, along with file_offset_of_page().

> Which is why I'd much rather change "pgoff_t" to "pgindex_t" and
> related "page_offset" variables to "page_index" variables.

There's only about 20 of those out of the 938 pgoff_t users.  But there's
over a hundred called 'pgoff'.  I need to get smarter about using
Coccinelle; I'm sure it can do this.
Linus Torvalds April 11, 2020, 10:02 p.m. UTC | #3
On Sat, Apr 11, 2020 at 2:48 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> I wasn't entirely forthcoming ... I actually want to introduce a new
>
> #define page_offset(page, x) ((unsigned long)(x) & (page_size(page) - 1))

No, no, no.

THAT would be confusing. Re-using a name (even if you renamed it) for
something new and completely different is just bad taste. It would
also be a horrible problem - again - for any stable backport etc.

Just call that "offset_in_page()" and be done with it.

                Linus
Matthew Wilcox April 11, 2020, 10:06 p.m. UTC | #4
On Sat, Apr 11, 2020 at 03:02:40PM -0700, Linus Torvalds wrote:
> On Sat, Apr 11, 2020 at 2:48 PM Matthew Wilcox <willy@infradead.org> wrote:
> >
> > I wasn't entirely forthcoming ... I actually want to introduce a new
> >
> > #define page_offset(page, x) ((unsigned long)(x) & (page_size(page) - 1))
> 
> No, no, no.
> 
> THAT would be confusing. Re-using a name (even if you renamed it) for
> something new and completely different is just bad taste. It would
> also be a horrible problem - again - for any stable backport etc.
> 
> Just call that "offset_in_page()" and be done with it.

But we _have_ an offset_in_page() and it doesn't take a struct page
argument.  Reusing the name wouldn't be too bad because it would take
two arguments, so nothing would inadvertently apply cleanly.

Anyway, if you give me a decision on pgindex_t vs pgidx_t, I'll work
that up before -rc1 closes.
Linus Torvalds April 11, 2020, 10:09 p.m. UTC | #5
On Sat, Apr 11, 2020 at 3:06 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> But we _have_ an offset_in_page() and it doesn't take a struct page
> argument.

.. it doesn't take a struct page argument because a struct page always
has one compile-time fixed size.

The only reason you seem to want to get the new interface is because
you want to change that fact.

So yes, you'd have to change the _existing_ offset_in_page() to take
that extra "which page" argument.

That's not confusing.

             Linus
Matthew Wilcox April 11, 2020, 11:22 p.m. UTC | #6
On Sat, Apr 11, 2020 at 03:09:35PM -0700, Linus Torvalds wrote:
> On Sat, Apr 11, 2020 at 3:06 PM Matthew Wilcox <willy@infradead.org> wrote:
> >
> > But we _have_ an offset_in_page() and it doesn't take a struct page
> > argument.
> 
> .. it doesn't take a struct page argument because a struct page always
> has one compile-time fixed size.
> 
> The only reason you seem to want to get the new interface is because
> you want to change that fact.
> 
> So yes, you'd have to change the _existing_ offset_in_page() to take
> that extra "which page" argument.
> 
> That's not confusing.

Unfortunately there isn't always a struct page around.  For example:

int __do_munmap(struct mm_struct *mm, unsigned long start, size_t len,
                struct list_head *uf, bool downgrade)
{
        unsigned long end;
        struct vm_area_struct *vma, *prev, *last;

        if ((offset_in_page(start)) || start > TASK_SIZE || len > TASK_SIZE-start)
                return -EINVAL;

where we don't care _which_ page, we just want to know the offset relative
to the architecturally defined page size.  In this specific case, it
should probably be changed to is_page_aligned(start).

There are trickier ones like on powerpc:

unsigned long vmalloc_to_phys(void *va)
{
        unsigned long pfn = vmalloc_to_pfn(va);

        BUG_ON(!pfn);
        return __pa(pfn_to_kaddr(pfn)) + offset_in_page(va);
}

where there actually _is_ a struct page, but it will need to be found.
Maybe we can pass in NULL to indicate to use the base page size.  Or
rename all current callers to offset_in_base_page() before adding a
struct page pointer to offset_in_page().  Tedious, but doable.