mbox series

[0/3] dedupe smb unicode files

Message ID 20230628011439.159678-1-linux@treblig.org (mailing list archive)
Headers show
Series dedupe smb unicode files | expand

Message

Dr. David Alan Gilbert June 28, 2023, 1:14 a.m. UTC
From: "Dr. David Alan Gilbert" <linux@treblig.org>

The smb client and server code have (mostly) duplicated code
for unicode manipulation, in particular upper case handling.

Flatten this lot into shared code.

There's some code that's slightly different between the two, and
I've not attempted to share that - this should be strictly a no
behaviour change set.

I'd love to also boil out the same code from fs/jfs/ - but that's
a thought for another time (and harder since there's no good test
for it).

Lightly tested with a module and a monolithic build, and just mounting
itself.

This dupe was found using PMD:
  https://pmd.github.io/pmd/pmd_userdocs_cpd.html

Dave

Dr. David Alan Gilbert (3):
  fs/smb: Remove unicode 'lower' tables
  fs/smb: Swing unicode common code from server->common
  fs/smb/client: Use common code in client

 fs/smb/client/cifs_unicode.c                  |   1 -
 fs/smb/client/cifs_unicode.h                  | 313 +-----------------
 fs/smb/client/cifs_uniupr.h                   | 239 -------------
 fs/smb/common/Makefile                        |   1 +
 .../uniupr.h => common/cifs_unicode_common.c} | 156 +--------
 fs/smb/common/cifs_unicode_common.h           | 279 ++++++++++++++++
 fs/smb/server/unicode.c                       |   1 -
 fs/smb/server/unicode.h                       | 301 +----------------
 8 files changed, 298 insertions(+), 993 deletions(-)
 delete mode 100644 fs/smb/client/cifs_uniupr.h
 rename fs/smb/{server/uniupr.h => common/cifs_unicode_common.c} (50%)
 create mode 100644 fs/smb/common/cifs_unicode_common.h

Comments

Tom Talpey June 28, 2023, 1:15 p.m. UTC | #1
On 6/27/2023 9:14 PM, linux@treblig.org wrote:
> From: "Dr. David Alan Gilbert" <linux@treblig.org>
> 
> The smb client and server code have (mostly) duplicated code
> for unicode manipulation, in particular upper case handling.
> 
> Flatten this lot into shared code.
> 
> There's some code that's slightly different between the two, and
> I've not attempted to share that - this should be strictly a no
> behaviour change set.
> 
> I'd love to also boil out the same code from fs/jfs/ - but that's
> a thought for another time (and harder since there's no good test
> for it).
> 
> Lightly tested with a module and a monolithic build, and just mounting
> itself.
> 
> This dupe was found using PMD:
>    https://pmd.github.io/pmd/pmd_userdocs_cpd.html
> 
> Dave
> 
> Dr. David Alan Gilbert (3):
>    fs/smb: Remove unicode 'lower' tables
>    fs/smb: Swing unicode common code from server->common
>    fs/smb/client: Use common code in client
> 
>   fs/smb/client/cifs_unicode.c                  |   1 -
>   fs/smb/client/cifs_unicode.h                  | 313 +-----------------
>   fs/smb/client/cifs_uniupr.h                   | 239 -------------
>   fs/smb/common/Makefile                        |   1 +
>   .../uniupr.h => common/cifs_unicode_common.c} | 156 +--------
>   fs/smb/common/cifs_unicode_common.h           | 279 ++++++++++++++++

So far so good, but please drop the "cifs_" prefix from this new file's
name, since its contents apply to later smb dialects as well.

Tom.

>   fs/smb/server/unicode.c                       |   1 -
>   fs/smb/server/unicode.h                       | 301 +----------------
>   8 files changed, 298 insertions(+), 993 deletions(-)
>   delete mode 100644 fs/smb/client/cifs_uniupr.h
>   rename fs/smb/{server/uniupr.h => common/cifs_unicode_common.c} (50%)
>   create mode 100644 fs/smb/common/cifs_unicode_common.h
>
Dr. David Alan Gilbert June 28, 2023, 1:41 p.m. UTC | #2
* Tom Talpey (tom@talpey.com) wrote:
> On 6/27/2023 9:14 PM, linux@treblig.org wrote:
> > From: "Dr. David Alan Gilbert" <linux@treblig.org>
> > 
> > The smb client and server code have (mostly) duplicated code
> > for unicode manipulation, in particular upper case handling.
> > 
> > Flatten this lot into shared code.
> > 
> > There's some code that's slightly different between the two, and
> > I've not attempted to share that - this should be strictly a no
> > behaviour change set.
> > 
> > I'd love to also boil out the same code from fs/jfs/ - but that's
> > a thought for another time (and harder since there's no good test
> > for it).
> > 
> > Lightly tested with a module and a monolithic build, and just mounting
> > itself.
> > 
> > This dupe was found using PMD:
> >    https://pmd.github.io/pmd/pmd_userdocs_cpd.html
> > 
> > Dave
> > 
> > Dr. David Alan Gilbert (3):
> >    fs/smb: Remove unicode 'lower' tables
> >    fs/smb: Swing unicode common code from server->common
> >    fs/smb/client: Use common code in client
> > 
> >   fs/smb/client/cifs_unicode.c                  |   1 -
> >   fs/smb/client/cifs_unicode.h                  | 313 +-----------------
> >   fs/smb/client/cifs_uniupr.h                   | 239 -------------
> >   fs/smb/common/Makefile                        |   1 +
> >   .../uniupr.h => common/cifs_unicode_common.c} | 156 +--------
> >   fs/smb/common/cifs_unicode_common.h           | 279 ++++++++++++++++
> 
> So far so good, but please drop the "cifs_" prefix from this new file's
> name, since its contents apply to later smb dialects as well.

Sure.

Dave

> Tom.
> 
> >   fs/smb/server/unicode.c                       |   1 -
> >   fs/smb/server/unicode.h                       | 301 +----------------
> >   8 files changed, 298 insertions(+), 993 deletions(-)
> >   delete mode 100644 fs/smb/client/cifs_uniupr.h
> >   rename fs/smb/{server/uniupr.h => common/cifs_unicode_common.c} (50%)
> >   create mode 100644 fs/smb/common/cifs_unicode_common.h
> >
Dr. David Alan Gilbert June 28, 2023, 1:46 p.m. UTC | #3
* Dr. David Alan Gilbert (dave@treblig.org) wrote:
> * Tom Talpey (tom@talpey.com) wrote:
> > On 6/27/2023 9:14 PM, linux@treblig.org wrote:
> > > From: "Dr. David Alan Gilbert" <linux@treblig.org>
> > > 
> > > The smb client and server code have (mostly) duplicated code
> > > for unicode manipulation, in particular upper case handling.
> > > 
> > > Flatten this lot into shared code.
> > > 
> > > There's some code that's slightly different between the two, and
> > > I've not attempted to share that - this should be strictly a no
> > > behaviour change set.
> > > 
> > > I'd love to also boil out the same code from fs/jfs/ - but that's
> > > a thought for another time (and harder since there's no good test
> > > for it).
> > > 
> > > Lightly tested with a module and a monolithic build, and just mounting
> > > itself.
> > > 
> > > This dupe was found using PMD:
> > >    https://pmd.github.io/pmd/pmd_userdocs_cpd.html
> > > 
> > > Dave
> > > 
> > > Dr. David Alan Gilbert (3):
> > >    fs/smb: Remove unicode 'lower' tables
> > >    fs/smb: Swing unicode common code from server->common
> > >    fs/smb/client: Use common code in client
> > > 
> > >   fs/smb/client/cifs_unicode.c                  |   1 -
> > >   fs/smb/client/cifs_unicode.h                  | 313 +-----------------
> > >   fs/smb/client/cifs_uniupr.h                   | 239 -------------
> > >   fs/smb/common/Makefile                        |   1 +
> > >   .../uniupr.h => common/cifs_unicode_common.c} | 156 +--------
> > >   fs/smb/common/cifs_unicode_common.h           | 279 ++++++++++++++++
> > 
> > So far so good, but please drop the "cifs_" prefix from this new file's
> > name, since its contents apply to later smb dialects as well.
> 
> Sure.

Actually, would you be ok with smb_unicode_common ?  The reason is that
you end up with a module named unicode_common  that sounds too generic.

Dave

> Dave
> 
> > Tom.
> > 
> > >   fs/smb/server/unicode.c                       |   1 -
> > >   fs/smb/server/unicode.h                       | 301 +----------------
> > >   8 files changed, 298 insertions(+), 993 deletions(-)
> > >   delete mode 100644 fs/smb/client/cifs_uniupr.h
> > >   rename fs/smb/{server/uniupr.h => common/cifs_unicode_common.c} (50%)
> > >   create mode 100644 fs/smb/common/cifs_unicode_common.h
> > > 
> -- 
>  -----Open up your eyes, open up your mind, open up your code -------   
> / Dr. David Alan Gilbert    |       Running GNU/Linux       | Happy  \ 
> \        dave @ treblig.org |                               | In Hex /
>  \ _________________________|_____ http://www.treblig.org   |_______/
Dave Kleikamp June 28, 2023, 2:02 p.m. UTC | #4
On 6/28/23 8:46AM, Dr. David Alan Gilbert wrote:
> * Dr. David Alan Gilbert (dave@treblig.org) wrote:
>> * Tom Talpey (tom@talpey.com) wrote:
>>> On 6/27/2023 9:14 PM, linux@treblig.org wrote:
>>>> From: "Dr. David Alan Gilbert" <linux@treblig.org>
>>>>
>>>> The smb client and server code have (mostly) duplicated code
>>>> for unicode manipulation, in particular upper case handling.
>>>>
>>>> Flatten this lot into shared code.
>>>>
>>>> There's some code that's slightly different between the two, and
>>>> I've not attempted to share that - this should be strictly a no
>>>> behaviour change set.
>>>>
>>>> I'd love to also boil out the same code from fs/jfs/ - but that's
>>>> a thought for another time (and harder since there's no good test
>>>> for it).
>>>>
>>>> Lightly tested with a module and a monolithic build, and just mounting
>>>> itself.
>>>>
>>>> This dupe was found using PMD:
>>>>     https://pmd.github.io/pmd/pmd_userdocs_cpd.html
>>>>
>>>> Dave
>>>>
>>>> Dr. David Alan Gilbert (3):
>>>>     fs/smb: Remove unicode 'lower' tables
>>>>     fs/smb: Swing unicode common code from server->common
>>>>     fs/smb/client: Use common code in client
>>>>
>>>>    fs/smb/client/cifs_unicode.c                  |   1 -
>>>>    fs/smb/client/cifs_unicode.h                  | 313 +-----------------
>>>>    fs/smb/client/cifs_uniupr.h                   | 239 -------------
>>>>    fs/smb/common/Makefile                        |   1 +
>>>>    .../uniupr.h => common/cifs_unicode_common.c} | 156 +--------
>>>>    fs/smb/common/cifs_unicode_common.h           | 279 ++++++++++++++++
>>>
>>> So far so good, but please drop the "cifs_" prefix from this new file's
>>> name, since its contents apply to later smb dialects as well.
>>
>> Sure.
> 
> Actually, would you be ok with smb_unicode_common ?  The reason is that
> you end up with a module named unicode_common  that sounds too generic.

I'd suggest make it generic and move it to fs/nls/. I'd run it by the 
nls maintainers, but I don't think there are any.

Shaggy

> 
> Dave
> 
>> Dave
>>
>>> Tom.
>>>
>>>>    fs/smb/server/unicode.c                       |   1 -
>>>>    fs/smb/server/unicode.h                       | 301 +----------------
>>>>    8 files changed, 298 insertions(+), 993 deletions(-)
>>>>    delete mode 100644 fs/smb/client/cifs_uniupr.h
>>>>    rename fs/smb/{server/uniupr.h => common/cifs_unicode_common.c} (50%)
>>>>    create mode 100644 fs/smb/common/cifs_unicode_common.h
>>>>
>> -- 
>>   -----Open up your eyes, open up your mind, open up your code -------
>> / Dr. David Alan Gilbert    |       Running GNU/Linux       | Happy  \
>> \        dave @ treblig.org |                               | In Hex /
>>   \ _________________________|_____ http://www.treblig.org   |_______/
Steve French June 28, 2023, 2:02 p.m. UTC | #5
On Wed, Jun 28, 2023 at 8:56 AM Dr. David Alan Gilbert
<linux@treblig.org> wrote:
>
> * Dr. David Alan Gilbert (dave@treblig.org) wrote:
> > * Tom Talpey (tom@talpey.com) wrote:
> > > On 6/27/2023 9:14 PM, linux@treblig.org wrote:
> > > > From: "Dr. David Alan Gilbert" <linux@treblig.org>
> > > >
> > > > The smb client and server code have (mostly) duplicated code
> > > > for unicode manipulation, in particular upper case handling.
> > > >
> > > > Flatten this lot into shared code.
> > > >
> > > > There's some code that's slightly different between the two, and
> > > > I've not attempted to share that - this should be strictly a no
> > > > behaviour change set.
> > > >
> > > > I'd love to also boil out the same code from fs/jfs/ - but that's
> > > > a thought for another time (and harder since there's no good test
> > > > for it).
> > > >
> > > > Lightly tested with a module and a monolithic build, and just mounting
> > > > itself.
> > > >
> > > > This dupe was found using PMD:
> > > >    https://pmd.github.io/pmd/pmd_userdocs_cpd.html
> > > >
> > > > Dave
> > > >
> > > > Dr. David Alan Gilbert (3):
> > > >    fs/smb: Remove unicode 'lower' tables
> > > >    fs/smb: Swing unicode common code from server->common
> > > >    fs/smb/client: Use common code in client
> > > >
> > > >   fs/smb/client/cifs_unicode.c                  |   1 -
> > > >   fs/smb/client/cifs_unicode.h                  | 313 +-----------------
> > > >   fs/smb/client/cifs_uniupr.h                   | 239 -------------
> > > >   fs/smb/common/Makefile                        |   1 +
> > > >   .../uniupr.h => common/cifs_unicode_common.c} | 156 +--------
> > > >   fs/smb/common/cifs_unicode_common.h           | 279 ++++++++++++++++
> > >
> > > So far so good, but please drop the "cifs_" prefix from this new file's
> > > name, since its contents apply to later smb dialects as well.
> >
> > Sure.
>
> Actually, would you be ok with smb_unicode_common ?  The reason is that
> you end up with a module named unicode_common  that sounds too generic.

Since it is already in the smb/common directory, seems easier to name them:
smb/common/unicode.c
or smb/common/smb_unicode.c
Tom Talpey June 28, 2023, 2:11 p.m. UTC | #6
On 6/28/2023 10:02 AM, Dave Kleikamp wrote:
> On 6/28/23 8:46AM, Dr. David Alan Gilbert wrote:
>> * Dr. David Alan Gilbert (dave@treblig.org) wrote:
>>> * Tom Talpey (tom@talpey.com) wrote:
>>>> On 6/27/2023 9:14 PM, linux@treblig.org wrote:
>>>>> From: "Dr. David Alan Gilbert" <linux@treblig.org>
>>>>>
>>>>> The smb client and server code have (mostly) duplicated code
>>>>> for unicode manipulation, in particular upper case handling.
>>>>>
>>>>> Flatten this lot into shared code.
>>>>>
>>>>> There's some code that's slightly different between the two, and
>>>>> I've not attempted to share that - this should be strictly a no
>>>>> behaviour change set.
>>>>>
>>>>> I'd love to also boil out the same code from fs/jfs/ - but that's
>>>>> a thought for another time (and harder since there's no good test
>>>>> for it).
>>>>>
>>>>> Lightly tested with a module and a monolithic build, and just mounting
>>>>> itself.
>>>>>
>>>>> This dupe was found using PMD:
>>>>>     https://pmd.github.io/pmd/pmd_userdocs_cpd.html
>>>>>
>>>>> Dave
>>>>>
>>>>> Dr. David Alan Gilbert (3):
>>>>>     fs/smb: Remove unicode 'lower' tables
>>>>>     fs/smb: Swing unicode common code from server->common
>>>>>     fs/smb/client: Use common code in client
>>>>>
>>>>>    fs/smb/client/cifs_unicode.c                  |   1 -
>>>>>    fs/smb/client/cifs_unicode.h                  | 313 
>>>>> +-----------------
>>>>>    fs/smb/client/cifs_uniupr.h                   | 239 -------------
>>>>>    fs/smb/common/Makefile                        |   1 +
>>>>>    .../uniupr.h => common/cifs_unicode_common.c} | 156 +--------
>>>>>    fs/smb/common/cifs_unicode_common.h           | 279 
>>>>> ++++++++++++++++
>>>>
>>>> So far so good, but please drop the "cifs_" prefix from this new file's
>>>> name, since its contents apply to later smb dialects as well.
>>>
>>> Sure.
>>
>> Actually, would you be ok with smb_unicode_common ?  The reason is that
>> you end up with a module named unicode_common  that sounds too generic.
> 
> I'd suggest make it generic and move it to fs/nls/. I'd run it by the 
> nls maintainers, but I don't think there are any.

I agree that would be best. If it stays in smb/common, with or
without extra filename decoration, it will still need to move
someday. But I have no strong preference on prefix apart from
not constraining it to a single protocol dialect.

Tom.

> Shaggy
> 
>>
>> Dave
>>
>>> Dave
>>>
>>>> Tom.
>>>>
>>>>>    fs/smb/server/unicode.c                       |   1 -
>>>>>    fs/smb/server/unicode.h                       | 301 
>>>>> +----------------
>>>>>    8 files changed, 298 insertions(+), 993 deletions(-)
>>>>>    delete mode 100644 fs/smb/client/cifs_uniupr.h
>>>>>    rename fs/smb/{server/uniupr.h => common/cifs_unicode_common.c} 
>>>>> (50%)
>>>>>    create mode 100644 fs/smb/common/cifs_unicode_common.h
>>>>>
>>> -- 
>>>   -----Open up your eyes, open up your mind, open up your code -------
>>> / Dr. David Alan Gilbert    |       Running GNU/Linux       | Happy  \
>>> \        dave @ treblig.org |                               | In Hex /
>>>   \ _________________________|_____ http://www.treblig.org   |_______/
>
Dave Kleikamp June 28, 2023, 2:17 p.m. UTC | #7
On 6/28/23 8:46AM, Dr. David Alan Gilbert wrote:
> * Dr. David Alan Gilbert (dave@treblig.org) wrote:
>> * Tom Talpey (tom@talpey.com) wrote:
>>> On 6/27/2023 9:14 PM, linux@treblig.org wrote:
>>>> From: "Dr. David Alan Gilbert" <linux@treblig.org>
>>>>
>>>> The smb client and server code have (mostly) duplicated code
>>>> for unicode manipulation, in particular upper case handling.
>>>>
>>>> Flatten this lot into shared code.
>>>>
>>>> There's some code that's slightly different between the two, and
>>>> I've not attempted to share that - this should be strictly a no
>>>> behaviour change set.
>>>>
>>>> I'd love to also boil out the same code from fs/jfs/ - but that's
>>>> a thought for another time (and harder since there's no good test
>>>> for it).
>>>>
>>>> Lightly tested with a module and a monolithic build, and just mounting
>>>> itself.
>>>>
>>>> This dupe was found using PMD:
>>>>     https://pmd.github.io/pmd/pmd_userdocs_cpd.html
>>>>
>>>> Dave
>>>>
>>>> Dr. David Alan Gilbert (3):
>>>>     fs/smb: Remove unicode 'lower' tables
>>>>     fs/smb: Swing unicode common code from server->common
>>>>     fs/smb/client: Use common code in client
>>>>
>>>>    fs/smb/client/cifs_unicode.c                  |   1 -
>>>>    fs/smb/client/cifs_unicode.h                  | 313 +-----------------
>>>>    fs/smb/client/cifs_uniupr.h                   | 239 -------------
>>>>    fs/smb/common/Makefile                        |   1 +
>>>>    .../uniupr.h => common/cifs_unicode_common.c} | 156 +--------
>>>>    fs/smb/common/cifs_unicode_common.h           | 279 ++++++++++++++++
>>>
>>> So far so good, but please drop the "cifs_" prefix from this new file's
>>> name, since its contents apply to later smb dialects as well.
>>
>> Sure.
> 
> Actually, would you be ok with smb_unicode_common ?  The reason is that
> you end up with a module named unicode_common  that sounds too generic.

A bit off topic, but a question for Steve.

Is there a need for separate modules under fs/smb/common/? Or could the 
makefile do something like:

obj-$(CONFIG_SMBFS) += smb_common.o

smb_common-y := cifs.arc4.o cifs_md4.o smb_unicode.o

Shaggy

> 
> Dave
> 
>> Dave
>>
>>> Tom.
>>>
>>>>    fs/smb/server/unicode.c                       |   1 -
>>>>    fs/smb/server/unicode.h                       | 301 +----------------
>>>>    8 files changed, 298 insertions(+), 993 deletions(-)
>>>>    delete mode 100644 fs/smb/client/cifs_uniupr.h
>>>>    rename fs/smb/{server/uniupr.h => common/cifs_unicode_common.c} (50%)
>>>>    create mode 100644 fs/smb/common/cifs_unicode_common.h
>>>>
>> -- 
>>   -----Open up your eyes, open up your mind, open up your code -------
>> / Dr. David Alan Gilbert    |       Running GNU/Linux       | Happy  \
>> \        dave @ treblig.org |                               | In Hex /
>>   \ _________________________|_____ http://www.treblig.org   |_______/
Dr. David Alan Gilbert June 28, 2023, 2:39 p.m. UTC | #8
* Dave Kleikamp (dave.kleikamp@oracle.com) wrote:
> On 6/28/23 8:46AM, Dr. David Alan Gilbert wrote:
> > * Dr. David Alan Gilbert (dave@treblig.org) wrote:
> > > * Tom Talpey (tom@talpey.com) wrote:
> > > > On 6/27/2023 9:14 PM, linux@treblig.org wrote:
> > > > > From: "Dr. David Alan Gilbert" <linux@treblig.org>
> > > > > 
> > > > > The smb client and server code have (mostly) duplicated code
> > > > > for unicode manipulation, in particular upper case handling.
> > > > > 
> > > > > Flatten this lot into shared code.
> > > > > 
> > > > > There's some code that's slightly different between the two, and
> > > > > I've not attempted to share that - this should be strictly a no
> > > > > behaviour change set.
> > > > > 
> > > > > I'd love to also boil out the same code from fs/jfs/ - but that's
> > > > > a thought for another time (and harder since there's no good test
> > > > > for it).
> > > > > 
> > > > > Lightly tested with a module and a monolithic build, and just mounting
> > > > > itself.
> > > > > 
> > > > > This dupe was found using PMD:
> > > > >     https://pmd.github.io/pmd/pmd_userdocs_cpd.html
> > > > > 
> > > > > Dave
> > > > > 
> > > > > Dr. David Alan Gilbert (3):
> > > > >     fs/smb: Remove unicode 'lower' tables
> > > > >     fs/smb: Swing unicode common code from server->common
> > > > >     fs/smb/client: Use common code in client
> > > > > 
> > > > >    fs/smb/client/cifs_unicode.c                  |   1 -
> > > > >    fs/smb/client/cifs_unicode.h                  | 313 +-----------------
> > > > >    fs/smb/client/cifs_uniupr.h                   | 239 -------------
> > > > >    fs/smb/common/Makefile                        |   1 +
> > > > >    .../uniupr.h => common/cifs_unicode_common.c} | 156 +--------
> > > > >    fs/smb/common/cifs_unicode_common.h           | 279 ++++++++++++++++
> > > > 
> > > > So far so good, but please drop the "cifs_" prefix from this new file's
> > > > name, since its contents apply to later smb dialects as well.
> > > 
> > > Sure.
> > 
> > Actually, would you be ok with smb_unicode_common ?  The reason is that
> > you end up with a module named unicode_common  that sounds too generic.
> 
> I'd suggest make it generic and move it to fs/nls/. I'd run it by the nls
> maintainers, but I don't think there are any.

Steve & Tom - would you be OK with that?

(Copying in Gabriel Bertazi, owner of fs/unicode; although this isn't
utf-8)

Dave

> Shaggy
> 
> > 
> > Dave
> > 
> > > Dave
> > > 
> > > > Tom.
> > > > 
> > > > >    fs/smb/server/unicode.c                       |   1 -
> > > > >    fs/smb/server/unicode.h                       | 301 +----------------
> > > > >    8 files changed, 298 insertions(+), 993 deletions(-)
> > > > >    delete mode 100644 fs/smb/client/cifs_uniupr.h
> > > > >    rename fs/smb/{server/uniupr.h => common/cifs_unicode_common.c} (50%)
> > > > >    create mode 100644 fs/smb/common/cifs_unicode_common.h
> > > > > 
> > > -- 
> > >   -----Open up your eyes, open up your mind, open up your code -------
> > > / Dr. David Alan Gilbert    |       Running GNU/Linux       | Happy  \
> > > \        dave @ treblig.org |                               | In Hex /
> > >   \ _________________________|_____ http://www.treblig.org   |_______/
Steve French June 28, 2023, 2:47 p.m. UTC | #9
On Wed, Jun 28, 2023 at 9:24 AM Dave Kleikamp <dave.kleikamp@oracle.com> wrote:
>
> On 6/28/23 8:46AM, Dr. David Alan Gilbert wrote:
> > * Dr. David Alan Gilbert (dave@treblig.org) wrote:
> >> * Tom Talpey (tom@talpey.com) wrote:
> >>> On 6/27/2023 9:14 PM, linux@treblig.org wrote:
> >>>> From: "Dr. David Alan Gilbert" <linux@treblig.org>
> >>>>
> >>>> The smb client and server code have (mostly) duplicated code
> >>>> for unicode manipulation, in particular upper case handling.
> >>>>
> >>>> Flatten this lot into shared code.
> >>>>
> >>>> There's some code that's slightly different between the two, and
> >>>> I've not attempted to share that - this should be strictly a no
> >>>> behaviour change set.
> >>>>
> >>>> I'd love to also boil out the same code from fs/jfs/ - but that's
> >>>> a thought for another time (and harder since there's no good test
> >>>> for it).
> >>>>
> >>>> Lightly tested with a module and a monolithic build, and just mounting
> >>>> itself.
> >>>>
> >>>> This dupe was found using PMD:
> >>>>     https://pmd.github.io/pmd/pmd_userdocs_cpd.html
> >>>>
> >>>> Dave
> >>>>
> >>>> Dr. David Alan Gilbert (3):
> >>>>     fs/smb: Remove unicode 'lower' tables
> >>>>     fs/smb: Swing unicode common code from server->common
> >>>>     fs/smb/client: Use common code in client
> >>>>
> >>>>    fs/smb/client/cifs_unicode.c                  |   1 -
> >>>>    fs/smb/client/cifs_unicode.h                  | 313 +-----------------
> >>>>    fs/smb/client/cifs_uniupr.h                   | 239 -------------
> >>>>    fs/smb/common/Makefile                        |   1 +
> >>>>    .../uniupr.h => common/cifs_unicode_common.c} | 156 +--------
> >>>>    fs/smb/common/cifs_unicode_common.h           | 279 ++++++++++++++++
> >>>
> >>> So far so good, but please drop the "cifs_" prefix from this new file's
> >>> name, since its contents apply to later smb dialects as well.
> >>
> >> Sure.
> >
> > Actually, would you be ok with smb_unicode_common ?  The reason is that
> > you end up with a module named unicode_common  that sounds too generic.
>
> A bit off topic, but a question for Steve.
>
> Is there a need for separate modules under fs/smb/common/? Or could the
> makefile do something like:
>
> obj-$(CONFIG_SMBFS) += smb_common.o
>
> smb_common-y := cifs.arc4.o cifs_md4.o smb_unicode.o


Since arc4 and md4 are used more rarely used than smb_unicode (and in
some environments
use of md4 could be forbidden), and also since arc4 and md4 are not
really smb/cifs but crypto,
seems more logical to keep them separate.    There are other things
like quic support
(which is important for smb3.1.1) that will probably be much larger
(even with upcalls) that
could also be distinct modules in fs/smb/common in the future.
Tom Talpey June 28, 2023, 3:01 p.m. UTC | #10
On 6/28/2023 10:39 AM, Dr. David Alan Gilbert wrote:
> * Dave Kleikamp (dave.kleikamp@oracle.com) wrote:
>> On 6/28/23 8:46AM, Dr. David Alan Gilbert wrote:
>>> * Dr. David Alan Gilbert (dave@treblig.org) wrote:
>>>> * Tom Talpey (tom@talpey.com) wrote:
>>>>> On 6/27/2023 9:14 PM, linux@treblig.org wrote:
>>>>>> From: "Dr. David Alan Gilbert" <linux@treblig.org>
>>>>>>
>>>>>> The smb client and server code have (mostly) duplicated code
>>>>>> for unicode manipulation, in particular upper case handling.
>>>>>>
>>>>>> Flatten this lot into shared code.
>>>>>>
>>>>>> There's some code that's slightly different between the two, and
>>>>>> I've not attempted to share that - this should be strictly a no
>>>>>> behaviour change set.
>>>>>>
>>>>>> I'd love to also boil out the same code from fs/jfs/ - but that's
>>>>>> a thought for another time (and harder since there's no good test
>>>>>> for it).
>>>>>>
>>>>>> Lightly tested with a module and a monolithic build, and just mounting
>>>>>> itself.
>>>>>>
>>>>>> This dupe was found using PMD:
>>>>>>      https://pmd.github.io/pmd/pmd_userdocs_cpd.html
>>>>>>
>>>>>> Dave
>>>>>>
>>>>>> Dr. David Alan Gilbert (3):
>>>>>>      fs/smb: Remove unicode 'lower' tables
>>>>>>      fs/smb: Swing unicode common code from server->common
>>>>>>      fs/smb/client: Use common code in client
>>>>>>
>>>>>>     fs/smb/client/cifs_unicode.c                  |   1 -
>>>>>>     fs/smb/client/cifs_unicode.h                  | 313 +-----------------
>>>>>>     fs/smb/client/cifs_uniupr.h                   | 239 -------------
>>>>>>     fs/smb/common/Makefile                        |   1 +
>>>>>>     .../uniupr.h => common/cifs_unicode_common.c} | 156 +--------
>>>>>>     fs/smb/common/cifs_unicode_common.h           | 279 ++++++++++++++++
>>>>>
>>>>> So far so good, but please drop the "cifs_" prefix from this new file's
>>>>> name, since its contents apply to later smb dialects as well.
>>>>
>>>> Sure.
>>>
>>> Actually, would you be ok with smb_unicode_common ?  The reason is that
>>> you end up with a module named unicode_common  that sounds too generic.
>>
>> I'd suggest make it generic and move it to fs/nls/. I'd run it by the nls
>> maintainers, but I don't think there are any.
> 
> Steve & Tom - would you be OK with that?

It's fine with me but there are "CifsUni<foo>" and "SmbUni<foo>" entry
points which will look odd in fs/nls. I'd be fine with removing those
dialect-specific Cifs/Smb prefixes, but there might be some collisions
to work out.

Tom.

> (Copying in Gabriel Bertazi, owner of fs/unicode; although this isn't
> utf-8)
> 
> Dave
> 
>> Shaggy
>>
>>>
>>> Dave
>>>
>>>> Dave
>>>>
>>>>> Tom.
>>>>>
>>>>>>     fs/smb/server/unicode.c                       |   1 -
>>>>>>     fs/smb/server/unicode.h                       | 301 +----------------
>>>>>>     8 files changed, 298 insertions(+), 993 deletions(-)
>>>>>>     delete mode 100644 fs/smb/client/cifs_uniupr.h
>>>>>>     rename fs/smb/{server/uniupr.h => common/cifs_unicode_common.c} (50%)
>>>>>>     create mode 100644 fs/smb/common/cifs_unicode_common.h
>>>>>>
>>>> -- 
>>>>    -----Open up your eyes, open up your mind, open up your code -------
>>>> / Dr. David Alan Gilbert    |       Running GNU/Linux       | Happy  \
>>>> \        dave @ treblig.org |                               | In Hex /
>>>>    \ _________________________|_____ http://www.treblig.org   |_______/
Steve French June 28, 2023, 3:03 p.m. UTC | #11
On Wed, Jun 28, 2023 at 9:40 AM Dr. David Alan Gilbert
<linux@treblig.org> wrote:
> > > Actually, would you be ok with smb_unicode_common ?  The reason is that
> > > you end up with a module named unicode_common  that sounds too generic.
> >
> > I'd suggest make it generic and move it to fs/nls/. I'd run it by the nls
> > maintainers, but I don't think there are any.
>
> Steve & Tom - would you be OK with that?

Yes - absolutely

> (Copying in Gabriel Bertazi, owner of fs/unicode; although this isn't
> utf-8)

Unicode UCS-2
Dr. David Alan Gilbert June 28, 2023, 3:13 p.m. UTC | #12
* Steve French (smfrench@gmail.com) wrote:
> On Wed, Jun 28, 2023 at 9:40 AM Dr. David Alan Gilbert
> <linux@treblig.org> wrote:
> > > > Actually, would you be ok with smb_unicode_common ?  The reason is that
> > > > you end up with a module named unicode_common  that sounds too generic.
> > >
> > > I'd suggest make it generic and move it to fs/nls/. I'd run it by the nls
> > > maintainers, but I don't think there are any.
> >
> > Steve & Tom - would you be OK with that?
> 
> Yes - absolutely

OK.

> > (Copying in Gabriel Bertazi, owner of fs/unicode; although this isn't
> > utf-8)
> 
> Unicode UCS-2

(I'm going to regret the next question...)
So how does this compare to the stuff in include/linux/ucs2_string.h
and lib/ucs2_string.c ?

Dave

> 
> -- 
> Thanks,
> 
> Steve