mbox series

[v2,0/2] block: Use 'read-zeroes=true' mode by default with 'null-co' driver

Message ID 20210211142656.3818078-1-philmd@redhat.com (mailing list archive)
Headers show
Series block: Use 'read-zeroes=true' mode by default with 'null-co' driver | expand

Message

Philippe Mathieu-Daudé Feb. 11, 2021, 2:26 p.m. UTC
The null-co driver doesn't zeroize buffer in its default config,
because it is designed for testing and tests want to run fast.
However this confuses security researchers (access to uninit
buffers).

A one-line patch supposed which became a painful one, because
there is so many different syntax to express the same usage:

 opt = qdict_new();
 qdict_put_str(opt, "read-zeroes", "off");
 null_bs = bdrv_open("null-co://", NULL, opt, BDRV_O_RDWR | BDRV_O_PROTOCOL,
                     &error_abort);

vm.qmp('blockdev-add', driver='null-co', read_zeroes=False, ...)

vm.add_drive_raw("id=drive0,driver=null-co,read-zeroes=off,if=none")

    blk0 = { 'node-name': 'src',
        'driver': 'null-co',
        'read-zeroes': 'off' }

    'file': {
        'driver': 'null-co',
        'read-zeroes': False,
    }

    "file": {
        "driver": "null-co",
        "read-zeroes": "off"
    }

    { "execute": "blockdev-add",
      "arguments": {
          "driver": "null-co",
          "read-zeroes": false,
          "node-name": "disk0"
        }
    }

opts = {'driver': 'null-co,read-zeroes=off', 'node-name': 'root', 'size': 1024}

qemu -drive driver=null-co,read-zeroes=off

qemu-io ... "json:{'driver': 'null-co', 'read-zeroes': false, 'size': 65536}"

qemu-img null-co://,read-zeroes=off

qemu-img ... -o data_file="json:{'driver':'null-co',,'read-zeroes':false,,'size':'4294967296'}"

There are probably more.

Anyhow, the iotests I am not sure and should be audited are 056, 155
(I don't understand the syntax) and 162.

Please review,

Phil.

Philippe Mathieu-Daud=C3=A9 (2):
  block: Explicit null-co uses 'read-zeroes=3Dfalse'
  block/null: Enable 'read-zeroes' mode by default

 docs/devel/testing.rst                     | 14 +++++++-------
 tests/qtest/fuzz/generic_fuzz_configs.h    | 11 ++++++-----
 block/null.c                               |  2 +-
 tests/test-bdrv-drain.c                    | 10 ++++++++--
 tests/acceptance/virtio_check_params.py    |  2 +-
 tests/perf/block/qcow2/convert-blockstatus |  6 +++---
 tests/qemu-iotests/040                     |  2 +-
 tests/qemu-iotests/041                     | 12 ++++++++----
 tests/qemu-iotests/051                     |  2 +-
 tests/qemu-iotests/051.out                 |  2 +-
 tests/qemu-iotests/051.pc.out              |  4 ++--
 tests/qemu-iotests/087                     |  6 ++++--
 tests/qemu-iotests/118                     |  2 +-
 tests/qemu-iotests/133                     |  2 +-
 tests/qemu-iotests/153                     |  8 ++++----
 tests/qemu-iotests/184                     |  2 ++
 tests/qemu-iotests/184.out                 | 10 +++++-----
 tests/qemu-iotests/218                     |  3 +++
 tests/qemu-iotests/224                     |  3 ++-
 tests/qemu-iotests/224.out                 |  8 ++++----
 tests/qemu-iotests/225                     |  2 +-
 tests/qemu-iotests/227                     |  4 ++--
 tests/qemu-iotests/227.out                 |  4 ++--
 tests/qemu-iotests/228                     |  2 +-
 tests/qemu-iotests/235                     |  1 +
 tests/qemu-iotests/245                     |  2 +-
 tests/qemu-iotests/270                     |  2 +-
 tests/qemu-iotests/283                     |  3 ++-
 tests/qemu-iotests/283.out                 |  4 ++--
 tests/qemu-iotests/299                     |  1 +
 tests/qemu-iotests/299.out                 |  2 +-
 tests/qemu-iotests/300                     |  4 ++--
 32 files changed, 82 insertions(+), 60 deletions(-)

--=20
2.26.2

Comments

Alexander Bulekov Feb. 11, 2021, 3:42 p.m. UTC | #1
On 210211 1526, Philippe Mathieu-Daudé wrote:
> The null-co driver doesn't zeroize buffer in its default config,
> because it is designed for testing and tests want to run fast.
> However this confuses security researchers (access to uninit
> buffers).
> 

Interesting.. Is there an example bug report, where it raised alarms
because of an un-zeroed null-co:// buffer?
-Alex

> A one-line patch supposed which became a painful one, because
> there is so many different syntax to express the same usage:
> 
>  opt = qdict_new();
>  qdict_put_str(opt, "read-zeroes", "off");
>  null_bs = bdrv_open("null-co://", NULL, opt, BDRV_O_RDWR | BDRV_O_PROTOCOL,
>                      &error_abort);
> 
> vm.qmp('blockdev-add', driver='null-co', read_zeroes=False, ...)
> 
> vm.add_drive_raw("id=drive0,driver=null-co,read-zeroes=off,if=none")
> 
>     blk0 = { 'node-name': 'src',
>         'driver': 'null-co',
>         'read-zeroes': 'off' }
> 
>     'file': {
>         'driver': 'null-co',
>         'read-zeroes': False,
>     }
> 
>     "file": {
>         "driver": "null-co",
>         "read-zeroes": "off"
>     }
> 
>     { "execute": "blockdev-add",
>       "arguments": {
>           "driver": "null-co",
>           "read-zeroes": false,
>           "node-name": "disk0"
>         }
>     }
> 
> opts = {'driver': 'null-co,read-zeroes=off', 'node-name': 'root', 'size': 1024}
> 
> qemu -drive driver=null-co,read-zeroes=off
> 
> qemu-io ... "json:{'driver': 'null-co', 'read-zeroes': false, 'size': 65536}"
> 
> qemu-img null-co://,read-zeroes=off
> 
> qemu-img ... -o data_file="json:{'driver':'null-co',,'read-zeroes':false,,'size':'4294967296'}"
> 
> There are probably more.
> 
> Anyhow, the iotests I am not sure and should be audited are 056, 155
> (I don't understand the syntax) and 162.
> 
> Please review,
> 
> Phil.
> 
> Philippe Mathieu-Daud=C3=A9 (2):
>   block: Explicit null-co uses 'read-zeroes=3Dfalse'
>   block/null: Enable 'read-zeroes' mode by default
> 
>  docs/devel/testing.rst                     | 14 +++++++-------
>  tests/qtest/fuzz/generic_fuzz_configs.h    | 11 ++++++-----
>  block/null.c                               |  2 +-
>  tests/test-bdrv-drain.c                    | 10 ++++++++--
>  tests/acceptance/virtio_check_params.py    |  2 +-
>  tests/perf/block/qcow2/convert-blockstatus |  6 +++---
>  tests/qemu-iotests/040                     |  2 +-
>  tests/qemu-iotests/041                     | 12 ++++++++----
>  tests/qemu-iotests/051                     |  2 +-
>  tests/qemu-iotests/051.out                 |  2 +-
>  tests/qemu-iotests/051.pc.out              |  4 ++--
>  tests/qemu-iotests/087                     |  6 ++++--
>  tests/qemu-iotests/118                     |  2 +-
>  tests/qemu-iotests/133                     |  2 +-
>  tests/qemu-iotests/153                     |  8 ++++----
>  tests/qemu-iotests/184                     |  2 ++
>  tests/qemu-iotests/184.out                 | 10 +++++-----
>  tests/qemu-iotests/218                     |  3 +++
>  tests/qemu-iotests/224                     |  3 ++-
>  tests/qemu-iotests/224.out                 |  8 ++++----
>  tests/qemu-iotests/225                     |  2 +-
>  tests/qemu-iotests/227                     |  4 ++--
>  tests/qemu-iotests/227.out                 |  4 ++--
>  tests/qemu-iotests/228                     |  2 +-
>  tests/qemu-iotests/235                     |  1 +
>  tests/qemu-iotests/245                     |  2 +-
>  tests/qemu-iotests/270                     |  2 +-
>  tests/qemu-iotests/283                     |  3 ++-
>  tests/qemu-iotests/283.out                 |  4 ++--
>  tests/qemu-iotests/299                     |  1 +
>  tests/qemu-iotests/299.out                 |  2 +-
>  tests/qemu-iotests/300                     |  4 ++--
>  32 files changed, 82 insertions(+), 60 deletions(-)
> 
> --=20
> 2.26.2
> 
>
Philippe Mathieu-Daudé Feb. 12, 2021, 2:32 p.m. UTC | #2
On 2/11/21 4:42 PM, Alexander Bulekov wrote:
> On 210211 1526, Philippe Mathieu-Daudé wrote:
>> The null-co driver doesn't zeroize buffer in its default config,
>> because it is designed for testing and tests want to run fast.
>> However this confuses security researchers (access to uninit
>> buffers).
>>
> 
> Interesting.. Is there an example bug report, where it raised alarms
> because of an un-zeroed null-co:// buffer?

No, but I found a similar mention here:
https://www.mail-archive.com/qemu-block@nongnu.org/msg52045.html

Example:

$ valgrind qemu-system-i386 -S -drive
file=null-co://,format=raw,file.read-zeroes=on

$ valgrind qemu-system-i386 -S -drive
file=null-co://,format=raw,file.read-zeroes=off
==4048219== Conditional jump or move depends on uninitialised value(s)
==4048219==    at 0x4E19CC: guess_disk_lchs (hd-geometry.c:70)
==4048219==    by 0x4E1C72: hd_geometry_guess (hd-geometry.c:131)
==4048219==    by 0x4E0F0F: blkconf_geometry (block.c:183)
==4048219==    by 0x563727: ide_dev_initfn (qdev.c:201)
==4048219==    by 0x563AE4: ide_hd_realize (qdev.c:278)
==4048219==    by 0x563320: ide_qdev_realize (qdev.c:124)
==4048219==    by 0x8F8EAA: device_set_realized (qdev.c:761)
==4048219==    by 0x902347: property_set_bool (object.c:2255)
==4048219==    by 0x900441: object_property_set (object.c:1400)
==4048219==    by 0x904467: object_property_set_qobject (qom-qobject.c:28)
==4048219==    by 0x9007A4: object_property_set_bool (object.c:1470)
==4048219==    by 0x8F7F3B: qdev_realize (qdev.c:389)
Fam Feb. 13, 2021, 9:54 p.m. UTC | #3
On 2021-02-11 15:26, Philippe Mathieu-Daudé wrote:
> The null-co driver doesn't zeroize buffer in its default config,
> because it is designed for testing and tests want to run fast.
> However this confuses security researchers (access to uninit
> buffers).

I'm a little surprised.

Is changing default the only way to fix this? I'm not opposed to
changing the default but I'm not convinced this is the easiest way.
block/nvme.c also doesn't touch the memory, but defers to the device
DMA, why doesn't that confuse the security checker?

Cannot we just somehow annotate it in a way that the checker can
understand (akin to how we provide coverity models) and be done?

Thanks,
Fam

> 
> A one-line patch supposed which became a painful one, because
> there is so many different syntax to express the same usage:
> 
>  opt = qdict_new();
>  qdict_put_str(opt, "read-zeroes", "off");
>  null_bs = bdrv_open("null-co://", NULL, opt, BDRV_O_RDWR | BDRV_O_PROTOCOL,
>                      &error_abort);
> 
> vm.qmp('blockdev-add', driver='null-co', read_zeroes=False, ...)
> 
> vm.add_drive_raw("id=drive0,driver=null-co,read-zeroes=off,if=none")
> 
>     blk0 = { 'node-name': 'src',
>         'driver': 'null-co',
>         'read-zeroes': 'off' }
> 
>     'file': {
>         'driver': 'null-co',
>         'read-zeroes': False,
>     }
> 
>     "file": {
>         "driver": "null-co",
>         "read-zeroes": "off"
>     }
> 
>     { "execute": "blockdev-add",
>       "arguments": {
>           "driver": "null-co",
>           "read-zeroes": false,
>           "node-name": "disk0"
>         }
>     }
> 
> opts = {'driver': 'null-co,read-zeroes=off', 'node-name': 'root', 'size': 1024}
> 
> qemu -drive driver=null-co,read-zeroes=off
> 
> qemu-io ... "json:{'driver': 'null-co', 'read-zeroes': false, 'size': 65536}"
> 
> qemu-img null-co://,read-zeroes=off
> 
> qemu-img ... -o data_file="json:{'driver':'null-co',,'read-zeroes':false,,'size':'4294967296'}"
> 
> There are probably more.
> 
> Anyhow, the iotests I am not sure and should be audited are 056, 155
> (I don't understand the syntax) and 162.
> 
> Please review,
> 
> Phil.
> 
> Philippe Mathieu-Daud=C3=A9 (2):
>   block: Explicit null-co uses 'read-zeroes=3Dfalse'
>   block/null: Enable 'read-zeroes' mode by default
> 
>  docs/devel/testing.rst                     | 14 +++++++-------
>  tests/qtest/fuzz/generic_fuzz_configs.h    | 11 ++++++-----
>  block/null.c                               |  2 +-
>  tests/test-bdrv-drain.c                    | 10 ++++++++--
>  tests/acceptance/virtio_check_params.py    |  2 +-
>  tests/perf/block/qcow2/convert-blockstatus |  6 +++---
>  tests/qemu-iotests/040                     |  2 +-
>  tests/qemu-iotests/041                     | 12 ++++++++----
>  tests/qemu-iotests/051                     |  2 +-
>  tests/qemu-iotests/051.out                 |  2 +-
>  tests/qemu-iotests/051.pc.out              |  4 ++--
>  tests/qemu-iotests/087                     |  6 ++++--
>  tests/qemu-iotests/118                     |  2 +-
>  tests/qemu-iotests/133                     |  2 +-
>  tests/qemu-iotests/153                     |  8 ++++----
>  tests/qemu-iotests/184                     |  2 ++
>  tests/qemu-iotests/184.out                 | 10 +++++-----
>  tests/qemu-iotests/218                     |  3 +++
>  tests/qemu-iotests/224                     |  3 ++-
>  tests/qemu-iotests/224.out                 |  8 ++++----
>  tests/qemu-iotests/225                     |  2 +-
>  tests/qemu-iotests/227                     |  4 ++--
>  tests/qemu-iotests/227.out                 |  4 ++--
>  tests/qemu-iotests/228                     |  2 +-
>  tests/qemu-iotests/235                     |  1 +
>  tests/qemu-iotests/245                     |  2 +-
>  tests/qemu-iotests/270                     |  2 +-
>  tests/qemu-iotests/283                     |  3 ++-
>  tests/qemu-iotests/283.out                 |  4 ++--
>  tests/qemu-iotests/299                     |  1 +
>  tests/qemu-iotests/299.out                 |  2 +-
>  tests/qemu-iotests/300                     |  4 ++--
>  32 files changed, 82 insertions(+), 60 deletions(-)
> 
> --=20
> 2.26.2
> 
> 
>
Max Reitz Feb. 19, 2021, 11:07 a.m. UTC | #4
On 13.02.21 22:54, Fam Zheng wrote:
> On 2021-02-11 15:26, Philippe Mathieu-Daudé wrote:
>> The null-co driver doesn't zeroize buffer in its default config,
>> because it is designed for testing and tests want to run fast.
>> However this confuses security researchers (access to uninit
>> buffers).
> 
> I'm a little surprised.
> 
> Is changing default the only way to fix this? I'm not opposed to
> changing the default but I'm not convinced this is the easiest way.
> block/nvme.c also doesn't touch the memory, but defers to the device
> DMA, why doesn't that confuse the security checker?
> 
> Cannot we just somehow annotate it in a way that the checker can
> understand (akin to how we provide coverity models) and be done?

The question is, why wouldn’t we change the default?  read-zeroes=true 
seems the better default to me.  I consider silencing valgrind warnings 
and the like a nice side effect.

Max
Philippe Mathieu-Daudé Feb. 19, 2021, 2:09 p.m. UTC | #5
On 2/19/21 12:07 PM, Max Reitz wrote:
> On 13.02.21 22:54, Fam Zheng wrote:
>> On 2021-02-11 15:26, Philippe Mathieu-Daudé wrote:
>>> The null-co driver doesn't zeroize buffer in its default config,
>>> because it is designed for testing and tests want to run fast.
>>> However this confuses security researchers (access to uninit
>>> buffers).
>>
>> I'm a little surprised.
>>
>> Is changing default the only way to fix this? I'm not opposed to
>> changing the default but I'm not convinced this is the easiest way.
>> block/nvme.c also doesn't touch the memory, but defers to the device
>> DMA, why doesn't that confuse the security checker?

Generally speaking, there is a balance between security and performance.
We try to provide both, but when we can't, my understanding is security
is more important.

Customers expect a secure product. If they prefer performance and
at the price of security, it is also possible by enabling an option
that is not the default.

I'm not sure why you mention block/nvme here. I have the understanding
the null-co driver is only useful for testing. Are there production
cases where null-co is used?

BTW block/nvme is a particular driver where performance matters more
than security (but we have to make sure the users are aware of that).

>> Cannot we just somehow annotate it in a way that the checker can
>> understand (akin to how we provide coverity models) and be done?
> 
> The question is, why wouldn’t we change the default?  read-zeroes=true
> seems the better default to me.  I consider silencing valgrind warnings
> and the like a nice side effect.
> 
> Max
>
Fam Feb. 22, 2021, 5:35 p.m. UTC | #6
On 2021-02-19 15:09, Philippe Mathieu-Daudé wrote:
> On 2/19/21 12:07 PM, Max Reitz wrote:
> > On 13.02.21 22:54, Fam Zheng wrote:
> >> On 2021-02-11 15:26, Philippe Mathieu-Daudé wrote:
> >>> The null-co driver doesn't zeroize buffer in its default config,
> >>> because it is designed for testing and tests want to run fast.
> >>> However this confuses security researchers (access to uninit
> >>> buffers).
> >>
> >> I'm a little surprised.
> >>
> >> Is changing default the only way to fix this? I'm not opposed to
> >> changing the default but I'm not convinced this is the easiest way.
> >> block/nvme.c also doesn't touch the memory, but defers to the device
> >> DMA, why doesn't that confuse the security checker?
> 
> Generally speaking, there is a balance between security and performance.
> We try to provide both, but when we can't, my understanding is security
> is more important.

Why is hiding the code path behind a non-default more secure? What is
not secure now?

Fam
Philippe Mathieu-Daudé Feb. 22, 2021, 5:55 p.m. UTC | #7
On 2/22/21 6:35 PM, Fam Zheng wrote:
> On 2021-02-19 15:09, Philippe Mathieu-Daudé wrote:
>> On 2/19/21 12:07 PM, Max Reitz wrote:
>>> On 13.02.21 22:54, Fam Zheng wrote:
>>>> On 2021-02-11 15:26, Philippe Mathieu-Daudé wrote:
>>>>> The null-co driver doesn't zeroize buffer in its default config,
>>>>> because it is designed for testing and tests want to run fast.
>>>>> However this confuses security researchers (access to uninit
>>>>> buffers).
>>>>
>>>> I'm a little surprised.
>>>>
>>>> Is changing default the only way to fix this? I'm not opposed to
>>>> changing the default but I'm not convinced this is the easiest way.
>>>> block/nvme.c also doesn't touch the memory, but defers to the device
>>>> DMA, why doesn't that confuse the security checker?
>>
>> Generally speaking, there is a balance between security and performance.
>> We try to provide both, but when we can't, my understanding is security
>> is more important.
> 
> Why is hiding the code path behind a non-default more secure? What is
> not secure now?

Se we are back to the problem of having default values.

I'd like to remove the default and have the option explicit,
but qemu_opt_get_bool() expects a 'default' value.

Should we rename qemu_opt_get_bool() -> qemu_opt_get_bool_with_default()
and add a simpler qemu_opt_get_bool()?
Daniel P. Berrangé Feb. 22, 2021, 6:15 p.m. UTC | #8
On Fri, Feb 19, 2021 at 03:09:43PM +0100, Philippe Mathieu-Daudé wrote:
> On 2/19/21 12:07 PM, Max Reitz wrote:
> > On 13.02.21 22:54, Fam Zheng wrote:
> >> On 2021-02-11 15:26, Philippe Mathieu-Daudé wrote:
> >>> The null-co driver doesn't zeroize buffer in its default config,
> >>> because it is designed for testing and tests want to run fast.
> >>> However this confuses security researchers (access to uninit
> >>> buffers).
> >>
> >> I'm a little surprised.
> >>
> >> Is changing default the only way to fix this? I'm not opposed to
> >> changing the default but I'm not convinced this is the easiest way.
> >> block/nvme.c also doesn't touch the memory, but defers to the device
> >> DMA, why doesn't that confuse the security checker?
> 
> Generally speaking, there is a balance between security and performance.
> We try to provide both, but when we can't, my understanding is security
> is more important.
> 
> Customers expect a secure product. If they prefer performance and
> at the price of security, it is also possible by enabling an option
> that is not the default.
> 
> I'm not sure why you mention block/nvme here. I have the understanding
> the null-co driver is only useful for testing. Are there production
> cases where null-co is used?

Do we have any real world figures for the performance of null-co
with & without  zero'ing ?  Before worrying about a tradeoff of
security vs performance, it'd be good to know if there is actually
a real world performance problem in the first place. Personally I'd
go for zero'ing by defualt unless the performance hit was really
bad.

> BTW block/nvme is a particular driver where performance matters more
> than security (but we have to make sure the users are aware of that).


Regards,
Daniel
Philippe Mathieu-Daudé Feb. 22, 2021, 6:36 p.m. UTC | #9
On 2/22/21 7:15 PM, Daniel P. Berrangé wrote:
> On Fri, Feb 19, 2021 at 03:09:43PM +0100, Philippe Mathieu-Daudé wrote:
>> On 2/19/21 12:07 PM, Max Reitz wrote:
>>> On 13.02.21 22:54, Fam Zheng wrote:
>>>> On 2021-02-11 15:26, Philippe Mathieu-Daudé wrote:
>>>>> The null-co driver doesn't zeroize buffer in its default config,
>>>>> because it is designed for testing and tests want to run fast.
>>>>> However this confuses security researchers (access to uninit
>>>>> buffers).
>>>>
>>>> I'm a little surprised.
>>>>
>>>> Is changing default the only way to fix this? I'm not opposed to
>>>> changing the default but I'm not convinced this is the easiest way.
>>>> block/nvme.c also doesn't touch the memory, but defers to the device
>>>> DMA, why doesn't that confuse the security checker?
>>
>> Generally speaking, there is a balance between security and performance.
>> We try to provide both, but when we can't, my understanding is security
>> is more important.
>>
>> Customers expect a secure product. If they prefer performance and
>> at the price of security, it is also possible by enabling an option
>> that is not the default.
>>
>> I'm not sure why you mention block/nvme here. I have the understanding
>> the null-co driver is only useful for testing. Are there production
>> cases where null-co is used?
> 
> Do we have any real world figures for the performance of null-co
> with & without  zero'ing ?  Before worrying about a tradeoff of
> security vs performance, it'd be good to know if there is actually
> a real world performance problem in the first place. Personally I'd
> go for zero'ing by defualt unless the performance hit was really
> bad.

I simply wanted to help the security team by removing the
amount of reports using the null-co driver. This is probably
easier to resolve with one documentation line.
As I am not understanding where this thread is heading, I am
taking a step back with this topic. Please disregard this series.

Regards,

Phil.
Max Reitz Feb. 23, 2021, 8:44 a.m. UTC | #10
On 22.02.21 19:15, Daniel P. Berrangé wrote:
> On Fri, Feb 19, 2021 at 03:09:43PM +0100, Philippe Mathieu-Daudé wrote:
>> On 2/19/21 12:07 PM, Max Reitz wrote:
>>> On 13.02.21 22:54, Fam Zheng wrote:
>>>> On 2021-02-11 15:26, Philippe Mathieu-Daudé wrote:
>>>>> The null-co driver doesn't zeroize buffer in its default config,
>>>>> because it is designed for testing and tests want to run fast.
>>>>> However this confuses security researchers (access to uninit
>>>>> buffers).
>>>>
>>>> I'm a little surprised.
>>>>
>>>> Is changing default the only way to fix this? I'm not opposed to
>>>> changing the default but I'm not convinced this is the easiest way.
>>>> block/nvme.c also doesn't touch the memory, but defers to the device
>>>> DMA, why doesn't that confuse the security checker?
>>
>> Generally speaking, there is a balance between security and performance.
>> We try to provide both, but when we can't, my understanding is security
>> is more important.
>>
>> Customers expect a secure product. If they prefer performance and
>> at the price of security, it is also possible by enabling an option
>> that is not the default.
>>
>> I'm not sure why you mention block/nvme here. I have the understanding
>> the null-co driver is only useful for testing. Are there production
>> cases where null-co is used?
> 
> Do we have any real world figures for the performance of null-co
> with & without  zero'ing ?  Before worrying about a tradeoff of
> security vs performance, it'd be good to know if there is actually
> a real world performance problem in the first place. Personally I'd
> go for zero'ing by defualt unless the performance hit was really
> bad.

AFAIU, null-co is only used for testing, be it to just create some block 
nodes in the iotests, or perhaps for performance testing where you want 
to get the minimal roundtrip time through the block layer.  So there is 
no "real world performance problem", because there is no real world use 
of null-co or null-aio.  At least there shouldn’t be.

That begs the question of whether read-zeroes=off even makes sense, and 
I think it absolutely does.

In cases where we have a test that just wants a simple block node that 
doesn’t use disk space, the memset() can’t be noticeable.  But it’s just 
a test, so do we even need the memset()?  Strictly speaking, perhaps 
not, but if someone is to run it via Valgrind or something, they may get 
false positives, so just doing the memset() is the right thing to do.

For performance tests, it must be possible to set read-zeroes=off, 
because even though “that memset() isn’t noticeable in a functional 
test”, in a hard-core performance test, it will be.

So we need a switch.  It should default to memset(), because (1) making 
tools like Valgrind happy seems like a reasonable objective to me, and 
(2) in the majority of cases, the memset() cannot have a noticeable impact.

Max
Fam Feb. 23, 2021, 9:21 a.m. UTC | #11
On 2021-02-22 18:55, Philippe Mathieu-Daudé wrote:
> On 2/22/21 6:35 PM, Fam Zheng wrote:
> > On 2021-02-19 15:09, Philippe Mathieu-Daudé wrote:
> >> On 2/19/21 12:07 PM, Max Reitz wrote:
> >>> On 13.02.21 22:54, Fam Zheng wrote:
> >>>> On 2021-02-11 15:26, Philippe Mathieu-Daudé wrote:
> >>>>> The null-co driver doesn't zeroize buffer in its default config,
> >>>>> because it is designed for testing and tests want to run fast.
> >>>>> However this confuses security researchers (access to uninit
> >>>>> buffers).
> >>>>
> >>>> I'm a little surprised.
> >>>>
> >>>> Is changing default the only way to fix this? I'm not opposed to
> >>>> changing the default but I'm not convinced this is the easiest way.
> >>>> block/nvme.c also doesn't touch the memory, but defers to the device
> >>>> DMA, why doesn't that confuse the security checker?
> >>
> >> Generally speaking, there is a balance between security and performance.
> >> We try to provide both, but when we can't, my understanding is security
> >> is more important.
> > 
> > Why is hiding the code path behind a non-default more secure? What is
> > not secure now?
> 
> Se we are back to the problem of having default values.
> 
> I'd like to remove the default and have the option explicit,
> but qemu_opt_get_bool() expects a 'default' value.
> 
> Should we rename qemu_opt_get_bool() -> qemu_opt_get_bool_with_default()
> and add a simpler qemu_opt_get_bool()?

My point is I still don't get the full context of the problem this
series is trying to solve. If the problem is tools are confused, I would
like to understand why. What is the thing that matters here, exactly?

But there's always been nullblk.ko in kernel that doesn't lie in the
name. If we change the default we are not very honest any more about
what the driver actually does.

Even if null-co:// and null-aio:// is a bad idea, then let's add
zero-co://co and zero-aio://, and deprecate null-*://.

Fam
Daniel P. Berrangé Feb. 23, 2021, 9:29 a.m. UTC | #12
On Tue, Feb 23, 2021 at 09:44:51AM +0100, Max Reitz wrote:
> On 22.02.21 19:15, Daniel P. Berrangé wrote:
> > On Fri, Feb 19, 2021 at 03:09:43PM +0100, Philippe Mathieu-Daudé wrote:
> > > On 2/19/21 12:07 PM, Max Reitz wrote:
> > > > On 13.02.21 22:54, Fam Zheng wrote:
> > > > > On 2021-02-11 15:26, Philippe Mathieu-Daudé wrote:
> > > > > > The null-co driver doesn't zeroize buffer in its default config,
> > > > > > because it is designed for testing and tests want to run fast.
> > > > > > However this confuses security researchers (access to uninit
> > > > > > buffers).
> > > > > 
> > > > > I'm a little surprised.
> > > > > 
> > > > > Is changing default the only way to fix this? I'm not opposed to
> > > > > changing the default but I'm not convinced this is the easiest way.
> > > > > block/nvme.c also doesn't touch the memory, but defers to the device
> > > > > DMA, why doesn't that confuse the security checker?
> > > 
> > > Generally speaking, there is a balance between security and performance.
> > > We try to provide both, but when we can't, my understanding is security
> > > is more important.
> > > 
> > > Customers expect a secure product. If they prefer performance and
> > > at the price of security, it is also possible by enabling an option
> > > that is not the default.
> > > 
> > > I'm not sure why you mention block/nvme here. I have the understanding
> > > the null-co driver is only useful for testing. Are there production
> > > cases where null-co is used?
> > 
> > Do we have any real world figures for the performance of null-co
> > with & without  zero'ing ?  Before worrying about a tradeoff of
> > security vs performance, it'd be good to know if there is actually
> > a real world performance problem in the first place. Personally I'd
> > go for zero'ing by defualt unless the performance hit was really
> > bad.
> 
> AFAIU, null-co is only used for testing, be it to just create some block
> nodes in the iotests, or perhaps for performance testing where you want to
> get the minimal roundtrip time through the block layer.  So there is no
> "real world performance problem", because there is no real world use of
> null-co or null-aio.  At least there shouldn’t be.
> 
> That begs the question of whether read-zeroes=off even makes sense, and I
> think it absolutely does.
> 
> In cases where we have a test that just wants a simple block node that
> doesn’t use disk space, the memset() can’t be noticeable.  But it’s just a
> test, so do we even need the memset()?  Strictly speaking, perhaps not, but
> if someone is to run it via Valgrind or something, they may get false
> positives, so just doing the memset() is the right thing to do.
> 
> For performance tests, it must be possible to set read-zeroes=off, because
> even though “that memset() isn’t noticeable in a functional test”, in a
> hard-core performance test, it will be.
> 
> So we need a switch.  It should default to memset(), because (1) making
> tools like Valgrind happy seems like a reasonable objective to me, and (2)
> in the majority of cases, the memset() cannot have a noticeable impact.

Yes, that all makes sense to me.


Regards,
Daniel
Max Reitz Feb. 23, 2021, 4:01 p.m. UTC | #13
On 23.02.21 10:21, Fam Zheng wrote:
> On 2021-02-22 18:55, Philippe Mathieu-Daudé wrote:
>> On 2/22/21 6:35 PM, Fam Zheng wrote:
>>> On 2021-02-19 15:09, Philippe Mathieu-Daudé wrote:
>>>> On 2/19/21 12:07 PM, Max Reitz wrote:
>>>>> On 13.02.21 22:54, Fam Zheng wrote:
>>>>>> On 2021-02-11 15:26, Philippe Mathieu-Daudé wrote:
>>>>>>> The null-co driver doesn't zeroize buffer in its default config,
>>>>>>> because it is designed for testing and tests want to run fast.
>>>>>>> However this confuses security researchers (access to uninit
>>>>>>> buffers).
>>>>>>
>>>>>> I'm a little surprised.
>>>>>>
>>>>>> Is changing default the only way to fix this? I'm not opposed to
>>>>>> changing the default but I'm not convinced this is the easiest way.
>>>>>> block/nvme.c also doesn't touch the memory, but defers to the device
>>>>>> DMA, why doesn't that confuse the security checker?
>>>>
>>>> Generally speaking, there is a balance between security and performance.
>>>> We try to provide both, but when we can't, my understanding is security
>>>> is more important.
>>>
>>> Why is hiding the code path behind a non-default more secure? What is
>>> not secure now?
>>
>> Se we are back to the problem of having default values.
>>
>> I'd like to remove the default and have the option explicit,
>> but qemu_opt_get_bool() expects a 'default' value.
>>
>> Should we rename qemu_opt_get_bool() -> qemu_opt_get_bool_with_default()
>> and add a simpler qemu_opt_get_bool()?
> 
> My point is I still don't get the full context of the problem this
> series is trying to solve. If the problem is tools are confused, I would
> like to understand why. What is the thing that matters here, exactly?

My personal opinion is that it isn’t even about tools, it’s just about 
the fact that operating on uninitialized data is something that should 
generally be avoided.  For the null drivers, there is a reason to allow 
it (performance testing), but that should be a special case, not the 
default.

> But there's always been nullblk.ko in kernel that doesn't lie in the
> name. If we change the default we are not very honest any more about
> what the driver actually does.

Then we’re already lying, because if we model it after /dev/null, we 
should probably return -EIO on every read.

If a null device returns data, that data may be arbitrary, so we might 
as well memset() it to 0.  As I wrote in my reply to Daniel, I find it 
perfectly reasonable to make that the default: For functional tests 
(which I think are the majority of null’s users), it doesn’t make a 
difference except that operating on uninitialized data just isn’t a nice 
thing to do.

The only reasons I can see we wouldn’t change the default are (1) 
compatibility, which I don’t think is an issue for a test driver (plus, 
the only thing it might break are performance tests, which naively I 
think wouldn’t be a problem), and (2) it’s an additional gotcha when 
performance testing, but there are usually so many gotchas with 
performance testing, that I don’t see this as a problem either.

> Even if null-co:// and null-aio:// is a bad idea, then let's add
> zero-co://co and zero-aio://, and deprecate null-*://.
I find that too much work simply because it’s more work than just making 
read-zeroes=on the default, and I find doing that reasonable.

(Furthermore, we wouldn’t deprecate null-*, because it’s needed for 
performance testing.  We could add read-zeroes as an option to the new 
zero-* drivers, but I find that silly.)

Max
Fam Feb. 23, 2021, 5:21 p.m. UTC | #14
On 2021-02-23 17:01, Max Reitz wrote:
> On 23.02.21 10:21, Fam Zheng wrote:
> > On 2021-02-22 18:55, Philippe Mathieu-Daudé wrote:
> > > On 2/22/21 6:35 PM, Fam Zheng wrote:
> > > > On 2021-02-19 15:09, Philippe Mathieu-Daudé wrote:
> > > > > On 2/19/21 12:07 PM, Max Reitz wrote:
> > > > > > On 13.02.21 22:54, Fam Zheng wrote:
> > > > > > > On 2021-02-11 15:26, Philippe Mathieu-Daudé wrote:
> > > > > > > > The null-co driver doesn't zeroize buffer in its default config,
> > > > > > > > because it is designed for testing and tests want to run fast.
> > > > > > > > However this confuses security researchers (access to uninit
> > > > > > > > buffers).
> > > > > > > 
> > > > > > > I'm a little surprised.
> > > > > > > 
> > > > > > > Is changing default the only way to fix this? I'm not opposed to
> > > > > > > changing the default but I'm not convinced this is the easiest way.
> > > > > > > block/nvme.c also doesn't touch the memory, but defers to the device
> > > > > > > DMA, why doesn't that confuse the security checker?
> > > > > 
> > > > > Generally speaking, there is a balance between security and performance.
> > > > > We try to provide both, but when we can't, my understanding is security
> > > > > is more important.
> > > > 
> > > > Why is hiding the code path behind a non-default more secure? What is
> > > > not secure now?
> > > 
> > > Se we are back to the problem of having default values.
> > > 
> > > I'd like to remove the default and have the option explicit,
> > > but qemu_opt_get_bool() expects a 'default' value.
> > > 
> > > Should we rename qemu_opt_get_bool() -> qemu_opt_get_bool_with_default()
> > > and add a simpler qemu_opt_get_bool()?
> > 
> > My point is I still don't get the full context of the problem this
> > series is trying to solve. If the problem is tools are confused, I would
> > like to understand why. What is the thing that matters here, exactly?
> 
> My personal opinion is that it isn’t even about tools, it’s just about the
> fact that operating on uninitialized data is something that should generally
> be avoided.  For the null drivers, there is a reason to allow it
> (performance testing), but that should be a special case, not the default.

How do we define uninitialized data? Are buffers passed to a successful
read (2) syscall initialized? We cannot know, because it's up to the
fs/driver/storage. It's the same to bdrv_pread().

In fact block/null.c doesn't operate on uninitialized data, we should
really fix guess_disk_lchs() and similar.

> 
> > But there's always been nullblk.ko in kernel that doesn't lie in the
> > name. If we change the default we are not very honest any more about
> > what the driver actually does.
> 
> Then we’re already lying, because if we model it after /dev/null, we should
> probably return -EIO on every read.

nullblk.ko is not /dev/null, it's /dev/nullb*:

https://www.kernel.org/doc/Documentation/block/null_blk.txt

Fam