diff mbox series

[v2,01/16] nvme: fix pci doorbell size calculation

Message ID 20200415130159.611361-2-its@irrelevant.dk (mailing list archive)
State New, archived
Headers show
Series nvme: refactoring and cleanups | expand

Commit Message

Klaus Jensen April 15, 2020, 1:01 p.m. UTC
From: Klaus Jensen <k.jensen@samsung.com>

The size of the BAR is 0x1000 (main registers) + 8 bytes for each
queue. Currently, the size of the BAR is calculated like so:

    n->reg_size = pow2ceil(0x1004 + 2 * (n->num_queues + 1) * 4);

Since the 'num_queues' parameter already accounts for the admin queue,
this should in any case not need to be incremented by one. Also, the
size should be initialized to (0x1000).

    n->reg_size = pow2ceil(0x1000 + 2 * n->num_queues * 4);

This, with the default value of num_queues (64), we will set aside room
for 1 admin queue and 63 I/O queues (4 bytes per doorbell, 2 doorbells
per queue).

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
---
 hw/block/nvme.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Comments

Philippe Mathieu-Daudé April 15, 2020, 1:13 p.m. UTC | #1
On 4/15/20 3:01 PM, Klaus Jensen wrote:
> From: Klaus Jensen <k.jensen@samsung.com>
> 
> The size of the BAR is 0x1000 (main registers) + 8 bytes for each
> queue. Currently, the size of the BAR is calculated like so:
> 
>      n->reg_size = pow2ceil(0x1004 + 2 * (n->num_queues + 1) * 4);
> 
> Since the 'num_queues' parameter already accounts for the admin queue,
> this should in any case not need to be incremented by one. Also, the
> size should be initialized to (0x1000).
> 
>      n->reg_size = pow2ceil(0x1000 + 2 * n->num_queues * 4);
> 
> This, with the default value of num_queues (64), we will set aside room
> for 1 admin queue and 63 I/O queues (4 bytes per doorbell, 2 doorbells
> per queue).
> 
> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> ---
>   hw/block/nvme.c | 7 ++++++-
>   1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index d28335cbf377..5b5f75c9d29e 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -43,6 +43,9 @@
>   #include "trace.h"
>   #include "nvme.h"
>   
> +#define NVME_REG_SIZE 0x1000
> +#define NVME_DB_SIZE  4
> +
>   #define NVME_GUEST_ERR(trace, fmt, ...) \
>       do { \
>           (trace_##trace)(__VA_ARGS__); \
> @@ -1345,7 +1348,9 @@ static void nvme_realize(PCIDevice *pci_dev, Error **errp)
>       pcie_endpoint_cap_init(pci_dev, 0x80);
>   
>       n->num_namespaces = 1;
> -    n->reg_size = pow2ceil(0x1004 + 2 * (n->num_queues + 1) * 4);
> +
> +    /* num_queues is really number of pairs, so each has two doorbells */
> +    n->reg_size = pow2ceil(NVME_REG_SIZE + 2 * n->num_queues * NVME_DB_SIZE);

Unrelated to this change, but it would be cleaner to initialize reg_size 
using MAX_NUM_QUEUES, then in the I/O handler log GUEST_ERROR when 
registers > n->num_queues accessed. This would model closer to the hardware.

>       n->ns_size = bs_size / (uint64_t)n->num_namespaces;
>   
>       n->namespaces = g_new0(NvmeNamespace, n->num_namespaces);
>
Maxim Levitsky April 21, 2020, 9:39 a.m. UTC | #2
On Wed, 2020-04-15 at 15:13 +0200, Philippe Mathieu-Daudé wrote:
> On 4/15/20 3:01 PM, Klaus Jensen wrote:
> > From: Klaus Jensen <k.jensen@samsung.com>
> > 
> > The size of the BAR is 0x1000 (main registers) + 8 bytes for each
> > queue. Currently, the size of the BAR is calculated like so:
> > 
> >      n->reg_size = pow2ceil(0x1004 + 2 * (n->num_queues + 1) * 4);
> > 
> > Since the 'num_queues' parameter already accounts for the admin queue,
> > this should in any case not need to be incremented by one. Also, the
> > size should be initialized to (0x1000).
> > 
> >      n->reg_size = pow2ceil(0x1000 + 2 * n->num_queues * 4);
> > 
> > This, with the default value of num_queues (64), we will set aside room
> > for 1 admin queue and 63 I/O queues (4 bytes per doorbell, 2 doorbells
> > per queue).
> > 
> > Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> > Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> > ---
> >   hw/block/nvme.c | 7 ++++++-
> >   1 file changed, 6 insertions(+), 1 deletion(-)
> > 
> > diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> > index d28335cbf377..5b5f75c9d29e 100644
> > --- a/hw/block/nvme.c
> > +++ b/hw/block/nvme.c
> > @@ -43,6 +43,9 @@
> >   #include "trace.h"
> >   #include "nvme.h"
> >   
> > +#define NVME_REG_SIZE 0x1000
> > +#define NVME_DB_SIZE  4
> > +
> >   #define NVME_GUEST_ERR(trace, fmt, ...) \
> >       do { \
> >           (trace_##trace)(__VA_ARGS__); \
> > @@ -1345,7 +1348,9 @@ static void nvme_realize(PCIDevice *pci_dev, Error **errp)
> >       pcie_endpoint_cap_init(pci_dev, 0x80);
> >   
> >       n->num_namespaces = 1;
> > -    n->reg_size = pow2ceil(0x1004 + 2 * (n->num_queues + 1) * 4);
> > +
> > +    /* num_queues is really number of pairs, so each has two doorbells */
> > +    n->reg_size = pow2ceil(NVME_REG_SIZE + 2 * n->num_queues * NVME_DB_SIZE);
> 
> Unrelated to this change, but it would be cleaner to initialize reg_size 
> using MAX_NUM_QUEUES, then in the I/O handler log GUEST_ERROR when 
> registers > n->num_queues accessed. This would model closer to the hardware.
Agree.

Also keep in mind that NVME_DB_SIZE is configurable by setting the doorbell stride.
(but this is optional, so currently this code is OK)

Other than that,
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

> 
> >       n->ns_size = bs_size / (uint64_t)n->num_namespaces;
> >   
> >       n->namespaces = g_new0(NvmeNamespace, n->num_namespaces);
> > 
> 
> 


Best regards,
	Maxim Levitsky
diff mbox series

Patch

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index d28335cbf377..5b5f75c9d29e 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -43,6 +43,9 @@ 
 #include "trace.h"
 #include "nvme.h"
 
+#define NVME_REG_SIZE 0x1000
+#define NVME_DB_SIZE  4
+
 #define NVME_GUEST_ERR(trace, fmt, ...) \
     do { \
         (trace_##trace)(__VA_ARGS__); \
@@ -1345,7 +1348,9 @@  static void nvme_realize(PCIDevice *pci_dev, Error **errp)
     pcie_endpoint_cap_init(pci_dev, 0x80);
 
     n->num_namespaces = 1;
-    n->reg_size = pow2ceil(0x1004 + 2 * (n->num_queues + 1) * 4);
+
+    /* num_queues is really number of pairs, so each has two doorbells */
+    n->reg_size = pow2ceil(NVME_REG_SIZE + 2 * n->num_queues * NVME_DB_SIZE);
     n->ns_size = bs_size / (uint64_t)n->num_namespaces;
 
     n->namespaces = g_new0(NvmeNamespace, n->num_namespaces);