diff mbox series

[net-next,v20,02/13] rtase: Implement the .ndo_open function

Message ID 20240607084321.7254-3-justinlai0215@realtek.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series Add Realtek automotive PCIe driver | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net-next, async
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 8 this patch: 8
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 5 of 5 maintainers
netdev/build_clang success Errors and warnings before: 8 this patch: 8
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 8 this patch: 8
netdev/checkpatch warning WARNING: quoted string split across lines
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Justin Lai June 7, 2024, 8:43 a.m. UTC
Implement the .ndo_open function to set default hardware settings
and initialize the descriptor ring and interrupts. Among them,
when requesting irq, because the first group of interrupts needs to
process more events, the overall structure will be different from
other groups of interrupts, so it needs to be processed separately.

Signed-off-by: Justin Lai <justinlai0215@realtek.com>
---
 .../net/ethernet/realtek/rtase/rtase_main.c   | 425 ++++++++++++++++++
 1 file changed, 425 insertions(+)

Comments

Markus Elfring June 13, 2024, 7:50 a.m. UTC | #1
> when requesting irq, because the first group of interrupts needs to
> process more events, the overall structure will be different from
> other groups of interrupts, so it needs to be processed separately.

Can such a change description become clearer anyhow?


…
> +++ b/drivers/net/ethernet/realtek/rtase/rtase_main.c> +static int rtase_alloc_desc(struct rtase_private *tp)
> +{> +			netdev_err(tp->dev, "Failed to allocate dma memory of "
> +					    "tx descriptor.\n");
…

Would you like to keep the message (from such string literals) in a single line?
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/coding-style.rst?h=v6.10-rc3#n116


…
> +static int rtase_alloc_rx_skb(const struct rtase_ring *ring,> +{> +	struct sk_buff *skb = NULL;> +	int ret = 0;> +	if (!page) {
> +		netdev_err(tp->dev, "failed to alloc page\n");
> +		goto err_out;> +	if (!skb) {> +		netdev_err(tp->dev, "failed to build skb\n");
> +		goto err_out;
> +	}> +	return ret;

I find the following statement more appropriate.

	return 0;


> +
> +err_out:
> +	if (skb)
> +		dev_kfree_skb(skb);

Why would you like to repeat such a check after it can be determined
from the control flow that the used variable contains still a null pointer?


> +
> +	ret = -ENOMEM;
> +	rtase_make_unusable_by_asic(desc);
> +
> +	return ret;
> +}
…

It seems that the following statement can be more appropriate.

	return -ENOMEM;


May the local variable “ret” be omitted here?


…
> +static int rtase_open(struct net_device *dev)
> +{> +	int ret;
> +
> +	ivec = &tp->int_vector[0];
> +	tp->rx_buf_sz = RTASE_RX_BUF_SIZE;
> +
> +	ret = rtase_alloc_desc(tp);
> +	if (ret)
> +		goto err_free_all_allocated_mem;
…

I suggest to return directly after such a resource allocation failure.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/coding-style.rst?h=v6.10-rc3#n532


How do you think about to increase the application of scope-based resource management?
https://elixir.bootlin.com/linux/v6.10-rc3/source/include/linux/cleanup.h#L8

Regards,
Markus
Justin Lai June 17, 2024, 9:25 a.m. UTC | #2
> …
> > when requesting irq, because the first group of interrupts needs to
> > process more events, the overall structure will be different from
> > other groups of interrupts, so it needs to be processed separately.
> 
> Can such a change description become clearer anyhow?

Do you think it's ok if I change the description to the following?
"When requesting interrupt, because the first group of interrupts needs
to process more events, the overall structure and interrupt handler will
be different from other groups of interrupts, so it needs to be handled
separately. The first set of interrupt handlers need to handle the
interrupt status of RXQ0 and TXQ0, TXQ4~7, while other groups of
interrupt handlers will handle the interrupt status of RXQ1&TXQ1 or
RXQ2&TXQ2 or RXQ3&TXQ3 according to the interrupt vector."

> 
> 
> …
> > +++ b/drivers/net/ethernet/realtek/rtase/rtase_main.c
> …
> > +static int rtase_alloc_desc(struct rtase_private *tp) {
> …
> > +                     netdev_err(tp->dev, "Failed to allocate dma
> memory of "
> > +                                         "tx descriptor.\n");
> …
> 
> Would you like to keep the message (from such string literals) in a single line?
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Docume
> ntation/process/coding-style.rst?h=v6.10-rc3#n116
> 

Yes, I will make corrections to keep the message in a single line.

> 
> …
> > +static int rtase_alloc_rx_skb(const struct rtase_ring *ring,
> …
> > +{
> …
> > +     struct sk_buff *skb = NULL;
> …
> > +     int ret = 0;
> …
> > +     if (!page) {
> > +             netdev_err(tp->dev, "failed to alloc page\n");
> > +             goto err_out;
> …
> > +     if (!skb) {
> …
> > +             netdev_err(tp->dev, "failed to build skb\n");
> > +             goto err_out;
> > +     }
> …
> > +     return ret;
> 
> I find the following statement more appropriate.
> 
>         return 0;

Thank you, I will make the changes.

> 
> 
> > +
> > +err_out:
> > +     if (skb)
> > +             dev_kfree_skb(skb);
> 
> Why would you like to repeat such a check after it can be determined from the
> control flow that the used variable contains still a null pointer?

Indeed, it seems unnecessary to make this judgment here, I will remove it.

> 
> 
> > +
> > +     ret = -ENOMEM;
> > +     rtase_make_unusable_by_asic(desc);
> > +
> > +     return ret;
> > +}
> …
> 
> It seems that the following statement can be more appropriate.
> 
>         return -ENOMEM;

Ok, I will modify it.

> 
> 
> May the local variable “ret” be omitted here?

Yes, it seems like "ret" is no longer needed.

> 
> 
> …
> > +static int rtase_open(struct net_device *dev) {
> …
> > +     int ret;
> > +
> > +     ivec = &tp->int_vector[0];
> > +     tp->rx_buf_sz = RTASE_RX_BUF_SIZE;
> > +
> > +     ret = rtase_alloc_desc(tp);
> > +     if (ret)
> > +             goto err_free_all_allocated_mem;
> …
> 
> I suggest to return directly after such a resource allocation failure.
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Docume
> ntation/process/coding-style.rst?h=v6.10-rc3#n532
> 
> 
> How do you think about to increase the application of scope-based resource
> management?
> https://elixir.bootlin.com/linux/v6.10-rc3/source/include/linux/cleanup.h#L8

Due to our tx and rx each having multiple queues that need to
allocate descriptors, if any one of the queues fails to allocate,
rtase_alloc_desc() will return an error. Therefore, using 'goto'
here rather than directly returning seems to be reasonable.

> 
> Regards,
> Markus
Markus Elfring June 17, 2024, 10:07 a.m. UTC | #3
>> How do you think about to increase the application of scope-based resource management?
>> https://elixir.bootlin.com/linux/v6.10-rc3/source/include/linux/cleanup.h#L8
>
> Due to our tx and rx each having multiple queues that need to
> allocate descriptors, if any one of the queues fails to allocate,
> rtase_alloc_desc() will return an error. Therefore, using 'goto'
> here rather than directly returning seems to be reasonable.

Some goto chains can be replaced by further usage of advanced cleanup techniques,
can't they?

Regards,
Markus
Justin Lai June 17, 2024, 1:28 p.m. UTC | #4
> >> How do you think about to increase the application of scope-based resource
> management?
> >> https://elixir.bootlin.com/linux/v6.10-rc3/source/include/linux/clean
> >> up.h#L8
> >
> > Due to our tx and rx each having multiple queues that need to allocate
> > descriptors, if any one of the queues fails to allocate,
> > rtase_alloc_desc() will return an error. Therefore, using 'goto'
> > here rather than directly returning seems to be reasonable.
> 
> Some goto chains can be replaced by further usage of advanced cleanup
> techniques, can't they?
> 
> Regards,
> Markus

rtase_alloc_desc() is used to allocate DMA memory. 
I'd like to ask if it's better to keep our current method?
Simon Horman June 17, 2024, 6:59 p.m. UTC | #5
On Mon, Jun 17, 2024 at 01:28:51PM +0000, Justin Lai wrote:
> 
> > >> How do you think about to increase the application of scope-based resource
> > management?
> > >> https://elixir.bootlin.com/linux/v6.10-rc3/source/include/linux/clean
> > >> up.h#L8
> > >
> > > Due to our tx and rx each having multiple queues that need to allocate
> > > descriptors, if any one of the queues fails to allocate,
> > > rtase_alloc_desc() will return an error. Therefore, using 'goto'
> > > here rather than directly returning seems to be reasonable.
> > 
> > Some goto chains can be replaced by further usage of advanced cleanup
> > techniques, can't they?
> > 
> > Regards,
> > Markus
> 
> rtase_alloc_desc() is used to allocate DMA memory. 
> I'd like to ask if it's better to keep our current method?

Hi Justin,

It may be the case that the techniques recently added by cleanup.h could be
used here. But, OTOH, it is the case that using goto for unwinding errors
is in keeping with current Networking driver best practices.

Regardless of the above, I would suggest that if an error occurs in
rtase_alloc_desc() then it release any resources it has allocated. Assuming
the elements of tp->tx_ring and tp->rx_ring are initialised to NULL when
rtase_alloc_desc is called, it looks like that can be achieved by
rtase_alloc_desc() calling rtase_free_desc().

Something like the following (completely untested!).
Please also note that there is probably no need to call netdev_err on
error, as the memory core should already log on error.

static int rtase_alloc_desc(struct rtase_private *tp)
{
	struct pci_dev *pdev = tp->pdev;
	u32 i;

	/* rx and tx descriptors needs 256 bytes alignment.
	 * dma_alloc_coherent provides more.
	 */
	for (i = 0; i < tp->func_tx_queue_num; i++) {
		tp->tx_ring[i].desc =
				dma_alloc_coherent(&pdev->dev,
						   RTASE_TX_RING_DESC_SIZE,
						   &tp->tx_ring[i].phy_addr,
						   GFP_KERNEL);
		if (!tp->tx_ring[i].desc)
			goto err;
	}

	for (i = 0; i < tp->func_rx_queue_num; i++) {
		tp->rx_ring[i].desc =
				dma_alloc_coherent(&pdev->dev,
						   RTASE_RX_RING_DESC_SIZE,
						   &tp->rx_ring[i].phy_addr,
						   GFP_KERNEL);
		if (!tp->rx_ring[i].desc)
			goto err;
		}
	}

	return 0;

err:
	rtase_free_desc(tp)
	return -ENOMEM;
}

And then rtase_alloc_desc can be called like this in rtase_open():

static int rtase_open(struct net_device *dev)
{
	struct rtase_private *tp = netdev_priv(dev);
	const struct pci_dev *pdev = tp->pdev;
	struct rtase_int_vector *ivec;
	u16 i = 0, j;
	int ret;

	ivec = &tp->int_vector[0];
	tp->rx_buf_sz = RTASE_RX_BUF_SIZE;

	ret = rtase_alloc_desc(tp);
	if (ret)
		return ret;

	ret = rtase_init_ring(dev);
	if (ret)
		goto err_free_all_allocated_mem;

...

err_free_all_allocated_mem:
	rtase_free_desc(tp);

	return ret;
}

This is would be in keeping with my understanding of best practices for
Networking drivers: that callers don't have to worry about cleaning up
resources allocated by functions that return an error.


I would also suggest reading Markus's advice with due care,
as it is not always aligned with best practice for Networking code.
Markus Elfring June 17, 2024, 8:22 p.m. UTC | #6
> I would also suggest reading Markus's advice with due care,
> as it is not always aligned with best practice for Networking code.

I dare to propose further collateral evolution according to available programming interfaces.

Regards,
Markus
Justin Lai June 18, 2024, 7:51 a.m. UTC | #7
> On Mon, Jun 17, 2024 at 01:28:51PM +0000, Justin Lai wrote:
> >
> > > >> How do you think about to increase the application of scope-based
> > > >> resource
> > > management?
> > > >> https://elixir.bootlin.com/linux/v6.10-rc3/source/include/linux/c
> > > >> lean
> > > >> up.h#L8
> > > >
> > > > Due to our tx and rx each having multiple queues that need to
> > > > allocate descriptors, if any one of the queues fails to allocate,
> > > > rtase_alloc_desc() will return an error. Therefore, using 'goto'
> > > > here rather than directly returning seems to be reasonable.
> > >
> > > Some goto chains can be replaced by further usage of advanced
> > > cleanup techniques, can't they?
> > >
> > > Regards,
> > > Markus
> >
> > rtase_alloc_desc() is used to allocate DMA memory.
> > I'd like to ask if it's better to keep our current method?
> 
> Hi Justin,
> 
> It may be the case that the techniques recently added by cleanup.h could be
> used here. But, OTOH, it is the case that using goto for unwinding errors is in
> keeping with current Networking driver best practices.
> 
> Regardless of the above, I would suggest that if an error occurs in
> rtase_alloc_desc() then it release any resources it has allocated. Assuming the
> elements of tp->tx_ring and tp->rx_ring are initialised to NULL when
> rtase_alloc_desc is called, it looks like that can be achieved by
> rtase_alloc_desc() calling rtase_free_desc().
> 
> Something like the following (completely untested!).
> Please also note that there is probably no need to call netdev_err on error, as
> the memory core should already log on error.
> 
> static int rtase_alloc_desc(struct rtase_private *tp) {
>         struct pci_dev *pdev = tp->pdev;
>         u32 i;
> 
>         /* rx and tx descriptors needs 256 bytes alignment.
>          * dma_alloc_coherent provides more.
>          */
>         for (i = 0; i < tp->func_tx_queue_num; i++) {
>                 tp->tx_ring[i].desc =
>                                 dma_alloc_coherent(&pdev->dev,
> 
> RTASE_TX_RING_DESC_SIZE,
> 
> &tp->tx_ring[i].phy_addr,
>                                                    GFP_KERNEL);
>                 if (!tp->tx_ring[i].desc)
>                         goto err;
>         }
> 
>         for (i = 0; i < tp->func_rx_queue_num; i++) {
>                 tp->rx_ring[i].desc =
>                                 dma_alloc_coherent(&pdev->dev,
> 
> RTASE_RX_RING_DESC_SIZE,
> 
> &tp->rx_ring[i].phy_addr,
>                                                    GFP_KERNEL);
>                 if (!tp->rx_ring[i].desc)
>                         goto err;
>                 }
>         }
> 
>         return 0;
> 
> err:
>         rtase_free_desc(tp)
>         return -ENOMEM;
> }
> 
> And then rtase_alloc_desc can be called like this in rtase_open():
> 
> static int rtase_open(struct net_device *dev) {
>         struct rtase_private *tp = netdev_priv(dev);
>         const struct pci_dev *pdev = tp->pdev;
>         struct rtase_int_vector *ivec;
>         u16 i = 0, j;
>         int ret;
> 
>         ivec = &tp->int_vector[0];
>         tp->rx_buf_sz = RTASE_RX_BUF_SIZE;
> 
>         ret = rtase_alloc_desc(tp);
>         if (ret)
>                 return ret;
> 
>         ret = rtase_init_ring(dev);
>         if (ret)
>                 goto err_free_all_allocated_mem;
> 
> ...
> 
> err_free_all_allocated_mem:
>         rtase_free_desc(tp);
> 
>         return ret;
> }
> 
> This is would be in keeping with my understanding of best practices for
> Networking drivers: that callers don't have to worry about cleaning up
> resources allocated by functions that return an error.
> 
> 
> I would also suggest reading Markus's advice with due care, as it is not always
> aligned with best practice for Networking code.

Hi Simon,
Thank you for your response. Based on your suggestion, if the descriptor
allocation of DMA memory fails, I will directly do error handling in
rtase_alloc_desc() using the 'goto' method. Moreover, in the error
handling section, I will call rtase_free_desc(tp) to free the already
allocated descriptor.
Justin Lai June 18, 2024, 11:01 a.m. UTC | #8
> 
> > I would also suggest reading Markus's advice with due care, as it is
> > not always aligned with best practice for Networking code.
> 
> I dare to propose further collateral evolution according to available
> programming interfaces.
> 
> Regards,
> Markus

Thank you for your suggestion, but since we still need to survey the new
method, we want to use the goto method for this current version of the
patch and make modifications based on Simon's suggestions.
Markus Elfring June 18, 2024, 11:30 a.m. UTC | #9
>> I dare to propose further collateral evolution according to available
>> programming interfaces.> Thank you for your suggestion,

I became curious how the clarification will evolve further for adjusting
API usage in some ways.


> but since we still need to survey the new method,

Would you like to take another look at any intermediate application statistics?

Example:
Looking at guard usage (with SmPL)
https://lore.kernel.org/cocci/2dc6a1c7-79bf-42e3-95cc-599a1e154f57@web.de/
https://sympa.inria.fr/sympa/arc/cocci/2024-05/msg00090.html


> we want to use the goto method for this current version of the patch

Goto chains can still be applied for another while.


> and make modifications based on Simon's suggestions.
The change acceptance is evolving also according to known software transformations,
isn't it?

Regards,
Markus
diff mbox series

Patch

diff --git a/drivers/net/ethernet/realtek/rtase/rtase_main.c b/drivers/net/ethernet/realtek/rtase/rtase_main.c
index b2749cc72ebe..80ee963971c0 100644
--- a/drivers/net/ethernet/realtek/rtase/rtase_main.c
+++ b/drivers/net/ethernet/realtek/rtase/rtase_main.c
@@ -131,6 +131,297 @@  static u32 rtase_r32(const struct rtase_private *tp, u16 reg)
 	return readl(tp->mmio_addr + reg);
 }
 
+static int rtase_alloc_desc(struct rtase_private *tp)
+{
+	struct pci_dev *pdev = tp->pdev;
+	u32 i;
+
+	/* rx and tx descriptors needs 256 bytes alignment.
+	 * dma_alloc_coherent provides more.
+	 */
+	for (i = 0; i < tp->func_tx_queue_num; i++) {
+		tp->tx_ring[i].desc =
+				dma_alloc_coherent(&pdev->dev,
+						   RTASE_TX_RING_DESC_SIZE,
+						   &tp->tx_ring[i].phy_addr,
+						   GFP_KERNEL);
+		if (!tp->tx_ring[i].desc) {
+			netdev_err(tp->dev, "Failed to allocate dma memory of "
+					    "tx descriptor.\n");
+			return -ENOMEM;
+		}
+	}
+
+	for (i = 0; i < tp->func_rx_queue_num; i++) {
+		tp->rx_ring[i].desc =
+				dma_alloc_coherent(&pdev->dev,
+						   RTASE_RX_RING_DESC_SIZE,
+						   &tp->rx_ring[i].phy_addr,
+						   GFP_KERNEL);
+		if (!tp->rx_ring[i].desc) {
+			netdev_err(tp->dev, "Failed to allocate dma memory of "
+					    "rx descriptor.\n");
+			return -ENOMEM;
+		}
+	}
+
+	return 0;
+}
+
+static void rtase_free_desc(struct rtase_private *tp)
+{
+	struct pci_dev *pdev = tp->pdev;
+	u32 i;
+
+	for (i = 0; i < tp->func_tx_queue_num; i++) {
+		if (!tp->tx_ring[i].desc)
+			continue;
+
+		dma_free_coherent(&pdev->dev, RTASE_TX_RING_DESC_SIZE,
+				  tp->tx_ring[i].desc,
+				  tp->tx_ring[i].phy_addr);
+		tp->tx_ring[i].desc = NULL;
+	}
+
+	for (i = 0; i < tp->func_rx_queue_num; i++) {
+		if (!tp->rx_ring[i].desc)
+			continue;
+
+		dma_free_coherent(&pdev->dev, RTASE_RX_RING_DESC_SIZE,
+				  tp->rx_ring[i].desc,
+				  tp->rx_ring[i].phy_addr);
+		tp->rx_ring[i].desc = NULL;
+	}
+}
+
+static void rtase_mark_to_asic(union rtase_rx_desc *desc, u32 rx_buf_sz)
+{
+	u32 eor = le32_to_cpu(desc->desc_cmd.opts1) & RTASE_RING_END;
+
+	desc->desc_status.opts2 = 0;
+	/* force memory writes to complete before releasing descriptor */
+	dma_wmb();
+	WRITE_ONCE(desc->desc_cmd.opts1,
+		   cpu_to_le32(RTASE_DESC_OWN | eor | rx_buf_sz));
+}
+
+static void rtase_tx_desc_init(struct rtase_private *tp, u16 idx)
+{
+	struct rtase_ring *ring = &tp->tx_ring[idx];
+	struct rtase_tx_desc *desc;
+	u32 i;
+
+	memset(ring->desc, 0x0, RTASE_TX_RING_DESC_SIZE);
+	memset(ring->skbuff, 0x0, sizeof(ring->skbuff));
+	ring->cur_idx = 0;
+	ring->dirty_idx = 0;
+	ring->index = idx;
+
+	for (i = 0; i < RTASE_NUM_DESC; i++) {
+		ring->mis.len[i] = 0;
+		if ((RTASE_NUM_DESC - 1) == i) {
+			desc = ring->desc + sizeof(struct rtase_tx_desc) * i;
+			desc->opts1 = cpu_to_le32(RTASE_RING_END);
+		}
+	}
+
+	ring->ring_handler = tx_handler;
+	if (idx < 4) {
+		ring->ivec = &tp->int_vector[idx];
+		list_add_tail(&ring->ring_entry,
+			      &tp->int_vector[idx].ring_list);
+	} else {
+		ring->ivec = &tp->int_vector[0];
+		list_add_tail(&ring->ring_entry, &tp->int_vector[0].ring_list);
+	}
+}
+
+static void rtase_map_to_asic(union rtase_rx_desc *desc, dma_addr_t mapping,
+			      u32 rx_buf_sz)
+{
+	desc->desc_cmd.addr = cpu_to_le64(mapping);
+
+	rtase_mark_to_asic(desc, rx_buf_sz);
+}
+
+static void rtase_make_unusable_by_asic(union rtase_rx_desc *desc)
+{
+	desc->desc_cmd.addr = cpu_to_le64(RTK_MAGIC_NUMBER);
+	desc->desc_cmd.opts1 &= ~cpu_to_le32(RTASE_DESC_OWN | RSVD_MASK);
+}
+
+static int rtase_alloc_rx_skb(const struct rtase_ring *ring,
+			      struct sk_buff **p_sk_buff,
+			      union rtase_rx_desc *desc,
+			      dma_addr_t *rx_phy_addr, u8 in_intr)
+{
+	struct rtase_int_vector *ivec = ring->ivec;
+	const struct rtase_private *tp = ivec->tp;
+	struct sk_buff *skb = NULL;
+	dma_addr_t mapping;
+	struct page *page;
+	void *buf_addr;
+	int ret = 0;
+
+	page = page_pool_dev_alloc_pages(tp->page_pool);
+	if (!page) {
+		netdev_err(tp->dev, "failed to alloc page\n");
+		goto err_out;
+	}
+
+	buf_addr = page_address(page);
+	mapping = page_pool_get_dma_addr(page);
+
+	skb = build_skb(buf_addr, PAGE_SIZE);
+	if (!skb) {
+		page_pool_put_full_page(tp->page_pool, page, true);
+		netdev_err(tp->dev, "failed to build skb\n");
+		goto err_out;
+	}
+
+	*p_sk_buff = skb;
+	*rx_phy_addr = mapping;
+	rtase_map_to_asic(desc, mapping, tp->rx_buf_sz);
+
+	return ret;
+
+err_out:
+	if (skb)
+		dev_kfree_skb(skb);
+
+	ret = -ENOMEM;
+	rtase_make_unusable_by_asic(desc);
+
+	return ret;
+}
+
+static u32 rtase_rx_ring_fill(struct rtase_ring *ring, u32 ring_start,
+			      u32 ring_end, u8 in_intr)
+{
+	union rtase_rx_desc *desc_base = ring->desc;
+	u32 cur;
+
+	for (cur = ring_start; ring_end - cur > 0; cur++) {
+		u32 i = cur % RTASE_NUM_DESC;
+		union rtase_rx_desc *desc = desc_base + i;
+		int ret;
+
+		if (ring->skbuff[i])
+			continue;
+
+		ret = rtase_alloc_rx_skb(ring, &ring->skbuff[i], desc,
+					 &ring->mis.data_phy_addr[i],
+					 in_intr);
+		if (ret)
+			break;
+	}
+
+	return cur - ring_start;
+}
+
+static void rtase_mark_as_last_descriptor(union rtase_rx_desc *desc)
+{
+	desc->desc_cmd.opts1 |= cpu_to_le32(RTASE_RING_END);
+}
+
+static void rtase_rx_ring_clear(struct rtase_ring *ring)
+{
+	union rtase_rx_desc *desc;
+	u32 i;
+
+	for (i = 0; i < RTASE_NUM_DESC; i++) {
+		desc = ring->desc + sizeof(union rtase_rx_desc) * i;
+
+		if (!ring->skbuff[i])
+			continue;
+
+		skb_mark_for_recycle(ring->skbuff[i]);
+
+		dev_kfree_skb(ring->skbuff[i]);
+
+		ring->skbuff[i] = NULL;
+
+		rtase_make_unusable_by_asic(desc);
+	}
+}
+
+static void rtase_rx_desc_init(struct rtase_private *tp, u16 idx)
+{
+	struct rtase_ring *ring = &tp->rx_ring[idx];
+	u16 i;
+
+	memset(ring->desc, 0x0, RTASE_RX_RING_DESC_SIZE);
+	memset(ring->skbuff, 0x0, sizeof(ring->skbuff));
+	ring->cur_idx = 0;
+	ring->dirty_idx = 0;
+	ring->index = idx;
+
+	for (i = 0; i < RTASE_NUM_DESC; i++)
+		ring->mis.data_phy_addr[i] = 0;
+
+	ring->ring_handler = rx_handler;
+	ring->ivec = &tp->int_vector[idx];
+	list_add_tail(&ring->ring_entry, &tp->int_vector[idx].ring_list);
+}
+
+static void rtase_rx_clear(struct rtase_private *tp)
+{
+	u32 i;
+
+	for (i = 0; i < tp->func_rx_queue_num; i++)
+		rtase_rx_ring_clear(&tp->rx_ring[i]);
+
+	page_pool_destroy(tp->page_pool);
+	tp->page_pool = NULL;
+}
+
+static int rtase_init_ring(const struct net_device *dev)
+{
+	struct rtase_private *tp = netdev_priv(dev);
+	struct page_pool_params pp_params = { 0 };
+	struct page_pool *page_pool;
+	u32 num;
+	u16 i;
+
+	pp_params.flags = PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV;
+	pp_params.order = 0;
+	pp_params.pool_size = RTASE_NUM_DESC * tp->func_rx_queue_num;
+	pp_params.nid = dev_to_node(&tp->pdev->dev);
+	pp_params.dev = &tp->pdev->dev;
+	pp_params.dma_dir = DMA_FROM_DEVICE;
+	pp_params.max_len = PAGE_SIZE;
+	pp_params.offset = 0;
+
+	page_pool = page_pool_create(&pp_params);
+	if (IS_ERR(page_pool)) {
+		netdev_err(tp->dev, "failed to create page pool\n");
+		return -ENOMEM;
+	}
+
+	tp->page_pool = page_pool;
+
+	for (i = 0; i < tp->func_tx_queue_num; i++)
+		rtase_tx_desc_init(tp, i);
+
+	for (i = 0; i < tp->func_rx_queue_num; i++) {
+		rtase_rx_desc_init(tp, i);
+		num = rtase_rx_ring_fill(&tp->rx_ring[i], 0,
+					 RTASE_NUM_DESC, 0);
+		if (num != RTASE_NUM_DESC)
+			goto err_out;
+
+		rtase_mark_as_last_descriptor(tp->rx_ring[i].desc +
+					      sizeof(union rtase_rx_desc) *
+					      (RTASE_NUM_DESC - 1));
+	}
+
+	return 0;
+
+err_out:
+	rtase_rx_clear(tp);
+	return -ENOMEM;
+}
+
 static void rtase_tally_counter_clear(const struct rtase_private *tp)
 {
 	u32 cmd = lower_32_bits(tp->tally_paddr);
@@ -139,6 +430,130 @@  static void rtase_tally_counter_clear(const struct rtase_private *tp)
 	rtase_w32(tp, RTASE_DTCCR0, cmd | RTASE_COUNTER_RESET);
 }
 
+static void rtase_nic_enable(const struct net_device *dev)
+{
+	const struct rtase_private *tp = netdev_priv(dev);
+	u16 rcr = rtase_r16(tp, RTASE_RX_CONFIG_1);
+	u8 val;
+
+	rtase_w16(tp, RTASE_RX_CONFIG_1, rcr & ~RTASE_PCIE_RELOAD_EN);
+	rtase_w16(tp, RTASE_RX_CONFIG_1, rcr | RTASE_PCIE_RELOAD_EN);
+
+	val = rtase_r8(tp, RTASE_CHIP_CMD);
+	rtase_w8(tp, RTASE_CHIP_CMD, val | RTASE_TE | RTASE_RE);
+
+	val = rtase_r8(tp, RTASE_MISC);
+	rtase_w8(tp, RTASE_MISC, val & ~RTASE_RX_DV_GATE_EN);
+}
+
+static void rtase_enable_hw_interrupt(const struct rtase_private *tp)
+{
+	const struct rtase_int_vector *ivec = &tp->int_vector[0];
+	u32 i;
+
+	rtase_w32(tp, ivec->imr_addr, ivec->imr);
+
+	for (i = 1; i < tp->int_nums; i++) {
+		ivec = &tp->int_vector[i];
+		rtase_w16(tp, ivec->imr_addr, ivec->imr);
+	}
+}
+
+static void rtase_hw_start(const struct net_device *dev)
+{
+	const struct rtase_private *tp = netdev_priv(dev);
+
+	rtase_nic_enable(dev);
+	rtase_enable_hw_interrupt(tp);
+}
+
+static int rtase_open(struct net_device *dev)
+{
+	struct rtase_private *tp = netdev_priv(dev);
+	const struct pci_dev *pdev = tp->pdev;
+	struct rtase_int_vector *ivec;
+	u16 i = 0, j;
+	int ret;
+
+	ivec = &tp->int_vector[0];
+	tp->rx_buf_sz = RTASE_RX_BUF_SIZE;
+
+	ret = rtase_alloc_desc(tp);
+	if (ret)
+		goto err_free_all_allocated_mem;
+
+	ret = rtase_init_ring(dev);
+	if (ret)
+		goto err_free_all_allocated_mem;
+
+	rtase_hw_config(dev);
+
+	if (tp->sw_flag & RTASE_SWF_MSIX_ENABLED) {
+		ret = request_irq(ivec->irq, rtase_interrupt, 0,
+				  dev->name, ivec);
+		if (ret)
+			goto err_free_all_allocated_irq;
+
+		/* request other interrupts to handle multiqueue */
+		for (i = 1; i < tp->int_nums; i++) {
+			ivec = &tp->int_vector[i];
+			snprintf(ivec->name, sizeof(ivec->name), "%s_int%i",
+				 tp->dev->name, i);
+			ret = request_irq(ivec->irq, rtase_q_interrupt, 0,
+					  ivec->name, ivec);
+			if (ret)
+				goto err_free_all_allocated_irq;
+		}
+	} else {
+		ret = request_irq(pdev->irq, rtase_interrupt, 0, dev->name,
+				  ivec);
+		if (ret)
+			goto err_free_all_allocated_mem;
+	}
+
+	rtase_hw_start(dev);
+
+	for (i = 0; i < tp->int_nums; i++) {
+		ivec = &tp->int_vector[i];
+		napi_enable(&ivec->napi);
+	}
+
+	netif_carrier_on(dev);
+	netif_wake_queue(dev);
+
+	return 0;
+
+err_free_all_allocated_irq:
+	for (j = 0; j < i; j++)
+		free_irq(tp->int_vector[j].irq, &tp->int_vector[j]);
+
+err_free_all_allocated_mem:
+	rtase_free_desc(tp);
+
+	return ret;
+}
+
+static int rtase_close(struct net_device *dev)
+{
+	struct rtase_private *tp = netdev_priv(dev);
+	const struct pci_dev *pdev = tp->pdev;
+	u32 i;
+
+	rtase_down(dev);
+
+	if (tp->sw_flag & RTASE_SWF_MSIX_ENABLED) {
+		for (i = 0; i < tp->int_nums; i++)
+			free_irq(tp->int_vector[i].irq, &tp->int_vector[i]);
+
+	} else {
+		free_irq(pdev->irq, &tp->int_vector[0]);
+	}
+
+	rtase_free_desc(tp);
+
+	return 0;
+}
+
 static void rtase_enable_eem_write(const struct rtase_private *tp)
 {
 	u8 val;
@@ -171,6 +586,11 @@  static void rtase_rar_set(const struct rtase_private *tp, const u8 *addr)
 	rtase_w16(tp, RTASE_LBK_CTRL, RTASE_LBK_ATLD | RTASE_LBK_CLR);
 }
 
+static const struct net_device_ops rtase_netdev_ops = {
+	.ndo_open = rtase_open,
+	.ndo_stop = rtase_close,
+};
+
 static void rtase_get_mac_address(struct net_device *dev)
 {
 	struct rtase_private *tp = netdev_priv(dev);
@@ -191,6 +611,11 @@  static void rtase_get_mac_address(struct net_device *dev)
 	rtase_rar_set(tp, dev->dev_addr);
 }
 
+static void rtase_init_netdev_ops(struct net_device *dev)
+{
+	dev->netdev_ops = &rtase_netdev_ops;
+}
+
 static void rtase_reset_interrupt(struct pci_dev *pdev,
 				  const struct rtase_private *tp)
 {