Page MenuHomeFreeBSD

mlx4_en_free_resources(): fix error handling for priv->stat_ctx

Authored by hselasky on Dec 21 2016, 2:55 AM.
Referenced Files
Unknown Object (File)
Sat, Jun 1, 7:55 PM
Unknown Object (File)
Thu, May 23, 12:01 PM
Unknown Object (File)
Thu, May 23, 2:38 AM
Unknown Object (File)
May 21 2024, 4:14 PM
Unknown Object (File)
May 10 2024, 12:55 PM
Unknown Object (File)
May 5 2024, 11:05 AM
Unknown Object (File)
Apr 30 2024, 3:45 PM
Unknown Object (File)
Apr 30 2024, 3:45 PM



When trying to enable Mellanox ConnectX3 VF for Hyper-V,
I got the below error without

mlx4_en: mlx4_core0: Port 1: failed reserving qp for TX ring
mlx4_en: mlx4_core0: Port 1: Failed to allocate NIC resources

and the related error handling has a bug, causing a kernel crash:

Fatal trap 12: page fault while in kernel mode
Stopped at sysctl_ctx_free+0x7c: movq 0x8(%rax), %rax

The cause is:

In mlx4_en_init_netdev() -> mlx4_en_alloc_resources(),
if mlx4_en_create_tx_ring() fails, mlx4_en_sysctl_stat() won't be called;
later mlx4_en_init_netdev() -> mlx4_en_destroy_netdev() ->
mlx4_en_free_resources() -> sysctl_ctx_free(&priv->stat_ctx) will
dereference the NULL stat_ctx, because sysctl_ctx_init(&priv->stat_ctx)
hasn't been called.

To fix the issue, let's move the sysctl_ctx_init() to an earlier
BTW, we also move the init for priv->conf_ctx for consistency of
coding style, though it's not a must.

Diff Detail

rS FreeBSD src repository - subversion
Lint Not Applicable
Tests Not Applicable

Event Timeline retitled this revision from to mlx4_en_free_resources(): fix error handling for priv->stat_ctx. updated this object. edited the test plan for this revision. (Show Details)

It's probably better to call sysctl_ctx_init for the stat_ctx, when priv->sysctl is assigned. Or just moving the ctx_inits and sysctl assignment to common place. TAILQ requires explicit initialization, and we normally treat sysctl_ctx as opaque struct in drivers. edited edge metadata.

addressed sephe's comment by using a better method: let's move mlx4_en_sysctl_stat() to an earlier place. updated this object.

Let me update the patch, because the previous version causes a sysctl NULL pointer dereferencing issue: we must make sure mlx4_en_sysctl_stat() is called after mlx4_en_create_rx/tx_ring(), otherwise the stat sysctl nodes can't reference the correct ring pointers.

I'll have a look. Please give me some time (1 week).

Thanks! No hurry. This is is only a trivial fix to improve error handling.
We don't have a strong dependency on it at all. Just to make sure I don't forget this patch. :-)

I'm currently working on this patch. Sorry for the delay.

Slight refactor of the patch.

This revision was automatically updated to reflect the committed changes.