Page MenuHomeFreeBSD

mlx4_en_free_resources(): fix error handling for priv->stat_ctx
ClosedPublic

Authored by hselasky on Dec 21 2016, 2:55 AM.
Tags
None
Referenced Files
Unknown Object (File)
Sun, Apr 7, 5:29 AM
Unknown Object (File)
Sat, Apr 6, 1:31 PM
Unknown Object (File)
Sat, Mar 23, 1:26 AM
Unknown Object (File)
Feb 24 2024, 2:48 AM
Unknown Object (File)
Jan 23 2024, 4:35 PM
Unknown Object (File)
Dec 22 2023, 9:40 PM
Unknown Object (File)
Nov 14 2023, 8:21 AM
Unknown Object (File)
Nov 13 2023, 11:27 AM
Subscribers

Details

Summary

When trying to enable Mellanox ConnectX3 VF for Hyper-V,
I got the below error without https://reviews.freebsd.org/D8868

mlx4_en: mlx4_core0: Port 1: failed reserving qp for TX ring
mlx4_en: mlx4_core0: Port 1: Failed to allocate NIC resources

and the related error handling has a bug, causing a kernel crash:

Fatal trap 12: page fault while in kernel mode
...
Stopped at sysctl_ctx_free+0x7c: movq 0x8(%rax), %rax

The cause is:

In mlx4_en_init_netdev() -> mlx4_en_alloc_resources(),
if mlx4_en_create_tx_ring() fails, mlx4_en_sysctl_stat() won't be called;
later mlx4_en_init_netdev() -> mlx4_en_destroy_netdev() ->
mlx4_en_free_resources() -> sysctl_ctx_free(&priv->stat_ctx) will
dereference the NULL stat_ctx, because sysctl_ctx_init(&priv->stat_ctx)
hasn't been called.

To fix the issue, let's move the sysctl_ctx_init() to an earlier
place.
BTW, we also move the init for priv->conf_ctx for consistency of
coding style, though it's not a must.

Diff Detail

Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 6339
Build 6578: arc lint + arc unit

Event Timeline

decui_microsoft.com retitled this revision from to mlx4_en_free_resources(): fix error handling for priv->stat_ctx.
decui_microsoft.com updated this object.
decui_microsoft.com edited the test plan for this revision. (Show Details)

It's probably better to call sysctl_ctx_init for the stat_ctx, when priv->sysctl is assigned. Or just moving the ctx_inits and sysctl assignment to common place. TAILQ requires explicit initialization, and we normally treat sysctl_ctx as opaque struct in drivers.

decui_microsoft.com edited edge metadata.

addressed sephe's comment by using a better method: let's move mlx4_en_sysctl_stat() to an earlier place.

decui_microsoft.com updated this object.

Let me update the patch, because the previous version causes a sysctl NULL pointer dereferencing issue: we must make sure mlx4_en_sysctl_stat() is called after mlx4_en_create_rx/tx_ring(), otherwise the stat sysctl nodes can't reference the correct ring pointers.

I'll have a look. Please give me some time (1 week).

Thanks! No hurry. This is is only a trivial fix to improve error handling.
We don't have a strong dependency on it at all. Just to make sure I don't forget this patch. :-)

I'm currently working on this patch. Sorry for the delay.

Slight refactor of the patch.

This revision was automatically updated to reflect the committed changes.