Page MenuHomeFreeBSD

mlx4_en_free_resources(): fix error handling for priv->stat_ctx
ClosedPublic

Authored by hselasky on Dec 21 2016, 2:55 AM.
Tags
None
Referenced Files
F133234882: D8876.id23651.diff
Fri, Oct 24, 5:01 AM
Unknown Object (File)
Mon, Oct 20, 2:34 AM
Unknown Object (File)
Thu, Oct 16, 4:33 AM
Unknown Object (File)
Thu, Oct 16, 12:46 AM
Unknown Object (File)
Fri, Oct 10, 3:40 AM
Unknown Object (File)
Sun, Oct 5, 4:44 AM
Unknown Object (File)
Fri, Sep 26, 12:15 AM
Unknown Object (File)
Sep 18 2025, 11:42 AM
Subscribers

Details

Summary

When trying to enable Mellanox ConnectX3 VF for Hyper-V,
I got the below error without https://reviews.freebsd.org/D8868

mlx4_en: mlx4_core0: Port 1: failed reserving qp for TX ring
mlx4_en: mlx4_core0: Port 1: Failed to allocate NIC resources

and the related error handling has a bug, causing a kernel crash:

Fatal trap 12: page fault while in kernel mode
...
Stopped at sysctl_ctx_free+0x7c: movq 0x8(%rax), %rax

The cause is:

In mlx4_en_init_netdev() -> mlx4_en_alloc_resources(),
if mlx4_en_create_tx_ring() fails, mlx4_en_sysctl_stat() won't be called;
later mlx4_en_init_netdev() -> mlx4_en_destroy_netdev() ->
mlx4_en_free_resources() -> sysctl_ctx_free(&priv->stat_ctx) will
dereference the NULL stat_ctx, because sysctl_ctx_init(&priv->stat_ctx)
hasn't been called.

To fix the issue, let's move the sysctl_ctx_init() to an earlier
place.
BTW, we also move the init for priv->conf_ctx for consistency of
coding style, though it's not a must.

Diff Detail

Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

decui_microsoft.com retitled this revision from to mlx4_en_free_resources(): fix error handling for priv->stat_ctx.
decui_microsoft.com updated this object.
decui_microsoft.com edited the test plan for this revision. (Show Details)

It's probably better to call sysctl_ctx_init for the stat_ctx, when priv->sysctl is assigned. Or just moving the ctx_inits and sysctl assignment to common place. TAILQ requires explicit initialization, and we normally treat sysctl_ctx as opaque struct in drivers.

decui_microsoft.com edited edge metadata.

addressed sephe's comment by using a better method: let's move mlx4_en_sysctl_stat() to an earlier place.

decui_microsoft.com updated this object.

Let me update the patch, because the previous version causes a sysctl NULL pointer dereferencing issue: we must make sure mlx4_en_sysctl_stat() is called after mlx4_en_create_rx/tx_ring(), otherwise the stat sysctl nodes can't reference the correct ring pointers.

I'll have a look. Please give me some time (1 week).

Thanks! No hurry. This is is only a trivial fix to improve error handling.
We don't have a strong dependency on it at all. Just to make sure I don't forget this patch. :-)

I'm currently working on this patch. Sorry for the delay.

Slight refactor of the patch.

This revision was automatically updated to reflect the committed changes.