mlx4_en_free_resources(): fix error handling for priv->stat_ctx

Authored by hselasky on Dec 21 2016, 2:55 AM.



When trying to enable Mellanox ConnectX3 VF for Hyper-V,
I got the below error without

mlx4_en: mlx4_core0: Port 1: failed reserving qp for TX ring
mlx4_en: mlx4_core0: Port 1: Failed to allocate NIC resources

and the related error handling has a bug, causing a kernel crash:

Fatal trap 12: page fault while in kernel mode
Stopped at sysctl_ctx_free+0x7c: movq 0x8(%rax), %rax

The cause is:

In mlx4_en_init_netdev() -> mlx4_en_alloc_resources(),
if mlx4_en_create_tx_ring() fails, mlx4_en_sysctl_stat() won't be called;
later mlx4_en_init_netdev() -> mlx4_en_destroy_netdev() ->
mlx4_en_free_resources() -> sysctl_ctx_free(&priv->stat_ctx) will
dereference the NULL stat_ctx, because sysctl_ctx_init(&priv->stat_ctx)
hasn't been called.

To fix the issue, let's move the sysctl_ctx_init() to an earlier
BTW, we also move the init for priv->conf_ctx for consistency of
coding style, though it's not a must.

Diff Detail

rS FreeBSD src repository
Automatic diff as part of commit; lint not applicable.
Automatic diff as part of commit; unit tests not applicable. retitled this revision from to mlx4_en_free_resources(): fix error handling for priv->stat_ctx.Dec 21 2016, 2:55 AM updated this object. edited the test plan for this revision. (Show Details)

It's probably better to call sysctl_ctx_init for the stat_ctx, when priv->sysctl is assigned. Or just moving the ctx_inits and sysctl assignment to common place. TAILQ requires explicit initialization, and we normally treat sysctl_ctx as opaque struct in drivers.

addressed sephe's comment by using a better method: let's move mlx4_en_sysctl_stat() to an earlier place. updated this object.

Let me update the patch, because the previous version causes a sysctl NULL pointer dereferencing issue: we must make sure mlx4_en_sysctl_stat() is called after mlx4_en_create_rx/tx_ring(), otherwise the stat sysctl nodes can't reference the correct ring pointers.

@hselasky, can you please review this patch?

I'll have a look. Please give me some time (1 week).

Thanks! No hurry. This is is only a trivial fix to improve error handling.
We don't have a strong dependency on it at all. Just to make sure I don't forget this patch. :-)

I'm currently working on this patch. Sorry for the delay.

hselasky commandeered this revision.Apr 27 2017, 1:15 PM
hselasky edited reviewers, added:; removed: hselasky.
hselasky updated this revision to Diff 27788.Apr 27 2017, 2:31 PM

Slight refactor of the patch.

This revision was automatically updated to reflect the committed changes.