Page MenuHomeFreeBSD

Zero initialize all fields of socket structure
ClosedPublic

Authored by hselasky on Jul 4 2017, 8:53 AM.
Tags
None
Referenced Files
F108148601: D11475.id30395.diff
Tue, Jan 21, 9:50 PM
F108099003: D11475.id30395.diff
Tue, Jan 21, 9:16 AM
Unknown Object (File)
Mon, Dec 23, 6:25 AM
Unknown Object (File)
Dec 9 2024, 11:12 AM
Unknown Object (File)
Nov 22 2024, 4:32 PM
Unknown Object (File)
Oct 19 2024, 8:05 AM
Unknown Object (File)
Oct 4 2024, 9:59 AM
Unknown Object (File)
Oct 4 2024, 12:49 AM
Tokens
"Yellow Medal" token, awarded by sylvain_sylvaingarrigues.com.

Details

Summary

During recent changes some fields were left uninitialized when transforming the socket structure into a listening socket. Only when INVARIANTS is set, the full structure is zeroed. Make sure all fields are zeroed when INVARIANTS is undefined. This happens because the fields in question are part of a union.

PR: 220452
PR: 220358

Sponsored by: Mellanox Technologies

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

What I am not sure is why it broke only 32-bit architectures?

Hi,

@sylvain_sylvaingarrigues.com

Look at "struct socket" and how these fields are part of a union, where platform size dependent fields are mixed with fixed size fields. Try to extract the offsets of all the fields in the last union with -m32 and without -m32 for example.

I'm so glad I cc'ed you on this PR, you rock. I shall be able to confirm my armv6 kernel no longer panic with this fix within the next 24h. THANKS!

This change addresses the direct, observed consequences for switching from
the first anonymous struct to the second one: it avoids interpreting old
"first struct" material as upcall information.

But what I wonder is how many other fields of the "second struct" get
odd interpretations from the prior "first struct" content. These may not
be so obvious as "can not boot because of a fatal kernel trap" and so
be much harder to track down even if not correct. Even noticing
the (potential) incorrectness of the overall operation would be more
difficult.

But my wondering does not mean that I have specific examples of
specific odd-value problems from the context switching. It is more
of a generic worry given the evidence that at least sol_upcall and
its sol_upcallarg were not handled correctly for the issue: How much
checking for such issues was done by someone that understands
what is needed in this area?

@markmi_dsl-only.net

But what I wonder is how many other fields of the "second struct" get
odd interpretations from the prior "first struct" content.

From my code inspection all other fields are properly initialized when doing the switchover. From what I can see it is only these two fields missing.

@sylvain_sylvaingarrigues.com

I've reproduced the panic with my RPI and can confirm this patch fixes it. I think you will have similar results.

gallatin added a subscriber: gallatin.

This seems reasonable to me

This revision is now accepted and ready to land.Jul 4 2017, 1:31 PM
ae accepted this revision.EditedJul 4 2017, 5:30 PM
ae added a subscriber: ae.

Please, describe in commit message your calculations.

@ae :

With 64-bit we have the following overlapping offsets in "struct socket *":

sol_upcall=0x190
sol_upcallarg=0x198
so_rcv.sb_mtx=0x148
so_rcv.sb_sx=0x168
so_rcv.sb_sel=0x188
so_rcv.sb_state=0x190
so_rcv.sb_mb=0x198
so_rcv.sb_mbtail=0x1a0

We see that so->sol_upcall overlaps so->so_rcv.sb_state which is zero during listen. Can also be verified by adding this line of code to socketvar.h:

CTASSERT(offsetof(struct socket, so_rcv.sb_state) == offsetof(struct socket, sol_upcall));

For 32-bit we have the following offsets:

sol_upcall=0xe0
sol_upcallarg=0xe4
so_rcv.sb_mtx=0xb8
so_rcv.sb_sx=0xcc
so_rcv.sb_sel=0xe0
so_rcv.sb_state=0xe4
so_rcv.sb_mb=0xe8
so_rcv.sb_mbtail=0xec
so_rcv.sb_lastrecord=0xf0
so_rcv.sb_sndptr=0xf4

sol_upcall overlaps with so_rcv.sb_sel which is initialized to a valid data pointer:

so->so_rcv.sb_sel = &so->so_rdsel;

And so calling &so->so_rdsel, results in an invalid instruction for 32-bit platforms.

extern int printf(const char *fmt, ...);
#define NULL ((void *)0)

#include <sys/cdefs.h>
#include <sys/types.h>
#include <sys/stdint.h>
#include <sys/socketvar.h>

static void
dump_sockbuf(const char *prefix, struct sockbuf *sb)
{
  printf("%s.sb_mtx=%p\n", prefix, &sb->sb_mtx);
  printf("%s.sb_sx=%p\n", prefix, &sb->sb_sx);
  printf("%s.sb_sel=%p\n", prefix, &sb->sb_sel);
  printf("%s.sb_state=%p\n", prefix, &sb->sb_state);
  printf("%s.sb_mb=%p\n", prefix, &sb->sb_mb);
  printf("%s.sb_mbtail=%p\n", prefix, &sb->sb_mbtail);
  printf("%s.sb_lastrecord=%p\n", prefix, &sb->sb_lastrecord);
  printf("%s.sb_sndptr=%p\n", prefix, &sb->sb_sndptr);
  printf("%s.sb_fnrdy=%p\n", prefix, &sb->sb_fnrdy);
  printf("%s.sb_sndptroff=%p\n", prefix, &sb->sb_sndptroff);
  printf("%s.sb_acc=%p\n", prefix, &sb->sb_acc);
  printf("%s.sb_ccc=%p\n", prefix, &sb->sb_ccc);
  printf("%s.sb_hiwat=%p\n", prefix, &sb->sb_hiwat);
  printf("%s.sb_mbcnt=%p\n", prefix, &sb->sb_mbcnt);
  printf("%s.sb_mcnt=%p\n", prefix, &sb->sb_mcnt);
  printf("%s.sb_ccnt=%p\n", prefix, &sb->sb_ccnt);
  printf("%s.sb_mbmax=%p\n", prefix, &sb->sb_mbmax);
  printf("%s.sb_ctl=%p\n", prefix, &sb->sb_ctl);
  printf("%s.sb_lowat=%p\n", prefix, &sb->sb_lowat);
  printf("%s.sb_timeo=%p\n", prefix, &sb->sb_timeo);
  printf("%s.sb_flags=%p\n", prefix, &sb->sb_flags);
  printf("%s.sb_upcall=%p\n", prefix, &sb->sb_upcall);
  printf("%s.sb_upcallarg=%p\n", prefix, &sb->sb_upcallarg);
  printf("%s.sb_aiojobq=%p\n", prefix, &sb->sb_aiojobq);
  printf("%s.sb_aiotask=%p\n", prefix, &sb->sb_aiotask);
}

static void
dump_socket(struct socket *so)
{
	printf("sol_upcall=%p\n", &so->sol_upcall);
	printf("sol_upcallarg=%p\n", &so->sol_upcallarg);

	dump_sockbuf("so_rcv", &so->so_rcv);
	dump_sockbuf("so_snd", &so->so_snd);
}

int
main()
{
	dump_socket(NULL);
  	return (0);
}
This revision was automatically updated to reflect the committed changes.