Diff Detail
- Repository
- rG FreeBSD src repository
- Lint
Lint Skipped - Unit
Tests Skipped - Build Status
Buildable 65511 Build 62394: arc lint + arc unit
Event Timeline
On latest main, NFSv2 and NFSv3 pass, while NFSv4 panics:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe008dfd0490 assert_vop_locked() at assert_vop_locked+0x49/frame 0xfffffe008dfd04b0 VOP_PATHCONF_APV() at VOP_PATHCONF_APV+0x42/frame 0xfffffe008dfd04e0 nfsv4_fillattr() at nfsv4_fillattr+0xfa8/frame 0xfffffe008dfd0670 nfsvno_fillattr() at nfsvno_fillattr+0xdd/frame 0xfffffe008dfd0710 nfsrvd_getattr() at nfsrvd_getattr+0x3c6/frame 0xfffffe008dfd09a0 nfsrvd_dorpc() at nfsrvd_dorpc+0x167e/frame 0xfffffe008dfd0bb0 nfssvc_program() at nfssvc_program+0x852/frame 0xfffffe008dfd0db0 svc_run_internal() at svc_run_internal+0xaa8/frame 0xfffffe008dfd0ee0 svc_thread_start() at svc_thread_start+0xb/frame 0xfffffe008dfd0ef0 fork_exit() at fork_exit+0x82/frame 0xfffffe008dfd0f30 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe008dfd0f30 --- trap 0xc, rip = 0x2a07bda3b4ea, rsp = 0x2a07bc818ed8, rbp = 0x2a07bc819170 --- vnode 0xfffff8007dfdac08: type VDIR state VSTATE_CONSTRUCTED op 0xffffffff822fc120 usecount 3, writecount 0, refcount 1 seqc users 0 mountedhere 0 hold count flags () flags (VV_ROOT) lock type tmpfs: UNLOCKED tag VT_TMPFS, tmpfs_node 0xfffff800ada100f0, flags 0x0, links 2 mode 0755, owner 0, group 0, size 0, status 0x0 VOP_PATHCONF Entry (vp): 0xfffff8007dfdac08 is not locked but should be
I would be delighted if someone could suggest a better way to populate a thick(ish) jail than what I'm doing here, btw. The jails weigh in at around 200 MB each on amd64.
I think this is because you are trying to run an NFS mount
inside a jail and that cannot be done.
nfsrvd_getattr() will have the vnode locked. (It is locked before
the call to nfsrvd_getattr() by the code in nfsrv_compound()).
If you can reproduce this without jails, then there is something broken.
It clearly can. I'm doing it right here. And if we'd had this a month ago, the previous nfsd panic would have been spotted sooner.
I meant I am not sure it can be done reliably and safely.
You are correct in that I should have been testing with
DEBUG_VFS_LOCKS so that I would have seen the bug.
As for whether or not a non-vnet jail can safely do a
NFS mount, I am not conversant enough with jails to
be sure.
I do know that TCP reconnects need to use the correct
vnet to get them to work.
I also suspect that the nfscbd daemons will need to be
run outside of the jails.
You need to test with nfscbd running, both inside and
outside the jails and also do TCP reconnect tests.
--> If all those work fine, it might be ok so long as there
is a check for non-vnet and the risk of exposing parts of the file system tree outside of the jail into the jail is documented.
Others with jail expertise will need to review w.r.t. enabling them.
Oh, and I was thinking of vnet jails. I did not know you were
running a non-vnet jail. (I have never run a non-vnet jail.)
Oh, and lets not forget the other daemons. nfsuserd, gssd
and rpc.tlsclntd.
These can be run per-vnet, so you can only have one non-vnet
instance of them.
The first two (nfsuserd and gssd) have to access the correct passwd
and group databases (which are often files within a jail, I think?).
So, you'll end up with something like...
- NFS mounts can be done in one (and only one) non-vnet jail, maybe working correctly.
Making NFS mounts work in vnet jails could be done, but it is quite
a bit of work. Mostly making all the threads (any that make VOP_xxx()
or VFS_xxx() calls, plus the nfsiod threads that currently run from taskqueue
and maybe some others) do the correct CURVNET_SET()/CURVNET_RESTORE()s.
I started on making NFS mounts work in vnet jails, but stopped when it
got messy and no one seemed to really need it.
rick
You need to test with nfscbd running, both inside and
outside the jails and also do TCP reconnect tests.
--> If all those work fine, it might be ok so long as thereis a check for non-vnet and the risk of exposing parts of the file system tree outside of the jail into the jail is documented.Others with jail expertise will need to review w.r.t. enabling them.
You don't need to create a thick jail. In fact, you don't need a separate jail file system at all. It's totally fine to create a jail anchored at / . That's the way I did it in D48473 . The nfsd jail is anchored at / , and I created an exports file at a temporary location, which I passed on the cmdline to mountd.
I also don't think it's necessary to jail the client.
tests/sys/fs/nfs/nfs_test.sh | ||
---|---|---|
151 | vnet_cleanup isn't doing its job correctly. At least on my system, the epair interfaces don't get cleaned up. |
No, it is absolutely not fine. Both jails run daemons which create pidfiles and sockets in hardcoded locations.
That sounds like a good reason to add a "-P pidfile" argument. There's plenty of precedent for that.