Add audit(4) support to NFS(v3)
Needs RevisionPublic
Actions

Authored by shivank on Aug 31 2020, 4:57 AM.

Details

Reviewers

asomers
rmacklem
markj
mjg

Summary

Security event auditing permits the selective and fine-grained configurable logging of security-relevant system events for the purpose of post-mortem analysis, intrusion detection, and run-time monitoring and is intended to meet the requirements of the Common Criteria(CC)/Common Access protection profile(CAPP) evaluation. The audit subsystem in FreeBSD, audit(4), can record a variety of system events like user-logins, file system activities, network activities, process creations, etc.

The auditd(8) on the server doesn’t generate any record trails for the NFS activities as the audit works mostly on the syscall level and the NFS server is implemented within the kernel.

To audit NFS activities within the network, it will require to run the auditd(8) on each NFS client. This arrangement works perfectly fine in case of secure networks. But In the case of an insecure network, running auditd(8) on each client is not an option. The audit(4) support to the NFS server is a missing feature for such networks. Thus, the aim of this project is to audit each NFS RPC. This would allow audit of all NFS activities within the network by just running auditd(8) on the server.

Test Plan

A FreeBSD Port based on net/libnfs: https://github.com/shivankgarg98/NFSAuditTestSuite

mkdir /usr/local/tests/nfs-audit
make && make install
cd usr/local/tests/nfs-audit
kyua test -k Kyuafile

Note: Please make sure net/libnfs is already installed.

Diff Detail

Repository

rS FreeBSD src repository - subversion

Lint

Lint Skipped

Unit

Tests Skipped

Event Timeline

shivank requested review of this revision.Aug 31 2020, 4:57 AM

shivank created this revision.

It was earlier being reviewed on D25869. But due to change of base revision, It was showing changes which were not mine. So, I created a new review here.

asomers added inline comments.Sep 6 2020, 9:05 PM

sys/kern/vfs_cache.c
2698	This function is nearly identical to vn_vptocnp. You should merge the two.

merge vn_fullpath_any and vn_vptocnp with their locked counterpart to work for optionally locked vnodes.

The new code looks better. But grrr, there are two big problems:

It doesn't compile due to some recent changes on head. I suggest the following:
- Remove the <rpc/rpc.h>, <sys/mount.h>, and <fs/nfs/*> includes from audit.h. In addition to fixing the compile failure, it's generally not recommended to include headers from other headers. Sometimes it's necessary, but it also causes header pollution, and slow build times. Instead of including those files, just forward declare struct nfsrv_descript; and struct kaudit_record;.
- Add `<netinet/in.h>, <rpc/rpc.h>, <fs/nfs/nfsdport.h>, <fs/nfs/nfsproto.h>, and <fs/nfs/nfs.h> to audit_bsm_db.c
- Add <rpc/rpc.h>, <fs/nfs/nfsport.h>, <fs/nfs/nfsproto.h>, and <fs/nfs/nfs.h> to audit.c

It triggers a witness panic in the NFS client, even when auditting is disabled. It's not obvious to me why that would be. @rmacklem can you guess?

panic: Lock (lockmgr) nfs not locked @ /usr/home/somers/freebsd/base/head/sys/kern/vfs_subr.c:3141.
cpuid = 3
time = 1599497966
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0062c276e0
vpanic() at vpanic+0x182/frame 0xfffffe0062c27730
panic() at panic+0x43/frame 0xfffffe0062c27790
witness_assert() at witness_assert+0x23a/frame 0xfffffe0062c277c0
lockmgr_upgrade() at lockmgr_upgrade+0x4e/frame 0xfffffe0062c27810
nfs_lock() at nfs_lock+0x2c/frame 0xfffffe0062c27860
vop_sigdefer() at vop_sigdefer+0x2f/frame 0xfffffe0062c27890
vput_final() at vput_final+0x1db/frame 0xfffffe0062c278f0
vn_vptocnp() at vn_vptocnp+0x316/frame 0xfffffe0062c27970
vn_fullpath_dir() at vn_fullpath_dir+0x14a/frame 0xfffffe0062c279e0
vn_fullpath_any() at vn_fullpath_any+0x60/frame 0xfffffe0062c27a40
vn_getcwd() at vn_getcwd+0xd2/frame 0xfffffe0062c27a90
sys___getcwd() at sys___getcwd+0x54/frame 0xfffffe0062c27ad0
amd64_syscall() at amd64_syscall+0x140/frame 0xfffffe0062c27bf0
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe0062c27bf0
--- syscall (326, FreeBSD ELF64, sys___getcwd), rip = 0x80035746a, rsp = 0x7fffffffa2d8, rbp = 0x7fffffffa430 ---
KDB: enter: panic

This revision now requires changes to proceed.Sep 7 2020, 5:29 PM

I feel vfs_cache.c changes for making vn_fullpath_global work for optionally locked vnode are causing the trouble. Though I'm not sure what's the problem. I request Mateusz Guzik, @mjg to have a look at my vfs_cache.c changes. I would be grateful for your time.

Audit support for regular lookup starts with AUDIT_ARG_UPATH1_VP/AUDIT_ARG_UPATH2_VP without any vnodes locked. Later on visited vnodes get added with AUDIT_ARG_VNODE1/AUDIT_ARG_VNODE2 which only performs VOP_GETATTR (i.e. does *NOT* resolve any paths). Your code should follow the same scheme.

As you can see path resolving routines can take vnode locks on their own (modulo the smr case). This means they can't be called with locked vnodes to begin with, as otherwise you risk violating global lock ordering and consequently deadlocking the kernel.

The VOP_ISLOCKED routine is not entirely legal to call if you don't hold the lock. The name is perhaps misleading, but it can only reliably tell you that you have an exclusive lock or that *SOMEONE* has a shared lock (and it may be you). Or to put it differently, if you don't have the vnode locked but someone else has it shared locked, you will get non-0 and that's how you get the panic. Regardless of this problem, adding the call reduces performance and most notably suggests a bug on its own.

So the question is why are you calling here with any vnodes locked.

shivank added a reviewer: mjg.Sep 11 2020, 4:45 AM

In D26243#587132, @mjg wrote:

Audit support for regular lookup starts with AUDIT_ARG_UPATH1_VP/AUDIT_ARG_UPATH2_VP without any vnodes locked. Later on visited vnodes get added with AUDIT_ARG_VNODE1/AUDIT_ARG_VNODE2 which only performs VOP_GETATTR (i.e. does *NOT* resolve any paths). Your code should follow the same scheme.

As you can see path resolving routines can take vnode locks on their own (modulo the smr case). This means they can't be called with locked vnodes to begin with, as otherwise you risk violating global lock ordering and consequently deadlocking the kernel.

The VOP_ISLOCKED routine is not entirely legal to call if you don't hold the lock. The name is perhaps misleading, but it can only reliably tell you that you have an exclusive lock or that *SOMEONE* has a shared lock (and it may be you). Or to put it differently, if you don't have the vnode locked but someone else has it shared locked, you will get non-0 and that's how you get the panic. Regardless of this problem, adding the call reduces performance and most notably suggests a bug on its own.

So the question is why are you calling here with any vnodes locked.

I wish to audit the canonical path of the file requested by the NFS clients. The requested path from the client is extracted in the NFS server using nfsrv_parsename, but the vnode is locked in some NFS services. I thought of unlocking/relocking of vnode for path audit but Rick advised not to. That's why I had to call this locked vnode.

Thanks for your question which made me rethink the problem from scratch and I got a new idea for auditing path.

Hi @rmacklem and @asomers, if I use nfsvno_namei to get the canonical path for the client, I will not the need the AUDIT_ARG_UPATH1_VP.which will save me from all the trouble of passing locked vnode to vn_fullpath_global. Please provide your opinion on the same.

In D26243#587174, @shivank wrote:

In D26243#587132, @mjg wrote:

Audit support for regular lookup starts with AUDIT_ARG_UPATH1_VP/AUDIT_ARG_UPATH2_VP without any vnodes locked. Later on visited vnodes get added with AUDIT_ARG_VNODE1/AUDIT_ARG_VNODE2 which only performs VOP_GETATTR (i.e. does *NOT* resolve any paths). Your code should follow the same scheme.

As you can see path resolving routines can take vnode locks on their own (modulo the smr case). This means they can't be called with locked vnodes to begin with, as otherwise you risk violating global lock ordering and consequently deadlocking the kernel.

The VOP_ISLOCKED routine is not entirely legal to call if you don't hold the lock. The name is perhaps misleading, but it can only reliably tell you that you have an exclusive lock or that *SOMEONE* has a shared lock (and it may be you). Or to put it differently, if you don't have the vnode locked but someone else has it shared locked, you will get non-0 and that's how you get the panic. Regardless of this problem, adding the call reduces performance and most notably suggests a bug on its own.

So the question is why are you calling here with any vnodes locked.

I wish to audit the canonical path of the file requested by the NFS clients. The requested path from the client is extracted in the NFS server using nfsrv_parsename, but the vnode is locked in some NFS services. I thought of unlocking/relocking of vnode for path audit but Rick advised not to. That's why I had to call this locked vnode.

Thanks for your question which made me rethink the problem from scratch and I got a new idea for auditing path.

Hi @rmacklem and @asomers, if I use nfsvno_namei to get the canonical path for the client, I will not the need the AUDIT_ARG_UPATH1_VP.which will save me from all the trouble of passing locked vnode to vn_fullpath_global. Please provide your opinion on the same.

Isn't audit_nfsarg_vnode1 the problem? You already know the path when you call AUDIT_ARG_UPATH1_VP, right?

In D26243#587330, @asomers wrote:

Isn't audit_nfsarg_vnode1 the problem? You already know the path when you call AUDIT_ARG_UPATH1_VP, right?

Sorry, I didn't understand the question. When we are calling AUDIT_ARG_UPATH1_VP, We only know a component of the path and directory vnode, we then construct the whole path using vn_fullpath_global, which is stored in kernel audit record.