Page MenuHomeFreeBSD

Panic on ufsdirhash_findfree: free mismatch
AcceptedPublic

Authored by akumar3_isilon.com on Jul 5 2021, 9:05 AM.

Details

Summary

Looks like we have hit a corner case.
After the command "mkdir /base" was issued the vfs/ufs pulls up the inode 2 (inode for root) and start to lookup space to create a directory entry for /base
It pulls out the dirhash for inode 2 and sees there is space of this directory entry at block 29,
Block 29 is read from the disk to the memory and now we start looking for the free space in this block(block size 512 Bytes)
While iterating over the entries it reaches the very end of the block.
This condition is however not handled by ufsdirhash_findfree (i == 512) and we hit the panic

ufsdirhash_findfree (ip=<optimized out>, slotneeded=16, slotsize=0xfffffe99a1441590)

Diff Detail

Lint
Lint Skipped
Unit
Unit Tests Skipped

Event Timeline

akumar3_isilon.com created this revision.
This revision is now accepted and ready to land.Jul 5 2021, 11:41 AM
sys/ufs/ufs/ufs_dirhash.c
751–769

I don't think this is right. Doesn't this prevent us from using the last slot? As in, it's fine if we get here with i == DIRBLKSIZ if freebytes >= slotneeded. And so I think the condition becomes if (freebytes < slotneeded)--which is exactly what the panic is about.

It may be fine to defensively error out for these cases instead of panic, but something else is also going on here. I think dh_firstfree has to be corrupt to get here.

sys/ufs/ufs/ufs_dirhash.c
751–769

Indeed, I think this is covering up something else. If i > DIRBLKSIZ then we have a record which crosses a directory block, and I believe that is not permitted. Perhaps there's some race which creates a window where the directory blocks and hash table can get out of sync, or some memory corruption modified either dh_firstfree or the corresponding directory block.

sys/ufs/ufs/ufs_dirhash.c
751–769

Yes, dh_firstfree (expected to be -1, not 29) seems corrupt, and I am digging into it, no culprits found yet.