When a dirent is removed or changed due to a remove or rename, we
correctly made the write of the inode that decremented the link count
depend on the write of the buf containing the change to the dirent.
However, we did not prevent the end-of-life truncate from running under
the vput / vinactive following the remove or rename. Therefore it used
to happen that we would do the end-of-life truncate, including the inode
write with a size of 0 (but still with the old link count) before the
dirent write. If we then crashed before the dirent write, we would
recover with a state where the file was still linked, but had been
truncated to zero. The resulting state could be considered corruption.
An example where upon reboot /tmp/foo is linked but has size zero:
dd if=/usr/share/dict/words of=/tmp/foo bs=8k count=1
fsync /tmp/foo /tmp
sync
rm -f /tmp/foo
sleep 10
reboot -nq # or panic, reset, etc
The same problem happens with rename, where the consequence is worse.
mv -f /etc/foo.tmp /etc/foo can result in /etc/foo being truncated to
zero upon crash recovery.
Now, delay the end-of-life truncate by taking an extra vref in the
softdep code when a directory removal is set up, and only releasing it
after the directory page is written.
---
I'm not at all sure that this is the best way to enforce such a
dependency. However, it is simple and seems to work.