Page MenuHomeFreeBSD

makefs: Add a mmap'd buffer cache
Needs ReviewPublic

Authored by imp on Mar 11 2023, 7:17 PM.
Tags
None
Referenced Files
Unknown Object (File)
Fri, May 24, 10:49 AM
Unknown Object (File)
Mar 12 2024, 4:03 AM
Unknown Object (File)
Dec 23 2023, 4:02 AM
Unknown Object (File)
Dec 20 2023, 5:54 AM
Unknown Object (File)
Dec 12 2023, 8:41 AM
Unknown Object (File)
Oct 27 2023, 2:05 PM
Unknown Object (File)
Jul 9 2023, 7:36 PM
Unknown Object (File)
May 15 2023, 5:36 AM
Subscribers
None

Details

Reviewers
markj
emaste
Summary

Rather than read / write things, where possible map in the filesystem
image into memory with mmap. Then provide a zero-copy interface to
getblk, bread, bwrite. This speeds up creation of msdos filesystems
from 30s to .6s:
Before:
7.330u 23.841s 0:31.17 100.0% 198+360k 0+250522io 4pf+0w
After:
0.165u 0.464s 0:00.62 100.0% 193+352k 0+19io 1013pf+0w

Not yet enabled for ffs, since it dumps core. Also, the win is smaller
since w/o this ffs takes only 3s and mmap is optional.

Sponsored by: Netflix

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 50341
Build 47233: arc lint + arc unit

Event Timeline

imp requested review of this revision.Mar 11 2023, 7:17 PM
imp created this revision.
imp added inline comments.
usr.sbin/makefs/ffs.c
578

I'm not sure why this dumps core so in clrbuf, so I need to look at that closely.

usr.sbin/makefs/ffs/buf.c
255

I do not understand this memset here. It makes no sense to me. All users of getblk replace the contents by calling clrbuf, memset or memcpy.

usr.sbin/makefs/ffs/buf.c
72

We might want to introduce flags to bp's here to know when this is still 'valid' and when it's just been allocated. That would save the read. But since we brelse things aggressively and toss away the buffer cache in the non mmap case, I'm not sure how big a win that would be.

usr.sbin/makefs/ffs/buf.c
255

I guess it's just a safety belt? To make sure that a programming bug doesn't cause us to write uninitialized bytes into the image.

usr.sbin/makefs/msdos.c
59

This should be sorted with the sys/param.h include above.

187

Shouldn't this be MAP_SHARED?

usr.sbin/makefs/ffs.c
578

I have found that if I always malloc the buffer and copy into and out of the mmap'd image, then things work. But this
eliminates a lot of the benefit from the change for msdos. There's something in the ufs code that's silently depending on this slight difference that I've not been able to find.

Even when I account for the 'negative' blocks being used to store metadata and getblk being used to remember the mapping of -lbn to actual block on the disk. This is a second reason why things don't work. We get further if we have a hybrid approach of allocating the buffer until we know the block and then moving the pointer to it (after copying the contents of the buffer), but it still doesn't quite work right :(

usr.sbin/makefs/ffs/buf.c
255

Except we do it also when we change the size of the buffer, which doesn't make a lot of sense to me. It seems to be critical to making things work, though, based on my experiments over the weekend for reasons I've not found the root cause for.

usr.sbin/makefs/msdos.c
59

ok

187

I'm unsure. I thought MAP_SHARED was for multiple programs accessing the file, of which this is not.
I did see elsewhere a MAP_FILE | MAP_PRIVATE as well, but 'MAP_FILE' is 0 so only can convey intent.

usr.sbin/makefs/msdos.c
187

MAP_PRIVATE means that the changes will not be visible to other processes, so accesses to the file mapping are copy-on-write and get discarded when the mapping is destroyed. So I'm confused about how this works.

usr.sbin/makefs/msdos.c
187

It worked due to a testing error on my part. MAP_SHARED is needed for these changes to survive the process dying. So I've updated this and made a couple of other changes.

Rework to have both zero-copy and copy-back semantics of the buffer cache.

usr.sbin/makefs/ffs.c
577

= NULL?

usr.sbin/makefs/ffs.c
577

yup

usr.sbin/makefs/ffs/buf.c
141–142

NULL here too

Did you test on ZFS? I wonder if double buffering via mmap there will outweigh the improvement.

Did you test on ZFS? I wonder if double buffering via mmap there will outweigh the improvement.

no. I didn't test ZFS since zfs didn't use bread/bwrite interface. I haven't looked at its buffer management at all.

usr.sbin/makefs/ffs/buf.c
141–142

I think the proper test here is B_ALLOC bit in flags, no?

In D39026#889255, @imp wrote:

Did you test on ZFS? I wonder if double buffering via mmap there will outweigh the improvement.

no. I didn't test ZFS since zfs didn't use bread/bwrite interface. I haven't looked at its buffer management at all.

I meant, did you time makefs when the output image file is on a ZFS filesystem? I don't think makefs -t zfs is amenable to this approach, indeed.

brelse fix and sys/mman.h sorting.

In D39026#889255, @imp wrote:

Did you test on ZFS? I wonder if double buffering via mmap there will outweigh the improvement.

no. I didn't test ZFS since zfs didn't use bread/bwrite interface. I haven't looked at its buffer management at all.

I meant, did you time makefs when the output image file is on a ZFS filesystem? I don't think makefs -t zfs is amenable to this approach, indeed.

zfs seems a little faster than msdos. 0.6s vs ffs 0.4s on the machine I'm using.... Though times < 1s can be hard to measure meaningful differences w/o more effort than i want to put into this :)