Page MenuHomeFreeBSD

Add support for asynchrous file system operations to aio
Needs RevisionPublic

Authored by mmacy on Oct 9 2020, 8:53 PM.

Details

Reviewers
asomers
jhb
scottl
imp
Group Reviewers
manpages
Summary

There is a WIP branch of openzfs that supports fully asynchronous reads and largely asynchronous writes (tx assignment and prefaulting indirect blocks might block). Leveraging this outside of zvol requires VFS support and changes to aio. This is fairly simplistic:

    • limited to 32MB per operation and will fall back to synchronous for larger requests
  • uses a uio_bio which passes an array of held pages to the file system. The file system calls uiobiomove to do the actual data movement.
  • currently doesn't expose async fsync to aio

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Skipped
Unit
Unit Tests Skipped
Build Status
Buildable 34965

Event Timeline

mmacy requested review of this revision.Oct 9 2020, 8:53 PM

Generally looks fine to me.

sys/kern/vfs_aio.c
1814

Maybe call it aio_queue_vfs? qasync is a pretty generic name but the function is VFS specific.

Could we get some documentation for the uiobiomove, aio_qasync, and VOP_UBOP ?

Could we get some documentation for the uiobiomove, aio_qasync, and VOP_UBOP ?

It depends on how loosely you define 'documentation'. think the sanest thing is to follow precedent. uiobiomove should be documented in a manpage. aio_qasync is an implementation detail that perhaps justifies a few sentence comment on top of it. VOP_UBOP could stand a comment on top of it that describes the significance of the uio_bio parameters and expected return values. I don't envision it being used outside of aio.

It depends on how loosely you define 'documentation'. think the sanest thing is to follow precedent. uiobiomove should be documented in a manpage. aio_qasync is an implementation detail that perhaps justifies a few sentence comment on top of it. VOP_UBOP could stand a comment on top of it that describes the significance of the uio_bio parameters and expected return values. I don't envision it being used outside of aio.

+1 to comments for aio_qasync and man page for uiobiomove. But I think VOP_UBOP deserves a manpage, too. Most VOPs have them. And in addition to being used by aio, VIO_UBOP might be useful for md(4), and it might have many implementors. fusefs could profitably implement it as well as ZFS.

scottl added inline comments.
sys/kern/vfs_aio.c
1401

Just say "Reading from disk means writing into memory"

This revision is now accepted and ready to land.Oct 10 2020, 12:00 AM
  • comment new functions in vfs_aio.c
  • add man page for uio_bio (includes uiobiomove) and VOP_UBOP.
This revision now requires review to proceed.Oct 13 2020, 8:10 PM

@asomers can you let me what more you'd like to see in the documentation.

  • add sparse page array support to uiobiomove

Some man pages nits.

share/man/man9/VOP_UBOP.9
1

Is this required?

69

It would be nice if you could also add a HISTORY section to the man page.

share/man/man9/uio_bio.9
66

Whitespace at the end of the sentence.

119

This .Pp can be deleted since the next paragraph is a new section.

223

The comma is needed since the .Xr reference is the last one.

share/man/man9/VOP_UBOP.9
63

If the operation is not completed immediately, how will completion notification be delivered? The man page should mention the asynchronous nature of UIO_UBOP, since that's its whole point.

share/man/man9/uio_bio.9
28

This date needs fixing.

57

I think you should remove the trailing comma

227

It would be nice to say "and first appeared in FreeBSD 13.0."

sys/sys/_uio.h
46

DELETE needed as well?

sys/sys/_uio.h
46

That would be really sweet. But I don't think we need a UIO_BIO_DELETE until somebody writes aio_delete().

imp added inline comments.
sys/sys/_uio.h
46

Fair point....

This revision is now accepted and ready to land.Oct 18 2020, 3:47 PM
  • update uio_bio structure to support scatter gather as well as improve interop with linux
This revision now requires review to proceed.Nov 23 2020, 1:37 AM
asomers requested changes to this revision.Dec 17 2020, 3:35 AM
asomers added inline comments.
share/man/man9/uio_bio.9
60

This should be uiobiomove, right?

85

uio_bv_offset ?

sys/kern/vfs_aio.c
1388

Why not use << PAGE_SHIFT?

1389

Is io_pages the same thing as btoc(cb->aio_nbytes + cb->aio_buf) - btoc(cb->aio_buf) + 1?

1391

pages is a field from the bio backend's union member. But you're adding a new backend. I think you should create a new union member in struct kaiocb, and put vfs_pages or something in there. Especially because I'm eliminating the pages field in D27624. Also, what initializes pages at this point? I thought it was only used by aio_qbio, which hasn't been called yet, right?

1424

This is dead code. Since io_pages is unsigned, you'll always return -1 in the previous block before you get here. Or maybe this check should come before the previous one?

sys/sys/uio.h
65

Where is the matching #endif ? And what defines HAVE_UBOP?

69

What are the UIO_BIO_USER, UIO_BIO_PREEMPTED, and UIO_BIO_SG flags for?

This revision now requires changes to proceed.Dec 17 2020, 3:35 AM
sys/sys/uio.h
87

Would it be preferrable to make the field off_t if we want to support 64bit operation length with DELETE operation on 32bit platform potentially?

sys/sys/uio.h
87

Would it be preferrable to make the field off_t if we want to support 64bit operation length with DELETE operation on 32bit platform potentially?

Unfortunately, what I said above sounds quite impossible, since our current aio interface use ssize_t for length also.