Page MenuHomeFreeBSD

memdesc: Add routines for copying data to/from memory descriptors.
ClosedPublic

Authored by jhb on Jun 20 2023, 5:56 AM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, Jun 20, 3:33 PM
Unknown Object (File)
Wed, May 29, 6:05 PM
Unknown Object (File)
Wed, May 29, 6:05 PM
Unknown Object (File)
Wed, May 29, 6:05 PM
Unknown Object (File)
May 18 2024, 12:26 PM
Unknown Object (File)
May 8 2024, 12:29 PM
Unknown Object (File)
May 8 2024, 7:20 AM
Unknown Object (File)
May 7 2024, 3:43 PM
Subscribers

Details

Summary

These are modeled on the API used for m_copydata/m_copyback and the
crypto buffer APIs. One day it might be nice to reduce the
proliferation of these by adding cursors and using memdesc directly
for crypto request buffers.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

jhb requested review of this revision.Jun 20 2023, 5:56 AM

I use this in the NVMe over TCP transport to copy data between command buffers associated with NVMe commands and mbuf chains containing NVMe/TCP PDUs.

imp requested changes to this revision.Jun 20 2023, 8:35 AM
imp added inline comments.
sys/kern/kern_memdesc.c
1 ↗(On Diff #123467)

can I get a spdx-indentifier for this?
Also, does Chelsio require all rights reserved? I gotta ask :)

53 ↗(On Diff #123467)

any way to static_assert this?

171 ↗(On Diff #123467)

I'm not entirely sure I'm happy that we need to decode these details in yet another place. Maybe this should be a cam function that's called instead? ksan needs it as does _bus_dmamap_load_ccb
both pull out some/all of this data. And I wonder why there's no MMC I/O requests in any of those places...

406 ↗(On Diff #123467)

and another copy here, see above :(

This revision now requires changes to proceed.Jun 20 2023, 8:35 AM
sys/kern/kern_memdesc.c
284 ↗(On Diff #123467)

Why do we need page_off? PHYS_TO_DMAP will return the same sub-page offset.

310 ↗(On Diff #123467)

Shouldn't this be >= 1?

342 ↗(On Diff #123467)

Should this be >= 1?

496 ↗(On Diff #123467)

Why not handle uio?

sys/kern/kern_memdesc.c
53 ↗(On Diff #123467)

Not really. The question is if this if the MD code creates mappings this way or not. The other way to fix this is to do a loop invoking PHYS_TO_DMAP on each page which I could do easily enough.

171 ↗(On Diff #123467)

I do think it would be nice to basically have a function which returns a memdesc for a ccb. That would basically be the moral equivalent. I could add that and fix the existing places.

284 ↗(On Diff #123467)

Humm, presumably if i fix this to do a loop on each page and not rely on the assumption (which probably isn't safe in some cases?) then it would matter, but yes, otherwise it would not.

496 ↗(On Diff #123467)

Because this routine is intended to be safe to use in contexts where you can't sleep or fault. It could perhaps permit UIO_SYSSPACE, but I also knew I didn't need it for nvmf so figured it could be added in when there was an actual user.

TBH, I would probably not like to have uio in memdesc and instead change MEMDESC_VLIST to use a iovec array instead of abusing bus_dma_segment_t to store virtual pointers in ds_addr_t. That really means fixing that abuse in CAM CCBs though which is a bigger change.

sys/kern/kern_memdesc.c
171 ↗(On Diff #123467)

See D40880 for an approach that centralizes this. The MMC CCB doesn't have an I/O data buffer which is why it is not covered here. With that change in place, these routines simply go away as by the time these routines are called the memdesc is some other type instead.

  • Fix license and rebase after dropping MEMDESC_CCB/BIO.
jhb marked 2 inline comments as done.Jul 14 2023, 6:48 PM
jhb marked 3 inline comments as done.Jul 14 2023, 6:52 PM
jhb marked 2 inline comments as done.Jul 14 2023, 7:06 PM

This still does not handle uio's (for which I think uiomove is already a sufficient API), but I think I've addressed all the other feedback.

As an aside, I'm never sure when a file in sys/kern should start with kern_ vs. subr_. Are there any rules around that?

sys/kern/kern_memdesc.c
208 ↗(On Diff #124691)

It should be possible to fold this prologue into the loop if the loop sets page_off = 0 at the end of the first iteration.

208 ↗(On Diff #124691)

Don't you want to write pa += PAGE_SIZE - page_off here?

305 ↗(On Diff #124691)

Perhaps assert off >= 0 && size >= 0.

As an aside, I'm never sure when a file in sys/kern should start with kern_ vs. subr_. Are there any rules around that?

I'm not aware of any rules. My vague hand-wavy thought is that subr_* is lower level than kern_*, and given some other examples like subr_sglist.c and subr_bus_dma.c, maybe subr_ is the better prefix in this case?

sys/kern/kern_memdesc.c
208 ↗(On Diff #124691)

I was debating if I wanted to deal with page_off in the loop, vs make the loop more streamlined. However, it probably is less code to just set page_off to 0 at the end of the loop. And yes, I messed up the pa addition, and forgot to axe the trunc_page from PHYS_TO_DMAP above. Bah. My problem is that my test case (my NVMeoF host) doesn't currently use memdesc's with paddr ranges. It does test both bio and ccb's with unmapped I/O, but those use the vmpages memdesc.

In D40615#935118, @jhb wrote:

As an aside, I'm never sure when a file in sys/kern should start with kern_ vs. subr_. Are there any rules around that?

I'm not aware of any rules. My vague hand-wavy thought is that subr_* is lower level than kern_*, and given some other examples like subr_sglist.c and subr_bus_dma.c, maybe subr_ is the better prefix in this case?

subr_ makes more sense to me.

jhb marked an inline comment as done.

rename to subr, fix phys copy loops, add asserts

jhb marked 2 inline comments as done.Jul 24 2023, 5:46 PM
This revision was not accepted when it landed; it landed in state Needs Review.Jul 31 2023, 8:27 PM
This revision was automatically updated to reflect the committed changes.