Details

Reviewers

mav
chuck
jrtc27

Commits

rGd85d7244d8bb: nvme: Fix alignment on nvme structures
rG8c66f8811d52: nvme: Fix alignment on nvme structures
rGfea3cf1d6da0: nvme: Fix alignment on nvme structures

Summary

Remove __packed from nvme_command, nvme_completion and
nvme_dsm_trim. Add super-alignment to nvme_completion since it's always
at least that aligned in hardware (and in our existing uses of it
embedded in structures). It generates better code in
nvme_qpair_process_completions on riscv, and the same on all other
platforms.

Test Plan

This came from discussions with Jessica about issues on riscv, with a proposed fix in D31002 (Jessica also has one that has more background, which I've linked from that mine)

Diff Detail

Repository

rS FreeBSD src repository - subversion

Lint

Lint Passed

Unit

No Test Coverage

Build Status

Buildable 40235
Build 37124: arc lint + arc unit

Event Timeline

imp created this revision.Jul 2 2021, 2:15 PM

Herald added a subscriber: dab. · View Herald TranscriptJul 2 2021, 2:15 PM

imp requested review of this revision.Jul 2 2021, 2:15 PM

Harbormaster completed remote builds in B40235: Diff 91668.Jul 2 2021, 2:15 PM

imp edited the test plan for this revision. (Show Details)Jul 2 2021, 2:17 PM

imp added reviewers: mav, chuck, jrtc27.

imp added a child revision: D31002: nvme: coherently read status.

I agree that __packed changes nothing there and is not needed. But what is the point of adding explicit alignment? I've quickly looked through the use of the structures and found nothing that this would change.

In D31001#697509, @mav wrote:

I agree that __packed changes nothing there and is not needed. But what is the point of adding explicit alignment? I've quickly looked through the use of the structures and found nothing that this would change.

I was thinking for the nvme_completion we could read the completion record more efficiently, though maybe aligned(8) is sufficient for it to always use a movq. Lemme check code generation, if there's no advantage, I'll remove it.
For nvme_dsm_range, I have no good argument for that, so I'll remove it.

Just to document, there was a similar set of proposed changes for bhyve in D29126
I had a concern about the change, but @jhb explained __packed was more accurately "alignment of 1" and not really what the code wanted. He also mentioned that the sizeof asserts should catch any compiler shenanigans.

I agree with @mav that the __aligned(16) probably doesn't do anything especially since CQ base addresses must MPS aligned.

I've looked at the generated code for an artificial example.

#include <sys/cdefs.h>
#include <stdint.h>

struct gerbil {
        uint32_t w1;
        uint32_t w2;
        uint32_t w3;
        uint32_t w4;
};

struct hamster {
        uint32_t w1;
        uint32_t w2;
        uint32_t w3;
        uint32_t w4;
} __aligned(8);

extern void doit(void *);

void
copy_g(struct gerbil *g)
{
        struct gerbil gg = *g;

        if (gg.w4)
                doit(&gg);
}

void
copy_h(struct hamster *h)
{
        struct hamster hh = *h;

        if (hh.w4)
                doit(&hh);
}

For i386, aarch64, amd64, powerpc, and powerpc64 the code for copy_h and copy_g are identical. They don't care about the actual alignment, at least with the latest clang in the tree. I didn't try mips* or 32-bit arm because we don't support nvme there at all.

For riscv64, copy_h has significantly better code because it assumes super alignment (8), while copy_g does not. Changing it to alignment(4) causes the code generated to be the same, I presume because of the riscv64 ABI dictating this alignment.

I selected the above code because it's the bit of code racing in another commit, abstracted down to something easy to examine.

remove definitely not worth it alignment

Harbormaster completed remote builds in B40237: Diff 91671.Jul 2 2021, 3:32 PM

imp edited the summary of this revision. (Show Details)Jul 2 2021, 3:33 PM

mav accepted this revision.Jul 2 2021, 3:35 PM

This revision is now accepted and ready to land.Jul 2 2021, 3:35 PM

chuck accepted this revision.Jul 2 2021, 3:44 PM

better comment

This revision now requires review to proceed.Jul 2 2021, 3:47 PM

Harbormaster completed remote builds in B40238: Diff 91672.Jul 2 2021, 3:48 PM

What about nvme_registers? I don't think we actually use it for anything where it'd matter (the regs field is only ever assigned to, the actual reads/writes use offsetot and bus_space) but for completeness we probably should change that too.