Page MenuHomeFreeBSD

nvdimm(4): Fix various problems when the using the second label index block
ClosedPublic

Authored by scottph on Wed, Nov 6, 1:34 AM.

Details

Summary

struct nvdimm_label_index is dynamically sized, with the free
bitfield expanding to hold slot_cnt entries. Fix a few places
where we were treating the struct as though it had a fixed sized.

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

scottph created this revision.Wed, Nov 6, 1:34 AM
cem added a reviewer: kib.Wed, Nov 6, 7:07 AM

Thanks!

Some initial tentative feedback and CCing kib@. Caveat, I haven't yet consulted any official documentation nor have I looked at the adjacent code to understand what it is doing, so my comments may just be uninformed. But, provisionally, I don't understand the motivation for some of this changeset.

sys/dev/nvdimm/nvdimm.c
260–281 ↗(On Diff #63987)

The portion above here seems not to have any functional change? Just refactoring? It's not clear to me why these changes were made; in particular, they seem less clear to me, and they don't seem to change any aspect of how label_index's dynamic-sizing is handled. I might be missing something, it's late Pacific time.

282–287 ↗(On Diff #63987)

This part (and label_index_is_valid above) seem to be the functional change?

In D22253#486458, @cem wrote:

Thanks!
Some initial tentative feedback and CCing kib@. Caveat, I haven't yet consulted any official documentation nor have I looked at the adjacent code to understand what it is doing, so my comments may just be uninformed. But, provisionally, I don't understand the motivation for some of this changeset.

No worries, the relevant spec material to look at for NVDIMM labels is the UEFI
spec, sec 13.19. The description of the label index block begins
unceremoniously after the listing of return codes for LabelStorageWrite(). A
few pages later, under the heading of Label Storage Area Description, it says:

The size of an Index Block depends on how many label slots fit into the Label
Storage Area. The minimum size of an Index Block is 256 bytes and the size
must be a multiple of EFI_NVDIMM_LABEL_INDEX_ALIGN bytes. As necessary,
padding with zero bytes at the end of the structure is used to meet these
size requirements. The minimum size of the Label Storage Area is large
enough to hold 2 index blocks and 2 labels.

sys/dev/nvdimm/nvdimm.c
260–281 ↗(On Diff #63987)

There are a few functional changes above. The important thing to note is that index_size != sizeof(struct nvdimm_label_index). index_size is required to be a multiple of EFI_NVDIMM_LABEL_INDEX_ALIGN (= 256), where sizeof(struct nvdimm_label_index) == 72. So before, where index[1] meant to index to the second label index block, it actually was indexing to the address of index[0].free

Also, in the seq comparison code above, the check of index1.seq > index0.seq would incorrectly assign the priority to index1 in the index0.seq==1 && index1.seq==3 case, where priority should go to index0.

282–287 ↗(On Diff #63987)

yes, if the free bitfield is smaller than it possibly could be, we go with the smaller size in the index block.

cem added inline comments.Mon, Nov 11, 4:49 AM
sys/dev/nvdimm/nvdimm.c
260–281 ↗(On Diff #63987)

I see, thanks!

I think it might make the logical bugfix clearer to leave indices alone, adding the new index1 with the careful arithmetic and removing references to indices[1]. (Much like how we refer to getdirentries buffers with struct dirent despite a trailing variable-length member.)

As far as the sequence number comparison, I think the update wheel algorithm described in the UEFI spec (1 -> 2 -> 3 -> 1) might be more clearly expressed with subtraction mod 3. I believe this is correct (it's easy to verify exhaustively¹):

if (index_0_valid && index_1_valid) {
    if (((int)index1->seq - (int)index0->seq + 3) % 3 == 1) {
        /* index 1 was mostly recently updated */
        index_0_valid = false;
    } else {
        /* index 0 was mostly recently updated */
        index_1_valid = false;
    }
}

(The explicit +3 is needed because C's % operator isn't really modulus and doesn't work on negative integers. Before + 3, the result must be in the range [-2, 2].)

[¹]:

idx1->seqidx0->seqdifference plus 3mod 3newer index
12220
13111
21411
23220
31520
32411
282–287 ↗(On Diff #63987)

This part LGTM. Do we sanity check slot_cnt from ACPI is <= num_labels (i.e., it's sane and within bounds)? Should we?

cem accepted this revision.Mon, Nov 11, 4:52 AM

Change looks functionally correct to me, my suggestions are all stylistic.

This revision is now accepted and ready to land.Mon, Nov 11, 4:52 AM
scottph updated this revision to Diff 64214.Tue, Nov 12, 12:55 AM

Updated with @cem's suggestions. Reposting for mentor approval.

This revision now requires review to proceed.Tue, Nov 12, 12:55 AM
scottl accepted this revision.Tue, Nov 12, 1:01 AM
This revision is now accepted and ready to land.Tue, Nov 12, 1:01 AM
cem accepted this revision.Tue, Nov 12, 1:52 AM