Page MenuHomeFreeBSD

makewhatis: make output reproducible
ClosedPublic

Authored by emaste on Oct 10 2016, 4:59 PM.
Tags
None
Referenced Files
Unknown Object (File)
Wed, Nov 20, 2:19 AM
Unknown Object (File)
Wed, Oct 30, 2:30 PM
Unknown Object (File)
Oct 2 2024, 3:11 PM
Unknown Object (File)
Sep 11 2024, 9:27 PM
Unknown Object (File)
Sep 11 2024, 9:27 PM
Unknown Object (File)
Sep 9 2024, 7:03 PM
Unknown Object (File)
Sep 8 2024, 5:34 PM
Unknown Object (File)
Sep 8 2024, 10:52 AM
Subscribers
None

Details

Summary

The mandoc search database generation uses each page's inode number as a hash key to index hard linked pages only once. However, it also processed the pages ordered by hash key resulting in effectively non-deterministic output.

Instead:

  1. provide fts_open() with a comparison function to process directories and files in a deterministic order
  2. in addition to the hash, insert pages into a linked list which will be sorted (by virtue of 1)
  3. iterate over pages by the list in 2, instead of hash order

I will work on getting this upstream as well (although the patch is not very invasive)

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

emaste retitled this revision from to makewhatis: make output reproducible.
emaste updated this object.
emaste edited the test plan for this revision. (Show Details)
emaste added reviewers: bapt, des, cem.

use linked list in mpages_free(), leaving mpage_head as NULL

  • remove unused variable
  • add misisng {
  • reverse fts sort order, to counter the reversal introduced by the linked list

having the entries in the db sorted is helpful when inspecting the database during development, even though the resulting db order does not matter

contrib/mdocml/mandocdb.c
106 ↗(On Diff #21233)

Why not use queue.h? Even Linux has TAILQ.

581 ↗(On Diff #21233)

Why sort inverted? And why not just swap a and b instead?

contrib/mdocml/mandocdb.c
106 ↗(On Diff #21233)

Only to limit diffs wrt OpenBSD upstream, which already used un-macroized linked lists.

581 ↗(On Diff #21233)

mpages are added at the head of the linked list, so they're processed in the opposite order of being added. probably worth a comment.

I think the negating strcmp's return value is slightly more obvious that it's an intentional reverse ordering. but worth a comment.

add comment to fts_compare

bapt edited edge metadata.

please upstream that as soon as possible so it is not a pain to update later

This revision is now accepted and ready to land.Oct 10 2016, 6:16 PM
contrib/mdocml/mandocdb.c
106 ↗(On Diff #21233)

Macroizing this list wouldn't change the size of the diff too much. You don't have to rewrite the other non-macro list code.

581 ↗(On Diff #21233)

With a TAILQ, they could easily be added at the end of the list instead :-).

This revision was automatically updated to reflect the committed changes.