Page MenuHomeFreeBSD

simple trimming before swapon

Authored by dougm on Jun 11 2019, 6:07 AM.



Inspired by trim.c, add a function to be called before swapon that will trim everything.

Test Plan

It compiles. I don't know how to test it.

Diff Detail

rS FreeBSD src repository
Automatic diff as part of commit; lint not applicable.
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

dougm created this revision.Jun 11 2019, 6:07 AM
markj added a comment.Jun 11 2019, 4:41 PM

For testing, if you have a spare SSD available, it should be sufficient to just invoke "swapon <device>" from the command line. If not, I can help test it.

732 ↗(On Diff #58510)

Don't you need to open RDWR or WRONLY?

There should not be any need to use O_DIRECT here.

742 ↗(On Diff #58510)

I would find it clearer to have

off_t sz;
if (S_ISREG(sb.st_mode) {
    sz = sb.st_size;
} else if (S_ISCHR(sb.st_mode) || S_ISBLK(sb.st_mode)) {
    if (ioctl(fd, DIOCGMEDIASIZE, &sz) != 0)
        err(1, "ioctl(DIOCGMEDIASIZE");
} else {
    errx(1, "%s has an invalid file type", name);

ioarg[0] = 0;
ioarg[1] = sz;
if (ioctl(fd, DIOCGDELETE, ioarg) != 0)
    warn(1, "ioctl(DIOCGDELETE)");

In particular, it does not make sense to swap on a directory.

I am not sure how we should handle errors. I think it is reasonable to log them and continue, i.e., using warn(3) as above.

754 ↗(On Diff #58510)

My feeling is that this should be explicitly configured by the administrator. It's not unusual for devices to have slow or buggy TRIM support.

dougm updated this revision to Diff 58554.Jun 12 2019, 7:35 AM
dougm marked 3 inline comments as done.

Apply reviewer suggestions. Muddle through trying to process a "swaponce" option in fstab.

markj added inline comments.Jun 13 2019, 4:17 PM
389 ↗(On Diff #58554)

Hmm, see the description of -T in geli(8). The combination of "notrim" and "trimonce" should probably result in an error.

395 ↗(On Diff #58554)

Style: continuing lines should be indented by four space.

741 ↗(On Diff #58554)

Why is O_DIRECT needed?

803 ↗(On Diff #58554)

I guess it should be

[-F fstab] -aLq | [-E] file ...

In the first type of usage we're getting the list of swap devices and options from the fstab, and in the second we're specifying the swap device on the command line. -E should only be available in the latter usage.

dougm updated this revision to Diff 58595.Jun 13 2019, 8:23 PM
dougm marked 4 inline comments as done.

Address reviewer comments.

markj added a comment.Jun 17 2019, 3:42 PM

The "trimonce" option should be documented in the fstab(5) man page, share/man/man5/fstab.5, and the -E option should be documented in the swapon(8) man page, sbin/swapon/swapon.8.

The syntax used for man pages is documented in the mdoc(7) page. It's helpful also to look at existing man pages for examples of mandoc syntax.

192 ↗(On Diff #58595)

The options are per-swap device and the fstab may describe multiple swap devices. This enables the "trimonce" option for all of them following the first one with "trimonce" configured.

742 ↗(On Diff #58595)

Extra newline.

749 ↗(On Diff #58595)

This should be a fatal error.

771 ↗(On Diff #58595)

Extra newline.

dougm updated this revision to Diff 58751.Jun 17 2019, 10:14 PM

Endeavor to apply reviewer suggestions.

markj added inline comments.Jun 19 2019, 5:51 PM
31 ↗(On Diff #58751)

.Dd should be bumped for non-trivial changes like adding a new option.

92 ↗(On Diff #58751)

Same comment as below regarding .Dv.

181 ↗(On Diff #58751)

Hmm, does this code just blindly look for option names without checking against partial matches? That will need to be fixed at some point if we later add a plain "trim" option for online trim.

191 ↗(On Diff #58751)

Doesn't this still leave Eflag set for all remaining swap devices in the fstab?

400 ↗(On Diff #58751)

We should handle this consistently with "late" and "noauto", i.e., have a separate else if statement, or combine all of the strcmp() calls into a single predicate.

219 ↗(On Diff #58751)

I would suggest qualifying this: "For swap devices, the keyword..."

221 ↗(On Diff #58751)

Identifiers like BIO_DELETE should be marked with .Dv ("defined variable"):

... triggers the delivery of a
command ...
dougm updated this revision to Diff 58804.Jun 19 2019, 8:14 PM
dougm marked 8 inline comments as done.

Addressed reviewer comments about these changes.

dougm added inline comments.Jun 21 2019, 3:37 AM
191 ↗(On Diff #58751)

No, it will be reset for every iteration of the getfsent loop - every swap device in the fstab, as I understand it. However, it could leave Eflag set for all the devices on the command line that are processed after the fstab is processed, so I'll use the 1 bit of Eflags for fstab trimming and the 2 bit for the rest.

markj added a comment.Jun 21 2019, 5:29 PM

With the nits I mentioned fixed, the patch seems to work. Thanks for doing this.

BTW, to build and install a new swapon, run:

$ cd /usr/src/sbin/swapon
$ make cleandir all
$ sudo make install

Or from /usr/src:

$ make -C sbin/swapon cleandir all
$ sudo make -C sbin/swapon install
191 ↗(On Diff #58751)

Ok, I was looking at an older version of the diff.

757 ↗(On Diff #58804)

You'll need to #include <sys/disk.h>

764 ↗(On Diff #58804)

warn() doesn't take an exit status parameter.

dougm updated this revision to Diff 58870.Jun 21 2019, 6:34 PM

Add a header. Drop an extra argument to warn().

markj accepted this revision.Jun 21 2019, 7:51 PM
This revision is now accepted and ready to land.Jun 21 2019, 7:51 PM
alc added a subscriber: alc.Aug 8 2019, 4:44 PM
alc added inline comments.

We need to change the arguments to DIOCGDELETE. I enabled "trimonce" on a swap partition that resides at the start of a slice:

# gpart show nvd0s1
=>        0  976773104  nvd0s1  BSD  (466G)
          0  134217728       2  freebsd-swap  (64G)
  134217728  842555376       4  freebsd-ufs  (402G)

And, on the next reboot after the first application of trim, the BSD label has disappeared. I have seen this happen twice now, once by accident and just now by intention.

On a related note, recall that the swap pager sets aside the first two pages on the swap device, never using them.

markj added inline comments.Aug 8 2019, 7:38 PM

Oh, that's surprising. I didn't know that the swap pager does that.

imp added inline comments.Aug 8 2019, 7:40 PM

Most likely because 2 pages is 8k which is the size of a bsdlabel...

markj added inline comments.Aug 8 2019, 7:40 PM

I discovered another problem earlier this week: if the swap partition is used as a kernel dump device, the TRIMs will wipe a kernel dump before savecore runs. One workaround is to set the "late" option on the swap device in /etc/fstab, so that swapon is called after savecore runs. We should probably document that as a caveat in the swapon man page.

markj added inline comments.Aug 8 2019, 7:42 PM


2300         /*                                                                                                                                               
2301          * Do not free the first two block in order to avoid overwriting                                                                                 
2302          * any bsd label at the front of the partition                                                                                                   
2303          */                                                                                                                                              
2304         blist_free(sp->sw_blist, 2, nblks - 2);

I just wasn't aware of that is all.

imp added inline comments.Aug 8 2019, 7:43 PM

Yea, we should avoid the 'overlapping' areas of the drive, but skipping two pages will safely skip over all that junk on any platform we care about that supports trim. Otherwise, we'll stomp on the disklabel as alc noticed :) added inline comments.

How about return or error+exit here?
DIOCGDELETE ioctl fails for NFS swapfile.