Inspired by trim.c, add a function to be called before swapon that will trim everything.
Details
- Reviewers
markj - Group Reviewers
manpages - Commits
- rS349286: Modify swapon(8) to invoke BIO_DELETE to trim swap devices, either if
It compiles. I don't know how to test it.
Diff Detail
- Lint
Lint Skipped - Unit
Tests Skipped
Event Timeline
For testing, if you have a spare SSD available, it should be sufficient to just invoke "swapon <device>" from the command line. If not, I can help test it.
swapon.c | ||
---|---|---|
747 | Don't you need to open RDWR or WRONLY? There should not be any need to use O_DIRECT here. | |
757 | I would find it clearer to have off_t sz; ... if (S_ISREG(sb.st_mode) { sz = sb.st_size; } else if (S_ISCHR(sb.st_mode) || S_ISBLK(sb.st_mode)) { if (ioctl(fd, DIOCGMEDIASIZE, &sz) != 0) err(1, "ioctl(DIOCGMEDIASIZE"); } else { errx(1, "%s has an invalid file type", name); } ioarg[0] = 0; ioarg[1] = sz; if (ioctl(fd, DIOCGDELETE, ioarg) != 0) warn(1, "ioctl(DIOCGDELETE)"); In particular, it does not make sense to swap on a directory. I am not sure how we should handle errors. I think it is reasonable to log them and continue, i.e., using warn(3) as above. | |
775 | My feeling is that this should be explicitly configured by the administrator. It's not unusual for devices to have slow or buggy TRIM support. |
Apply reviewer suggestions. Muddle through trying to process a "swaponce" option in fstab.
swapon.c | ||
---|---|---|
395 | Hmm, see the description of -T in geli(8). The combination of "notrim" and "trimonce" should probably result in an error. | |
401 | Style: continuing lines should be indented by four space. | |
747 | Why is O_DIRECT needed? | |
809 | I guess it should be [-F fstab] -aLq | [-E] file ... In the first type of usage we're getting the list of swap devices and options from the fstab, and in the second we're specifying the swap device on the command line. -E should only be available in the latter usage. |
The "trimonce" option should be documented in the fstab(5) man page, share/man/man5/fstab.5, and the -E option should be documented in the swapon(8) man page, sbin/swapon/swapon.8.
The syntax used for man pages is documented in the mdoc(7) page. It's helpful also to look at existing man pages for examples of mandoc syntax.
swapon.c | ||
---|---|---|
192 | The options are per-swap device and the fstab may describe multiple swap devices. This enables the "trimonce" option for all of them following the first one with "trimonce" configured. | |
742 | Extra newline. | |
749 | This should be a fatal error. | |
771 | Extra newline. |
sbin/swapon/swapon.8 | ||
---|---|---|
31 ↗ | (On Diff #58751) | .Dd should be bumped for non-trivial changes like adding a new option. |
92 ↗ | (On Diff #58751) | Same comment as below regarding .Dv. |
sbin/swapon/swapon.c | ||
181 ↗ | (On Diff #58751) | Hmm, does this code just blindly look for option names without checking against partial matches? That will need to be fixed at some point if we later add a plain "trim" option for online trim. |
191 ↗ | (On Diff #58751) | Doesn't this still leave Eflag set for all remaining swap devices in the fstab? |
400 ↗ | (On Diff #58751) | We should handle this consistently with "late" and "noauto", i.e., have a separate else if statement, or combine all of the strcmp() calls into a single predicate. |
share/man/man5/fstab.5 | ||
219 ↗ | (On Diff #58751) | I would suggest qualifying this: "For swap devices, the keyword..." |
221 ↗ | (On Diff #58751) | Identifiers like BIO_DELETE should be marked with .Dv ("defined variable"): ... triggers the delivery of a .Dv BIO_DELETE command ... |
sbin/swapon/swapon.c | ||
---|---|---|
191 ↗ | (On Diff #58751) | No, it will be reset for every iteration of the getfsent loop - every swap device in the fstab, as I understand it. However, it could leave Eflag set for all the devices on the command line that are processed after the fstab is processed, so I'll use the 1 bit of Eflags for fstab trimming and the 2 bit for the rest. |
With the nits I mentioned fixed, the patch seems to work. Thanks for doing this.
BTW, to build and install a new swapon, run:
$ cd /usr/src/sbin/swapon $ make cleandir all $ sudo make install
Or from /usr/src:
$ make -C sbin/swapon cleandir all $ sudo make -C sbin/swapon install
sbin/swapon/swapon.c | ||
---|---|---|
757 ↗ | (On Diff #58804) | You'll need to #include <sys/disk.h> |
764 ↗ | (On Diff #58804) | warn() doesn't take an exit status parameter. |
191 ↗ | (On Diff #58751) | Ok, I was looking at an older version of the diff. |
head/sbin/swapon/swapon.c | ||
---|---|---|
762–765 ↗ | (On Diff #58886) | We need to change the arguments to DIOCGDELETE. I enabled "trimonce" on a swap partition that resides at the start of a slice: # gpart show nvd0s1 => 0 976773104 nvd0s1 BSD (466G) 0 134217728 2 freebsd-swap (64G) 134217728 842555376 4 freebsd-ufs (402G) And, on the next reboot after the first application of trim, the BSD label has disappeared. I have seen this happen twice now, once by accident and just now by intention. On a related note, recall that the swap pager sets aside the first two pages on the swap device, never using them. |
head/sbin/swapon/swapon.c | ||
---|---|---|
762–765 ↗ | (On Diff #58886) | Oh, that's surprising. I didn't know that the swap pager does that. |
head/sbin/swapon/swapon.c | ||
---|---|---|
762–765 ↗ | (On Diff #58886) | Most likely because 2 pages is 8k which is the size of a bsdlabel... |
head/sbin/swapon/swapon.c | ||
---|---|---|
762–765 ↗ | (On Diff #58886) | I discovered another problem earlier this week: if the swap partition is used as a kernel dump device, the TRIMs will wipe a kernel dump before savecore runs. One workaround is to set the "late" option on the swap device in /etc/fstab, so that swapon is called after savecore runs. We should probably document that as a caveat in the swapon man page. |
head/sbin/swapon/swapon.c | ||
---|---|---|
762–765 ↗ | (On Diff #58886) | Indeed: 2300 /* 2301 * Do not free the first two block in order to avoid overwriting 2302 * any bsd label at the front of the partition 2303 */ 2304 blist_free(sp->sw_blist, 2, nblks - 2); I just wasn't aware of that is all. |
head/sbin/swapon/swapon.c | ||
---|---|---|
762–765 ↗ | (On Diff #58886) | Yea, we should avoid the 'overlapping' areas of the drive, but skipping two pages will safely skip over all that junk on any platform we care about that supports trim. Otherwise, we'll stomp on the disklabel as alc noticed :) |
head/sbin/swapon/swapon.c | ||
---|---|---|
756 ↗ | (On Diff #58886) | How about return or error+exit here? |