Page MenuHomeFreeBSD

kldxref: report errors and handle cross-device renames
Needs ReviewPublic

Authored by mqudsi_neosmart.net on Dec 27 2021, 6:58 PM.
Tags
None
Referenced Files
Unknown Object (File)
Feb 29 2024, 4:48 PM
Unknown Object (File)
Jan 9 2024, 7:13 PM
Unknown Object (File)
Dec 22 2023, 11:49 PM
Unknown Object (File)
Dec 21 2023, 4:02 AM
Unknown Object (File)
Nov 17 2023, 8:40 PM
Unknown Object (File)
Nov 7 2023, 8:50 AM
Unknown Object (File)
Nov 5 2023, 2:13 PM
Unknown Object (File)
Nov 4 2023, 4:37 PM
Subscribers

Details

Summary

kldxref: report errors and handle cross-device rename

If the temporary hint file was created on a different filesystem than
the final resting place of the resulting linker.hints file, rename(2)
would fail with EXDEV. There was no checking for errors after the call
to rename(2), so the kldxref operation would appear to succeed but
would not actually write out a new linker.hints file.

This patch adds both error checking/reporting and a fix for the EXDEV
situation by resorting to copying the file and unlinking the temporary
source.

Additionally, if rename(2) failed but it wasn't an EXDEV situation, we
now inform the user that this step failed.

This error was observed when putting together a filesystem layout for a live cd from disparate sources.

Sponsored by: NeoSmart Technologies

Test Plan

You can verify the existing broken behavior as follows:

sudo rm /boot/kernel/linker.hints
sudo mount -t tmpfs -o union /dev/null /boot/kernel
sudo kldxref /boot/kernel # no trailing slash!

No errors will be reported but a linker.hints file will not be created. Tracing with truss or dtrace reveals the rename failed with ERR#18
The same process when run with the patched kldxref will succeed. Relevant portion of trace logs for changed behavior:

write(4,"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,37400) = 37400 (0x9218)
close(4)                                         = 0 (0x0)
rename("/boot/lhint.pSrbDz","/boot/kernel/linker.hints") ERR#18 'Cross-device link'
openat(AT_FDCWD,"/boot/lhint.pSrbDz",O_RDONLY,00) = 4 (0x4)
openat(AT_FDCWD,"/boot/kernel/linker.hints",O_WRONLY|O_CREAT|O_TRUNC,00) = 5 (0x5)
copy_file_range(0x4,0x0,0x5,0x0,0x7fffffffffffffff,0x0) = 299544 (0x49218)
close(4)                                         = 0 (0x0)
close(5)                                         = 0 (0x0)
unlink("/boot/lhint.pSrbDz")                     = 0 (0x0)
fchdir(0x3)                                      = 0 (0x0)
close(3)                                         = 0 (0x0)
exit(0x0)

Diff Detail

Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

mqudsi_neosmart.net created this revision.

Use warn() not err() to continue if possible.

jrtc27 added inline comments.
usr.sbin/kldxref/kldxref.c
662

style(9) requires the opening curly brace for functions to be on its own line

662

Not const char *?...

669

style(9) requires parentheses around all return values

677

If it succeeds, the call returns the number of bytes copied, which can be
fewer than len. Returning fewer bytes than len does not necessarily in-
dicate that EOF was reached. However, a return of zero for a non-zero
len argument indicates that the offset for infd is at or beyond EOF.
copy_file_range() should be used in a loop until copying of the desired
byte range has been completed. If an error has occurred, a -1 is re-
turned and the error code is placed in the global variable errno.

(https://www.freebsd.org/cgi/man.cgi?query=copy_file_range)

769

If you unlink after warning then it can be hoisted

mqudsi_neosmart.net set the repository for this revision to rG FreeBSD src repository.

Update per style(9) and ensure handling of partial copy_file_range() calls.

usr.sbin/kldxref/kldxref.c
138

Leave this alone

670

ival is not a helpful variable name

675

Now you've changed to bool these are the wrong way round

685

style(9) is 4 spaces indent for continuation lines

685

If you change this to > 0 then you'll exit the loop for < 0

695

... and then this could just be return (ival_renamed == 0), if you keep the return value at all, or

if (new_ival_name < 0) {
        warn(...);
        return (false);
}

return (true);
777

Having a return value for a static function that's always ignored is a bit odd

mqudsi_neosmart.net added inline comments.
usr.sbin/kldxref/kldxref.c
670

When doing "declarations up front" where a value may be repurposed a few times, I've always noticed a preference for less-specific names (the name is copied from elsewhere in the same file, actually) and I was trying to stick to the conventions I discovered in the existing codebase as I was exposed to them. I'll rename it to be more specific, happy to "be the change you want to see" :)

685

style(9) is 4 spaces indent for continuation lines

You're right. I thought indent(1) respected style(9)? There were a few other incorrect changes it made I had to fix by hand, but I missed this one.

685

style(9) is 4 spaces indent for continuation lines

695

I really prefer to keep error handling as close as possible to the code that triggered it, both to make it easier to review and to prevent refactoring/maintenance issues down the line. If I were to move the warn to the end, someone could potentially introduce another syscall that would clobber errno. Actually, I think that's already the case since this would come after the close(2) calls. But just checking the renamed variable to bubble up the return code would be fine.

777

I agree, which is why I made sure to document why it's ignored here. But the return value is there to be more generic/useful (and it doesn't hurt besides looking odd without that comment to explain). If an operation needed to abort after a failed file copy, the return code would then be used to abort but that's not the case right here.

usr.sbin/kldxref/kldxref.c
667

Why is src still a non-const char *?

670

ival just tells you it's an integer. ret and err are often used to hint that they're return values/error returns from something, which at least tells you more than the type. The only place ival is used in this file is for writing the version integer, 1, as the first integer in the hints file.

685

indent(1) isn't perfect, no. In this case, it defaults to -lp, whereas style(9) wants -nlp.

695

Oh, right, warn implicitly checks errno. You can still put the if < 0 warn directly after the loop though and avoid the nesting.

777

Ok, but that's not quite what your comment says. Your comment says it reports its own errors, but that would also be true of a void-returning copyfile.

usr.sbin/kldxref/kldxref.c
667

Sorry, I rewrote the prototype a dozen times when I was fighting with indent, and I must have hit u one time too many.

777

If you start with the assumption that the only reason to check the return code would be to either report an error or perform some sort of control flow based off the result, if the comment says it does its own error checking then the fact that there's no control flow logic here indicates that we're just going to continue. I'll make it void if that's what you prefer, though.

Updated to address code review issues.

I was about to make copyfile() a void function but then I realized how much work it would be for someone to change it if they need to reuse it but act on its success/failure in the future (revisiting all the early return sites and making sure to bubble up the correct result) and just expanded the comment at the callsite instead.

Let me know if you still want it to be a void function and I'll be happy to oblige.

usr.sbin/kldxref/kldxref.c
685

You can also give clang-format a try. It's not able to exactly match style(9) but gets pretty close.

I usually redirect the formatted output to a temp file, diff the input and formatted files, and iteratively edit.

777

If you start with the assumption that the only reason to check the return code would be to either report an error or perform some sort of control flow based off the result, if the comment says it does its own error checking then the fact that there's no control flow logic here indicates that we're just going to continue. I'll make it void if that's what you prefer, though.

usr.sbin/kldxref/kldxref.c
684

This is now a 12 space indent, not a 4 space one

usr.sbin/kldxref/kldxref.c
777

Oops, did not mean to add that.

mqudsi_neosmart.net added inline comments.
usr.sbin/kldxref/kldxref.c
685

You can also give clang-format a try. It's not able to exactly match style(9) but gets pretty close.

Hi Ed! Yeah, I just discovered that FreeBSD has an (experimental?) .clang-format file in the project root that gave me much more sane results than ident(1) (with -nlp), but clang-format was responsible for the 12-space continuation indent jrtc27@ just pointed out (and it seems to have an unreasonably short hard limit to the line length?). Getting there!

@emaste @jrtc27 is there anything else you would like for me to do here? I've checked and the attached diff still appears to apply cleanly against main. I'm not sure which future releases you would like to target this against; it is a fairly innocuous bugfix and it's hard to see any negative fallout from applying it across the board but I defer to your wisdom.

usr.sbin/kldxref/kldxref.c
775

I still think this approach is a bad design

usr.sbin/kldxref/kldxref.c
775

As I mentioned, if you just want the static function to be void, I'm happy to change it. Would that work for you?

usr.sbin/kldxref/kldxref.c
775

Yes, that would be ok

The copyfile helper function has been converted to void so that there are no ignored return codes.

@jrtc27 is there anything else you would like for me to do here?

@jrtc27 is there anything else you would like for me to do here?

mqudsi_neosmart.net edited the summary of this revision. (Show Details)

Fixed reversed src/dst names in warning message when rename fails and its not EXDEV.

@emaste do you mind taking a look and seeing if there's anything else this needs?

This should be made obsolete by commit f4613af424cc93d42f35730fd9862f0c6f964cbd which stops creating the temporary file in a different directory. Now it should always be in the same directory.

Thanks for the info! I can confirm that the patch in question addresses both the basic case and where there are different mount points under the top-level modules directory, each containing additional kernel modules.

I think this differential can be closed out.