Page MenuHomeFreeBSD

vn_fullpath: const'ify retbuf
AbandonedPublic

Authored by kevans on Feb 23 2020, 4:17 AM.

Details

Reviewers
kib
rlibby
markj
Group Reviewers
manpages
Summary

This is arguably a little churny, but I suspect it doesn't matter all that much as most of the churn are in various declarations. This change may serve two purposes:

  • Document the intention of vn_fullpath to not actually modify whatever string retbuf may point to, and
  • Allow callers a little more flexibility in what they're passing to retbuf.

As a side-effect, it also takes care of a couple of -Wwrite-strings issues in the process as a common pattern is to initialize a local retbuf to a literal string as a default value before passing the address to vn_fullpath.

Diff Detail

Repository
rS FreeBSD src repository
Lint
Lint Skipped
Unit
Unit Tests Skipped
Build Status
Buildable 29542

Event Timeline

I really do not understand it. The *retbuf *is* modified by vn_fullpath. I believe you are lying to the compiler' aliasing analysis this way.

In D23802#523059, @kib wrote:

I really do not understand it. The *retbuf *is* modified by vn_fullpath. I believe you are lying to the compiler' aliasing analysis this way.

*retbuf is, but importantly **retbuf is not. This should be the difference between const char ** and char * const *

In D23802#523059, @kib wrote:

I really do not understand it. The *retbuf *is* modified by vn_fullpath. I believe you are lying to the compiler' aliasing analysis this way.

*retbuf is, but importantly **retbuf is not. This should be the difference between const char ** and char * const *

Well, there are three values, retbuf which constness is only important if completely blatant bug in vn_fullpath1() is added. The *retbuf is just a pointer, and the function assigns to it. And the last, the buffer where *retbuf points to, which is the buffer containing the calculated path.

The buffer was written by the function, this is my point of the aliasing breakage. And I do not see why the recipient of the pointer to the result buffer must be disallowed from modifying it.

In D23802#523379, @kib wrote:
In D23802#523059, @kib wrote:

I really do not understand it. The *retbuf *is* modified by vn_fullpath. I believe you are lying to the compiler' aliasing analysis this way.

*retbuf is, but importantly **retbuf is not. This should be the difference between const char ** and char * const *

Well, there are three values, retbuf which constness is only important if completely blatant bug in vn_fullpath1() is added. The *retbuf is just a pointer, and the function assigns to it. And the last, the buffer where *retbuf points to, which is the buffer containing the calculated path.

The buffer was written by the function, this is my point of the aliasing breakage. And I do not see why the recipient of the pointer to the result buffer must be disallowed from modifying it.

vn_fullpath* must not attempt to modify *retbuf as-is since it's not guaranteed that it's pointing to a region where it's safe to do so; and in-fact it's not safe in some of the callers where it is initially valid.

The callers could maybe modify it, but as a caller the signature and description of vn_fullpath doesn't leave me thinking that it will necessarily be safe to do so.

First, one must jump through the small hoop to make sure that *retbuf isn't the buffer you passed in for those circumstances where you shouldn't be modifying the buffer you passed in as the caller -- unless you set that up after, e.g. fdescfs's use.

Second, we've got two buffers having been returned: one that we may free (freebuf), and one of unknown origin that may point into freebuf but there's no guarantee that it will do so and it's not clear that there's a good way to check this without knowing the extent of *freebuf since the vn_fullpath* family will, AFAICT, build up the path from the end of the allocated buffer if it has done so. This one may just be a documentation concern, as we use the weak wording (emphasis mine) "may (on success) point at a newly allocated buffer containing the resulting pathname" rather than asserting that it will either remain unchanged or point into *freebuf.

In D23802#523379, @kib wrote:

The buffer was written by the function, this is my point of the aliasing breakage.

A const pointer does not require that the pointed-to object remain unchanged; it merely suggests that it not be modified via the pointer. There may be other non-const pointers to the same object or const may be cast away.

(On the other hand, writing to an object declared const or to a string literal is undefined behaviour, and often causes a segmentation fault in practice.)

And I do not see why the recipient of the pointer to the result buffer must be disallowed from modifying it.

This is indeed a trade-off. Logically, *retbuf is either a copy of its original value or borrows from *freebuf. In the former case it should inherit the constness and in the latter case it is safe to modify. This is much like the constness issue with strchr().

In practice, though, it seems like most callers want to initialize *retbuf with a string literal, and do not need the ability to write to **retbuf.

Also, if it is guaranteed that a buffer is allocated in a successful call, then *freebuf + (*retbuf - *freebuf) is a deconsted pointer to the result that does not involve a cast.

In D23802#523379, @kib wrote:

The buffer was written by the function, this is my point of the aliasing breakage.

A const pointer does not require that the pointed-to object remain unchanged; it merely suggests that it not be modified via the pointer. There may be other non-const pointers to the same object or const may be cast away.

Yes, but why is it useful even slightly to prevent modifications to (successfull) result of the vn_fullpathX() ?

(On the other hand, writing to an object declared const or to a string literal is undefined behaviour, and often causes a segmentation fault in practice.)

And I do not see why the recipient of the pointer to the result buffer must be disallowed from modifying it.

This is indeed a trade-off. Logically, *retbuf is either a copy of its original value or borrows from *freebuf. In the former case it should inherit the constness and in the latter case it is safe to modify. This is much like the constness issue with strchr().

In practice, though, it seems like most callers want to initialize *retbuf with a string literal, and do not need the ability to write to **retbuf.

But this has nothing to do with the function signature. It is callers trying to save single line of error checking path and the bizarre vn_fullpath() interface causing them to mis-use out argument, relying on the function not modifying it in case of non-successful return.

Also, if it is guaranteed that a buffer is allocated in a successful call, then *freebuf + (*retbuf - *freebuf) is a deconsted pointer to the result that does not involve a cast.

This is another illustration of the same strangeness of the interface. I would suggest to not depend on this detail, despite it looking 'obvious' and safe.

I think in terms of const semantics, this is okay, although it does seem accidental and undesirable to prevent callers from modifying the buf on success as a side-effect.

I agree the interface is strange. It would seem simpler to me if it were just

int vn_fullpath(struct thread *td, struct vnode *vp, char **retbuf);

And have *retbuf set unconditionally, either to a path on success or else to NULL.

EXAMPLE:
const char *pathstr;
char *pathbuf;
vn_fullpath(td, vp, &pathbuf); /* pathbuf == NULL iff error != 0 */
pathstr = (pathbuf != NULL) ? pathbuf : "unknown";
printf("%s\n", pathstr);
free(pathbuf, M_TEMP);

That would obviously mean changing logic in callers, which maybe isn't worth the effort. We do seem to be touching (nearly?) all callers here...

I agree the interface is strange. It would seem simpler to me if it were just

int vn_fullpath(struct thread *td, struct vnode *vp, char **retbuf);

Maybe with a compatibility macro wrapper

#define	vn_fullpath(td, vp, pretbuf, pfreebuf)				\
({									\
	char **__pbuf;							\
	int __ret;							\
	__pbuf = (pfreebuf);						\
	__ret = vn_fullpath1((td), (vp), __pbuf);			\
	if (__ret == 0)							\
		*(pretbuf) = *__pbuf;					\
	__ret;								\
})

That allows the caller to use the old pattern with pretbuf being const or non-const as the caller likes.