Page MenuHomeFreeBSD

Per-thread credentials
AbandonedPublic

Authored by cem on Jan 23 2019, 4:45 PM.
Tags
None
Referenced Files
F133409913: D18930.id53314.diff
Sat, Oct 25, 2:34 PM
Unknown Object (File)
Mon, Oct 20, 1:10 AM
Unknown Object (File)
Fri, Oct 17, 3:55 PM
Unknown Object (File)
Thu, Oct 16, 1:52 AM
Unknown Object (File)
Thu, Oct 16, 1:52 AM
Unknown Object (File)
Thu, Oct 16, 1:52 AM
Unknown Object (File)
Thu, Oct 16, 1:52 AM
Unknown Object (File)
Thu, Oct 16, 1:52 AM

Details

Summary

Gandi has a need for per-thread credential api for a userspace file server. This idea isn't new as it was proposed by Isilon and a design was drafted here: https://wiki.freebsd.org/Per-Thread%20Credentials . This differential implements this design.

Interaction with other parts: These syscalls may only be called by priviledged processes, the tainted threads will necessarily be "dropping" privileges from the process, so all cred checks (cansched, candebug, cansee, etc) will be less permissive for the tainted threads.

Process tainting: at the moment, tainting a thread will taint the containing process, to comply with issetugid(2). However there is no way to untaint a process once setcred(2) has been called because this would mean having to check every thread. A solution to this would be to add a per-process counter of tainted threads and only untaint the process iff the counter hits 0.

Jails: Tainted threads inherit prisons from the containing process, I'm not familiar with the jail internals in the kernel but as far as I can tell the existing mechanisms with processes should hold.

Auditing: a PR has been opened with the openbsm project here to reserve event numbers 4326[3-5]. I've also added some code in sys/security/audit/audit_bsm.c to encode arguments.

For visibility, I added the new information to ddb. Specifically the ps command now displays the thread in the UID column if the thread is tainted (used to be empty space), and a more complete breakdown with all the GIDs is available with show thread ... command.

Diff Detail

Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 22214
Build 21423: arc lint + arc unit

Event Timeline

rwatson requested changes to this revision.Jan 23 2019, 5:55 PM
rwatson added inline comments.
sys/kern/kern_prot.c
233–235

Is this change safe without holding the process lock? Historically, p_ucred hasn't been a stable pointer without it.

259–264

Is this change safe without holding the process lock? Historically, p_ucred hasn't been a stable pointer without it.

278–286

Is this change safe without holding the process lock? Historically, p_ucred hasn't been a stable pointer without it.

309–314

Is this change safe without holding the process lock? Historically, p_ucred hasn't been a stable pointer without it.

343–344

Is this change safe without holding the process lock? Historically, p_ucred hasn't been a stable pointer without it.

1252–1253

Is this change safe without holding the process lock? Historically, p_ucred hasn't been a stable pointer without it.

1291–1292

Is this change safe without holding the process lock? Historically, p_ucred hasn't been a stable pointer without it.

2390

Is this change safe without holding the process lock? Historically, p_ucred hasn't been a stable pointer without it.

2446

Is this change safe without holding the process lock? Historically, p_ucred hasn't been a stable pointer without it.

2473

Is this change safe without holding the process lock? Historically, p_ucred hasn't been a stable pointer without it.

This revision now requires changes to proceed.Jan 23 2019, 5:55 PM

A quick question:

Right now, certain types of credential changes are allowed to diverge for per-thread credentials, but others aren't. For example, RES UID and GIDs are handled, as are additional groups. What are the implications of not making other properties such as jail, audit, and MAC labels per-thread?

lib/libc/sys/setcred.2
56

line break after .

77

This isn't quite true in that it's unchanged if uid is an invalid pointer. Changing the implementation to allow getcred(NULL, &an_int, NULL) probably makes sense.

sys/kern/kern_prot.c
2452

Personal style thing, I'd make this sizeof(*uap->uid) so the size is the declared size of the destination.

2461

else not needed.

sys/kern/syscalls.master
3174
_Out_writes_(*gidsetlen) gid_t *gidset

(Need to dereference to get the length.)

Is this change safe without holding the process lock? Historically, p_ucred hasn't been a stable pointer without it.

clearly I overlooked this, I'm going to restore the previous mechanism: thread local storage of the process credential (the old use of td->td_ucred but in a new td->p_ucred)

A quick question:

Right now, certain types of credential changes are allowed to diverge for per-thread credentials, but others aren't. For example, RES UID and GIDs are handled, as are additional groups. What are the implications of not making other properties such as jail, audit, and MAC labels per-thread?

I'm going to need to investigate these subsystems in order to prove there aren't any harmful implications.

@rwatson: The patch has been updated to take into account the other parts of credentials: jail, MAC, audit and flags are written to thread credential in order for them to always be in sync with the process credentials.

I also added locks when using the process credential as you mentionned, although with a fastpath that will be taken most of the time, not impacting existing userspaces.

lib/libthr/thread/thr_cred_np.c
43

What is the reasoning behind this pthread wrappers ? What functionality do they provide beyond what is available using libc sycalls wrappers ?

sys/kern/kern_jail.c
2406 ↗(On Diff #53200)

What is the purpose of this lock acq ?

2408 ↗(On Diff #53200)

So right after this call, if the td thread is running and accesing its td_ucred, it might deref a freed memory, am I right ?

lib/libthr/thread/thr_cred_np.c
43

at this moment the added functionality is none. from the wiki page on the design: "We don't want to assume that the FreeBSD kernel will always provide a 1-to-1 mapping between userspace threads and kernel threads. Thus we add an explicit pthread_getcred_np() function to retrieve the uid/gid/groups from the thread."

Moved the process to thread credential syncronization logic into thread_cow_update instead of everywhere scattered around the codebase.

Also added thread credential information in procstat -s.

mjg added inline comments.
sys/kern/kern_prot.c
332

This cannot be legit. kern_getgroups performs copyout which can block indefinitely. Unfortunately the current routines don't provide any warning even with debug enabled. I may add something to change it. There are probably other places with the same problem.

@mjg: moved copyout outside of PROC_LOCK for getgroups, getresuid and getresgid.

I don't have an opinion about the userspace API. It does seem a little bit fishy that there is no tight control from that end. I would expect a fully-privileged daemon to get create a credfd an allow certain uids/gids to be switched to. Then it can drop privs. But again, I did not think this through.

As for the current code, thread_cow_update use to revert threads is an abuse imo. There should be a dedicated func to take care of it, which would also be called by this one.

I think struct thread should get extended -- what's td_ucred in the current kernel should be moved to td_pucred. I.e. td_pucred is a pointer to refed proc credentials. Threads which don't use per-thread creds would happen to have td_ucred == td_pucred. Such an approach would mitigate a lot of complexity. In particular code which reports creds an branche on TD_IS_SUGID can blindly go for td_pucred. This also allows cred testing without grabbing the proc lock. I think this approach is significantly more future-proof as it does not add the new constraint (process lock) to code which needs to check non-sugid creds.

Even without the pointer it would probably be better to

if (!TD_IS_SUGID(td) {
     cred = td->td_ucred;
} else {
     PROC_LOCK(p);
     cred = crhold(p->p_cred);
     PROC_UNLOCK(p);
}
.....

@mjg: good call, this revision is likely the best regarding performance, and code complexity. As for future-proofing, it's obvious that process credentials will always be necessary for backwards compatibility so caching them entirely in each thread is a good idea.

The reference hold/free logic was wrong. now in the case where td_ucred==td_pucred only one reference is held.

That's an error-prone approach. You should always have separate references for both ucred and pucred.

@mjg: reverted back to one reference per object. You're right, better safe than sorry.

Also removed credential checking for getcred(2), a thread should always be able to check its own credential.

In D18930#406796, @mjg wrote:

I don't have an opinion about the userspace API. It does seem a little bit fishy that there is no tight control from that end. I would expect a fully-privileged daemon to get create a credfd an allow certain uids/gids to be switched to. Then it can drop privs. But again, I did not think this through.

I'm not sure what you mean by this? To clarify the current design: a root process can set individual thread credentials, and then can drop process credentials, in this case the thread will not be able to revert back its credentials.

So I'm in the process of reviewing this. I don't have a complete review yet, but I do have some comments.

In D18930#406796, @mjg wrote:

I don't have an opinion about the userspace API. It does seem a little bit fishy that there is no tight control from that end. I would expect a fully-privileged daemon to get create a credfd an allow certain uids/gids to be switched to. Then it can drop privs. But again, I did not think this through.

I'm not sure what you mean by this? To clarify the current design: a root process can set individual thread credentials, and then can drop process credentials, in this case the thread will not be able to revert back its credentials.

I'm saying a better design would have possible credentials ready for use and would tell the thread what to switch to. One way to do it would be to create a new facility -- credential file descriptor. Instead of setcred accepting some numbers, you would accept the fd. There are at least 2 major wins here: 1. creds can be prepared upfront and root privs dropped (assuming the total number of creds is low enough to make it viable) 2. the kernel has all the data allocated so it's a matter of just referencing it. instead of costly construction of new creds, you just td->td_ucred = crhold(newcreds); and of course deref old ones. However, I don't insist on this. Some form of cred caching may be useful here though.

sys/kern/kern_proc.c
977

why is this moved?

sys/kern/kern_prot.c
885–886

It's very iffy to have a kern_ prefixed function expect a locked process. More importantly though, since such an expectation is now in place, it should assert that it holds.

sys/kern/kern_thread.c
542

My major gripe is here. I already noted manipulations below should be moved elsewhere. But more importantly they look extremely iffy. credentials are supposed to be copy-on-write -- after they are initialized, they never change. Yet here several changes are made and I don't see any assurance the creds being modified are not referenced elsewhere.

sys/kern/kern_proc.c
977

credential information is moved into the thread-specific function below. This is relevant for the parts of the credential that become thread-specific eith this patch, for the parts that still aren’t thread specific: I moved them anyway to keep consistency in the code and future proof further credential changes.

I've tried to justify the current implementation as best I can to see if it can get past the finish line instead of going for a credfd solution (which I don't think I'll have the will to design and implement).
For the credfd solution, I agree with the wins, although I think it would be expensive to prepare everything upfront, also that design doesn't solve the copy-on-write problem you mentioned.

sys/kern/kern_prot.c
885–886

I can add an assert, I could also suffix with '_locked' if that helps

sys/kern/kern_thread.c
542

I agree placing this here isn't ideal, the first intention wasn't to do this (see previous patch versions).
To be clear the problem we are solving is when parts of the process credential change, the thread credential needs to catch up. The other option is to do this when doing this at the time of change (ie: when the jail changes), we needed to add logic and locking scattered around the code base, it made much more sense to bundle everything here.
About the credential being immutable, again this isn't ideal but was done for speed, if this is blocking I could always crdup each time and free the old thread credential.

I haven't looked too closely at the actual diff. I assume Eric tagged me into this for some reason, so I'll echo my earlier remarks:

We ended up with special credential file descriptors, rather than using uid_t's and gid_t's directly, because of a need for compatibility with arbitrary Windows LDAP users ("SID"s?) not present in the local id database.

The APIs we use today look like:

663     AUE_NULL        STD     { int modifytcred2(int fd, \
                                    struct native_token *token, \
                                    int flags); }
664     AUE_NULL        STD     { int modifytcred(int fd1, int fd2, \
                                    int flags); }
665     AUE_NULL        STD     { int accesstcred(char *path, int flags, \
                                    int fd); }
666     AUE_NULL        STD     { int buildtcred(struct native_token *token, \
                                    int current); }
667     AUE_NULL        STD     { int gettcred(char *user, int thread); }
668     AUE_NULL        STD     { int settcred(int fd, int flags, \
                                    struct native_token *token); }
669     AUE_NULL        STD     { int reverttcred(void); }
670     AUE_NULL        STD     { int restricttcred(int fd, struct
native_token *token); }

Which lines up with mjg's concern, I think.

To expand on that a bit, struct native_token are constructed in userspace and contain any relevant authorization information. It is associated with a set of userspace manipulation APIs below. Not all of this is necessary relevant to FreeBSD, and it's possible FreeBSD would want to expand (something like) this with MAC label, jail, etc.

/**
 * Native token.
 * Represents the authorization information of a thread.
 *
 * @ingroup native_token
 */
struct native_token {
        u_int            nt_fields;     /* valid fields */
#define NTOKEN_UID         0x00000001
#define NTOKEN_GID         0x00000002
#define NTOKEN_SID         0x00000004
#define NTOKEN_GSID        0x00000008
#define NTOKEN_GROUPS      0x00000010
#define NTOKEN_IPRIVS      0x00000020
#define NTOKEN_LADDR       0x00000040
#define NTOKEN_RADDR       0x00000080
#define NTOKEN_PROTOCOL    0x00000100
#define NTOKEN_ZID         0x00000200
#define NTOKEN_FIELD_MASK  0x000003FF
#define NTOKEN_USEUID      0x00010000
#define NTOKEN_USEGID      0x00020000
#define NTOKEN_DENYIFS     0x00040000
#define NTOKEN_BACKUP      0x00080000   /* Used with ISI_PRIV_IFS_BACKUP */
#define NTOKEN_RESTORE     0x00100000   /* Used with ISI_PRIV_IFS_RESTORE */
#define NTOKEN_FLAG_MASK   0x001F0000
        uid_t            nt_uid;        /* user id */
        struct persona  *nt_sid;        /* user sid */
        gid_t            nt_gid;        /* group id */
        struct persona  *nt_gsid;       /* group sid */
        struct persona **nt_groups;     /* group personae */
        short            nt_ngroups;    /* number of groups */
        struct isi_priv *nt_iprivs;     /* isilon privileges */
        short            nt_niprivs;    /* number of isilon privileges */
        zid_t            nt_zid;        /* zone id */
        struct in6_addr  nt_laddr;      /* local IP address */
        struct in6_addr  nt_raddr;      /* remote client IP address */
        short            nt_protocol;   /* see enum isp_protocol */
};
...
struct native_token * ntoken_get(void) __malloc_like;
void ntoken_clear(struct native_token *token);
void ntoken_free(struct native_token *token);
bool ntoken_extend_groups(struct native_token *token, int ngroups);
bool ntoken_extend_iprivs(struct native_token *token, int niprivs);
struct fmt_conv_ctx native_token_fmt(const struct native_token *p);
struct fmt_conv_ctx native_token_json_fmt(const struct native_token *p);
struct native_token *ntoken_copy(const struct native_token *nt);
bool ntoken_copy_contents(struct native_token *dest, const struct native_token *src);
bool ntoken_copy_partial(struct native_token *token, const struct native_token *nt, uint32_t fields);
/**
 * Compare two tokens. @a nt2 matches @a nt1 if all fields in @a nt1 match
 * those in @a nt2.  @a nt2 might have more fields defined than @a nt1
 */
bool ntoken_matches(const struct native_token *nt1,
    const struct native_token *nt2);
bool ntoken_equal(const struct native_token *nt1,
    const struct native_token *nt2);

/**
 * Convert a fd credential into a native token
 * @ingroup native_token
 */
struct native_token *fd_to_ntoken(int fd);

(IPv4 IPs are v4mapped into the in6 laddr/raddr; fmt_print() is basically sbuf, but fmt_conv_ctx is a nice extension allowing structure formatting of sorts with %{}.)

lib/libthr/thread/thr_cred_np.c
43

That document was probably written against FreeBSD 7 or earlier and had KSE fresh in mind. I don't know if FreeBSD will ever do M:N threading again, and this API can be added if it ever does?

Hello FreeBSD maintainers,

I'm not employed at Gandi anymore so I haven't been working on this patch, and have no intention to in the future. You can close this revision. However, It would be nice to keep the revision public so the next person working on per-thread credentials can draw inspiration and maybe choose a different implementation.

Thank you for taking the time to review this, and for merging the other patches that got through! farewell

cem added a reviewer: jack_gandi.net.

Thanks for the head's up, Jack! Closed revisions stay public, to the best of my knowledge.