Page MenuHomeFreeBSD

Fix build with DRM and INVARIANTS enabled.
ClosedPublic

Authored by jhb on Jul 18 2019, 10:00 PM.

Details

Summary

The DRM drivers use the lockdep assertion macros with spinlock_t locks
which are backed by mutexes, not sx locks. This causes compile failures
since you can't use sx_assert with a mutex. As a hack for now, just
disable these macros.

Test Plan
  • build drm-current-kmod sources as part of a kernel build with INVARIANTS enabled

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

jhb created this revision.Jul 18 2019, 10:00 PM

I don't think the drm kmods are built with INVARIANTS, unless it's somehow passed into the build environment when building out of tree kmods. In general, are there other options added when building the DRM kmods (or other kmods) as part of the kernel build?

It is probably not enough to just build test this, it probably need a run time test as well. The lkpi in general is also used by other drivers, although I don't know if this part of it is.

I would like for @johalun and @hselasky to weight in on this as well.

What about lockedep_is_held() ??

jhb added a comment.Jul 20 2019, 4:20 PM

@zeising: With the related changes (see the stack, and you've commented on the drm-current-kmod one), a normal GENERIC kernel on head defines INVARIANTS and trips over this. If you apply the other two changes and install the updated package and try to build GENERIC you will hit this.

@hselasky: The drm modules don't currently use lockdep_is_held. The longer-term solution I think would be to change struct mutex and spinlock_t so that these macros could access the nested lock_object member similar to how cv_wait works, though in this case you'd have to inline it a bit. For example, if both struct mutex and spinlock_t both contained a single member named lock you could do something like:

#ifdef INVARIANTS
#define lockdep_assert_held(l) \
    (LOCK_CLASS(&(l)->lock_object).lc_assert(&(l)->lock_object, LA_XLOCKED)
#else
#define lockdep_assert_held(l) do { } while (0)
#endif

Actually, I had forgotten about 'lc_assert', that actually makes the real fix not too painful. For 'lockdep_is_held' we might have to add a new method to 'lock_class' but that's fairly easy. Alternatively, if GCC and clang support some kind of compile-time 'typeof' type extension we could have it switch on the type to decide what to do.

@zeising : Do you use CONFIG_LOCKDEP in drm-next ?

#if IS_ENABLED(CONFIG_LOCKDEP)

GEM_BUG_ON(debug_locks &&
           !!lockdep_is_held(&rq->i915->drm.struct_mutex) !=
           !!(flags & I915_WAIT_LOCKED));

#endif

Else I suggest we simply stub all lockdep functions by default.

Can you update the patch?

--HPS

Just remove lockdep_is_held(m) .

In D20992#455803, @jhb wrote:

@zeising: With the related changes (see the stack, and you've commented on the drm-current-kmod one), a normal GENERIC kernel on head defines INVARIANTS and trips over this. If you apply the other two changes and install the updated package and try to build GENERIC you will hit this.

I'm not sure I understand this comment.
The kmod is built using the kmod facility in ports, and bsd.kmod.mk. I don't know if INVARIANTS and other options gets set by that, but it seems they don't. If the build when building out of tree (building the port) vs. building in tree is different because of this (and other options), this would mean we get different drivers depending on how they are built, and this feels like it's going to become a support nightmare. Perhaps options such as these should be stripped before building ports modules together with source, to avoid this issue.

@zeising : Do you use CONFIG_LOCKDEP in drm-next ?

Looking at the drm-current-kmod-soruces, CONFIG_LOCKDEP seems to not be defined. See https://github.com/FreeBSDDesktop/kms-drm/blob/drm-v5.0/drivers/gpu/drm/drm_os_config.h#L107 .

jhb added a comment.Jul 31 2019, 11:08 PM
In D20992#455803, @jhb wrote:

@zeising: With the related changes (see the stack, and you've commented on the drm-current-kmod one), a normal GENERIC kernel on head defines INVARIANTS and trips over this. If you apply the other two changes and install the updated package and try to build GENERIC you will hit this.

I'm not sure I understand this comment.
The kmod is built using the kmod facility in ports, and bsd.kmod.mk. I don't know if INVARIANTS and other options gets set by that, but it seems they don't. If the build when building out of tree (building the port) vs. building in tree is different because of this (and other options), this would mean we get different drivers depending on how they are built, and this feels like it's going to become a support nightmare. Perhaps options such as these should be stripped before building ports modules together with source, to avoid this issue.

It's more like getting reliably working modules on HEAD instead of random crashes and panics due to a mismatch in the kernel and module environment. Standalone modules only work with non-changing kernels (e.g. a user staying on a release) or with modules that use stable, well-supported interfaces like (the graphics drivers do not as they regularly use VM internals). Plus, running the graphics drivers with debugging enabled on HEAD such as INVARIANTS is more likely to find bugs more quickly and more nicely. virtualbox is in a similar situation where it needs to be paired with the running kernel instead of mixing old and new versions of kernel interfaces.

jhb added a comment.Jul 31 2019, 11:09 PM
In D20992#455803, @jhb wrote:

@hselasky: The drm modules don't currently use lockdep_is_held. The longer-term solution I think would be to change struct mutex and spinlock_t so that these macros could access the nested lock_object member similar to how cv_wait works, though in this case you'd have to inline it a bit. For example, if both struct mutex and spinlock_t both contained a single member named lock you could do something like:

#ifdef INVARIANTS
#define lockdep_assert_held(l) \
    (LOCK_CLASS(&(l)->lock_object).lc_assert(&(l)->lock_object, LA_XLOCKED)
#else
#define lockdep_assert_held(l) do { } while (0)
#endif

I can do a variation of this now with the current code by just assuming that the first member of any lock type is a FreeBSD lock and using a cast to 'struct lock_object *'. I will try that approach next.

jhb updated this revision to Diff 60348.Aug 1 2019, 5:01 AM
  • Use LOCK_CLASS and lock_object casts.
In D20992#458955, @jhb wrote:
In D20992#455803, @jhb wrote:

@zeising: With the related changes (see the stack, and you've commented on the drm-current-kmod one), a normal GENERIC kernel on head defines INVARIANTS and trips over this. If you apply the other two changes and install the updated package and try to build GENERIC you will hit this.

I'm not sure I understand this comment.
The kmod is built using the kmod facility in ports, and bsd.kmod.mk. I don't know if INVARIANTS and other options gets set by that, but it seems they don't. If the build when building out of tree (building the port) vs. building in tree is different because of this (and other options), this would mean we get different drivers depending on how they are built, and this feels like it's going to become a support nightmare. Perhaps options such as these should be stripped before building ports modules together with source, to avoid this issue.

It's more like getting reliably working modules on HEAD instead of random crashes and panics due to a mismatch in the kernel and module environment. Standalone modules only work with non-changing kernels (e.g. a user staying on a release) or with modules that use stable, well-supported interfaces like (the graphics drivers do not as they regularly use VM internals). Plus, running the graphics drivers with debugging enabled on HEAD such as INVARIANTS is more likely to find bugs more quickly and more nicely. virtualbox is in a similar situation where it needs to be paired with the running kernel instead of mixing old and new versions of kernel interfaces.

I'm not disagreeing about enabling INVARIANTS (or other options) in the modules. What I worry about is that you get a different set of options when building the module from ports (or using packages) compared to when the driver is built together with the kernel. It is possible to rebuild the module after each kernel update, and this is the recommended way to do it on CURRENT or with custom kernels today.
Having the modules built differently when building from ports as opposed to with the kernel, will lead to support issues, I believe.
One idea we had in the graphics team meeting yesterday, to alleviate this is to have some infrastructure, either something like config.h but for the kernel, or a makefile include, or something, that can tell which options the kernel is built with, and use that when building kmods from ports. That doesn't solve the issue however, when kmods are built against other sources than the currently installed system. I'm not sure if this is a good idea or not, but it's an idea.
There is also the issue where people rebuild kernel first, and then update the packages and get the new soruces for drm-kmod (or any kmod), but perhaps that's just solved by telling people to update the package first. However, pkg usually warns if you try to update to packages that are built for a later kernel version than the one you are running on.
These issues are all general issues that I think needs at least a little thought with this infrastructure. They are not specific to this review or to drm-kmods, I just thought I'd raise them, since this work got me thinking about them.

jhb added a comment.Aug 6 2019, 6:56 PM

So with this change people will rebuild the module on current without having to reinstall the port, so I think this will effectively do the recommended change, but even more strongly on current. I think that any pre-built kernel modules are always going to have to be targeted at a kind of lowest-common-denominator. We aim to do this already for standalone kernel module builds (so cd /sys/modules/foo; make is the same problem space).

If you really want to customize kernel modules to the running kernel then I think the next step you might consider is to build modules during the post-install step of the package being installed. However, for people running releases, I think shipping a pre-built module that will work with GENERIC is sufficient. In that sense I'm pretty happy with where the current LOCAL_MODULES scheme ends up.

jhb added a comment.Aug 6 2019, 6:57 PM

@hselasky Do you have any thoughts on the LOCK_CLASS-based approach? Is this something you can test? I've only tested it with drm.

In D20992#460070, @jhb wrote:

So with this change people will rebuild the module on current without having to reinstall the port, so I think this will effectively do the recommended change, but even more strongly on current. I think that any pre-built kernel modules are always going to have to be targeted at a kind of lowest-common-denominator. We aim to do this already for standalone kernel module builds (so cd /sys/modules/foo; make is the same problem space).
If you really want to customize kernel modules to the running kernel then I think the next step you might consider is to build modules during the post-install step of the package being installed. However, for people running releases, I think shipping a pre-built module that will work with GENERIC is sufficient. In that sense I'm pretty happy with where the current LOCAL_MODULES scheme ends up.

I'm not sure I follow what you mean here. What I'm worried about are a couple of things, one, is that if you build the driver from the port (as is done now), or if it's built with the kernel, will yield different modules, because different kernel options, such as INVARIANTS, are applied. This will lead to support issues in the end. If you build the kernel module from ports today, it will match the sources you have installed in /usr/src (or SRCDIR, I think), and thusly also match your kernel version.
The other issue is that if you update the port after you've built the new kernel, you have built the old version of the driver, not the one you just updated to. This can lead to surprises, as well as issues if the kmod needs to be updated to be buildable on the new system (which, with drm drivers, has happened multiple times). The solution is to tell people to update the package (with the new source) before building a new kernel, but pkg, in general, won't let you do that if the packages in pkg are built against a newer kernel version.

This is mostly issues on CURRENT, and to a lesser extent STABLE, for releases, the pre-packaged kernel modules should work, with one notable exception. Packages are built on the lowest common release, so, since both 11.2 and 11.3 are supported, all packages are built on 11.2, which can cause issues wit kmods. This is however, out of scope for this discussion. :)

hselasky added inline comments.Aug 7 2019, 8:02 AM
sys/compat/linuxkpi/common/include/linux/lockdep.h
66 ↗(On Diff #60348)

Shared locks do not have an owner. How do we handle this?

@hselasky Do you have any thoughts on the LOCK_CLASS-based approach? Is this something you can test? I've only tested it with drm.

@jhb I'm probably fine with the lock-class approach. Though I would really like these functions stubbed and not used in FreeBSD. On the other hand asserting locks can be useful for debugging.

jhb added inline comments.Aug 7 2019, 4:04 PM
sys/compat/linuxkpi/common/include/linux/lockdep.h
66 ↗(On Diff #60348)

So I wasn't 100% sure what the right semantic is for lockdep_is_held. lc_owner() returns true if there is either a shared or exclusive lock. However, AFAIK, the only locks we use in linuxkpi are spinlock_t and 'struct mutex' both of which only use exclusive locks and never use sx_slock, so in practice this would be ok? If Linux includes some kind of reader/writer lock that we support in linuxkpi, and Linux supports lockdep_is_held for that lock, then we would need to ensure we match whatever semantics Linux uses in that case. I was hopeful that you might be able to help answer those questions?

Hi,

Are we certain that lockep() is not used with read_lock()? Also I don't like that cast to lock_object. It would be better to do this clean by referring to the lock object field in the lock, assuming they are named then same. Or use WITNESS for selected locks, which is much better. The reason WITNESS is currently disabled for LinuxKPI spinlocks is that spinlocks are not destroyed under Linux, so the WITNESS object never goes away.

common/include/linux/rwlock.h:
#define	read_lock(_l)		rw_rlock(&(_l)->rw)

--HPS

jhb added a comment.Aug 7 2019, 7:11 PM

Hi,
Are we certain that lockep() is not used with read_lock()? Also I don't like that cast to lock_object. It would be better to do this clean by referring to the lock object field in the lock, assuming they are named then same. Or use WITNESS for selected locks, which is much better. The reason WITNESS is currently disabled for LinuxKPI spinlocks is that spinlocks are not destroyed under Linux, so the WITNESS object never goes away.

The reason I did the cast is that the subfields of spinlock_t and struct mutex are not the same, so I can't depend on, e.g. &(l)->m.lock_object. We can rename the fields if you want, but that's a much larger change. That could also be done as a followup that removes the casts.

common/include/linux/rwlock.h:
#define	read_lock(_l)		rw_rlock(&(_l)->rw)

Ok, can you tell me what the semantic in Linux is for lockdep_is_held() after on a lock locked via read_lock()?

Hi,

If we look at Linux all the LOCKDEP stuff is under ifdef:

#define lockdep_is_held(lock)           lock_is_held(&(lock)->dep_map)
#define lockdep_is_held_type(lock, r)   lock_is_held_type(&(lock)->dep_map, (r))

They add an own object into the various locks to track usage, similar to witness:

typedef struct raw_spinlock {
        arch_spinlock_t raw_lock;
#ifdef CONFIG_DEBUG_SPINLOCK
        unsigned int magic, owner_cpu;
        void *owner;
#endif
#ifdef CONFIG_DEBUG_LOCK_ALLOC
        struct lockdep_map dep_map;
#endif
} raw_spinlock_t;

The LOCKDEP is also supported for rwlocks:

include/linux/rwlock_api_smp.h:		rwlock_acquire_read(&lock->dep_map, 0, 1, _RET_IP_);
include/linux/rwlock_api_smp.h:		rwlock_acquire(&lock->dep_map, 0, 1, _RET_IP_);

For now I think we should just stub LOCKDEP as a feature in FreeBSD. Else we risk incompatibilities, that for example the is lock held returns different values in Linux than in FreeBSD and we get a bunch of false positives.

Ok, can you tell me what the semantic in Linux is for lockdep_is_held() after on a lock locked via read_lock()?

Linux doesn't document this function from what I can see.

jhb added a comment.Aug 8 2019, 4:56 PM

Hi,
If we look at Linux all the LOCKDEP stuff is under ifdef:

#define lockdep_is_held(lock)           lock_is_held(&(lock)->dep_map)
#define lockdep_is_held_type(lock, r)   lock_is_held_type(&(lock)->dep_map, (r))

If you look just below this though you see what the semantics are:

/*
 * Same "read" as for lock_acquire(), except -1 means any.
 */
extern int lock_is_held_type(const struct lockdep_map *lock, int read);

static inline int lock_is_held(const struct lockdep_map *lock)
{
        return lock_is_held_type(lock, -1);
}

So lockdep_is_held() returns 1 if the current thread holds any lock (shared or written) and 0
if it doesn't hold the lock. The existing change I have is about as close to that as we can
get. It will return 1 when other threads hold a read lock and not this thread, but there's
not much we can do about that. If code uses the assertions directly instead of abusing
lockdep_is_held for home-grown assertions it will work correctly though.

@jhb : OK, if you can move lockdep_is_held() under #ifdef INVARIANTS and otherwise define lockdep_is_held() as 1, I think we are in-line.

jhb added a comment.Aug 13 2019, 4:07 PM

@jhb : OK, if you can move lockdep_is_held() under #ifdef INVARIANTS and otherwise define lockdep_is_held() as 1, I think we are in-line.

Ok. I suspect that the existing lockdep_is_held is a bit closer to the truth (it can still return a false positive of 1 if some other thread holds a share lock) as opposed to always returning a false positive of 1, but I can move it under invariants.

jhb updated this revision to Diff 60737.Aug 13 2019, 4:10 PM
  • Map lockdep_is_held 1 for !INVARIANTS.
hselasky accepted this revision.Aug 13 2019, 5:40 PM

Looks good!

This revision is now accepted and ready to land.Aug 13 2019, 5:40 PM
This revision was automatically updated to reflect the committed changes.