Page MenuHomeFreeBSD

linsysfs: Reimplement bus scan code.
Needs ReviewPublic

Authored by dchagin on Feb 12 2023, 3:52 PM.
Tags
None
Referenced Files
Unknown Object (File)
Tue, Oct 21, 5:53 PM
Unknown Object (File)
Tue, Oct 21, 3:06 PM
Unknown Object (File)
Fri, Oct 17, 1:34 PM
Unknown Object (File)
Sun, Oct 12, 7:33 AM
Unknown Object (File)
Sun, Oct 12, 7:33 AM
Unknown Object (File)
Sat, Oct 11, 2:27 PM
Unknown Object (File)
Sat, Oct 11, 12:00 PM
Unknown Object (File)
Sat, Oct 11, 12:00 PM
Subscribers

Details

Reviewers
kib
Group Reviewers
linuxkpi
Summary

The linsysfs code continues to grow, so split bus scan code to a separate
functions, add some helpers.
The primary goal is to scan bus as much as it's needed. The first scan create
the directory structure, other scans can use it, finding nodes by pci device.
Rewrote scsi and drm code using new helpers.

  1. Yes, the change is big, but it doesn't make sense to me to optimize code

and then completely rewrite it.

  1. The drm code needed more attention, as it derefences device_t to lkpi

struct device. Perhaps here make sence to do differently.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 49820
Build 46711: arc lint + arc unit

Event Timeline

If this works, I am fine with it.

sys/compat/linsysfs/linsysfs.c
78

Fix style, continuation should use 4-space indent.

96

What protects the list manipulations?

sys/compat/linsysfs/linsysfs.h
31
dchagin added inline comments.
sys/compat/linsysfs/linsysfs.c
96

What protects the list manipulations?

For now it's not protected, as well as bus scan code,
both used only at the module load/unload time.
I plan to add protection, ponders how to handle device add/remove.

sys/compat/linsysfs/linsysfs.c
96

Giant is most likely held there, but you allocate sleepable.

I think it is worth adding a comment mentioning the locking state. Also, you might consider starting using bus_topo_lock() there, even if formally.

locking done, now bus scan code unsleepable,
there is still a problem with tracking the life of refs to device_t,
however while the drm-kmod doesn't know how to unload, that's ok

locking done, now bus scan code unsleepable,
there is still a problem with tracking the life of refs to device_t,
however while the drm-kmod doesn't know how to unload, that's ok

It knows how to unload, however it's true that currently this panic but I'm working on fixing this and keep it stable as we need this for work.

In D38545#878793, @manu wrote:

locking done, now bus scan code unsleepable,
there is still a problem with tracking the life of refs to device_t,
however while the drm-kmod doesn't know how to unload, that's ok

It knows how to unload, however it's true that currently this panic but I'm working on fixing this and keep it stable as we need this for work.

ah, I wrote as well as I could, I don't mean to criticize, I read Linux code every day and I understand how hard your work is.

sys/compat/linsysfs/linsysfs.c
85

So the possible failure is not caused by any persistent conditions, it is just transient internal unability to allocate memory. It is bad, IMO.

Just lock with bus_topo for now.

In D38545#878793, @manu wrote:

locking done, now bus scan code unsleepable,
there is still a problem with tracking the life of refs to device_t,
however while the drm-kmod doesn't know how to unload, that's ok

It knows how to unload, however it's true that currently this panic but I'm working on fixing this and keep it stable as we need this for work.

ah, I wrote as well as I could, I don't mean to criticize, I read Linux code every day and I understand how hard your work is.

Don't worry I didn't take it badly, I was just saying that you shouldn't rely on the fact that currently drm-kmod doesn't unload because this is a bug :)

I still do not understand what are you doing.

You need to lock the newbus subsystem to safely access it, no? The newbus lock is bus_topo_lock(). You need to use it, then you can reuse it for your structures. The only quirk is that right now bus_topo_lock() is Giant so it is dropped on sleep.

sys/compat/linsysfs/linsysfs.h
35

Indents there definitely do not follow style

In D38545#879586, @kib wrote:

I still do not understand what are you doing.

You need to lock the newbus subsystem to safely access it, no? The newbus lock is bus_topo_lock(). You need to use it, then you can reuse it for your structures. The only quirk is that right now bus_topo_lock() is Giant so it is dropped on sleep.

bus_topo_lock is acquired below in linsysfs_bus_scan,
however, I have a one question before any changes further, does device tree persistent? If not, should I after any sleep, rescan device tree?
Also, do we have plan to replace Giant here by normal mutex? If the tree is not persistent, this complicates things

In D38545#879586, @kib wrote:

I still do not understand what are you doing.

You need to lock the newbus subsystem to safely access it, no? The newbus lock is bus_topo_lock(). You need to use it, then you can reuse it for your structures. The only quirk is that right now bus_topo_lock() is Giant so it is dropped on sleep.

bus_topo_lock is acquired below in linsysfs_bus_scan,
however, I have a one question before any changes further, does device tree persistent? If not, should I after any sleep, rescan device tree?
Also, do we have plan to replace Giant here by normal mutex? If the tree is not persistent, this complicates things

What do you mean by persistence? Imagine that somebody plugs in a USB/COM adapter, or removes a hot-pluggable PCIe device. Of course, it is reflected in the newbus structure.

On the other hand, right now newbus is not really locked. It is somewhat protected by the Giant, but because Giant is dropped for sleep, the protection is somewhat cheesy. Locking newbus for real is quite non-trivial, because the changes affect all drivers, and this is a lot of testing. Another hard aspect is that many drivers use Giant special properties (recursion, autodrop on sleep), which is impossible to emulate with regular locks. At least several attempts to lock newbus failed.

bus_topo_lock() is mostly an indicator for the places which needs attention for newbus locking. This is why I want it to be present in your changes, and why I am opposed to add specific local locking scheme.

In D38545#879892, @kib wrote:
In D38545#879586, @kib wrote:

I still do not understand what are you doing.

You need to lock the newbus subsystem to safely access it, no? The newbus lock is bus_topo_lock(). You need to use it, then you can reuse it for your structures. The only quirk is that right now bus_topo_lock() is Giant so it is dropped on sleep.

bus_topo_lock is acquired below in linsysfs_bus_scan,
however, I have a one question before any changes further, does device tree persistent? If not, should I after any sleep, rescan device tree?
Also, do we have plan to replace Giant here by normal mutex? If the tree is not persistent, this complicates things

What do you mean by persistence? Imagine that somebody plugs in a USB/COM adapter, or removes a hot-pluggable PCIe device. Of course, it is reflected in the newbus structure.

On the other hand, right now newbus is not really locked. It is somewhat protected by the Giant, but because Giant is dropped for sleep, the protection is somewhat cheesy. Locking newbus for real is quite non-trivial, because the changes affect all drivers, and this is a lot of testing. Another hard aspect is that many drivers use Giant special properties (recursion, autodrop on sleep), which is impossible to emulate with regular locks. At least several attempts to lock newbus failed.

bus_topo_lock() is mostly an indicator for the places which needs attention for newbus locking. This is why I want it to be present in your changes, and why I am opposed to add specific local locking scheme.

Kib is right: there are plans for making this better. Experiments look promising, but I've not had enough time to torture test them. Having bus_topo_lock() etc will show where we may need to change or at least analyze

In D38545#879892, @kib wrote:
In D38545#879586, @kib wrote:

I still do not understand what are you doing.

You need to lock the newbus subsystem to safely access it, no? The newbus lock is bus_topo_lock(). You need to use it, then you can reuse it for your structures. The only quirk is that right now bus_topo_lock() is Giant so it is dropped on sleep.

bus_topo_lock is acquired below in linsysfs_bus_scan,
however, I have a one question before any changes further, does device tree persistent? If not, should I after any sleep, rescan device tree?
Also, do we have plan to replace Giant here by normal mutex? If the tree is not persistent, this complicates things

What do you mean by persistence? Imagine that somebody plugs in a USB/COM adapter, or removes a hot-pluggable PCIe device. Of course, it is reflected in the newbus structure.

On the other hand, right now newbus is not really locked. It is somewhat protected by the Giant, but because Giant is dropped for sleep, the protection is somewhat cheesy. Locking newbus for real is quite non-trivial, because the changes affect all drivers, and this is a lot of testing. Another hard aspect is that many drivers use Giant special properties (recursion, autodrop on sleep), which is impossible to emulate with regular locks. At least several attempts to lock newbus failed.

bus_topo_lock() is mostly an indicator for the places which needs attention for newbus locking. This is why I want it to be present in your changes, and why I am opposed to add specific local locking scheme.

Thakn you for clarification. I understand.
I spent some time to adapt linsysfs and others to ifAPI. And found that anyway I need some generic way to create pseudofs nodes in an unsleepable context. As there two places here now where it needed - bus scan code, as Giant would be replaced after all, and the listnics code which should be rewrited to iterate over ifnets in the net epoch. Also both of this functions should start by eventhandler, to handle device attach/detach etc, not only at mount time.
I ponders the best way I can use, may be thread taskqueue?
Or I missing something?