Page MenuHomeFreeBSD

Update LinuxKPI layer in FreeBSD-11-current
AbandonedPublic

Authored by hselasky on Oct 7 2015, 8:06 AM.
Tags
None
Referenced Files
Unknown Object (File)
Wed, Oct 15, 6:44 AM
Unknown Object (File)
Sun, Oct 12, 2:53 PM
Unknown Object (File)
Sat, Oct 11, 10:35 AM
Unknown Object (File)
Sat, Oct 11, 10:30 AM
Unknown Object (File)
Mon, Oct 6, 2:20 PM
Unknown Object (File)
Mon, Oct 6, 3:42 AM
Unknown Object (File)
Fri, Oct 3, 7:32 AM
Unknown Object (File)
Thu, Oct 2, 10:48 AM
Subscribers

Details

Summary

Highlevel list of changes:

  • import Linux KPI changes from DragonflyBSD
  • adapt the iounmap() function to FreeBSD, by adding a second argument
  • adapt the vunmap() function to FreeBSD, by adding a second argument
Test Plan

Test drivers using current code.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

hselasky retitled this revision from to Update LinuxKPI layer in FreeBSD-11-current.
hselasky updated this object.
hselasky edited the test plan for this revision. (Show Details)
hselasky added reviewers: np, dumbbell.
hselasky set the repository for this revision to rS FreeBSD src repository - subversion.

Use ffsl() instead of fsls() in find_next_bit and find_next_zero_bit functions. Regression issue.

Fix a minor typo that sneaked in.

The cxgbe parts are pretty straightforward and look ok to me. I'm not familiar enough with the Linux KPIs to review the rest. We should try to find others who are interested in the Linux KPIs and are actually familiar enough with it to be able to review this.

hselasky updated this object.
hselasky edited edge metadata.

Update patch after committing most parts of it to 11-current.

hselasky updated this object.

Update patch after recent commit.

kib added inline comments.
sys/ofed/include/linux/io-mapping.h
71

This is a comment about both io_mapping_create_wc() and io_mapping_free(). PAT attributes have page granularity, in other words, if UMA used the same page for M_KMALLOC item and any other item, then other item accesses would suffer from the lower cacheability mode and cause slowdown.

Next, when you do io_mapping_free(), if any other item, allocated with io_mapping_create_wc(), is co-located with the freed item, then you change the mapping attributes of both items to WB.

Summary is that you must not combine malloc and non-standard mapping attributes, this is wrong.

sys/ofed/include/linux/io-mapping.h
71

Hi Konstantin,

In the code mentioned we don't map the kmalloc'ed area, but the area pointed to by "base". If the memory at "base" has a granularity of PAGE_SIZE, then it should be fine?

What is consequence of not restoring the so-called memory attribute? Can we simply leave out line 69 ?

sys/ofed/include/linux/io-mapping.h
71

I misread the code, sorry.

Well, then the question is what is the base, and how is it used this code is done with it.
If base is mapping e.g. a device registers memory, like BAR for some PCI device, then reverting it to WB is wrong. It is also wrong if base points to some kind of aperture, since the strongest cacheability attribute which could be tolerated by an aperture is typically write-combining.

sys/ofed/include/linux/io-mapping.h
71

"base" is pointing to a PCI BAR, in current clients.

In other words, you suggest I simply leave the memory attribute at the last value which a driver set, because then it won't be invalid?

sys/ofed/include/linux/io-mapping.h
71

It is impossible to tell. Probably, if the accesses with WC attributes worked for the driver, they would not be totally wrong for other code. Resetting the attributes to WB is definitely wrong for typical devices BARs.

Restore old behaviour in io_mapping_free().

sys/ofed/include/linux/io-mapping.h
71

Hi Kib,

Thank you for looking at this. Then DFBSD is not doing it right. I've restored the old behaviour.

Is it fine to call pmap_unmapdev() even if the mapped areas are within the same PAGE?

sys/ofed/include/linux/io-mapping.h
71

Of course not, since pmap_unmapdev() truncates address and rounds up the size, it cannot act on anything less then a page.

Typically on amd64 the BARs are located inside the PCI hole, which is often between 3G and 4G, which is covered by the direct map. Then, pmap_unmapdev() actually does nothing. But for register pages outside the direct map, or on i386, the mapping is removed for real, so the other area is no longer accessible after the call.

That said, I do not think that there are many/any devices which co-locate their registers in the same page. Such configuration is not recommended by Intel, because you can only allow pass-through of the whole page to the virtualization domains, and then you cannot provide a fine-grined access to one device without also providing access to the other.

Keep the implementation as is until further.