Page MenuHomeFreeBSD

Rework ofed build.
ClosedPublic

Authored by kib on Jun 1 2018, 9:35 PM.
Tags
None
Referenced Files
Unknown Object (File)
Mar 10 2024, 1:42 PM
Unknown Object (File)
Mar 10 2024, 1:42 PM
Unknown Object (File)
Mar 10 2024, 1:41 PM
Unknown Object (File)
Mar 10 2024, 1:41 PM
Unknown Object (File)
Mar 10 2024, 1:41 PM
Unknown Object (File)
Mar 10 2024, 1:39 PM
Unknown Object (File)
Mar 10 2024, 1:37 PM
Unknown Object (File)
Mar 10 2024, 1:35 PM

Details

Summary

This change aligns the build with the FreeBSD traditional approach to track inter-dependencies.

I only converted libs, binaries will follow if this approach if found sensible.

Also, I do not understand the __L stuff, please provide the pointers.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

lib/ofed/Makefile
3–24 ↗(On Diff #43257)

For __L you basically just need to copy the LDADD/LIBADD of each library (the directory for that LDADD) and remap it. The list above is something to work off of.
A few months ago I came up with this (based on different library refactoring than you did):

_ofed_lib=             contrib/ofed/usr.lib
_prebuild_libs+=       contrib/ofed/usr.lib/libosmcomp
_prebuild_libs+=       contrib/ofed/usr.lib/libibverbs
_prebuild_libs+=       contrib/ofed/usr.lib/libibmad
_prebuild_libs+=       contrib/ofed/usr.lib/libibumad

contrib/ofed/usr.lib/libosmcomp__L: lib/libthr__L
contrib/ofed/usr.lib/libibmad__L: contrib/ofed/usr.lib/libibumad__L

Based on my own version of:

SUBDIR_DEPEND_libcxgb4=        libibverbs
SUBDIR_DEPEND_libibcm= libibverbs
SUBDIR_DEPEND_libibmad=        libibumad
SUBDIR_DEPEND_libibnetdisc=    libosmcomp libibmad libibumad
SUBDIR_DEPEND_libmlx4= libibverbs
SUBDIR_DEPEND_libmlx5= libibverbs
SUBDIR_DEPEND_libosmvendor=    libibumad
SUBDIR_DEPEND_librdmacm=       libibverbs
SUBDIR_PARALLEL=

Note I only have 2 __L lines because I've only had to add 4 libraries to _prebuild_libs. The rest just build in _ofed_lib which is part of _generic_libs. Only dependencies for "all of contrib/ofed/usr.lib" need to be built in _prebuild_libs. That means all of the right side of your SUBDIR_DEPEND above should be in _prebuild_libs. And if any of them depend on another directory then list it as consumer_dir__L: dep_library_dir__L (which is where I mean remap the SUBDIR_DEPEND line as the order is the same). If there's no dependencies (like for libumad) then it does not need any __L line for itself.

Also note the libthr dependency there, whereever you moved the osmcomp library that wants libthr will need that dependency.

So you will likely end up with __L lines for the following and should be added to _prebuild_libs:

  • libibverbers
  • libibumad
  • complib
Makefile.inc1
2463 ↗(On Diff #43257)

You don't need to depend on the include directory as make includes is ran before make libraries.
Though I did need this a few months ago in Makefile.libcompat:

diff --git Makefile.libcompat Makefile.libcompat
index b8fc76c4e152..b2eddc1280f2 100644
--- Makefile.libcompat
+++ Makefile.libcompat
@@ -139,6 +139,7 @@ _LC_LIBDIRS.yes=            lib gnu/lib
 _LC_LIBDIRS.${MK_CDDL:tl}+=    cddl/lib
 _LC_LIBDIRS.${MK_CRYPT:tl}+=   secure/lib
 _LC_LIBDIRS.${MK_KERBEROS:tl}+=        kerberos5/lib
+_LC_LIBDIRS.${MK_OFED:tl}+=    contrib/ofed/include

 _LC_INCDIRS=   \
                include \
Makefile.inc1
2459–2463 ↗(On Diff #43257)

This should be all that is needed:

_ofed_lib= contrib/ofed/lib
_prebuild_libs+= contrib/ofed/lib/libibverbs
_prebuild_libs+= contrib/ofed/lib/libibumad
_prebuild_libs+= contrib/ofed/lib/libibmad
_prebuild_libs+= contrib/ofed/lib/complib

contrib/ofed/lib/libibmad__L: contrib/ofed/lib/libibumad__L
contrib/ofed/lib/complib__L: lib/libthr__L
lib/ofed/Makefile
3–24 ↗(On Diff #43257)

Was trying to explain, just ignore. I put the right solution in Makefile.inc1 section.

Import suggested handling of libraries dependencies.
Move infiniband/include build out of contrib/.

Only binaries left to handle.

Why can't the base library build system have a multi-stage build system by default?

@kib: Make sure multi-threaded build doesn't fail.

Abusing prebuild_libs seems like a hack!

Add a five step library build with accompanying variables. That will make the Makefile code more clean and easier to understand instead of all these manual dependencies.

Abusing prebuild_libs seems like a hack!

Sort of, yes. I agree with the statement that ib libs should be not put into the prebuild libs.

For now I used the recipe provided by bdrewery. When the build system grows the ability to handle this case cleaner, we will adopt.

Complete patch, no components (even include) are built in contrib, everything is moved to lib/ofed and usr.bin/ofed. Patch requires some more quality checks, I only build world for amd64. I plan to do make universe with and without ofed for it, and installation checks on amd64.

I think that for the purpose of enabling ofed by default, the list of the binaries should be examined and trimmed. The useless binaries should be put under some build knob, like WITH_OFED_EXTRAS, default to off. This is for later.

Update to the current state, in particular, lib/ofed/Makefile.inc added.

When running make tinderbox to check the build, I noted that sporadically mips and powerpc buildworlds fail with the

nm: elf_begin: Invalid argument

error. This comes from the nm invocation in the sorting subcommand for linking dso. The errors are random and seem to occur only for gcc-built arches. Reading of the nm and libelf source code suggested that the most likely cause is the invalid ELF structure of the .pico file is damaged.

So I applied the following debugging patch to see what is going on:

diff --git a/contrib/elftoolchain/nm/nm.c b/contrib/elftoolchain/nm/nm.c
index 33ab44295c7..70ceeb5e05e 100644
--- a/contrib/elftoolchain/nm/nm.c
+++ b/contrib/elftoolchain/nm/nm.c
@@ -1467,7 +1467,7 @@ read_object(const char *filename)
 
 	assert(filename != NULL && "filename is null");
 
-	if ((fd = open(filename, O_RDONLY)) == -1) {
+	if ((fd = open(filename, O_RDONLY | O_EXLOCK | O_SHLOCK)) == -1) {
 		warn("'%s'", filename);
 		return (1);
 	}
diff --git a/sys/kern/vfs_vnops.c b/sys/kern/vfs_vnops.c
index 09f665e8f3f..b6700c6eeb8 100644
--- a/sys/kern/vfs_vnops.c
+++ b/sys/kern/vfs_vnops.c
@@ -348,6 +348,13 @@ vn_open_vnode(struct vnode *vp, int fmode, struct ucred *cred,
 	if ((error = VOP_OPEN(vp, fmode, cred, td, fp)) != 0)
 		return (error);
 
+	if ((fmode & (O_EXLOCK | O_SHLOCK)) == (O_EXLOCK | O_SHLOCK) &&
+	    (fmode & FWRITE) == 0 && vp->v_writecount > 0) {
+		error = EDOOFUS;
+		goto exx;
+	}
+		
+
 	while ((fmode & (O_EXLOCK | O_SHLOCK)) != 0) {
 		KASSERT(fp != NULL, ("open with flock requires fp"));
 		if (fp->f_type != DTYPE_NONE && fp->f_type != DTYPE_VNODE) {
@@ -387,7 +394,7 @@ vn_open_vnode(struct vnode *vp, int fmode, struct ucred *cred,
 			error = vn_writechk(vp);
 		break;
 	}
-
+exx:
 	if (error != 0) {
 		fp->f_flag |= FOPENFAILED;
 		fp->f_vnode = vp;

What the patch does is it makes nm(1) to pass very specific combination of flags to the kernel code for open(2). When kernel detects read-only open of the file with that combination of flags, it checks the count of writers to the underlying vnode (v_writecount). If there is a writer, the open(2) syscall fails with EDOOFUS.

And indeed, I got the error

--- libibnetdisc.so.5.full ---
building shared library libibnetdisc.so.5
c++ -isystem /da/1/obj/da/1/src/powerpc.powerpc64/tmp/usr/include -L/da/1/obj/da/1/src/powerpc.powerpc64/tmp/usr/lib -B/da/1/obj/da/1/src/powerpc.powerpc64/tmp/usr/lib --sysroot=/da/1/obj/da/1/srcpowerpc.powerpc64/tmp -B/da/1/obj/da/1/src/powerpc.powerpc64/tmp/usr/bin  -Wl,--version-script=/da/1/src/contrib/ofed/libibnetdisc/libibnetdisc.map  -fstack-protector-strong -shared -Wl,-x -Wl,--fatal-warnings -Wl,--warn-shared-textrel  -o libibnetdisc.so.5.full -Wl,-soname,libibnetdisc.so.5  `NM='nm' NMFLAGS='' lorder chassis.pico g_hash_table.pico ibnetdisc.pico ibnetdisc_cache.pico query_smp.pico |  tsort -q` -L/da/1/obj/da/1/src/powerpc.powerpc64/lib/ofed/libibmad -L/da/1/obj/da/1/src/powerpc.powerpc64/lib/ofed/libibumad -L/da/1/obj/da/1/src/powerpc.powerpc64/lib/ofed/complib  -losmcomp  -libmad  -libumad
nm: 'query_smp.pico': Programming error

So it seems that linking command was started before the compiling stage finished, there was still some process writing to a .pico file.

I do not see anything unusual in my Makefiles for libraries. Is it a race in the build system ? In make ?

Makefile.inc1
2459 ↗(On Diff #43337)

libvendor not needed in prebuild_libs since nothing depends on it

2464 ↗(On Diff #43337)

libvendor not needed here since both libibumad and libthr are in prebuild_libs

2467 ↗(On Diff #43337)

lib/ofed__L not needed as libthr is in prebuild libs, already built when lib/ofed is built in generic_libs.

lib/ofed/libvendor/Makefile
15 ↗(On Diff #43337)

I loaded a random Makefile. No -L should be needed.

In D15648#330673, @kib wrote:

Abusing prebuild_libs seems like a hack!

Sort of, yes. I agree with the statement that ib libs should be not put into the prebuild libs.

For now I used the recipe provided by bdrewery. When the build system grows the ability to handle this case cleaner, we will adopt.

"prebuild libs" are really just a list of libraries that other libraries or directories with many libraries in them (like lib/ or ofed/lib) depend on. It is a hack in the sense that someone didn't want to write out the whole dependency graph for everything in lib/ on lib/libthr for example. By just moving libthr to the "prebuild libs" phase then all of lib/ can be built later in "generic" without explicit dependencies. The same applies here so we can just build "ofed/lib" in generic libs phase without writing out a whole large dependency list (like ofed/lib/Makefile's SUBDIR_DEPEND plus all of the LIBADDS for each library for a true dependency graph). We get a simpler 2 or 3 __L lines rather than 15.

Incorporate feedback:

  • remove -L's.
  • remove unneeded _L deps.

Consider it Reviewed after these are fixed.

Makefile.inc1
2462 ↗(On Diff #43369)

Here's the reason for the nm failure. I missed this before but since lib is already part of _generic_libs on line 2476, _ofed_lib` creates a race where lib and lib/ofed can be building twice concurrently.

So remove _ofed_lib entirely from Makefile.inc1 now due to the dir rename.

Makefile.libcompat
152 ↗(On Diff #43369)

Oops, this can come out now that lib/ofed/include is part of lib (line 148 already descends into it)

lib/Makefile
126 ↗(On Diff #43369)

Note this line above is bogus. I may commit a fix which will conflict for you but I can wait.

209–212 ↗(On Diff #43369)

Should be:

SUBDIR.${MK_OFED}+= ofed
usr.bin/ofed/libibverbs/Makefile.inc
8 ↗(On Diff #43369)

These should be LIBADD+=

usr.bin/ofed/librdmacm/Makefile.inc
8 ↗(On Diff #43369)

These should be LIBADD+=

This revision now requires changes to proceed.Jun 14 2018, 4:28 PM
kib marked 5 inline comments as done.Jun 14 2018, 4:39 PM
kib added inline comments.
lib/Makefile
126 ↗(On Diff #43369)

I do not mind resolving small conflict.

Fix libpcap dependency on libmlx5

This revision is now accepted and ready to land.Jun 14 2018, 11:55 PM

Tinderbox and installworld tests passed with WITH_OFED=yes. I think that the patch is ready to go.

Just make sure buildworld WITH_OFED -jXXX does not break.

Just make sure buildworld WITH_OFED -jXXX does not break.

I obviously never ever build this patch without -j. All the testing was done with make -j 32 on 8x2x2 skylake xeon.

This revision was automatically updated to reflect the committed changes.