Page MenuHomeFreeBSD

Improve cross-tools build concurrency
Needs ReviewPublic

Authored by on Nov 7 2016, 12:27 PM.



Make tasks of _cross-tools run concurrently. Change for-loop to dependencies.

Test Plan

On a 4-core rpi2 it improves build time for ~10% from 435 to 390 secs, utilization from ~325% to ~370%
On a 12-core hsw-ep it improves build time for ~30% from 25 to 16 secs, utilization from ~400% to ~650%

It is far from ideal mostly due to chained dependencies in binutils.

Diff Detail

rS FreeBSD src repository - subversion
Lint Skipped
Unit Tests Skipped

Event Timeline retitled this revision from to Improve cross-tools build concurrency. updated this object. edited the test plan for this revision. (Show Details) set the repository for this revision to rS FreeBSD src repository - subversion.

Update diff to handle cross build correctly. Tested with TARGET=arm64 TARGET_ARCH=aarch64.

I'm worried that this doesn't encode dependencies between the cross tools. Have you confirmed that there's no dependencies this will gloss over?

I added libelftc dependency after failed test build. Otherwise they looked good and survived -j240 _cross-build on 24 vcpu. But I'd admit not really checking each makefiles into detail before first submit.

Quick scan through I found that elfcopy indeed has libpe in LIB_ADD. Update patch to address it.

Any suggestions to check for more?

Another way to test is to have __TESTTOOLS= <all the stuff in the for statement> and use ${_TESTTOOLS:Ox} to randomize the list to make sure. Run that a few times, and you'll have at least a good probabilistic belief that there's no hidden dependencies.

Put the list of cross-tools prerequisites to randomized test. 100 times -j240 passed ok.

imp added a reviewer: imp.

OK. I'm cool with this now that we've done random testing on it. Thanks for running them. We've had issues with cross tools in the past and those tests give me the confidence we'll not be repeating some unfortunate mistakes from the past.

This revision is now accepted and ready to land.Nov 17 2016, 4:42 AM

I did this and never committed:

commit b936f4e146817fd8b450580d72569b6b16b9a29e
Author: Bryan Drewery <>
Date:   Thu Oct 22 16:53:19 2015 -0700

    Parallelize cross-tools.
    Sponsored by:   EMC / Isilon Storage Division
    MFC after:      3 weeks

diff --git Makefile.inc1 Makefile.inc1
index 491a086..7aff350 100644
--- Makefile.inc1
+++ Makefile.inc1
@@ -1588,12 +1588,21 @@ _elftctools=	lib/libelftc \
+# Eltoolchain tools depend on libelftc
+.if !empty(_elftctools)
+.for _tool in ${_elftctools:Nlib/libelftc}
+cross-tools_${_tool}: cross-tools_lib/libelftc
 # If an full path to an external cross compiler is given, don't build
 # a cross compiler.
 .if ${XCC:M/*} == "" && ${MK_CROSS_COMPILER} != "no"
 .if ${MK_CLANG_BOOTSTRAP} != "no"
 _clang=		usr.bin/clang
 _clang_libs=	lib/clang
+cross-tools_usr.bin/clang: cross-tools_lib/clang
 .if ${MK_GCC_BOOTSTRAP} != "no"
 _cc=		gnu/usr.bin/cc
@@ -1603,7 +1612,8 @@ _cc=		gnu/usr.bin/cc
 _usb_tools=	sys/boot/usb/tools
-cross-tools: .MAKE .PHONY
+cross-tools: .PHONY
 .for _tool in \
     ${_clang_libs} \
     ${_clang} \
@@ -1614,12 +1624,14 @@ cross-tools: .MAKE .PHONY
     ${_btxld} \
     ${_crunchide} \
+cross-tools_${_tool}: .PHONY
 	${_+_}@${ECHODIR} "===> ${_tool} (obj,depend,all,install)"; \
 		cd ${.CURDIR}/${_tool} && \
 		${MAKE} DIRPRFX=${_tool}/ obj && \
 		${MAKE} DIRPRFX=${_tool}/ depend && \
 		${MAKE} DIRPRFX=${_tool}/ all && \
+cross-tools: cross-tools_${_tool}

From my own work it seems you're missing clang->libclang dependency.

Note also the simpler-to-maintain way I added in the elftoolchain dependencies:

+# Eltoolchain tools depend on libelftc
+.if !empty(_elftctools)
+.for _tool in ${_elftctools:Nlib/libelftc}
+cross-tools_${_tool}: cross-tools_lib/libelftc

This lets us maintain only 1 list of the same tools.

Lastly, how much time does this save in the build? I did the work and never committed it, so I must have hit a problem or found it saved such little time it wasn't warranted.

Ah yes I should have tested with MK_SYSTEM_COMPILER=no to cache clang.

The _elftctools list now has two libs libelftc and libpe. I 'd go your way if there is only one.

The benefit gained is marginal. for amd64 it saves 10 secs in ~36 mins 12-core buildworld or say ~0.5%, on env mentioned above, without building clang. For usual 2- or 4-core desktop PC it would be even less. You've optimized the most other stages. ;-) I only did it at free time and see if it does not harm to get further improved. edited edge metadata.

Add clang dependency.

with MK_SYSTEM_COMPILER=no, both src & obj on tmpfs, timing looks like below:

before patch:


_worldtmp 0.213u 0.378s 0:00.58 100.0%  3654+208k 0+0io 0pf+0w
  _legacy 0.370u 0.177s 0:00.53 101.8%  6112+258k 0+0io 0pf+0w

_bootstrap-tools 501.781u 15.202s 0:56.77 910.6% 61201+651k 0+20io 0pf+0w

   _cleanobj 175.950u 50.315s 0:19.42 1165.0%      820+246k 0+447io 0pf+0w
        _obj 78.108u 16.179s 0:08.21 1148.2%       874+251k 0+0io 0pf+0w
_build-tools 12.827u 1.928s 0:02.74 537.9% 28651+384k 0+0io 0pf+0w
_cross-tools 6916.140u 122.121s 10:46.53 1088.6%   59670+677k 0+918io 0pf+0w
   _includes 74.681u 17.964s 0:08.28 1118.8%       1040+254k 0+0io 0pf+0w
  _libraries 12073.224u 424.004s 18:36.65 1119.1%  54540+649k 0+1619io 0pf+0w
  everything 3393.358u 352.163s 5:17.55 1179.5%    50595+531k 0+6837io 0pf+0w
     build32 2884.314u 308.760s 5:26.20 978.8%     52410+527k 0+528io 0pf+0w
 buildkernel 3046.829u 194.845s 4:58.29 1086.7%    48546+508k 0+4305io 0pf+0w

29157.910u 1504.101s 46:41.90 1094.3% 53755+608k 0+14674io 0pf+0w

after patch:


_worldtmp 0.143u 0.445s 0:00.58 100.0%  3627+206k 0+0io 0pf+0w
  _legacy 0.373u 0.174s 0:00.53 101.8%  8962+278k 0+0io 0pf+0w

_bootstrap-tools 500.693u 16.050s 0:56.53 914.0% 61179+651k 0+9io 0pf+0w

   _cleanobj 174.024u 52.174s 0:19.45 1162.9%      820+246k 0+0io 0pf+0w
        _obj 77.499u 16.759s 0:08.22 1146.4%       887+252k 0+73io 0pf+0w
_build-tools 12.829u 1.890s 0:02.73 538.8% 28742+383k 0+10io 0pf+0w
_cross-tools 6919.590u 122.245s 10:28.67 1120.1%   59745+678k 0+768io 0pf+0w
   _includes 76.151u 16.420s 0:08.32 1112.6%       1061+255k 0+0io 0pf+0w
  _libraries 12083.793u 419.015s 18:33.20 1123.1%  54583+649k 0+2407io 0pf+0w
  everything 3394.471u 347.383s 5:16.56 1182.0%    50624+531k 0+7764io 0pf+0w
     build32 2882.364u 308.298s 5:24.95 981.8%     52415+527k 0+1027io 0pf+0w
 buildkernel 3048.868u 192.933s 4:58.05 1087.6%    48555+509k 0+4408io 0pf+0w

29171.022u 1493.846s 46:17.94 1103.8% 53796+608k 0+16466io 0pf+0w

This revision now requires review to proceed.Nov 25 2016, 11:40 AM