Page MenuHomeFreeBSD

Proposed performance enhancements of Uses/python.mk
AbandonedPublic

Authored by marino on Nov 30 2015, 9:40 PM.
Tags
None
Referenced Files
F107927527: D4325.id10614.diff
Sun, Jan 19, 1:50 PM
Unknown Object (File)
Fri, Jan 17, 1:04 PM
Unknown Object (File)
Sat, Jan 11, 10:19 AM
Unknown Object (File)
Dec 21 2024, 4:53 AM
Unknown Object (File)
Dec 1 2024, 12:55 AM
Unknown Object (File)
Nov 27 2024, 8:36 PM
Unknown Object (File)
Nov 6 2024, 10:26 AM
Unknown Object (File)
Oct 31 2024, 9:26 PM

Details

Reviewers
mva
danfe
antoine
wg
Group Reviewers
portmgr
Summary

The python.mk file as it is currently is very unfriendly to full ports
tree scans, such as those down by poudriere bulk -a.

For example, even when no python is installed as is always the case
on the full scan, every single port that has USES=python in it is
attempting to launch python anyway, and falling back to the default,
which is then piped into tail(1)! An obvious improvement is don't
shell commands when they are 100% guaranteed to fail.

A similar case exists for PYTHONBASE. An attempt to launch python
is made every time and fails every time in the case above.

The worst offender of all is the definition of PYTHON_PORTVERSION
which unconditionally spawns another instance of make just to figure
out the port version. This is extremely heavy during porttree scanning.

Even when python is on the system, it is unnecessary and heavy work
to spawn pytho over and over to retrieve extremly simple information.

This solution:

  1. lang/python creates a makefile fragment with the sought information already defined. This actions as a cache.
  1. If this cache is present, it is used thus eliminating all chained shell commands
  1. if it's not present, and bin/python is not present, the default is used. Note that over time, all the lang/python ports will be updated and the existence check for bin/python will ALWAYS fail. In the meantime, there are going to be older lang/python ports installed so the bin/python fallback has to remain, at least for a while. In time it can be removed.
  1. The lang/pythonXY ports each define PYTHON_PORTVERSION in a separate file called Makefile.version. This eliminated another shell command (replacing it with a friendlier include). As a side benefit, python35 was really cleaned it. It was a mess with all the regex which actually emitted matching errors. It's much better now.

The improvements are dramatic. On DragonFly poudriere on a 4-core
machine, the time needed to scan the entire tree dropped from 13.5
minutes to 11.5 minutes just with this one modification.

Hopefully these modications will be seen as useful.

Diff Detail

Repository
rP FreeBSD ports repository
Lint
No Lint Coverage
Unit
No Test Coverage
Build Status
Buildable 1641
Build 1647: arc lint + arc unit

Event Timeline

marino retitled this revision from to Proposed performance enhancements of Uses/python.mk.
marino updated this object.
marino added reviewers: koobs, danfe.

For some reason, arc wouldn't accept "Python" in reviewers list.

hold on, lang/python wasn't committed.

marino edited edge metadata.

Add missed lang/python changes

marino edited edge metadata.

Removed extra line from lang/python that should not have been there.

By the way, I added the ports_env stuff for this purpose (avoiding execs). It avoids running python in every port. Poudriere master uses it now. If you use it you may find even more speedup in your project.

How poudriere uses it:

if [ -f "${MASTERMNT}/usr/ports/Mk/Scripts/ports_env.sh" ]; then

eval "$(injail env \
    SCRIPTSDIR=/usr/ports/Mk/Scripts \
    PORTSDIR=/usr/ports \
    MAKE=/usr/bin/make \
    /bin/sh /usr/ports/Mk/Scripts/ports_env.sh | \
    grep '^export [^;&]*')"

fi

lang/python/Makefile
18

Please do not mix functional changes with whitespace. It is a separate review/commit.

By the way, I added the ports_env stuff for this purpose (avoiding execs). It avoids running python in every port. Poudriere master uses it now. If you use it you may find even more speedup in your project.

ah, that explains this line, ".if !defined(_PYTHON_DEFAULT_VERSION)". I asked antoine about it earlier. I couldn't figure out why it was there since it would only work on an "illegally" defined internal name was used in make.conf. I guess it was used in ports_env.sh instead.

lang/python35/Makefile
10

This change should probably be reverted. I think I thought line 5 was PORTVERSION. It probably doesn't matter, but there's no reason to change it either.

koobs added subscribers: koobs, Python.

Update reviewers to those that have the expertise/knowledge to do so

Can I get some eyeballs on this one?

It's a huge performance improvement for scan time, but if FreeBSD is not interested then I'll just move it over to DPorts and forget about it here.

I don't think it's all that much to review. It won't take long, I promise.

I can't believe there's not a single person on either portmgr or python team that has time and willingness to review this.

If nothing changes in the next 3 days, I'll probably close this request.

I'll have a look at this patch tomorrow. Sorry, would have done it this weekend but ENOTIME :-(.

I believe the ABIVER!= is wrong for python 3.
It is already wrong in python.mk but now the wrongness is duplicated (will give different result depending on what is installed)

I believe the ABIVER!= is wrong for python 3.
It is already wrong in python.mk but now the wrongness is duplicated (will give different result depending on what is installed)

see lang/python/Makefile:

.if ${PYTHON_MAJOR_VER} == 2
ABIVER=
RUN_DEPENDS+= python2:${PORTSDIR}/lang/python2
.elif ${PYTHON_MAJOR_VER} == 3
ABIVER!= ${PYTHON_CMD}-config --abiflags 2>/dev/null || true
RUN_DEPENDS+= python3:${PORTSDIR}/lang/python3
.endif

It maintains the definition that is currently there. The ABIVER should be the same before and after this diff.

A general comment, how would this patch integrate with different implementations of python (i.e. when/if pypy becomes a first class citizen next to cpython). If I read your patch correctly it will integrate without difficulty.

I hope my review was helpful, it was my first review :-/

Mk/Uses/python.mk
271

Would it not be possible to export METAPY_INSTALLED_VERSION as _PYTHON_DEFAULT_VERSION (perhaps conditionally: ?=) and get the same behaviour without needing to patch this section? The same goes, I think, for PYTHON_BASE?

273

This changes existing behaviour: if the user (or a tool) defined _PYTHON_DEFAULT_VERSION then it would get overridden in your patch with the version from python whereas the existing behaviour does not override it.

419

If you are going to change the METAPY variables per the previous comment then maybe change this line to:
.if !defined(PYTHON_ABIVER) && exists...

(and change the exported METAPY_ABIVER to PYTHON_ABIVER).

lang/python/Makefile
33

The definition of ABIVER could be inlined with line 54 and instead you could do something like make -V PYTHON_ABIVER to avoid duplicate definitions for ABIVER

I can't comment about pipi. I responded to all three inline comments.

Mk/Uses/python.mk
273

You are right that behavior is changed. I didn't know why this was here until Bryan talked about ports_env.sh. On the other hand, this is an internal variable and no tool should be modifying it, right?

Regarding ports_env.sh, is support for _PYTHON_DEFAULT_VERSION still required? It is there for performance reasons which we are addressing with this change. Or I am missing something? Is the behavior change a bad thing?

419

I'm not following this one. If by previous comment you are talking about Antoines, there was no functionality change (thus no need to change this) ... assuming I'm not missing something.

I don't understand what the issue is about PYTHON_ABIVER. It looks functional as it is to me. Maybe somebody can illustrate a problem with an example to help me understand?

lang/python/Makefile
33
  1. I am DESPERATELY removing all spawning of make. This is an extremely heavy penality. Getting rid of make is basically the point of this review.
  1. This is the only definition of ABIVER! The original one is removed.
lang/python/Makefile
33

I'm re-reading this comment again. It cannot be inlined with line 54 because there are 2 possible values of ABIVER depending if python 2 or 3 is used. Another reason it can't be inlined is because it's a shell command. I also don't see what benefit inlining has... ?

Mk/Uses/python.mk
271

I am working on an update to address several of the comments. I think it does not make sense to override _PYTHON_DEFAULT_VERSION when METAPY_INSTALLED_VERSION is installed though. All the data from vitals.mk have to stick together as a unit.

marino edited edge metadata.

Incorporate some comments:

  1. Remove unrelated whitespace changes
  2. Restore _PYTHON_DEFAULT_VERSION override, but only when vitals.mk is not imported
  3. Remove DISTNAME change to lang/python35

Because ABIVER doesn't change doesn't mean it's correct.
Now the wrong line is copied twice.
Using != for things that may not be present is wrong!

Because ABIVER doesn't change doesn't mean it's correct.

But if ABIVER is wrong before this patch, that's out of scope of the patch. right?

Now the wrong line is copied twice.

Well, the original line is only there for a transitional period (however long that may be). Once the metaport is reinstalled, it will never be run again. But the above point remains: Why is an existing bug being held against this review? because it copies the bug? Do you want to fix the bug first?

Using != for things that may not be present is wrong!

This doesn't happen.
If you think it happens, please give me a concrete example and I'll illustrate why it doesn't happen.

This definitely happens, on a system with nothing installed:

# echo DEFAULT_VERSIONS=python=3.4 >> /etc/make.conf
# make -C /usr/ports/lang/python install
# cat /usr/local/share/python/vitals.mk
METAPY_INSTALLED_VERSION=3.4
METAPY_ABIVER=
METAPY_BASE=/usr/local
# python3.4-config --abiflags
m

What do I win?

nothing. That case is impossible.

vitals.mk only exists is lang/python is installed.
If lang/python is installed, then METAPY_ABIVER= would be defined as "m"

which is the same as the output on the last line.

Maybe take another look to see how vitals.mk is generated?

So what happened on my computer 10 minutes ago is impossible????

hmmm.

how is
.elif ${PYTHON_MAJOR_VER} == 3
ABIVER!= ${PYTHON_CMD}-config --abiflags 2>/dev/null || true

getting ABIVER evalued to null string?

And more importantly, if you know what the problem is, how should the patch be modified to address the problem?

Something like this seems to fix the problem:

@@ -15,12 +15,14 @@
 
 USES=                  python:run
 NO_BUILD=              yes
+VITALMK=       ${STAGEDIR}${DATADIR}/vitals.mk
 
 PLIST_FILES=   bin/2to3 \
                bin/idle \
                bin/pydoc \
                bin/python \
-               bin/python-config
+               bin/python-config \
+               share/python/vitals.mk
 
 .include <bsd.port.pre.mk>
 
@@ -45,6 +47,14 @@
 .endif
        ${LN} -sf python${PYTHON_MAJOR_VER}-config \
                ${STAGEDIR}${PREFIX}/bin/python-config
+       ${MKDIR} ${VITALMK:H}
+       ${ECHO} METAPY_INSTALLED_VERSION=${PYTHON_VER} > ${VITALMK}
+.if ${PYTHON_MAJOR_VER} == 2
+       ${ECHO} METAPY_ABIVER= >> ${VITALMK}
+.elif ${PYTHON_MAJOR_VER} == 3
+       ${ECHO} METAPY_ABIVER=$$(${PYTHON_CMD}-config --abiflags) >> ${VITALMK}
+.endif
+       ${ECHO} METAPY_BASE=${PREFIX} >> ${VITALMK}
 
 # Major upgrade support
 PORTUPGRADE_CMD=       ${LOCALBASE}/sbin/portupgrade
antoine edited edge metadata.

The use of METAPY_ABIVER in python.mk looks wrong too.
If a port has USES=python:3.4+ , but python is installed (and links to python2 -> python 2.7), the ABIVER will be set to nothing (from METAPY_ABIVER) instead of m

In my opinion the whole patch creates more bugs than the performance enhancement it may or may not solve...

This revision now requires changes to proceed.Dec 14 2015, 10:59 PM

The performance issues are very real and demonstrable, e.g. even when it's known that python isn't installed, python is launched (and fails), then is piped through head. This happens 25,000 times on every full poudriere bulk run. that's low hanging fruit.

on your example, how can the port build?
When poudriere tries to build ports like that, it fails before it starts.

marino edited edge metadata.

Incorporated antoine's lang/python modification

????
Gnome builds in poudriere and it depends on some python2.7 ports, the python symlink (via gtk20) and some python3.x ports

assuming your example is legit (lang/python installed for python2 but python3 can coexist and build individual ports) then we can do 2 things:

  1. don't cache ABIVERSION (with major impact of spawning python again)
  2. cache ABIVERSION on a per-python basis

the removal of the "make" is the biggest performance enhancement and that definitely should be done in any case.

actually, ABIVER shell command is not a big impact.

  1. it's not a python spawn
  2. It only happens on python3

It's probably best just to remove that bit (restore it how it was)

If I do that, are there any other immediately obvious issues?

marino edited edge metadata.

Restore ABIVER handling (it's not cached by vitals.mk anymore)

so did I address all the concerns?

Right now this is the overview:
A) _PYTHON_DEFAULT_VERSION is cached by lang/python. An externally defined _PYTHON_DEFAULT_VERSION will not be used unless lang/python is either not installed or not upgraded. (big performance gain by avoiding python spawn)

B) A) PYTHONBASE is cached by lang/python. An externally defined PYTHONBASE will override it. (big performance gain by avoiding python spawn)

C) PYTHON_PORTVERSION is determined by including lang/pythonXX/Makefile.version unless it is overridden by _PORTS_ENV_CHECK (huge performance gain by avoiding make spawn)

D) PYTHON_ABIVER was not changed after all.

Quoting "marino (John Marino)" <phabric-noreply@freebsd.org>:

marino added a comment.

so did I address all the concerns?

Right now this is the overview:
A) _PYTHON_DEFAULT_VERSION is cached by lang/python.  An

externally defined _PYTHON_DEFAULT_VERSION will not be used unless
lang/python is either not installed or not upgraded. (big
performance gain by avoiding python spawn)

B) A) PYTHONBASE is cached by lang/python.  An externally defined

PYTHONBASE will override it. (big performance gain by avoiding
python spawn)

PYTHONBASE was previously defined via PYTHON_CMD, which may vary based
on what the build toolchain requires (e.g. USES=...). Setting it based
on lang/python is wrong, especially since PYTHONBASE is not defined
externally within a port build chain. Maybe I'm missing something in
your patch, but the description of the changed behaviour is problematic.

Cheers
Marcus

okay, I see that now.

Is there something we can check to only use the cached version when PYTHON_CMD is unset (and therefore the default version)?

By the way, there is more python spawning in Mk/Uses/python.mk. Is it possibly a duplicate of stuff in Mk/bsd.python.mk ?

Quoting "marino (John Marino)" <phabric-noreply@freebsd.org>:

marino added a comment.

okay, I see that now.

Is there something we can check to only use the cached version

when PYTHON_CMD is unset (and therefore the default version)?

Not without a lot of additional conditionals, which complicate
maintenance. I wonder about what the performance gain is, if only your
PYTHON_DEFAULT_VERSION improvement (avoiding the make call) is added.
Do you have any numbers?
That'd allow us to estimate, whether it is worth to investigate or not.

By the way, there is more python spawning in Mk/Uses/python.mk.

Is it possibly a duplicate of stuff in Mk/bsd.python.mk ?

bsd.python.mk does not exist anymore, since it was moved to Mk/Uses/python.mk.

Cheers
Marcus

The impacts are huge, I put this on the summary:
"The improvements are dramatic. On DragonFly poudriere on a 4-core
machine, the time needed to scan the entire tree dropped from 13.5
minutes to 11.5 minutes just with this one modification."

The use of "make" to get the version was the very biggest impact. Frankly that technique should be banned. At the very, very least this needs to be removed.

The second biggest impact is spawning python even when it's known it's not installed. x50,000 it's a big hit.

I can't imagine it's *that* difficult to figure out how to switch between cached or not.

Quoting "marino (John Marino)" <phabric-noreply@freebsd.org>:

marino added a comment.

The impacts are huge, I put this on the summary:
"The improvements are dramatic. On DragonFly poudriere on a 4-core
machine, the time needed to scan the entire tree dropped from 13.5
minutes to 11.5 minutes just with this one modification."

I'm aware of that. I'm asking about the run-time behaviour with only
the make removal. If that already saves *a lot*, let's get it into
python.mk first and think about how to avoid python invocations in a
a second round, so that the patch is

a) easier to test
b) has a smaller impact and risk of unwanted side effects.

The use of "make" to get the version was the very biggest impact.

Frankly that technique should be banned. At the very, very least
this needs to be removed.

The second biggest impact is spawning python even when it's known

it's not installed. x50,000 it's a big hit.

Yes, but does it consume 0.2 seconds or 50 seconds or something else?

Cheers
Marcus

okay, fair enough, I created D4660 to isolate "${MAKE}" changes. When it's accepted, I'll update this one with fixes.

"Yes, but does it consume 0.2 seconds or 50 seconds or something else?"

More than 50 seconds during a full scan.

Reviewing the original code, it's clear now that bin/python won't be called if it's not installed. That means it's not impacting the case of scanning the whole tree prior to a full tree build (or repository rebuild).

Pursuing additional performance is diminishing returns for me, so I'll just abandon this review since we got the big one (make) taken care of already.