Page MenuHomeFreeBSD

lang/python: do not write bytecode when running as root
AbandonedPublic

Authored by vishwin on Feb 9 2023, 7:49 AM.
Tags
None
Referenced Files
Unknown Object (File)
Sun, Dec 1, 7:57 AM
Unknown Object (File)
Oct 20 2024, 11:14 PM
Unknown Object (File)
Oct 4 2024, 2:57 PM
Unknown Object (File)
Oct 4 2024, 9:31 AM
Unknown Object (File)
Oct 2 2024, 2:22 AM
Unknown Object (File)
Oct 1 2024, 11:18 PM
Unknown Object (File)
Sep 29 2024, 6:42 AM
Unknown Object (File)
Sep 29 2024, 1:20 AM
Subscribers

Details

Reviewers
tcberner
fluffy
arrowd
antoine
Group Reviewers
Python
Summary

Python writes bytecode on import by default when such files do not already exist, which is achieveable when the executing user has write privileges in the directories of the modules being imported. This is problematic when running as root, both in port build and system administration/operation scenarios. When building ports as root, including the optional interactive shell phase, written bytecode results in filesystem violations. In system administration/operation, when system services are written in Python and executed by root, bytecode gets written to system directories, again resulting in filesystem pollution and integrity violations.

The Python interpreter has both a -B flag and PYTHONDONTWRITEBYTECODE environment variable to disable the default behaviour. However, this assumes that the flag or environment variable can be passed/set at will, which is not the case in either scenario described above. There also exists a sys.dont_write_bytecode settable variable in the Python standard library, but this is only useful inside a Python script or interactive session.

This unconditionally disables writing bytecode when the effective UID is 0, which corresponds to root. There are three variants of this patch: 3.7, 3.8-3.10, 3.11 and later.

Test Plan

3.10 example shown, but operation is identical:

vishwin@ardmore:~ % python
Python 3.10.9 (main, Feb  9 2023, 04:54:29) [Clang 14.0.5 (https://github.com/llvm/llvm-project.git llvmorg-14.0.5-0-gc12386 on freebsd14
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.dont_write_bytecode
False
>>>
vishwin@ardmore:~ % sudo -E python
Python 3.10.9 (main, Feb  9 2023, 04:54:29) [Clang 14.0.5 (https://github.com/llvm/llvm-project.git llvmorg-14.0.5-0-gc12386 on freebsd14
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.dont_write_bytecode
True
>>>

Further, the failure mode exhibited in D38429 comment goes away entirely.

Reported and submitted upstream.

Diff Detail

Repository
rP FreeBSD ports repository
Lint
No Lint Coverage
Unit
No Test Coverage
Build Status
Buildable 49647
Build 46537: arc lint + arc unit

Event Timeline

antoine requested changes to this revision.Feb 9 2023, 7:52 AM

This makes no sense.

This revision now requires changes to proceed.Feb 9 2023, 7:52 AM

It makes sense in that packaged bytecode is going away.

No it makes no sense to have special code for root and to diverge from upstream

There is no difference from upstream as far as actual Python interpretation and execution is concerned. The only difference is no more filesystem-polluting instruction caching when ran as root if such files do not already exist. Manually invoking compile_py, compileall et al are unchanged.

This makes no sense, on systems where binaries are installed as user "bin" your code will not work

pkg's INSTALL_AS_USER is not a default setting.

The first priority is still ensuring that ports, particularly Python ports, have their do-extract, do-patch, do-configure, do-build, do-stage, do-package working properly as root, which is a default. This does exactly that, and those stages when not root continue to work. If this becomes a big enough problem when D34739 is completed, this can go away then, and only then.

Alternatively, instead of special handling for root, writing bytecode on import can be unconditionally disabled by default. Still does not affect any manual bytecode compilation via compile_py, compileall, et al (apparent in that none of the lang/python ports have changed plists), nor does it affect utilising existing bytecode so long as they match the timestamps and specific Python interpreter they were built against. Most importantly, there is still no divergence from upstream as far as code interpretation and execution is concerned, just less filesystem pollution.