The current binary goal uses PEX to build a python binary. Unfortunately PEX does not include a python interpreter in the binary. An alternative solution is to build a binary using pyinstaller (which bundles the python interpreter in the binary) and staticx (which makes the binary platform independent). After a discussion on the slack channel with the pants team I have opened this ticket.
Possible cli
binary goal./pants binary --type=pyinstaller <binary-target>
./pants pyinstaller-bin <binary-target>
If this cannot be implemented by you team, can you point me into the direction for me to start working on this?
Thanks
Adrian
Thanks for posting this. To summarize a private conversation, I think the best (and also easiest) way to get this done is to create a new target type, pyinstaller_binary, similar to python_binary, and then create a corresponding PyInstallerBinaryFieldSet member of the BinaryFieldSet union, and have a rule that can turn a PyInstallerBinaryFieldSet into a CreatedBinary, similar to the existing rule for PythonBinaryFieldSet. Happy to help work through this.
@thamenato is going to take a look at this 馃殌
Josh Reed made a very apt comment the other day, that the hardest part of adding a plugin is scoping what it should do. A Pants plugin is essentially a fancy wrapper around subprocess.run(), meaning that we need to know first what command line arguments we're going to use with pyinstaller.
You'll want to start by downloading pyinstaller https://www.pyinstaller.org and checking out --help. (I personally love pipx to install things like this.) Create a small example repo, or use https://github.com/pantsbuild/example-python, and try creating things with pyinstaller _without_ Pants.
We don't need to worry about all the options, like --strip. We can add those later easily. For now, the focus is getting the core functionality.
One thing I'm a bit confused by is how you specify third party requirements. https://pyinstaller.readthedocs.io/en/stable/operating-mode.html#analysis-finding-the-files-your-program-needs makes it sound like PyInstaller inspects your import statements. But I don't know how it would handle things like what version of the dep to use for the binary..
@adabuleanu any insights on how you use PyInstaller would be much appreciated, including if you can share any of the argvs you're using.
This is helpful too: https://realpython.com/pyinstaller-python/.
pyinstaller_binary targetSee https://www.pantsbuild.org/v2.1/docs/plugins-package-goal#1-set-up-a-package-target-type-recommended.
It's not super clear to me which fields we'll want - see ./pants help pex_binary for an example of how those fields closely mirror pex --help. For now, it's probably sufficient to start with COMMON_PYTHON_FIELDS, Dependencies, and a new PyinstallerBinarySources field.
You can define this in pants/backend/python/target_types.py.
Start by reading https://www.pantsbuild.org/v2.1/docs/rules-api-concepts for an overview of the Rules API. It'll be helpful to read some of the other parts too, like "File System" and "Processes". It's okay if this doesn't all make sense at first.
See https://www.pantsbuild.org/v2.1/docs/plugins-package-goal for instructions on hooking into the package goal. As a first step, you'll want to pip install pyinstaller, which you can do with https://www.pantsbuild.org/v2.1/docs/rules-api-installing-tools#pex-install-binaries-through-pip.
This is the trickiest part - we're happy to pair on this.
https://www.pantsbuild.org/v2.1/docs/rules-api-testing talks about how to test. We'll likely want to use Approach #2, similar to
the tests we have for Pex at https://github.com/pantsbuild/pants/blob/5ab354061780921a0784f9a4da3677d345f3f214/src/python/pants/backend/python/util_rules/pex_test.py#L310-L559
We're happy to help with this.
Thanks @Eric-Arellano I appreciate your guidance once again :100:
I will share the workaround that I did in the cicd for building a binary using pants + pyinstaler+ staticx (installed with pipenv)
# execute docker run
dockerRun () {
local container_image=${1}
shift;
local working_dir="/app"
local docker_cmd="docker"
if [[ $(id -u) -ne 0 ]]; then
docker_cmd="sudo docker"
fi
${docker_cmd} run --rm \
--workdir ${working_dir} \
-v ${bamboo_working_directory:-$(pwd)}:${working_dir} \
-e V_ENV_USER=$(id -u) \
-e V_ENV_GROUP=$(id -u) \
${container_image} \
/bin/bash -c "${@}"
}
# set cmd var based on branching strategy
setCmd() {
local cmd1="${1}"
local cmd2="${2}"
local branch=$(git branch | sed -n -e 's/^\* \(.*\)/\1/p')
case "${branch}" in
"develop"|"release/version_*")
cmd="${cmd1}"
;;
"master")
# do nothing
;;
*)
cmd="${cmd2}"
;;
esac
echo ${cmd}
}
#!/usr/bin/env bash
set -exuo pipefail
RUNNING_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" >/dev/null && pwd)"
RUNNING_DIR_RELATIVE="$(dirname $0)"
utils_dir="${RUNNING_DIR}/../utils"
source ${utils_dir}/utils.sh
BRANCH="develop"
# TODO: add a new target type "python_binary_pyinstaller" to support binary build using pyinstaller + staticx
# below are the commands to run when above is finished
# cmd1='rm -rf dist &&
# ./pants --version &&
# ./pants filter --filter-target-type=python_binary_pyinstaller :: | xargs ./pants binary'
# cmd2='rm -rf dist &&
# ./pants --version &&
# ./pants --changed-since=${BRANCH} --changed-dependees=transitive filter --filter-target-type=python_binary_pyinstaller | xargs ./pants binary'
cmd1='rm -rf dist &&
./pants --version &&
./pants filter --filter-target-type=python_distribution :: | xargs ./pants setup-py &&
./pants filter --filter-target-type=python_distribution :: | xargs -i ./pants setup-py {} -- sdist &&
./pants filter --filter-target-type=python_binary :: | xargs -i -d ":" -n 2 '${RUNNING_DIR_RELATIVE}'/build_nci_installer_binary.sh'
cmd2='rm -rf dist &&
./pants --version &&
./pants filter --filter-target-type=python_distribution :: | xargs ./pants setup-py &&
./pants filter --filter-target-type=python_distribution :: | xargs -i ./pants setup-py {} -- sdist &&
./pants --changed-since='${BRANCH}' --changed-dependees=transitive filter --filter-target-type=python_binary --address-regex="^(.(?!(.py:)))*$" | xargs -i -d ":" -n 2 '${RUNNING_DIR_RELATIVE}'/build_nci_installer_binary.sh'
cmd=$(setCmd "${cmd1}" "${cmd2}")
# you can run this inside docker with
# dockerRun ${DOCKER_IMAGE} "${cmd}"
bash -c "${cmd}"
Note: Pyinstaller cli is is customized for our environment (but it can be made generic) + we run "cleanup" on the ansible modules to reduce the size of the final binary
#!/usr/bin/env bash
# Note: script is currently custom for installer binary
# TODO: make is generic to support multiple binaries using the following strategies:
# - add a new target type "python_binary_pyinstaller" to support binary build using pyinstaller + staticx (preferred solution)
# - make this script generic to support multiple binaries
set -euxo pipefail
RUNNING_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" >/dev/null && pwd)"
binary_path="$(echo -e "${1}" | tr -d '[:space:]')"
binary_name="$(echo -e "${2}" | tr -d '[:space:]')"
dist_path=$(realpath "dist")
binary_dist_path="${dist_path}/${binary_name}"
unstripped_binary_name="unstripped-${binary_name}"
unstripped_binary_dist_path="${dist_path}/${unstripped_binary_name}"
nci_install_main="${binary_path}/nci.py"
nci_install_add_data_src="${binary_path}/../additional_data"
nci_install_add_data_dest="nci_installer/additional_data"
nci_install_hooks_src="${binary_path}/../hooks"
env_valid_add_data_src="${binary_path}/../../nci_env_validation/additional_data"
env_valid_add_data_dest="nci_env_validation/additional_data"
ansible_add_data_src="/tmp/ansible"
ansible_add_data_dest="ansible"
cd ${RUNNING_DIR}
rm -rf Pipfile*
# install pyinstaller and staticx
pipenv --python 3.6 install pyinstaller==4.0 staticx==0.11.0
# install libraries created in the previous pipelines from dist directory
pipenv install -e ${dist_path}/*
venv_path=$(pipenv --venv)
export PATH=${venv_path}/bin:${PATH}
rm -rf Pipfile*
cd -
python_site_pkgs=$(python -c 'import sysconfig; print(sysconfig.get_paths()["purelib"])')
if [[ -z ${LD_LIBRARY_PATH+x} ]]; then
export LD_LIBRARY_PATH=${python_site_pkgs}/.libs_cffi_backend/
else
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${python_site_pkgs}/.libs_cffi_backend/
fi
# cleanup ansible module to make the binary smaller
rm -rf ${ansible_add_data_src} && mkdir -p ${ansible_add_data_src}
cp -r ${python_site_pkgs}/${ansible_add_data_dest}/* ${ansible_add_data_src}
find ${ansible_add_data_src}/ -type d|grep __pycache__|xargs rm -rf
find ${ansible_add_data_src}/ -type f|grep ps1$|xargs rm -rf
find ${ansible_add_data_src}/ -type f|grep psm1$|xargs rm -rf
rm -rf ${ansible_add_data_src}/modules/cloud
rm -rf ${ansible_add_data_src}/modules/network
rm -rf ${ansible_add_data_src}/modules/web_infrastructure
rm -rf ${ansible_add_data_src}/modules/windows
# build binary
pyinstaller \
--noconfirm \
--onefile \
--strip \
--add-data "${nci_install_add_data_src}:${nci_install_add_data_dest}" \
`if [[ -d ${env_valid_add_data_src} ]]; then echo "--add-data ${env_valid_add_data_src}:${env_valid_add_data_dest}"; fi` \
--add-data "${ansible_add_data_src}:${ansible_add_data_dest}" \
--name ${unstripped_binary_name} \
--clean \
--additional-hooks-dir=${nci_install_hooks_src} \
--hidden-import='pty' \
--hidden-import='xml.etree' \
--hidden-import='xml.etree.ElementTree' \
--hidden-import='selectors' \
--hidden-import='csv' \
--hidden-import='smtplib' \
--hidden-import='ansible' \
--hidden-import='logging.handlers' \
${nci_install_main}
# make binary run on all distributions
staticx ${unstripped_binary_dist_path} ${binary_dist_path}
# cleanup
rm -rf ${unstripped_binary_dist_path} ./build ./${unstripped_binary_name}.spec ${ansible_add_data_src}
Adding some notes here:
pyenv you have to compile the python that you want to use with the following flags:env PYTHON_CONFIGURE_OPTS="--enable-shared"env PYTHON_CONFIGURE_OPTS="--enable-framework"Took me a while to figure this out (thought was my OS at first) but they have it on their FAQ: https://github.com/pyinstaller/pyinstaller/wiki/FAQ#gnulinux
pyinstaller will use whatever library version you currently have installed locally, which probably means that we can let pants create the pyinstaller pex binary with all the dependencies (?). I'm not 100% sure on this one but I couldn't find anything on their documentation and StackOverflow "told me" that's how it's done.Great finds! That's tricky with the interpreter constraints restrictions. Users will need to use [python-setup].interpreter_search_paths to influence that. We can have a warning in our docs about that perhaps.
it seems like pyinstaller will use whatever library version you currently have installed locally, which probably means that we can let pants create the pyinstaller pex binary with all the dependencies
This sounds correct based on what @adabuleanu shared and talking to him a little about this. Further, I believe the interpreter you run with determines which interpreter is shipped in the binary.
I agree that we will want to use Pex if possible. In particular, we use a pattern for other tools of having two PEXes:
Why two PEXes? It gives us finer grained caching. If you change the version of PyInstaller, we shouldn't need to re-resolve your own third party reqs.
--
The next step would be to try to get PyInstaller to work via Pex's CLI, without Pants.
This will look something like pex pyinstaller -o pyinstaller.pex -m pyinstaller:main, then ./pyinstaller.pex [normal cli args]. The -m value will be something from https://github.com/pyinstaller/pyinstaller/blob/5bc2c4620bfc5dde5a76a007c7ff50ee74223ca5/setup.cfg#L85-L92, I suspect the first one.
Now add to that above PEX command something like pex ansicolors pyinstaller .... Check that ./pyinstaller.pex will correctly load the requirement when creating the binary.
This step is key to ensuring that we can in fact use a Pex to resolve requirements.
--pex-pathIt's possible to "merge" two PEXes together at runtime through the --pex-path option. Create a dedicated requirements pex, like pex ansicolors req2 -o reqs.pex, then a dedicated tool pex like pex pyinstaller -o pyinstaller.pex -m pyinstaller:main --pex-path reqs.pex. You will end up running ./pyinstaller.pex as before, but it will merge in reqs.pex.
Once you have all three of these steps working, then we can start to proceed to porting this to Pants.