Conda-forge.github.io: [discussion] should post-link/pre-unlink scripts output stderr/stdin?

Created on 9 Jun 2016  Â·  28Comments  Â·  Source: conda-forge/conda-forge.github.io

I preemptively went through and found all of the post-link scripts based on my initial, naive pattern for using jupyter (server|nb)extension enable and added them, but should have created this issue first... if only to make the below list more simply! Anyhow, by having them up there, we can already see the impact this might have.

staged recipes:

feedstocks:

@jakevdp What's problematic about the output?

@jakirkham Agreed. I kind of like seeing this output as it tells me the post linking worked. Also, it is pretty short; so, don't really find it a hindrance for following log output. Why should it be removed here or elsewhere?

I've heard from a few downstream users of conda with jupyter extensions that post-link/pre-unlink scripts generating output is a Bad Thing, specifically breaking assumptions about conda remove|install --json. As conda is increasingly used in more automated use cases, i.e. constructor, provisioning tools, this seems pretty important to address.

Here is some example --json output... very weird that it truncates the name in "fetch": "nb_anacondaclo"... that's not going to be super useful!

nbollweg@rocinante:~/Documents/projects/nbpresent (master)$ conda install --json nb_anacondacloud
Using Anaconda Cloud api site https://api.anaconda.org
{"maxval": 18683, "finished": false, "progress": 0, "fetch": "nb_anacondaclo"}
{"maxval": 18683, "finished": false, "progress": 16384, "fetch": "nb_anacondaclo"}
{"maxval": 18683, "finished": false, "progress": 18683, "fetch": "nb_anacondaclo"}
{"maxval": 18683, "finished": false, "progress": 18683, "fetch": "nb_anacondaclo"}
{"maxval": 18683, "finished": true, "progress": 18683, "fetch": "nb_anacondaclo"}
{"maxval": 1, "finished": false, "progress": 0}
{"maxval": 1, "finished": false, "progress": 0, "name": "nb_anacondacloud"}
{"maxval": 1, "finished": true, "progress": 1}
{"maxval": 12, "finished": false, "progress": 0}
{"maxval": 12, "finished": false, "progress": 0, "name": "yaml"}
{"maxval": 12, "finished": false, "progress": 1, "name": "funcsigs"}
{"maxval": 12, "finished": false, "progress": 2, "name": "pytz"}
{"maxval": 12, "finished": false, "progress": 3, "name": "pyyaml"}
{"maxval": 12, "finished": false, "progress": 4, "name": "requests"}
{"maxval": 12, "finished": false, "progress": 5, "name": "six"}
{"maxval": 12, "finished": false, "progress": 6, "name": "clyent"}
{"maxval": 12, "finished": false, "progress": 7, "name": "python-dateutil"}
{"maxval": 12, "finished": false, "progress": 8, "name": "anaconda-client"}
{"maxval": 12, "finished": false, "progress": 9, "name": "nb_config_manager"}
Enabling nb_config_manager in /Users/nbollweg/miniconda3/envs/nbpresent-foo/etc/jupyter
Existing config...
{}
New config...
{'NotebookApp': {'config_manager_class': 'nb_config_manager.EnvironmentConfigManager'}}
{"maxval": 12, "finished": false, "progress": 10, "name": "nbsetuptools"}
{"maxval": 12, "finished": false, "progress": 11, "name": "nb_anacondacloud"}
Installing nb_anacondacloud  OK
Enabling nb_anacondacloud  OK
{"maxval": 12, "finished": true, "progress": 12}
{
  "actions": {
    "EXTRACT": [
      "nb_anacondacloud-0.2.0-py35_0"
    ],
    "FETCH": [
      "nb_anacondacloud-0.2.0-py35_0"
    ],
    "LINK": [
      "yaml-0.1.6-0 /Users/nbollweg/miniconda3/pkgs 1",
      "funcsigs-0.4-py35_0 /Users/nbollweg/miniconda3/pkgs 1",
      "pytz-2016.4-py35_0 /Users/nbollweg/miniconda3/pkgs 1",
      "pyyaml-3.11-py35_4 /Users/nbollweg/miniconda3/pkgs 1",
      "requests-2.10.0-py35_0 /Users/nbollweg/miniconda3/pkgs 1",
      "six-1.10.0-py35_0 /Users/nbollweg/miniconda3/pkgs 1",
      "clyent-1.2.2-py35_0 /Users/nbollweg/miniconda3/pkgs 1",
      "python-dateutil-2.5.3-py35_0 /Users/nbollweg/miniconda3/pkgs 1",
      "anaconda-client-1.4.0-py35_0 /Users/nbollweg/miniconda3/pkgs 1",
      "nb_config_manager-0.1.3-py35_0 /Users/nbollweg/miniconda3/pkgs 1",
      "nbsetuptools-0.1.5-py35_0 /Users/nbollweg/miniconda3/pkgs 1",
      "nb_anacondacloud-0.2.0-py35_0 /Users/nbollweg/miniconda3/pkgs 1"
    ],
    "PREFIX": "/Users/nbollweg/miniconda3/envs/nbpresent-foo",
    "SYMLINK_CONDA": [
      "/Users/nbollweg/miniconda3"
    ]
  },
  "success": true
}

Perhaps the "right" solution would be for conda install to handle this more actively, and package stderr/stdout from post-link into some stanza, with the above generating something like:

{"maxval": 12, "finished": false, "progress": 11, "name": "nb_anacondacloud", "post-link": [
        {"type": "out",  "value": "Installing nb_anacondacloud  OK"},
        {"type": "out, "value": "Enabling nb_anacondacloud  OK"}
]}

or, alternately:

{
  "actions": {
  "POST-LINK": {
      "name": "nb_anacondacloud",
      "output": [
        {"type": "out", "value": "Installing nb_anacondacloud  OK"},
        {"type": "out, "value": "Enabling nb_anacondacloud  OK"}
      ]
    }
  }
}

I haven't looked into what it would take to change conda to do this, but perhaps that is the easier fix.

cc: @ilanschnell @ijstokes @kalefranz

Most helpful comment

Regarding points 4-5: this means that jupyter nbextensions should not be installed/uninstalled using these scripts, because it depends on the jupyter package.

These packages need to run, e.g. jupyter nbextension install packagename at the end of conda install, and jupyter nbextension uninstall packagename at the beginning conda remove. What's the recommended alternative for this case?

All 28 comments

Thanks Nick. My perspective is that conda does not need to change. When post-link scripts output things, it is probably important (usually some error message). In this case, the post-link script is used in a way that it was never designed for. I think we have to better communicate the purpose of these scripts.

Post-link (and pre-unlink) scripts should:
1.) be avoided whenever possible
2.) not touch anything other than the files being installed
3.) not write anything to stdout (or stderr), unless an error occurs
4.) not depend on any installed (or to be installed) conda packages
5.) only depend on simple system tools rm, cp, mv, ln, etc.

I see the problem you are contemplating now, @bollwyvl. So, my gut feeling at least with nb_anacondacloud is it shouldn't have so much output or it should be doing some logging of its own. In some of the other cases, this has been a line or two that basically says things were configured correctly post-install. The latter not only seems ok to me, but actually seems desirable as it shows the user what is going on in a simple way. Personally seeing something installed ok is desirable too. It just should be brief and to the point.

cc @conda-forge/bqplot @conda-forge/ipyleaflet @conda-forge/nb_anacondacloud @conda-forge/nbpresent @conda-forge/pythreejs @conda-forge/vega @conda-forge/widgetsnbextension

post-link script is used in a way that it was never designed for.

Didn't know that – is there another way to tell conda to add additional install/uninstall steps for things like jupyter extensions?

Thanks for the feedback all.

@ilanschnell Post-link (and pre-unlink) scripts should:

Yes, let's please communicate that more clearly!

However, in the case of a --json install, i do feel like wrapping them up into the result object in some way or another would be good.

@jakirkham So, my gut feeling at least with nb_anacondacloud is it shouldn't have so much output or it should be doing some logging of its own.

We're trying to get this to be a more deterministic process with:

Basically, nb_anacondacloud would create its own file in a directory at _build_ time, such that the order of increasing config importance would be:

$PREFIX/etc/jupyter/jupyter_notebook_config.json.d/nb_anacondacloud.json
$PREFIX/etc/jupyter/jupyter_notebook_config.json
$HOME/.jupyter/jupyter_notebook_config.json

With a merge occurring at each step.

The conda documentation regarding post-link scripts is now up-to-date, see http://conda.pydata.org/docs/building/build-scripts.html

Regarding points 4-5: this means that jupyter nbextensions should not be installed/uninstalled using these scripts, because it depends on the jupyter package.

These packages need to run, e.g. jupyter nbextension install packagename at the end of conda install, and jupyter nbextension uninstall packagename at the beginning conda remove. What's the recommended alternative for this case?

@jakevdp at the moment, post-link and pre-unlink scripts are the only solution.

However, in the next minor version of the notebook, the registration of an nbextension will correspond to add a file in a configuration directory, which can be done by the package manager (or even with a python wheel).

https://github.com/jupyter/notebook/issues/1508#issuecomment-224956351

I'll update all the PRs to write to $PREFIX/.messages.txt, so we can at least be partially compliant...

Why is 4 an issue at this point? Package dependencies are now installed first, so a runtime dependency can be an install time dependency.

Number 4 is an issue, because we want to keep these post link scripts simple, and not have them do all sorts of fancy things (those type of things which are now causing problems).

those type of things which are now causing problems

Just to be clear, by "causing problems", you mean "printing to stdout", yes? Or is there something else?

Number 4 is an issue, because we want to keep these post link scripts simple, and not have them do all sorts of fancy things (those type of things which are now causing problems).

Isn't simple covered by 5? Also, it seems like post-link steps are the place one can add functionality that conda can't or won't do. Conda can't rewrite shbang lines for other languages, so you have to write a post-link to fix those lines _at install time_. Being able to extend conda in this way for special cases, puts less of a burden on conda extending support for other use cases.

@bkreider Conda is Python agnostic. Why do you say it cannot rewrite shebangs for other languages? All it does is replace a prefix placeholder with the actual install prefix. Nowhere in the conda code is python hard-coded when it comes to shebangs.

@ilanschnell This issue from 2.5 years ago around node: https://github.com/conda/conda/issues/492

We've been manually (post link) patching all node binaries since then.

I think we have a clear recommendation to avoid stdout in post link scripts, and am also in favour of that. Point 4 about not depending on installed packages is worrying as we have no other mechanism to do this kind of registration, but I suspect the reason for this is that the post-link is called immediately after the installation/linking of each package, not after _all_ packages have been installed.

Is there anything left to discuss in this issue?

Point 4 about not depending on installed packages is worrying as we have no other mechanism to do this kind of registration, but I suspect the reason for this is that the post-link is called immediately after the installation/linking of each package, not after all packages have been installed.

The only reason it works in the way it does currently is that dependencies of a distribution are linked before itself, so if a package depends on Jupyter, it is currently the case that Jupyter will already be installed.

That is unfortunately not a strict guarantee. Things in the past have been linked alphabetically, and there has not been strong interest in keeping things happening in dependency order.

there has not been strong interest in keeping things happening in dependency order.

Here's registering that interest 😉

I'll update all the PRs to write to $PREFIX/.messages.txt, so we can at least be partially compliant...

So, this is a good idea. Few thoughts.

First, if we have multiple things getting installed that do this, we end up with a situation where one of them "wins" the log and overwrites the other ones. That could be solved by appending. Though that log will soon become a huge mess. Maybe some directory structure can solve this (e.g. $PREFIX/etc/conda/logs/link/<pkg>/output.log and one for error too).

Second, builds of other packages that install dependencies doing this will pick up the log file(s). So we need to start excluding this location explicitly in the builds. I don't know of a way to simply specify this to conda-build. Though we could discuss adding some functionality like this. For the meantime removing a standard location should solve this (e.g. rm -r $PREFIX/etc/conda/logs/link).

That is unfortunately not a strict guarantee. Things in the past have been linked alphabetically, and there has not been strong interest in keeping things happening in dependency order.

I would be interested in this if it doesn't cost much to keep. I know there are internal projects where people are relying on dependency install order (can't remember them off the top of my head, but it's come up many times). It makes it impossible to coordinate packages that are related and need to share install time information.

I was an early-adopter (2013) of conda for everything and hit the pain of alphabetically ordered package installation (i shouldn't have to rename my packages to get the desired order).

@bkreider you (and others) are hitting the pain because you are using post-link scripts to do things which they were never designed for (that is configuration) and not following the guidelines for post-link scripts: http://conda.pydata.org/docs/building/build-scripts.html
To solve configuration problems (among other things), we created: https://github.com/conda/kapsel

are hitting the pain because you are using post-link scripts to do things which they were never designed for

There is no pain anymore. Ever since the "install in dependency order" happened, they work perfectly and in way that doesn't surprise. If conda moves back to alphabetically install order, i can see the pain happening again.

which they were never designed for (that is configuration)

That just shifts the question. If it is just because of mis-design, it's an easier problem to fix as compared to a technological problem.

Kapsel is not the answer for this problem. Post-link scripts are the glue that makes some conda packages possible that would otherwise be a serious pain to integrate between conda and yum/apt/etc because conda doesn't manager the entire OS (yet).

So I'm going to take the proactive step of closing this issue. I think we have arrived at the general consensus that we _should not_ be outputting during the link scripts in general.

Thanks for the excellent detail in here - I certainly hope we can maintain the existing conda behaviour of ordering the link script calls in dependency order.

Actually, just to clarify on @ilanschnell's point about kapsel, just like post-link scripts written in bash, kapsel isn't designed to be a solution for configuration management.

Salt, ansible, ADaM, puppet, chef, and in some cases fabric. NOT BASH

On Jul 26, 2016, at 7:35 PM, Ilan Schnell [email protected] wrote:

@bkreider you (and others) are hitting the pain because you are using post-link scripts to do things which they were never designed for (that is configuration) and not following the guidelines for post-link scripts: http://conda.pydata.org/docs/building/build-scripts.html
To solve configuration problems (among other things), we created: https://github.com/conda/kapsel

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

Could someone please clarify if we need to be appending to .message.txt or simply writing to it? I want to make sure that when we are writing to it that we aren't overwriting something from another pre/post [un]link script.

It's "just a file," so use >> rather than >.

Thanks for clarifying.

Was this page helpful?
0 / 5 - 0 ratings