Cylc-flow: cylc installation: packages for Debian, Red Hat, etc...

Created on 16 Feb 2015  路  26Comments  路  Source: cylc/cylc-flow

We should look into creating proper packages for cylc installation, ideally supporting
the multi-version wrapper way of working.

Most helpful comment

I agree this is a problem, and I don't have a solution for that now... :disappointed_relieved: npm has the same problem.

We could register cycl... but there are other possible typos. But trying to register all of them doesn't sound like the best approach.

Not sure if npm and PYPI are filtering _homoglyphs_, but they are another possible way to attack users if not well filtered.

I think instead we should later discuss the possibility of instructing users to host PYPI repositories. Some organisations already do that for Debian/RHEL repositories, and also for Java Maven artefacts. There are alternatives for PYPI too I think - though I never did it.

This way we provide the safest solution, where you pick the dependencies you need and bring them within your organisation. Or you allow to connect to Anaconda/PYPI/GitHub/etc, where there is the risk of attacks like the typo squatting (NB: cycl could happen in any of these, as pip supports Anaconda's PIP repos, PYPI, GitHub, or even URL's to install... same for the conda utility, which also connects to Conda repositories).

-- digress...

For me, the best approach for PYPI would be to follow what's done in Java's Maven repositories, and I believe also in Perl' CPAN repositories, and have some sort of groups per artefacts. So in our case we could have something like pip install cylc@cycl.

NPM has now org scoped packages, e.g. npm install @vue/cli. In the past there would be something like vue-cli, but now with the scope it's easier to group packages and have simpler names.

It's much easier to filter the creation of these scopes in npm, instead of filtering all new packages. Once you have an org/scope/group, you can safely publish whatever you want within that.

All 26 comments

Issue numbering is deliberate!

I'm packaging cylc (also fcm, rose) for inclusion in Debian.

regards
Alastair McKinstry ([email protected])

@amckinstry - that's exciting to hear! Please keep us posted on progress (and let us know if any quirks in the system get in the way of proper packaging, of course). Are you a user yourself?

@hjoliver I've been experimenting with it at work (www.ichec.ie) where we do NWP and climate with Met 脡ireann, the Irish met. agency. I've been setting it up for satellite and radar data assimilation for some of our projects.

That's great! There's a Vagrant VM install recipe which may help at https://github.com/metomi/metomi-vms/tree/master/ubuntu-1404. It has working installations of fcm, rose, and cylc.

Struggled a little bit last week trying to install cylc on Ubuntu trusty because of pygraphviz. Managed to fix that after finding a issue in GitHub with some workaround.

This is a Docker file that builds Ubuntu 14.04 trusty + system dependencies + Python and packages, and cylc 6.8.1.

https://github.com/kinow/docker/blob/master/cylc/Dockerfile

Looked for some place in the docs folder where I could submit a pull request with instructions for cylc on Ubuntu, but I didn't find a good place to put this information. Let me know if there's any documentation that could be updated specifically for Ubuntu and I'll send a pull request.

@kinow - thanks for that. I had not run into this pygraphviz problem myself yet. Installation documentation - at present we only have the top level INSTALL file, which just refers to the cylc tarball and leaves external dependencies up to the user; and the _Installation_ section of the User Guide (doc/cug.tex) which is less minimal but could certainly be improved a lot - especially re docker, vagrant.

Thanks for the prompt reply @hjoliver . I'll take a look at the Installation section of the User Guide to see if I can send a pull request.

I wonder if it would it be useful to create a new directory tree (under conf/ or admin/, or perhaps etc/) to hold Docker files...

Or a separate repository... right now we have only 1 repository under the cylc organisation... but maybe we could have one for docker.

This way it could be easier to use it as link for sources in Docker Hub, and also manage contributions/issues.

Some other OSS projects do that

But totally doable to have a folder like conf or dockerfiles. The latter is actually what JupyterHub does

Having played with Docker a bit, I think this is a good idea, although it's not clear to me which way is best (separate repo or not?). And @kinow we probably need to know/document how to use Cylc+Docker in a cluster environment. As discussed already, Docker seems to be a no-go in HPC at the moment, but at the least Cylc is normally used across multiple hosts (suite host + one or more job hosts). Presumably you can deploy the container on all hosts, and have jobs invoke cylc commands correctly on the job hosts. What about the multiple Cylc versions case? (e.g. if you a long-running suite that you don't want to upgrade yet, and a new suite for the new Cylc version) ... can we make this work transparently with our existing central wrapper script CYLC_VERSION selection mechanism...?

As I'm not aware this has been written down anywhere:

Packing Cylc

With future developments in Cylc our dependency base is going to broaden. We are going to pick up dependencies which relate only to particular components of functionality, to list a few for example:

  • Build deps

    • CSS compilation (using SASS or an alternative) for development we might want to compile on the fly in JS but when it comes to deployment it makes sense to ship the compiled CSS.

    • Code minimisation (CSS, JS)

  • Documentation deps

    • Sphinx (#2651) and a whole host of Sphinx related dependencies (themeing, auto documentation tools, etc)

  • GUI deps

    • Web server framework

    • Javascript libraries

  • etc

We wouldn't want build or documentation dependencies to bloat the Cylc package, similarly sites wont want to install web frameworks on job hosts. Should Cylc releases include cylc.tests? We will want to break down the dependencies down by usage providing users more fine grained control over what they install where.

Our package groups might look something like:

  • Regular installation (More-or-less as present)
  • Client installation (paired down package providing the Cylc command line)
  • Web Server
  • Web Service (incase there is a motivation to run services in different environments, or if a particular service has a more controversial dependency)
  • Job Host
  • Developer Install (build, documentation, testing)

PIP & Conda

Packaging systems can support this in different ways, with pip we could do something like this:

$ pip install cylc  # default Cylc installation
$ pip install cylc[developer]  # default + developer stuff

Conda doesn't appear to have a universal solution to this yet, different packages are one option, outputs might be another:

package:
    name: cylc-split
    version: "8.0.0"

requirements:
    run:
        - python=3.7

outputs:
    - name: cylc.webserver
      requirements:
          run:
              - tornado
    - name: cylc.developer
      requirements:
         run:
             - javascript
             - sass
$ # basic
$ conda create -n 'mycylcenv' python cylc
$ # installing components
$ conda create -n 'mycylcenv' python cylc cylc.developer cylc.webserver

How Far Do We Want To Go?

$ # too far?
$ conda create -n 'service-group-1' python cylc cylc.suite-list-service cylc.suite-run-service

Do we:

  1. Just Package Cylc's dependencies (install the whole Cylc codebase for every type of installation)

    • Simple, easy, fast, done

  2. Package properly (install the bits of Cylc needed for the specified purpose)

    • Requires more modular code

    • Easy to do for the web-framework stuff

    • We would have to build Cylc for deployment (which we are probably going to have to do anyway)

  3. Lump up the Cylc codebase into different repositories.

Vote for more modular approach.

+1 for modular approach. Something along the lines of

$ pip install cylc  # basic cylc, as in the CUG
$ pip install cylc[kafka]  # whatever dependencies we need for cylc and karfka
$ pip install cylc[em]  # not sure if necessary, but just an example
$ pip install cylc[all]  # when you need everything

pip install cylc[job-host] # minimal client-only installation for job hosts

See also #2802.

Krita has been releasing its official snap packages for a while. I used once, but later compiled with qt from source. Bit scary to have a huge binary with dependencies... a l脿 static linked binaries. I prefer to manage my dependencies.

But I understand why it is useful to some users. And might be simpler for Krita developers to support users, who are normally not tech savvy, but more concerned about just get it running and be able to draw, illustrate, etc.

I know it works with Ubuntu. No idea if rhel fedora etc support too.

B

Snap format is Ubuntu invention and not widely adopted to my knowledge. Might be a dead end in the long term, as some of their other inventions (Upstart, Mir).

Note via @arjclark on PyPI "typo squatting" cyber attacks:

https://medium.com/@bertusk/detecting-cyber-attacks-in-the-python-package-index-pypi-61ab2b585c67

We might want to upload a duplicate package to "cycl", or a stub that simply prints "Did you mean to install cylc?". Ping @kinow

I agree this is a problem, and I don't have a solution for that now... :disappointed_relieved: npm has the same problem.

We could register cycl... but there are other possible typos. But trying to register all of them doesn't sound like the best approach.

Not sure if npm and PYPI are filtering _homoglyphs_, but they are another possible way to attack users if not well filtered.

I think instead we should later discuss the possibility of instructing users to host PYPI repositories. Some organisations already do that for Debian/RHEL repositories, and also for Java Maven artefacts. There are alternatives for PYPI too I think - though I never did it.

This way we provide the safest solution, where you pick the dependencies you need and bring them within your organisation. Or you allow to connect to Anaconda/PYPI/GitHub/etc, where there is the risk of attacks like the typo squatting (NB: cycl could happen in any of these, as pip supports Anaconda's PIP repos, PYPI, GitHub, or even URL's to install... same for the conda utility, which also connects to Conda repositories).

-- digress...

For me, the best approach for PYPI would be to follow what's done in Java's Maven repositories, and I believe also in Perl' CPAN repositories, and have some sort of groups per artefacts. So in our case we could have something like pip install cylc@cycl.

NPM has now org scoped packages, e.g. npm install @vue/cli. In the past there would be something like vue-cli, but now with the scope it's easier to group packages and have simpler names.

It's much easier to filter the creation of these scopes in npm, instead of filtering all new packages. Once you have an org/scope/group, you can safely publish whatever you want within that.

We might want to upload a duplicate package to "cycl"

As cycl is such a common typo that might be a good idea.

I think we are now pretty settled on the Conda model for packaging Cylc which opens up quick simple installation for the different Unix variants, multi-version installation, etc.

Given this do we still have an interest in maintaining packages with other package managers?

+1 to support only conda as we are doing. Users should be able to create rpm packages if they'd like/need

Agreed, I'll close this.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

dwsutherland picture dwsutherland  路  3Comments

kinow picture kinow  路  4Comments

oliver-sanders picture oliver-sanders  路  3Comments

oliver-sanders picture oliver-sanders  路  3Comments

oliver-sanders picture oliver-sanders  路  5Comments