Ckeditor5: Switch to versioned dependencies

Created on 2 Jan 2017  Â·  14Comments  Â·  Source: ckeditor/ckeditor5

tl;dr It discusses the need to change how we install packages in dev mode (i.e. cloned repos) and proposes creating a tool like Lerna but for multirepo.

Problem

We'll need to switch to versioned dependencies. Each package (i.e. ckeditor5 and all its dependencies) have dependencies in other ckeditor5-* packages and these dependencies should be defined using semver range ^x.y.z.

Currently, since we haven't released any package yet, we're using GitHub URLs in our package.jsons:

  "dependencies": {
    "ckeditor5-autoformat": "ckeditor/ckeditor5-autoformat",
    "ckeditor5-basic-styles": "ckeditor/ckeditor5-basic-styles",
    "ckeditor5-clipboard": "ckeditor/ckeditor5-clipboard",
    "ckeditor5-core": "ckeditor/ckeditor5-core",
    ...
  }

Thanks to that npm always installs the latest development version of each package. This makes testing a package or the whole project on CI plain simple – npm takes care of getting the latest code.

GitHub URLs have one more nice feature – we can specify tag/branch/hash of a dependency and npm i will install exactly this version. So if we want to switch to a configuration of branches we simply change package.json to:

  "dependencies": {
    "ckeditor5-autoformat": "ckeditor/ckeditor5-autoformat#t/xxx",
    "ckeditor5-basic-styles": "ckeditor/ckeditor5-basic-styles",
    "ckeditor5-clipboard": "ckeditor/ckeditor5-clipboard#t/yyy",
    "ckeditor5-core": "ckeditor/ckeditor5-core",
    ...
  }

And run our gulp update tool (or npm i if CI is installing this for the first time) and exactly these versions are installed.

Once we'll start using versioned dependencies we'll lose all that. npm i and our tool will always install the latest released version of each dependency.

Solution

Before I'll start, let's just first reconsider one thing – we're now symlinking dependencies in ckeditor5/node_modules/ to their cloned git repos. This is done only for the first level, so then you have ckeditor5/node_modules/ckeditor5-core/node_modules/ckeditor5-utils linking to some long outdated version of that package. To make Webpack and other bundlers load all ckeditor5-* packages from ckeditor5/node_modules/ we developed plugins for them which resolve import paths accordingly.

But then... why to symlink the packages at all? These symlinks are totally broken anyway, because if a module in ckeditor5-core requires some module from ckeditor5-utils, then Node, Webpack, Rollup, etc. will look for it inside ckeditor5-core/node_modules/.

So, my idea is to stop pretending that node_modules make any sense and stop symlinking anything. This will also prevent npm from logging all those ugly warning messages and randomly breaking deps installed in symlinked packages (which constantly happen to me, but it's a total bugfoot).

Another thing which I'd like to change is extracting the env tasks to a separate command line tool called e.g. mnpm (which means multi-npm or "metrów nad poziomem morza" :D we can find a better name ofc ;P).

Finally, let's not forget that we also need to switch to scoped modules, which means that we need to review all our existing env tools anyway because now the packages can be installed in node_modules/<scope>/*

Based on that we say:

  1. You can build CKEditor using versioned packages if you don't configure Webpack to use the dev versions (default behaviour). Then, Webpack sources packages from node_modules/.
  2. Or you enable the custom module resolution mechanism which can source packages from any directory which you pointed it to. It can be ckeditor5/packages/ and doesn't need to be ckeditor5/node_modules/ any more.

So, now the solution to versioning becomes simpler. You run mnpm install and it takes all @ckeditor/ckeditor5-* dependencies and dev dependencies (which were previously installed by npm i) and, if not cloned already, clones related repos to ckeditor5/packages/ and pulls latest changes.

Now, what we're missing is:

  1. How to tell this script to install concrete branches of certain dependencies?
  2. How to tell this whole system (esp. the plugin for bundler) that a certain package isn't installed at all? I'm not sure we need this because, perhaps, we can just say that ckeditor5 contains all the needed deps. But it's possible to have a situation that you will forget to specify that e.g. ckeditor5-image requires ckeditor5-widget and everything will work in our dev env even though none of the packages would install ckeditor5-widget in a normal scenario, just because ckeditor5-widget is always defined as a dep of ckeditor5. OTOH, it should be enough if we have this verified on CI.
  3. Which brings me to – how to run tests of latest dev versions on CI if a tested package will define versioned deps?

CI

For ckeditor5:

  1. Clone ckeditor5.
  2. Install mnpm.
  3. Run mnpm install --no-npm-install (which is going to clone all deps, but won't execute npm i inside them, to save time).
  4. Run tests based for all packages in ckeditor5/packages/.

For ckeditor5-foo:

  1. Clone ckeditor5-foo.
  2. Install mnpm.
  3. Run mnpm install --no-npm-install (which is going to clone all deps, but won't execute npm i inside them, to save time).
  4. Configure Webpack to source packages from ckeditor5-foo/ and ckeditor5-foo/packages/* and run tests for ckeditor5-foo/tests.

Dependencies setup

So, the last topic is how to tell the mnpm install which packages should be in which versions. My idea is that we should avoid magic, so it'd be best to introduce a simple mpnm.json config which will have keys like fixedDependencies where you can manually define some deps versions. If not defined, the mnpm will install master.

Finding dependencies to install in dev mode

mnpm must be able to understand which packages needs to be cloned. I think that the easiest way is to say:

  1. If something is your dependency you add it to your package.json as you'd normally do.
  2. You run npm i which install all your deps and deps of your deps in a flat way.
  3. You run mnpm i which scans node_modules/@ckeditor/ and finds there all the packages which need to be cloned.
  4. Alternatively, you run mnpm i @ckeditor/ckeditor5-foo or configure list of packages to be installed in mnpm.json (cause they can be your packages and the script wouldn't find them by applying some general pattern (which needs to be configurable in mnpm.json because mpmn should be project agnostic)).

This will work when executed in ckeditor5 and also when executed in ckeditor5-foo on CI. We can avoid putting mnpm.json in every repo by calling mnpm install --package-pattern='@ckeditor/ckeditor5-!(dev)*' on CI for ckeditor5-foo.

Last but not least

  • mnpm is meant to be a generic tool, not only for CKEditor 5. This means that mnpm.json may contain more configuration. It should be like Lerna but for multi-repo.
  • Keeping all packages inside ckeditor5/packages/ will make this similar to how we keep them now in https://github.com/ckeditor/ckeditor5-dev (and which is the default for Lerna), with that difference, that they won't be committed to the same repo.
  • There's only one thing which I'm worried and which is how to remove packages from ckeditor5/packages/ if they are removed from the project real dependencies. Perhaps it's enough that eventual missing deps will be verified on CI... Hope you agree :)
dev discussion feature

Most helpful comment

With my first attempt to create a testing env for submodules I did something (apparently) unexpected and I got:

(master 6244d58) p@m /...ckeditor5/test-submodules/test-main> git submodule add [email protected]:Reinmar/test-a.git packages/test-a/
A git directory for 'packages/test-a' is found locally with remote(s):
  origin  [email protected]:Reinmar/test-a.git
If you want to reuse this local git directory instead of cloning again from
  [email protected]:Reinmar/test-a.git
use the '--force' option. If the local git directory is not the correct repo
or you are unsure what this means choose another name with the '--name' option.

Problem is... I have no idea what it means. I only know it happens after my previous try to add this submodule, when the repos where completely empty. And now I don't see any new file in the repo, but the submodule was previously installed somehow. I checked .git/config because, according to one of the articles, the added submodule should be there, but it isn't there.

Super confusing. rm -rf test-submodules and starting from scratch.


My first observation was that I don't like the fact that the hashes of submodules aren't stored in some retrievable format.

http://stackoverflow.com/questions/5033441/where-does-git-store-the-sha1-of-the-commit-for-a-submodule

git submodule status

This means that operating on them will be trickier. Not a big deal, but it wasn't a coincidence that I looked for this because it's natural.


Next, I made a ticket branch in one of the sub repos, committed it, then committed a master ticket branch in the main repo and pushed that.

In a separate clone of the main repo (to simulate that the above changes were made by some other developer) I fetched the changes and switched to the master ticket branch. Then you need to execute git submodule update --init.

I then made one more commit in the submodule's ticket branch and pushed it. Now, a different developer wants to pull all changes done in the submodules (so he does that from the context of the main repo). I realised that, obviously, this won't work – the submodule is not set to a specific branch, but rather a commit. So in order to recursively get changes in all submodules I need to guess what branch I should be pulling... which means that we'd need to have a separate config file to keep on which branches the submodules should be and a tool to read it and fetch changes.

http://stackoverflow.com/questions/1030169/easy-way-pull-latest-of-all-submodules

To sum up, since there's no configuration of on which branch a certain submodule is we'd have to face several problems:

  • git clone --resursive [email protected]:Reinmar/test-main.git leaves you the submodules in detached state. This is both dangerous (a git beginner could lose they work if they commit to such a submodule) and inconvenient. A post install hook would need to be created to fix this.

    Looking for a solution I found http://stackoverflow.com/questions/6474847/how-do-i-git-clone-recursive-and-checkout-master-on-all-submodules-in-a-single where there's a comment:

    Perhaps you should consider gitslave as an alternative to git-submodule, depending on your development workflow it may be better (or worse). Specifically to your issue, gitslave keeps all members of the superproject on the same branch (absent specific git (not gitslave) commands to do otherwise, and even then many commands warn you when you are on different branches).

    Gitslave is useful when you control and develop on the subprojects at more of less the same time as the superproject, and furthermore when you typically want to tag, branch, push, pull, etc all repositories at the same time.

    git-submodule is better when you do not control the subprojects or more specifically wish to fix the subproject at a specific revision even as the subproject changes.

  • You can't configure to which branch you'd like to have some submodule checked out. You can only specify a hash, which will be inconvenient to share some dev environment setups.

  • Since the hashes aren't kept in a plain file... you can't diff them easily. Plus, hashes gives you no clue what they were supposed to mean.
  • Since we need an ability to specify the branch some times, we'll need to create a separate file for it anyway...
  • Since submodules are locked to a hash, we'd need a GitHub hook which would update submodules in ckeditor5 every time some change lands in any of the submodules. We talked about this already – we'll need this anyway for us to be able to do git bisect, but in the previous situation it was an option. Now it'd be a must. Also, this would mean that ckeditor5 repo would change very frequently, which might be irritating. I'm not sure in what cases this might backfire, but right now the main repo changes very infrequently and these changes contain only some dev tools update. I'm also a bit unsure how would this work for ckeditor5 forks, since resolving conflicts between submodule versions is a nightmare.
  • I also thought about this – if, e.g., you're working on a ticket in some submodule, but want to pull changes to all other submodules, to be sure that you're testing the latest changes, then you, again, need a tool...

Let me quote this bit again:

git-submodule is better when you do not control the subprojects or more specifically wish to fix the subproject at a specific revision even as the subproject changes.

Submodules may sound like what we need, but it seems now that they are not the best solution in our situation. They would be great if we'd have some big libraries, developed by separate teams and rarely updated in the main project. But in our case, all the packages are developed at the same time, hence a more dynamic solution is needed.

So, let's try some tools:

All 14 comments

A crazy idea... what if we just create a tool to clone all the repos to ckeditor5/packages/ and then use Lerna to manage them? :D:D Lerna symlinks all the modules between themselves.

I really like the idea. I am wondering if there will be no problems in having multiple repositories inside main ckeditor5 repository. But since we are not using git submodules, putting ckeditor5/packages/ to .gitingore should be fine I think.

+1 for creating generic tool from our env tasks, we gained some experience in working on multi-repository project and it would be cool if others could benefit from that.

I see now that we have two options which I described in my 1st and 2nd comment:

  1. Say that we don't have energy to hack npm and module import resolution on file system level and move this to Webpack. This is the idea in which we keep packages inside some custom directory like packages/.
  2. Say that we want to use the original module resolution logic and symlink all the packages and their deps together. This is what Lerna does, because all the deps inside packages/ are symlinked between each other.

The second approach sounds more correct, because you adhere to the environment you work with. But it's really similar to what we have now, but with a deeper symlinking. We can save some time by using Lerna to do that, but TBH, I'm not quite sure if that's going to allow us getting rid of the custom module resolution. I don't know today if modules accessible under:

  • ckeditor5/packages/ckeditor5-engine/node_modules/ckeditor5-utils/*
  • ckeditor5/packages/ckeditor5-core/node_modules/ckeditor5-utils/*

will be resolved by Webpack or Rollup to: ckeditor5/packages/ckeditor5-utils/* (assuming that all this is cross-symlinked) or whether such packages will be sourced twice. This would need to be checked.

So, before we make any decision, I think we should test it. Because perhaps cross-symlinking everything using Lerna is just the simplest possible solution.

Uh... This was interesting. I decided to test how Webpack and Rollup will resolve the imports in case of symlinked node modules. First I created such structure:

index -> cke5-a
cke5-a -> cke5-b
cke5-a -> cke5-c
cke5-b -> cke5-c

And everything worked fine:

You can only see one console log for the module c.

But then I added relative imports, exactly like we ue them:

index -> cke5-a
index -> cke5-c/foo
cke5-a -> cke5-b
cke5-a -> cke5-c
cke5-a -> cke5-c/foo
cke5-b -> cke5-c

And here are the results:

So it worked with Webpack and broke with Rollup (the "c2" console.log is duplicated). I'm going to add Browserify to see which behaviour is more frequent (Rollup may have a bug).

OK, Browserify works fine, but after I converted the code to Node.js's require() (so Browserify could parse it) it generated a totally broken output.

The issue for Rollup is already opened https://github.com/rollup/rollup-plugin-node-resolve/issues/67 but there was no response yet.

We worked together with @pomek on designing how the new workflow could look like. I described many use cases and features in https://gist.github.com/Reinmar/b1e6c2aa571adbc11b04ed3fca0e18f9.

Then, we talked with @wwalc, @pjasiun and @oleq and they pointed out that the mgit tool that we'd like to propose is very similar to what git submodules do. I'll work now on understanding whether and how git submodules will be usable for us. Their obvious advantage is that they are a pretty well known tool, but the doubt we have are related to how submodules worked. We had very unpleasant experience with them in the past.

Anyway, let's see...

With my first attempt to create a testing env for submodules I did something (apparently) unexpected and I got:

(master 6244d58) p@m /...ckeditor5/test-submodules/test-main> git submodule add [email protected]:Reinmar/test-a.git packages/test-a/
A git directory for 'packages/test-a' is found locally with remote(s):
  origin  [email protected]:Reinmar/test-a.git
If you want to reuse this local git directory instead of cloning again from
  [email protected]:Reinmar/test-a.git
use the '--force' option. If the local git directory is not the correct repo
or you are unsure what this means choose another name with the '--name' option.

Problem is... I have no idea what it means. I only know it happens after my previous try to add this submodule, when the repos where completely empty. And now I don't see any new file in the repo, but the submodule was previously installed somehow. I checked .git/config because, according to one of the articles, the added submodule should be there, but it isn't there.

Super confusing. rm -rf test-submodules and starting from scratch.


My first observation was that I don't like the fact that the hashes of submodules aren't stored in some retrievable format.

http://stackoverflow.com/questions/5033441/where-does-git-store-the-sha1-of-the-commit-for-a-submodule

git submodule status

This means that operating on them will be trickier. Not a big deal, but it wasn't a coincidence that I looked for this because it's natural.


Next, I made a ticket branch in one of the sub repos, committed it, then committed a master ticket branch in the main repo and pushed that.

In a separate clone of the main repo (to simulate that the above changes were made by some other developer) I fetched the changes and switched to the master ticket branch. Then you need to execute git submodule update --init.

I then made one more commit in the submodule's ticket branch and pushed it. Now, a different developer wants to pull all changes done in the submodules (so he does that from the context of the main repo). I realised that, obviously, this won't work – the submodule is not set to a specific branch, but rather a commit. So in order to recursively get changes in all submodules I need to guess what branch I should be pulling... which means that we'd need to have a separate config file to keep on which branches the submodules should be and a tool to read it and fetch changes.

http://stackoverflow.com/questions/1030169/easy-way-pull-latest-of-all-submodules

To sum up, since there's no configuration of on which branch a certain submodule is we'd have to face several problems:

  • git clone --resursive [email protected]:Reinmar/test-main.git leaves you the submodules in detached state. This is both dangerous (a git beginner could lose they work if they commit to such a submodule) and inconvenient. A post install hook would need to be created to fix this.

    Looking for a solution I found http://stackoverflow.com/questions/6474847/how-do-i-git-clone-recursive-and-checkout-master-on-all-submodules-in-a-single where there's a comment:

    Perhaps you should consider gitslave as an alternative to git-submodule, depending on your development workflow it may be better (or worse). Specifically to your issue, gitslave keeps all members of the superproject on the same branch (absent specific git (not gitslave) commands to do otherwise, and even then many commands warn you when you are on different branches).

    Gitslave is useful when you control and develop on the subprojects at more of less the same time as the superproject, and furthermore when you typically want to tag, branch, push, pull, etc all repositories at the same time.

    git-submodule is better when you do not control the subprojects or more specifically wish to fix the subproject at a specific revision even as the subproject changes.

  • You can't configure to which branch you'd like to have some submodule checked out. You can only specify a hash, which will be inconvenient to share some dev environment setups.

  • Since the hashes aren't kept in a plain file... you can't diff them easily. Plus, hashes gives you no clue what they were supposed to mean.
  • Since we need an ability to specify the branch some times, we'll need to create a separate file for it anyway...
  • Since submodules are locked to a hash, we'd need a GitHub hook which would update submodules in ckeditor5 every time some change lands in any of the submodules. We talked about this already – we'll need this anyway for us to be able to do git bisect, but in the previous situation it was an option. Now it'd be a must. Also, this would mean that ckeditor5 repo would change very frequently, which might be irritating. I'm not sure in what cases this might backfire, but right now the main repo changes very infrequently and these changes contain only some dev tools update. I'm also a bit unsure how would this work for ckeditor5 forks, since resolving conflicts between submodule versions is a nightmare.
  • I also thought about this – if, e.g., you're working on a ticket in some submodule, but want to pull changes to all other submodules, to be sure that you're testing the latest changes, then you, again, need a tool...

Let me quote this bit again:

git-submodule is better when you do not control the subprojects or more specifically wish to fix the subproject at a specific revision even as the subproject changes.

Submodules may sound like what we need, but it seems now that they are not the best solution in our situation. They would be great if we'd have some big libraries, developed by separate teams and rarely updated in the main project. But in our case, all the packages are developed at the same time, hence a more dynamic solution is needed.

So, let's try some tools:

I've checked mr. I was looking forward to this tool. Unfortunately, it does not work on Windows…

But… the tool allows doing everything we want to do:

  • cloning repositories,
  • updating repositories,
  • executing a custom command on all repositories.

Installation is very simple:

brew install myrepos
apt-get install mr

Configuration is very simple and can be placed in /ckeditor5/.mrconfig:

[packages/ckeditor5-autoformat]
checkout = git clone [email protected]:ckeditor/ckeditor5-autoformat.git

[packages/ckeditor5-basic-styles]
checkout = git clone [email protected]:ckeditor/ckeditor5-basic-styles.git
update = echo "Updated." && git pull

We can define a command for checkout, update, etc… From my point of view - this tool does exactly we want to do. But… as I said - Windows isn't supported.

mr supports also a job in parallel.

image

I tried to use gr, but I cannot. I read the documentation and IMO this tool won't help us.

I've checked mr. I was looking forward to this tool. Unfortunately, it does not work on Windows…

I think that the ease of install and use is one of the most important requirements. This one isn't maintained any more, it's written in C/C++ (which means we won't be able to contribute) and it isn't broadly available.

mu and gr are both written in Python, which is better, but not perfect. But all this made me think that we need so few features from this tool that spending more time on the research doesn't make sense. So if any of these tools didn't look very promising, then let's get back to our mgit.

BTW, there are 3 other tools called mgit already:

The first one looks interesting. The rest of them are absolutely dead.

But again, I don't want to waste any more time. Please only check in what format the first one keeps the installed repos, because, perhaps, we might follow it. But again, none of this is JS based, none of this can be installed using npm, etc.

To make it clear for everybody, for MVP we need a tool which can do:

  • mgit bootstrap
  • mgit update
  • mgit exec

We should be able to have this in 2 days. We already wasted more than 1 on research which didn't give us anything.

And one more tool called mgit: https://github.com/cpsubrian/mgit :D.

This one, AFAIU works a bit differently and may not have any configuration. But it may give you some inspiration too.

So mgit2 was published https://www.npmjs.com/package/mgit2 :) Congrats, @pomek!

We'll continue work in https://github.com/ckeditor/ckeditor5/issues/389.

Was this page helpful?
0 / 5 - 0 ratings