Json: JSON library as a git submodule

Created on 5 May 2020  ยท  13Comments  ยท  Source: nlohmann/json

In order to use nlohmann/json as a submodule of another project it would be great to have a small subset of the JSON project without unit tests, documentation and benchmarks.

The size of the v3.7.3 is 252MB while the size of the include folder is 800k. It takes time to download the library on CI if you build from scratch.

As a workaround one can download zip file and put include folder only.

enhancemenimprovement stale

Most helpful comment

I believe the least intrusive way would be to start a new repository on GitHub (e.g. nlohmann/modern-json) and leave this one as is (possibility marking it as read-only/archive/deprecated). This provides a clean history for users who want to adopt the latest version, while keeping the whole history intact for whoever depends on it. I guess it might be even possible to negotiate with GitHub to transfer watch/starred points to the new repository.

Even such a change, which would not be disruptive for the users, needs careful consideration and only might be carried out on a major release (e.g. 4.0.0), if ever.

All 13 comments

This is odd, because we just recently in #2081 removed the test data from the develop branch. A shallow checkout is now just 6 MB:

โฏ git clone --depth 1 https://github.com/nlohmann/json.git
Cloning into 'json'...
remote: Enumerating objects: 855, done.
remote: Counting objects: 100% (855/855), done.
remote: Compressing objects: 100% (733/733), done.
remote: Total 855 (delta 121), reused 588 (delta 62), pack-reused 0
Receiving objects: 100% (855/855), 6.45 MiB | 3.93 MiB/s, done.
Resolving deltas: 100% (121/121), done.

I'm not sure if I can configure depth of the clone on CI... Full clone is massive still.

$ git clone https://github.com/nlohmann/json.git                                                        Cloning into 'json'...
remote: Enumerating objects: 124, done.
remote: Counting objects: 100% (124/124), done.
remote: Compressing objects: 100% (81/81), done.
Receiving objects:  11% (5825/50464), 64.34 MiB | 240.00 KiB/s
...

Looks like the repo is huge because of history. The repo size is 11MB while the size of the .git folder is 180MB.

So --depth is the only workaround.

I believe new git tag is needed now instead of 3.7.3.

There will be a new tag once version 3.8.0 is released which should happen this month. I will also investigate how to remove the large files from the git history.

Have you tried using shallow submodules?

No, I haven't. Thanks for pointing me out.

Looks like in git 2.14.1+ this feature is broken. See "Summary of buggy / unexpected / annoying behaviour as of Git 2.14.1" answer.

It seems that it is discussing two specific cases relating to checking out a custom branch/commit of the submodule. Given the length of discussions (especially the first answer), I understand that shallow submodules are probably brittle and might fail in some cases, but in your use-case it might solve the problem.

I tried to apply git config ... shallow true command to another submodule in my repo. Then I've called git clone --recurse-submodules again but it had changed nothing. Git still downloads entire history.

I guess it is because I stick submodules to tags (== hash). Thus git has to download entire history to find that hash. Or I use it incorrectly.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

There will be a new tag once version 3.8.0 is released which should happen this month. I will also investigate how to remove the large files from the git history.

There's exactly one way to remove large files from history. Well, two ways if you count "do a shallow clone".

You will need to use the git-filter-branch(1) command (see the EXAMPLES) section of the manpage, to rewrite history, and then force push to overwrite all commits. This will break previous git tags, commit sha1 references, etc. which is obviously not an ideal situation. Some people would argue that for a one-time thing if the reward is great enough (like removing lots of very large files from the history), it makes sense to do this. Other people would argue history should never be removed for any reason whatsoever. Ultimately, this is a personal choice.

If you're interested in doing this, don't hesitate to ping me with questions.

Thanks for letting me know. I did a brief research and also found the mentioned downsides in removing large files from the history. Right now, I am more than hesitant to break existing tags and hashes, and hope that submodule get a nicer support for shallow clones.

I believe the least intrusive way would be to start a new repository on GitHub (e.g. nlohmann/modern-json) and leave this one as is (possibility marking it as read-only/archive/deprecated). This provides a clean history for users who want to adopt the latest version, while keeping the whole history intact for whoever depends on it. I guess it might be even possible to negotiate with GitHub to transfer watch/starred points to the new repository.

Even such a change, which would not be disruptive for the users, needs careful consideration and only might be carried out on a major release (e.g. 4.0.0), if ever.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

jmlemetayer picture jmlemetayer  ยท  3Comments

MariaRamos89 picture MariaRamos89  ยท  4Comments

afowles picture afowles  ยท  3Comments

qis picture qis  ยท  4Comments

Fonger picture Fonger  ยท  4Comments