Tldr: Move to a Monorepo

Created on 12 Jun 2017  路  25Comments  路  Source: tldr-pages/tldr

Hey guys! How's it going?

I was wondering if you'd be interested in turning tldr into a monorepo that includes the pages, and also the clients.

The goal would be to centralize issue handling rather than pointing people to other places, or linking issues between repos, and to gain more control over the quality of the clients that are built around the content we host.

Some things we'd have to figure out are:

  • Unified build process for different clients
  • Dependencies and points where things could be simplified (like js stuff sharing code)
  • Permissions that would need to be issued (for people whose project we'd move into this repository)
  • Look into generating some of the docs that are currently maintained manually (like listing the clients and pointing to the right versions of them)

But the outcome would be a single source of truth for dealing with everything tldr :)

I'd like to volunteer and take a stab at this if you guys think it's a good idea

@tldr-pages/content
@tldr-pages/cpp-client
@tldr-pages/exilir-client
@tldr-pages/node-client
@tldr-pages/python-client

architecture clients decision documentation mass changes tooling

Most helpful comment

I would be more inclined to go in the direction of decentralization, to be honest -- i.e. establishing creating a solid, well-defined ecosystem, rather than a(n even more) centrally managed project. That would give people more flexibility to participate in the project (both as users and as contributors) in the ways that are most comfortable for them, and if we do it right, we wouldn't lose in terms of quality or consistency.

By "do it right", I mean mostly wrapping up the client spec (#1065), and moving the rest of the clients out of the tldr-pages organization (https://github.com/tldr-pages/tldr/issues/1104#issuecomment-251525707).

That way, we are able to (1) allow people to contribute and play around using whatever tools they prefer, (2) make sure there isn't too much stuff for maintainers to handle, which can be problematic on low availability periods, and (3) easily ensure that any clients we endorse (by listing in the Readme and website) conforms to the explicit quality/functionality requirements we've defined.

All 25 comments

Wouldn't it be better to actually merge the clients and just move to one single client working on all platforms by just adding an executable to your path with no dependencies?
I would love to provide support for the cpp-client, which is able to achieve said goal, but I would need some assistant since time is currently very sparse.

As proposed before: Make the cpp-client the sole client, since it already is working on each platform without any installed dependencies, which means we could provide binaries for each platform which can simply installed through an "installer" (some script, or even an actual installer for windows systems).

I agree. A single client that compiles to all platforms would be ideal.

Just keep in mind that we'd still need to move the web client and the mobile clients. Code-sharing between those, to some extent, would be very very nice.


Re: the cpp client, I'm not down for that. I'd like a higher level language, not so high as Haskell thou, that can still be compiled down to all necessary platforms. And no I'm not thinking Go either, but Rust would be an interesting option.

And of course if we are all angry about languages we can just say to hell with syntax and use Chicken Scheme (https://www.call-cc.org/) or Racket (https://racket-lang.org/) 馃槤

Just keep in mind that we'd still need to move the web client and the mobile clients. Code-sharing between those, to some extent, would be very very nice.

Should be somewhat possible, if it's written in C (or C++, or anything that compiles to C) to at least share most of the code between all platforms (WebAssembly / Emscripten for the web, no special treatment required for iOS, native lib for Android, although that's a little cumbersome, and Win/Linux/macOS is obvious).

I don't care about the language to much, though. My preferred language is C, though, that's why I chose to write a client in C.
I do agree, that having at least a network component in the standard library would remove (or rather move it away from our need to maintain it) the need for any third-party dependencies (the C version currently depends on libcurl).
Choosing a language widely known is a plus, since that will encourage / increase user contributions (no, JS is not an option, in my opinion). C is daunting for a lot of people.

I actually like Chicken Scheme / Racket! 馃憤 :trollface:

I would be more inclined to go in the direction of decentralization, to be honest -- i.e. establishing creating a solid, well-defined ecosystem, rather than a(n even more) centrally managed project. That would give people more flexibility to participate in the project (both as users and as contributors) in the ways that are most comfortable for them, and if we do it right, we wouldn't lose in terms of quality or consistency.

By "do it right", I mean mostly wrapping up the client spec (#1065), and moving the rest of the clients out of the tldr-pages organization (https://github.com/tldr-pages/tldr/issues/1104#issuecomment-251525707).

That way, we are able to (1) allow people to contribute and play around using whatever tools they prefer, (2) make sure there isn't too much stuff for maintainers to handle, which can be problematic on low availability periods, and (3) easily ensure that any clients we endorse (by listing in the Readme and website) conforms to the explicit quality/functionality requirements we've defined.

I agree with @waldyrious on this. I like the system as it is now. A pages repo and individual clients repo. Having different clients is kinda the fun part and people get to choose what client they want to use.

Re: issue handling - Yea I agree that's slightly painful.
Re: single source of truth - Well actually there are no multiple sources of truth here. We have multiple truths and that's totally fine by me. Each repo being concerned with its own stuff. Furthermore, unifying the build process for all clients is just going to be plain messy. I would recommend against that.

So my vote is to keep things as is. As I don't see any big gain from doing it.

Having different clients is kinda the fun part and people get to choose what client they want to use.

I don't understand what the users gain by that? I can only imagine confusion, to chose what client to chose and than later noticing some new feature might not be working, because it's only implemented in other clients. That's simply frustrating.

I think we can all agree that doing brew install foo (as an example) is much more convenient for a macOS user than, say, curl / configure / make / make install, not to mention the ability to update, uninstall, etc. Similarly, for a user who already has node, or pip, or cargo, etc. in their system, the most convenient way to install a tool is to use those ecosystems. What I'm saying is YMMV, and consolidating around a single client will force people into a single workflow that may not be the most adequate for them.

By going the ecosystem route, rather than the monorepo one, we ensure that (1) people retain the freedom to choose whatever client they prefer; (2) the maintainers are not strained to support all the convenient functions that those package managers provide; and (3) we remain a fertile ground for people to experiment with new clients and new features for the clients.

I can only imagine confusion, to chose what client to chose and than later noticing some new feature might not be working, because it's only implemented in other clients.

That's precisely what we'd gain by establishing a well-defined set of features that all sanctioned clients should follow: then any extras would be cherries on top, but they would be advertised by those specific clients, not as general features of the tldr client.

Having a single client doesn't mean we're going to drop convenient ways of installing. It's the exact opposite.

Having a single client ensures that we can focus on properly delivering it to the users, instead of trying to maintain multiple clients with multiple ways of installing, we can focus on delivering our client via a host of different methods:

  • Package managers on Linux, one for each popular distro (debian/ubuntu, arch linux, centOS, etc)
  • homebrew on macOS
  • chocolatey or installer on windows
  • binary for each system, for easy drop into the path

Each method (in the best case) requiring a single maintainer, to make sure it's working and complies with our quality requirements. I count 9 different methods, that would mean if we want to provide a unified user experience, each client would require at the very least to 10 people to provide this.

If all of this burden is on each of the client maintainers, there will be compromises made, for example, by dropping methods of delivering or not integrating new features, due to the sheer amount of work required.

Apart from the amount of work and people required to maintain multiple clients vs a single client, it's going to be a lot more difficult to make sure each of the clients is implementing the spec completely and complies with our quality requirements. Even if the spec route is going to be implemented, how will changes be handled? It'll require some time for clients to implement changes, and if they're going to be advertised, but not yet implemented in a client of a user, that's going to cause frustration and would even require to regularly check clients and remove them if they don't comply anymore.

I honestly don't see any good point in trying to make multiple clients work for use as an organization. Keep in mind, I'm not saying to disallow anybody implementing a tldr client in your favorite language, I'm just saying that we should have a single client to focus on and endorse as a whole. This is going to make the quality of said client a lot better, compared to splitting our users among many different clients.

To answer some of your concerns:

people retain the freedom to choose whatever client they prefer;

What is that buying us? I've never seen a user complaining that their cat (or man, grep, whatever) implementation is not available in there favorite programming language.

the maintainers are not strained to support all the convenient functions that those package managers provide

This problem is only caused by having multiple clients and would be completely gone with a unified client experience.

we remain a fertile ground for people to experiment with new clients and new features for the clients.

Why would you want to "experiment" with the tldr client? It either works, or it doesn't. If you install a client which doesn't, 90% of users will never come back to tldr because their FTUE was terrible. We, as an organization, can still experiment with features, by simply having multiple channel where users can chose to install bleeding edge versions, beta versions or stable releases.

I think I should weigh in on this discussion. Personally, I think that having multiple clients and the tldr-pages repo for just the pages themselves makes more sense - and not just because of the single responsibility principle. What if I find a terrible terrible bug in the client that I use? I'll report it, sure, but I still want to be able to use tldr-pages easily (what about a REST api too? Just a random thought), so I could use another client instead.

And a multi-client environment helps bring tldr-pages to all sorts of platforms we can't bring it to ourselves. What about an app from an ios / android device? Or a client implemented in hardware with an arduino? Or a terribly obscure architecture that we haven't considered in the C / C++ code?

Also, while having a client in C / C++ will invariably limit the people who could contribute to it. Sure, you know C and C++, but what about everyone else? It isn't a simple language to learn at _all_. I've taken a whole year's worth on university lectures and I'm still shaky on a number of key points. I prefer C#. The next guy might prefer Ruby. The next person might like a bit of Python. The solution: have multiple clients - then people can contribute in their 'native' language. Contribution helps people find an easy way to get involved in writing code.

This is just my opinion. I'm welcome to counter-arguments.

Also, what are those at-symbols in the initial comment, @ostera? Here they are:

@tldr-pages/content
@tldr-pages/cpp-client
@tldr-pages/exilir-client
@tldr-pages/node-client
@tldr-pages/python-client

They all lead to 404s, so I think you might have gotten some syntax wrong.

What if I find a terrible terrible bug in the client that I use?

That's a case which should never happen in a stable release, that's what beta versions are for: to discover this sort of bugs.

And a multi-client environment helps bring tldr-pages to all sorts of platforms we can't bring it to ourselves

A single client can, without any problems, run on any of the most used platforms (Windows, macOS, Linux, *BSD).
Why would you want to use tldr on an embedded platform, to begin with? You very likely, never work directly on it, and therefore always have a host, where you can use tldr, which will run any of the mainstream operating systems.

Also, while having a client in C / C++ will invariably limit the people who could contribute to it

Nobody said it'll be C or C++ in the end. For what it worth it could be anything which can be compiled to a self-contained executable on all platforms (like Go, Rust, Haskell).

Contribution helps people find an easy way to get involved in writing code.

If you want to write a client in your favorite language, nobody is stopping you. Why do we have to provide clients for people to contribute to? That's not the point of it. The point is to have a single stable client for our users, if nobody is contributing, but it works like a charm, that's totally fine for me. (How often have to contributed to any of the GNU tools? 99% use them without ever even looking at the source, and they work like a charm for 25 years.)

@sbrl Those are mentions to the teams in the tldr organization, they link to it, which is why everybody not in the org gets a 404.

Hey! Amazing discussion everybody. Thanks for chiming in :) I'll try to clarify some of my points and comment directly to some of the things you said.

On a _single client_

Just to clarify a small point, when I said:

A single client that compiles to all platforms would be ideal.

I meant as _platform_ also runtimes like Ruby, Python, or Node.

So that changes the view from

                    +------>.deb
                    |
+--------------+    +------>.rpm
|              |    |
|  One Client  +----------->.exe
|              |    |
+--------------+    +------>.app
                    |
                    +------>.apk

To

+--------------+     +-----------+    +-----------------+
|              |     | Generated |    |                 |
| Client Spec  +----->  Client   +----> Target Platform |
|              |     |   Code    |    |                 |
+--------------+     +-----------+    +-----------------+

Which means we generate interpreted clients:

  • Python
  • Ruby
  • Javascript
  • _other interpreted languages_

And a compiled clients that we can compile down to different platforms

  • Rust (just as an example, would work for most binary platforms)
  • Swift for iOS
  • Java/Kotlin/Frege/whatever JVM language you want to use for Android

This effectively gives you a "single client" (behavior) that ends up as a multiplicity of "single clients" (artifacts of the code, runnables).

And everyone gets to use whatever they feel like, knowing that it's _the same client (behavior)_ everywhere, even when it's _not the same client (artifact)_.


@waldyrious, when you say:

(3) easily ensure that any clients we endorse (by listing in the Readme and website) conforms to the explicit quality/functionality requirements we've defined.

I'd argue that listing doesn't guarantee us anything. As an example, I could put up Ads on tldr.jsx and actively be getting (some) money out of hits on tldr.sh -- adding my link on the README is no guarantee that I can't do that. And that's a barely malicious one, people download far more dangerous crap in their computers without verifying (I mean them who have never curl http://something | bash --ed something throw the first stone).

But generating the client on the other hand...does give you the guarantee.

On workload and maintenance

@Leandros

If all of this burden is on each of the client maintainers, there will be compromises made, for example, by dropping methods of delivering or not integrating new features, due to the sheer amount of work required.

@waldyrious

(2) make sure there isn't too much stuff for maintainers to handle, which can be problematic on low availability periods

By automating the point I made at the beginning of this comment, you reduce the amount of _maintenance_ drastically.

@agnivade

Furthermore, unifying the build process for all clients is just going to be plain messy

Actually: bazel.io. Most big companies work in monorepos -- the mess that comes with multi-repo issue-cross-linking and inter-dependency-management is far heavier than solving the build process once for all things.

On a REST API

@sbrl:

(what about a REST api too? Just a random thought)

That's just another _projection_ of the client spec.

But I don't want to generate stuff!

If we didn't generate any clients, I can imagine us setting up the build process to host a python, ruby, node, rust, C++, Java, bash, and swift clients. The initial cost of doing it is high, that's why I volunteered for it, but the outcome would be that the _spec_ could be actual usage tests of all clients.

No change in any client that didn't pass the end-to-end tests / spec, would get merged. Either you conform to the spec, or you're out.


Again, thanks everyone for the conversation!

Hi, I'm still not sold on the usefulness of having multiple clients. Can someone explain what the benefit is?

For me the parallel is if https://git-scm.com/ showed a choice of a Ruby client, C++ client, Python client... As a consumer, I don't really care about the language it was written in, I just want to download it the easiest way. That way is typically brew install or apt-get install, or downloading binaries for my platform.

There are tools out there that chose to depend on a given runtime (e.g. Jekyll) but they simply say that in the README and ask people to use bundle install. I think it would be odd if Jekyll maintained a Node/Ruby/Python/Rust/GoLang/Haskell client, which all had roughly the same features but not quite. I agree with @ostera that the feature problem goes away if we auto-generate all the clients, and :+1: that would probably be the way to go... but it sounds like a lot of work for something that could be solved by having a program that can be apt-get installed.

It is true that an official client in C++ or Golang might limit the number of people who can contribute, but I assume we're not suggesting to add new clients for the main purpose of adding contributors, especially if these contributors will be effectively be duplicating work in different languages.

I do like the fact that the current many clients is a great breeding ground for innovation, features and ideas. We definitely shouldn't prevent that... Personally I would simply not make them official clients, and simply

  • list them at the bottom of the README with a YMMV disclaimer
  • have a chat with the maintainers to see if any ideas should be brought back into the official client

Of course this is focussed on the desktop client. There will always be the need for Web/iOS/Android clients too, which could be 3 solutions, or maybe 1 solution repackaged with the many technologies available (Cordova etc).

In my personal opinion, the bash client has the lowest requirements and widest possible install base. I'd like to see it as the default command line client, but I also feel like it needs more work to correctly implement caching before it should be seriously considered as the CLI default.
Contrawise, the mobile and web clients are also supremely important and handles use cases the bash client will never cover. So I don't really see the adoption of a single blessed client as a reasonable case either.

@ostera You say

Which means we generate interpreted clients

How can you automatically generate a client in all those languages? Transpiling in particular just sounds messy. In addition, how would you manage to get a single codebase to act both as a webpage, for example, and a command line client at the same time? I can't see how it's possible. You'd need some platform-specific code at _least_, which already breaks your theory if I'm reading this correctly.

Sure, I think that having an 'official' client is a good idea - especially for new users.

In addition, what's stopping someone from just creating and maintaining their own client? Since tldr-pages is open source, it's bound to happen.

@rprieto I can see how having a client where you can apt install tldr is a good idea. Not everyone has apt though. e.g. alpine linux, opensuse, arch, centos, windows, mac, solaris, android, ios, windows phone, etc.

In my mind, I compare it to letsencrypt - it's got a single api and an official client, but there are lots of other clients for different situations too.

Yeah - I think that an official / unofficial system might work well.

I really don't get this repackaging business. Could someone explain how it would be possible?

@sethwoodworth Yeah, I rather like the bash client too :D It's got much saner defaults and is much faster than either the node.js or the python clients - I've had far fewer issues with it :D

Regarding package manager, I meant an official binary we can distribute via apt-get, apk (Alpine), homebrew etc... Assuming it won't be an official package we'll simply have to say (like many projects) "please run brew tap <...> && brew install <...>".

That would be for the CLI tool. I agree there's obviously other platforms, such as Web / Desktop / iOS / Android / Windows phone, which could be 5 native clients or even 1 "web" client repackaged in wrapper apps.

Just leaving two quick notes here:

1) the transpiling stuff sure sounds interesting, although I have no experience in that, so I don't have an opinion on whether that's reasonably feasible (or desirable wrt hand-written clients in the languages that natively target specific runtimes such as Python). Still, I'd like to mention the obvious Haxe, and this interactive graph of compilers and transpilers which may be helpful to inform discussion of this point in particular.

2) I've come to reconsider my stance on this -- I found @Leandros' arguments, in particular, to be rather compelling. I'm on the fence at the moment, and will need to carefully read this thread to adopt a position, which I'm not sure I'll have the time to do in the near future. That's to say that if the rest of the community (meaning those who've manifested interest in this by participating in this discussion thread) decides one way or the other before I'm able to make up my mind, I'd be fine with the outcome -- i.e., consider my vote neutral at this point.

  1. Transpiling can be rather unstable, from what I've seen / heard. We'd have to do some practical investigation.

Cross-referencing #91, "Separate tldr-pages into different git repo" (last time I tried to find it, I wasn't able to).

Relevant quote:

This is now the "pages" only repo. The _Node.js_ client has moved to tldr-node-client.

I don't think the transpiling idea could work in a reasonable amount of work/time and would not be stable I think, every language moves on and that would be a huge amount of work to be maintained.
A single client would be great if it would be working perfectly, I remember I had issues the first time I tried tldr and heard the same from a colleague whom I showed my client. That's why I think the diversity of clients should be given and not just one official client should be supported. I had the possibility to just install an other client of the same organization, I'm not sure if I would have been willing to install a client from some other source because I didn't know if the client would be updated, reacts on changes of the official organisation - like when you change the markdown for your pages, change the location of the pages to download - or stuff.

I don't really think a monorepo across all clients could work. That would mean that this repo have a TONS of issues and pull requests. You would try to sort the issues and prs via labels but that gets confusing sooner as someone might think.

There should might be a repository for general tldr stuff (not the pages) where things like client specification are written (or a wiki) and the maintainers of the single repository should be responsible to take issues they get into the general tldr repo/wiki if it is something which has to do with all clients e.g. new features.

@sethwoodworth How does the bash client not correctly do caching?? I'd welcome any issues you find.

What's the state of this? I'm pretty much against this and the issue seems very stale.

I'm pretty sure the ship has sailed on this subject. While generating client code for multiple platforms and runtimes from the same base spec would be awesome, I'm pretty sure we can agree it's unworkable for the tldr pages project as it currently exists, for several reasons:

  1. it would be quite complex to implement and maintain (indeed, maintenance would have to be done over a much wider surface area of potential bugs, security issues, setup configurations, etc.)
  2. it would likely not be able to cover all the platforms we currently serve via our distributed network of clients (from operating systems like linux and windows; to runtimes like node and python; to targets other than desktop binaries like android apps and websites; and so on);
  3. it would centralize development and publishing across a wide range of platforms, languages and repositories, to be managed by a small team of core maintainers.

Beyond that, my subjective opinion is (as I stated earlier on this thread) that we gain more by having a rich, loosely-coupled ecosystem of multiple clients that allow both developers to experiment with new ways to present and interact with the tldr pages content, taking advantage of specific strong suits of each platform or language (or their own skills and experience), and simultaneously allows users to pick and choose from the clients _they_ feel do the job the best for them.

We always have received issues and bug reports in the main repo that are actually related to specific clients, but they were never overwhelming, and we've always been able to direct people to the appropriate issue trackers, so there isn't even a sizeable load in the maintainers with the current system.

For all the above, my proposal would be to stick with the current system of multiple clients managed in a decentralized way.

Great way of putting it, @waldyrious! I agree.

I agree with @waldyrious.

Totally understand @waldyrious points 馃槃 so I'll close this issue 馃檶

Was this page helpful?
0 / 5 - 0 ratings

Related issues

mikerouxfr picture mikerouxfr  路  3Comments

zlatanvasovic picture zlatanvasovic  路  3Comments

zlatanvasovic picture zlatanvasovic  路  3Comments

taki picture taki  路  3Comments

amitech picture amitech  路  3Comments