Tldr: question: automated example runner testing framework

Created on 2 Jul 2017  Â·  9Comments  Â·  Source: tldr-pages/tldr

I was just wondering if any thought was ever given to possibly running the examples for commands in a docker container or travis build, and have the results be part of the PR checks.

Pros:

  • The ability to validate the description.
  • Checks the command syntax.
  • Automated notification if the command syntax ever changes.

Cons:

  • High barrier for entry into contributing.
  • Longer build times.
  • Added architectural/framework complexity.

I don't think there is anything that's wrong with existing project model, but thought that the existence of such a tool might make things easier on a PR reviewer (by offloading some of the work to the contributor).

tooling

All 9 comments

This is very close to impossible IMO.

Because, the project contains non-builtin commands too. Which means one would have to install _all_ of the commands before even trying to test them. And the installation steps are not specified in the pages.

For eg. - rustc. Do you plan to download the rust compiler, create a dummy rust program on-the-fly and then execute it ? Same for Go ?

How would you validate mongo or psql ? Are you going to connect to actual mongo or postgres instances and check whether the commands run or not ? You would have to create additional users and give them passwords too.

How about youtube-dl ? Will you actually download a youtube video and test it?

You get my point. There are hundreds of things like these in the repo. And I believe its not possible to test out all of them. And doing only a subset of them doesn't make sense IMO.

Yeah, I understand where you are coming from.

Because, the project contains non-builtin commands too. Which means one would have to install all of the commands before even trying to test them.

In the workflow for a _add page_,
I feel like the contributor typically has the command installed for the documentation they wish to add (no evidence to back this claim). I was more so envisioning you only run tests on the command you are updating/adding, similar to how some build tools only run the tests affected by a change set.

There are hundreds of things like these in the repo. And I believe its not possible to test out all of them. And doing only a subset of them doesn't make sense IMO.

I totally agree with you that some commands are much harder to test, and therefore, probably not worth it. And while I disagree that having tests on a subset of command doesn't make sense, at the end of the day, if those maintaining the project don't find it helpful, there's no point in adding something like this imo.

Thanks for sharing your perspective with me. :)

No worries, glad to see my view. I understand your wanting to run tests only on the command added/changed, but that runs into the same problem if the command is not a built-in. Even if it was a built-in, you would have to test whether it actually works or not.

at the end of the day, if those maintaining the project don't find it helpful, there's no point in adding something like this imo.

Not at all. Disagreements are bound to happen in any collaborative project. But that shouldn't stop development work.

I hope this does not discourage you from sending PRs for more pages! Thanks again for sharing your thoughts.

I actually think it would be really neat if we could have a minimal example for each command page that would be tested whenever that page is edited (or we could go crazy and test all commands in every commit). For example, for the youtube-dl command, we'd have, say, a tests/common/youtube-dl.sh file that Travis would run, that would install the command and test every example.

This wouldn't be mandatory for contributing to the project (except perhaps when editing an existing page which already has a test script) and the unavoidable incompleteness would be handled with a coverage metric; so we'd a badge in the README for the status of running each command, and a badge for the test coverage of existing commands.

Of course, implementing this requires work that the current maintenance team can't afford to commit to, but anyone (wink wink, @deekim) is welcome to hack on this and submit a WIP pull request (or link to a simplified demo) so we could eventually integrate the idea into the project.

Well, anything which makes a request outside the machine is hairy to test. I am comfortable with testing commands like date, find, grep etc.

But if you want to properly test youtube-dl, you will have to actually download a video, check that indeed the video has been downloaded by confirming that the hashes match. And now what if the video is taken down ? Are we going to maintain our own video ?

Yeah, I was thinking that the test script for the youtube-dl case would indeed download a video and confirm that the command worked as expected by whatever tests would make sense in that case.

I don't think it's that hard to cater to each command's specific behavior for testing, since the automation of the tests would only be in their execution, not in their writing. That is, I'm not envisioning a way to automatically test a new command given its examples alone -- I was thinking more of gradually building a test harness by human-provided contributions, and this test set would then be run automatically on each commit that touches that file (or any commit regardless, if that's viable).

In that way, tests can easily be changed if they start failing, for instance by replacing the URL for a resource that the test uses.

I see. IMHO, it's not worth doing it. I can see this quickly growing out of bounds and becoming a maintenance nightmare. But that's just me. Would love to hear other opinions. /cc @sbrl.

Again, just to be sure, I did mention I don't think we, the current maintainers, should focus on this at the moment — we have several meaty items in our todo list that have priority over this. But I'd be happy to incorporate such a system in case someone offers an implementation.

I don't think it would be too big of a maintenance burden once in place: we can always drop individual tests that stop working or become otherwise problematic (e.g. taking too long to complete, failing often for external reasons, etc.)

@agnivade and @waldyrious, thanks for your input :)
i completely understand that there are a lot of other meatier items for the current maintainers to work on. i will be vacationing for a large part of september, but ill try to whip up a prototype towards the end of the month if you guys would be interested in reviewing.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

zlatanvasovic picture zlatanvasovic  Â·  3Comments

mebeim picture mebeim  Â·  3Comments

michaeldbianchi picture michaeldbianchi  Â·  3Comments

dikarel picture dikarel  Â·  3Comments

amitech picture amitech  Â·  3Comments