Dvc.org: ci: test to check all links

Created on 27 Sep 2019  路  26Comments  路  Source: iterative/dvc.org

We have internal and external links, mainly in our MD files, which tend to change regularly and its easy to break them accidentally for different reasons. We could probably automate this check?

Per https://github.com/iterative/dvc.org/pull/637#pullrequestreview-293487957

doc-content doc-engine

Most helpful comment

@shcheklein sure I will look into it 馃憣馃徏

All 26 comments

p.s per https://github.com/iterative/dvc.org/pull/637#discussion_r328993919 (and biopython/biopython.github.io/issues/62)

There is also this: https://wummel.github.io/linkchecker/ But it seems to be a bit old and bloated.

If this issue is still interesting, I may have a look at it and try to build a script.

I did not understand, what was the problem with the other script, that was deleted?

yep it's definitely useful if it's made part of the CI flow (CircleCI)

The script is submitted here: https://github.com/iterative/dvc.org/pull/690

It is not completely robust and reliable, sometimes it may give false-positives, so maybe not suitable for automatic checking and CI. But it may still be useful for manual checking once in a while.

Thanks. I'm a little behind on my own work but will check it out ASAP.

@jorgeorpinel @dashohoxha left my comments in the PR. I think it's a good first step, but does not resolve the issue. My major concern, no one will be using it (and someone will just do a PR do delete it like happened already) if it's not integrated into CI system.

My major concern, no one will be using it (and someone will just do a PR do delete it like happened already)

I was actually using the deleted script each time before committing, and it always would find something to fix. But if it was only me that was using it, no problem, I can use a local copy of the script.

I can accept and merge this, but I feel it won't be used/useful w/o introducing a proper CI setup.

I think it's a good first step, but does not resolve the issue.

If it is not useful then there is no point in merging it.

Although I tried it manually and it reported a lot of issues (most of them actually are redirection problems, which are not really breaking any thing). It might be useful for manual checking, especially when we restructure the hierarchy of the docs (which we are planning to do).
But again, I can have a local copy and use it for myself.

I was actually using the deleted script each time before committing, and it always would find something to fix. But if it was only me that was using it, no problem, I can use a local copy of the script.

why not pre-commit hook? or the yarn command we have now?

If it is not useful then there is no point in merging it.

I'm not saying that it's not useful. I'm saying that there are extremely high chances that everyone will forget about it later if it's not part of the CI system and even not documented anywhere. It seems to me that it's not that big of a problem to make part of the CI system and run it on changed files? @algomaster99 could you take a look and help @dashohoxha with this?

why not pre-commit hook?

Does this mean that I don't need to check and fix them because the git hooks will do it automatically?

or the yarn command we have now?

Which yarn command? The one mentioned here: https://dvc.org/doc/user-guide/contributing-docs#doc-style-guidelines-and-tips-for-java-script-and-markdown?
If I have to go to the docs and copy/paste a command, I usually make a small script for it.
By the way, the double star globing syntax mentioned there (**) as far as I know does not work by default, needs to be activated/enabled.

@algomaster99 could you take a look and help @dashohoxha with this?

I would prefer if @algomaster99 takes this script and makes it part of the CI system, because I don't know anything about it.

@dashohoxha

Does this mean that I don't need to check and fix them because the git hooks will do it automatically?

Yep. That's the whole point - it takes care about everything automatically before you do git commit.

Which yarn command? The one mentioned here: https://dvc.org/doc/user-guide/contributing-docs#doc-style-guidelines-and-tips-for-java-script-and-markdown?
If I have to go to the docs and copy/paste a command, I usually make a small script for it.
By the way, the double star globing syntax mentioned there (**) as far as I know does not work by default, needs to be activated/enabled.

I think, yes. @jorgeorpinel can address this question better I believe and concern with **.

makes it part of the CI system, because I don't know anything about it.

sure, np. Please, address some point in the PR and I'll merge it, will keep this ticket open. We can update the description of the ticket to mention the script.

@shcheklein sure I will look into it 馃憣馃徏

Sorry guys, just getting back to this. I think most has been said so let's try to integrate it to CI.

Making scripts for personal use is totally fine of course, but I agree they don't need to be committed if they're not something everyone should be using and integrated to git hooks and/or CI system, or at least documented somewhere as a recommendation (e.g. shell autocompletion scripts in the core repo).
Tip: Personal scripts or other files can also be "gitignored" in the .git/info/exclude file (not to change .gitingore).


Off topic discussion about Prettier command

the double star globing syntax mentioned there (**) as far as I know does not work by default.

It seems to be part of the regular Prettier syntax as it's used in several usage examples here: https://prettier.io/docs/en/cli.html (The example yarn prettier ... command in the docs contrib. guide did have an error that Ivan already fixed, though.)

@algomaster99 will you reuse PR #690 (check-links branch)?

@jorgeorpinel I can use it. Its logic seems correct. But first we need to fix the engine bug.

Not necessarily. Please see comments in the PR. Thanks

a good option is to use some external tools - like Hexometer. Any other tools available that will keep monitoring the website?

use some external tools - like Hexometer

Good idea. Here's the report:

image

See https://hexometer.com/broken-links/dvc.org

The 6 bad links are:

404 https://dvc.org/assets/1ce1117a69b7e41551e0.js
404 https://dvc.org/assets/c1f03f6aff19cf266795.js
404 https://dvc.org/assets/575df23d1a8ff04b8d54.js
404 https://dvc.org/assets/0.4d6f2247fe215b4f7184.css
404 https://dvc.org/assets/07dca80a102d4149e9736d4b162cff6f.ico

Which are not from docs or md files. Probably related to next build? Idk why they would be broken though.

See also https://hexometer.com/tools/dvc.org for a full report of the website. Mostly good results, it seems.

Any other tools available that will keep monitoring the website?

Well, Hexometer costs for checking continuously and it would only send us reports. I don't think we can integrate in the CI pipeline.

I only know SAM (includes Web Link monitor) which is expensive. But probably most APM tools include this?

I think I'm fine with that tools sending a weekly report or something as a solution.

OK, we can close this then? Just curious about:

The 6 bad links are:

404 https://dvc.org/assets/1ce1117a69b7e41551e0.js
404 https://dvc.org/assets/c1f03f6aff19cf266795.js
404 https://dvc.org/assets/575df23d1a8ff04b8d54.js
404 https://dvc.org/assets/0.4d6f2247fe215b4f7184.css
404 https://dvc.org/assets/07dca80a102d4149e9736d4b162cff6f.ico

Which are not from docs or md files. Probably related to next build?

Cc @iAdramelk

yep, let's close for now (one ticket less 馃槄 ) .... those come from the dvc.org/chat - looks like it's bug in Discord integration?

I can be assigned here - could add some fixes on top of #690. I'd add both pre-commit diff as well as full CI.

@casperdcl just curious, don't you think it can be a bit overkill to support this? Hexometer detects changes once a week I suppose - more or less enough?

Not sure it's overkill since I think it'll be quick to implement. Hexometer has false positives and doesn't provide instant feedback on a PR...

@casperdcl kk, let's try this!

Kewl

Was this page helpful?
0 / 5 - 0 ratings