Docs: Consider running an automated PR to compress all images

Created on 28 Aug 2020  路  12Comments  路  Source: dotnet/docs

Docs readers with slow internet connection may get better experience if images are compressed.

related to https://github.com/dotnet/docs/pull/19737#issuecomment-682643419

cc: @IEvangelist

Information Architecture Pri3 discussion doc-enhancement

All 12 comments

I was going to suggest this. I use ImgBot on my own sites and it works great.

I don't know if it's really worth it. Doing some quick powershell calculations on png/jpg in the repo, I only see ~20 files bigger than 500k and only ~130 files over 200k and the average file size is ~58k

@adegeo We may see a way to apply the automation only for files bigger than 200kb to reduce the number of files changed. Will that be better?

We already have image compression capabilities built into our VS Code tooling, see image compression. We have also explored ImgBot, in fact I was driving this effort months ago. Ideally, we'd compress images before they are committed to git - otherwise the change history continues to grow even if the images are compressed after the fact. Does that make sense?

That's true but those are wholly different concerns. ImgBot saves bandwidth and improves the experience for many thousands (millions?) of visitors to the public site; keeping the git repo small only matters to the hundred(s) of developers working with the source. Both are good, and may be used together to provide "defense in depth" for when things slip through (for instance, when 3rd party open source developers submit PRs but don't have your tooling installed).

@IEvangelist The VS Code extension looks great!
Good point also about the change history size, I wasn't considering that. So yes I agree it's better if the image is compressed in #19737 before it's merged.

But what about the existing large images?

I wouldn't prefer an automated PR for all images, because as you said, it makes the git history grow. Hence, I'm closing my PR of ImgBot.

But I think it could be worth to have a smaller PR that compresses either the 20 image larger than 500k, or the 130 larger than 200k (my preference is the 130 images).

That's true but those are wholly different concerns. ImgBot saves bandwidth and improves the experience for many thousands (millions?) of visitors to the public site; keeping the git repo small only matters to the hundred(s) of developers working with the source. Both are good, and may be used together to provide "defense in depth" for when things slip through (for instance, when 3rd party open source developers submit PRs but don't have your tooling installed).

@ardalis I think we should balance between both.

I faced problems before cloning very large repos with a poor 5MB/sec internet connection (which is really common in many countries) :smile:

Visitors to the public site certainly should be a higher priority, but I think they will mostly affected by the extremely large images, which is probably the 130 images over 200k?

We agree - let's do both. :)

Another problem is that images that are compressed cause our localization teams to pick up those changes and manually go update/localize every single image that has text on it. Which I'm told is very time consuming, and costly. I know that the issue is being further investigated higher up by the platform team. So the post processing stuff is going to have to wait.

One idea that I floated to the team was to empower the content developers to automatically create the machine translated text overlays with images as part of our common workflow. They were discussing using ML.NET, or Cognitive Services to perform the translations and automatically overlay the corresponding translated text. Again, something that would be ideally done before a merge. Also, in the Azure docs repo - this is a much more serious problem. They have over 50k images! 馃槷

In that case, will 130 images be large amount of work to the localization teams?

It would, that is why there is hesitation around mass updates. We're still looking into that. The main point is to try to keep images compressed before being merged.

Great point @IEvangelist I forgot about loc translation.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

garfbradaz picture garfbradaz  路  3Comments

sebagomez picture sebagomez  路  3Comments

stjepan picture stjepan  路  3Comments

ygoe picture ygoe  路  3Comments

svick picture svick  路  3Comments