Gitea: Gitea is slow when external avatar services are down

Created on 12 Feb 2019  路  17Comments  路  Source: go-gitea/gitea

  • Gitea version (or commit ref): 1.7
  • Git version:
  • Operating system: RedHat 7
  • Database (use [x]):

    • [ ] PostgreSQL

    • [ ] MySQL

    • [ ] MSSQL

    • [X] SQLite

  • Can you reproduce the bug at https://try.gitea.io:

    • [ ] Yes (provide example URL)

    • [ ] No

    • [X] Not relevant

  • Log gist:

Console log:
[Macaron] 2019-02-12 14:47:11: Started GET / for 10.10.10.1
[Macaron] 2019-02-12 14:48:11: Completed GET / 200 OK in 1m0.064527745s

app.log -file:
404 2019/02/12 14:47:11 [D] Session ID: 541269a9dc47baf3
405 2019/02/12 14:47:11 [D] CSRF Token: -IiHK1CVOgkwyuyXL13P1Gri6Xc6MTU0OTk3NTQ5NzM3NzY5MDYxNg==
406 2019/02/12 14:47:11 [D] Template: user/dashboard/dashboard
407 2019/02/12 14:47:31 [...modules/base/tool.go:216 SizedAvatarLink()] [E] LibravatarService.FromEmail(email=x@localhost): error 407 lookup _avatars-sec._tcp.localhost on 4.2.2.2:53: read udp 10.0.2.15:45383->4.2.2.2:53: i/o timeout
408 2019/02/12 14:47:51 [...modules/base/tool.go:216 SizedAvatarLink()] [E] LibravatarService.FromEmail(email=x@localhost): error 408 lookup _avatars-sec._tcp.localhost on 4.2.2.2:53: read udp 10.0.2.15:58838->4.2.2.2:53: i/o timeout
409 2019/02/12 14:48:11 [...modules/base/tool.go:216 SizedAvatarLink()] [E] LibravatarService.FromEmail(email=x@localhost): error 409 lookup _avatars-sec._tcp.localhost on 4.2.2.2:53: read udp 10.0.2.15:42802->4.2.2.2:53: i/o timeout

Description

When enabling external avatars, when those sites are down or doesn't reply in a timely fashion, I think Gitea isn't behaving in a nice way...
As seen from the logs above... Browser makes a request, then exactly 1m after it actually starts loading the page and resources aso... 1 minute per page load, because an external avatar request timed out.

I think this should be a lot shorter cut off time than one minute, or be configurable.

kinenhancement revieweconfirmed

Most helpful comment

As @strk is a member of Gitea maintainer team, sending patches upstream would likely be possible.

We could use github.com/miekg/dns which can leverage DNS timeouts, and also possible cache timeouts (maybe for 1 hour, or something smaller than caching the positive responses).

All 17 comments

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs during the next 2 weeks. Thank you for your contributions.

You can change the config to disable that external avatar service.

@lunny Yes I know that. That is not the issue.

The issue is that when you want to use them, but the service is down the gitea pages stop loading, waiting for services that are down for a full minute.
Better behavior is that the pages should load anyway (since avatars, although nice to have isn't crucial for gitea usage).

Agreed. Why don't we provide a per user avatar link which on request redirects to the looked up avatar instead of providing the avatar link directly.

@zeripath I think that would have the same problem, that each page would request those links trying to display an avatar, which would then need the real link.

I think there are probably multiple problems related to avatars from what I've heard, but after going down a hole I think this is something particular (sorry for long comments):

The error logs above are specific to using federated avatars and looking up the SRV record for the domain name in the users email address, which comes from the go-libavatar library we use here:

strk.kbt.io/projects/go/libravatar

https://gitlab.com/strk/go-libravatar/blob/master/libravatar.go#L191

This exmple seems to have ENABLE_FEDERATED_AVATAR = true and the user email is set to x@localhost

So the library above is doing an SRV query to localhost based on that email address.

The go dns resolver will look in /etc/resolv.conf on a unix machine by default usually has something likeoptions timeout:2 attempts:5

So timeout after 2 seconds and try 5 times. That is per nameserver. So if somebody were using two name servers (say 8.8.8.8, 8.8.4.4) it would do that SRV query 5 times for each nameserver, timing out after 2 seconds. That would be 20 seconds total for each avatar -- as seen in the logs above. If the person above had say, 3 avatars to look up it would time out after exactly 1 minute (20 * 3).

Obviously this problem can be exponentially worse if you are trying to look up 5, 10, 20 of these. Or hard to diagnose if you use federated avatars and of 10 of them, one is a bad email address that adds an extra 20 seconds load.

So there are a couple of things to think about:

1: Don't try to load more than one of the same avatar if the first one has failed. Perhaps this can happen by looking up what needs to be loaded and doing it all at once, failing if the first one times out.

2: Don't rely on even one DNS lookup that can take 20 seconds to fail. This would probably mean changes to the referenced library at strk.kbt.io/projects/go/libravatar

3: Just store a copy of federated/gravatar avatar locally and refresh that copy every x amount of time as a background task. Do the original lookup when an account is set up so it doesn't block normal page loads and then refresh at an interval.

4: Maybe as a temporary fix for some examples like this one, don't lookup an avatar for localhost or other probably not real domain names

I think a good future solution is to do lookup of the avatar and save that image locally, refreshing either on a time basis, or perhaps if somebody visits their profile page as well. Even if the timeouts are made better, I don't think each page load should rely on links and/or connections to outside sites that may or may not be working at the moment. This would fix the "gitea is slow because of another server" problem and also avoid the potential "sometimes my pages loose all avatars" problem too. I believe this is something that popular services that make use of Gravatar do (Github, bigger WordPress sites, etc...)

Alternatively, if the above change isn't likely to happen perhaps dropping support for federated avatars (not gravatar or similar services). It can't ever be reasonable to run a dozen SRV queries when trying to load a page, even with a better timeouts.

As @strk is a member of Gitea maintainer team, sending patches upstream would likely be possible.

We could use github.com/miekg/dns which can leverage DNS timeouts, and also possible cache timeouts (maybe for 1 hour, or something smaller than caching the positive responses).

As @mrsdizzie 's advice, we could cache all external avatars and refresh them every day.

I think we could/should do both of those: Make the fetching of the avatars better but also cache them on disk for displaying; Ideally Gitea shouldn't rely on connections to any third party servers for content to display on page views if possible.

That was really what I wanted to say, as pointed out above. I wrote about avatars because that's how I noticed it, but my problem is really that gitea should function as it should, even if all/any external links/sites are down.

And, IMHO, external services add value (I love avatars) so it would really be unfortunate if it was removed... Just treat it as a best effort, if it works it works but don't let core functionality depend on them.

I'm very happy to accept a patch for go-libravatar to tweak timeouts/TTLs.

Lookup could be maybe done by some background process to reduce the pain.

As per local cache, that one could be implemented in Gitea itself, no need to be in the library (for example we might want Gitea to include in that cache avatars that were obtained in a different way)

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs during the next 2 weeks. Thank you for your contributions.

@strk any progress? I think maybe you can move go-libravatar from gitlab to gitea instance if you want some help on that.

There's already a mirror on gitlab: https://gitlab.com/strk/go-libravatar/

And yes, help is welcome (although I still think cache should be done in Gitea itself)

I've a plan: gitea always passes to the browser a url to self (/user/x/avatar/size) and the server takes care of the redirect. How does that sound ?

Please give #8245 a try

Can this be closed now that #8245 was merged ? Or do we want to wait for the per-commit avatar link first ? (non-registered users don't have a redirect endpoint at the moment, to avoid showing the contributors email to clients, although I'm not sure what's the point of that, given email is visible in git history anyway)

Was this page helpful?
0 / 5 - 0 ratings

Related issues

jorise7 picture jorise7  路  3Comments

internalfx picture internalfx  路  3Comments

lunny picture lunny  路  3Comments

jonasfranz picture jonasfranz  路  3Comments

thehowl picture thehowl  路  3Comments