Is your feature request related to a problem? Please describe.
Some servers use custom models excessively and not uncommonly end up with several gigabytes of initial download. For people with slower internet connections or just people who don't want to download that much, this is a problem.
Describe the solution you'd like
I suggest adding an accurate total download size field to the server browser (as already suggested in PR #154).
On the one hand, this leads to more transparency and on the other hand, it might put some pressure on server owners not to exaggerate using custom models too much if it's not necessary.
Additional context
Wouldn't the server browser slow down if MTA would be downloading the file list and checking what is cached for each server in the browser? Could it be possible to perform an asynchronous check only for the servers that are visible on the screen?
Wouldn't the server browser slow down if MTA would be downloading the file list and checking what is cached for each server in the browser? Could it be possible to perform an asynchronous check only for the servers that are visible on the screen?
The way you described yeah it would be slowed down,
But if the initial-size would be sent to serverlist server with the query request, then there won't be any performance issue.
Wouldn't the server browser slow down if MTA would be downloading the file list and checking what is cached for each server in the browser? Could it be possible to perform an asynchronous check only for the servers that are visible on the screen?
The way you described yeah it would be slowed down,
But if the initial-size would be sent to serverlist server with the query request, then there won't be any performance issue.
You can't account for cached files by just having the download size.
Wouldn't the server browser slow down if MTA would be downloading the file list and checking what is cached for each server in the browser? Could it be possible to perform an asynchronous check only for the servers that are visible on the screen?
The way you described yeah it would be slowed down,
But if the initial-size would be sent to serverlist server with the query request, then there won't be any performance issue.You can't account for cached files by just having the download size.
Then what about this, the method you described, but not like doing it for every server, when you click on a server name a panel be opened, with the info 馃
Wouldn't the server browser slow down if MTA would be downloading the file list and checking what is cached for each server in the browser? Could it be possible to perform an asynchronous check only for the servers that are visible on the screen?
The way you described yeah it would be slowed down,
But if the initial-size would be sent to serverlist server with the query request, then there won't be any performance issue.You can't account for cached files by just having the download size.
Then what about this, the method you described, but not like doing it for every server, when you click on a server name a panel be opened, with the info 馃
Nobody uses the info panel and I did not suggest doing it for every server. I suggested doing it for the servers visible on the screen.
Wouldn't the server browser slow down if MTA would be downloading the file list and checking what is cached for each server in the browser? Could it be possible to perform an asynchronous check only for the servers that are visible on the screen?
The way you described yeah it would be slowed down,
But if the initial-size would be sent to serverlist server with the query request, then there won't be any performance issue.You can't account for cached files by just having the download size.
Then what about this, the method you described, but not like doing it for every server, when you click on a server name a panel be opened, with the info 馃
Nobody uses the info panel and I did not suggest doing it for every server. I suggested doing it for the servers visible on the screen.
How about sending the download file list to the server list?
How about sending the download file list to the server list?
That's basically what I said in my first post.
Yeah but what i said is different, it doesn't need asynchronous check.
Yeah but what i said is different, it doesn't need asynchronous check.
By asynchronous, I meant checking the download size while scrolling down the list instead of checking all the server at once.
It's not like a server file-size gonna change every second, so a first size report on server start with query request should be sufficient. An static value like the server name.
It's not like a server file-size gonna change every second, so a first size report on server start with query request should be sufficient. An static value like the server name. ~ @Deihim007
I believe @Dezash is talking about client-side latency in looking up + hashing all the client resources. As long as we store the size of each resource, there is no problem in updating the file size when a resource is updated.
A potential solution to the client slowdown:
Problems with the above solution:
last_update = max(map(get_last_update, filepaths)), but then we still need to do disk (metadata) reads for every single file in every single resource.Notes:
- if client resource folder was updated _after_ the resource download time (stored in cache), we invalidate that resource and re-hash each file (and store it in the cache)
What if the server updated the resource? I think there should be an expiration time of those hashes.
when we query for a server's size:
- the filenames+hashes+filesizes of all resources are received
- if server filename/hash DOES NOT MATCH the cache filename/hash, we know that the file will be redownloaded when you join the server. increase the filesize by the server entry filesize
- if the server filename/hash DOES MATCH the same details in the cache, the resource will not download
When/how do you suggest querying for server size though? If the client were to query around 5000 servers on every startup, downloading its filenames, hashes, filesizes and then checking each hash, it could take up quite a bit of time.
What if the server updated the resource? I think there should be an expiration time of those hashes.
The server always sends over the hashes for updated resources. If the user's client resources are out of date, their hashes will not match the server's hashes. This means that the mismatched file needs to be redownloaded and therefore the size added to the client's perceived server download size.
An expiration time for hashes would not be necessary as they would always be up to date (because of the download time heuristic).
Generally I feel that expiring resources might be a good idea, though. i.e. resources that servers have not used in a long time should be deleted. This specifically can be discussed in a separate issue.
Also, servers sending the resource structure (filename) across make it possible for us to provide a "delete resources" button after disconnecting from a server. Again to be discussed in a separate issue (there are a few usability issues I can think of).
When/how do you suggest querying for server size though?
If the client were to query around 5000 servers on every startup, downloading its filenames, hashes, filesizes and then checking each hash, it could take up quite a bit of time.
The "filename data" (filenames, hashes, sizes) is sent alongside with the other metadata. Indeed this might end up being a lot of information - we'd have to calculate the average number of files in started resources per server and determine whether or not it would be a great increase.
In the (very likely) case that it is too expensive to send this information over at once, we can still:
I'm not sure if hash comparison will take a while, but I may be underestimating it. We also don't have to be perfectly accurate, so we can apply some other heuristics or cheat a little.
One way of cheating at hash comparison would be to ignore filenames and just use Bloom filters (on the resource level). See "yourbasic.org Bloom filters explained" (or this other unverified non-Go related tutorial).
Actually, for bloom filters, we don't even need to ignore filenames, we can test for membership of concatenated strings like so: hash + "-" + filename.
The Google Chrome web browser used to use a Bloom filter to identify malicious URLs. Any URL was first checked against a local Bloom filter, and only if the Bloom filter returned a positive result was a full check of the URL performed (and the user warned, if that too returned a positive result).
We only want an estimate, so we would not need to perform a full check. We would have a bloom filter for each resource the client has already downloaded, and (I assume) files are usually spread thin across many resources, so there's probably a low probability of failure as well (someone should verify this).
Also, further reading about why Chrome no longer uses Bloom filters:
PrefixSet as an alternate to BloomFilter for safe-browsing.
The safe-browsing prefix data is uniformly distributed across the 32-bit integer space. When sorted, the average delta between items is about 8,000, which can be encoded in a 16-bit integer. PrefixSet takes advantage of this to compress the prefixes into a structure which is relatively efficient to query.
The sorting issue could probably be solved without the need of two columns. The values could be initialized using the estimated download size and asynchronously updated with the accurate download size (prioritizing the servers that are visible on the screen).
The sorting issue could probably be solved without the need of two columns. The values could be initialized using the estimated download size and asynchronously updated with the accurate download size (prioritizing the servers that are visible on the screen).
That's an even better idea 馃挅
We might go through more than two prototypes (the first one being #154) which we don't have enough room for on this issue, so I haven't marked this as "Likely Accept", and have marked it as "Accepted" instead.
Most helpful comment
The way you described yeah it would be slowed down,
But if the initial-size would be sent to serverlist server with the query request, then there won't be any performance issue.