Nugetgallery: Package statistics not updating properly

Created on 2 Apr 2020  路  11Comments  路  Source: NuGet/NuGetGallery

This is an issue which has been going on for a while, but every time it's fixed it seems like it occurs again but slightly differently 馃槉

In the past the download stats has been frozen and when navigating to "View full stats" you could see that the date of the last update was old. Now the stats are not updating but the full stats page says the stats have been updated. For example it's been a while since the download stats increased for our main package:

https://www.nuget.org/packages/Piranha

However, nuget.org claims it was updated yesterday.

Given that we have several years of user statistics showing that this never happens it's highly unlikely that there would be zero downloads for a longer period. Also when this has occurred in the past, all of a sudden several hundred downloads are all of a sudden added on the statistics, so it gives the impression that the stats are recorded, but not propagated properly on package level for the site to show.

Best regards

Most helpful comment

We apologize for this very bad experience , and it's also hurting us now. We have scheduled to rebuild this pipeline with a much better architecture and implementation, starting from this sprint. And it will replace our legacy statistics pipeline soon.

All 11 comments

More input. Today the download count increase with 132 from yesterday, which statistically is too high for our project to be one day of downloads, on the other hand it seems too low to be the total of the days the statistics have stood still. Due to the fact that it increased too much for one day and stood still earlier, even though the dates said the statistics was updated somewhat validates the suspicion that stats are recorded somewhere but are not propagated properly through the system.

I've put up an excel where I'll store the values from every day the website says it has been update to see how it continues to behave.

Best regards

Hi, @tidyui. Thank you for noticing us! We met an incident for our statistics pipeline during these days, and it's mitigating now. We are really sorry for the inconvenience. Our current statistics pipeline is not so scalable, but we have scheduled to refresh/improve this pipeline. I think we will provide a much better statistics pipeline soon.
Yeah, you are right. The download counts from the CDN side during the past several days are still flowing through our system gradually, until it's fully mitigated. So the update on the download counts may include several days' counts.

So it sounds like this is a NuGet Statistics issue, and not Covid destroying the number of downloads?
image

I have the same issue with my project: https://www.nuget.org/packages/ETLBox/
The statistics are updated very randomly after ~7 to 10 days, in between they seem to be frozen (though there are for sure downloads). It's really hard to trust the numbers, as they get updated not more often than once a week with a high number, and then stands still again for quite some time. Looking at other tools and statistics (Google Analytics, Github) I can see more or less a steady amount of traffic.

@daryllabar @roadrunnerlenny It looks like surely there are some problems in the Nuget download count and sync. The number of downloads it showing after 4 days is not trustable anyway.

Hi, guys. Thank you for the patience here! We have already fed all the downloads counts we missed during the past period, and the statistics data should be the expected one now.
The reason why you notice that the increase of download counts was not steady during the past period is that the statistics pipeline was designed and implemented several years ago, and the pipeline is not scalable enough to handle some very large log files, which will slow the progress of the ingestion, and lead to dead letters. We have to take some actions to feed these failed log files again, and let these log files go through the pipeline.
So that's why you see that the download counts seemed frozen during the period, and increased a lot later, because we were feeding the unprocessed logs, which contained multiple days' download counts.

We apologize for this very bad experience , and it's also hurting us now. We have scheduled to rebuild this pipeline with a much better architecture and implementation, starting from this sprint. And it will replace our legacy statistics pipeline soon.

I see this is closed but is the issue fixed? As a user is is incredibly frustrating to see issues closed without an actual solution implemented or at the bare minimum a reference to a new ticket to track the tick opened. This isn't how issue trackers are supposed to work!

For the following package the stats details are:

https://www.nuget.org/packages/SixLabors.ImageSharp/

Last update Statistics last updated at 2020-04-30 16:47:41 UTC

This indicates that the system is broken again.

You mean it鈥檚 still broken 馃榿 I agree that it would be awesome to have somewhere we could follow what鈥檚 happening with this or any form of timeline. Right now the stats are just a guessing game

I agree, shouldn鈥檛 this be kept open until a fix has been implemented? Someone else is going to just create a new issue thinking this hasn鈥檛 been addressed. It would be nice to track this somewhere.

We are all divas and need to know how much people love us via our download counts 馃槀

Yeah, I mean for us developing completely free software with no additional registration or built in spyware crap, these statistics are the only indicator on which version people are using, how fast people migrate to new versions and so on.

If these numbers don鈥檛 start to work we have to look into other ways of tracking usage statistics because it鈥檚 extremely important for maintenance.

This, and that I like to update my CV with the download count every day (just kidding) 馃お

Was this page helpful?
0 / 5 - 0 ratings