In our maps we are now loading data from https://publiclab.org/api/srch/nearbyPeople?nwlat=10.0&selat=0.0&nwlng=0.0&selng=10.0
This code is located here: https://github.com/publiclab/plots2/blob/master/app/api/srch/search.rb#L333
Accessing this API takes a lot longer than it should. For example, it takes 19.43s to load:
Meanwhile another API we use in our map layers has far more results in only 83ms!
Some of our other Search API requests perform much better than nearbyPeople. Why is that particular one so slow? In order for our maps to be useful we can't have such a long delay in loading every time we refresh or move the map.
I suspect that we are going to have to use less search params in order to get the speeds we want. The problem is I'm not sure how we can test those params without pushing and merging the changes to the live site, and that's going to be a really slow iteration. Anyone want to look into this?
@nstjean i could like to try this one. I think the issue could be because of the greater number of parameters to be processed. Due to which, it takes a larger amount of time to load and process the data.
@dburugupalli Go for it!
Yes we need to decrease the time taken by that query. Thanks!
@dburugupalli my first suggestion is to try adding a limit to this. we should try to trace the query -- adding more info here:
This is the query, which i suspect runs real slow on the whole huge users table. We can limit with an additional:
.where(status: 1)
which will help a lot. We could also join the nodes table and insist that the user has a node which has node.status == 1
too. That'd help a lot too.
Yes, here is the Skylight report on this query:
Full report here: https://oss.skylight.io/app/applications/GZDPChmcfm1Q/1583245980/6h/endpoints/Srch::Search%20%5BGET%5D%20srch%2FnearbyPeople
That actually says the slow query is the user tags query...
So maybe when we run this, it is running on a TON of records:
We could limit it to 100
Also, are we re-running a user_tag query on each instance, on this line?
item.user_tags.none? do |user_tag|
If so, we could optimize this.
I think this'll take a little experimentation but if that is a nested query on item.user_tags.none?
we should be able to speed this up a LOT by a) limiting the results and b) not nesting the query, but, inspecting the item and seeing what we can do to test if it has any tags without generating a new nested query.
Looking at whether we need to add an index, or reduce recursion in query nesting...
Hmm. user_tags
has no indices: https://github.com/publiclab/plots2/blob/15589c95b54759b2828ea0d9b3571d2bebf8c2dc/db/schema.rb.example#L417-L422
Not sure if it needs them? Checking # of records now.
Primary key ought to auto-index...
So, i ran it in the production console and it's really fast: UserTag.find_by(uid: 1)
So, i think it's the nesting - i bet we are running it on a ton of users.
https://github.com/publiclab/plots2/blob/783fbe2161bd70e13ad5e62ac6ab9254de3661b5/app/services/search_service.rb#L197
and
https://github.com/publiclab/plots2/blob/783fbe2161bd70e13ad5e62ac6ab9254de3661b5/app/services/search_service.rb#L176
have
User.where('rusers.status <> 0')
.joins(:user_tags)
same statements running twice! Can we optimize here by storing them ?
shouldn't there be status<>0
clause at user in https://github.com/publiclab/plots2/blob/783fbe2161bd70e13ad5e62ac6ab9254de3661b5/app/services/search_service.rb#L187 ?
I think there should be !
Two possible ways to reduce time are short circuiting if possible, AND rearrangement of statements so that the first clause eliminates more cases then the upcoming clauses in sql statements.
I think the first and second referenced in your comment are the latitude and longitude calcs I want to combine.
https://blog.nrowegt.com/rails-pg-ambiguous-column-reference/
OK, i was able to get this to work:
u = User.where('rusers.status <> 0').joins(:user_tags).where('user_tags.value LIKE ?', 'lat%').where('REPLACE(user_tags.value, "lat:", "") BETWEEN 44 AND 44.1').joins('INNER JOIN `user_tags` AS `lontags` ON lontags.uid = rusers.id').where('lontags.value LIKE ?', 'lon%').where('REPLACE(lontags.value, "lon:", "") BETWEEN 20 AND 80')
Formatted:
u = User.where('rusers.status <> 0')
.joins(:user_tags)
.where('user_tags.value LIKE ?', 'lat%')
.where('REPLACE(user_tags.value, "lat:", "") BETWEEN 44 AND 44.1')
.joins('INNER JOIN `user_tags` AS `lontags` ON lontags.uid = rusers.id')
.where('lontags.value LIKE ?', 'lon%')
.where('REPLACE(lontags.value, "lon:", "") BETWEEN 20 AND 80')
Hmm. @cesswairimu actually the block I'm still not sure about is:
item.user_tags.none? do |user_tag|
user_tag.name == "location:blurred"
end
What's going on here?
i think i found that the block is used to evaluate all items of the array and the whole thing returns true only if none of the array members return true? https://ruby-doc.org/core-2.7.0/Enumerable.html#method-i-none-3F
OK, i added .limit(limit)
in a couple places too. I hope this helps!!!
https://github.com/publiclab/plots2/issues/7556#issuecomment-610625735
Could the replace statements move towards end?
After sorting, we will have lesser number of rows so replacing at last step can improve(I don't have full context, so it may be wrong)
https://github.com/publiclab/plots2/pull/7795 is now merged, although one of the .limit()
statements had to be removed. We should re-test this after today's live site republication and see if performance improved?
Current:
https://oss.skylight.io/app/applications/GZDPChmcfm1Q/1586809440/1d/endpoints/Srch::Search%20%5BGET%5D%20srch%2FnearbyPeople
Now not seeing much more than 8s, much better than 60!!!
https://oss.skylight.io/app/applications/GZDPChmcfm1Q/1586902080/30m/endpoints/Srch::Search%20%5BGET%5D%20srch%2FnearbyPeople
but needs more testing.
Test it out at this zoom level: https://publiclab.org/map#9/40.71372914097432/-73.47581131383778
cc zoom level optimizations: https://github.com/publiclab/plots2/issues/7822
OK, so i'm not seeing a huge difference between before and after April 14; both are averaging around 17 seconds or so, and both have outliers up to more than a minute of load time, so not good. But even before the 14th, the "what time is spent on" chart is quite different for different periods.
We just don't have a huge amount of traffic so statistics aren't great, but I'll screenshot a few "moments" so we can look at different possible sources of inefficiency:
https://oss.skylight.io/app/applications/GZDPChmcfm1Q/1586919420/1d/endpoints/Srch::Search%20%5BGET%5D%20srch%2FnearbyPeople
this has 3 queries:
SELECT DISTINCT `rusers`.* FROM `rusers` INNER JOIN `user_tags` ON `user_tags`.`uid` = `rusers`.`id` WHERE (rusers.status <> ?) AND (value LIKE ?) AND (REPLACE(value, ?, ?) BETWEEN ? AND ?)
SELECT `rusers`.* FROM `rusers` INNER JOIN `user_tags` ON `user_tags`.`uid` = `rusers`.`id` WHERE (rusers.status <> ?) AND (rusers.id IN ?) AND (user_tags.value LIKE ?) AND (REPLACE(user_tags.value, ?, ?) BETWEEN ? AND ?)
SELECT `rusers`.* FROM `rusers` INNER JOIN `user_tags` ON `user_tags`.`uid` = `rusers`.`id` WHERE (rusers.status <> ?) AND (rusers.id IN ?) AND (user_tags.value LIKE ?) AND (REPLACE(user_tags.value, ?, ?) BETWEEN ? AND ?) ORDER BY created_at DESC LIMIT ?
That was all just one query, for the period of time before the April 14th optimization.
This next one was after the optimization:
The query was more simple:
SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = ?
It does seem that most slowness after April 14th was to do with user_tags
for a single user... that's weird.
Could it be that for large sets of users, this is just these lines running over and over?
If true (and we should be able to test that on the console?) we should be able to do this filtering a different way, as we should already have loaded the user tags and be able to additionally sort by them. But maybe we haven't eager-loaded the associations?
Could we a) do this in 2 batches, one for users with a location:blurred
tag and once without, and/or b) eager-load the user tags so we can filter without doing individual queries for each user?
Also, pssst @Tlazypanda the staff and I were talking about this today and I thought you may also want to put eyes on it since it is performance related! :)
@emilyashley Yeah sure! Would need a bit of context on this first though :sweat_smile: Will look into it :smile:
Hi @daemon1024 -- i wondered if this might be one you'd like to take up in your project, if you end up having time. It may have a big impact on usability. Thanks!
Hey @jywarren, Definitely. I actually mentioned benchmarking the People API in my proposal and this gives a lot of relevant context as well.
Most helpful comment
Also, pssst @Tlazypanda the staff and I were talking about this today and I thought you may also want to put eyes on it since it is performance related! :)