Im running the latest 2020.12-rc and updated one hour ago. Apart the login and first display of the first page of my timeline everything is completely slow.
Everything seems to be very slow and the browser is loading "forever" after I logged in.
The server itself is idling and other applications on that server are functioning completely normal an fast.
Do you experience the same?
My Friendica server is at the moment not usable with wait times between 3-10 minutes after each click.
Can you execute show processlist; in the mysql console?
Id User Host db Command Time State Info Progress
41 DELAYED localhost other Delayed insert Waiting for INSERT 0.000
36808 Friendica localhost other Sleep 0 "" 0.000
37349 other localhost other Sleep 180 "" 0.000
37595 other 127.0.0.1:59556 other Sleep 38 "" 0.000
37631 other localhost other Sleep 36 "" 0.000
37688 other localhost Query 0 init show processlist 0.000
Have you had a look at the number of open connections if you are using fcgi?
Yes. I have configured a very high number of connections since over a year or so. I didn't change the configuration of friendica, nor mariadb. I just looked up last weekend the "default database format" and did a mysqlcheck -o on all databases I have on the server. After that everything works fine until I made a git pull this morning. Since then I have this problem.
I'm not using fcgi, just the standard apache mod.
Same situation here.
The pages are loading fine here. But I had some pretty heavy I/O load after yesterday's pull. The NVMe SSD drives were at 100% utilization. It lasted for approx. 8 hours. Then it went back to normal. I cannot say what it was (I was sleeping). Looks like some heavy background task.
There is a new background task that removes not longer needed public contacts. The initial execution could take some time.
I saw such a heavy process last week, but it was gone soon.
But still at the moment I even can't login any more.
Correction .. i could login after 5 Minutes .... and so does every page load .. 5-10 Minutes
Server is still idling.
Have you had a look at your webserver's log if there is a lot of activity?
Server is idle all the time arond 0.2-0.4 load and normal external connection like all the time. So nothing is happening on the server.
The only pages, that are quick are pages, that don't need a timeline like contact, admin page and so on.
What does mysqltuner.pl say about the highest number of connections and possibly some other values to adjust?
Have a look at the value for table_open_cache as well. Also please check if the PDO emulation from PHP might help. Enable pdo_emulate_prepares in the local.config.php:
'database' => [
'hostname' => 'servername',
'username' => 'user',
'password' => 'password',
'database' => 'database',
'pdo_emulate_prepares' => true,
],
What time is measured in the "Other" section of the rendertime addon? For me the 5 minutes are lying there. 50 MySQL connections, that pdo emulation does not change anything not even the idling of the RasPi.
Other is everything outside the known speed limits. It can be the session handling as well. You could try activating 'session_handler' => 'native', in your local.php.ini. (After activating you have to log in again)
No, it is not the session handling. It is _something_ with loading items for timelines or single posting views. There the "Other" value hits the 250s mark while on other pages it says 1.2s.
Look, on my server there is Friendica, a wordpress blog a b2evolution blog and a RSS server. Everything is running smoothly like always except Friendica. So i doubt there is something wrong with my database settings. Because if there would be something wrong with my database settings, one or all other other services should "stutter" as well.
I don't have errors in den logs, nor is the server overloaded as I wrote several times above.
I try to roll back to an earlier state of development and see if the error is still there, because I think there were some changes in friendicas code or frontend, that are causing the issue, that my friendica is unusable.
When it had been something with loading the items then the database should be the slow part. That's weird.
I think it has something to do with loading items for the timeline, because loading from the database is ok (like loading contacts). Loading single items (clicking on the notifications), or loading the whole timeline is pain in the ass.
Maybe one more strange thing, that may or may not have something to do. I get some old posts in my timeline the past days. Posts that I already got. From 6 months ago until 2 weeks ago. I noticed around 10 such posts.
@hoergen if you have the time git bisect can help identifying the crucial commit. It will iterate through the start and end commit until the first bogus commit is found. I may find some tome for this but only over the weekend.
Due to the changes in the database structure I wouldn't recommend "git bisect". For example we added a new field in the workerqueue. This means that when going back to a version previous than that, all waiting jobs in the que will be trashed.
But so will the rolling back of the git checkout.
And queued jobs are fine, but when you cannot use the system anyhow they are merely pointless.
Hmmm ok so I am locked in the current situation and have to wait until some magic happens ;)
TL;DR
This issue seems to be user specific. A user of my node reported the same issue. But for me everything is as fast as usual.
We've also introduced BBCode versioning where previously cached items will be re-converted if we change something to the BBCode versioning. This allows retroactively applying fixes to how posts look. Previously the cache would last indefinitely which is great for performance, but broken posts would stay broken forever even after a fix was introduced.
This may have had an impact on the display of stream pages with high items counts, but I'm as puzzled as @annando about the origin of the slowness for the login page. We would have to run an XDebug trace to find the exact origin of the slowness.
TL;DR
This issue seems to be user specific. A user of my node reported the same issue. But for me everything is as fast as usual.
Oh wow you are right. I have chosen two accounts that are affected by this "bug" and didn't try to test the other accounts. Maybe an activated addon ?
But what about the login page, it wouldn't be for a specific user?
The login page here is quick like all time, But when logging in with the affected users it is getting slow as soon as you click on "login"
Thank you for the clarification, we're talking about the network page then, so the slowness makes more sense. Still, the BBCode cache busting mechanism should only be triggered once for each post by BBCode version increase, so the slowness shouldn't be systematic.
No, not only "network" page. It is every page with "posted items" as well personal wall and displaying single items that is slow.
So at least 3 pages I could identify.
As long it is loading, the icons above are looking like this (maybe it helps)

Thank you for the added details. Still, as long as we don't experience the slowness ourselves, we have little chance to address it without an XDebug trace. I can guide you through it.
Or maybe it has something to do with jquery? In web console I get this error while waiting to load
Uncaught TypeError: data.notifications is undefined
<anonymous> https://horche.demkontinuum.de/view/js/main.js?v=2020.12-rc:296
jQuery 7
NavUpdate https://horche.demkontinuum.de/view/js/main.js?v=2020.12-rc:510
jQuery 7
NavUpdate https://horche.demkontinuum.de/view/js/main.js?v=2020.12-rc:507
setTimeout handler*NavUpdate https://horche.demkontinuum.de/view/js/main.js?v=2020.12-rc:527
setTimeout handler*NavUpdate https://horche.demkontinuum.de/view/js/main.js?v=2020.12-rc:527
As hint I get this page https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Errors/Unexpected_type
Not true for all accounts, only for some ... I think Aliens took over my friendica server ;)
But ... on the other accounts I didn't get this error .. until :( I reloaded the page without cache Shift+Reload. (click on the reload button of Firefox)
As soon I do that on an account, where I didn't have that problem. I have it now.
So this bug will affect everybody sooner or later .....try it, if you dare ;)
ps: it makes me feel it has something to do with jquery ...
BTW: Have you tried other browsers?
BTW: Have you tried other browsers?
Yes, Opera and Chromium. The same result.
Chromium and Firefox here.
Ok. I'm on a train right now. Let me clear my browser's cache. If you don't hear from me anymore....
:-)
Nope. Network page is still loaded in less than 2 seconds.
The network page on pirati.ca loads in less than 0.8 seconds, single pages load in less than 0.2 seconds. Is the low speed reported by rendertime? If yes, then it can't be the browser. But of course it can always be some addon. So disabling them could help.
Hmm. I'm using all add-ons that are installed on the server. Still, some local users have the issue and some don't.
Wait... I don't use the markdown add-on.
Deactivating Markdown addon doesn't change the behaviour. :(
If im not logged in, the page loads very quick.
If im logged in loading of the page needs 3-5 minutes
example: https://horche.demkontinuum.de/display/52acf9cd-f9d18fce1f7b1d74-7785a3f3
_WTF was ist denn da lose? It is like cybermobbing myself ;)_
Deactivating Markdown addon doesn't change the behaviour. :(
Yup. Noticed that too. But I had to leave the train. So I couldn't post the result of the test.
This isn't a browser add-on thing? Or is it?
I tried all my 15 accounts on 4 nodes on 2 different servers (all on current RC). None of the accounts has this issue. I really tried. But I cannot reproduce it. There's nothing suspicious in the browser console. No PHP errors. No super slow SQL queries. But still, some local users do have the issue. I'm puzzled. 馃
The only slow query is this one:
# Query_time: 15.444776 Lock_time: 0.000020 Rows_sent: 0 Rows_examined: 33879125
# Rows_affected: 0 Bytes_sent: 76
# Tmp_tables: 7 Tmp_disk_tables: 0 Tmp_table_sizes: 1269880
# Full_scan: Yes Full_join: No Tmp_table: Yes Tmp_table_on_disk: No
# Filesort: Yes Filesort_on_disk: No Merge_passes: 0 Priority_queue: Yes
SET timestamp=1607447372;
SELECT `id` FROM `contact` WHERE (`network` IN ('dfrn', 'dspr', 'stat', 'apub', 'zot!', 'unkn') AND `self` = 0 AND (`id` IN (SELECT `author-id` FROM `item`) OR `id` IN (SELECT `owner-id` FROM `item`) OR `id` IN (SELECT `causer-id` FROM `item`) OR `id` IN (SELECT `cid` FROM `post-tag`) OR `id` IN (SELECT `cid` FROM `user-contact`) OR `uid` != 0) AND (`last-update` < '2020-11-08 17:09:01' OR (NOT `archive` AND `last-update` < '2020-12-01 17:09:01'))) ORDER BY `last-update` LIMIT 100;
It is user related.
I tried it with several browsers, and several accounts. and every combination is only related to. On the other accounts that were slow in between. Maybe that was just a coincident (?) But my main account that manages all other accounts has this problem.
Maybe it is related to "manage accounts" ? Maybe a database query, that gets some weird instruction, because of "this person is managing some other accounts"?
@AlfredSK that query is an expected slow query. This is the one that cleans up public contacts that aren't in use anymore.
What is this? Column '729'? This looks wrong. I see a lot some of these in the log file....
2020-12-08T16:08:24Z index [ERROR]: DB Error {"code":1054,"error":"Unknown column '729' in 'field list'","callstack":"DBA::selectFirst, Photo::getAvatar, Photo::init, call_user_func, Module::run, App::runFrontend","params":"SELECT `729`, `1` FROM `contact` WHERE (1) LIMIT 1"} - {"file":"Database.php","line":664,"function":"p","uid":"17a332","process_id":1735845}
I have the same problem with very slow loading, this profile is also affected by this error Not getting updates from some profiles #9634. Could the errors be related?
@AlfredSK this is something different, but will be fixed in a minute.
Using the API seems to work _fine_.
From an affected user...
https://libranet.de/display/0b6b25a8-185f-cf54-3047-de7524719648
What matters is whether I'm logged in. I get the same issue with curl. When logged in, transfer stands still at zero for two minutes, then starts.
$ time curl -o /dev/null https://libranet.de/~clacke
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 1085k 0 1085k 0 0 95372 0 --:--:-- 0:00:11 --:--:-- 95138
real 0m11,754s
user 0m0,068s
sys 0m0,033s
$ time curl -o /dev/null 'https://libranet.de/~clacke' -H 'Cookie: PHPSESSID=****; Friendica=****'
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 3528k 0 3528k 0 0 26272 0 --:--:-- 0:02:17 --:--:-- 270k
real 2m17,610s
user 0m0,099s
sys 0m0,125s
I guess a log could help.
Sure. Problem is i cannot reproduce it myself. My accounts are not affected.
I'm currently trying to enhance the profiler with some additional data. The current state is that I'm still fighting with the understanding of how the classes are initialized and executed. This "dependency injection" is still a miracle to me.
Hit me up if you need an explanation or specific details.
I'm the affected user on Libranet. My intuition when there is an extreme slowdown like this and it's user-dependent is that a database table somewhere is missing an index.
I'm the affected user on Libranet. My intuition when there is an extreme slowdown like this and it's user-dependent is that a database table somewhere is missing an index.
Hmm then it has to be something to do with the permission settings on private posts, because anonymous access on public posts is quick. And another thing that would be strange, that access via API is still quick as @tobiasd told earlier in the thread.
In my case, it looks like friendica is just waiting for something, because the database is idle too. If there is no index, the database should be working visibly. Isn't it?
...and there is no slow query. It cannot be a DB problem.
A possible cause could be timing out external requests. This wouldn't impact the node load (CPU, disk I/O or MySQL) while its waiting for a response from a remote server.
Okay. What can I look/grep for in the logs? Ideas?
When I just watch the logs I don't see anything suspicious - except some weird 404's for /photo/somerandomnumber-4.png?ts=153xxxx794 in Apache access log.
A possible cause could be timing out external requests. This wouldn't impact the node load (CPU, disk I/O or MySQL) while its waiting for a response from a remote server.
But this should be on my and on @tobiasd server and it should last in my timeline and in my personal wall for over 3 days now.
According to @AlfredSK assumption and I think he is on the right track, we may should start checking the users database entries together with post permission settings, while loading a timeline (with the logged in user).?
I added two PR to hopefully help detecting the bottleneck. As far as I understood the profiler tells that the speed is eating in the "other" part, isn't it?
So on my system 296 from 297 seconds are coming from the "Content" section in the rendertime display.
Okay. That helps a little bit tracking it down. Please do the following: Enable at least "info" in the logging. It happens with /display/ pages as well, didn't it? Then visit such a page. Then have a look in the log file for the guid of that page (the part after /display/). Have a look at the time if this had been at the time you visited the page. Then have a look for the value "uid" in the log file. Then collect all log entries with the same "uid" in it. Have a look at the timestamps. When they are plausible then you should have entries from the whole phase of the visit - and at some point there should be some time gap of several seconds. That part of the log would be really interesting to see.
I doubt it somehow ;) Having the log level set to "info" opening threads via the notification menu I only see the information about the guid posting being received
2020-12-10T07:09:48Z index [NOTICE]: created item {"id":2566654,"uid":0,"network":"apub","uri-id":1578168,"guid":"22b2c6df-a2fc5fff39cd7155-9b3253b7"} - {"file":"Item.php","line":1925,"function":"insert","uid":"da544e","process_id":30420}
2020-12-10T07:09:49Z index [NOTICE]: created item {"id":2566655,"uid":2,"network":"apub","uri-id":1578168,"guid":"22b2c6df-a2fc5fff39cd7155-9b3253b7"} - {"file":"Item.php","line":1925,"function":"insert","uid":"da544e","process_id":30420}
or on an older thread, that it got updated
2020-12-10T08:35:31Z index [INFO]: Update thread {"item":2559109,"guid":"2196d4ee-415f-ccc7-9f98-3cd161792462","result":true} - {"file":"Item.php","line":3406,"function":"updateThread","uid":"3c378f","process_id":3222}
But I don't see a log entry for opening the /display page.
I got the same "wating times" when i click on "preview" when writing comments. Hope that gives another hint to narrow the search.
@tobiasd It does happen with that display as well, didn't it? Then you could add some logging by yourself as well. Since it seems to be in the content section, you could open mod/content.php and could add logging entries like Logger::notice('Test-A'); to display_content. One directly on the first line, one before the return. Then you could check if the time delay is between those two entries. And if it is then you could track it down by adding more and more entries in between.
mmh... maybe that is the problem ;-) I don't have a mod/content.php file
$ ls mod/c*
mod/cal.php
where was the content moved to?
I meant mod/display.php, sorry.
The "process_id" in the log file entries for opening a single posting on /display should only contain entries done during the opening of that display page right?
Ok there is "a ton" of probing contacts not involved in any form with that thread I've picket. Looking at an RSS feed entry, but probing various contacts from partially dead nodes in the fediverse for about 5 minutes until displaying the item.
I would focus on the value of "uid" in the log file. Since Friendica does much stuff in parallel (some at the frontend, some at the backend) they all get mixed up in this single log file. So you need one anchor log entry where you know that it had been generated at the point where you want to have a look at. Then filter for all log entries only with that specific "uid" value. Then it shows a trace of what had been done in that specific task.
When this probing is done with the same "uid" then I really would like to see that log.
In this case, both process_id and uid seem to have the same content. I'll send you a link.
The addon "superblock" is responsible,
If this is the case it would explain why my accounts are not affected. I don't have any entries in the superblock settings.
I will do some fix for this later this day.
mmh... I had that addon in use, onve--but it is deactivated server wide at the moment.
The addon "superblock" is responsible,
YEAH! I deleted all entries in that addon and now everything is fast again.
PS: simply deactivating didn't work. ... but it should work?
Yeah, it's the entries in that addon. Simply deactivating will not work.
Yeah, it's the entries in that addon. Simply deactivating will not work.
But normally it should work aka should have the same effect like deleting all entries. I think that points on another problem (bug light).
That addon is some kind of fake addon since the functionality is in the main code and not in the addon.
Hmm, that are very obfuscating infos.
The addon is only adding some front-end functionality so that the user can do some blocking, which is a core functionalyity in reality.
Ah ok, so it isn't an addon any more. Maybe it should integrated and labeled in the core (admin frontend) as such. So it is clearer to admins and giving the functionality the correct credits ;)
And it would ease up bug searching.
This addon is superfluous anyway. We now can block any contacts directly from the core. It had been from a time where you had to follow someone to block them.
I've applied the patch in my system and the times are down again to normal 3s :)
I've applied the patch in my system and the times are down again to normal 3s :)
you applied it manually, isn't it? Not via git pull dingdong
Yes, manually. One of the tests if failing for some reason but I wanted to see if the patch fixes the bug anyhow. Afterall it could not get more unusable ;)
Ok, tested it with activated superblock and alle entrys. The fix worked and the issue can be closed.
Most helpful comment
Ok, tested it with activated superblock and alle entrys. The fix worked and the issue can be closed.