Matomo: Tracker: Serialization failure: 1213 Deadlock found when trying to get lock

Created on 7 Oct 2014  路  56Comments  路  Source: matomo-org/matomo

so i use the tracking api.

my code separates actions into groups of 500-800 (depending on session length)

each group is put into a queue, and threads send these groups to piwik to be tracked.

currently 15 threads send these groups simultaneously.

why i do this:
in my particular setup, i did some testing and around 500 was the sweet spot for piwik's ability to handle bulk tracks (less didn't decrease the time much, and more started increasing the time significantly)

i also noticed that the times it takes to track these hits doesn't seem to increase much if you have multiple bulk requests at the same time.

the problem:
i recently upgraded from 2.2.2 to 2.7.0, and now when i run the same importer, i start getting error 500s from the tracker requests.

looking at the apache error logs, there seems to be deadlock issues:

Bug

All 56 comments

I was able to reproduce this on my local machine while trying to import big log file (I guess it was Piwik 2.5).

thanks for reproducing

Thanks for the report, we should investigate this for sure.

Hello
Same problem for me

Thanks for the future fix

Any news about that point . Every night my importlogs script fails because of that error.
Please let me know urgently if there is a workaround for that . I am losing all my statistics since more than a week now.. !

Thanks

do you guys get other errors in your web server log or mysql error log? I don't know what can cause this issue so far.

Nothing else tnat what jas been mentioned

In piwik output I have

Fatal error: Internal Server Error
You can restart the import of "/var/tmp/stats/app/access_full" from the point it failed by specifying --skip=21056 on the command line.

And in apache log :
Hote that's help

I can also reproduce the error in 2.9.1.

I tried to import a logfile of the size of 60 MB.

This is the output of mysql.err (mariadb 5.5.34)

141120 15:22:20
** (1) TRANSACTION:
TRANSACTION 97983A, ACTIVE 1 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 6 lock struct(s), heap size 1248, 4 row lock(s), undo log entries 26
MySQL thread id 197643, OS thread handle 0x7f958fb85700, query id 37275520 piwik-db-server.tld piwik-server.tld piwik_basic Updating
UPDATE piwik_log_visit SET idvisitor = '^SscD', visit_total_time = '1265', visit_last_action_time = '2014-10-08 17:57:21', visit_exit_idaction_url = '1607782', visit_total_a
ctions = visit_total_actions + 1 WHERE idsite = '7' AND idvisit = '1300111'
*
* (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 1172 page no 2854 n bits 136 index PRIMARY of table piwik_basic.piwik_log_visit trx id 97983A lock_mode X locks rec but not gap waiting
** (2) TRANSACTION:
TRANSACTION 979835, ACTIVE 1 sec starting index read, thread declared inside InnoDB 500
mysql tables in use 1, locked 1
11 lock struct(s), heap size 1248, 10 row lock(s), undo log entries 175
MySQL thread id 197639, OS thread handle 0x7fabd81b4700, query id 37276630 piwik-db-server.tld piwik-server.tld piwik_basic Updating
UPDATE piwik_log_visit SET idvisitor = '^So', visit_total_time = '1321', visit_last_action_time = '2014-10-08 18:02:39', visit_exit_idaction_url = '1607836', visit_t
otal_actions = visit_total_actions + 1 WHERE idsite = '7' AND idvisit = '1300276'
*
* (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 1172 page no 2854 n bits 136 index PRIMARY of table piwik_basic.piwik_log_visit trx id 979835 lock_mode X locks rec but not gap
** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 1172 page no 2855 n bits 144 index PRIMARY of table piwik_basic.piwik_log_visit trx id 979835 lock_mode X locks rec but not gap waiting
*
* WE ROLL BACK TRANSACTION (1)
InnoDB: transactions deadlock detected, dumping detailed information.

I get the same error messages with Piwik 2.9.1 and Oracle's mysql-community-server-5.5.33 on openSUSE 12.2 (x86_64), and I seem to observe gaps in the imported data.

The error messages no longer appear when I change the (non-default) importer option --recorders=4 to --recorders=1 (or omit it). Of course this workaround slows down the import on a multi-core machine.

My solution is to "downgrade" (reinstallation) to piwik 2.6.1. --recorders=24 is not a problem!
Deadlocks are also gone !

I have used a number of versions of PIWIK previously and am seeing this problem with 2.9.0.

Previously I have collected about 12 months worth of data and am running via BASh script to process through my archives, consolidate data into reasonable "chunks" and then upload with the python script...

Once I see the error on my main database, it then seems to repeat a lot. It is as if I have blocked the pipe and everything backs up. I have tried a number of things just to experiment to see if I can see a root cause:

  • I can take the same set of data, that fails to upload with this error and upload the same to a fresh new clean and empty database no problem whatsoever
  • I can retry loading to the main database and over time it will upload slowly and eventually
  • I can take other data to another machine in the network and upload from another instance of PIWIK pointing to the same dbase, at high speed

I conject that {for whatever reason} this is the dbase in a bad state, and all MySQL information about these kind of things suggests that its all about how the data is being pushed in, causing MySQL to "protect" its data (my words).

Furthermore pietsch's "fix" suggests that the database end of the connection has reached a state that can acceot data but (humanised word) "slow down" the rate of input....

So a question to PIWIK. Does the database just accept data when uploading with python OR does the php code and database together manipulate it in the upload session.

My current approach is to slow my rebuild right down and also to stop the continuing data collection from my live source... in otherwords I have one stream of data going into the database. My worry os that once a break occurs. That will be that and I will be back to uploading 20 records every 10 minutes or so.

zero-universe's suggestion is most wise at the moment....

What sort of information shall I collect to benefit this forum, assuming that I see the error once again?

Kev

any updates on this topic ?

@zero-universe nop

same here (v2.9.1), reducing recorders from 12 to 2 helped, but its only a temporary fix. i import hourly, and with only 2 threads it takes to much time. (115 req/sec instead of 280-350 req/sec)

Had to reduce recorders to 1 ... :-1:

The same affects me (2.9.1)

The same affects on version 2.10.0

Same here.
Piwik 2.10.0
IIS8.5
PHP 5.4.24
MySQL 5.6.22

Reducing recorders to 1 helps

Error query: SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction
In query: UPDATE piwik_log_visit SET idvisitor = ?, visit_total_actions = visit_total_actions + 1 , visit_exit_idaction_url = ?, visit_total_time = ?, visit_last_action_time = ? WHERE idsite = ? AND idvisit = ?

We are also having this problem in this environment:

CentOS 7
Piwik 2.10.0
Apache 2.4.6
PHP 5.4.16
MariaDB 5.5.40

And like others here, reducing the number of recorders to 1 appears to have eliminated the problem.

+1 here.

Also had to reduce recorders to 1.

Debian Wheezy
Piwik 2.10.0
Apache 2.2.22-13+deb7u4
PHP 5.4.36-0+deb7u3
MySQL 5.5.41-0+wheezy1

Hi, I wrote the transaction patch in 2.4.0. I will try to find a solution, I already have an Idea that different Data handling within piwik might be the problem.

As long as the analysis takes, please try to set "bulk_requests_use_transaction = 0" in config/global.ini
That should improve the situation at the cost of speed, but not as much as "recorder = 1"

You are a genius!
I could import without generating the error.
Thank you.

Any update on this issue?

I have also noticed that if you use MySQLi instead of PDO the deadlock errors go away but the number of records/sec drop significantly.

I'm moving this to 2.12.0 and hopefully we can fix it there. Maybe @medic123de will have a solution for it :-)

FYI: this same issue is discussed in #7337 where it was identified that the deadlocks are caused by the archiving cron core:archive running at the same time which is causing those deadlocks.

@mattab I have also seen deadlocks caused by the import_logs.py script with no archive job running. I am not sure if the 2 issues are related as in the case of the archive job I am seeing the IOPS on our DB instance maxed out at over 2500 Write IOPS

I have seen deadlocks when imports were happening simulatenously with archiving. (Thanks for point that out.) But, like esapozhnikov mentions, I've also seen deadlocks when the import script was running and no archiving was happening.

@all
can someone please verify, if import_logs.py from #7348 improves or fixes that issue?

@quba are you still able to reproduce the deadlock issue with the patched importer?

Other ways of fixing (or at least making deadlocks more unlikely) could be:

  • batch smaller commits in between. E.g. when a recorder bulk loads 1000 rows, the tracker could commit in 10 (20, 30, 40) row batches to reduce locks.
  • A generally accepted way for dealing with deadlocks is to simply retry a transaction

I would prefer a fix on the tracker's side because the import-log is not the only adapter for the bulk tracking API, and therefore should be bullet proof to use without knowledge of internals.

Looks like it has been resolved in #7348.

1) Could somebody check if this proof of concept bugfix would resolve the issue? I was not able to reproduce the error anymore with a 4G input file. This will not make deadlocks occur anymore but very likely will deal with the situation (release lock and try again after a short time).

Retrying is one of the best practices with a deadlock scenario, so this is a valid way to go.
@medic123de @mattab @quba

2) It would be very interesting, if this PoC fix decreases speed significantly.

here is diff without whitespace: https://github.com/peterbo/piwik/commit/ca4084a73560410c211309ee623d13bd617be6fe?w=1

Retrying sounds good to me, +1 for this approach

@peterbo ca4084a doesn't solve the issue for me (I still have timeouts).

Note there is a workaround available for this problem (as suggested by . In config/config.ini.php under [Tracker] add: bulk_requests_use_transaction=0

(another workaround is to set recorders to 1 but this config setting may be faster on your setup)

(edited Aug 26th 2016)

Setting bulk_requests_use_transaction=0 does not work. I'm getting the same deadlock.
2nd workaround worked, but takes very long to parse our logfiles (~75 rec/sec only)

I'm unclear on the status of this bug in Piwik. There appears to have been almost no action on this in two months. The last few posts seem to disagree regarding whether there is a fix for this. Any update?

I dont thik it's solved @bryankwalton

If it can help I always get this Deadlock error on the same table (piwik_option table) when the logs import process tries to import data from logs.

The full error is:

2015/06/30 18:51:55 [error] 42998#0: *12755 FastCGI sent in stderr: "PHP message: Error in Piwik (tracker): Error query: SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction                                 In query: INSERT INTO `piwik_option` (option_name, option_value, autoload)  VALUES (?, ?, ?)  ON DUPLICATE KEY UPDATE option_value = ?                                 Parameters: array (   0 => 'report_to_invalidate_2229_2015-06-25',   1 => '1',   2 => 0,   3 => '1', )" while reading response header from upstream, client: 127.0.0.1, server: piwik01.dsaw.unibo.it, request: "POST /piwik.php HTTP/1.0", upstream: "fastcgi://unix:/var/run/php5-fpm.sock:", host: "localhost"

Ok it's not a real solution but I'm avoiding potential data loss like this:
and /var/www/piwik/libs/PiwikTracker/PiwikTracker.php:633

        $success = false;
        $count = 0;
        while(!$success) {
            try {
                $response = $this->sendRequest($this->getBaseUrl(), 'POST', $postData, $force = true);
                if ($response){
                    $success = true;
               }
            }
            catch(Exception $e){
                sleep(3);
                $count++;
                //error_log($url + " - " + $count, 3, "/var/log/piwik/test.log");
                // throw $e;
            }
        }

/var/www/piwik/libs/PiwikTracker/PiwikTracker.php:1384

            $success = false;
            $count = 0;
            while(!$success) {
                try {   
                        $response = file_get_contents($url, 0, $ctx);
                       if ($response){
                            $success = true;
                       }
                }
                catch(Exception $e){
                    sleep(3);
                    $count++;
                    //error_log($url + " - " + $count, 3, "/var/log/piwik/test.log");
                    //throw $e;
                }
            }

I know it's ugly but it works.

@mattab

Any update on this issue, I though it will be fixed in 2.13 but I ma still getting deadlocks.

Just ran into this as well. Changing bulk_requests_use_transaction = 0 had no effect. No archiving is running during the time import is occurring. Latest 2.14.3 installed.

I wrote about it. See link http://forum.piwik.org/read.php?2,127070.

@piwikjapan Thanks for looking into that. Your last comment seems to have fixed it for me. At least log_import is working again, after some Mysql issues giving 500 errors:

The use of "--replay-tracking" and "--recorder-max-payload-size=200" did the trick - recorders were already at 1 and bulk_requests_use_transaction=0 had no effect.

Since it seems that the solution is to not use a --recorders value greater than 1, and to add --replay-tracking and --recorder-max-payload-size, shouldn't this page be updated? It is still recommending that --recorder be set to the number of cores on the system.

Note that the workaround to set "bulk_requests_use_transaction=0" has to be set in the [Tracker] section, not as mentioned in https://github.com/piwik/piwik/issues/6398#issuecomment-91093146 in [General].

I'm hiting this with Piwik 3.1.1

I get this error with Piwik 3.3.0 and have some performance numbers (Core i7-6700, MariaDB 10.0.32):

  • transactions off, recorders=8: ~150 req/s
  • transactions off, recorders=4: ~120 req/s
  • transactions on, recorders=4: fails, too many attempts, yields a "--skip=" info.
  • transactions on, recorders=2: ~175 req/s
  • transactions on, recorders=1: ~460 req/s

The performance hit suffered by more and more transaction rollbacks can almost eat up all the benefits of using transactions in the first place. I reach up to 900 req/s and then there is a 0 multiple times, followed by the error message (and a transaction rollback). Setting --recorders=1 (or not setting it at all and using the default value of 1) leads by far to the best results.

Some additional information: I use the BotTracker plugin, and most of the deadlocks happen with queries from that plugin (SET botCount = botCount + 1), but I also have had some deadlocks without the plugin.

Impact of the payload size: By adjusting --recorder-max-payload-size I got these improvements:

  • transactions on, recorders=4, payload-size=500: ~330 req/s
  • transactions on, recorders=1, payload-size=1000: ~515 req/s

Specifying more than 500 using 4 recorders led to request entity too large errors. The speed using multiple recorders improves, but due to the deadlock rollbacks it never exceeds the speed of recorders=1.

My takeaway: Don't look for more requests per second, stay with safe 500 req/s. Just like @mddvul22 said, the recomendations on this page should be changed.

Try the changes in the pull request and see if you get better results. It's only two files that have minor changes.

Hi everyone

We have merged a Pull request which is supposed to address this issue.
---> Please upgrade to 3.5.0-b2 which includes the fix from #12733

It may not be completely fixed, but we'll need your feedback to confirm this. In the meantime, we'll mark as closed. To use the beta release please see: http://piwik.org/faq/how-to-update/faq_159/

Hi FYI, I am just seeing this with Matomo 3.5.0 and using import_logs.py with --recorders=2:

2018/05/11 08:38:21 [error] 1287#0: *10688 FastCGI sent in stderr: "PHP message: Error in Matomo (tracker): Error query: SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction In query: UPDATE matomo_log_visit SET idvisitor = ?, visit_last_action_time = ?, visit_exit_idaction_url = ?,  visit_total_actions = visit_total_actions + 1 ,  visit_total_interactions = visit_total_interactions + 1 , visit_total_time = ?, custom_var_k1 = ?, custom_var_v1 = ? WHERE idsite = ? AND idvisit = ? Parameters: array (   0 => '<95>{<A7><B2>^P<D3><FF><B3>',   1 => '2018-05-11 08:38:18',   2 => 2184827,   3 => 248,   4 => 'App version',   5 => 'Web_3.13.3',   6 => 1,   7 => '1588134', )" while reading response header from upstream, client: 185.70.40.94, server: matomo.example.org, request: "POST //piwik.php HTTP/1.1", upstream: "fastcgi://unix:/var/run/php-fpm/php-fpm.sock:", host: "matomo.example.org"

Maybe I missed a bit.
While you're running your import, and before you run archive, can you run this query and send the result?

select *
from _yourdbnamehere_.piwik_option
where option_name like 'report_to_invalidate_%';

Hi! If you were asking me:

MariaDB [(none)]> select * from matomo.matomo_option where option_name like 'report_to_invalidate_%';
+-----------------------------------------+--------------+----------+
| option_name                             | option_value | autoload |
+-----------------------------------------+--------------+----------+
| report_to_invalidate_1_2018-05-11_20584 | 1            |        0 |
| report_to_invalidate_1_2018-05-11_22000 | 1            |        0 |
| report_to_invalidate_1_2018-05-13_20591 | 1            |        0 |
+-----------------------------------------+--------------+----------+
3 rows in set (0.00 sec)

Yes. I have to admit, I'm not using the python script for this, so maybe our custom loader has helped with this issue as well. But I had to make those changes to get it to work even with our loader. I'll look again soon and try to reproduce with the import_logs.py. I don't have the free time to look at this right now though. I think maybe the report invalidation transaction should be moved out of this path. Maybe make another API call out of it that just sets the value and commits, and get it out of the mix all together.

Does anyone have any suggestions to fix this issue in the meantime? I see that there's an open pull request but it looks like it may not be likely to fix the issue? "bulk_requests_use_transaction=0" seems to do very little, if anything.

If it helps, I'm encountering the exact same deadlock issue using the bulk tracking rather than the importer. So it doesn't seem to be related to the log importer.

Just created an issue see: #14619, Tracker mode: configure MySQL transaction isolation level to READ UNCOMMITTED. to avoid gap locks
we'll work on the issue soon. (Not sure if it's related, but it maybe could help?)

Was this page helpful?
0 / 5 - 0 ratings