There are a few reports of people having migrated their database from sqlite to postgres and ending up with missing or completely wrong indexes.
Tables known to be affected include:
user_ips (missing unique index so upserts fail)device_lists_remote_cache (missing unique index so upserts fail)state_groups_state (index is on (state_group) instead of (state_group, type, state_key))The indexes in question are created on background updates. We don't run the background updates on a migration (synapse_port_db truncates the background update table), which is normally fine because we expect the data in the sqlite database to have been updated. However, we create the postgres data schema by walking the delta files, so we'll end up with entirely the wrong schema.
The best fix for now is probably to create a new schema snapshot; however it would be good to think about how we could avoid this becoming a problem again in future.
We'll also need to think about how to put everyone's schema back together again now that we have messed it up.
I get this following error aswell (after applying the top posts corrections)
however the first two points are not clear to me..
psycopg2.ProgrammingError: there is no unique or exclusion constraint matching the ON CONFLICT specification
Does that mean for this device_lists_remote_cache table there are not just indexes misconfigured, but also Constraints missing?
Where can i look up how it should look, to sort it out once and for all?
Is there a .sql file with the base structure of the postgresql db, where all is listed?
Thanks
With Synapse 0.99.3 (schema version 53), this is how to fix the database (I diffed the schemas from a normal PostgreSQL installation and from one migrated from SQLite):
CREATE INDEX access_tokens_device_id ON public.access_tokens USING btree (user_id, device_id);
CREATE INDEX current_state_events_member_index ON public.current_state_events USING btree (state_key) WHERE (type = 'm.room.member'::text);
DROP INDEX device_inbox_stream_id;
CREATE INDEX device_inbox_stream_id_user_id ON public.device_inbox USING btree (stream_id, user_id);
CREATE UNIQUE INDEX device_lists_remote_cache_unique_id ON public.device_lists_remote_cache USING btree (user_id, device_id);
CREATE UNIQUE INDEX device_lists_remote_extremeties_unique_idx ON public.device_lists_remote_extremeties USING btree (user_id);
CREATE INDEX device_lists_stream_user_id ON public.device_lists_stream USING btree (user_id, device_id);
CREATE INDEX event_contains_url_index ON public.events USING btree (room_id, topological_ordering, stream_ordering) WHERE ((contains_url = true) AND (outlier = false));
CREATE INDEX event_push_actions_highlights_index ON public.event_push_actions USING btree (user_id, room_id, topological_ordering, stream_ordering) WHERE (highlight = 1);
CREATE INDEX event_push_actions_u_highlight ON public.event_push_actions USING btree (user_id, stream_ordering);
CREATE UNIQUE INDEX event_search_event_id_idx ON public.event_search USING btree (event_id);
CREATE INDEX event_to_state_groups_sg_index ON public.event_to_state_groups USING btree (state_group);
CREATE INDEX local_media_repository_url_idx ON public.local_media_repository USING btree (created_ts) WHERE (url_cache IS NOT NULL);
DROP INDEX state_groups_state_id;
CREATE INDEX state_groups_state_type_idx ON public.state_groups_state USING btree (state_group, type, state_key);
CREATE INDEX user_ips_device_id ON public.user_ips USING btree (user_id, device_id, last_seen);
CREATE INDEX user_ips_last_seen ON public.user_ips USING btree (user_id, last_seen);
CREATE INDEX user_ips_last_seen_only ON public.user_ips USING btree (last_seen);
DROP INDEX user_ips_user_ip;
CREATE UNIQUE INDEX user_ips_user_token_ip_unique_index ON public.user_ips USING btree (user_id, access_token, ip);
CREATE INDEX users_creation_ts ON public.users USING btree (creation_ts);
After running this, the two schemas are exactly the same (except for the presence of the port_from_sqlite3 table, of course). This solved the problem for me. :tada:
Thank you, @huguesdk ! Can you also post how you compared the schemas and the expected pristine schema of 0.99.3, so I can check mine? (I have run your fixes, but I don't know what synapse version I was running when I migrated from sqlite, so the fixes I need might be different from yours.)
Glad to help! I鈥檓 running PostgreSQL in a Docker container, so to dump the schema I use:
docker exec <container> pg_dump -s -d homeserver -U synapse > synapse_schema.sql
If you are not using Docker, just use the last part:
pg_dump -s -d homeserver -U synapse > synapse_schema.sql
I simply compared the two schemas with diff:
diff synapse_schema_expected.sql synapse_schema.sql
The expected schema is here. (It鈥檚 gzipped, because GitHub doesn鈥檛 support attaching .sql files.)
Thanks @huguesdk, had the same issue after migrating from Sqlite to Postgres, this solved my problem also.
@huguesdk I get this error when I run the first query
synapse=# CREATE INDEX access_tokens_device_id ON public.access_tokens USING btree (user_id, device_id);
ERROR: relation "access_tokens_device_id" already exists
@jayavanth: Was your database migrated from SQLite? To be sure what鈥檚 missing, dump the schema and diff it like explained above. What issue do you actually have?
@huguesdk Here's the diff https://pastebin.com/hK3Nv5Dm. We can ignore the db username in the diff right?
(Posted this in Synapse Admins but it got lost in the threads)
I'm trying to purge older messages in a particular room. I can see the older messages in Riot but when I run
SELECT event_id FROM events WHERE type='m.room.message' AND received_ts<'1553290290477' AND room_id=<room_id>;
I get (0) rows
Even when I do SELECT event_id FROM events WHERE type='m.room.message' ; I only get like 1000 events. We've been using it for a long time so there's wayyyyyyyy more messages than that
So, that's why I think something is wrong with my port.
@jayavanth: Looking at your diff, it seems that the only difference (besides the port_from_sqlite3 table of course) is the owner name, so your schema seems correct.
I have no experience in purging older messages, and I know little about the database structure. Did you actually do any operation on the database to purge older messages? If you did and still see the older messages in Riot, maybe it鈥檚 because they are still in Riot鈥檚 cache? Did you try clearing the cache? In Riot web, you can do this by going to Settings > Help & About > Clear Cache and Reload.
@huguesdk No, I didn't do any operation to purge it. Just cleared the cache but I can still see the messages 馃檨. What's a good place to ask this question? A new issue maybe?
@jayavanth: Are you trying to purge messages manually by making changes directly in the database? I would advice against this (unless you know the structure very well). Did you take a look at the Purge History API? Currently, the best place to ask questions about this is in the Synapse Admins room. Unfortunately, questions get quickly lost in the conversation, since there is no proper threading yet. I think that a forum (like Discourse for example) would be more appropriate for this, but currently there is none.
@huguesdk I'm using this script https://github.com/matrix-org/synapse/blob/master/contrib/purge_api/purge_history.sh which calls Purge History API. The script does a similar query to what I described above.
Totally agree with having different channels to avoid messages getting lost in the pile
I have 2 missing indices, which cannot be created because of duplicate values:
psycopg2.IntegrityError: could not create unique index "device_lists_remote_cache_unique_id"
psycopg2.IntegrityError: could not create unique index "user_ips_user_token_ip_unique_index"
Is it safe to shut down synapse and then wipe the 2 tables as they seem to be cache tables only?
Further to my earlier message, I can confirm that the following made synapse properly create the indices automatically with no adverse effects (that I have seen) - but please see the next message for a warning on federated e2e encrypted chats:
DELETE FROM public.device_lists_remote_cache;
DELETE FROM public.device_lists_remote_extremeties;
DELETE FROM public.user_ips;
I hope I didn't break anything but so far it looks good.
DELETE FROM public.device_lists_remote_cache;
DELETE FROM public.device_lists_remote_extremeties;
Just as a warning to others tempted to try the same thing: this has the potential to break e2e messaging. If you're not using e2e over federation, you'll be fine. Otherwise, you get to keep both parts when it breaks ;)
Do you by chance have a safe set of sql statements that can be run to clean things up? I am thinking cleaning up duplicates and the likes so that the indices can be created properly?
@ara4n @richvdh @erikjohnston
I'm sorry guys, but the ongoing problems with synapse_port_db are outrageously ridiculous... How many people should ran into that knife? Can you please concentrate on fixing the critical bugs in the official codebase (like synapse_port_db)? Testing such sensible parts of Synapse is essential, isn't it? Remove that script from the official github repo (if it doesn't work like expected). Otherwise, please fix it. Luckily, I tried it beforehand (on a test system):
cat homeserver.log
...
2019-06-02 13:39:53,824 - synapse.storage.client_ips - 359 - ERROR - update_client_ips-853 - Failed to insert client IP (('@alice:nas', 'MDAxMWxvY2F0aW9uIG5hcwowMDEzaWRlbnRpZmllciBrZXkKMDAxMGNpZCBnZW4gPSAxCjAwMWRjaWQgdXNlcl9pZCA9IEBhbGljZTpuYXMKMDAxNmNpZCB0eXBlID0gYWNjZXNzCjAwMjFjaWQgbm9uY2UgPSBHNyprV3cwUU5YTkh1LlcrCjAwMmZzaWduYXR1cmUgf0HxJE-hAR0Trao_F6LlgqzpFjU4mZ4A40gymOKLveYK', '::1'), ('Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36', 'VUONEXXEBP', 1559475590506)): InvalidColumnReference('there is no unique or exclusion constraint matching the ON CONFLICT specification\n')
2019-06-02 13:41:23,824 - synapse.storage.client_ips - 359 - ERROR - update_client_ips-871 - Failed to insert client IP (('@carol:nas', 'MDAxMWxvY2F0aW9uIG5hcwowMDEzaWRlbnRpZmllciBrZXkKMDAxMGNpZCBnZW4gPSAxCjAwMWRjaWQgdXNlcl9pZCA9IEBjYXJvbDpuYXMKMDAxNmNpZCB0eXBlID0gYWNjZXNzCjAwMjFjaWQgbm9uY2UgPSA9ak8tZmtnKkB-dUtPcVlxCjAwMmZzaWduYXR1cmUgCi9fTVCB-1JJNy6av5APpbjoTy2YMZ8zMeToN0rpa2AK', '::1'), ('Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36', 'ATQMZGBFYH', 1559475681482)): InvalidColumnReference('there is no unique or exclusion constraint matching the ON CONFLICT specification\n')
2019-06-02 13:42:13,824 - synapse.storage.client_ips - 359 - ERROR - update_client_ips-881 - Failed to insert client IP (('@alice:nas', 'MDAxMWxvY2F0aW9uIG5hcwowMDEzaWRlbnRpZmllciBrZXkKMDAxMGNpZCBnZW4gPSAxCjAwMWRjaWQgdXNlcl9pZCA9IEBhbGljZTpuYXMKMDAxNmNpZCB0eXBlID0gYWNjZXNzCjAwMjFjaWQgbm9uY2UgPSBHNyprV3cwUU5YTkh1LlcrCjAwMmZzaWduYXR1cmUgf0HxJE-hAR0Trao_F6LlgqzpFjU4mZ4A40gymOKLveYK', '::1'), ('Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36', 'VUONEXXEBP', 1559475729288)): InvalidColumnReference('there is no unique or exclusion constraint matching the ON CONFLICT specification\n')
2019-06-02 13:43:48,824 - synapse.storage.client_ips - 359 - ERROR - update_client_ips-900 - Failed to insert client IP (('@carol:nas', 'MDAxMWxvY2F0aW9uIG5hcwowMDEzaWRlbnRpZmllciBrZXkKMDAxMGNpZCBnZW4gPSAxCjAwMWRjaWQgdXNlcl9pZCA9IEBjYXJvbDpuYXMKMDAxNmNpZCB0eXBlID0gYWNjZXNzCjAwMjFjaWQgbm9uY2UgPSA9ak8tZmtnKkB-dUtPcVlxCjAwMmZzaWduYXR1cmUgCi9fTVCB-1JJNy6av5APpbjoTy2YMZ8zMeToN0rpa2AK', '::1'), ('Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36', 'ATQMZGBFYH', 1559475824841)): InvalidColumnReference('there is no unique or exclusion constraint matching the ON CONFLICT specification\n')
2019-06-02 13:44:18,824 - synapse.storage.client_ips - 359 - ERROR - update_client_ips-906 - Failed to insert client IP (('@alice:nas', 'MDAxMWxvY2F0aW9uIG5hcwowMDEzaWRlbnRpZmllciBrZXkKMDAxMGNpZCBnZW4gPSAxCjAwMWRjaWQgdXNlcl9pZCA9IEBhbGljZTpuYXMKMDAxNmNpZCB0eXBlID0gYWNjZXNzCjAwMjFjaWQgbm9uY2UgPSBHNyprV3cwUU5YTkh1LlcrCjAwMmZzaWduYXR1cmUgf0HxJE-hAR0Trao_F6LlgqzpFjU4mZ4A40gymOKLveYK', '::1'), ('Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36', 'VUONEXXEBP', 1559475854569)): InvalidColumnReference('there is no unique or exclusion constraint matching the ON CONFLICT specification\n')
cat pgsql_store/postgresql.log
2019-06-02 11:22:11.263 CEST [23641] LOG: listening on IPv6 address "::1", port 54320
2019-06-02 11:22:11.263 CEST [23641] LOG: listening on IPv4 address "127.0.0.1", port 54320
2019-06-02 11:22:11.329 CEST [23641] LOG: listening on Unix socket "~/pgsql_store/.s.PGSQL.54320"
2019-06-02 11:22:11.448 CEST [23642] LOG: database system was shut down at 2019-06-02 11:21:50 CEST
2019-06-02 11:22:11.492 CEST [23641] LOG: database system is ready to accept connections
2019-06-02 11:38:11.504 CEST [28307] ERROR: there is no unique or exclusion constraint matching the ON CONFLICT specification
2019-06-02 11:38:11.504 CEST [28307] STATEMENT: INSERT INTO user_ips (user_id, access_token, ip, user_agent, device_id, last_seen) VALUES ('@bob:nas', 'MDAxMWxvY2F0aW9uIG5hcwowMDEzaWRlbnRpZmllciBrZXkKMDAxMGNpZCBnZW4gPSAxCjAwMWJjaWQgdXNlcl9pZCA9IEBib2I6bmFzCjAwMTZjaWQgdHlwZSA9IGFjY2VzcwowMDIxY2lkIG5vbmNlID0gQUhRKjJtS3dPRExEMkx6NAowMDJmc2lnbmF0dXJlILxzIZUbG2qcu4bTR0QSFi1-Gz_jqbUuPRYnkziqCfcdCg', '-', 'Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36', 'LVVTBWVFVR', 1559468290864) ON CONFLICT (user_id, access_token, ip) DO UPDATE SET user_agent=EXCLUDED.user_agent, device_id=EXCLUDED.device_id, last_seen=EXCLUDED.last_seen
2019-06-02 11:38:11.510 CEST [28307] ERROR: current transaction is aborted, commands ignored until end of transaction block
2019-06-02 11:38:11.510 CEST [28307] STATEMENT: INSERT INTO user_ips (user_id, access_token, ip, user_agent, device_id, last_seen) VALUES ('@alice:nas', 'MDAxMWxvY2F0aW9uIG5hcwowMDEzaWRlbnRpZmllciBrZXkKMDAxMGNpZCBnZW4gPSAxCjAwMWRjaWQgdXNlcl9pZCA9IEBhbGljZTpuYXMKMDAxNmNpZCB0eXBlID0gYWNjZXNzCjAwMjFjaWQgbm9uY2UgPSBHNyprV3cwUU5YTkh1LlcrCjAwMmZzaWduYXR1cmUgf0HxJE-hAR0Trao_F6LlgqzpFjU4mZ4A40gymOKLveYK', '-', 'Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36', 'VUONEXXEBP', 1559468291411) ON CONFLICT (user_id, access_token, ip) DO UPDATE SET user_agent=EXCLUDED.user_agent, device_id=EXCLUDED.device_id, last_seen=EXCLUDED.last_seen
2019-06-02 11:40:11.505 CEST [28306] ERROR: there is no unique or exclusion constraint matching the ON CONFLICT specification
2019-06-02 11:40:11.505 CEST [28306] STATEMENT: INSERT INTO user_ips (user_id, access_token, ip, user_agent, device_id, last_seen) VALUES ('@bob:nas', 'MDAxMWxvY2F0aW9uIG5hcwowMDEzaWRlbnRpZmllciBrZXkKMDAxMGNpZCBnZW4gPSAxCjAwMWJjaWQgdXNlcl9pZCA9IEBib2I6bmFzCjAwMTZjaWQgdHlwZSA9IGFjY2VzcwowMDIxY2lkIG5vbmNlID0gQUhRKjJtS3dPRExEMkx6NAowMDJmc2lnbmF0dXJlILxzIZUbG2qcu4bTR0QSFi1-Gz_jqbUuPRYnkziqCfcdCg', '-', 'Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36', 'LVVTBWVFVR', 1559468411294) ON CONFLICT (user_id, access_token, ip) DO UPDATE SET user_agent=EXCLUDED.user_agent, device_id=EXCLUDED.device_id, last_seen=EXCLUDED.last_seen
2019-06-02 11:40:16.504 CEST [28307] ERROR: there is no unique or exclusion constraint matching the ON CONFLICT specification
2019-06-02 11:40:16.504 CEST [28307] STATEMENT: INSERT INTO user_ips (user_id, access_token, ip, user_agent, device_id, last_seen) VALUES ('@alice:nas', 'MDAxMWxvY2F0aW9uIG5hcwowMDEzaWRlbnRpZmllciBrZXkKMDAxMGNpZCBnZW4gPSAxCjAwMWRjaWQgdXNlcl9pZCA9IEBhbGljZTpuYXMKMDAxNmNpZCB0eXBlID0gYWNjZXNzCjAwMjFjaWQgbm9uY2UgPSBHNyprV3cwUU5YTkh1LlcrCjAwMmZzaWduYXR1cmUgf0HxJE-hAR0Trao_F6LlgqzpFjU4mZ4A40gymOKLveYK', '-', 'Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36', 'VUONEXXEBP', 1559468411785) ON CONFLICT (user_id, access_token, ip) DO UPDATE SET user_agent=EXCLUDED.user_agent, device_id=EXCLUDED.device_id, last_seen=EXCLUDED.last_seen
2019-06-02 11:42:10.741 CEST [28301] ERROR: duplicate key value violates unique constraint "device_uniqueness"
2019-06-02 11:42:10.741 CEST [28301] DETAIL: Key (user_id, device_id)=(@alice:nas, VUONEXXEBP) already exists.
2019-06-02 11:42:10.741 CEST [28301] STATEMENT: INSERT INTO devices (user_id, device_id, display_name) VALUES('@alice:nas', 'VUONEXXEBP', NULL)
2019-06-02 11:42:16.504 CEST [28308] ERROR: there is no unique or exclusion constraint matching the ON CONFLICT specification
2019-06-02 11:42:16.504 CEST [28308] STATEMENT: INSERT INTO user_ips (user_id, access_token, ip, user_agent, device_id, last_seen) VALUES ('@bob:nas', 'MDAxMWxvY2F0aW9uIG5hcwowMDEzaWRlbnRpZmllciBrZXkKMDAxMGNpZCBnZW4gPSAxCjAwMWJjaWQgdXNlcl9pZCA9IEBib2I6bmFzCjAwMTZjaWQgdHlwZSA9IGFjY2VzcwowMDIxY2lkIG5vbmNlID0gQUhRKjJtS3dPRExEMkx6NAowMDJmc2lnbmF0dXJlILxzIZUbG2qcu4bTR0QSFi1-Gz_jqbUuPRYnkziqCfcdCg', '-', 'Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36', 'LVVTBWVFVR', 1559468531571) ON CONFLICT (user_id, access_token, ip) DO UPDATE SET user_agent=EXCLUDED.user_agent, device_id=EXCLUDED.device_id, last_seen=EXCLUDED.last_seen
2019-06-02 11:42:19.028 CEST [28306] ERROR: duplicate key value violates unique constraint "device_uniqueness"
2019-06-02 11:42:19.028 CEST [28306] DETAIL: Key (user_id, device_id)=(@bob:nas, LVVTBWVFVR) already exists.
2019-06-02 11:42:19.028 CEST [28306] STATEMENT: INSERT INTO devices (user_id, device_id, display_name) VALUES('@bob:nas', 'LVVTBWVFVR', NULL)
2019-06-02 11:42:41.504 CEST [28301] ERROR: there is no unique or exclusion constraint matching the ON CONFLICT specification
2019-06-02 11:42:41.504 CEST [28301] STATEMENT: INSERT INTO user_ips (user_id, access_token, ip, user_agent, device_id, last_seen) VALUES ('@alice:nas', 'MDAxMWxvY2F0aW9uIG5hcwowMDEzaWRlbnRpZmllciBrZXkKMDAxMGNpZCBnZW4gPSAxCjAwMWRjaWQgdXNlcl9pZCA9IEBhbGljZTpuYXMKMDAxNmNpZCB0eXBlID0gYWNjZXNzCjAwMjFjaWQgbm9uY2UgPSBHNyprV3cwUU5YTkh1LlcrCjAwMmZzaWduYXR1cmUgf0HxJE-hAR0Trao_F6LlgqzpFjU4mZ4A40gymOKLveYK', '-', 'Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36', 'VUONEXXEBP', 1559468561455) ON CONFLICT (user_id, access_token, ip) DO UPDATE SET user_agent=EXCLUDED.user_agent, device_id=EXCLUDED.device_id, last_seen=EXCLUDED.last_seen
2019-06-02 11:44:26.504 CEST [28297] ERROR: there is no unique or exclusion constraint matching the ON CONFLICT specification
2019-06-02 11:44:26.504 CEST [28297] STATEMENT: INSERT INTO user_ips (user_id, access_token, ip, user_agent, device_id, last_seen) VALUES ('@bob:nas', 'MDAxMWxvY2F0aW9uIG5hcwowMDEzaWRlbnRpZmllciBrZXkKMDAxMGNpZCBnZW4gPSAxCjAwMWJjaWQgdXNlcl9pZCA9IEBib2I6bmFzCjAwMTZjaWQgdHlwZSA9IGFjY2VzcwowMDIxY2lkIG5vbmNlID0gQUhRKjJtS3dPRExEMkx6NAowMDJmc2lnbmF0dXJlILxzIZUbG2qcu4bTR0QSFi1-Gz_jqbUuPRYnkziqCfcdCg', '-', 'Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36', 'LVVTBWVFVR', 1559468666410) ON CONFLICT (user_id, access_token, ip) DO UPDATE SET user_agent=EXCLUDED.user_agent, device_id=EXCLUDED.device_id, last_seen=EXCLUDED.last_seen
2019-06-02 11:45:01.504 CEST [29593] ERROR: there is no unique or exclusion constraint matching the ON CONFLICT specification
2019-06-02 11:45:01.504 CEST [29593] STATEMENT: INSERT INTO user_ips (user_id, access_token, ip, user_agent, device_id, last_seen) VALUES ('@alice:nas', 'MDAxMWxvY2F0aW9uIG5hcwowMDEzaWRlbnRpZmllciBrZXkKMDAxMGNpZCBnZW4gPSAxCjAwMWRjaWQgdXNlcl9pZCA9IEBhbGljZTpuYXMKMDAxNmNpZCB0eXBlID0gYWNjZXNzCjAwMjFjaWQgbm9uY2UgPSBHNyprV3cwUU5YTkh1LlcrCjAwMmZzaWduYXR1cmUgf0HxJE-hAR0Trao_F6LlgqzpFjU4mZ4A40gymOKLveYK', '-', 'Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36', 'VUONEXXEBP', 1559468697985) ON CONFLICT (user_id, access_token, ip) DO UPDATE SET user_agent=EXCLUDED.user_agent, device_id=EXCLUDED.device_id, last_seen=EXCLUDED.last_seen
...
2019-06-02 12:28:48.826 CEST [10630] ERROR: current transaction is aborted, commands ignored until end of transaction block
2019-06-02 12:28:48.826 CEST [10630] STATEMENT: INSERT INTO user_ips (user_id, access_token, ip, user_agent, device_id, last_seen) VALUES ('@alice:nas', 'MDAxMWxvY2F0aW9uIG5hcwowMDEzaWRlbnRpZmllciBrZXkKMDAxMGNpZCBnZW4gPSAxCjAwMWRjaWQgdXNlcl9pZCA9IEBhbGljZTpuYXMKMDAxNmNpZCB0eXBlID0gYWNjZXNzCjAwMjFjaWQgbm9uY2UgPSBHNyprV3cwUU5YTkh1LlcrCjAwMmZzaWduYXR1cmUgf0HxJE-hAR0Trao_F6LlgqzpFjU4mZ4A40gymOKLveYK', '::1', 'Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36', 'VUONEXXEBP', 1559471328473) ON CONFLICT (user_id, access_token, ip) DO UPDATE SET user_agent=EXCLUDED.user_agent, device_id=EXCLUDED.device_id, last_seen=EXCLUDED.last_seen
2019-06-02 12:29:39.294 CEST [10629] ERROR: duplicate key value violates unique constraint "device_uniqueness"
2019-06-02 12:29:39.294 CEST [10629] DETAIL: Key (user_id, device_id)=(@bob:nas, KMOALBDPRY) already exists.
2019-06-02 12:29:39.294 CEST [10629] STATEMENT: INSERT INTO devices (user_id, device_id, display_name) VALUES('@bob:nas', 'KMOALBDPRY', NULL)
2019-06-02 12:29:48.386 CEST [10630] ERROR: duplicate key value violates unique constraint "device_uniqueness"
2019-06-02 12:29:48.386 CEST [10630] DETAIL: Key (user_id, device_id)=(@alice:nas, VUONEXXEBP) already exists.
2019-06-02 12:29:48.386 CEST [10630] STATEMENT: INSERT INTO devices (user_id, device_id, display_name) VALUES('@alice:nas', 'VUONEXXEBP', NULL)
2019-06-02 12:31:08.823 CEST [10629] ERROR: there is no unique or exclusion constraint matching the ON CONFLICT specification
2019-06-02 12:31:08.823 CEST [10629] STATEMENT: INSERT INTO user_ips (user_id, access_token, ip, user_agent, device_id, last_seen) VALUES ('@carol:nas', 'MDAxMWxvY2F0aW9uIG5hcwowMDEzaWRlbnRpZmllciBrZXkKMDAxMGNpZCBnZW4gPSAxCjAwMWRjaWQgdXNlcl9pZCA9IEBjYXJvbDpuYXMKMDAxNmNpZCB0eXBlID0gYWNjZXNzCjAwMjFjaWQgbm9uY2UgPSA9ak8tZmtnKkB-dUtPcVlxCjAwMmZzaWduYXR1cmUgCi9fTVCB-1JJNy6av5APpbjoTy2YMZ8zMeToN0rpa2AK', '::1', 'Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36', 'ATQMZGBFYH', 1559471466274) ON CONFLICT (user_id, access_token, ip) DO UPDATE SET user_agent=EXCLUDED.user_agent, device_id=EXCLUDED.device_id, last_seen=EXCLUDED.last_seen
2019-06-02 12:31:18.823 CEST [10620] ERROR: there is no unique or exclusion constraint matching the ON CONFLICT specification
2019-06-02 12:31:18.823 CEST [10620] STATEMENT: INSERT INTO user_ips (user_id, access_token, ip, user_agent, device_id, last_seen) VALUES ('@alice:nas', 'MDAxMWxvY2F0aW9uIG5hcwowMDEzaWRlbnRpZmllciBrZXkKMDAxMGNpZCBnZW4gPSAxCjAwMWRjaWQgdXNlcl9pZCA9IEBhbGljZTpuYXMKMDAxNmNpZCB0eXBlID0gYWNjZXNzCjAwMjFjaWQgbm9uY2UgPSBHNyprV3cwUU5YTkh1LlcrCjAwMmZzaWduYXR1cmUgf0HxJE-hAR0Trao_F6LlgqzpFjU4mZ4A40gymOKLveYK', '::1', 'Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36', 'VUONEXXEBP', 1559471476459) ON CONFLICT (user_id, access_token, ip) DO UPDATE SET user_agent=EXCLUDED.user_agent, device_id=EXCLUDED.device_id, last_seen=EXCLUDED.last_seen
...
2019-06-02 13:42:13.824 CEST [10620] ERROR: there is no unique or exclusion constraint matching the ON CONFLICT specification
2019-06-02 13:42:13.824 CEST [10620] STATEMENT: INSERT INTO user_ips (user_id, access_token, ip, user_agent, device_id, last_seen) VALUES ('@alice:nas', 'MDAxMWxvY2F0aW9uIG5hcwowMDEzaWRlbnRpZmllciBrZXkKMDAxMGNpZCBnZW4gPSAxCjAwMWRjaWQgdXNlcl9pZCA9IEBhbGljZTpuYXMKMDAxNmNpZCB0eXBlID0gYWNjZXNzCjAwMjFjaWQgbm9uY2UgPSBHNyprV3cwUU5YTkh1LlcrCjAwMmZzaWduYXR1cmUgf0HxJE-hAR0Trao_F6LlgqzpFjU4mZ4A40gymOKLveYK', '::1', 'Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36', 'VUONEXXEBP', 1559475729288) ON CONFLICT (user_id, access_token, ip) DO UPDATE SET user_agent=EXCLUDED.user_agent, device_id=EXCLUDED.device_id, last_seen=EXCLUDED.last_seen
2019-06-02 13:43:48.824 CEST [11460] ERROR: there is no unique or exclusion constraint matching the ON CONFLICT specification
2019-06-02 13:43:48.824 CEST [11460] STATEMENT: INSERT INTO user_ips (user_id, access_token, ip, user_agent, device_id, last_seen) VALUES ('@carol:nas', 'MDAxMWxvY2F0aW9uIG5hcwowMDEzaWRlbnRpZmllciBrZXkKMDAxMGNpZCBnZW4gPSAxCjAwMWRjaWQgdXNlcl9pZCA9IEBjYXJvbDpuYXMKMDAxNmNpZCB0eXBlID0gYWNjZXNzCjAwMjFjaWQgbm9uY2UgPSA9ak8tZmtnKkB-dUtPcVlxCjAwMmZzaWduYXR1cmUgCi9fTVCB-1JJNy6av5APpbjoTy2YMZ8zMeToN0rpa2AK', '::1', 'Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36', 'ATQMZGBFYH', 1559475824841) ON CONFLICT (user_id, access_token, ip) DO UPDATE SET user_agent=EXCLUDED.user_agent, device_id=EXCLUDED.device_id, last_seen=EXCLUDED.last_seen
2019-06-02 13:44:18.823 CEST [11458] ERROR: there is no unique or exclusion constraint matching the ON CONFLICT specification
2019-06-02 13:44:18.823 CEST [11458] STATEMENT: INSERT INTO user_ips (user_id, access_token, ip, user_agent, device_id, last_seen) VALUES ('@alice:nas', 'MDAxMWxvY2F0aW9uIG5hcwowMDEzaWRlbnRpZmllciBrZXkKMDAxMGNpZCBnZW4gPSAxCjAwMWRjaWQgdXNlcl9pZCA9IEBhbGljZTpuYXMKMDAxNmNpZCB0eXBlID0gYWNjZXNzCjAwMjFjaWQgbm9uY2UgPSBHNyprV3cwUU5YTkh1LlcrCjAwMmZzaWduYXR1cmUgf0HxJE-hAR0Trao_F6LlgqzpFjU4mZ4A40gymOKLveYK', '::1', 'Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36', 'VUONEXXEBP', 1559475854569) ON CONFLICT (user_id, access_token, ip) DO UPDATE SET user_agent=EXCLUDED.user_agent, device_id=EXCLUDED.device_id, last_seen=EXCLUDED.last_seen
Finally, two sql dumps (if anybody is interested):
before_synapse_port_db_(sqlite3).zip
after_synapse_port_db_(pgsql).zip
What's the recommended action to take after such a failed migration? Continue using the sqlite db or is it safe to use the "broken" postgres db?
What's the recommended action to take after such a failed migration? Continue using the sqlite db or is it safe to use the "broken" postgres db?
I just applied @huguesdk's schema changes after running a few days with the wrong schema, after migrating from SQLite, so I'm very much about to find out.
I did some investigation and fiddling around this, but won't be able to go much further due to a lack of time to dedicate to it. Here's what we've decided needed to be done to fix this issue:
To do the latter, we need to create a new class in Synapse that holds the code for all background updates but doesn't require access to the hs object, otherwise the script needs to instantiate one and it becomes quite trickier to manage. This should be doable, especially as a new full schema has been created in v1.0.0 (which effectively decreases the amount of background updates to run on a new DB).
I have a similar problem. I cannot upload/attach files anymore due to missing indices. This however only manifested after upgrading to synapse v1.2.1. I would appreciate a quick fix, which reestablishes the mandatory db structure very much.
@jptsroh did you try the SQL script in huguesdk's comment?
After I chose the right db I could execute those sql statements. Now I can upload files again. Thanks!
However, the postgres error log still shows errors and now complains about duplicate entries:
ERROR: duplicate key value violates unique constraint "device_uniqueness"
2019-08-17 08:11:56.416 UTC [486] matrix_synapse@dopadb DETAIL: Key (user_id, device_id)=(@....) already exists.
STATEMENT: INSERT INTO devices (user_id, device_id, display_name) VALUES('@.....', 'xxx', NULL)
Good news, a fix to this is on its way: https://github.com/matrix-org/synapse/pull/6102! :)
Fixed in #6102 (and https://github.com/matrix-org/synapse/pull/6243)! :)
Awesome! Which version will this be included in? 1.4.2, 1.5.0?
Awesome! Which version will this be included in? 1.4.2, 1.5.0?
1.5.0, which should be out in one or two weeks (don't take my word for it though)
I'm going to reopen this I'm afraid, because I feel like an important part of it (helping out all those people who have migrated to postgres at some point in the last 4 years) hasn't been done.
Since it might be related and help people out here: I also had wrong behavior. Namely read markers didn't work correctly and (way more dangerous for the network, if my instance wasn't the only one doing it), the timeout for destinations didn't get saved properly. Turns out public.destinations and public.room_account_data didn't have constraints in place that should have been there. I manually deleted the lines that broke the unique constraints and readded them. with
ALTER TABLE ONLY public.destinations ADD CONSTRAINT destinations_pkey PRIMARY KEY (destination);
and
ALTER TABLE ONLY public.room_account_data ADD CONSTRAINT room_account_data_uniqueness UNIQUE (user_id, room_id, account_data_type);
I apologize to anyone, my instance might have spammed while their instances were already struggling.
@richvdh Do you think that there is a way to find other instances with that behavior (if there are any) and notify the administrators of them?
A good way to do that would be to configure a future version of synapse to detect inconsistencies on first startup and correct them, or at least produce a suitably informative fatal error.
the initial plan here is to produce a document to help admins fix their databases. Hopefully we'll get tot that in the next few days.
Automated detection of the missing indexes might happen later but there are no current plans to implement that.
I think a document to fix the indexes is appropriate, but I do think the bug I found for the missing constraints lead to a way more serious behavious since it floods the network of hosts that are already struggling to answer to requests in time. I think it would be appropriate to rather not start an intstance (at least without a switch), than just let it continue to do harm to the network.
I'll try and make some progress here.
Here are some schema dumps for various clean databases:
synapse_port_db script), then upgraded to v1.7.2: full.ported_141.sql.txtsynapse_port_db uses to initialise the postgres database), then upgraded to v1.7.2: full.ported_0995.sql.txtDiffs between the ported database and the clean db are at https://gist.github.com/richvdh/f1d84edf7c3da1fce2347675dd3d55e5.
So, to fix a broken database, you should be able to run the following sql in psql:
CREATE INDEX IF NOT EXISTS access_tokens_device_id ON access_tokens USING btree (user_id, device_id);
CREATE INDEX IF NOT EXISTS current_state_events_member_index ON current_state_events USING btree (state_key) WHERE (type = 'm.room.member'::text);
CREATE INDEX IF NOT EXISTS device_inbox_stream_id_user_id ON device_inbox USING btree (stream_id, user_id);
DROP INDEX IF EXISTS device_inbox_stream_id;
CREATE UNIQUE INDEX IF NOT EXISTS device_lists_remote_cache_unique_id ON device_lists_remote_cache USING btree (user_id, device_id);
CREATE UNIQUE INDEX IF NOT EXISTS device_lists_remote_extremeties_unique_idx ON device_lists_remote_extremeties USING btree (user_id);
CREATE INDEX IF NOT EXISTS device_lists_stream_user_id ON device_lists_stream USING btree (user_id, device_id);
CREATE INDEX IF NOT EXISTS event_contains_url_index ON events USING btree (room_id, topological_ordering, stream_ordering) WHERE ((contains_url = true) AND (outlier = false));
CREATE INDEX IF NOT EXISTS event_push_actions_highlights_index ON event_push_actions USING btree (user_id, room_id, topological_ordering, stream_ordering) WHERE (highlight = 1);
CREATE INDEX IF NOT EXISTS event_push_actions_u_highlight ON event_push_actions USING btree (user_id, stream_ordering);
CREATE UNIQUE INDEX IF NOT EXISTS event_search_event_id_idx ON event_search USING btree (event_id);
CREATE INDEX IF NOT EXISTS event_to_state_groups_sg_index ON event_to_state_groups USING btree (state_group);
CREATE INDEX IF NOT EXISTS local_media_repository_url_idx ON local_media_repository USING btree (created_ts) WHERE (url_cache IS NOT NULL);
CREATE INDEX IF NOT EXISTS state_groups_state_type_idx ON state_groups_state USING btree (state_group, type, state_key);
DROP INDEX IF EXISTS state_groups_state_id;
CREATE UNIQUE INDEX IF NOT EXISTS user_ips_user_token_ip_unique_index ON user_ips USING btree (user_id, access_token, ip);
CREATE INDEX IF NOT EXISTS user_ips_device_id ON user_ips USING btree (user_id, device_id, last_seen);
CREATE INDEX IF NOT EXISTS user_ips_last_seen ON user_ips USING btree (user_id, last_seen);
CREATE INDEX IF NOT EXISTS user_ips_last_seen_only ON user_ips USING btree (last_seen);
DROP INDEX IF EXISTS user_ips_user_ip;
CREATE INDEX IF NOT EXISTS users_creation_ts ON users USING btree (creation_ts);
CREATE INDEX IF NOT EXISTS room_memberships_user_room_forgotten ON room_memberships USING btree (user_id, room_id) WHERE (forgotten = 1);
Note that some of these tables (especially state_groups_state!) may be large, and adding an index will lock the table, potentially making synapse hang while it waits for database operations to complete, so it may be best to either schedule synapse downtime, or to add CONCURRENTLY to the CREATE INDEX statements (in which case, be sure to run them individually rather than pasting all of the above at once).
Note also that some of the CREATE UNIQUE INDEX commands may fail if there are duplicates in the table. user_ips is particularly likely to be affected in this way; the easiest solution there may be simply to delete the existing data in the table (TRUNCATE TABLE user_ips;): its primary use is in populating the 'last seen IP address' in a user's device list so if that information is not valuable to you or your users, it can be deleted.
I'd welcome any feedback from administrators who have success or otherwise running the above.
so we think the above looks about right. It would be nice to figure out a way to either automate the repair process, or at least point people to the commands when necessary, but I'm going to consider that as a potential issue for the future and close this one.
Most helpful comment
With Synapse 0.99.3 (schema version 53), this is how to fix the database (I diffed the schemas from a normal PostgreSQL installation and from one migrated from SQLite):
After running this, the two schemas are exactly the same (except for the presence of the
port_from_sqlite3table, of course). This solved the problem for me. :tada: