If you're interested please comment here and come join our "Contributors" community channel on our daily build server, where you can discuss questions with community members and the Mattermost core team. For technical advice or questions, please join our "Developers" community channel.
New contributors please see our Developer's Guide.
Notes: Jira ticket
When adding the materialized public channels table (https://github.com/mattermost/mattermost-server/pull/9349), I intentionally duplicated the indices (for present columns) on the Channels table so as to reduce the complexity of the changes and rely on similar query planner behaviour.
I suspect some of these indices could be removed altogether, reducing database load overall (mostly on write). This ticket is an open-ended task to review our use of indices in the sql store.
If you have any questions or want to discuss the ticket, feel free to ask @jesse.hallam or @christopher in the ~Developers: Performance channel on community.mattermost.com
@jwilander @esethna I'm interested in taking a look at this and have mentioned how I would go about taking care of the task on the "Developers : Performance" channel. The discovery of duplicate keys can be automated fairly simply using a tool such as https://www.percona.com/doc/percona-toolkit/2.1/pt-duplicate-key-checker.html
Sounds good @BenKnigge 👍
Hey @BenKnigge,
Did all your questions got resolved?
Are you still working on this @BenKnigge ?
@hanzei If @BenKnigge is not able to contribute I would like to.
Let's give @BenKnigge a few days to respond and see then.
I would also be interested in taking this issue.
@RajatVaryani Would you be open on passing this ticket to @Pomyk or do you want to work on this yourself?
I've started a new role and haven't done anything with this since my
initial research into it.
Finding a duplicate index is fairly simple.
Finding unnecessary indexes is more complicated.
Ideally whomever works on this would have access to usage stats for one or
more live databases that are under active participation.
On Mon, Nov 4, 2019 at 2:31 PM Ben Schumacher notifications@github.com
wrote:
@RajatVaryani https://github.com/RajatVaryani Would you be open on
passing this ticket to @Pomyk https://github.com/Pomyk or do you want
to work on this yourself?—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/mattermost/mattermost-server/issues/10347?email_source=notifications&email_token=ACJHMF2UB3EYU45C2FRWXKTQSCIENA5CNFSM4GZGTW42YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDAZBGI#issuecomment-549556377,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ACJHMF7C5GDUIXQ3ZYOOP5DQSCIENANCNFSM4GZGTW4Q
.
--
Benjamin Knigge
[email protected]
tel: +1-310-570-2880
@hanzei I don't mind passing this ticket.
Then it's all yours @Pomyk
Thanks @RajatVaryani
I have analyzed code in channel_store.go, ran some explains and checked used indices stats in percona server after some basic app usage.
I have found that the following indices are unnecessary on PublicChannels:
Index on TeamId could be extended to another column, for example:
These index changes could be done for Channels table too.
Also found some queries that look over complicated and will try to make a PR to simplify them.
Duplicate indices found by pt-duplicate-key-checker: https://gist.github.com/Pomyk/b374964f059f694234981d7d49227ec6
cc @lieut-data
@Pomyk, wow this is a terrific analysis -- thank you!
I have analyzed code in channel_store.go, ran some explains and checked used indices stats in percona server after some basic app usage.
Would you be open to sharing your setup and testing strategy? We'd likely want to replicate your setup both with a loadtest (https://github.com/mattermost/mattermost-load-test) and possibly even on our community Mattermost instance, given the potential presence of enterprise-only queries and such.
@reflog and @streamer45, would it make sense for you to work with @Pomyk on this issue, given your existing loadtest and overall performance investigations?
Most of the work was reading code and extracting example queries. Usually you can guess which indexes would be used just by looking at the query, but I have also used explain on the more complicated ones in mysql and postgres.
I have also used https://www.percona.com/doc/percona-server/LATEST/diagnostics/user_stats.html from Percona as it's easier to find used indexes than with performance_schema. Looks like PostgreSQL has index statistics too, but I haven't used them yet.
I will gladly work with you, I have even applied for the Back End Engineer role at Mattermost ;)
Reactions table has inefficient primary key (UserId,PostId,EmojiName). Using (PostId,UserId,EmojiName) would be better as it would use index scan when fetching reactions for a post. Now it scans the whole table.
@Pomyk Thank you for your excellent analysis :+1:
Would love to see some benchmarks of the queries involved before and after your proposed changes. Does that sound good?
Also, if you need any help, we can move the discussion to the ~Developers: Performance channel.
OK, I will do some benchmarks.
benchmarks: https://github.com/Pomyk/mattermost-server/commit/1b4f4d3c703dbbfe9967efb50be22e82f73cfeb2
some helper scripts I used: https://github.com/Pomyk/mattermost-benchmark-scripts
sample results:
benchmark old ns/op new ns/op delta
BenchmarkChannels/mysql/GetPublicChannelsForTeam/1-16 514718 511093 -0.70%
BenchmarkChannels/mysql/GetMoreChannels/1-16 610817 600723 -1.65%
BenchmarkChannels/mysql/GetDeleted/1-16 819924 832318 +1.51%
BenchmarkChannels/mysql/SearchInTeam/1-16 628560 628895 +0.05%
BenchmarkChannels/mysql/SearchArchivedInTeam/1-16 1328315 1443086 +8.64%
BenchmarkChannels/mysql/SearchMore/1-16 748883 732619 -2.17%
BenchmarkChannels/mysql/SearchGroupChannels/1-16 687369 683080 -0.62%
BenchmarkChannels/postgres/GetPublicChannelsForTeam/1-16 704532 676958 -3.91%
BenchmarkChannels/postgres/GetMoreChannels/1-16 1042366 1011997 -2.91%
BenchmarkChannels/postgres/GetDeleted/1-16 1077233 1060135 -1.59%
BenchmarkChannels/postgres/SearchInTeam/1-16 911217 820803 -9.92%
BenchmarkChannels/postgres/SearchArchivedInTeam/1-16 2281084 2154007 -5.57%
BenchmarkChannels/postgres/SearchMore/1-16 1257868 1199514 -4.64%
BenchmarkChannels/postgres/SearchGroupChannels/1-16 1487470 1500727 +0.89%
BenchmarkReactions/mysql/GetForPost-16 17370649 280530 -98.39%
BenchmarkReactions/postgres/GetForPost-16 3060279 269196 -91.20%
Modified tables:
CREATE TABLE `Channels` (
`Id` varchar(26) NOT NULL,
`CreateAt` bigint(20) DEFAULT NULL,
`UpdateAt` bigint(20) DEFAULT NULL,
`DeleteAt` bigint(20) DEFAULT NULL,
`TeamId` varchar(26) DEFAULT NULL,
`Type` varchar(1) DEFAULT NULL,
`DisplayName` varchar(64) DEFAULT NULL,
`Name` varchar(64) DEFAULT NULL,
`Header` text,
`Purpose` varchar(250) DEFAULT NULL,
`LastPostAt` bigint(20) DEFAULT NULL,
`TotalMsgCount` bigint(20) DEFAULT NULL,
`ExtraUpdateAt` bigint(20) DEFAULT NULL,
`CreatorId` varchar(26) DEFAULT NULL,
`SchemeId` varchar(26) DEFAULT NULL,
`GroupConstrained` tinyint(1) DEFAULT NULL,
PRIMARY KEY (`Id`),
UNIQUE KEY `Name` (`Name`,`TeamId`),
KEY `idx_channels_create_at` (`CreateAt`),
KEY `idx_channels_team_id` (`TeamId`,`DeleteAt`,`DisplayName`),
FULLTEXT KEY `idx_channel_search_txt` (`Name`,`DisplayName`,`Purpose`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
CREATE TABLE `PublicChannels` (
`Id` varchar(26) NOT NULL,
`DeleteAt` bigint(20) DEFAULT NULL,
`TeamId` varchar(26) DEFAULT NULL,
`DisplayName` varchar(64) DEFAULT NULL,
`Name` varchar(64) DEFAULT NULL,
`Header` text,
`Purpose` varchar(250) DEFAULT NULL,
PRIMARY KEY (`Id`),
UNIQUE KEY `Name` (`Name`,`TeamId`),
KEY `idx_publicchannels_team_id` (`TeamId`,`DeleteAt`,`DisplayName`),
FULLTEXT KEY `idx_publicchannels_search_txt` (`Name`,`DisplayName`,`Purpose`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
CREATE TABLE `Reactions` (
`UserId` varchar(26) NOT NULL,
`PostId` varchar(26) NOT NULL,
`EmojiName` varchar(64) NOT NULL,
`CreateAt` bigint(20) DEFAULT NULL,
PRIMARY KEY (`PostId`,`UserId`,`EmojiName`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;