我有一个370万条红包数据的mysql 表和 一个 单机(cpu i7 ,256GB固态硬盘,16G内存,Centos7.3)Docker最新版部署的 TiDb 同样是红包数据表,结构都一样。
测试使用
语句
SELECT count(*) FROM biz_user_red_packet;
表结构
int id,int user_id,int order_form_id,int red_packet_id,date create_time,date use_time,int profit,int status,int delete_flag,int planid
发现 TiDB(花费了33秒)非常缓慢,反而mysql(花费1.45秒) 一下就算出表总数了.能否问下是什么原因?
=======
I have a mysql table with 3.7 million red envelope data and a single machine (CPU i7,256GB solid-state drive, 16gb of ram, Centos7.3). The latest version of the TiDb is also a red envelope data table with the same structure.
Test using
statements
SELECT count (*) FROM biz_user_red_packet;
Table structure
Int id,int user_id,int order_form_id,int red_packet_id,date create_time,date use_time,int profit,int status,int delete_flag,int planid.
It is very slow to find TiDB (it takes 33 seconds), but mysql (1.45 seconds) is the total number of tables. Can you ask why?
Doc文档里 建议使用 SELECT count(1) FROM test.biz_user_red_packet;
代替 SELECT count(*) FROM test.biz_user_red_packet;
但是结果是一样缓慢的。
A SELECT count(1) FROM test.biz_user_red_packet is recommended in the Doc document.
Instead of SELECT count(*) FROM test.biz_user_red_packet;
But the result is as slow as the last one.
@zhuxiujia Yes, the count(1) will be as slow as count(*) because they are essentially the same, so you can use methods recommended in the docs to speed up your query.
What's the configuration of your TiDB cluster? How many TiDB, TiKV, PD in it?
And please provide us your TiDB github version.
@XuHuaiyu Only one machine. @zhuxiujia I think your questions is why TiDB run count(*) slower than MySQL? And you sure you have no pd or tikv?
1个TiDb 没有 Tikv pd
i have One TiDb has no Tikv or Pd.
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
2ea42358c2fb pingcap/tidb:latest "/tidb-server" 23 hours ago Up 23 hours 0.0.0.0:4000->4000/tcp, 0.0.0.0:10080->10080/tcp tidb-server
62a2fcb6242f tomcat8.5 "catalina.sh run" 8 weeks ago Up 3 weeks 0.0.0.0:9993->8080/tcp api-doc
One PC was deployed with Docker.
If you deploy TiDB upon tikv, TiDB retrieve data by GRPC, which needs network round trip. So this is why TiDB is slower than MySQL in a small table.
tidb_distsql_scan_concurrency like this doc saidI only use one TiDB without TiKV and TiPD, no network requests such as your GRPC. It's several times slower than mysql without a network request.
我只用了一个 TiDB 没有 TiKV 和 TiPD,没有网络请求例如你说的GRPC。在没有网络请求耗时的情况下,比mysql慢了好几倍
If you only start a tidb-server, there will be a mock-tikv to store data.
Mock-tikv is used for testing.
You may try to deploy a cluster with at lease 1TiDB, 1TiKV, 1*PD.
@XuHuaiyu I used Docker Compose to deploy the cluster on my machine, with three TiDB, three Tikv, three TiPd, but again, count() is 30 times slower than mysql.
我使用 Docker Compose 部署了集群在我那个机器上,包含3个TiDB,3个Tikv,3个TiPd,但是 还是一样,count()比mysql慢30倍
TiDB accesses TiKV in every query. When the amount of data is small, all MySQL is in memory, and TiDB needs to conduct a network access.
So you can try this Chinese doc or this English docas what @winkyao mentioned before.
Is SELECT count(*) is slow? Well, it depends.
Is that clear? @zhuxiujia
请问这个问题解决了么?我这边4000W到数据count(1)需要7,8秒。这个算是慢么?还是正常的?我个人认为比较慢。
@liujunhui Yes, it's slower than expected, can you provide the deployment of your cluster?
@liujunhui Yes, it's slower than expected, can you provide the deployment of your cluster?
三台PC电脑node1(8G内存)部署一个KV,一个PD,一个TIBD,node2(4G内存)部署一个KV,一个PD,node3(4G内存)部署一个KV,一个PD。node2和node3的内存在查询count(1)的时候会上升90%多,而且查询之后就不会降下来。查询慢的原因是内存不够么?还是其他原因?
@liujunhui I think the hardware is the bottleneck of select count(1) from t too slow. The memory is small and cannot make the TiKV cache data in memory.
Most helpful comment
Is SELECT count(*) is slow? Well, it depends.
TiDB on the local storage is used for testing purpose, it's not optimized for performance.
Communication costs in multiple processes is much higher than a single MySQL process.
Is that clear? @zhuxiujia