slow search & decreased retention on Graylog 3.2.2 version, because decreased amount of documents in index. Retention decreased from 13 days to 10 days.
We have two GRAYLOG clusters 5+10 with versions 3.0.2 and 3.2.2 side by side (blue green deployment) with the same settings:
Indices settings:
Shards: 10
Replicas: 1
Index Rotation strategy: Index Size
Max index size: 16GB
Index retention strategy: Delete
Max number of indices: 200
Indices from version 3.2.2:
graylog_270 Contains messages from 4 days ago up to 4 days ago (14.3GB / 29,612,635 messages) Show Details / Actions
graylog_269 Contains messages from 4 days ago up to 4 days ago (14.1GB / 29,234,745 messages) Show Details / Actions
graylog_268 Contains messages from 4 days ago up to 4 days ago (13.9GB / 28,817,818 messages) Show Details / Actions
graylog_267 Contains messages from 4 days ago up to 4 days ago (14.2GB / 31,456,249 messages) Show Details / Actions
graylog_266 Contains messages from 4 days ago up to 4 days ago (14.5GB / 29,905,324 messages) Show Details / Actions
AVG:29,805,354 docs per index
Indices from version 3.0.2:
graylog_2713 Contains messages from 4 days ago up to 4 days ago (13.3GB / 34,044,967 messages) Show Details / Actions
graylog_2712 Contains messages from 4 days ago up to 4 days ago (12.9GB / 32,813,180 messages) Show Details / Actions
graylog_2711 Contains messages from 4 days ago up to 4 days ago (14.4GB / 38,896,557 messages) Show Details / Actions
graylog_2710 Contains messages from 4 days ago up to 4 days ago (13.0GB / 33,717,885 messages) Show Details / Actions
graylog_2709 Contains messages from 4 days ago up to 4 days ago (13.3GB / 34,285,394 messages) Show Details / Actions
AVG: 34,746,196 docs per index
newer version of Graylog has fewer documents per index
RATIO: 29,805,354 / 34,746,196 *100 = 85.78%
I was comparing messages in both GRAYLOG versions. I founded only field gl2_accounted_message_size in 3.2.2.
Is this flied responsible for that?
Also GUI searches takes more time in version 3.2.2.
| Compare searches in ms | v3.0.2 | v3.2.2 |
| -- | -- | -- |
| 5 min | 239 | 469 |
| 1 hour | 1 105 | 3036 |
| 8 hours | 2 292 | 9 094 |
| 1 day | 5 558 | 15 431 |
| 2 days | 12 343 | 34 426 |
| 5 days | 33 911 | time outed
Slow GUI searches are big showstopper
Hi!
Thanks for the detailed numbers!
gl2_accounted_message_size is simply an integer field, so even in the worst possible case, without any compression effects etc, that field should not account for the difference in size alone (e.g. if we assume 8 bytes per field, even the most pessimistic case would amount to around 400 MB per index, which in practice it won't be). We will have to add metadata from time to time, though, there's no way around that.
For the search times, I will raise the issue with the team, they will likely have some clarification questions of their own.
@jozefbarcin: How did you measure the times exactly?
@jozefbarcin: How did you measure the times exactly?
Sorry for delay, I missed your comment. I measure it from:
time curl -LI -o /dev/null -w "response_code: %{http_code}" -u $TOKEN:token -H 'Accept: application/json' -X GET 'https://graylog_url/api/search/universal/relative?query=*&range=432000&decorate=true'
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- 0:00:37 --:--:-- 0
response_code: 200
real 0m38,011s
user 0m0,062s
sys 0m0,043s
My company has had a similar issue. We were on a graylog 3.1 cluster that was working fine. Then we upgraded to 3.3 and the new search UI and functionality has been much slower to the point of being nearly unusable. The page crashes in Chrome periodically even when trying to do simple searches for a short time frame at times.
@kroepke @dennisoelkers any update?