Victoriametrics: vmagent and scrape_timeout

Created on 4 Mar 2020  ·  3Comments  ·  Source: VictoriaMetrics/VictoriaMetrics

Describe the bug
Глобальный параметр scrape_timeout: 30s не игнорируется, но не применяется

To Reproduce
Запустить snmp_exporter и обратиться к недоступным хостам

# arp
Address                  HWtype  HWaddress           Flags Mask            Iface
10.0.0.209                    (incomplete)                              ens32
10.0.0.211            ether   00:50:56:ba:1a:fe   C                     ens32

# time curl http://127.0.0.1:9116/snmp?module=cisco_l2\&target=10.0.0.209
An error has occurred while serving metrics:

error collecting metric Desc{fqName: "snmp_error", help: "Error scraping target", constLabels: {}, variableLabels: []}: error getting target 10.0.0.209: Request timeout (after 3 retries)

real    4m0.019s
user    0m0.016s
sys     0m0.016s


#time curl http://127.0.0.1:9116/snmp?module=cisco_l2\&target=10.0.0.211
An error has occurred while serving metrics:

error collecting metric Desc{fqName: "snmp_error", help: "Error scraping target", constLabels: {}, variableLabels: []}: error getting target 10.0.0.211: Error reading from socket: read udp 10.0.0.214:43613->10.0.0.211:161: read: connection refused

real    0m0.033s
user    0m0.016s
sys     0m0.008s

scrape.yml

global:
  scrape_interval: 60s
  scrape_timeout:  30s

Логи vmagent

2020-03-04T02:14:38.728Z        info    VictoriaMetrics/lib/promscrape/scraper.go:162   restarting scrapers for changed `file_sd_config` targets
2020-03-04T02:14:39.089Z        info    VictoriaMetrics/lib/promscrape/scraper.go:149   stopped all the 141 scrapers for `file_sd_config` targets
2020-03-04T02:14:39.089Z        info    VictoriaMetrics/lib/promscrape/scraper.go:145   starting 141 scrapers for `file_sd_config` targets
2020-03-04T02:14:44.931Z        error   VictoriaMetrics/lib/promscrape/scrapework.go:127        error when scraping "http://127.0.0.1:9116/snmp?module=cisco_l2&target=10.0.0.211": unexpected status code returned when scraping "http://
2020-03-04T02:15:44.932Z        error   VictoriaMetrics/lib/promscrape/scrapework.go:127        error when scraping "http://127.0.0.1:9116/snmp?module=cisco_l2&target=10.0.0.211": unexpected status code returned when scraping "http://
2020-03-04T02:16:44.929Z        error   VictoriaMetrics/lib/promscrape/scrapework.go:127        error when scraping "http://127.0.0.1:9116/snmp?module=cisco_l2&target=10.0.0.211": unexpected status code returned when scraping "http://
2020-03-04T02:17:44.935Z        error   VictoriaMetrics/lib/promscrape/scrapework.go:127        error when scraping "http://127.0.0.1:9116/snmp?module=cisco_l2&target=10.0.0.211": unexpected status code returned when scraping "http://
2020-03-04T02:18:44.933Z        error   VictoriaMetrics/lib/promscrape/scrapework.go:127        error when scraping "http://127.0.0.1:9116/snmp?module=cisco_l2&target=10.0.0.211": unexpected status code returned when scraping "http://
2020-03-04T02:19:44.931Z        error   VictoriaMetrics/lib/promscrape/scrapework.go:127        error when scraping "http://127.0.0.1:9116/snmp?module=cisco_l2&target=10.0.0.211": unexpected status code returned when scraping "http://
2020-03-04T02:20:44.931Z        error   VictoriaMetrics/lib/promscrape/scrapework.go:127        error when scraping "http://127.0.0.1:9116/snmp?module=cisco_l2&target=10.0.0.211": unexpected status code returned when scraping "http://
2020-03-04T02:21:44.928Z        error   VictoriaMetrics/lib/promscrape/scrapework.go:127        error when scraping "http://127.0.0.1:9116/snmp?module=cisco_l2&target=10.0.0.211": unexpected status code returned when scraping "http://
2020-03-04T02:22:44.931Z        error   VictoriaMetrics/lib/promscrape/scrapework.go:127        error when scraping "http://127.0.0.1:9116/snmp?module=cisco_l2&target=10.0.0.211": unexpected status code returned when scraping "http://
2020-03-04T02:23:44.930Z        error   VictoriaMetrics/lib/promscrape/scrapework.go:127        error when scraping "http://127.0.0.1:9116/snmp?module=cisco_l2&target=10.0.0.211": unexpected status code returned when scraping "http://
2020-03-04T02:24:44.932Z        error   VictoriaMetrics/lib/promscrape/scrapework.go:127        error when scraping "http://127.0.0.1:9116/snmp?module=cisco_l2&target=10.0.0.211": unexpected status code returned when scraping "http://
2020-03-04T02:25:44.928Z        error   VictoriaMetrics/lib/promscrape/scrapework.go:127        error when scraping "http://127.0.0.1:9116/snmp?module=cisco_l2&target=10.0.0.211": unexpected status code returned when scraping "http://
2020-03-04T02:26:44.934Z        error   VictoriaMetrics/lib/promscrape/scrapework.go:127        error when scraping "http://127.0.0.1:9116/snmp?module=cisco_l2&target=10.0.0.211": unexpected status code returned when scraping "http://
2020-03-04T02:27:44.930Z        error   VictoriaMetrics/lib/promscrape/scrapework.go:127        error when scraping "http://127.0.0.1:9116/snmp?module=cisco_l2&target=10.0.0.211": unexpected status code returned when scraping "http://
2020-03-04T02:28:44.930Z        error   VictoriaMetrics/lib/promscrape/scrapework.go:127        error when scraping "http://127.0.0.1:9116/snmp?module=cisco_l2&target=10.0.0.211": unexpected status code returned when scraping "http://
2020-03-04T02:29:44.930Z        error   VictoriaMetrics/lib/promscrape/scrapework.go:127        error when scraping "http://127.0.0.1:9116/snmp?module=cisco_l2&target=10.0.0.211": unexpected status code returned when scraping "http://
2020-03-04T02:30:44.930Z        error   VictoriaMetrics/lib/promscrape/scrapework.go:127        error when scraping "http://127.0.0.1:9116/snmp?module=cisco_l2&target=10.0.0.211": unexpected status code returned when scraping "http://
2020-03-04T02:31:44.935Z        error   VictoriaMetrics/lib/promscrape/scrapework.go:127        error when scraping "http://127.0.0.1:9116/snmp?module=cisco_l2&target=10.0.0.211": unexpected status code returned when scraping "http://
2020-03-04T02:32:44.931Z        error   VictoriaMetrics/lib/promscrape/scrapework.go:127        error when scraping "http://127.0.0.1:9116/snmp?module=cisco_l2&target=10.0.0.211": unexpected status code returned when scraping "http://
2020-03-04T02:33:44.930Z        error   VictoriaMetrics/lib/promscrape/scrapework.go:127        error when scraping "http://127.0.0.1:9116/snmp?module=cisco_l2&target=10.0.0.211": unexpected status code returned when scraping "http://
2020-03-04T02:34:44.931Z        error   VictoriaMetrics/lib/promscrape/scrapework.go:127        error when scraping "http://127.0.0.1:9116/snmp?module=cisco_l2&target=10.0.0.211": unexpected status code returned when scraping "http://
2020-03-04T02:35:44.928Z        error   VictoriaMetrics/lib/promscrape/scrapework.go:127        error when scraping "http://127.0.0.1:9116/snmp?module=cisco_l2&target=10.0.0.211": unexpected status code returned when scraping "http://
2020-03-04T02:36:44.930Z        error   VictoriaMetrics/lib/promscrape/scrapework.go:127        error when scraping "http://127.0.0.1:9116/snmp?module=cisco_l2&target=10.0.0.211": unexpected status code returned when scraping "http://
2020-03-04T02:37:44.929Z        error   VictoriaMetrics/lib/promscrape/scrapework.go:127        error when scraping "http://127.0.0.1:9116/snmp?module=cisco_l2&target=10.0.0.211": unexpected status code returned when scraping "http://
2020-03-04T02:38:44.934Z        error   VictoriaMetrics/lib/promscrape/scrapework.go:127        error when scraping "http://127.0.0.1:9116/snmp?module=cisco_l2&target=10.0.0.211": unexpected status code returned when scraping "http://
2020-03-04T02:39:44.930Z        error   VictoriaMetrics/lib/promscrape/scrapework.go:127        error when scraping "http://127.0.0.1:9116/snmp?module=cisco_l2&target=10.0.0.211": unexpected status code returned when scraping "http://
2020-03-04T02:40:38.730Z        info    VictoriaMetrics/lib/promscrape/scraper.go:162   restarting scrapers for changed `file_sd_config` targets
2020-03-04T02:40:38.906Z        info    VictoriaMetrics/lib/promscrape/scraper.go:149   stopped all the 141 scrapers for `file_sd_config` targets
2020-03-04T02:40:38.906Z        info    VictoriaMetrics/lib/promscrape/scraper.go:145   starting 141 scrapers for `file_sd_config` targets
2020-03-04T02:43:48.624Z        error   VictoriaMetrics/lib/promscrape/scrapework.go:127        error when scraping "http://127.0.0.1:9116/snmp?module=cisco_l2&target=10.0.0.209": error when scraping "http://127.0.0.1:9116/snmp?module
2020-03-04T02:46:18.628Z        error   VictoriaMetrics/lib/promscrape/scrapework.go:127        error when scraping "http://127.0.0.1:9116/snmp?module=cisco_l2&target=10.0.0.209": error when scraping "http://127.0.0.1:9116/snmp?module
2020-03-04T02:48:48.634Z        error   VictoriaMetrics/lib/promscrape/scrapework.go:127        error when scraping "http://127.0.0.1:9116/snmp?module=cisco_l2&target=10.0.0.209": error when scraping "http://127.0.0.1:9116/snmp?module

Хочу обратить внимание на scrape_duration=150.007s и with timeout 30s: timeout, который совпадает с указанным в конфиге.

# curl http://localhost:8429/targets
job="cisco_l2" (67/69 up)
        state=down, endpoint=http://127.0.0.1:9116/snmp?module=cisco_l2&target=10.0.0.209, labels={instance="10.0.0.209", job="cisco_l2", obj="twr"}, last_scrape=179.283s ago, scrape_duration=150.007s, error="error when scraping \"http://127.0.0.1:9116/snmp?module=cisco_l2&target=10.0.0.209\" with timeout 30s: timeout"
        state=down, endpoint=http://127.0.0.1:9116/snmp?module=cisco_l2&target=10.0.0.211, labels={instance="10.0.0.211", job="cisco_l2", obj="twr"}, last_scrape=272.973s ago, scrape_duration=0.003s, error="unexpected status code returned when scraping \"http://127.0.0.1:9116/snmp?module=cisco_l2&target=10.0.0.211\": 500; expecting 200; response body: \"An error has occurred while serving metrics:\\n\\nerror collecting metric Desc{fqName: \\\"snmp_error\\\", help: \\\"Error scraping target\\\", constLabels: {}, variableLabels: []}: error getting target 10.0.0.211: Error reading from socket: read udp 10.0.0.214:35534->10.0.0.211:161: read: connection refused\\n\""


#curl http://localhost:8429/targets
job="cisco_l2" (67/68 up)

Количество targets отличается какое-то время. В логах видно в обоих случаях.

starting 141 scrapers for `file_sd_config` targets

Т.е. в целом кол-во таргетов не изменилось. Фактически была замена адреса 10.0.0.211 на 10.0.0.209, но в /targets какое-то время показывается оба хоста.

Expected behavior

# arp
Address                  HWtype  HWaddress           Flags Mask            Iface
10.0.1.122                     (incomplete)                              ens160
# time curl http://127.0.0.1:9116/snmp?module=cisco_l2\&target=10.0.1.122
An error has occurred while serving metrics:

error collecting metric Desc{fqName: "snmp_error", help: "Error scraping target", constLabels: {}, variableLabels: []}: error getting target 10.0.1.122: Request timeout (after 3 retries)

real    4m0.018s
user    0m0.010s
sys     0m0.018s

Опрос так же 4 минуты.
Данные с Prometheus http://10.0.1.20:9090/targets

Endpoint    State   Labels  Last Scrape     Scrape Duration     Error
http://127.0.0.1:9116/snmp
module="cisco_l2" target="10.0.1.122"   down    instance="10.0.1.122" job="cisco_l2" obj="r51"  1m29.085s ago   30s     Get http://127.0.0.1:9116/snmp?module=cisco_l2&target=10.0.1.122: context deadline exceeded

Scrape Duration 30s по причине context deadline exceeded

Version

vmagent-20200303-194035-tags-v1.34.2-0-g0b1e877a

Used command-line flags

bug

Most helpful comment

спасибо. все работает.
scrape_duration=30.002s, error="error when scraping .. with timeout 30s: timeout"
vmagent-20200310-015134-v1.34.2-14-g7c432da

All 3 comments

The issue should be fixed in the commit 7c432da7885a0aa8d4822a387809db30d4509b6d .

It will be available in the next release. In the mean time it is possible to build vmagent from sources according to these docs.

спасибо. все работает.
scrape_duration=30.002s, error="error when scraping .. with timeout 30s: timeout"
vmagent-20200310-015134-v1.34.2-14-g7c432da

The bugfix is available starting from v1.34.3.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

abualy picture abualy  ·  3Comments

ozn0417 picture ozn0417  ·  3Comments

cristihcd picture cristihcd  ·  4Comments

oOHenry picture oOHenry  ·  4Comments

jelmd picture jelmd  ·  3Comments