Victoriametrics: Deduplication via influx command line

Created on 23 Feb 2020  Â·  3Comments  Â·  Source: VictoriaMetrics/VictoriaMetrics

Im trying to see if i can utilize the victoria metrics deduplication deduplication description when I/my jobs inject duplicate data. Perhaps I misinterpreted the intended use but I tried to inject data via influx db cli:

(venv) my_vm@ubuntu:~$ curl -G 'http://localhost:8428/api/v1/export' -d 'match={__name__=~"hi.*"}'
(venv) my_vm@ubuntu:~$ curl -d 'hi,tag1=value1 value=15 1581919391144000000' -X POST 'http://localhost:8428/write'
(venv) my_vm@ubuntu:~$ curl -G 'http://localhost:8428/api/v1/export' -d 'match={__name__=~"hi.*"}'
{"metric":{"__name__":"hi_value","tag1":"value1"},"values":[15],"timestamps":[1581919391144]}
(venv) my_vm@ubuntu:~$ curl -d 'hi,tag1=value1 value=15 1581919391144000000' -X POST 'http://localhost:8428/write'
(venv) my_vm@ubuntu:~$ curl -G 'http://localhost:8428/api/v1/export' -d 'match={__name__=~"hi.*"}'
{"metric":{"__name__":"hi_value","tag1":"value1"},"values":[15,15],"timestamps":[1581919391144,1581919391144]}
(venv) my_vm@ubuntu:~$ 

It looks like it didn't work but when I try to instantiate my victoria metric docker container with the argument I get the following error (that it doesn't support the argument):

(venv) my_vm@ubuntu:~$ docker container run -d -p 8428:8428 -v victoria_metrics:/victoria-metrics-data valyala/victoria-metrics:latest -retentionPeriod=24 -dedup.minScrapeInterval=1s
7b41cf84c9250f4f657910e5941dfa91afe40f35f38aa4fab43b926f7cdc7207
(venv) my_vm@ubuntu:~$ docker container ls --all
CONTAINER ID        IMAGE                             COMMAND                  CREATED             STATUS                     PORTS               NAMES
7b41cf84c925        valyala/victoria-metrics:latest   "/victoria-metrics-p…"   5 seconds ago       Exited (2) 4 seconds ago                       vigorous_brahmagupta
(venv) my_vm@ubuntu:~$ docker container 
attach   commit   cp       create   diff     exec     export   inspect  kill     logs     ls       pause    port     prune    rename   restart  rm       run      start    stats    stop     top      unpause  update   wait     
(venv) my_vm@ubuntu:~$ docker container logs vigorous_brahmagupta 
flag provided but not defined: -dedup.minScrapeInterval
victoria-metrics-20191202-131048-tags-v1.30.2-0-g62a915f2
Usage of /victoria-metrics-prod:
  -bigMergeConcurrency int
        The maximum number of CPU cores to use for big merges. Default value is used if set to 0
  -deleteAuthKey string
        authKey for metrics' deletion via /api/v1/admin/tsdb/delete_series
  -enableTCP6
        Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP is used
  -graphiteListenAddr string
        TCP and UDP address to listen for Graphite plaintext data. Usually :2003 must be set. Doesn't work if empty
  -http.disableResponseCompression
        Disable compression of HTTP responses for saving CPU resources. By default compression is enabled to save network bandwidth
  -httpAuth.password string
        Password for HTTP Basic Auth. The authentication is disabled if -httpAuth.username is empty
  -httpAuth.username string
        Username for HTTP Basic Auth. The authentication is disabled if empty. See also -httpAuth.password
  -httpListenAddr string
        TCP address to listen for http connections (default ":8428")
  -influxMeasurementFieldSeparator {measurement}{separator}{field_name}
        Separator for {measurement}{separator}{field_name} metric name when inserted via Influx line protocol (default "_")
  -influxSkipSingleField {measurement}
        Uses {measurement} instead of `{measurement}{separator}{field_name}` for metic name if Influx line contains only a single field
  -loggerLevel string
        Minimum level of errors to log. Possible values: INFO, ERROR, FATAL, PANIC (default "INFO")
  -maxConcurrentInserts int
        The maximum number of concurrent inserts (default 4)
  -maxInsertRequestSize int
        The maximum size of a single insert request in bytes (default 33554432)
  -maxLabelsPerTimeseries int
        The maximum number of labels accepted per time series. Superflouos labels are dropped (default 30)
  -memory.allowedPercent float
        Allowed percent of system memory VictoriaMetrics caches may occupy (default 60)
  -metricsAuthKey string
        Auth key for /metrics. It overrides httpAuth settings
  -opentsdbHTTPListenAddr string
        TCP address to listen for OpentTSDB HTTP put requests. Usually :4242 must be set. Doesn't work if empty
  -opentsdbListenAddr string
        TCP and UDP address to listen for OpentTSDB put messages. Usually :4242 must be set. Doesn't work if empty
  -pprofAuthKey string
        Auth key for /debug/pprof. It overrides httpAuth settings
  -precisionBits int
        The number of precision bits to store per each value. Lower precision bits improves data compression at the cost of precision loss (default 64)
  -retentionPeriod int
        Retention period in months (default 1)
  -search.disableCache
        Whether to disable response caching. This may be useful during data backfilling
  -search.latencyOffset duration
        The time when data points become visible in query results after the colection. Too small value can result in incomplete last points for query results (default 30s)
  -search.logSlowQueryDuration duration
        Log queries with execution time exceeding this value. Zero disables slow query logging (default 5s)
  -search.maxConcurrentRequests int
        The maximum number of concurrent search requests. It shouldn't exceed 2*vCPUs for better performance. See also -search.maxQueueDuration (default 2)
  -search.maxLookback -search.lookback-delta
        Synonim to -search.lookback-delta from Prometheus. The value is dynamically detected from interval between time series datapoints if not set. It can be overridden on per-query basis via `max_lookback` arg
  -search.maxPointsPerTimeseries int
        The maximum points per a single timeseries returned from the search (default 30000)
  -search.maxQueryDuration duration
        The maximum time for search query execution (default 30s)
  -search.maxQueryLen int
        The maximum search query length in bytes (default 16384)
  -search.maxQueueDuration duration
        The maximum time the request waits for execution when -search.maxConcurrentRequests limit is reached (default 10s)
  -search.maxTagKeys int
        The maximum number of tag keys returned per search (default 100000)
  -search.maxTagValues int
        The maximum number of tag values returned per search (default 100000)
  -search.maxUniqueTimeseries int
        The maximum number of unique time series each search can scan (default 300000)
  -smallMergeConcurrency int
        The maximum number of CPU cores to use for small merges. Default value is used if set to 0
  -snapshotAuthKey string
        authKey, which must be passed in query string to /snapshot* pages
  -storageDataPath string
        Path to storage data (default "victoria-metrics-data")
  -tls -tlsCertFile
        Whether to enable TLS (aka HTTPS) for incoming requests. -tlsCertFile and `-tlsKeyFile` must be set if `-tls` is set
  -tlsCertFile -tls
        Path to file with TLS certificate. Used only if -tls is set. Prefer ECDSA certs instead of RSA certs, since RSA certs are slow
  -tlsKeyFile -tls
        Path to file with TLS key. Used only if -tls is set
  -version
        Show VictoriaMetrics version

Did I misinterpret the "deduplication" capability or I'm not using it correctly when I instantiate a docker container? Victoria metrics deduplicates the data from prometheus export/remote_write?

question

All 3 comments

Hi @ozn0417 looks like you use wrong image valyala/victoria-metrics:latest (this obsolete path). The right one you can find here https://hub.docker.com/r/victoriametrics/victoria-metrics

The de-duplication has been added in v1.31.0, while it looks like you use v1.30.2 - see the victoria-metrics-20191202-131048-tags-v1.30.2-0-g62a915f2 line in the log output.

The de-duplication is applied to all the ingested data via all the supported ingestion protocols.

As @tenmozes already said, the http://hub.docker.com/r/valyala/victoria-metrics has been deprecated long time ago, so it is recommended switching to https://hub.docker.com/r/victoriametrics/victoria-metrics and using the latest image from there.

This worked. Thanks to both of you.
Didn't realize the valyala/vict... was deprecated. I'll use the suggested image.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

valyala picture valyala  Â·  4Comments

sh0rez picture sh0rez  Â·  3Comments

dima-vm picture dima-vm  Â·  3Comments

EricAntoni picture EricAntoni  Â·  3Comments

jelmd picture jelmd  Â·  3Comments