Argo: Possible memory leak?

Created on 15 Nov 2018  ·  3Comments  ·  Source: argoproj/argo

Is this a BUG REPORT or FEATURE REQUEST?: BUG REPORT

What happened: Executing a long-running script might leads to memory leak.

screen shot 2018-11-15 at 18 04 18

- name: stream
  script:
    image: 'node:10-alpine'
    imagePullPolicy: Always
    resources:
      requests:
        cpu: 1
        memory: 2Gi
      limits:
        cpu: 1.5
        memory: 6Gi
    command:
      - dumb-init
      - node
      - '--optimize_for_size'
      - '--max_old_space_size=920'
      - '--gc_interval=100'
    source: |-
      const { exec } = require('/ds-stream/index.js')
      exec().then(({ uri }) => console.log(uri)).catch((e) => {
        console.error(e)
        process.exit(1)
      })

What you expected to happen:
The memory usage should be approximately equal to actual usage.

screen shot 2018-11-15 at 18 04 28

which is ~ 23 MB.

How to reproduce it (as minimally and precisely as possible):

const mysql = require('mysql')
const { Storage } = require('@google-cloud/storage')
const JSONStream = require('JSONStream')

const storage = new Storage({
  projectId: process.env.GCLOUD_PROJECT
})
const file = storage.bucket('bucket').file('filename')
const conn = mysql.createConnection(options)

conn.query('select * from large_table')
  .stream({ highWaterMark: 1000 })
  .pipe(JSONStream.stringify(false))
  .pipe(file.createWriteStream({ contentType: 'application/json', resumable: true }))

Anything else we need to know?:
GKE

Environment:

  • Argo version:
argo: v2.2.1
  BuildDate: 2018-10-11T16:25:59Z
  GitCommit: 3b52b26190163d1f72f3aef1a39f9f291378dafb
  GitTreeState: clean
  GitTag: v2.2.1
  GoVersion: go1.10.3
  Compiler: gc
  Platform: darwin/amd64
  • Kubernetes version :
clientVersion:
  buildDate: 2018-08-20T10:09:03Z
  compiler: gc
  gitCommit: 0c38c362511b20a098d7cd855f1314dad92c2780
  gitTreeState: clean
  gitVersion: v1.10.7
  goVersion: go1.9.3
  major: "1"
  minor: "10"
  platform: darwin/amd64
serverVersion:
  buildDate: 2018-11-02T23:07:38Z
  compiler: gc
  gitCommit: 9b635efce81582e1da13b35a7aa539c0ccb32987
  gitTreeState: clean
  gitVersion: v1.9.7-gke.7
  goVersion: go1.9.3b4
  major: "1"
  minor: 9+
  platform: linux/amd64

Other debugging information (if applicable):

  • workflow result:
Name:                ds-cron-bvn9l
Namespace:           default
ServiceAccount:      default
Status:              Running
Created:             Thu Nov 15 17:42:41 +0900 (24 minutes ago)
Started:             Thu Nov 15 17:42:41 +0900 (24 minutes ago)
Duration:            24 minutes 39 seconds
Parameters:          
  project:           toreta-ds-staging

STEP                        PODNAME                   DURATION  MESSAGE
 ● ds-cron-bvn9l                                                
 ├-✔ information-schema(0)  ds-cron-bvn9l-3300678462  6s        
 └-● customers                                                  
   └---● stream(0)          ds-cron-bvn9l-3257813657  24m
  • executor logs:
time="2018-11-15T08:42:50Z" level=info msg="Creating a docker executor"
time="2018-11-15T08:42:50Z" level=info msg="Executor (version: v2.2.0, build_date: 2018-08-30T08:52:54Z) initialized with template:\narchiveLocation: {}\ninputs:\n  parameters:\n  - name: table\n    value: customers\n  - name: dsl\n    value: |2-\n\n                (comp\n                  (replace \"company_name\" (mask))\n                  (replace \"first_name\" (mask))\n                  (replace \"first_name_reading\" (mask))\n                  (replace \"last_name\" (mask))\n                  (replace \"last_name_reading\" (mask))\n                  (replace \"note\" (mask)))\nmetadata:\n  annotations:\n    api.toreta.in/table: customers\nname: stream\noutputs: {}\nretryStrategy:\n  limit: 3\nscript:\n  command:\n  - dumb-init\n  - node\n  - --optimize_for_size\n  - --max_old_space_size=920\n  - --gc_interval=100\n  env:\n  - name: DEBUG\n    value: ds-stream:*\n  - name: GCLOUD_PROJECT\n    value: toreta-ds-staging\n  - name: MYSQL_BUFFER\n    value: \"0\"\n  - name: MYSQL_HOSTNAME\n    value: toreta-rails\n  - name: MYSQL_USERNAME\n    valueFrom:\n      secretKeyRef:\n        key: username\n        name: toreta-rails\n  - name: MYSQL_PASSWORD\n    valueFrom:\n      secretKeyRef:\n        key: password\n        name: toreta-rails\n  - name: MYSQL_DATABASE\n    valueFrom:\n      secretKeyRef:\n        key: database\n        name: toreta-rails\n  - name: SALT\n    valueFrom:\n      secretKeyRef:\n        key: SALT\n        name: ds-stream\n  - name: MYSQL_TABLE\n    value: customers\n  - name: GCLOUD_BUCKET\n    value: toreta-ds-staging-rails-snapshots\n  - name: DSL\n    value: |2-\n\n                (comp\n                  (replace \"company_name\" (mask))\n                  (replace \"first_name\" (mask))\n                  (replace \"first_name_reading\" (mask))\n                  (replace \"last_name\" (mask))\n                  (replace \"last_name_reading\" (mask))\n                  (replace \"note\" (mask)))\n  image: gcr.io/toreta-ds-staging/github-toreta-ds-stream:latest\n  imagePullPolicy: Always\n  name: \"\"\n  resources:\n    limits:\n      cpu: 1500m\n      memory: 6Gi\n    requests:\n      cpu: \"1\"\n      memory: 2Gi\n  source: |-\n    const { exec } = require('/ds-stream/index.js')\n    exec().then(({ uri }) => console.log(uri)).catch((e) => {\n      console.error(e)\n      process.exit(1)\n    })\n"
time="2018-11-15T08:42:50Z" level=info msg="Loading script source to /argo/staging/script"
time="2018-11-15T08:42:50Z" level=info msg="Start loading input artifacts..."
time="2018-11-15T08:42:50Z" level=info msg="Alloc=3267 TotalAlloc=9592 Sys=9286 NumGC=3 Goroutines=3"
time="2018-11-15T08:42:55Z" level=info msg="Creating a docker executor"
time="2018-11-15T08:42:55Z" level=info msg="Executor (version: v2.2.0, build_date: 2018-08-30T08:52:54Z) initialized with template:\narchiveLocation: {}\ninputs:\n  parameters:\n  - name: table\n    value: customers\n  - name: dsl\n    value: |2-\n\n                (comp\n                  (replace \"company_name\" (mask))\n                  (replace \"first_name\" (mask))\n                  (replace \"first_name_reading\" (mask))\n                  (replace \"last_name\" (mask))\n                  (replace \"last_name_reading\" (mask))\n                  (replace \"note\" (mask)))\nmetadata:\n  annotations:\n    api.toreta.in/table: customers\nname: stream\noutputs: {}\nretryStrategy:\n  limit: 3\nscript:\n  command:\n  - dumb-init\n  - node\n  - --optimize_for_size\n  - --max_old_space_size=920\n  - --gc_interval=100\n  env:\n  - name: DEBUG\n    value: ds-stream:*\n  - name: GCLOUD_PROJECT\n    value: toreta-ds-staging\n  - name: MYSQL_BUFFER\n    value: \"0\"\n  - name: MYSQL_HOSTNAME\n    value: toreta-rails\n  - name: MYSQL_USERNAME\n    valueFrom:\n      secretKeyRef:\n        key: username\n        name: toreta-rails\n  - name: MYSQL_PASSWORD\n    valueFrom:\n      secretKeyRef:\n        key: password\n        name: toreta-rails\n  - name: MYSQL_DATABASE\n    valueFrom:\n      secretKeyRef:\n        key: database\n        name: toreta-rails\n  - name: SALT\n    valueFrom:\n      secretKeyRef:\n        key: SALT\n        name: ds-stream\n  - name: MYSQL_TABLE\n    value: customers\n  - name: GCLOUD_BUCKET\n    value: toreta-ds-staging-rails-snapshots\n  - name: DSL\n    value: |2-\n\n                (comp\n                  (replace \"company_name\" (mask))\n                  (replace \"first_name\" (mask))\n                  (replace \"first_name_reading\" (mask))\n                  (replace \"last_name\" (mask))\n                  (replace \"last_name_reading\" (mask))\n                  (replace \"note\" (mask)))\n  image: gcr.io/toreta-ds-staging/github-toreta-ds-stream:latest\n  imagePullPolicy: Always\n  name: \"\"\n  resources:\n    limits:\n      cpu: 1500m\n      memory: 6Gi\n    requests:\n      cpu: \"1\"\n      memory: 2Gi\n  source: |-\n    const { exec } = require('/ds-stream/index.js')\n    exec().then(({ uri }) => console.log(uri)).catch((e) => {\n      console.error(e)\n      process.exit(1)\n    })\n"
time="2018-11-15T08:42:55Z" level=info msg="Waiting on main container"
time="2018-11-15T08:42:56Z" level=info msg="main container started with container ID: a53ad97e0a7bdb0ce7990443348e034ff17bac963ffbbe5fe952aa95de84525f"
time="2018-11-15T08:42:56Z" level=info msg="Starting annotations monitor"
time="2018-11-15T08:42:56Z" level=info msg="Execution control set from API: {2018-11-15 14:42:41 +0000 UTC}"
time="2018-11-15T08:42:56Z" level=info msg="docker wait a53ad97e0a7bdb0ce7990443348e034ff17bac963ffbbe5fe952aa95de84525f"
time="2018-11-15T08:42:56Z" level=info msg="Starting deadline monitor"
time="2018-11-15T08:47:55Z" level=info msg="Alloc=2953 TotalAlloc=11215 Sys=10342 NumGC=6 Goroutines=11"
time="2018-11-15T08:52:55Z" level=info msg="Alloc=2957 TotalAlloc=11222 Sys=10342 NumGC=8 Goroutines=11"
time="2018-11-15T08:57:55Z" level=info msg="Alloc=2957 TotalAlloc=11224 Sys=10342 NumGC=11 Goroutines=11"
time="2018-11-15T09:02:55Z" level=info msg="Alloc=2959 TotalAlloc=11229 Sys=10342 NumGC=13 Goroutines=11"
time="2018-11-15T09:07:55Z" level=info msg="Alloc=2959 TotalAlloc=11232 Sys=10342 NumGC=16 Goroutines=11"

Most helpful comment

Was this fixed? I seem to have the workflow-controller get OOMKilled every week or so.

All 3 comments

Was this fixed? I seem to have the workflow-controller get OOMKilled every week or so.

@radcheb it's different from kubeflow/pipelines#3550 where people reports a memory leak for KFP, but not argo

Was this page helpful?
0 / 5 - 0 ratings