Elasticsearch: Out of memory (invoked oom-killer)

Created on 31 Mar 2017  ·  4Comments  ·  Source: elastic/elasticsearch

Elasticsearch version: 2.4.0
Plugins installed: [ "license", "marvel-agent" ]
JVM version: 1.8.0_101
OS version: Ubuntu 16.04.1 LTS
Kernel version: 4.4.0-67-generic #88-Ubuntu

Kibana version: 4.6.0
Plugins installed: [ "marvel", "sense" ]
Filebeat version: 5.2.2
nginx version: nginx/1.10.0 (Ubuntu)

Description of the problem including expected versus actual behavior:
This node has been running for months with normal load. The other day Filebeat was installed to ship system logs to another elasticsearch log cluster. After about 24 hour after the installation the server ran out of memory and from the syslog I can see that filebeat invoked oom-killer that in turn led to killing of the JVM and elasticsearch.
The server was rebooted and seemed to work fine for 1 1/2 hour. Then the same thing happened again, but this time it was java invoked oom-killer. Once again the server was rebootet and this time Filebeat was stopped. After some days Filebeat was started again, and the server has now been running stable for three days.

I do not know if this problem is caused by Filebeat or if it was a coincidence that it occurred after the installation.

Other things to note:

  • Nothing interesting in the elasticsearch logs or the Filebeat logs.
  • The other nodes in the cluster have been stable since the Filebeat installation. The only difference in their setup is that they all have kernel version 4.4.0-66-generic #87-Ubuntu.
  • When i look at the graph for system load in marvel at the time of the error the load went from an avg. of 1.5 to 200.

Steps to reproduce:

  1. Do not know how to reproduce since the error occurred after over 24 hours of uptime with normal load.

Similar issue: https://github.com/elastic/elasticsearch/issues/22788

Logs:
/var/log/syslog

Mar 24 11:35:24 search06 kernel: [70429.840539] filebeat invoked oom-killer: gfp_mask=0x24201ca, order=0, oom_score_adj=0
Mar 24 11:35:24 search06 kernel: [70429.840542] filebeat cpuset=/ mems_allowed=0
Mar 24 11:35:24 search06 kernel: [70429.840548] CPU: 3 PID: 1142 Comm: filebeat Not tainted 4.4.0-67-generic #88-Ubuntu
Mar 24 11:35:24 search06 kernel: [70429.840549] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v1.0 11/26/2012
Mar 24 11:35:24 search06 kernel: [70429.840551]  0000000000000286 00000000df728432 ffff880370b9f9f8 ffffffff813f86d3
Mar 24 11:35:24 search06 kernel: [70429.840553]  ffff880370b9fbb0 ffff88036c833300 ffff880370b9fa68 ffffffff8120b24e
Mar 24 11:35:24 search06 kernel: [70429.840555]  ffffffff8113f60a ffff880370b9fa98 ffffffff811a6e0d ffff880387ff2ef8
Mar 24 11:35:24 search06 kernel: [70429.840557] Call Trace:
Mar 24 11:35:24 search06 kernel: [70429.840564]  [<ffffffff813f86d3>] dump_stack+0x63/0x90
Mar 24 11:35:24 search06 kernel: [70429.840567]  [<ffffffff8120b24e>] dump_header+0x5a/0x1c5
Mar 24 11:35:24 search06 kernel: [70429.840572]  [<ffffffff8113f60a>] ? __delayacct_freepages_end+0x2a/0x30
Mar 24 11:35:24 search06 kernel: [70429.840576]  [<ffffffff811a6e0d>] ? do_try_to_free_pages+0x2ed/0x410
Mar 24 11:35:24 search06 kernel: [70429.840578]  [<ffffffff811928c2>] oom_kill_process+0x202/0x3c0
Mar 24 11:35:24 search06 kernel: [70429.840580]  [<ffffffff81192ce9>] out_of_memory+0x219/0x460
Mar 24 11:35:24 search06 kernel: [70429.840582]  [<ffffffff81198cd8>] __alloc_pages_slowpath.constprop.88+0x938/0xad0
Mar 24 11:35:24 search06 kernel: [70429.840585]  [<ffffffff811990f6>] __alloc_pages_nodemask+0x286/0x2a0
Mar 24 11:35:24 search06 kernel: [70429.840588]  [<ffffffff811e2b8c>] alloc_pages_current+0x8c/0x110
Mar 24 11:35:24 search06 kernel: [70429.840591]  [<ffffffff8118eeab>] __page_cache_alloc+0xab/0xc0
Mar 24 11:35:24 search06 kernel: [70429.840593]  [<ffffffff8119139a>] filemap_fault+0x14a/0x3f0
Mar 24 11:35:24 search06 kernel: [70429.840597]  [<ffffffff812a2e16>] ext4_filemap_fault+0x36/0x50
Mar 24 11:35:24 search06 kernel: [70429.840601]  [<ffffffff811be230>] __do_fault+0x50/0xe0
Mar 24 11:35:24 search06 kernel: [70429.840604]  [<ffffffff811c1d3e>] handle_mm_fault+0xf8e/0x1820
Mar 24 11:35:24 search06 kernel: [70429.840607]  [<ffffffff8106b527>] __do_page_fault+0x197/0x400
Mar 24 11:35:24 search06 kernel: [70429.840609]  [<ffffffff8106b7b2>] do_page_fault+0x22/0x30
Mar 24 11:35:24 search06 kernel: [70429.840612]  [<ffffffff8183e7f8>] page_fault+0x28/0x30
Mar 24 11:35:24 search06 kernel: [70429.840613] Mem-Info:
Mar 24 11:35:24 search06 kernel: [70429.840617] active_anon:280484 inactive_anon:2137 isolated_anon:0
Mar 24 11:35:24 search06 kernel: [70429.840617]  active_file:44 inactive_file:87 isolated_file:0
Mar 24 11:35:24 search06 kernel: [70429.840617]  unevictable:1938018 dirty:0 writeback:16 unstable:0
Mar 24 11:35:24 search06 kernel: [70429.840617]  slab_reclaimable:43288 slab_unreclaimable:1259852
Mar 24 11:35:24 search06 kernel: [70429.840617]  mapped:7273 shmem:2255 pagetables:8296 bounce:0
Mar 24 11:35:24 search06 kernel: [70429.840617]  free:23264 free_pcp:281 free_cma:0
Mar 24 11:35:24 search06 kernel: [70429.840620] Node 0 DMA free:15872kB min:72kB low:88kB high:108kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15996kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:32kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Mar 24 11:35:24 search06 kernel: [70429.840625] lowmem_reserve[]: 0 3833 13943 13943 13943
Mar 24 11:35:24 search06 kernel: [70429.840628] Node 0 DMA32 free:58888kB min:18560kB low:23200kB high:27840kB active_anon:225444kB inactive_anon:1404kB active_file:128kB inactive_file:292kB unevictable:2545532kB isolated(anon):0kB isolated(file):0kB present:4046236kB managed:3965580kB mlocked:2545532kB dirty:0kB writeback:64kB mapped:8536kB shmem:1540kB slab_reclaimable:38096kB slab_unreclaimable:1057936kB kernel_stack:5072kB pagetables:10660kB unstable:0kB bounce:0kB free_pcp:1124kB local_pcp:120kB free_cma:0kB writeback_tmp:0kB pages_scanned:9100 all_unreclaimable? yes
Mar 24 11:35:24 search06 kernel: [70429.840632] lowmem_reserve[]: 0 0 10110 10110 10110
Mar 24 11:35:24 search06 kernel: [70429.840634] Node 0 Normal free:18296kB min:48948kB low:61184kB high:73420kB active_anon:896492kB inactive_anon:7144kB active_file:48kB inactive_file:56kB unevictable:5206540kB isolated(anon):0kB isolated(file):0kB present:10616832kB managed:10352916kB mlocked:5206540kB dirty:0kB writeback:0kB mapped:20556kB shmem:7480kB slab_reclaimable:135056kB slab_unreclaimable:3981440kB kernel_stack:10144kB pagetables:22524kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:576900 all_unreclaimable? yes
Mar 24 11:35:24 search06 kernel: [70429.840638] lowmem_reserve[]: 0 0 0 0 0
Mar 24 11:35:24 search06 kernel: [70429.840640] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 1*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15872kB
Mar 24 11:35:24 search06 kernel: [70429.840649] Node 0 DMA32: 11402*4kB (UME) 1570*8kB (UME) 36*16kB (UM) 1*32kB (U) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 58776kB
Mar 24 11:35:24 search06 kernel: [70429.840655] Node 0 Normal: 4142*4kB (UMH) 8*8kB (H) 8*16kB (H) 6*32kB (H) 9*64kB (H) 4*128kB (H) 1*256kB (H) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 18296kB
Mar 24 11:35:24 search06 kernel: [70429.840664] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Mar 24 11:35:24 search06 kernel: [70429.840665] 8524 total pagecache pages
Mar 24 11:35:24 search06 kernel: [70429.840666] 0 pages in swap cache
Mar 24 11:35:24 search06 kernel: [70429.840667] Swap cache stats: add 0, delete 0, find 0/0
Mar 24 11:35:24 search06 kernel: [70429.840668] Free swap  = 0kB
Mar 24 11:35:24 search06 kernel: [70429.840669] Total swap = 0kB
Mar 24 11:35:24 search06 kernel: [70429.840670] 3669766 pages RAM
Mar 24 11:35:24 search06 kernel: [70429.840670] 0 pages HighMem/MovableOnly
Mar 24 11:35:24 search06 kernel: [70429.840671] 86166 pages reserved
Mar 24 11:35:24 search06 kernel: [70429.840672] 0 pages cma reserved
Mar 24 11:35:24 search06 kernel: [70429.840672] 0 pages hwpoisoned
Mar 24 11:35:24 search06 kernel: [70429.840673] [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name
Mar 24 11:35:24 search06 kernel: [70429.840677] [  394]     0   394    10972     2493      24       3        0             0 systemd-journal
Mar 24 11:35:24 search06 kernel: [70429.840679] [  424]     0   424    42126      261      19       3        0             0 lvmetad
Mar 24 11:35:24 search06 kernel: [70429.840681] [  447]     0   447    11183      575      23       3        0         -1000 systemd-udevd
Mar 24 11:35:24 search06 kernel: [70429.840683] [  691]   100   691    25081      542      19       3        0             0 systemd-timesyn
Mar 24 11:35:24 search06 kernel: [70429.840685] [  820]     0   820     9559      401      22       3        0             0 systemd-logind
Mar 24 11:35:24 search06 kernel: [70429.840703] [  822]     0   822     1100      315       8       3        0             0 acpid
Mar 24 11:35:24 search06 kernel: [70429.840705] [  824]     0   824    95627       91      21       3        0             0 lxcfs
Mar 24 11:35:24 search06 kernel: [70429.840707] [  826]     0   826    71369     1236      41       4        0             0 accounts-daemon
Mar 24 11:35:24 search06 kernel: [70429.840708] [  829]     0   829     6511      483      19       3        0             0 atd
Mar 24 11:35:24 search06 kernel: [70429.840710] [  833]   113   833   488348   194272     865     721        0             0 node
Mar 24 11:35:24 search06 kernel: [70429.840712] [  837]   107   837    13129      515      30       3        0          -900 dbus-daemon
Mar 24 11:35:24 search06 kernel: [70429.840714] [  901]     0   901    10176      666      25       3        0             0 cron
Mar 24 11:35:24 search06 kernel: [70429.840715] [  903]   104   903    66503      312      31       3        0             0 rsyslogd
Mar 24 11:35:24 search06 kernel: [70429.840717] [  949]     0   949    69296      176      40       3        0             0 polkitd
Mar 24 11:35:24 search06 kernel: [70429.840719] [  954]     0   954     3344       37      11       3        0             0 mdadm
Mar 24 11:35:24 search06 kernel: [70429.840720] [  959]     0   959   107590     2155      35       5        0             0 snapd
Mar 24 11:35:24 search06 kernel: [70429.840722] [  969]     0   969    16380      506      35       4        0         -1000 sshd
Mar 24 11:35:24 search06 kernel: [70429.840724] [  975]     0   975     4914     2158      15       5        0             0 filebeat
Mar 24 11:35:24 search06 kernel: [70429.840726] [  990]   112   990 18475811  1993312    5428      62        0             0 java
Mar 24 11:35:24 search06 kernel: [70429.840727] [  999]     0   999     1306       30       9       3        0             0 iscsid
Mar 24 11:35:24 search06 kernel: [70429.840729] [ 1000]     0  1000     1431      877       9       3        0           -17 iscsid
Mar 24 11:35:24 search06 kernel: [70429.840731] [ 1072]     0  1072     4868      219      14       3        0             0 irqbalance
Mar 24 11:35:24 search06 kernel: [70429.840733] [ 1074]     0  1074   140928     1084      58       3        0             0 lwsmd
Mar 24 11:35:24 search06 kernel: [70429.840734] [ 1099]     0  1099   231193      375      73       4        0             0 lwsmd
Mar 24 11:35:24 search06 kernel: [70429.840736] [ 1136]     0  1136     3985      414      13       3        0             0 agetty
Mar 24 11:35:24 search06 kernel: [70429.840737] [ 1146]     0  1146    36396      591      45       3        0             0 nginx
Mar 24 11:35:24 search06 kernel: [70429.840739] [ 1147]    33  1147    39417      791      55       3        0             0 nginx
Mar 24 11:35:24 search06 kernel: [70429.840741] [ 1148]    33  1148    39428      533      55       3        0             0 nginx
Mar 24 11:35:24 search06 kernel: [70429.840742] [ 1149]    33  1149    39423      594      55       3        0             0 nginx
Mar 24 11:35:24 search06 kernel: [70429.840744] [ 1150]    33  1150    39440      872      55       3        0             0 nginx
Mar 24 11:35:24 search06 kernel: [70429.840746] [ 1154]     0  1154   148263      941      63       3        0             0 lwsmd
Mar 24 11:35:24 search06 kernel: [70429.840747] [ 1181]     0  1181   197559      318      69       4        0             0 lwsmd
Mar 24 11:35:24 search06 kernel: [70429.840749] [ 1213]   111  1213    19018     1401      38       3        0             0 snmpd
Mar 24 11:35:24 search06 kernel: [70429.840751] [ 1223]     0  1223   232151     1253      74       4        0             0 lwsmd
Mar 24 11:35:24 search06 kernel: [70429.840752] [ 1252]     0  1252   320092     2792     106       4        0             0 lwsmd
Mar 24 11:35:24 search06 kernel: [70429.840754] [ 1303]     0  1303   166416     1224      67       3        0             0 lwsmd
Mar 24 11:35:24 search06 kernel: [70429.840768] [23403]     0 23403    26514      805      56       3        0             0 sshd
Mar 24 11:35:24 search06 kernel: [70429.840770] [23416] 926419527 23416    13716      683      30       3        0             0 systemd
Mar 24 11:35:24 search06 kernel: [70429.840772] [23421] 926419527 23421    18035      618      38       3        0             0 (sd-pam)
Mar 24 11:35:24 search06 kernel: [70429.840774] [23499] 926419527 23499    26514      709      53       3        0             0 sshd
Mar 24 11:35:24 search06 kernel: [70429.840775] [23500] 926419527 23500     9116     1114      22       3        0             0 bash
Mar 24 11:35:24 search06 kernel: [70429.840777] [23516] 926419527 23516    12970      790      30       3        0             0 top
Mar 24 11:35:24 search06 kernel: [70429.840779] [23896]     0 23896    26514      797      54       3        0             0 sshd
Mar 24 11:35:24 search06 kernel: [70429.840780] [23904] 926420501 23904    13716      593      31       3        0             0 systemd
Mar 24 11:35:24 search06 kernel: [70429.840783] [23911] 926420501 23911    18035      618      38       3        0             0 (sd-pam)
Mar 24 11:35:24 search06 kernel: [70429.840784] [23943] 926420501 23943    26514      436      52       3        0             0 sshd
Mar 24 11:35:24 search06 kernel: [70429.840786] [23944] 926420501 23944     9116      956      23       3        0             0 bash
Mar 24 11:35:24 search06 kernel: [70429.840788] [24002]     0 24002    26514      788      54       3        0             0 sshd
Mar 24 11:35:24 search06 kernel: [70429.840789] [24013] 926420521 24013    13716      646      29       3        0             0 systemd
Mar 24 11:35:24 search06 kernel: [70429.840791] [24016] 926420521 24016    18035      618      38       3        0             0 (sd-pam)
Mar 24 11:35:24 search06 kernel: [70429.840793] [24046] 926420521 24046    26514      645      53       3        0             0 sshd
Mar 24 11:35:24 search06 kernel: [70429.840794] [24047] 926420521 24047     7238      592      19       3        0             0 sftp-server
Mar 24 11:35:24 search06 kernel: [70429.840796] [24077]     0 24077    26514      566      54       3        0             0 sshd
Mar 24 11:35:24 search06 kernel: [70429.840798] [24096] 926420501 24096    12922      509      29       3        0             0 top
Mar 24 11:35:24 search06 kernel: [70429.840799] Out of memory: Kill process 990 (java) score 557 or sacrifice child
Mar 24 11:35:24 search06 kernel: [70429.841000] Killed process 990 (java) total-vm:73903244kB, anon-rss:7973248kB, file-rss:0kB
Mar 24 11:35:24 search06 systemd[1]: elasticsearch.service: Main process exited, code=killed, status=9/KILL
Mar 24 11:35:24 search06 systemd[1]: elasticsearch.service: Unit entered failed state.
Mar 24 11:35:24 search06 systemd[1]: elasticsearch.service: Failed with result 'signal'.
````
`/var/log/syslog`

Mar 24 13:09:13 search06 kernel: [ 4771.257280] java invoked oom-killer: gfp_mask=0x24201ca, order=0, oom_score_adj=0
Mar 24 13:09:13 search06 kernel: [ 4771.257283] java cpuset=/ mems_allowed=0
Mar 24 13:09:13 search06 kernel: [ 4771.257288] CPU: 2 PID: 1446 Comm: java Not tainted 4.4.0-67-generic #88-Ubuntu
Mar 24 13:09:13 search06 kernel: [ 4771.257289] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v1.0 11/26/2012
Mar 24 13:09:13 search06 kernel: [ 4771.257291] 0000000000000286 000000008bfad918 ffff880370f1fa50 ffffffff813f86d3
Mar 24 13:09:13 search06 kernel: [ 4771.257293] ffff880370f1fc08 ffff880035429980 ffff880370f1fac0 ffffffff8120b24e
Mar 24 13:09:13 search06 kernel: [ 4771.257295] ffffffff8113f60a ffff880370f1faf0 ffffffff811a6e0d ffff880387ff2ef8
Mar 24 13:09:13 search06 kernel: [ 4771.257296] Call Trace:
Mar 24 13:09:13 search06 kernel: [ 4771.257302] [] dump_stack+0x63/0x90
Mar 24 13:09:13 search06 kernel: [ 4771.257305] [] dump_header+0x5a/0x1c5
Mar 24 13:09:13 search06 kernel: [ 4771.257310] [] ? __delayacct_freepages_end+0x2a/0x30
Mar 24 13:09:13 search06 kernel: [ 4771.257314] [] ? do_try_to_free_pages+0x2ed/0x410
Mar 24 13:09:13 search06 kernel: [ 4771.257316] [] oom_kill_process+0x202/0x3c0
Mar 24 13:09:13 search06 kernel: [ 4771.257318] [] out_of_memory+0x219/0x460
Mar 24 13:09:13 search06 kernel: [ 4771.257320] [] __alloc_pages_slowpath.constprop.88+0x938/0xad0
Mar 24 13:09:13 search06 kernel: [ 4771.257323] [] __alloc_pages_nodemask+0x286/0x2a0
Mar 24 13:09:13 search06 kernel: [ 4771.257326] [] alloc_pages_current+0x8c/0x110
Mar 24 13:09:13 search06 kernel: [ 4771.257329] [] __page_cache_alloc+0xab/0xc0
Mar 24 13:09:13 search06 kernel: [ 4771.257330] [] generic_file_read_iter+0x545/0x670
Mar 24 13:09:13 search06 kernel: [ 4771.257333] [] new_sync_read+0x94/0xd0
Mar 24 13:09:13 search06 kernel: [ 4771.257335] [] __vfs_read+0x26/0x40
Mar 24 13:09:13 search06 kernel: [ 4771.257337] [] vfs_read+0x86/0x130
Mar 24 13:09:13 search06 kernel: [ 4771.257339] [] SyS_pread64+0x95/0xb0
Mar 24 13:09:13 search06 kernel: [ 4771.257342] [] entry_SYSCALL_64_fastpath+0x16/0x71
Mar 24 13:09:13 search06 kernel: [ 4771.257344] Mem-Info:
Mar 24 13:09:13 search06 kernel: [ 4771.257347] active_anon:114452 inactive_anon:2124 isolated_anon:0
Mar 24 13:09:13 search06 kernel: [ 4771.257347] active_file:72 inactive_file:0 isolated_file:0
Mar 24 13:09:13 search06 kernel: [ 4771.257347] unevictable:1932437 dirty:1 writeback:0 unstable:0
Mar 24 13:09:13 search06 kernel: [ 4771.257347] slab_reclaimable:27304 slab_unreclaimable:1449229
Mar 24 13:09:13 search06 kernel: [ 4771.257347] mapped:6509 shmem:2211 pagetables:6742 bounce:0
Mar 24 13:09:13 search06 kernel: [ 4771.257347] free:23289 free_pcp:227 free_cma:0
Mar 24 13:09:13 search06 kernel: [ 4771.257350] Node 0 DMA free:15872kB min:72kB low:88kB high:108kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15996kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:32kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Mar 24 13:09:13 search06 kernel: [ 4771.257354] lowmem_reserve[]: 0 3833 13943 13943 13943
Mar 24 13:09:13 search06 kernel: [ 4771.257356] Node 0 DMA32 free:58976kB min:18560kB low:23200kB high:27840kB active_anon:102916kB inactive_anon:1372kB active_file:244kB inactive_file:0kB unevictable:2554900kB isolated(anon):0kB isolated(file):0kB present:4046236kB managed:3965580kB mlocked:2554900kB dirty:0kB writeback:0kB mapped:8548kB shmem:1516kB slab_reclaimable:21960kB slab_unreclaimable:1192596kB kernel_stack:3792kB pagetables:7444kB unstable:0kB bounce:0kB free_pcp:908kB local_pcp:244kB free_cma:0kB writeback_tmp:0kB pages_scanned:5284 all_unreclaimable? yes
Mar 24 13:09:13 search06 kernel: [ 4771.257360] lowmem_reserve[]: 0 0 10110 10110 10110
Mar 24 13:09:13 search06 kernel: [ 4771.257362] Node 0 Normal free:18308kB min:48948kB low:61184kB high:73420kB active_anon:354892kB inactive_anon:7124kB active_file:44kB inactive_file:60kB unevictable:5174848kB isolated(anon):0kB isolated(file):0kB present:10616832kB managed:10352916kB mlocked:5174848kB dirty:4kB writeback:0kB mapped:17488kB shmem:7328kB slab_reclaimable:87256kB slab_unreclaimable:4604288kB kernel_stack:10928kB pagetables:19524kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:523972 all_unreclaimable? yes
Mar 24 13:09:13 search06 kernel: [ 4771.257365] lowmem_reserve[]: 0 0 0 0 0
Mar 24 13:09:13 search06 kernel: [ 4771.257368] Node 0 DMA: 24kB (U) 18kB (U) 116kB (U) 132kB (U) 164kB (U) 1128kB (U) 1256kB (U) 0512kB 11024kB (U) 12048kB (M) 34096kB (M) = 15872kB
Mar 24 13:09:13 search06 kernel: [ 4771.257376] Node 0 DMA32: 9052
4kB (UME) 20588kB (ME) 39416kB (UM) 032kB 064kB 0128kB 0256kB 0512kB 01024kB 02048kB 04096kB = 58976kB
Mar 24 13:09:13 search06 kernel: [ 4771.257382] Node 0 Normal: 42174kB (UMH) 148kB (H) 1516kB (H) 1832kB (H) 464kB (H) 2128kB (H) 0256kB 0512kB 01024kB 02048kB 04096kB = 18308kB
Mar 24 13:09:13 search06 kernel: [ 4771.257391] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Mar 24 13:09:13 search06 kernel: [ 4771.257393] 8385 total pagecache pages
Mar 24 13:09:13 search06 kernel: [ 4771.257394] 0 pages in swap cache
Mar 24 13:09:13 search06 kernel: [ 4771.257396] Swap cache stats: add 0, delete 0, find 0/0
Mar 24 13:09:13 search06 kernel: [ 4771.257397] Free swap = 0kB
Mar 24 13:09:13 search06 kernel: [ 4771.257398] Total swap = 0kB
Mar 24 13:09:13 search06 kernel: [ 4771.257399] 3669766 pages RAM
Mar 24 13:09:13 search06 kernel: [ 4771.257400] 0 pages HighMem/MovableOnly
Mar 24 13:09:13 search06 kernel: [ 4771.257401] 86166 pages reserved
Mar 24 13:09:13 search06 kernel: [ 4771.257402] 0 pages cma reserved
Mar 24 13:09:13 search06 kernel: [ 4771.257403] 0 pages hwpoisoned
Mar 24 13:09:13 search06 kernel: [ 4771.257404] [ pid ] uid tgid total_vm rss nr_ptes nr_pmds swapents oom_score_adj name
Mar 24 13:09:13 search06 kernel: [ 4771.257410] [ 401] 0 401 9104 683 21 3 0 0 systemd-journal
Mar 24 13:09:13 search06 kernel: [ 4771.257413] [ 446] 0 446 42126 274 19 3 0 0 lvmetad
Mar 24 13:09:13 search06 kernel: [ 4771.257414] [ 453] 0 453 11301 864 24 3 0 -1000 systemd-udevd
Mar 24 13:09:13 search06 kernel: [ 4771.257416] [ 699] 100 699 25081 496 20 3 0 0 systemd-timesyn
Mar 24 13:09:13 search06 kernel: [ 4771.257418] [ 827] 113 827 350358 56458 268 210 0 0 node
Mar 24 13:09:13 search06 kernel: [ 4771.257420] [ 830] 0 830 6511 472 18 3 0 0 atd
Mar 24 13:09:13 search06 kernel: [ 4771.257421] [ 835] 107 835 13129 604 29 3 0 -900 dbus-daemon
Mar 24 13:09:13 search06 kernel: [ 4771.257423] [ 894] 0 894 1100 294 8 3 0 0 acpid
Mar 24 13:09:13 search06 kernel: [ 4771.257425] [ 896] 0 896 10176 662 25 3 0 0 cron
Mar 24 13:09:13 search06 kernel: [ 4771.257426] [ 898] 0 898 71371 526 41 3 0 0 accounts-daemon
Mar 24 13:09:13 search06 kernel: [ 4771.257428] [ 900] 0 900 40226 273 15 4 0 0 lxcfs
Mar 24 13:09:13 search06 kernel: [ 4771.257429] [ 902] 0 902 9561 525 22 3 0 0 systemd-logind
Mar 24 13:09:13 search06 kernel: [ 4771.257431] [ 907] 104 907 66503 679 30 3 0 0 rsyslogd
Mar 24 13:09:13 search06 kernel: [ 4771.257432] [ 911] 0 911 69251 1184 28 5 0 0 snapd
Mar 24 13:09:13 search06 kernel: [ 4771.257434] [ 972] 0 972 3344 37 11 3 0 0 mdadm
Mar 24 13:09:13 search06 kernel: [ 4771.257436] [ 978] 0 978 16380 487 35 3 0 -1000 sshd
Mar 24 13:09:13 search06 kernel: [ 4771.257437] [ 993] 0 993 4914 1991 14 5 0 0 filebeat
Mar 24 13:09:13 search06 kernel: [ 4771.257439] [ 1003] 112 1003 19540547 1967765 5214 58 0 0 java
Mar 24 13:09:13 search06 kernel: [ 4771.257441] [ 1009] 0 1009 1306 30 8 3 0 0 iscsid
Mar 24 13:09:13 search06 kernel: [ 4771.257442] [ 1010] 0 1010 1431 877 8 3 0 -17 iscsid
Mar 24 13:09:13 search06 kernel: [ 4771.257444] [ 1030] 0 1030 69296 1122 38 3 0 0 polkitd
Mar 24 13:09:13 search06 kernel: [ 4771.257446] [ 1037] 0 1037 140928 376 58 3 0 0 lwsmd
Mar 24 13:09:13 search06 kernel: [ 4771.257447] [ 1063] 0 1063 214809 1122 70 4 0 0 lwsmd
Mar 24 13:09:13 search06 kernel: [ 4771.257449] [ 1134] 0 1134 36396 351 44 3 0 0 nginx
Mar 24 13:09:13 search06 kernel: [ 4771.257450] [ 1138] 33 1138 39407 505 52 3 0 0 nginx
Mar 24 13:09:13 search06 kernel: [ 4771.257452] [ 1139] 33 1139 39407 629 53 3 0 0 nginx
Mar 24 13:09:13 search06 kernel: [ 4771.257453] [ 1140] 33 1140 39407 805 52 3 0 0 nginx
Mar 24 13:09:13 search06 kernel: [ 4771.257455] [ 1142] 33 1142 39407 722 53 3 0 0 nginx
Mar 24 13:09:13 search06 kernel: [ 4771.257456] [ 1171] 0 1171 4868 153 14 3 0 0 irqbalance
Mar 24 13:09:13 search06 kernel: [ 4771.257458] [ 1180] 0 1180 3985 396 13 3 0 0 agetty
Mar 24 13:09:13 search06 kernel: [ 4771.257459] [ 1182] 0 1182 148263 2376 63 4 0 0 lwsmd
Mar 24 13:09:13 search06 kernel: [ 4771.257461] [ 1209] 0 1209 197559 565 69 4 0 0 lwsmd
Mar 24 13:09:13 search06 kernel: [ 4771.257462] [ 1237] 111 1237 19017 1356 37 3 0 0 snmpd
Mar 24 13:09:13 search06 kernel: [ 4771.257464] [ 1246] 0 1246 232151 356 73 4 0 0 lwsmd
Mar 24 13:09:13 search06 kernel: [ 4771.257465] [ 1286] 0 1286 340573 2135 106 4 0 0 lwsmd
Mar 24 13:09:13 search06 kernel: [ 4771.257466] [ 1325] 0 1325 166416 685 66 4 0 0 lwsmd
Mar 24 13:09:13 search06 kernel: [ 4771.257473] Out of memory: Kill process 1003 (java) score 550 or sacrifice child
Mar 24 13:09:13 search06 kernel: [ 4771.257615] Killed process 1003 (java) total-vm:78162188kB, anon-rss:7865052kB, file-rss:6008kB
Mar 24 13:09:13 search06 kernel: [ 4771.273077] systemd-journal invoked oom-killer: gfp_mask=0x24280ca, order=0, oom_score_adj=0
Mar 24 13:09:13 search06 kernel: [ 4771.273080] systemd-journal cpuset=/ mems_allowed=0
Mar 24 13:09:13 search06 kernel: [ 4771.273085] CPU: 3 PID: 401 Comm: systemd-journal Not tainted 4.4.0-67-generic #88-Ubuntu
Mar 24 13:09:13 search06 kernel: [ 4771.273087] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v1.0 11/26/2012
Mar 24 13:09:13 search06 kernel: [ 4771.273088] 0000000000000286 000000003ba6b27c ffff8800f28bfaf8 ffffffff813f86d3
Mar 24 13:09:13 search06 kernel: [ 4771.273090] ffff8800f28bfcb0 ffff880370c9f2c0 ffff8800f28bfb68 ffffffff8120b24e
Mar 24 13:09:13 search06 kernel: [ 4771.273092] ffffffff8113f60a ffff8800f28bfb98 ffffffff811a6e0d 0000000100000001
Mar 24 13:09:13 search06 kernel: [ 4771.273095] Call Trace:
Mar 24 13:09:13 search06 kernel: [ 4771.273100] [] dump_stack+0x63/0x90
Mar 24 13:09:13 search06 kernel: [ 4771.273103] [] dump_header+0x5a/0x1c5
Mar 24 13:09:13 search06 kernel: [ 4771.273108] [] ? __delayacct_freepages_end+0x2a/0x30
Mar 24 13:09:13 search06 kernel: [ 4771.273111] [] ? do_try_to_free_pages+0x2ed/0x410
Mar 24 13:09:13 search06 kernel: [ 4771.273114] [] oom_kill_process+0x202/0x3c0
Mar 24 13:09:13 search06 kernel: [ 4771.273117] [] out_of_memory+0x219/0x460
Mar 24 13:09:13 search06 kernel: [ 4771.273119] [] __alloc_pages_slowpath.constprop.88+0x938/0xad0
Mar 24 13:09:13 search06 kernel: [ 4771.273121] [] __alloc_pages_nodemask+0x286/0x2a0
Mar 24 13:09:13 search06 kernel: [ 4771.273124] [] alloc_pages_vma+0xad/0x250
Mar 24 13:09:13 search06 kernel: [ 4771.273128] [] handle_mm_fault+0x1491/0x1820
Mar 24 13:09:13 search06 kernel: [ 4771.273130] [] ? vma_merge+0x22e/0x330
Mar 24 13:09:13 search06 kernel: [ 4771.273133] [] __do_page_fault+0x197/0x400
Mar 24 13:09:13 search06 kernel: [ 4771.273135] [] do_page_fault+0x22/0x30
Mar 24 13:09:13 search06 kernel: [ 4771.273138] [] page_fault+0x28/0x30
Mar 24 13:09:13 search06 kernel: [ 4771.273156] Mem-Info:
Mar 24 13:09:13 search06 kernel: [ 4771.273159] active_anon:114711 inactive_anon:2124 isolated_anon:0
Mar 24 13:09:13 search06 kernel: [ 4771.273159] active_file:64 inactive_file:18 isolated_file:0
Mar 24 13:09:13 search06 kernel: [ 4771.273159] unevictable:1932437 dirty:1 writeback:0 unstable:0
Mar 24 13:09:13 search06 kernel: [ 4771.273159] slab_reclaimable:27304 slab_unreclaimable:1449266
Mar 24 13:09:13 search06 kernel: [ 4771.273159] mapped:6509 shmem:2211 pagetables:6742 bounce:0
Mar 24 13:09:13 search06 kernel: [ 4771.273159] free:23323 free_pcp:0 free_cma:0
Mar 24 13:09:13 search06 kernel: [ 4771.273165] Node 0 DMA free:15872kB min:72kB low:88kB high:108kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15996kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:32kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Mar 24 13:09:13 search06 kernel: [ 4771.273172] lowmem_reserve[]: 0 3833 13943 13943 13943
Mar 24 13:09:13 search06 kernel: [ 4771.273181] Node 0 DMA32 free:58904kB min:18560kB low:23200kB high:27840kB active_anon:103952kB inactive_anon:1372kB active_file:212kB inactive_file:12kB unevictable:2554900kB isolated(anon):0kB isolated(file):0kB present:4046236kB managed:3965580kB mlocked:2554900kB dirty:0kB writeback:0kB mapped:8548kB shmem:1516kB slab_reclaimable:21960kB slab_unreclaimable:1192744kB kernel_stack:3792kB pagetables:7444kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1620 all_unreclaimable? yes
Mar 24 13:09:13 search06 kernel: [ 4771.273186] lowmem_reserve[]: 0 0 10110 10110 10110
Mar 24 13:09:13 search06 kernel: [ 4771.273200] Node 0 Normal free:18516kB min:48948kB low:61184kB high:73420kB active_anon:354892kB inactive_anon:7124kB active_file:44kB inactive_file:60kB unevictable:5174848kB isolated(anon):0kB isolated(file):0kB present:10616832kB managed:10352916kB mlocked:5174848kB dirty:4kB writeback:0kB mapped:17488kB shmem:7328kB slab_reclaimable:87256kB slab_unreclaimable:4604288kB kernel_stack:10928kB pagetables:19524kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:780 all_unreclaimable? yes
Mar 24 13:09:13 search06 kernel: [ 4771.273204] lowmem_reserve[]: 0 0 0 0 0
Mar 24 13:09:13 search06 kernel: [ 4771.273206] Node 0 DMA: 2
4kB (U) 18kB (U) 116kB (U) 132kB (U) 164kB (U) 1128kB (U) 1256kB (U) 0512kB 11024kB (U) 12048kB (M) 34096kB (M) = 15872kB
Mar 24 13:09:13 search06 kernel: [ 4771.273214] Node 0 DMA32: 89754kB (UME) 20608kB (UME) 40016kB (UM) 1032kB (UM) 064kB 0128kB 0256kB 0512kB 01024kB 02048kB 04096kB = 59100kB
Mar 24 13:09:13 search06 kernel: [ 4771.273221] Node 0 Normal: 4215
4kB (UMH) 148kB (H) 3016kB (UH) 2332kB (UMH) 464kB (H) 3128kB (H) 0256kB 0512kB 01024kB 02048kB 04096kB = 18828kB
Mar 24 13:09:13 search06 kernel: [ 4771.273229] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Mar 24 13:09:13 search06 kernel: [ 4771.273229] 8385 total pagecache pages
Mar 24 13:09:13 search06 kernel: [ 4771.273231] 0 pages in swap cache
Mar 24 13:09:13 search06 kernel: [ 4771.273232] Swap cache stats: add 0, delete 0, find 0/0
Mar 24 13:09:13 search06 kernel: [ 4771.273232] Free swap = 0kB
Mar 24 13:09:13 search06 kernel: [ 4771.273233] Total swap = 0kB
Mar 24 13:09:13 search06 kernel: [ 4771.273234] 3669766 pages RAM
Mar 24 13:09:13 search06 kernel: [ 4771.273235] 0 pages HighMem/MovableOnly
Mar 24 13:09:13 search06 kernel: [ 4771.273235] 86166 pages reserved
Mar 24 13:09:13 search06 kernel: [ 4771.273236] 0 pages cma reserved
Mar 24 13:09:13 search06 kernel: [ 4771.273237] 0 pages hwpoisoned
Mar 24 13:09:13 search06 kernel: [ 4771.273237] [ pid ] uid tgid total_vm rss nr_ptes nr_pmds swapents oom_score_adj name
Mar 24 13:09:13 search06 kernel: [ 4771.273241] [ 401] 0 401 10970 1115 23 3 0 0 systemd-journal
Mar 24 13:09:13 search06 kernel: [ 4771.273243] [ 446] 0 446 42126 274 19 3 0 0 lvmetad
Mar 24 13:09:13 search06 kernel: [ 4771.273245] [ 453] 0 453 11301 864 24 3 0 -1000 systemd-udevd
Mar 24 13:09:13 search06 kernel: [ 4771.273247] [ 699] 100 699 25081 496 20 3 0 0 systemd-timesyn
Mar 24 13:09:13 search06 kernel: [ 4771.273249] [ 827] 113 827 350358 56458 268 210 0 0 node
Mar 24 13:09:13 search06 kernel: [ 4771.273250] [ 830] 0 830 6511 472 18 3 0 0 atd
Mar 24 13:09:13 search06 kernel: [ 4771.273252] [ 835] 107 835 13129 604 29 3 0 -900 dbus-daemon
Mar 24 13:09:13 search06 kernel: [ 4771.273253] [ 894] 0 894 1100 294 8 3 0 0 acpid
Mar 24 13:09:13 search06 kernel: [ 4771.273255] [ 896] 0 896 10176 662 25 3 0 0 cron
Mar 24 13:09:13 search06 kernel: [ 4771.273257] [ 898] 0 898 71371 526 41 3 0 0 accounts-daemon
Mar 24 13:09:13 search06 kernel: [ 4771.273258] [ 900] 0 900 40226 273 15 4 0 0 lxcfs
Mar 24 13:09:13 search06 kernel: [ 4771.273260] [ 902] 0 902 9561 525 22 3 0 0 systemd-logind
Mar 24 13:09:13 search06 kernel: [ 4771.273261] [ 907] 104 907 66503 581 30 3 0 0 rsyslogd
Mar 24 13:09:13 search06 kernel: [ 4771.273263] [ 911] 0 911 69251 1184 28 5 0 0 snapd
Mar 24 13:09:13 search06 kernel: [ 4771.273264] [ 972] 0 972 3344 37 11 3 0 0 mdadm
Mar 24 13:09:13 search06 kernel: [ 4771.273266] [ 978] 0 978 16380 487 35 3 0 -1000 sshd
Mar 24 13:09:13 search06 kernel: [ 4771.273268] [ 993] 0 993 4914 1991 14 5 0 0 filebeat
Mar 24 13:09:13 search06 kernel: [ 4771.273270] [ 1440] 112 1003 19540547 1973609 5214 58 0 0 java
Mar 24 13:09:13 search06 kernel: [ 4771.273281] [ 1009] 0 1009 1306 30 8 3 0 0 iscsid
Mar 24 13:09:13 search06 kernel: [ 4771.273283] [ 1010] 0 1010 1431 877 8 3 0 -17 iscsid
Mar 24 13:09:13 search06 kernel: [ 4771.273284] [ 1030] 0 1030 69296 1122 38 3 0 0 polkitd
Mar 24 13:09:13 search06 kernel: [ 4771.273286] [ 1037] 0 1037 140928 376 58 3 0 0 lwsmd
Mar 24 13:09:13 search06 kernel: [ 4771.273287] [ 1063] 0 1063 214809 1122 70 4 0 0 lwsmd
Mar 24 13:09:13 search06 kernel: [ 4771.273289] [ 1134] 0 1134 36396 351 44 3 0 0 nginx
Mar 24 13:09:13 search06 kernel: [ 4771.273290] [ 1138] 33 1138 39407 505 52 3 0 0 nginx
Mar 24 13:09:13 search06 kernel: [ 4771.273292] [ 1139] 33 1139 39407 629 53 3 0 0 nginx
Mar 24 13:09:13 search06 kernel: [ 4771.273293] [ 1140] 33 1140 39407 805 52 3 0 0 nginx
Mar 24 13:09:13 search06 kernel: [ 4771.273295] [ 1142] 33 1142 39407 722 53 3 0 0 nginx
Mar 24 13:09:13 search06 kernel: [ 4771.273296] [ 1171] 0 1171 4868 153 14 3 0 0 irqbalance
Mar 24 13:09:13 search06 kernel: [ 4771.273298] [ 1180] 0 1180 3985 396 13 3 0 0 agetty
Mar 24 13:09:13 search06 kernel: [ 4771.273299] [ 1182] 0 1182 148263 2376 63 4 0 0 lwsmd
Mar 24 13:09:13 search06 kernel: [ 4771.273301] [ 1209] 0 1209 197559 565 69 4 0 0 lwsmd
Mar 24 13:09:13 search06 kernel: [ 4771.273302] [ 1237] 111 1237 19017 1356 37 3 0 0 snmpd
Mar 24 13:09:13 search06 kernel: [ 4771.273304] [ 1246] 0 1246 232151 356 73 4 0 0 lwsmd
Mar 24 13:09:13 search06 kernel: [ 4771.273305] [ 1286] 0 1286 340573 2135 106 4 0 0 lwsmd
Mar 24 13:09:13 search06 kernel: [ 4771.273307] [ 1325] 0 1325 166416 685 66 4 0 0 lwsmd
Mar 24 13:09:13 search06 kernel: [ 4771.273314] Out of memory: Kill process 1440 (java) score 552 or sacrifice child
Mar 24 13:09:13 search06 kernel: [ 4771.273352] Killed process 1440 (java) total-vm:78162188kB, anon-rss:7871044kB, file-rss:23392kB
Mar 24 13:09:13 search06 systemd[1]: elasticsearch.service: Main process exited, code=killed, status=9/KILL
Mar 24 13:09:13 search06 systemd[1]: elasticsearch.service: Unit entered failed state.
Mar 24 13:09:13 search06 systemd[1]: elasticsearch.service: Failed with result 'signal'.
```

Most helpful comment

@vmarchaud Yes, I have solved it. Have a look here: https://discuss.elastic.co/t/out-of-memory-invoked-oom-killer/80795

All 4 comments

@simenflatby

Please ask questions like these in the forum instead: https://discuss.elastic.co/

The github issues list is reserved for bug reports and feature requests only. If you manage to determine the steps to reproduce and confirm it is a bug with Elasticsearch then feel free to raise an issue here.

thanks

@simenflatby Hello, we've got the exact same issue : elasticsearch is killed by the oom-killer, the only difference is the kernel version for us too, i was wondering if you found a solution ?

@vmarchaud Yes, I have solved it. Have a look here: https://discuss.elastic.co/t/out-of-memory-invoked-oom-killer/80795

@simenflatby 我也遇到了和你一样的问题,按照你所说的,移除虚拟机的cd rom就能解决这个问题。thank u~

Was this page helpful?
0 / 5 - 0 ratings