Server: GPU memory leak on HTML producer

Created on 24 Mar 2020  路  25Comments  路  Source: CasparCG/server

Expected behaviour

VRAM usage should be freed upon closing of HTML templates or HTML producers.

Current behaviour

GPU memory usage keeps increasing when removing and adding HTML templates until VRAM is full and starts using shared video memory, slowing down rendering.


Steps to reproduce

  1. Specify <html><enable-gpu>true</enable-gpu></html> in config
  2. CG 1-1 ADD 0 "template1" 1 "data"
  3. CG 1-1 STOP
  4. Repeat steps 2 and 3 indefinitely.
  5. Watch casparcg.exe GPU memory usage go up in task manager.

Environment

  • Commit: 6985aac9cd164dea5c1049272d005caace55b2c3
  • Server version: 2.3 dev
  • Operating system: Windows 10

Screenshots

Timelapse: https://img.mehir.ar/template.gif

html typbug

Most helpful comment

Problem appears only if HTML GPU option is turned on in config.

All 25 comments

Hi @rrebuffo, thanks for reporting this. I'm just curious how long time it takes until VRAM is full with your hardware? How many iterations of add/stop do you do before VRAM is full?

Best regards,
Armin

Every template added increases around 16MB of VRAM.
It gets filled every few hours when developing templates (testing small changes more than a houndred times). The system gets really slow when GPU total RAM usage is around 25% above dedicated GPU memory (my card is 2GB VRAM, it gets slower approaching the 1900MB and becomes laggy system-wide above 2600MB usage, clearly falling back to shared memory.

@rrebuffo, thanks for the additional information. I have put the issue into the v2.3.0 LTS milestone.

@dotarmin is there an announcement somewhere about the LTS version? (sorry for hijacking the issue)

@dimitry-ishenko, it will be announced very soon :)

I'm unable to reproduce this issue.
may be it is template related or try to update graphics driver etc ?

@rrebuffo, can you provide us with the template you're using to reproduce this error?

Any template would do. I just checked with PLAY 1-1 [HTML] google.com
Same problem. Same 16MB increases.
I can also point out that this happens in two different machines with different OS, driver versions and hardware.

On a fresh copy of the last build and edited casparcg.config with just <html><enable-gpu>true</enable-gpu></html> added and run it from casparcg_auto_restart.bat:

[2020-04-11 05:28:47.679] [info]    ############################################################################
[2020-04-11 05:28:47.680] [info]    CasparCG Server is distributed by the Swedish Broadcasting Corporation (SVT)
[2020-04-11 05:28:47.680] [info]    under the GNU General Public License GPLv3 or higher.
[2020-04-11 05:28:47.680] [info]    Please see LICENSE.TXT for details.
[2020-04-11 05:28:47.680] [info]    http://www.casparcg.com/
[2020-04-11 05:28:47.680] [info]    ############################################################################
[2020-04-11 05:28:47.680] [info]    Starting CasparCG Video and Graphics Playout Server 2.3.0 4176a9b1 Dev
[2020-04-11 05:28:48.270] [info]    Initializing OpenGL Device.
[2020-04-11 05:28:48.279] [info]    Initialized OpenGL 4.5.0 NVIDIA 441.66 NVIDIA Corporation
[2020-04-11 05:28:48.350] [info]    D3D11: Selected adapter: NVIDIA GeForce GTX 960
[2020-04-11 05:28:48.350] [info]    D3D11: Selected feature level: 45312
[2020-04-11 05:28:48.359] [info]    Initialized ffmpeg module.
[2020-04-11 05:28:48.359] [info]    Initialized oal module.
[2020-04-11 05:28:48.360] [info]    Initialized decklink module.
[2020-04-11 05:28:48.360] [info]    Initialized screen module.
[2020-04-11 05:28:48.360] [info]    Initialized newtek module.
[2020-04-11 05:28:48.412] [info]    Initialized html module.
[2020-04-11 05:28:48.685] [info]    Initialized flash module.
[2020-04-11 05:28:48.687] [info]    Initialized bluefish module.
[2020-04-11 05:28:48.687] [info]    Initialized image module.
[2020-04-11 05:28:48.687] [info]    "C:/CasparCG\server_2.3_4176a9b1\casparcg.config":
[2020-04-11 05:28:48.687] [info]    -----------------------------------------
[2020-04-11 05:28:48.687] [info]    <?xml version="1.0" encoding="utf-8"?>
[2020-04-11 05:28:48.687] [info]    <configuration>
[2020-04-11 05:28:48.687] [info]       <paths>
[2020-04-11 05:28:48.687] [info]          <media-path>media/</media-path>
[2020-04-11 05:28:48.687] [info]          <log-path>log/</log-path>
[2020-04-11 05:28:48.687] [info]          <data-path>data/</data-path>
[2020-04-11 05:28:48.687] [info]          <template-path>template/</template-path>
[2020-04-11 05:28:48.687] [info]       </paths>
[2020-04-11 05:28:48.687] [info]       <lock-clear-phrase>secret</lock-clear-phrase>
[2020-04-11 05:28:48.687] [info]       <channels>
[2020-04-11 05:28:48.687] [info]          <channel>
[2020-04-11 05:28:48.687] [info]             <video-mode>720p5000</video-mode>
[2020-04-11 05:28:48.687] [info]             <consumers>
[2020-04-11 05:28:48.687] [info]                <screen/>
[2020-04-11 05:28:48.687] [info]                <system-audio/>
[2020-04-11 05:28:48.687] [info]             </consumers>
[2020-04-11 05:28:48.687] [info]          </channel>
[2020-04-11 05:28:48.687] [info]       </channels>
[2020-04-11 05:28:48.687] [info]       <controllers>
[2020-04-11 05:28:48.687] [info]          <tcp>
[2020-04-11 05:28:48.687] [info]             <port>5250</port>
[2020-04-11 05:28:48.687] [info]             <protocol>AMCP</protocol>
[2020-04-11 05:28:48.687] [info]          </tcp>
[2020-04-11 05:28:48.687] [info]       </controllers>
[2020-04-11 05:28:48.687] [info]       <amcp>
[2020-04-11 05:28:48.687] [info]          <media-server>
[2020-04-11 05:28:48.687] [info]             <host>localhost</host>
[2020-04-11 05:28:48.687] [info]             <port>8000</port>
[2020-04-11 05:28:48.687] [info]          </media-server>
[2020-04-11 05:28:48.687] [info]       </amcp>
[2020-04-11 05:28:48.687] [info]       <html>
[2020-04-11 05:28:48.687] [info]          <enable-gpu>true</enable-gpu>
[2020-04-11 05:28:48.687] [info]       </html>
[2020-04-11 05:28:48.687] [info]    </configuration>
[2020-04-11 05:28:48.687] [info]    -----------------------------------------
[2020-04-11 05:28:48.707] [info]    Initialized OpenGL Accelerated GPU Image Mixer for channel 1
[2020-04-11 05:28:48.709] [info]    video_channel[1|720p5000] Successfully Initialized.
[2020-04-11 05:28:48.711] [info]    Screen consumer [1|720p5000] Initialized.
[2020-04-11 05:28:48.774] [info]    oal[1|720p5000] Initialized.
[2020-04-11 05:28:48.775] [info]    Initialized channels.
[2020-04-11 05:28:48.776] [info]    Initialized controllers.
[2020-04-11 05:28:48.777] [info]    Initialized osc.
[2020-04-11 05:29:02.698] [info]    Received message from Console: PLAY 1-1 [html] google.com\r\n
[2020-04-11 05:29:05.306] [info]    Received message from Console: PLAY 1-1 [html] google.com\r\n
[2020-04-11 05:29:05.515] [info]    html[google.com] Destroyed.
[2020-04-11 05:29:06.306] [info]    Received message from Console: PLAY 1-1 [html] google.com\r\n
[2020-04-11 05:29:06.614] [info]    html[google.com] Destroyed.
[2020-04-11 05:29:07.178] [info]    Received message from Console: PLAY 1-1 [html] google.com\r\n
[2020-04-11 05:29:07.474] [info]    html[google.com] Destroyed.
[2020-04-11 05:29:08.082] [info]    Received message from Console: PLAY 1-1 [html] google.com\r\n
[2020-04-11 05:29:08.394] [info]    html[google.com] Destroyed.
[2020-04-11 05:29:09.194] [info]    Received message from Console: PLAY 1-1 [html] google.com\r\n
[2020-04-11 05:29:09.474] [info]    html[google.com] Destroyed.
[2020-04-11 05:29:17.290] [info]    Received message from Console: PLAY 1-1 [html] google.com\r\n
[2020-04-11 05:29:17.594] [info]    html[google.com] Destroyed.
[2020-04-11 05:29:18.322] [info]    Received message from Console: PLAY 1-1 [html] google.com\r\n
[2020-04-11 05:29:18.574] [info]    html[google.com] Destroyed.
[2020-04-11 05:29:22.738] [info]    Received message from Console: PLAY 1-1 [html] google.com\r\n
[2020-04-11 05:29:23.054] [info]    html[google.com] Destroyed.
[2020-04-11 05:29:23.882] [info]    Received message from Console: PLAY 1-1 [html] google.com\r\n
[2020-04-11 05:29:24.094] [info]    html[google.com] Destroyed.
[2020-04-11 05:29:24.994] [info]    Received message from Console: PLAY 1-1 [html] google.com\r\n
[2020-04-11 05:29:25.294] [info]    html[google.com] Destroyed.
[2020-04-11 05:29:26.074] [info]    Received message from Console: PLAY 1-1 [html] google.com\r\n
[2020-04-11 05:29:26.374] [info]    html[google.com] Destroyed.
[2020-04-11 05:29:52.874] [info]    Received message from Console: PLAY 1-1 [html] google.com\r\n
[2020-04-11 05:29:53.194] [info]    html[google.com] Destroyed.
[2020-04-11 05:29:53.314] [info]    Received message from Console: PLAY 1-1 [html] google.com\r\n
[2020-04-11 05:29:53.594] [info]    html[google.com] Destroyed.
[2020-04-11 05:29:53.738] [info]    Received message from Console: PLAY 1-1 [html] google.com\r\n
[2020-04-11 05:29:53.954] [info]    html[google.com] Destroyed.
[2020-04-11 05:29:54.130] [info]    Received message from Console: PLAY 1-1 [html] google.com\r\n
[2020-04-11 05:29:54.434] [info]    html[google.com] Destroyed.
[2020-04-11 05:29:54.522] [info]    Received message from Console: PLAY 1-1 [html] google.com\r\n
[2020-04-11 05:29:54.794] [info]    html[google.com] Destroyed.
[2020-04-11 05:29:54.890] [info]    Received message from Console: PLAY 1-1 [html] google.com\r\n
[2020-04-11 05:29:55.154] [info]    html[google.com] Destroyed.
[2020-04-11 05:29:55.250] [info]    Received message from Console: PLAY 1-1 [html] google.com\r\n
[2020-04-11 05:29:55.534] [info]    html[google.com] Destroyed.
[2020-04-11 05:29:55.634] [info]    Received message from Console: PLAY 1-1 [html] google.com\r\n
[2020-04-11 05:29:55.914] [info]    html[google.com] Destroyed.
[2020-04-11 05:29:55.962] [info]    Received message from Console: PLAY 1-1 [html] google.com\r\n
[2020-04-11 05:29:56.294] [info]    html[google.com] Destroyed.
[2020-04-11 05:29:56.314] [info]    Received message from Console: PLAY 1-1 [html] google.com\r\n
[2020-04-11 05:29:56.594] [info]    html[google.com] Destroyed.
[2020-04-11 05:29:56.674] [info]    Received message from Console: PLAY 1-1 [html] google.com\r\n
[2020-04-11 05:29:56.974] [info]    html[google.com] Destroyed.
[2020-04-11 05:29:58.018] [info]    Received message from Console: PLAY 1-1 [html] google.com\r\n
[2020-04-11 05:29:58.374] [info]    html[google.com] Destroyed.
[2020-04-11 05:29:58.506] [info]    Received message from Console: PLAY 1-1 [html] google.com\r\n
[2020-04-11 05:29:58.814] [info]    html[google.com] Destroyed.
[2020-04-11 05:29:59.010] [info]    Received message from Console: PLAY 1-1 [html] google.com\r\n
[2020-04-11 05:29:59.294] [info]    html[google.com] Destroyed.
[2020-04-11 05:29:59.514] [info]    Received message from Console: PLAY 1-1 [html] google.com\r\n
[2020-04-11 05:29:59.794] [info]    html[google.com] Destroyed.
[2020-04-11 05:30:00.018] [info]    Received message from Console: PLAY 1-1 [html] google.com\r\n
[2020-04-11 05:30:00.234] [info]    html[google.com] Destroyed.

The final VRAM usage after that is 278.108K

@rrebuffo may be it is now system related because i tried it again with same your config and play 1-1 [HTML] google.com. casparcg is working normal

I'm testing it on Win 10 x64
Dell T5500
Quadro 2000 with latest driver ( on old driver I was having casparcg crashing issue )

@rrebuffo can you test with removed? I know we had issues with a memory leak when using system-audio but that should have been fixed as far as I can see.

Can you also test to upgrade your drivers as advised above 馃槉

Thanks

Tried updating to latest graphics driver with clean settings and also removing system audio, even though I had previously tested it with only one Decklink or one NDI consumer.
The only change is that the increments are now 8MB.
This is very frustrating, like I said, this happens to two different machines. Can it be the windows installation? The only common thing about them is the installation media. Windows is 18363.418 (I did not update at all)

Tested again on an old Windows 7 test installation and got the same results with 4176a9b1bcd03dd22061f30110b1664d8216c881 build and latest nvidia drivers.
Clean config file only modified with GPU enabled on HTML.
Playing out PLAY 1-1 [html] google.com
Win7
Memory load goes straight up to 100% after around 150 commands.

When you've gotten to 100%, does a CLEAR 1-1 free the VRAM or not?

No. It sits there until the server is shut down.
Neither CLEAR 1-1 nor CLEAR 1 have any effect.

OK could you see if GL GC clears it?

It goes down a couple MB but it's not freed.

Alright, I think that suggest this is a problem with CEF not freeing the memory.

Can you show what GL INFO reports and what the DIAG window looks like?

I need to try and reproduce this myself, and if I cannot I shall be back with a special build designed to try and narrow down the cause

baseline
past_100%
Diag is not telling much because there's no other load on the system. UWP apps and Visual Studio Code become unusable, a simple scroll takes half a second to render. Also I'm testing with one less monitor (only one 1080p, usually I have this one and a 2160p one) and that helps with performance.

Is the second screenshot from when it has run out of memory?
I was wondering if perhaps we were leaking producers, and that might be shown in diag. And gl info will show the total memory we have still allocated/cached on the gpu (unless we have truely leaked some)

Yes it is.
I think the producers are cleaned up and they don鈥檛 show up in diag (like the ffmpeg rtmp bug do) but clearly the memory from them is not being released.

Problem appears only if HTML GPU option is turned on in config.

Problem appears only if HTML GPU option is turned on in config.

Is it still present in 2.3.2?

Is it still present in 2.3.2?

Yes it is https://github.com/CasparCG/server/issues/1363

Was this page helpful?
0 / 5 - 0 ratings

Related issues

jesperstarkar picture jesperstarkar  路  51Comments

ronag picture ronag  路  29Comments

TKooijmans picture TKooijmans  路  61Comments

dotarmin picture dotarmin  路  44Comments

petterreinholdtsen picture petterreinholdtsen  路  61Comments