Core: Inspecting the managed heap in a dotnet core application in Linux

Created on 23 Oct 2018  路  13Comments  路  Source: dotnet/core

Hi there,

I want to inspect the memory of an application that is consuming more memory than expected and do a simple !dupmheap to try to figure out which are the type of objects that are piling up.

Please check this self-contained example. It will create a docker image with all the tools already installed, then creates a dotnet core 2.1 application that runs in a continuous loop printing messages with a 1s sleep between messages, and use it as entry point:

FROM microsoft/dotnet:2.1-sdk
WORKDIR /app
RUN apt-get update && apt-get install curl -y && \ 
    apt-get install gpg -y && \
    apt-get install apt-transport-https -y && \
    curl https://packages.microsoft.com/keys/microsoft.asc | gpg --dearmor > microsoft.gpg && \
    mv microsoft.gpg /etc/apt/trusted.gpg.d/microsoft.gpg && \
    sh -c 'echo "deb [arch=amd64] https://packages.microsoft.com/repos/microsoft-ubuntu-xenial-prod xenial main" > /etc/apt/sources.list.d/microsoft.list' && \
    apt-get update && \
    apt-get install procdump -y && \
    apt-get install lldb-3.9 -y
RUN dotnet tool install -g dotnet-symbol && \
    export PATH="$PATH:/root/.dotnet/tools" && \
    dotnet symbol coredump && \
    dotnet new console && \
    echo 'using System; namespace app { class Program { static void Main(string[] args) { while(true) { Console.WriteLine($"{DateTime.Now:HH:mm:ss} Hello!"); System.Threading.Thread.Sleep(1000);} } } } '  > Program.cs && \ 
    dotnet build -c Release
WORKDIR /app/bin/Release/netcoreapp2.1
ENTRYPOINT [ "dotnet", "app.dll" ]

Then you need to run it with ptrace support and connect to it:

$ docker build --no-cache --rm -t lldb-poc .
$ docker run --cap-add=SYS_PTRACE --security-opt seccomp:unconfined -d --name lldb-poc-c lldb-poc 
$ docker exec -it --privileged lldb-poc-c /bin/bash

Once in, the first thing I tried is to get a memory dump and open it with LLDB. You will see that the application has the PID 1 (as usual when it is the entry point), and you can take the dump using procdump that is also installed:

# procdump --pid 1

ProcDump v1.0.1 - Sysinternals process dump utility
Copyright (C) 2017 Microsoft Corporation. All rights reserved. Licensed under the MIT license.
Mark Russinovich, Mario Hewardt, John Salem, Javid Habibi
Monitors a process and writes a dump file when the process exceeds the
specified criteria.

Process:                dotnet (1)
CPU Threshold:          n/a
Commit Threshold:       n/a
Threshold Seconds:      10
Number of Dumps:        1

Press Ctrl-C to end monitoring without terminating the process.

[07:31:51 - INFO]: Timed:
[07:31:54 - INFO]: Core dump 1 generated: dotnet_time_2018-10-22_07:31:51.1

Now that the dump is taken, I try to open it in LLDB...

# lldb-3.9 `which dotnet` -c dotnet_time_2018-10-22_07\:31\:51.1
(lldb) target create "/usr/bin/dotnet" --core "dotnet_time_2018-10-22_07:31:51.1"

But it hangs forever... the prompt never gets to (lldb) again. I tried several variants, like loading the SOS plugin first, and then do the target create after, but the outcome is always the same.

I tried also to attach to the running process, but no luck:

# lldb-3.9 `which dotnet`
(lldb) target create "/usr/bin/dotnet"
Current executable set to '/usr/bin/dotnet' (x86_64).
(lldb) plugin load /usr/share/dotnet/shared/Microsoft.NETCore.App/2.1.5/libsosplugin.so
(lldb) process attach -p 1
error: attach failed: unable to attach

Could you please give me some guidance on how to explore the process' memory with LLDB in linux?

Most helpful comment

It may be the microsoft/dotnet:2.1-sdk docker images you are starting with and then it something we (.NET Core) need to look at.

All 13 comments

@mikem8361 @noahfalk @tommcdon can you please help here?

You need to be supervisor to attach with lldb. sudo -E lldb-3.9

I am root, is not that enough?

Same outome:

# sudo -E lldb-3.9 `which dotnet`
(lldb) target create "/usr/bin/dotnet"
Current executable set to '/usr/bin/dotnet' (x86_64).
(lldb) process attach --pid 1
error: attach failed: unable to attach
(lldb)

I'm not sure what is going on. I'll have to follow your repo steps which it may be a while before I can get to it. This looks like a lldb setup/install/configuration problem more than a .NET Core or SOS problem, but I do want to document helpful with lldb itself.

Thanks for looking into this.

It is indeed strange. And the problem may boil down to the microsoft/dotnet image itself. For example, despite of all flags and capabilities indicated, still cannot do things like these:

# echo 0 | tee /proc/sys/kernel/yama/ptrace_scope
tee: /proc/sys/kernel/yama/ptrace_scope: Read-only file system
0

# sysctl -w kernel.yama.ptrace_scope=0
sysctl: setting key "kernel.yama.ptrace_scope": Read-only file system

I am doing some investigation, although my knowledge as developer in Linux is pretty much none, but it does not seem to be easy to fix reading things like this : https://community.c9.io/t/cant-attach-gdb-to-process-ptrace-readonly/6343

Being unable of even doing a basic memory analysis on a production containerized dotnet core app is definitely a downer. Specially nowadays where there are lot of gen 2 collectable objects like RabbitMQ subscriptions, connected Websockets, etc... You are probably right and this may not be a .NET Core or SOS problem, although it has a potential impact in its perception as a production ready tool (at least for linux containers!). Let me know any findings or even if you definitely discard it as relevant issue in this repo, and I will continue my pilgrimage repo by repo 馃

Cheers.

It may be the microsoft/dotnet:2.1-sdk docker images you are starting with and then it something we (.NET Core) need to look at.

Any update? What are the next steps?

Adding @tommcdon
@mikem8361 sounded like it was on his todo list to look into but he is on vacation right now so there wouldn't be an update from him for a while.

Hi there, any update on this? :) Thanks

This issue looks like a dup of https://github.com/dotnet/diagnostics/issues/79 which I have added some comments. If these are the same issue, then this one should be closed.

I haven't tried creating/using the docker image yet which I hope to do this week.

I don't this this is a dup. Referenced issue solved by using createdump, but here I cannot attach to a running dotnet process.

This is definitely not a duplicate. FWIW I'm leaving my results here for those whose google search leads them here.

I noticed that a default install of lldb (apt-get -y install lldb) works in combination with --cap-add=SYS_PTRACE when the container is based on mcr.microsoft.com/dotnet/core/runtime:3.0-buster-slim, but fails with the frustrating and uninformative "could not attach" error message when the container is based on mcr.microsoft.com/dotnet/core/runtime:2.2-stretch-slim. It turns out that in Buster apt-get install lldb installs version 7.0, whereas in Stretch it installs version 3.8. Trying newer versions available for Stretch, I found that neither of 3.8, 3.9 and 4.0 can attach in a container, but 7.0 can, although installing it pulls in many dependencies and produces a very fat image (~500M extra). If you aren't able or willing to upgrade to .NET Core 3.0, changing the dockerfile to use

RUN apt-get -y install lldb-7
RUN ln /usr/bin/lldb-7 /usr/bin/lldb

may be an acceptable workaround.

Was this page helpful?
0 / 5 - 0 ratings