Runtime: Making the info GC.GetMemoryInfo provides public

Created on 16 Jan 2019 · 38Comments · Source: dotnet/runtime

EDIT @stephentoub 3/22/2019. Added updated API from later in the thread:
```C#
namespace System
{
struct GCMemoryInfo
{
///

/// High Memory Load Threshold when the last GC occured
///

public int HighMemoryLoadThreshold { get; } // or HighMemoryLoadThresholdPercent? float/double? 0-1 vs 0-100?

    /// <summary>
    /// Memory Load when the last GC ocurred
    /// </summary>
    public int MemoryLoad { get; } // or MemoryLoadPercent? float/double? 0-1 vs 0-100?

    /// <summary>
    /// Total available memory for the GC to use when the last GC ocurred. By default this is the physical memory on the machine, but it may be customized by specifying a HardLimit.
    /// </summary>
    public long TotalAvailableMemory { get; }

    /// <summary>
    /// The total heap size when the last GC ocurred
    /// </summary>
    public long HeapSize { get; }

    /// <summary>
    /// The total fragmentation of the last GC ocurred
    /// </summary>
    public long Fragmentation { get; } // should we include "Bytes" in all names where appropriate?
}

class GC
{
    /// <summary>
    /// Get information about GC state at the time of the last GC. 
    /// </summary>
    public static GCMemoryInfo GetGCMemoryInfo();

    /// <summary>
    /// Get a count of the bytes allocated over the lifetime of the process.
    /// <param name="precise">If true, gather a precise number, otherwise gather a fairly count. Gathering a precise value triggers at a significant performance penalty.</param>
    /// </summary>
    public static long GetTotalBytesAllocated(bool precise);
}

}

Original proposal:

Expose the GC.GetMemoryInfo API publicly so things like asp.net core can use it to determine whether they should trim their caches and etc. Right now since it’s internal it’s only used by BCL. [#25617](https://github.com/dotnet/corefx/issues/25617) can be satisfied by this.

Currently this is the internal API's signature:

```csharp
// highMemLoadThreshold - what GC considers as "high memory load" which is usually 
// 90% of the totally physical memory but can be changed via a config. This is when
// GC would become more aggressive to do a full compacting GC to reduce heap size.
//
// totalPhysicalMem - what GC sees as total physical memory. If you are running in a 
// container this would be the physical memory limit you specify on the container.
//
// lastRecordedMemLoad - this was the memory load GC saw in the last GC. When the
// process is not running in a container this is the physical memory load of the machine;
// otherwise it's (the physical memory used by the process / totalPhysicalMem).
//
// lastRecordedHeapSize - this was the total heap size at the end of last GC.
// 
// lastRecordedFragmentation - this was the total fragmentation on the GC heap 
// at the end of last GC.
[MethodImplAttribute(MethodImplOptions.InternalCall)]
internal static extern void GetMemoryInfo(out uint highMemLoadThreshold,
                                            out ulong totalPhysicalMem,
                                            out uint lastRecordedMemLoad,
                                            // The next two are size_t
                                            out UIntPtr lastRecordedHeapSize,
                                            out UIntPtr lastRecordedFragmentation);

Does this signature need to change for a public API?

api-approved area-System.Runtime

Source

Maoni0

👍5

Most helpful comment

How about this as an api surface?

namespace System
{
    struct GCMemoryInfo
    {
        /// <summary>
        /// HighMemoryLoadThreshold when the GC occured
        /// </summary>
        public long HighMemoryLoadThreshold { get; }

        /// <summary>
        /// Total available memory for the GC to use. By default this is the physical memory on the machine, but it may be customized by specifying a HardLimit.
        /// </summary>
        public long TotalAvailableMemory { get; }

        /// <summary>
        /// Memory Load when the GC completes
        /// </summary>
        public int MemoryLoad { get; }

        /// <summary>
        /// The total heap size when the GC completes
        /// </summary>
        public long HeapSize { get; }

        /// <summary>
        /// The total fragmentation of the GC Heap
        /// </summary>
        public long Fragmentation { get; }
    }

    class GC
    {
        /// <summary>
        /// Get information about GC state at the time of the last GC. 
        /// </summary>
        public static GCMemoryInfo LastRecordedGCMemoryInfo { get; }
    }
}

davidwrighton on 5 Feb 2019

👍4

All 38 comments

cc @KKhurin (who owns S.R.Caching and has asked for this)

danmosemsft on 16 Jan 2019

@JunTaoLuo @Tratcher

davidfowl on 20 Jan 2019

This should probably return a struct. That's cleaner because it does not force side-effects and calling code can treat the call as an expression. Also, new members can be added later.

GSPP on 21 Jan 2019

👍3

@davidfowl how do you envision AspNetCore using this information? And when would we call it?

The caches want an event that says the server is low on memory. This API isn't an event, you'd have to poll it. Then you'd have to decide from the available data what constitutes low memory.

Tratcher on 22 Jan 2019

@Maoni0 Can you add some comments on what each field means and how they are measured?

JunTaoLuo on 22 Jan 2019

@GSPP seems reasonable to me.

@JunTaoLuo updated.

Maoni0 on 22 Jan 2019

Why do we have a single function here, why do we use unsigned types? I believe we generally try to avoid those throughout most of our api surface. I think we'd be better served with a series of properties instead. Also, I don't understand the meaning of lastRecordedHeapSize and lastRecordedFragmentation. Could you explain further what those mean, and what sorts of actions one could perform based on them? Finally, I think those should just be typed as long or ulong, just to be similar to totalPhysicalMem.

davidwrighton on 30 Jan 2019

Multiple static properties would force multiple native calls and would introduce race conditions. I don't know if this is relevant. But if it is, a struct would solve that.

In order to design this API we should think about what other "data" APIs could be made available in the future. It should be a concept that is extensible enough to maintain a clean and uniform API surface.

I believe all byte counts should be given as long. Unsigned is inappropriate for consistency reasons (and VB.NET?) and IntPtr is unnecessarily complex for calling code. It is OK to use a wider integer type on 32 bit platforms.

GSPP on 30 Jan 2019

@davidwrighton because they are all values from the last GC which makes it very natural to group them together. things like highMemLoadThreshold currently don't change per GC but they could which was why they are returned with each call. as @GSPP pointed out, multiple calls to properties can mean you are now getting values from different GCs. not sure how I can explain the last 2 parameters more clear? they are what the last GC observed as heap size and fragmentation. fragmentation means the sum of free objects in the heap. if you observe a lot of fragmentation and you want to keep the heap size down (ie, you have a lot of memory so GC isn't very aggressive but you want to keep the total heap size smaller) you could consider inducing a compacting GC). you can use the heap size as a measure to see how worthwhile it is to even care about the GC heap, eg if the GC heap is a small percentage of the total memory, it wouldn't get you much by reducing it.

Maoni0 on 30 Jan 2019

How about this as an api surface?

namespace System
{
    struct GCMemoryInfo
    {
        /// <summary>
        /// HighMemoryLoadThreshold when the GC occured
        /// </summary>
        public long HighMemoryLoadThreshold { get; }

        /// <summary>
        /// Total available memory for the GC to use. By default this is the physical memory on the machine, but it may be customized by specifying a HardLimit.
        /// </summary>
        public long TotalAvailableMemory { get; }

        /// <summary>
        /// Memory Load when the GC completes
        /// </summary>
        public int MemoryLoad { get; }

        /// <summary>
        /// The total heap size when the GC completes
        /// </summary>
        public long HeapSize { get; }

        /// <summary>
        /// The total fragmentation of the GC Heap
        /// </summary>
        public long Fragmentation { get; }
    }

    class GC
    {
        /// <summary>
        /// Get information about GC state at the time of the last GC. 
        /// </summary>
        public static GCMemoryInfo LastRecordedGCMemoryInfo { get; }
    }
}

davidwrighton on 5 Feb 2019

👍4

LGTM!

Maoni0 on 5 Feb 2019

What is "MemoryLoad"? Is it a number of bytes? Then it should be long.

LastRecordedGCMemoryInfo should be a method. A property suggests that you can write something like:

Console.WriteLine($@"
    {GC.LastRecordedGCMemoryInfo.PropertyA},
    {GC.LastRecordedGCMemoryInfo.PropertyB},
    ...");

But that fetches those values multiple times. This is racy and not good for performance. Each call will fill, create and pass the entire struct. As members are added to the struct over time this becomes more and more inefficient.

Properties are by convention cheap to access and assumed to return the same values in successive calls.

The code pattern should be:

var gcMemoryInfo = GC.GetLastRecordedGCMemoryInfo();

Console.WriteLine($@"
    {gcMemoryInfo.PropertyA},
    {gcMemoryInfo.PropertyB},
    ...");

GSPP on 5 Feb 2019

memory load is a percentage. @davidwrighton do you want to respond to what @GSPP said about reading fields from the struct? I presumed the idea was you get the struct back and then read fields off of the same struct as some sort of good practice. I trust your expertise in this area more than mine :-)

Maoni0 on 5 Feb 2019

Well, while it might be reasonable to trust my opinion in writing C# more than yours for the design of C# apis, I still don't feel all that comfortable with making statements about the implication of a property vs a getter. I'd really prefer an actual api design expert to chime in there. @danmosemsft Could you recommend someone?

@davidfowl @Tratcher I've got some further thoughts involving an event that could be used in dotnet/coreclr#18619 . I'm very much not convinced that event is the right one, but its a thought. Would it be useful for your ASP.NET caches?

davidwrighton on 6 Feb 2019

I agree with @GSPP that a method helps lead to the pit of success in that you are more likely to avoid unnecessary calls, and get a self consistent set of numbers.

c# class GC { /// <summary> /// Get information about GC state at the time of the last GC. /// </summary> public static GCMemoryInfo GetLastRecordedGCMemoryInfo() }

@stephentoub do you agree? (BTW, I thought design guidelines actually explicitly recommend that property getters be "cheap", but I cannot actually find that written here: https://docs.microsoft.com/en-us/dotnet/standard/design-guidelines/property. Anyway I have no idea whether this implementation would be cheap, even if so, I think the self consistent set is a good enough reason for a method)

danmosemsft on 6 Feb 2019

There is a whole Choosing Between Properties and Methods chapter in the framework design guidelines. https://docs.microsoft.com/en-us/previous-versions/dotnet/netframework-4.0/ms229054(v=vs.100)

This case is close to borderline, but I think method returning struct is more appropriate in this case.

jkotas on 6 Feb 2019

@jkotas, intersting, your link is to a fuller version (the version I remember) and my link is to the updated version, which is missing that chapter. I opened a doc issue. https://github.com/dotnet/docs/issues/10332

danmosemsft on 6 Feb 2019

@stephentoub do you agree?

I like:
C# public static GCMemoryInfo GetLastRecordedGCMemoryInfo()
though "Recorded" is a bit of an oddity in API names... maybe just "GetLastGCMemoryInfo()", or even just "GetGCMemoryInfo()" and it can be documented how and when the data is collected... if you're using this method, you're likely going to need to read the docs, anyway. Regardless, I think it makes sense to be a method and return a struct.

stephentoub on 6 Feb 2019

👍3

your link is to a fuller version

The original full version is in the book (mslibrary has ebook that you checkout pretty fast). Online docs always had a subset of what is in the book.

jkotas on 6 Feb 2019

I'm confused, we are comparing 2 versions of the online docs, not comparing the book to the docs 😺

danmosemsft on 6 Feb 2019

The old online version is from the first edition of the book. The current online version is from the second edition of the book. The subset reprinted in the docs from the first edition and the subset reprinted in the docs from the second edition is different. (I guess we did not get the permission for the whole book.) I was just pointing out that the book is the place to go if you want the actual complete guidelines.

jkotas on 6 Feb 2019

Ah gotcha, thanks.

danmosemsft on 6 Feb 2019

the one I thought was the most natural is a method call that returns a struct, as discussed very early on on this thread. so taking the feedback on the name into consideration how about

public static GCMemoryInfo GetLastGCMemoryInfo()

Maoni0 on 6 Feb 2019

👍1

If public int MemoryLoad { get; } is a percentage, should it maybe be a float? That's semantically the natural type. An integer would introduce artificial quantization of the possible values. The additional precision likely is not much of a problem but just based on cleanliness it should be float.

Unfortunately, there are multiple precedents in the BCL that use an int for a percentage. I could not find any float percentage except for the UI assemblies. I searched a big list of assemblies with .NET Reflector. So I guess int is better after all for consistency.

GSPP on 6 Feb 2019

Would this be a good place to address https://github.com/dotnet/corefx/issues/30644 as well? i.e. add a property:
C# public struct GCMemoryInfo { ... public long TotalBytesAllocated { get; } }
?

stephentoub on 6 Feb 2019

Well, I don't think TotalBytesAllocated is a value that is recorded per GC operation, so that's more indication that we should tweak the name of the method to be something like GetGCMemoryInfo(), and document that some of the values are based on the last gc to occur, and some of them are instantaneous values, but I think it makes sense to put them here.

davidwrighton on 6 Feb 2019

👍1

Actually looking at the discussion in dotnet/runtime#26602, if it means that this api will have to do a walk over all threads instead of being a pretty simple accessor, it becomes quite a bit scarier to include it in this structure. I'd rather that customers could use this api without expecting an O(N) cost to calling it.

davidwrighton on 6 Feb 2019

Following some of the feedback, how about this as an api surface? And as well the AppDomain api would be adjusted to call the non-precise variant of GC.GetTotalBytesAllocated.

namespace System
{
    struct GCMemoryInfo
    {
        /// <summary>
        /// HighMemoryLoadThreshold when the last GC occured
        /// </summary>
        public long HighMemoryLoadThreshold { get; }

        /// <summary>
        /// Total available memory for the GC to use when the last GC ocurred. By default this is the physical memory on the machine, but it may be customized by specifying a HardLimit.
        /// </summary>
        public long TotalAvailableMemory { get; }

        /// <summary>
        /// Memory Load when the last GC ocurred
        /// </summary>
        public int MemoryLoad { get; }

        /// <summary>
        /// The total heap size when the last GC ocurred
        /// </summary>
        public long HeapSize { get; }

        /// <summary>
        /// The total fragmentation of the last GC ocurred
        /// </summary>
        public long Fragmentation { get; }
    }

    class GC
    {
        /// <summary>
        /// Get information about GC state at the time of the last GC. 
        /// </summary>
        public static GCMemoryInfo GetGCMemoryInfo();

        /// <summary>
        /// Get a count of the bytes allocated over the lifetime of the process.
        /// <param name="precise">If true, gather a precise number, otherwise gather a fairly count. Gathering a precise value triggers at a significant performance penalty.</param>
        /// </summary>
        public static long GetTotalBytesAllocated(bool precise);
    }
}

davidwrighton on 7 Feb 2019

I suggest renaming public int MemoryLoad { get; } to MemoryLoadPercent.

Fragmentation is not very descriptive. Without having read any comments about it in this thread I have no idea what it means or even what unit it has.

fragmentation means the sum of free objects in the heap

So maybe call it FragmentationLostBytes.

Maybe all byte values should have the suffix Bytes. (Stephen did this as well in his proposed TotalBytesAllocated property.)

There is little use in making a short name. Nobody is typing these names thanks to intellisense. It's better to make them descriptive. That makes them easier to discover and the code becomes easier to read.

Really, for all of these names I do not know what they mean just from the name alone.

GSPP on 8 Feb 2019

👍1

@terrajobst Could you organize the api folks to take a look at this and provide some comments? Thanks.

davidwrighton on 28 Feb 2019

Video

Looks good! Comments from the review:

GCMemoryInfo should be marked readonly
We should suffix all members with Bytes
We should normalzie all values into bytes
HighMemoryLoadThreshold should return long and be named HighMemoryLoadThresholdBytes
MemoryLoad should return long and be named MemoryLoadBytes
Fragmentation should be FragmentedBytes
GetTotalAllocatedBytes() should default the parameter to false

terrajobst on 9 Apr 2019

We have GetAllocatedBytesForCurrentThread() already. Should GetTotalBytesAllocated be renamed to GetTotalAllocatedBytes instead, to match?

ahsonkhan on 9 Apr 2019

👍1

@ahsonkhan that sounds good to me!

Maoni0 on 9 Apr 2019

Updated my comment above to match conclusion.

terrajobst on 9 Apr 2019

@sergiy-k to get it into 3.0, we would need to cherry-pick https://github.com/dotnet/corefx/commit/9e2f3b630a51875fc0455a7cabb8803f6527f81e to release/3.0. What would be the process?

luhenry on 23 Apr 2019

@luhenry You do not need to do anything to get it to 3.0. The change will get to release/3.0 once we snap next preview.

jkotas on 24 Apr 2019

👍1

@jkotas my understanding is that with Preview 5, we need to go through shiproom to get something released. Do you mean that Preview 6 will still get snapshoted from master and shiproon is only needed to get something as part of Preview 5?

luhenry on 24 Apr 2019

@luhenry Yes. Preview 6 will get snapshotted from master in the future.

davidwrighton on 24 Apr 2019

👍1

Was this page helpful?

0 / 5 - 0 ratings