EDIT @stephentoub 3/22/2019. Added updated API from later in the thread:
```C#
namespace System
{
struct GCMemoryInfo
{
///
/// High Memory Load Threshold when the last GC occured
///
public int HighMemoryLoadThreshold { get; } // or HighMemoryLoadThresholdPercent? float/double? 0-1 vs 0-100?
/// <summary>
/// Memory Load when the last GC ocurred
/// </summary>
public int MemoryLoad { get; } // or MemoryLoadPercent? float/double? 0-1 vs 0-100?
/// <summary>
/// Total available memory for the GC to use when the last GC ocurred. By default this is the physical memory on the machine, but it may be customized by specifying a HardLimit.
/// </summary>
public long TotalAvailableMemory { get; }
/// <summary>
/// The total heap size when the last GC ocurred
/// </summary>
public long HeapSize { get; }
/// <summary>
/// The total fragmentation of the last GC ocurred
/// </summary>
public long Fragmentation { get; } // should we include "Bytes" in all names where appropriate?
}
class GC
{
/// <summary>
/// Get information about GC state at the time of the last GC.
/// </summary>
public static GCMemoryInfo GetGCMemoryInfo();
/// <summary>
/// Get a count of the bytes allocated over the lifetime of the process.
/// <param name="precise">If true, gather a precise number, otherwise gather a fairly count. Gathering a precise value triggers at a significant performance penalty.</param>
/// </summary>
public static long GetTotalBytesAllocated(bool precise);
}
}
Original proposal:
Expose the GC.GetMemoryInfo API publicly so things like asp.net core can use it to determine whether they should trim their caches and etc. Right now since it鈥檚 internal it鈥檚 only used by BCL. [#25617](https://github.com/dotnet/corefx/issues/25617) can be satisfied by this.
Currently this is the internal API's signature:
```csharp
// highMemLoadThreshold - what GC considers as "high memory load" which is usually
// 90% of the totally physical memory but can be changed via a config. This is when
// GC would become more aggressive to do a full compacting GC to reduce heap size.
//
// totalPhysicalMem - what GC sees as total physical memory. If you are running in a
// container this would be the physical memory limit you specify on the container.
//
// lastRecordedMemLoad - this was the memory load GC saw in the last GC. When the
// process is not running in a container this is the physical memory load of the machine;
// otherwise it's (the physical memory used by the process / totalPhysicalMem).
//
// lastRecordedHeapSize - this was the total heap size at the end of last GC.
//
// lastRecordedFragmentation - this was the total fragmentation on the GC heap
// at the end of last GC.
[MethodImplAttribute(MethodImplOptions.InternalCall)]
internal static extern void GetMemoryInfo(out uint highMemLoadThreshold,
out ulong totalPhysicalMem,
out uint lastRecordedMemLoad,
// The next two are size_t
out UIntPtr lastRecordedHeapSize,
out UIntPtr lastRecordedFragmentation);
Does this signature need to change for a public API?
cc @KKhurin (who owns S.R.Caching and has asked for this)
@JunTaoLuo @Tratcher
This should probably return a struct. That's cleaner because it does not force side-effects and calling code can treat the call as an expression. Also, new members can be added later.
@davidfowl how do you envision AspNetCore using this information? And when would we call it?
The caches want an event that says the server is low on memory. This API isn't an event, you'd have to poll it. Then you'd have to decide from the available data what constitutes low memory.
@Maoni0 Can you add some comments on what each field means and how they are measured?
@GSPP seems reasonable to me.
@JunTaoLuo updated.
Why do we have a single function here, why do we use unsigned types? I believe we generally try to avoid those throughout most of our api surface. I think we'd be better served with a series of properties instead. Also, I don't understand the meaning of lastRecordedHeapSize and lastRecordedFragmentation. Could you explain further what those mean, and what sorts of actions one could perform based on them? Finally, I think those should just be typed as long or ulong, just to be similar to totalPhysicalMem.
Multiple static properties would force multiple native calls and would introduce race conditions. I don't know if this is relevant. But if it is, a struct
would solve that.
In order to design this API we should think about what other "data" APIs could be made available in the future. It should be a concept that is extensible enough to maintain a clean and uniform API surface.
I believe all byte counts should be given as long
. Unsigned is inappropriate for consistency reasons (and VB.NET?) and IntPtr
is unnecessarily complex for calling code. It is OK to use a wider integer type on 32 bit platforms.
@davidwrighton because they are all values from the last GC which makes it very natural to group them together. things like highMemLoadThreshold currently don't change per GC but they could which was why they are returned with each call. as @GSPP pointed out, multiple calls to properties can mean you are now getting values from different GCs. not sure how I can explain the last 2 parameters more clear? they are what the last GC observed as heap size and fragmentation. fragmentation means the sum of free objects in the heap. if you observe a lot of fragmentation and you want to keep the heap size down (ie, you have a lot of memory so GC isn't very aggressive but you want to keep the total heap size smaller) you could consider inducing a compacting GC). you can use the heap size as a measure to see how worthwhile it is to even care about the GC heap, eg if the GC heap is a small percentage of the total memory, it wouldn't get you much by reducing it.
How about this as an api surface?
namespace System
{
struct GCMemoryInfo
{
/// <summary>
/// HighMemoryLoadThreshold when the GC occured
/// </summary>
public long HighMemoryLoadThreshold { get; }
/// <summary>
/// Total available memory for the GC to use. By default this is the physical memory on the machine, but it may be customized by specifying a HardLimit.
/// </summary>
public long TotalAvailableMemory { get; }
/// <summary>
/// Memory Load when the GC completes
/// </summary>
public int MemoryLoad { get; }
/// <summary>
/// The total heap size when the GC completes
/// </summary>
public long HeapSize { get; }
/// <summary>
/// The total fragmentation of the GC Heap
/// </summary>
public long Fragmentation { get; }
}
class GC
{
/// <summary>
/// Get information about GC state at the time of the last GC.
/// </summary>
public static GCMemoryInfo LastRecordedGCMemoryInfo { get; }
}
}
LGTM!
What is "MemoryLoad"? Is it a number of bytes? Then it should be long
.
LastRecordedGCMemoryInfo
should be a method. A property suggests that you can write something like:
Console.WriteLine($@"
{GC.LastRecordedGCMemoryInfo.PropertyA},
{GC.LastRecordedGCMemoryInfo.PropertyB},
...");
But that fetches those values multiple times. This is racy and not good for performance. Each call will fill, create and pass the entire struct. As members are added to the struct over time this becomes more and more inefficient.
Properties are by convention cheap to access and assumed to return the same values in successive calls.
The code pattern should be:
var gcMemoryInfo = GC.GetLastRecordedGCMemoryInfo();
Console.WriteLine($@"
{gcMemoryInfo.PropertyA},
{gcMemoryInfo.PropertyB},
...");
memory load is a percentage. @davidwrighton do you want to respond to what @GSPP said about reading fields from the struct? I presumed the idea was you get the struct back and then read fields off of the same struct as some sort of good practice. I trust your expertise in this area more than mine :-)
Well, while it might be reasonable to trust my opinion in writing C# more than yours for the design of C# apis, I still don't feel all that comfortable with making statements about the implication of a property vs a getter. I'd really prefer an actual api design expert to chime in there. @danmosemsft Could you recommend someone?
@davidfowl @Tratcher I've got some further thoughts involving an event that could be used in dotnet/coreclr#18619 . I'm very much not convinced that event is the right one, but its a thought. Would it be useful for your ASP.NET caches?
I agree with @GSPP that a method helps lead to the pit of success in that you are more likely to avoid unnecessary calls, and get a self consistent set of numbers.
c#
class GC
{
/// <summary>
/// Get information about GC state at the time of the last GC.
/// </summary>
public static GCMemoryInfo GetLastRecordedGCMemoryInfo()
}
@stephentoub do you agree? (BTW, I thought design guidelines actually explicitly recommend that property getters be "cheap", but I cannot actually find that written here: https://docs.microsoft.com/en-us/dotnet/standard/design-guidelines/property. Anyway I have no idea whether this implementation would be cheap, even if so, I think the self consistent set is a good enough reason for a method)
There is a whole Choosing Between Properties and Methods
chapter in the framework design guidelines. https://docs.microsoft.com/en-us/previous-versions/dotnet/netframework-4.0/ms229054(v=vs.100)
This case is close to borderline, but I think method returning struct is more appropriate in this case.
@jkotas, intersting, your link is to a fuller version (the version I remember) and my link is to the updated version, which is missing that chapter. I opened a doc issue. https://github.com/dotnet/docs/issues/10332
@stephentoub do you agree?
I like:
C#
public static GCMemoryInfo GetLastRecordedGCMemoryInfo()
though "Recorded" is a bit of an oddity in API names... maybe just "GetLastGCMemoryInfo()", or even just "GetGCMemoryInfo()" and it can be documented how and when the data is collected... if you're using this method, you're likely going to need to read the docs, anyway. Regardless, I think it makes sense to be a method and return a struct.
your link is to a fuller version
The original full version is in the book (mslibrary has ebook that you checkout pretty fast). Online docs always had a subset of what is in the book.
I'm confused, we are comparing 2 versions of the online docs, not comparing the book to the docs 馃樅
The old online version is from the first edition of the book. The current online version is from the second edition of the book. The subset reprinted in the docs from the first edition and the subset reprinted in the docs from the second edition is different. (I guess we did not get the permission for the whole book.) I was just pointing out that the book is the place to go if you want the actual complete guidelines.
Ah gotcha, thanks.
the one I thought was the most natural is a method call that returns a struct, as discussed very early on on this thread. so taking the feedback on the name into consideration how about
public static GCMemoryInfo GetLastGCMemoryInfo()
If public int MemoryLoad { get; }
is a percentage, should it maybe be a float
? That's semantically the natural type. An integer would introduce artificial quantization of the possible values. The additional precision likely is not much of a problem but just based on cleanliness it should be float
.
Unfortunately, there are multiple precedents in the BCL that use an int
for a percentage. I could not find any float
percentage except for the UI assemblies. I searched a big list of assemblies with .NET Reflector. So I guess int
is better after all for consistency.
Would this be a good place to address https://github.com/dotnet/corefx/issues/30644 as well? i.e. add a property:
C#
public struct GCMemoryInfo
{
...
public long TotalBytesAllocated { get; }
}
?
Well, I don't think TotalBytesAllocated is a value that is recorded per GC operation, so that's more indication that we should tweak the name of the method to be something like GetGCMemoryInfo(), and document that some of the values are based on the last gc to occur, and some of them are instantaneous values, but I think it makes sense to put them here.
Actually looking at the discussion in dotnet/runtime#26602, if it means that this api will have to do a walk over all threads instead of being a pretty simple accessor, it becomes quite a bit scarier to include it in this structure. I'd rather that customers could use this api without expecting an O(N) cost to calling it.
Following some of the feedback, how about this as an api surface? And as well the AppDomain api would be adjusted to call the non-precise variant of GC.GetTotalBytesAllocated.
namespace System
{
struct GCMemoryInfo
{
/// <summary>
/// HighMemoryLoadThreshold when the last GC occured
/// </summary>
public long HighMemoryLoadThreshold { get; }
/// <summary>
/// Total available memory for the GC to use when the last GC ocurred. By default this is the physical memory on the machine, but it may be customized by specifying a HardLimit.
/// </summary>
public long TotalAvailableMemory { get; }
/// <summary>
/// Memory Load when the last GC ocurred
/// </summary>
public int MemoryLoad { get; }
/// <summary>
/// The total heap size when the last GC ocurred
/// </summary>
public long HeapSize { get; }
/// <summary>
/// The total fragmentation of the last GC ocurred
/// </summary>
public long Fragmentation { get; }
}
class GC
{
/// <summary>
/// Get information about GC state at the time of the last GC.
/// </summary>
public static GCMemoryInfo GetGCMemoryInfo();
/// <summary>
/// Get a count of the bytes allocated over the lifetime of the process.
/// <param name="precise">If true, gather a precise number, otherwise gather a fairly count. Gathering a precise value triggers at a significant performance penalty.</param>
/// </summary>
public static long GetTotalBytesAllocated(bool precise);
}
}
I suggest renaming public int MemoryLoad { get; }
to MemoryLoadPercent
.
Fragmentation
is not very descriptive. Without having read any comments about it in this thread I have no idea what it means or even what unit it has.
fragmentation means the sum of free objects in the heap
So maybe call it FragmentationLostBytes
.
Maybe all byte values should have the suffix Bytes
. (Stephen did this as well in his proposed TotalBytesAllocated
property.)
There is little use in making a short name. Nobody is typing these names thanks to intellisense. It's better to make them descriptive. That makes them easier to discover and the code becomes easier to read.
Really, for all of these names I do not know what they mean just from the name alone.
@terrajobst Could you organize the api folks to take a look at this and provide some comments? Thanks.
Looks good! Comments from the review:
GCMemoryInfo
should be marked readonlyBytes
HighMemoryLoadThreshold
should return long
and be named HighMemoryLoadThresholdBytes
MemoryLoad
should return long
and be named MemoryLoadBytes
Fragmentation
should be FragmentedBytes
GetTotalAllocatedBytes()
should default the parameter to false
We have GetAllocatedBytesForCurrentThread()
already. Should GetTotalBytesAllocated
be renamed to GetTotalAllocatedBytes
instead, to match?
@ahsonkhan that sounds good to me!
Updated my comment above to match conclusion.
@sergiy-k to get it into 3.0, we would need to cherry-pick https://github.com/dotnet/corefx/commit/9e2f3b630a51875fc0455a7cabb8803f6527f81e to release/3.0
. What would be the process?
@luhenry You do not need to do anything to get it to 3.0. The change will get to release/3.0
once we snap next preview.
@jkotas my understanding is that with Preview 5, we need to go through shiproom to get something released. Do you mean that Preview 6 will still get snapshoted from master
and shiproon is only needed to get something as part of Preview 5?
@luhenry Yes. Preview 6 will get snapshotted from master in the future.
Most helpful comment
How about this as an api surface?