This is a tracking issue for few things we've discussed with @dsyme and wanted to discuss further during F# Compiler Office Hours.
Now:
Later:
https://github.com/fsharp/FsAutoComplete/issues/264 is related. If we start shipping the portable pdb files with Ionide and FCS powered apps, we could performance profile them and see where the hotspots are to help tune.
It would really help if this project shipped the pdb files https://github.com/Microsoft/visualfsharp/issues/294. Speak to the assignees of https://github.com/aspnet/Universe/issues/131 to figure out the Microsoft strategy for doing this.
@ctaggart Some previous memory-related measurements are available here.
Based on a discussion in the FSSF compiler slack channel I added a note about reference assemblies and the compiler flags listed here: https://github.com/dotnet/roslyn/blob/master/docs/features/refout.md
From the FSSF #compiler slack channel:
@auduchinok and I wanted to discuss one of the items in https://github.com/Microsoft/visualfsharp/issues/4258, in particular we thought that one of the items might be a good way for people to get into the compiler. Specifically the item " Pack attributes in ILTypeDef/etc like how it is done in TypeProviders SDK, use Reflection API enums;"
If you look at the memory stats described in the compiler guide https://fsharp.github.io/2015/09/29/fsharp-compiler-guide.html you can see that some types of objects contribute much more to memory usage of the compiler than others. There are quite a few culprits, but two are ILTypeDef and ILMethodDef, which are representations of data read from an assembly
ILTypeDef is defined here: https://github.com/Microsoft/visualfsharp/blob/master/src/absil/il.fs#L1556
As you can see, ILTypeDef is fairly large. The point of the work item is that it can be made smaller. A good comparison is a derived copy of the same code that already does this, see https://github.com/fsprojects/FSharp.TypeProviders.SDK/blob/master/src/ProvidedTypes.fs#L2736.
This is a much smaller object - perhaps half the size. The main trick used is that the information stored in IsSerializable, IsSealed, IsAbstract, Access and so on is all encoded in the "Attributes: TypeAttributes" instead.This compact-information-into-an-attributes-field can be applied for ILTypeDef, ILMethodDef, ILPropertyDef, ILEventDef and so on. A draft version of the code doing this is basically already implemented in the ProvidedTypes SDK https://github.com/fsprojects/FSharp.TypeProviders.SDK/blob/master/src/ProvidedTypes.fs#L2736.
OK, so in short - if you're a good F# dev and looking for a contribution to make to some core data structures, then doing the "compact-information-into-an-attributes-field" work described above would be very welcome.
Followed by:
If people get ambitious, a considerably harder (but very valuable) job is compaction of the "Val" and "Entity" types. In particular, one or more "Val" nodes get used for each function, member AND simple value ("x" in
let f x = ...). The most common case is probably a basic, simple value.I think there are some code comments about this already in the code, but basically the simplest, dumbest kind of value - the kind we all know and love - "x" in
let f x = ...etc - will always have about 3/4 of the fields of the Val type with zero, default or empty values. i.e. very simple data. In this case, we should likely use a "Val" and "ValExtraData" to find a cut between simple values and more complex but rarer informationIf someone is really good at this kind of stuff, then I suppose the "right" thing to do is to insert some manual profiling code to work out the statistically most common shape of Val objects, and then choose a representation to match
TBH I suspect just repurposing the existing "member data" field (used if the to be "data for fat values" would give good bang-for-buck. https://github.com/Microsoft/visualfsharp/blob/master/src/fsharp/tast.fs#L2204. Using an interface, union or base class would still require careful estimation of which kinds of implementations to support.
For example, most values don't have attributes (think "x" in "let f x = ..."), so
val_attribsis almost always empty. Thus movingval_attribsinto the "fat" part of the data seems sensible.Likewise
val_declaring_entity,val_xmldoc,val_xmldocsig,val_access,val_defn,val_const,val_other_rangeSee comment here : https://github.com/Microsoft/visualfsharp/blob/master/src/fsharp/tast.fs#L2163
Added point about a compiler server, as per a discussion we had with @KevinRansom about perf improvements
@dsyme 2nd point can be Mark as done what point I can work next?
@dsyme 3rd point can be marked as done
Do you want that I'll apply the same technique to Entity?
@AviAvni I marked the item as complete, thanks!
@AviAvni Yes, doing the same for Entity would be great
Note that the vast majority of Entity objects are not F#-defined but rather come from .NET assemblies. We read the assemblies lazily, namespace by namespace, but a lot of types still come into existence.
Here's my guess about the optimal split:
These are the fields with no good default value (for the majority of Entitys, which are type definitions coming from .NET asssemblies)
entity_flags : EntityFlags
entity_stamp: Stamp
entity_logical_name: string
entity_range: range
entity_attribs: Attribs // .NET types usually have some attributes, unlike most values
entity_tycon_repr: TyconRepresentation // contains the real information for .NET types
entity_pubpath : PublicPath option // normally populated
entity_cpath : CompilationPath option // normally populated
entity_modul_contents: MaybeLazy<ModuleOrNamespaceType> // usually "empty" but not so easy to remove
These are the fields with good default values:
entity_typars: LazyWithContext<Typars, range> // usually empty, might be hard
entity_kind : TyparKind // usually "Type"
entity_compiled_name: string option // usually None
entity_other_range: (range * bool) option // usually None
entity_tycon_repr_accessibility: Accessibility // usually Public
entity_tycon_abbrev: TType option // usually None
entity_tycon_tcaug: TyconAugmentation // None for types from .NET
entity_exn_info: ExceptionInfo // usuall TExnNone
entity_xmldoc : XmlDoc // usually empty
entity_xmldocsig : string // usually empty, may be updated if XML doc is computed
entity_accessiblity: Accessibility // usually taccessPublic
This one is borderline:
mutable entity_il_repr_cache : CompiledTypeRepr cache
This is a guess and absolutely needs to be statistically verified. (Note: it might be more important to verify this w.r.t. a typical user compilation that references the entire .NET Framework rather than the F# compiler)
NOTES:
entity_il_repr_cache is only needed for types that are actually referred to by F# code being compiled. The vast majority of .NET types in reference assemblies are not referred to. That's why I say it's borderline.entity_modul_contents usually points through to an "empty" blob of information, but it's not so easy to characterize this, as a lazy thunk is used to convert any nested types in to TAST nodes
entity_typars is read lazily and it might not be so easy to determine that the typars are empty when first creating the ILTypeDef
the information in entity_pubpath and entity_cpath is two similar representations of the same thing that just both grew up historically. Reducing this duplication may be a good target for later work.
See notes on ByteFile/byte[] here: https://github.com/Microsoft/visualfsharp/pull/4429#issuecomment-372662705
Another thought about ByteFile: perhaps we could
That may have the combined effect of soundness and zero memory usage when the DLLs are no longer being accessed during compilation. We could also collect statistics about how many times we re-read sections.
nb. It's annoying to have to do this sort of thing since ByteFile and MemoryMappedFile are nice enough, and work well for 64-bit address spaces, and is only really necessary for 32-bit devenv.exe.
With #4628 I think we will have completed the core of this memory reduction work. I've updated the description to show the progress we've made. @AviAvni will also be doing further profiling of Ionide to determine next opportunities
I'm closing this since the "do now" work has been completed
Most helpful comment
With #4628 I think we will have completed the core of this memory reduction work. I've updated the description to show the progress we've made. @AviAvni will also be doing further profiling of Ionide to determine next opportunities