Omr: Compiler Memory Management Hierarchy Discussion

Created on 26 Oct 2017  Ā·  14Comments  Ā·  Source: eclipse/omr

This issue includes structure from https://github.com/eclipse/openj9.

Background

Providers: These classes have APIs to provide a segment (indentation is inheritance; hierarchy/naming could be subject to change)

TR::SegmentProvider
'-> J9::DebugSegmentProvider
'-> TR::SegmentPool
'-> TR::SegmentAllocator
    '-> J9::SystemSegmentProvider
    '-> OMR::SystemSegmentProvider
    '-> OMR::DebugSegmentProvider

Allocators: These classes are used by the above classes to do the actual allocation of memory (indentation is inheritance; hierarchy/naming could be subject to change)

TR::RawAllocator

J9::J9SegmentProvider
'-> J9::J9SegmentCache
'-> J9::SegmentAllocator

See https://github.com/eclipse/omr/issues/788 for what it means for a class to be in the TR namespace as opposed to J9 or OMR.

Discussion

The point of this discussion is to help create a firm understanding of the memory management hierarchy in TR, in order to help me with the documentation I will be writing in the near future, but also to separate distinct concepts.

This discussion requires a distinction between what it means to be a Provider and an Allocator. Given the way the code exists today, I’ve come up with these definitions:

  • An Allocator is the ā€œallocatorā€ part of a Memory Manager. It is used by a Provider to acquire and free memory. Its job is simply to get memory and free memory. When an Allocator is asked to free memory, it must free that memory; it does not manage any memory by holding it in a free list.
  • A Provider is the ā€œmanagerā€ part of a Memory Manager. It manages memory it requests from an Allocator. It is the external interface by which higher level constructs request and release memory. When a Provider is asked to free memory, it does not have to release it back to the OS; it can keep it in free lists, or protect that memory to help debug memory corruption issues. A Provider is also not obligated to use an Allocator (as defined above) but instead can use mmap or malloc or sbrk or whatever - as the ā€œmanagerā€ part of a Memory Manager, it is free to use whatever it wants to actually acquire the memory it will manage.

Now, whether these are actually good definitions is subject to debate, but the point remains that the grouping in the Background above shows how the classes therein semantically differ. For example, in OpenJ9, J9::SystemSegmentProvider uses J9::J9SegmentProvider to allocate segments, whereas in OMR, OMR::SystemSegmentProvider uses TR::RawAllocator to allocate segments, but TR::Region (a higher level construct) works with either one of the Providers; it doesn’t need to know anything about the underlying Allocators.

With that preamble, I want to point out an inconsistency with the naming of TR::SegmentAllocator. The distinction between TR::SegmentProvider and TR::SegmentAllocator is that TR::SegmentAllocator has additional APIs to support concepts such as allocation limits, as well as amount of region and system bytes allocated (the concept of region here has nothing to do with TR::Region). However, I’m not entirely convinced I see the need for this additional layer. If one wishes to create a Provider that only provides functionality to request and release memory without having to worry about allocation limits or region/system memory, then all they would have to do is implement the appropriate empty methods (perhaps with asserts in them). Therefore, I propose that we merge TR::SegmentAllocator into TR::SegmentProvider, removing it from the hierarchy. However, if there is real value in maintaining the distinction between TR::SegmentProvier and TR::SegmentAllocator, then we need to find a better name for TR::SegmentAllocator.

@mstoodle @mpirvu @rwy0717 @lmaisons Thoughts?

compiler discussion

Most helpful comment

Also, thanks for writing this up. I owe you a beer next time we meet.

All 14 comments

Given my definitions above, J9::J9SegmentProvider should also be renamed (since it is an Allocator), but that's a discussion that I'll bring up in https://github.com/eclipse/openj9/issues/459.

I'm obviously biased, but I liked having the TR::SegmentProvider TR::SegmentAllocator distinction because it distinguished between the two sides of concern of objects that provisioned TR::MemorySegments to TR::Regions, and enforced it at compile time.

It also allows for the chaining of capabilities that don't necessarily care about doing more than being a TR::SegmentProvider (like the TR::SegmentPool)

FWIW, chaining is also why I had a J9::J9SegmentProvider in OpenJ9.

Also, thanks for writing this up. I owe you a beer next time we meet.

In my previous reply, I didn't do the last paragraph justice, so as a follow-up:

I. An allocator creates something
II. A provider provides a means of acquiring something

Note that II is a 'weaker' less-informative contract than I.

Under that model, anything implementing TR::SegmentAllocator _is_ allocating TR::MemorySegments. Even if the space is coming from an already internally-managed memory reservation, each one of the implementers is creating TR::MemorySegment objects for use by a consumer.

If anything this whole thing is demonstrating that the inconsistency resulting from my laziness of not refactoring J9::DebugSegmentProvider to use TR::SegmentAllocator is causing more harm than I realized by muddying the waters.

I. An allocator creates something
II. A provider provides a means of acquiring something

That makes a ton of sense.

That said, how does that definition that fit into TR::SegmentPool which actually does create TR::MemorySegments in its _segmentStack? It also means that for example OMR::SystemSegmentProvider should be renamed to OMR::SystemSegmentAllocator since it implements TR::SegmentAllocator and is therefore an Allocator.

It seems like there two ideas here with TR::SegmentProvider and TR::SegmentAllocator.

  1. Separation of APIs: Implementors of TR::SegmentProvider need to implement a subset of APIs that they would have if they implemented TR::SegmentAllocator. In this context, having a reference to TR::SegmentProvider means having access to less APIs, which makes sense in the case of TR::SegmentPool who doesn't care about allocation limits.
  2. Separation of semantics: Having a reference to TR::SegmentProvider means you're saying "Hey, provide me a segment"; it implies (at least in my mind) that TR::SegmentProvider will use an Allocator to actually create the segments since it is just responsible for providing segments.

I think it's the fact that both of these ideas are present in the memory hierarchy that results in both justifying the naming, but also causing a bit of confusion.

Actually now that I think of it, with your definitions, it can be asserted an Allocator can be a Provider, but a Provider cannot be an Allocator. I think in that case, having J9::DebugSegmentProvider and TR::MemoryPool (both technically Allocators) extend TR::SegmentProvider is fine. The distinction just needs to be well documented is all. What do you think?

an Allocator can be a Provider, but a Provider cannot be an Allocator.

I think you mean this: An allocator is sufficient for a provider, but a provider is not sufficient for an allocator. If so, our models match.

For TR::MemoryPool did you mean TR::SegmentPool ? If so, that gets segments from another TR::SegmentProvider. When a segment is released, if it's the right size, it keeps it around to re-use later, but it shouldn't be creating new TR::MemorySegment objects.

An allocator is sufficient for a provider, but a provider is not sufficient for an allocator.

Yes, that's a more precise way of putting it.

For TR::MemoryPool did you mean TR::SegmentPool ?

Yeah sorry, I did mean TR::SegmentPool.

If so, that gets segments from another TR::SegmentProvider. When a segment is released, if it's the right size, it keeps it around to re-use later, but it shouldn't be creating new TR::MemorySegment objects.

I see, so the distinction here is that when an implementer of TR::SegmentProvider uses another TR::SegmentProvider, it is not an Allocator because it doesn't create something, but instead uses the other TR::SegmentProvider to provide it with the memory? How does that differ from what you said earlier:

Even if the space is coming from an already internally-managed memory reservation, each one of the implementers is creating TR::MemorySegment objects for use by a consumer.

In the case of TR::SegmentPool, the space is coming from an already internally-managed memory reservation, namely the other TR::SegmentProvider. So in essence, isn't TR::SegmentPool creating TR::MemorySegment objects for use by a consumer?

(BTW, sorry if it seems like I'm grilling you, I'm just trying to make this all as clear as possible for me and anyone else who wishes to learn about the memory hierarchy. I appreciate all the clarifications šŸ˜„ )

Thanks for pulling this together @dsouzai (and thanks for helping to fill in the gaps @lmaisons). This is useful reference material.

In the case of TR::SegmentPool, the space is coming from an already internally-managed memory reservation, namely the other TR::SegmentProvider. So in essence, isn't TR::SegmentPool creating TR::MemorySegment objects for use by a consumer?

The easiest answer is that it's not creating new TR::MemorySegment objects. Notice that the only type it ever stuffs into its internal data structures is TR::reference_wrapper<TR::MemorySegment> it always deals in _references_ to TR::MemorySegments

Even if the space is coming from an already internally-managed memory reservation, each one of the implementers is creating TR::MemorySegment objects for use by a consumer.

That was probably an overly-pedantic distinction on my part. However, as it now muddies the waters, what I meant by that was acknowledging that in J9, the J9::SystemSegmentProvider wasn't allocating new memory directly, but instead using J9MemorySegments (which could be considered as already 'managed' to some extent). It was still redistributing the allocated memory, and it was still creating new TR::MemorySegment objects. TR::SegmentPool doesn't do either of those things. It simply holds on to TR::MemorySegment objects that are released by a consumer (usually a dead TR::Region) and uses them to fulfill new requests before it go asks the lower provider for new ones.

When it can't fulfill a segment request from the pool, and does get a new one, it forwards the new segment directly to the requesting consumer. It doesn't do any carving or redistribution of its own.

In case it isn't obvious, the above was me. Apparently my password manager decided to use my mothballed username to log in.

Ah ok I see.

Then perhaps we should rename TR::SegmentAllocator to something that conveys the idea that it defines APIs that can be implemented either by a Provider or an Allocator. It just so happens that all implementers of TR::SegmentAllocator are Allocators, but one could easily create something similar to TR::SegmentPool if they wished to impose allocation limits.

For example, if TR::SegmentAllocator was renamed to something like TR::SegmentProviderExtension (I know, terrible name, but it doesn't imply implementers have to be Allocators), then both TR::SegmentProvier and TR::SegmentProviderExtension externally have Provider APIs (release/request). Whether the implementer is really an Allocator under the covers doesn't matter (since as stated above, an Allocator is sufficient for a Provider).

Therefore, perhaps a good rule of memory allocation should be: anyone who requests memory must do so through a Provider. TR::Region obeys this rule since it acquires memory through a TR::SegmentProvider, and anyone using STL containers can use the automatic conversion inside TR::Region to get a TR::typed_allocator that obeys STL Allocator semantics.

So, I think renaming TR::SegmentAllocator to something that implies it is a Provider, or more specifically, that implementers are not obligated to implement Allocator semantics, is something that should be considered, though that's very clearly easier said than done hehe.

Does that conclusion seem sound to you?

Edit: As I reread this, another possibility comes to mind. We can leave TR::SegmentAllocator named as is, with the requirement that all implementers of TR::SegmentAllocator have to be Allocators, whereas not all implementers of TR::SegmentProvider have to be Allocators. The hierarchy as it stands is consistent with this definition. It does mean, however, that if someone wanted to create a Provider with some concept of limiting how much memory it could provide, they would need to create a new class rather than just extend TR::SegmentAllocator.

I had an offline chat with @jdmpapin regarding this a few days ago. We came to the consensus that we don't really gain anything significant by trying to maintain a distinction between a Provider and an Allocator. For the most part, it is sufficient to just view the difference between TR::SegmentProvider and TR::SegmentAllocator as a separation of APIs.

As such, going forward, we should think of the terms Provider and Allocator as interchangeable. When I write up the documentation regarding the compiler memory management, I won't bother with any sort of conceptual distinction between the two, and simply focus on the various implementers and the reason for each one's existence.

Closing this issue since I'm happy with the information I received as part of the discussion. Feel free to reopen if anyone feels there's more to discuss.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

aviansie-ben picture aviansie-ben  Ā·  3Comments

0xdaryl picture 0xdaryl  Ā·  3Comments

fjeremic picture fjeremic  Ā·  3Comments

sajidahmed21 picture sajidahmed21  Ā·  3Comments

aviansie-ben picture aviansie-ben  Ā·  6Comments