We have discovered an unexpected behavior of the runtime regarding layout of a struct in memory with .NET CLR.
In the following example, the struct BoolAndExplicitStruct
layout will be promoted to an auto-layout because of the presence of the field public bool Bool
and public ExplicitLayoutStruct ExplicitLayout
_together_:
[StructLayout(LayoutKind.Explicit)]
struct ExplicitLayoutStruct {
[FieldOffset(0)] public byte B00;
}
struct RegularStruct
{
public int Field;
}
struct BoolAndExplicitStruct
{
// bool AND a field with an explicit layout struct
// change the layout of this struct to Auto instead of sequential
public bool Bool;
public ExplicitLayoutStruct ExplicitLayout;
public int Int32;
public RegularStruct RegularStruct;
public long Int64;
}
// Field Int64: starts at offset 0
// Field Int32: starts at offset 8
// Field Bool: starts at offset 12
// Field ExplicitLayout: starts at offset 16
// Field RegularStruct: starts at offset 24
This is causing lots of trouble on the Burst compiler on our side, because as we are using these structs in native code, we expect the same sequential+explicit layout than .NET, but we were not expecting an implicit auto-layout promotion in this case (We haven't implemented today the auto-layout because we fear its implementation dependent behavior)
It seems to be related around this function CheckIfDisqualifiedFromManagedSequential or the caller of this function, but haven't identified exactly in the code when this flip occurs.
Do you know the reason of this auto-layout promotion? Could we revert that behavior? (We will be happy to make a PR)
Happens on all OS, all CPU and all .NET version (including .NET framework) but not on Mono (because Mono doesn't have auto-layout promotion afaik)
Nope.
I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.
I'll just paste the relevant part of the doc for StructLayoutAttribute like I did in discord: "For non-blittable types, it controls the layout when the class or structure is marshaled to unmanaged code, but does not control the layout in managed memory."
The native layout of the struct is as one would expect based on the fact that C# structs are implicitly marked Sequential: Marshal.OffsetOf<BoolAndExplicitStruct>("Bool") == 0
Sequential layout specification doesn't control the managed view because BoolAndExplicitStruct
is not blittable (there's the bool field) - this behavior falls out from the doc.
There was a similar conversation about StructLayout specification not applying to the managed view e.g. here: https://github.com/dotnet/runtime/issues/1770#issuecomment-574979786
Sequential layout specification doesn't control the managed view because BoolAndExplicitStruct is not blittable (there's the bool field) - this behavior falls out from the doc.
That's a bit incorrect. The presence of a bool doesn't change the layout. You can check by yourself. Here, that's the presence of a non-blittable type AND an explicit struct layout field. That's why it is so unexpected.
For example the following BoolAndExplicitStruct
struct is perfectly sequential:
[StructLayout(LayoutKind.Explicit)]
struct ExplicitLayoutStruct {
[FieldOffset(0)] public byte B00;
}
struct RegularStruct
{
public int Field;
public bool Bool;
}
struct BoolAndExplicitStruct
{
// bool AND a field with an explicit layout struct
// change the layout of this struct to Auto instead of sequential
public bool Bool;
//public char Char; // same behavior than bool
//public ExplicitLayoutStruct ExplicitLayout;
public int Int32;
public RegularStruct RegularStruct;
public long Int64;
}
but if you uncomment public ExplicitLayoutStruct ExplicitLayout
you will get an auto-layout.
I was not aware of any auto-layout promotion except when you have a managed object inside one of the struct. But this auto-layout seems completely arbitrary and, well, quite wrong to me.
Until now, I was always expecting that auto-layout promotion would only happen in the presence of some managed objects in any embedded structs. But this auto-layout promotion in the presence of only unmanaged type is breaking these expectations quite hard, it makes basically native struct sharing barely impossible without using explicit layout everywhere (and explicit layout is awful to work with when you start to have a variable pointer type 32/64 bits).
The presence of a bool doesn't change the layout.
The presence of the bool makes the struct non-blittable - at that point the runtime is free not to respect the StructLayoutAttribute. It will often end up sequential in the managed view too, but details of when that happens are undocumented. I tried to look up the reasoning behind why the runtime ends up laying it out differently in this case, but that change was made before 2010 and is more difficult to dig out.
The presence of the bool makes the struct non-blittable - at that point the runtime is free not to respect the StructLayoutAttribute. It will often end up sequential in the managed view too, but details of when that happens are undocumented. I tried to look up the reasoning behind why the runtime ends up laying it out differently in this case, but that change was made before 2010 and is more difficult to dig out.
Indeed, and I would love to understand the reason for this specific case, and if we can revert that behavior for unmanaged types (blittable or not)
I always wondered why it was decided to map a boolean against an int in memory instead of a byte?
Once the struct is considered non-blittable, the runtime checks if the struct is disqualified from being managed sequential. Since one of the fields is not managed sequential (it's explicit), the type is disqualified from having a sequential managed layout.
Once the struct is considered non-blittable, the runtime checks if the struct is disqualified from being managed sequential. Since one of the fields is not managed sequential (it's explicit), the type is disqualified from having a sequential managed layout.
Why having an explicit layout of a field is disqualifying the struct for being sequential? Why promoting to auto-layout? Also, if you remove the bool but keep the explicit layout field in my example above, it won't promote the struct to auto-layout. So what you describe is not exactly what is happening.
I always wondered why it was decided to map a boolean against an int in memory instead of a byte?
In managed, a bool is mapped to a byte. The problem mentioned above doesn't change this layout for the bool type (it's still 1 byte). The problem is implicit promotion to auto-layout of the struct, but an auto-layout won't change the size of the blittable types, unlike P/Invoke interop that will involve marshalling behaviors.
Why in the context of interop marshalling a bool is translated to int? I believe it goes back to the Windows API having declared the C++ BOOL
to actually map to a DWORD
so an int32
, so .NET selected this behavior back in the days to match the Windows layout for BOOL.
If you remove the bool field then the structure is blittable, so layout is preserved. Explicit only disqualifies the type if the type is non-blittable, as I said.
The reasoning behind this design is from long before my time and from before the source code was imported into the CoreCLR repo, so I'm not sure what the original reasoning was.
If you remove the bool field then the structure is blittable, so layout is preserved. Explicit only disqualifies the type if the type is non-blittable, as I said.
Oh, ok, I see, sorry for my misunderstanding.
The reasoning behind this design is from long before my time and from before the source code was imported into the CoreCLR repo, so I'm not sure what the original reasoning was.
Ok. Maybe @jkotas knows about this? I'm still struggling to see a practical reasons for this choice. And it's really hurting interop to native scenario where you don't want to involve marshalling rules. It makes almost impossible to use sequential struct with native code, unless you are sure that you don't hit the implicit auto-layout promotion rules.
Some of our users are starting to request us to implement the rules of auto-layout of CoreCLR in our compiler, which is pretty bad, because it will put us in a risky positions of being broken very easily by the CLR. I know all of these are implementation details but really, that's not great to have such behavior. I have been doing low level interop for more than 10 years, and I'm quite choked to discover such a rule so late, it's scary.
And it's really hurting interop to native scenario where you don't want to involve marshalling rules.
If that's the scenario, you just need to make sure the structure is blittable - the bool field makes the structure not blittable, so interop is going to have to do marshalling (I think this applies even if the managed layout happens to match the native layout by accident). The difference in layout is only happening because the structure is not blittable.
If that's the scenario, you just need to make sure the structure is blittable - the bool field makes the structure not blittable, so interop is going to have to do marshalling (I think this applies even if the managed layout happens to match the native layout by accident). The difference in layout is only happening because the structure is not blittable.
Yeah, but it is basically asking that we won't allow our users to use bool or char or we will have to post process the assemblies and do horrible things with IL to make it pass. That's quite a big deal. Specially because also Mono doesn't do this kind of re-auto-layout, so we discovered only this case because some of our users are running Burst on .NET Core.
This rule of non-blittable promoting to auto-layout doesn't make sense for me, it seems completely arbitrary without any technical reasons.
Maybe @jkotas knows about this?
I do not know why it is done this way.
The casting would be a perf killer, but this one is blittable ...
[System.Serializable]
public struct bBool // : IComparable, IComparable<bBool>, IConvertible, IEquatable<bBool>, IFormattable ... etc.
{
private byte m_value; // Add all attributes for Unity that you may need
public bBool(bool value) => m_value = (byte)(value ? 1 : 0);
public bool IsTrue => m_value == 1;
public bool IsFalse => m_value != 1;
public void Switch() => m_value = (byte)(m_value == 1 ? 0 : 1);
public void ToTrue() => m_value = (Byte)1;
public void ToFalse() => m_value = (Byte)0;
public override string ToString() => m_value == 1 ? "True" : "False";
public static implicit operator bBool(bool value) => new bBool(value);
public static implicit operator bool(bBool value) => value.m_value == 1;
}
I have been doing low level interop for more than 10 years, and I'm quite choked to discover such a rule so late, it's scary.
馃憜馃徏 This
The runtime may be _permitted_ to to reorder fields on this struct, but I don't think anyone outside of people on the CoreCLR team would think that blatantly ignoring sequential layout is reasonable.
Would you expect this struct to have a completely different layout compared to @xoofx's BoolAndExplicitStruct
?
public struct BoolAndByteInsteadOfStruct
{
public bool Bool;
public byte B;
public int Int32;
public RegularStruct RegularStruct;
public long Int64;
}
When I look at both structs, I see the following:
B
/ExplicitLayout
Int32
RegularStruct
Int64
Even if I knew that the bool makes this non-blittable and therefore layout rules go out the window, I'd at least expect the runtime to choose the same layout for each of them but it doesn't:
==== Managed Layout of BoolAndExplicitStruct ====
System.Int64 Int64 @ 0
System.Int32 Int32 @ 8
System.Boolean Bool @ 12
ExplicitLayoutStruct ExplicitLayout @ 16
RegularStruct RegularStruct @ 24
==== Managed Layout of BoolAndByteInsteadOfStruct ====
System.Boolean Bool @ 0
System.Byte B @ 1
System.Int32 Int32 @ 4
RegularStruct RegularStruct @ 8
System.Int64 Int64 @ 16
If it consistently performed automatic layout (and the layouts are the same if you explicitly give the runtime permission to do automatic layout with LayoutKind.Auto
) it'd be one thing. However the fact that it generally does the expected thing makes this gotcha even more unpleasant.
the bool field makes the structure not blittable
In my eyes the fact that the runtime considers bool
and char
to be non-blittable is an unfortunate artifact of a decision made in 2002 to make interop with Windows easier (at the expense of everything else.) In modern times this feels more like a gotcha than something beneficial.
Is it at all feasible to have either a module attribute or a feature flag that says "No really, unless I say otherwise using MarshalAs
, I accept bool is 8 bits and char is a 16 bit UTF16 character"? Or barring that, an additional struct layout which is essentially SequentialManaged
?
(Heck, I'd love a ISeriouslyNeverWantMarshalingLogicToRunForPInvokesOnThisAssemblyAndYouShouldThrowAnExceptionIfItWereNeededAttribute
, but that's getting off topic.)
Is it at all feasible to have either a module attribute or a feature flag that says "No really, unless I say otherwise using
MarshalAs
, I accept bool is 8 bits and char is a 16 bit UTF16 character"?
So I've been doing some digging, and I found out that this kinda already exists for char
.
IsFieldBlittable
in fieldmarshaler
is what determines whether or not a field is considered blittable or not. In this function one of the check is affectively "If a char field is marked as unicode or marshaled as an I2 or U2, it is blittable":
This means if you mark your module with [module: DefaultCharSet(CharSet.Unicode)]
, char
will be blittable by default.
Unfortunately there is no such escape hatch for ELEMENT_TYPE_BOOLEAN
(even if you manually specify a 1 byte MarshalAs
.) I would like propose two changes:
1) Add an escape hatch for booleans when they are marshaled in a blittable fashion:
// The lack of NATIVE_TYPE_DEFAULT is intentional here.
// This value really just means "no marshaling specified",
// which does not represent the native type in the case of booleans.
case ELEMENT_TYPE_BOOLEAN:
isBlittable = (nativeType == NATIVE_TYPE_I1) || (nativeType == NATIVE_TYPE_U1);
break;
2) Introduce a new module-level attribute similar to DefaultCharSetAttribute
but for booleans:
namespace System.Runtime.InteropServices
{
[AttributeUsage(AttributeTargets.Module, Inherited = false)]
public sealed class DefaultBooleanMarshalingAttribute : Attribute
{
// Note that I'm not at all attached to the idea of specifying an UnmanagedType here.
// I'm only expressing it this way to keep with the spirit of DefaultCharSetAttribute
public DefaultBooleanMarshalingAttribute(UnmanagedType unmanagedType)
=> UnmanagedType = unmanagedType;
public UnmanagedType UnmanagedType { get; }
}
}
As with DefaultCharSet
, Roslyn would have special understanding of this attribute and use it to set the marshaling mode of boolean fields when it is present.
Edit: Hmm, one possible issue with this (and probably why there's no escape hatch today) is that bool marshaling protects against awkward bools. In that case perhaps a new NATIVE_TYPE_BLITTABLEBOOL
could be added to CorNativeType
.
I'm not sure that we need an attribute to fix this problem, because today, the promotion to auto-layout doesn't make the struct blittable either, it changes its layout but it doesn't promote a bool to an int, so it's not a problem of marshalling. It seems more an arbitrary encoded rule that is not backup by a technical reasons.
If you look at the code, blittable and sequentiability are two different aspects, but here the trouble comes for sequentiability:
So I would be more in favor to just remove this auto-layout promotion unless there is a managed object.
In my eyes the fact that the runtime considers
bool
andchar
to be non-blittable is an unfortunate artifact of a decision made in 2002 to make interop with Windows easier (at the expense of everything else.) In modern times this feels more like a gotcha than something beneficial.
I agree a 100% with that comment
In that case perhaps a new NATIVE_TYPE_BLITTABLEBOOL could be added to CorNativeType.
Agree. That's why I proposed a workaround bBool struct.
So I would be more in favor to just remove this auto-layout promotion unless there is a managed object.
That and to make a boolean blittable by mapping it to a byte on the native side.
That and to make a boolean blittable by mapping it to a byte on the native side.
I would prefer to not conflate the auto-layout promotion described here and the default non-blittabilty of bool. This can be opened in a separate issue, but here, I don't propose to change this behavior.
It seems more an arbitrary encoded rule that is not backup by a technical reasons.
I do agree, I was basing my investigation/proposal on the other comments stating the fact that bool
is non-blittable is what indicates to the runtime that it has permission to mess with the layout.
I would prefer to not conflate the auto-layout promotion described here and the default non-blittabilty of bool. This can be opened in a separate issue, but here, I don't propose to change this behavior.
The logic in the runtime is that the type will use sequential layout if it is blittable or managed sequential. If both are false (as is with BoolAndExplicitStruct
) it will automatically determine the layout using PlaceInstanceFields
:
They're two sides of the same coin, so addressing either would fix your core issue.
That being said, I do agree it'd be better to consider each issue separately. I'll open up a separate issue tomorrow morning.
The logic in the runtime is that the type will use sequential layout if it is blittable or managed sequential. If both are false (as is with BoolAndExplicitStruct) it will automatically determine the layout using PlaceInstanceFields:
Ah, good catch, I was not able to find where this logic was encoded from the code I linked before. I understand now how things are connected in the code now, thank you.
I think this thread is conflating two things: managed layout and native layout of a type. A type can have two layouts: one that is used within the managed environment, when managed code is operating on the data structures, and one that is used when the type is exposed to unmanaged code through e.g. p/invoke interop, Marshal.StructureToPtr
/PtrToStructure
, etc.
StructLayoutAttribute
controls the native layout. It's documented to do only that. Period.
Sentences like "I have been doing low level interop for more than 10 years, and I'm quite choked to discover such a rule so late, it's scary" make it seem like this thread is about the native layout of the type and the runtime not respecting the attribute when computing the native layout (i.e. when the type is passed as a parameter to a p/invoke that might have an opinion about how the type is laid out in memory). But this discussion and the issue is actually about managed layout. __Managed layout is not relevant to interop. Native code should never see the managed layout of a type. Native code only sees the native layout__.
There's one special case and that's blittable types, where the runtime is required to say managed layout == native layout. In that case StructLayoutAttribute
will obviously be respected, because it needs to be respected for native layout.
Here's some code:
using System;
using System.Runtime.InteropServices;
using System.Runtime.CompilerServices;
Console.WriteLine("Offset of Bool field in the native layout: " + Marshal.OffsetOf<BoolAndExplicitStruct>("Bool"));
BoolAndExplicitStruct b = default;
Console.WriteLine("Offset of Bool field in the managed layout: " +
Unsafe.ByteOffset<bool>(ref Unsafe.As<BoolAndExplicitStruct, bool>(ref b), ref b.Bool));
[StructLayout(LayoutKind.Explicit)]
struct ExplicitLayoutStruct
{
[FieldOffset(0)]
public byte B00;
}
struct BoolAndExplicitStruct
{
public bool Bool;
public ExplicitLayoutStruct ExplicitLayout;
public int Int32;
public long Int64;
}
This will print:
Offset of Bool field in the native layout: 0
Offset of Bool field in the managed layout: 12
Now change the bool Bool
field to be a byte
. This will make the struct blittable. Now the program will print:
Offset of Bool field in the native layout: 0
Offset of Bool field in the managed layout: 0
There's no bug in the interop scenario - the type will be laid out sequentially for p/invokes as expected.
I think the runtime should be free to organize the managed layout of non-blittable types however it sees fit.
I think the runtime should be free to organize the managed layout of non-blittable types however it sees fit.
You describe a current behavior but nobody seems to be able to describe the rationale behind "it sees fit".
I would prefer a simple rule that says: For all unmanaged types (blittable or not), we should stick to the layout requested.
You describe a current behavior but nobody seems to be able to describe the rationale behind "it sees fit".
Hypothetically - it should be possible for the runtime to collect access patterns and drive the field layout using PGO data to optimize CPU cache utilization.
Hypothetically - it should be possible for the runtime to collect access patterns and drive the field layout using PGO data to optimize CPU cache utilization.
Well, yeah, that's for example what I opted in stark-lang where structs are by default auto-layout, not sequential 馃槈
But in .NET, structs are by default sequential, so we are not in the position to fulfill this hypothetical dream. The promotion rule to auto-layout are not described anywhere, so I would prefer to stick to a more predictive rule For all unmanaged types (blittable or not), we should stick to the layout requested
.
The attribute does not control the managed layout.
For reference, the ManagedSequential feature was added in .NET Framework 2.0 and the algorithm has remained unchanged since then. The rationale I would posit by the design is "if the user makes their struct sequential and every nested struct sequential and we don't need to align any reference type fields, we might as well use the sequential layout since we're already calculating it" (before .NET 5, interop layout was calculated at type load time so we calculated the managed sequential layout at the same time as the native sequential layout). Once you add an explicit layout struct field, the runtime figures "I need to do extra validation for explicit layout anyway, so don't do ManagedSequential".
Since I can't find any docs from that era I don't know the rationale for sure, but @MichalStrehovsky is correct that we don't guarantee this behavior. It's just a convenience. If you want your managed layout to match your native layout, then you need to use blittable types.
Based on when it was released, my guess would be that the feature was added so that in C++/CLI, regular managed structs that didn鈥檛 have reference fields would have the same layout as if they were regular C++ structs in unmanaged code. Since C++ has no concept of explicit layout, the user would have to explicitly apply the StructLayout attribute, which means that the type must be a managed type.
Based on when it was released, my guess would be that the feature was added so that in C++/CLI, regular managed structs that didn鈥檛 have reference fields would have the same layout as if they were regular C++ structs in unmanaged code
Makes me wonder whether we would be able to get rid of ManagedSequential if C++/CLI provably can't get to the struct (e.g. a private struct that is not in a C++/CLI assembly). Automatic layout typically ends up with better packing. I also found some hints that ManagedSequential was new in 2.0 - if it wasn't too breaking to go from "auto layout for all non-blittable struct" to "sequential layout for many non-blittable structs" back then between .NET 1.0 to .NET 2.0, maybe it wouldn't be breaking to go back the other way again.
Since I can't find any docs from that era I don't know the rationale for sure, but @MichalStrehovsky is correct that we don't guarantee this behavior. It's just a convenience. If you want your managed layout to match your native layout, then you need to use blittable types.
Well, this is not guaranteed today but we could guarantee it. We could change the doc above and the implementation:
For unmanaged types, LayoutKind.Sequential controls both the layout in managed memory and unmanaged memory.
At least also it would make the Mono runtime and .NET runtime behaving similarly on that and would make things a lot more predictable.
I always wondered why it was decided to map a boolean against an int in memory instead of a byte?
@Ultrahead Since you asked: https://stackoverflow.com/a/28515361
Well, this is not guaranteed today but we could guarantee it. We could change the doc above and the implementation:
Why would we want to cement a behavior with actual perf disadvantages in docs? I assume you made the choice in Stark to have structs be auto by default for perf reasons as well. The managed view of the struct is under runtime's discretion for the same reason. It happens to sometime be sequential in the managed view too, but it's not documented and it might be worth investigating getting rid of that.
I still don't understand why managed layout matters in your scenario. I see references to Burst being used with CoreCLR - but that sounds like interop like any other - Burst code would be considered native code from the runtime perspective and should see the native view of the struct, not the managed view. In the native view, the bool field is always at offset 0 as expected because StructLayout is respected. The managed view is under control of the managed runtime and is not suitable to be exposed to unmanaged code.
Why would we want to cement a behavior with actual perf disadvantages in docs? I assume you made the choice in Stark to have structs be auto by default for perf reasons as well. The managed view of the struct is under runtime's discretion for the same reason. It happens to sometime be sequential in the managed view too, but it's not documented and it might be worth investigating getting rid of that.
Yes, but I made the choice to make struct with auto-layout _by default_, which is very different from the benefits that you are claiming here. Here, you are saying that if we have a default sequential layout but are using char/bool, we might start to layout differently. While the only case where it makes sense today for non-blittable is for managed objects, but for char/bool, really? What do you expect to gain while you won't be able to do this auto-layout promotion/optimizations if there are no char/bool? It seems that you are trying to legitimate the existing behavior for something that has no practical benefits (if struct layout was defaulted to auto, that would be a different discussion)
I still don't understand why managed layout matters in your scenario. I see references to Burst being used with CoreCLR - but that sounds like interop like any other - Burst code would be considered native code from the runtime perspective and should see the native view of the struct, not the managed view. In the native view, the bool field is always at offset 0 as expected because StructLayout is respected. The managed view is under control of the managed runtime and is not suitable to be exposed to unmanaged code.
If you pass a pointer to a struct like this to native code (not a by ref, but an unsafe pointer), the interop layout will do nothing. it will pass directly the pointer of the managed layout, the compiler will not complain, the JIT will be happy also.
For us, the transition to Burst should be transparent for our users, ultimately, we would like Burst to be able to compile entirely a code base, maybe even managed.
In order to detect the issue above, we will have to basically disallow the usage of bool/char entirely in any structs used in Burst code (or we find a way to recompile all our user assemblies by transforming the bool to byte, not even sure it's practical...)
My higher management might start to ask us if we move one day to CoreCLR to fork CoreCLR just because of that issue. I don't want to fork CoreCLR for that. I know that it's not your problem, but at least you get the context about why I'm so keen on trying to revert that blittable behavior. 馃槈
In order to detect the issue above, we will have to basically disallow the usage of bool/char entirely in any structs used in Burst code (or we find a way to recompile all our user assemblies by transforming the bool to byte, not even sure it's practical...
Or force devs to find workarounds like bBool.
All in all it's either quite restrictive or cumbersome
That being said, I do agree it'd be better to consider each issue separately. I'll open up a separate issue tomorrow morning.
@PathogenDavid: great. Please let us know the link to it when you do :)
Or force devs to find workarounds like bBool.
Yeah, we already tried that years ago. Our users hated it. My CTO hates it.
Because also Unity has been always working with Mono, passing a pointer to a struct to native code was giving the same layout. These kind of rules fuels the idea of forking, and I hate that as well.
Yeah, we already tried that years ago. Our users hated it. My CTO hates it
I would hate it also ...
it will pass directly the pointer of the managed layout, the compiler will not complain, the JIT will be happy also.
Right, the runtime/compiler/JIT will be happy about all sorts of incorrect p/invoke declaration. It doesn't make them correct. I guess the runtime could have blocked that, but the users would just go "a-ha! I'll just declare it as void*" so at that point it doesn't matter. If you pass it as ref
instead of pointer, I expect the runtime to fix the layout issue for you.
What do you expect to gain while you won't be able to do this auto-layout promotion/optimizations if there are no char/bool?
Bool is specifically a type that auto layout might be able to pack more efficiently than sequential (unless the user makes conscious packing choices) and is pretty common (much more common to see a bool field than say byte/sbyte/short/ushort that are also prone to have packing problems).
Right, the runtime/compiler/JIT will be happy about all sorts of incorrect p/invoke declaration. It doesn't make them correct. I guess the runtime could have blocked that, but the users would just go "a-ha! I'll just declare it as void*" so at that point it doesn't matter.
Is it possible to warn on call-sites, then? DoNotPassNonBlittableStructPointersToExternMethodsAnalyzer
馃檭
Right, the runtime/compiler/JIT will be happy about all sorts of incorrect p/invoke declaration. It doesn't make them correct. I guess the runtime could have blocked that, but the users would just go "a-ha! I'll just declare it as void*" so at that point it doesn't matter. If you pass it as ref instead of pointer, I expect the runtime to fix the layout issue for you.
We have many unmanaged struct containers in DOTS Unity that are starting to rely on where T : unmanaged
and have plenty of T*
inside them. it's not as simple as just fixing a ref
on the P/invoke layer.
Do many of those structs contain char?
I think it is a good idea to work towards eliminating corner case differences between CLR blittable and C# unmanaged types. The difference between the two concepts is hard to explain, so the less explaining of corner cases we have to do the better. For example, https://github.com/dotnet/runtime/pull/1866 is going in this direction.
Changing layout rules is a breaking change, even if it is in the unspecified territory. It is potentially a breaking change for the R2R file format too.
I do not think we will be able to ever remove the managed sequential concept. It would be too breaking. It means that we have given up the opportunity to auto layout 99+% of the C# unmanaged types unless they are explicitly marked auto layout.
I think getting rid of the explicit struct inside sequential struct promoting to auto-layout may be plausible. We need some data to understand the breaking potential.
I think getting rid of the explicit struct inside sequential struct promoting to auto-layout may be plausible. We need some data to understand the breaking potential.
If we can do that without requiring any user code changes, that would be fantastic.
But I agree, if it is breaking any actual code relying on this undocumented layout then we will have to find another way to support that (e.g maybe a [MarshalAs(U1)]
on the bool to tell the layout that it should not promote to auto in that case because we want the same layout for native than managed, even though that would still be super painful for our users)
@xoofx Do many of those structs you mentioned contain char types?
Most helpful comment
Yeah, we already tried that years ago. Our users hated it. My CTO hates it.
Because also Unity has been always working with Mono, passing a pointer to a struct to native code was giving the same layout. These kind of rules fuels the idea of forking, and I hate that as well.