Powershell: Cannot modify array of custom structs directly

Created on 23 Apr 2020  路  20Comments  路  Source: PowerShell/PowerShell

Steps to reproduce

# custom struct via C#
$source = @"
public struct MyIntStruct
{
   public int x;
}
"@
Add-Type -TypeDefinition $source

# array of 1 MyIntStruct
$arr = New-Object -TypeName MyIntStruct[] -ArgumentList 1
$arr[0].x       # 0 
$arr[0].x = 1   # change to 1
$arr[0].x       # 0 -> Did not change value

Expected behavior

$arr[0].x should be set to 1

Actual behavior

$arr[0].x is not changed

Environment data

Name                           Value
----                           -----
PSVersion                      7.0.0
PSEdition                      Core
GitCommitId                    7.0.0
OS                             Microsoft Windows 10.0.18363
Platform                       Win32NT
PSCompatibleVersions           {1.0, 2.0, 3.0, 4.0鈥
PSRemotingProtocolVersion      2.3
SerializationVersion           1.1.0.1
WSManStackVersion              3.0

Issue-Enhancement Up-for-Grabs WG-Engine

Most helpful comment

Not sure if the rest of this post should be it's own issue, or if it even belongs in this repo since it isn't really actionable, but I'mma put it here.

So I've mentioned that a large part of why this happens and in general why mutable structs are sort of wonky is because PowerShell doesn't and can't support ref. It does have PSReference (aka [ref]), which sort of emulates ref but only really for variables.

Well I do have an idea for how the concept of ref could be implemented in PowerShell safely. It should however be stressed that:

_No one should work on this_

Until a compelling enough reason is demonstrated. The scope of these changes are too large to currently justify the benefit, not just in implementation time but also maintenance.

If there comes a time when ref becomes super important to PowerShell, maybe this will help the poor soul taking on that implementation.

PSDeferredReference

The problem with ref when it comes to PowerShell is that you can't store a ref on the heap. You can however, store all the information needed to create one. This is sort of like what [ref] does,
but it's a bit unwieldy.

  1. It only works with PSVariable as it's source
  2. It requires you to manipulate it through it's Value property

To solve these problems, ref should be implemented as a new type of dynamic meta object similar to PSObject. Given the example of an array reference, the meta object would contain a reference to the whole array and the index requested.

Every binder would need to be updated to account for this.

PSInvokeMemberBinder (and similar)

If passed as an argument, use for the following

  • Strictly typed ref arguments (maybe requiring [ref] or a new keyword/type accelerator)
  • Loosely typed by value arguments (e.g. allow type conversion as usual)
  • Most importantly, the actual meta object should never be passed as is. C# code shouldn't have
    to know what it is

If used as the instance

  • If reference type or immutable value type use the value as the instance
  • If mutable value type use the reference as the instance

PSVariableAssignmentBinder, PSPipeWriterBinder and PSToArrayBinder

Always process by value. Never save or write the meta object somewhere accessible.

Maybe allow saving to a variable with [ref] syntax to mirror C#, but never let the user really see it. For example:

$array = [int[]](0..10)
[ref] $value = [ref] $array[0]

[string]$value.GetType()
# int

[object]$value
# 0

$value = 30
$array[0]
# 30

The most important requirement for the meta object imo is that it must be as invisible as a managed pointer (ref) in C#.

PSGetIndexBinder

When the [ref] syntax is used and a ref return is supported by the indexer, return a deferred reference. This should include at least arrays and ideally also Memory<T>.

PSGetMemberBinder

When the [ref] syntax is used with a mutable field access or (maybe) ref property, return a deferred reference.

All other binders

All others need to be aware of the meta object and operate on the value directly.

ref parameters in PowerShell class methods

This could also be used to add support for ref/out/in parameters in PowerShell class methods. Since class methods are implemented by a generated method calling a scriptblock, the generated method could look something like this:

class MyRefTest {
    static [void] Test([int] $byValueArg, [ref] [int] $byRefArg) {
        $byRefArg = 10
    }
}
public class MyRefTest
{
    public static unsafe void Test(int byValueArg, ref int byRefArg)
    {
        fixed (void* pByRefArg = &byRefArg)
        {
            using var mpByRefArg = new PSManagedPointer((IntPtr)pByRefArg);
            MyRefTest_<staticHelpers>.<Test_1_1>.InvokeHelper(
                instance: null,
                sessionStateInternal: null,
                new object[] { byValueArg, mpByRefArg });
        }
    }
}

internal class PSManagedPointer : PSDeferredReference, IDisposable
{
    private readonly IntPtr _managedPointer;

    private bool _isDisposed;

    public PSManagedPointer(IntPtr managedPointer) => _managedPointer = managedPointer;

    // Unsure if this would be an abstract method on `PSDeferredReference` or just part of
    // some kind of new `PSGetReference` binder.
    public unsafe ref T GetRef<T>()
    {
        // Use this as the runtime equivalent to Roslyn's stackalloc escape analysis.
        if (_isDisposed) throw new ObjectDisposedException(this);
        return ref Unsafe.AsRef((void*)_managedPointer);
    }
}

This could remain safe because you can ensure that the managed pointer would only be used for the lifetime of the method invocation.

Caveats

Since this meta object would need to perform an action every time the deferred reference is used, there are some notable problems.

  1. ref return methods would still not be possible to support. Every access would require another method invocation
  2. Performance would be worse than actual managed pointers. Though probably still not measurably worse than current behavior in most contexts
  3. The binder would need to be less forgiving with out of range and/or null handling. Or require some trickery with Unsafe.AsRef<T>((void*)null) and Unsafe.IsNull<T>(ref T)
  4. Binder code would likely double, maybe triple. Note that Binders.cs is currently 7,762 lines

Edit: Added how it could be used to add support for ref parameters in PowerShell classes as well.

All 20 comments

Best practice for struct/valuetype are to make them immutable.
Reference type and value type don't work the same (try to change a char inside a string)

This code works :

$arr =@([MyIntStruct]@{ x = 1 })
$arr[0].x 
$arr[0] = [MyIntStruct]@{ x = 2 }
$arr[0].x 

Best practice for struct/valuetype are to make then immutable.
reference type and value type don't work the same (try to change a char inside a string)

That is helpful, but not as elegant as you can do in C#.

I just tested in C# and this works...

public struct MyIntStruct
{
   public int x;
}
var arr = new MyIntStruct[1];
arr[0].x = 1;

Shouldn't PowerShell be able to do the same?

Depends on quite a bit, I'm afraid. PS boxes value types implicitly in pretty much every operation, so trying to address things by reference like you're doing here when they're a value type aren't guaranteed to work. C# is able to get a pointer to the struct in-memory, so it's able to modify it. PowerShell, though, loses that reference by boxing the value type (since to box it, it has to copy it and now has a different reference, since it can't box the reference itself).

So by the time you get to setting a value, the struct you set the value on is actually not the same one as in the array.

Should it work? I guess so, probably, depends on what your goal is really. For a quick workaround, I think you can do a simple cast to [object[]] after creating the array and modify the structs within that array instead, since they're already boxed and the reference will remain intact through indexing operations.

Can it be made to work without breaking a significant portion of PS's functionality is a completely different question, though.

This is a few shades out of my usual depth, but that's my present understanding of the problem, as it were. 馃檪

@vexx32 I confirm your workaround is working. Thanks

@iSazonov, if I interpret @SeeminglyScience's comments in https://github.com/PowerShell/PowerShell/issues/12411#issuecomment-619065783 correctly (the details are beyond me), it sounds like this may not be doable, so perhaps the label "Up-for-Grabs" is not appropriate - but I'll let @SeeminglyScience weigh in.

Thanks @mklement0, yeah I would say this isn't feasible without a significant time investment and possibly some large architectural changes.

I think it is not critical and rare case now but we _could_ enhance support for the scenario in future. I guess we will get the request more and more because .Net will use structs in APIs more and more.

Makes sense, @iSazonov.

Given @SeeminglyScience's assessment, however, I wonder if we need a _new label_, say, "Experts-Only" - just to give would-be implementers fair warning that a big and complicated task (that also needs to be carefully evaluated for breaking changes) awaits.

This would complement the existing "First-Time-Issue" label.

It looks like a provocation 馃槃

Yeah, I'm not sure that would be especially constructive? You're almost labelling it "warning: this is hard as heck" which will probably just deter pretty much everyone.

It might be best to create a meta issue so we can discuss that specifically and let the PS team weigh in on that particular point. 馃檪

If "First-Time-Issue" is considered helpful guidance rather than an insult (which it certainly isn't), then so should "Experts-Only" be (if it's about the _wording_, by contrast, that's negotiable).

The point is that some issues are much more intricate and/or high-risk than they look, whereas "Up-for-Grabs" tends to invite casual contributions. The idea is to give would-be contributors fair warning, so as to prevent wasted effort.

In other words: This label _should_ say "this is hard", if there are good reasons for this assessment.

However, I don't feel strongly enough to create a meta issue myself, so I'll leave it at that.

We use "First-Time-Issue" and "Up-for-Grabs" as special labels because they are grabbed external systems like https://up-for-grabs.net/
Do you know systems which grabs "Experts-Only"?

Want to contribute to open source, but not sure where to start?

That's good to know - I had no idea this site existed.

Given that the - perhaps previously unstated - recommendation is to _combine_ "experts-only"
with "up-for-grabs" (just like "first-time-issue" already is), users who come to issues via this site (which is certainly not the only route) would benefit too.

To be honest if I was looking through Up-for-Grabs and this was the first issue I hit, I'd probably move on to a different project.

It's definitely within the realm of possibility that I'm wrong and that there's a simple fix I'm not thinking of, wouldn't be the first time. If I'm not though, then someone either needs to implement a safe version of managed pointers for PowerShell or reorganize the compiler in a pretty complicated way. Neither of those would really be feasible for someone not already very familiar with the internals. I'm also fairly sure neither would be accepted as a PR given the scope and minimal gain.

It's a fair point: perhaps a task of this nature shouldn't be up-for-grabs (which was my original thought too) - especially if you think a PR wouldn't be accepted.

My "experts-only" suggestion was meant to help in case we do leave it up-for-grabs - the latter being the signal that the fix / feature is recognized as potentially valuable, even though the core team has no plans to implement it themselves.

Irrespective of what the right answer is for _this_ issue, I think such a label could be useful in general - except if the consensus is that issues that _would_ warrant this new label should by definition _not_ be up-for-grabs.
Perhaps then we need an "acceptable-limitation" label instead (although, arguably, "resolution-by-design" is sufficient, albeit not as descriptive).

Not sure if the rest of this post should be it's own issue, or if it even belongs in this repo since it isn't really actionable, but I'mma put it here.

So I've mentioned that a large part of why this happens and in general why mutable structs are sort of wonky is because PowerShell doesn't and can't support ref. It does have PSReference (aka [ref]), which sort of emulates ref but only really for variables.

Well I do have an idea for how the concept of ref could be implemented in PowerShell safely. It should however be stressed that:

_No one should work on this_

Until a compelling enough reason is demonstrated. The scope of these changes are too large to currently justify the benefit, not just in implementation time but also maintenance.

If there comes a time when ref becomes super important to PowerShell, maybe this will help the poor soul taking on that implementation.

PSDeferredReference

The problem with ref when it comes to PowerShell is that you can't store a ref on the heap. You can however, store all the information needed to create one. This is sort of like what [ref] does,
but it's a bit unwieldy.

  1. It only works with PSVariable as it's source
  2. It requires you to manipulate it through it's Value property

To solve these problems, ref should be implemented as a new type of dynamic meta object similar to PSObject. Given the example of an array reference, the meta object would contain a reference to the whole array and the index requested.

Every binder would need to be updated to account for this.

PSInvokeMemberBinder (and similar)

If passed as an argument, use for the following

  • Strictly typed ref arguments (maybe requiring [ref] or a new keyword/type accelerator)
  • Loosely typed by value arguments (e.g. allow type conversion as usual)
  • Most importantly, the actual meta object should never be passed as is. C# code shouldn't have
    to know what it is

If used as the instance

  • If reference type or immutable value type use the value as the instance
  • If mutable value type use the reference as the instance

PSVariableAssignmentBinder, PSPipeWriterBinder and PSToArrayBinder

Always process by value. Never save or write the meta object somewhere accessible.

Maybe allow saving to a variable with [ref] syntax to mirror C#, but never let the user really see it. For example:

$array = [int[]](0..10)
[ref] $value = [ref] $array[0]

[string]$value.GetType()
# int

[object]$value
# 0

$value = 30
$array[0]
# 30

The most important requirement for the meta object imo is that it must be as invisible as a managed pointer (ref) in C#.

PSGetIndexBinder

When the [ref] syntax is used and a ref return is supported by the indexer, return a deferred reference. This should include at least arrays and ideally also Memory<T>.

PSGetMemberBinder

When the [ref] syntax is used with a mutable field access or (maybe) ref property, return a deferred reference.

All other binders

All others need to be aware of the meta object and operate on the value directly.

ref parameters in PowerShell class methods

This could also be used to add support for ref/out/in parameters in PowerShell class methods. Since class methods are implemented by a generated method calling a scriptblock, the generated method could look something like this:

class MyRefTest {
    static [void] Test([int] $byValueArg, [ref] [int] $byRefArg) {
        $byRefArg = 10
    }
}
public class MyRefTest
{
    public static unsafe void Test(int byValueArg, ref int byRefArg)
    {
        fixed (void* pByRefArg = &byRefArg)
        {
            using var mpByRefArg = new PSManagedPointer((IntPtr)pByRefArg);
            MyRefTest_<staticHelpers>.<Test_1_1>.InvokeHelper(
                instance: null,
                sessionStateInternal: null,
                new object[] { byValueArg, mpByRefArg });
        }
    }
}

internal class PSManagedPointer : PSDeferredReference, IDisposable
{
    private readonly IntPtr _managedPointer;

    private bool _isDisposed;

    public PSManagedPointer(IntPtr managedPointer) => _managedPointer = managedPointer;

    // Unsure if this would be an abstract method on `PSDeferredReference` or just part of
    // some kind of new `PSGetReference` binder.
    public unsafe ref T GetRef<T>()
    {
        // Use this as the runtime equivalent to Roslyn's stackalloc escape analysis.
        if (_isDisposed) throw new ObjectDisposedException(this);
        return ref Unsafe.AsRef((void*)_managedPointer);
    }
}

This could remain safe because you can ensure that the managed pointer would only be used for the lifetime of the method invocation.

Caveats

Since this meta object would need to perform an action every time the deferred reference is used, there are some notable problems.

  1. ref return methods would still not be possible to support. Every access would require another method invocation
  2. Performance would be worse than actual managed pointers. Though probably still not measurably worse than current behavior in most contexts
  3. The binder would need to be less forgiving with out of range and/or null handling. Or require some trickery with Unsafe.AsRef<T>((void*)null) and Unsafe.IsNull<T>(ref T)
  4. Binder code would likely double, maybe triple. Note that Binders.cs is currently 7,762 lines

Edit: Added how it could be used to add support for ref parameters in PowerShell classes as well.

Thanks for taking the time to put this together, @SeeminglyScience.

Given that this may have to wait for a - at this point hypothetical - future PowerShell version not burdened with the need to maintain backward compatibility (see #6745), I wonder how this fits into the larger sentiment expressed by @lzybkr here (where, as an aside, he also wishes that ETS instance members on value types had not been allowed via the resurrection tables):

Note that I consider PSObject a legacy artifact that might never have been introduced if the DLR existed before PowerShell v1.

I wonder how this fits into the larger sentiment expressed by @lzybkr here (where, as an aside, he also wishes that ETS instance members on value types had not been allowed via the resurrection tables):

Note that I consider PSObject a legacy artifact that might never have been introduced if the DLR existed before PowerShell v1.

I believe he's referring to (@lzybkr feel free to correct me if I'm wrong) PSObject's use as a member bag (e.g. pso.Properties), not it's use as a DLR meta object.

@SeeminglyScience - when implementing DLR support - PSObject as a DLR metaobject was done for completeness, so you are more or less correct. PowerShell obviously needs a property bag type but probably doesn't need the apis exposed by PSObject given the DLR and resurrection tables.

Was this page helpful?
0 / 5 - 0 ratings