Powershell: Enhance hash table syntax

Created on 20 Oct 2020  ·  10Comments  ·  Source: PowerShell/PowerShell

Runtime Construct and Populate

At design-time the syntax of an array and a dictionary (hash table) may look quiet consistent:

$Array = @(1, 4, 9) # or just $a = 1, 4, 9
$Hashtable = @{1 = 1; 2 = 3; 3 = 9}

But at runtime there is quiet a difference; it is possible to construct and populate a array in ones:

$Array = @(1..3 | %{ $_ * $_ })

but it is not possible to construct and populate a dictionary in ones. AFAIK, I always need to construct it first and then populate it:

$Hashtable = @{}
1..3 | %{ $Hashtable[$_] = $_ * $_ }

Basically, it is not possible to put a function in a hash table syntax, like: @{ MyFunction }, as there is no way to get out of argument mode, in comparison with an array syntax, where it is easy to jump to expression mode @( MyFunction ). For the same reason it is not possible to initiate and populate the hash table from within @{...} during runtime.

The same limitation counts for using a .Net constructor. I can populate an array within an array constructor:

$Array = [array](1..3 | %{ $_ * $_ })

But it is not possible to something similar from within a hash table constructor:

$Hashtable = [hashtable](1..3 | %{ @{ $_ = $_ * $_ } })

Or to invoke a function like:

$Hashtable = [hashtable]MyKeyValuePairs

InvalidArgument: Cannot convert the "System.Object[]" value of type "System.Object[]" to type "System.Collections.Hashtable".

Purpose
It would be nice if the hash table constructor accepts a list KeyValuePair (hash table) objects (Object[]]) to better support the Single-responsibility principle.

Caveats

  • A single item (anything but a key value pair, e.g.:@{ 1 = 'One' }, 2, @{ 3 = 'Three' }) probably needs to be considered as a key with a $Null value.
  • Duplicate keys need to be handled, e.g.: overwrite by default and throw an error when Set-StrictMode is set or put the related values of the duplicated keys in an array: [hashtable]@{ 1 = 'One' }, @{2 = 'Two' }, @{ 1 = 'Also One' } 🡆 [hashtable]@{ 1 = 'One', 'Also One'; 2 = 'Two' }

The same goes for a [PSCustomObject], it isn't possible to construct and populate a new PSCustomObject in once at runtime. Meaning, to construct a [PSCustomObject] and populate it with custom properties at runtime, it is required to construct the [PSCustomObject] first, and than populate it, using Add-Member. See e.g. StackOverflow question Is it possible to do an for “for” in an PSObject?.

$Object = [PSCustomObject](1..3 | %{ @{ "Name$_" = $_ * $_ } })

There is no error, but instead, I get a list of hash table objects ([hashtable[]]) rather than a single PSCustomObject with 3 properties:

PS C:\> $Object.Count
3
PS C:\> $Object[0].GetType()
IsPublic IsSerial Name                                     BaseType
-------- -------- ----                                     --------
True     True     Hashtable                                System.Object

There is probably a good explanation for this but I would expect a single [PSCustomObject] with three properties, like:

Name1 Name2 Name3
----- ----- -----
    1     4     9

Presuming that for a [PSCustomObject] it would mean a breaking change, a [Hashtable] constructor as purposed would still allow to do something like this:

$Object = [PSCustomObject][Hashtable]MyPropertyFunction

Or directly:

$Object = [PSCustomObject][Hashtable](1..3 | %{ @{ "Name$_" = $_ * $_ } })

This purpose is related to #5643.
In addition to the shortcut purpose from @iSazonov, for the object @[] shortcut accelerator, @{} could also be an shortcut for the hash table accelerator (wishful thinking?):

$Hashtable = @{}MyPropertyFunction
$Object = @[]$Hashtable

Design-time Construct and Populate

I am not sure whether this part makes sense in a way it is a more a less intuitive/consistent with the current syntax, but I like to mention it anyways as it has at least some consistency with the runtime purpose above:

To assign an array, it is possible to just supply a list of items separated with a comma:

$Array = 1, 2, 3

It would be nice to also be able to do that for creating a list of key-value pairs, e.g.:

$KeyValues = 1 = 'One', 2 = 'Two', 3 = 'Three' 

Knowing that this currently this has no meaning and returns an error:

Line |
   1 |  $KeyValues = 1 = 'One', 2 = 'Two', 3 = 'Three'
     |               ~
     | The assignment expression is not valid. The input to an assignment operator must be an object that is able to accept assignments, such as a variable or a property.

But is could return a list of [hashtable] items, similar to:

$KeyValues = @{ 1 = 'One' }, @{ 2 = 'Two' }, @{ 3 = 'Three' }

Which on its own could be used for simple enumerations (even there is no binary index and requires to enumerate twice):

$KeyValues.GetEnumerator().GetEnumerator() | Foreach-Object { Write-Host ('{0} = {1}' -f $_.Key, $_.Value) }

Which than could than be casted to a single hash table as suggested in the above Runtime section, like:

$Hashtable = [hashtable](1 = 'One', 2 = 'Two', 3 = 'Three')

Caveats

  • String keys always required to be quoted: $Hashtable = [hashtable]('One' = 1, 'Two' = 2, 'Three' = 3)
  • Variable keys need to be double quoted: $Hashtable = [hashtable](1..3 | %{ "$_" = $_ * $_ }), which is probably unsafe.
Issue-Enhancement WG-Language

Most helpful comment

but it is not possible to _construct and populate_ a dictionary in ones. AFAIK, I always need to construct it first and then populate it:

$Hashtable = @{}
1..3 | %{ $Hashtable[$_] = $_ * $_ }

In PSv7 you can do this:

1..3 | % { ($h ??= @{})[$_] = $_ * $_ }

and prior you can do

1..3 | % -b { $h = @{} } -pr { $h[$_] = $_ * $_ }

Personally if something were to be added I'd rather it be a command like:

0..10 | ConvertTo-Hashtable { $_ } { $_ * $_ }

I think there's an issue or RFC for something like that but I'm not sure.

All 10 comments

but it is not possible to _construct and populate_ a dictionary in ones. AFAIK, I always need to construct it first and then populate it:

$Hashtable = @{}
1..3 | %{ $Hashtable[$_] = $_ * $_ }

In PSv7 you can do this:

1..3 | % { ($h ??= @{})[$_] = $_ * $_ }

and prior you can do

1..3 | % -b { $h = @{} } -pr { $h[$_] = $_ * $_ }

Personally if something were to be added I'd rather it be a command like:

0..10 | ConvertTo-Hashtable { $_ } { $_ * $_ }

I think there's an issue or RFC for something like that but I'm not sure.

at runtime there is quiet a difference; it is possible to construct and populate a array in ones:

$Array = @(1..3 | %{ $_ * $_ })
but it is not possible to construct and populate a dictionary in ones. AFAIK, I always need to construct it first and then populate it:

$Hashtable = @{}
1..3 | %{ $Hashtable[$_] = $_ * $_ }

Yes this is normal. You don't need the @() array-subexpression operator in the first one. $Array = 1..3 | %{ $_ * $_ } gives the same result. FF . Multiple things naturally become [object[]] in PowerShell if you want a string array or a int array or you have to define it up front. So that's a special behaviour.

You can concatenate hash tables.
$h=@{} ; 1..3 | %{ $h+= @{$_ = $_* $_}}
Also works, although it is no better.

@{ "Name$_" = $_ * $_ } Says make a one item hash table. so 1..3 | %{ @{ "Name$_" = $_ * $_ } } Says make 3 one item hash tables, which is why it's not so great with [PsCustomObject].

$Array = [array](1..3 | %{ $_ * $_ })

Says take 1,4,9 (already an array) and cast it to array. Simple

$H = [hashtable](1..3 | %{ @{ $_ = $_ * $_ } })

Says take an array of 3 hash tables and cast them to a hash table. Which is unwieldy. As as a way of saying "initialize a hash table and put this in it" is longer than where you started.
$H = @{}; 1..3 | %{ $H[$_] = $_ * $_ }

To assign an array, it is possible to just supply a list of items separated with a comma:
$Array = 1, 2, 3

Well yes but if you were consistent with what you first wrote you'd use $Array = @(1, 2, 3)

It would be nice to also be able to do that for creating a list of key-value pairs, e.g.:
$KeyValues = 1 = 'One', 2 = 'Two', 3 = 'Three'

Well if we were being consistent that would be $KeyValues = @{1 = 'One', 2 = 'Two', 3 = 'Three'}

Don't forget you can write $KeyValues = @{$won = 'One',} but this
$KeyValues = $won = 'One'
already has a meaning, so I don't think it would be nice at all.

Incidentally

$KeyValues = 1 = 'One', 2 = 'Two', 3 = 'Three'
Are value pairs until we take take them somewhere where one becomes a key. We might have

$KeyValues = @{One = 1    
               Two = 2}

Without needing the quotes

$Hashtable = [hashtable](1 = 'One', 2 = 'Two', 3 = 'Three')
So what you've done is you've replaced @ with [hashtable] {} with () and ";" or linebreak with "," and made a requirement for string literal keys to be quoted.

I'm not seeing an improvement here. @SeeminglyScience `s last suggestion is a good one and could support lots of ways of passing in the keys and values.

I also like @SeeminglyScience's idea, but the command should probably be named differently (and probably needs an option to create an _ordered_ hashtable or, better yet, do so by default):

0..10 | New-Hashtable { $_ } { $_ * $_ }

You could then do [pscustomobject] (New-Hashtable ...) if you wanted a custom object.

I suppose another option would be to add a new parameter set to New-Object (with the two script blocks) to construct as [pscustomobject] instance by default and give it an -AsHashtable switch.

(A ConvertTo-Hashtable might be nice for converting [pscustomobject]s to hash tables, though you could argue that syntactic sugar [hashtable] $customObject is called for, to complement [pscustomobject] $hashTable; as usual, fitting the [ordered] in there would be awkward; it is too late, but I really wish that PowerShell would simply _always_ use ordered hashtables; sigh).


@iRon7:

I get a list of hash table objects ([hashtable[]])
There is probably a good explanation for this

The explanation is that PowerShell only implements the [pscustomobject] _syntactic sugar_ for a single operand _of type (ordered) [hashtable]_ (but sadly not for a generic Dictionary`2, though #13727 may change that).

The unfortunate part is that for non-supported operands the cast is the same as casting to [psobject], which is a virtual no-op: it simply wraps the operand in a - largely invisible, except when not (#5579) - [psobject] instance.

See #13836 and #13838

@mklement0, Thanks for the comments and the interesting linked issues.
The idea from @SeeminglyScience's:

0..10 | New-Hashtable { $_ } { $_ * $_ }

Is indeed nice, although it is a little confusing with input process blocks as with the Foreach-Object cmdlet (unless you use (but named parameters):

0..10 | Foreach-Object { $_ } { $_ * $_ }

Because the first script block has a different meaning.
You could also consider:

0..10 | New-Hashtable { @{ $_ = $_ * $_ } }

But that causes embedded curly brackets (which is not so nice but quiet common in PowerShell, think of calculated properties)

it is too late, but I really wish that PowerShell would simply always use ordered hashtables; sigh

I completely agree, I was thinking: _what would a change for this actually break?, Enumerations will be similar and you can't index on an unordered hashtable anyways (it will be very fragile as it will likely immediately break anyways when you add something). But the the problem is all the explicit (parameter) definitions which are currently in the field._
One step to this direction can be taken anyways without breaking anything yet by simply encouraging using the IDictionary interface for parameters #13852

unless you use (but named parameters)

I think the separate script blocks are preferable, and using named arguments would bring the necessary clarity, along the lines of:

0..10 | New-Hashtable -KeyScriptBlock { $_ } -ValueScriptBlock { $_ * $_ }

what would a change for this actually break?

Probably not too much overall, but, in addition to [hashtable]-typed parameters you mention, think of cases where someone uses a "polymorphic" (untyped, implicitly [object]) parameter and then reflects on its type: $param -is [hashtable].
(By the way, there's a typo in your comment: you meant IDictionary, not IDirectory.)

It is similar to the idea of making PowerShell use System.Collections.Generic.List`1 instead of arrays by default (see see https://github.com/PowerShell/PowerShell/issues/5643#issuecomment-349811857): again, most scripts wouldn't notice, but some could break.

Both ideas - switching to default use of ordered hashtables and lists rather than arrays - are prime candidates for #6745.

(As an aside: System.Collections.Generic.Dictionary`2 is _not_ ordered, and from what I can tell there is currently no generic counterpart to System.Collections.Specialized.OrderedDictionary - see https://stackoverflow.com/q/2629027/45375)


As an aside:

you can't index on an unordered hashtable anyways (it will be very fragile

You _fundamentally_ cannot _numerically_ index into a [hashtable], that only works with [ordered].
With [ordered], there is a risk of ambiguity with numeric keys:

([ordered] @{ 1 = 'one'; 2 = 'two' })[1]
two # !! `1` was interpreted as *positional* index, not as a key.
# Workarounds: `.1` or `[[object] key]`
Stack Overflow
There doesn't appear to be a generic implementation of OrderedDictionary (which is in the System.Collections.Specialized namespace) in .NET 3.5. Is there one that I'm missing? I've found

what would a change for this actually break?

Probably not too much overall, but, in addition to [hashtable]-typed parameters you mention, think of cases where someone uses a "polymorphic" (untyped, implicitly [object]) parameter and then reflects on its type: $param -is [hashtable].

Yeah that's a good example. Another is a map of ints:

$h = @{
    1 = 2
}

# Check if the key `0` exists.
if (-not $h[0]) { do something }

Also worth noting that it would probably double (or at least add one third) of the allocation size since it's basically a hashtable and also an ArrayList. I've seen hashtables get pretty big in PS scripts too, might push some OOM (but probably unlikely).

0..10 | New-Hashtable -KeyScriptBlock { $_ } -ValueScriptBlock { $_ * $_ }

In relation to Encourage using IDictionary interface for parameters #13852, it might be preferable to create an _ordered_ dictionary by default:

0..10 | New-Dictionary -KeyScriptBlock { $_ } -ValueScriptBlock { $_ * $_ }

and have mutual switches like -AsHashTable (default ordered) and -Sorted along with a -CaseSensitive switch.
(Would it make sense to introduce a new PowerShell specific type, something like: [PSDictionary]/[PSCustomDictionary] with PowerShell specific features and defaults. Or is this _way out of line_?)

By the way, a common use case for runtime construction and populating, is (binary) indexing a collection of PSCustomObjects for a fast join with another PSCustomObjects collection, where I would like to use a syntax like this:

$IndexByName = $PSObjectCollection | New-Dictionary -KeyScriptBlock { $_.Name } -ValueScriptBlock { $_ }

I definitely like the idea of naming the cmdlet New-Dictionary, defaulting to an ordered hashtable (System.Collections.Specialized.OrderedDictionary), and the -CaseSensitive switch.

When you say -Sorted, did you mean -Ordered or were you thinking of _sorted_ dictionaries (sorted by _keys_, as opposed to _ordered_, which maintain _definition order_): SortedList`2 and SortedDictionary`2 (they fundamentally behave the same, but are optimized for different things).

Instead of having multiple switches, I suggest a single -As or perhaps better -Kind parameter that accepts an enumeration value; e.g.:

New-Dictionary -Kind { Unordered | Ordered | Sorted | SortedList }

I don't see the need for a custom, PS-specific dictionary type, but perhaps you can elaborate on why you think it would be a good idea.

@mklement0,

did you mean -Ordered or were you thinking of sorted dictionaries

I really meant Sorted, needless to say, I am not known with most dictionary types (except for Hashtable, Ordered and Sorted dictionaries) and do not quiet understand the hierarchy and which type/interface is derived from which type/interface.

Instead of having multiple switches, I suggest a single -As

A single -As parameter is indeed better than multiple (mutually excluded) switches, except for the Unordered value, as Unordered would also imply that the dictionary is "Unsorted" (which is confusing), so far I can tell, the common type name for this (anything that is not Ordered | Sorted | SortedList) is in fact the well known [Hashtable]:

New-Dictionary -As{ Hashtable | Ordered | Sorted | SortedList }

a custom, PS-specific dictionary type, but perhaps you can elaborate

  • This is partly related to the Encourage using IDictionary interface for parameters #13852 and your new Support hashtable initializers for all IDictionary types #13873 issues. I can't oversee this, but using the IDictionary and a paramater type might be a too large filter for dictionaries that might fail when iterated further down into a script. A specific PowerShell dictionary type (with can also be used for a parameter type) might give more control over directory types that supposed to iterate well in a PowerShell script.
  • Get a consistent set of members, as e.g. a [Hashtable] has a ContainsKey and a ContainsValue method and an [Ordered] type not.
    (btw, I don't understand why the .Net get_key() method is hided as using the Keys property could potentially conflict with keys based on a custom list during runtime)
  • To also allow for a [PSDictionary]@{ 1 = 'a'; 2 = 'b'; 3 = 'c' } syntax with (runtime) features like [PSDictionary]{ $PSObjectCollection | % { $_.Name = $_ } } as initially purposed
  • A new _common_ PS-specific dictionary type might be easier to understand for a novice programmer (than dealing with all the directory types/interfaces) and with that, could be an easier way to move away from the _unordered_ hash table

Re Hashtable as the enum value: yes, that'll be more familiar to users.

Re custom PS dictionary type:

What you're describing is the purpose that the IDictionary interface already fulfills.

Your ContainsKey / ContainsValue example is a manifestation of the challenges of working with interfaces I've tried to summarize in the linked issue: by only working with types as themselves as opposed to through interfaces, interfaces are all but invisible in PowerShell, though implementing #13865 should help a bit.

VSCode already helps you with IntelliSense, but absent that you must simply know what members the interface itself supports - as opposed to a given type implementing it, which has additional members, which differs across implementing types.

Rather than trying to solve with this a custom type that in effect would emulate an interface, the better solution would be for PowerShell to support interface-typed variables, so that only the interface members are accessible through them, as in C#.

Come to think of it, this may be exactly what @SeeminglyScience may have been advocating in his comment (I didn't see it at the time) - though it sounds like a nontrivial undertaking,

Correct, @SeeminglyScience? Do you see the potential for breaking changes if that were to be introduced? Also, casting back to an implementing type would for symmetry also have to be supported (though -as may do in a pinch).

Was this page helpful?
0 / 5 - 0 ratings