Powershell: why unbox single item array?

Created on 14 Nov 2019  路  10Comments  路  Source: PowerShell/PowerShell

I know the behavior exists from the beginning and there is a way to cast it as array.

However, this is unexpected behavior and no other languages are doing it, making it very hard to read/write other formats (json/yaml).

is there a plan to correct this?

Issue-Question Resolution-Answered

Most helpful comment

In addition to the above comments, you might find the PSKoans topics concerning arrays to be a useful resource to familiarise yourself with the behaviours in practice. 馃檪

All 10 comments

I speak in no official capacity, but the behavior is at the heart of PowerShell and very unlikely to change.

I understand that it may be unexpected if you're not familiar with PowerShell, but once you understand it, it is manageable - and you may see the benefits.

In the context of creating JSON, you now have -AsArray to ensure that even a single input object received through the pipeline is treated as an array (there is no official YAML support yet, but see #3607):

PS> @(1) | ConvertTo-Json -AsArray  # pipeline enumerates the input array, -AsArray treats the input as array
[
  1
]

An alternative is to pass the array as an _argument_:

PS>  ConvertTo-Json @(1)
[
  1
]

_From_-JSON conversion with Convert-FromJson actually _is_ array-preserving at the moment, but the behavior so runs counter to PowerShell's usual behavior that it will be changed - see #3424

Once the change has been made, you'll be able to use -NoEnumerate to preserve the array status of the input:

# After 3424 is fixed.
PS> ('[ 1 ]' | ConvertFrom-Json -NoEnumerate).GetType().Name
Object[]

no other languages are doing it

PowerShell is designed as interactive shell and as script engine. As result it has some "strange" compromises to address so different scenarios.

Also note that the distinction between arrays and scalar is often unproblematic in PowerShell, because it allows you to treat scalars _as if they were_ arrays; e.g.:

$var = 42 # scalar

$var.Count  # -> 1 - a scalar is a "collection of 1"

$var[0] # -> 42 - a scalar's "first collection element" is the scalar itself

$var[-1] # -> 42 - a scalar's "last collection element" is the scalar itself

Pitfalls:

  • Indexing a scalar [string] instance indexes into its _character array_ instead; e.g., 'foo'[0] is 'f'

  • Calling .Count on lazy IEnumerables (which don't have a .Count property) forces enumeration and then reports 1 _for each enumerated (scalar) object_ (unless they happen to have a .Count property themselves); e.g., ([IO.File]::ReadLines("$pwd/file.txt")).Count reports 1 for each line of the input file.


Conversely, you can use the "array guarantor" operator (array-subexpression operator), @(...), to ensure that a scalar becomes an array:

$var = 42 # scalar

# @(...) ensures that $var is treated as an array.
# If $var already *is* an array, @(...) is - loosely speaking - a no-op (see pitfall below).
@($var).GetType().FullName  # -> System.Object[]

Alternatively, if you're capturing output from a command or expression in a variable, a simpler alternative is to type-constrain the variable with [array] (effectively the same as System.Object[]):

[array] $arr = 42   # $arr is now a single-element array of type System.Object[]

Pitfall:

  • @(...) technically _always_[1] creates a _new_ array of type System.Object[].

    • With _command_ output (e.g., Get-ChildItem), this makes no difference (System.Object[] is the array type PowerShell uses anyway when it collects _multiple_ outputs from the pipeline for you, such as when assigning command output to a variable.

    • However, if you wrap something that _already is an array in memory_ in @(...), you not only needlessly create a new array, but its specific typing may be lost; e.g., @([int[]] (1..3)).GetType().FullName yields System.Object[], despite the [int[]]-typed input array.


[1] There is one exception: Array _literals_: Because needlessly constructing array literals as @( 1, 2 ) - instead of just 1 , 2 (the use of array-construction operator , is enough) - was such a widespread practice, starting in v5.1 the @( ... ) is optimized away, so as to avoid constructing an array _twice_.

In addition to the above comments, you might find the PSKoans topics concerning arrays to be a useful resource to familiarise yourself with the behaviours in practice. 馃檪

Nice, @vexx32 - that looks like a labor of love.

@smartpcr, let me also attempt to describe the _design rationale_ for the current pipeline behavior (it's fine if no one is listening anymore, I'm also writing this up for myself).


PowerShell is built around _pipelines_: data conduits through which objects stream, one object at a time[1].

PowerShell commands output to the pipeline by default, and any command can write _any number of objects_, including none - and _that number isn't known in advance_, because it can vary depending on arguments and external state.

E.g., Get-ChildItem *.txt can situationally emit none, 1, or multiple objects.

Since the pipeline is just a stream of objects of unspecified count, there is no concept of an _array_ in the pipeline itself, both on input and on output:

  • On input, arrays (and most enumerables) are _enumerated_, i.e. the elements are sent one by one to the pipeline. Therefore, there is no difference between sending a _scalar_ (single object) and sending a _single-element array_ through the pipeline.

  • On output, multiple objects are simply output one at a time (though it is possible, but rare, to send an array _as a whole_, but it is then itself just another, single output object in the pipeline).

It is only when you _collect_ a pipeline's output that arrays come into play, of necessity:

  • A single output object needs no container, and can just be received as itself.

  • Multiple objects need a container, and PowerShell automatically creates a System.Object[] array to collect the output objects in.


[1] You can introduce _buffering_ of multiple objects with the common -OutBuffer parameter, but the next command in a pipeline still receives the buffered objects one by one.

This issue has been marked as answered and has not had any activity for 1 day. It has been closed for housekeeping purposes.

I'd like to add that when doing dynamic manipulation it's mind blowing to have to wrap the result of a where clause over an array to account for the cases where it may result in an array (multiple matches) or a single item.

@hrimhari: For collections already in memory / ones that can be, you can use the .Where() array _method_, which _always_ returns a collection; e.g.:

PS> (1..10).Where({ $_ -eq 5 }).GetType().Name
Collection`1

@mklement0 good to know it returns Collection. I actually tried it before posting but I was testing against type [array] and getting false.

Specifically, you'll get an instance of [System.Collections.ObjectModel.Collection[psobject]].

If you want to more generally test for something that is _enumerable_, use -is [System.Collections.IEnumerable]

Was this page helpful?
0 / 5 - 0 ratings