Powershell: We need a new operator '..' to access to members of childitems of a collection

Created on 3 Aug 2018  路  29Comments  路  Source: PowerShell/PowerShell

#example 1
$files = Get-ChildItem -File
#Let's get the length of every file.
$files..Length
#example 2
$hash = @{
    first = @{
        name = 'one'
        value = 1
    }
    second = 'two'
    'first name' = 'Jobs'
}
#The member intellisense would be clearer in PowerShell_ISE editor and the TAB completion.
$hash..first
Issue-Discussion WG-Language

Most helpful comment

I'm not sure we need it (since collections don't have many members), but that syntax seems clean, concise, and unambiguous.

All 29 comments

If I understand you correctly, you want a dedicated operator named .. that makes member enumeration explicit and thereby avoids the ambiguity with the current .-based member enumeration, where a property of a given name at the collection level takes precedence over element-level properties (such as .Length in your first example).

I don't quite understand how your 2nd example fits into this, however. Can you clarify by updating the original post, and, while you're at it, use code blocks (```powershell ...) to better format your code?

Having a dedicated operator would potentially allows us to solve 3 existing problems:

  • as stated, it would resolve the ambiguity of .: the new operator signals the explicit intent to operate on the _elements_ of the collection, not the collection itself.

  • it would allow us to provide an alternative to the pipeline-like behavior of . (implicit unwrapping, flattening of arrays) - see #6802 (though perhaps having different behaviors would ultimately be too confusing)

  • it would allow implementing _assignment_ operations too, which are currently not supported for fear of unintended use - see #5271 and, in particular, https://github.com/PowerShell/PowerShell/issues/5271#issuecomment-340285892

How does Powershell differentiate .. when used as the range operator from .. when used to access the members of a collection.

not arguing against the need to access the collection member's properties just don't understand how the two uses of the same operator symbol will work

@RichardSiddaway:

I was focusing on making the case for such an operator in the abstract, but you're right, the syntactic form needs careful consideration too.

.. is problematic for its potential to be confused with the range operator alone.

Groovy uses *., for instance, and at first blush that may work, given that we don't allow unquoted property names to start with digits (unless it's an all-digit name, in which quoting or use of an expression could disambiguate - see #6946).

But let's see if there is a fundamental willingness to introduce such an operator.

Sorry, my english is poor!
Yes, that's what i mean, thanks @mklement0
About the confusing you mentioned, the range operator operates on two variables or literal value, like:

1..10

or锛圥owerShell 6.0+)

'a'..'z'

or

$from..$to

and their mixed mode combination.

But when we use the new operator as i said, it's like:

$files..Length

the second operand is a member name, it's not a variable which must starts with '$'.
Even in the following case:

function one { 1 }
#3..one is error code
#we must write like this
3..(one)
#or 
3..$(one)

The second example is intend to use items of IDictionary object as elements of collection object, just like the 1st example. It's a extensional advice.

Thanks for updating the initial post, @nutix56.

Re the 2nd example: I'm still not entirely clear what you expect the output of $hash..first to be: The _key-value pairs_ of the nested hashtable (2 [System.Collections.DictionaryEntry] instances representing the name and value entries each)? Or just their _values_?

As for the syntax: I meant confusing to the _user_, not necessarily whether it _technically_ works (syntactically):

  • In terms of being confusing, perhaps the fact that ranges _can_ start with a variable, but _rarely do_, makes that less of a concern.

  • Even technically, though, there's ambiguity that would have to be resolved with non-obvious rules (and I can't speak to how much it would complicate parsing):

PowerShell supports expressions as property names, so $files..Length could be written as $b='Length'; $files..$b (at least it wouldn't be a good idea to disallow this with .., given that it's supported with .). So with a token such as $a..$b you would then have to infer from the _values_ of $a and $b whether they're both _numeric_ (-> range operator) or not (-> member enumeration). Note that the range operator is extremely flexible in what it accepts as numeric input, so that ' 5 '..$False is equivalent to 5...0, for instance).

The first issue example 1, work-around is $files | select-object Length, didn't realize listing all the file lengths was so difficult. This seems peculiar, since in a hash table, the keys overload the base properties, but here a property is getting hidden by a base PowerShell property. The work around for the hash is to use the method call get_[propertyname]()

The second example I don't think shows anything. There is no 'first' property that is getting overloaded. $hash['first'] is also acceptable. I do think this should be the correct way to retrieve values from the hash, however.

Thanks, @msftrncs.

There's a more concise workaround: (Get-ChildItem).ForEach('Length') (and you'd need to add -ExpandProperty to yours).

Interesting that the logic is reversed with hashtables (entry keys shadowing the hashtable's own properties when using dot notation).

Unfortunately, that precludes meaningful application of the potential new operator to hashtables, because . already acts like that operator.

If the logic weren't reversed, that is, if the behavior were consistent with member enumeration, @{ Count = 'Chocula' }.Count would return 1 (it currently returns Chocula), i.e., the hashtable's own property, whereas @{ Count = 'Chocula' }*.Count would then unambiguously target the _entry_, returning Chocula.

@mklement0 This is an aside, but WRT to assigning to a property on each member of a collection, you can do it with .foreach() as follows:

PS[2] (8) > $l = @{a=1},@{a=2},@{a=3}
PS[2] (9) > $l.foreach("a", 13)
PS[2] (10) > $l

Name                           Value
----                           -----
a                              13
a                              13
a                              13

Now regarding what operator we might use (since '..' is definitly out for the reasons stated) Groovy uses '*' for spreading (=== "splatting") whereas PowerShell uses '@'. Consequently the corresponding operator in PowerShell would be "@."as in:

 $lengths = [email protected]

Now in a previous language I designed, I used '->' for a similar purpose (but slightly different semantics)

$lengths = $list->Length

I liked the fact that it was clearly visible in the code. Another possibility is to have a semantic such that if the RHS is a scriptblock, then it gets applied to each element, minifying the foreach scenario even further.

(1..10)->{$_ * 2}

(Early in PowerShell we spent a bunch of time talking to the folks in the languages group in MS Research. One of the researchers was very keen on having a minimal foreach/map/apply/select notation so this would probably make him happy :-)

Thanks, @BrucePay.

Good point about @. being more appropriate.

Thanks for the reminder about .ForEach() also being able to _set_ properties.

Though .ForEach() therefore technically gives us everything we'd need the new operator for, I still think the latter is a useful addition (and it doesn't sound like you disagree, I'm just spelling it out), because:

  • with .ForEach() you need to _quote_ the property name, and you need method syntax.

  • the fact that .ForEach() even supports property extraction by a _string_ as the name and especially _assignment_ is not intuitively obvious and easy to forget - probably also because the use case of passing a single script block, as you would to ForEach-Object and the foreach loop, is so prevalent.

  • @. is much more concise, and concision is important for frequently used core features

that if the RHS is a scriptblock
(1..10)->{$_ * 2}

In the case of a script block, however, I feel that .ForEach() is more appropriate (not least because there wouldn't be a . analog for something (1..10)@.{$_ * 2}):

(1..10).ForEach({$_ * 2}) # I know you can omit the (...), but that seems fraught to me.

Though, as I've proposed, I wish we had .ForEach() in _operator_ form:

(1..10) -foreach {$_ * 2}

@chuanjiao10:

Yes, getting carried away with abstract symbols is generally problematic, and there is an inherent tension between readability and concision.

However, if the . operator (member access) makes sense to you, then @. should make sense to you as a natural extension of . for an inherently collection-focused language such as PowerShell.

The use of symbols is less problematic for frequently used features, because seeing them often makes them second nature.

Here, @. creates an operator whose @ symbol is already established as pertaining to collections, as Bruce states (argument splatting, @(...)), which helps with remembering its purpose.

I don't see the need for this. I find it rare that a property conflicts with a collection property, and for the other cases, you can easily do $collection |% property. What's better about $collection..property?

As an aside: I suggest we use @. in the discussion from now on.

What's better about [email protected]

  • No situational ambiguity, especially with property names provided a _variables or expressions_.

  • Better performance, especially compared to ... | % property, but even relative to .ForEach( 'property')

  • Conceptual clarity: you signal the intent to perform member enumeration as opposed to accessing a property of the collection itself.

  • Concision

All your points except your second one can be made for |% property too.

By that logic, member enumeration via . should never have implemented - yet it became a very popular feature.

The second point, performance, matters - member enumeration is fast, and you don't want to sacrifice speed just to work around its limitations.

Additionally, re concision: Use of aliases is discouraged in scripts, so your alternative would really be ... | ForEach-Object property.

. already existed to access properties, so there was no additional operator introduced. It's just PowerShell being a bit smarter in certain situations to do what you probably meant to do. It makes sense, because PowerShell is smart enough to wrap and unwrap values and collections in other contexts as well (e.g. when passing string to a string[] parameter, or a single-element array to a string parameter).

But whenever a new operator is introduced you have to consider that this is one additional thing for people to learn, one more thing that someone might not understand when they read a PowerShell script.

. already existed to access properties, so there was no additional operator introduced.

Yes, but one _should have been_ introduced.

As great a feature as member enumeration was to introduce in principle, conflating it with the regular member accessor, ., resulted in the problematic behavior that @. aims to fix.

Maybe it should just be $Collection%Property or $Collection.%Property 馃槈

% is worth considering, @Jaykul, for symmetry with the % alias.

However:

  • Just % could break existing code:
PS> $foo = $bar = 1; $foo%$bar  # '%' interpreted as modulo operator
0 
  • .% may be confusing, due to reversing the logic: conceptually, you enumerate first (%), and then you member-access (.)

Therefore, it is the form %. that is a potential alternative to @.

Although that is also technically valid with numbers (it's not very often we do fractional modulo):

$ten = 10; $ten%.2

PowerShell won't allow a .$variable (or .text) on the right-hand side of the modulo operator, so I think it's safe enough...

The operator focus on:

QQ鎴浘20191213105900

But, maybe it's better to use '...'

We can use it as a replacement for the simplest loop statement, like

$lenths = $(foreach($f in Get-ChildItem -File) { $f.Length})
$lenths

=>

$lengths = (Get-ChildItem -File)...Length

'...' means more enumerable elements,

when we use it between a collection object and its element property name, it means enumerate the property on each element of the collection object.

I think it's very informal.

If all you want is a single property why not use select-object

PS> Get-ChildItem -File | select -ExpandProperty length

If all you want is a single property why not use select-object

PS> Get-ChildItem -File | select -ExpandProperty length

It's more convenient, doesn't it?

To recap the issue succinctly:

Member enumeration with . is:

  • (a) delightfully concise
  • (b) fast

Its only drawback is the inability to _predictably_ target the _elements'_ members rather than the _collection object_'s in case of naming conflicts.

Having a distinct syntax - next to . - that unambiguously targets the _elements_ would solve that problem (and would also allow addressing other . limitations - see above).

The current workarounds fall short:

  • Select-Object -ExpandProperty length satisfies neither (a) nor (b): it is verbose and slow.

  • .ForEach('Length') _somewhat_ satisfies (b) (still slower than member enumeration), but not (a)

To my mind, the best candidate for the new syntax, after the discussion above, is %.:

$collection%.Length # Return the *elements'* .Length property values.

I'm not sure we need it (since collections don't have many members), but that syntax seems clean, concise, and unambiguous.

Technically, isn't $collection%.property then the same as $collection.foreach{$_.property}? (but shorter)

@msftrncs: yes, though that is somewhat slower - and more verbose - than the aforementioned $collection.ForEach('Property') syntax (which is still verbose) - the latter syntax also supports _assigning_ to property values more efficiently (pass an extra argument with the value).

Speaking of which: %., with its unambiguous intent, could and should support _assigning_ to properties as well (which with . was deemed too dangerous - see https://github.com/PowerShell/PowerShell/issues/5271#issuecomment-340285892); e.g., you could do something like:

# Assign the - invariably *same* - property value to all elements of the array.
(Get-Item *.txt)%.LastWriteTime = Get-Date

@Jaykul, glad to hear you like the syntax; re the need for it:

  • PowerShell treats collections uniformly in many contexts (which is a good thing), and you may not even know what particular collection type you're dealing with in a given situation (not least because PowerShell itself returns different collection types situationally ([object[]] vs. [System.Collections.ArrayList] vs. [System.Collections.ObjectModel.Collection[PSObject]]).

    • That is, in a given situation you may not even be aware of the potential for name conflicts, and you won't find out until later.
  • As for specific members:

    • The Item method present on most collections is pitfall when dealing with XML or JSON data with such element / property names, for instance; Length and Count to lesser degrees.

    • ToString is currently _invariably_ invoked on the collection itself (where it is more or less useless), but having a concise, predictable way to call the ToString method on all elements of a collection would be helpful, so that you could do the following (which with just . currently _fails_):

# Predictably call .ToString() on the elements, not the collection itself.
((Get-date), (Get-Date).AddDays(1))%.ToString('dd')

# The above would be the concise equivalent of:
((Get-date), (Get-Date).AddDays(1)).ForEach('ToString', 'dd')

Yes, but I don't find this argument compelling at all:

In a given situation you may not even be aware of the potential for name conflicts, and you won't find out until later.

I mean, that's obviously true. But most of the time, it doesn't matter. It's not that this is non-deterministic and might be different on different computers, so I can just write it and run it.

Adding a _longer_ syntax that's more explicit about what it's doing and therefore _theoretically_ clearer about what will happen doesn't mean that anyone's going to use it, because while it's technically clearer, we can't go back in time and do this unambiguously (i.e. we won't remove the fact that . unrolls child properties as long as they don't conflict with properties on the collection).

So this will be a great in those specific but rare situations where we actually run into a duplicate -- but people will have to actually know about it, and be 100% sure they don't need their code to run in PowerShell 7 or earlier.

Potentially this would just add an optional operator that's confusingly similar to several other operators and practically not used in the real world, and is therefore always confusing when people see it.

But most of the time, it doesn't matter.
So this will be a great in those specific but rare situations where we actually run into a duplicate

The point is: You shouldn't have to worry about when it _does_ matter.

When you write code, your intent is always clear: you'll know ahead of time whether you're looking for a property of the _collection_ or properties of its _elements_.

Currently, if you're not careful (and remember all member names of the collection you're dealing with), that intent can be thwarted in the latter case, with potentially confusing symptoms.

Having a _direct, unambiguous expression of one's intent_ in the form of %. is therefore a great asset in my book.

Yes, the behavior of . can't be changed anymore, but anyone who understands what %. offers may consider adopting it as a matter of _habit_.

people will have to actually know about it, and be 100% sure they don't need their code to run in PowerShell 7 or earlier.

That applies to any new feature; it's the price of progress.
We have the same problem with the ternary operator, null-coalescing, pipeline chains...

As for the concern about adding another, similar-looking operator: Let's make it easy to look up help for operators: #11339

The problem is that this isn't true:

When you write code, your intent is always clear: you'll know ahead of time whether you're looking for a property of the collection or properties of its elements.

Even _if_ we add this operator, only half of that is possible: you'll be able to specify that you're looking for a property of the elements. That's it. When you're looking at code that uses %. you'll know the author meant a property of the elements. The other half isn't possible -- when you're looking at code, you won't know for sure that they deliberately chose to use . instead of %. because they're looking for a property of the collection...

If we do add this feature, we should also add a script analyzer rule that recommends people use %. whenever the analyzer can tell that they're addressing the elements of the collection. It would be a hard rule to write, because you don't know for sure whether a command returns multiple items or not, which is one of the reasons we got the automatic unrolling in the first place -- so we could just write it without worrying about it 馃槈

When you're looking at code that uses %. you'll know the author meant a property of the elements.

That's all I ask for. You can't fix the past, but you can make things better in the future.
That is, going forward you'll be able to - and should - express your intent unambiguously, and the reward is predictable results.

we should also add a script analyzer rule that recommends people use %. whenever the analyzer can tell that they're addressing the elements of the collection.

Great, if feasible - but it sounds like it isn't.

one of the reasons we got the automatic unrolling in the first place -- so we could just write it without worrying about it 馃槈

That would be great if you _didn't_ have to worry about it, but the whole reason for this discussion is that you currently _do_ (situationally, and you may not anticipate when it'll hit you), and %. offers a way so you _no longer have to_.

Was this page helpful?
0 / 5 - 0 ratings