Not sure if -UnifyProperties
is the correct name for the parameter purpose described below.
Anyways, it would be nice to have an easy and _standard_ way to resolve the (common) issue were properties aren't displayed or taken by the next cmdlet because the first object in the pipeline doesn't contain all the properties of the following objects.
See e.g.: Not all properties displayed and I believe just a new StackOverflow question came in with the same cause: Trying to get all Teams with their owners, members and guest in a CSV using Powershell.
Select-Object -UnifyProperties
should behave similar to the purposed Union-Object
function (v0.2.1) described in the Not all properties displayed answer. Meaning that it is expected to stall the pipeline to collect all properties names of all objects in the pipeline and use them as a property selection
With the purpose in place, the properties of the issues below could be obtained by simply piping the object to | Select-Object -UnifyProperties
:
[pscustomobject]@{name='joe';address='home'}, [pscustomobject]@{phone='1'} | Select-Object -UnifyProperties
name address phone
---- ------- -----
joe home
1
$Adapters, $IPAddress | Select-Object -UnifyProperties
$printers, $process | Select-Object -UnifyProperties
-Union
seems to imply objects are joined / data is added or shared in some fashion, when in reality you're not really adding data to any objects, just padding out extra properties so the display format is clear.
I do agree it would be helpful for Select-Object to have this kind of functionality, but I don't think the parameter name communicates the intent very well. 馃
Agree, I am not native English, and I have no clue what should be a correct name, any suggestions? -UniteProperties
? AlignObjects
?
(I will change the title and content of this issue accordingly)
I'm not sure myself really. -StandardizeProperties
would _maybe_ be ok but it feels like there should be a simpler term for it that I can't find at the moment. 馃槄
I have changed the title/content using -UniteProperties
for now as -Union
is definitely wrong and confusing.
It is all about join-object
, is it not it?
Typically I see Join-Object mentioned where you have two discrete sets of objects that you want to combine together based on a shared property/properties. My understanding of this suggestion (correct me if I'm wrong, please, @iRon7 馃檪) is that we're mainly concerned with ensuring all the objects have the same property names, rather than merging related objects?
Correct, it is _not_ something like Join-Object
, it is what @vexx32 describes: ensuring all the objects have the same property names
It might also be considered to give this purposed feature to a specific (bogus) wildcard property (e.g. a double asterisk: -Property **
):
# Wishful thinking...
[pscustomobject]@{name='joe';address='home'}, [pscustomobject]@{phone='1'} | Select-Object **
name address phone
---- ------- -----
joe home
1
@iRon7 your example is exactly what any implementation of join-object
does including my own. So I vote for a new native cmdlet join-object
.
Does it? Most Join-Object implementations I've seen would result in a single object. @iRon7's example is still two objects.
Single object is a result of merge. Join just adds object to another one. That is my understanding.
That's not what's happening here, either? Neither object is being added to the other. Just ensuring that all the objects have the same set of property names.
I like the idea in general, but we need to distinguish between whether this is for _display formatting_ or really about _creating objects_ that have the union of all properties across all input objects.
If this is about display formatting, then Select-Object
isn't the right cmdlet to extend - Format-Table
would be, perhaps with an -AllProperties
switch:
Here's a quick prototype:
function Format-TableAllProperties {
[System.Collections.Generic.List[string]] $propNames = @()
[System.Collections.Generic.HashSet[string]] $hashSet = @()
$inputCollected = @($input)
$inputCollected.ForEach({
foreach ($name in $_.psobject.Properties.Name) {
if ($hashSet.Add($name)) { $propNames.Add($name) }
}
})
$inputCollected | Format-Table $propNames
}
PS> [pscustomobject] @{ one = 1; two = 2; three = 3 }, [pscustomobject] @{ one = 10; three = 30; four = 4 } |
Format-TableAllProperties
one two three four
--- --- ----- ----
1 2 3
10 30 4
@dfinke had to put Update-FirstObjectProperties
into ImportExcel for exactly this reason. There must be a stack of places which need it.
Whether adding it to Select-object
is better than having a self contained command can be argued both ways.
What that function could be implemented in a proxy command wrapping Select-Object
but I've grown to expect it to be in its own command, so that's my bias, but I'm also thinking Select-Object
is one of those widely used cmdlets that people might prefer to leave alone.
So it sounds like _both_ a for-display and an extend-actual-objects solution may be desirable.
(I suspect you're aware of it , @jhoneill, and the function name suggests it, but just to make it explicit: the linked function uses only the _first_ input object as the source for the set of properties whose presence should be ensured on all subsequent ones).
@iRon7, can we get clarity on which one you were looking for - the examples in the OP suggest the former - and perhaps create a _separate_ issue for the respective other, or at least clearly distinguish these use cases.
@mklement0,
for _display formatting_ or really about _creating objects_
It is about creating objects (at least from my view), I had more something in mind like:
function UniteProperties { # Select-Object -UniteProperties
[System.Collections.Generic.List[string]] $propNames = @()
[System.Collections.Generic.HashSet[string]] $hashSet = @()
$inputCollected = @($input)
$inputCollected.ForEach({
foreach ($name in $_.psobject.Properties.Name) {
if ($hashSet.Add($name)) { $propNames.Add($name) }
}
})
$inputCollected | Select-Object $propNames
}
Current situation:
[pscustomobject] @{ one = 1; two = 2; three = 3 },
[pscustomobject] @{ one = 10; three = 30; four = 4 } |
ConvertTo-Csv
"one","two","three"
"1","2","3"
"10",,"30"
Future situation
[pscustomobject] @{ one = 1; two = 2; three = 3 },
[pscustomobject] @{ one = 10; three = 30; four = 4 } |
UniteProperties | ConvertTo-Csv # Select-Object -UniteProperties | ConvertTo-Csv
"one","two","three","four"
"1","2","3",
"10",,"30","4"
(I suspect you're aware of it , @jhoneill, and the function name suggests it, but just to make it explicit: the linked function uses only the _first_ input object as the source for the set of properties whose presence should be ensured on all subsequent ones).
Yes, it should be on all subsequent objects, because if you correct only the first object and then ... | Sort-Object | ...
, it might lose some properties again (see the description in the Not all properties displayed answer).
Thanks for clarifying, @iRon7.
As for the parameter name, maybe -UnifyProperties
is better?
As for a potential separate cmdlet instead: I struggle to even think of a good name, because there is no fitting approved verb that I see (there's some conceptual similarity to Add-Member
, but adding it to that is worse, I think).
I think adding this to Select-Object
is a good fit in terms of user expectations, even though the collect-all-input-up-front behavior is a departure, but addressing that through the help should suffice.
However, I don't know if there are implementation challenges and we would need to decide whether want to allow combining the new switch with any of the existing sub-selection functionality (-First
, -Skip
, -Unique
...)
(I suspect you're aware of it , @jhoneill, and the function name suggests it, but just to make it explicit: the linked function uses only the _first_ input object as the source for the set of properties whose presence should be ensured on all subsequent ones).
There are two situations, one is where the first object determines what will be displayed / exported and the other objects can safely be left with a subset of the fields.
Someone wedded to strict mode might hit problems with properties missing, so I wouldn't rule out ensuring presence, but
It is about _creating objects_ (at least from my view), I had more something in mind like:
function UniteProperties { # Select-Object -UniteProperties [System.Collections.Generic.List[string]] $propNames = @() [System.Collections.Generic.HashSet[string]] $hashSet = @() $inputCollected = @($input) $inputCollected.ForEach({ foreach ($name in $_.psobject.Properties.Name) { if ($hashSet.Add($name)) { $propNames.Add($name) } } }) $inputCollected | Select-Object $propNames }
Or more simply, and more powershell-styled.
function UniteProperties {
$hash =@{}
$i = @($input)
foreach ($obj in $i) {foreach ($p in $obj.psobject.properties) {$hash[$p.name] = $true} }
$i | Select-Object ($hash.keys | ForEach-Object tostring)
}
And test with
$y [pscustomobject] @{ one = 1; two = 2; three = 3 },
[pscustomobject] @{ one = 10; three = 30; four = 4 } |
UniteProperties
$y | convertto-csv
$y[1].four.gettype()
gm -in $y[0]
$y[0].four.gettype()
Have a look at what the last three lines do. That's part of what I was trying to explain to @mklement0 and probably didn't make sense. That maybe what you want ...
@jhoneill,
I guess you meant: $y = [pscustomobject] ...
In that case, all the examples behave as I would expect:
$y | convertto-csv
All objects are converted to a csv format. Where every property (independent of its type) is wrapped in double quotes, except for $Null
with is left empty:
"one","two","three","four"
"1","2","3",
"10",,"30","4"
$y[1].four.gettype()
The property of the second object ([1]
) is set to an integer (four = 4
):
IsPublic IsSerial Name BaseType
-------- -------- ---- --------
True True Int32 System.ValueType
gm -in $y[0]
I am not familiar with this syntax, but assume it is similar to $y[0] | gm
.
The properties One
, Two
, and Three
are set to an integer (as above) and the property four
is null
(where Get-Member
shows a general Object
for the MemberType which unrelated to this purpose)
Name MemberType Definition
---- ---------- ----------
Equals Method bool Equals(System.Object obj)
GetHashCode Method int GetHashCode()
GetType Method type GetType()
ToString Method string ToString()
four NoteProperty object four=null
one NoteProperty int one=1
three NoteProperty int three=3
two NoteProperty int two=2
$y[0].four.gettype()
Is resulting in an error as the value is not supplied ($Null
), similar to: $Null.GetType()
InvalidOperation: You cannot call a method on a null-valued expression.
The result will be exactly the same if you manually define the properties for Select-Object
:
$y = [pscustomobject] @{ one = 1; two = 2; three = 3 },
[pscustomobject] @{ one = 10; three = 30; four = 4 } |
UniteProperties | Select-Object one, two, three, four
And still similar to just:
$z = [pscustomobject] @{ one = 1; two = 2; three = 3 },
[pscustomobject] @{ one = 10; three = 30; four = 4 }
Where the _expected_ difference is:
$y[0] | gm | $z[0] | gm
|
TypeName: Selected.System.Management.Automation.PSCustomObject | TypeName: System.Management.Automation.PSCustomObject
|
Name MemberType Definition | Name MemberType Definition
---- ---------- ---------- | ---- ---------- ----------
Equals Method bool Equals(System.Object obj) | Equals Method bool Equals(System.Object obj)
GetHashCode Method int GetHashCode() | GetHashCode Method int GetHashCode()
GetType Method type GetType() | GetType Method type GetType()
ToString Method string ToString() | ToString Method string ToString()
four NoteProperty object four=null | one NoteProperty int one=1
one NoteProperty int one=1 | three NoteProperty int three=3
three NoteProperty int three=3 | two NoteProperty int two=2
two NoteProperty int two=2 |
In the currect situation ($z[0]
), the property Four
is missing, where using ... | Select-Object -UnifyProperties
($y[0]
), the property four
is set to $Null
.
By default the result will be the same, both $z[0].four
and $y[0].four
eventually result in a $Null
.
There will be a difference when setting the Set-StrictMode -Version latest
, where $y[0].four
will convinently return a $Null
and $z[0].four
will return an error:
PropertyNotFoundException: The property 'four' cannot be found on this object. Verify that the property exists.
This is yet another reason to apply the unification to all object properties and not just the first one.
$y = [pscustomobject] @{ one = 1; two = 2; three = 3 },
[pscustomobject] @{ one = 4; two = 5; three = 6 },
[pscustomobject] @{ one = 10; three = 30; four = 4 } |
UniteProperties | ConvertTo-Csv
(It will be inconsistent if $y[0].four
and $y[1].four
behave differently in certain StrictModes)
@mklement0,
As for the parameter name, maybe -UnifyProperties is better?
I am fine with this, although I did have some more thoughts about this: Rather than choosing a _verb_ for a parameters, rely on the the cmdlet's verb, which results in something like this: ... | Select -EveryProperty
or simply ... | Select -AllProperties
Which also shows why I think it should be a feature of Select-Object
. Besides, Select-Object
has already a similar output if you define the properties yourself, like ... | Select-Object one, two, three, four
(this is also supported by the fact that the prototype already uses the Select-Object
cmdlet) , it just needs to automatically figure out what properties are used.
even though the collect-all-input-up-front behavior is a departure
Some of the existing parameters of the Select-Object
cmdlet already doing this like -Last <int>
(e.g. -Last 1000
for less objects) and (unexpected, see: #11221) -Unique
@iRon7:
My concern about -AllProperties
and -EveryProperty
is that it sounds like Select-Object *
and doesn't express the aspect of _unifying_ (making _uniform_) the set of properties across all objects. In general, it is not uncommon for parameter names to use verbs (-Wait
, -Skip
, -Force
, ...), if that is your concern.
And to be clear: I'm fine with adding this to Select-Object
, and the collect-all-input-first is just a matter of documenting it properly.
If -Last
really currently collects all input up front, it amounts to an inefficient implementation that should be changed in favor of a queue of the specified length, so that only the most recent N input objects are retained on an ongoing basis.
Similarly, -Unique
only needs to retain the _unique_ input objects - though, _depending on the input_, that may be all of them.
To put it differently: cmdlets that for conceptual reasons must defer pipeline output until their end
block is processed may need to _look_ at all input objects first, but they don't have to necessarily _collect_ them _all_, whereas the feature we're discussing here does.
Most helpful comment
@iRon7:
My concern about
-AllProperties
and-EveryProperty
is that it sounds likeSelect-Object *
and doesn't express the aspect of _unifying_ (making _uniform_) the set of properties across all objects. In general, it is not uncommon for parameter names to use verbs (-Wait
,-Skip
,-Force
, ...), if that is your concern.And to be clear: I'm fine with adding this to
Select-Object
, and the collect-all-input-first is just a matter of documenting it properly.If
-Last
really currently collects all input up front, it amounts to an inefficient implementation that should be changed in favor of a queue of the specified length, so that only the most recent N input objects are retained on an ongoing basis.Similarly,
-Unique
only needs to retain the _unique_ input objects - though, _depending on the input_, that may be all of them.To put it differently: cmdlets that for conceptual reasons must defer pipeline output until their
end
block is processed may need to _look_ at all input objects first, but they don't have to necessarily _collect_ them _all_, whereas the feature we're discussing here does.