In many contexts in PowerShell, custom objects and hashtables can conveniently be used interchangeably, such as in JSON serialization (ConvertTo-Json
)
However, Export-Csv
and ConvertTo-Csv
currently do not support dictionaries ((ordered) hashtables, IDictionary
instances) meaningfully: they serialize the dictionary _itself_.
Making these cmdlets serialize the key-value pairs, analogous to property-name-value pairs in [pscustomobject]
input would be helpful.
# OK - custom object input
[pscustomobject] @{ prop=1 } | ConvertTo-Csv | Should -Be '"prop"', '"1"'
# Currently unsupported: hashtable input
@{ prop=1 } | ConvertTo-Csv | Should -Be '"prop"', '"1"'
The latter test fails, indicating the currently useless serialization of hashtables:
Expected @('"prop"', '"1"'), but got
@('"IsReadOnly","IsFixedSize","IsSynchronized","Keys","Values","SyncRoot","Count"',
'"False","False","False","System.Collections.Hashtable+KeyCollection","System.Collections.Hashtable+ValueCollection","System.Collections.Hashtable","1"'
Make Export-Csv
and ConvertTo-Csv
detect IDictionary
input and serialize its key-value pairs instead of the dictionary object itself.
Related #8855 (can we move it to the issue too?)
We need to add new switch to avoid a breaking change.
Also there is a question about Collection and related interfaces.
Current behaviour for ConvertTo-Csv results in data that is essentially useless:
PS> $data = 1..10 | % { @{ Number = $_ } }
PS> $data
Name Value
---- -----
Number 1
Number 2
Number 3
Number 4
Number 5
Number 6
Number 7
Number 8
Number 9
Number 10
PS> $data | convertto-csv
"IsReadOnly","IsFixedSize","IsSynchronized","Keys","Values","SyncRoot","Count"
"False","False","False","System.Collections.Hashtable+KeyCollection","System.Collections.Hashtable+ValueCollection","System.Collections.Hashtable","1"
"False","False","False","System.Collections.Hashtable+KeyCollection","System.Collections.Hashtable+ValueCollection","System.Collections.Hashtable","1"
"False","False","False","System.Collections.Hashtable+KeyCollection","System.Collections.Hashtable+ValueCollection","System.Collections.Hashtable","1"
"False","False","False","System.Collections.Hashtable+KeyCollection","System.Collections.Hashtable+ValueCollection","System.Collections.Hashtable","1"
"False","False","False","System.Collections.Hashtable+KeyCollection","System.Collections.Hashtable+ValueCollection","System.Collections.Hashtable","1"
"False","False","False","System.Collections.Hashtable+KeyCollection","System.Collections.Hashtable+ValueCollection","System.Collections.Hashtable","1"
"False","False","False","System.Collections.Hashtable+KeyCollection","System.Collections.Hashtable+ValueCollection","System.Collections.Hashtable","1"
"False","False","False","System.Collections.Hashtable+KeyCollection","System.Collections.Hashtable+ValueCollection","System.Collections.Hashtable","1"
"False","False","False","System.Collections.Hashtable+KeyCollection","System.Collections.Hashtable+ValueCollection","System.Collections.Hashtable","1"
"False","False","False","System.Collections.Hashtable+KeyCollection","System.Collections.Hashtable+ValueCollection","System.Collections.Hashtable","1"
Is there any reason someone would rely on this behaviour?
@vexx32 I think the design is PowerShell generalization of psobject serialization/deserialization vs data serialization/deserialization that we discuss in the issue.
@iSazonov:
If there's a way to support this as part of a more generalized feature that may even improve performance for the [pscustomobject]
case, then all the better.
I haven't looked at IDataView
yet, and I don't know how realistic near-term use in PowerShell is - by contrast, enabling support for IDictionary
specifically seems like a pretty quick enhancement to make (that could still benefit from later under-the-hood optimizations, as long as the behavior doesn't change).
Either way, I agree with @vexx32 that there's no backward-compatibility concern here and therefore no need for a new switch.
I think we should also consider adding a switch to ConvertFrom-Csv
and Import-Csv
in order to import the data -AsHashtable
for symmetry.
Either way, I agree with @vexx32 that there's no backward-compatibility concern here and therefore no need for a new switch.
It is not clear if we want change output:
Expected @('"prop"', '"1"'), but got
@('"IsReadOnly","IsFixedSize","IsSynchronized","Keys","Values","SyncRoot","Count"',
'"False","False","False","System.Collections.Hashtable+KeyCollection","System.Collections.Hashtable+ValueCollection","System.Collections.Hashtable","1"'
Great idea, @vexx32 - see #11027.
@iSazonov:
To me, it's quite obvious that no one would rely on this output - the _only_ piece of information remotely of interest in this output that is _specific to the input object_ is the _entry count_ (column "Count") - and for that you obviously don't need CSV output.
I would agree that the current output is not useful and thus very unlikely to break someone.
If we want address more advanced scenarios we could design new Import/Export-TabularData cmdlets where "Data" better reflects the focus on data processing.
I don't see a need to divide features into a new cmdlet at the moment. We're not spinning up impromptu SQL servers to process data, we're just adding a sensible input/output type. The addition isn't particularly significant, in my opinion, and shouldn't warrant additional commands.
What is alternative for serialization/deserialization PowerShell objects?
@iSazonov, what do you mean?
Also I still don't understand (see my comment above) why we only consider IDictionary if there are IList, IEnumerable, ICollection.
We're definitely _not_ talking about a general serialization feature here.
(The latter is what Export-Clixml
is for (and enhancing that to support type-faithful deserialization for more than the handful of currently supported well-known types would be great, but also sounds challenging; #10916, if I understand it correctly, actually proposes something different, which sounds even more challenging: it is not asking for type-faithful deserialization - categorical support for which is fundamentally impossible - but for proxy methods that call back to the remoting endpoint).
why we only consider IDictionary if there are IList, IEnumerable, ICollection.
We're considering supporting IDictionary
as a collection _element_ type, not as a _collection_ type - in the same way that ConvertTo-Json
already does.
That is, the proposal is to not only to support collections (enumerables, lists) _whose elements are_ [pscustomobject]
instances, but also those _whose elements_ are IDictionary
instances.
[pscustomobject]
instances are primarily "property bags", and IDictionary
instances (at least with string-typed keys) are conceptually related and, in practice, are sometimes used interchangeably - each types has its pros and cons, but, fundamentally, they are both a (possibly ordered) collection of key-value pairs.
To give a concrete example: With this proposal implemented, the following two commands will yield the same result:
# Collection of *custom objects*
[pscustomobject] @{ one = 1; two = 2 }, [pscustomobject] @{ one = 1; two = 2 } | ConvertTo-Csv
# Conceptually equivalent collection of *hash tables*
@{ one = 1; two = 2 }, @{ one = 1; two = 2 } | ConvertTo-Csv
That is, both commands would output:
"one","two"
"1","2"
"1","2"
That is, the proposal is not only to support collections (enumerables, lists) whose elements are [pscustomobject] instances, but also those whose elements are IDictionary instances.
:-) Thanks for my education. I see your point.
My concern was about follow scenario:
Get-Date | Export-Csv c:\tmp\q.txt -IncludeTypeInformation
$a=Import-Csv C:\tmp\q.txt
$a.psobject
$a
While Export/Import-CliXml is universal, Export/Import-Csv give great UX and better performance for special, table, case, and I'd want lost this. Sorry that I was not accurate enough. I mistakenly thought that IncludeTypeInformation was by default although it was in Windows PowerShell, in Core it was changed (by me?! :upside_down_face:)
I would like to work on this one if it's available.
@ivanshen apologies, I forgot to make a note here; I submitted #11029 to add this functionality already. 馃檪
Most helpful comment
:-) Thanks for my education. I see your point.
My concern was about follow scenario:
While Export/Import-CliXml is universal, Export/Import-Csv give great UX and better performance for special, table, case, and I'd want lost this. Sorry that I was not accurate enough. I mistakenly thought that IncludeTypeInformation was by default although it was in Windows PowerShell, in Core it was changed (by me?! :upside_down_face:)