Follow-up from #7715; related to #7713, #7537, and #5797.
A pattern is emerging for asking cmdlets to output "bare" objects, which means output objects that:
are _not_ decorated with NoteProperty members, the way lines read from a file with Get-Content are, for instance.
are _not_ wrapped in instances of a _helper_ type, the way that Select-String or Compare-Object output is, for instance.
There are three, not mutually exclusive motivations for requesting such "bare" output:
NoteProperty members that subsequent processing may act on in an undesired fashion (see #5797)There are probably more existing cmdlets that could benefit from the pattern, as would future ones.
In line with PowerShell's commitment to consistency, a _common (shared) parameter name_ should be used in all these cases.
-Bare makes the most sense to me.
To avoid confusion with -Raw as implemented in Get-Content - which simply reads the whole file while still decorating the resulting string - #7715 proposed deprecating -Raw in favor of a more descriptive name such as -Whole. (Note that by _deprecating_ I don't mean to imply _removing_ support for -Raw, just documenting -Whole first and mentioning -Raw as a legacy name).
Written as of:
PowerShell Core 6.1.0
@mklement0 Could you please list all affected cmdlets (from the repo) in the description?
@iSazonov: Do you mean additional cmdlets that _could_ benefit from this pattern?
Yes, it will help PowerShell Committee to review and approve if they see the full list.
@mklement0 @iSazonov We've already reviewed this issue as part of #7715. The committee unanimously agreed that we are going to continue to use the existing pattern -Raw as the parameter to indicate that an object should be returned unadorned. (What exactly that means is up to the cmflet author.) Adding a new, largely indistinguishable parameter is undesirable as it will add confusion while providing no tangible benefit.
@BrucePay: #7715 was focused on _paving the way_ for the pattern proposed here - and the feedback suggested that the larger pattern and benefit perhaps wasn't fully considered in the decision.
Hence, this issue was opened, which focuses on the bigger picture.
Adding a new, largely indistinguishable parameter
The pattern proposed is here is clearly distinguishable in intent from what the - unfortunately named - -Raw currently does in the context of Get-Content, as described.
With the proposed deprecation (without removal) of -Raw, the distinction problem goes away (not for legacy code, but going forward).
while providing no tangible benefit.
The benefit is the pattern described in the initial post, for which we have concrete uses already.
I'm sure there are more, and, once the pattern is established, future cmdlets can take advantage too.
And, just to give a dying horse another wack:
the existing pattern
-Rawas the parameter to indicate that an object should be returned unadorned.
Get-Content's -Raw doesn't return anything _unadorned_ (undecorated), it just _changes the output partitioning_ and _still adorns_.
In a _very loose_ sense you can consider that reading the input "raw", but this loose sense:
gets in the way of a _meaningful, reusable pattern_, given that we now want Get-Content to output _truly undecorated_ _lines_ as well (#7537; i.e., without changing the output partitioning) - and I think retaining the existing -Raw while introducing something -RawLines isn't a great solution in and of itself, let alone impeding the establishment of a well-defined general pattern.
is in itself poorly descriptive of the very specific action performed by Get-Content -Raw (hence the suggested alias -Whole).
OK, one more for the road:
Note: The premise is that there is value in establishing a general pattern using a shared parameter name and in applying it to #7713 and perhaps #5797, among others.
What exactly that [
-Rare] means is up to the cmdlet author.
It _is_ an option to live with the loose definition of -Raw, but:
we'd then have to live with the -Raw -RawLines confusion within Get-Content, and the general confusion over the distinctly different Get-Content -Raw behavior.
-Raw, especially in the context of file I/O has a distinct connotation of raw _bytes_, which is inapplicable.
By contrast:
-Bare better connotes "lack of decoration".
With Get-Content's -Raw deprecated, the specific function it performs can be given a more descriptive name, such as -Whole (and -Whole in itself could become a standardized name for read-everything-at-once, but at this time I'm unaware of other cmdlets that could use it).
@mklement0
and the feedback suggested that the larger pattern and benefit perhaps wasn't fully considered in the decision.
It was. Sorry if that was unclear.
Thanks, @BrucePay, but further clarification is needed:
To quote @SteveL-MSFT's summary:
The current use of -Raw is acceptable and therefore no reason to make the proposed change. We would support a proposal to add a type parameter for streaming line-by-line w/o annotations although we did not come to agreement on the naming. -Bare is not different enough from -Raw to communicate functional differences.
This tells me that the decision was entirely focused on rejecting the proposal to deprecate -Raw _as currently used with Get-Content_, and what to name the new parameter that fits the pattern described in this issue _in the context of Get-Content_.
Aside from my obvious preference for this deprecation (without removal - I'll stop saying that now, consider it implied; "deemphasizing" is just too clunky), _this_ proposal's gist is to _introduce a general parameter pattern_ with _a_ shared name - while my naming preference is clearly -Bare, that is just _one_ suggestion.
Committing to this pattern means the newly agreed-upon name should be used in #7713, #7537, and perhaps #5797, as well as going forward.
By your own reasoning,
"Raw" in this context means, "undecorated", "not cooked", etc.
In PowerShell we try to choose a single term and apply it consistently so that, even if it does not seem intuitive to a person, they only have to learn it once. Sometimes we choose a sub-optimal term but we live with it because you should only have to learn something once.
[...] use the existing pattern -Raw as the parameter to indicate that an object should be returned unadorned.
In other words, this issue:
just asks more formally for defining and establishing the use of a -Raw-like parameter pattern.
and to you -Raw is an acceptable name for this general pattern, because what Get-Content -Raw currently does fits in as well, correct?
If so, and everyone's happy with this -Raw deal (if you will), then that leaves just one, incidental question:
What to name the new _streaming line-by-line w/o annotations_ parameter for Get-Content, given that -Raw is already taken there.
Given the -Raw deal, the name -RawLines, which you yourself have pondered, sounds just fine to me.
Y'know, thinking on that a bit, @mklement0, although it would be a more significant change... I would consider simply replacing the -Raw parameter with your proposed -RawLines parameter.
There's no need for an extra parameter; as has been noted in the associated issues at least once, -ReadCount can already be used to read the file in one go. Additionally, use of that is backwards compatible.
So while old code would need to be updated to work properly with the new version (thus a breaking change, I suppose), new code would be relatively backwards compatible, if the read-all-at-once was all that it was being used for. And if the undecorated output was required, it would be restricted to the newer versions.
@vexx32:
I think _removing_ -Raw would be too drastic a change (and with my proposal it wouldn't go away, we'd just tell people to use its new alias, -Whole, from now on).
Actually, what -Raw does (read the entire file as a _single string_) can _not_ be done with -ReadCount: the latter is a _chunking_ mechanism that is still array-based; -ReadCount 0 reads all lines at once, but puts them into an _array_.
There's no _functional_ problem with naming the new parameter for reading lines _undecorated_ (but still line-by-line) -RawLines, but it will cause confusion, because what "raw" means in -Raw vs. -RawLines is then quite distinct.
As an aside:
Somewhat ironically, the only parameter that ever deserved to be called -Raw - the undocumented Format-Hex -Raw, which asked for a raw byte representation in certain contexts - is now obsolete.
With my proposal, there would be no more (non-obsolete, non-deprecated) -Raw parameter - _for now_.
That said, there's no reason not to revive it _where appropriate_, given that, despite -Raw and -Bare having substantial semantic overlap, the following distinction can be useful:
-Raw ... ask for _uninterpreted_ data (typically, _raw bytes_)
-Bare ... ask for _undecorated_ data (without NoteProperties and, in a wider sense, not wrapped in a helper type that provides metadata)
There, I feel better now, although you could ask: hasn't that poor equine suffered enough?
If this proposal or some variant of it were to go ahead then I would suggest this as a general pattern:
[-OutputMode {Raw | RawLines | ...}]
It would make it clear at a glance to the casual user that there are options to affect the output of the cmdlet and that there is a choice between a set of distinct modes.
If I came across a future Get-Content cmdlet that had -Bare or -RawLines in the parameters then I would have to read further on into the documentation before I realised they merely configured the output. Also, I don't think it would be obvious that they were mutually exclusive until stated.
EDIT: OK, I see now that that -Raw and -Bare can be combined in your last post.
@dgc:
While -Raw and -Bare could be separate switches with distinct meanings, they'd be mutually exclusive.
Note that there's no need to distinguish between Raw and RawLines, because the aspect of _partitioning the output_ has nothing to do with the semantics of the -Bare switch proposed here (_undecorated/non-wrapped objects)_ - and is worth keeping separate.
The partitioning aspect, which is not common, is covered in Get-Content as follows:
The unfortunately named -Raw (get the _whole file_, as a _single string_). As stated, if Get-Content's -Raw were aliased to, say, -Whole, any confusion would go away.
-ReadCount (get groups of _lines_ as _arrays_) has more of a potential to be generically useful _under that name_ (or at least a _different_ one), namely for partitioning the stream of input objects into arrays of fixed size - see #8270
@dgc:
To reframe my previous comment in light of the decision that -Raw will be retained as the general switch name for requesting non-decorated/non-wrapped output:
The need for the -Raw / -RawLines distinction should not arise outside of Get-Content, where it only arises in order to preserve backward compatibility.
I went exploring, and maybe irrelevant but for the record, found a few other more things which, if you squint a bit, fit this issue's described pattern of asking a cmdlet to return a more basic output, or switch to another commonly desired output, or do less work for improved performance. Not all related to "undecorated" output, exactly:
Get-Date returns a [DateTime], and -Format 'yyyy-mm-dd' returns [string].Test-Connection returns [TestConnectionCommand+PingReport], and -Quiet returns [bool].ConvertTo-Xml returns [XmlDocument], and -As String makes it return a [string].Get-Variable returns [PSVariable] but with -ValueOnly it returns just the variable value.Get-CimInstance has -KeyOnly and -Shallow to ask it to get less information (KeyOnly is documented as "[returns key parameters only ..] reduces the amount of data transferred over the network." - presumably that's a performance reason).Get-ComputerInfo -Property BiosCaption asks it to return less information, just like feeding it through | select BiosCaption would .. but apparently still makes you wait as long as it takes to get all the information, so it doesn't seem to be a performance reason. Help says it "Specifies, as a string array, the computer properties in which this cmdlet displays." as if it's intended to be an output display option for a user.Measure-Command returns [GenericMeasureInfo] but with -Lines it switches to [TextMeasureInfo] (to try and behave like the wc utility?).and existing in Windows PowerShell, but not PS 6.1:
Get-Clipboard has a -Raw parameter which ignores newlines.Get-EventLog has -AsBaseObject which "Indicates that this cmdlet returns a standard System.Diagnostics.EventLogEntry object for each event. Without this parameter, Get-EventLog returns an extended PSObject object with additional EventLogName, Source, and InstanceId properties."Test-Connection returns Win32_PingStatus, and -Quiet returns [bool].
Currently we don't use Win32_PingStatus - please update your message.
@iSazonov updated.
Thanks, @HumanEquivalentUnit.
The patterns I see in your examples is to ask for an _alternative_ output data type or only _part of_ the usual output objects.
With respect to an alternative data type, it is what the -As<type> switches in _some_ of the examples do.
You could argue that -ValueOnly should therefore be -AsValue, and -Quiet (which isn't really quiet, only _quieter_) should be -AsBoolean, and, though a less clear-cut case, perhaps -KeyOnly should be -AsKey.
Similarly, Get-ChildItem's -Name could be -AsName, and it is an example of how the reasonable expectation that asking for something _simpler_ or for _less_ also results in _better performance_ doesn't always hold: see #9119 - just like you found with Get-ComputerInfo -Property in #9234.
Get-Clipboard's use of -Raw is just as unfortunate as Get-Content's - I've previously proposed
-Whole in #7715, but, based on the above, -AsString could work too (though, while more consistent, it is a tad more obscure in this specific case).
Get-EventLog's -AsBaseObject is a good example of what this issue calls for: requesting the usual output object, but _undecorated_ (without tacked-on ETS properties); thus, it should be -Raw (though I wonder if I've expressed my preference for -Bare before).
Measure-Object's -Line, -Word and -Character switches are interesting, because they only not only result in a different type of output object, but also _change the input processing_ to count _inside_ of the input objects, i.e., the lines, words, and characters inside multi-line strings. TextMeasureInfo is derived from the abstract MeasureInfo class, just like the default GenericMeasureInfo output type.
Most helpful comment
I went exploring, and maybe irrelevant but for the record, found a few other more things which, if you squint a bit, fit this issue's described pattern of asking a cmdlet to return a more basic output, or switch to another commonly desired output, or do less work for improved performance. Not all related to "undecorated" output, exactly:
Get-Datereturns a[DateTime], and-Format 'yyyy-mm-dd'returns[string].Test-Connectionreturns[TestConnectionCommand+PingReport], and-Quietreturns[bool].ConvertTo-Xmlreturns[XmlDocument], and-As Stringmakes it return a[string].Get-Variablereturns[PSVariable]but with-ValueOnlyit returns just the variable value.Get-CimInstancehas-KeyOnlyand-Shallowto ask it to get less information (KeyOnly is documented as "[returns key parameters only ..] reduces the amount of data transferred over the network." - presumably that's a performance reason).Get-ComputerInfo -Property BiosCaptionasks it to return less information, just like feeding it through| select BiosCaptionwould .. but apparently still makes you wait as long as it takes to get all the information, so it doesn't seem to be a performance reason. Help says it "Specifies, as a string array, the computer properties in which this cmdlet displays." as if it's intended to be an output display option for a user.Measure-Commandreturns[GenericMeasureInfo]but with-Linesit switches to[TextMeasureInfo](to try and behave like thewcutility?).and existing in Windows PowerShell, but not PS 6.1:
Get-Clipboardhas a-Rawparameter which ignores newlines.Get-EventLoghas-AsBaseObjectwhich "Indicates that this cmdlet returns a standard System.Diagnostics.EventLogEntry object for each event. Without this parameter, Get-EventLog returns an extended PSObject object with additional EventLogName, Source, and InstanceId properties."