Powershell: Introduce a new, single-line raw string literal (single-line here-string alternative)

Created on 17 Jul 2020  路  8Comments  路  Source: PowerShell/PowerShell

Note: This idea was first proposed by @bergmeister and fleshed out by @lzybkr in https://github.com/PowerShell/PowerShell/issues/2337#issuecomment-391152107, but only in the context of a proposal focused on something different (allowing indentation of the closing delimiter of here-strings).

Summary of the new feature/enhancement

To complement the existing raw / semi-raw, invariably _multi-line_ here-string literals (@'<newline>...<newline>'@ / @"<newline>...<newline>"@), it would be convenient to have a _single-line_ variant.

For lack of a better term I'll call the new variant a _raw string literal_.

For instance, here's how you could express verbatim string 6' 2":

# Note: leading and trailing whitespace inside the value will be ignored.

# single-quoted
@' 6' 2" '@

# double-quoted - as with double-quoted here-strings, string interpolation and `-escaping would work.
@" $feet' 2" "@

To avoid the need for escaping, allow a _variable_ number of quotes in the delimiter to avoid the need for escaping; e.g.:

@'' using @' in here is now fine ''@

A use case that would benefit is wanting to pass a command line written for the native shell _as-is_ to it (see also: #13068); e.g. (on Unix):

# WISHFUL THINKING
sh -c @' python -c 'print("hi")' | cat -n '@

Proposed technical implementation details (optional)

Backward compatibility assessment:

Since with the current here-string literals no (non-whitespace) characters are allowed on the same line after the opening delimiter, and given that trying to do that is currently a _syntax error_, this new string-literal variant is safe to introduce.

Disambiguation:

Anything that starts with @' or @" and is _followed by non-whitespace characters on the same line_ is interpreted as the new _raw_ string literal, requiring the closing delimiter to be _on the same line_.

  • Note: Conceivably, we could let the new variant span multiple lines as well, but it strikes me as conceptually cleaner to restrict the multi-line forms to the established here-string syntax.

Insignificant leading and trailing whitespace:

Ignoring whitespace surrounding the value serves two purposes:

  • It makes it easier to visually distinguish the delimiters from the enclosed value (e.g., @' 6' 2" '@ vs. @'6' 2"'@)
  • Since supporting a _variable_ number of quotes in the delimiter is desirable (e.g., @'' @' is fine ''@), ignoring surrounding whitespace solves the problem of enclosed values _starting or ending with a quote_ that would otherwise break the syntax (e.g., to embed verbatim 'hi, @''hi'@ wouldn't work syntactically, but @' 'hi '@ - with surrounding whitespace stripped on parsing - does.

@TSlivede proposes the following alternatives, which consider all whitespace significant, albeit at the expense of the visual separation between the delimiters and the value:

  • limiting the number of quotes to _3 at most_ - see below.
  • instead of making the number of _quotes_ variable, duplicate the @ symbol - see below.

Values that need surrounding whitespace to be significant:

In cases where you need the enclosed values _start and/or end with whitespace_, the following solutions are possible:

  • Use the _double-quoted_ form, where `-escaping can be used to escape the spaces.

    • E.g., to get verbatim <space>a"b'<space>, you'd use @"` a"b'` "@.

    • Alternatively, @TSlivede proposes considering only _one_ leading/trailing space insignificant, in which case you merely need to add one extra space each to values with significant whitespace, in both the single- and double-quoted forms; e.g., @'<space> a"b <space>'@ and @"<space> a"b <space>"@. However, a concern is that user may not expect that only a _specific number_ (i.e., one) of spaces is insignificant; see below.

  • Alternatively, use the established multi-line here-string forms, where all whitespace is significant.

Issue-Enhancement WG-Language

Most helpful comment

Addition to that suggestion: To allow embedding literal '''@ maybe allow multiple (unlimited?) @ symbols in the string delimiters. E.g.:

@@'some string containing >>>'@<<< - delimited with multiple @-symbols'@@

All 8 comments

Thanks for creating an explicit issue for this!

Alternative suggestion regarding the removal of "Insignificant leading and trailing whitespace":

How about considering only a maximum of one leading space and one trailing space as "insignificant" and remove it?

This would still allow this:

  • It makes it easier to visually distinguish the delimiters from the enclosed value (e.g., @' 6' 2" '@ vs. @'6' 2"'@)
  • Since supporting a variable number of quotes in the delimiter is desirable (e.g., @'' @' is fine ''@), ignoring surrounding whitespace solves the problem of enclosed values starting or ending with a quote that would otherwise break the syntax (e.g., to embed verbatim 'hi, @''hi'@ wouldn't work syntactically, but @' 'hi '@ - with surrounding whitespace stripped on parsing - does.

And it would also improve "Values that need surrounding whitespace to be significant":
Just put one more space to the beginning and one more to the end.

Thanks, @TSlivede, I've folded your suggestion into the OP. I like it, but a slight concern is that the fact that the behavior is tied to a _specific number_ of spaces, i.e., exactly one, could be a bit obscure, and that people may be more used to _any number_ of leading/trailing spaces being insignificant, such as in inline code elements enclosed in ` in Markup.

people may be more used to any number of leading/trailing spaces being are insignificant

Yes you are right, the syntax I suggested might be very surprising to new users, maybe it's not such a good idea after all.

It could for example be especially problematic, if someone wants a string leading with e.g. 4 spaces. To test if spaces are preserved he enters @'聽聽聽聽test string'@ (4 spaces) and powershell prints 聽聽聽test string (three spaces), which looks very similar to four leading spaces. That user would probably not notice the missing space and would now have a very subtle bug in his code...

Good points. I take it then that this would require at least the initial leading space to be escaped with `?

@vexx32, yes, but note that the use of ` for escaping requires a switch to the _double-quoted_ form, because in the single-quoted one the ` would be a literal that is retained. Using a single-quoted (invariably multi-line) here-string instead avoids that.

@TSlivede :) I've unindented the paragraph, and I've also added our (later) concern about the suggestion expressed therein.

Another alternative idea:

Allow only a maximum of three quotes in the delimiters and don't remove leading/trailing whitespace.

This way we would lose the visual advantage

  • It makes it easier to visually distinguish the delimiters from the enclosed value (e.g., @' 6' 2" '@ vs. @'6' 2"'@)

(@' 6' 2" '@ and @'6' 2"'@ would not be equivalent)

But we would gain an easy option for leading or trailing significant whitespace.

Leading quotes would also be easy: As only a maximum of three quotes are considered part of the string delimiter, the fourth quote would be a literal quote.

e.g., to embed verbatim 'hi one could use @''''hi'''@

As with my previous suggestion the behavior is tied to a specific number of quotes (not spaces in this case 馃槈). But three quotes is something users might already have seen for here strings: python, kotlin, scala, groovy

Addition to that suggestion: To allow embedding literal '''@ maybe allow multiple (unlimited?) @ symbols in the string delimiters. E.g.:

@@'some string containing >>>'@<<< - delimited with multiple @-symbols'@@
Was this page helpful?
0 / 5 - 0 ratings