Roslyn: [Discussion] Constant string interpolation

Created on 12 May 2016  Â·  20Comments  Â·  Source: dotnet/roslyn

Problems

There are two relevant cases, and both of them are related to nameof().

  1. Consider the following code:

csharp public static readonly string Sql = $"SELECT {nameof(ColumnX)} ..."; // moving to a constant public const string Sql = "SELECT " + nameof(ColumnX) + " ...";

Even though nameof is a constant I have to fall back on older syntax here.

  1. Another (more insidious) example:

csharp var sql = new StringBuilder(); sql.Append("SELECT * FROM " + nameof(C)); // vs sql.Append($"SELECT * FROM {nameof(C)}");

Even though approaches seem similar, the first one is a constant, while the second one does string allocation — defeating the purpose of a StringBuilder. This can be partially mitigated by having a FormattableString overload on StringBuilder, but FormattableString itself and its internal array would still need allocations.

Solution Discussion

I would like to evaluate string interpolation as a constant in certain contexts (that could only work if all values in it were constants). But I can't think of a simple syntax.

One option could be something like (const)$"A {nameof(B)} C", but I'm not too fond of it.

Area-Language Design Discussion

Most helpful comment

A more formal proposal (a bit ad-hoc, not spec-language yet):

If string interpolation expression is interpreted as a constant, it can be compiled to a constant string if: all values in {} can be compiled to constant strings; and there are no formatting specifiers on the values. Otherwise a compilation error will be reported.

An expression is interpreted as a constant if:

  1. (Current) It appears at the right hand side of const assignment -- e.g. const string A = $"{nameof(B)}";.
  2. (New) It appears within const() expression -- e.g. const($"{nameof(B)}") (similar to checked/unchecked).

All 20 comments

Similar changes were proposed several times before (see e.g. #4678). They were closed, because of issues regarding specific behavior of string.Format() which would have to be maintained.

@svick Yes -- it's clear that implicit approach cannot work here. However explicit approach can since it would explicitly limit what can be placed in {}. Since const string X = "X" + 5 is not currently supported, I wouldn't even expect (const)$"X{5}" to compile.

So this limits supported scenarios to just strings and nulls I think, and culture or custom formatters will be ignored in the same way as const string X = "X" + "Y" ignores them now.

A more formal proposal (a bit ad-hoc, not spec-language yet):

If string interpolation expression is interpreted as a constant, it can be compiled to a constant string if: all values in {} can be compiled to constant strings; and there are no formatting specifiers on the values. Otherwise a compilation error will be reported.

An expression is interpreted as a constant if:

  1. (Current) It appears at the right hand side of const assignment -- e.g. const string A = $"{nameof(B)}";.
  2. (New) It appears within const() expression -- e.g. const($"{nameof(B)}") (similar to checked/unchecked).

I don't see how that remedies the issue mentioned by @gafter concerning custom formatting and culture. With this proposal it would be quite possible for the two following expressions to return different values:

const string world = "World";

string greeting1 = $"Hello {world}!";
const string greeting2 = $"Hello {world}!";

context-based constness seem to be not a good option becasue the returned value might vary. I did suggest $$"..." as a culture-agnostic vanilla string interpolation which turns to string concatenations and remains a constant in #9407 but apparently nobody is listening.

@HaloFour If I understand the problem correctly, the case is "culture that overrides format insertion for string values"? To me it seems like an edge case that would break quite a few libraries as well. I think the spec should just make it clear that culture override is not applied to this case as an official limitation. E.g. select x doesn't call .Select(), so if .Select(x => x) is expected to do weird things some common refactorings can break code — but in practice it's not a problem.

@alrz I've searched for similar discussions, but unfortunately missed that one. I think $$"..." is similar — I suggested const($"") to avoid too many special symbols, but it can go either way depending on what people think. Having it always explicit might be the right way if my answer to @HaloFour doesn't seem convincing.

@ashmind Correct. It's fully possible to write a PigLatin culture and set it to the default for the current thread which will rewrite arbitrary string values, e.g.: $"Hello {world}!" becoming "Hello Orldway!".

It may be an edge case, but it's still the prescribed behavior according to the specification and why all of the previous requests to either optimize interpolation into simple concat and/or to permit interpolation in constant expressions have not moved forward.

@HaloFour Edge case you say, the thing is, string concatenation/interpolation should be considered as a low level feature and not something on top of internationalization and culture API which itself is a full-blown disappointment.

@HaloFour

it's still the prescribed behavior according to the specification

Only in non-constant situation -- we don't have prescribed behavior in constant case yet. Yes, prescribing it not to use culture would be inconsistent -- but is it a problem enough to require explicit opt-in instead? I can't think of a single real-world case where someone could run into it (and if they do, they would have other problems with libs etc).

Explicit opt-in could make it possible I suppose.

  1. amend specification to allow constant-expression to contain string interpolation iff all interpolated values are constant strings (any constant?).
  2. permit an explicit conversion to a constant-expression

String interpolation of a few string constants should be possible without making any changes.
I know that current culture can affect ToString format, but it does not affect if you concatenate only string constants together. I see it as a bug.

I know that current culture can affect ToString format, but it does not affect if you concatenate only string constants together.

Concatenation doesn't involve the current culture. Formatting does, and string interpolation is based on formatting. Using String.Format with all const strings still uses the current culture. There's no way to use String.Format to produce a const expression.

So you can change your implementation in compiler to solve that bug. String.Format is no needed to concatenate strings. By the way all constants are evaluated on compilation time and hard coded in binary.

It can be String.Format for non-constants and simple string concatenation for constraints. It does not affect way how it work and does not have any performance issues. It would just solve that bug and make possible developers to use new syntax.

@mklemarczyk

It's not a bug, it was explicitly designed to function like this. You not liking it doesn't make it a bug.

String.Format isn't needed for concatenating strings, but that's not what you're doing. You're interpolating them, and String.Format is needed for that based on the spec and implementation of how interpolation works in C#. If you want to concatenate strings, go right ahead and use the same syntax that has worked since C# 1.0. Changing the implementation to behave differently based on usage would produce unexpected results:

const string FOO = "FOO";
const string BAR = "BAR";
string foobar = $"{FOO}{BAR}";
const string FOOBAR = $"{FOO}{BAR}"; // illegal today

Debug.Assert(foobar == FOOBAR); // might not always be true

I made a little proof of concept turning interoplation of string constants into just a string constant (rather than a call to string.Format):
https://github.com/AustinWise/roslyn/commit/713c06d912ebe845d2731d34f7d6222f39a7ece9

I could be overlooking something, but currently things like current culture does not effect how strings are treated when they are the arguments to string.Format call. So this sort of optimization should be harmless, unless in the furture string.Format starts to behave differently on the same input. That seems unlikly though.

@AustinWise

Read the following: https://github.com/dotnet/roslyn/pull/6738#issuecomment-156257037

Even strings may be affected by the current culture depending on how it's implementation of IFormatProvider functions. It's entirely possible to define a PigLatinCulture which would modify all formatted strings. It may be an extreme (and possibly esoteric) edge case, but it is a part of the composite formatting implementation inherited by interpolation.

@HaloFour
I think if you limited the scope to interoplating string constants into a string (as apposed to creating a FormattableString), it's simpler. The generated code does not pass a value for IFormatProvider, so there is no worry about custom formatters. And String does not implement IFormattable, so only its ToString() method is called. And all that does is return itself. So the current culture should have no effect.

A legitimate concern though would be this type of optimization might complicate the C# spec. Currently the spec just says the meaning of an interoplated string is "call string.Format". Changing the compiler to just emit a string might require specifying the behavior of string.Format, which would complicate the spec a little bit.

@ashmind Maybe it's possible to use backticks _similarly_ to how it's used in JavaScript.

$`A {nameof(B)} C`

This will also be compatible with this:

$@`A {nameof(B)} C`

I would like this if it were limited very specifically to e.g. nameof() and maybe invariant interpolation with compile-time constants. Here's one example where I could have used this:

[Obsolete($"The {nameof(Foo)} method has been obsoleted, and {nameof(Bar)} should be used instead.")]
void Foo()
{
}

void Bar()
{
}

Issue moved to dotnet/csharplang #2077 via ZenHub

Was this page helpful?
0 / 5 - 0 ratings