Roslyn: IDE0071 - removing ToString causes boxing of the int value

Created on 27 Apr 2020  路  22Comments  路  Source: dotnet/roslyn

_This issue has been moved from a ticket on Developer Community._

    public class Class1
    {
        public int Value { get; set; }

        public string GetValue() => $"Value:{Value}";
        public string GetValueWithToString() => $"Value:{Value.ToString()}";
    }

When interpolating an int value, if we don't call ToString we are boxing the value. I would expect to not see the suggestion to remove the ToString call. Moreover by looking at the IL, in the GetValue() method there is string. Format call while in the GetValueWithToString() there is a string. Concat call. If we use a reference type, there is no boxing of course, but the underlying method change as well.

I agree that the syntax is more concise without the ToString but the behavior is different which I would not expect from a refactoring.


Original Comments

Visual Studio Feedback System on 4/24/2020, 00:15 AM:

We have directed your feedback to the appropriate engineering team for further evaluation. The team will review the feedback and notify you about the next steps.


Original Solutions

(no solutions)

Area-IDE Concept-Continuous Improvement Developer Community IDE-CodeStyle Resolution-By Design

Most helpful comment

Couldn't the compiler add the .ToString() itself? Then the refactoring would be correct and it would "unrefactor" behind the scenes to the faster variant.

All 22 comments

Hi, I was purposefully using ToString() in string interpolation for all value types to avoid boxing.

As I often write something like

public class Request {
   public int PageSize {get;set;}
   public int PageIndex {get;set;}
   // other stuff
   public override ToString() => $"PageSize={PageSize.ToString()}, PageIndex={PageIndex.ToString()}, ...";
}

// controller action
_logger.LogInformation("Request {r}", request);

The log will call the ToString method and I'm avoiding boxing the two int values using ToString() inside the interpolated string. I also did some benchmark to be sure I was on the right path but I'm not sure how meaningful this could be

https://github.com/davidelettieri/ToStringBenchmark

I could be wrong but I do not think this is related to boxing.

As a compiler optimization this:

public string GetValueWithToString() => $"Value:{Value.ToString()}";

will get transformed to

public string GetValueWithToString() => string.Concat("Value:", Value.ToString());

whereas this:

public string GetValue() => $"Value:{Value}";

gets transformed into this:

public string GetValue() => string.Format("Value:{0}", Value);;

so the performance difference is just calling string.Concat vs. string.Format

Yep, and String.Format's 2nd parameter is of type object, so if you pass a value type it'll get boxed.

Ah right, the runtime will box the int.

Anyways, there are a lot of refactorings that change application performance (foreach to linq for example). I am not convinced that this is worth not offering to the user because of a compiler optimization detail. You should be able to use existing configuration mechanisms to disable this analysis in your codebase

There is the detail that you change which ToString overload gets called: string interpolation will call IFormattable.ToString. But that doesn't matter for int (I don't know whether IDE0071 takes that into account).

Thanks for the feedback!
Disabling IDE0071 is OK but it will remove the suggestion for all cases, not only value types. It would be great to have to suggestion only for reference types but it's not vital.

Disabling it works for me since I'm using the additional ToString only for value types anyway.

I guess my question @davidelettieri and @CyrusNajmabadi is: if we changed it to not trigger in cases where it would cause you to call string.Format instead of string.Concat is there any case today where this rule would trigger?

This also corresponds to the editorconfig setting dotnet_style_prefer_simplified_interpolation which means it has a lot of configuration options.

You could set dotnet_style_prefer_simplified_interpolation in an editorconfig file in the directory for the files that use this pattern and get the suggestion everywhere else.

#  Using ToString will cause the compiler to call string.Concat instead of string.Format in interpolated strings
#  We do not want to call string.Format as we have no localization concerns (like dates)
#  and want the better performance of string.Concat
dotnet_style_prefer_simplified_interpolation = false

You could also use #pragma warning disable if you want future developers to not be tempted to change this (even without the ide suggesting it).

// Using ToString will cause the compiler to call string.Concat instead of string.Format in interpolated strings
// We do not want to call string.Format as we have no localization concerns (like dates)
// and want the better performance of string.Concat
#pragma warning disable IDE0071 
public class Class1
{
    public int Value { get; set; }

    public string GetValueWithToString() => $"Value:{Value.ToString()}";
}
#pragma warning restore IDE0071

Yeah, i would just say that you should just disable this feature if you don't like the results :)

It would be great to have to suggestion only for reference types

Even for reference types string.Concat is a lot quicker operation than string.Format. I would argue that if you have performance hot-paths that use string interpolation you should not enable this suggestion.

I used these two commands to create a benchmark

 dotnet new -i BenchmarkDotNet.Templates
 dotnet new benchmark --console-app -f netcoreapp3.1

added this code:

using BenchmarkDotNet.Attributes;

namespace benchmark {
    public class Benchmarks {
        class MyClass{}

        [Benchmark]
        public void Format() {
            string.Format("Value:{0}", new MyClass());
        }

        [Benchmark(Baseline = true)]
        public void Concat() {
            string.Concat("Value:", (new MyClass()).ToString());
        }
    }
}

and after dotnet run these are my results:

BenchmarkDotNet=v0.12.0, OS=Windows 10.0.18363
AMD Ryzen Threadripper 1950X, 1 CPU, 32 logical and 16 physical cores
.NET Core SDK=5.0.100-preview.1.20155.7
  [Host]     : .NET Core 3.1.3 (CoreCLR 4.700.20.11803, CoreFX 4.700.20.12001), X64 RyuJIT
  DefaultJob : .NET Core 3.1.3 (CoreCLR 4.700.20.11803, CoreFX 4.700.20.12001), X64 RyuJIT

| Method | Mean | Error | StdDev | Ratio | RatioSD |
|------- |---------:|---------:|---------:|------:|--------:|
| Format | 89.64 ns | 1.392 ns | 1.302 ns | 2.70 | 0.04 |
| Concat | 33.20 ns | 0.328 ns | 0.274 ns | 1.00 | 0.00 |

Given that which usages of string interpolation lower to String.Concat and which lower to String.Format is an implementation detail, if you specifically want to avoid calls to String.Format, I'd use a rule of "no string interpolation on hot paths". Use string concatenation instead.

int i = 3;
string s = "i: " + i;

Lowers to:

string s = string.Concat("i: ", 3.ToString());

If it helps, I ran several benchmarks after we were going through latest suggestions on Stack Overflow that mirror findings above in https://github.com/dotnet/roslyn/issues/43711#issuecomment-620860040. They're available in my benchmarks repo:

using BenchmarkDotNet.Attributes;

namespace Benchmarks
{
    [Config(typeof(Config))]
    public class InterpolationTests
    {
        private readonly int num = 5;
        private readonly string str = "test";

        private readonly int num2 = 5;
        private readonly string str2 = "test";
        private readonly string str3 = "test";
        private readonly string str4 = "test";

        [Benchmark]
        public string IntBoxing() => $"Some {num} thing {str}";
        [Benchmark]
        public string IntToString() => $"Some {num.ToString()} thing {str}";

        [Benchmark]
        public string IntBoxingMoreArgs() => $"Some {num} thing {str} {num2} {str2} {str3} {str4}";
        [Benchmark]
        public string IntToStringMoreArgs() => $"Some {num.ToString()} thing {str} {num2.ToString()} {str2} {str3} {str4}";
    }
}

2020-04-09 17_32_15-Benchmarks

Given the performance difference is not minor, what's the feasibility on disabling in the string.Concat() vs. string.Format() changeover case? I'm also poking here because it's a _default on_ analyzer when devs upgrade. It'd be great if we could get the benefits without the downsides.

Couldn't the compiler add the .ToString() itself? Then the refactoring would be correct and it would "unrefactor" behind the scenes to the faster variant.

Aside if its an reference type then calling .ToString() in the interpolated string lets the GC reference go whereas not having .ToString() holds onto it for longer. https://twitter.com/anandgupta08/status/1247087284857368576

Yes. it feels very weir dthat a user would need to 'optimize' this at the level instead of the simple call being the optimal one.

Especially considering the compiler already does _exactly_ this optimization for string concatenation case:

using System;
public class C {
    public string M1(int value) {
        return "first " + value + " second";
    }

    public string M2(int value) {
        return $"first {value} second";
    }
}

https://sharplab.io/#v2:C4LglgNgPgAgTARgLACgYGYAE9MGFMDeqmJ2WMCADJgLIIAUYAdsJgG4CGEArgKYCUhYqREwA7JgBEAMzAAnAM6tJmANTsufNVMwLeAYwD2TACaSA3MJIBfVFbLYqtOIxYaeAoShGiJAEhl5JUJOD2tdA2MzS29SWxRrIA==

My interpretation is this:

  1. The code fix suggests code which is easier to read
  2. The code fix suggests code which is functionally equivalent to the original
  3. For most applications/users/scenarios, the performance difference between the two approaches is negligible

    1. The cost of string allocations is likely higher than boxing, so if this did appear on a performance-sensitive path it would likely be part of a pooling or caching strategy anyway

    2. Users who do not fall into this group are likely to be aware that they fall into such a category and disable this analyzer anyway

  4. The compiler and/or runtime have been known in the past to optimize cases like those produced by the code fix

It seems the diagnostic and code fix are performing as expected for majority audiences. My recommendation would be filing a code generation feature request for either the C# compiler or the JIT over in dotnet/runtime to detect and optimize cases produced by this code fix.

Couldn't the compiler add the .ToString() itself? Then the refactoring would be correct and it would "unrefactor" behind the scenes to the faster variant.

The reason this cannot be done (as I understand it) is that the C# spec for string interpolation does not allow this. My assumption is that if this is something that we would like to change, a proposal on dotnet/csharplang first needs to be approved.

but @gafter is the real authority on whether this can be done. and if doing it would require a spec change

I'm pretty sure we would be able to take such a change (though it may need to go through LDM). We're already planning on taking a change to allow string-interpolations to be constants in C#9. It would likely make sense to roll this into that or similar efforts.

First off, this has been raised before: #194, #4678, #6669, #6738, #9212.

Neal's assertion is that $"{x}" is equivalent to string.Format("{0}", x) is equivalent to string.Format(CultureInfo.CurrentCulture, "{0}", x) (1, 2, 3). The spec says that string.Format needs to do a bunch of logic around ICustomFormatter and IFormattable, which it would be hard to get the compiler to generate each time.

I suspect that spec might have been clarified since Neal looked at it: as I read it now, a lot of the points are conditioned with "if a composite formatting method with a non-null IFormatProvider argument is called" or similar: if IFormatProvider is null, then the stuff around ICustomFormatter goes away, for instance. I also can't see any indication that string.Format("{0}", x) is equivalent to string.Format(CultureInfo.CurrentCulture, "{0}", x) either, and it's not what the runtime does: that passes null as the IFormatProvider. I couldn't reproduce the claimed behaviour where string interpolation could fetch an ICustomFormatter either (but this comment suggests that's a bug?)

If it is true that the ICustomFormatter stuff is a red herring, then I think we might be able to lower:

object o = ...;
string s = $"o: {o})";

as:

object o = ...;
string s = string.Concat("o: ", (o is IFormattable) ? ((IFormattable)o).ToString(null, null) : o?.ToString());

with all of the considerations around copying value types, null-safety, etc, that went into #35006. (Of course, in IL you can invoke ((IFormattable)o).ToString(null, null) without boxing).

Maintaining the order of evaluation is also annoying: if a and b are expressions, then string.Format("{0}{1}", a, b) is not the same as string.Format("{0}{1}", a.ToString(), b.ToString()). Mind you, string concatenation has always violated this (in different ways) and noone seems to have noticed apart from ILSpy: #38641.

There are also quite a few issues over in dotnet/runtime about changes/alternatives to string.Format which don't require boxing value types.

My interpretation is this:

1. The code fix suggests code which is easier to read

2. The code fix suggests code which is functionally equivalent to the original

3. For most applications/users/scenarios, the performance difference between the two approaches is negligible

   1. The cost of string allocations is likely higher than boxing, so if this did appear on a performance-sensitive path it would likely be part of a pooling or caching strategy anyway
   2. Users who do not fall into this group are likely to be aware that they fall into such a category and disable this analyzer anyway

4. The compiler and/or runtime have been known in the past to optimize cases like those produced by the code fix

It seems the diagnostic and code fix are performing as expected for majority audiences. My recommendation would be filing a code generation feature request for either the C# compiler or the JIT over in dotnet/runtime to detect and optimize cases produced by this code fix.

Design meeting decision: This is by design, per @sharwell's comment above. Also, a DiagnosticSuppressor can be implemented by users for their specific needs.

Note: def feel free to open issue against compiler to improve codegen here. it def seems desirable.

Opened issue for compiler to improve codegen https://github.com/dotnet/roslyn/issues/44168

Was this page helpful?
0 / 5 - 0 ratings