Roslyn: Static String.Equals() should never be allowed to silently revert to Object.Equals()

Created on 17 Sep 2016  路  3Comments  路  Source: dotnet/roslyn

Version Used: 4.5.2

Steps to Reproduce:

Evaluate this expression: string.Equals(1, "1")

Expected Behavior:
This expression should either fail to compile, or return true.

Actual Behavior:
It currently returns false. Even though string.Equals() is expressed, indicating a specific interest in a string-based comparison, object.Equals() is actually invoked, which merely does a reference comparison.

Opinion

This kind of static inheritance isn't permitted anywhere else in the language as near as I can tell. The current behavior presents a huge pitfall for refactoring.

Suppose you start with some legacy code rich in primitive envy:

            string myColor = "blue";
...
            if (string.Equals(myColor, "blue"))
            {
                // handle blue
            }

and begin encapsulating strings with types:

            var myColor = new Color("blue");
...
            if (string.Equals(myColor, "blue"))
            {
                // handle blue
            }

Colors.Blue has not been implemented to automatically convert to a string (no operator string() defined). The string.Equals() continues to compile, but quietly switches from string.Equals() to object.Equals(), where it is no longer getting a string comparison. The whole point of introducing types is to introduce safety that prevents these kinds of inadvertent changes in behavior, but in this case the vaccine is killing the patient.

Possible Fixes

I can think of several different ways to improve the situation:

  1. Do not inherit static members of object, ever. The refactored code above would fail to compile because Colors.Blue can't convert to support one of the overloads of String.Equals().
  2. Do not inherit static members of object into types which have overloads defined for those members. For example, String already has String.Equals(), which implies that it implements "semantically string" comparisons. Those overloads would hide the existence of the Equals() methods in in Object. The refactored code above would fail to compile because Colors.Blue can't convert to support one of the overloads of String.Equals().
  3. Implement a String.Equals(object, object) overload which calls through to Object.Equals(), but make it a suppressable compiler error to use it.
  4. Implement a String.Equals(object, object) overload, to directly replace the one in Object with one that is semantically correct for String. (i.e. calls ToString() on the parameters).

Any of these would provide a reasonable path for refactoring more safely. I lean toward the ones that yield a compiler error, since any of those would guide the developer to reevaluate whether a string-comparison is still functionally equivalent to what was happening prior to the refactoring. That might be when they remember to implement ToString(), or decide that they need to do a different kind of comparison.

Area-Analyzers Feature Request help wanted

Most helpful comment

This behavior is indeed confusing (another example: which encoding do you think UTF32Encoding.Default returns?), but I don't see how could it be fixed without breaking backwards compatibility.

All 3 comments

This behavior is indeed confusing (another example: which encoding do you think UTF32Encoding.Default returns?), but I don't see how could it be fixed without breaking backwards compatibility.

Yeah, I threw a bone in the compatibility direction with option 3 above. E.g.:

class String
{
    [Obsolete("Use object.Equals(object, object), or string.Equals(string, string) instead")]
    static bool Equals(object obj1, object obj2) {return object.Equals(obj1, obj2);}
}

This would just produce a warning, and provide compatible behavior by default. I suppose you can treat warnings as errors, but a warning alone might be enough.

This is an extremely targeted tactic; it can be used anywhere that static overloads exist. I suppose it could even help with UTF32Encoding.Default.

I don't know where I got the impression that C# didn't support static inheritance, that is clearly wrong. It creates a brittle surface area, one which is very challenging to change while maintaining backward compatibility.

As a result, it seems that every class has a responsibility to consider what static methods it is indirectly exposing from classes below, and to evaluate whether to allow them or replace them. String and UTF32Encoding were both failures in this regard. But even if they weren't, at any moment someone could add yet another static method to a base class and undo all of the rigor that had been performed on them. Derived classes are essentially powerless to control their own surface area.

Derived classes should have to opt into exposing base class statics, not opt out. That's _definitely_ not a change that can be made without breaking compatibility. Maybe a /requireoptinforstaticinheritance compiler option could compile types with that model. And then a syntax would be needed for declaring which methods should be re-exposed.

Going the other way, maybe Encoding should be able to define Encoding.Default, and declare it uninheritable, because they know its very idea doesn't translate well to derived classes. Intellisense could still help with discovery, but it could flip over to direct invocation via the base class to eliminate the ambiguity, leaving invocation in the derived class as a compiler error.

I've also been burned by this in the past, but at this point I think any language changes would be impractical.

This seems like a great scenario for an analyzer though. In fact I'd probably go so far as flagging any static member access which is done via a type _other than_ the declaring type.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

ilexp picture ilexp  路  167Comments

MgSam picture MgSam  路  152Comments

stephentoub picture stephentoub  路  259Comments

MadsTorgersen picture MadsTorgersen  路  542Comments

gafter picture gafter  路  279Comments