Roslyn: Reuse Local and Anonymous Functions where possible (compiler optimization)

Created on 13 Apr 2020  路  14Comments  路  Source: dotnet/roslyn

The current compiler doesn't optimize well local functions and anonymous functions.
Look at the following C# code (from SharpLab):


Source Code

using System;

public class ExampleClass {
    public object ExampleMethod() {
        bool ExampleLocal(Char t) {
            return char.IsLetter(t) && char.IsLower(t);
        }
        bool ExampleLocal2(Char t) {
            return char.IsLetter(t) && char.IsLower(t);
        }
        Func<char, bool> ExampleLocalWithAnnonymous() {
            return e => char.IsLetter(e) && char.IsLower(e);
        }
        Func<char, bool> ExampleLocalWithAnnonymous2() {
            return e => char.IsLetter(e) && char.IsLower(e);
        }
        return (
            (Func<char, bool>)(e => char.IsLetter(e) && char.IsLower(e)),
            (Func<char, bool>)(e => char.IsLetter(e) && char.IsLower(e)),
            ExampleLocalWithAnnonymous(),
            ExampleLocalWithAnnonymous2(),
            (Func<char, bool>)ExampleLocal,
            (Func<char, bool>)ExampleLocal2
        );
    }

    public object ExampleMethod2() {
        bool ExampleLocal(Char t) {
            return char.IsLetter(t) && char.IsLower(t);
        }
        bool ExampleLocal2(Char t) {
            return char.IsLetter(t) && char.IsLower(t);
        }
        Func<char, bool> ExampleLocalWithAnnonymous() {
            return e => char.IsLetter(e) && char.IsLower(e);
        }
        Func<char, bool> ExampleLocalWithAnnonymous2() {
            return e => char.IsLetter(e) && char.IsLower(e);
        }
        return (
            (Func<char, bool>)(e => char.IsLetter(e) && char.IsLower(e)),
            (Func<char, bool>)(e => char.IsLetter(e) && char.IsLower(e)),
            ExampleLocalWithAnnonymous(),
            ExampleLocalWithAnnonymous2(),
            (Func<char, bool>)ExampleLocal,
            (Func<char, bool>)ExampleLocal2
        );
    }
}

public class ExampleClass2 {
    public object ExampleMethod() {
        bool ExampleLocal(Char t) {
            return char.IsLetter(t) && char.IsLower(t);
        }
        bool ExampleLocal2(Char t) {
            return char.IsLetter(t) && char.IsLower(t);
        }
        Func<char, bool> ExampleLocalWithAnnonymous() {
            return e => char.IsLetter(e) && char.IsLower(e);
        }
        Func<char, bool> ExampleLocalWithAnnonymous2() {
            return e => char.IsLetter(e) && char.IsLower(e);
        }
        return (
            (Func<char, bool>)(e => char.IsLetter(e) && char.IsLower(e)),
            (Func<char, bool>)(e => char.IsLetter(e) && char.IsLower(e)),
            ExampleLocalWithAnnonymous(),
            ExampleLocalWithAnnonymous2(),
            (Func<char, bool>)ExampleLocal,
            (Func<char, bool>)ExampleLocal2
        );
    }

    public object ExampleMethod2() {
        bool ExampleLocal(Char t) {
            return char.IsLetter(t) && char.IsLower(t);
        }
        bool ExampleLocal2(Char t) {
            return char.IsLetter(t) && char.IsLower(t);
        }
        Func<char, bool> ExampleLocalWithAnnonymous() {
            return e => char.IsLetter(e) && char.IsLower(e);
        }
        Func<char, bool> ExampleLocalWithAnnonymous2() {
            return e => char.IsLetter(e) && char.IsLower(e);
        }
        return (
            (Func<char, bool>)(e => char.IsLetter(e) && char.IsLower(e)),
            (Func<char, bool>)(e => char.IsLetter(e) && char.IsLower(e)),
            ExampleLocalWithAnnonymous(),
            ExampleLocalWithAnnonymous2(),
            (Func<char, bool>)ExampleLocal,
            (Func<char, bool>)ExampleLocal2
        );
    }
}

This code is currently compiled as (modified a bit to make it shorter here):


Compiled Code

public class ExampleClass {
    [Serializable]
    [CompilerGenerated]
    private sealed class <>c {
        public static readonly <>c <>9 = new <>c();

        public static Func<char, bool> <>9__0_6;
        public static Func<char, bool> <>9__0_7;
        public static Func<char, bool> <>9__0_4;
        public static Func<char, bool> <>9__0_5;
        public static Func<char, bool> <>9__1_6;
        public static Func<char, bool> <>9__1_7;
        public static Func<char, bool> <>9__1_4;
        public static Func<char, bool> <>9__1_5;

        private bool <ExampleMethod>g__ExampleLocal|0_0(char t)
            => char.IsLetter(t) && char.IsLower(t);
        private bool <ExampleMethod>g__ExampleLocal2|0_1(char t)
            => char.IsLetter(t) && char.IsLower(t);
        internal bool <ExampleMethod>b__0_6(char e)
            => char.IsLetter(e) && char.IsLower(e);
        internal bool <ExampleMethod>b__0_7(char e)
            => char.IsLetter(e) && char.IsLower(e);
        internal bool <ExampleMethod>b__0_4(char e)
            => char.IsLetter(e) && char.IsLower(e);
        internal bool <ExampleMethod>b__0_5(char e)
            => char.IsLetter(e) && char.IsLower(e);
        private bool <ExampleMethod2>g__ExampleLocal|1_0(char t)
            => char.IsLetter(t) && char.IsLower(t);
        private bool <ExampleMethod2>g__ExampleLocal2|1_1(char t)
            => char.IsLetter(t) && char.IsLower(t);
        internal bool <ExampleMethod2>b__1_6(char e)
            => char.IsLetter(e) && char.IsLower(e);
        internal bool <ExampleMethod2>b__1_7(char e)
            => char.IsLetter(e) && char.IsLower(e);
        internal bool <ExampleMethod2>b__1_4(char e)
            => char.IsLetter(e) && char.IsLower(e);
        internal bool <ExampleMethod2>b__1_5(char e)
            => char.IsLetter(e) && char.IsLower(e);
    }

    public object ExampleMethod() {
        return new ValueTuple<Func<char, bool>, Func<char, bool>, Func<char, bool>, Func<char, bool>, Func<char, bool>, Func<char, bool>>(<>c.<>9__0_4 ?? (<>c.<>9__0_4 = new Func<char, bool>(<>c.<>9.<ExampleMethod>b__0_4)), <>c.<>9__0_5 ?? (<>c.<>9__0_5 = new Func<char, bool>(<>c.<>9.<ExampleMethod>b__0_5)), <ExampleMethod>g__ExampleLocalWithAnnonymous|0_2(), <ExampleMethod>g__ExampleLocalWithAnnonymous2|0_3(), new Func<char, bool>(<>c.<>9.<ExampleMethod>g__ExampleLocal|0_0), new Func<char, bool>(<>c.<>9.<ExampleMethod>g__ExampleLocal2|0_1));
    }

    public object ExampleMethod2() {
        return new ValueTuple<Func<char, bool>, Func<char, bool>, Func<char, bool>, Func<char, bool>, Func<char, bool>, Func<char, bool>>(<>c.<>9__1_4 ?? (<>c.<>9__1_4 = new Func<char, bool>(<>c.<>9.<ExampleMethod2>b__1_4)), <>c.<>9__1_5 ?? (<>c.<>9__1_5 = new Func<char, bool>(<>c.<>9.<ExampleMethod2>b__1_5)), <ExampleMethod2>g__ExampleLocalWithAnnonymous|1_2(), <ExampleMethod2>g__ExampleLocalWithAnnonymous2|1_3(), new Func<char, bool>(<>c.<>9.<ExampleMethod2>g__ExampleLocal|1_0), new Func<char, bool>(<>c.<>9.<ExampleMethod2>g__ExampleLocal2|1_1));
    }

    [CompilerGenerated]
    private static Func<char, bool> <ExampleMethod>g__ExampleLocalWithAnnonymous|0_2()
        => <>c.<>9__0_6 ?? (<>c.<>9__0_6 = new Func<char, bool>(<>c.<>9.<ExampleMethod>b__0_6));
    [CompilerGenerated]
    private static Func<char, bool> <ExampleMethod>g__ExampleLocalWithAnnonymous2|0_3()
        => return <>c.<>9__0_7 ?? (<>c.<>9__0_7 = new Func<char, bool>(<>c.<>9.<ExampleMethod>b__0_7));
    [CompilerGenerated]
    private static Func<char, bool> <ExampleMethod2>g__ExampleLocalWithAnnonymous|1_2()
        => <>c.<>9__1_6 ?? (<>c.<>9__1_6 = new Func<char, bool>(<>c.<>9.<ExampleMethod2>b__1_6));    
    [CompilerGenerated]
    private static Func<char, bool> <ExampleMethod2>g__ExampleLocalWithAnnonymous2|1_3()
        => <>c.<>9__1_7 ?? (<>c.<>9__1_7 = new Func<char, bool>(<>c.<>9.<ExampleMethod2>b__1_7));
}

public class ExampleClass2 {
    [Serializable]
    [CompilerGenerated]
    private sealed class <>c {
        public static readonly <>c <>9 = new <>c();

        public static Func<char, bool> <>9__0_6;
        public static Func<char, bool> <>9__0_7;
        public static Func<char, bool> <>9__0_4;
        public static Func<char, bool> <>9__0_5;
        public static Func<char, bool> <>9__1_6;
        public static Func<char, bool> <>9__1_7;
        public static Func<char, bool> <>9__1_4;
        public static Func<char, bool> <>9__1_5;

        private bool <ExampleMethod>g__ExampleLocal|0_0(char t)
            => char.IsLetter(t) && char.IsLower(t);
        private bool <ExampleMethod>g__ExampleLocal2|0_1(char t)
            => char.IsLetter(t) && char.IsLower(t);
        internal bool <ExampleMethod>b__0_6(char e)
            => char.IsLetter(e) && char.IsLower(e);
        internal bool <ExampleMethod>b__0_7(char e)
            => char.IsLetter(e) && char.IsLower(e);
        internal bool <ExampleMethod>b__0_4(char e)
            => char.IsLetter(e) && char.IsLower(e);
        internal bool <ExampleMethod>b__0_5(char e)
            => char.IsLetter(e) && char.IsLower(e);
        private bool <ExampleMethod2>g__ExampleLocal|1_0(char t)
            => char.IsLetter(t) && char.IsLower(t);
        private bool <ExampleMethod2>g__ExampleLocal2|1_1(char t)
            => char.IsLetter(t) && char.IsLower(t);
        internal bool <ExampleMethod2>b__1_6(char e)
            => char.IsLetter(e) && char.IsLower(e);
        internal bool <ExampleMethod2>b__1_7(char e)
            => char.IsLetter(e) && char.IsLower(e);
        internal bool <ExampleMethod2>b__1_4(char e)
            => char.IsLetter(e) && char.IsLower(e);
        internal bool <ExampleMethod2>b__1_5(char e)
            => char.IsLetter(e) && char.IsLower(e);
    }

    public object ExampleMethod() {
        return new ValueTuple<Func<char, bool>, Func<char, bool>, Func<char, bool>, Func<char, bool>, Func<char, bool>, Func<char, bool>>(<>c.<>9__0_4 ?? (<>c.<>9__0_4 = new Func<char, bool>(<>c.<>9.<ExampleMethod>b__0_4)), <>c.<>9__0_5 ?? (<>c.<>9__0_5 = new Func<char, bool>(<>c.<>9.<ExampleMethod>b__0_5)), <ExampleMethod>g__ExampleLocalWithAnnonymous|0_2(), <ExampleMethod>g__ExampleLocalWithAnnonymous2|0_3(), new Func<char, bool>(<>c.<>9.<ExampleMethod>g__ExampleLocal|0_0), new Func<char, bool>(<>c.<>9.<ExampleMethod>g__ExampleLocal2|0_1));
    }

    public object ExampleMethod2() {
        return new ValueTuple<Func<char, bool>, Func<char, bool>, Func<char, bool>, Func<char, bool>, Func<char, bool>, Func<char, bool>>(<>c.<>9__1_4 ?? (<>c.<>9__1_4 = new Func<char, bool>(<>c.<>9.<ExampleMethod2>b__1_4)), <>c.<>9__1_5 ?? (<>c.<>9__1_5 = new Func<char, bool>(<>c.<>9.<ExampleMethod2>b__1_5)), <ExampleMethod2>g__ExampleLocalWithAnnonymous|1_2(), <ExampleMethod2>g__ExampleLocalWithAnnonymous2|1_3(), new Func<char, bool>(<>c.<>9.<ExampleMethod2>g__ExampleLocal|1_0), new Func<char, bool>(<>c.<>9.<ExampleMethod2>g__ExampleLocal2|1_1));
    }

    [CompilerGenerated]
    private static Func<char, bool> <ExampleMethod>g__ExampleLocalWithAnnonymous|0_2()
        => <>c.<>9__0_6 ?? (<>c.<>9__0_6 = new Func<char, bool>(<>c.<>9.<ExampleMethod>b__0_6));
    [CompilerGenerated]
    private static Func<char, bool> <ExampleMethod>g__ExampleLocalWithAnnonymous2|0_3()
        => return <>c.<>9__0_7 ?? (<>c.<>9__0_7 = new Func<char, bool>(<>c.<>9.<ExampleMethod>b__0_7));
    [CompilerGenerated]
    private static Func<char, bool> <ExampleMethod2>g__ExampleLocalWithAnnonymous|1_2()
        => <>c.<>9__1_6 ?? (<>c.<>9__1_6 = new Func<char, bool>(<>c.<>9.<ExampleMethod2>b__1_6));    
    [CompilerGenerated]
    private static Func<char, bool> <ExampleMethod2>g__ExampleLocalWithAnnonymous2|1_3()
        => <>c.<>9__1_7 ?? (<>c.<>9__1_7 = new Func<char, bool>(<>c.<>9.<ExampleMethod2>b__1_7));
}

That code could be optimized to:


Hand-Written Code

public class ExampleClass {    
    private static class HelperClass {
        public static Func<char, bool> function = Function;
        private static bool Function(char e) => char.IsLetter(e) && char.IsLower(e);
    }

    private static bool LocalFunction(char e) => char.IsLetter(e) && char.IsLower(e);

    private static Func<char, bool> LocalFunction2() => HelperClass.function;

    public object ExampleMethod() {
        return (
            HelperClass.function,
            HelperClass.function,
            LocalFunction2(),
            LocalFunction2(),
            (Func<char, bool>)LocalFunction,
            (Func<char, bool>)LocalFunction
        );
    }

    public object ExampleMethod2() {
        return (
            HelperClass.function,
            HelperClass.function,
            LocalFunction2(),
            LocalFunction2(),
            (Func<char, bool>)LocalFunction,
            (Func<char, bool>)LocalFunction
        );
    }
}

public class ExampleClass2 {    
    private static class HelperClass {
        public static Func<char, bool> function = Function;
        private static bool Function(char e) => char.IsLetter(e) && char.IsLower(e);
    }

    private static bool LocalFunction(char e) => char.IsLetter(e) && char.IsLower(e);

    private static Func<char, bool> LocalFunction2() => HelperClass.function;

    public object ExampleMethod() {
        return (
            HelperClass.function,
            HelperClass.function,
            LocalFunction2(),
            LocalFunction2(),
            (Func<char, bool>)LocalFunction,
            (Func<char, bool>)LocalFunction
        );
    }

    public object ExampleMethod2() {
        return (
            HelperClass.function,
            HelperClass.function,
            LocalFunction2(),
            LocalFunction2(),
            (Func<char, bool>)LocalFunction,
            (Func<char, bool>)LocalFunction
        );
    }
}

Or with some tricky stuff:


Hand-Written Code

internal class HelperClass {
    public static Func<char, bool> function = Function;
    private static bool Function(char e) => char.IsLetter(e) && char.IsLower(e);
}

public class ExampleClass {        
    public object ExampleMethod() {
        return (HelperClass.function, HelperClass.function, HelperClass.function, HelperClass.function, HelperClass.function, HelperClass.function);
    }

    public object ExampleMethod2() {
        return (HelperClass.function, HelperClass.function, HelperClass.function, HelperClass.function, HelperClass.function, HelperClass.function);
    }
}

public class ExampleClass2 {        
    public object ExampleMethod() {
        return (HelperClass.function, HelperClass.function, HelperClass.function, HelperClass.function, HelperClass.function, HelperClass.function);
    }

    public object ExampleMethod2() {
        return (HelperClass.function, HelperClass.function, HelperClass.function, HelperClass.function, HelperClass.function, HelperClass.function);
    }
}

I think there is plenty of room here for compiling optimization. I guess the last snippet would require a very advanced analysis to produce that code. But the previous snippet just checks for repeated compiled-generated members in a class.

Area-Compilers Code Gen Quality

Most helpful comment

This feels better as a generalized IL-postpass. i.e. i don't see why i would only want this for C#. I'd want it for VB, F# and any other .net language.

All 14 comments

Is it possible to give a minimal example of the optimization you want to make, and to describe it in words?

Is it to use the same cached delegate when two lambdas are the same?

There are several optimizations that could be done. Some of them may collide which each other, but if you think carefully all of them can be applied.
I will write them:

Function Deduplication In Same Class

Whenever a local or anonymous function is declared, the compiler should check if that compiled-generated function does already exist.
Basically it should do:

  • Find local/anonymous function declaration in the input code.
  • Check if this function is already declared somewhere:

    • True: use the already made function.

    • False: create a compiled-generated function and use it.

In order to do that, the compiler could store a list with all the compiled-generated functions IL-code. Each time the user declares a function, it should be looked for in the list.

Local Function Example


Input

public class ExampleClass {
    public bool ExampleMethod(char q) {
        bool ExampleLocal(char t)
            => char.IsLetter(t);
        return ExampleLocal(q);
    }

    public bool ExampleMethod2(string q) {
        bool ExampleLocalWithDifferentNameButSameCode(char t)
            => char.IsLetter(t);
        return ExampleLocalWithDifferentNameButSameCode(q[0]);
    }
}


Current Output

public class ExampleClass {
    public bool ExampleMethod(char q)
        => <ExampleMethod>g__ExampleLocal|0_0(q);
    public bool ExampleMethod2(string q)
        => <ExampleMethod2>g__ExampleLocalWithDifferentNameButSameCode|1_0(q[0]);

    [CompilerGenerated]
    private static bool <ExampleMethod>g__ExampleLocal|0_0(char t)
        => char.IsLetter(t);
    [CompilerGenerated]
    private static bool <ExampleMethod2>g__ExampleLocalWithDifferentNameButSameCode|1_0(char t)
        => char.IsLetter(t);
}


Expected Output

public class ExampleClass {
    public bool ExampleMethod(char q)
        => <ExampleMethod>g__ExampleLocal|0_0(q);
    public bool ExampleMethod2(string q)
        => <ExampleMethod2>g__ExampleLocal|0_0(q[0]);

    [CompilerGenerated]
    private static bool <ExampleMethod>g__ExampleLocal|0_0(char t)
        => char.IsLetter(t);
}

The same should be applied to local functions which capture the scope of the method.

Annonymous Function Example

The same apply to annoymous functions:


Input

public class ExampleClass {
    public Func<char, bool> ExampleMethod(char q)
        => (e) => char.IsLetter(e);

    public Func<char, bool> ExampleMethod2(char q)
        => (e) => char.IsLetter(e);
}


Current Output

public class ExampleClass {
    [Serializable]
    [CompilerGenerated]
    private sealed class <>c
    {
        public static readonly <>c <>9 = new <>c();

        public static Func<char, bool> <>9__0_0;

        public static Func<char, bool> <>9__1_0;

        internal bool <ExampleMethod>b__0_0(char e)
            => char.IsLetter(e);

        internal bool <ExampleMethod2>b__1_0(char e)
            => char.IsLetter(e);
    }

    public Func<char, bool> ExampleMethod(char q)
        => <>c.<>9__0_0 ?? (<>c.<>9__0_0 = new Func<char, bool>(<>c.<>9.<ExampleMethod>b__0_0));

    public Func<char, bool> ExampleMethod2(char q)
        => <>c.<>9__1_0 ?? (<>c.<>9__1_0 = new Func<char, bool>(<>c.<>9.<ExampleMethod2>b__1_0));
}


Expected Output

public class ExampleClass {
    [Serializable]
    [CompilerGenerated]
    private sealed class <>c
    {
        public static readonly <>c <>9 = new <>c();

        public static Func<char, bool> <>9__0_0;

        internal bool <ExampleMethod>b__0_0(char e)
            => char.IsLetter(e);
    }

    public Func<char, bool> ExampleMethod(char q)
        => <>c.<>9__0_0 ?? (<>c.<>9__0_0 = new Func<char, bool>(<>c.<>9.<ExampleMethod>b__0_0));

    public Func<char, bool> ExampleMethod2(char q)
        => <>c.<>9__0_0 ?? (<>c.<>9__0_0 = new Func<char, bool>(<>c.<>9.<ExampleMethod2>b__1_0));
}

The same should be applied to anonymous functions which capture the scope of the method. The generated <>c__DisplayClass1_0 by them can also be reused with no problems.

Local and Annonymous Combined Deduplication

Sometimes a local function can do the same as an anonymous function. In those cases, the compiler should note that they are similar and reuse the same function.


Input

public class ExampleClass {
    public Func<char, bool> ExampleMethod() {
        return (e) => char.IsLetter(e);
    }
    public Func<char, bool> ExampleMethod2() {
        bool Local(char e) => char.IsLetter(e);
        return Local;
    }
}


Current Output

public class ExampleClass
{
    [Serializable]
    [CompilerGenerated]
    private sealed class <>c {
        public static readonly <>c <>9 = new <>c();
        public static Func<char, bool> <>9__0_0;

        internal bool <ExampleMethod>b__0_0(char e)
            => char.IsLetter(e);
        private bool <ExampleMethod2>g__Local|1_0(char e)
            => char.IsLetter(e);
    }

    public Func<char, bool> ExampleMethod()
        => <>c.<>9__0_0 ?? (<>c.<>9__0_0 = new Func<char, bool>(<>c.<>9.<ExampleMethod>b__0_0));
    public Func<char, bool> ExampleMethod2()
        => new Func<char, bool>(<>c.<>9.<ExampleMethod2>g__Local|1_0);
}

This could be improved to:


Expected Output

public class ExampleClass
{
    [Serializable]
    [CompilerGenerated]
    private sealed class <>c {
        public static readonly <>c <>9 = new <>c();     
        public static Func<char, bool> <>9__0_0;

        internal bool <ExampleMethod>b__0_0(char e)
            => char.IsLetter(e);
    }

    public Func<char, bool> ExampleMethod()
        => <>c.<>9__0_0 ?? (<>c.<>9__0_0 = new Func<char, bool>(<>c.<>9.<ExampleMethod>b__0_0));
    public Func<char, bool> ExampleMethod2()
        => new Func<char, bool>(<>c.<>9.<ExampleMethod>b__0_0);
}

Or, if we reuse the same delegate:


Improved Output

public class ExampleClass
{
    [Serializable]
    [CompilerGenerated]
    private sealed class <>c {
        public static readonly <>c <>9 = new <>c();
        public static Func<char, bool> <>9__0_0;

        internal bool <ExampleMethod>b__0_0(char e)
            => char.IsLetter(e);
    }

    public Func<bool> ExampleMethod()
        => <>c.<>9__0_0 ?? (<>c.<>9__0_0 = new Func<char, bool>(<>c.<>9.<ExampleMethod>b__0_0));
    public Func<bool> ExampleMethod2()
        => <>c.<>9__0_0 ?? (<>c.<>9__0_0 = new Func<char, bool>(<>c.<>9.<ExampleMethod>b__0_0));
}

However, this last idea may be more difficult to make (just personal guess, actually I don't have any idea how Roslyn works).

Don't Create <>c Classes Where Not Necessary

The compiler doesn't bother in checking if it's really necessary to create a class for local functions nor anonymous functions.
For example, the following code creates a <>c class when it's not necessary.

Local Function


Input

public class ExampleClass {
    public Func<char, bool> ExampleMethod() {
        bool Local(char e) => char.IsLetter(e);
        return Local;
    }
}


Current Output

public class ExampleClass {
    [Serializable]
    [CompilerGenerated]
    private sealed class <>c
    {
        public static readonly <>c <>9 = new <>c();

        private bool <ExampleMethod>g__Local|0_0(char e)
            => char.IsLetter(e);
    }

    public Func<char, bool> ExampleMethod()
        => new Func<char, bool>(<>c.<>9.<ExampleMethod>g__Local|0_0);
}

This could be perfectly tweaked into:


Expected Output

public class ExampleClass {
    public Func<char, bool> ExampleMethod()
        => new Func<char, bool>(<ExampleMethod>g__Local|0_0);

    [CompilerGenerated]
    private bool <ExampleMethod>g__Local|0_0(char e)
        => char.IsLetter(e);
}

Or applying additional optimizations:


Improved

public class ExampleClass {
    public Func<char, bool> ExampleMethod()
        => <ExampleMethod>g__Local|0_0F ?? (<ExampleMethod>g__Local|0_0F = new Func<char, bool>(<ExampleMethod>g__Local|0_0));

    [CompilerGenerated]
    private Func<char, bool> <ExampleMethod>g__Local|0_0F;

    [CompilerGenerated]
    private bool <ExampleMethod>g__Local|0_0(char e)
        => char.IsLetter(e);
}

Annonymous Function


Input

public class ExampleClass {
    public Func<char, bool> ExampleMethod()
        => (e) => char.IsLetter(e);
}


Current Output

public class ExampleClass {
    [Serializable]
    [CompilerGenerated]
    private sealed class <>c
    {
        public static readonly <>c <>9 = new <>c();

        public static Func<char, bool> <>9__0_0;

        internal bool <ExampleMethod>b__0_0(char e)
            => char.IsLetter(e);
    }

    public Func<char, bool> ExampleMethod()
        => <>c.<>9__0_0 ?? (<>c.<>9__0_0 = new Func<char, bool>(<>c.<>9.<ExampleMethod>b__0_0));
}

The code could be simplified into:


Expected Output

public class ExampleClass {
    public Func<char, bool> ExampleMethod()
        => <ExampleMethod>b__0_0F ?? (<ExampleMethod>b__0_0F = new Func<char, bool>(<ExampleMethod>b__0_0));

    [CompilerGenerated]
    private Func<char, bool> <ExampleMethod>b__0_0F;

    [CompilerGenerated]
    private bool <ExampleMethod>b__0_0(char e)
        => char.IsLetter(e);
}

Function Deduplication In Other Classes

The function deduplication could be extended not only to local/anonymous functions in the same class but also to other classes.
A new helper class should be generated to store all these functions:

Local Functions


Input

public class ExampleClass {
    public bool ExampleMethod(char q) {
        bool ExampleLocal(char t)
            => char.IsLetter(t);
        return ExampleLocal(q);
    }
}

public class ExmapleClass2 {    
    public bool ExampleMethod2(string q) {
        bool ExampleLocal(char t)
            => char.IsLetter(t);
        return ExampleLocal(q[0]);
    }
}


Current Output

public class ExampleClass {
    public bool ExampleMethod(char q)
        => <ExampleMethod>g__ExampleLocal|0_0(q);

    [CompilerGenerated]
    private static bool <ExampleMethod>g__ExampleLocal|0_0(char t)
        => char.IsLetter(t);
}
public class ExmapleClass2 {
    public bool ExampleMethod2(string q)
        => <ExampleMethod2>g__ExampleLocal|0_0(q[0]);

    [CompilerGenerated]
    private static bool <ExampleMethod2>g__ExampleLocal|0_0(char t)
        => char.IsLetter(t);
}


Expected Output

[CompilerGenerated]
internal static class <>c_All_Local_Functions_Go_Here {
    [CompilerGenerated]
    private static bool <ExampleClass><ExampleMethod2>g__ExampleLocal|0_0(char t)
    // The name came from the first time it's used.
        => return char.IsLetter(t);
}

public class ExampleClass {
    public bool ExampleMethod(char q)
        => <>c_All_Local_Functions_Go_Here.<ExampleClass><ExampleMethod2>g__ExampleLocal|0_0(q);
}

public class ExampleClass2 {
    public bool ExampleMethod(char q)
        => <>c_All_Local_Functions_Go_Here.<ExampleClass><ExampleMethod2>g__ExampleLocal|0_0(q);
}

Annonymous Functions


Input

public class ExampleClass {
    public Func<char, bool> ExampleMethod()
        => (e) => char.IsLetter(e);
}

public class ExampleClass2 {    
    public Func<char, bool> ExampleMethod()
        => (e) => char.IsLetter(e);
}


Current Output

public class ExampleClass {
    [Serializable]
    [CompilerGenerated]
    private sealed class <>c {
        public static readonly <>c <>9 = new <>c();

        public static Func<char, bool> <>9__0_0;

        internal bool <ExampleMethod>b__0_0(char e)
            => char.IsLetter(e);
    }

    public Func<char, bool> ExampleMethod()
        => <>c.<>9__0_0 ?? (<>c.<>9__0_0 = new Func<char, bool>(<>c.<>9.<ExampleMethod>b__0_0));
}
public class ExampleClass2 {
    [Serializable]
    [CompilerGenerated]
    private sealed class <>c
    {
        public static readonly <>c <>9 = new <>c();

        public static Func<char, bool> <>9__0_0;

        internal bool <ExampleMethod>b__0_0(char e)
            => char.IsLetter(e);
    }

    public Func<char, bool> ExampleMethod()
        => <>c.<>9__0_0 ?? (<>c.<>9__0_0 = new Func<char, bool>(<>c.<>9.<ExampleMethod>b__0_0));
}


Expected Output

[CompilerGenerated]
internal sealed class <>c_All_Annonymous_Functions_Go_Here {
    public static readonly <>c_All_Annonymous_Functions_Go_Here <>9 = new <>c_All_Annonymous_Functions_Go_Here();

    public static Func<char, bool> <>9__0_0;

    [CompilerGenerated]
    private static bool <ExampleClass><ExampleMethod>b__0_0(char t)
        => return char.IsLetter(t);
}

public class ExampleClass {
    public Func<char, bool> ExampleMethod()
        => <>c_All_Annonymous_Functions_Go_Here.<>9__0_0 ?? (<>c_All_Annonymous_Functions_Go_Here.<>9__0_0 = new Func<char, bool>(<>c_All_Annonymous_Functions_Go_Here.<>9.<ExampleClass><ExampleMethod>b__0_0));
}
public class ExampleClass2 {
    public Func<char, bool> ExampleMethod()
        => <>c_All_Annonymous_Functions_Go_Here.<>9__0_0 ?? (<>c_All_Annonymous_Functions_Go_Here.<>9__0_0 = new Func<char, bool>(<>c_All_Annonymous_Functions_Go_Here.<>9.<ExampleClass><ExampleMethod>b__0_0));
}

Use Static Classes for Annonymous Functions

At the moment, anonymous functions are generated by the compiler as a non-static class (and so instance methods).
Whenever an anonymous function is going to be used, the compiler checks if the delegate does exist, and create a new instance one if not.
In some situations, it's possible to use static classes and static methods. This would also improve the performance, since there is no need to pass the implicit this pointer (of instance methods) to those methods.


Input

public class ExampleClass {
    public Func<char, bool> ExampleMethod(char q) {
        return (e) => char.IsLetter(e);
    }
}


Current Output

public class ExampleClass
{
    [Serializable]
    [CompilerGenerated]
    private sealed class <>c
    {
        public static readonly <>c <>9 = new <>c(); // Completely unnecessary with static methods

        public static Func<char, bool> <>9__0_0;

        internal bool <ExampleMethod>b__0_0(char e) // This is implicitly (this <>c instance, char e)
            => return char.IsLetter(e);
    }

    public Func<char, bool> ExampleMethod(char q)
        => <>c.<>9__0_0 ?? (<>c.<>9__0_0 = new Func<char, bool>(<>c.<>9.<ExampleMethod>b__0_0));
}


Expected Output

public class ExampleClass
{
    [Serializable]
    [CompilerGenerated]
    private static class <>c
    {
        public static Func<char, bool> <>9__0_0;

        internal static bool <ExampleMethod>b__0_0(char e)
            => return char.IsLetter(e);
    }

    public Func<char, bool> ExampleMethod(char q)
        => <>c.<>9__0_0 ?? (<>c.<>9__0_0 = new Func<char, bool>(<>c.<>9.<ExampleMethod>b__0_0));
}

Even more, there is no need to use a null checking.
Instead we can use static constructors:


Expected Output

public class ExampleClass
{
    [Serializable]
    [CompilerGenerated]
    private static class <>c
    {
        public static Func<char, bool> <>9__0_0 = <ExampleMethod>b__0_0;

        private static bool <ExampleMethod>b__0_0(char e)
            => return char.IsLetter(e);
    }

    public Func<char, bool> ExampleMethod(char q)
        => <>c.<>9__0_0;
}

This would increase the perfomance of annonymous functions:


Test Code

public class AnnonymousFunctionSpeed {
        private const int length = 100_000_000;
        private Func<bool>[] array = new Func<bool>[length];

        [Serializable]
        [CompilerGenerated]
        private sealed class c_CurrentImplementation {
            public static readonly c_CurrentImplementation instance = new c_CurrentImplementation();
            public static Func<bool> __0_0;
            internal bool CurrentImplementation__0_0() => true;
        }

        [Benchmark]
        public object CurrentImplementation() {
            for (int i = 0; i < array.Length; i++)
                array[i] = (c_CurrentImplementation.__0_0 ?? (c_CurrentImplementation.__0_0 = new Func<bool>(c_CurrentImplementation.instance.CurrentImplementation__0_0)));
            return array;
        }

        [Serializable]
        [CompilerGenerated]
        private static class c_NonInstanceImplementation {
            public static Func<bool> __0_0;
            internal static bool NonInstanceImplementation__0_0() => true;
        }

        [Benchmark]
        public object NonInstanceImplementation() {
            for (int i = 0; i < array.Length; i++)
                array[i] = (c_NonInstanceImplementation.__0_0 ?? (c_NonInstanceImplementation.__0_0 = new Func<bool>(c_NonInstanceImplementation.NonInstanceImplementation__0_0)));
            return array;
        }


        [Serializable]
        [CompilerGenerated]
        private static class c_StaticConstructorImplementation {
            public static Func<bool> __0_0 = StaticConstructorImplementation__0_0;
            private static bool StaticConstructorImplementation__0_0() => true;
        }

        [Benchmark]
        public object StaticConstructorImplementation() {
            for (int i = 0; i < array.Length; i++)
                array[i] = c_StaticConstructorImplementation.__0_0;
            return array;
        }
    }


Benchmark

|                          Method |     Mean |   Error |  StdDev |
|-------------------------------- |---------:|--------:|--------:|
|           CurrentImplementation | 422.1 ms | 8.15 ms | 9.39 ms |
|       NonInstanceImplementation | 393.1 ms | 4.71 ms | 4.17 ms |
| StaticConstructorImplementation | 347.4 ms | 4.72 ms | 4.18 ms |

This feels better as a generalized IL-postpass. i.e. i don't see why i would only want this for C#. I'd want it for VB, F# and any other .net language.

In some situations, it's possible to use static classes and static methods. This would also improve the performance, since there is no need to pass the implicit this pointer (of instance methods) to those methods.

I seem to remember that it's actually the opposite. For some reason delegates of static methods are slower than delegates of instance methods.

Could you do some benchmarking to show this is actually slower and if so by how much?

As far as I can see all of these optimisations only apply to lambdas which don't capture anything.

The compiler already caches such lambdas so they only allocate once per program lifetime.

I think some numbers showing that this would actually reduce allocations/improve performance by enough to be worthwhile would be useful.

For example, for deduplicating lambdas, it would be useful to know in a large codebase (e.g. dotnet/runtime, Roslyn) what percentage of lambdas are actually duplicates.

This feels better as a generalized IL-postpass. i.e. i don't see why i would only want this for C#. I'd want it for VB, F# and any other .net language.

I only use C# from .Net, that is why I make the issue about C#, but I guess it could be used for all languages compiled by Roslyn.

In some situations, it's possible to use static classes and static methods. This would also improve the performance since there is no need to pass the implicit this pointer (of instance methods) to those methods.

I seem to remember that it's actually the opposite. For some reason, delegates of static methods are slower than delegates of instance methods.

Could you do some benchmarking to show this is actually slower and if so by how much?

Just tested, you are right. Didn't know that Delegates are optimized for instance methods. However, there is still room for improvement:


Benchmark

public class AnnonymousFunctionSpeed {
    private const int length = 100_000_000;
    private Func<bool>[] array = new Func<bool>[length];

    [Serializable]
    [CompilerGenerated]
    private sealed class c_CurrentImplementation {
        public static readonly c_CurrentImplementation instance = new c_CurrentImplementation();
        public static Func<bool> __0_0;
        internal bool CurrentImplementation__0_0() => true;
    }

    [Benchmark]
    public object CurrentImplementation() {
        for (int i = 0; i < array.Length; i++)
            array[i] = (c_CurrentImplementation.__0_0 ?? (c_CurrentImplementation.__0_0 = new Func<bool>(c_CurrentImplementation.instance.CurrentImplementation__0_0)));
        return array;
    }

    [Serializable]
    [CompilerGenerated]
    private sealed class c_StaticImplementation {
        public static readonly c_CurrentImplementation instance = new c_CurrentImplementation();
        public static Func<bool> __0_0 = instance.CurrentImplementation__0_0;
        internal bool CurrentImplementation__0_0() => true;
    }

    [Benchmark]
    public object StaticImplementation() {
        for (int i = 0; i < array.Length; i++)
            array[i] = c_StaticImplementation.__0_0;
        return array;
    }
}


Result

|                          Method |     Mean |   Error |  StdDev |
|-------------------------------- |---------:|--------:|--------:|
|           CurrentImplementation | 431.3 ms | 6.85 ms | 6.41 ms |
|            StaticImplementation | 353.2 ms | 6.92 ms | 8.75 ms |

As far as I can see all of these optimisations only apply to lambdas which don't capture anything.

The compiler already caches such lambdas so they only allocate once per program lifetime.

I think some numbers showing that this would actually reduce allocations/improve performance by enough to be worthwhile would be useful.

For example, for deduplicating lambdas, it would be useful to know in a large codebase (e.g. dotnet/runtime, Roslyn) what percentage of lambdas are actually duplicates.

It's true that they allocate once per program lifetime, but why allocate several times the exact same lambda if the compiler could infer that they are the same?
Also, this would make .dll files smaller, since duplicated lambdas could be removed, and several lambdas could be stored in the same class instead of one class per user-defined class, which would reduce more the file size. This would indirectly reduce the loading time (I've understand that types slow down the CRL load time by a bit), and may ultimately reduce the native code size. Also, (though the less important), it will make the compiled code cleaner.

This may be especially useful for autogenerated code with tools like T4 templates.

These optimizations are not limited to lambdas which don't capture anything.
Look at this:

Local Function


Input

public class ExampleClass {
    public bool ExampleMethod(char q) {
        bool ExampleLocal()
            => char.IsLetter(q);
        return ExampleLocal();
    }

    public bool ExampleMethod2(char q) {
        bool ExampleLocal()
            => char.IsLetter(q);
        return ExampleLocal();
    }

    public bool ExampleMethod3(char q) {
        bool ExampleLocal()
            => char.IsLower(q);
        return ExampleLocal();
    }
}


Current Output

public class ExampleClass {
    [StructLayout(LayoutKind.Auto)]
    [CompilerGenerated]
    private struct <>c__DisplayClass0_0 {
        public char q;
    }

    [StructLayout(LayoutKind.Auto)]
    [CompilerGenerated]
    private struct <>c__DisplayClass1_0 {
        public char q;
    }

    [StructLayout(LayoutKind.Auto)]
    [CompilerGenerated]
    private struct <>c__DisplayClass2_0 {
        public char q;
    }

    public bool ExampleMethod(char q) {
        <>c__DisplayClass0_0 <>c__DisplayClass0_ = default(<>c__DisplayClass0_0);
        <>c__DisplayClass0_.q = q;
        return <ExampleMethod>g__ExampleLocal|0_0(ref <>c__DisplayClass0_);
    }

    public bool ExampleMethod2(char q) {
        <>c__DisplayClass1_0 <>c__DisplayClass1_ = default(<>c__DisplayClass1_0);
        <>c__DisplayClass1_.q = q;
        return <ExampleMethod2>g__ExampleLocal|1_0(ref <>c__DisplayClass1_);
    }

    public bool ExampleMethod3(char q) {
        <>c__DisplayClass2_0 <>c__DisplayClass2_ = default(<>c__DisplayClass2_0);
        <>c__DisplayClass2_.q = q;
        return <ExampleMethod3>g__ExampleLocal|2_0(ref <>c__DisplayClass2_);
    }

    [CompilerGenerated]
    private static bool <ExampleMethod>g__ExampleLocal|0_0(ref <>c__DisplayClass0_0 P_0)
        => return char.IsLetter(P_0.q);

    [CompilerGenerated]
    private static bool <ExampleMethod2>g__ExampleLocal|1_0(ref <>c__DisplayClass1_0 P_0)
        => return char.IsLetter(P_0.q);

     [CompilerGenerated]
    private static bool <ExampleMethod3>g__ExampleLocal|2_0(ref <>c__DisplayClass2_0 P_0)
        => return char.IsLower(P_0.q);
}


Expected Output

public class ExampleClass {
    [StructLayout(LayoutKind.Auto)]
    [CompilerGenerated]
    private struct <>c__DisplayClass0_0 {
        public char q;
    }

    public bool ExampleMethod(char q) {
        <>c__DisplayClass0_0 <>c__DisplayClass0_ = default(<>c__DisplayClass0_0);
        <>c__DisplayClass0_.q = q;
        return <ExampleMethod>g__ExampleLocal|0_0(ref <>c__DisplayClass0_);
    }

    public bool ExampleMethod2(char q) {
        <>c__DisplayClass0_0 <>c__DisplayClass0_ = default(<>c__DisplayClass0_0);
        <>c__DisplayClass0_.q = q;
        return <ExampleMethod>g__ExampleLocal|0_0(ref <>c__DisplayClass0_);
    }

    public bool ExampleMethod3(char q) {
        <>c__DisplayClass2_0 <>c__DisplayClass2_ = default(<>c__DisplayClass2_0);
        <>c__DisplayClass2_.q = q;
        return <ExampleMethod3>g__ExampleLocal|2_0(ref <>c__DisplayClass2_);
    }

    [CompilerGenerated]
    private static bool <ExampleMethod>g__ExampleLocal|0_0(ref <>c__DisplayClass0_0 P_0)
        => return char.IsLetter(P_0.q);

    [CompilerGenerated]
    private static bool <ExampleMethod3>g__ExampleLocal|2_0(ref <>c__DisplayClass2_0 P_0)
        => return char.IsLower(P_0.q);
}

Annonymous Functions


Input

public class ExampleClass {
    public Func<bool> ExampleMethod1(char q)
        => () => char.IsLetter(q);


    public Func<bool> ExampleMethod2(char q)
        => () => char.IsLower(q);
}


Current Output

public class ExampleClass {
    [CompilerGenerated]
    private sealed class <>c__DisplayClass0_0 {
        public char q;

        internal bool <ExampleMethod1>b__0()
            => return char.IsLetter(q);
    }

    [CompilerGenerated]
    private sealed class <>c__DisplayClass1_0 {
        public char q;

        internal bool <ExampleMethod2>b__0() 
            => char.IsLower(q);
    }

    public Func<bool> ExampleMethod1(char q) {
        <>c__DisplayClass0_0 <>c__DisplayClass0_ = new <>c__DisplayClass0_0();
        <>c__DisplayClass0_.q = q;
        return new Func<bool>(<>c__DisplayClass0_.<ExampleMethod1>b__0);
    }

    public Func<bool> ExampleMethod2(char q) {
        <>c__DisplayClass1_0 <>c__DisplayClass1_ = new <>c__DisplayClass1_0();
        <>c__DisplayClass1_.q = q;
        return new Func<bool>(<>c__DisplayClass1_.<ExampleMethod2>b__0);
    }
}


Expected Output

public class ExampleClass {
    [CompilerGenerated]
    private sealed class <>c__DisplayClass0_0 {
        public char q;

        internal bool <ExampleMethod1>b__0()
            => return char.IsLetter(q);

        internal bool <ExampleMethod2>b__0() 
            => char.IsLower(q);
    }

    public Func<bool> ExampleMethod1(char q) {
        <>c__DisplayClass0_0 <>c__DisplayClass0_ = new <>c__DisplayClass0_0();
        <>c__DisplayClass0_.q = q;
        return new Func<bool>(<>c__DisplayClass0_.<ExampleMethod1>b__0);
    }

    public Func<bool> ExampleMethod2(char q) {
        <>c__DisplayClass0_0 <>c__DisplayClass0_0 = new <>c__DisplayClass0_0();
        <>c__DisplayClass0_0.q = q;
        return new Func<bool>(<>c__DisplayClass0_0.<ExampleMethod2>b__0);
    }
}

About numbers, I have no idea how to check the amount of duplicated lambdas in the current runtime and Roslyn (more than reading line by line), nor how performance would be increased without rewriting that code by hand using those optimizations. Haven't count them, but I bet both repositories have hundred of thousands of lines...

It's true that they allocate once per program lifetime, but why allocate several times the exact same lambda if the compiler could infer that they are the same?

likely because that's a lot of effort the team has to take on to support tihs, and not a lot of evidence that there would be significant gain. i.e. how often does this occur in a codebase? If you were to check roslyn itself, how many lambdas would this save us (as a percentage of total)? If this is going to only optimize 3 cases out of 15,000... then that's a lot of invested engineering effort for very little payoff :)

About numbers, I have no idea how to check the amount of duplicated lambdas in the current runtime and Roslyn (more than reading line by line), nor how performance would be increased without rewriting that code by hand using those optimizations.

As an approximation, you could use roslyn to find all lambdas in a codebase, remove and normalise all whitespace, convert to a string, and count/list the number of duplicates as a percentage of all lambdas. This is imperfect since eg y => y != null is the same as x => x != null, and two lambdas both of which are x => x + y can be different lambdas if y is a different variable. However I think it's a good enough heuristic to provide some idea of what benefit this might provide.

This feels better as a generalized IL-postpass. i.e. i don't see why i would only want this for C#. I'd want it for VB, F# and any other .net language.

Reminds me of this old proposal for something to perform an IL -> IL optimization step: #15929.

@YairHalberstadt you are right. The idea is worthless.
I analysed both Roslyn and the Runtime source code.
From the 35746 C# files there are:

  • 2301 local functions: 125 exact matches.
  • 1402 anonymous methods: 435 exact matches.

To determine matches I stripped all trivia from them and turn to lower case.
The most common local function is only used 9 times, and also is not very large:

  • IEnumerable<Document>GetCurrentDocuments() => environment.Workspace.CurrentSolution.Projects.Single().Documents;

Anonymous methods are more common, but the duplicated ones are usually extremely small, such as:

  • delegate{} x 96
  • delegate{return true;} x 35
  • delegate(){return null;} x 34
  • delegate(int i){return i+1;} x 18
  • delegate{ran=false;} x 16
  • delegate{return ran;} x 10

Sorry. I should have done some research before raising an issue.

Nothing to be sorry about. Enthusiasm and interest are always great things to have! Thanks for doing the extra legwork and research here :)

There's one thing to consider here. Roslyn and Runtime avoid using LINQ everywhere, which I would imagine would be a sizeable source of these delegates.

Note: we don't avoid linq in Roslyn. We just avoid it in hotspots :)

Was this page helpful?
0 / 5 - 0 ratings