I have just encountered another performance hit issue that is much like #330 but is easier to reproduce. Essentially, I am finding that if I comment out a class in my solution, I experience a difference of 4-6us in my performance results. Extremely bizarre so I thought I would report it to see if there is an issue, maybe known.
You can find the branch with this repo here:
https://github.com/Mike-EEE/ExtendedXmlSerializer/tree/issues/Benchmark.NET/363
Please run this test:
https://github.com/Mike-EEE/ExtendedXmlSerializer/blob/issues/Benchmark.NET/363/test/ExtendedXmlSerializer.Performance.Tests/Benchmarks.cs#L70-L71
And then run it again with this file commented out, and then again with it inline and compiled into the assembly:
https://github.com/Mike-EEE/ExtendedXmlSerializer/blob/issues/Benchmark.NET/363/src/ExtendedXmlSerializer/ContentModel/Collections/MemberedCollectionItemReader.cs
This is what I am experiencing on my machine with the file commented out:
Method | Mean | StdDev |
---------------------------------- |----------- |---------- |
DeserializationClassWithPrimitive | 44.2065 us | 0.0742 us |
And with it inline and compiled into the assembly:
Method | Mean | StdDev |
---------------------------------- |----------- |---------- |
DeserializationClassWithPrimitive | 50.1144 us | 0.0700 us |
Please let me know if you have any questions around this and I will get them answered for you.
OK I got the links in there for you. Please let me know if you have any problems accessing and/or building the project and I will do my best to assist.
ugh. I am truly embarrassed. I will file this one under "do not file bug last thing after a long night of waayyyy too much coffee and code." First thought upon waking was ... "was there anything type-related that I had added recently?" And sure enough (to my horror) there is a type routine I have for filtering application-specific types. I just now commented that out and the 5us returns. I have no idea why at this point (as loading one more type shouldn't be such a massive hit?), but it does not look like this has anything to do with BDN. I will close this for now unless I find there is something directly related to BDN (and am deathly sure of it). Again, apologies.
Hello @Mike-EEE! I encourage you to use some profiler when you want to find out which parts of the benchmarked method are taking time. Benchmarking gives you only the sum, profiler can give you detailed view. You can start with VS Instrumentation Profiler which is built-in VS for Professional and Ultimate.
Ahhh thank you @adamsitnik for your assistance. If I could delete this issue I fully would. I cannot tell you how embarrassed I am/was to figure out that it had nothing to do with BDN. If I didn't have my issue with #330 I wouldn't have thought of it, but commenting out a class and seeing a 5us difference triggered it for me. In any case this will not be happening again. Promise. 馃槢

OK I am actually going to resurrect this issue as a question/discussion, as I have spent the day looking into this and have found the underlying issue to the best of my understanding. After spending a _bunch_ of time trying to figure out what is going on, I have managed to isolate the issue so that the hit is less disruptive, from 5us to ~2us.
It also turns out that this issue has nothing to do with typing/types but with static member allocations, which I will attempt to describe below.
When using classes that have dependencies, I design them so that there's usually a default dependency for them (ala pure DI). In my case, this default dependency is usually a static singleton. I know things get a little weird when you start introducing static singletons into a design, _so please know and note that my constructors are strictly assignments only_. That is, you will never see any operations within my constructors except the assignment of readonly fields of the class.
As an example of a dependency, consider a class like this, where the Provider.Default is used as a dependency:
class SampleClass
{
public SampleClass(string name) : this(Provider.Default, name) {...}
public SampleClass(IProvider provider, string name) {...}
}
What I have been incorporating lately is refactoring these default implementations/dependencies so that they are a static field in the dependent class, such as this:
class SampleClass
{
readonly static IProvider ProviderImplementation = Provider.Default;
public SampleClass(string name) : this(ProviderImplementation, name) {...}
public SampleClass(IProvider provider, string name) {...}
}
[[Actual Sample Here](https://github.com/Mike-EEE/ExtendedXmlSerializer/blob/issues/Benchmark.NET/363/src/ExtendedXmlSerializer/ContentModel/Properties/TypeParser.cs#L35-L47)]
_I might and very well may be wrong here_, but my idea/thinking here is that since the dependency is assigned as a static field in the class, it doesn't need to go through the work to resolve the reference at the original site via the property, so it should theoretically be faster, especially in the case of allocating a bunch of instances. That is, it is one operation of looking up a field in the current class vs. two operations of looking up a class and then the property in _that_ class, if that makes sense.
(Again, not sure if that is correct, but that was/is my thinking.)
Well, it turns out this design is what is causing my bizarre performance behavior here. After spending some (ok a LOT of) time today, it seems that BDN emits faster times when I pull all of references that are used in such a way into a single collection, and then in turn use each instance in some fashion when the benchmark class is initializng, such as calling ToString on each instance as seen here.
Doing this currently yields an additional 1-2us for each benchmark that is run. When I say 1-2us, I am saying that the lowest number that I get out of the results after many many tries with the hack in place is about 1-2us lower than the lowest number that I see with the hack commented out.
As a specific example, in the case of DeserializationClassWithPrimitive, I see the lowest number of around 44.8us when running the benchmark about 4 or 5 times (maybe more) with the hack above in place. Commenting out the hack, I will see the typical number ranging from ~46.5-48us (47.1us seems to be the most popular).
FWIW, I did try undoing all of these static references and moving them inline into the constructor, but it didn't make anything faster, and even the hack as outlined above stopped working, so I figured there is something else at play here and reverted to the state that you see above.
So, I am way outside of my element here and thought of checking in and see if I can get my thoughts around this. Is there something incredibly/painfully obvious about this that would cause the discrepancy? It would seem as if the static instances are not actually resolved until runtime, which doesn't make sense as I ensure that they are all ultimately referenced by making warm-up calls in the constructor.
One final thought: dotTrace provides a way of comparing profiling sessions, but it doesn't seem to currently work well with xUnit in .NET Core (it compares by threads, and each xunit session uses different threads, doh). I have an outstanding ticket to JetBrains regarding this (their testing suite has been admirably riding .NET Core's chaos, so it very well could be a limitation of tooling ATM). Before I spent some time figuring out how to run a process-based session with dotTrace and do some comparisons that way, I thought I would reach out here instead.
Lots going on here. Hopefully enough made sense here so that someone might be able to assist. :)
AHHHH... OK, please disregard this issue (AGAIN!), I have finally figured out the source of all my ails.
When doing configuration, I was doing it like this:
Job.Default.With(new GcMode {Force = false});
This of course is _wrong_. It should be in a configuration as such:
class Configuration : ManualConfig
{
public Configuration()
{
Add(Job.Default.With(new GcMode {Force = false}));
}
}
Doing this solved all my problems, including my confusion with #353. OMG so relieved now. 馃槍
As an FYI/FWIW here, in addition to the improper configuration above, looks like I was encountering a tooling problem with the new project.json project bits. Essentially, new changes were not being compiled even though the output said it was. So, I was actually running old code and the results were reflecting _that_ code and not the expected new code, if that makes sense. It took me having to place a throw new InvalidOperationException() in the code to discover this latent issue.
Specifically calling rebuild on the performance project does work, so that is what I will be doing from now on. So, lots and lots of variables at play here. What can I say, creation is a messy process.