Roslyn: Increase maximum size of string literals from 10 million to 15 million

Created on 8 Jan 2020  路  19Comments  路  Source: dotnet/roslyn

The metadata writer (MetadataBuilder in System.Reflection.Metadata) that we use in the Roslyn compilers limit the amount of string literals to a smaller value than necessary (about 10 million characters instead of 15 million characters). This limit is smaller than was supported by older compilers, so some users are unnecessarily prevented from migrating to newer tools. We should increase the limit in hopes of removing this obstacle.

The test EmitErrorTests.ToManyUserStrings demonstrates our limit.

/cc @tmat

Area-Compilers Bug Tenet-Compatibility

All 19 comments

Regarding https://github.com/dotnet/roslyn/issues/40371#issuecomment-565632616

@rogeriorlima3 Is it possible for you to provide us with your app (the final C# code to be compiled is all we need) so we would be able to confirm this change helps you?

IIRC the limit is correct, based on the spec. The native compiler didn't check correctly for the limit and emitted metadata that was invalid according to the ECMA spec.

Is the 2^24 referred to in #9852 correct? The spec looks to allow up to 536870911 (or 2^29 - 1)
image

In the .NET assembly format, a metadata token into the string table (e.g. for the ldstr instruction) is a 4-byte value, the first byte of which identifies the table (in this case the string table), and the remaining 3 bytes indicates the offset to the string. That gives us 24 bits or about 15 million characters. However in practice we see the metadata writer giving an overflow error after about 10 million characters. I was hoping that the difference between the two is an implementation constraint in the metadata writer that we could lift.

@gafter Is it characters or bytes? #US is UTF16 encoded, which means we only have 2^23 characters (2^24 bytes).

If we take this nice theme and try to use it copy paste... in a .aspx page using VB.Net the error appears...
https://keenthemes.com/metronic/preview/demo1/index.html
If we copy paste in a .aspx page using C# works... Why ?

Guys, This is serious... If MS continue to force companies migrate their systems that were written in VB.Net we will stop use MS Technologies. Simple as that. We are really tired of this.
We are talking with a Post Doc Group specialized in Software Eng, companies cant be locked in MS, Oracle, Google etc. If necessary to write our code in C, C++, we will do that. Or you change your mindset or you will lose market share... Some guys here are learning NodeJS for you to have an idea.
But why ?.... pleople do not trust MS anymore.

@rogeriorlima3 see https://github.com/dotnet/roslyn/issues/40820#issuecomment-572261763. It sounds like the team needs your help to ensure the issue is fixed :)

The consensus at this point appears to be that this would likely require a change to the CLI specification and assembly file format, which is a major undertaking. In the meantime ASP should aim to use resources for large strings so that things can work without a major platform overhaul.

@rogeriorlima3 why are you not using resources files for this in VB instead of string literals? That is a simple workaround while we re-architect the entire CLI specification to support this.

@jmarolf I suspect @rogeriorlima3 has little control over how ASP.net represents page contents in the generated sources.

@gafter Forgive me if I鈥檓 wrong, but wouldn鈥檛 it be easier and more compatible to change the way ASP.NET and Razor handles the string literals rather than making a change to the CLI?

Sent with GitHawk

@06needhamt yes, definitely.

@NTaylorMullen FYI

@rogeriorlima3 why are you not using resources files for this in VB instead of string literals? That is a simple workaround while we re-architect the entire CLI specification to support this.

We are not using strings literals... well I just copy the index.html page from the metronic template
I try to test it in an .ASPX page.

The problem is:
If we take this page:
https://keenthemes.com/metronic/preview/demo1/index.html
Write all its content in:
.ASPX using VB.NET - Does not work
.ASPX using C# - It works

The question is why ? It only works using C# ?

@jmarolf I suspect @rogeriorlima3 has little control over how ASP.net represents page contents in the generated sources.

The current scenario is: It only works using C# not with VB.net
Generate an empty .ASPX page to simulate what I am talking about.

@rogeriorlima3 Your case is extreme and far out of the norm. Your choices are to petition your edge case to get attention (which will still likely take a long time until something happens) or to refactor your code.

This is effectively how all platforms work. There isn't going to be a lot of effort put into niche scenarios as that takes away time and resources from problems that are far more impactful for the ecosystem.

Note: as these are open source projects, if this is something critical for you, you could always consider contributing fixes yourself to unblock your unique circumstances.

@rogeriorlima3 Your case is extreme and far out of the norm. Your choices are to petition your edge case to get attention (which will still likely take a long time until something happens) or to refactor your code.

This is effectively how all platforms work. There isn't going to be a lot of effort put into niche scenarios as that takes away time and resources from problems that are far more impactful for the ecosystem.

Note: as these are open source projects, if this is something critical for you, you could always consider contributing fixes yourself to unblock your unique circumstances.

Hey dont worry... :) We already decided to use Java instead.
All Vb.Net code that we have will be rewriten using Java. (or other non MS tech)
It seems to be safer and more stable. More reliable technology. Just warn MS
to warn companies in time before they decide to kill C#. (MS = technology for children.)
See you. Bye

Was this page helpful?
0 / 5 - 0 ratings