We have a ServiceFabric service, which is running the latest .NET Core 3.1. We publish & deploy it as self-contained. It references Microsoft.Data.SqlClient. Using version 1.1.3 everything works perfectly. After updating to 2.0.0, we can not event deploy it to the fabric anymore. Unfortunataly, ServiceFabric gives very little information why. This is what we see:
Service package for manifest 'XXXX' and service package activation ID 'XXXX' is in Error.
'System.Hosting' reported Error for property 'ServiceTypeRegistration:XXX'.
The ServiceType was disabled on the node.
It tries that several times and then gives up.
By digging into SF events i was able to find this as well:
There was an error during CodePackage activation.The process/container terminated with exit code:2147516556.
By comparing service package i can see that it used to include sni.dll when using 1.1.3 and then its gone after we updated to 2.0.0. My suspicion is that is the reason why the service fails to start.
And i think is my theory is correct because the service works just fine locally, since sni.dll is availaible on my PC. But it does not exist on the ServiceFabric's VMs.
@dmytro-gokun
Please look at 2.0.0 release notes for breaking changes in SNI naming (introduced in #570). For .NET Core, with 2.0.0 we've introduced our own replacement for runtime.native.System.Data.SqlClient.sni NuGet as well that renames sni.dll to Microsoft.Data.SqlClient.SNI.dll.
If you can provide us a repro, we can investigate more in what your observations are.
@cheenamalhotra Yes, I've read that. And the deployment package contains Microsoft.Data.SqlClient.SNI.dll, so no problem here. The problem is that it somehow still needs sni.dll to load.
About a repro. As I said, it does not reproduce with a Service Fabric Local Cluster on a typical development machine. I can make a repro that you will have to deploy to a real cluster and then you will see it works just fine with 1.1.3 and does not even start with 2.0.0. Will this work for you?
The problem is that it somehow still needs sni.dll to load.
The SqlClient driver does not need the old sni.dll anymore. Can you confirm which component needs the old 'sni.dll' by capturing more logs from your cluster? The above errors don't provide any context.
I can make a repro that you will have to deploy to a real cluster and then you will see it works just fine with 1.1.3 and does not even start with 2.0.0. Will this work for you?
I don't have that environment to deploy but if you can gather logs we can investigate the errors.
The SqlClient driver does not need the old sni.dll anymore.
Do you perhaps know, in which situations it was loaded before? That will help me build a repro.
Can you confirm which component needs the old 'sni.dll' by capturing more logs from your cluster? The above errors don't provide any context.
Well, that's the problem with ServiceFabric. It does not give you any more info than that. I've spent several hours googling and trying different tricks, but that's all I've got :( Any chance, you know anyone who can help with getting more info from SF?
Do you perhaps know, in which situations it was loaded before? That will help me build a repro.
It's a NuGet package reference, and was loaded from runtime.native.System.Data.SqlClient.sni same way as currently Microsoft.Data.SqlClient.SNI.runtime NuGet is loaded. It's a package reference that contains native runtime DLLs. The old package referenced another package implicitly based on arch, but now we added arch specific runtime folders in the same NuGet. But again. the driver doesn't need old sni.dll anymore, so I don't think it will stop you from working if it's absent from your environment.
Have you tried to remove sni.dll from your local dev environment and confirm if you see same error?
If it's related, it should fail locally too, but if not - this error is probably not related to driver and it's dependencies. I also see similar issues on service fabric (same error code): https://github.com/microsoft/service-fabric/issues/868
You can find guide to debug issues here: Debug service errors using the log files, or else contact Service Fabric teams.
So, after further investigation, I was able to find the root cause. It has nothing to do with sni.dll.
The root cause is this exception:
Error:
An assembly specified in the application dependencies manifest (XXXX.deps.json) was not found:
package: 'Microsoft.Data.SqlClient.SNI.runtime', version: '2.0.0'
path: 'runtimes/win-x64/native/Microsoft.Data.SqlClient.SNI.pdb'
This exception happens during the deployment and before our code even has a chance to run.
Now:
1) Why a .pdb is even required for the app to load? Are not pdbs only needed for debugging/stack trace formatting etc? To me this looks really weird, maybe I do not understand something about pdbs here;
2) The .pdb is not here because that's how Azure DevOps deploys services by default: without pdbs. This is described in detail here: https://docs.microsoft.com/en-us/azure/service-fabric/service-fabric-tutorial-deploy-app-with-cicd-vsts#configure-continuous-delivery-with-azure-pipelines. Particularly, on this screenshot: https://docs.microsoft.com/en-us/azure/service-fabric/media/service-fabric-tutorial-deploy-app-with-cicd-vsts/saveandqueue.png you can (partially) see the step which copies *.pdb to a separate folder (which is later zipped to an artifact) and then the next step which deletes original *.pdb from the final deployment package. This explains why pdbs are not copied to Service Fabric and why we get the exception we get.
3) After we removed steps which remove pdbs from the deployment package, it deploys and works just fine.
To me it looks like the correct fix to this issue should be not to require the pdb presence after all. If, for some reason, it is absolutely required, this should be thoroughfuly documented somewhere. Noone should kill so much time on this again.
@dmytro-gokun This looks very Service Fabric specific - wonder why the Service Fabric team does not want to deploy .pdb files? Maybe the Azure DevOps template should be updated?
@ErikEJ well, I do not know why. I can only guess that they want to preserve disk space. If you are using microservices architecture and run hundreds of services, all those pdbs (which are identical for each service) may eat all your limited disk space very fast.
Nevertheless, deploying a .pdb alongside your app or not deploying it should be the developer's choice and not a requirement for the app to run at all.
Agree, wonder if the pdb was included in the previous package? @cheenamalhotra ??
@ErikEJ It obviously was not, as the same app/DevOps pipeline worked just fine with 1.1.3
Hi @dmytro-gokun
The pdbs were included starting with v2.0.0 since the DLLs are closed source, as tools like Binskim fail to work with driver otherwise as reported in #598
I would expect these errors to be handled by applications or application frameworks and there should be a solution to this.
But products cannot NOT ship PDBs due to such issues, it looks like a generic issue to me that should be reported to Service Fabrics and documented by their teams how to handle PDB related issues their tools may encounter.
This issue is called by this step in the Azure DevOps build pipeline:
steps:
- task: DeleteFiles@1
displayName: 'Delete files from $(build.artifactstagingdirectory)\applicationpackage'
inputs:
SourceFolder: '$(build.artifactstagingdirectory)\applicationpackage'
Contents: '**\*.pdb'
@cheenamalhotra In my humble opinion, Azure DevOps does the right thing: they treat debug info as debug info. By requiring a pdb to be present for an app to start, you are essentially changing the semantic of what pdb is. IMO.
By requiring a pdb to be present for an app to start, you are essentially changing the semantic of what pdb is. IMO.
@dmytro-gokun I'm not sure I follow.. We don't require PDBs to be present, there's no dependency! They're copied with DLL when you build/deploy. You should be able to work without them.
What seems to be going on is these PDB references are copied to your deps.json file but when you remove these PDBs the deps.json fails to find them when dotnet reads the file. So you need to probably remove them before deps.json is created or fix your deps.json essentially if you absolutely need to remove them.
@cheenamalhotra deps.json is not created manually or by Service Fabric. It is created by dotnet build and i presume it uses dependency information from nuget packages. Here is the problematic piece:
"Microsoft.Data.SqlClient.SNI.runtime/2.0.0": {
"native": {
"runtimes/win-x64/native/Microsoft.Data.SqlClient.SNI.dll": {
"fileVersion": "2.0.0.0"
},
"runtimes/win-x64/native/Microsoft.Data.SqlClient.SNI.pdb": {
"fileVersion": "0.0.0.0"
}
}
},
And that's why Service Fabric looks for this pdb while deploying the service. To be honest, i'm not that familiar with nuget to tell if it's possible to mark *.pdb dependency as optional. But that's what would look like a correct solution to me.
I would then recommend this gets reported to dotnet/sdk team that they should exclude copying PDBs for "deploy" targets :)
We don't control anything in .NET Core. We only ship PDBs along with DLLs and rest is taken care by NuGet restore and dotnet sdk.
For .NET Framework we control COPY targets and also exclude PDBs for deploy tasks, but we always shipped PDBs there and there have been no such concerns reported.
@cheenamalhotra
Okay, but:
1) This does not happen to any other nuget package we reference in any other project we have (quite a lot of packages, quite a lot of projects);
2) This did not happen in version 1.1.3
The only conclusion I can draw from this is that there is some problem with this particular nuget package. Do not you agree?
Maybe have a look at any differences between the two SNI packages
Can you give me name of any other native runtime NuGet packages that contain PDBs that you reference?
I can compare and see.
For 1.1.3 it didn't happen because we referenced "runtime.native.System.Data.SqlClient.SNI" NuGet that did not contain PDBs. Reason why there were complaints regarding missing PDBs.
LOL - now there are complaints that they are there 馃槃
@cheenamalhotra Unfortunately, I cannot give you such an example. But it also obvious to me that the way it is done today is causing problems. I've spent tons of time getting to the root of this issue. I'm 100% sure hundreds of people are doing the same these days and will do in the future if it's not fixed somehow. This is not the way it should be.
If you ask me, I would say it makes sense for you to speak to the core team (or nuget team?) and ask them how to do this properly, so that .pdb is not included to .deps.json. If no such way exists at the moment, a feature request has to be made to the proper team.
Also, if you think about it ... why should the pdb go to the package itself? That's what MS symbol server is for, is not it?
I think since we also shipped the PDBs to Microsoft.Data.SqlClient.SNI Nuget (SNI for .NET Framework), we packaged it here too for consistency, but I agree the experience hasn't been great with .NET Core. I'll take it offline and see if we can take them away and only upload them on symbol server.
Hi @dmytro-gokun
Just a heads up, we're setup and planning to fix this issue with next preview release of Microsoft.Data.SqlClient v2.1.0-preview2, will keep you posted.
@cheenamalhotra Thanks a lot. Looking forward for it.
Run into the same problem here. Our single exe .net core app crashed after upgrading Sqlclient from 1.1.0 to 2.0.1. Took me whole night to realize it's the pdb missing, then I found this discussion.
Looking forward for for the next release/fix!
I had a similar problem. I was using _EFCore.BulkExtensions_ v3.2.4 (which is using _SqlClient_ v2). Downgrade to _EFCore.BulkExtensions_ v3.1.4 (with _SqlClient_ v1) solved the issue.
Hi @dmytro-gokun
Closing this issue as pdbs have been removed from Microsoft.Data.SqlClient.SNI.runtime v2.1.1 to be picked by Microsoft.Data.SqlClient v2.1.0-preview2 onwards.
This issue happened when I switched from .NET 5 rc2 to RTM today and upgraded many of my nugets to 5.0.0 versions. I don't use service fabric at all. Has there been a regression somewhere?
<PackageReference Include="FlexLabs.EntityFrameworkCore.Upsert" Version="3.1.0" />
<PackageReference Include="Microsoft.AspNetCore.Mvc.Razor.RuntimeCompilation" Version="5.0.0" />
<PackageReference Include="Humanizer" Version="2.8.26" />
<PackageReference Include="MailKit" Version="2.9.0" />
<PackageReference Include="Microsoft.EntityFrameworkCore.InMemory" Version="5.0.0" />
<PackageReference Include="Microsoft.EntityFrameworkCore.Relational" Version="5.0.0" />
<PackageReference Include="Microsoft.EntityFrameworkCore.Sqlite.Core" Version="5.0.0" />
<PackageReference Include="Microsoft.EntityFrameworkCore.SqlServer" Version="5.0.0" />
<PackageReference Include="Microsoft.Extensions.Caching.Memory" Version="5.0.0" />
<PackageReference Include="Microsoft.Extensions.Hosting" Version="5.0.0" />
<PackageReference Include="Microsoft.Extensions.Http.Polly" Version="5.0.0" />
<PackageReference Include="Microsoft.Web.Administration" Version="11.1.0" />
<PackageReference Include="NLog" Version="4.7.5" />
<PackageReference Include="Npgsql.EntityFrameworkCore.PostgreSQL" Version="5.0.0-rc2" />
<PackageReference Include="Pomelo.EntityFrameworkCore.MySql" Version="5.0.0-alpha.1" />
<PackageReference Include="protobuf-net.Grpc.AspNetCore" Version="1.0.123" />
<PackageReference Include="SharpCompress" Version="0.26.0" />
<PackageReference Include="StackExchange.Redis" Version="2.1.58" />
<PackageReference Include="System.Net.WebSockets" Version="4.3.0" />
<PackageReference Include="System.Net.WebSockets.WebSocketProtocol" Version="5.0.0" />
Rolling the entity framework nuget packages back to 3.1.8 fixed the problem on the .NET 5 rtm release for me.
I had this issue too with a .NET Core app using Entity Framework Core 5.0.0 packaged as a single exe. I solved it by explicitly taking a dependency on Microsoft.Data.SqlClient
version 2.1.0 (which depends on Microsoft.Data.SqlClient.SNI.runtime
>= 2.1.1):
<PackageReference Include="Microsoft.Data.SqlClient" Version="2.1.0" />
<PackageReference Include="Microsoft.EntityFrameworkCore.SqlServer" Version="5.0.0" />
I did the same thing and it worked as well.
Most helpful comment
I think since we also shipped the PDBs to Microsoft.Data.SqlClient.SNI Nuget (SNI for .NET Framework), we packaged it here too for consistency, but I agree the experience hasn't been great with .NET Core. I'll take it offline and see if we can take them away and only upload them on symbol server.