This is a part 2 of https://github.com/PowerShell/PowerShell/issues/13132. I'm running PowerShell scripts using executed from a .NET program using Microsoft.PowerShell.SDK. The package bundle is created using --runtime linux-x64 switch with dotnet publish.
Initially I it was always failing and with the previous issue it was discovered I need to move the Modules folder in the runtimes folder up to the root of the publish folder.
Now that I have implemented that workaround of moving the Module folder to the root of the publish folder I'm getting the following error about 10% of the time.
The repo is the same as is explained in the https://github.com/PowerShell/PowerShell/issues/13132 except before executing the runtimes/win/lib/netcoreapp3.1/Modules folder is copied up to root of the publish folder.
@daxian-dbw
2020-07-15 23:21:04: [Error] - An unexpected error has occurred while processing ForEach-Object -Parallel input. This may mean that some of the piped input did not get processed. Error: System.IO.DirectoryNotFoundException: Could not find a part of the path '/tmp/32029fe7-1d55-4203-bdb1-898dfea7c7d0/.local/share/powershell/Modules'.
   at System.IO.FileSystem.CreateDirectory(String fullPath)
   at System.IO.Directory.CreateDirectory(String path)
   at System.Management.Automation.Platform.SelectProductNameForDirectory(XDG_Type dirpath)
   at System.Management.Automation.ModuleIntrinsics.GetPersonalModulePath()
   at System.Management.Automation.ModuleIntrinsics.GetModulePath(String currentProcessModulePath, String hklmMachineModulePath, String hkcuUserModulePath)
   at System.Management.Automation.ModuleIntrinsics.SetModulePath()
   at System.Management.Automation.ModuleIntrinsics..ctor(ExecutionContext context)
   at System.Management.Automation.ExecutionContext.InitializeCommon(AutomationEngine engine, PSHost hostInterface)
   at System.Management.Automation.ExecutionContext..ctor(AutomationEngine engine, PSHost hostInterface, InitialSessionState initialSessionState)
   at System.Management.Automation.AutomationEngine..ctor(PSHost hostInterface, InitialSessionState iss)
   at System.Management.Automation.Runspaces.LocalRunspace.DoOpenHelper()
   at System.Management.Automation.Runspaces.LocalRunspace.OpenHelper(Boolean syncCall)
   at System.Management.Automation.Runspaces.RunspaceBase.CoreOpen(Boolean syncCall)
   at System.Management.Automation.Runspaces.RunspaceBase.Open()
   at System.Management.Automation.PSTasks.PSTaskBase.Start()
   at System.Management.Automation.PSTasks.PSTaskPool.Add(PSTaskBase task)
   at Microsoft.PowerShell.Commands.ForEachObjectCommand.<InitParallelParameterSet>b__63_2(Object _).
{
  "SerializationVersion": {
    "Major": 1,
    "Minor": 1,
    "Build": 0,
    "Revision": 1,
    "MajorRevision": 0,
    "MinorRevision": 1
  },
  "WSManStackVersion": {
    "Major": 3,
    "Minor": 0,
    "Build": -1,
    "Revision": -1,
    "MajorRevision": -1,
    "MinorRevision": -1
  },
  "PSCompatibleVersions": [
    {
      "Major": 1,
      "Minor": 0,
      "Build": -1,
      "Revision": -1,
      "MajorRevision": -1,
      "MinorRevision": -1
    },
    {
      "Major": 2,
      "Minor": 0,
      "Build": -1,
      "Revision": -1,
      "MajorRevision": -1,
      "MinorRevision": -1
    },
    {
      "Major": 3,
      "Minor": 0,
      "Build": -1,
      "Revision": -1,
      "MajorRevision": -1,
      "MinorRevision": -1
    },
    {
      "Major": 4,
      "Minor": 0,
      "Build": -1,
      "Revision": -1,
      "MajorRevision": -1,
      "MinorRevision": -1
    },
    {
      "Major": 5,
      "Minor": 0,
      "Build": -1,
      "Revision": -1,
      "MajorRevision": -1,
      "MinorRevision": -1
    },
    {
      "Major": 5,
      "Minor": 1,
      "Build": 10032,
      "Revision": 0,
      "MajorRevision": 0,
      "MinorRevision": 0
    },
    {
      "Major": 6,
      "Minor": 0,
      "Build": 0,
      "Revision": -1,
      "MajorRevision": -1,
      "MinorRevision": -1
    },
    {
      "Major": 6,
      "Minor": 1,
      "Build": 0,
      "Revision": -1,
      "MajorRevision": -1,
      "MinorRevision": -1
    },
    {
      "Major": 6,
      "Minor": 2,
      "Build": 0,
      "Revision": -1,
      "MajorRevision": -1,
      "MinorRevision": -1
    },
    {
      "Major": 7,
      "Minor": 0,
      "Build": 2,
      "Revision": -1,
      "MajorRevision": -1,
      "MinorRevision": -1
    }
  ],
  "Platform": "Unix",
  "GitCommitId": "7.0.2",
  "PSRemotingProtocolVersion": {
    "Major": 2,
    "Minor": 3,
    "Build": -1,
    "Revision": -1,
    "MajorRevision": -1,
    "MinorRevision": -1
  },
  "PSVersion": {
    "Major": 7,
    "Minor": 0,
    "Patch": 2,
    "PreReleaseLabel": null,
    "BuildLabel": null
  },
  "PSEdition": "Core",
  "OS": "Linux 4.14.165-102.205.amzn2.x86_64 #1 SMP Fri Feb 14 22:46:57 UTC 2020"
}
Move comment from the previous issue:
Error: System.IO.DirectoryNotFoundException
No idea why
Directory.CreateDirectorycould fail with this exception. It doesn't require any part of the path to be present ... The code is here. The path in use was/tmp/2473a89a-8513-49db-bf23-51d8af91a349/.local/share/powershell/Modulesand/tmp/2473a89a-8513-49db-bf23-51d8af91a349was already successfully created by this code.
@normj What is the environment that your application was running in? It could be a permission issue in /tmp but the fact that it happens only 10% of the times makes it even more mysterious.
Loop in @PaulHigin @adityapatwardhan @rjmholt for help.
I assume this is on a Linux machine? It seems strange that it is looking for modules in the /tmp directory. Is pwsh installed in the normal way? It sounds like the /tmp/... path has been removed? After all it is a 'tmp' directory.
@PaulHigin According to this code, it seems the HOME env variable is not set in those running environments where the script failed.
@normj Could it be possible that it only failed on the systems where HOME env variable is not set (which happens to be 10% of the environments)?
This is part of our AWS tooling to run PowerShell scripts in AWS Lambda. In Azure speak basically think this is how we run PowerShell scripts as functions. When I say it fails 10% of the time I mean 10 Lambda function invocations that run the same script fail with this error. The rest of the time it works fine. Given that we are talking about parallelization is it possible there is some sort of race condition.
The environment is a Linux environment. The /tmp folder is actually the only writable place in Lambda the process has access to. The rest of the filesystem is read only. Just double checked and the HOME environment variable is never set in Lambda.
Here is the host code that runs PowerShell in Lambda via the Microsoft.PowerShell.SDK package.
https://github.com/aws/aws-lambda-dotnet/tree/master/Libraries/src/Amazon.Lambda.PowerShellHost
GitHub
Libraries, samples and tools to help .NET Core developers develop AWS Lambda functions. - aws/aws-lambda-dotnet
From the stack, the exception is thrown during runspace creation for the parallel script block to run, which is before a thread is started to run it. We use to create a new runspace for each parallel script block run, but now reuse runspaces. But it is hard to see this as a concurrency issue given the stack. With the newer version there is less runspace creation which may mitigate the problem. You can tell if you have the newer version because there is a new ForEach-Object -Parallel switch ('-UseNewRunsapce') that lets you revert to the old behavior of creating a new runspace for each script block.
By new do you mean the 7.1 preview packages? I can't use those as they are based on .NET 5 and I need .NET Core 3.1. I also tested out today 7.0.3 release and still have same behavior of about 10% invocations getting this error.
Here's the place in .NET where the error is thrown.
In particular, this section suggests that the code is written to handle races to create the directory.
The error thrown looks like it's this one, implying ENOENT was the native error code.
That suggests that one of the mkdir calls returned ENOENT rather than the handled EEXIST — here's native implementation that gets called. According to the mkdir(2) manpage, this can occur when:
A directory component in pathname does not exist or is a dangling symbolic link.
I believe the code I referenced is also present in the .NET Core 3.1 timeframe. Reading through, it's still not clear to me where the error occurs or why.
Some possibilities:
FileSystem.CreateDirectory() code (the 10% occurrence favours this)I think it might be worth opening an issue on .NET
It turns out that the runspace reuse feature is not part of the 7.0.x releases, and is slated to go into 7.1. So you can only see it today in the preview releases. But in any case, all runspaces are created on a single thread so there should be no concurrency issues. It would be interesting to see if runspace reuse affects the problem, since fewer runspace objects are created. I also wonder if this only occurs on one platform. It would be helpful if there were a simple consistent repro.
I will be OOF today.
When I say it fails 10% of the time I mean 10 Lambda function invocations that run the same script fail with this error. The rest of the time it works fine.
@normj Thanks for the additional information. Just so I'm clear, when _it works fine_, it was also running in AWS Lambda function environment which is supposed to be the same as those 10 failing invocations, right?
When the env variable HOME is not defined, every Runspace startup will try creating a folder in the form of /tmp/<new-guid>/.local/share/powershell/Modules as the personal user module path. That means LOTs of such folders get created when Foreach-Object -Parallel is dealing with a lot of inputs. In that situation, it may not be surprising to see strange failures like this one. (_maybe we have reached the upper limit to create new directories in /tmp/ in the Lambda sanbox?_)
I suggest you to create the HOME environment variable in the Lambda function configurations (_you can define arbitrary environment variables for Azure Functions_), and make it maybe point to /tmp/home. Then only /tmp/home/.local/share/powershell/Modules will be created by the first Runspace and it will be reused by the rest of Runspace instances created by -Parallel. Please give this workaround a try and see how it goes.
... every Runspace startup will try creating a folder in the form of ...
A new home directory is created for each runspace? This seems wrong. Why wouldn't the temporary home directory be static?
@daxian-dbw All failures and and success are in Lambda.
As you suggested I tried setting the HOME environment variable and once I did that I haven't been able to recreate any failures. So I think we can use that as a workaround and making setting the HOME environment variable as part of our Hosting code.
A new home directory is created for each runspace? This seems wrong. Why wouldn't the temporary home directory be static?
Yes, that's something should be looked into. The code is here.
As you suggested I tried setting the HOME environment variable and once I did that I haven't been able to recreate any failures. So I think we can use that as a workaround and making setting the HOME environment variable as part of our Hosting code.
@normj Good to hear that. At least you are unblocked :) I think we should fix the code I referenced above.
A new home directory is created for each runspace? This seems wrong. Why wouldn't the temporary home directory be static?
Yes, that's something should be looked into. The code is here.
I submitted #13239 to address this.
:tada:This issue was addressed in #13239, which has now been successfully released as v7.1.0-preview.6.:tada:
Handy links:
Most helpful comment
I submitted #13239 to address this.