Consider the following two modules:
function Get-CoolStuff() {
Write-Output 'Cool stuff 1!'
}
function Get-CoolStuff() {
Write-Output 'Cool stuff 2!'
}
Notice that they both define (and export by default) a Get-CoolStuff function. Each one behaves differently.
Now consider these two scripts:
#Requires -Modules MyModule1
Get-CoolStuff
#Requires -Modules MyModule2
Get-CoolStuff
These two scripts are simple: they ensure the module they need is imported and then proceed to call the module function.
Now let's run them. Ensure that the scripts are set up appropriately so that #Requires will automatically import them if needed. (So MyModule1.psm1 should be in a folder named MyModule1 that's somewhere on PSModulePath, and similarly for MyModule2.psm1.)
Then:
PS > .\myscript1.ps1
Cool stuff 1!
PS > .\myscript2.ps1
Cool stuff 2!
They both work, and all seems to be well.
...Until you try to run myscript1.ps1 again:
PS > .\myscript1.ps1
Cool stuff 2!
The problem is that since the function names from the modules conflict, myscript1.ps1 has been silently broken just because MyModule2 happened to get imported.
The problem with allowing conflicting function names is that it can silently, unexpectedly, and confusingly break the behavior of anything that depends on it. The function used is effectively random; you never know what code you're actually calling if there happens to be a conflict.
One primary work around for this is to manually import a module using a prefix:
PS > Import-Module MyModule2 -Prefix MyModule2
and then call it like this:
PS > Get-MyMod2CoolStuff
This is completely sufficient when you're working at the command line, developing commands as you go, but it doesn't work well for scripts and script modules. To use it for scripts, you have to
#Requires -Modules and RequiredModules)Import-Module in the script with the prefix or having some kind of orchestration script, but this is still cumbersome and rather messy.The other work around is somewhat better. You can be explicit about which module the function should come from:
PS > MyModule1\Get-CoolStuff
Cool stuff 1!
This is a little bit better since you can still depend on #Requires and similar, but it still means you have to expect a name collision ahead of time and clutters your script. There's no way to find out if you accidentally bring a new conflict into your set up.
What I'd like to see is ambiguous function calls cause an error. Following the above example, I'd like to
see something like this:
PS > Get-CoolStuff
The function 'Get-CoolStuff' is ambiguous. Found in modules: MyModule1, MyModule2
At line:1 char:1
+ Get-CoolStuff
+ ~~~~~~~
+ CategoryInfo : InvalidOperation: (Get-CoolStuff:function) [], RuntimeException
+ FullyQualifiedErrorId : FunctionAmbiguous
(All I did here was change a few things in the error you get when attempting to access an unknown variable under strict mode.)
I think it makes the most sense to make this only happen under strict mode. Strict mode already has a history of introducing breaking changes that make the runtime more well behaved/predictable, and this certainly falls into that category. This would also give users an easy way to disable this particular check (by setting the strict mode version lower).
The biggest improvement here is that it follows the fail fast principle. This prevents the command from doing the wrong thing if you're unlucky enough for the differences to not cause an actual error, and it gives the caller a clearer, more immediate indication of what they need to fix it even if it would have caused an error otherwise.
Since this is a rare case, no change needs to be made to how PowerShell behaves when there is no conflict. Since this would be a breaking change for some cases, it would make the most sense to include it in a major version increment. (Obviously not 6. Maybe 7.)
Test scripts all run under:
PS > $PSVersionTable
Name Value
---- -----
PSVersion 6.0.0-beta
PSEdition Core
GitCommitId v6.0.0-beta.5
OS Microsoft Windows 6.3.9600
Platform Win32NT
PSCompatibleVersions {1.0, 2.0, 3.0, 4.0...}
PSRemotingProtocolVersion 2.3
SerializationVersion 1.1.0.1
WSManStackVersion 3.0
A real world example where this is getting me into trouble is the Carbon module vs. the ScheduledTasks module. I'm using both for some installation scripts for a set of web apps, services, and scheduled tasks. Specifically, the both implement a Get-ScheduledTask function that returns different, incompatible types. E.g., the ScheduledTasks function Disable-ScheduledTask fails with a Carbon object. I had also created some convenience functions in a module of my own that have a conflict with Carbon, and they behave somewhat differently, which caused problems in my scripts.
One problem I see with this proposal would be modules that intentionally rewrite/wrap commands from other modules to provide domain specific functionality while not breaking functionality of the original command in consumer code. That should still be a valid practice even in strict mode, IMO.
For example, Module1 has a Get-Widget command that does a rudimentary retrieval of widgets from a "classic" widget store. Module2 provides a wrapper for Module1's Get-Widget that has the exact same use functionality, but internally provides the ability to use an alternate authentication method to a custom widget store.
Also, how would this work with modules that intentionally rewrite/wrap core commands?
@markekraus
I disagree about that being a valid usage in strict mode. The reason is because that imposes a specific dependency on the order that modules are imported. What if your "override" module somehow got accidentally imported before the other one? Maybe you added a new script or module that depends on the overriding module but not the original, and it's getting executed/imported before the original in your own code. Or maybe someone re-arranges your imports because they want the original, not realizing that this will break something else. Or maybe a completely different script depends only on the overriding module, and someone executes that other script before yours.
Try the above example in reverse (run myscript2.ps1 first). You'll see that both scripts start outputting Cool stuff 1! instead of Cool stuff 2!.
So allowing your scenario still creates situations where the code is unpredictable when called manually and fragile as changes are made. The whole point of strict mode is to give you more strict behaviors that help you avoid creating situations that get you into trouble, isn't it?
Core command overriding suffers all the same problems as overriding other commands, only worse since scripts all over the place almost certainly reference core commands without qualifying them. Is there a way to qualify core commands? If not, implementing the same behavior would effectively disallow overriding core commands in strict mode. That, in my opinion, is not a terrible thing.
On the other hand, some conflicts with core commands could be the result of new ones being added as PowerShell continues to be developed. That would be unintentionally overwriting core commands, though, so a nice error might still be the best thing there (fail fast principle).
FYI, in case you're wondering, a real world example where this is getting me into trouble is the Carbon module vs. the ScheduledTasks module. There's direct conflicts between those two, and some convenience functions I created in my own module before getting Carbon also conflict with it.
(I'm editing that into the original post.)
The reason is because that imposes a specific dependency on the order that modules are imported.
Order of operations matters and the order of importing modules is usually managed by the module manifests themselves.
I don't believe strict mode has any say in the order of operations within a script. It's meant to enforce best practice, not to ensure order of operations. For example, it will prevent a user from accessing a non-existent property on an object, but it will not ensure a user is modifying a property before a Set- command is called to make the change.
is there a way to even qualify core commands?
Yes: Microsoft.PowerShell.Utility\Write-Output. In modules which need strict access to core commands, they use Module Qualified command names for the commands they depend on. Or they ensure the correct command by using the Get-Command like pester.psm1.
I'm not sure what you mean by controlling module import order in the manifest. A module's functions can be broken by a module it doesn't even depend on. (I assume you mean #Requires for scripts, but it suffers from the same problems.)
Example:
@{
ModuleVersion = '1.0.0'
GUID = 'e17b3c87-6fc0-4457-a1e4-67233073da11'
Author = 'bladeoflight16'
Description = 'Stuff'
PowerShellVersion = '6.0'
RequiredModules = @('MyModule1')
NestedModules = @('.\MyModule3.psm1')
}
function Get-EvenMoreCoolStuff() {
Get-CoolStuff | Write-Output
Write-Output 'Even more cool stuff!'
}
Export-ModuleMember -Function Get-EvenMoreCoolStuff
Now execute:
PS> Import-Module MyModule3
PS> Get-EvenMoreCoolStuff
Cool stuff 1!
Even more cool stuff!
PS> .\myscript2.ps1
Cool stuff 2!
PS> Get-EvenMoreCoolStuff
Cool stuff 2!
Even more cool stuff!
MyModule3 has no declared dependency on MyModule2, but MyModule2 is still overriding the function that it depends on. MyModule3's function even behaves differently in the same session, just because of a name conflict. Whether the modules and scripts work depends completely on the global state of the session.
And I didn't test it, but what if module Z depends on Y to be imported after X, specifies them in that order, but Y is already imported before Z is imported? I suspect Y would not be imported again, so that means X is imported after Y, breaking Z.
In my current set up, I have a fairly large number of scripts and modules to maintain. Maybe roughly 20 scripts, 10 to 15 custom modules, and maybe 5 third party modules. A lot of the scripts require the same functionality, and I've moved those out into modules to avoid duplicating a lot of logic. Some of the scripts are even intended to run in completely different environments. (For example, we have psake build scripts and also scripts to deploy stuff.)
I've also adopted a policy of using #Requires and RequiredModules as much as possible for dependency management. The benefit was enormous: it allowed me to decouple the "what" from the "where." This allowed me to organize my project much more sanely. Scripts that directly relate to some C# component (that needs to be built and deployed, for instance) can be located close to them and included in their build for distribution, and that particular script/project doesn't need to contain any logic for packaging or distributing or finding those modules. And other scripts that aren't closely related to a project (like configuring something globally on IIS) can be anywhere as well. None of them have to care about where those modules are, as long as they're on PSModulePath, which can be set up once before trying to run anything, and the user receives a fairly clear, obvious error message if they forget to do so.
But with the current conflict resolution of PowerShell ("last one wins"), I can't be confident my scripts are going to work. They might need to be run in an order I didn't foresee and therefore didn't test. Or I might add a new script that needs to run before others and adds Carbon as a dependency, now suddenly breaking some of them. Or I might need to pull another module into an existing script because now it needs something from that.
That kind of unreliability is a major risk as the system grows. I don't know what functions might have name conflicts in the future as PowerShell gets upgraded and more functionality becomes available.
If we were talking about one or two scripts, I could see these problems being manageable. But having to go back, find all references to a few functions, and add specific qualifiers to 30 different scripts (thousands of lines of code) to function calls where the qualifier wasn't needed before, or worse, might be imposed by third party code, is a pretty big deal.
I suppose an alternative could be to just disallow any unqualified reference to a function in a script, but this struck me as an incredibly cumbersome requirement that makes the code extraordinarily more verbose and difficult to read, and it would only rarely show any benefit (despite that benefit being incredibly important to the reliability of the PowerShell code). With the current resolution mechanism, any unqualified command (even core ones!) may be a bug just waiting to happen.
I guess you could implement something which would allow you to explicitly declare what modules to look in within a script/script module. (It'd effective be like C#'s namepsace usings, I suppose.) But you'd also have to implement a strict mode where if the function wasn't in one of the declared modules, it would error out. How would that work with dot sourcing, though? And you still have to deal with conflicts between those some way. I suppose "last in the list" might work okay for that; at least then it's not, "random function in the global state blowing up my code at random."
I'm not sure what you mean by controlling module import order in the manifest.
Module2 has Get-Widget which wraps a Get-Widget in Module1, Module2 calls calls Module1 in RequiredModules in the module manifest. The consumer code imports Module2. With the proposal in place, this would cause an error with strict mode enforced due to there being an "ambiguous command". In reality, Module2 is making legitimate use of PowerShell's Command Precedence to wrap Module1\Get-Widget to provide additional back-end functionality (i.e. no new parameters available to consumer code, but can access a custom Widget store).
Then it's not possible to write maintainable code in PowerShell. If you're actually encouraging building a system that way, no one is going to be able to track what it's actually doing when reading it.
As I said before, this is all fine if you're working manually at the command line just doing things on the fly. Putting this kind of confusion in a code base is a completely different story. That's why I said it should only be enforced under strict mode. It might even be okay if you can limit it in scope to a single script, so someone can figure out what's coming from where. But PowerShell is built so that changing a command affects all child scopes automatically, whether you want it to or not, and manipulating shared state nearly always leads to a confusing tangle of logic. In PowerShell, you don't even know what code you might be affecting by doing this. (Most of the programming world has realized that's the case just with shared variables, forget doing it with actual code.)
It is, and it can help in situations where 3rd Party Company A has Module B that does not interact with API C. You have a massive code base. Rather than rewriting all of Company A's Module B or refactoring your entire code base, you write Module D which wraps the functionality of Module B for the commands where it is relevant.
A real world example of this is the ActiveDirectory Module. There is a bug with Get-ADGroupMember which causes a terminating error when it cannot resolve a ForeignSecurityPrincipal. The amount of code which calls Get-ADGroupMember that is 3rd party is large. You cannot refactor both your code and all the 3rd party code you need to address this. You do not want to re-write the entire ActiveDirectory module. Instead, you create your own module that wraps/re-writes Get-ADGroupMember to return the raw AD Object for the FSP when it cannot be resolved. The only requirements is that you the ActiveDirectory module fist before your own module. You an do that very easily by having your module require ActiveDirectory and importing your module.
This most certain helps with maintainable code.
And what happens when two third parties replace that command with different ones that do different things? Suddenly, you have a system that you don't know what it's going to do. A system that I can't predict the behavior of is not maintainable. What's worse, I might not even know when I've pulled in another one. I might not find out until my AD set up is a mess because it's doing the wrong thing.
The better answer to that bug is for ActiveDirectory to fix the bug and everyone can upgrade, and users of the library just won't be able to support that use case until they do. People who need it can stick with whatever old interface there is until it's fixed. This PowerShell "feature" even lowers the pressure on them to actually fix something that is broken because they can just tell others to replace it and pretend everything is fine. I don't think an ecosystem where important libraries can just ignore important bugs is a good thing; do you?
It is the responsibility of the code consumer to vet 3rd party code before using it. I 'm not saying it's ideal, I'm also not saying this is what you should do in every situation. I'm just saying that there is a legitimate use-case for command wrapping and this proposal will break a fundamental feature of PowerShell. Strict mode is intended to enforce best practices. In some cases, squashing functions is best practice.
It is the responsibility of the code consumer to vet 3rd party code before using it.
This is ludicrous. It's not the user's fault if a library they thought was good is broken. Sure, they should've tested their usage, and maybe they did. Testing isn't perfect. It's not going to catch everything. It's the responsibility of the libraries' authors to do as much as they can to make it safe to use. Your argument could also be used to say that anyone who uses ActiveDirectory in a library they intend to distribute should've tested it with a ForeignSecurityPrinciple and made sure it still worked. If they had, they would've either created a work around (like a separate function that actually works) or explicitly decided not to support it until ActiveDirectory was fixed. Heck, your argument also suggests that whoever wrote the ActiveDirectory module should have tested better, as they probably aren't the authors of all the technology under the hood that they call out to. So this statement doesn't really help your case.
In some cases, squashing functions is best practice.
No, it's not. If a library can't support proper fetching of ActiveDirectory group members, the best practice is for the library to not fetch them. It can make them an input of some kind instead, and then the caller can do the fetching however is required. If it can support it by not using Get-ADGroupMember, it can do so with a differently named function. That the library was written poorly does not make hacking their code this way a best practice. It might be a viable temporary work around, but something that dangerous is never going to be a best practice.
I've seen occasions where you're tempted to do this kind of stuff in Python (replace a module member that's accessed in multiple places). And in some rare cases, it makes sense as a massive hack because you have no other viable options. But at least in Python, I know when I'm doing it, and libraries doing it behind your back are extremely rare. (I'm certain a majority of Python developers would be unable to name any library that does it without the caller explicitly requesting it, and the only one I can think of where you can explicitly request it is mock, which immediately undoes it.) And that's because even though the language allows it, the community frowns on it. Certainly no one defends it as a "best practice." And importantly, it will never happen accidentally; it only happens explicitly.
The fact that in some rare cases it's the only option you have doesn't eliminate the fact that it's dangerous, error prone, and best avoided, and it certainly doesn't make it a "best practice." Best practices are by definition not a list of everything it sometimes makes sense to do; best practices are the things you should do and avoid as much as possible and that you should tread very carefully when you have to go outside them. Overriding commands by using a conflicting name is definitely something best avoided and that requires special care when you have to do it.
Make special note of the fact that I'm not advocating for the complete removal of the ability to do what you're describing. You'd still be able to outside of strict mode. This means that in my original proposal, there is still some means of accomplishing what you suggest. But avoiding name conflicts should clearly be a best practice. The feature I'm suggesting would encourage just that and give early detection of when it happens accidentally.
Your argument could also be used to say that anyone who uses ActiveDirectory in a library they intend to distribute should've tested it with a
ForeignSecurityPrincipleand made sure it still worked.
My argument is just the opposite. I'm saying that not everyone could have even detected it as it requires a multi-forest environment with certain conditions met to make the FSP unresolvable and thus cause the terminating error in Get-ADGroupMember. If I want to use strict mode without the ability to squash a broken function like this I would have to refactor both my entire code base and all the 3rd party code. For what purpose? It's one command. PowerShell allows for command squashing and in this instance it is a perfectly legitimate reason to do so. What best practice is being preserved here by denying it in strict mode?
Pester is an example where commands are squashed for mocking. Several modules wrap Out-Default to provide all kinds of additional functionality. Non-JEA constrained endpoints use proxy functions. There are tons of cases where squashing a function is common and best practice.
It would also mean that modules with the same function names would require refactoring consumer code to call the Module Qualified Command Name instead of relying on their own intentional load order. That is pretty unmaintainable as now your code base is coupled with a module name that might change.
It might also break situations where you are squashing code provided via external input. For example, code where ScriptBlocks are passed where Write-Host might be wrapped to capture output or any other rewrite need.
It creates as many problems as it solves, and I still contend that squashing is a legitimate use of PowerShell's command precedence. Taking the ability out of strict mode doesn't preserve any best practice, but may, in fact, break some.
You can do Import-Module -Name MyModule1 -Force to make it explicitly reload a Module so that it overwrites the MyModule2\Get-CoolStuff with the one you want from Module1
So in other words you have a propagation of bad design decisions that make guesses in the face of ambiguity because it wasn't restricted sooner, and now undoing it will break a lot of stuff. This seems to be running trend in PowerShell. That still doesn't make it a best practice. That just means that the language design handled ambiguity badly and people took advantage of the features available.
If I want to use strict mode without the ability to squash a broken function like this I would have to refactor both my entire code base and all the 3rd party code.
Why can't you temporarily disable strict mode for calls to the problematic third party library? Making name collisions as a work around more difficult would likely lead to better architected libraries in the long run.
Pester is an example where commands are squashed for mocking. Several modules wrap
Out-Defaultto provide all kinds of additional functionality. Non-JEA constrained endpoints use proxy functions.
Pester, for mocking, is the only example here that doesn't make me say, "What the heck?"
I can think of no good reason to override Out-Default. If you want to do something else, just write a function with a different name and call it. And the fact you said, "several modules" already reveals the problem: which module's version are you getting? Why can't they use the standard mechanisms for overriding the formatting anyway? Like this blog explains how to use PSStandardMembers instead. And I sure hope they're not doing something crazy like overriding the default output streams. Writing code that messes with things as basic as the standard streams should darn well be a huge no-no.
Pester is a use case worth discussing and figuring something out for. But it's also something that won't be used in prod and has a very explicit, very narrow, very careful use case. It's one of the rare cases where leveraging this makes sense, but in virtually every other case, there's a cleaner way to implement things that imposes much less risk on the caller.
It would also mean that modules with the same function names would require refactoring consumer code to call the Module Qualified Command Name instead of relying on their own intentional load order. That is pretty unmaintainable as now your code base is coupled with a module name that might change.
If the module name changes, you have to change the list of module names in the requirements list anyway. A module name change already breaks your module, so you're not losing anything there. It adds at most a Replace All of ModuleName\ throughout your code. Additionally, widely used modules are unlikely to change their names anytime soon. This would be a rare event, and as I just described, not that difficult to change per script. You might even be able to do something like use Notepad++'s Find In Files function. Combined with a little caution and tests to make sure you didn't break anything, this isn't that big a deal. Maybe a little tedious, but tedious and reliable is better than, "my code is exploding or doing the wrong thing for reasons that are not immediately obvious now," and, "My code works sometimes and sometimes not depending on the user's global environment." Even consistent failure is better than those. So I think this is a straw man.
It might also break situations where you are squashing code provided via external input. For example, code where
ScriptBlocks are passed whereWrite-Hostmight be wrapped to capture output or any other rewrite need.
This is an example of something you shouldn't do. A module has no business making this kind of decision for the caller. If the caller writes Write-Host, they should get Write-Host. You don't take this kind of control away from the user in code that has any measure of sanity. How does the author know whether the caller wants to use regular Write-Host or their custom Write-Host? If I ask for standard built in functionality, I should get it, and not more. This is exactly the kind of ambiguity that causes unexpected behavior that's going to cause some poor developer (like me) a lot of late nights and heartache.
And it's easy to not do. You just define Write-CustomHost or whatever. If the caller imports it using Import-Module MyCrazyWriteHostModule -Prefix Crazy, their Write-Host call isn't going to call your custom one anyway. But by allowing module authors to override everything including the kitchen sink to do this, you're taking away the ability of another developer to actually use the standard #Requires and RequiredModules declarations because they'll be forced to use arguments like -Prefix to have working code. Or they'll have to prefix everything with Microsoft.PowerShell.Utility.
It creates as many problems as it solves, and I still contend that squashing is a legitimate use of PowerShell's command precedence. Taking the ability out of strict mode doesn't preserve any best practice, but may, in fact, break some.
The best practice it breaks is having code that I can depend on to do what the author tells it to and not more, while allowing us to leverage standard functionality like #Requires and RequiredModules.
If you don't implement this, the only way I can get that is to prefix everything anyway, which accomplishes nothing but making my code harder to read except for one or two functions and removes the ability to even hack my code by overriding the commands.
Not all 3rd party code is dangerous. If the answer is to turn off strict mode for 3rd party code, what incentive would there be to write modules with strict mode in mind or to vet modules based on strict mode adherence?
I'll add that other languages do not consider function squashing a strict mode violation. This includes JavaScript, PHP, and PERL.
The incentive for having strict mode on in the first place is to write better code. You said it yourself: it encourages following best practices, and it does that by reducing the chance that you'll write code that silently passes when it shouldn't. I can't stop a third party from not caring about whether their code is any good. However, if they design it for strict mode, I can have confidence that if I need to do some weird hack in the short term, I can turn off strict mode until I find a better library or something gets fixed. And if turning strict mode on causes their code by itself to fail, maybe I just won't use their library at all or I'll report bugs to them. Enforcing strict mode compliance is something that can only really be done by community pressure, anyway, since it's off by default.
You realize that two of those languages are regarded as some of the worst designed languages in the past 30 years, right? JavaScript is frequently ridiculed for having been designed in 10 days, has a famous book called The Good Parts (which implies there are lots of bad parts to avoid), and has crazy behavior as described in the wat video. PHP is... Well, it gets stuff like fractal of bad design and PHP Singularity because of all the poor ambiguity handling in it. These reputations are not wholly undeserved, even if you think it's somewhat exaggerated. Is that the reputation PowerShell wants?
@bladeoflight16
I can think of no good reason to override
Out-Default
There are simply some things you cannot get away with on PSStandardMembers. Also, there may be a need to capture all end of pipeline operations.
If the module name changes, you have to change the list of module names in the requirements list anyway
Right, a few lines of code versus potentially many more. I'll concede that it's not exactly a high risk or overly technical operation.
This is an example of something you shouldn't do. A module has no business making this kind of decision for the caller.
I would argue that in the case of DSL's, it most certainly does.
You realize that two of those languages are regarded as some of the worst designed languages in the past 20 years, right?
Considering Python is pretty strict and allows it (but doesn't really have a strict mode I'm aware of), and that these languages (regardless of their pros and cons) are language peers that do have strict modes, I felt they were relevant examples. It's not a novel concept and it is technique that has been used in functional programming languages. It's not unique to PowerShell. This kind of behavior is common in both interactive shell (bash aliasing ls to add coloring by default) and in scripting (some phpbb add-on modules did this back in the day)
IMO, this behavior is to functional programming what polymorphism is to object oriented programming.
Right, a few lines of code versus potentially many more. I'll concede that it's not exactly a high risk or overly technical operation.
Right, and what I'm suggesting means you don't have to repeat the module name in as many places to get predictable behavior in the normal case. Without functionality somewhere in the vein of what I'm suggesting, you're forced to choose between code you can depend on and code that's clean.
I went ahead and tested for my hypothetical X, Y, Z modules. I appear to be correct:
@{
ModuleVersion = '1.0.0'
GUID = 'e17b3c87-6fc0-4457-a1e4-67233073da11'
Author = 'bladeoflight16'
Description = 'Stuff'
PowerShellVersion = '6.0'
RequiredModules = @('MyModule2', 'MyModule1')
NestedModules = @('.\MyModule3.psm1')
}
Then,
PS> Import-Module MyModule1
PS> Import-Module MyModule3
PS> Get-EvenMoreCoolStuff
Cool stuff 2!
Even more cool stuff!
So module order in the declaration is already unreliable. All you need to do is have some situation where the modules are already imported out of the order you expect. Developers depending on it now already have bugs waiting to happen. What if two modules depend on a conflicting order? Do you realize how hard that would be to diagnose?
I would argue that in the case of DSL's, it most certainly does.
Why does a DSL need to redefine existing PowerShell commands? Why can't it just make its own?
This kind of behavior is common in both interactive shell (bash aliasing ls to add coloring by default)
bash is not comparable to PowerShell. bash does very little. It has a few niceties like functions and some simple string processing for convenience purpose, but its primary job is to invoke other tools and manage their parameters and pipelines. In other words, it focuses on orchestration between external tools that handle most of the heavy lifting.
PowerShell is, for better or worse, becoming a one-stop shop for Windows administration and essentially just another language for the .NET runtime. This means you're much more likely to have whole programs for managing your system written in it than you are in bash. And the proof is in the lack of stuff actually written in bash and distributed; mostly, you just see simple scripts on a blog post or scripts for launching some other tool. For full blown system administration on Linux, you see Python based and Ruby based tools like Ansible and Chef because they're full fledged programming languages that have the features and design to support large scale applications. At the scales where bash is typically used, name conflicts with functions are not a big problem. By contrast, we're having conversations about thousands of lines of code explicitly written in PowerShell for management of a single set of applications/services, and you see things like DSC being implemented in it. People don't write build scripts in bash; they write them using make or other tools. PowerShell has psake and Invoke-Build for .NET projects; it even has its own testing frameworks for testing PowerShell code. bash also doesn't even have a concept of anything other than scripts and the interactive command line, while PowerShell has modules, module manifests, and a package manager. bash is a vastly simpler, more stripped down tool with a much narrower purpose than PowerShell is.
IMO, this behavior is to functional programming what polymorphism is to object oriented programming.
Inheritance is commonly regarded as an undesirable design decision to be avoided. The popularity of the saying "prefer composition over inheritance" proves this much. .NET decided to switch the default over Java and made methods overridable only if you explicitly declared them to be so. They did this because allowing anyone to override any behavior tends to make your code less maintainable. It makes it harder to understand how your code is going to behave at runtime, and impossible to predict whether someone might break it in the future. I agree that at small scales (within a single script or two), it's manageable, but not with thousands of lines of code. It is far, far preferable to explicitly design your code in such a way that it's clear what the developer can control and what they can't. You can do this by allowing arguments (including for blocks of code) or having interfaces that must be implemented; in some rare cases, it's even okay to have some kind of global switch or setting. But, "replace any piece of code with your own" is not really a winning strategy in the long run.
I'm completely open to alternatives for dealing with ambiguous calls, but the fact that PowerShell makes a guess about what the developer wants in the face of ambiguity means that developers are going to get unexpectedly bitten. I understand that implementing this is going to cause some developers problems in the near term, but in the long run, getting rid of guesses is going to be worth it. I've said this before: if you're not willing to make any breaking changes to PowerShell, it will never be a good language that helps you develop good code. It will always be a necessary evil that people hate to use because of the bad decisions in its past. Part of the reason for suggesting it be made part of strict mode is for the purpose of ensuring it's opt in early on, which gives developers time to refactor and adjust. I'm open to alternatives for phasing in, too. Maybe ugly warnings now, and actually build it in a couple major versions later? I know I don't know everything, but I do know that the way PowerShell currently handles this is dangerous and error prone, difficult to debug, and hard to prevent with only the functionality we have now. That isn't a good thing for the health of the code coming out of PowerShell's ecosystem.
but the fact that PowerShell makes a guess about what the developer wants in the face of ambiguity means that developers are going to get unexpectedly bitten.
Functions definitions work like any other assignment. There is no expectation in any language (that I have ever been exposed to, at least) that the following is true:
```powershell
$a = 5
$a = 6
$a -eq 5
````
Functions are just another form of assignment. It's not making some random guess, PowerShell has a well defined command precedence. Commands of the same type of the same name in the same "module scope" will work on last assigned overwriting the previous. Commands of the same name of the same type from different "modules scopes" are chosen based last defined. In areas where ambiguity exist there are tools to ensure the correct command is called (module qualified command names or invoking results from Get-Command).
There is nothing ambiguous about that. The language expects that the author knows the order of operations they need to execute as it cannot make assumptions or presume to know the author's intent. Getting the desired results from the order of operations is the job of the programmer, not strict mode.
There are, as have been demonstrated, legitimate reasons for making use of PowerShell's command precedence to overwrite, wrap, or supersede commands of the same name. These uses cases are perfectly acceptable, even in strict mode.
Strict mode is meant to enforce best practices, not to prevent author error by confusing their own order of operations. the language should not make assumptions that it knows better in that area.
Further other language peers with strict modes permit this behavior in strict mode. Historically, this behavior has existed and been used in many functional languages for many reasons. In some instances it is best practiced and in others can lead to harmful results. As with many things in any language, the tool is not to be faulted for misuse when it serves legitimate purposes.
Finally, I think we have reached a point were we are just repeating ourselves. I think i have covered everything on this issue I wish to. I will bow out and let others provide input.
Functions are just another form of assignment.
I think either this isn't really true or there's something I don't understand. If it is true, then why does an auto-import triggered by myscript2.ps1 affect the behavior of myscript1.ps1? Shouldn't those scripts have separate scopes? But the module import seems to affect them globally, even though I'm not dot sourcing anything into the top level scope.
@markekraus On that note, I'd like to make a point, and I really do want to hear your thoughts on it.
Where I see a big difference in PowerShell vs. Python here is that this assignment propagates to a lot of different scopes. Python has much more well defined scoping rules. Such as assignment would be attached to a specific object (which might also be a module, class, etc.). Now, while you're might reference that object in more scopes via imports, these are all very explicit, and it's easy to trace what's come from where. Combine this with the fact that monkey patching is considered unusual and avoided wherever possible (which is almost everywhere) and, maybe if by virtue of community attitude alone, this means it's a problem you don't have to concern yourself over very much.
But PowerShell's scoping is much looser. Anything you do in the parent scope affects all child scopes, automatically and silently. And maybe even more globally than that when modules come into play, given what I mentioned above. This is pretty close to having a fully global state. So when it comes to the question of, "What actual function is being called?" I have to know the entire set of overrides in every single parent scope of whatever one I'm looking at. Notably, this can change the behavior of a module function I didn't even write mid-program. (As in it worked the first time I called it but not the second.)
To me, this seems very difficult to manage. If I have a couple thousands lines of code, I can't effectively keep details about what order everything happened in my head. And with module dependencies in particular, I'm trying to build a system where I can simply say, "Give me this set of functions," and I have it without worrying about when or where it got loaded. I want to build this to make it easier to reason about my code and simpler to organize my project. How do I accomplish that in PowerShell as it stands? How would you go about this? How do you keep track of all of those kinds of details as you write and modify thousands of lines of code?
Given a specific piece of code, I can probably come up with a way to implement it differently than intentionally overriding a function in the majority of cases, but I have a hard time coming up with ways to better manage the complexity and cognitive load this feature introduces in a medium to large code base, where scattering around different pieces is necessary to keep a handle on the complexity. Do you have any suggestions?
Anything you do in the parent scope affects all child scopes,
That depends on how the child scopes are called, same with Python.
def testfunction():
print str
return
str = 'Test'
testfucntion()
or
def outerfunction():
innerfunction()
return
def innerfunction():
print 'foo'
return
outerfunction()
def innerfunction():
print 'bar'
return
outerfunction()
You have to be cognizant of scoping in functional programming languages because functions are assignments, assignments can and do change, and the default is that changes in the parent scope affect the child scope. When you author modules and you need certainty of the command called, you ensure the command with Module Qualified Command names or via invoking results from Get-Command. In python there are similar practices. Otherwise you are implicitly allowing the commands called in your module to be something else.
This is perfectly fine. If I write a module that makes heavy use of Invoke-WebRequest, but the host environment heeds to change the behavior of Invoke-WebRequest so they can comply with strict firewall rules or something, as an author I don't need to care about that. I certainly don't want to have to write my module to work for every little edge case. If they break my module because they tamper with their Invoke-WebRequest, well, that's there prerogative. But if I know for certain I need the "real" Get-Command, you better believe I'm writing my module with Microsoft.PowerShell.Core\Get-Command.
As a consumer of other modules. it's on me to see what functions are imported and up to me to chose the import order to address any clashes. If module authors are using vague command names, I may open an issue asking them to change it to something more specific. But some times vague commands or clashing names are on purpose to address certain issues. It's up to me as a consumer to review their documentation and if their module is just too greedy or poorly written, then I don't want that in production anyway.
And with module dependencies in particular, I'm trying to build a system where I can simply say, "Give me this set of functions," and I have it without worrying about when or where it got loaded. ...
It's just a matter of front loading your effort to determine if conflicts exists and figuring out the best way to address them. I have similar issues with the SharePointOnline Module and the SharePointOnlinePnP module as they both compliment each other and but they have many commands in common. sometimes I need one form one module or one from another. and sometimes I have to switch back and forth between them.
For stand alone scripts I opt for swapping back and forth with Import-Module -force usually because I'm doing copy/paste from another script and don't want to waste time rewriting. For larger more modular/interconnected code, I use module qualified command names. Some of the commands from both work almost identical and I don't need to worry about which module I'm calling it from. In some cases I write my own functions that will call the correct function. It really depends on what the situation best calls for. Obviously, you have to have the correct command referenced somewhere. sometimes it makes sense to abstract that and others it does not.
@markekraus
It seems to me that what you're missing is the difficulty of managing this when you start using modules and scripts that then have their own set of dependencies. You seem to be focused on a single layer of dependencies: you have one script, this script has a few dependencies that could potentially clash, and the things you depend on either do not have further dependencies or the dependencies they have do not clash. The situations I'm running into are more complex than that. I have multiple layers of dependencies, and the clashes are not solely in the top level script. I'll try to expand on the details of how that complexity plays out in practice.
That depends on how the child scopes are called, same with Python.
It is not the same with Python. Consider 3 modules:
a.py
import b
def do_c():
print('custom c from a')
b.do_b()
b.py
from c import do_c
def do_b():
do_c()
print('doing b')
c.py
def do_c():
print('doing c')
In Python, each module has its own, independent state. b's call to do_c is unaffected by a defining its own do_c because a's do_c variable does not affect b's. b is isolated from changes in a's scope, unless another block of code explicitly reaches into b and changes the state (e.g., b.do_c = a.do_c). Doing so is known as monkey patching, and while it is used from time to time, it is not a widely encouraged practice because it can cause exactly the kind of problem I am describing. While each module's state is global in a sense and I could monkey patch c before b imports from it if I was careful about it, it is also possible for me to have local operations that do not propagate everywhere. Typically, monkey patching is either done early when the program launches to affect everything that imports the patched code, or it's done during a targeted frame of time (like with the mock package). But regardless of how it's done, it never happens unless you explicitly make it happen. In Python, my code only ever gets exactly what it asks for, even if it enables me to ask for crazy things.
The difference is that PowerShell gives me the crazy things I didn't ask for and doesn't give me a good way to ask for the sane things. PowerShell does not provide the sort of isolation Python does. If Python worked like PowerShell, the reassignment in a would, within the scope of a only, change the behavior of do_b. This would be completely unexpected if I didn't know B depended on C or that C happened to have a function named do_c, and even if I know those things, it's still unexpected that having a function the same name as one in two completely separate modules changes the behavior of the other modules, especially if I didn't even write those modules. Isolation of Python's kind would simplify development greatly: when a module declares its dependencies, it should get those dependencies (and preferably no others to avoid having undeclared dependencies on parent scope state) unless some outside code explicitly overrides them. Unfortunately, the cat is already out of the bag, and just making modules more isolated like this would be a breaking change in PowerShell. So my feature request attempted to suggest an alternative that would get most of the same benefits of this isolation.
As a consumer of other modules. it's on me to see what functions are imported and up to me to chose the import order to address any clashes.
I strongly disagree. It should not be on the caller to have a full, white box understanding of every module they use. I should not need to know every dependency of a module and whether that module's dependencies clash with my current script's modules. Requiring that of script writers greatly lessens the value of even having reusable modules to developers, who now have to go examine the source code of every module they are considering using. Even if I wrote the modules I'm using, I still must have constant awareness of whether I'm possibly introducing name conflicts with any possible child scope; that's a huge cognitive load. For example, if I have a MoreScheduledTasks module to supplement the built in ScheduledTasks module and it depends on ScheduledTasks, importing Carbon into my current script should not change the behavior of functions from MoreScheduledTasks. Such behavior is unexpected and cannot be anticipated without a full understanding of every line of code of the functions I call from MoreScheduledTasks. This is exactly the situation I faced, and it was difficult to diagnose, particularly since the Carbon dependency was a new addition to an existing, working script.
Additionally, it is not always possible to achieve the desired result merely through ordering. Consider 4 modules:
A1.psm1:
function Do-A() { Write-Host 'A1' }
B.psm1:
function Do-B() {
Do-A
Write-Host 'B'
}
Assume that B.psd1 declares a dependency on A1.
A2.psm1:
function Do-A() { Write-Host 'A2' }
C.psm1:
function Do-C() {
Do-A
Write-Host 'C'
}
Assume that C.psd1 declares a dependency on A2.
myscript.ps1:
#Requires -Modules B, C
Do-C
Do-B
This outputs:
A2
C
A2
B
There is no way to make this behave correctly, short of re-importing modules my script doesn't even want to use in very specific locations:
#Requires -Modules B, C
Import-Module A2 -Force
Do-C
Import-Module A1 -Force
Do-B
And this requires praying that the module import doesn't have side effects. It also largely negates the value of declaring those dependencies in my psd1 files; declaring those dependencies explicitly in the psd1 file should be enough.
And as my original example demonstrates, this even impacts command resolution in scripts that invoke myscript.ps1 (parent scopes!!!). They might have been depending on the exact opposite import order of A1 and A2. So to ensure the expected behavior, you have this sort of constant module management spidering out through every level of the code. Or I have to ensure I always explicitly reference A1\Do-A or A2\Do-A, which then eliminates the possibility of even doing monkey patching. I'm inclined to start using module qualification up front in all my PowerShell code because I don't want to have to go back and revisit all of my code every time I bring a new third party module into the mix.
In other words, getting predictable behavior out of PowerShell's command selection as a code base evolves and grows is impossible without an inordinate amount of effort. The justification for this difficulty is to enable monkey patching. But the solutions to the unpredictability either require a massive cognitive load of understanding and controlling what modules were imported in what order at every single line of code or to disable monkey patching altogether by referencing explicit module functions. The former is untenable beyond one or two scripts and modules, and the latter defeats the justification for even having the behavior.