Powershell: How does PowerShell make breaking changes?

Created on 7 Jul 2020  ·  8Comments  ·  Source: PowerShell/PowerShell

From discussion in https://github.com/PowerShell/PowerShell/issues/11674, particularly in https://github.com/PowerShell/PowerShell/issues/11674#issuecomment-652712978.

There's also related discussion in https://github.com/PowerShell/PowerShell/issues/13068#issuecomment-654597300.

We have the breaking change contract, which basically promises that there will never be significant breaking changes in PowerShell, for some value of "significant".

While a reluctance to make breaking changes is a strength in a platform, and there are definitely some breaking changes that are too divergent to implement, I think no software can stay evergreen without making breaking changes. Indeed PowerShell has already made a number of them, but in an ad-hoc way that doesn't properly catalogue or otherwise handle them. In particular, PowerShell has some "bad old" legacy edge- and corner-cases that would be nice to change, but to do so we need some pathway to making those changes.

I think we need to confront PowerShell's need to make breaking changes in PowerShell and discuss what can be implemented both at process and technical levels to ease such changes.

Basically we need a way to implement breaking changes softly, so that:

  • The changes are out there, but old scripts can still run on them
  • Users have a way to discover breaking changes, where they will be affected, and be given time to migrate over

Some ideas off the top of my head:

  • The experimental feature system could be expanded to also capture and opt in/out of breaking changes
  • We already have the breaking change PR tag, but should expand this to make it easier to query
  • We should establish some sane way of funneling breaking changes into PSScriptAnalyzer. For Parser changes, this hits the hurdle of needing a particular version of the parser (one of the reasons I've argued strongly against syntax breaking changes). To address this, we might need to spin the parser out into its own NuGet package
  • We probably would need a review process to decide which breaking changes can be kept and which must be reverted. This could be very hard to establish though, since the likelihood of a breaking change being picked up by a script depending on it not to break is low given our usage statistics
  • We would also need a way to ensure that breaking changes are only picked up as default behaviour in the right semantic versions, or at least in the versions desired by the PowerShell maintainers

Crucially, anything we implement must scale so it can lean heavily on automation. Processes requiring heavy time investments are unlikely to work no matter how aspirational the language.

Issue-Enhancement

Most helpful comment

Agreed with all of this. I don't have fully-formed thoughts here yet, but something I've been turning over in my mind as a potential way of _managing_ such changes in the time leading up to when it's acceptable having them in an actual release version of PowerShell might be to maintain a vNext branch that is rebased periodically to incorporate the lesser changes from the main branch, and used to house breaking changes so that we can have regular builds of those, a place to review those changes on their own merit with a little less of a worry of breaking things immediately if they're accepted.

All of what you've mentioned will be important as well, but I think the fact that these kinds of changes will naturally take significantly more time to make it to any kind of release means that unless we're willing to manage them in some way in this project itself, many otherwise quite useful / important features could end up in a more or less abandoned state with quite a lot of work required to rebase that into the current main branch. To me, it seems like it would be significantly easier to handle those piecemeal, as they come up, rather than incurring the cost of reconciling the change within (for example) a targeted release week, when suddenly one or many "breaking change" PRs then need to be rebased, updated, and merged.

That aside... _definitely_ agree that we need a proper plan for handling these. The breaking change contract makes sense to a point, but to avoid the project becoming more technical debt than commonly-used or desirable features we still need a way to make breaking changes happen when it's required. The bar for what constitutes enough of a "valuable" breaking change proposition may need to be clearly defined, and it might make some sense to "stockpile" the more incremental ones when we're able so that we can bundle related ones together and make a recongnisable and feature-centred breaking change release, which (I think) would help customers both argue to take the new breaks as well as have clearly defined versions that are well-known to break one or more features for the sake of improvements in that area.

Sorry about the wall of text, this is lovely to see. Thank you for writing this up! 💖

All 8 comments

Agreed with all of this. I don't have fully-formed thoughts here yet, but something I've been turning over in my mind as a potential way of _managing_ such changes in the time leading up to when it's acceptable having them in an actual release version of PowerShell might be to maintain a vNext branch that is rebased periodically to incorporate the lesser changes from the main branch, and used to house breaking changes so that we can have regular builds of those, a place to review those changes on their own merit with a little less of a worry of breaking things immediately if they're accepted.

All of what you've mentioned will be important as well, but I think the fact that these kinds of changes will naturally take significantly more time to make it to any kind of release means that unless we're willing to manage them in some way in this project itself, many otherwise quite useful / important features could end up in a more or less abandoned state with quite a lot of work required to rebase that into the current main branch. To me, it seems like it would be significantly easier to handle those piecemeal, as they come up, rather than incurring the cost of reconciling the change within (for example) a targeted release week, when suddenly one or many "breaking change" PRs then need to be rebased, updated, and merged.

That aside... _definitely_ agree that we need a proper plan for handling these. The breaking change contract makes sense to a point, but to avoid the project becoming more technical debt than commonly-used or desirable features we still need a way to make breaking changes happen when it's required. The bar for what constitutes enough of a "valuable" breaking change proposition may need to be clearly defined, and it might make some sense to "stockpile" the more incremental ones when we're able so that we can bundle related ones together and make a recongnisable and feature-centred breaking change release, which (I think) would help customers both argue to take the new breaks as well as have clearly defined versions that are well-known to break one or more features for the sake of improvements in that area.

Sorry about the wall of text, this is lovely to see. Thank you for writing this up! 💖

This is 100% something that is, not yet an issue, but is definitely a discussion, a feature that GitHub recently added & I would love to see more of these types of conversations happen in discussion so that actual issues can be better managed & prioritised & not so easily lost in the sea of discussions.

So @joeyaiello, @SteveL-MSFT, I will be raising the use of discussions in the PowerShell Repos as a point for the Community call as they will be so useful to use.

But on point with this discussion, I personally think we need a number of new Language modes to supplement No, Constrained and Full Language modes & is probably the only way we can easily and safely manage this however it likely will potentially make the code base more brittle going forward.

We have asked for something to cover this before in the RFC Repo called optional changes & was in some ways asked for exactly this purpose but was rejected see https://github.com/PowerShell/PowerShell-RFC/pull/220 for more details

The experimental feature system could be expanded to also capture and opt in/out of breaking changes

Otherwise I personally really want this as a feature as it would allow for other things to be able to adapt quicker & like you say here

We should establish some sane way of funneling breaking changes into PSScriptAnalyzer. For Parser changes, this hits the hurdle of needing a particular version of the parser (one of the reasons I've argued strongly against syntax breaking changes). To address this, we might need to spin the parser out into its own NuGet package

theres nothing stopping us sticking to specfic default versions of PSScriptAnalyzer running on certain versions of PowerShell hosted in a zip install fashion on customised agents. That isn't all that much of a challange (or cost) to run & to maintain in all fairness.

@kilasuit while those are all potential points of discussion, I'm not sure most of them are really relevant to the issue at hand here, except for the point about experimental/optional features. To me, that appears to be more a way to _avoid_ making breaking changes more than a way to plan for and move towards necessary breaking changes.

And just as much as we can pin PSSA versions and build specific versions into our images, we can just as easily do about the same thing with PowerShell versions themselves, so I don't think the additional encumbrance of support and maintenance for optional/experimental features really meets the need stated by Rob here.

Nah you've misunderstood as if you created a series of language modes that could run under PowerShell but be constrained for just the small breaking changes we want to potentially make you'd gain ease of implementation by being able to wrap the breaking change in a specific Language mode if clause.

And if we say had 100 breaking changes in the next 7 years of PowerShell we'd need to be able to have a suitable and managable test matrix of PowerShell versions to be able to test it easily. Which would be messy but utilising language modes would make it less messy and massively more managable to test and maintain particularly if we could get Pester v6 to make use of thread jobs for testing, and to do so in the various different language modes.

And yes you are right its more to avoid unnecessarily breaking someone for using something no longer of the norm, but still supported and a valid way of working, whilst providing script/module authors choice and flexibility of how they choose to implement updates to functionality that may break them.

And if we say had 100 breaking changes in the next 7 years of PowerShell we'd need to be able to have a suitable and managable test matrix of PowerShell versions to be able to test it easily.

100 breaking changes is 100^2 unique testing states in your test matrix. How do you quantify "_managable_ test matrix"? Do you know of any open source projects _managing_ test matrices of 100^2 size?

I don't think we have had that number of breaking changes across all of PowerShell or even .Net so far so was a _theoretical_ figure not a literal one.

plus the support lifecycle of PowerShell matching .Net would allow us to add a group of these together in specific PowerShell versions which then would make removing them in later versions much more simplistic if we just add it as a language mode, making it simpler and easier to maintain in the longer term but putting the onus on maintenance on the admin/module author that is reliant/hesitant to remove older/lesser used/suboptimal functionality not the PowerShell team.

It also makes the end user experience cleaner, particularly the new to PowerShell end user experience, which is actually the most important thing to consider here, not the seasoned developer experience.

Hi @kilasuit :) Thanks for adding your thoughts.

I didn't quite see an answer to my questions. I think what you're saying is that managability depends on how you define LanguageMode. If it's individual feature flags per language feature, then it's n^2. If it's feature flags per language version, then it's asymptotic to a polynomial series (n-k0)^2+(n-k1)^2+...+(n-kn-1)^2+(n-kn)^2. This is a fairly standard trick when dealing with combinatorial explosion, but the issue is that it is no longer a test matrix but test matrices, and those test matrices require maintenance. For an open source project, do you think that would increase the bar for collaboration when running such a test suite? I think this deserves concrete prototyping before moving forward with such an approach.

Andrew Vickers of the Entity Framework team even told me flat out that while he came from a background of building complex test matrices, for EFCore he consciously chose against it, because it often made it hard to decipher best practices in configuring and using the software.

I used to be a big supporter of Orthogonal Array Testing Strategy (OATS) and feature composition and testing pairwise combinations of features to achieve a three sigma feature composition confidence interval. I am not such a supporter today.

Hi @jzabroski

Hi @kilasuit :) Thanks for adding your thoughts.

I didn't quite see an answer to my questions.

you are right, I purposely didn't respond to your question, mainly as I think we are getting too technical here about an implementation that
1) has not been decided yet - my suggestion is just that, because this will be hard to get right and I'm not really overly convinced it makes sense to investigate it as an option.
2) may never be overly possible for the reasons you went on to mention
3) has been in some ways rejected already by the committee on a number of previous occasions.

I think what you're saying is that managability depends on how you define LanguageMode.

Possibily but like I said above this was a mere basic suggestion, and if I felt it would have more merit I would have raised an more detailed RFC for further discussion, as per this projects long standing process

I also chose not to respond to this particular question because that's realistically an unknown, unknown for this repo and how it may look in the future & again is getting too technical to be conducive to the overall direction of the thread.

How do you quantify "managable test matrix"?

I'm good at pushing something that I believe in, but also equally good at pushing overly complex solutions that aren't that feasible or sensible, thought I have been getting much better at identifying the latter cases and jumping out of the conversation, which I was attempting to do in this case, though I should have perhaps mentioned that I'm not confident in the suggested approach before bailing, will make sure to do this in future 🙂

Also whilst I was aware of the mathematics behind this stuff, its been a while since I had a real need to work out an equation like the below, which again points us to getting off track and falling foul of point 1 above.

If it's individual feature flags per language feature, then it's n^2. If it's feature flags per language version, then it's asymptotic to a polynomial series (n-k0)^2+(n-k1)^2+...+(n-kn-1)^2+(n-kn)^2. This is a fairly standard trick when dealing with combinatorial explosion, but the issue is that it is no longer a test matrix but test matrices, and those test matrices require maintenance. For an open source project, do you think that would increase the bar for collaboration when running such a test suite? I think this deserves concrete prototyping before moving forward with such an approach.

I will bail out at this point and just say that we've discussed this request before, at great lenght, but we almost always come to the conclusion that this is a level of hard & getting a solution the can be agreed is something that I don't think we will be able to do easily and walls of Text on here, I'm not sure will get us to a point where we can easily manage it. But it's definitely encouraging to see passion from @rjmholt for this as well as the community for this but perhaps we can find a workable solution this time, who knows 🤷‍♂️

Was this page helpful?
0 / 5 - 0 ratings